Skip to content

Commit

Permalink
add model metrics table when only validating the 10k most stable pred…
Browse files Browse the repository at this point in the history
…ictions for each model
  • Loading branch information
janosh committed Jun 20, 2023
1 parent 3ac72e6 commit c4ca186
Show file tree
Hide file tree
Showing 28 changed files with 715 additions and 362 deletions.
3 changes: 2 additions & 1 deletion matbench_discovery/metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,8 +101,9 @@ def stable_metrics(
DAF=precision / prevalence,
Precision=precision,
Recall=recall,
**dict(TPR=TPR, FPR=FPR, TNR=TNR, FNR=FNR),
Accuracy=(n_true_pos + n_true_neg) / len(each_true),
**dict(TPR=TPR, FPR=FPR, TNR=TNR, FNR=FNR),
**dict(TP=n_true_pos, FP=n_false_pos, TN=n_true_neg, FN=n_false_neg),
MAE=np.abs(each_true - each_pred).mean(),
RMSE=((each_true - each_pred) ** 2).mean() ** 0.5,
R2=r2_score(each_true, each_pred),
Expand Down
2 changes: 1 addition & 1 deletion matbench_discovery/plots.py
Original file line number Diff line number Diff line change
Expand Up @@ -703,7 +703,7 @@ def cumulative_precision_recall(
df = dfs[metric]
ax.set(ylim=(0, 1), xlim=(0, None), ylabel=metric)
for model in df_preds:
# TODO is this if really necessary?
# TODO is this really necessary?
if len(df[model].dropna()) == 0:
continue
x_end = df[model].dropna().index[-1]
Expand Down
13 changes: 12 additions & 1 deletion matbench_discovery/preds.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from tqdm import tqdm

from matbench_discovery import ROOT
from matbench_discovery.data import Files, glob_to_df
from matbench_discovery.data import Files, df_wbm, glob_to_df
from matbench_discovery.metrics import stable_metrics
from matbench_discovery.plots import eVpa, model_labels, quantity_labels

Expand Down Expand Up @@ -131,13 +131,24 @@ def load_df_wbm_with_preds(


df_metrics = pd.DataFrame()
df_metrics_10k = pd.DataFrame() # look only at each model's 10k most stable predictions
prevalence = (df_wbm[each_true_col] <= 0).mean()

df_metrics.index.name = "model"
for model in PRED_FILES:
each_pred = df_preds[each_true_col] + df_preds[model] - df_preds[e_form_col]
df_metrics[model] = stable_metrics(df_preds[each_true_col], each_pred)
most_stable_10k = each_pred.nsmallest(10_000)
df_metrics_10k[model] = stable_metrics(
df_preds[each_true_col].loc[most_stable_10k.index], most_stable_10k
)
df_metrics_10k[model]["DAF"] = df_metrics_10k[model]["Precision"] / prevalence


# pick F1 as primary metric to sort by
df_metrics = df_metrics.round(3).sort_values("F1", axis=1, ascending=False)
df_metrics_10k = df_metrics_10k.round(3).sort_values("F1", axis=1, ascending=False)


# dataframe of all models' energy above convex hull (EACH) predictions (eV/atom)
df_each_pred = pd.DataFrame()
Expand Down
2 changes: 1 addition & 1 deletion models/bowsr/metadata.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ requirements:
megnet: 1.3.2
numpy: 1.24.0
pandas: 1.5.1
trained_on_benchmark: false
trained_for_benchmark: false

hyperparams:
Optimizer Params:
Expand Down
4 changes: 2 additions & 2 deletions models/cgcnn/metadata.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
torch-scatter: 2.0.9
numpy: 1.24.0
pandas: 1.5.1
trained_on_benchmark: true
trained_for_benchmark: true

hyperparams:
Ensemble Size: 10
Expand Down Expand Up @@ -57,7 +57,7 @@
torch-scatter: 2.0.9
numpy: 1.24.0
pandas: 1.5.1
trained_on_benchmark: true
trained_for_benchmark: true

hyperparams:
Ensemble Size: 10
Expand Down
3 changes: 2 additions & 1 deletion models/chgnet/metadata.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@ requirements:
ase: 3.22.0
pymatgen: 2022.10.22
numpy: 1.24.0
trained_on_benchmark: false
trained_for_benchmark: false
# training_set: MPTraj

hyperparams:
max_steps: 2000
Expand Down
95 changes: 29 additions & 66 deletions models/m3gnet/metadata.yml
Original file line number Diff line number Diff line change
@@ -1,66 +1,29 @@
- model_name: M3GNet
model_version: 2022.9.20
matbench_discovery_version: 1.0
date_added: "2022-09-20"
date_published: "2022-02-05"
authors:
- name: Chi Chen
affiliation: UC San Diego
role: Model
orcid: https://orcid.org/0000-0001-8008-7043
- name: Shyue Ping Ong
affiliation: UC San Diego
orcid: https://orcid.org/0000-0001-5726-2587
email: ongsp@ucsd.edu
repo: https://github.com/materialsvirtuallab/m3gnet
url: https://materialsvirtuallab.github.io/m3gnet
doi: https://doi.org/10.1038/s43588-022-00349-3
preprint: https://arxiv.org/abs/2202.02450
requirements:
m3gnet: 0.1.0
pymatgen: 2022.10.22
numpy: 1.24.0
pandas: 1.5.1
trained_on_benchmark: false
notes:
description: M3GNet is a GNN-based universal (as in full periodic table) interatomic potential for materials trained on up to 3-body interactions in the initial, middle and final frame of MP DFT relaxations.
long: It thereby learns to emulate structure relaxation, MD simulations and property prediction of materials across diverse chemical spaces.
training: Using pre-trained model released with paper. Was only trained on a subset of 62,783 MP relaxation trajectories in the 2018 database release (see [related issue](https://github.com/materialsvirtuallab/m3gnet/issues/20#issuecomment-1207087219)).

- model_name: M3GNet + MEGNet
model_version: 2022.9.20
matbench_discovery_version: 1.0
date_added: "2023-02-03"
date_published: "2022-02-05"
authors:
- name: Chi Chen
affiliation: UC San Diego
role: Model
orcid: https://orcid.org/0000-0001-8008-7043
- name: Weike Ye
affiliation: UC San Diego
orcid: https://orcid.org/0000-0002-9541-7006
- name: Yunxing Zuo
affiliation: UC San Diego
orcid: https://orcid.org/0000-0002-2734-7720
- name: Chen Zheng
affiliation: UC San Diego
orcid: https://orcid.org/0000-0002-2344-5892
- name: Shyue Ping Ong
affiliation: UC San Diego
orcid: https://orcid.org/0000-0001-5726-2587
email: ongsp@ucsd.edu
repo: https://github.com/materialsvirtuallab/m3gnet
url: https://materialsvirtuallab.github.io/m3gnet
doi: https://doi.org/10.1038/s43588-022-00349-3
preprint: https://arxiv.org/abs/2202.02450
requirements:
m3gnet: 0.1.0
megnet: 1.3.2
pymatgen: 2022.10.22
numpy: 1.24.0
pandas: 1.5.1
trained_on_benchmark: false
notes:
description: This combination of models uses M3GNet to relax initial structures and then passes it to MEGNet to predict the formation energy.
training: Using pre-trained model released with paper. Was only trained on a subset of 62,783 MP relaxation trajectories in the 2018 database release (see [related issue](https://github.com/materialsvirtuallab/m3gnet/issues/20#issuecomment-1207087219)).
model_name: M3GNet
model_version: 2022.9.20
matbench_discovery_version: 1.0
date_added: "2022-09-20"
date_published: "2022-02-05"
authors:
- name: Chi Chen
affiliation: UC San Diego
role: Model
orcid: https://orcid.org/0000-0001-8008-7043
- name: Shyue Ping Ong
affiliation: UC San Diego
orcid: https://orcid.org/0000-0001-5726-2587
email: ongsp@ucsd.edu
repo: https://github.com/materialsvirtuallab/m3gnet
url: https://materialsvirtuallab.github.io/m3gnet
doi: https://doi.org/10.1038/s43588-022-00349-3
preprint: https://arxiv.org/abs/2202.02450
requirements:
m3gnet: 0.1.0
pymatgen: 2022.10.22
numpy: 1.24.0
pandas: 1.5.1
trained_for_benchmark: false
notes:
description: M3GNet is a GNN-based universal (as in full periodic table) interatomic potential for materials trained on up to 3-body interactions in the initial, middle and final frame of MP DFT relaxations.
long: It thereby learns to emulate structure relaxation, MD simulations and property prediction of materials across diverse chemical spaces.
training: Using pre-trained model released with paper. Was only trained on a subset of 62,783 MP relaxation trajectories in the 2018 database release (see [related issue](https://github.com/materialsvirtuallab/m3gnet/issues/20#issuecomment-1207087219)).
testing: We also tried combining M3GNet with MEGNet where M3GNet is used to relax initial structures which are then passed to MEGNet to predict the formation energy.
2 changes: 1 addition & 1 deletion models/megnet/metadata.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ requirements:
pymatgen: 2022.10.22
numpy: 1.24.0
pandas: 1.5.1
trained_on_benchmark: false
trained_for_benchmark: false

notes:
description: MatErials Graph Network is another GNN for material properties of relaxed structure which showed that learned element embeddings encode periodic chemical trends and can be transfer-learned from large data sets (formation energies) to predictions on small data properties (band gaps, elastic moduli).
Expand Down
2 changes: 1 addition & 1 deletion models/voronoi/metadata.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ requirements:
pymatgen: 2022.10.22
numpy: 1.24.0
pandas: 1.5.1
trained_on_benchmark: true
trained_for_benchmark: true

notes:
description: A random forest trained to map the combo of composition-based Magpie features and structure-based relaxation-invariant Voronoi tessellation features (bond angles, coordination numbers, ...) to DFT formation energies.
Expand Down
2 changes: 1 addition & 1 deletion models/wrenformer/metadata.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ requirements:
pymatgen: 2022.10.22
numpy: 1.24.0
pandas: 1.5.1
trained_on_benchmark: true
trained_for_benchmark: true

hyperparams:
Ensemble Size: 10
Expand Down
2 changes: 1 addition & 1 deletion paper
Submodule paper updated from 3ea614 to d7c7bf
10 changes: 5 additions & 5 deletions readme.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<h1 align="center" style="display: grid;">
<img src="https://raw.githubusercontent.com/janosh/matbench-discovery/main/site/static/favicon.svg" alt="Logo" width="80px">
Matbench Discovery
<h1 align="center">
<img src="https://github.com/janosh/matbench-discovery/raw/main/site/static/favicon.svg" alt="Logo" width="60px"><br>
Matbench Discovery
</h1>

<h4 align="center" class="toc-exclude">
Expand All @@ -17,12 +17,12 @@ Matbench Discovery
Matbench Discovery is an [interactive leaderboard](https://janosh.github.io/matbench-discovery) and associated [PyPI package](https://pypi.org/project/matbench-discovery) which together make it easy to benchmark ML energy models on a task designed to closely simulate a high-throughput discovery campaign for new stable inorganic crystals.

In version 1 of this benchmark, we explore 8 models covering multiple methodologies ranging from random forests to graph neural networks, from one-shot predictors to iterative Bayesian optimizers and interatomic potential-based relaxers. We find [CHGNet](https://github.com/CederGroupHub/chgnet) ([paper](https://doi.org/10.48550/arXiv.2302.14231)) to achieve the highest F1 score of 0.59, $R^2$ of 0.61 and a discovery acceleration factor (DAF) of 3.06 (meaning a 3x higher rate of stable structures compared to dummy selection in our already enriched search space). See the [**full results**](https://janosh.github.io/matbench-discovery/preprint#results) in our interactive dashboard which provides valuable insights for maintainers of large-scale materials databases. We show these models have become powerful enough to warrant deploying them as triaging steps to more effectively allocate compute in high-throughput DFT relaxations.
So far, we've tested 8 models covering multiple methodologies ranging from random forests with structure fingerprints to graph neural networks, from one-shot predictors to iterative Bayesian optimizers and interatomic potential-based relaxers. We find [CHGNet](https://github.com/CederGroupHub/chgnet) ([paper](https://doi.org/10.48550/arXiv.2302.14231)) to achieve the highest F1 score of 0.59, $R^2$ of 0.61 and a discovery acceleration factor (DAF) of 3.06 (meaning a 3x higher rate of stable structures compared to dummy selection in our already enriched search space). We believe our results show that ML models have become robust enough to deploy them as triaging steps to more effectively allocate compute in high-throughput DFT relaxations. This work provides valuable insights for anyone looking to build large-scale materials databases.

<slot name="metrics-table" />

We welcome contributions that add new models to the leaderboard through [GitHub PRs](https://github.com/janosh/matbench-discovery/pulls). See the [usage and contributing guide](https://janosh.github.io/matbench-discovery/contribute) for details.

For a version 2 release of this benchmark, we plan to merge the current training and test sets into the new training set and acquire a much larger test set (potentially at meta-GGA level of theory) compared to the v1 test set of 257k structures. Anyone interested in joining this effort please [open a GitHub discussion](https://github.com/janosh/matbench-discovery/discussions) or [reach out privately](mailto:janosh@lbl.gov?subject=Matbench%20Discovery).
Anyone interested in joining this effort please [open a GitHub discussion](https://github.com/janosh/matbench-discovery/discussions) or [reach out privately](mailto:janosh@lbl.gov?subject=Matbench%20Discovery).

For detailed results and analysis, check out the [preprint](https://janosh.github.io/matbench-discovery/preprint) and [supplementary material](https://janosh.github.io/matbench-discovery/si).
Loading

0 comments on commit c4ca186

Please sign in to comment.