Learning_from_models

The performance datasets and the analysis notebooks for the publication "Learning from models: high-dimensional analyses on the performance of machine learning interatomic potentials"

Data

The performance datasets $\mathcal{D}_ {\mathrm{2300}}$ and $\mathcal{D}_ {\mathrm{124}}$ are in the directory data/. D124_log.csv corresponds to the $\mathcal{D}_ {\mathrm{124}}$, but all error metrics are log errors ($\mathrm{log}_ {\mathrm{10}}$ $\delta$) and the benchmark row is labelled as DFT (row 126, or index 125). The $\mathcal{D}_ {\mathrm{2300}}^ {\mathrm{48}D}$ can simply be reproduced by removing properties in the energy ranking category in $\mathcal{D}_{\mathrm{2300}}$, so we don't have a separate file for it. Data to reproduce the figures in different sections are in the directory data/. Detailed locations of data for these figures can be found in the analysis notebooks.
Note: the N-optimal_random_variables.png in data/ is a figure mentioned in Notebook 04, which corresponds to an analysis not included in the publication.

Notebooks

Each notebook corresponds to an analysis in the paper:

Notebook 01: Figure 4, section 2.3
Notebook 02: Figure 5, Supplementary Figure S5 and S6, section 2.4
Notebook 03: Figure 7, Supplementary Figure S8, S9, S10, and S11, section 2.6
Notebook 04: Related to Supplementary Figure S7

Other algorithms and methods

Several algorithms and methods are from online, including the algorithms to compute Pareto fronts, inverted generational distance, and the Cholesky method. Their websites are in corresponding files and notebooks. References about other packages or algorithms are in the Methods of the paper.

Citation

If you use the datasets or the analyses extensively, you may want to cite the following publication:
Y. Liu, Y. Mo, Learning from models: high-dimensional analyses on the performance of machine learning interatomic potentials. (ready to submit)

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
data		data
training_datasets		training_datasets
01_Introduction_and_pareto_fronts.ipynb		01_Introduction_and_pareto_fronts.ipynb
02_Analyzing_high-dimensional_performance.ipynb		02_Analyzing_high-dimensional_performance.ipynb
03_Correlations_and_prediction_relations.ipynb		03_Correlations_and_prediction_relations.ipynb
04_Analyzing_the_Pareto_fronts_of_random_variables.ipynb		04_Analyzing_the_Pareto_fronts_of_random_variables.ipynb
Pareto.py		Pareto.py
README.md		README.md
basics.py		basics.py
network_mvc.py		network_mvc.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning_from_models

Data

Notebooks

Other algorithms and methods

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Learning_from_models

Data

Notebooks

Other algorithms and methods

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages