Skip to content

mogroupumd/Learning_from_models

Repository files navigation

Learning_from_models

  The performance datasets and the analysis notebooks for the publication "Learning from models: high-dimensional analyses on the performance of machine learning interatomic potentials"

Data

  The performance datasets $\mathcal{D}_ {\mathrm{2300}}$ and $\mathcal{D}_ {\mathrm{124}}$ are in the directory data/. D124_log.csv corresponds to the $\mathcal{D}_ {\mathrm{124}}$, but all error metrics are log errors ($\mathrm{log}_ {\mathrm{10}}$ $\delta$) and the benchmark row is labelled as DFT (row 126, or index 125). The $\mathcal{D}_ {\mathrm{2300}}^ {\mathrm{48}D}$ can simply be reproduced by removing properties in the energy ranking category in $\mathcal{D}_{\mathrm{2300}}$, so we don't have a separate file for it. Data to reproduce the figures in different sections are in the directory data/. Detailed locations of data for these figures can be found in the analysis notebooks.
Note: the N-optimal_random_variables.png in data/ is a figure mentioned in Notebook 04, which corresponds to an analysis not included in the publication.

Notebooks

Each notebook corresponds to an analysis in the paper:

  • Notebook 01: Figure 4, section 2.3
  • Notebook 02: Figure 5, Supplementary Figure S5 and S6, section 2.4
  • Notebook 03: Figure 7, Supplementary Figure S8, S9, S10, and S11, section 2.6
  • Notebook 04: Related to Supplementary Figure S7

Other algorithms and methods

  Several algorithms and methods are from online, including the algorithms to compute Pareto fronts, inverted generational distance, and the Cholesky method. Their websites are in corresponding files and notebooks. References about other packages or algorithms are in the Methods of the paper.

Citation

If you use the datasets or the analyses extensively, you may want to cite the following publication:
Y. Liu, Y. Mo, Learning from models: high-dimensional analyses on the performance of machine learning interatomic potentials. (ready to submit)

About

The analyses applied on high-dimensional performance data of machine learning interatomic potentials (MLIPs), described in 'Learning from models: high-dimensional analyses on the performance of MLIPs'

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors