Questions on available DFT data for training

In the OC20 dataset, it seems that DFT data for the relaxed total system energy and DFT calculated adsoption energy are available for training new machine learning models. Are there DFT calculated energies for relaxed Slab ONLY, and relaxed Adsorbate ONLY available in the OC20 dataset for training new models along with the DFT calculated total relaxed system energy and DFT adsorption energy? Thanks!

Hi -

The data for slab only data can be found here: ocp/DATASET.md at main · Open-Catalyst-Project/ocp · GitHub. As far as adsorbate only data, we just used a linear reference of a few adsorbates (see Table 5 here). For example, if we wanted to compute the energy of CO = -7.282 + -7.204 = -14.486.

image

Your timely answers are so helpful! Thanks a lot!

One more question, what are the meaning of the labels below in Table 2 (S2EF Test) in the same paper?
ID OOD Ads OOD Cat OOD Both
Thanks!
Found the answers in the paper:Subsplits include sampling from the same
distribution as training (In Domain), unseen adsorbates (Out of Domain (OOD) Adsorbate), unseen
element compositions for catalysts (OOD Catalyst), and unseen adsorbates and catalysts (OOD
Both)

1 Like