Inquiry about obtaining bulk structures for non-standard Bulk IDs in OC22 Dataset

Dear OC22 Team,

I am working with the OC22 dataset and have encountered an issue regarding the Bulk IDs provided in the OC22 Mapping file.

Issue Description:

  1. Standard Bulk IDs: Most Bulk IDs in the dataset follow the format “mp-xxxx”, which allows for easy retrieval of the corresponding bulk structures from the Materials Project database.

  2. Non-standard Bulk IDs: I’ve noticed several Bulk IDs that do not follow this format. For example, “TiRuO4-rutile”. These IDs cannot be directly used to query the Materials Project database.

Question:
For these non-standard Bulk IDs (e.g., “TiRuO4-rutile”), how can I obtain the corresponding DFT-relaxed bulk structures? I really need them. Is there a specific resource or method to access these structures that are not directly available in the Materials Project database?

I would greatly appreciate any guidance, resources, or alternative methods you could provide for obtaining the bulk structures corresponding to these non-standard Bulk IDs.

Thank you for your time and assistance.

Best regards,

Hi -

Apologies for the delay! You can actually find all our raw DFT trajectories at the following link - Open Catalyst 2022 (OC22). Note we only released this data for systems in the train/val set. But you should be able to find most systems here!

Hi mshuaibi,

Thanks for your reply. I know how to get the DFT-relaxed surface structures from OC22. But now I would like to know how to get the bulk structures associated with each of them from Materials Project. And the current problem is For these non-standard Bulk IDs in the OC22 mapping file (e.g., “TiRuO4-rutile”), I could not find any related bulk information in materials project. So I would like to know how to get the bulk structures and energies for such cases.

Thanks a lot!

Best regards