Questions about Data mapping information

lcb · March 27, 2024, 1:52pm

Hello, OCP team!
Recently I have been working with Data mapping information (a Python pickle) to process relevant data. I need the information about the slab and adsorbates for each of the systems in OC22 dataset. A Python dictionary can be loaded by loading the pickle file. But I’m confused about the meaning of a key (miller_index) in the loaded Python dictionary. In the DATASET.md, miller_index is interpreted as 3-tuple of integers indicating the Miller indices of the surface. I don’t quite understand the explanation. Can I interpret this key as the location of the adsorbate on the slab. For example, for the same slab, different miller_index means that adsorbate position on the slab is also different. Specifically, in an oer reaction, the adsorbate OH*, O*, OOH* will appear sequentially. Should the OH*, O*, OOH* conform to the same miller_index?
Can you help me explain in detail what this key (miller_index) means. Thank you.

mshuaibi · March 27, 2024, 8:16pm

Hi -

Not quite. When creating catalyst surfaces you must first start with some bulk material. This is identified as bulk_id in the data mapping. Starting from a bulk material, there are different ways one can “slice” that bulk material to create different surfaces, for example:

How we can define these different surfaces can be done with miller index which in simple terms, is just a notation to define different planes of the bulk crystal material. Surfaces can have different performance and hence important to study adsorbates placed on different miller indices i.e. Cu (100) vs Cu (111).

Hope that helps!

lcb · March 28, 2024, 2:01am

Thank you for your answer. I think I understand.
For the picture you shared, can you tell me what the miller index is for the last two structures that I don’t understand how they were created.

mshuaibi · March 28, 2024, 3:20am

I don’t have the exact miller indices for these, I mainly use these to illustrate different surfaces. You can take a look through this to give you a better understanding of what various miller indices will correspond to - 1.2: Miller Indices (hkl) - Chemistry LibreTexts.

lcb · March 28, 2024, 7:00am

Ok, I see. Thank you so much for helping me.

lcb · March 28, 2024, 2:41pm

Now I have a new question.
If I have a sid for a material, or a bulk_id for that material, how do I find the LMDB Data object of that material? Now I can only go through all the materials of the entire oc22, and finally match the corresponding material. Is there any other way?

mshuaibi · April 2, 2024, 4:19pm

You may have an easier time downloading the following: ocp/DATASET.md at main · Open-Catalyst-Project/ocp · GitHub. These are the full trajectories for different sids, you can get what you need here rather than searching the LMDBs.

Topic		Replies	Views
Welcome to the discussion board!	6	857	October 5, 2022
NeurIPS test-challenge metadata	1	641	September 29, 2021
Data mapping information for test challenge set	7	1544	September 29, 2021
Extract Specific Structure Information from OC22	5	243	February 2, 2024
Help for a new scientist	1	490	February 8, 2023

Questions about Data mapping information

Related Topics