Opening LMDB files

lordiku · August 21, 2021, 3:19pm

Please I will like to open the LMBD file to see how the data in it is structured. Is there a sample code I can use to open and access the contents?

mshuaibi · August 21, 2021, 7:37pm

You can refer to the end of this tutorial for help with interacting with the LMDBs - ocp/lmdb_dataset_creation.ipynb at master · Open-Catalyst-Project/ocp · GitHub. Note this sample code is for S2EF LMDBs (specifying the directory as src). If you’re interested in interacting with the IS2RE dataset, something like this should do:

from ocpmodels.datasets import SinglePointLmdbDataset
dataset = SinglePointLmdbDataset({"src": "path/to/is2re/data.lmdb"})

lordiku · August 21, 2021, 8:16pm

Thank you very much for the quick response.
I am still struggling on how to install ocpmodels.
I have tried:
conda install ocpmodels
on Jupiter notebook but it does not work. Please is there a way out?

mshuaibi · August 21, 2021, 8:41pm

The repo is not installable via a conda package. Please follow the details here for step-by-step instructions: GitHub - Open-Catalyst-Project/ocp: https://opencatalystproject.org/.

lordiku · August 23, 2021, 6:34pm

Hi Muhammad,

Thank you for all your help.

I don’t know if I will be asking for too much if I request for a complete code to be able to open/view the data frame in the LMDB files. I keep running into errors when I try to follow the sample code you provided which seems to be for only the S2EF. I want to interact with the IS2RE dataset. Please guide me.

I don’t know the section of the sample code within which I am into input:

from ocpmodels.datasets import SinglePointLmdbDataset
dataset = SinglePointLmdbDataset({“src”: “path/to/is2re/data.lmdb”})

I look forward to reading from you

mshuaibi · August 23, 2021, 10:22pm

If you can share your error messages I can better help you. Navigating the IS2RE dataset is similar to the S2EF dataset:

If you haven’t already done so, make sure to download the IS2RE LMDBs here - ocp/DATASET.md at master · Open-Catalyst-Project/ocp · GitHub. Uncompress the downloaded file.
Clone the OCP repo onto your machine.
Follow the installation instructions here to set up your conda environment - GitHub - Open-Catalyst-Project/ocp: https://opencatalystproject.org/. If you are using a non-gpu machine, use
conda-merge env.common.yml env.cpu.yml > env.yml
when you get to that step. Make sure to pip install -e . from within the cloned ocp directory.
Make note of the path to the downloaded+uncompressed IS2RE LMDBs, specifically find the data.lmdb file you’re interested in, let’s assume all/train/data.lmdb.
Run the following:

from ocpmodels.datasets import SinglePointLmdbDataset
dataset = SinglePointLmdbDataset({"src": "all/train/data.lmdb"})
sample = dataset[0]
print(sample)

output: Data(atomic_numbers=[86], cell=[1, 3, 3], cell_offsets=[2964, 3], distances=[2964], edge_index=[2, 2964], fixed=[86], force=[86, 3], natoms=86, pos=[86, 3], pos_relaxed=[86, 3], sid=2472718, tags=[86], y_init=6.282500615000004, y_relaxed=-0.025550085000020317)

lordiku · August 24, 2021, 5:47am

Hi Muhammed,

I get this error when I try to run [from ocpmodels.datasets import SinglePointLmdbDataset]:

ModuleNotFoundError Traceback (most recent call last)
in
----> 1 from ocpmodels.datasets import SinglePointLmdbDataset

ModuleNotFoundError: No module named ‘ocpmodels’

lordiku · August 24, 2021, 5:54am

mshuaibi · August 24, 2021, 2:25pm

Have you followed the installation instructions first, these errors are due to not properly installing the repo/dependencies.

nchia · June 5, 2023, 2:55pm

Hi!

Do you know why is this happened? I installed the ocpmodels properly, but I still could not run the code.
Thank you for your help

mshuaibi · June 5, 2023, 4:07pm

Hi -

Try the following:

from ocpmodels.datasets import LmdbDataset

dataset = LmdbDataset({"src": "Documents/etc/etc/train"})

Some context here - When loading S2EF data you only need to give the directory, and not a specific *.lmdb file. When using IS2RE data you need to specify the exact .lmdb file. We’re working on making this easier to use, but hopefully this resolves your issue for the time being.

Sejal2002 · July 12, 2023, 5:44pm

Sir, can you please help me with the pip equivalent commands for creating the virtual environments with the required OCP dependencies?
The installation instructions are given to set up the Conda environment. However, I cannot run Conda on my cluster, hence the need to use Pip.

Thank you for your time!

Topic		Replies	Views
通过读取is2re的lmdb数据集	3	479	March 27, 2024
Creating and reading large lmdb files	1	130	July 22, 2024
Create OC22lmdbdataset by using OUTCAR（relaxation）	3	270	November 14, 2023
\how to Read OC22 data?	1	141	June 11, 2024
Lmdb转换：什么格式可以转换成lmdb	1	481	July 18, 2023

Opening LMDB files

Related topics