Customised dataset attribute


Hi OCP team, I’m currently trying some datasets outside OCP, so I’m creating my own lmdb dataset. I noticed that in the lmdb creation tutorial: ‘pos_relaxed’ and ‘y_init’ are two attributes necessary for IS2RE/IS2RS tasks. While, I didn’t find these two attributes in your model, and I can train normally without them. May I know what’s the usage of these two attribute?

Thank you very much.

pos_relaxed is only used when running relaxations to check if the data is val/test, here, and to compute IS2RS metrics when on val, here. y_init does not get used in the codebase. Other than that these aren’t being used. Note, neither of these properties are needed for the actual models. We include these attributes more so for completeness. Someone who may be interested in doing IS2RS directly (predicting relaxed positions without running a relaxation) may use pos_relaxed in their loss function for instance.

Thank you for your explanation!


Hi, may I ask another unrelated question under this topic? I noticed that you just updated the ocp repo recently and I downloaded it. But when I load the trainer in train_s2ef_example, following error occured. May I know how should I solve this? For reference, I’m using Mac CPU (but I don’t think that’s the problem?) and this problem never occured using previous version repo.

Thank you very much.

Hi -

We recently pushed a fix for this here. Can you try pulling the latest changes via git pull and trying again?

Hi, I tried with latest repo but the error still occured. Is there any other possibilities causing this problem?

Thank you.

Aha this is referring to the jupyter notebook example. I have opened a PR fix for this - https://github.com/Open-Catalyst-Project/ocp/pull/265. You can copy this version https://github.com/Open-Catalyst-Project/ocp/blob/tutorial_fix/docs/source/tutorials/train_s2ef_example.ipynb until the changes are pushed to the main branch.

Thank you very much!