How to construct ASE atoms object from LMDB dataset?

Hello,
I’d like to make ASE atoms object from provided LMDB database file.
I think at previous post someone mentioned that there are extxyz format, but I can’t find it.
Could you tell me the link where I can download extxyz format file? Else, should I manually build ASE object from provided data?
I think the latter idea is a bit dangerous, since I might make mistake and misinterpret your data.
Thank you in advance for help!

Hi - If you’re looking for ASE readable data, you can download .traj files here. If you want to convert LMDB data objects into ASE objects, we have code for that here.

Hope that helps!

Dear Mshuaibi,

Thank you for your efforts in organizing this competition. I still have some questions about how to use the data you publish.

  1. As you mentioned the Relaxation Trajectory file here in the OC20 dataset, I am wondering if it could be used in the competition.

  2. BTW, I also want to know if the OC20 Mapping file here can be used.

  3. Finally, I want to confirm whether the validation data and test data of S2EF and IS2RE can be used for training.

Thank you in advance for your help!

  1. As you mentioned the Relaxation Trajectory file here in the OC20 dataset, I am wondering if it could be used in the competition.

These are entire trajectories for all systems in OC20. Training on this is not allowed for entering the competition, but is perfectly fine for publishing results on outside of the competition.

For the competition, we’re only allowing training on the IS2RE and S2EF-2M datasets. S2EF-2M is a 2M subset of the complete trajectory data you asked about.

  1. BTW, I also want to know if the OC20 Mapping file here can be used.

Yeah, this is fine to use to design features etc. for your model for the competition.

  1. Finally, I want to confirm whether the validation data and test data of S2EF and IS2RE can be used for training.

EDIT: For the competition, it’s fine to use IS2RE val for training, but not IS2RE test or S2EF val/test. S2EF val can be used for hyperparameter tuning. We understand that the line b/w training on val vs. hyperparameter tuning on val is blurry / hard to enforce, but overall, the spirit here is to ensure fair comparison between entries. So the recommended workflow is to train on train and tune hyperparams on val.

All of these are fine to use for publishing results on outside of the competition. Note that the test dataset doesn’t come with energy / force labels to use as model training supervision, but still possible to have unsupervised approaches on this data.

Dear abhshkdz,

Thank you for your reply. I am still a little bit confused. When I run relaxation mode, I need to train my model on S2EF 2M, so can I use S2EF Val Data as the validation set to tune the parameters? Or do I have to split S2EF 2M into training and validation sets to run relaxed mode so that the S2EF Val data does not participate in the training?

Thanks a lot!

Sorry for the confusion — yes, it’s fine to use val for hyperparameter tuning. Also edited my response above. Let me know if you have any follow-ups.