Missing catalyst slab trajectories

Hello! - Posting a question about the catalyst only trajectories available here: https://dl.fbaipublicfiles.com/opencatalystproject/data/slab_trajectories.tar

This file contains 294,249 trajectory files for bare slabs. In the corresponding mapping file https://dl.fbaipublicfiles.com/opencatalystproject/data/mapping_adslab_slab.pkl , there are 368,581 unique bare slab SIDs, ~25% more than the number of structures. Are these structures intentionally missing or unstable? I am wondering if I need to filter out the ads+cat systems corresponding to the missing bare slabs.

Thanks for any help.

Hi!

Sorry for the confusion on this. This was indeed intentional. mapping_adslab_slab.pkl includes all the slabs used in the train and validation set. However, for the in-distribution val splits, some of these slabs were also used in the test set. We decided to exclude any slabs that were also being used in the test set. We can plan on updating that pkl file to avoid future confusion. Thanks for catching this, I hope this clarifies it.

Got it that makes sense - thanks for clarifying!