NeurIPS ‘21 Challenge Updates

mshuaibi · September 1, 2021, 5:22pm

Hi All -

With less than a month from the release of the test-challenge dataset split we look forward to seeing what the community has to offer. We’d like to post a few updates regarding some of the past concerns people have raised: IS2RE Leaderboard Concerns.

We recognize that the resource availability of participants may vary drastically across research labs and industries. To encourage more participation, we will be recognizing 2 winners for the NeurIPS ‘21 Challenge based on: (1) The best overall performance and (2) The best performance using ONLY the IS2RE dataset (size 460,328): ocp/DATASET.md at master · Open-Catalyst-Project/ocp · GitHub. You will be prompted at submission time to specify whether you only used the IS2RE dataset or not.
Participants submitting to track (2) are prohibited from using any other datasets and/or pretrained S2EF models. Data augmentation is permitted as long as it comes ONLY from the IS2RE dataset. Pretraining in any form that uses S2EF data will not be allowed for this track. Participants submitting to track (1) are free to use any dataset. Using DFT is prohibited for both tracks.
Participants can submit to both tracks as long as the submissions follow the regulations mentioned above. We will be inviting the winners of each track for an oral presentation at NeurIPS. If a single team wins both tracks, we will additionally invite the second place team of track 2 to present.
An updated leaderboard will be released with the following additional column (among other minor additions): Dataset (mandatory): IS2RE-only vs Any

Please let us know if you have any questions or concerns. If you are unsure as to what track your approach falls under we recommend you reach out to us sooner rather than later as to avoid any future confusion.

Good Luck!

-The OCP Team

Jingtun · September 2, 2021, 1:48pm

Question: Does IS2RE-only learder board (track (2)) include using Relaxation Trajectories dataset as augmentation?

mshuaibi · September 2, 2021, 2:05pm

No - Relaxation trajectories are what comprise the S2EF dataset, so augmenting this would not be permitted for track 2. Augmentation can only come from direct transformations of the IS2RE dataset (rotations, interpolations, etc.), no external sources are permitted here.

ccc · September 2, 2021, 2:10pm

Hi OCP official teams,

A) Is cross-validation on the whole dataset (train + validation set) permitted for the final competition solution?
B) Can we ensemble multiple models by using different seed?

Thanks for your reply!

mshuaibi · September 2, 2021, 5:29pm

Yes - cross-validation and ensembling is permitted.

borna · September 23, 2021, 1:51pm

Hi,

What do you mean by cross-validation on this task?

Thanks.

mshuaibi · September 23, 2021, 8:49pm

By cross-validation I refer to the general idea of using both train+validation data during training in ways other than a classic train-val split (i.e. k-fold cross-validation). You can find several examples online on how this is exactly done. We haven’t used these methods in any of the OC20 results presented. I hope this answers your question.

borna · September 23, 2021, 9:04pm

So on the IS2RE dataset we can use train + all validation splits (560K) for training?

Thanks

mshuaibi · September 23, 2021, 9:37pm

Yes, you may use both the training and validation set and still be included in the IS2RE-only dataset track.

Jingtun · September 27, 2021, 6:12pm

Question: Does IS2RE-only competition learder board (track (2)) include using test dataset (in IS2RE all, not the competition test dataset) as augmentation (like for Self-Supervised Learning) for training?

mshuaibi · September 27, 2021, 8:29pm

Test data of any kind is prohibited for training use in both tracks of the NeurIPS '21 Challenge. Such approaches are permitted in the main OC20 leaderboard however.

Topic		Replies	Views
IS2RE Leaderboard Concerns	3	1035	January 4, 2023
IS2RE Table 4 Corrections	6	840	July 27, 2021
Fine-Tuning An IS2RE DPP model	9	477	January 23, 2024
Customised dataset attribute	7	812	September 4, 2021
OC20 S2EF 200K Training Errors	2	484	July 22, 2023

NeurIPS ‘21 Challenge Updates

Related Topics