Thanks for the questions!
Just to reiterate our motivation behind organizing the challenge — we want to encourage methods that can accelerate IS2RE pipelines and make them considerably faster than DFT. Using DFT (with equivalent theory as in OC20 data) at test time would be similarly slow, and hence not allowed.
Having said that, there might be other cheaper calculations (e.g. force fields, reactive force fields, some approximate tight binding methods) that are much faster. That’s fine to do, especially if these calculations take < 1 second per IS2RE prediction (simulating the entire relaxation may take slightly longer, < 10 seconds). We obviously don’t have a way to strictly enforce / check inference times since we just ask for predictions, but would appreciate it if you stick to the spirit and keep the ~1 second ballpark number in mind. Note that most tight binding methods are significantly more expensive than this.
Other auxiliary features (e.g. Bader charges, other element properties as in the CGCNN paper, etc.) are also fine to use to train models, but worth keeping in mind that some of these features (e.g. Bader charges) might not be available at test time. We recently released Bader charge data for OC20 training / validation here: ocp/DATASET.md at main · Open-Catalyst-Project/ocp · GitHub.