Our cutoff radius was selected to ensure the proper interactions were being captured while maintaining computational tractability. Given that the underlying DFT did not consider long-range interactions, we felt comfortable with 6A - similar to other works: https://aip.scitation.org/doi/full/10.1063/1.4966192 https://arxiv.org/pdf/1710.10324.pdf.
Our nearest neighbor limits were incorporated to help with model efficiency. We ran the following experiment on a literature dataset before arriving at the parameter we felt comfortable with:
num_neighbors: 12, 30, 50, 100
energy_mae(eV): 0.5655, 0.4931, 0.4876, 0.4843
We still provide the flexibility for users to modify these parameters if they choose to do so: ocp/preprocess_ef.py at e6fdfb0d0194d50b4c500b1d1eea10f040821de3 · Open-Catalyst-Project/ocp · GitHub.
More recently, we ran the following experiment exploring the trade-off between max neighbors, cutoff radius, and performance for an identical, small DimeNet++ model. Disclaimer- it’s likely that tuning model hyperparameters for each of these combinations can result in better/worse numbers but hopefully this gives you an idea.
Hope this helps!