Divergent Training Behavior with EquiformerV2 on ANI-2x Dataset with PBC

Hello community,

I’m using the EquiformerV2 model to train a neural network on LMDB splits I created from the ANI-2x dataset. Since the ANI-2x dataset only contains non-periodic structures, I implemented a workaround to mimic a periodic system by placing each molecule within a “dummy” 50 ų box.

This is because previously, training without PBC led to errors, but recent updates seem to have resolved these [*]. Now, however, I’m observing divergent behaviors between two setups with and without PBC:
Training Configurations:

  • With PBC: use_pbc=True, enforce_max_neighbors_strictly=True
  • Without PBC: use_pbc=False, enforce_max_neighbors_strictly=False
    All other parameters are held constant between the two setups.


Observations:
Forces Magnitude Error: The error in force predictions drops more steadily with PBC, while it plateaus at a higher level without PBC.
Cosine Similarity: The PBC setup improves to around 0.8 in cosine similarity, whereas the non-PBC setup remains around 0.2, with little further improvement.
Hypothesis
One possible explanation is that, because I created the LMDB files using the a2g methods with this dummy periodic system, the dataset has been effectively “baked” with periodicity in mind. This might mean that it’s no longer possible to treat it as a truly aperiodic system, since the initial graph construction assumed periodicity.

Another factor could be the way PBC artificially expands the perceived system size: by appending copies of the original box to the sides, each box remains isolated from other molecules, which could make errors appear smaller relative to the total system size.

Thank you for any suggestions or ideas on interpreting these results.

[*] Edit: The errors stemmed from the AtomsToGraphs class, not from the EquiformerV2 code itself. It is not possible to convert non periodic ASE Atoms objects to graphs using that class.

Happy to help - can you actually move this discussion to our github issues page. Thanks!

On it right now. I also want to adress why the AtomsToGraphs class and therefore LMDB creation fails for non-periodic systems.