Pretrained OCP models

yinlliang · September 13, 2023, 1:53pm

Hello, OCP team!
I saw the latest equiformerV2 model (August) on OC22Leaderboard to achieve optimal performance, but I couldn’t find its Checkpoint file on github. Can you provide it?
There is another question is, in the Open - Catalyst - Project/ocp/tree/main/configs/oc22 s2ef, painn and spinconv model: I did not find, also please provide it, thank you!

abhshkdz · September 13, 2023, 5:07pm

Hey, thanks for your interest in the OC22 models! Re the EquiformerV2 checkpoints – we have a little bit of cleaning up to do before we release them. Should be able to release them in a couple of weeks.

abhshkdz · October 6, 2023, 6:17pm

An EquiformerV2 model trained on OC22 is now available here: https://github.com/Open-Catalyst-Project/ocp/blob/main/MODELS.md#open-catalyst-2022-oc22

yinlliang · October 31, 2023, 2:11pm

Thanks!

But I used the checkpoints in the EquiformerV2, I got a error in the torch.nn.modules.module load_state_dict function.
if len(error_msgs) > 0: raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( self.__class__.__name__, "\n\t".join(error_msgs)))
The model I used is in ocpmodels.models.equiformer_V2.
The Debug told me: nameerror(“name ‘module’ is not defined”).
So it was wrong with the model?

abhshkdz · November 1, 2023, 8:01pm

Could you share a code snippet please so we can reproduce this error at our end?

The following should work:

python main.py \
    --config configs/oc22/s2ef/equiformer_v2/equiformer_v2_N@18_L@6_M@2_e4_f100_121M.yml \
    --checkpoint path/to/eq2_121M_e4_f100_oc22_s2ef.pt \
    --mode validate --debug

yinlliang · November 2, 2023, 7:54am

Sure, the code:

from ocpmodels.models.equiformer_v2.equiformer_v2_oc20 import EquiformerV2_OC20

model_config = {
    'use_pbc': True,
    'regress_forces': True,
    'otf_graph': True,

    'enforce_max_neighbors_strictly': False,

    'max_neighbors': 20,
    'max_radius': 12.0,
    'max_num_elements': 90,

    'num_layers': 18,
    'sphere_channels': 128,
    'attn_hidden_channels': 64,
    # [64, 96] This determines the hidden size of message passing. Do not necessarily use 96.
    'num_heads': 8,
    'attn_alpha_channels': 64,  # Not used when `use_s2_act_attn` is True.
    'attn_value_channels': 16,
    'ffn_hidden_channels': 128,
    'norm_type': 'layer_norm_sh',  # ['rms_norm_sh', 'layer_norm', 'layer_norm_sh']

    'lmax_list': [6],
    'mmax_list': [2],
    'grid_resolution': 18,  # [18, 16, 14, None] For `None`, simply comment this line.

    'num_sphere_samples': 128,

    'edge_channels': 128,
    'use_atom_edge_embedding': True,
    'share_atom_edge_embedding': False,
    # If `True`, `use_atom_edge_embedding` must be `True` and the atom edge embedding will be shared across all blocks.
    'use_m_share_rad': False,
    'distance_function': 'gaussian',
    'num_distance_basis': 512,  # not used

    'attn_activation': 'silu',
    'use_s2_act_attn': False,
    # [False, True] Switch between attention after S2 activation or the original EquiformerV1 attention.
    'use_attn_renorm': True,  # Attention re-normalization. Used for ablation study.
    'ffn_activation': 'silu',  # ['silu', 'swiglu']
    'use_gate_act': False,  # [True, False] Switch between gate activation and S2 activation
    'use_grid_mlp': True,  # [False, True] If `True`, use projecting to grids and performing MLPs for FFNs.
    'use_sep_s2_act': True,  # Separable S2 activation. Used for ablation study.

    'alpha_drop': 0.1,  # [0.0, 0.1]
    'drop_path_rate': 0.1,  # [0.0, 0.05]
    'proj_drop': 0.0,

    'weight_init': 'uniform',  # ['uniform', 'normal']

    'load_energy_lin_ref': True,
    # Set to `True` for the test set or when loading a checkpoint that has `energy_lin_ref` parameters, `False` for training and val.
    'use_energy_lin_ref': True,  # Set to `True` for the test set, `False` for training and val.
}

model = EquiformerV2_OC20(model_config)

checkpoints_path = "D:\data\eq2_121M_e4_f100_oc22_s2ef.pt"

model.load_checkpoint(checkpoints_path)

for name, para in model.parameters():
    print(name)
    print(para)

I loaded the equiformerV2-oc22 checkpoint file into the model by myself, but I kept making errors when I output model parameters. I don’t know why, isn’t equiformer_v2_oc20 compatible with checkpoint?

In addition, I have another question. The OC22-IS2REconfig file does not contain several optimal model config files in OC22leaderboard. Is it possible that S2EF and IS2RE’s config file model load the same parameters? If I only want to implement OC22-IS2RE in the model equiformerV2, do I only need to modify the task?

abhshkdz · November 2, 2023, 9:16pm

Your code is pretty close! Try this:

import torch

from ocpmodels.models.equiformer_v2.equiformer_v2_oc20 import EquiformerV2_OC20


if __name__ == "__main__":
    # initialize the model
    model_config = {
        "use_pbc": True,
        "regress_forces": True,
        "otf_graph": True,
        "enforce_max_neighbors_strictly": False,
        "max_neighbors": 20,
        "max_radius": 12.0,
        "max_num_elements": 90,
        "num_layers": 18,
        "sphere_channels": 128,
        "attn_hidden_channels": 64,
        # [64, 96] This determines the hidden size of message passing. Do not necessarily use 96.
        "num_heads": 8,
        "attn_alpha_channels": 64,  # Not used when `use_s2_act_attn` is True.
        "attn_value_channels": 16,
        "ffn_hidden_channels": 128,
        "norm_type": "layer_norm_sh",  # ['rms_norm_sh', 'layer_norm', 'layer_norm_sh']
        "lmax_list": [6],
        "mmax_list": [2],
        "grid_resolution": 18,  # [18, 16, 14, None] For `None`, simply comment this line.
        "num_sphere_samples": 128,
        "edge_channels": 128,
        "use_atom_edge_embedding": True,
        "share_atom_edge_embedding": False,
        # If `True`, `use_atom_edge_embedding` must be `True` and the atom edge embedding will be shared across all blocks.
        "use_m_share_rad": False,
        "distance_function": "gaussian",
        "num_distance_basis": 512,  # not used
        "attn_activation": "silu",
        "use_s2_act_attn": False,
        # [False, True] Switch between attention after S2 activation or the original EquiformerV1 attention.
        "use_attn_renorm": True,  # Attention re-normalization. Used for ablation study.
        "ffn_activation": "silu",  # ['silu', 'swiglu']
        "use_gate_act": False,  # [True, False] Switch between gate activation and S2 activation
        "use_grid_mlp": True,  # [False, True] If `True`, use projecting to grids and performing MLPs for FFNs.
        "use_sep_s2_act": True,  # Separable S2 activation. Used for ablation study.
        "alpha_drop": 0.1,  # [0.0, 0.1]
        "drop_path_rate": 0.1,  # [0.0, 0.05]
        "proj_drop": 0.0,
        "weight_init": "uniform",  # ['uniform', 'normal']
        "load_energy_lin_ref": True,
        # Set to `True` for the test set or when loading a checkpoint that has `energy_lin_ref` parameters, `False` for training and val.
        "use_energy_lin_ref": True,  # Set to `True` for the test set, `False` for training and val.
    }
    model = EquiformerV2_OC20(
        num_atoms=-1, bond_feat_dim=-1, num_targets=-1, **model_config
    )

    # load pretrained weights
    checkpoint_path = "D:\data\eq2_121M_e4_f100_oc22_s2ef.pt"

    ckpt = torch.load(checkpoint_path)
    state_dict = ckpt["state_dict"]
    state_dict = {k[2 * len("module.") :]: v for k, v in state_dict.items()}

    model.load_state_dict(state_dict)
    model.eval()

    # load normalization statistics
    normalizers = ckpt.get("normalizers")

    # generate energy and force prediction on a sample
    from torch_geometric.data import Batch
    from ocpmodels.datasets import LmdbDataset

    dset = LmdbDataset({"src": "data/oc22/s2ef/val_id/"})

    batch = Batch.from_data_list([dset[0]])
    energy, forces = model(batch)

    energy = (
        energy.detach() * normalizers["target"]["std"]
        + normalizers["target"]["mean"]
    )
    forces = (
        forces.detach() * normalizers["grad_target"]["std"]
        + normalizers["grad_target"]["mean"]
    )

    print(batch.sid, batch.fid, energy, batch.y)
    # tensor([10315]) tensor([52]) tensor([-725.3206]) tensor([-725.3793])

Is it possible that S2EF and IS2RE’s config file model load the same parameters?

Almost! The S2EF models will have an additional force output block that the IS2RE models don’t have. So you could keep the rest of the model config the same and set regress_forces=False to initialize a model for predicting only energies.

If I only want to implement OC22-IS2RE in the model equiformerV2, do I only need to modify the task?

Correct. Switch the trainer to energy (see example) and set regress_forces=False. If you’re initializing the model weights from a pretrained checkpoint, you might have to set strict_load=False, otherwise it’ll error out on trying to load the force output block weights as well.

yinlliang · November 4, 2023, 1:54pm

Thanks！ This helped a lot!

I have successfully loaded out of the model, but now there’s one problem is that my data set is to use the OUTCAR, and I used the OUTCAR data into AseReadMultiStructureDataset class. The object of this class is entered into the forward function of the model, however there is always an error: there is no natoms parameter. There is no reason to make this mistake. If I want to convert the OC22lmdbdataset class directly, I should first convert my OUTCAR data into lmdb files before I can proceed. I am very confused about how to implement this step. Thank you very much if you can answer me!
And my code is:

import torch
import torch.nn as nn

from ocpmodels.models.equiformer_v2.equiformer_v2_oc20 import EquiformerV2_OC20

if __name__ == "__main__":
    # initialize the model
    model_config = {
        "use_pbc": True,
        "regress_forces": True,
        "otf_graph": True,
        "enforce_max_neighbors_strictly": False,
        "max_neighbors": 20,
        "max_radius": 12.0,
        "max_num_elements": 90,
        "num_layers": 18,
        "sphere_channels": 128,
        "attn_hidden_channels": 64,
        # [64, 96] This determines the hidden size of message passing. Do not necessarily use 96.
        "num_heads": 8,
        "attn_alpha_channels": 64,  # Not used when `use_s2_act_attn` is True.
        "attn_value_channels": 16,
        "ffn_hidden_channels": 128,
        "norm_type": "layer_norm_sh",  # ['rms_norm_sh', 'layer_norm', 'layer_norm_sh']
        "lmax_list": [6],
        "mmax_list": [2],
        "grid_resolution": 18,  # [18, 16, 14, None] For `None`, simply comment this line.
        "num_sphere_samples": 128,
        "edge_channels": 128,
        "use_atom_edge_embedding": True,
        "share_atom_edge_embedding": False,
        # If `True`, `use_atom_edge_embedding` must be `True` and the atom edge embedding will be shared across all blocks.
        "use_m_share_rad": False,
        "distance_function": "gaussian",
        "num_distance_basis": 512,  # not used
        "attn_activation": "silu",
        "use_s2_act_attn": False,
        # [False, True] Switch between attention after S2 activation or the original EquiformerV1 attention.
        "use_attn_renorm": True,  # Attention re-normalization. Used for ablation study.
        "ffn_activation": "silu",  # ['silu', 'swiglu']
        "use_gate_act": False,  # [True, False] Switch between gate activation and S2 activation
        "use_grid_mlp": True,  # [False, True] If `True`, use projecting to grids and performing MLPs for FFNs.
        "use_sep_s2_act": True,  # Separable S2 activation. Used for ablation study.
        "alpha_drop": 0.1,  # [0.0, 0.1]
        "drop_path_rate": 0.1,  # [0.0, 0.05]
        "proj_drop": 0.0,
        "weight_init": "uniform",  # ['uniform', 'normal']
        "load_energy_lin_ref": True,
        # Set to `True` for the test set or when loading a checkpoint that has `energy_lin_ref` parameters, `False` for training and val.
        "use_energy_lin_ref": True,  # Set to `True` for the test set, `False` for training and val.
    }
    model = EquiformerV2_OC20(
        num_atoms=-1, bond_feat_dim=-1, num_targets=-1, **model_config
    )

    checkpoint_path = "D:\data\eq2_121M_e4_f100_oc22_s2ef.pt"
    ckpt = torch.load(checkpoint_path)
    state_dict = ckpt["state_dict"]
    state_dict = {k[2 * len("module."):]: v for k, v in state_dict.items()}
    model.load_state_dict(state_dict)

    for k in state_dict:
        print(k, '\t')

    model.to('cuda')

    # load normalization statistics
    normalizers = ckpt.get("normalizers")

    # generate energy and force prediction on a sample
    from torch_geometric.data import Batch
    from ocpmodels.datasets import LmdbDataset
    from ocpmodels.datasets.ase_datasets import AseReadMultiStructureDataset
    import pickle

    dataset = {
        'src': "D:\data\ldh\\try\\train\O\\0",
        'pattern': "**/OUTCAR",
        'a2g_args': {'r_energy': True},
        'ase_read_args': {'format': 'vasp-out', 'index': '0'},
        'keep_in_memory': False
    }

    my_dataset = AseReadMultiStructureDataset(dataset)

    #dset = LmdbDataset({"src": "D:\data\s2ef_test_lmdbs\\test_data\s2ef\\all\\test_id\\"})

    energy, forces = model(my_dataset).to('cuda')

    energy = (
        energy.detach() * normalizers["target"]["std"]
        + normalizers["target"]["mean"]
    )
    forces = (
        forces.detach() * normalizers["grad_target"]["std"]
        + normalizers["grad_target"]["mean"]
    )
    print(energy)

So I would like to ask how I can specifically load this OUTCAR data into a Dataset that can be used for training.

Thanks！

emsunshine · November 27, 2023, 7:12pm

We just merged in a fix for the natoms parameter with the ASE datasets. If you’re still facing this issue, let me know if this fix helps!

Topic		Replies	Views
OCPCalculator ASE	4	495	September 30, 2022
Error with model: GemNetOC	2	128	January 12, 2024
Running the relaxation with DFT	7	919	July 21, 2021
Evaluation server and leaderboards are now up!	0	702	March 1, 2021
Data augmentation in ForceNet	4	619	January 17, 2022

Pretrained OCP models

Related Topics