Setting up multi-GPUs in the notebook

Dear OCP team,
Thank you for building such a huge database for the community.

Currently, I am trying to understand the details of OCP and testing a few models using Jupyter notebook by following the tutorial. But I met a problem in the settings.

How can I set up multiple GPUs to train the model in the notebook?

Using the command, I was able to obtain the results as guided.(torch.distributed launch --nproc_per_node = 4, num-gpus 4)
However, personally, it would be easier to deal with the things using notebook if possible.

Thank you in advance.

Sincerely,
Gwan-Yeong

Hi Gwan-Yeong

Thanks for your query!

Setting up multi-gpu training using Jupyter notebook is pretty tricky. I would recommend looking at this response from the pytorch team: DistributedDataParallel on terminal vs jupyter notebook - #4 by pritamdamania87 - distributed - PyTorch Forums

We are unable to extend support for multi-gpu training using Jupyter notebooks at this point of time in the OCP codebase, since there’re already open issues in (upstream) Pytorch.

We would recommend setting up multigpu training via terminal.

Best
Siddharth

1 Like

Dear Siddharth,

Thank you very much for your quick reply. I didn’t know that issue:)
If so, I should use the terminal.

Thanks again!

BR,
Gwan-Yeong