Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

report error when I use multiple GPUs #10

Open
yangdongchao opened this issue Jan 22, 2022 · 4 comments
Open

report error when I use multiple GPUs #10

yangdongchao opened this issue Jan 22, 2022 · 4 comments

Comments

@yangdongchao
Copy link

yangdongchao commented Jan 22, 2022

python3 train.py --base vas_codebook.yaml -t True --gpus 0,1,

when I try to run the code with two GPUs, it report error
pytorch_lightning.utilities.exceptions.MisconfigurationException: You requested GPUs: [0, 1] But your machine only has: []

But if I only uses gpus 0, there is no error happen.
So I want to ask how to using multiple GPUs to train this code?

@v-iashin
Copy link
Owner

v-iashin commented Jan 22, 2022

Well, I don't know why is it happening on your side, I am afraid. Are you using windows?

If I were you, I would check if you can train a model (not SpecVQGAN) in a distributed setting using pure pytorch.

If you can train one, I would look into PyTorchLightning. It seems to miss one of your GPUs.

Also, could you please share the output of nvidia-smi and torch.cuda.device_count()

@yangdongchao
Copy link
Author

Well, I don't know why is it happening on your side, I am afraid. Are you using windows?

If I were you, I would check if you can train a model (not SpecVQGAN) in a distributed setting using pure pytorch.

If you can train one, I would look into PyTorchLightning. It seems to miss one of your GPUs.

Also, could you please share the output of nvidia-smi and torch.cuda.device_count()

Thanks for your reply, I have solve this problem.

@v-iashin
Copy link
Owner

How did you solve it?

@yangdongchao
Copy link
Author

How did you solve it?

How did you solve it?

In fact, I did anything. When I run codebook, it still only use one GPU. But when I train transformer, it can use multiple GPUs. So I give up to use multiple GPUs when training codebook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants