Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: expected scalar type Float but found Half with DCNv2 at backward #2187

Closed
zehuichen123 opened this issue Aug 10, 2022 · 5 comments
Assignees

Comments

@zehuichen123
Copy link

zehuichen123 commented Aug 10, 2022

Hi, I am tryiuse using fp16 in BEVDet (actually the BEVDepth implemented with the BEVDet codebase) with mmcv==1.4.0, torch=1.9, torchvision=0.10.0. However, I encountered this problem:

Traceback (most recent call last):
  File "./tools/train.py", line 224, in <module>
    main()
  File "./tools/train.py", line 220, in main
    meta=meta)
  File "/nfs/chenzehui/code/BEVDet/mmdet3d/apis/train.py", line 208, in train_model
    meta=meta)
  File "/nfs/chenzehui/code/BEVDet/mmdet3d/apis/train.py", line 177, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/nfs/chenzehui/others/miniconda3/envs/bevdet/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/nfs/chenzehui/others/miniconda3/envs/bevdet/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 51, in train
    self.call_hook('after_train_iter')
  File "/nfs/chenzehui/others/miniconda3/envs/bevdet/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook
    getattr(hook, fn_name)(self)
  File "/nfs/chenzehui/others/miniconda3/envs/bevdet/lib/python3.7/site-packages/mmcv/runner/hooks/optimizer.py", line 224, in after_train_iter
    self.loss_scaler.scale(runner.outputs['loss']).backward()
  File "/nfs/chenzehui/others/miniconda3/envs/bevdet/lib/python3.7/site-packages/torch/_tensor.py", line 255, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/nfs/chenzehui/others/miniconda3/envs/bevdet/lib/python3.7/site-packages/torch/autograd/__init__.py", line 149, in backward
    allow_unreachable=True, accumulate_grad=True)  # allow_unreachable flag
  File "/nfs/chenzehui/others/miniconda3/envs/bevdet/lib/python3.7/site-packages/torch/autograd/function.py", line 87, in apply
    return self._forward_cls.backward(self, *args)  # type: ignore[attr-defined]
  File "/nfs/chenzehui/others/miniconda3/envs/bevdet/lib/python3.7/site-packages/torch/autograd/function.py", line 204, in wrapper
    outputs = fn(ctx, *args)
  File "/nfs/chenzehui/others/miniconda3/envs/bevdet/lib/python3.7/site-packages/mmcv/ops/modulated_deform_conv.py", line 129, in backward
    with_bias=ctx.with_bias)
RuntimeError: expected scalar type Float but found Half

There are some related issues #1004 about DCN in fp16 but I noticed that they are all accur in the forward phase, not in the backward function.

@zehuichen123
Copy link
Author

I rechecked the code and find that when I change the DCNv2 to DCN (https://github.com/HuangJunJie2017/BEVDet/blob/8bd2c041b249b5fc52cbdd1cfe45834cb98f7e00/mmdet3d/models/necks/view_transformer.py#L307), this error disappeared. So this bug only belongs to DCNv2?

@grimoire
Copy link
Member

I guess it is because the bias has wrong type.
We have a fix in 9b49fcc. You can use the latest MMCV or update the code if you want to use 1.4.0

@quhaoooo
Copy link

quhaoooo commented Sep 9, 2022

I guess it is because the bias has wrong type. We have a fix in 9b49fcc. You can use the latest MMCV or update the code if you want to use 1.4.0

I got the sanme problem when i use amp and can not solve when i update the code:
image

@zehuichen123
Copy link
Author

@quhaoooo You can simply set bias=False to avoid this problem. It seems that bias is not so important.

@quhaoooo
Copy link

@quhaoooo You can simply set bias=False to avoid this problem. It seems that bias is not so important.

I just set bias = False , but got this problem:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants