Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] Support amp (pytorch >= 1.6.0) on DCN and DCNv2/ Add unit tests on DCN/DCNv2 amp #1029

Merged
merged 6 commits into from
May 23, 2021

Conversation

AronLin
Copy link
Contributor

@AronLin AronLin commented May 17, 2021

Motivation

mmcv 1.3.2 supports automatic mixed precision with pytorch >= 1.6.0 but the CUDA ops don't support it now, resulting in some runtime errors like #1004 and #1028.

#1014 fixes the bug in #1004. However, when the type of input data for DCN is float32, the bug still exists. The same bug also exits in DCNv2.

Here is I'd like to do:

  • Fixes the bugs whether the input type is float16 or float32, when using amp.
  • Add amp unit tests for DCN/DCNv2 ops.

Modification

  • Cast the type of DCN/DCNv2 weight and input manually according to the type of offset.
    The offset in deform_conv is computed by a Conv2d which supports amp, so the type of offset is the flag for whether we use fp16 or amp.
  • Add amp unit tests for DCN/DCNv2 ops.
    With amp (pytorch >= 1.6.0), the model type will not be set manually. The input type might be float32 and float16; the unit tests should include both cases.

BC-breaking (Optional)

No.

@CLAassistant
Copy link

CLAassistant commented May 17, 2021

CLA assistant check
All committers have signed the CLA.

@AronLin AronLin changed the title fix fp16 bug on DCNv2 [Fix] fix fp16 bug on DCNv2 May 17, 2021
@zhouzaida zhouzaida requested a review from ycxioooong May 17, 2021 12:03
@codecov
Copy link

codecov bot commented May 17, 2021

Codecov Report

Merging #1029 (6c719c0) into master (e9f2a02) will decrease coverage by 0.00%.
The diff coverage is 33.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1029      +/-   ##
==========================================
- Coverage   65.35%   65.34%   -0.01%     
==========================================
  Files         155      155              
  Lines        9927     9930       +3     
  Branches     1806     1806              
==========================================
+ Hits         6488     6489       +1     
- Misses       3105     3107       +2     
  Partials      334      334              
Flag Coverage Δ
unittests 65.34% <33.33%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmcv/ops/modulated_deform_conv.py 49.56% <0.00%> (-0.88%) ⬇️
mmcv/ops/deform_conv.py 62.04% <100.00%> (+0.27%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e9f2a02...6c719c0. Read the comment docs.

@AronLin AronLin changed the title [Fix] fix fp16 bug on DCNv2 [Fix] Support amp (pytorch >= 1.6.0) on DCN and DCNv2/ Add unit tests on DCN/DCNv2 amp May 19, 2021
Comment on lines 73 to 76
# The flag for whether to use fp16 (pytorch < 1.6.0) or
# amp (pytorch >= 1.6.0) is the type of "offset", we
# cast weight and input to temporarily support fp16 and
# amp whatever the pytorch version is.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may state why we are doing this, e.g amp won't cast input for DCN correctly.

@ZwwWayne ZwwWayne merged commit 4bd3b50 into open-mmlab:master May 23, 2021
@AronLin AronLin deleted the fixDCNv2FP16 branch May 23, 2021 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants