[AMP] support GPU BF16 amp for dygraph #39029

zhangbo9674 · 2022-01-18T12:55:41Z

PR types

New features

PR changes

APIs

Describe

GPU动态图支持bf16混合精度训练。
paddle.amp.auto_cast()接口新增参数dtype，默认float16，可选【float16、bfloat16】
使用bfloat16进行混合精度训练示例：
paddle.amp.auto_cast(enable=True, custom_white_list={}, custom_black_list={}, level='O2', dtype='bfloat16')

paddle-bot-old · 2022-01-18T12:55:46Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

… dev/bf16_a100

paddle-bot-old · 2022-01-27T02:36:18Z

Sorry to inform you that 2a0bd30's CIs have passed for more than 7 days. To prevent PR conflicts, you need to re-run all CIs manually.

zhiqiu · 2022-02-11T05:01:40Z

paddle/fluid/imperative/amp_auto_cast.h

+enum class AmpDtype {
+  D0 = 0,  // float32
+  D1,      // float16
+  D2,      // bfloat16
+};


use pten::dtype directly

zhiqiu · 2022-02-11T05:03:40Z

python/paddle/fluid/tests/unittests/test_imperative_auto_mixed_precision.py

@@ -861,7 +861,7 @@ def train(layer, loader, loss_fn, opt):
                          feed={feed_target_names[0]: tensor_img},
                          fetch_list=fetch_targets)

-        self.assertTrue(np.allclose(pred.numpy(), results, atol=1.e-5))
+        self.assertTrue(np.allclose(pred.numpy(), results, atol=1.e-2))


why change precision?

Maintain the original accuracy 1.e-5.

zhiqiu · 2022-02-11T05:05:12Z

python/paddle/fluid/dygraph/amp/auto_cast.py

+    prop = paddle.device.cuda.get_device_capability()
+    cuda_version = paddle.version.cuda()
+    if cuda_version is not None:
+        cuda_maj_decide = int(cuda_version.split('.')[0]) >= 11


what's the meaning of 'maj_decide'?

Change to cuda_version_check.

zhiqiu · 2022-02-11T05:06:02Z

python/paddle/fluid/dygraph/amp/auto_cast.py

@@ -163,6 +185,7 @@ def check_optimizers(optimizers):
 @signature_safe_contextmanager
 @dygraph_only
 def amp_guard(enable=True,
+              dtype='float16',


Is it better to make 'dtype' as the last parameter?

zhiqiu · 2022-02-11T05:11:28Z

paddle/fluid/imperative/tracer.cc

+    if (amp_dtype_ == AmpDtype::D1) {
+      new_ins = AutoCastInputs<VarType>(type, ins);
+    }
  } else if (amp_level_ == AmpLevel::O2) {
    VLOG(5) << "Pure fp16 run operator: " << type;
-    new_ins = CastPureFp16Inputs<VarType>(type, ins);
+    if (amp_dtype_ == AmpDtype::D1) {
+      new_ins = CastPureFp16Inputs<VarType>(type, ins);
+    } else if (amp_dtype_ == AmpDtype::D2) {
+      new_ins = CastPureBf16Inputs<VarType>(type, ins);
+    }


why bf16 only supports o2?

supported O1：op in white list will use bf16, other will use fp32.

zhiqiu · 2022-02-11T05:12:15Z

python/paddle/amp/auto_cast.py

@@ -19,6 +19,7 @@


 def auto_cast(enable=True,
+              dtype='float16',


same for the parameter

zhiqiu · 2022-02-15T08:01:43Z

python/paddle/fluid/data_feeder.py

@@ -67,6 +67,9 @@ def convert_dtype(dtype):
            # however, jointly supporting python2 and python3, (as well as python4 maybe)
            # may still be a long-lasting problem.
            return str(dtype)
+        # NOTE(zhangbo): Now numpy not support bfloat, and paddle use uint16 to carry the data of bfloat16, and there binary is consistent.


... does not support, and paddle uses unit16 to represent ..., and there binaries are consistent.

zhiqiu · 2022-02-15T08:03:36Z

python/paddle/fluid/dygraph/amp/auto_cast.py

+    if dtype == 'float16':
+        amp_dtype = "float16"
+    elif dtype == 'bfloat16':
+        amp_dtype = "bfloat16"


amp_dtype = dtype is ok

zhiqiu

LGTM

ForFishes

LGTM

TCChenlong

LGTM

zhangbo9674 added 4 commits January 18, 2022 07:20

support dtype param for auto_cast

579a85b

add amp_dtype for tracer

2f5f316

add unsupported bf16 list

a2034da

support bf16 amp for O2

12e327f

zhangbo9674 added 5 commits January 18, 2022 13:15

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

ab98946

… dev/bf16_a100

refine python interface for bfloat16

2a0bd30

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

369e369

… dev/bf16_a100

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

2b28531

… dev/bf16_a100

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

4f2bd71

… dev/bf16_a100

zhangbo9674 added 5 commits February 7, 2022 03:14

merge develop

a86a7b3

refine code

ed03311

refine code

137e998

refine unittest

7fa4fa4

solve conflict

9d348a4

zhiqiu reviewed Feb 11, 2022

View reviewed changes

zhangbo9674 added 3 commits February 14, 2022 02:50

refine code

82b8ac1

refine code

0d6d699

add bf16 o1

63e5377

zhiqiu reviewed Feb 15, 2022

View reviewed changes

zhangbo9674 added 5 commits February 15, 2022 08:15

refine code by comment

570e494

solve conflict & merge develop

1fd45f7

add gradient accumulator

22097fe

add recompute

bc056f8

solve conflict

f1912be

zhiqiu approved these changes Feb 18, 2022

View reviewed changes

ForFishes approved these changes Feb 18, 2022

View reviewed changes

lanxianghit approved these changes Feb 18, 2022

View reviewed changes

TCChenlong approved these changes Feb 18, 2022

View reviewed changes

zhangbo9674 merged commit 7d6d384 into PaddlePaddle:develop Feb 18, 2022

zhangbo9674 mentioned this pull request Feb 22, 2022

[bf16] Refine BF16 amp-o1 logic #39815

Merged

zhangbo9674 deleted the dev/bf16_a100 branch March 2, 2023 02:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMP] support GPU BF16 amp for dygraph #39029

[AMP] support GPU BF16 amp for dygraph #39029

zhangbo9674 commented Jan 18, 2022 •

edited

Loading

paddle-bot-old bot commented Jan 18, 2022

paddle-bot-old bot commented Jan 27, 2022

zhiqiu Feb 11, 2022

zhangbo9674 Feb 14, 2022

zhiqiu Feb 11, 2022

zhangbo9674 Feb 15, 2022

zhiqiu Feb 11, 2022

zhangbo9674 Feb 14, 2022

zhiqiu Feb 11, 2022

zhangbo9674 Feb 14, 2022

zhiqiu Feb 11, 2022

zhangbo9674 Feb 15, 2022

zhiqiu Feb 11, 2022

zhangbo9674 Feb 14, 2022

zhiqiu Feb 15, 2022

zhangbo9674 Feb 15, 2022

zhiqiu Feb 15, 2022

zhangbo9674 Feb 15, 2022

zhiqiu left a comment

ForFishes left a comment

TCChenlong left a comment

[AMP] support GPU BF16 amp for dygraph #39029

[AMP] support GPU BF16 amp for dygraph #39029

Conversation

zhangbo9674 commented Jan 18, 2022 • edited Loading

PR types

PR changes

Describe

paddle-bot-old bot commented Jan 18, 2022

paddle-bot-old bot commented Jan 27, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhiqiu left a comment

Choose a reason for hiding this comment

ForFishes left a comment

Choose a reason for hiding this comment

TCChenlong left a comment

Choose a reason for hiding this comment

zhangbo9674 commented Jan 18, 2022 •

edited

Loading