Support BF16 training for sharding and dp #46846

GhostScreaming · 2022-10-10T08:24:20Z

PR types

New features

PR changes

OPs

Describe

Support bfloat16 datatype for reducer operator, fill kernel and sharding strategy.

is wrong.

…into develop

… develop

… fix_bfloat16

paddle-bot · 2022-10-10T08:24:24Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

paddle-bot · 2022-10-10T08:24:26Z

✅ This PR's description meets the template requirements!
Please wait for other CI results.

sneaxiy

LGTM.

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * support pure bfloat16 * support bf16 linear * update PR to pass CI * tiny fix where_grad_kernel.cu * Support bfloat16 type for reducer and sharding. * Fix some bug. * Polish code. * Polise code. * Add bfloat16 datatype in fill_grad kernels. Co-authored-by: sneaxiy <sneaxiy@126.com>

This reverts commit 6adbed6.

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * support pure bfloat16 * support bf16 linear * update PR to pass CI * tiny fix where_grad_kernel.cu * Support bfloat16 type for reducer and sharding. * Fix some bug. * Polish code. * Polise code. * Add bfloat16 datatype in fill_grad kernels. Co-authored-by: sneaxiy <sneaxiy@126.com>

* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * support pure bfloat16 * support bf16 linear * update PR to pass CI * tiny fix where_grad_kernel.cu * Support bfloat16 type for reducer and sharding. * Fix some bug. * Polish code. * Polise code. * Add bfloat16 datatype in fill_grad kernels. Co-authored-by: sneaxiy <sneaxiy@126.com> Co-authored-by: sneaxiy <sneaxiy@126.com>

GhostScreaming and others added 15 commits September 14, 2022 10:10

Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result

1e70140

is wrong.

Merge branch 'reduce_sum' of https://github.com/GhostScreaming/Paddle …

e1f08a2

…into develop

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

ff1bfbc

… develop

support pure bfloat16

f4fe24f

support bf16 linear

b420a32

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

5b7bc39

… develop

update PR to pass CI

7ff1388

tiny fix where_grad_kernel.cu

b9a7c14

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

46662c4

… fix_bfloat16

Merge branch 'fix_bfloat16' of https://github.com/sneaxiy/Paddle into…

9e18791

… fix_bfloat16

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

29a9e77

… fix_bfloat16

Support bfloat16 type for reducer and sharding.

817d7ee

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

6e15126

… fix_bfloat16

Fix some bug.

44abf06

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

7fe04f2

… fix_bfloat16

sneaxiy previously approved these changes Oct 14, 2022

View reviewed changes

Polish code.

384d497

GhostScreaming dismissed sneaxiy’s stale review via 384d497 October 14, 2022 11:28

GhostScreaming added 2 commits October 16, 2022 14:55

Polise code.

d012390

Add bfloat16 datatype in fill_grad kernels.

480c732

sneaxiy approved these changes Oct 17, 2022

View reviewed changes

sneaxiy changed the title ~~Fix bfloat16~~ Support BF16 training for sharding Oct 17, 2022

sneaxiy merged commit 0b39b24 into PaddlePaddle:develop Oct 17, 2022

sneaxiy changed the title ~~Support BF16 training for sharding~~ Support BF16 training for sharding and dp Oct 17, 2022

GhostScreaming added a commit to GhostScreaming/Paddle that referenced this pull request Oct 21, 2022

Revert "Support BF16 training for sharding (PaddlePaddle#46846)"

d58c1f6

This reverts commit 6adbed6.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support BF16 training for sharding and dp #46846

Support BF16 training for sharding and dp #46846

GhostScreaming commented Oct 10, 2022 •

edited

Loading

paddle-bot bot commented Oct 10, 2022

paddle-bot bot commented Oct 10, 2022 •

edited

Loading

sneaxiy left a comment

Support BF16 training for sharding and dp #46846

Support BF16 training for sharding and dp #46846

Conversation

GhostScreaming commented Oct 10, 2022 • edited Loading

PR types

PR changes

Describe

paddle-bot bot commented Oct 10, 2022

paddle-bot bot commented Oct 10, 2022 • edited Loading

sneaxiy left a comment

Choose a reason for hiding this comment

GhostScreaming commented Oct 10, 2022 •

edited

Loading

paddle-bot bot commented Oct 10, 2022 •

edited

Loading