Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dygraph sharding stage 2] sharding broadcast overlap #46656

Merged
merged 5 commits into from
Oct 9, 2022

Conversation

FeixLiu
Copy link
Contributor

@FeixLiu FeixLiu commented Sep 29, 2022

PR types

Others

PR changes

Others

Describe

Support sharding broadcast overlap for stage 2.

Usage

origin_model = model
model, optimizer, scaler = group_sharded_parallel(model=model)
# need pass the origin model to the function.
model._set_broadcast_overlap(True, origin_model)

6.7B Loss Compare

image

6.7. Speed Compare

No broadcast overlap Broadcast overlap Gain
40077 43507 +8.6%

@FeixLiu FeixLiu force-pushed the broadcast_overlap_forward branch 2 times, most recently from 818838f to 69ae5d4 Compare October 8, 2022 03:41
@FeixLiu FeixLiu requested a review from sneaxiy October 9, 2022 03:08
@FeixLiu FeixLiu merged commit d8b4ca9 into PaddlePaddle:develop Oct 9, 2022
@FeixLiu FeixLiu deleted the broadcast_overlap_forward branch October 9, 2022 07:30
FeixLiu added a commit to FeixLiu/Paddle that referenced this pull request Oct 17, 2022
fuyinno4 pushed a commit that referenced this pull request Oct 18, 2022
* [dygraph sharding] Overlap the reduce and the caculation for sharding stage 2. (#46495)

* [dygraph sharding stage 2] sharding broadcast overlap (#46656)

* Multi groups for broadcast of sharding stage 2 (#46894)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants