-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Sharding]: update config DOC #32299
[Sharding]: update config DOC #32299
Conversation
Thanks for your contribution! |
c003485
to
69cedad
Compare
aafb3d4
to
ba7ee5e
Compare
This configuration will affect the communication speed in sharding training, | ||
and should be an empirical value decided by your model size and network topology. | ||
sharding_segment_strategy(string): strategy used to segment the program(forward & backward operations). two strategise are | ||
available: "segment_broadcast_MB" and "segment_anchors". segment is a concept used in sharding to overlap computation and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
segment_broadcast_MB
和 segment_anchors
的概念需要介绍一下吧?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
segment_broadcast_MB(float): segment by the parameters broadcast volume. sharding will introduce parameter broadcast operations into program, and | ||
after every segment_broadcast_MB size parameter being broadcasted, the program will be cutted into one segment. | ||
This configuration will affect the communication speed in sharding training, and should be an empirical value decided by your model size and network topology. | ||
Only enable sharding_segment_strategy = segment_broadcast_MB. when Default is 32.0 . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when Default is 32.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
segment_anchors(list): list of anchors used to segment the program, which allows a finner control of program segmentation. | ||
this strategy is experimental by now. Only enable sharding_segment_strategy = segment_anchors. | ||
|
||
sharding_degree(int): specific the number of gpus within each sharding parallelism group; and sharding will be turn off if sharding_degree=1. Default is 8. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sharding_degree(int) -> sharding_degree(int, optional) 下同
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
segment_anchors(list): list of anchors used to segment the program, which allows a finner control of program segmentation. | ||
this strategy is experimental by now. Only enable sharding_segment_strategy = segment_anchors. | ||
|
||
sharding_degree(int): specific the number of gpus within each sharding parallelism group; and sharding will be turn off if sharding_degree=1. Default is 8. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sharding_degree(int) -> sharding_degree(int, optional)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -845,7 +869,7 @@ def pipeline_configs(self): | |||
**Notes**: | |||
**Detailed arguments for pipeline_configs** | |||
|
|||
**micro_batch**: the number of small batches in each user defined batch | |||
**micro_batch_size**: the number of small batches in each user defined batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这一部分中文文档没有修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done~
还有一点需要注意,文档中不要出现用户作为主语的情况,一般省略主语即可 |
updated~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Others
PR changes
Docs
Describe
sharding: update config DOC
英文
中文