Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Sharding]: update config DOC #32299

Merged

Conversation

JZ-LIANG
Copy link
Contributor

@JZ-LIANG JZ-LIANG commented Apr 15, 2021

PR types

Others

PR changes

Docs

Describe

sharding: update config DOC

英文
image

中文
image

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@JZ-LIANG JZ-LIANG changed the title sharding: update config DOC [Sharding]: update config DOC Apr 15, 2021
@JZ-LIANG JZ-LIANG force-pushed the static/hybrid-parallelism/4d-doc branch from c003485 to 69cedad Compare April 15, 2021 09:23
@JZ-LIANG JZ-LIANG force-pushed the static/hybrid-parallelism/4d-doc branch from aafb3d4 to ba7ee5e Compare April 15, 2021 12:42
This configuration will affect the communication speed in sharding training,
and should be an empirical value decided by your model size and network topology.
sharding_segment_strategy(string): strategy used to segment the program(forward & backward operations). two strategise are
available: "segment_broadcast_MB" and "segment_anchors". segment is a concept used in sharding to overlap computation and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

segment_broadcast_MB segment_anchors 的概念需要介绍一下吧?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

segment_broadcast_MB(float): segment by the parameters broadcast volume. sharding will introduce parameter broadcast operations into program, and
after every segment_broadcast_MB size parameter being broadcasted, the program will be cutted into one segment.
This configuration will affect the communication speed in sharding training, and should be an empirical value decided by your model size and network topology.
Only enable sharding_segment_strategy = segment_broadcast_MB. when Default is 32.0 .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when Default is 32.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

segment_anchors(list): list of anchors used to segment the program, which allows a finner control of program segmentation.
this strategy is experimental by now. Only enable sharding_segment_strategy = segment_anchors.

sharding_degree(int): specific the number of gpus within each sharding parallelism group; and sharding will be turn off if sharding_degree=1. Default is 8.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sharding_degree(int) -> sharding_degree(int, optional) 下同

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

segment_anchors(list): list of anchors used to segment the program, which allows a finner control of program segmentation.
this strategy is experimental by now. Only enable sharding_segment_strategy = segment_anchors.

sharding_degree(int): specific the number of gpus within each sharding parallelism group; and sharding will be turn off if sharding_degree=1. Default is 8.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sharding_degree(int) -> sharding_degree(int, optional)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -845,7 +869,7 @@ def pipeline_configs(self):
**Notes**:
**Detailed arguments for pipeline_configs**

**micro_batch**: the number of small batches in each user defined batch
**micro_batch_size**: the number of small batches in each user defined batch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这一部分中文文档没有修改

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done~

@TCChenlong
Copy link
Contributor

还有一点需要注意,文档中不要出现用户作为主语的情况,一般省略主语即可

@JZ-LIANG
Copy link
Contributor Author

还有一点需要注意,文档中不要出现用户作为主语的情况,一般省略主语即可

updated~

Copy link
Contributor

@TCChenlong TCChenlong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ForFishes ForFishes merged commit e348901 into PaddlePaddle:develop Apr 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants