Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoParallel] adapt lazyinit & fix pass #45840

Merged
merged 5 commits into from
Sep 9, 2022

Conversation

zhaoyinglia
Copy link
Contributor

@zhaoyinglia zhaoyinglia commented Sep 7, 2022

PR types

New features

PR changes

Others

Describe

  • add lazy_init
  • fix dist_loader to receive complete feed_list user defined
  • fix recompute pass to reduce time
  • fix reshard to be compatible with sharding pass
  • fix dp_overlap pass to be compatible with sharding pass
  • fix amp pass when casted grad_var is renamed as grad_var@RENAME
  • fix amp pass record the map of grad_cast to fwd_cast for dist_default
  • fix sharding pass when the op's input of fp16 pass is empty
  • fix sharding pass to remove sum op when the var is not local shard
  • update auto_parallel_gpt_model.py to weight_sharing version in unittest

@paddle-bot
Copy link

paddle-bot bot commented Sep 7, 2022

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@@ -304,7 +338,10 @@ def main_program(self):

@property
def startup_program(self):
return self.concrete_program.startup_program
try:
return self.proxy_layer.startup_program
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里为什么要这样写?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为用户有可能并没有用 LazyGuard,在这种情况下也需要保证正确执行之前无lazyinit版本的动转静。

@@ -199,6 +206,7 @@ def __init__(self, layer, loss_func, metrics, inputs_spec, labels_spec):

self.build_info = BuildInfo()
self._logger = get_logger(logging.INFO)
self.lazy_init = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个lazy_init参数的作用是什么?可否在这里Note注明下?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是为了表示用户是否使用了 LazyGuard。如果没有的话(False),会直接利用 model 初始化时的参数;如果使用了(True),会通过执行 startup_program 来初始化参数

}
# slice param_value with dist_attr
# share sliced_param_value with param_tensor in global_scope
from .converter import Converter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import 语句不要动态import,推荐统一放到最前面

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改。

Copy link
Contributor

@Aurelius84 Aurelius84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@JZ-LIANG JZ-LIANG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
name of the auto_parallel/helper.py file should be more specific in future

@JZ-LIANG JZ-LIANG merged commit bc2265f into PaddlePaddle:develop Sep 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants