-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add some passes which can be applied to Program #34730
Conversation
Thanks for your contribution! |
fix compile error of op compat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for framework.py & blcok_desc.h
paddle/fluid/framework/block_desc.cc
Outdated
@@ -238,5 +238,37 @@ BlockDesc *BlockDesc::ForwardBlock() const { | |||
return prog_->MutableBlock(static_cast<size_t>(desc_->forward_block_idx())); | |||
} | |||
|
|||
void BlockDesc::MoveFrom(BlockDesc *block) { | |||
PADDLE_ENFORCE_NOT_NULL( | |||
block, platform::errors::InvalidArgument("block must be provided")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we recommend the first letter of error message sentence capitalized and ends with .
, other cases are same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
outputs[i]->ShareBufferWith(*inputs[i]); | ||
VLOG(10) << "Share tensor buffer " << (*input_args)[i] << " -> " | ||
<< (*output_args)[i]; | ||
if (!share_dims.empty() && share_dims[i]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better check share_dims.size()==n
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have been checked here.
update_attr(attrs, attr_types, "nranks", 1, "size_t") | ||
update_attr(attrs, attr_types, "use_cuda", False, "bool") | ||
# TODO(zjl): how to skip fetch variables ? | ||
update_attr(attrs, attr_types, "mem_opt_skip_vars", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I only skip the variables that satisfies var.is_data == True
currently.
const auto &startups = | ||
graph.Get<details::ProgramDescs>(details::kStartupProgramDescs); | ||
VLOG(10) << "Merge startup programs"; | ||
MergePrograms(startup_program, startups, /*append=*/true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why we need to merge here? IF startup_program is the applied program, why it will lose some vars or ops?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commonly, startup_program would not lose some vars or ops. We only need to merge the created kStartupProgramDescs
to the startup_program. Some passes may produce kStartupProgramDescs
(a list of some ProgramDescs) (see here) and run it once when using ParallelExecutor (see here). So when we apply passes to ProgramDescs, startup_program should contain these kStartupProgramDescs
, so that these new created ops can run and only run once.
if (is_first_var_valid == is_second_var_valid) { | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_first_var_valid
and is_second_var_valid
are both true is ok ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I agree with you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your suggestions, @zhiqiu . I do think is_first_var_valid
and is_second_var_valid
are both true is ok. I have added the logic. Please review again.
outputs[i]->ShareBufferWith(*inputs[i]); | ||
VLOG(10) << "Share tensor buffer " << (*input_args)[i] << " -> " | ||
<< (*output_args)[i]; | ||
if (!share_dims.empty() && share_dims[i]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have been checked here.
update_attr(attrs, attr_types, "nranks", 1, "size_t") | ||
update_attr(attrs, attr_types, "use_cuda", False, "bool") | ||
# TODO(zjl): how to skip fetch variables ? | ||
update_attr(attrs, attr_types, "mem_opt_skip_vars", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I only skip the variables that satisfies var.is_data == True
currently.
const auto &startups = | ||
graph.Get<details::ProgramDescs>(details::kStartupProgramDescs); | ||
VLOG(10) << "Merge startup programs"; | ||
MergePrograms(startup_program, startups, /*append=*/true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commonly, startup_program would not lose some vars or ops. We only need to merge the created kStartupProgramDescs
to the startup_program. Some passes may produce kStartupProgramDescs
(a list of some ProgramDescs) (see here) and run it once when using ParallelExecutor (see here). So when we apply passes to ProgramDescs, startup_program should contain these kStartupProgramDescs
, so that these new created ops can run and only run once.
|
||
scope1 = paddle.static.Scope() | ||
with paddle.static.scope_guard(scope1): | ||
self.executor.run(startup1) | ||
|
||
scope2 = paddle.static.Scope() | ||
with paddle.static.scope_guard(scope2): | ||
self.executor.run(startup2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scope1 = paddle.static.Scope() | |
with paddle.static.scope_guard(scope1): | |
self.executor.run(startup1) | |
scope2 = paddle.static.Scope() | |
with paddle.static.scope_guard(scope2): | |
self.executor.run(startup2) | |
paddle.seed(2021) | |
scope1 = paddle.static.Scope() | |
with paddle.static.scope_guard(scope1): | |
self.executor.run(startup1) | |
paddle.seed(2021) | |
scope2 = paddle.static.Scope() | |
with paddle.static.scope_guard(scope2): | |
self.executor.run(startup2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for you suggestions!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PADDLE_ENFORCE_EQ(block.ID(), 0, platform::errors::Unimplemented( | ||
"Inplace can only perform in block 0.")); | ||
// only take block 0 gc_vars | ||
const auto all_gc_vars = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Readability: suggest to rename 'all_gc_vars' to 'op_gc_vars' or other name, then it is easier to know why its size is same to all_ops.size()
and we can know 'op_gc_vars[i]' means 'gc_vars' of 'op[i]'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
for (const auto &pair : var_pair) { | ||
const auto &input_slot = pair.first; | ||
const auto &output_slot = pair.second; | ||
auto input_var = GetFirstVarName(op, input_slot, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not forward to read the meaning of boolean without looking at the code of GetFirstVarName.
Suggest for two options:
- `/* is_input = */ true
2 Change GetFirstVarName
to GetNameMapFirstVar
, then pass it op.Inputs()
instead of op
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API.spec changed,but no Python API changed,approve for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for op benchmark ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
New features
PR changes
Others
Describe
Add some passes which can be applied to Program:
TODO: add fuse_allreduce_op_pass.