-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug of reduce_sum op. #46045
Fix bug of reduce_sum op. #46045
Conversation
1. Change fluid head files to phi files. 2. Delete useless code. 3. Fix code style problems.
@@ -29,10 +78,42 @@ void SumRawKernel(const Context& dev_ctx, | |||
if (out_dtype == DataType::UNDEFINED && out->dtype() != x.dtype()) { | |||
out_dtype = out->dtype(); | |||
} | |||
phi::Reduce<T, kps::AddFunctor, kps::IdentityFunctor>( | |||
if (x.numel() > INT_MAX) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x.numel() >= std::numeric_limits<int32_t>::max()
// Caculate | ||
eigen_out_tensor.device(*dev_ctx.eigen_device()) = | ||
eigen_x_tensor.sum(eigen_reduce_dim); | ||
std::vector<int64_t> final_out_dim; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to saving the original output dims first like:
auto origin_out_dims = out->dims();
// some other codes...
out->Resize(origin_out_dims);
The code above is more readable.
// Resize Input Tensor | ||
auto new_x = x; | ||
int added_dims = EigenDimSize - x.dims().size(); | ||
std::vector<int64_t> new_dim(added_dims, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change new_dim
to be new_x_dim
, which is more readable.
Change the type of new_dim
from std::vector<int64_t>
to be std::array<int64_t, EigenDimSize>
.
// Create Out Tensor | ||
dev_ctx.Alloc<T>(out); | ||
// Resize Out Tensor | ||
std::vector<int64_t> new_reduced_dim(added_dims, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new_reduced_dim
-> new_out_dim
.
Use std::array<int64_t, kReduceOutRank>
instead of std::vector<int64_t>
.
new_dim.push_back(x.dims().at(i)); | ||
} | ||
new_x.Resize(phi::make_ddim(new_dim)); | ||
auto eigen_x_tensor = EigenTensor<T, EigenDimSize>::From(x); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x
->new_x
.
dev_ctx, x, reduce_all, dims.GetData(), keep_dim, out_dtype, out); | ||
} | ||
if (x.numel() > INT_MAX) { | ||
#ifndef PADDLE_WITH_XPU_KP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems that out_dtype
is not considered here. How about raise an exception directly if a valid out_dtype
is provided?
dev_ctx, x, reduce_all, dims.GetData(), keep_dim, out_dtype, out); | ||
} | ||
if (x.numel() > std::numeric_limits<int32_t>::max()) { | ||
#ifndef PADDLE_WITH_XPU_KP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong macro position. It should be:
#ifndef PADDLE_WITH_XPU_KP
if (x.numel() > std::numeric_limits<int32_t>::max()) {
// Some other codes..
return;
}
#endif
// The original codes...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Great work.
PR types
Bug fixes
PR changes
OPs
Describe
Fix bug of reduce_sum op. When
input.numel() > INT32_MAX
, its result is wrong. Use eigen tensor sum function to caculate in such case.