Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge master #115

Merged
merged 10 commits into from
Feb 23, 2024
4 changes: 2 additions & 2 deletions paddle/fluid/operators/fused/fused_seq_tensor_op.cc
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,10 @@ class FusedSeqTensorOp : public framework::OperatorWithKernel {
ad_slot_num, 0,
platform::errors::InvalidArgument(
"ad_slot_num [%ld] <= 0", ad_slot_num));
PADDLE_ENFORCE_LT(
PADDLE_ENFORCE_LE(
ad_slot_offset, slot_num - 1,
platform::errors::InvalidArgument(
"ad_slot_num [%ld] > slot_num - 1 [%ld]", ad_slot_offset, slot_num));
"ad_slot_num [%ld] > slot_num - 1 [%ld]", ad_slot_offset, slot_num));
PADDLE_ENFORCE_GE(
ad_slot_offset, 0,
platform::errors::InvalidArgument(
Expand Down
4 changes: 2 additions & 2 deletions paddle/fluid/operators/fused/fused_seq_tensor_op.cu
Original file line number Diff line number Diff line change
Expand Up @@ -145,8 +145,8 @@ __device__ void warpReduce(volatile T* cache, int tid) {

#define THREAD_PER_BLOCK 128
template <typename T>
__global__ void reduce_sum_max_length(const T* input, // 1
T* mask_output, // mask
__global__ void reduce_sum_max_length(const T* input,
T* mask_output,
const size_t batch_count,
const size_t ins_num,
const size_t slot_num,
Expand Down