-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add FP16 option in save_combine_op #10471
Add FP16 option in save_combine_op #10471
Conversation
std::vector<int> lod1 = {0, 1, 2, 3}; | ||
int numel1 = 100; | ||
paddle::framework::LoD expect_lod1; | ||
float* expect1 = CreateForSaveCombineOp<float>(10, 10, lod1, "test_var1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it is related to the error.
But the last number of your lod vector info should match the first dimension of your tensor, which is 10 here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed (though it is not related to the error).
std::vector<int> lod3 = {0, 1, 2, 3}; | ||
int numel3 = 2050; | ||
paddle::framework::LoD expect_lod3; | ||
float* expect3 = CreateForSaveCombineOp<float>(1, 2050, lod3, "test_var3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your lod3 = {0, 1, 2, 3}, but the first dimension of your tensor is 1. Not sure if this is right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kexinzhao . I found the reason which causes the failure. It seems that converting from int -> float16 -> float, results in loss of information.
After the values 2048, the values are:
i, float(float16(i))
2048, 2048
2049, 2048
2050, 2050
2051, 2050
2052, 2052
2053, 2052
2054, 2054
2055, 2054
2056, 2056
2057, 2056
… save_combine_fp16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
… save_combine_fp16
… save_combine_fp16
Issue: it fails the check since the 3rd tensor has size > 2050.