Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update passes in quant2_int8_mkldnn_pass #38912

Merged
merged 3 commits into from
Jan 27, 2022

Conversation

wozna
Copy link
Contributor

@wozna wozna commented Jan 12, 2022

PR types

Bug fixes

PR changes

Others

Describe

This PR updates passes that are applied to the graph during quant2_int8_mkldnn_pass. It turns out that many passes were added to CpuPassStrategy and EnableMkldnn passes, but they weren't added to mentioned python script. I used the same order of the passes that is presented in paddle_pass_builder.cc.

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@lidanqing-intel
Copy link
Contributor

@wozna @pmajchrzak Please verify this PR will not cause our existing daily CI int8 models accuracy and performance drop. Thanks!

sfraczek
sfraczek previously approved these changes Jan 13, 2022
Copy link
Contributor

@sfraczek sfraczek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lidanqing-intel
Copy link
Contributor

@wozna Hi could you or ask Pior to upload spreadsheet showing this PR does not cause any accuracy drop or performance drop on exisiting int8 modes CI

@wozna
Copy link
Contributor Author

wozna commented Jan 19, 2022

I can confirm that all recent changes don't cause any accuracy change or performance change in any model from our CI. Before I had a problem with scale_matmul_fuse_pass. I couldn't understand why this pass has to be in _quantize_fp32_graph not in the _optimize_fp32_graph. It turned out that we use a scale operator to gather scale for matmul and when I changed the place of scale_matmul_fuse_pass, 12 matmuls weren't quantized because the scale for input was missing. This change improved a little accuracy on Ernie from 0.791165 to 0.8000 but caused a performance drop of 1.5%. That's why I returned to the previous order, but I moved cpu_quantize_placement_pass, after scale_matmul and reshape_transpose_matmul/V2 passes.

@wozna
Copy link
Contributor Author

wozna commented Jan 19, 2022

@lidanqing-intel @sfraczek Could you please repeat your review?

Copy link
Contributor

@sfraczek sfraczek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@lidanqing-intel lidanqing-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lidanqing-intel
Copy link
Contributor

@baoachun Could you please review this PR. Thanks

@wozna wozna requested a review from baoachun January 24, 2022 11:02
@lidanqing-intel
Copy link
Contributor

@baoachun Hi could you please approve this PR? Some models from BML team still need to use save_quant_model.py solution for now. So we still need to maintain this save_quant_model.py. Thanks

Copy link
Contributor

@baoachun baoachun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wozna wozna closed this Jan 26, 2022
@wozna wozna reopened this Jan 26, 2022
@lidanqing-intel lidanqing-intel merged commit 0e235e5 into PaddlePaddle:develop Jan 27, 2022
@wozna wozna deleted the update_passes branch February 24, 2023 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants