Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Paddle-TRT] support new quant format from slim #46022

Merged
merged 17 commits into from
Oct 10, 2022

Conversation

zhoutianzi666
Copy link
Contributor

@zhoutianzi666 zhoutianzi666 commented Sep 14, 2022

PR types

Others

PR changes

Others

Describe

  • support new quant format(add QDQ before every op) from PaddleSlim
  • put identity_scale_op_clean_pass after quant
    • when pattern QDQ-> scale -> QDQ -> scale arises, identity_scale_op_clean_pass will make it to QDQ -> QDQ, bugs arises later in delete_quant_dequant_linear_op_pass.
    • so we must put identity_scale_op_clean_pass after delete_quant_dequant_linear_op_pass

目前此pr支持新格式的量化(所有op前都插入QDQ)。
改动主要是:

  • 防止Q/DQ中共享的Scale权重被删除。
  • matmul_v2 支持 matrix * vector ,picodet中有。
    • 同时顺带支持了vec*vec

image

  • 将"identity_scale_op_clean_pass"移动到量化pass之后,避免在量化时出现QDQ->QDQ这样的结构,这会触发delete_quant_dequant_linear_op_pass的bug。本质原因还是关系到pattern匹配策略的问题。同46178 pr是一个问题。

@zhoutianzi666 zhoutianzi666 changed the title [Paddle-TRT] support new quant format form slim [Paddle-TRT] support new quant format from slim Oct 8, 2022
@zhoutianzi666 zhoutianzi666 force-pushed the new_new_slim branch 3 times, most recently from 2a3950f to b7bd70e Compare October 9, 2022 01:56
Copy link
Contributor

@zhangjun zhangjun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor External developers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants