Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMP]split minimize and add unscale_ for GradScaler #3897

Merged

Conversation

zhangbo9674
Copy link
Contributor

@zhangbo9674 zhangbo9674 commented Sep 18, 2021

1、Split function GradScaler::minimize() to GradScaler::step() + GradScaler::update()

GradScaler::minimize():

    scaler = paddle.amp.GradScaler(init_loss_scaling=1024)

    with paddle.amp.auto_cast():
        output = model(data)
        loss = mse(output, label)

    scaled = scaler.scale(loss)
    scaled.backward()            
    scaler.minimize(optimizer, scaled)
    optimizer.clear_grad()

GradScaler::step() + GradScaler::update():

    scaler = paddle.amp.GradScaler(init_loss_scaling=1024)

    with paddle.amp.auto_cast():
        output = model(data)
        loss = mse(output, label)

    scaled = scaler.scale(loss)
    scaled.backward() 
    scaler.step(optimizer)
    scaler.update()
    optimizer.clear_grad()
  • minimize() and step()+update() are two methods of parameter gradient updating in amp. In paddle 2.0, we recommend using step()+update().

  • If optimizer belongs to paddle 1.0, only minimize() can be used.

2、Add GradScaler::unscale_(optimizer):

    scaler = paddle.amp.GradScaler(init_loss_scaling=1024)

    with paddle.amp.auto_cast():
        output = model(data)
        loss = mse(output, label)

    scaled = scaler.scale(loss)
    scaled.backward() 
    scaled.unscale_(optimizer)
    scaler.step(optimizer)
    scaler.update()
    optimizer.clear_grad()
  • This API is used to unscale the gradients of parameters, multiplies the gradients of parameters by 1/(loss scaling ratio).
  • If unscale_ is not called, minimize() or step() will call this api, else this call will not be repeated.

3、中文文档预览:

GradScaler:

图片

step+update:

图片

unscale_

图片

@paddle-bot-old
Copy link

Thanks for your contribution!

@zhangbo9674 zhangbo9674 changed the title [AMP]splic minimize and add unscale_ for GradScaler [AMP]split minimize and add unscale_ for GradScaler Sep 22, 2021
Copy link
Contributor

@zhiqiu zhiqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@TCChenlong TCChenlong merged commit 34b11f5 into PaddlePaddle:develop Sep 28, 2021
zhangbo9674 added a commit to zhangbo9674/docs that referenced this pull request Sep 29, 2021
* add GradScaler comment

* add GradScaler comment

* fix doc style

fix doc style

* refine style

Co-authored-by: Chen Long <1300851984@qq.com>
TCChenlong added a commit that referenced this pull request Sep 29, 2021
* add GradScaler comment

* add GradScaler comment

* fix doc style

fix doc style

* refine style

Co-authored-by: Chen Long <1300851984@qq.com>

Co-authored-by: Chen Long <1300851984@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants