[AMP]split minimize and add unscale_ for GradScaler #3897

zhangbo9674 · 2021-09-18T11:58:10Z

1、Split function `GradScaler::minimize()` to `GradScaler::step()` + `GradScaler::update()`：

GradScaler::minimize():

    scaler = paddle.amp.GradScaler(init_loss_scaling=1024)

    with paddle.amp.auto_cast():
        output = model(data)
        loss = mse(output, label)

    scaled = scaler.scale(loss)
    scaled.backward()            
    scaler.minimize(optimizer, scaled)
    optimizer.clear_grad()

GradScaler::step() + GradScaler::update():

    scaler = paddle.amp.GradScaler(init_loss_scaling=1024)

    with paddle.amp.auto_cast():
        output = model(data)
        loss = mse(output, label)

    scaled = scaler.scale(loss)
    scaled.backward() 
    scaler.step(optimizer)
    scaler.update()
    optimizer.clear_grad()

minimize() and step()+update() are two methods of parameter gradient updating in amp. In paddle 2.0, we recommend using step()+update().
If optimizer belongs to paddle 1.0, only minimize() can be used.

2、Add `GradScaler::unscale_(optimizer)`:

    scaler = paddle.amp.GradScaler(init_loss_scaling=1024)

    with paddle.amp.auto_cast():
        output = model(data)
        loss = mse(output, label)

    scaled = scaler.scale(loss)
    scaled.backward() 
    scaled.unscale_(optimizer)
    scaler.step(optimizer)
    scaler.update()
    optimizer.clear_grad()

This API is used to unscale the gradients of parameters, multiplies the gradients of parameters by 1/(loss scaling ratio).
If unscale_ is not called, minimize() or step() will call this api, else this call will not be repeated.

3、中文文档预览：

GradScaler：

step+update：

unscale_

paddle-bot-old · 2021-09-18T11:58:14Z

Thanks for your contribution!

zhiqiu

LGTM

fix doc style

…ev/splic-minimize_and_unscale

* add GradScaler comment * add GradScaler comment * fix doc style fix doc style * refine style Co-authored-by: Chen Long <1300851984@qq.com>

* add GradScaler comment * add GradScaler comment * fix doc style fix doc style * refine style Co-authored-by: Chen Long <1300851984@qq.com> Co-authored-by: Chen Long <1300851984@qq.com>

zhangbo9674 added 2 commits September 18, 2021 11:52

add GradScaler comment

e685bb3

add GradScaler comment

8333d1c

zhangbo9674 changed the title ~~[AMP]splic minimize and add unscale_ for GradScaler~~ [AMP]split minimize and add unscale_ for GradScaler Sep 22, 2021

zhangbo9674 mentioned this pull request Sep 22, 2021

[AMP]split minimize and add unscale_ for GradScaler PaddlePaddle/Paddle#35825

Merged

zhiqiu approved these changes Sep 22, 2021

View reviewed changes

zhangbo9674 mentioned this pull request Sep 22, 2021

[cherry pick]split minimize and add unscale_ for GradScaler PaddlePaddle/Paddle#35927

Merged

TCChenlong and others added 4 commits September 23, 2021 22:26

fix doc style

ba18712

fix doc style

refine style

c5a29bb

Merge branch 'develop' of https://github.com/PaddlePaddle/docs into d…

ba1b973

…ev/splic-minimize_and_unscale

merge

fa940ad

TCChenlong approved these changes Sep 28, 2021

View reviewed changes

TCChenlong merged commit 34b11f5 into PaddlePaddle:develop Sep 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMP]split minimize and add unscale_ for GradScaler #3897

[AMP]split minimize and add unscale_ for GradScaler #3897

zhangbo9674 commented Sep 18, 2021 •

edited

Loading

paddle-bot-old bot commented Sep 18, 2021

zhiqiu left a comment

[AMP]split minimize and add unscale_ for GradScaler #3897

[AMP]split minimize and add unscale_ for GradScaler #3897

Conversation

zhangbo9674 commented Sep 18, 2021 • edited Loading

1、Split function GradScaler::minimize() to GradScaler::step() + GradScaler::update()：

2、Add GradScaler::unscale_(optimizer):

3、中文文档预览：

GradScaler：

step+update：

unscale_

paddle-bot-old bot commented Sep 18, 2021

zhiqiu left a comment

Choose a reason for hiding this comment

zhangbo9674 commented Sep 18, 2021 •

edited

Loading

1、Split function `GradScaler::minimize()` to `GradScaler::step()` + `GradScaler::update()`：

2、Add `GradScaler::unscale_(optimizer)`: