[AMP]split minimize and add unscale_ for GradScaler #35825

zhangbo9674 · 2021-09-17T04:10:58Z

PR types

New features

PR changes

APIs

Describe

1、Split function `GradScaler::minimize()` to `GradScaler::step()` + `GradScaler::update()`：

GradScaler::minimize():

    scaler = paddle.amp.GradScaler(init_loss_scaling=1024)

    with paddle.amp.auto_cast():
        output = model(data)
        loss = mse(output, label)

    scaled = scaler.scale(loss)
    scaled.backward()            
    scaler.minimize(optimizer, scaled)
    optimizer.clear_grad()

GradScaler::step() + GradScaler::update():

    scaler = paddle.amp.GradScaler(init_loss_scaling=1024)

    with paddle.amp.auto_cast():
        output = model(data)
        loss = mse(output, label)

    scaled = scaler.scale(loss)
    scaled.backward() 
    scaler.step(optimizer)
    scaler.update()
    optimizer.clear_grad()

minimize() and step()+update() are two methods of parameter gradient updating in amp. In paddle 2.0, we recommend using step()+update().
If optimizer belongs to paddle 1.0, only minimize() can be used.

2、Add `GradScaler::unscale_(optimizer)`:

    scaler = paddle.amp.GradScaler(init_loss_scaling=1024)

    with paddle.amp.auto_cast():
        output = model(data)
        loss = mse(output, label)

    scaled = scaler.scale(loss)
    scaled.backward() 
    scaled.unscale_(optimizer)
    scaler.step(optimizer)
    scaler.update()
    optimizer.clear_grad()

This API is used to unscale the gradients of parameters, multiplies the gradients of parameters by 1/(loss scaling ratio).
If unscale_ is not called, minimize() or step() will call this api, else this call will not be repeated.

3、docs review:

GradScaler:

step+update:

unscale_:

中文文档pr链接：
PaddlePaddle/docs#3897

… dev/split-minimize

paddle-bot-old · 2021-09-17T04:11:53Z

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

… dev/split-minimize

zhiqiu

LGTM

* split minimize() to step() + update() * add unscale and step for grad_scaler * add unittest * refine code in minimize * delete step in loss_scaler * fix example bug * refine comment * refine unittest * add unittest

zhangbo9674 added 4 commits September 3, 2021 12:16

split minimize() to step() + update()

e813af3

add unscale and step for grad_scaler

048bd89

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

af5f692

… dev/split-minimize

add unittest

eaede1c

zhangbo9674 added 2 commits September 17, 2021 06:50

refine code in minimize

4244cdf

delete step in loss_scaler

11c39f0

zhangbo9674 changed the title ~~Dev/split minimize and unscale~~ [AMP]split minimize and unscale Sep 17, 2021

zhangbo9674 added 5 commits September 17, 2021 08:32

fix example bug

eeb4c1c

refine comment

2a1cbf9

refine unittest

ff76fb4

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into…

6da6e15

… dev/split-minimize

add unittest

c394837

zhangbo9674 changed the title ~~[AMP]split minimize and unscale~~ [AMP]split minimize and add unscale_ for GradScaler Sep 22, 2021

zhiqiu approved these changes Sep 22, 2021

View reviewed changes

TCChenlong approved these changes Sep 22, 2021

View reviewed changes

lanxianghit approved these changes Sep 22, 2021

View reviewed changes

zhiqiu merged commit bf6f0e5 into PaddlePaddle:develop Sep 22, 2021

zhangbo9674 deleted the dev/split-minimize branch March 2, 2023 02:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMP]split minimize and add unscale_ for GradScaler #35825

[AMP]split minimize and add unscale_ for GradScaler #35825

zhangbo9674 commented Sep 17, 2021 •

edited

Loading

paddle-bot-old bot commented Sep 17, 2021

zhiqiu left a comment

[AMP]split minimize and add unscale_ for GradScaler #35825

[AMP]split minimize and add unscale_ for GradScaler #35825

Conversation

zhangbo9674 commented Sep 17, 2021 • edited Loading

PR types

PR changes

Describe

1、Split function GradScaler::minimize() to GradScaler::step() + GradScaler::update()：

2、Add GradScaler::unscale_(optimizer):

3、docs review:

paddle-bot-old bot commented Sep 17, 2021

zhiqiu left a comment

Choose a reason for hiding this comment

zhangbo9674 commented Sep 17, 2021 •

edited

Loading

1、Split function `GradScaler::minimize()` to `GradScaler::step()` + `GradScaler::update()`：

2、Add `GradScaler::unscale_(optimizer)`: