Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kunlun] cherry-pick:fix multi xpu dygraph hang, test=kunlun #32696

Merged
merged 1 commit into from
May 1, 2021

Conversation

vslyu
Copy link
Contributor

@vslyu vslyu commented Apr 29, 2021

PR types

Bug fixes

PR changes

Others

Describe

cherry-pick [PR32662]:#32662
Fix multi kunlun XPU cards dygraph running hang.
ref to PaddleClas [PR690]:PaddlePaddle/PaddleClas#690

python3.7 -m paddle.distributed.launch --xpus=2,3 --log_dir log tools/train.py -c ./configs/quick_start/ResNet50_vd_finetune_kunlun.yaml
2021-04-29 08:21:19,988 - INFO - epoch:19 , train step:0   , top1: 0.93750, top5: 1.00000, loss: 0.92145, lr: 0.000023, batch_cost: 0.96966 s, reader_cost: 0.23445 s, ips: 33.00113 images/sec, eta: 0:00:14
2021-04-29 08:21:26,911 - INFO - epoch:19 , train step:10  , top1: 0.93750, top5: 1.00000, loss: 1.03136, lr: 0.000003, batch_cost: 0.69550 s, reader_cost: 0.00019 s, ips: 46.01037 images/sec, eta: 0:00:03
2021-04-29 08:21:29,695 - INFO - END epoch:19  train top1: 0.92292, top5: 0.98542, loss: 0.95281,  batch_cost: 0.69007 s, reader_cost: 0.00025 s, batch_cost_sum: 3.45033 s, ips: 46.37236 images/sec.

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@fuyinno4 fuyinno4 merged commit 2c1ed9b into PaddlePaddle:release/2.1 May 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants