[WIP] Accelerate training by replacing DataContainer object scatter #1236

hhaAndroid · 2021-08-02T12:02:54Z

Motivation

During the YOLOX reproduction process, we found that the scatter process of DataContainer will significantly increase the training time. The practice has shown that replacing the custom scatter can reduce the training time by about half.

Modification

Replace the custom scatter of the DataContainer object with the scatter of PyTorch.

BC-breaking

None

Use cases

I only tested MMDetection, the other frameworks did not test.

Note

I need to experiment and compare the training and inference speed.

codecov · 2021-08-02T13:07:11Z

Codecov Report

Merging #1236 (586e394) into master (285a052) will increase coverage by 0.02%.
The diff coverage is 33.33%.

@@            Coverage Diff             @@
##           master    #1236      +/-   ##
==========================================
+ Coverage   68.27%   68.29%   +0.02%     
==========================================
  Files         160      160              
  Lines       10599    10597       -2     
  Branches     1937     1937              
==========================================
+ Hits         7236     7237       +1     
+ Misses       2979     2976       -3     
  Partials      384      384

Flag	Coverage Δ
unittests	`68.29% <33.33%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
mmcv/parallel/_functions.py	`16.32% <33.33%> (+2.60%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 285a052...586e394. Read the comment docs.

mmcv/parallel/scatter_gather.py

kennymckormick · 2021-08-05T05:05:03Z

What's ur using case? Does that only work for stack=False DataContainers?

hhaAndroid · 2021-08-05T09:23:34Z

What's ur using case? Does that only work for stack=False DataContainers?

His behavior is exactly the same as before, just replaced with the scatter of Pytorch itself.

ZCMax · 2021-08-10T05:21:30Z

I tried this PR in mmdetection3D on PointPillar:

Environment:
configs: hv_pointpillars_secfpn_6x8_160e_kitti-3d-car.py
mmcv_version: 1.3.8
GPUs: 4 V100

original implementations:

2021-08-09 23:18:34,975 - mmdet - INFO - Epoch [21][50/310] lr: 7.286e-03, eta: 3:40:21, time: 1.751, data_time: 0.679, memory: 5405, loss_cls: 0.1120, loss_bbox: 0.2178, loss_dir: 0.0244, loss: 0.3542, grad_norm: 0.7251
2021-08-09 23:19:50,129 - mmdet - INFO - Epoch [21][100/310] lr: 7.352e-03, eta: 3:41:41, time: 1.502, data_time: 0.229, memory: 5405, loss_cls: 0.1055, loss_bbox: 0.2167, loss_dir: 0.0245, loss: 0.3466, grad_norm: 0.8071
2021-08-09 23:20:10,814 - mmdet - INFO - Epoch [21][150/310] lr: 7.416e-03, eta: 3:40:21, time: 0.415, data_time: 0.022, memory: 5405, loss_cls: 0.1051, loss_bbox: 0.2159, loss_dir: 0.0231, loss: 0.3442, grad_norm: 0.7746
2021-08-09 23:20:52,095 - mmdet - INFO - Epoch [21][200/310] lr: 7.480e-03, eta: 3:40:01, time: 0.825, data_time: 0.259, memory: 5405, loss_cls: 0.1044, loss_bbox: 0.2087, loss_dir: 0.0224, loss: 0.3355, grad_norm: 0.6906
2021-08-09 23:21:08,705 - mmdet - INFO - Epoch [21][250/310] lr: 7.544e-03, eta: 3:38:30, time: 0.333, data_time: 0.017, memory: 5405, loss_cls: 0.1081, loss_bbox: 0.2156, loss_dir: 0.0216, loss: 0.3453, grad_norm: 0.7769
2021-08-09 23:21:24,320 - mmdet - INFO - Epoch [21][300/310] lr: 7.607e-03, eta: 3:36:58, time: 0.312, data_time: 0.028, memory: 5405, loss_cls: 0.1027, loss_bbox: 0.2125, loss_dir: 0.0217, loss: 0.3369, grad_norm: 0.7366
2021-08-09 23:21:27,592 - mmdet - INFO - Saving checkpoint at 21 epochs
2021-08-09 23:22:37,692 - mmdet - INFO - Epoch [22][50/310] lr: 7.683e-03, eta: 3:37:29, time: 1.387, data_time: 0.943, memory: 5405, loss_cls: 0.1130, loss_bbox: 0.2207, loss_dir: 0.0233, loss: 0.3570, grad_norm: 0.7560
2021-08-09 23:23:12,504 - mmdet - INFO - Epoch [22][100/310] lr: 7.745e-03, eta: 3:36:51, time: 0.696, data_time: 0.022, memory: 5418, loss_cls: 0.1025, loss_bbox: 0.2089, loss_dir: 0.0235, loss: 0.3349, grad_norm: 0.7138
2021-08-09 23:23:28,139 - mmdet - INFO - Epoch [22][150/310] lr: 7.806e-03, eta: 3:35:20, time: 0.314, data_time: 0.023, memory: 5418, loss_cls: 0.1057, loss_bbox: 0.2157, loss_dir: 0.0238, loss: 0.3453, grad_norm: 0.6475
2021-08-09 23:23:59,514 - mmdet - INFO - Epoch [22][200/310] lr: 7.867e-03, eta: 3:34:33, time: 0.626, data_time: 0.220, memory: 5418, loss_cls: 0.1049, loss_bbox: 0.2132, loss_dir: 0.0217, loss: 0.3398, grad_norm: 0.7137
2021-08-09 23:24:13,722 - mmdet - INFO - Epoch [22][250/310] lr: 7.927e-03, eta: 3:33:01, time: 0.285, data_time: 0.020, memory: 5418, loss_cls: 0.1047, loss_bbox: 0.2112, loss_dir: 0.0213, loss: 0.3373, grad_norm: 0.6982
2021-08-09 23:24:36,918 - mmdet - INFO - Epoch [22][300/310] lr: 7.987e-03, eta: 3:31:53, time: 0.464, data_time: 0.023, memory: 5418, loss_cls: 0.0999, loss_bbox: 0.2078, loss_dir: 0.0231, loss: 0.3307, grad_norm: 0.7557
2021-08-09 23:24:40,149 - mmdet - INFO - Saving checkpoint at 22 epochs
2021-08-09 23:25:24,172 - mmdet - INFO -
Car AP@0.70, 0.70, 0.70:
bbox AP:89.4112, 83.4955, 79.2196
bev AP:89.8205, 79.7134, 79.2923
3d AP:70.0940, 61.0183, 56.1177
aos AP:89.25, 82.82, 78.41
Car AP@0.70, 0.50, 0.50:
bbox AP:89.4112, 83.4955, 79.2196
bev AP:90.5806, 88.2620, 87.3935
3d AP:90.4840, 87.6858, 85.4203
aos AP:89.25, 82.82, 78.41

using this PR:

2021-08-10 01:44:01,949 - mmdet - INFO - Exp name: hv_pointpillars_secfpn_6x8_160e_kitti-3d-car.py
2021-08-10 01:44:01,950 - mmdet - INFO - Epoch(val) [20][943] KITTI/Car_3D_easy_strict: 71.4129, KITTI/Car_BEV_easy_strict: 89.4457, KITTI/Car_2D_easy_strict: 89.4210, KITTI/Car_3D_moderate_strict: 63.6602, KITTI/Car_BEV_moderate_strict: 83.4318, KITTI/Car_2D_moderate_strict: 85.5265, KITTI/Car_3D_hard_strict: 61.5196, KITTI/Car_BEV_hard_strict: 79.2837, KITTI/Car_2D_hard_strict: 79.6540, KITTI/Car_3D_easy_loose: 94.7463, KITTI/Car_BEV_easy_loose: 94.8973, KITTI/Car_2D_easy_loose: 89.4210, KITTI/Car_3D_moderate_loose: 88.8798, KITTI/Car_BEV_moderate_loose: 89.1970, KITTI/Car_2D_moderate_loose: 85.5265, KITTI/Car_3D_hard_loose: 87.9514, KITTI/Car_BEV_hard_loose: 88.6405, KITTI/Car_2D_hard_loose: 79.6540
2021-08-10 01:44:48,076 - mmdet - INFO - Epoch [21][50/310] lr: 7.286e-03, eta: 3:21:12, time: 0.902, data_time: 0.469, memory: 5405, loss_cls: 0.1121, loss_bbox: 0.2190, loss_dir: 0.0238, loss: 0.3548, grad_norm: 0.6942
2021-08-10 01:45:07,487 - mmdet - INFO - Epoch [21][100/310] lr: 7.352e-03, eta: 3:20:01, time: 0.388, data_time: 0.017, memory: 5405, loss_cls: 0.1060, loss_bbox: 0.2160, loss_dir: 0.0239, loss: 0.3460, grad_norm: 0.7869
2021-08-10 01:45:27,478 - mmdet - INFO - Epoch [21][150/310] lr: 7.416e-03, eta: 3:18:52, time: 0.400, data_time: 0.020, memory: 5405, loss_cls: 0.1057, loss_bbox: 0.2148, loss_dir: 0.0242, loss: 0.3447, grad_norm: 0.7293
2021-08-10 01:45:41,673 - mmdet - INFO - Epoch [21][200/310] lr: 7.480e-03, eta: 3:17:28, time: 0.284, data_time: 0.023, memory: 5405, loss_cls: 0.1050, loss_bbox: 0.2093, loss_dir: 0.0228, loss: 0.3370, grad_norm: 0.6972
2021-08-10 01:45:57,756 - mmdet - INFO - Epoch [21][250/310] lr: 7.544e-03, eta: 3:16:09, time: 0.321, data_time: 0.033, memory: 5405, loss_cls: 0.1088, loss_bbox: 0.2153, loss_dir: 0.0218, loss: 0.3458, grad_norm: 0.7547
2021-08-10 01:46:16,487 - mmdet - INFO - Epoch [21][300/310] lr: 7.607e-03, eta: 3:15:00, time: 0.375, data_time: 0.022, memory: 5405, loss_cls: 0.1016, loss_bbox: 0.2119, loss_dir: 0.0226, loss: 0.3361, grad_norm: 0.7507
2021-08-10 01:47:32,780 - mmdet - INFO - Epoch [22][50/310] lr: 7.683e-03, eta: 3:15:58, time: 1.464, data_time: 0.456, memory: 5405, loss_cls: 0.1135, loss_bbox: 0.2205, loss_dir: 0.0239, loss: 0.3578, grad_norm: 0.7754
2021-08-10 01:48:45,143 - mmdet - INFO - Epoch [22][100/310] lr: 7.745e-03, eta: 3:17:17, time: 1.447, data_time: 0.017, memory: 5418, loss_cls: 0.1020, loss_bbox: 0.2094, loss_dir: 0.0227, loss: 0.3341, grad_norm: 0.6982
2021-08-10 01:49:02,153 - mmdet - INFO - Epoch [22][150/310] lr: 7.806e-03, eta: 3:16:02, time: 0.339, data_time: 0.022, memory: 5418, loss_cls: 0.1056, loss_bbox: 0.2156, loss_dir: 0.0243, loss: 0.3455, grad_norm: 0.5944
2021-08-10 01:49:43,651 - mmdet - INFO - Epoch [22][200/310] lr: 7.867e-03, eta: 3:15:54, time: 0.830, data_time: 0.452, memory: 5418, loss_cls: 0.1060, loss_bbox: 0.2176, loss_dir: 0.0224, loss: 0.3459, grad_norm: 0.7582
2021-08-10 01:50:06,232 - mmdet - INFO - Epoch [22][250/310] lr: 7.927e-03, eta: 3:14:55, time: 0.451, data_time: 0.019, memory: 5418, loss_cls: 0.1046, loss_bbox: 0.2113, loss_dir: 0.0222, loss: 0.3382, grad_norm: 0.7089
2021-08-10 01:50:21,506 - mmdet - INFO - Epoch [22][300/310] lr: 7.987e-03, eta: 3:13:37, time: 0.307, data_time: 0.023, memory: 5418, loss_cls: 0.0999, loss_bbox: 0.2043, loss_dir: 0.0223, loss: 0.3264, grad_norm: 0.6924
2021-08-10 01:51:12,615 - mmdet - INFO -
Car AP@0.70, 0.70, 0.70:
bbox AP:89.6106, 83.8629, 79.3142
bev AP:89.9098, 80.0422, 79.3679
3d AP:72.0378, 62.7379, 56.7960
aos AP:89.37, 83.15, 78.34
Car AP@0.70, 0.50, 0.50:
bbox AP:89.6106, 83.8629, 79.3142
bev AP:90.5960, 88.8071, 87.5059
3d AP:90.5378, 88.3073, 85.5774
aos AP:89.37, 83.15, 78.34

The training speed seems faster and accuracy seems higher after using this PR.

hhaAndroid · 2021-08-16T03:11:18Z

We found that there is no need to modify it temporarily, so it is closed. If there are new developments in the follow-up, it will start again.

Speed up DataContainer object scatter

8deca85

hhaAndroid changed the title ~~Speed up DataContainer object scatter~~ Accelerate training by replacing DataContainer object scatter Aug 2, 2021

zhouzaida requested a review from hellock August 4, 2021 12:41

zhouzaida mentioned this pull request Aug 4, 2021

Iteration Plan v1.3.11 - Aug 2021 #1216

Closed

18 tasks

ZwwWayne reviewed Aug 5, 2021

View reviewed changes

mmcv/parallel/scatter_gather.py Outdated Show resolved Hide resolved

replace scatter

11e8055

hhaAndroid changed the title ~~Accelerate training by replacing DataContainer object scatter~~ [WIP] Accelerate training by replacing DataContainer object scatter Aug 5, 2021

support CPU inference

586e394

zhouzaida mentioned this pull request Aug 13, 2021

Iteration Plan v1.3.12 - Aug 2021 #1265

Closed

13 tasks

hhaAndroid closed this Aug 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Accelerate training by replacing DataContainer object scatter #1236

[WIP] Accelerate training by replacing DataContainer object scatter #1236

hhaAndroid commented Aug 2, 2021 •

edited

Loading

codecov bot commented Aug 2, 2021 •

edited

Loading

kennymckormick commented Aug 5, 2021

hhaAndroid commented Aug 5, 2021

ZCMax commented Aug 10, 2021

hhaAndroid commented Aug 16, 2021

[WIP] Accelerate training by replacing DataContainer object scatter #1236

[WIP] Accelerate training by replacing DataContainer object scatter #1236

Conversation

hhaAndroid commented Aug 2, 2021 • edited Loading

Motivation

Modification

BC-breaking

Use cases

Note

codecov bot commented Aug 2, 2021 • edited Loading

Codecov Report

kennymckormick commented Aug 5, 2021

hhaAndroid commented Aug 5, 2021

ZCMax commented Aug 10, 2021

hhaAndroid commented Aug 16, 2021

hhaAndroid commented Aug 2, 2021 •

edited

Loading

codecov bot commented Aug 2, 2021 •

edited

Loading