Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【论文复现】动态图模式下训练结束后计算flops错误 #38253

Closed
ETTR123 opened this issue Dec 17, 2021 · 13 comments
Closed

【论文复现】动态图模式下训练结束后计算flops错误 #38253

ETTR123 opened this issue Dec 17, 2021 · 13 comments
Assignees
Labels

Comments

@ETTR123
Copy link

ETTR123 commented Dec 17, 2021

   1)PaddlePaddle版本:2.2.1
PR地址:https://github.com/PaddlePaddle/PaddleSeg/pull/1623

训练命令:

python PaddleSeg/train.py --config PaddleSeg/configs/enet/enet_cityscapes_1024x512_adam_0.002_80k.yml

 W1218 13:14:42.893093   301 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1

W1218 13:14:42.893141 301 device_context.cc:465] device: 0, cuDNN Version: 7.6.
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/norm.py:653: UserWarning: When training, we now always track global mean and variance.
"When training, we now always track global mean and variance.")
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/math_op_patch.py:253: UserWarning: The dtype of left and right variables are not the same, left dtype is paddle.float32, but right dtype is paddle.int64, the right dtype will convert to paddle.float32
format(lhs_dtype, rhs_dtype, lhs_dtype))
2021-12-18 13:14:58 [INFO] [TRAIN] epoch: 1, iter: 10/10, loss: 2.2953, lr: 0.001259, batch_cost: 1.1241, reader_cost: 0.85701, ips: 7.1169 samples/sec | ETA 00:00:00
<class 'paddle.nn.layer.conv.Conv2D'>'s flops has been counted
Cannot find suitable count function for <class 'paddle.nn.layer.pooling.MaxPool2D'>. Treat it as zero FLOPs.
<class 'paddle.nn.layer.norm.BatchNorm2D'>'s flops has been counted
Cannot find suitable count function for <class 'paddle.nn.layer.activation.PReLU'>. Treat it as zero FLOPs.
Cannot find suitable count function for <class 'paddle.nn.layer.common.Dropout2D'>. Treat it as zero FLOPs.
<class 'paddle.nn.layer.activation.ReLU'>'s flops has been counted
<class 'paddle.nn.layer.conv.Conv2DTranspose'>'s flops has been counted
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/tensor/creation.py:130: DeprecationWarning: np.object is a deprecated alias for the builtin object. To silence this warning, use object by itself. Doing this will not modify any behavior and is safe.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
if data.dtype == np.object:
Traceback (most recent call last):
File "PaddleSeg/train.py", line 199, in
main(args)
File "PaddleSeg/train.py", line 194, in main
to_static_training=cfg.to_static_training)
File "/home/aistudio/PaddleSeg/paddleseg/core/train.py", line 311, in train
custom_ops={paddle.nn.SyncBatchNorm: op_flops_funs.count_syncbn})
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/hapi/dynamic_flops.py", line 113, in flops
print_detail=print_detail)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/hapi/dynamic_flops.py", line 254, in dynamic_flops
model(inputs)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 914, in call
outputs = self.forward(*inputs, **kwargs)
File "/home/aistudio/PaddleSeg/paddleseg/models/enet.py", line 199, in forward
x, max_indices1_0 = self.downsample1_0(x)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 914, in call
outputs = self.forward(*inputs, **kwargs)
File "/home/aistudio/PaddleSeg/paddleseg/models/enet.py", line 504, in forward
main, max_indices = self.main_max1(x)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 917, in call
hook_result = forward_post_hook(self, inputs, outputs)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/hapi/dynamic_flops.py", line 184, in count_io_info
m.register_buffer('output_shape', paddle.to_tensor(y.shape))
AttributeError: 'tuple' object has no attribute 'shape'

PS:
1:self.main_max1 = paddle.nn.MaxPool2D(2, stride=2, return_mask=return_indices)
2:将File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/hapi/dynamic_flops.py", line 184, in count_io_info:
m.register_buffer('output_shape', paddle.to_tensor(y.shape))
改为
m.register_buffer('output_shape', paddle.to_tensor(y[0].shape))后,错误消除。

@paddle-bot-old
Copy link

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档常见问题历史IssueAI社区来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQGithub Issue and AI community to get the answer.Have a nice day!

@ghostxsl
Copy link
Contributor

您是已经将问题解决了吗?建议到PaddleSeg的repo下提出修改意见

@ETTR123
Copy link
Author

ETTR123 commented Dec 20, 2021

没有解决,并不确定这样修改会造成什么其他的影响。

@ghostxsl
Copy link
Contributor

您好,计算flops默认model的输出是tensor,建议您修改一下您模型的输出类型

@shiyutang
Copy link
Contributor

@ghostxsl 您好,这个问题是因为maxpool2D 在指定return_mask=True时返回output和indice_mask,而不是单一tensor导致的这个问题。

@ghostxsl
Copy link
Contributor

您好,现在应该是只支持这些layer:https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/hapi/dynamic_flops.py#L187-L207
只有这些layer才会统计,请问您有没有自定义custom_op呢?

@ghostxsl
Copy link
Contributor

您可以确认一下看看是否 model.children 的所有输出都是tensor呢?

@shiyutang
Copy link
Contributor

好的

您可以确认一下看看是否 model.children 的所有输出都是tensor呢?

@ETTR123
Copy link
Author

ETTR123 commented Dec 20, 2021

您好,现在应该是只支持这些layer:https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/hapi/dynamic_flops.py#L187-L207 只有这些layer才会统计,请问您有没有自定义custom_op呢?

没有

@ETTR123
Copy link
Author

ETTR123 commented Dec 20, 2021

您可以确认一下看看是否 model.children 的所有输出都是tensor呢?

您好,检查过了,都是tensor

@ETTR123
Copy link
Author

ETTR123 commented Dec 20, 2021

简单的复现了一下出问题的地方

test1
test2

@ghostxsl
Copy link
Contributor

您好,刚经过内部沟通,该问题为已知问题,目前正在修复中,后续进展我会在这里同步给您~

@ETTR123
Copy link
Author

ETTR123 commented Dec 21, 2021

您好,刚经过内部沟通,该问题为已知问题,目前正在修复中,后续进展我会在这里同步给您~

好的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants