Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bugfix] EMAHook load state dict #507

Merged
merged 5 commits into from
Sep 9, 2022

Conversation

okotaku
Copy link
Contributor

@okotaku okotaku commented Sep 2, 2022

Motivation

When using yolox, it raised an error because of fixed num_classes in head.

  File "/opt/site-packages/mmdet/.mim/tools/train.py", line 120, in <module>
    main()
  File "/opt/site-packages/mmdet/.mim/tools/train.py", line 116, in main
    runner.train()
  File "/opt/conda/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1623, in train
    self.load_or_resume()
  File "/opt/conda/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1585, in load_or_resume
    self.load_checkpoint(self._load_from)
  File "/opt/conda/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1977, in load_checkpoint
    self.call_hook('after_load_checkpoint', checkpoint=checkpoint)
  File "/opt/conda/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1693, in call_hook
    getattr(hook, fn_name)(self, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/mmengine/hooks/ema_hook.py", line 187, in after_load_checkpoint
    self.ema_model.module.load_state_dict(
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1605, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for YOLOX:
        size mismatch for bbox_head.multi_level_conv_cls.0.weight: copying a param with shape torch.Size([80, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 128, 1, 1]).
        size mismatch for bbox_head.multi_level_conv_cls.0.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([3]).
        size mismatch for bbox_head.multi_level_conv_cls.1.weight: copying a param with shape torch.Size([80, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 128, 1, 1]).
        size mismatch for bbox_head.multi_level_conv_cls.1.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([3]).
        size mismatch for bbox_head.multi_level_conv_cls.2.weight: copying a param with shape torch.Size([80, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([3, 128, 1, 1]).
        size mismatch for bbox_head.multi_level_conv_cls.2.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([3]).

Checklist

  1. Pre-commit or other linting tools are used to fix the potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  3. If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMCls.
  4. The documentation has been modified accordingly, like docstring or example tutorials.

@okotaku okotaku changed the title Ema load state dict [bugfix] EMAHook load state dict Sep 2, 2022
@okotaku
Copy link
Contributor Author

okotaku commented Sep 5, 2022

I found it raised error when run test. I fixed it.

@ZwwWayne ZwwWayne added this to the 0.2.0 milestone Sep 5, 2022
mmengine/hooks/ema_hook.py Outdated Show resolved Hide resolved
@RangiLyu RangiLyu added the bug Something isn't working label Sep 5, 2022
mmengine/hooks/ema_hook.py Outdated Show resolved Hide resolved
mmengine/hooks/ema_hook.py Outdated Show resolved Hide resolved
mmengine/hooks/ema_hook.py Outdated Show resolved Hide resolved
Copy link
Member

@RangiLyu RangiLyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ZwwWayne ZwwWayne merged commit a6f5297 into open-mmlab:main Sep 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants