Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] EnvMatStat fails when two descriptors have the same hash #4151

Closed
iProzd opened this issue Sep 20, 2024 · 0 comments · Fixed by #4152
Closed

[BUG] EnvMatStat fails when two descriptors have the same hash #4151

iProzd opened this issue Sep 20, 2024 · 0 comments · Fixed by #4152
Labels
bug reproduced This bug has been reproduced by developers

Comments

@iProzd
Copy link
Collaborator

iProzd commented Sep 20, 2024

Bug summary

When computing the data stat, if two descriptors have the same hash (see get_hash below, e.g. repformer and repinit_tebd), the latter one will choose to load the computed stats.

    def get_hash(self) -> str:
        """Get the hash of the environment matrix.

        Returns
        -------
        str
            The hash of the environment matrix.
        """
        dscpt_type = "se_a" if self.last_dim == 4 else "se_r"
        return get_hash(
            {
                "type": dscpt_type,
                "ntypes": self.descriptor.get_ntypes(),
                "rcut": round(self.descriptor.get_rcut(), 2),
                "rcut_smth": round(self.descriptor.rcut_smth, 2),
                "nsel": self.descriptor.get_nsel(),
                "sel": self.descriptor.get_sel(),
                "mixed_types": self.descriptor.mixed_types(),
            }
        )

However, it seems that the computed stats are not flushed to the file (even used self.root.flush() in DPH5Path), so an empty stats will be loaded and raise error.

pt/utils/env_mat_stat.py:213, in EnvMatStatSe.__call__(self)
    211 for type_i in range(self.descriptor.get_ntypes()):
    212     if self.last_dim == 4:
--> 213         davgunit = [[avgs[f"r_{type_i}"], 0, 0, 0]]
    214         dstdunit = [
    215             [
    216                 stds[f"r_{type_i}"],
   (...)
    220             ]
    221         ]
    222     elif self.last_dim == 1:

KeyError: 'r_0'

After computation, next training process will success in loading stats from hdf5 file.

DeePMD-kit Version

devel

Backend and its version

PyTorch v2.1.2

How did you download the software?

Built from source

Input Files, Running Commands, Error Log, etc.

cd examples/water/dpa2
dp --pt train input_torch_small.json

Steps to Reproduce

see above

Further Information, Files, and Links

No response

@iProzd iProzd added the bug label Sep 20, 2024
@njzjz njzjz added the reproduced This bug has been reproduced by developers label Sep 20, 2024
njzjz added a commit to njzjz/deepmd-kit that referenced this issue Sep 20, 2024
Fix deepmodeling#4151.

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
@njzjz njzjz linked a pull request Sep 20, 2024 that will close this issue
github-merge-queue bot pushed a commit that referenced this issue Sep 21, 2024
Fix #4151.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **New Features**
- Enhanced path filtering logic to include a broader range of keys when
generating subpaths.
  
- **Bug Fixes**
	- Improved the accuracy of path results returned by the `glob` method.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
@njzjz njzjz closed this as completed Sep 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug reproduced This bug has been reproduced by developers
Projects
Development

Successfully merging a pull request may close this issue.

2 participants