Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

df_summary.index contains nan values #27

Closed
pbenner opened this issue May 4, 2023 · 1 comment
Closed

df_summary.index contains nan values #27

pbenner opened this issue May 4, 2023 · 1 comment
Labels
bug Something isn't working data Data loading and processing

Comments

@pbenner
Copy link
Collaborator

pbenner commented May 4, 2023

> python data/wbm/fetch_process_wbm_dataset.py
[...]
Traceback (most recent call last):
  File "/home/pbenner/Source/tmp/matbench-discovery/data/wbm/fetch_process_wbm_dataset.py", line 331, in <module>
    df_summary.index = df_summary.index.map(increment_wbm_material_id)  # format IDs
  File "/home/pbenner/.local/opt/anaconda3/envs/crysfeat/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 6158, in map
    new_values = self._map_values(mapper, na_action=na_action)
  File "/home/pbenner/.local/opt/anaconda3/envs/crysfeat/lib/python3.10/site-packages/pandas/core/base.py", line 924, in _map_values
    new_values = map_f(values, mapper)
  File "pandas/_libs/lib.pyx", line 2834, in pandas._libs.lib.map_infer
  File "/home/pbenner/Source/tmp/matbench-discovery/data/wbm/fetch_process_wbm_dataset.py", line 147, in increment_wbm_material_id
    prefix, step_num, material_num = wbm_id.split("_")
AttributeError: 'float' object has no attribute 'split'

caused by nan values at positions 185450, 185451, 185473, 185474, 185476, 185477

@janosh
Copy link
Owner

janosh commented May 5, 2023

Good catch... again.

Sadly this script takes too many resources to run in GH CI. I tried in

- scripts/compile_metrics.py
- scripts/analyze_element_errors.py

but the jobs were killed. Probably due to high mem use. So thanks for continually testing this. Hopefully it works now! 🤞

@janosh janosh added bug Something isn't working data Data loading and processing labels May 5, 2023
@janosh janosh closed this as completed in e5ca0d2 May 5, 2023
janosh added a commit that referenced this issue Jun 20, 2023
…process_wbm_dataset.py (closes #27)

---> 147     prefix, step_num, material_num = wbm_id.split("_")
     148 except ValueError:
     149     print(f"bad {wbm_id=}")
janosh added a commit that referenced this issue Jun 20, 2023
…process_wbm_dataset.py (closes #27)

---> 147     prefix, step_num, material_num = wbm_id.split("_")
     148 except ValueError:
     149     print(f"bad {wbm_id=}")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data Data loading and processing
Projects
None yet
Development

No branches or pull requests

2 participants