Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds multi format reader #250

Merged
merged 73 commits into from
Aug 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
7977c2a
Adds multi format reader
domna Feb 21, 2024
cb4102d
Adds standard parsings
domna Feb 22, 2024
da3638e
Add json config parsing
domna Feb 23, 2024
2851a0f
Use default factory for list
domna Feb 23, 2024
b04770a
Only resolve asterisk from same level
domna Feb 23, 2024
101ea45
Remove residual testing code
domna Feb 23, 2024
815b426
Adds eln callback and config argument
domna Mar 5, 2024
c101666
Adds list parsing for special prefixes
domna Mar 5, 2024
a9ae42e
Add **kwargs to example reader
domna Mar 6, 2024
cc1e898
Add nomad .pylintrc
domna Mar 7, 2024
79e8ad7
modified: pynxtools/dataconverter/readers/multi/reader.py
domna Mar 7, 2024
f85e9f8
Provide entry names and use __post_init__ for dataclass
domna Mar 7, 2024
a674cfd
Silence mypy false positive
domna Mar 7, 2024
0a082ee
Add ? notation for dependent keys
domna Mar 7, 2024
fe2889a
Ruff formatting
domna Apr 15, 2024
b1a82fd
Remove pylintrc
domna Jul 9, 2024
882e389
Move reader to src dir
domna Jul 11, 2024
b68fa57
Adds processing order and optional groups removal
domna Jul 17, 2024
c64a442
Move reader into src and delete old location
domna Jul 17, 2024
c3efb41
Fixes from testing
domna Jul 17, 2024
a1b0266
Adapt tests to definition overwrite message
domna Jul 17, 2024
e3e9049
Adapt get_data_dims
domna Jul 17, 2024
48144e7
update defs once again
lukaspie Jul 11, 2024
fde411b
temporarily use feature branches in plugin tests
lukaspie Jul 11, 2024
bb27e34
switch to plugin main branches again
lukaspie Jul 16, 2024
e3fc8d1
Add docs for rules in NeXus
lukaspie Jul 15, 2024
8c39500
Git reset in publish workflow
lukaspie Jul 17, 2024
8d30d57
fetch submodules, git reset before build
lukaspie Jul 17, 2024
b39d3fb
add workflow to simulate publishing
lukaspie Jul 17, 2024
4c1e3d4
Add git status after build
domna Jul 17, 2024
06de5e8
Run diff
domna Jul 17, 2024
43ee01c
Remove publish test
domna Jul 17, 2024
b3f89cf
Remove unused import
domna Jul 17, 2024
511e63a
Add post_processing step
domna Jul 18, 2024
0f74aa9
Write to new dict in json parsing, otherwise we always create an entr…
domna Jul 18, 2024
81ecefb
Copy values to new dict if no magic keyword is present
domna Jul 19, 2024
e242c04
Resolve links from written data
domna Jul 19, 2024
386756c
add fix for missing keys in dict again (from c1ecb70)
rettigl Jul 19, 2024
2a7ec30
Ruff format
domna Jul 19, 2024
4ac4f9b
Remove optional groups in the main loop
domna Jul 19, 2024
364f027
Call data dims when entry name is present
domna Jul 22, 2024
1f92f88
parse config before postprocess
lukaspie Jul 23, 2024
8662f2d
modify docs
lukaspie Jul 23, 2024
0e652e2
Pass keys to json callbacks
domna Jul 23, 2024
2c15e8a
Replace `ENTRY` instead of `ENTRY[entry]` in config dict
domna Jul 23, 2024
5f01080
Update prefix:value notation
domna Jul 23, 2024
c2c7df3
Correctly remove `!` dependent keys
domna Jul 25, 2024
5f27420
Allow lists of possible values in config file
lukaspie Jul 24, 2024
512503f
simplify static typing
lukaspie Jul 24, 2024
bff8566
Fix removal of missing groups
domna Jul 25, 2024
fdd349e
simplify code, make docstrings more elaborate
lukaspie Jul 25, 2024
7e5d88c
Configure handler for root logger
domna Jul 25, 2024
3aae156
Init config_dict
domna Jul 25, 2024
364ec87
pass parent key to parse_yml (#391)
lukaspie Jul 25, 2024
f94050e
Update default value for parent_key in FlattenSettings
domna Jul 25, 2024
476b9b7
Readd link flattening to flatten_json
domna Jul 25, 2024
fa7ad82
Fix the config dict sorting
domna Jul 26, 2024
a95f9bd
Use single pynxtools logger which does not propagate
domna Jul 26, 2024
c6743b8
Capture log calls from non-propagating loggers
domna Jul 26, 2024
1624c22
Add `create_link_dict` parameter for flatten_json
domna Jul 26, 2024
9f4ece8
propagate create_link_dict parameter
lukaspie Jul 26, 2024
6a55ceb
set create_link_dict to False for MultiFormatReader
lukaspie Jul 26, 2024
239b42c
re-add logger propagation
rettigl Jul 26, 2024
cff15f2
Adapt logging and don't parse objects if they are none
domna Jul 27, 2024
1f56dfe
strip ! prefix before parsing lists
rettigl Jul 27, 2024
6bb3bed
Remove unecessary checks for `!` notation
domna Jul 28, 2024
8053e75
make removed groups an info again
rettigl Jul 28, 2024
6276ed4
enhance docs of ParseJsonCallbacks
lukaspie Aug 12, 2024
eaec8a2
pop suppress_warning from kwargs
lukaspie Aug 12, 2024
5aeaee1
Remove caplog overwrite for non-root logger tests
domna Aug 12, 2024
c114f53
custom logging formatting
lukaspie Aug 12, 2024
99bdc5b
make highlighted logger levels a constant
lukaspie Aug 12, 2024
20f4182
update nexus-version
lukaspie Aug 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/learn/nexus-rules.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,4 +40,4 @@ Aside from these general rules, there are a number of special rules for NeXus na

- There is also a set of reserved suffixes that are used to give additional information for a group or field. For the full list, see [Rules for Storing Data Items in NeXus Files](https://manual.nexusformat.org/datarules.html), section "Reserved suffixes".

- Additionally to namefitting, data annotation can use further information. For example, in case of NXdata, the axes listed among the `@axes` shall fit to any instances of `AXISNAME` and data objects listed in `@signal` or `@auxiliary_signals` shall fit to instances of `DATA`. Such rules are typically given in the base classes (e.g., see [here](https://manual.nexusformat.org/classes/base_classes/NXdata.html#index-0) for NXdata). Any tool that makes use of the base classes should implement these special rules in its validation procedure. As an example, pynxtools has a special [function for handling NXdata](https://github.com/FAIRmat-NFDI/pynxtools/blob/474fe823112b8ee1e7b42ac80bb7408fdde22bd5/src/pynxtools/dataconverter/validation.py#L220).
- Additionally to namefitting, data annotation can use further information. For example, in case of NXdata, the axes listed among the `@axes` shall fit to any instances of `AXISNAME` and data objects listed in `@signal` or `@auxiliary_signals` shall fit to instances of `DATA`. Such rules are typically given in the base classes (e.g., see [here](https://manual.nexusformat.org/classes/base_classes/NXdata.html#index-0) for NXdata). Any tool that makes use of the base classes should implement these special rules in its validation procedure. As an example, pynxtools has a special [function for handling NXdata](https://github.com/FAIRmat-NFDI/pynxtools/blob/474fe823112b8ee1e7b42ac80bb7408fdde22bd5/src/pynxtools/dataconverter/validation.py#L220).
23 changes: 23 additions & 0 deletions src/pynxtools/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,36 @@
# limitations under the License.
#

import logging
import os
import re
from datetime import datetime

from pynxtools._build_wrapper import get_vcs_version
from pynxtools.definitions.dev_tools.globals.nxdl import get_nxdl_version

LOGGER_LEVELS_TO_HIGHLIGHT = (logging.WARNING, logging.ERROR)


class CustomFormatter(logging.Formatter):
"""Formatter that specifically highlights errors and warnings."""

def format(self, record):
if not getattr(record, "prefixed", False):
if record.levelno in LOGGER_LEVELS_TO_HIGHLIGHT:
record.msg = f"{record.levelname}: {record.msg}"
# Mark the record as prefixed
record.prefixed = True
return super().format(record)


logger = logging.getLogger("pynxtools")
handler = logging.StreamHandler()
formatter = CustomFormatter("%(message)s")
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.INFO)

MAIN_BRANCH_NAME = "fairmat"


Expand Down
13 changes: 9 additions & 4 deletions src/pynxtools/dataconverter/convert.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,7 @@
from pynxtools.dataconverter.validation import validate_dict_against
from pynxtools.dataconverter.writer import Writer

logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
logger.addHandler(logging.StreamHandler())
logger = logging.getLogger("pynxtools")


if sys.version_info >= (3, 10):
Expand Down Expand Up @@ -337,7 +335,12 @@ def main_cli():
"--mapping",
help="Takes a <name>.mapping.json file and converts data from given input files.",
)

@click.option(
"-c",
"--config",
type=click.Path(exists=True, dir_okay=False, file_okay=True, readable=True),
help="A json config file for the reader",
)
# pylint: disable=too-many-arguments
def convert_cli(
files: Tuple[str, ...],
Expand All @@ -349,6 +352,7 @@ def convert_cli(
ignore_undocumented: bool,
skip_verify: bool,
mapping: str,
config: str,
fail: bool,
):
"""This command allows you to use the converter functionality of the dataconverter."""
Expand Down Expand Up @@ -394,6 +398,7 @@ def convert_cli(
nxdl,
output,
skip_verify,
config_file=config,
ignore_undocumented=ignore_undocumented,
fail=fail,
)
Expand Down
9 changes: 4 additions & 5 deletions src/pynxtools/dataconverter/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,7 @@
get_required_string as nexus_get_required_string,
)

logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
logger = logging.getLogger("pynxtools")


class ValidationProblem(Enum):
Expand Down Expand Up @@ -874,13 +873,13 @@ def write_nexus_def_to_entry(data, entry_name: str, nxdl_def: str):
def update_and_warn(key: str, value: str, overwrite=False):
if key in data and data[key] is not None and data[key] != value:
report = (
f"This is overwritten by the actually used value '{value}'"
f"This is overwritten by the actually used value '{data[key]}'"
if overwrite
else f"The provided version '{value}' is kept. We assume you know what you are doing."
else f"The provided version '{data[key]}' is kept. We assume you know what you are doing."
)
logger.log(
logging.WARNING if overwrite else logging.INFO,
f"The entry '{key}' (value: {data[key]}) should not be changed by "
f"The entry '{key}' (value: {value}) should not be changed by "
f"the reader. {report}",
)
if overwrite or data.get(key) is None:
Expand Down
2 changes: 1 addition & 1 deletion src/pynxtools/dataconverter/readers/base/reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
"""The abstract class off of which to implement readers."""

from abc import ABC, abstractmethod
from typing import Tuple, Any
from typing import Any, Tuple


class BaseReader(ABC):
Expand Down
1 change: 1 addition & 0 deletions src/pynxtools/dataconverter/readers/example/reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ def read(
template: dict = None,
file_paths: Tuple[str] = None,
objects: Tuple[Any] = None,
**_,
) -> dict:
"""Reads data from given file and returns a filled template dictionary"""
data: dict = {}
Expand Down
Empty file.
Loading
Loading