-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example data files for NOM sample preparation #146
Conversation
Your concern is valid, but it's not really Poetry's fault. "ValueError: : Unknown CURIE prefix: @base" means that there's some value that is supposed to be a CUIRe, but it doesn't have a prefix. Yes, the files can be validated individually with I'll look though this over the next day or two and share my insights. |
@pkalita-lbl what is the recommended debugging next step when one validation crashes like this and doesn't report an |
I'm running this now, from #!/bin/bash
# Path to the schema file
SCHEMA="../../nmdc_schema/nmdc_materialized_patterns.yaml"
# Directory containing the data files
DATA_DIR="../../src/data/valid"
# Loop over all files in the directory
for file in "$DATA_DIR"/*; do
# Extract the class name from the filename by cutting on the first hyphen
class_name=$(basename "$file" | cut -d'-' -f1)
echo "$file"
echo "$class_name"
# Run the linkml-validate command
linkml-validate --schema "$SCHEMA" --target-class "$class_name" "$file"
done |
You can see from the stacktrace that the problem isn't in the validation. It emanates from the line where This also speaks to the fact that |
@pkalita-lbl and I made the same discovery around the same time. The linkml-convert exercise didn't help. I'll try converting each valid example to RDF, which is where the absence of a CURIe is especially problematic. |
@cmungall is there some way to assert a default CURIe base in the RDF conversion process? |
Step-wise RDF generation over the example files is revealing several errors |
src/data/valid/Database-AssemblyAnalysis-1.yaml
src/data/valid/Database-biosample-exhasutive.yaml
src/data/valid/Database-mags.yaml
src/data/valid/Database-NOM-material-processing.yaml
src/data/valid/Database-ReadQcAnalysisActivity-quality_fail.yaml
src/data/valid/MetabolomicsAnalysis-1.yaml
|
@anastasiyaprymolenna do you want me to fix these for you and push them back to your branch? We'll also have to think about when we are going to back-merge berkeley-schema-fy24 |
to src/data/problem/valid
The illegal CURIes in
|
… into nom-example-sample-prep-metadata
It also concerns me that the identifier for the |
Same thing for
|
Summary for tonight:
|
Valid metadata to use for NOM workflows. Includes extraction and SPE protocols.
@turbomam The test fails to validate a CURIE, but the output from poetry is not informative as to which CURIE does is not accepted. Is there a way to make poetry have a more informative traceback than what follows? Or is there a way to speed up testing for just one example file? Because there are 66 curies in that file to test if I am to do it one by one.