Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gold_analysis_project_identifiers and (new) jgi_analysis_project_identifiers to WorkflowChain #1123

Closed
Tracked by #1466
aclum opened this issue Sep 16, 2023 · 7 comments · Fixed by microbiomedata/berkeley-schema-fy24#71
Assignees
Labels
berkeley-fy24-refactor Label to describe issues created during the December 2023 hackathon for schema refactor

Comments

@aclum
Copy link
Contributor

aclum commented Sep 16, 2023

In reviewing some things for the napa compliance squad I realized there is not a generic external_identifiers slot. For example all that exists on Class: MetagenomeAssembly is insdc_assembly_identifiers. JGI/GOLD and IMG have identifiers that are a direct or broad mapping. For example, img_identifiers which are currently on Class Biosample, a more appropriate mapping is https://microbiomedata.github.io/nmdc-schema/MetagenomeAnnotationActivity/ and https://microbiomedata.github.io/nmdc-schema/MagsAnalysisActivity/

Currently the workflows team uses the GOLD analysis project identifiers (Ga*) to name the directories when Ingesting the JGI workflow data, I'd like somewhere to store this in the schema so we can link back to what GOLD identifier generated the data. Currently I have to provide a table offline that maps the gold sequencing project, listed in the omics_processing_set record as https://microbiomedata.github.io/nmdc-schema/gold_sequencing_project_identifiers/ to GOLD analysis project identifiers which currently don't have a designated slot.

@mslarae13
Copy link
Contributor

Some of this should go in workflow chain

Maybe!

@mbthornton-lbl mbthornton-lbl self-assigned this Dec 15, 2023
@mbthornton-lbl mbthornton-lbl changed the title external identifiers for workflow execution activities Add gold_analysis_project_identifiers and (new) jgi_analysis_project_identifiers to WorkflowChain Dec 15, 2023
@mbthornton-lbl mbthornton-lbl added the berkeley-fy24-refactor Label to describe issues created during the December 2023 hackathon for schema refactor label Dec 18, 2023
@mslarae13
Copy link
Contributor

@mbthornton-lbl these slots are coming off Class:WorkflowExecution and will ONLY be on Class:WorkflowChain. Correct?

@mbthornton-lbl
Copy link
Contributor

mbthornton-lbl commented Dec 20, 2023 via email

@aclum
Copy link
Contributor Author

aclum commented Jan 2, 2024

Outstanding todo is to add img_identifiers to MetagenomeAnnotation and MagsAnalysis

@mslarae13
Copy link
Contributor

mslarae13 commented Jan 4, 2024

@aclum is the gold_analysis identifier part of this complete? Will img_identifiers be done here, or should that be a separate issue and PR?

@mslarae13
Copy link
Contributor

The Pr with both identifiers has been merged in.
A separate PR should be made for img_identifiers.

@aclum
Copy link
Contributor Author

aclum commented Jan 19, 2024

Linked PR addresses img_identifiers in WorkflowExeuction subclasses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
berkeley-fy24-refactor Label to describe issues created during the December 2023 hackathon for schema refactor
Development

Successfully merging a pull request may close this issue.

3 participants