Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement migrator that creates/populates workflow_chain_set (file: migrator_from_X_to_PR9.py) #127

Merged
merged 37 commits into from
May 17, 2024

Conversation

brynnz22
Copy link
Collaborator

@brynnz22 brynnz22 commented Apr 10, 2024

Migrator creates the workflow_chain_set and maps all WorkflowExecution subclasses to a workflow chain instance by the omics processing id in the was_informed_by slot.

  • Migrator needs to come Before class name change from OmicsProcessing to DataGeneration as it refers to the omics_processing_set.

Address changes in PR #9

@brynnz22 brynnz22 self-assigned this Apr 10, 2024
@eecavanna eecavanna changed the title Workflowchain migrator Implement migrator that creates and populates workflow_chain_set collection Apr 17, 2024
Here are the commands I ran:
```
# Install black.
pip install "black>=23.1,<25"

# Run black.
black nmdc_schema/migrators/migrator_from_X_to_PR9.py
```
@eecavanna
Copy link

Here's the command I'm using to run the doctests in this migrator:

poetry run python -m doctest -v nmdc_schema/migrators/migrator_from_X_to_PR9.py

@eecavanna
Copy link

I'll focus on these two issues:

  1. "No adapter was specified" message appears
Trying:
    m = Migrator()
Expecting nothing
No adapter was specified. Migration capability will be limited.
  1. populate_workflow_chain doctest fails
Trying:
    m.populate_workflow_chain()
Expecting:
    # After execution, the workflow_chain_set collection should contain a document with the following data:
    {'id': 'nmdc:wfc-456', 'was_informed_by': 'nmdc:omcp-123', 'analyte_category': 'metagenome', 'type': 'nmdc:WorkflowChain', 'name': 'Workflow Chain for metagenome analysis related to Study 1'}
**********************************************************************
File "/nmdc-schema/nmdc_schema/migrators/migrator_from_X_to_PR9.py", line 169, in migrator_from_X_to_PR9.Migrator.populate_workflow_chain
Failed example:
    m.populate_workflow_chain()
Exception raised:
    Traceback (most recent call last):
      File "/usr/local/lib/python3.9/doctest.py", line 1334, in __run
        exec(compile(example.source, filename, "single",
      File "<doctest migrator_from_X_to_PR9.Migrator.populate_workflow_chain[4]>", line 1, in <module>
        m.populate_workflow_chain()
      File "/nmdc-schema/nmdc_schema/migrators/migrator_from_X_to_PR9.py", line 191, in populate_workflow_chain
        self.adapter.insert_document("workflow_chain_set", workflow_chain_doc)
    AttributeError: 'NoneType' object has no attribute 'insert_document'

The cautionary message was:
```
No adapter was specified. Migration capability will be limited.
```

Note: That message still appears with other doctests that
      aren't part of this file.
@eecavanna
Copy link

eecavanna commented Apr 19, 2024

The log from the most recent GitHub Actions failure says:

  RuntimeError

  Unable to find installation candidates for bioregistry (0.10.157)

The installation of a Python package failed. The failure is not related to the doctests.

I opened a ticket in nmdc-schema (microbiomedata#1932) about this failure. The failure is occurring in that repository also.

@brynnz22
Copy link
Collaborator Author

The log from the most recent GitHub Actions failure says:

  RuntimeError

  Unable to find installation candidates for bioregistry (0.10.157)

The installation of a Python package failed. The failure is not related to the doctests.

I opened a ticket in nmdc-schema (microbiomedata#1932) about this failure. The failure is occurring in that repository also.

Katherine has a PR up to solve this, so its being worked on.

@eecavanna eecavanna changed the title Implement migrator that creates and populates workflow_chain_set collection Implement migrator that creates and populates workflow_chain_set collection (file: migrator_from_X_to_PR9.py) May 12, 2024
@eecavanna eecavanna changed the title Implement migrator that creates and populates workflow_chain_set collection (file: migrator_from_X_to_PR9.py) Implement migrator that creates/populates workflow_chain_set (file: migrator_from_X_to_PR9.py) May 12, 2024
@eecavanna
Copy link

eecavanna commented May 15, 2024

Implementation of this migrator is currently blocked by the Minter (or, rather, "a Minter") not allowing people to mint IDs for WorkflowChains. I consider myself — @eecavanna — responsible for resolving that (with respect to this PR).

@brynnz22 brynnz22 marked this pull request as ready for review May 16, 2024 18:28
@brynnz22 brynnz22 requested a review from turbomam May 16, 2024 18:29
@brynnz22
Copy link
Collaborator Author

@eecavanna @turbomam this migrator is finally ready for a final review and can be merged into main afterwards. The workflochain ids have been pre-minted. The change with the project.makefile was because I edited it and then recently change it back to match main - so it should match main. Not sure why its showing those interesting changes.

@brynnz22 brynnz22 requested a review from eecavanna May 16, 2024 18:31
@eecavanna
Copy link

I'm reviewing this now. I will make some changes to variable names and push a new commit today.

@eecavanna
Copy link

eecavanna commented May 16, 2024

As @brynnz22 said, the lines that GitHub is showing as being added to the project.Makefile via this PR, already exist in that file on the tip of the main branch. I don't understand why GitHub is showing them as new lines in this PR. 🤷

Copy link

@eecavanna eecavanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I am comfortable with this being merged into the main branch. Once it's merged in, it'll be easier for me to test "all Berkeley migrators" in series.

@eecavanna
Copy link

eecavanna commented May 17, 2024

Hi @turbomam, once you merge this branch into main, will you also publish a new prerelease package to PyPI? I think publishing a new package to PyPI will unblock microbiomedata/nmdc-runtime#519 (comment) (I'm getting some path-related errors when installing the package directly from the GitHub repo as opposed to from PyPI, and I suspect it's because what's in the GitHub repo hasn't undergone whatever build steps GitHub Actions performs when building a package that will be published to PyPI).

@turbomam turbomam merged commit 0bac14b into main May 17, 2024
4 checks passed
@turbomam turbomam deleted the workflowchain_migrator branch May 17, 2024 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants