Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop a plugin interface to add new downloaders #328

Open
3 tasks
MattF-NSIDC opened this issue Oct 25, 2023 · 8 comments
Open
3 tasks

Develop a plugin interface to add new downloaders #328

MattF-NSIDC opened this issue Oct 25, 2023 · 8 comments

Comments

@MattF-NSIDC
Copy link
Member

MattF-NSIDC commented Oct 25, 2023

We would like to make data services more accessible, but we don't want earthaccess to be coupled with the APIs of all the different services.

It was suggested by Bri that we add support for asynchronous ordering as an access method. @andypbarrett suggested support for NSIDC subsetter. It was suggested by Joseph that a plugin system would be a good way to decouple this significant complexity from the main earthaccess codebase. I think we have broad agreement that this is a good way forward. Does that sound right?

Next up we need to talk about the interface! We need to be thoughtful about breaking changes here, because those will create work for all plugin projects (and those downstream of the plugins). What does developing a plugin look like? How will earthaccess discover/register installed plugins? What context will earthaccess provide to the plugin (and if that includes credentials, how should we warn users about using only trusted plugins?)? What information will be reported back from the plugin to earthaccess? How do we test that we haven't committed any breaking changes to the interface, and if so, apply the correct version change?

Identified tasks (please edit me!)

  • Think about what some likely plugins would look like. List out what has been explicitly requested so far in one place.
  • Specify an interface between earthaccess and its plugins
  • Edit me!

🚀 ❤️

EDIT: cc @BriannaLind @jhkennedy

@jhkennedy
Copy link
Collaborator

So I've been poking at this a little. Harmony and HyP3 both have python clients which allow you to interact with those services, so building an earthaccess plugin with those should be straight forward -- most of the work is just translating the earthaccess search results into the form needed by those services. Just sketching out with pseudo code, this could look like:

# HyP3
results = earthaccess.search(...)
jobs = earthaccess.hyp3.submit_rtc_jobs(results, **kwargs)
jobs.watch()

ds =  earthaccess.smart_open(jobs)  # xarray dataset

# Harmony
results = earthaccess.search(...)
jobs = earthaccess.harmony.mosaic(results, **kwargs)
jobs.watch()

ds = earthaccess.smart_open(jobs)  # xarray dataset

# SBAS -> HyP3
results = earthaccess.search(...)
stacks = earthaccess.asf.sbas(results, **kwargs)
jobs = earthaccess.hyp3.submit_insar_jobs(stacks, **kwargs)
jobs.watch()

ds = earthaccess.smart_open(jobs)  # xarray dataset

@jhkennedy
Copy link
Collaborator

Really there's three fundamental interfaces for services:

  1. Submit granules to the service
  2. wait for the services to do it's thing
  3. access the resulting data from the service

@nikki-t
Copy link
Collaborator

nikki-t commented Jun 3, 2024

Some thoughts on how to do this borrowing from @jhkennedy 's comments above.

Abstract Plugin Interface: PluginInterface

  1. submit_granules(): Submit granules to the service.
  2. wait_for_service(): Wait for the services to do its thing.
  3. get_results(): Access the resulting data from the service.

Plugin Discovery:
Propose we create a 'plugins' directory and follow the "Using namespace packages" section at this documentation.

Earthaccess would then discover and load any plugins located in the 'plugins' directory. This could load a list of available plugins that the earthaccess.results.DataCollection.service method can use to return available plugins when returning a list of associated services for a collection.

Later on we could move towards letting the user install the plugins via a command like this: pip install earthaccess[harmony] which would only install and make a Harmony service plugin available. This way the user only has to install what they need. It looks like this might require the use of entry points.

Steps to implement a plugin:

  1. Create a new directory with the name of your plugin in the 'plugins' directory.
  2. Create a __init__.py file in your plugin directory.
  3. Create a class that inherits from PluginInterface and implement the following methods:
    a. submit_granules()
    b. wait_for_services()
    c. get_results()
  4. Register your plugin by editing the pyproject.toml file. I need to do a little bit more research on this. I found xarray documentation and Python packaging documentation that might be helpful.

I would like to implement the PluginInterface class, test the discovery functionality to see how everything might work, and then regroup for feedback on a hackday!

@mfisher87
Copy link
Member

I feel like plugins should be independent repositories so they can be packaged separately. That way we don't need to maintain optional dependencies within this repo for every plugin, and it gives plugin authors more autonomy. I like xarray's documentation about this! I think they're prescribing that the plugin's pyproject.toml would define metadata needed for xarray to discover it. I need to understand better how that discovery works. I like this model.

@nikki-t
Copy link
Collaborator

nikki-t commented Jun 10, 2024

That's a great idea. I also am interested in better understanding the plug-in discovery mechanism. So far I have been thinking of the plug-in code as a part of earthaccess. But I will explore what xarray has done to see how it might be possible for plug-ins to be contained in their own repos.

@mfisher87
Copy link
Member

I was hoping to catch some news at the end of today's hack day, sorry I missed you!

@nikki-t
Copy link
Collaborator

nikki-t commented Jun 12, 2024

@mfisher87 - No worries, I dropped off a little early because I was stuck on failing units tests for the services PR. (Luis offered to take a look at the PR as the units tests for Python 3.10 - 3.12 are failing but Python 3.9 are succeeding.) But we didn't get to talk about the plugin architecture. Maybe we can reserve a little bit of time during the next hackday to discuss!

@mfisher87
Copy link
Member

Sounds like a good plan :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🆕 New
Development

No branches or pull requests

4 participants