Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚸 Walkthrough on using RioXarray IterDataPipes #8

Merged
merged 6 commits into from
Jun 6, 2022
Merged

Conversation

weiji14
Copy link
Owner

@weiji14 weiji14 commented Jun 6, 2022

Initial tutorial for an Earth Observation data pipeline using torchdata! Using example Sentinel-2 dataset over Singapore.

Preview at https://weiji14.github.io/zen3geo/walkthrough.html

Sentinel-2 image over Singapore on 20220115

Extends #6

TODO:

  • Install STAC related dependencies
  • Loading Cloud-Optimized GeoTIFFs from STAC
  • Constructing a DataPipe
  • Looping through DataPipe

References:

Planetary Computer SDK for Python and Python library for working with Spatiotemporal Asset Catalog (STAC)!

Also updated deploy-docs.yml workflow with new docs dependencies.
Initial tutorial draft for an Earth Observation data pipeline using `torchdata`! Includes loading Cloud-Optimized GeoTIFFs from STAC and constructing a DataPipe. More to come on looping through DataPipes with DataLoader.
@weiji14 weiji14 added the documentation Improvements or additions to documentation label Jun 6, 2022
@weiji14 weiji14 added this to the 0.1.0 milestone Jun 6, 2022
@weiji14 weiji14 self-assigned this Jun 6, 2022
Mention that extra keyword arguments like `overview_level` can be passed into `read_from_rioxarray()`. Made a call for contributors to implement extra functionality like clip_box and reproject! Also fixed some typos here and there.
Using iter, for-loop or DataLoader. Which will you choose?
@weiji14 weiji14 marked this pull request as ready for review June 6, 2022 15:46
@weiji14 weiji14 merged commit 76b8349 into main Jun 6, 2022
@weiji14 weiji14 deleted the walkthrough branch June 6, 2022 15:58
weiji14 added a commit that referenced this pull request Sep 7, 2022
Just a random collection of mostly documentation-related patches. Patches type-hints in #52, isort imports in #35, mention functional name of IterDataPipe in walkthroughs #8 and #20, and remove mention of returned tuple to patch #33.

* 🏷️ Add specific type hints for mask_datapipe in geopandas.py

Should be either an xarray.DataArray or xarray.Dataset.

* 🚨 Sort spatialpandas imports in datashader.py

Ran isort to sort spatialpandas.geometry imports alphabetically. Also intersphinx linked the `.crs` attribute to geopandas.GeoDataFrame.crs.

* 💬 Mention functional name of IterDataPipe in walkthroughs

So people don't get confused on why the class-form like `Collator` is mentioned but `.collate` was used instead.

* 📝 Remove mention of tuple being returned in test_pyogrio_reader

Forgot to edit the unit test's docstring. Patches #33.

* 🍻 It's GeoPackage and GeoDataFrame, not GeoTIFF and DataArray

Need to be more careful when copying and pasting stuff.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant