Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add basic support for data layer metadata IO #237

Merged
merged 7 commits into from
Apr 24, 2023
Merged

Conversation

brendan-ward
Copy link
Member

Resolves #218

Helps support GeoPandas #2777 and #2850

GDAL silently ignores attempts to write metadata when not supported by the underlying driver.

The output of read_info(...) now includes a metadata key with the dict of metadata from the data layer (or None); note that depending on the driver, there may be metadata read even if not written. E.g., the metadata of a shapefile can be {'DBF_DATE_LAST_UPDATE': '2023-04-02'}.

I did not add metadata to read because it didn't seem useful there.

/cc @DahnJ

@brendan-ward brendan-ward added the enhancement New feature or request label Apr 2, 2023
@brendan-ward brendan-ward added this to the 0.6.0 milestone Apr 2, 2023
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering: does GDAL only support setting it on the layer object? Or could it also be set on the dataset object?

pyogrio/geopandas.py Outdated Show resolved Hide resolved
@brendan-ward
Copy link
Member Author

GDAL appears to support reading / writing at the dataset or layer level, and seems to keep track of which was used (so they roundtrip). This means we probably should pass this in as dataset_metadata or layer_metadata.

@jorisvandenbossche
Copy link
Member

Needs a merge of main.

I was initially wondering if we should given one of both (probably the layer metadata) the "preference" and use plain metadata for that one, for simplicity in case users just need that (I also think Fiona essentially only supports layer metadata, but maybe more accidentally, as the docstring indicates they support both, but the code seems to prefer the layer: https://github.com/Toblerity/Fiona/blob/bd2aabb950206da14a450bd0d863e0b9b92bf5bd/fiona/ogrext.pyx#L1455-L1477).
Often, when writing a GeoDataFrame to a file, you just have a single layer, and this distinction is not that important. On the other hand, it's probably more explicit to just use layer_metadata and dataset_metadata.

@brendan-ward
Copy link
Member Author

Adding metadata as an alias of layer_metadata seems reasonable.

@brendan-ward
Copy link
Member Author

@jorisvandenbossche @martinfleis looks like we might be running into an auth issue with CircleCI that is coming from the GeoPandas organization level? Perhaps an expired personal access token? I don't recall that we specifically set up CircleCI for pyogrio.

@martinfleis
Copy link
Member

That's probably caused by #243 and I believe will resolve once merged?

@jorisvandenbossche
Copy link
Member

Sorry, you can ignore the Circle CI failure. That's indeed from #243, I enabled Circle CI to try building linux arm wheels there (we use CircleCI for that in shapely). But since that's not yet working, will disable it again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: add support for metadata tags if supported by driver
3 participants