Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read with GeoArrow metadata for GDAL >= 3.8 #366

Merged
merged 2 commits into from
Feb 29, 2024

Conversation

kylebarron
Copy link
Contributor

Resolves half of #345. @jorisvandenbossche mentioned in #345 (comment) that he'd suggest making the change to geoarrow metadata. I'd also propose that we make this the default.

A test is added to verify that the metadata on the geometry field is set. The name should be geoarrow.wkb and the extension metadata should include a CRS field with PROJJSON, which I assert is EPSG:4326 for this test dataset.

@brendan-ward would you still prefer having a flag for the user to set this? Ref #345 (comment)

@kylebarron
Copy link
Contributor Author

kylebarron commented Feb 26, 2024

This will be exciting to get working with lonboard, a visualization library I'm working on. I added support for parsing from geoarrow.wkb input and with the CRS information it'll do multithreaded reprojection to WGS84, parallelizing over each Arrow chunk (conveniently GDAL will handle the chunking into row groups). So it should be really fast to plot data from a file (though today it still goes through shapely for the WKB -> GeoArrow parsing, which is a relative bottleneck, though still certainly fast enough)

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
I am personally fine with just defaulting to geoarrow without providing an explicit keyword to control this (in the end it's just for the metadata, not the actual data you get)

Copy link
Member

@brendan-ward brendan-ward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Seems reasonable to go with geoarrow as the default and no need for a keyword, especially since it makes downstream handling easier.

@brendan-ward
Copy link
Member

Could you please add an entry to the changelog?

@brendan-ward brendan-ward added this to the 0.8.0 milestone Feb 28, 2024
@brendan-ward brendan-ward merged commit 0908672 into geopandas:main Feb 29, 2024
19 checks passed
@kylebarron kylebarron deleted the kyle/read-geoarrow-metadata branch February 29, 2024 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants