Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

geopandas: Use io.StringIO to read geojson data and handle compatibility with geopandas v0.x and v1.x #3247

Merged
merged 6 commits into from
May 20, 2024
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 23 additions & 17 deletions pygmt/helpers/tempfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,14 @@
Utilities for dealing with temporary file management.
"""

import io
import uuid
from contextlib import contextmanager
from pathlib import Path
from tempfile import NamedTemporaryFile

import numpy as np
from packaging.version import Version


def unique_name():
Expand Down Expand Up @@ -139,29 +141,33 @@
# 32-bit integer overflow issue. Related issues:
# https://github.com/geopandas/geopandas/issues/967#issuecomment-842877704
# https://github.com/GenericMappingTools/pygmt/issues/2497
if geojson.index.name is None:
geojson.index.name = "index"
geojson = geojson.reset_index(drop=False)
schema = gpd.io.file.infer_schema(geojson)
for col, dtype in schema["properties"].items():
if dtype in ("int", "int64"):
overflow = geojson[col].abs().max() > 2**31 - 1
schema["properties"][col] = "float" if overflow else "int32"
ogrgmt_kwargs["schema"] = schema
if Version(gpd.__version__).major < 1: # GeoPandas v0.x
# The default engine 'fiona' supports the 'schema' parameter.
if geojson.index.name is None:
geojson.index.name = "index"
geojson = geojson.reset_index(drop=False)
schema = gpd.io.file.infer_schema(geojson)
for col, dtype in schema["properties"].items():
if dtype in ("int", "int64"):
overflow = geojson[col].abs().max() > 2**31 - 1
schema["properties"][col] = "float" if overflow else "int32"
ogrgmt_kwargs["schema"] = schema
else: # GeoPandas v1.x.
# The default engine "pyogrio" doesn't support the 'schema' parameter
# but we can change the dtype directly.
for col in geojson.columns:
if geojson[col].dtype in ("int", "int64", "Int64"):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to add "Int64" here because:

>>> import pandas as pd
>>> pd.Int32Dtype().name
'Int32'
>>> pd.Int64Dtype().name
'Int64'

overflow = geojson[col].abs().max() > 2**31 - 1
dtype = "float" if overflow else "int32"
geojson[col] = geojson[col].astype(dtype)

Check warning on line 162 in pygmt/helpers/tempfile.py

View check run for this annotation

Codecov / codecov/patch

pygmt/helpers/tempfile.py#L158-L162

Added lines #L158 - L162 were not covered by tests
# Using geopandas.to_file to directly export to OGR_GMT format
geojson.to_file(**ogrgmt_kwargs)
except AttributeError:
# Other 'geo' formats which implement __geo_interface__
import json

import fiona

with fiona.Env():
jsontext = json.dumps(geojson.__geo_interface__)
# Do Input/Output via Fiona virtual memory
with fiona.io.MemoryFile(file_or_bytes=jsontext.encode()) as memfile:
geoseries = gpd.GeoSeries.from_file(filename=memfile)
geoseries.to_file(**ogrgmt_kwargs)
jsontext = json.dumps(geojson.__geo_interface__)
gpd.read_file(filename=io.StringIO(jsontext)).to_file(**ogrgmt_kwargs)

yield tmpfile.name

Expand Down
Loading