Skip to content

Commit

Permalink
DOC: basic API docs (#18)
Browse files Browse the repository at this point in the history
* DOC: basic API docs

* add yml

* try versioneer directly

* avoid installation

* no gdal

* install via requirements

* remove --no-deps

* try conda

* gdal

* use version directly

* use rst for api

* attribution

* docstring formatting

* review comments

* add follow-up note

* docstring formatting
  • Loading branch information
martinfleis committed Nov 18, 2021
1 parent 804669d commit d60a25e
Show file tree
Hide file tree
Showing 11 changed files with 336 additions and 23 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,6 @@ Pipfile.lock

benchmarks/fixtures/*

.libs
.libs

docs/build
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
14 changes: 14 additions & 0 deletions docs/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
name: pyogrio
channels:
- conda-forge
dependencies:
- python==3.9.*
- gdal
- numpy==1.19.*
- numpydoc==1.1.*
- Cython==0.29.*
- docutils==0.16.*
- myst-parser
- pip
- pip:
- sphinx_rtd_theme
35 changes: 35 additions & 0 deletions docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build

if "%1" == "" goto help

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.http://sphinx-doc.org/
exit /b 1
)

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
15 changes: 15 additions & 0 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
API reference
=============

Core
----

.. automodule:: pyogrio
:members: list_drivers, list_layers, read_bounds, read_info, set_gdal_config_options, get_gdal_config_option, __gdal_version__, __gdal_version_string__

GeoPandas integration
---------------------

.. automodule:: pyogrio
:members: read_dataframe, write_dataframe
:noindex:
67 changes: 67 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Path setup --------------------------------------------------------------

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))
import sphinx_rtd_theme
from pyogrio import __version__


autodoc_mock_imports = [
"geopandas",
"pygeos",
]

# -- Project information -----------------------------------------------------

project = "pyogrio"
copyright = "2020-2021 Brendan C. Ward and pyogrio contributors"
author = "Brendan C. Ward and pyogrio contributors"

# The full version, including alpha/beta/rc tags
release = __version__


# -- General configuration ---------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
"sphinx.ext.autodoc",
"numpydoc",
"sphinx.ext.autosummary",
"sphinx_rtd_theme",
"myst_parser",
]

# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []


# -- Options for HTML output -------------------------------------------------

# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = "sphinx_rtd_theme"

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ["_static"]
27 changes: 27 additions & 0 deletions docs/source/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# pyogrio - Vectorized spatial vector file format I/O using GDAL/OGR

Pyogrio provides a
[GeoPandas](https://github.com/geopandas/geopandas)-oriented API to OGR vector
data sources, such as ESRI Shapefile, GeoPackage, and GeoJSON. This converts to
/ from `geopandas.GeoDataFrame`s when the data source includes geometry and
`pandas.DataFrame`s otherwise.

Pyogrio uses a vectorized approach for reading and writing DataFrames, resulting in
\>5-10x speedups reading files and \>5-20x speedups writing files compared to using
non-vectorized approaches (Fiona and current I/O support in GeoPandas).



```{warning}
This is an early version and the API is subject to substantial change.
```

```{toctree}
---
maxdepth: 2
caption: Contents
---
install
api
```
99 changes: 99 additions & 0 deletions docs/source/install.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Installation

## Requirements

Supports Python 3.6 - 3.9 and GDAL 2.4.x - 3.2.x
(prior versions will not be supported)

Reading to GeoDataFrames requires requires `geopandas>=0.8` with `pygeos` enabled.

## Installation

### Conda-forge

This package is available on [conda-forge](https://anaconda.org/conda-forge/pyogrio)
for Linux, MacOS, and Windows.

```bash
conda install -c conda-forge pyogrio
```

This requires compatible versions of `GDAL` and `numpy` from `conda-forge` for
raw I/O support and `geopandas`, `pygeos`, and their dependencies for GeoDataFrame
I/O support.

### PyPi

This package is not yet available on PyPI because it involves compiled binary
dependencies. We are planning to release this package on PyPI for Linux and MacOS.
We are unlikely to release Windows packages on PyPI in the near future due to
the complexity of packaging binary packages for Windows.

### Common installation errors

A driver error resulting from a `NULL` pointer exception like this:

```
pyogrio._err.NullPointerError: NULL pointer error
During handling of the above exception, another exception occurred:
...
pyogrio.errors.DriverError: Data source driver could not be created: GPKG
```

Is likely the result of a collision in underlying GDAL versions between `fiona`
(included in `geopandas`) and the GDAL version needed here. To get around it,
uninstall `fiona` then reinstall to use system GDAL:

```bash
pip uninstall fiona
pip install fiona --no-binary fiona
```

Then restart your interpreter.


## Development

Clone this repository to a local folder.

Install an appropriate distribution of GDAL for your system. `gdal-config` must
be on your system path.

Building `pyogrio` requires requires `Cython`, `numpy`, and `pandas`.

Run `python setup.py develop` to build the extensions in Cython.

Tests are run using `pytest`:

```bash
pytest pyogrio/tests
```

### Windows

Install GDAL from an appropriate provider of Windows binaries. We've heard that
the [OSGeo4W](https://trac.osgeo.org/osgeo4w/) works.

To build on Windows, you need to provide additional command-line parameters
because the location of the GDAL binaries and headers cannot be automatically
determined.

Assuming GDAL is installed to `c:\GDAL`, you can build as follows:

```bash
python -m pip install --install-option=build_ext --install-option="-IC:\GDAL\include" --install-option="-lgdal_i" --install-option="-LC:\GDAL\lib" --no-deps --force-reinstall --no-use-pep517 -e . -v
```

`GDAL_VERSION` environment variable must be if the version cannot be autodetected
using `gdalinfo.exe` (must be on your system `PATH` in order for this to work).

The location of the GDAL DLLs must be on your system `PATH`.

`--no-use-pep517` is required in order to pass additional options to the build
backend (see https://github.com/pypa/pip/issues/5771).

Also see `.github/test-windows.yml` for additional ideas if you run into problems.

Windows is minimally tested; we are currently unable to get automated tests
working on our Windows CI.
36 changes: 19 additions & 17 deletions pyogrio/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ def list_drivers(read=False, write=False):
Returns
-------
dict
Mapping of driver name to file mode capabilities: "r": read, "w": write.
Drivers that are available but with unknown support are marked with "?"
Mapping of driver name to file mode capabilities: ``"r"``: read, ``"w"``: write.
Drivers that are available but with unknown support are marked with ``"?"``
"""

drivers = ogr_list_drivers()
Expand Down Expand Up @@ -83,12 +83,12 @@ def read_bounds(
features. Must be less than the total number of features in the file.
max_features : int, optional (default: None)
Number of features to read from the file. Must be less than the total
number of features in the file minus skip_features (if used).
number of features in the file minus ``skip_features`` (if used).
where : str, optional (default: None)
Where clause to filter features in layer by attribute values. Uses a
restricted form of SQL WHERE clause, defined here:
http://ogdi.sourceforge.net/prop/6.2.CapabilitiesMetadata.html
Examples: "ISO_A3 = 'CAN'", "POP_EST > 10000000 AND POP_EST < 100000000"
Examples: ``"ISO_A3 = 'CAN'"``, ``"POP_EST > 10000000 AND POP_EST < 100000000"``
bbox : tuple of (xmin, ymin, xmax, ymax), optional (default: None)
If present, will be used to filter records whose geometry intersects this
box. This must be in the same CRS as the dataset.
Expand All @@ -97,7 +97,7 @@ def read_bounds(
-------
tuple of (fids, bounds)
fids are global IDs read from the FID field of the dataset
bounds are ndarray of shape(4, n) containig xmin, ymin, xmax, ymax
bounds are ndarray of shape(4, n) containig ``xmin``, ``ymin``, ``xmax``, ``ymax``
"""

return ogr_read_bounds(
Expand All @@ -113,7 +113,7 @@ def read_bounds(
def read_info(path, layer=None, encoding=None):
"""Read information about an OGR data source.
`crs` and `geometry` will be `None` and `features` will be 0 for a
``crs`` and ``geometry`` will be ``None`` and ``features`` will be 0 for a
nonspatial layer.
Parameters
Expand All @@ -129,13 +129,15 @@ def read_info(path, layer=None, encoding=None):
Returns
-------
dict
{
"crs": "<crs>",
"fields": <ndarray of field names>,
"encoding": "<encoding>",
"geometry": "<geometry type>",
"features": <feature count>
}
A dictionary with the following keys::
{
"crs": "<crs>",
"fields": <ndarray of field names>,
"encoding": "<encoding>",
"geometry": "<geometry type>",
"features": <feature count>
}
"""
return ogr_read_info(str(path), layer=layer, encoding=encoding)

Expand All @@ -154,9 +156,9 @@ def set_gdal_config_options(options):
----------
options : dict
If present, provides a mapping of option name / value pairs for GDAL
configuration options. True / False are normalized to 'ON' / 'OFF'.
A value of None for a config option can be used to clear out a previously
set value.
configuration options. ``True`` / ``False`` are normalized to ``'ON'``
/ ``'OFF'``. A value of ``None`` for a config option can be used to clear out a
previously set value.
"""

_set_gdal_config_options(options)
Expand All @@ -173,7 +175,7 @@ def get_gdal_config_option(name):
Returns
-------
value of the option or None if not set
'ON' / 'OFF' are normalized to True / False.
``'ON'`` / ``'OFF'`` are normalized to ``True`` / ``False``.
"""

return _get_gdal_config_option(name)
Loading

0 comments on commit d60a25e

Please sign in to comment.