Implement Zarr spatial and proj conventions support#883
Implement Zarr spatial and proj conventions support#883emmanuelmathot wants to merge 26 commits intocorteva:masterfrom
Conversation
…g and formatting utilities
…sions' attribute in XRasterBase
- Introduced a new module `rioxarray/_convention/zarr.py` for handling Zarr conventions. - Implemented functions to read and write CRS and spatial metadata according to Zarr specifications. - Added an enumeration `Convention` in `rioxarray/enum.py` to manage geospatial metadata conventions (CF and Zarr). - Updated `rioxarray/_options.py` to include convention options and validators. - Modified `rioxarray/rioxarray.py` to utilize the new convention architecture for reading and writing CRS and transforms. - Created integration tests for the new convention architecture in `test/integration/test_convention_architecture.py`. - Added unit tests for convention functionality in `test/test_convention_architecture.py`.
|
If convention is |
- Added entry in history.rst for Zarr spatial and proj conventions support with Convention enum and Zarr-specific methods (corteva#883). - Included a new section for conventions in index.rst. - Documented the rioxarray.Convention class in rioxarray.rst.
…onvention handling
…hod and improve dimension detection fallback
… and improve clarity on default behaviors
…for Zarr conventions
…ove test coverage
|
In general, I would like to see only the minimum code needed for This process may even benefit from breaking down the pieces into separate MRs:
Dividing it up this way will help each component get the review it needs. But, we can stick to a big MR if you prefer, |
| return json.loads(projjson_str) | ||
|
|
||
|
|
||
| def calculate_spatial_bbox( |
There was a problem hiding this comment.
You could use rasterio.transform.array_bounds to simplify this.
|
|
||
| When reading geospatial metadata, rioxarray follows this priority order based on the global convention setting: | ||
|
|
||
| - **None (default)**: CF conventions first, with Zarr conventions as fallback if explicitly declared |
There was a problem hiding this comment.
I would like both conventions attempted when reading files regardless of the convention setting. The convention setting changing the search priority is fine. This way you can read in files from either format at the same time with the convention setting set.
|
For testing:
Imports:
|
|
|
||
| # Handle PROJJSON dict (Zarr proj:projjson convention) | ||
| if isinstance(crs_input, dict): | ||
| crs_input = CRS.from_json_dict(crs_input) |
There was a problem hiding this comment.
This should be handled with CRS.from_user_input below.
There was a problem hiding this comment.
| return attrs_out | ||
|
|
||
|
|
||
| def write_conventions( |
There was a problem hiding this comment.
This is an extra function I would prefer not included in this PR.
| "- [rio.write_transform()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.write_transform)\n", | ||
| "- [rio.transform()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.transform)" | ||
| "- [rio.transform()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.transform)\n", | ||
| "- [rio.write_zarr_crs()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.write_zarr_crs) - New Zarr method\n", |
| "- [rio.write_transform()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.write_transform)\n", | ||
| "- [rio.transform()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.transform)" | ||
| "- [rio.transform()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.transform)\n", | ||
| "- [rio.write_zarr_transform()](../rioxarray.rst#rioxarray.rioxarray.XRasterBase.write_zarr_transform) - Zarr-specific method" |
| "id": "5e08da24", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "## Zarr-Specific Methods\n", |
| from rioxarray._io import open_rasterio | ||
| from rioxarray._options import set_options | ||
| from rioxarray._show_versions import show_versions | ||
| from rioxarray.enum import Convention |
There was a problem hiding this comment.
I would prefer this not be added here and stay in the enum namespace for now.
| .. autoclass:: rioxarray.set_options | ||
|
|
||
|
|
||
| rioxarray.Convention |
| requires-python = ">=3.12" | ||
| dependencies = [ | ||
| "packaging", | ||
| "rasterio>=1.4.3", |
There was a problem hiding this comment.
Please don't remove the dependency pin.
| @@ -1,5 +1,5 @@ | |||
| default_language_version: | |||
| python: python3.12 | |||
There was a problem hiding this comment.
Please don't modify this in this PR.
| all = [ | ||
| "scipy" | ||
| ] | ||
| test = [ |
There was a problem hiding this comment.
Please don't modify dependencies in this PR.
alfredoahds
left a comment
There was a problem hiding this comment.
Seconding Alan's comment about minimizing the MR surface making this easier to review.
Overall this seems good, one nitpick is the recurring series of
if convention is ...:
...
elif convention is ...:
...
Maybe we can define a common interface for the convention modules and simplify this down? There will be some unused args in the zarr methods but having something like
convention = _get_convention(convention_input or get_option(CONVENTION))
parsed_crs = convention.read_crs(...)
would reduce that duplication/series of ifs.
| # Use CF convention logic for dimension detection | ||
| # Also use this as fallback when convention is None | ||
| if "x" in self._obj.dims and "y" in self._obj.dims: | ||
| self._x_dim = "x" | ||
| self._y_dim = "y" | ||
| elif "longitude" in self._obj.dims and "latitude" in self._obj.dims: | ||
| self._x_dim = "longitude" | ||
| self._y_dim = "latitude" | ||
| else: | ||
| # look for coordinates with CF attributes | ||
| for coord in self._obj.coords: | ||
| # make sure to only look in 1D coordinates | ||
| # that has the same dimension name as the coordinate | ||
| if self._obj.coords[coord].dims != (coord,): | ||
| continue | ||
| if ( | ||
| self._obj.coords[coord].attrs.get("axis", "").upper() == "X" | ||
| ) or ( | ||
| self._obj.coords[coord].attrs.get("standard_name", "").lower() | ||
| in ("longitude", "projection_x_coordinate") | ||
| ): | ||
| self._x_dim = coord | ||
| elif ( | ||
| self._obj.coords[coord].attrs.get("axis", "").upper() == "Y" | ||
| ) or ( | ||
| self._obj.coords[coord].attrs.get("standard_name", "").lower() | ||
| in ("latitude", "projection_y_coordinate") | ||
| ): | ||
| self._y_dim = coord |
There was a problem hiding this comment.
Thoughts on moving this bit into rioxarray._convention.cf.read_spatial_dimensions so there's a consistent interface for the convention options?
This is the first in a series of PRs splitting corteva#883 as requested by maintainers. **Changes:** - Add `Convention` enum with `CF` value - Add `convention` option to `set_options()` - Create cf.py module with extracted CF read/write logic - Refactor rioxarray.py to use the new CF module This PR establishes the framework for supporting multiple geospatial metadata conventions. Future PRs will add Zarr convention support. --- - [ ] Closes corteva#883 (partial - first of series) - [x] Tests added - [] Fully documented, including history.rst for all changes and rioxarray.rst for new API
Draft for Zarr conventions support according to plan in #882
docs/history.rstfor all changes anddocs/rioxarray.rstfor new API