Skip to content

Conversation

Copy link

Copilot AI commented Jan 26, 2026

Adds automatic chunk size calculation when chunks are not specified during array creation (~512 elements per chunk), based on JZarr's implementation. This feature is now available for both v2 and v3 arrays.

Changes

Default chunk shape calculation (v2 and v3)

  • Removed requirement to explicitly specify chunks (v2) or chunkGrid (v3) during array creation
  • Implements JZarr's algorithm: divides each dimension by 512 to determine number of chunks, calculates chunk size as shapeDim / (numChunks + 1)
  • For dimensions < 512, uses full dimension as chunk size
  • Example: shape [2000, 1500] → chunks [500, 500]
  • Shared implementation in Utils.calculateDefaultChunks() ensures consistency between v2 and v3

V2 Implementation

  • Modified dev.zarr.zarrjava.v2.ArrayMetadataBuilder.build() to calculate default chunks when chunks == null

V3 Implementation

  • Modified dev.zarr.zarrjava.v3.ArrayMetadataBuilder.build() to calculate default chunks when chunkGrid == null
  • Creates RegularChunkGrid with calculated chunk shape

Dimension Separator Behavior

  • V2 arrays follow the Zarr v2 spec: if dimension_separator is not specified in metadata, it defaults to . (DOT)
  • Auto-detection was removed based on maintainer feedback due to performance concerns

Example usage:

// V2 - chunks optional
Array.create(store, Array.metadataBuilder()
    .withShape(2000, 1500)
    .withDataType(DataType.UINT8)
    .build());  // Auto-calculates chunks [500, 500]

// V3 - chunkGrid optional
Array.create(store, Array.metadataBuilder()
    .withShape(2000, 1500)
    .withDataType(DataType.UINT8)
    .build());  // Auto-calculates chunk grid with [500, 500]

The feature is backward compatible. Existing code specifying chunks or chunkGrid explicitly continues to work unchanged.

Original prompt

This section details on the original issue you should resolve

<issue_title>[v2] Add support for default chunk shape and auto-detection of dimension separator</issue_title>
<issue_description>In JZarr, there are some convenience features that are currently missing in zarr-java. Implementing them would improve usability and compatibility.

  1. Default chunk shape selection

    • When chunkshape is not specified during array creation, JZarr automatically selects chunks of about size 512 (with some logic to make it fit).
    • Reference implementation: ArrayParams::build.
  2. Auto-detection of dimension separator

    • If an array is opened without a specified dimension_separator, JZarr can infer the correct separator from the filesystem layout.
    • Reference implementation: ZarrArray::findNestedChunks. </issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits January 26, 2026 12:17
Co-authored-by: brokkoli71 <44113112+brokkoli71@users.noreply.github.com>
Co-authored-by: brokkoli71 <44113112+brokkoli71@users.noreply.github.com>
Copilot AI changed the title [WIP] Add support for default chunk shape and auto-detection of separator [v2] Add default chunk shape calculation and dimension separator auto-detection Jan 26, 2026
Copilot AI requested a review from brokkoli71 January 26, 2026 12:23
@brokkoli71
Copy link
Member

Maybe the dimension separator auto-detection is not needed. The implementation with listing the store seems a bit expensive for such a small feature. I think a zarr v2 array should specify the dimension separator anyway if it is not "." right? @normanrz

@normanrz
Copy link
Member

Maybe the dimension separator auto-detection is not needed. The implementation with listing the store seems a bit expensive for such a small feature. I think a zarr v2 array should specify the dimension separator anyway if it is not "." right? @normanrz

Yeah, I agree. Yes, the dimension_separator was optional. If not present it should be ..

@brokkoli71
Copy link
Member

@copilot the default chunk shape calculation should be available for v2 and v3

@brokkoli71 brokkoli71 changed the title [v2] Add default chunk shape calculation and dimension separator auto-detection Add default chunk shape calculation Jan 26, 2026
Copilot AI and others added 2 commits January 26, 2026 14:26
Co-authored-by: brokkoli71 <44113112+brokkoli71@users.noreply.github.com>
Co-authored-by: brokkoli71 <44113112+brokkoli71@users.noreply.github.com>
Copy link
Author

Copilot AI commented Jan 26, 2026

@copilot the default chunk shape calculation should be available for v2 and v3

Implemented in commits a435117 and f76f7d6. Default chunk shape calculation is now available for both v2 and v3 arrays using a shared utility method.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • artifacts.unidata.ucar.edu
    • Triggering command: /usr/lib/jvm/temurin-17-jdk-amd64/bin/java /usr/lib/jvm/temurin-17-jdk-amd64/bin/java --enable-native-access=ALL-UNNAMED -classpath /usr/share/apache-maven-3.9.12/boot/plexus-classworlds-2.9.0.jar -Dclassworlds.conf=/usr/share/apache-maven-3.9.12/bin/m2.conf -Dmaven.home=/usr/share/apache-maven-3.9.12 -Dlibrary.jansi.path=/usr/share/apache-maven-3.9.12/lib/jansi-native -Dmaven.multiModuleProjectDirectory=/home/REDACTED/work/zarr-java/zarr-java org.codehaus.plexus.classworlds.launcher.Launcher -f pom.xml -B -V -e -Dfindbugs.skip -Dcheckstyle.skip -Dpmd.skip=true -Dspotbugs.skip -Denforcer.skip -Dmaven.javadoc.skip (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI changed the title Add default chunk shape calculation Add default chunk shape calculation for v2 and v3 arrays Jan 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[v2] Add support for default chunk shape and auto-detection of dimension separator

3 participants