feat(arrow): support ArrowReaderOptions in get_metadata#2003
feat(arrow): support ArrowReaderOptions in get_metadata#2003xbattlax wants to merge 3 commits intoapache:mainfrom
Conversation
32f3eca to
126011a
Compare
| arrow-arith = "57.1" | ||
| arrow-array = "57.1" | ||
| arrow-buffer = "57.1" | ||
| arrow-cast = "57.1" | ||
| arrow-ord = "57.1" | ||
| arrow-schema = "57.1" | ||
| arrow-select = "57.1" | ||
| arrow-string = "57.1" |
There was a problem hiding this comment.
In the associated linked issue: #1934
I see @lgingerich mentioned
so there is no crate dependency upgrade work that needs done
do we still need this pckage upgrade to be done as a part of this change
There was a problem hiding this comment.
Yes, the upgrade to 57.1 is required. The ArrowReaderOptions::metadata_options() method used in this PR was only exposed starting in arrow 57.1 (see apache/arrow-rs#7393). Without the upgrade, the code won't compile.
There was a problem hiding this comment.
See my now closed PR here #1935
so there is no crate dependency upgrade work that needs done
This ended up not being true, an upgrade is necessary. That's why I closed my PR.
Respect ArrowReaderOptions by extracting metadata_options and passing it to ParquetMetaDataReader via with_metadata_options(). This allows callers to configure metadata decoding behavior. This change requires parquet 57.1.0 which added the ParquetMetaDataReader::with_metadata_options() API. Changes: - Update arrow and parquet dependencies from 57.0 to 57.1 - Update bindings/python/Cargo.lock for the new versions - Use metadata_options from ArrowReaderOptions in get_metadata Closes apache#1934
65c2f99 to
1f0add9
Compare
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions. |
Summary
Respect
ArrowReaderOptionsinArrowFileReader::get_metadata()by extractingmetadata_optionsand passing it toParquetMetaDataReaderviawith_metadata_options().Changes
bindings/python/Cargo.lockfor the new versionsmetadata_optionsfrom theArrowReaderOptionsparameterParquetMetaDataReader::with_metadata_options()Notes
The
ParquetMetaDataReader::with_metadata_options()API was added in parquet 57.1.0.Security Audit
The security audit failure (RUSTSEC-2026-0001) is a pre-existing issue unrelated to this PR. It affects the
rkyvdependency and is being addressed in #1994.Closes #1934