[SS-103] Fix COPY FROM PARQUET unsorted map keys#36032
[SS-103] Fix COPY FROM PARQUET unsorted map keys#36032def- wants to merge 3 commits intoMaterializeInc:mainfrom
Conversation
|
Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone. PR title guidelines
Pre-merge checklist
|
|
Had to update the checks in the test you created - it appears MZ doesn't support the equality operator for maps (at least string maps): |
patrickwwbutler
left a comment
There was a problem hiding this comment.
feels a bit suspicious reviewing my own changes so I won't merge, but I'm fairly confident this is good to go - also tested without the sort changes and the TD fails as expected
def-
left a comment
There was a problem hiding this comment.
The fix looks reasonable to me!
martykulma
left a comment
There was a problem hiding this comment.
Nice! The reasoning for the sort needs to be documented in the source, but lgtm otherwise.
🦈
|
|
||
| # Regression test for BUG #41: COPY FROM Parquet Map keys not sorted. | ||
| # | ||
| # Materialize's Datum::Map requires keys to be in ascending sorted order. |
There was a problem hiding this comment.
This is a great clarifying comment about the sort - definitely ought to be in the source!
There was a problem hiding this comment.
👍 It'd be nice to (also) have it on the doc comment for Datum::Map (and/or push_dict_with?) if that type is never supposed to be constructed from unsorted values.
There was a problem hiding this comment.
I see it's already on push_dict_with:
materialize/src/repr/src/row.rs
Lines 2349 to 2351 in 15f2490
There was a problem hiding this comment.
I'd be even happier if we can get rid of that foot gun. Seems way too easy to use wrongly.
Fixes https://github.com/MaterializeInc/database-issues/issues/11290
Updates
ColReader::Maplogic to sort keys when reading in data.