Skip to content

Conversation

@Shekharrajak
Copy link
Contributor

Fixes #2944

This happens because RangeExec (and other non-Comet Spark operators) produce Spark's OnHeapColumnVector instead of Arrow arrays that the native writer expects.

What changes are included in this PR?

  • Modified CometNativeWriteExec.doExecuteColumnar() to detect when the child operator is not a CometPlan
  • Added automatic conversion of Spark columnar batches to Arrow format using CometArrowConverters.columnarBatchToArrowBatchIter()
  • Added support for row-based input by converting rows to Arrow batches using CometArrowConverters.rowToArrowBatchIter()

How are these changes tested?

Added two new tests in CometParquetWriterSuite:

@Shekharrajak Shekharrajak force-pushed the fix/issue-2944-local-writer-arrow-array branch from 5a6966b to 37e4a23 Compare January 12, 2026 18:42
@mbutrovich
Copy link
Contributor

Should this just be a modification to shouldApplySparkToColumnar in CometExecRule to insert the operator instead of duplicating that operator's logic?

@codecov-commenter
Copy link

codecov-commenter commented Jan 15, 2026

Codecov Report

❌ Patch coverage is 88.88889% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 59.87%. Comparing base (f09f8af) to head (1302fc5).
⚠️ Report is 849 commits behind head on main.

Files with missing lines Patch % Lines
...comet/serde/operator/CometDataWritingCommand.scala 85.71% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3075      +/-   ##
============================================
+ Coverage     56.12%   59.87%   +3.74%     
- Complexity      976     1414     +438     
============================================
  Files           119      168      +49     
  Lines         11743    15584    +3841     
  Branches       2251     2590     +339     
============================================
+ Hits           6591     9331    +2740     
- Misses         4012     4944     +932     
- Partials       1140     1309     +169     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Comet fails when local writer enabled

3 participants