Skip to content

Conversation

@andygrove
Copy link
Member

@andygrove andygrove commented Jan 8, 2026

Which issue does this PR close?

Closes #.

Rationale for this change

The current implementation of accumulators for sum_int handle all three eval modes, making the code complex to understand and also adds overhead due to matching on eval mode within loops.

What changes are included in this PR?

Refactored sum_int.rs from a single accumulator with eval mode branches inside loops into 6 separate accumulator implementations:

Single-row accumulators:

  • SumIntegerAccumulatorLegacy - uses wrapping arithmetic (no overflow checking)
  • SumIntegerAccumulatorAnsi - uses checked arithmetic, returns error on overflow
  • SumIntegerAccumulatorTry - uses checked arithmetic, returns None on overflow, tracks has_all_nulls

Groups accumulators:

  • SumIntGroupsAccumulatorLegacy
  • SumIntGroupsAccumulatorAnsi
  • SumIntGroupsAccumulatorTry

Key changes:

  • SumInteger::accumulator() and create_groups_accumulator() now use a match on eval_mode to instantiate the appropriate accumulator type
  • Each accumulator has a dedicated inner update_sum function without any eval mode branching
  • The Legacy/Ansi accumulators don't carry the has_all_nulls field since they don't need it
  • Added helper methods like overflowed() and group_overflowed() for the Try variants to improve readability

How are these changes tested?

Existing tests.

Benchmark results show some improvement for group accumulators but not for row accumulators.

sum_integer_accumulator/groups_legacy
                        time:   [53.969 µs 54.108 µs 54.283 µs]
                        change: [−14.477% −12.994% −11.194%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  2 (2.00%) high mild
  7 (7.00%) high severe
sum_integer_accumulator/groups_ansi
                        time:   [59.963 µs 60.076 µs 60.189 µs]
                        change: [−30.722% −30.522% −30.325%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
Benchmarking sum_integer_accumulator/groups_try: Warming up for 500.00 ms
Warning: Unable to complete 100 samples in 500.0ms. You may wish to increase target time to 520.5ms, enable flat sampling, or reduce sample count to 60.
sum_integer_accumulator/groups_try
                        time:   [100.99 µs 101.23 µs 101.45 µs]
                        change: [−7.5070% −6.2309% −5.2479%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

@codecov-commenter
Copy link

codecov-commenter commented Jan 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.99%. Comparing base (f09f8af) to head (e8a3a50).
⚠️ Report is 918 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3054      +/-   ##
============================================
+ Coverage     56.12%   59.99%   +3.86%     
- Complexity      976     1475     +499     
============================================
  Files           119      175      +56     
  Lines         11743    16165    +4422     
  Branches       2251     2681     +430     
============================================
+ Hits           6591     9698    +3107     
- Misses         4012     5114    +1102     
- Partials       1140     1353     +213     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@andygrove andygrove marked this pull request as ready for review January 8, 2026 20:11
@andygrove andygrove changed the title perf: refactor sum int with specialized implementations for each eval_mode [WIP] perf: refactor sum int with specialized implementations for each eval_mode Jan 8, 2026
@andygrove
Copy link
Member Author

@coderfender could you review?

Copy link
Contributor

@coderfender coderfender left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the individual accumulators and code LGTM . I will continue to review grups accumulator and provide an update soon

Copy link
Contributor

@coderfender coderfender left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@andygrove
Copy link
Member Author

@comphead could you review when you have time

@parthchandra
Copy link
Contributor

This approach creates a lot of code duplication, the difference being mostly in overflow handling. How about we have a trait say, OverflowStrategy, and then implementations of the overflow strategy for each eval mode. Then the implementation can just use the appropriate implementation while most of the code remains the same.

@andygrove
Copy link
Member Author

This approach creates a lot of code duplication, the difference being mostly in overflow handling. How about we have a trait say, OverflowStrategy, and then implementations of the overflow strategy for each eval mode. Then the implementation can just use the appropriate implementation while most of the code remains the same.

If I understand correctly, you're suggesting a common implementation that invokes a dyn trait per row for overflow handling?

@coderfender
Copy link
Contributor

minor : I wonder if we could reuse Add in checked_arithmetic.rs like Spark to remove redundancy

@andygrove andygrove merged commit fd47751 into apache:main Feb 4, 2026
124 checks passed
@parthchandra
Copy link
Contributor

This approach creates a lot of code duplication, the difference being mostly in overflow handling. How about we have a trait say, OverflowStrategy, and then implementations of the overflow strategy for each eval mode. Then the implementation can just use the appropriate implementation while most of the code remains the same.

If I understand correctly, you're suggesting a common implementation that invokes a dyn trait per row for overflow handling?

Good point. I also considered a macro based approach but that just makes the code very unreadable. I guess this approach should be okay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants