Skip to content

Conversation

@andygrove
Copy link
Member

Summary

  • Adds native Comet support for Spark's datediff function
  • Returns the number of days between two dates: datediff(endDate, startDate) = endDate - startDate
  • Since Date32 stores days since epoch, implementation is a simple subtraction

Test Plan

  • Added unit tests in CometTemporalExpressionSuite
  • Tests cover: positive/negative differences, same dates, larger ranges, null handling
  • All existing tests pass

Note: This PR was generated with AI assistance.

Closes #3087

Adds native Comet support for Spark's datediff function, which returns
the number of days between two dates (endDate - startDate).

Closes apache#3087

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@andygrove andygrove marked this pull request as draft January 14, 2026 23:49
@codecov-commenter
Copy link

codecov-commenter commented Jan 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.87%. Comparing base (f09f8af) to head (05c660f).
⚠️ Report is 848 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3145      +/-   ##
============================================
+ Coverage     56.12%   59.87%   +3.74%     
- Complexity      976     1414     +438     
============================================
  Files           119      168      +49     
  Lines         11743    15587    +3844     
  Branches       2251     2589     +338     
============================================
+ Hits           6591     9332    +2741     
- Misses         4012     4946     +934     
- Partials       1140     1309     +169     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

FuzzDataGenerator.generateDataFrame(r, spark, schema, 1000, DataGenOptions())
}

test("datediff") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests using dates from leap years would be fun to try. Try this -

datediff( 1900-03-01, 1900-02-27) != datediff(2000-03-01, 2000-02-27)  

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion! I've added leap year edge case tests:

  • datediff('1900-03-01', '1900-02-27') = 2 days (1900 was NOT a leap year - divisible by 100 but not 400)
  • datediff('2000-03-01', '2000-02-27') = 3 days (2000 WAS a leap year - divisible by 400)
  • datediff('2004-03-01', '2004-02-28') = 2 days (2004 was a leap year - divisible by 4, not by 100)
  • datediff('2100-03-01', '2100-02-28') = 1 day (2100 will NOT be a leap year - divisible by 100 but not 400)

All tests pass.

@andygrove andygrove marked this pull request as ready for review January 15, 2026 02:25
Add tests for leap year handling as suggested in review:
- 1900 was NOT a leap year (divisible by 100 but not 400)
- 2000 WAS a leap year (divisible by 400)
- 2004 was a leap year (divisible by 4, not by 100)
- 2100 will NOT be a leap year (divisible by 100 but not 400)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Support Spark expression: date_diff

3 participants