-
Notifications
You must be signed in to change notification settings - Fork 1.9k
feat(spark): implement Spark date_diff function
#19845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| SELECT date_diff(column1, column2) | ||
| FROM VALUES | ||
| ('2009-07-30'::date, '2009-07-31'::date), | ||
| ('2009-07-31'::date, '2009-07-30'::date), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to add a test case for different input types ?
E.g. a Date and a String:
| ('2009-07-31'::date, '2009-07-30'::date), | |
| ('2009-07-31'::date, '2009-07-30'::date), | |
| ('2009-07-31'::date, '2009-07-30'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and also a Timestamp and a Date
| ('2009-07-31'::date, '2009-07-30'::date), | |
| ('2009-07-31 23:45:01'::timestamp, '2009-07-30'::date), |
|
|
||
| fn invoke_with_args(&self, _args: ScalarFunctionArgs) -> Result<ColumnarValue> { | ||
| internal_err!( | ||
| "spark date_diff should have been simplified to standard subtraction" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "spark date_diff should have been simplified to standard subtraction" | |
| "Spark `date_diff` should have been simplified to standard subtraction" |
I am not sure whether ASF branding rules should be obeyed here. Usually all Apache projects should be referred with Apache in front, i.e. Apache Spark ...
| Ok(ExprSimplifyResult::Simplified(binary_expr( | ||
| end.cast_to(&DataType::Int32, info.schema())?, | ||
| Operator::Minus, | ||
| start.cast_to(&DataType::Int32, info.schema())?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the argument data type is Date64 ? The signature allows it.
Date32 keeps the seconds since epoch, but Date64 keeps the milliseconds since epoch.
Which issue does this PR close?
datafusion-sparkSpark Compatible Functions #15914Rationale for this change
Add support for spark https://spark.apache.org/docs/latest/api/sql/index.html#date_diff function
What changes are included in this PR?
Are these changes tested?
yes in SLT
Are there any user-facing changes?
yes