Skip to content

Commit acc7012

Browse files
authored
Implemented DATEDIFF function (#262)
Resolves #221 Please check the issue for more details on the implementation. This implementation calculates date/time differences for various units as follows: Years: Extracts the year from both dates and calculates the difference. Months: Calculates (years_diff * 12 + month_y) - month_x to get the total months difference. Days: Subtracts the start-of-day values of both dates to compute the days difference. Hours: Expands to ((days_diff * 24 + hours_y) - hours_x) for total hours difference. Minutes: Expands to ((hours_diff * 60 + minutes_y) - minutes_x) for total minutes difference. Seconds: Expands to ((minutes_diff * 60 + seconds_y) - seconds_x) for total seconds difference.
1 parent f66f1fb commit acc7012

File tree

11 files changed

+689
-1
lines changed

11 files changed

+689
-1
lines changed

documentation/functions.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ Below is the list of every function/operator currently supported in PyDough as a
2828
* [HOUR](#hour)
2929
* [MINUTE](#minute)
3030
* [SECOND](#second)
31+
* [DATEDIFF] (#datediff)
3132
- [Conditional Functions](#conditional-functions)
3233
* [IFF](#iff)
3334
* [ISIN](#isin)
@@ -290,6 +291,37 @@ is from 0-59:
290291
Orders(is_lt_30_seconds = SECOND(order_date) < 30)
291292
```
292293

294+
<!-- TOC --><a name="datediff"></a>
295+
### DATEDIFF
296+
297+
Calling `DATEDIFF` between 2 timestamps returns the difference in one of `years`, `months`,`days`,`hours`,`minutes` or`seconds`.
298+
299+
- `DATEDIFF("years", x, y)`: Returns the **number of full years since x that y occurred**. For example, if **x** is December 31, 2009, and **y** is January 1, 2010, it counts as **1 year apart**, even though they are only 1 day apart.
300+
- `DATEDIFF("months", x, y)`: Returns the **number of full months since x that y occurred**. For example, if **x** is January 31, 2014, and **y** is February 1, 2014, it counts as **1 month apart**, even though they are only 1 day apart.
301+
- `DATEDIFF("days", x, y)`: Returns the **number of full days since x that y occurred**. For example, if **x** is 11:59 PM on one day, and **y** is 12:01 AM the next day, it counts as **1 day apart**, even though they are only 2 minutes apart.
302+
- `DATEDIFF("hours", x, y)`: Returns the **number of full hours since x that y occurred**. For example, if **x** is 6:59 PM and **y** is 7:01 PM on the same day, it counts as **1 hour apart**, even though the difference is only 2 minutes.
303+
- `DATEDIFF("minutes", x, y)`: Returns the **number of full minutes since x that y occurred**. For example, if **x** is 7:00 PM and **y** is 7:01 PM, it counts as **1 minute apart**, even though the difference is exactly 60 seconds.
304+
- `DATEDIFF("seconds", x, y)`: Returns the **number of full seconds since x that y occurred**. For example, if **x** is at 7:00:01 PM and **y** is at 7:00:02 PM, it counts as **1 second apart**.
305+
306+
```py
307+
# Calculates, for each order, the number of days since January 1st 1992
308+
# that the order was placed:
309+
orders(
310+
days_since=DATEDIFF("days",datetime.date(1992, 1, 1), order_date)
311+
)
312+
```
313+
314+
The first argument in the `DATEDIFF` function supports the following aliases for each unit of time. The argument is **case-insensitive**, and if a unit is not one of the provided options, an error will be thrown:
315+
316+
- **Years**: Supported aliases are `"years"`, `"year"`, and `"y"`.
317+
- **Months**: Supported aliases are `"months"`, `"month"`, and `"mm"`.
318+
- **Days**: Supported aliases are `"days"`, `"day"`, and `"d"`.
319+
- **Hours**: Supported aliases are `"hours"`, `"hour"`, and `"h"`.
320+
- **Minutes**: Supported aliases are `"minutes"`, `"minute"`, and `"m"`.
321+
- **Seconds**: Supported aliases are `"seconds"`, `"second"`, and `"s"`.
322+
323+
Invalid or unrecognized units will result in an error. For example, `"Days"`, `"DAYS"`, and `"d"` are all treated the same due to case insensitivity.
324+
293325
<!-- TOC --><a name="conditional-functions"></a>
294326
## Conditional Functions
295327

pydough/pydough_operators/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
"CONTAINS",
1818
"COUNT",
1919
"ConstantType",
20+
"DATEDIFF",
2021
"DAY",
2122
"DEFAULT_TO",
2223
"DIV",
@@ -83,6 +84,7 @@
8384
BXR,
8485
CONTAINS,
8586
COUNT,
87+
DATEDIFF,
8688
DAY,
8789
DEFAULT_TO,
8890
DIV,

pydough/pydough_operators/expression_operators/README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,13 @@ These functions must be called on singular data as a function.
8484
- `HOUR`: Returns the hour component of a datetime.
8585
- `MINUTE`: Returns the minute component of a datetime.
8686
- `SECOND`: Returns the second component of a datetime.
87+
- `DATEDIFF("unit",x,y)`: Returns the difference between two dates (y-x) in one of
88+
- **Years**: `"years"`, `"year"`, `"y"`
89+
- **Months**: `"months"`, `"month"`, `"mm"`
90+
- **Days**: `"days"`, `"day"`, `"d"`
91+
- **Hours**: `"hours"`, `"hour"`, `"h"`
92+
- **Minutes**: `"minutes"`, `"minute"`, `"m"`
93+
- **Seconds**: `"seconds"`, `"second"`, `"s"`.
8794

8895
##### Conditional Functions
8996

pydough/pydough_operators/expression_operators/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
"BinaryOperator",
1616
"CONTAINS",
1717
"COUNT",
18+
"DATEDIFF",
1819
"DAY",
1920
"DEFAULT_TO",
2021
"DIV",
@@ -77,6 +78,7 @@
7778
BXR,
7879
CONTAINS,
7980
COUNT,
81+
DATEDIFF,
8082
DAY,
8183
DEFAULT_TO,
8284
DIV,

pydough/pydough_operators/expression_operators/registered_expression_operators.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
"""
2-
Definition bindings of builtin PyDough operators that reutrn an expression.
2+
Definition bindings of builtin PyDough operators that return an expression.
33
"""
44

55
__all__ = [
@@ -12,6 +12,7 @@
1212
"BXR",
1313
"CONTAINS",
1414
"COUNT",
15+
"DATEDIFF",
1516
"DAY",
1617
"DEFAULT_TO",
1718
"DIV",
@@ -149,6 +150,9 @@
149150
SECOND = ExpressionFunctionOperator(
150151
"SECOND", False, RequireNumArgs(1), ConstantType(Int64Type())
151152
)
153+
DATEDIFF = ExpressionFunctionOperator(
154+
"DATEDIFF", False, RequireNumArgs(3), ConstantType(Int64Type())
155+
)
152156
SLICE = ExpressionFunctionOperator(
153157
"SLICE", False, RequireNumArgs(4), SelectArgumentType(0)
154158
)

pydough/pydough_operators/type_inference/type_verifier.py

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,3 +103,40 @@ def accepts(self, args: list[Any], error_on_fail: bool = True) -> bool:
103103
)
104104
return False
105105
return True
106+
107+
108+
class RequireArgRange(TypeVerifier):
109+
"""
110+
Type verifier implementation class that requires the
111+
number of arguments to be within a range, both ends inclusive.
112+
"""
113+
114+
def __init__(self, low_range: int, high_range: int):
115+
self._low_range: int = low_range
116+
self._high_range: int = high_range
117+
118+
@property
119+
def low_range(self) -> int:
120+
"""
121+
The lower end of the range.
122+
"""
123+
return self._low_range
124+
125+
@property
126+
def high_range(self) -> int:
127+
"""
128+
The higher end of the range.
129+
"""
130+
return self._high_range
131+
132+
def accepts(self, args: list[Any], error_on_fail: bool = True) -> bool:
133+
from pydough.qdag.errors import PyDoughQDAGException
134+
135+
if not (self.low_range <= len(args) <= self.high_range):
136+
if error_on_fail:
137+
raise PyDoughQDAGException(
138+
f"Expected between {self.low_range} and {self.high_range} arguments,\
139+
received {len(args)}"
140+
)
141+
return False
142+
return True

0 commit comments

Comments
 (0)