Add support for datetime64[ns, UTC] via PandasColumn.datetime_column(..., tz='UTC') (#4117)
From #dagster-support thread:
@mrdavidling:
I have a PandasColumn.datetime_column that I'd like to constrain to only contain dates in UTC.
Is this possible?
Currently, if I define the column like this:
PandasColumn.datetime_column("at_date", non_nullable=True, is_required=True)
and pass it a dataframe with a the column of type datetime64[ns, UTC] (from pandas.to_datetime(my_dates, utc=True) ) I get the following error:
Warning! Type check failed. Violated "ColumnDTypeInSetConstraint" for column "at_date" - Column dtype must be in the following set {'datetime64[ns]'}.. DTypes received: datetime64[ns, UTC]
Which is pretty much the exact opposite of the constraint I'm trying to place on the column 🙂
@catherinewu:
Hi @mrdavidlaing 👋 that change would be very appreciated! feel free to tag me in the PR
Summary
Extends the PandasColumn.datetime_column() with a tz param (defaulted to None to preserve existing behaviour).
When tz='UTC' the column must have values with a UTC timezone
Test Plan
Added test_datetime_column_utc_ok() to unit test suite