The following adds documentation for the custom dataframe type factory/validation logic available in dagster pandas.
The above example provides a simple introduction to using dataframe types in dagster solids. There are also a lot of ways to maximize your workflow development experience by extending your plain DataFrame types. Luckily, dagster-pandas does this for you and provides an API for creating custom dataframe types that perform data quality checks, emit summary statistics, and enable safe/reliable IO for dataframe serialization/deserialization.
Furthermore just sounds awkwardly formal.
Other than inline comments I think we should do a progression where we introduce three layers of contraints.
- Pure datatype checking. Frame this as schema.
- Add mins/maxes etc.
- Add a totally custom constraint.
let's do import from top-level
I would include also "schema validation"
just use create_dagster_pandas_dataframe_type. I don't think we need to include "factory" verbiage
stick to top level includes
probably overkill to have a named output? just return the df?