Changeset View
Changeset View
Standalone View
Standalone View
docs/sections/intro_tutorial/materializations.rst
- This file was moved from docs/sections/learn/tutorial/materializations.rst.
Show All 10 Lines | |||||
:py:class:`Materialization <dagster.Materialization>` events. Like | :py:class:`Materialization <dagster.Materialization>` events. Like | ||||
:py:class:`TypeCheck <dagster.TypeCheck>` and :py:class:`ExpectationResult <dagster.ExpectationResult>`, | :py:class:`TypeCheck <dagster.TypeCheck>` and :py:class:`ExpectationResult <dagster.ExpectationResult>`, | ||||
materializations are side-channels for metadata -- they don't get passed to downstream solids and | materializations are side-channels for metadata -- they don't get passed to downstream solids and | ||||
they aren't used to define the data dependencies that structure a pipeline's DAG. | they aren't used to define the data dependencies that structure a pipeline's DAG. | ||||
Suppose that we rewrite our ``sort_calories`` solid so that it saves the newly sorted data frame to | Suppose that we rewrite our ``sort_calories`` solid so that it saves the newly sorted data frame to | ||||
disk. | disk. | ||||
.. literalinclude:: ../../../../examples/dagster_examples/intro_tutorial/materializations.py | .. literalinclude:: ../../../examples/dagster_examples/intro_tutorial/materializations.py | ||||
:lines: 23-43 | :lines: 23-43 | ||||
:linenos: | :linenos: | ||||
:lineno-start: 23 | :lineno-start: 23 | ||||
:caption: materializations.py | :caption: materializations.py | ||||
We've taken the basic precaution of ensuring that the saved csv file has a different filename for | We've taken the basic precaution of ensuring that the saved csv file has a different filename for | ||||
each run of the pipeline. But there's no way for Dagit to know about this persistent artifact. | each run of the pipeline. But there's no way for Dagit to know about this persistent artifact. | ||||
So we'll add the following lines: | So we'll add the following lines: | ||||
.. literalinclude:: ../../../../examples/dagster_examples/intro_tutorial/materializations.py | .. literalinclude:: ../../../examples/dagster_examples/intro_tutorial/materializations.py | ||||
:lines: 23-53 | :lines: 23-53 | ||||
:linenos: | :linenos: | ||||
:lineno-start: 23 | :lineno-start: 23 | ||||
:emphasize-lines: 22-31 | :emphasize-lines: 22-31 | ||||
:caption: materializations.py | :caption: materializations.py | ||||
Note that we've had to add the last line, yielding an :py:class:`Output <dagster.Output>`. Until | Note that we've had to add the last line, yielding an :py:class:`Output <dagster.Output>`. Until | ||||
now, all of our solids have relied on Dagster's implicit conversion of the return value of a solid's | now, all of our solids have relied on Dagster's implicit conversion of the return value of a solid's | ||||
Show All 13 Lines | |||||
calls this facility the | calls this facility the | ||||
:py:func:`@output_materialization_config <dagster.output_materialization_config>`. | :py:func:`@output_materialization_config <dagster.output_materialization_config>`. | ||||
Suppose we would like to be able to configure outputs of our toy custom type, the | Suppose we would like to be able to configure outputs of our toy custom type, the | ||||
``SimpleDataFrame``, to be automatically materialized to disk as both as a pickle and as a .csv. | ``SimpleDataFrame``, to be automatically materialized to disk as both as a pickle and as a .csv. | ||||
(This is a reasonable idea, since .csv files are human-readable and manipulable by a wide variety | (This is a reasonable idea, since .csv files are human-readable and manipulable by a wide variety | ||||
of third party tools, while pickle is a binary format.) | of third party tools, while pickle is a binary format.) | ||||
.. literalinclude:: ../../../../examples/dagster_examples/intro_tutorial/output_materialization.py | .. literalinclude:: ../../../examples/dagster_examples/intro_tutorial/output_materialization.py | ||||
:lines: 29-64 | :lines: 29-64 | ||||
:linenos: | :linenos: | ||||
:lineno-start: 29 | :lineno-start: 29 | ||||
:caption: output_materialization.py | :caption: output_materialization.py | ||||
We set the output materialization config on the type: | We set the output materialization config on the type: | ||||
.. literalinclude:: ../../../../examples/dagster_examples/intro_tutorial/output_materialization.py | .. literalinclude:: ../../../examples/dagster_examples/intro_tutorial/output_materialization.py | ||||
:lines: 67-74 | :lines: 67-74 | ||||
:linenos: | :linenos: | ||||
:lineno-start: 67 | :lineno-start: 67 | ||||
:emphasize-lines: 5 | :emphasize-lines: 5 | ||||
:caption: output_materialization.py | :caption: output_materialization.py | ||||
Now we can tell Dagster to materialize intermediate outputs of this type by providing config: | Now we can tell Dagster to materialize intermediate outputs of this type by providing config: | ||||
.. literalinclude:: ../../../../examples/dagster_examples/intro_tutorial/output_materialization.yaml | .. literalinclude:: ../../../examples/dagster_examples/intro_tutorial/output_materialization.yaml | ||||
:linenos: | :linenos: | ||||
:lines: 6-10 | :lines: 6-10 | ||||
:lineno-start: 6 | :lineno-start: 6 | ||||
:caption: output_materialization.yaml | :caption: output_materialization.yaml | ||||
When we run this pipeline, we'll see that materializations are yielded (and visible in the | When we run this pipeline, we'll see that materializations are yielded (and visible in the | ||||
structured logs in Dagit), and that files are created on disk (with the semicolon separator we | structured logs in Dagit), and that files are created on disk (with the semicolon separator we | ||||
specified). | specified). | ||||
.. thumbnail:: output_materializations.png | .. thumbnail:: output_materializations.png |