Page MenuHomePhabricator

Simplify Airflow operators
ClosedPublic

Authored by natekupp on Jul 24 2019, 8:32 PM.

Details

Reviewers
max
Group Reviewers
Restricted Project
Commits
R1:fa38767633b3: Simplify Airflow operators
Summary

This diff simplifies the two base airflow operators we use, in advance of implementing correct Airflow task skip behavior when a step skip event is returned by Dagster step execution

Test Plan

unit

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

natekupp created this revision.Jul 24 2019, 8:32 PM
natekupp retitled this revision from Reorganize Airflow operators to Simplify Airflow operators.Jul 24 2019, 11:26 PM
natekupp updated this revision to Diff 3151.Jul 24 2019, 11:31 PM

add comment

natekupp edited the summary of this revision. (Show Details)Jul 25 2019, 4:41 PM
natekupp added a reviewer: Restricted Project.
max accepted this revision.Jul 25 2019, 4:58 PM
max added a subscriber: max.
max added inline comments.
python_modules/dagster-airflow/dagster_airflow/factory.py
68–69

is it just oop pedantry that makes me prefer even a no-op DagsterOperator abc?

102–128

do we prefer this to the operators having a single interface and then throwing away the args they don't consume?

python_modules/dagster-airflow/dagster_airflow/operators.py
122

Just noting that this landed in https://github.com/apache/airflow/pull/5369. I think we should consider vendoring the DockerOperator and PythonOperator to improve our Airlflow backwards compatibility and reduce the need for shims like this

311

conceivably we should introduce a stronger check, since we can have storage: in_memory:, which won't work for airflow

This revision is now accepted and ready to land.Jul 25 2019, 4:58 PM
natekupp updated this revision to Diff 3187.Jul 25 2019, 5:28 PM
natekupp marked 2 inline comments as done.

up

natekupp marked 2 inline comments as done.Jul 25 2019, 5:30 PM
natekupp added inline comments.
python_modules/dagster-airflow/dagster_airflow/factory.py
68–69

I started by leaving it, but that forces you into multiple inheritance since I can't put DagsterOperator between both DockerOperator and PythonOperator - maybe the right way to solve this is to inherit from DagsterOperator in our vendored versions of those, which would especially be nice to ensure we distinguish from the built-in Airflow versions of those.

102–128

I think so - I started w/ the other and it felt a little gross to dance around what gets routed through kwargs vs named params. but def willing to play around a bit more, maybe there's another way to structure this

python_modules/dagster-airflow/dagster_airflow/operators.py
122

ha, that's funny. yeah I like the idea of vendoring, maybe a good follow-up

311

good call, I added a check for this

This revision was automatically updated to reflect the committed changes.