Page MenuHomePhabricator

switch spark compute to pipe to stdout/stderr instead of structured logs
ClosedPublic

Authored by prha on Tue, Aug 27, 11:31 PM.

Details

Summary

Now that we capture the compute IO, we should switch the spark solid
to dumping IO to stdout/stderr instead of using the python logging system

Depends on D897, D895, D893

Test Plan

Ran the event_ingest_pipeline

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

prha created this revision.Tue, Aug 27, 11:31 PM
prha added a reviewer: Restricted Project.Tue, Aug 27, 11:45 PM
alangenfeld requested changes to this revision.Wed, Aug 28, 8:20 PM
alangenfeld added a subscriber: alangenfeld.
alangenfeld added inline comments.
python_modules/libraries/dagster-spark/dagster_spark/utils.py
1โ€“2

ideally this change should just be to change Popen to leave stdout/stderr to default behavior, does that not work with our new setup?

This revision now requires changes to proceed.Wed, Aug 28, 8:20 PM
alangenfeld added inline comments.Wed, Aug 28, 8:22 PM
python_modules/libraries/dagster-spark/dagster_spark/utils.py
1โ€“2

maybe even use subprocess.check_call/output

prha updated this revision to Diff 4158.Sat, Aug 31, 1:54 AM

get rid of the spark logging wrapper

alangenfeld accepted this revision.Tue, Sep 3, 3:19 PM
This revision is now accepted and ready to land.Tue, Sep 3, 3:19 PM

๐Ÿ‘๐Ÿป

max added a subscriber: max.Wed, Sep 4, 5:54 PM
max added inline comments.
python_modules/libraries/dagster-spark/dagster_spark/utils.py
1โ€“2

will stdout/stderr logs still stream if we use one of the blocking calls? (call/check_call/output)

prha added inline comments.Fri, Sep 6, 6:47 PM
python_modules/libraries/dagster-spark/dagster_spark/utils.py
1โ€“2

yep, still streams!

This revision was landed with ongoing or failed builds.Fri, Sep 6, 6:48 PM
This revision was automatically updated to reflect the committed changes.