Page MenuHomePhabricator

[spark 1/N] Move Spark logic to a Spark resource
AbandonedPublic

Authored by natekupp on Oct 15 2019, 9:47 PM.

Details

Reviewers
alangenfeld
Group Reviewers
Restricted Project
Summary

Cleaning up / improving the interface for defining Spark jobs.

The motivation here is that I'd like to better deliver on the promise of seamless local development <> prod deployment.

To do that for e.g. local spark <> EMR, I think we need to decouple the solid implementation from the environment configuration required to physically execute the Spark job; i.e. move the latter to a resource as here—otherwise, the user is forced to rewrite their jobs with different solids for production.

Downside is that we end up with one resource def per solid (see the multiple jobs test), but (with a bit more work here) at least you wouldn't have to replace all of your solids in your production version, and you could just swap out the local Spark resources for EMR resources.

Would love ideas on if there's a better way to do this!

Test Plan

unit

Diff Detail

Repository
R1 dagster
Branch
spark_resource
Lint
Lint OK
Unit
No Unit Test Coverage

Event Timeline

natekupp created this revision.Oct 15 2019, 9:47 PM
natekupp retitled this revision from Move Spark logic to a Spark resource to [spark 1/N] Move Spark logic to a Spark resource.Sun, Oct 20, 5:16 AM
natekupp retitled this revision from [spark 1/N] Move Spark logic to a Spark resource to Move Spark logic to a Spark resource.Sun, Oct 20, 5:40 AM
natekupp edited the summary of this revision. (Show Details)
natekupp added a reviewer: Restricted Project.
natekupp edited the summary of this revision. (Show Details)Sun, Oct 20, 5:47 AM
natekupp updated this revision to Diff 5910.Sun, Oct 20, 5:50 AM
natekupp edited the summary of this revision. (Show Details)

up

ill have to read the details of the diff again later - but the high level question i have is whether we want to introduce breaking changes before 0.7.0 or if we have to try to support old and new simultaneously

natekupp retitled this revision from Move Spark logic to a Spark resource to [spark 1/N] Move Spark logic to a Spark resource.Mon, Oct 21, 11:30 PM
alangenfeld requested changes to this revision.EditedThu, Oct 24, 3:23 PM

based on our in person conversations - I think its hard to determine if this is the right direction without an example that proves it solves the high level problem of being able to swap local/sparkX/sparkY effectively

This revision now requires changes to proceed.Thu, Oct 24, 3:23 PM
natekupp abandoned this revision.Tue, Oct 29, 9:45 PM

I'm going to abandon this one in favor of iterating on the more comprehensive RFC diff