#ft just FYI, this is to follow-up on conversation with @alangenfeld so he is primary reviewer :)
**Goal:** implement a single `spark_solid` that you can _actually_ develop on locally and deploy on EMR without overhauling all your code
This RFC substantially refactors `dagster-spark` and EMR.
The main code to look at is `dagster_aws/emr/solids.py#47`, `create_spark_solid()`, which defines a solid usable in either a "local" Spark or an EMR context. The test in `dagster_aws_tests/emr_tests/test_combined_solid.py` exercises it.
**Also to note:**NOTES**
- "Local" Spark could very well be a Spark client on a remote Dagster worker, which could in turn be pointing to a full-blown Spark cluster via `master_url`. This is the way you'd launch a Spark job for any deployment environment other than EMR or Dataproc.
- I _think_ this code is more or less directly usable for real production-grade pyspark workloads, also. Will just a few minorneed a few tweaks to point to Python file targets instead of jars and main classes.