Page MenuHomeElementl

Add a reference deployment for ECS
ClosedPublic

Authored by jordansanders on Tue, Jun 8, 8:52 PM.

Details

Summary

This demonstrates how to use docker-compose do deploy a Dagster stack to
ECS (and how to run the same stack locally for development). I chose the
docker-compose route over with several different alternatives
(CloudFormation template, AWS Copilot, Terraform, etc.):

  • CloudFormation and Terraform force our hand toward taking on too much devops responsibility (providing recommendation on how to setting up VPCs, Load Balancers, etc.)
  • Copilot's tooling wasn't mature enough
  • Several users have recently expressed interest in trying to deploy our

deploy_docker docker-compose.yaml to ECS.

Test Plan

I'm mainly putting this up for comments right now but if we like the
direction it's heading, we could mimic the type of tests run in our
deploy_docker example.

Diff Detail

Repository
R1 dagster
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Tue, Jun 8, 9:16 PM
Harbormaster failed remote builds in B31826: Diff 39195!
Harbormaster returned this revision to the author for changes because remote builds failed.Tue, Jun 8, 9:45 PM
Harbormaster failed remote builds in B31834: Diff 39203!

This direction looks good to me! And it seems like we can always mature the docker-compose over time when we need to

I like this direction too.

I (and I think users as well) would benefit from a description of how and why this compose differs from the deploy_docker example (postgres? different image name scheme?). It would be super cool if the only difference between this and the deploy_docker example were a different dagster.yaml, maybe attached via a volume mount rather than baked in to the image.

I could be wrong but I think a lot of users will want to use RDS. Is postgres just for ease of trying things out?

yeah to johann's point - the docker-compose setup here is basically identical other than that the image comes from $REGISTRY_URL right? (and the RunLauncher will be an ECSRunLauncher eventually - although will that similarly be able to use the DockerRunLauncher with different Docker creds do you think?)

I don't think they necessarily need to share code, but making the difference clear seems like a good idea

Generally I think this looks good though

examples/deploy_ecs/Dockerfile_pipelines
23–24 ↗(On Diff #39203)

can this be set in the docker-compose file instead (this comment probably applies to the deploy_docker example as well)

examples/deploy_ecs/README.md
20

we might want to clarify here how this would work when you have more than one repository location? 'pipelines' was intended to be an example name in the docker-compose example

This revision is now accepted and ready to land.Wed, Jun 9, 2:08 PM

I (and I think users as well) would benefit from a description of how and why this compose differs from the deploy_docker example (postgres? different image name scheme?). It would be super cool if the only difference between this and the deploy_docker example were a different dagster.yaml, maybe attached via a volume mount rather than baked in to the image.

yeah to johann's point - the docker-compose setup here is basically identical other than that the image comes from $REGISTRY_URL right? (and the RunLauncher will be an ECSRunLauncher eventually - although will that similarly be able to use the DockerRunLauncher with different Docker creds do you think?)

I don't think they necessarily need to share code, but making the difference clear seems like a good idea

I don't know yet if the ECSRunLauncher is actually going to share any of the mechanics of the DockerRunLauncher because doing so will probably require us to mount the Docker socket as a volume. I'd be a little surprised if we can do this in ECS and I have a hunch we'll probably need to spin up new tasks via the ECS API.

So the differences are basically:

  • no network definitions (just letting ECS take care of that for us so that it sets up things like service discovery using its defaults)
  • images come from the ECR repository
  • not exposing the docker socket

I could be wrong but I think a lot of users will want to use RDS. Is postgres just for ease of trying things out?

Yeah, that's my assumption too. I'm including it for the same reason that the Helm charts include it - ease of getting up and running with an expectation that people will switch to their own database. I was considering adding a "Next steps" section with some high level instructions on what you might want to do to get this production ready.

examples/deploy_ecs/Dockerfile_pipelines
23–24 ↗(On Diff #39203)

👍

examples/deploy_ecs/README.md
20

Yeah - I can make it more clear that this is a general recipe for making your images available to ECR but that you can provide your own images and create your own repositories (and that you don't even need to use ECR - you can use whatever Docker registry you want)

cherrypick onto a branch that has the EcsRunLauncher

don't run tests (because none exist yet)

jordansanders retitled this revision from [RFC] Add a reference deployment for ECS to Add a reference deployment for ECS.
jordansanders edited the summary of this revision. (Show Details)

Clean up documentation, use a multi-stage dockerfile, attempt to fix tox.

examples/deploy_ecs/Dockerfile
2

@dgibson I noticed you're making some changes to the "deploy docker" example https://dagster.phacility.com/D8485.

Thoughts on this approach to using a multi-stage build instead of multiple similar dockerfiles? If we like it, we might want to make a similar change there as well.

examples/deploy_ecs/Dockerfile
2

i don't feel super strongly either way- I expect users to be rebuilding the pipelines image much more frequently than the 'system' ones like dagit and dagster since it contains their code, but that doesn't seem incompatible with this

Add a simple test to satisfy tox. Otherwise, pytest raises exit code 5 (no tests collected) so tox fails the build.

This revision was automatically updated to reflect the committed changes.