This is an intentionally inflexible EcsRunLauncher that makes a couple
of assumptions about your ECS configuration:
- The process that initializes the launcher is also running in an ECS task (aka you can't mix and match this EcsRunLauncher with something running in K8s)
- That task runs with launchType = "FARGATE" and networkMode = "awsvpc"
This is how ECS is configured when using our initial ECS reference
deployment.
By making these assumptions, we can simplify the behavior of the
launcher. Instead of needing to create its own task definition or the
user needing to seperately configure a task definition for runs, it can
inheret its task definition from its parent process and override the
command. For example, in the ECS reference deployment, the parent
process will be running a grpc server. The launcher will launch a run in
a task using the same definition but will override the command with
dagster api execute_run.
To figure out what task our parent process is running in, we need to
introspect its task metadata:
https://docs.aws.amazon.com/AmazonECS/latest/userguide/task-metadata-endpoint-v4-fargate.html
In tests, we stub these responses since we aren't actually running on
Fargate and this metadata isn't actually set.
If we're to eventually allow the launcher to register its own task,
we'll need to make sure we're careful about not recreating task
definitions that already exist (since AWS imposes limits on the number
of active task definitions you can create) and also about garbage
collecting (marking as inactive) older task definitions.
Once the task is created, we tag it with the run_id. That way, we can
reference it later. To look up a task based on its tag, we first need to
list all tasks and then check each one to see if it has the tag. AWS
also has a ResourceGroupsTaggingAPI that could allow us to achieve this
with fewer API requests - that's something we can enhance later.