Page MenuHomePhabricator

[prototype rfc] User Code Container Entrypoints in Dagit
AbandonedPublic

Authored by themissinghlink on Thu, Mar 26, 4:52 PM.

Details

Summary

This prototype hooks in image handles to the dagit CLI as well as introduces a dagster container snapshot CLI. You can now build your user code as a container and as long as your Dockerfile provides a ENTRYPOINT ["/usr/local/bin/dagster"] entrypoint. You are good to go. To watch this work, take the following Dockerfile which represents the airline demo.

FROM dagster/dagster-py37:latest

RUN apt-get update \
    && mkdir -p /opt/dagster/ \
    && mkdir -p /opt/dagster/dagster_home

ADD ./dagster /opt/dagster

WORKDIR /opt/dagster
RUN pip install --upgrade pip \
    && make install_dev_python_modules

WORKDIR /opt/dagster/examples/dagster_examples/airline_demo

ENTRYPOINT ["/usr/local/bin/dagster"]

Then build an image by running docker build -t airline_user_code -f <path to dockerfile> .

Once your image is built. You can from anywhere run: dagit -p 3333 --image airline_user_code and everything should work!

Call Outs:

  • This really only works with Airlines, I didn't wan't to waste too much time on this because Nick and Alex are building their own snapshots.
  • I am noticing that the whole ExecutionTargetHandle/ExecutionTargetData/Loader interface breaks down a bit once containers get thrown in. I was just cutting corners to get stuff working but I would love to learn more about what their abstraction boundaries are (why seperate handle v data)?
  • Execution isn't "isolated" since we are serializing code, I haven't gotten around to the ContainerExecutionManager which I will build in tomorrow but it shouldn't be hard.

But would love comments and figured this would be useful for our architecture meeting!

Test Plan

na

Diff Detail

Repository
R1 dagster
Branch
container-prototype-example (branched from master)
Lint
Lint OK
Unit
No Unit Test Coverage

Event Timeline

  • added docker as a dependency
  • self review fixes
themissinghlink retitled this revision from [prototype] User Code Container Entrypoints in Dagit to [prototype rfc] User Code Container Entrypoints in Dagit.
themissinghlink added a comment.EditedThu, Mar 26, 5:51 PM

Moving forward. If this all looks good, I can start with refining and producing land-able diffs for the following:

  • The dagster container snapshot CLI.
  • The ImageLoader + wireup
themissinghlink edited the summary of this revision. (Show Details)Thu, Mar 26, 6:47 PM
alangenfeld added a comment.EditedThu, Mar 26, 9:23 PM

ExecutionTargetHandle is such a mess already - im skeptical that we want to do this image handling inside that. Especially since what we are loading out of the image is actually going to be a RepositorySnapshot which has a set of PipelineSnapshots and not a real RepositoryDefinition. I have a hunch putting a layer of indirection on top of ExecutionTargetHandle may be a better path.

python_modules/dagster/dagster/cli/container.py
11–20

since this is the snapshot of a repo - i feel like dagster repository snapshot makes more sense

Sounds good. I am going to take you all off as reviewers. Based on our meeting yesterday the milestone is to modify all CLI commands in the dagster pipeline group by letting users pass in a --image <image name> option and get results from a read only container with their user code. For this to happen, I am going to get the following revisions out:

  • A dagster repository CLI group with a snapshot command. This command will return a serialized representation of the repository snapshot that is made up of the pipeline snapshots that Nick has been working on!
  • An accompanying layer of indirection on top of 'ExecutionTargetHandle`/'handle_for_entrypoint_cli_args' that hooks into the pipeline group commands and returns the relevant read only snapshot data.
themissinghlink abandoned this revision.Fri, Mar 27, 5:15 PM

I am gonna abandon this revision for queue management.