Page MenuHomePhabricator

Add a RepositoryLocationOrigin class that separates out the loading instructions for a repo location from the actual repository data that it loads
ClosedPublic

Authored by dgibson on Oct 27 2020, 3:06 AM.

Details

Summary

The high level goal here is to support dagit being able to better handle failures while loading repository locations.

Right now, if there's an issue loading a repo location, we don't have a great way to keep the parts of the repo location that could be used to reload it - most of the repo location handles, for example, assume that you have already successfully loaded the code pointers. This diff adds a separate RepositoryLocationOrigin class hierarchy that just captures the origin of the repo location (much like RepositoryOrigin or PipelineOrigin) - this is all information that we can count on being there before we load any user code. Now, the workspace load path just loads a list of these origins, then passes them into the Workspace (which creates handles from them, doing all the loading then). This paves the way for us to be able to reload repo locations that never loaded successfully in the first place.

As part of this work, we also need to make it so that default repo location names don't depend on the load being successful (otherwise there would be no name for repo locations that fail to load).

This diff doesn't have any actual functionality changes (other than the repo location name changes) - it just splits out handles from origins to lay groundwork for making those changes next.

Test Plan

BK

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Oct 27 2020, 4:16 AM
Harbormaster failed remote builds in B20209: Diff 24525!

use the relative filepath to set the location name for a python_file location when one isn't specified (rather than the absolute filepath), to make the name less crazy-long

more tweaks to the default location name to make it more human-readable while still being solely determined from the yaml and not the loaded repo data (reflected in the test_hello_world_XXX test)

this seems great to me -we can keep messing with default location names as we go

makeitso

python_modules/dagster/dagster/core/host_representation/origin.py
50–51

add a comment block about the purpose and mental model for this

This revision is now accepted and ready to land.Oct 28 2020, 10:59 PM