Page MenuHomePhabricator

Add repo stats to telemetry
ClosedPublic

Authored by catherinewu on Sun, May 17, 5:46 PM.

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

catherinewu created this revision.Sun, May 17, 5:46 PM
catherinewu requested review of this revision.Sun, May 17, 6:02 PM

add version

catherinewu edited the test plan for this revision. (Show Details)Mon, May 18, 10:20 PM
catherinewu added reviewers: schrockn, alangenfeld.

update doc string

update docs to remove cringe-y paragraph

I definitely think this is better than trying to re-derive the pipeline from the function args

python_modules/dagster/dagster/core/telemetry.py
225

feels like were jamming two substantially different things in to one function to share a relatively small amount of code/complexity - I bet with some light refactoring you could have a separate log_ function for this metadata and things will feel much better

226–227

theres a lot being swallowed by this broad except - I agree that telemetry should fail silently for transient error - but if something was wrong in the instanceof checks above for example we would never find it

231–243

does this need to be this flexible given the current 2 callsites? Part of the reason i argued for supplying the pipeline directly is we could be really clear about the expected type (and use check.inst_param for example)

232

could add an interface if we wanted for reconstructable pipeline

address comments

schrockn requested changes to this revision.Wed, May 20, 1:07 PM
schrockn added inline comments.
python_modules/dagster/dagster/cli/pipeline.py
371–373

we should get the instance first and thread it down log_repo_starts. Don't add additional calls to DagsterInstance.get()

python_modules/dagster/dagster/core/telemetry.py
227

minor nit: i can be nice to do early returns to avoid excessive nesting:

e.g.

if not os.path.exists(telemetry_id_path):
   return None

with open(telemetry_id_path, 'r') as telemetry_id_file:
   telemetry_id_yaml = yaml.safe_load(telemetry_id_file)
   if not isinstance(telemetry_id_yaml.get(INSTANCE_ID_STR), six.string_types):
     return None
   return telemetry_id_yaml[INSTANCE_ID_STR]
238

as noted above, thread an instance down to here

239–248

@alangenfeld what should this look like in a completely migrated world? this is pretty rough

253–261

I'd consider making this a namedtuple with checked argument, if nothing that just for documentation

This revision now requires changes to proceed.Wed, May 20, 1:07 PM
alangenfeld added inline comments.Wed, May 20, 3:00 PM
python_modules/dagster/dagster/core/definitions/reconstructable.py
98 ↗(On Diff #14276)

consider just IReconstructablePipeline I prefix for "interface"

python_modules/dagster/dagster/core/telemetry.py
239–248

based on the two existing callsites - the check above should change to ReconstructableExecutablePipeline and then this if else block goes away

catherinewu marked 6 inline comments as done.

up

put _set_telemetry_instance_id into a try/catch in case we encounter an error writing the user's file system

python_modules/dagster/dagster/core/telemetry.py
227

oh yeah, thats nicer

alangenfeld accepted this revision.Wed, May 20, 10:51 PM

looksgood

python_modules/dagster/dagster/core/telemetry.py
83–91

put some explanation here about why we want to flatten to a single schema type from the different actions.

maybe a name like TelemetryEntry, TelemetryRow, Telemetry___ would be good too

274

maybe slap some weird sigils in here like '<<unable_to_write_instance_id>>'

schrockn accepted this revision.Thu, May 21, 1:12 AM

good stuff thanks for knocking this out

python_modules/dagster/dagster/core/telemetry.py
228

This is interesting. This is for a later diff but I wonder if we need to start having the requirement that wrapped functions have an instance argument so that we make sure we get the correct one...

256

nesty nesty

This revision is now accepted and ready to land.Thu, May 21, 1:12 AM

@catherinewu you're going to have to rebase on https://dagster.phacility.com/D3022 which i just landed.

This revision was automatically updated to reflect the committed changes.