Page MenuHomeElementl

monitor sensor 4/[dagit] show origin runs on the sensor's page
ClosedPublic

Authored by yuhan on May 13 2021, 5:15 AM.
Tags
None
Referenced Files
F2893842: D7895.diff
Sun, Mar 26, 5:25 AM
Unknown Object (File)
Fri, Mar 24, 7:19 AM
Unknown Object (File)
Thu, Mar 23, 10:02 AM
Unknown Object (File)
Fri, Mar 17, 12:43 PM
Unknown Object (File)
Fri, Mar 17, 9:55 AM
Unknown Object (File)
Thu, Mar 16, 6:49 AM
Unknown Object (File)
Wed, Mar 15, 12:49 PM
Unknown Object (File)
Fri, Mar 10, 5:56 PM
Subscribers
None

Details

Summary

dagit changes for mvp

backend

  • add origin_run_ids to JobTick and JobTickData
  • fetch origin runs based on tick.origin_run_ids

dagit
ticks:

  • when the user hovers a success tick, the label shows origin run ids
  • when the user clicks a success tick, the pop up shows the origin runs. so users can go to the runs and find the engine events reported by the monitor sensor. the modal differentiates "Reacted Runs" (open to naming suggestion) vs "Requested Runs" with extra info icons.

Screen Shot 2021-06-04 at 9.20.47 PM.png (3×2 px, 1 MB)

sensor page:

  • same sensor metadata table but says "sensor does not target a pipeline" in the pipeline tag
  • currently the page won't display "latest runs" because
    1. that part currently means the targeted runs (runs created by the sensor) so it shows a warning to avoid confusion
    2. backend isn't ready - no easy way to fetch runs that the sensor reacted to and reported back to.

Screen Shot 2021-06-04 at 9.20.39 PM.png (3×2 px, 743 KB)

next steps

  1. build graphql layer (probably storage too) for latest reacted runs
  2. show "Reacted Runs" in "Latest Runs" and differentiate it from "Requested Runs" similar to the tick modal - if the approach makes sense
Test Plan

dagit

Diff Detail

Repository
R1 dagster
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

It seems like these runs could be interpreted as runs that were triggered by the sensor - how do we expect users to tell the difference?

Yep, I agree with @sandyryza... It's nice that we can distinguish between skip ticks (no runs failed) vs successfully ran monitor function ticks (reported into failed run), but it'd be nice to be able to do that without using the same UI that other sensors use for run requests. What would the UI look like if we did a retry request?

im imagining a pipeline sensor/monitor page will include three section (panels)

  1. "Tick History": individual ticks: skipped, failed, succeed. a successful tick would link to their associated runs - once we have 2 and 3 built, we can have the same ui to differentiate in the tick's pop up
  2. latest runs it reacted to - originating runs that the monitor will report back to.
  3. latest runs it created - we havent enabled yielding a run request yet, but it will be the retry request case.

im only showing 1 on the page for mvp. thinking 2 vs 3 being left vs right panels where one shows the "input runs" and the other shows the "output runs"

I missed the other diffs in this stack so will defer to the reviewers who are up to speed. Happy to be added back if there are specific questions for me

Reacted Runs vs Requested Runs

yuhan edited the test plan for this revision. (Show Details)

mostly nits... biggest question is around the term "Reacted Runs". Just catching up to the latest thinking on it.

js_modules/dagit/packages/core/src/jobs/TickHistory.tsx
263

are we planning on supporting two non-empty lists of runs here (e.g. both requested runs and reacted runs)? if not, maybe we don't need to have the different sections and instead just have some sort of text description.

266

is this a new term that we're going to refer to a lot (i.e. should we introduce some sort of public reaction API to the sensor context)? if not, maybe we can have this be a little more narrowly scoped to "failed runs" or something specific to pipeline_failure_sensor?

282

style nit: not a huge fan of all-caps for emphasis. would prefer to distinguish using color / font-weight / font-size instead.

yuhan marked an inline comment as done.
yuhan edited the summary of this revision. (Show Details)

up

js_modules/dagit/packages/core/src/jobs/TickHistory.tsx
263

My thinking was to use the different sections to differentiate the pipeline failure sensor ticks modal from a regular sensor tick modal: in a regular sensor page, you will see requested runs after clicking a success tick; in a pipeline failure sensor page, you will see failed runs the sensor reacted to.

So I wanted to use different sections to show both "Failed Runs" and "Requested Runs" (and leave the "Requested Runs" section blank with a text description):

image.png (932×2 px, 126 KB)

266

"Failed Runs" then

This revision is now accepted and ready to land.Jun 15 2021, 12:06 AM