Page MenuHomeElementl
Feed Advanced Search

Today

yuhan added inline comments to D7476: monitor sensor prototype.
Sat, May 15, 1:57 AM

Yesterday

yuhan accepted D7914: [docs] Re-add example for accessing resources within solid.
Fri, May 14, 8:49 PM
yuhan planned changes to D7476: monitor sensor prototype.

fixing db query logic

Fri, May 14, 8:43 PM
yuhan committed R1:fa5de79f23f2: 0.11.9 docs build (manual) (authored by yuhan).
0.11.9 docs build (manual)
Fri, May 14, 12:55 AM

Thu, May 13

yuhan accepted D7908: Remove 0.11.9rc8 docs.
Thu, May 13, 10:37 PM
yuhan added a comment to D7895: monitor sensor 4/[dagit] show origin runs on the sensor's page.

im imagining a pipeline sensor/monitor page will include three section (panels)

  1. "Tick History": individual ticks: skipped, failed, succeed. a successful tick would link to their associated runs - once we have 2 and 3 built, we can have the same ui to differentiate in the tick's pop up
  2. latest runs it reacted to - originating runs that the monitor will report back to.
  3. latest runs it created - we havent enabled yielding a run request yet, but it will be the retry request case.
Thu, May 13, 9:46 PM
yuhan accepted D7904: update docs for sensor cursors.
Thu, May 13, 9:15 PM
yuhan added inline comments to D7476: monitor sensor prototype.
Thu, May 13, 8:45 PM
yuhan added a comment to D7476: monitor sensor prototype.

bumping my early comment in case you missed it.

In D7476#206864, @yuhan wrote:

naming spitballing:

  • pipeline_failure_sensor / pipeline_sensor which means it watches pipelines - accurate, no new noun
  • pipeline_failure_monitor / pipeline_monitor which creates a new noun "monitor" but the name "monitor" is more intuitive as it describes the use case not the machinery

as im writing, i think im leaning towards pipeline_failure_sensor / pipeline_sensor so we wont be introducing new nouns too early


If a monitor ultimately boils down to a sensor, is it worth introducing a new noun? I.e. should we just call this pipeline_failure_sensor?

i was thinking of subclassing the sensor def to a PipelineSensorDefinition later in a diff, and we can have a @pipeline_sensor to address a generic sensing pipeline (i.e. monitoring) case.
it'd take arbitrary dagster event types and a set of pipeline pointer (e.g. pipeline_name), react to events and yield MonitorRequest. something like:

@pipeline_sensor(
   dagster_event_types=[DagsterEventType.PIPELINE_START],
   pipeline_names=["my_pipeline"], # pipeline_names=None means it watches all pipelines
) 
def alert_on_pipeline_start:
   ...

We probably will want to add a version that monitors / senses failures for a particular pipeline or set of pipelines. How do we imagine the name of that fitting in with the name of this API?

@pipeline_failure_sensor(pipeline_names=["my_pipeline"])
def alert_on_pipeline_failure_whitelisted:
   ...
Thu, May 13, 8:15 PM
yuhan updated the diff for D7476: monitor sensor prototype.

rebase

Thu, May 13, 6:33 AM
yuhan retitled D7878: monitor sensor 1/ rename cursor to before_cursor, add arg after_cursor and ascending to RunStorage.get_runs from monitor sensor 1/ add arg after_cursor to RunStorage.get_runs to monitor sensor 1/ rename cursor to before_cursor, add arg after_cursor and ascending to RunStorage.get_runs.
Thu, May 13, 6:28 AM
yuhan updated the diff for D7878: monitor sensor 1/ rename cursor to before_cursor, add arg after_cursor and ascending to RunStorage.get_runs.

cursor -> before_cursor
+ ascending=False

Thu, May 13, 6:22 AM
yuhan requested review of D7895: monitor sensor 4/[dagit] show origin runs on the sensor's page.
Thu, May 13, 5:36 AM
yuhan updated the summary of D7476: monitor sensor prototype.
Thu, May 13, 5:17 AM
yuhan accepted D7855: [mypy] system_config/objects.py.
Thu, May 13, 4:20 AM
yuhan updated the diff for D7476: monitor sensor prototype.

mypy

Thu, May 13, 4:10 AM
yuhan added a comment to D7476: monitor sensor prototype.

naming spitballing:

  • pipeline_failure_sensor / pipeline_sensor which means it watches pipelines - accurate, no new noun
  • pipeline_failure_monitor / pipeline_monitor which creates a new noun "monitor" but the name "monitor" is more intuitive as it describes the use case not the machinery
Thu, May 13, 2:28 AM
yuhan updated the diff for D7476: monitor sensor prototype.

base decorator

Thu, May 13, 2:11 AM
yuhan added inline comments to D7476: monitor sensor prototype.
Thu, May 13, 2:05 AM
yuhan updated the summary of D7476: monitor sensor prototype.
Thu, May 13, 1:57 AM
yuhan updated the summary of D7476: monitor sensor prototype.
Thu, May 13, 1:56 AM
yuhan updated the diff for D7476: monitor sensor prototype.

+ MonitorRequest

Thu, May 13, 1:55 AM
yuhan updated the diff for D7880: monitor sensor 2/ allow nullable pipeline_name in sensor def.

+ is_monitor_sensor flag

Thu, May 13, 12:29 AM

Wed, May 12

yuhan added a comment to D7878: monitor sensor 1/ rename cursor to before_cursor, add arg after_cursor and ascending to RunStorage.get_runs.

to make sure im understanding it correctly, say we have runs = [1,2,3,4,5,6,7,8,9,10] and we will get:
get_runs(cursor=5, ascending=True, limit=2) = [6,7] where cursor is an after cursor
get_runs(cursor=5, ascending=False, limit=2) = [2,1] where cursor is a before cursor

Wed, May 12, 7:18 PM
yuhan accepted D7868: Add Dagster alert events.

per offline discussion with @rexledesma - these events won't be persisted in OSS for now so should be safe to land - if we ever need to update we can run migration on our end.
i'd recommend stack the call site diff and make sure the new events are working properly before landing :)

Wed, May 12, 6:56 PM
yuhan added inline comments to D7880: monitor sensor 2/ allow nullable pipeline_name in sensor def.
Wed, May 12, 6:53 PM
yuhan updated the test plan for D7880: monitor sensor 2/ allow nullable pipeline_name in sensor def.
Wed, May 12, 6:38 AM
yuhan requested review of D7880: monitor sensor 2/ allow nullable pipeline_name in sensor def.
Wed, May 12, 6:12 AM
yuhan updated the diff for D7476: monitor sensor prototype.

black

Wed, May 12, 5:49 AM
yuhan retitled D7476: monitor sensor prototype from pipeline hook prototype to monitor sensor prototype.
Wed, May 12, 5:45 AM
yuhan updated the diff for D7476: monitor sensor prototype.

up

Wed, May 12, 5:08 AM
yuhan updated the diff for D7476: monitor sensor prototype.

TODO:

  • allow query multiple types: +DagsterEventType.PIPELINE_INIT_FAILURE
  • yield something to indicate the monitor success/error instead of "skipping"
Wed, May 12, 5:06 AM
yuhan requested review of D7878: monitor sensor 1/ rename cursor to before_cursor, add arg after_cursor and ascending to RunStorage.get_runs.
Wed, May 12, 4:35 AM

Tue, May 11

yuhan accepted D7864: Remove executor from concepts overview page.
Tue, May 11, 9:23 PM
yuhan requested changes to D7868: Add Dagster alert events.

to be clear, the diff itself looks good to me.
but bc im not clear about the use cases and where would the call site be, i'd like us to wait the landing until we have a follow up diff to show the use case - just to avoid unnecessary backcompat parsing.

Tue, May 11, 8:26 PM
yuhan added a comment to D7868: Add Dagster alert events.

In which cases we would want these events? I'd like to slow down pushing this a bit because backward compat could be tricky as they are going to be persisted in the event db.
questions are like:

  • what about errors?
  • do we need both start and success? - my take is alert execution itself usually is lightweight so not sure if start is necessary
Tue, May 11, 8:21 PM
yuhan planned changes to D7476: monitor sensor prototype.

not yet :)

Tue, May 11, 6:43 PM
yuhan added a comment to D7864: Remove executor from concepts overview page.

do we have redirects? if not lets add the old/new url pair to https://github.com/dagster-io/dagster/blob/master/docs/next/util/redirectUrls.json

Tue, May 11, 5:30 PM
yuhan updated the diff for D7476: monitor sensor prototype.

@pipeline_failure_sensor

Tue, May 11, 5:28 AM

Mon, May 10

yuhan accepted D7850: [dagit] Fix sensor/schedule mutations.

thx for the quick fix 🙏

Mon, May 10, 11:00 PM
yuhan planned changes to D7476: monitor sensor prototype.

per offline convo, MVP is gonna be @failure_sensor which watches all pipelines in a repo and targets 0 pipeline run

Mon, May 10, 8:57 PM

Sat, May 8

yuhan accepted D7818: [docs] "base class" on concepts pages.
Sat, May 8, 1:44 AM

Fri, May 7

yuhan closed D7783: new arg of_type to EventLogStorage.get_logs_for_run.
Fri, May 7, 9:48 PM
yuhan committed R1:6d308238f48b: new arg of_type to EventLogStorage.get_logs_for_run (authored by yuhan).
new arg of_type to EventLogStorage.get_logs_for_run
Fri, May 7, 9:48 PM
yuhan retitled D7783: new arg of_type to EventLogStorage.get_logs_for_run from new arg dagster_event_type to EventLogStorage.get_logs_for_run to new arg of_type to EventLogStorage.get_logs_for_run.
Fri, May 7, 9:47 PM
yuhan added a comment to D7646: RFC solid RetryPolicy.

Do you have solid invocation > solid definition > pipeline default ? Do you allow applying them at each composition layer? If we did have TimeoutPolicy would it merge? Should SolidExecutionPolicy be a holder for one policy of each "type" and which is sourced given the decided precedence?

  • imo solid invocation > solid definition fallback sounds good.
  • im worried about pipeline default solid-level policy vs pipeline-level policy, which is a similar problem we are facing in hooks but this one is simpler - from the lesson learned from "solid-hook on a pipeline" confusion and if we are going to enable that pattern, i think we will need an explicit name to say "this is solid retry policy not a pipeline policy", which is ok in the max_retries case bc the difference is subtle, but it will result in behavioral difference when it's a timeout policy.
  • adding one more layer to the party: should we also allow run config level policy : ) - i could imagine some pipeline takes too long and someone is tuning the perf via config and wants to change the retry/timeout policy via config too - again, not a mvp's concern
Fri, May 7, 8:27 PM
yuhan accepted D7778: Update solid examples.

sogood

Fri, May 7, 8:09 PM
yuhan closed D7774: D7767 followup: check run_id in downstream.
Fri, May 7, 7:53 PM
yuhan committed R1:1b015a061014: D7767 followup: check run_id in downstream (authored by yuhan).
D7767 followup: check run_id in downstream
Fri, May 7, 7:53 PM
yuhan updated the diff for D7783: new arg of_type to EventLogStorage.get_logs_for_run.

up

Fri, May 7, 7:52 PM
yuhan updated the diff for D7783: new arg of_type to EventLogStorage.get_logs_for_run.

dagster_event_type -> of_type

Fri, May 7, 7:38 PM
yuhan added inline comments to D7783: new arg of_type to EventLogStorage.get_logs_for_run.
Fri, May 7, 7:34 PM
yuhan updated the diff for D7774: D7767 followup: check run_id in downstream.

update error msg

Fri, May 7, 7:16 PM
yuhan closed D7785: [docs] use class in search config.
Fri, May 7, 5:18 PM
yuhan committed R1:736ad8e56cd3: [docs] use class in search config (authored by yuhan).
[docs] use class in search config
Fri, May 7, 5:18 PM
yuhan closed D7784: [docs] util script that removes versioned content.
Fri, May 7, 5:18 PM
yuhan committed R1:48bc6ba9af09: [docs] util script that removes versioned content (authored by yuhan).
[docs] util script that removes versioned content
Fri, May 7, 5:18 PM
yuhan requested review of D7783: new arg of_type to EventLogStorage.get_logs_for_run.
Fri, May 7, 7:00 AM
yuhan requested review of D7785: [docs] use class in search config.
Fri, May 7, 6:39 AM
yuhan requested review of D7784: [docs] util script that removes versioned content.
Fri, May 7, 6:23 AM
yuhan added a comment to D7476: monitor sensor prototype.

reporting mechanism insert some event in the logs of the originating run id which would point to the location of the compute logs of the hook, even though it's captured outside of the pipeline run context.

something similar to instance.report_event(origin_run_id, ...) sounds like a plan

Fri, May 7, 2:29 AM
yuhan added a comment to D7476: monitor sensor prototype.

I like the direction alex is proposing. this is my mental model of this direction:

image.png (460×1 px, 84 KB)

a sensor's flow could look like that^ where

  • the input could be either
    • external states (e.g. s3 file)
    • dagster-aware states (e.g. pipeline failure, pipeline success, dagster-aware assets)

( ^ we currently don't differentiate these two and for sensor-variant i don't think we need to either. but in terms of execution control, we could eventually separate the evaluation based on this and make the evaluation/execution/triggering more efficient and robust.)

Fri, May 7, 1:02 AM

Thu, May 6

yuhan requested review of D7774: D7767 followup: check run_id in downstream.
Thu, May 6, 7:18 PM
yuhan closed D7767: add run group tags to backfill jobs.
Thu, May 6, 7:13 PM
yuhan committed R1:5293ab6baf62: add run group tags to backfill jobs (authored by yuhan).
add run group tags to backfill jobs
Thu, May 6, 7:13 PM
yuhan updated the diff for D7767: add run group tags to backfill jobs.

backoff :/ - will handle error in a follow up

Thu, May 6, 6:51 PM
yuhan updated the diff for D7767: add run group tags to backfill jobs.

mypy

Thu, May 6, 6:29 PM
yuhan updated the diff for D7767: add run group tags to backfill jobs.

check invariant in downstream

Thu, May 6, 6:20 PM
yuhan added inline comments to D7767: add run group tags to backfill jobs.
Thu, May 6, 6:20 PM
yuhan added a comment to D7476: monitor sensor prototype.

How would this work in a world where we encourage people to print (or whatever) instead of using context.log?

i think either way, we can thread the message and report it back to the original run like [1]

Thu, May 6, 6:08 PM
yuhan added inline comments to D7767: add run group tags to backfill jobs.
Thu, May 6, 5:57 PM
yuhan added inline comments to D7767: add run group tags to backfill jobs.
Thu, May 6, 5:45 PM
yuhan updated the diff for D7767: add run group tags to backfill jobs.

up

Thu, May 6, 5:44 PM
yuhan published D7767: add run group tags to backfill jobs for review.
Thu, May 6, 5:36 PM

Wed, May 5

yuhan accepted D7751: check in build files from partition docs changes.
Wed, May 5, 10:40 PM
yuhan added inline comments to D7476: monitor sensor prototype.
Wed, May 5, 9:02 PM
yuhan updated the diff for D7476: monitor sensor prototype.

up

Wed, May 5, 7:48 PM
yuhan added reviewers for D7476: monitor sensor prototype: alangenfeld, prha, sandyryza.
Wed, May 5, 6:52 PM
yuhan updated the summary of D7476: monitor sensor prototype.
Wed, May 5, 7:31 AM
yuhan updated the summary of D7476: monitor sensor prototype.
Wed, May 5, 7:21 AM
yuhan retitled D7476: monitor sensor prototype from WIP convert hook_fn to a dummy pipeline and a sensor to pipeline hook prototype.
Wed, May 5, 7:21 AM
yuhan updated the diff for D7476: monitor sensor prototype.

no need to yield PipelineHookRunSuccess

Wed, May 5, 7:19 AM
yuhan updated the summary of D7476: monitor sensor prototype.
Wed, May 5, 7:16 AM
yuhan updated the summary of D7476: monitor sensor prototype.
Wed, May 5, 7:10 AM
yuhan updated the diff for D7476: monitor sensor prototype.

working but need a lot clean up
sensor daemon + invoke hooks in sensor's evaluation_fn

Wed, May 5, 7:06 AM

Tue, May 4

yuhan accepted D7723: fix missing images on backfills page.
Tue, May 4, 8:04 PM

Sat, May 1

yuhan added a comment to D7689: update docs / examples to use `metadata` parameter instead of `metadata_entries`.

the solid-events.mdx page mentions both EventMetadata and EventMetadataEntry. having both without explaining the difference/relation on the page seems a bit confusing.

Sat, May 1, 12:45 AM

Thu, Apr 29

yuhan closed D7649: [docs] remove rc versioned content.
Thu, Apr 29, 6:08 PM
yuhan committed R1:9099ce588f27: [docs] remove rc versioned content (authored by yuhan).
[docs] remove rc versioned content
Thu, Apr 29, 6:06 PM
yuhan closed D7652: [docs] add HookContext.solid_exception example.
Thu, Apr 29, 5:32 PM
yuhan committed R1:38be68a7ebf7: [docs] add HookContext.solid_exception example (authored by yuhan).
[docs] add HookContext.solid_exception example
Thu, Apr 29, 5:32 PM
yuhan updated the diff for D7652: [docs] add HookContext.solid_exception example.

up

Thu, Apr 29, 5:10 PM

Wed, Apr 28

yuhan requested review of D7652: [docs] add HookContext.solid_exception example.
Wed, Apr 28, 11:03 PM
yuhan accepted D7650: include cereal.csv in assets.
Wed, Apr 28, 10:45 PM
yuhan updated the test plan for D7649: [docs] remove rc versioned content.
Wed, Apr 28, 10:36 PM
yuhan requested review of D7649: [docs] remove rc versioned content.
Wed, Apr 28, 10:28 PM
yuhan requested review of D7647: wip - log file path in io managers including intermediate storages.
Wed, Apr 28, 10:19 PM
yuhan added a comment to D7632: Default to re-executing step selection of current run if one is present.

drive by thought: agreed "full pipeline" is misleading (and i was the one that named it). maybe "Root Run (<insert solid selection if applicable>)"

Wed, Apr 28, 9:56 PM
yuhan accepted D7573: remove unsatisfied inputs from advanced tutorial sections.

good to land it and lets follow up the csv link in a separate diff then (would be great it get link change in tmr's release tho)

Wed, Apr 28, 9:53 PM
yuhan added a comment to D7619: RFC: add basic event metadata using plain dictionary.

I like this direction! but let's make sure we have docs and examples ready when we release this change

Wed, Apr 28, 9:51 PM