Page MenuHomePhabricator

[sensors-4] add sensors to scheduler event loop
ClosedPublic

Authored by prha on Tue, Nov 3, 10:04 PM.

Details

Summary

1st pass at evaluating sensors in the schedule event loop.

There's a fair amount of code duplication, but can start combining if this ends up being the right approach.

This doesn't allow for customization of the evaluation interval or provide for threading to mitigate for latency from one sensor evaluation to affect other sensors / schedules.

Test Plan

bk

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

prha requested review of this revision.Tue, Nov 3, 10:21 PM
dgibson requested changes to this revision.Wed, Nov 4, 3:06 AM

this seems reasonable, time for tests?

python_modules/dagster/dagster/scheduler/sensor.py
55

update copy

82–84

this will be using external repository origin to work if you make the changes from the previous diff so I won't belabor that

98

setting the timestamp here doesn't mess up future checks if it fails does it? like if it suddenly crashed after doing this write and started up again, would it be idempotent?

If it would that's probably undesirable and maybe we should filter be completed ticks when determining the checkpoint time?

This revision now requires changes to proceed.Wed, Nov 4, 3:06 AM
prha marked 2 inline comments as done.

lint

will keep working through the stack - seems like this sensor format is not the final version since it just returns a boolean? and isn't currently offering any idempotency/run-exactly-once guarantees right?

python_modules/dagster/dagster/scheduler/scheduler.py
109–116 ↗(On Diff #25281)

this is going to need a rebase as this method no longer exists- should probably be a new SensorDaemon in the dagster daemon loop in dagster/daemon/daemon.py? Should be functionally the same though

python_modules/dagster/dagster/scheduler/sensor.py
43–50

Probably worth chatting about this part IRL, having some trouble totally following it - seems hard to guarantee idempotence/exactly-one-run-per-sensor-execution for? what if it fails in the middle of doing this?

Do we want to maintain the idempotence guarantees for sensors?

105–112

i had assumed the cursor would be based on the execution keys of the created runs, not just a string version of the current time? For idempotence purposes?

145–146

are we eventually going to include the cursor here? seems like it would need to be part of this in order for it to be useful?

python_modules/dagster/dagster_tests/scheduler_tests/test_sensor_run.py
35–45

this formatting is going to be updated right?

157

rename

The scaffolding here seems fine, we will need to sort out the exact details of the recovery-from-failure rules but it seems like that is happening later in the stack where we will actually use the execution_key and cursor params?

Will have a bit of a painful rebase though

This revision is now accepted and ready to land.Tue, Nov 17, 3:10 PM
This revision was landed with ongoing or failed builds.Mon, Nov 23, 3:49 PM
This revision was automatically updated to reflect the committed changes.