Page MenuHomePhabricator

Add cron filters for ScheduleDefinitions
ClosedPublic

Authored by sashank on Thu, Oct 3, 12:35 AM.

Details

Reviewers
alangenfeld
Group Reviewers
Restricted Project
Commits
R1:9b6881b9a80f: Add cron filters for ScheduleDefinitions
Summary

Adds a cron_filter argument to ScheduleDefinition, which is a function that returns a boolean. It is called right before executing a schedule, and a schedule is only executed if the cron_filter returns true.

Since cron_filter is a function and is not serializable, this diff includes a small refactor to ScheduleDefinition. The parts of ScheduleDefinition which need to be passed down to a Schedule are stored in ScheduleDefinitionData. Also, execution_params are no longer saved to an artifact file – they are grabbed from the code on disk in the ScheduleDefinition at schedule execution time.

An example schedule that runs only on weekdays:

def many_events_every_minute_filter():
    weekno = datetime.datetime.today().weekday()
    # Returns true if current day is a weekday
    return weekno < 5

many_events_every_minute = ScheduleDefinition(
    name="many_events_every_min",
    cron_schedule="* * * * *",
    execution_params={
        "environmentConfigData": {"storage": {"filesystem": {}}},
        "selector": {"name": "many_events", "solidSubset": None},
        "mode": "default",
    },
    cron_filter=many_events_every_minute_filter,
)
Test Plan

test_start_scheduled_execution_with_cron_filter

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

sashank created this revision.Thu, Oct 3, 12:35 AM
sashank updated this revision to Diff 5464.Thu, Oct 3, 12:50 AM

make gql

sashank edited the summary of this revision. (Show Details)Thu, Oct 3, 4:04 PM
sashank added a reviewer: Restricted Project.
sashank edited the summary of this revision. (Show Details)
sashank edited the summary of this revision. (Show Details)
sashank edited the summary of this revision. (Show Details)
sashank edited the summary of this revision. (Show Details)
sashank edited the summary of this revision. (Show Details)Thu, Oct 3, 4:08 PM

I just dont like the name cron_filter - theres nothing cron specific about this lifecycle hook and im meh on filter though could deal with it. Throwing out ideas should_proceed should_execute on_before_execute preflight_check before_execute

sashank updated this revision to Diff 5498.Thu, Oct 3, 5:47 PM

Good call - I like should_execute

  • cron_filter -> should_execute
alangenfeld accepted this revision.Thu, Oct 3, 5:51 PM

goforit

python_modules/dagster/dagster/core/definitions/schedule.py
8–9

ScheduleDefinitionData

dont love this name - dont have a better idea

This revision is now accepted and ready to land.Thu, Oct 3, 5:51 PM
sashank added inline comments.Thu, Oct 3, 5:52 PM
python_modules/dagster/dagster/core/definitions/schedule.py
8–9

agreed - it's internal so we can change it later

sashank updated this revision to Diff 5502.Thu, Oct 3, 5:56 PM

lint + rebase

sashank updated this revision to Diff 5512.Thu, Oct 3, 6:23 PM

rebase again