Page MenuHomePhabricator

Add integration test for using DagsterDaemonScheduler in helm chart, renmame run coordinator integration test to daemon integration test
ClosedPublic

Authored by dgibson on Dec 14 2020, 5:13 PM.

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Dec 14 2020, 5:31 PM
Harbormaster failed remote builds in B22788: Diff 27715!

test failure is not real

awesome. once you land, we should get this into the internal deployment - @johann do you want to make that change?

This revision is now accepted and ready to land.Dec 14 2020, 10:25 PM
helm/dagster/values.yaml
468 ↗(On Diff #27715)

Needs to be true by default now

helm/dagster/templates/configmap-instance.yaml
44

Should users be able to pass in the max_catchup_runs config

helm/dagster/templates/configmap-instance.yaml
44

meh, I don't anticipate that will be actually be used much, esp in a system that automatically restarts the scheduler when it goes down. Can always add it if a use case comes up

turn on dagsterDaemon (but run queueing off by default for now)

get some very basic integraiton test coverage (generalize the runcoordinator suite to the daemon in general). Not exercising the schduler specifically yet though

add a test that runs a schedule tick using the helm chart

helm/dagster/values.yaml
459 ↗(On Diff #27843)

given that this is enabled by default now, can we remove the [EXPERIMENTAL]: (Optional) Deploy a daemon for run coordination, etc. verbiage. We can also move this higher in the Helm chart (to be closer to the scheduler, and above optional components like Flower / redis / rabbitmq?)

Might also help to write out some more info for questions like how do I enable the DagsterDaemon scheduler (is scheduler.k8sEnabled = false sufficient or do I also need dagsterDaemon.enabled=true?). Or if I set the scheduler.k8sEnabled = true and queuedRunCoordinator.enabled = true, do i still need to set dagsterDaemon.enabled = true?

Might be good to define the valid configurations of these enabled flags and add that to the validation we currently do in _helpers.tpl

integration_tests/test_suites/celery-k8s-integration-test-suite/test_daemon_scheduler.py
19

would add a check that this returns 0 before we start schedule?

schedule_runs = dagster_instance_for_daemon.get_runs(
    PipelineRunsFilter(tags=PipelineRun.tags_for_schedule(reoriginated_schedule))
)
python_modules/dagster/dagster/scheduler/scheduler.py
303

?

cat's advice - also disable telemetry in a different way than setting the BUILDKTIE env var on the test image, that was messing things up

python_modules/dagster/dagster/scheduler/scheduler.py
303

this was a bad merge from a while ago - noticed when checking the pods logs while testing this. scheduler was launching runs directly instead of putting them on the run queue

helm/dagster/values.yaml
459 ↗(On Diff #27843)

Totally agree with all this - will do it separately though as part of the docs push if that's OK. Added it to our tracking spreadsheet so it won't slip through the cracks.

this looks good. I think it would be slightly better to add an env var that disables telemetry (ie DAGSTER_TELEMETRY_ENABLED) which can replace BUILDKITE in the Dockerfile and in the telemetry.py file so that we have more rigorous guarantees that tests won't send telemetry in the future. Also setting telemetry.enabled: false as the default in the Helm chart would be reasonable since we're very unlikely to get telemetry from k8s clusters anyways

This revision is now accepted and ready to land.Dec 16 2020, 5:17 PM
dgibson retitled this revision from Default helm chart to DagsterDaemonScheduler rather than SystemCronScheduler to Add integration test for using DagsterDaemonScheduler in helm chart, renmame run coordinator integration test to daemon integration test.
dgibson edited the summary of this revision. (Show Details)

rebase on top of separate diff that changes the helm chart to turn on the damon - this diff now just adds the integrtion test

i incorrectly thought I need to update the unit test image

use DAGSTER_DISABLE_TELEMETRY instead of BUILDKTIE