Page MenuHomePhabricator

Configable Postgres-backed DagsterInstance
ClosedPublic

Authored by max on Tue, Oct 1, 11:47 PM.

Details

Summary

Setup is performant against RDS, even running log spew, with dagster.yaml config like:

run_storage:
  module: dagster_postgres.run_storage
  class: PostgresRunStorage
  config:
    postgres_url: "postgresql://user:password@instance.us-west-1.rds.amazonaws.com:5432/db_name"

event_log_storage:
  module: dagster_postgres.event_log
  class: PostgresEventLogStorage
  config:
    postgres_url: "postgresql://user:password@instance.us-west-1.rds.amazonaws.com:5432/db_name"

Resolves #1760

Test Plan

Unit, manual integration

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

max created this revision.Tue, Oct 1, 11:47 PM
max updated this revision to Diff 5377.Wed, Oct 2, 2:36 PM

Rebase

max updated this revision to Diff 5380.Wed, Oct 2, 2:49 PM

Rebase

Harbormaster failed remote builds in B4339: Diff 5380!
max edited the summary of this revision. (Show Details)Wed, Oct 2, 10:59 PM
max edited the test plan for this revision. (Show Details)
max added reviewers: schrockn, alangenfeld, Restricted Project.
max updated this revision to Diff 5461.Wed, Oct 2, 11:28 PM

Rebase

max edited the summary of this revision. (Show Details)Wed, Oct 2, 11:34 PM
natekupp accepted this revision.Thu, Oct 3, 1:46 AM
natekupp added a subscriber: natekupp.

overall looks fine to me, but seems rather risky for the night before a release given all the threading business here... we should hammer on this hard tomorrow

python_modules/libraries/dagster-postgres/dagster_postgres/event_log.py
238

why this particular sleep value?

This revision is now accepted and ready to land.Thu, Oct 3, 1:46 AM
max added inline comments.Thu, Oct 3, 1:58 AM
python_modules/libraries/dagster-postgres/dagster_postgres/event_log.py
238

I'll add a comment - I figure this is something like a P95 for round-trips to RDS.

schrockn added inline comments.Thu, Oct 3, 5:52 PM
python_modules/dagster/dagster/seven/__init__.py
28

just do _ = ModuleNotFoundError or something and then you don't need the pylint ish

python_modules/libraries/dagster-postgres/dagster_postgres/event_log.py
225

you should probably be able to do

with dict_lock:
   ...
262–289

with

271

with

schrockn requested changes to this revision.Thu, Oct 3, 5:57 PM
schrockn added inline comments.
python_modules/libraries/dagster-postgres/dagster_postgres/event_log.py
134

this is pretty terrifying. we should really be doing explicit lifecycle management and not be relying on the finalizer

179

this seems like a big behavior change. what is going on here?

294–296

I don't think this is doing what you think. there are no copies behind made here. you can just del run_id_dict[run_id] directly

This revision now requires changes to proceed.Thu, Oct 3, 5:57 PM
max added inline comments.Thu, Oct 3, 6:10 PM
python_modules/libraries/dagster-postgres/dagster_postgres/event_log.py
134

we will have to do some pretty serious restructuring to get this on a more secure footing -- there isn't a proper lifecycle for these instance objects/members atm

179

yeah, i am happy to revert this but i figure there isn't a clear reason to recover from here?

294–296

see the comment above (https://docs.python.org/2/library/multiprocessing.html#multiprocessing.managers.SyncManager) -- this is the only way to mutate the proxies that i know of

max updated this revision to Diff 5508.Thu, Oct 3, 6:16 PM

Nits

schrockn accepted this revision.Thu, Oct 3, 6:26 PM

spicemustflow

python_modules/libraries/dagster-postgres/dagster_postgres/event_log.py
134

kk. let's make an issue to track this

179

I consider this a sort of "task failed successfully" situation. The process correctly communicated termination in an unexpected fashion. To me if you terminate a process by letting the an exception float this is more of a "task failed unsuccessfully" state

205

should this be while not done and not queue.empty():

?

289

what's going on here? this is same as doing del run_id_dict[run_id] there are no copies being made here.

This revision is now accepted and ready to land.Thu, Oct 3, 6:26 PM
max added inline comments.Thu, Oct 3, 6:34 PM
python_modules/libraries/dagster-postgres/dagster_postgres/event_log.py
134
179

Yep

205

I don't think so, since done isn't set in this block

289

it's very terrible and deceptive, but self.run_id_dict is not a dict -- see https://docs.python.org/2/library/multiprocessing.html#multiprocessing.managers.SyncManager

max updated this revision to Diff 5522.Thu, Oct 3, 6:43 PM

Rebase

schrockn added inline comments.Thu, Oct 3, 6:46 PM
python_modules/libraries/dagster-postgres/dagster_postgres/event_log.py
289

can you name it differently then

289

comment, a helper function with a good name. something

This revision was landed with ongoing or failed builds.Thu, Oct 3, 6:50 PM
This revision was automatically updated to reflect the committed changes.