Page MenuHomeElementl

Make run storage metadata class configurable, plus the ability to insert/filter tables using additional attributes
AbandonedPublic

Authored by dgibson on May 26 2021, 7:18 PM.

Details

Test Plan

BK (TestRunStorage subclass still passes)

Diff Detail

Repository
R1 dagster
Branch
multitentantdb (branched from master)
Lint
Lint Passed
Unit
No Test Coverage

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.May 26 2021, 7:42 PM
Harbormaster failed remote builds in B31264: Diff 38476!
Harbormaster returned this revision to the author for changes because remote builds failed.May 26 2021, 8:45 PM
Harbormaster failed remote builds in B31273: Diff 38488!
dgibson retitled this revision from Proof of concept: Add ability to subclass storages with arbitrary columns in both inserts and where clauses to Make run storage metadata class configurable, plus the ability to insert/filter tables using additional attributes.
dgibson edited the test plan for this revision. (Show Details)

up

Harbormaster returned this revision to the author for changes because remote builds failed.May 26 2021, 9:19 PM
Harbormaster failed remote builds in B31278: Diff 38494!
python_modules/dagster/dagster/core/storage/runs/sql_run_storage.py
62

values probably not columns

62

i think i'd prefer like:

def inject_additional_insert_values(**kwargs):
    return kwargs

and then e.g.

conn.execute(
    self._run_tags_table.insert(),  # pylint: disable=no-value-for-parameter
    [
        **self.inject_additional_insert_values(
            run_id=pipeline_run.run_id,
            key=k,
            value=v,
        )
        for k, v in pipeline_run.tags.items()
    ],
)
65

i think i'd prefer a more specific name here but not sure what. it'd be nice to make it clear this was for reads/queries and that it is adding clauses to queries. with_extra_select_clauses_for_table() is terrible but.

65

if my understanding is correct that concrete implementations of apply_additional_filters and additional_insert_columns will then look up the table name on the table class and then switch on the table name, i don't think i quite understand what the win of passing the table class around is vs. just passing around its name, or the key _bulk_actions_table etc

66

i think the implementation in the base class should just be:

def apply_additional_filters(self, select: Select, _table_class: str) -> Select:
    return query
113

interesting, this might be enough -- at least as long as we don't have association tables

185

i don't think this is necessary

250

realistically i think all these joins need to be parametrized also, which means that we need some kind of per-query functionality, not just per-table. i think we can perhaps do this in a pluggable way if we introduce some kind of QueryProvider

python_modules/libraries/dagster-mysql/dagster_mysql/run_storage/run_storage.py
54

why?