I need another pair of eyes on this one -
still unclear to me how useful this info is - i guess we can use it if someone reports a bug to us?
separate file maybe
using a finally to "catch" these errors feels a little off - what's the reason to avoid except over the errors explicitly?
should this be a raise_from so we can surface the original exception still?
should dagster instance migrate be in this diff? Seems like its going to be just calling upgrade on the instances stores so i assume not much additional code
Definitely, what's better is that with the declarative base, you get a nice models.py file because all of this gets consolidated to one class and will make automigrations easier later on.
We can consolidate all of this into one class if we use declarative_base factory. This is a bit weird as written because you are trying to use the connection level execute while also trying to use ORM features which might lead to transactional consistency issues with alembic as we add more tables and get into distributed territory. The best way to think about it is that SQLAlchemy makes available three major objects for executing queries: Engines, Connections, and Sessions. Engines/Connections are the lowest level abstraction which are great for executing raw SQL Queries. However, the moment you start using ORM features, best practice is to use Session execute objects. The reasoning for this is because they handle a bunch of transactional consistency issues for you. I fear we will end up trying to eventually redo all this one bug at a time. This blog post outlines how this can all come together:
You can also use sessionmakers to do automated session management, but we can get there later.
This is effectively re-implementing sessions, it might be worth exploring all of that.
what do you think of putting all of our db.Table defs in one file?
It'll be really helpful later on when we have N tables and want to have one spot to look at our schema; this chunk of code is also nice documentation of our storage layer, sort of how schema.graphql is for the GraphQL API
re: all our tables in one file, i feel fine about putting the schema for each individual storage in its own file, i feel very uneasy about mixing the run storage schema with the event log storage schema, etc. -- there is no reason to suppose that in general these will live in the same database. we've separated the storages in part because event log storage is generally >> run storage