Page MenuHomeElementl

[dagster-mysql] `wait_timeout` fix for MySQL Engine Connections
ClosedPublic

Authored by sidkmenon on May 20 2021, 7:49 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, May 14, 8:15 PM
Unknown Object (File)
Sat, May 13, 11:22 PM
Unknown Object (File)
Wed, May 10, 2:52 PM
Unknown Object (File)
May 6 2023, 9:26 PM
Unknown Object (File)
Apr 30 2023, 2:33 PM
Unknown Object (File)
Apr 8 2023, 11:55 PM
Unknown Object (File)
Mar 20 2023, 8:27 PM
Unknown Object (File)
Mar 20 2023, 3:51 PM

Details

Summary

By default, SQLAlchemy keeps a db.Engine's pooled connections open forever (pool_recycle=-1). This is a problem in MySQL because of a config value called wait_timeout, which sets the MySQL server to kill idle connections after 8 hours. Therefore, it was possible for code to check out a connection from SQLAlchemy (via self.connect() on one of the storages) where the underlying DB connection had been removed. To rectify this, I am setting the pooled connections to recycle after 1 hour.

Test Plan

bk and Integration

Local testing involved setting the wait_timeout variable on the MySQL DB to 2 seconds & then sleeping for 3 or more seconds. Before this fix in Dagit we'd encounter an OperationalError as the MySQL connection was dead; however, after this fix (setting MYSQL_POOL_RECYCLE to instead be 1 second), this issue was not present in a long-running (15 minute+) session with frequent reloading.

The issue repro code looked something like this:

with self.connect() as conn:
  conn.execute('SET SESSION wait_timeout = 2;')
time.sleep(3)

Diff Detail

Repository
R1 dagster
Branch
mysql-issue (branched from master)
Lint
Lint Passed
Unit
No Test Coverage

Event Timeline

sidkmenon edited the summary of this revision. (Show Details)
sidkmenon edited the test plan for this revision. (Show Details)

We should definitely add a test for this.

python_modules/libraries/dagster-mysql/dagster_mysql/run_storage/run_storage.py
70

might eventually want to make this configurable, but this seems fine for now.

alangenfeld added inline comments.
python_modules/libraries/dagster-mysql/dagster_mysql/event_log/event_log.py
59

not sure this matters for NullPool

Added tests & removed the NullPool wait_timeout fix - there's no pool for that case so there's nothing to recycle

alangenfeld added inline comments.
python_modules/libraries/dagster-mysql/dagster_mysql_tests/test_wait_timeout.py
26–28

some assert here would be good

This revision is now accepted and ready to land.May 20 2021, 9:02 PM
sidkmenon marked 2 inline comments as done.

Adding in asserts & cleaning up tests