Page MenuHomePhabricator

move API run launcher to store psutil.Process objects instead of POpen objects
AbandonedPublic

Authored by dgibson on Jul 7 2020, 9:26 PM.

Details

Summary

This is a prereq to being able to persist PIDs as part of the run launcher, load the process by PID later, and then terminate it. There's no way (that I can see) to create a POpen object for an already existing process.

No actual new functionality yet though.

Test Plan

Existing unit test coverage verifies that terminated runs (verify that a test starts failing if _is_alive just returns True always)
Run Dagit on sleepy pipeline locally, verify that terminating a run still terminates the process that it spins up
Is there any local Windows testing I should do as part of this?

Diff Detail

Repository
R1 dagster
Branch
psutil (branched from master)
Lint
Lint OK
Unit
No Unit Test Coverage

Event Timeline

dgibson created this revision.Jul 7 2020, 9:26 PM
Harbormaster returned this revision to the author for changes because remote builds failed.Jul 7 2020, 9:51 PM
Harbormaster failed remote builds in B14847: Diff 18214!
dgibson updated this revision to Diff 18256.Jul 8 2020, 3:10 AM

load module always, not just in windows

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 8 2020, 3:22 AM
Harbormaster failed remote builds in B14882: Diff 18256!
dgibson updated this revision to Diff 18260.Jul 8 2020, 3:38 AM

test fix

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 8 2020, 3:51 AM
Harbormaster failed remote builds in B14885: Diff 18260!
dgibson requested review of this revision.Jul 8 2020, 3:54 AM

i'm not sure what to make of the airline-tests build failure here, wondering if reviewers have any thoughts - the issue seems to be that the build environment doesn't have gcc? But all the other tests are able to include it without issue.

I updated the integration image today, so this could be related to that somehow.

dgibson updated this revision to Diff 18270.Jul 8 2020, 1:25 PM

Try making airline-demo use python:3.7.5 instead of python:3.7.5-slim-stretch (still haven't quite figured out how to run these BK tests locally)

Yeah to be clear it doesn't make that that that would be the issue, since the integration image includes an explicit install of gcc. It's just something that has changed.

yeah, I don't think it was you - https://github.com/docker-library/python/issues/60 suggests that it's b/c slim doesn't have gcc installed

dgibson planned changes to this revision.Jul 8 2020, 1:46 PM

ok, that seemed to work but made the airline demo build super slow, will look into making it keep slim while also including gcc, like most of the other integration images see to do

max added a comment.Jul 8 2020, 2:55 PM

mmm, i think you're looking for the (poorly named) os.kill API

@max that would work for the terminating part, but there were two other things in here that I wasn't sure how to do without psutil for an already-existing process:

  • check if the process is running (POpen::poll())
  • wait for the process to finish (POpen::communicate())
dgibson updated this revision to Diff 18300.Jul 8 2020, 5:02 PM

nate resolved my airline_demo build issue by removing the airline_demo build

max added a comment.Jul 8 2020, 5:06 PM

os.waitpid and psutil.pid_exists?

It looks like waitpid requires the pid to be a child process? Which it won't necessarily be in the future once we're attempting to handle the case where the run executor stopped and restarted while a child process is still running.

psutil.pid_exists looks like it would work too, is it any better than the Process.is_running() that I used in this diff though?

dgibson planned changes to this revision.Jul 8 2020, 6:37 PM
dgibson abandoned this revision.Mon, Jul 13, 2:11 PM