Page MenuHomePhabricator

sandyryza (Sandy Ryza)
User

Projects

User does not belong to any projects.

User Details

User Since
Apr 3 2020, 4:04 PM (8 w, 2 d)

Recent Activity

Yesterday

sandyryza requested review of D3183: Streamline resource declarations in lakehouse example.
Sat, May 30, 5:51 PM
sandyryza committed R1:ed738d3fa4f9: In config error, reference Shape instead of dict (authored by sandyryza).
In config error, reference Shape instead of dict
Sat, May 30, 3:59 PM
sandyryza closed D3151: In config error, reference Shape instead of dict.
Sat, May 30, 3:59 PM
sandyryza requested review of D3177: A stab at Lakehouse Table and type checking.
Sat, May 30, 12:10 AM

Fri, May 29

sandyryza added a comment to D3154: resource_with_config.

I originally started implementing a more general case - something like the partially_configured_resource described above. My impression though was that producing a curried config spec would require working pretty deeply inside the config machinery. I was grepping field snaps and EvaluationStacks, but then stepped back and thought it might not be a good use of my time if my immediate goal was fully configured resources. Maybe I was overcomplicating though and deep surgery isn't required.

Fri, May 29, 6:45 PM
sandyryza added a comment to D3154: resource_with_config.

I am a data infra eng. I am responsible for how spark contexts are configured prod, stagings test, etc.. However I still want my ops people and devs to be able to control a limited subset of the configuration. E.g. maybe I just want to expose a memory limit and nothing else (contrived example).

Fri, May 29, 6:26 PM
sandyryza added a comment to D3154: resource_with_config.

Do we think its worth having a static and dynamic API?

Fri, May 29, 3:51 PM

Thu, May 28

sandyryza committed R1:336441d54fc3: PySpark EMR example (authored by sandyryza).
PySpark EMR example
Thu, May 28, 11:21 PM
sandyryza closed D3106: PySpark EMR example.
Thu, May 28, 11:21 PM
sandyryza updated the diff for D3106: PySpark EMR example.

black

Thu, May 28, 9:47 PM
sandyryza requested review of D3154: resource_with_config.
Thu, May 28, 9:35 PM
sandyryza updated the diff for D3106: PySpark EMR example.
  • from_files instead of from_pkg_resources
Thu, May 28, 9:20 PM
sandyryza requested review of D3151: In config error, reference Shape instead of dict.
Thu, May 28, 4:52 PM

Wed, May 27

sandyryza committed R1:e676762f1a03: Lakehouse renovation (authored by sandyryza).
Lakehouse renovation
Wed, May 27, 11:16 PM
sandyryza closed D2925: Lakehouse renovation.
Wed, May 27, 11:16 PM
sandyryza added inline comments to D2925: Lakehouse renovation.
Wed, May 27, 11:16 PM
sandyryza added a comment to D2925: Lakehouse renovation.

@alangenfeld you requested changes on an earlier version - do you still have any reservations?

Wed, May 27, 8:38 PM
sandyryza updated the summary of D3064: Allow PresetDefinitions to wrap ModeDefinitions.
Wed, May 27, 8:17 PM
sandyryza requested review of D3064: Allow PresetDefinitions to wrap ModeDefinitions.
Wed, May 27, 8:16 PM
sandyryza updated the diff for D2925: Lakehouse renovation.

fewer globals

Wed, May 27, 8:15 PM
sandyryza added inline comments to D2925: Lakehouse renovation.
Wed, May 27, 8:13 PM
sandyryza updated the diff for D2925: Lakehouse renovation.

black

Wed, May 27, 5:06 PM
sandyryza updated the diff for D2925: Lakehouse renovation.

try to fix 3.5 tests

Wed, May 27, 4:51 PM
sandyryza accepted D3115: Remove duplicated code in buildkite pipeline.py.

I am a tox / buildkite n00b, so can't claim to have combed through this in great detail, but it seems like the right direction.

Wed, May 27, 4:36 PM
sandyryza accepted D3109: Move pylint to package tox files.

Thanks for doing god's work.

Wed, May 27, 3:39 PM
sandyryza updated the diff for D2925: Lakehouse renovation.

remove unrelated changes and fix tests

Wed, May 27, 3:37 PM
sandyryza updated the diff for D2925: Lakehouse renovation.
  • more lakehouse
Wed, May 27, 3:32 AM
sandyryza accepted D3114: Fix pylint issues with recent pylint.
Wed, May 27, 3:13 AM

Tue, May 26

sandyryza requested review of D3106: PySpark EMR example.
Tue, May 26, 10:15 PM
sandyryza added a comment to D2925: Lakehouse renovation.

@schrockn thanks for all the feedback. All reasonable. Regarding your main questions - it looks like they're about the SolidAsset, which I threw in on Friday and is a bit half-baked. It could make most sense to leave that for a separate revision?

Tue, May 26, 3:39 AM

Fri, May 22

sandyryza updated the summary of D2925: Lakehouse renovation.
Fri, May 22, 10:30 PM
sandyryza updated the diff for D2925: Lakehouse renovation.

nick and alex naming feedback

Fri, May 22, 10:28 PM

Thu, May 21

sandyryza added reviewers for D2925: Lakehouse renovation: schrockn, alangenfeld, nate, max, prha.
Thu, May 21, 8:47 PM
sandyryza updated the summary of D2925: Lakehouse renovation.
Thu, May 21, 8:24 PM
sandyryza requested review of D2925: Lakehouse renovation.
Thu, May 21, 8:08 PM
sandyryza committed R1:f282d5b4d351: "Stream" events from EMR to S3 (authored by sandyryza).
"Stream" events from EMR to S3
Thu, May 21, 5:38 PM
sandyryza closed D2968: "Stream" events from EMR to S3.
Thu, May 21, 5:38 PM
sandyryza updated the diff for D2968: "Stream" events from EMR to S3.

Fix a bug

Thu, May 21, 5:09 PM

Wed, May 20

sandyryza committed R1:08302e460fb2: "Stream" events from S3 to plan process (authored by sandyryza).
"Stream" events from S3 to plan process
Wed, May 20, 9:43 PM
sandyryza closed D2990: "Stream" events from S3 to plan process.
Wed, May 20, 9:43 PM
sandyryza added inline comments to D2990: "Stream" events from S3 to plan process.
Wed, May 20, 8:55 PM
sandyryza updated the diff for D2990: "Stream" events from S3 to plan process.
  • fix test_emr
Wed, May 20, 8:53 PM
sandyryza committed R1:55556d0ff2a7: Fix dagster_type auto_plugins arg type in doc (authored by sandyryza).
Fix dagster_type auto_plugins arg type in doc
Wed, May 20, 8:20 PM
sandyryza closed D2993: Fix dagster_type auto_plugins arg type in doc.
Wed, May 20, 8:20 PM
sandyryza updated the diff for D2990: "Stream" events from S3 to plan process.
  • missing yield
Wed, May 20, 8:19 PM
sandyryza added a comment to D2990: "Stream" events from S3 to plan process.

@schrockn I tried the single threaded way, per your suggestion, and I think it ended up way cleaner.

Wed, May 20, 5:31 PM
sandyryza updated the diff for D2990: "Stream" events from S3 to plan process.
  • Single thread
Wed, May 20, 5:29 PM

Tue, May 19

sandyryza requested review of D2993: Fix dagster_type auto_plugins arg type in doc.
Tue, May 19, 6:21 PM
sandyryza requested review of D2990: "Stream" events from S3 to plan process.
Tue, May 19, 5:55 PM
sandyryza retitled D2968: "Stream" events from EMR to S3 from "Stream" Dagster events from EMR to S3 to "Stream" events from EMR to S3.
Tue, May 19, 5:17 PM

Mon, May 18

sandyryza updated the diff for D2968: "Stream" events from EMR to S3.
  • use real events
Mon, May 18, 10:02 PM
sandyryza added inline comments to D2968: "Stream" events from EMR to S3.
Mon, May 18, 9:54 PM
sandyryza requested review of D2968: "Stream" events from EMR to S3.
Mon, May 18, 5:05 PM

Thu, May 14

sandyryza requested review of D2918: Remove _object methods from IntermediateStore.
Thu, May 14, 10:03 PM

Wed, May 13

sandyryza committed R1:f61bae264b32: Remove _intermediate methods from IntermediateStore (authored by sandyryza).
Remove _intermediate methods from IntermediateStore
Wed, May 13, 6:52 PM
sandyryza closed D2917: Remove _intermediate methods from IntermediateStore.
Wed, May 13, 6:52 PM
sandyryza requested review of D2917: Remove _intermediate methods from IntermediateStore.
Wed, May 13, 5:07 PM

Tue, May 12

sandyryza accepted D2884: Redo mocks with pyspark changes.
Tue, May 12, 8:05 PM

Mon, May 11

sandyryza added inline comments to D2739: ☠️ExecutionTargetHandle ☠️.
Mon, May 11, 10:42 PM

Fri, May 8

sandyryza committed R1:761db302ee52: PySpark odds and ends (authored by sandyryza).
PySpark odds and ends
Fri, May 8, 3:10 PM
sandyryza closed D2829: PySpark odds and ends.
Fri, May 8, 3:10 PM

Thu, May 7

sandyryza accepted D2830: 0.7.10 changelog.
Thu, May 7, 11:30 PM
sandyryza requested review of D2829: PySpark odds and ends.
Thu, May 7, 11:25 PM
sandyryza added inline comments to D2830: 0.7.10 changelog.
Thu, May 7, 11:19 PM
sandyryza added inline comments to D2578: Remote PySpark step execution on EMR.
Thu, May 7, 4:08 AM

Wed, May 6

sandyryza committed R1:2235c0520895: Remote PySpark step execution on EMR (authored by sandyryza).
Remote PySpark step execution on EMR
Wed, May 6, 9:46 PM
sandyryza closed D2578: Remote PySpark step execution on EMR.
Wed, May 6, 9:45 PM
sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.
  • Fix common prefix error in to_module_name_based_handle
Wed, May 6, 3:28 PM
sandyryza added inline comments to D2578: Remote PySpark step execution on EMR.
Wed, May 6, 3:41 AM
sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.
  • Get airline tests working
Wed, May 6, 12:37 AM

Tue, May 5

sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.
  • Get airline tests working
Tue, May 5, 10:34 PM
sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.
  • DagsterSubprocessError
Tue, May 5, 9:18 PM
sandyryza added inline comments to D2578: Remote PySpark step execution on EMR.
Tue, May 5, 8:25 PM
sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.
  • fix EMR tests and make PySpark resource non-lazy
Tue, May 5, 8:17 PM
sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.
  • remove prints from intermediates_manager
Tue, May 5, 6:14 PM
sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.
  • fix airline tests
Tue, May 5, 5:47 PM
sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.
  • black
  • update example snapshots
  • pyspark test_resources is no longer applicable
Tue, May 5, 5:36 PM
sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.
  • fix airline_demo prod_base.yaml
  • get simple_pyspark example working and fix handle tests
Tue, May 5, 5:09 PM
sandyryza added inline comments to D2578: Remote PySpark step execution on EMR.
Tue, May 5, 1:25 AM
sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.
  • remove @pyspark_solid
Tue, May 5, 1:25 AM
sandyryza added inline comments to D2578: Remote PySpark step execution on EMR.
Tue, May 5, 1:15 AM
sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.

Make test pass intermediates, clean up airline demo, enforce step launcher config invariant

Tue, May 5, 1:13 AM

Mon, May 4

sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.

Fix ETH

Mon, May 4, 11:23 PM
sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.

rsync before test_do_it_live_emr

Mon, May 4, 10:25 PM
sandyryza retitled D2578: Remote PySpark step execution on EMR from Remote PySpark step execution on EMR - big picture to Remote PySpark step execution on EMR.
Mon, May 4, 9:18 PM
sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.

Take out step_launcher_resource_key from solid definition

Mon, May 4, 9:15 PM
sandyryza updated the diff for D2578: Remote PySpark step execution on EMR.

Clean up loose ends and get test_do_it_live_emr working

Mon, May 4, 9:08 PM

Fri, May 1

sandyryza committed R1:21a402ce48fe: StepLauncher (authored by sandyryza).
StepLauncher
Fri, May 1, 8:05 PM
sandyryza closed D2688: StepLauncher.
Fri, May 1, 8:05 PM
sandyryza updated the diff for D2688: StepLauncher.
  • fix test broken by rebase
Fri, May 1, 7:38 PM
sandyryza updated the diff for D2688: StepLauncher.

Spacing merge issue

Fri, May 1, 6:41 PM
sandyryza updated the diff for D2688: StepLauncher.
  • Doc LocalExternalStepLauncher and internal_step_launcher -> no_step_launcher
Fri, May 1, 5:43 PM

Apr 30 2020

sandyryza added inline comments to D2688: StepLauncher.
Apr 30 2020, 6:51 PM
sandyryza updated the diff for D2688: StepLauncher.

Find launcher using inheritance rather than resource key name

Apr 30 2020, 6:51 PM
sandyryza added inline comments to D2688: StepLauncher.
Apr 30 2020, 6:12 PM

Apr 29 2020

sandyryza added a comment to D2688: StepLauncher.

I think we should start with the outer retries since we do want to be able retry the launch - we can add the inner retries later

Apr 29 2020, 11:41 PM
sandyryza updated the diff for D2688: StepLauncher.

Retries and check types in new

Apr 29 2020, 11:41 PM
sandyryza added inline comments to D2688: StepLauncher.
Apr 29 2020, 9:54 PM
sandyryza updated the diff for D2688: StepLauncher.

Replace InternalStepLauncher with None

Apr 29 2020, 9:53 PM
sandyryza updated the diff for D2688: StepLauncher.

Black

Apr 29 2020, 8:58 PM