Page MenuHomePhabricator

themissinghlink (Abhinava Singh)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 22 2019, 4:39 PM (17 w, 6 d)

Recent Activity

Fri, Feb 21

themissinghlink added a comment to D2098: [BUGFIX] - Thread in solid_subset into PartitionSetDefinitions.

Good catch! So I am not actually sure if I did this right....the problem goes a lot deeper. It turns out, solid_subset isn't being threaded through at all. This is my best bet at what I think the fix is?

Fri, Feb 21, 11:47 PM
themissinghlink updated the diff for D2098: [BUGFIX] - Thread in solid_subset into PartitionSetDefinitions.
  • thread into schedule definition and add test
Fri, Feb 21, 11:41 PM
themissinghlink added reviewers for D2098: [BUGFIX] - Thread in solid_subset into PartitionSetDefinitions: prha, sashank.
Fri, Feb 21, 9:22 PM
themissinghlink created D2098: [BUGFIX] - Thread in solid_subset into PartitionSetDefinitions.
Fri, Feb 21, 9:18 PM

Thu, Feb 20

themissinghlink committed R1:a9fd0cc6c591: integrate ml solid into pipeline and fix tests (authored by themissinghlink).
integrate ml solid into pipeline and fix tests
Thu, Feb 20, 10:32 PM
themissinghlink closed D2082: integrate ml solid into pipeline and fix tests.
Thu, Feb 20, 10:31 PM
themissinghlink updated the diff for D2082: integrate ml solid into pipeline and fix tests.
  • made pylint disabling more human readable
Thu, Feb 20, 10:19 PM
themissinghlink added inline comments to D2082: integrate ml solid into pipeline and fix tests.
Thu, Feb 20, 10:12 PM
themissinghlink added a comment to D2082: integrate ml solid into pipeline and fix tests.

Talked with @schrockn. The intermediate system should not be counted on after the compute step. This means we need to explicitly persist the ml model and yield the materialization. I will take on the eventing story in another revision because this requires a lot of change.

Thu, Feb 20, 10:03 PM
themissinghlink committed R1:c9ce9abe59a9: Make input_hydration_configs and output_materialization_configs parametrizable… (authored by themissinghlink).
Make input_hydration_configs and output_materialization_configs parametrizable…
Thu, Feb 20, 6:58 PM
themissinghlink closed D2079: Make input_hydration_configs and output_materialization_configs parametrizable for custom dagster dataframe types REDUX.
Thu, Feb 20, 6:58 PM
themissinghlink added reviewers for D2082: integrate ml solid into pipeline and fix tests: schrockn, nate, max.
Thu, Feb 20, 3:53 AM
themissinghlink created D2082: integrate ml solid into pipeline and fix tests.
Thu, Feb 20, 3:45 AM
themissinghlink updated the diff for D2079: Make input_hydration_configs and output_materialization_configs parametrizable for custom dagster dataframe types REDUX.
  • got rid of silly name keyword param and made it required
Thu, Feb 20, 12:29 AM
themissinghlink updated the diff for D2079: Make input_hydration_configs and output_materialization_configs parametrizable for custom dagster dataframe types REDUX.
  • made docs fixes and added comment
Thu, Feb 20, 12:14 AM
themissinghlink added inline comments to D2079: Make input_hydration_configs and output_materialization_configs parametrizable for custom dagster dataframe types REDUX.
Thu, Feb 20, 12:04 AM

Wed, Feb 19

themissinghlink added a reviewer for D2079: Make input_hydration_configs and output_materialization_configs parametrizable for custom dagster dataframe types REDUX: schrockn.
Wed, Feb 19, 11:46 PM
themissinghlink abandoned D2077: Make input_hydration_configs and output_materialization_configs parametrizable for custom dagster dataframe types..

Holy shit. In all the excitement, I realized I never checked out a branch....uhhhh gonna abandon this and get a new revision out with the fixes. I'll link this revision to that one but will need another LGTM

Wed, Feb 19, 11:42 PM
themissinghlink created D2079: Make input_hydration_configs and output_materialization_configs parametrizable for custom dagster dataframe types REDUX.
Wed, Feb 19, 11:42 PM
themissinghlink updated the diff for D2060: make dagster pandas input/output schema more flexible.
  • rebasing to pick up buildkite changes.
Wed, Feb 19, 11:09 PM
themissinghlink committed R1:bb22e5245081: Refactor generate_training_set pipeline. (authored by themissinghlink).
Refactor generate_training_set pipeline.
Wed, Feb 19, 11:05 PM
themissinghlink closed D2071: Refactor generate_training_set pipeline..
Wed, Feb 19, 11:05 PM
themissinghlink added inline comments to D2071: Refactor generate_training_set pipeline..
Wed, Feb 19, 11:02 PM
themissinghlink added a comment to D2077: Make input_hydration_configs and output_materialization_configs parametrizable for custom dagster dataframe types..

On it.

Wed, Feb 19, 10:54 PM
themissinghlink added a reviewer for D2077: Make input_hydration_configs and output_materialization_configs parametrizable for custom dagster dataframe types.: nate.
Wed, Feb 19, 10:17 PM
themissinghlink updated the diff for D2060: make dagster pandas input/output schema more flexible.
  • renamed to more explicit val
Wed, Feb 19, 9:55 PM
themissinghlink added a comment to D2060: make dagster pandas input/output schema more flexible.

Regardless of whether pandas' API changes in the future, we need to be compatible with all versions of pandas which makes it difficult to have a strictly typed schema right? I mean it's a tradeoff, but I don't know if we want to tie ourselves to maintaining it.

Wed, Feb 19, 9:04 PM
themissinghlink updated the diff for D2071: Refactor generate_training_set pipeline..
  • addressed feedback
Wed, Feb 19, 8:46 PM
themissinghlink added inline comments to D2071: Refactor generate_training_set pipeline..
Wed, Feb 19, 8:46 PM
themissinghlink added reviewers for D2077: Make input_hydration_configs and output_materialization_configs parametrizable for custom dagster dataframe types.: schrockn, max.
Wed, Feb 19, 8:36 PM
themissinghlink created D2077: Make input_hydration_configs and output_materialization_configs parametrizable for custom dagster dataframe types..
Wed, Feb 19, 7:41 PM
themissinghlink added a reviewer for D2071: Refactor generate_training_set pipeline.: nate.
Wed, Feb 19, 4:02 AM
themissinghlink added reviewers for D2071: Refactor generate_training_set pipeline.: schrockn, alangenfeld.
Wed, Feb 19, 4:02 AM
themissinghlink updated the summary of D2071: Refactor generate_training_set pipeline..
Wed, Feb 19, 3:52 AM
themissinghlink updated the diff for D2071: Refactor generate_training_set pipeline..
  • brought in LocalClient resource
Wed, Feb 19, 3:48 AM
themissinghlink added a comment to D2071: Refactor generate_training_set pipeline..

Done because I realized I need to make a core change.

Wed, Feb 19, 12:24 AM
themissinghlink removed reviewers for D2071: Refactor generate_training_set pipeline.: schrockn, nate, alangenfeld.
Wed, Feb 19, 12:24 AM
themissinghlink updated the diff for D2071: Refactor generate_training_set pipeline..
  • rebased with master to pick up buildkite changes
Wed, Feb 19, 12:05 AM
themissinghlink committed R1:0b590a18477a: Disable dask and gcp tests. (authored by themissinghlink).
Disable dask and gcp tests.
Wed, Feb 19, 12:01 AM
themissinghlink closed D2074: Disable dask and gcp tests..
Wed, Feb 19, 12:01 AM

Tue, Feb 18

themissinghlink updated the diff for D2074: Disable dask and gcp tests..
  • moved issues to the right locations
Tue, Feb 18, 11:49 PM
themissinghlink updated the diff for D2074: Disable dask and gcp tests..
  • add tracking issues and resolve feedback
Tue, Feb 18, 11:45 PM
themissinghlink added a reviewer for D2074: Disable dask and gcp tests.: Restricted Project.
Tue, Feb 18, 11:35 PM
themissinghlink updated the diff for D2074: Disable dask and gcp tests..
  • switched to commenting out and fixed issue with control flow logic
Tue, Feb 18, 11:30 PM
themissinghlink created D2074: Disable dask and gcp tests..
Tue, Feb 18, 11:24 PM
themissinghlink updated the summary of D2071: Refactor generate_training_set pipeline..
Tue, Feb 18, 11:02 PM
themissinghlink added reviewers for D2071: Refactor generate_training_set pipeline.: schrockn, nate, alangenfeld.
Tue, Feb 18, 11:01 PM
themissinghlink updated the diff for D2071: Refactor generate_training_set pipeline..
  • up
Tue, Feb 18, 10:50 PM
themissinghlink updated the diff for D2071: Refactor generate_training_set pipeline..
  • patched all gcs calls
Tue, Feb 18, 10:49 PM
themissinghlink updated the diff for D2071: Refactor generate_training_set pipeline..
  • forgot to make black
Tue, Feb 18, 10:33 PM
themissinghlink updated the diff for D2071: Refactor generate_training_set pipeline..
  • rebased and fixed mypy issue
Tue, Feb 18, 10:30 PM
themissinghlink retitled D2071: Refactor generate_training_set pipeline. from Refactor pipeline to real generate_training_set pipeline. to Refactor generate_training_set pipeline..
Tue, Feb 18, 10:26 PM
themissinghlink created D2071: Refactor generate_training_set pipeline..
Tue, Feb 18, 10:25 PM

Mon, Feb 17

themissinghlink added a comment to D2060: make dagster pandas input/output schema more flexible.

Re naming. I'm not married to serialization_options, however, kwargs seems a bit confusing right? Here is an example. What about read_csv_kwargs and to_csv_kwargs just so we are being hella explicit with what's happening here?

Mon, Feb 17, 11:39 PM
themissinghlink added a comment to D2060: make dagster pandas input/output schema more flexible.

The problem with strongly typing the config is that while yes you get to catch errors early, you also create a really brittle API. Pandas API changes pretty frequently and new kwargs are added all the time. What happens if the user uses a version of rreadcsv that differs from the version we made a strongly typed config of? Or worse, what if a version of pandas makes a backward incompatible change? Our current version is the least opinionated about implementation details and if the user has a pin to a specific version of pandas and want to be opinionated, they can make their own. Right now I feel more people would make their own rather than use the default which feels wrong.

Mon, Feb 17, 11:31 PM
themissinghlink updated the summary of D2060: make dagster pandas input/output schema more flexible.
Mon, Feb 17, 7:47 PM
themissinghlink added reviewers for D2060: make dagster pandas input/output schema more flexible: max, nate.
Mon, Feb 17, 7:46 PM
themissinghlink added a comment to D2037: Fix k8s tests post-release.

Just to clean this up, should we abandon this?

Mon, Feb 17, 7:45 PM
themissinghlink abandoned D2058: Enable persistence of dataframes to gcs.

Gonna abandon this in favor of a different approach, to discuss at a different time.

Mon, Feb 17, 7:45 PM
themissinghlink created D2060: make dagster pandas input/output schema more flexible.
Mon, Feb 17, 7:21 PM
themissinghlink added a comment to D2058: Enable persistence of dataframes to gcs.

Ok let me take a crack at an idea I have and put it up in a few! This actually might simplify things a lot.

Mon, Feb 17, 5:55 PM
themissinghlink added a comment to D2058: Enable persistence of dataframes to gcs.

That's a good point. Hmmm, we could have create_dagster_pandas_dataframe_type take in a input_hydration_config and output_materialization_config as params. Then we could have these selectors live in their respective libraries. However, this means that pandas would have to be a dependency of gcp/aws/azure/....

Mon, Feb 17, 5:45 PM
themissinghlink updated the diff for D2058: Enable persistence of dataframes to gcs.
  • fixed final testing call site
Mon, Feb 17, 5:35 PM
themissinghlink added reviewers for D2058: Enable persistence of dataframes to gcs: max, schrockn, nate.
Mon, Feb 17, 5:18 PM
themissinghlink updated the diff for D2058: Enable persistence of dataframes to gcs.
  • made black fixes
Mon, Feb 17, 5:14 PM
themissinghlink updated the diff for D2058: Enable persistence of dataframes to gcs.
  • fixed tests call sites
Mon, Feb 17, 5:11 PM
themissinghlink updated the diff for D2058: Enable persistence of dataframes to gcs.
  • added google dependency to dagster pandas lib
Mon, Feb 17, 4:47 PM
themissinghlink updated the diff for D2058: Enable persistence of dataframes to gcs.
  • fixed lint bugs
Mon, Feb 17, 4:37 PM
themissinghlink created D2058: Enable persistence of dataframes to gcs.
Mon, Feb 17, 4:28 PM

Sat, Feb 15

themissinghlink accepted D2056: Actually display the type check description in Dagit..

This seems legit to me. Watched Max and did see it work in action.

Sat, Feb 15, 12:55 AM

Fri, Feb 14

themissinghlink accepted D2046: Fix code font families.

rubberstamp

Fri, Feb 14, 12:04 AM

Thu, Feb 13

themissinghlink abandoned D1951: added cleanup logic in intro tutorial tests.

abandoning in favor of tweaking tutorial to dump artifacts to a data directory. Will put up a new revision soon.

Thu, Feb 13, 6:37 PM
themissinghlink accepted D2038: Fix hardcoded dagster version in k8s test results.

Rubber stamp. Nick did something too make sure this doesn’t conflict with that.

Thu, Feb 13, 1:21 AM
themissinghlink accepted D2037: Fix k8s tests post-release.

LGTM

Thu, Feb 13, 12:31 AM

Wed, Feb 12

themissinghlink committed R1:0c7274e7c1e4: Add input hydration and output materialization configs to custom dataframes. (authored by themissinghlink).
Add input hydration and output materialization configs to custom dataframes.
Wed, Feb 12, 11:52 PM
themissinghlink closed D2032: Add input hydration and output materialization configs to custom dataframes..
Wed, Feb 12, 11:52 PM
themissinghlink accepted D2029: Config migration guide and some improved error messages.

LGTM

Wed, Feb 12, 11:39 PM
themissinghlink updated the diff for D2032: Add input hydration and output materialization configs to custom dataframes..
  • use safe_tempfile_path instead of raw named temporary file
Wed, Feb 12, 11:36 PM
themissinghlink added a reviewer for D2032: Add input hydration and output materialization configs to custom dataframes.: schrockn.
Wed, Feb 12, 11:26 PM
themissinghlink updated the diff for D2032: Add input hydration and output materialization configs to custom dataframes..
  • got rid of type annotations to be python2 compliant
Wed, Feb 12, 11:23 PM
themissinghlink updated the summary of D2032: Add input hydration and output materialization configs to custom dataframes..
Wed, Feb 12, 11:21 PM
themissinghlink updated the diff for D2032: Add input hydration and output materialization configs to custom dataframes..
  • got rid of todo
Wed, Feb 12, 11:19 PM
themissinghlink created D2032: Add input hydration and output materialization configs to custom dataframes..
Wed, Feb 12, 11:18 PM
themissinghlink added a comment to D2029: Config migration guide and some improved error messages.

Since we are testing to make sure we are emitting human readable error strings, we ought to encode these in tests as well. AKA does the error message return a DagsterInvariantViolationError with the error mentioned above.

Wed, Feb 12, 10:56 PM
themissinghlink added a comment to D2023: fixed mismatched lines.

We really really really need to do some sort of snapshot deploy that gets triggered iff there was a code change to the tutorial docs and include it with the buildkite thing. Currently, it's too easy to mess things up.

Wed, Feb 12, 10:09 PM
themissinghlink committed R1:40dd3c9d5853: fixed mismatched lines (authored by themissinghlink).
fixed mismatched lines
Wed, Feb 12, 10:07 PM
themissinghlink closed D2023: fixed mismatched lines.
Wed, Feb 12, 10:07 PM
themissinghlink added reviewers for D2023: fixed mismatched lines: alangenfeld, max.
Wed, Feb 12, 9:51 PM
themissinghlink updated the summary of D2023: fixed mismatched lines.
Wed, Feb 12, 9:51 PM
themissinghlink updated the summary of D2023: fixed mismatched lines.
Wed, Feb 12, 9:49 PM
themissinghlink updated the summary of D2023: fixed mismatched lines.
Wed, Feb 12, 9:47 PM
themissinghlink updated the summary of D2023: fixed mismatched lines.
Wed, Feb 12, 9:46 PM
themissinghlink updated the summary of D2023: fixed mismatched lines.
Wed, Feb 12, 9:44 PM
themissinghlink removed reviewers for D2023: fixed mismatched lines: alangenfeld, max.
Wed, Feb 12, 9:42 PM
themissinghlink updated the test plan for D2023: fixed mismatched lines.
Wed, Feb 12, 9:42 PM
themissinghlink created D2023: fixed mismatched lines.
Wed, Feb 12, 9:41 PM
themissinghlink committed R1:671650baf042: Dagster Pandas Guide Docs (authored by themissinghlink).
Dagster Pandas Guide Docs
Wed, Feb 12, 7:00 PM
themissinghlink closed D1964: Dagster Pandas Guide Docs.
Wed, Feb 12, 7:00 PM
themissinghlink abandoned D2013: add should_overwrite flags and tests for make_python_type_usable_as_dagster_type.

Abandoning because I was convinced this was not the right way to do this.

Wed, Feb 12, 6:47 PM
themissinghlink updated the diff for D1964: Dagster Pandas Guide Docs.
  • added last batch of fixes to the documentation
Wed, Feb 12, 6:46 PM