Page MenuHomePhabricator

Add Redshift resource
ClosedPublic

Authored by nate on Fri, Mar 13, 11:12 PM.

Details

Summary

We might consider load() / unload() methods, but fully representing the semantics of https://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html in Python seems ambitious (and a moving target...)

Test Plan

unit

Diff Detail

Repository
R1 dagster
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

nate updated this revision to Diff 10525.Fri, Mar 13, 11:12 PM
nate created this revision.

up

nate edited the summary of this revision. (Show Details)Fri, Mar 13, 11:18 PM
nate added a reviewer: max.Sat, Mar 14, 4:40 AM
nate added reviewers: yuhan, catherinewu.
max added a comment.Mon, Mar 16, 9:26 PM

it might not be crazy just to have docs saying, you can do load/unload by running res.execute_query('COPY blah blah')

max requested changes to this revision.Mon, Mar 16, 9:44 PM

my only real issue here is about the handling of multiple queries where a query doesn't return any results -- it seems to me that the length of the result should be the same as the length of the list of queries -- otherwise how will we know which result corresponds to which query?

python_modules/libraries/dagster-aws/dagster_aws/redshift/resources.py
17

if users aren't intended to implement this, can it be _BaseRedshiftResource with a docstring explaining that

22

why a tuple?

56

Optional[bool]

60

might be nice to link via intersphinx here

65

should be List[Tuple] or List[Tuple[Any, ...]]

76

Is this metadata not sufficiently represented in the structured log entry? I would rather not introduce this kind of semistructured metadata in log lines

91

Optional[bool]

100

this is List[List[Tuple]] or List[List[Tuple[Any, ...]]]

118

shouldn't we still do results.append([])?

168

hm, seems weird not to be able to config this from env

190

is this a little misleading? if i set autocommit: false afaict we don't disable anything -- it's just like this being None

197

60?

200

maybe link to docs for acceptable values

214

should be a sphinx code-block -- literal include would be better yet as we could put it under test

This revision now requires changes to proceed.Mon, Mar 16, 9:44 PM
yuhan added a comment.Tue, Mar 17, 8:34 PM
In D2250#57526, @max wrote:

it might not be crazy just to have docs saying, you can do load/unload by running res.execute_query('COPY blah blah')

Agreed. I will even actually expect to do COPY command in res.execute_query(). I'd imagine it's a common status-qup for data engineers

nate updated this revision to Diff 10636.Tue, Mar 17, 9:16 PM
nate marked 14 inline comments as done.

up

python_modules/libraries/dagster-aws/dagster_aws/redshift/resources.py
17

cool, done

22

you mean vs. a list? don't feel strongly, just have used this convention elsewhere for extracting args from config

56

sg

60

cool done

76

yah agree, removed

118

yes we absolutely should, added

168

Yup! fortunately Alex just landed https://dagster.phacility.com/D2267 - updated to use IntSource

190

yes thanks for catching - I've updated the code to actually do what this text describes

197

lol. surprise!

200

cool, done!

214

thanks for flagging - yeah made this a code block for now, I'll do the literal include as a follow up

max accepted this revision.Tue, Mar 17, 9:56 PM
max added inline comments.
python_modules/libraries/dagster-aws/dagster_aws/redshift/resources.py
78

still have [redshift]

This revision is now accepted and ready to land.Tue, Mar 17, 9:56 PM
This revision was automatically updated to reflect the committed changes.