Page MenuHomeElementl

Provide more detailed SkipReasons
ClosedPublic

Authored by jordansanders on Apr 12 2021, 9:15 PM.

Details

Summary

Previously, when partitions were outside the schedule's start/end date
range, we'd always return the SkipReason:

"Partition selector did not return a partition. Make sure that the
timezone on your partition set matches your execution timezone."

Now, we provide a more specific SkipReason.

Test Plan

bk

Diff Detail

Repository
R1 dagster
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

python_modules/dagster/dagster/utils/partitions.py
264

This feels a little icky to me - that I'm returning either a SkipReason or a Partition and then re-throwing the SkipReason. I've chosen to do this because it's only inside of the partition selection fn that we have enough information to determine why a partition wasn't selected.

Harbormaster returned this revision to the author for changes because remote builds failed.Apr 12 2021, 9:33 PM
Harbormaster failed remote builds in B28734: Diff 35270!
python_modules/dagster/dagster/utils/partitions.py
245

this should be partition_set_def.get_partitions(context.scheduled_execution_time)

and we should re-use it on line 260 now that it's in the main path

python_modules/dagster/dagster/utils/partitions.py
245

I'm not sure that's the case. partition_set_def.get_partitions() == partition_set_def.get_partitions(context.scheduled_execution_time) only if context.scheduled_execution_time is None (and probably in some other cases too depending on what partition_fn is). But in general, partition_set_def.get_partitions() != partition_set_def.get_partitions(context.scheduled_execution_time) which is why originally, it was happening inside the if block. For example, using the data from my first test:

(Pdb) context.scheduled_execution_time
datetime.datetime(2021, 1, 1, 0, 0)
(Pdb) partition_set_def.get_partitions(context.scheduled_execution_time)
[]
(Pdb) partition_set_def.get_partitions()
[Partition(value=DateTime(2021, 2, 1, 0, 0, 0, tzinfo=Timezone('UTC')), name='2021-02'), Partition(value=DateTime(2021, 3, 1, 0, 0, 0, tzinfo=Timezone('UTC')), name='2021-03')]

The reason I'm not passing in context.scheduled_execution_time is because I want the superset of all possible partitions so I can use it to find the earliest and latest. I'm using it as a stand-in for the schedule's actual start_time and end_time since as far as I can tell, that information isn't easily accessible from my current scope.

  • Treat empty partition sets as "too early"
python_modules/dagster/dagster/utils/partitions.py
245

Daniel and I talked in person and decided that it's fine to assume that if partition_fn returns an empty list, it's because the time precedes the earliest partition.

dgibson requested changes to this revision.Apr 13 2021, 3:41 PM

this generally makes a ton of sense to me, I just had some more time to think about the exact copy to use for maximum clarify, left some thoughts inline.
@prha may have thoughts here too.

python_modules/dagster/dagster/core/definitions/partition.py
190–191

could you update the docblock here as well?

python_modules/dagster/dagster/utils/partitions.py
265

OK I had a bit more time since we talked IRL to think about the copy here and am realizing that "the start time hasn't occurred yet" is not totally right and would be super confusing in the most common time when it happens, when you just missed the fact that the day offset is 1 (it's more like "the schedule tick time corresponding to the start date hasn't occurred yet", but that is very confusing and hard to explain).

Maybe to address that, I think we should include both the scheduled execution time and the partition time in these messages so that it's a bit more clear why this might have happened (since the most likely reason for somebody to run into this is that they didn't realize that we offset the days by one and set the start date to today)

What do you think about something like

"Could not find a partition at time {partition_time.isoformat()} corresponding to the schedule tick at time {context.scheduled_execution_time.isoformat()}. The most likely reason for this is because that partition time comes before the start of the partition set".

281

this was not part of the original task, but could also be an opportunity here to yield a SkipReason in the same vein of those above? not essential

This revision now requires changes to proceed.Apr 13 2021, 3:41 PM

I think this makes sense. At first it was a little jarring to have the selector return a SkipReason, but I think we can get by if we have this documented reasonably.

The other thing is that I haven't really heard of people writing custom partition selectors. Mostly they're just using the built-in decorators.

python_modules/dagster/dagster/utils/partitions.py
265

What about:

f"The partition {partition_time.isoformat()} precedes the earliest schedule tick {context.scheduled_execution_time.isoformat()}. Verify your schedule's start_date is correct."

For our first test case, that'd look like:

The partition 2020-12-01T00:00:00+00:00 precedes the earliest scheduled tick 2021-01-01T00:00:00. Verify your schedule's start_date is correct.

(ignore the lack of timezone offset - it's just because the test's context.scheduled_execution_time isn't timezone aware)

And I can update the "too late" case as well - to something like:

f"The partition {partition_time.isoformat()} is after the last schedule tick {context.scheduled_execution_time.isoformat()}. Verify your schedule's end_date is correct."
dgibson requested changes to this revision.Apr 14 2021, 6:41 PM
dgibson added inline comments.
python_modules/dagster/dagster/core/definitions/partition.py
190–191

This comment still applies

python_modules/dagster/dagster/utils/partitions.py
265

I'm having a lot of trouble parsing the new copy tbh, I'm not sure what "the earliest schedule tick" is referring to (I know what the earliest *partition* means, but not what the earliest schedule tick is - the schedule ticks start as soon as you turn the schedule on). And the issue isn't so much that the partition is earlier than the tick (it's always earlier, there's a fixed offset) as that it's not in the partition set at all. Happy to workshop the copy some more over slack more but I think we should be sure to mention that it's trying to use a partition that does not exist (and ideally give you a clear sense as to why). Definitely hard to message clearly though since what's going on under the hood here is pretty confusing

This revision now requires changes to proceed.Apr 14 2021, 6:41 PM
python_modules/dagster/dagster/utils/partitions.py
265

So "schedule ticks" can happen before the schedule start time and after the schedule end time? I was trying to avoid using "partition set" because "your partition couldn't be found because it's not in the partition set" felt a little tautological. Do we think most users will understand what a partition set is?

How about something like this:

"Your partition (2020-12-01) precedes the earliest partition in the partition set (2021-01-01). Verify your schedules start_date is correct."

dgibson requested changes to this revision.Apr 15 2021, 3:14 PM

sorry for all the back and forth here, just looking to get the copy super clear since the early check is a common source of user confusion in #general

re: referencing the partition set, If users read the docs they should be pretty clear that there's a partition set under the hood: https://docs.dagster.io/concepts/partitions-schedules-sensors/schedules#defining-a-schedule (but not everybody will do that - we've talked about renaming it to daily_partitioned_schedule instead to ensure that it's clear that there's a partition set under the hood)

python_modules/dagster/dagster/utils/partitions.py
262–263

i wish we had access to the partition format string here so that we could use the expected partition name rather than the raw datetime, but we do not.

What we could do is use a different slightly prettier format string that just includes the hour? Like .strftime("%Y-%m-%d %H%z") ?

263

I like the text, but we still have the problem that context.scheduled_execution_time.isoformat() is not the beginning of the partition set (or even any value in the partition set neccesarily - for example, you could have a schedule that executes every day at 9AM and fills in the previous day's partition set. The scheduled execution time is 9AM (not a time in the partition set) and the partition set is midnight from the day before)

Here is an idea: If you do want to fetch the first partition for this error message, or even as a replacement for the 'empty partition list' check, you could do something that has some overlap with what you had in the very first version of this diff:

partitions_for_early_check = partition_set_def.get_partitions(None) # Intentionally None, just to fetch the 'un-mocked' set of partitions, needs to be separate from the 'real' partitions list since it might not include partitions in the future
first_partition = partitions_for_early_check[0] # use this for the 'too early' check
This revision now requires changes to proceed.Apr 15 2021, 3:14 PM

Explicitly check against the earliest possible partition.

I also simplified the final partition lookup. I think a lot of its complexity is obviated by checking the upper and lower bounds first. Now, once we're confident that the partition time isn't outside the partition set's upper and lower bounds, we can return a matching partition if it's int he set and our generic SkipReason if it's not.

python_modules/dagster/dagster/utils/partitions.py
262–263

In the absence of the actual partition format string, I think I prefer the more explicit format. That future proofs it against custom schedules with a finer/different grained fidelity than the hour.

dgibson added inline comments.
python_modules/dagster/dagster/utils/partitions.py
249

i'm worried about some crazy edge case where somebody defines an empty partition set - that might be frowned upon but probably should not fail with an IndexError

252–258

is there a particular reason to change this? Maybe slightly premature optimization, but sometimes the partition set can be large and its typically the last element in the set

This revision is now accepted and ready to land.Mon, Apr 19, 2:05 PM
python_modules/dagster/dagster/utils/partitions.py
249

Good call - I'll fix that.

252–258

I wonder how big of a list it'd have to be for the short circuiting to be noticeable...

I agree - probably premature optimization. But I can add it back if we'd like. I found the logic to be a little more complicated to reason through in an already kind of muddy method.

  • Guard against IndexError when there's an empty partition set
This revision was automatically updated to reflect the committed changes.