Page MenuHomeElementl

Add compatibility for JSON output of dbt>=0.19.0.
ClosedPublic

Authored by bob on Feb 10 2021, 5:50 PM.

Details

Summary

Intended to resolve issue #3616

The JSON schema for dbt run results (among many other dbt Artifacts) has changed in dbt 0.19.0.

dagster-dbt currently fails when parsing the output from dbt run and dbt compile. This diff sets missing fields as optional and should be compatible with dbt before and after 0.19.0

To Do

  • Decide on how long dagster-dbt will support dbt <0.19.0. Please comment below with your thoughts.
  • Include new metadata fields from dbt 0.19.0 in the dagster-dbt Outputs and AssetMaterializations
Test Plan

buildkite

Diff Detail

Repository
R1 dagster
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Feb 10 2021, 6:06 PM
Harbormaster failed remote builds in B25552: Diff 31184!
bob edited the summary of this revision. (Show Details)
bob added reviewers: sandyryza, max.

With this change, will users on the later version of dbt be missing data from their AssetMaterializations that users on the earlier versions will have? If so, is that addressable? On the Github issue, I think someone mentioned that "unique_id" could be used as a replacement?

With this change, will users on the later version of dbt be missing data from their AssetMaterializations that users on the earlier versions will have? If so, is that addressable? On the Github issue, I think someone mentioned that "unique_id" could be used as a replacement?

Yes, this change will result in missing data for later version of dbt.

  1. In 0.18, a node object contains unique_id, as well as 10+ other fields. In 0.19, node has been replaced with unique_id, and the 10+ other fields are now omitted. For this change, using 0.19 would result in node={} and no field unique_id.
  1. Also, in 0.18, the root JSON object has a field generated_at. In 0.19, this field generated_at has been moved into an object metadata. For this change, using 0.19 would result in generated_at=None.
  1. Before this change, many of the fields were already Optional[] because the dbt JSON had no officially documented schema. Based on the docs for 0.19, a few of these fields (e.g. error, table, fail, warn, skip, etc.) are definitely removed.

Next Steps

I can address (1) and (2) and will update this diff. As for (3), I am not so sure.


Other Discussion

This change will at least allow dagster-dbt to run successfully with dbt 0.19, so as to resolve issue #3616.

In an ideal world, dagster-dbt would be able to support dbt 0.X to 0.19, but I have yet to decide what X would be. In any case, if we assume that we want to support, say dbt 0.17[1] through 0.19, then there would be quite a lot of changes for types.py as well as our testing for each version of dbt. In my opinion, such changes should be in a different diff (or a diff stacked on this one).

I currently feel like we should support dbt at least as far back as dbt 0.17 (released on June 2020), but I'd be happy to hear other people's thoughts on this matter. I will also have more information about how elegant/gnarly it will be to support dbt 0.17 through 0.19 as I tinker on it a bit more.

max added inline comments.
python_modules/libraries/dagster-dbt/setup.py
32

presumably this should be >0.17?

This revision is now accepted and ready to land.Feb 11 2021, 8:27 PM