-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Working solution serialization bug. #5874
Working solution serialization bug. #5874
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're converting all of these Exception object to strings in the next structured logging update anyway, because everything in the event has to be encoded into a protobuf message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
So it looks like #5385 has the exact same problem (passing a non-string to |
…utes. Before this change, the spec of various classes expected base_msg and msg params to be str's. This assumption did not always hold true. post_init hooks ensures the spec is obeyed.
e07829f
to
ef4eae9
Compare
Revised in light of the recent structured logging change. PR description revised, mostly expanded to describe the second bug. Employs the same subclass-as-trait paradigm to . And for good measure, I did check the new structured logging code on main still exhibits the bugs described in the linked issues before folding these changes into the code. |
7e59699
to
2dae7e7
Compare
class AdapterEventStringFunctor: | ||
def __post_init__(self): | ||
super().__post_init__() | ||
if not isinstance(self.base_msg, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to check if base_msg exists?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought about this. We're talking data classes so these objects must be instantiated with params. A little toy example:
from dataclasses import dataclass
@dataclass
class Item:
a:str
b:str
c:str
asdf = Item()
Try running this and it complains:
TypeError: init() missing 3 required positional arguments: 'a', 'b', and 'c'
Hence, if a dataclass exists, it's gotta have the params. And this functor should only be used in classes that have this param. I think that renders an existential check redundant? (if it's not, tell me)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, just one question. Also, do we need unit tests to validate this behavior?
Unsure about unit tests. Is it worth having tests to check that the intended spec holds for something like the structured logging? That feels extra to me, but I'm also open to writing some quick ones if it seems like a good idea from your POV! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
resolves #5436
resolves #5385
Description
Ho, boy! What an odyssey. This PR has some reworks to the adapter logging functions for dbt. These changes help us conform to the intended spec of the logging classes. The resulting behavior makes our log functions reminiscent of python's
print()
.reproducing the errors
5385
models/asdf.sql
:5436
asdf2.sql
:dbt --no-use-colors --log-format json run -s asdf2 | jq .msg
——————
dbt-bigquery==1.0.0
dbt-core==1.0.8
dbt-bigquery==1.1.0
dbt-core==1.1.2
dbt-bigquery==1.2.0
dbt-bigquery==1.3.0b2
dbt-core==1.3.0b2
dbt-core==1.2.1
Error 1: Serialization Woes
json.dumps
presumes that all elements within a dictionary can be simply rendered as a json string. But complex objects inside the argument will cause a serialization error. This happens precisely at line 209 of core/dbt/events/functions.Several classes have
msg
parameter. This spec does not always match reality: Some arguments are not strings. This leads to trouble whenlogger
member function calls are made. So, if we force the message given to logger functions to be a string, the spec matches reality and the error vanishes. Local proof of it working on development.This error is what the user OwenKephart graciously documented over in Bigquery issue 206 as a desired output for my test case model, designed to fail.
QED.
Error 2: Secret scrubbing implodes when provided a non-str type
Jinja
log
hands a type to Python which should but isn't always a string. Although, we have a functionscrub_secrets
which expects that.By making sure these
MacroEvent*
types are all in fact strings, we solve the current problem and hopefully, future ones.Checklist
changie new
to create a changelog entry