preserve empty strings in native env templating #2608

drewbanin · 2020-07-01T22:20:21Z

Do not merge this PR

Just wanted to call out another issue we're seeing with the Jinja NativeEnv implementation introduced in 0.17.1. This PR fixes a problem where the empty string '' is converted to None.

Example:

version: 2

models:
  - name: debug

    columns:
      - name: id
        tests:
          - accepted_values:
              values: ['', 'a']

Here, the test compiles to ....where id not in ('None', 'a').

I think the number of holes to plug in our current approach to native env rendering is exceeding the number of thumbs we have available to us. Let's regroup and figure out a long-term sustainable approach to type inference for yaml+jinja.

drewbanin · 2020-07-06T15:45:52Z

@cla-bot check

cla-bot · 2020-07-06T15:45:56Z

The cla-bot has been summoned, and re-checked this pull request!

beckjake

We should test this, of course...

drewbanin · 2020-07-06T19:28:23Z

Do we want to further restrict the set of types that dbt will convert to? The docs say that literal_eval will:

Safely evaluate an expression node or a string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None.

So, literal_eval will convert string values to tuples, lists, dicts and sets. I think it could be a good idea to restrict this set of destination types to non-container types. I do think that coercing string input to a container type is an awesome feature when used intentionally, but a very bad feature when used by accident. A string like "('abc', 'def')" is a pretty reasonable looking SQL fragment, but in the native env context, it would be interpreted as a Python tuple!

I'd be ok with documenting and supporting this behavior if we're happy living with it for the foreseeable future. If we have any reservations about something like this though, it could be a good idea to remove support today and consider explicit ways to support these container types in a native env in the future

beckjake · 2020-07-06T20:04:24Z

so, I know earlier I said that I thought it would be hard to add an as_bool that's the reverse of the default, but maybe I was wrong. What about something like this?

class NativeMarker(str):
    pass


def get_environment(...):
   ...
   env.filters['as_native'] = NativeMarker


def quoted_native_concat(nodes):
    head = list(islice(nodes, 2))

    if not head:   # this turns get_rendered('') into ''
        return ''

    if len(head) == 1:
        raw = head[0]
        if not isinstance(raw, NativeMarker):
            return str(raw)
    else:
        # turn multiple nodes into a string
        return "".join([str(v) for v in chain(head, nodes)])

    # if we got here, there was a single node marked with the 'as_native' filter.
    try:
        return literal_eval(raw)
    except (ValueError, SyntaxError, MemoryError):
        return raw

beckjake · 2020-07-06T20:06:24Z

That's kind of a reset to the 0.16 behavior, but it makes the native rendering sort of opt-in at the jinja level, with some caveats around the native code generator behaving slightly differently from the standard code generator. Maybe that's what we always wanted?

beckjake · 2020-07-06T20:09:36Z

You could pretty easily extend this with filters that assert the type, too. so we could implement an as_bool filter that ensures the result of literal_eval is a bool, etc... but this seems like a potential one-size-fits-all solution for the issue.

It does suck for anyone who was using enabled: "{{ var('my_var_value') }} and now has to tack an | as_native on the end, but maybe optimizing for them is less important than optimizing for everyone else?

drewbanin · 2020-07-07T01:18:16Z

cool, that approach is broadly aligned with how I was thinking about this too. There's a lot to like about the 0.17.0 behavior, but it has too many quirks to be a sane default. I just keep coming back to the case of key: '00'. Any implementation that evaluates this string out to 0 is just wrong. If the only answer there is making the native env opt-in/explicit, then so be it IMO.

We're going to need to write something up about this change to make sure folks understand why we changed the behavior in 0.17.0 and why we're changing it back in 0.17.1 (or does this make our next release 0.18.0??). I think most folks would agree that making native env rendering explicit is the right decision long-term, and if that's the case, then it will benefit everyone to make this change as soon as possible.

drewbanin · 2020-07-08T19:58:56Z

I confused myself with this one :)

Closing as this was replaced by #2618

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

preserve empty strings in native env templating #2608

preserve empty strings in native env templating #2608

drewbanin commented Jul 1, 2020

drewbanin commented Jul 6, 2020

cla-bot bot commented Jul 6, 2020

beckjake left a comment

drewbanin commented Jul 6, 2020

beckjake commented Jul 6, 2020

beckjake commented Jul 6, 2020

beckjake commented Jul 6, 2020

drewbanin commented Jul 7, 2020

drewbanin commented Jul 8, 2020

preserve empty strings in native env templating #2608

preserve empty strings in native env templating #2608

Conversation

drewbanin commented Jul 1, 2020

Do not merge this PR

drewbanin commented Jul 6, 2020

cla-bot bot commented Jul 6, 2020

beckjake left a comment

Choose a reason for hiding this comment

drewbanin commented Jul 6, 2020

beckjake commented Jul 6, 2020

beckjake commented Jul 6, 2020

beckjake commented Jul 6, 2020

drewbanin commented Jul 7, 2020

drewbanin commented Jul 8, 2020