Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for python "re" module for doing regex in jinja templates #1755

Closed
whisperstream opened this issue Sep 16, 2019 · 7 comments · Fixed by #2851
Closed

Support for python "re" module for doing regex in jinja templates #1755

whisperstream opened this issue Sep 16, 2019 · 7 comments · Fixed by #2851
Labels
enhancement New feature or request good_first_issue Straightforward + self-contained changes, good for new contributors!

Comments

@whisperstream
Copy link

Describe the feature

I have a case where I'd like to be able to do a regex in a jinja template for matching a particular set of values. It would be handier to have a access to python's re module than having to use normal string manipulation and matching

Describe alternatives you've considered

For now I'll probably use text manipulation using if, |replace and + to do the same thing to get the string into the right format, but it would be much handier to be able to get the values from regex groups.

Additional context

Not database specific, would apply to all jinja templates in the same way that datetime and pytz modules are made available.

Who will this benefit?

Anybody who wants to do more complex string manipulation and string checking in a more straight-foward way than is possible today.

@whisperstream whisperstream added enhancement New feature or request triage labels Sep 16, 2019
@drewbanin drewbanin removed the triage label Sep 16, 2019
@drewbanin
Copy link
Contributor

Hey @whisperstream! What kind of regexing are you planning on doing in the jinja context? I'm open to this idea, but I'd like to better understand what your use case is here

@whisperstream
Copy link
Author

Sure it's part of the same problem I've been trying solve for better handling permissions in dbt models #1695.

So for for this problem I have the following:

        {%- if row['granted_on'].lower() == 'role' and (row['name'].lower().startswith('dev_br') or  row['name'].lower().startswith('dev_raw')) -%}
            {%- set value = row['name'][4:].lower() -%}

whereas with re I could do:

        {% set matcher = re.match('^dev_((br|raw)_\w+)', row['name]|lower) %}
        {%- if matcher is not none -%}
            {%- set value = matcher.group(1) -%}

I think regex is pretty standard these days and so having a more concise way of expressing these conditions could be useful.

@drewbanin
Copy link
Contributor

drewbanin commented Sep 16, 2019

roger! That makes a lot of sense.

Is this something you'd be interested in submitting a PR for? You can find the code where modules are injected into the context here: https://github.com/fishtown-analytics/dbt/blob/f9bc7c56e512d424248c4c0eeaceab5a2cc04fd0/core/dbt/context/common.py#L344-L348

We limit the methods that are exported from the modules in the dbt context -- check out get_datetime_module_context or get_pytz_module_context in that same file for an example.

@drewbanin drewbanin added the good_first_issue Straightforward + self-contained changes, good for new contributors! label Sep 16, 2019
@whisperstream
Copy link
Author

whisperstream commented Sep 17, 2019

yeah I'm happy to give that PR a shot. @drewbanin which branch should I be working against?

@drewbanin
Copy link
Contributor

Ah! Sorry, totally missed this. For posterity: use the default branch shown at https://github.com/fishtown-analytics/dbt -- that will always be the dev/ branch we're working on. In this case, it's dev/louisa-may-alcott!

@machupichu
Copy link

@drewbanin I am also very interested in that feature! I am trying to tweak your star macro to add a regex argument to select only a subset of columns based on this regex! I guess the PR hasn't been opened/merged, right ?

@drewbanin
Copy link
Contributor

hey @machupichu - that's right - we haven't yet made a code change that adds support for the re module yet.

Check out the discussion here for some background on why solving this problem in the general case is tough: #1997

I'm a-ok with adding specific support for the re module if we can make sure that it doesn't leak out attributes that would lead to any security issues. Check out this comment in particular for more info: #1997 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good_first_issue Straightforward + self-contained changes, good for new contributors!
Projects
None yet
3 participants