-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-1811] [Feature] Unix-style wildcard fqn selector method via fnmatch #6598
Comments
This is a cool idea @z3z1ma ! And I'm pleasantly surprised how short and sweet the implementation in #6599 looks. Questions for you and @jtcohen6:
|
@dbeatty10 my take!
Originally I would have liked to update the default for i, selector_part in enumerate(node_selector.split(".")):
# if we hit a GLOB, then this node is selected
if selector_part == SELECTOR_GLOB:
return True
elif flat_fqn[i] == selector_part:
continue
else:
return False This is not as flexible as the As far as adding it elsewhere... My point above remains relevant on the
Proof of existing test cases passing with minimal changes.
The unix-style wildcard syntax is significantly simpler than regex. Furthermore I cannot think of any selection criteria that could not be expressed with unix-style wildcard syntax yet I could think of innumerable "weird" things and issues to troubleshoot for users if they are trying to use regular expressions. No reason for matching groups, etc. There is more mileage here in simplicity. It is also worth mentioning
|
After caffeinating. I am thinking that even changing the qualified name selector (the default) to use fnmatch would actually not be a breaking change since the glob will work identically once encountered as selecting all resources sharing the preceding segments. 😄 |
Awesome analysis @z3z1ma ! Based on your findings, I'd be inclined to add your @jtcohen6 you may have a different read here or things we haven't yet considered -- will be interested to hear your thoughts and feedback. |
Support added for all the "string matching" based selectors: MetricSelectorMethod
FileSelectorMethod
PackageSelectorMethod
QualifiedNameSelectorMethod
SourceSelectorMethod
TagSelectorMethod
TestNameSelectorMethod There is no logical purpose for the fnmatch in any of the other methods. But the above can all be augmented. Still a non-breaking change and net new functionality for all of these, possibilities include:
Once @jtcohen6 reviews, I'll remove my Wildcard selector since the functionality is embedded in the default selector. I will rename the test case which comprises more advanced patterns, |
@z3z1ma This is awesome !! You're right, I am pleasantly surprised with how naturally this change fits into the existing code, and the existing user experience ( Fully agree with:
I'm going to remove |
Is this your first time submitting a feature request?
Describe the feature
Synopsis
Support a new selector method which leverages
fnmatch
from the python stdlib. This function matches a string against a pattern that uses unix wildcard syntax. Unix wildcard syntax includes wildcards (*), ? (one of any character), and [a-z][1-9] range expressions. Using this syntax against the modelsfqn
allows an innumerable amount of more creative--select
uses.It is entirely opt-in. No existing behavior is changed. You can use this by prefixing
wildcard
to a select like this:wildcard:jaffle_*.staging.*_customer
Unlocks
This is a significant improvement to test direct selection. you can now test specific columns across an entire section of the graph.
One example:
dbt test -s 'wildcard:*salesforce*ccm_cloud_spend*'
This will run all tests that contains salesforce and ccm_cloud_spend in the fqn. No tag wrangling required.
The same benefits apply to any workflow where users need to select something with more flexibility than what is currently possible. Some complex selectors may also be able to be simplified.
Run everything upstream of any model ending with
_arr
dbt run -s '+wildcard:*_arr'
Conclusion
There are theoretically no downsides since its a net new feature that is fully opt-in.
Describe alternatives you've considered
This is painful to write, but here:
What I have proposed is far more dynamic than anything I can conjure up.
Who will this benefit?
Everyone who needs more control in how they select their resources. Especially benefiting direct test selection which is really painful right now; but we could rattle out use cases all day given the flexibility of this selector.
Are you interested in contributing this feature?
Yes
Anything else?
dbt docs PR
dbt-labs/docs.getdbt.com#2702
The text was updated successfully, but these errors were encountered: