-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String patterns #140
Comments
@AnthonyDGreen Dim scheme = "http"
Dim rest = "www.microsoft.com"
Select Case input
Case Match $"{scheme}://{rest}" When scheme = "ms-word"
Case Match $"{user}@{domain}" When domain <> "outlook.com"
Throw New NotSupportedException("All must use outlook.com!")
Case Match $"{*}.docx"
Case Match $"{CInt(id)}|{CDate(date)}|{description}|{CDbl(total)}"
' CSVs are never this easy to parse. Sooner or later 75 rows in somebody gets clever.
End Select |
@AdamSpeight2008 In C#'s pattern matching, using a variable in the match clause with the same name as an outer-scope variable is a compiler error: |
BTW, not to complain, but Scala/Kotlin/Swift et al are fast becoming the new VB (not in BASIC syntax, but in the core VB values: expressiveness, sugar, aiding the programmer, etc.) |
I want to note one hesitation: currently VB.NET supports two forms of patterns for strings: |
Regex is arguably the most powerful parsing method but I get tired of searching Google for the appropriate expression... And @AnthonyDGreen proposal is much more powerful than the I'd welcome an additional way to achieve this and as it is simpler in my mind than Regex I feel it is in keeping with VBs stated goals of remaining approachable. Thoughts? |
I'm not denying that this proposal would be very powerful; if it would originally have been available in the language it might have been an excellent choice. But once we have the Like syntax, I don't think it appropriate to add another syntax. That said, perhaps there's some way to incorporate string pattern matching into the existing Like syntax? As @AnthonyDGreen noted in the original post,
|
This builds on proposal #124 "Pattern Matching" with a pattern specifically designed for the decomposing (parsing) strings. The syntax is designed to as closely as practical mirror the interpolated string syntax introduced in VB14 for creating strings.
After 20 years of
InStr
,Mid
,Left
, andRight
this really excites me. Even if I'm usingIndexOf
andSubstring
it still feels like this kind of parsing is still a frequently recurring task in my life.This would compose with other patterns too, the "interpolations" bounded by
{
and}
could contain other patterns. Right now I've only been able to figure out how to make it work with lazy matching and without backtracking and it falls apart if two interpolations appear side by side (with no text between) since the first will eat up all the text.Should the "alignment" part of an interpolation be usable to require/match a substring of fixed or minimal length?
Match $"{y,4}-{m}-{d}"
Maybe. It could help with the problem mentioned above.
Is there anything at all that would make sense with the "format" part of an interpolation? It seems hard since there's really no way to get that part to mean the same thing coming out as it does going in.
We need to find the sweet spot for productivity vs. power (as always). These be dangerous waters.
Can we do something to modernize the VB
Like
operator instead?Not sure.
Why not full-blown regex literals like in Perl?
This has always been a great personal temptation for me. At the moment I don't think this is the right way to go. RegEx is about making extremely complicated things more terse (and cryptic). That's counter to our goals with VB of making things straight-forward and approachable. All but the very simplest of regexes (regexen?) quickly become arcane magic. One test of this proposal will be how much people still need to fall back to regex with it.
Does this pattern need built-in alternation?
It does seem natural to support alternation within this pattern as a way of describing optionality:
Any term that isn't matched in all cases may be null (maybe we should require a
?
after the name then). The secondCase
is complete (I think), but the first does not exhaustively handle all permutations. I think the number of cases you'd need to write to represent all possibilities is 16 (could be wrong). I think the correct code is:Is that really so much more readable than
"((<scheme>.+)://)?(<domain>*+)(:(<port>\d+))?(/(<path>.*))?(\?(<query>.+))?"
Well, yes, and infinitely easier to reason about (my brain froze several times writing it), but that's not the point.
Is there some way to support greedier matching or backtracking?
So far pattern functions as I envision them take the form
<Function([ByRef p1 [, ByRef p2, ...]]) As Boolean>
. Maybe there's some other form we could consider for strings (or maybe all enumerables?) that could let the match functiontoo darn complicated.This could be pretty hard to prototype and needs a lot of spec work.
The text was updated successfully, but these errors were encountered: