String operators and builtin functions #38

bittrance · 2019-04-07T06:44:28Z

I'm planning a project which will have both rules (i.e. boolean expressions) and transformations (i.e. string expressions) in the same configuration entry, so it makes sense to use a library with support for both to get a unified syntax for these expressions. Evalexpr almost fills this description, but would need a bit more string support. Given that these functions are pretty standard, I thought it makes sense to contribute them upstream rather than keep them locally.

This PR introduces common operations for strings:

Identifier	Argument Amount	Description
downcase	1	Returns lower-case version of string
len	1	Return the character length of string argument
match	2	Returns true if first string argument matches regex in second
replace	3	Returns string with matches replaced by third argument
trim	1	Strips whitespace from start and end of string
upcase	1	Returns upper-case version of string

This PR adds a dependence on the regex crate. This should probably be optional, but I'm thinking it is so likely that a string user will want to use regexes that it might make more sense to try to make all string functions optional?

bittrance · 2019-04-07T07:19:08Z

Having thought a bit more, the question seems to boil down to one of type separarion. One strategy would be to support implicit type conversion (e.g. "foo" + 1). The other strategy would be to be very strict, perhaps by implementing a separate string concatenation operator (e.g. "foo" . "bar") and provide explicit type conversion functions (e.g. format("%s", 4.3)). Which would you prefer?

ISibboI · 2019-04-08T16:04:24Z

Hey bittrance,

thank you for your interest in this crate and for your contribution. I have a few remarks:

I want to introduce namespaces into this crate. This way, variables and functions can have more of a context. So I propose to put most of the functions you implemented into the str namespace. For now, this would just mean to add the prefix str:: to their names.
I would like to keep the names of builtin functions as close to the Rust names as possible. I propose the following names:
- str::to_uppercase
- len - This can actually be reused for different types, so it makes sense to keep it in the root namespace
- str::matches
- str::replace
- str::trim
- str::to_lowercase
I would make the dependency to the regex crate optional, because one of the selling-arguments of this crate is having no required dependencies. We should introduce the feature regex_support for the str::matches function. The other string functions that don't require regex should be non-optional in my opinion, because they don't depend on regex.
The crate already does implicit type conversion from int to float in all relevant operations, if not all arguments are int. So I think overloading the + operator to concatenate strings and implicitly convert to string if only one argument is a string is fine and fits well into the crate.

That's it so far. I will also take a look at the code.
Thank you very much for this pull request!

ISibboI

I looked at the code and I think it looks good. There are some minor things here and there.

And I see you did not implement the auto-conversion yet. I would accept the pull-request without, that's fine. But then we should definetly make an issue for the auto-conversion

src/error/display.rs

src/error/mod.rs

src/function/builtin.rs

src/error/mod.rs

ISibboI · 2019-04-08T16:36:10Z

(I just started a new job and moved to a new location, so my work on this is slowed down a bit. But as soon as I get internet at home, I will be able to care for this crate properly again.)

bittrance · 2019-04-09T07:47:26Z

I understand your plan to have namespaces and it would make sense for some use cases, particularly where the developer doesn't have control over the runtime list of available functions, but I'm not sure it will make much sense to my users. If I use evalexpr as a light-weight configuration templating language, I know that the type of the field is supposed to be a string and the list of functions will be well-known. A somewhat simplified configuration example from my project:

resource {
  name = replace_matches(system_name, "prod-us1-(.*)", "$1") + "-" + to_lowercase(service_name)
  filter = tag == "production" && level(status == "NORMAL", "5 min")
}

Prefixing the functions with str:: (or perhaps regex::) will just be verbose. Namespacing will also mean that the locally defined level function (as in levelling period) will either have to have a separate namespace (perhaps agg(regaiton):: or (the project name) barsa::) or reside in the top namespace, but either way the user will have to know where to find it.

I could add functions into namespaces as per your description and introduce re-exporting functions so that my project can make the string functions available in the root namespace as well. Does that make sense?

Regarding type conversions, I have come to realize that this should rather be discussed in terms of type safety (not just separation). In particular, I think that I would like to be able to validate that the resulting type of an expression is correct when I load it. This is a separate issue from string functions, so I will do some research and open a separate issue to explain my thinking.

Regarding regex method naming, note https://docs.rs/regex/1.1.5/regex/struct.Regex.html#using-the-stdstrpattern-methods-with-regex . Some digging points to rust-lang/rust#27721 (#![feature(pattern)]) and rust-lang/rust#56345 (#![feature(needle)] and friends) so it will be there eventually, I suppose. I'm fine with implementing regex::{matches,replace} or str::regex_{matches,replace} if you prefer to "track" stable Rust in this regard.

The rest of the comments are straight-forward changes that I will incorporate when I update the code pending resolution of these discussions.

ISibboI · 2019-04-12T16:42:28Z

Hey bittrance,

I understand your concern about namespaces being too verbose for your use case. I think we can easily resolve that by adding something like an include_all_namespaces flag to the Context. If it is set, having an expression x would evaluate to the value of x, or abc::x or some other namespace.
In case of conflicts, an error should be returned.

For the sake of moving this forward, we could just merge the pull request without the namespacing changes, so you name all your methods without str:: prefixes, and I introduce that later on.
This leads to another breaking change, but I think that's fine.

About type conversions, I think we can also merge this pull request without the auto-conversion feature into strings. Can all be added separately. I think we should not overload this pull request.

For the regex naming, I think there should be a differentiation between calling for example the matches method with a regex, or with a simple string pattern. Since string patterns might also contain control-characters of regexes, which might then be confusing to the user. I think I would rather have separate methods for regex/non-regex operations. So for now I guess regex_{matches,replace}.

…), trim, upcase. New dependency regex.

…e(regex), trim, upcase. New dependency regex.

Also simplify their error handling.

bittrance · 2019-04-12T21:48:23Z

I went with your original suggestion to namespace with str:: as it is going to be a while until this becomes a problem for me and I'm confident we can find a good solution in a separate PR.

I hid the parts dependent on regex behind a feature flagn (excluding the error variant and its support which are not dependent on anything from the actual crate). Not sure feature flagging individual match arms is best practice, but I think that problem should be solved in the more general context of namespaces and modularized builtins.

I force-pushed the branch in order to make the regex dependency optional in all commits so as to not break future bisection.

ISibboI · 2019-04-13T16:02:07Z

Okay that sounds good.

For feature flagging individual match arms: I'm fine with that. Right now there are so little builtins that this list is easy to comprehend.

ISibboI · 2019-04-13T16:02:59Z

Thank you for your contribution! You seem to write really good code.

Relates to #38

Support addition and comparison of strings.

a9c4530

ISibboI requested changes Apr 8, 2019

View reviewed changes

src/error/display.rs Outdated Show resolved Hide resolved

src/error/mod.rs Outdated Show resolved Hide resolved

src/function/builtin.rs Outdated Show resolved Hide resolved

src/function/builtin.rs Outdated Show resolved Hide resolved

src/error/mod.rs Outdated Show resolved Hide resolved

ISibboI added enhancement New feature or request feature:regex Related to the optional regex feature of this crate labels Apr 8, 2019

ISibboI added this to the 4.2.0 milestone Apr 8, 2019

bittrance added 5 commits April 12, 2019 23:35

Builtin string functions - downcase, len, match(regex), replace(regex…

be54931

…), trim, upcase. New dependency regex.

fixup! Builtin string functions - downcase, len, match(regex), replac…

f6c8689

…e(regex), trim, upcase. New dependency regex.

Rename matches,replace -> regex_{matches,replace}.

7c6b6d4

Also simplify their error handling.

Rename builtin upcase,downcase -> to_uppercase, to_lowercase.

d72334c

Namespace string builtins into "str::".

dbf3949

bittrance force-pushed the string-operators branch from bf2bd51 to dbf3949 Compare April 12, 2019 21:48

ISibboI merged commit 2f7d1c2 into ISibboI:master Apr 13, 2019

ISibboI added a commit that referenced this pull request Apr 13, 2019

Improve documentation of builtin functions

83386c0

Relates to #38

ISibboI self-assigned this May 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

String operators and builtin functions #38

String operators and builtin functions #38

bittrance commented Apr 7, 2019

bittrance commented Apr 7, 2019

ISibboI commented Apr 8, 2019

ISibboI left a comment

ISibboI commented Apr 8, 2019

bittrance commented Apr 9, 2019

ISibboI commented Apr 12, 2019

bittrance commented Apr 12, 2019

ISibboI commented Apr 13, 2019

ISibboI commented Apr 13, 2019

String operators and builtin functions #38

String operators and builtin functions #38

Conversation

bittrance commented Apr 7, 2019

bittrance commented Apr 7, 2019

ISibboI commented Apr 8, 2019

ISibboI left a comment

Choose a reason for hiding this comment

ISibboI commented Apr 8, 2019

bittrance commented Apr 9, 2019

ISibboI commented Apr 12, 2019

bittrance commented Apr 12, 2019

ISibboI commented Apr 13, 2019

ISibboI commented Apr 13, 2019