-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
builtin: simplify splint_nth methods #21563
Conversation
5689bad
to
b30c2e2
Compare
Have you measured a performance difference? |
None, that I could think of. I was just curious, since I observed several surprising results yesterday, while testing c9e6a12 . |
Oh yeah I see. It really is good to double check. I also had some surprises in the recent days where I had the same thoughts as you put into yet another PR
|
Co-authored-by: JalonSolov <JalonSolov@gmail.com>
(the program above is produced by mechanically transforming a bit the existing tests for the functions + adding a runner to make it easier to run with different iteration counts) |
I think it is due to this construct: - is_delim := i + delim.len <= s.len && s.substr(i, i + delim.len) == delim vs + if s[i..i + delim.len] or { break } == delim { |
The old one generated: bool is_delim = (int)(i + delim.len) <= s.len &&
string__eq(string_substr(s, i, (int)(i + delim.len)), delim); The new one generated: _result_string _t5 = string_substr_with_check(s, i, (int)(i + delim.len));
if (_t5.is_error) {
IError err = _t5.err;
break;
}
if (string__eq(*(string*)&_t5.data, delim)) { |
with: diff --git vlib/builtin/string.v vlib/builtin/string.v
index bca5674e3..c684d7bc2 100644
--- vlib/builtin/string.v
+++ vlib/builtin/string.v
@@ -953,7 +953,7 @@ pub fn (s string) split_nth(delim string, nth int) []string {
mut start := 0
// Add up to `nth` segments left of every occurrence of the delimiter.
for i := 0; i <= s.len; i++ {
- if s[i..i + delim.len] or { break } == delim {
+ if i + delim.len <= s.len && s[i..i + delim.len] == delim {
if nth > 0 && res.len == nth - 1 {
break
} |
Adding also this patch, speeds up things too (by reducing allocations): diff --git vlib/builtin/string.v vlib/builtin/string.v
index bca5674e3..9cd997cce 100644
--- vlib/builtin/string.v
+++ vlib/builtin/string.v
@@ -927,7 +927,7 @@ pub fn (s string) split_nth(delim string, nth int) []string {
0 {
for i, ch in s {
if nth > 0 && res.len == nth - 1 {
- res << s[i..]
+ res << unsafe { s.substr_unsafe(i, s.len - 1) }
break
}
res << ch.ascii_str()
@@ -941,7 +941,7 @@ pub fn (s string) split_nth(delim string, nth int) []string {
if nth > 0 && res.len == nth - 1 {
break
}
- res << s.substr(start, i)
+ res << unsafe { s.substr_unsafe(start, i) }
start = i + 1
}
}
@@ -953,18 +953,18 @@ pub fn (s string) split_nth(delim string, nth int) []string {
mut start := 0
// Add up to `nth` segments left of every occurrence of the delimiter.
for i := 0; i <= s.len; i++ {
- if s[i..i + delim.len] or { break } == delim {
+ if i + delim.len <= s.len && unsafe { s.substr_unsafe(i, i + delim.len) } == delim {
if nth > 0 && res.len == nth - 1 {
break
}
- res << s.substr(start, i)
+ res << unsafe { s.substr_unsafe(start, i) }
i += delim.len
start = i
}
}
// Then add the remaining part of the string as the last segment.
if nth < 1 || res.len < nth {
- res << s[start..]
+ res << unsafe { s.substr_unsafe(start, s.len - 1) }
}
}
}
@@ -984,7 +984,7 @@ pub fn (s string) rsplit_nth(delim string, nth int) []string {
0 {
for i := s.len - 1; i >= 0; i-- {
if nth > 0 && res.len == nth - 1 {
- res << s[..i + 1]
+ res << unsafe { s.substr_unsafe(0, i + 1) }
break
}
res << s[i].ascii_str()
@@ -998,29 +998,30 @@ pub fn (s string) rsplit_nth(delim string, nth int) []string {
if nth > 0 && res.len == nth - 1 {
break
}
- res << s[i + 1..rbound]
+ res << unsafe { s.substr_unsafe(i + 1, rbound) }
rbound = i
}
}
if nth < 1 || res.len < nth {
- res << s[..rbound]
+ res << unsafe { s.substr_unsafe(0, rbound) }
}
}
else {
mut rbound := s.len
for i := s.len - 1; i >= 0; i-- {
- is_delim := i - delim.len >= 0 && s[i - delim.len..i] == delim
+ is_delim := i - delim.len >= 0
+ && unsafe { s.substr_unsafe(i - delim.len, i) } == delim
if is_delim {
if nth > 0 && res.len == nth - 1 {
break
}
- res << s[i..rbound]
+ res << unsafe { s.substr_unsafe(i, rbound) }
i -= delim.len
rbound = i
}
}
if nth < 1 || res.len < nth {
- res << s[..rbound]
+ res << unsafe { s.substr_unsafe(0, rbound) }
}
}
} but it may be out of scope for this PR, given its title. |
Awesome analysis, thanks for sharing the results in that way! Please be free to push the changes. |
Thanks, I will in a moment 🙇🏻♂️ . |
9842175
to
6209c09
Compare
The The intermediate string will not be used for anything else, and will be copied anyway by the |
… fix the new allocation in the loop of split_nth
863f7a5
to
0fbb825
Compare
* master: cgen: fix array fixed initialization on struct from call (vlang#21568) testing: implement a separate `-show-asserts` option, for cleaner test output (`-stats` still works, and still shows both the compilation stats and the asserts) (vlang#21578) toml: fix `@[toml: ]`, support `@[skip]` (vlang#21571) os: fix debugger_present() for non Windows OSes (vlang#21573) builtin: simplify splint_nth methods (vlang#21563) net.http: change default http.Server listening address to :9009, to avoid conflicts with tools, that start their own http servers on 8080 like bytehound (vlang#21570) os: make minior improvement to C function semantics and related code (vlang#21565) io: cleanup prefix_and_suffix/1 util function (vlang#21562) os: remove mut declarions for unchanged vars in `os_nix.c.v` (vlang#21564) builtin: reduce allocations in s.index_kmp/1 and s.replace/2 (vlang#21561) ci: shorten path used for unix domain socket tests (to fit in Windows path limits)
Simplifies the functions for better read and maintainability.