-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected behavior using str_sub
when having bigger or smaller start/end values than the minimum/maximun length of the 'subsetted' string
#547
Comments
Could you please rework your reproducible example to use the reprex package ? That makes it easier to see both the input and the output, formatted in such a way that I can easily re-run in a local session. |
Hello Hadley, thanks for checking this. I rewrote the report using Hopefully it makes sense. Required package and test string:
Example 1: shows expected behavior (5 letters before and after properly extracted)
Example 2: truncation after first "M", shows unexpected behavior
I would consider the Example 3 and 4: same unexpected behavior but with truncation at the 3rd or 4th element of the string
The results of the examples above seems to related to the Example 5: shows expected behavior when truncation happens after 4th element of the stringThis seems to be because in this case, the
Simplification of what I define as unexpectedBased on these code tests, it seems like a negative input in the Therefore for the following code examples
I would expect the same result as:
Session info
|
Looks like the problem is that I've failed to document what happens with negative integers — they count back from the right-hand side of the string. This might not be the most intuitive behaviour for your use case, but it's useful in general, and anyway is too late to change now. |
* Better documentation for `start` and `end`. Fixes #547 * Add test for empty strings * Check `value` length and add test
Dear tidyverse team,
I think I have found an unexpected behavior in
str_sub
that I want to report, because I didn't find anything like this in the issue section.Imagine we have the following string:
I want to be able to define a truncation site based on a substring (i.e.,
"JUGAR"
, in my example), and use that information to get the 5 letters before and after the truncation site. In this case, the truncation site would be before the first"J"
, so I would expect the 5 letters after the truncation to be"JUGAR"
and the 5 letters before the truncation to be"GUSTA"
. This works properly in the 1st example, but it doesn't when the trucation site is closer to the beginning ofstring_test
.Hopefully I can illustrate this better with the two examples below.
Example 1: shows expected behavior (5 letters before and after properly extracted)
Nevertheless, when the 'truncation site' is just at
start == 2
ofstring_test
, I get an empty result, instead of the expected behavior of getting the letter at position atstart == 1
. See the example code:Example 2: truncation after first "M", shows unexpected behavior
As you can see, I get
""
instead of"M"
, which is the only letter before the 'truncation site'. I would expect to get"M"
if it is the only letter before my 'truncation site'.I would define this as unexpected behavior, but please let me know if I am missing something.
Thank you very much in advance for taking the time to check this. I will be very happy to receive your feedback on this.
Best wishes,
Miguel
Session info:
The text was updated successfully, but these errors were encountered: