Fix path completion when UTF8 char is occurring before match characters. #9227

dhoegh · 2014-12-02T21:02:45Z

Small fix to #8838 so completion works again on: "β $dir_space\\space", @tkelman, @ivarne

ivarne · 2014-12-02T21:28:47Z

Looks good to me.

stevengj · 2014-12-03T17:51:07Z

base/REPLCompletions.jl

@@ -188,7 +188,7 @@ function completions(string, pos)
    inc_tag = Base.incomplete_tag(parse(partial , raise=false))
    if inc_tag in [:cmd, :string]
        m = match(r"[\t\n\r\"'`@\$><=;|&\{]| (?!\\)", reverse(partial))
-        startpos = length(partial) - (m == nothing ? 1 : m.offset) + 2
+        startpos = nextind(partial, nextind(partial, length(partial) - m.offset))


Don't you have to handle the m == nothing case?

See also my comment. It's not clear to me why the original code is incorrect, since m.offset is the offset of an ASCII character (a single code unit).

I do not handle nothing because the regex should be garented a match due to the ´if´ at line 189.
About the ´nextind´

julia>s="α C:/" julia>start = match(r" ",reverse(s)).offset julia>s[length(s)-start] 'α'

So I need to step past the UTF-xx char. It could be changed to:

nextind(partial, length(partial) - m.offset) + 1

because I know the match char is ASCII.

Doesn't s[endof(s) - start + 1] work fine? Hmm, no, that's not right either because endof gives the last character index, not the last byte index. Seems like it should be s[nextind(s,endof(s)) - start]

The real problem here seems to be the use of length instead of endof, which is wrong because offset is a byte index not a character index. e.g. your patch gives the wrong result 'β' for s="αβ C:/αβ"; start = match(r" ",reverse(s)).offset; s[nextind(s, nextind(s, length(s) - start))]

It seems like searching strings in reverse is common enough and so error-prone that we should export a reverseind(s,i) = nextind(s,endof(s)) - i function from Base, and document it along with reverse. Although the implementation would be more complicated if i is not the index of an ASCII character.

@stevengj thank you for the pointer to endof, it was wrong to use length. But it still seems like it should be start = nextind(s, endof(s) - match(r" ",reverse(s)).offset) + 1 to get from C.

julia>s="αβ C:/αβ" julia>start = nextind(s, endof(s) - match(r" ",reverse(s)).offset) + 1 julia>s[start:end] "C:/αβ"

I have updated my comment

nextind(s,endof(s)) - offset gives the index of the matched (ASCII) character in s. If you want the index of the character after that, you can use nextind(s,endof(s)) - offset + 1 because the matched character is ASCII.

Your code does not work if the character before the matched character is ASCII:

julia> s="αβd C:/αβ"; julia> start = nextind(s, endof(s) - match(r" ",reverse(s)).offset) + 1; julia> s[start:end] " C:/αβ"

(Notice that the matched character is included in this case, whereas it was not included for s="αβ C:/αβ".)

@stevengj I have found the solution, it converts to charecter index and the back:

julia> s="αβ C:/αβ"; julia> start = chr2ind(s, length(s) - ind2chr(reverse(s), match(r" ",reverse(s)).offset) + 2); julia> s[start:end] "C:/αβ"

I have updated.
I fogot to reverse the string

My solution is faster: ind2chr and chr2ind are O(n) operations. Why don't you want to use nextind(s,endof(s)) - offset + 1?

dhoegh · 2014-12-04T13:55:32Z

@stevengj I have updated it with you proposal, and added a UTF-xx string to the path. Sorry I did not realized it worked you solution worked, I was pretty busted when I looked at it yesterday.

stevengj · 2014-12-04T15:06:51Z

LGTM, thanks!

Fix path completion when UTF8 char is occurring before match characters.

dhoegh · 2014-12-04T15:13:12Z

It should be backported @JuliaBackports.

(cherry picked from commit 9b3a83b) ref: #9227

ivarne · 2014-12-06T21:52:08Z

Backported in 967b37a

ivarne added REPL Julia's REPL (Read Eval Print Loop) regression Regression in behavior compared to a previous version labels Dec 2, 2014

stevengj reviewed Dec 3, 2014
View reviewed changes

Fix path completion when UTF8 char is occurring before match characters.

9b3a83b

dhoegh force-pushed the fix_path_completion_unicode branch from 864ef0d to 9b3a83b Compare December 4, 2014 13:51

stevengj added a commit that referenced this pull request Dec 4, 2014

Merge pull request #9227 from dhoegh/fix_path_completion_unicode

87e9ee1

Fix path completion when UTF8 char is occurring before match characters.

stevengj merged commit 87e9ee1 into JuliaLang:master Dec 4, 2014

ivarne added the backport pending label Dec 4, 2014

stevengj mentioned this pull request Dec 4, 2014

add reverseind(s, i): convert indices in reverse(s) to indices in s #9249

Merged

dhoegh mentioned this pull request Dec 5, 2014

Backport #9227 "Fix path completion when UTF8 char is occurring before match characters." #9253

Closed

tkelman mentioned this pull request Dec 6, 2014

Fix #9209 #9210

Merged

ivarne pushed a commit that referenced this pull request Dec 6, 2014

Fix path completion when UTF8 char is occurring before match characters.

967b37a

(cherry picked from commit 9b3a83b) ref: #9227

ivarne removed the backport pending label Dec 6, 2014

dhoegh deleted the fix_path_completion_unicode branch January 13, 2015 08:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix path completion when UTF8 char is occurring before match characters. #9227

Fix path completion when UTF8 char is occurring before match characters. #9227

dhoegh commented Dec 2, 2014

ivarne commented Dec 2, 2014

stevengj Dec 3, 2014

dhoegh Dec 3, 2014

stevengj Dec 3, 2014

stevengj Dec 3, 2014

dhoegh Dec 3, 2014

stevengj Dec 3, 2014

dhoegh Dec 4, 2014

stevengj Dec 4, 2014

dhoegh commented Dec 4, 2014

stevengj commented Dec 4, 2014

dhoegh commented Dec 4, 2014

ivarne commented Dec 6, 2014

Fix path completion when UTF8 char is occurring before match characters. #9227

Fix path completion when UTF8 char is occurring before match characters. #9227

Conversation

dhoegh commented Dec 2, 2014

ivarne commented Dec 2, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhoegh commented Dec 4, 2014

stevengj commented Dec 4, 2014

dhoegh commented Dec 4, 2014

ivarne commented Dec 6, 2014