Add initial pipeline completion support #530

renkun-ken · 2021-01-21T04:40:12Z

What problem did you solve?

Closes #323

This PR implements a simple function getPipelineCompletionItems() based on extendSelection so that the beginning symbol of the pipeline expression is extracted and its names() stored in globalenv.json is used to provide live completion items in user session.

Note that for subsequent pipeline expressions such as mutate() and summarize(), new variables may be introduced. REditorSupport/languageserver#324 should be helpful in this case to provide token completion for these newly introduced variables even if the code is not executed but those symbols already appear in the code.

(If you have)Screenshot

(If you do not have screenshot) How can I check this pull request?

In the following example, tbl1 is created and saved to disk. The language server should be able to capture all these symbols in tbl1 if they appear in the code rather than comments. After storing tbl1 to disk, we could comment out the tbl1 code and let session watcher completion provider work to capture tbl1 in the pipeline expression and provide its names in the whole expression.

We could test the completions in each mutate() and select() call.

library(dplyr)
library(tibble)

# tbl1 <- tibble(
#   Name = c("Product1", "Product2", "Product3"),
#   Quality = c("Good", "Good", "Best"),
#   Score = c(8, 9, 10),
#   Type = c("B", "B", "A")
# )
# saveRDS(tbl1, "tbl1.rds")

tbl1 <- readRDS("tbl1.rds")

# typical pipeline without assignment
tbl1 %>%
  mutate(Score_mean = mean(Score)) %>%
  select(Score)

# typical pipeline with assignment
tbl2 <- tbl1 %>%
  mutate(Score_mean = mean(Score)) %>%
  select(Score)

# one-lined pipeline
tpl3 <- tbl1 %>% mutate(Name2 = Name) %>% select(Score)

# mixed case
tpl3 <- tbl1 %>% mutate(Name2 = Name) %>% 
  select(Score)

renkun-ken · 2021-01-21T05:18:52Z

getBracketCompletionItems and getPipelineCompletionItems may generate duplicate completion items. Looks like we need to remove those duplicates.

renkun-ken · 2021-01-21T05:25:38Z

Also this does not handle the following

tbl2 <- tbl1 %>%
  select(Name, Score)

Looks like we need to extract the symbol before the pipe operator or after the assignment operator.

ManuelHentschel · 2021-01-22T15:13:31Z

The example seems to work nicely for me (small typo: l is missing in # saveRDS(tb1...).

Have you considered checking the function name to determine if the column names are suggested? As far as I'm aware, the pipeline operator can also be used for other functions like lapply() which do not automatically attach the dataframe (and hence don't make the column names usable by themselves). On the other hand, functions like mutate, select etc. can also be called with normal syntax, e.g. tbl2 <- select(tbl1, Score), which does not trigger the current pipeline-completion.

The current approach is probably fine for most users, though, and still a significant improvement!

renkun-ken · 2021-01-24T10:21:37Z

I change the algorithm to searching the pipe token before any pipe operator like %>%, |> in the first non-empty code line detected by extendSelection() so that the following cases can be handled properly:

# typical pipeline without assignment
tbl1 %>%
  mutate(Score_mean = mean(Score)) %>%
  select(Score)

# typical pipeline with assignment
tbl2 <- tbl1 %>%
  mutate(Score_mean = mean(Score)) %>%
  select(Score)

# one-lined pipeline
tpl3 <- tbl1 %>% mutate(Name2 = Name) %>% select(Score)

# mixed case
tpl3 <- tbl1 %>% mutate(Name2 = Name) %>% 
  select(Score)

renkun-ken · 2021-01-24T10:30:09Z

Have you considered checking the function name to determine if the column names are suggested?

I guess we don't really need such accurate completions by detecting known NSE functions as RStudio does not seem to do this either. It's hard to know the accurate results unless we do more to make sure if these calls are from the expected namespaces (dplyr::filter() or stats::filter()?)

ManuelHentschel · 2021-01-24T20:28:20Z

I guess we don't really need such accurate completions by detecting known NSE functions as RStudio does not seem to do this either. It's hard to know the accurate results unless we do more to make sure if these calls are from the expected namespaces (dplyr::filter() or stats::filter()?)

Sounds good :)

danielbasso · 2021-01-25T15:38:08Z

Used this feature a lot this weekend, and it is already way better than not having anything.

getBracketCompletionItems and getPipelineCompletionItems may generate duplicate completion items. Looks like we need to remove those duplicates.

This is the only "problem" I encountered so far. Sometimes the autocomplete can become a lot messy by the sheer number of duplicates:

I personally wouldn't mind don't having the 'abc' suggestions, just the object ones (blue outlined solid), even if this means that sometimes variables created by mutate() or not assigned dataframes won't show up. I think it would become cleaner.

Aside from that, it works great. The autocomplete doesn't need to be perfect, just need to help, as now it is already doing.

renkun-ken · 2021-01-25T16:59:04Z

@danielbasso Thanks for testing!

In fact, these duplicates come from different sources. The "abc" items come from token completion in languageserver which means these tokens appear in your code (not necessarily executed) while blue-box items combe from vscode-R pipeline completion which detects the symbol being piped and find its names() from the global environment of user session provided by session watcher.

Currently, it seems we don't have a way to remove duplicates of completion items from difference sources, but I guess I could add an option to languageserver so that you could turn off token completion?

danielbasso · 2021-01-26T20:42:53Z

Currently, it seems we don't have a way to remove duplicates of completion items from difference sources, but I guess I could add an option to languageserver so that you could turn off token completion?

@renkun-ken

Well, if it doesn't take much of your time, this would be nice! I just don't want to cause extra work for you.

The token completion from languageserver is cool, but sometimes it kinda gets out of control with 100+ suggestions. In my previous example, 8 lines of code produced 17 suggestions (Even "c" is suggested hahaha).

Add initial pipeline completion support

3f9cfe8

renkun-ken mentioned this pull request Jan 21, 2021

Expand Session watcher to help auto complete with NSE (tidyverse syntax) #323

Closed

Update completion triggers

bcb667a

renkun-ken marked this pull request as draft January 21, 2021 16:29

Improve getPipelineCompletionItems

1213594

renkun-ken marked this pull request as ready for review January 24, 2021 10:17

renkun-ken requested a review from ManuelHentschel January 24, 2021 14:47

Update getWordRangeAtPosition not work with backtick

8a7fe4f

ManuelHentschel approved these changes Jan 24, 2021

View reviewed changes

renkun-ken merged commit 2d378dd into REditorSupport:master Jan 25, 2021

renkun-ken mentioned this pull request Jan 27, 2021

Improve token completion REditorSupport/languageserver#372

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add initial pipeline completion support #530

Add initial pipeline completion support #530

renkun-ken commented Jan 21, 2021 •

edited

Loading

renkun-ken commented Jan 21, 2021

renkun-ken commented Jan 21, 2021

ManuelHentschel commented Jan 22, 2021

renkun-ken commented Jan 24, 2021

renkun-ken commented Jan 24, 2021

ManuelHentschel commented Jan 24, 2021

danielbasso commented Jan 25, 2021 •

edited

Loading

renkun-ken commented Jan 25, 2021

danielbasso commented Jan 26, 2021

Add initial pipeline completion support #530

Add initial pipeline completion support #530

Conversation

renkun-ken commented Jan 21, 2021 • edited Loading

renkun-ken commented Jan 21, 2021

renkun-ken commented Jan 21, 2021

ManuelHentschel commented Jan 22, 2021

renkun-ken commented Jan 24, 2021

renkun-ken commented Jan 24, 2021

ManuelHentschel commented Jan 24, 2021

danielbasso commented Jan 25, 2021 • edited Loading

renkun-ken commented Jan 25, 2021

danielbasso commented Jan 26, 2021

renkun-ken commented Jan 21, 2021 •

edited

Loading

danielbasso commented Jan 25, 2021 •

edited

Loading