Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pickerInput is slow to render large amounts of data #184

Open
hlendway opened this issue May 20, 2019 · 42 comments
Open

pickerInput is slow to render large amounts of data #184

hlendway opened this issue May 20, 2019 · 42 comments

Comments

@hlendway
Copy link

hlendway commented May 20, 2019

Hello,
I'm using pickerInput in my shiny apps due to the nice built in features, i.e. live-search and action-box to select/deselect all options. I have a large amount of data in some of these, ~10,000 options in some. This may seem extreme but it's manageable with the search features. I'm trying to reduce load time for my app and I've found pickerInputs with large amounts of data are the biggest issue. When comparing load time to a selectizeInput it takes about 12 times the time to load. I understand the additional features (live-search and action-box) may take additional time to load, if I remove the additional features it still takes 10 times the amount of time to load, (both instances I use virtual-scrolling for the pickerInput).

I've attached screenshots of the profile run in R.
Below is the sample app with a pickerInput and selectizeInput you can run to see the performance difference.

I know my case may be the extreme but it would be nice if the performance was closer to that of selectizeInput, not sure if there's anything in the code that could be optimized. Thanks!

library(shiny)
library(shinydashboard)
library(tidyverse)
library(babynames)
library(shinyjs)
library(shinyWidgets)

names <- babynames::babynames %>% 
  distinct(name) %>% 
  top_n(10000,name) %>% 
  arrange(name) %>% 
  pull(name)

ui <-  dashboardPage(skin = "black",
                     dashboardHeader(
                       title="Test App"
                     ),
                     dashboardSidebar(sidebarMenu(id = "sidebar",
                                                  menuItem("Tab One", tabName = "tabOne", icon = icon("heartbeat")),
                                                  htmlOutput("namesPickerSelect"),
                                                  htmlOutput("namesSelectizeSelect")
                     )),
                     dashboardBody(
                       useShinyjs(),
                       extendShinyjs(text = "shinyjs.resetClick = function() { Shiny.onInputChange('.clientValue-plotly_click-month_select', 'null'); }"),
                       #extendShinyjs(text = "shinyjs.resetClick = function() { Shiny.onInputChange('.clientValue-plotly_click-month_select', 'null'); }"),
                       tabItems(
                         tabItem(tabName = "tabOne"
                         )
                       )
                     )
)


server <- function(input, output, session) {

  output$namesPickerSelect <- renderUI({

    pickerInput(
      inputId = "namesSelect",
      label = "Names Picker Input :",
      choices = names,
      options = list(
        #`actions-box` = TRUE,
        #`live-search` = TRUE,
        `virtualScroll` = 10,
        size = 10
      ),
      multiple = TRUE
    )

  })

  output$namesSelectizeSelect <- renderUI({

    selectizeInput(
      inputId = "namesSelect2",
      label = "Names Selectize Input :",
      choices = names,
      options = list(
        maxOptions  = 10000
      ),
      multiple = TRUE
    )

  })
  
}

# Run the application 
shinyApp(ui = ui, server = server)

Run without actions-box = TRUE & live-search = TRUE:
ProfileWithOUTSearchAndSelectAllEnabled
Run WITHactions-box = TRUE & live-search = TRUE:
ProfileWithSearchAndSelectAllEnabled

@pvictor
Copy link
Member

pvictor commented Jun 5, 2019

Mmmh thanks for reporting that, it's indeed an issue with the R code (updatePickerInput is also very slow). I will work on that.

@hlendway
Copy link
Author

hlendway commented Jun 5, 2019

That would be excellent if it could be improved. Let me know how I can help, thanks!

@shibahead
Copy link

I agree it. I am using pickerInput very useful, but I have same problem. Please improve this problem.
selectizeInput have server option. I wonder if it is good for pickerInput that have same option.
I am sorry that my English is so bad.

@daattali
Copy link
Contributor

Just to confirm, I've tested this with the underlying javascript library (bootstrap-select v1.13.0) and it doesn't seem to be slow when using the javascript library directly (example). So this does suggest that there may be a performance issue with the R implementation

@lalitsc12
Copy link

Hi Daattali,
Sorry for the ignorance, can you show how to implement this in shiny.

@pvictor
Copy link
Member

pvictor commented Sep 13, 2019

This is the (internal) R function pickerSelectOptions() that is slow, this function is used both by pickerInput() and updatePickerInput().

pickerSelectOptions <- function(choices, selected = NULL, choicesOpt = NULL, maxOptGroup = NULL) {

So the HTML tags generation is slow but it's in R not in HTML :

library(shinyWidgets)
choices <- sample.int(1e6, 1e5) # 10000 choices

system.time({
  mypicker <- pickerInput(
    inputId = "id",
    label = "Label :",
    choices = choices,
    multiple = TRUE
  )
}) # 11.78 sec

@Sbirch556
Copy link

Has anyone been using any type of work around for the issue?

@trafficonese
Copy link

I played a bit with the pickerSelectOptions function and tried to optimize it.
I made a gist with all function versions and some benchmarking.

This is the result with 10.000 choices, with res0 being the original function.

Unit: milliseconds
 expr      min       lq     mean   median       uq       max neval
 res0 649.1555 732.6629 922.2550 851.3989 991.6766 2086.0715    20
 res1 423.4223 503.8548 620.7318 629.1364 700.6460  926.4494    20
 res2 455.5857 542.1930 669.5109 644.3501 757.3877 1245.7816    20
 res3 395.1565 515.5664 575.9230 582.7990 645.5065  710.5663    20
 res4 314.8682 392.8538 474.6054 470.6431 525.3321  737.5034    20

Result with 100.000 choices (I just compared res4 to res0):

Unit: seconds
 expr      min       lq     mean   median       uq      max neval
 res0 6.831709 7.201631 8.163757 7.315676 8.443628 14.86792    20
 res4 3.741216 4.081858 4.316816 4.197558 4.392406  6.41836    20

Its not very much, but at least a tiny bit faster. :)

@pvictor
Copy link
Member

pvictor commented Sep 30, 2019

I've updated the function to do the same thing as selectInput if choicesOpt = NULL, this should improve performance,
Thanks @trafficonese for your benchmark, I'll look into the fourth option, the main difference is the dropNulls implementation ?

@trafficonese
Copy link

trafficonese commented Sep 30, 2019

Not only,

  • I removed all :: function calls
  • cumsum(l) is calculated twice, so I saved it in a vector and re-used it
  • created a vector names(choices) outside the lapply part
  • changed matrix(data = c(c(1, cumsum(l)[-length(l)] + 1), cumsum(l)), ncol = 2) to matrix(data = c(1, cs[-length(l)] + 1, cs), ncol = 2). You use c() twice, but I dont think its necessary.
  • If the label is NULL you dont need to do both HTML() and htmlEscape(), since htmlEscape is expensive. So if label is NULL I just use HTML(NULL)
  • If choicesOpt is empty I have another if-else condition, which only sets value, label and selected
  • changed choice %in% selected to any(choice == selected)
  • faster dropNulls function, as you spotted already
  • removed the return() and wrapped the whole lapply in tagList

And you could also change sapply(choices, length) to unlist(lapply(choices, length)), which I found to be faster although its actually 2 function calls, but also I read you should avoid using sapply.

I think that was it :)

@lalitsc12
Copy link

when will it be updated in the widgets ?

@pvictor
Copy link
Member

pvictor commented Sep 30, 2019

@lalitsc12 you can install from GitHub to try it out.

@trafficonese thanks for the precision, some thoughts :

  • i also removed all ::
  • good point
  • ok, another solution will be to use a mapply but I don't think it make a difference in performance
  • ok
  • label is never NULL since choices is always a named list
  • I use an other function in that case (last version on GitHub)
  • the difference is negligible, I think
  • I've to look into your dropNulls function
  • ok

@trafficonese
Copy link

Great!
and yes, keep choice %in% selected as I think its way faster anyway in most cases.

@hlendway
Copy link
Author

hlendway commented Oct 1, 2019

This looks like a great improvement I can't wait to implement this in my apps. Thank you for your work on this!
Comparing to the profile in my original post.
Run without actions-box = TRUE & live-search = TRUE 2720 vs 560:
image
Run WITH actions-box = TRUE & live-search = TRUE 3100 vs.600:
image

@trafficonese
Copy link

@pvictor - Another thing which is much faster.
Instead of
sapply(choices, length)
you can just use
lengths(choices)

But I think most time-saving will come from the new function selectOptions.

@pvictor
Copy link
Member

pvictor commented Oct 3, 2019

If i'm not mistaken lengths is from R 3.2.0, I want to keep R 3.1.0 as minimal version required.

@trafficonese
Copy link

Indeed, lengths came with 3.2.0

@swnydick
Copy link

swnydick commented Oct 30, 2019

I ran into this problem with 100,000 options with no choiceOpts. The "mapply" underlying "selectOptions" is very slow (it's essentially running a "for" loop under the hood). It would be much faster to flag list options, run code on just those options, run different code on the non-list options, and then paste everything together.

Also - note that "vapply" is faster than "sapply" and is more reliable because you can specify the function return value. The code below (which worked on a few test examples, although I didn't test it thoroughly) took .36 seconds with 100000 choices, where the original function took 37.8 seconds. If you can get away from using mapply/lapply, the calls will be much faster.

selectOptions <- function(choices,
                          selected = NULL){
  
  # initial vector to store output character strings
  html           <- vector("character", length(choices))
  
  # indicating where to update list elements
  is_list_choice <- vapply(choices, is.list, logical(1L))
  
  # apply function ON list choices and add back to html
  if(any(is_list_choice)){
    list_choices   <- choices[is_list_choice]
    list_html      <- sprintf(
      fmt = '<optgroup label="%s">\n%s\n</optgroup>',
      htmltools::htmlEscape(text      = names(list_choices),
                            attribute = TRUE),
      vapply(list_choices, selectOptions, character(1L), selected = selected)
    )
    html[is_list_choice] <- list_html
  } 
  
  # run on just vector choices and put back into html
  if(any(!is_list_choice)){
    vec_choices <- choices[!is_list_choice]
    vec_html    <- sprintf(
      fmt = '<option value="%s"%s>%s</option>',
      htmltools::htmlEscape(text      = vec_choices,
                            attribute = TRUE),
      c("", " selected")[(vec_choices %in% selected) + 1],
      htmltools::htmlEscape(names(vec_choices))
    )
    html[!is_list_choice] <- vec_html
  }
  
  # paste everything together
  htmltools::HTML(paste(html, collapse = "\n"))
}

@tanrahman234
Copy link

Hi! I'm facing this issue as well, of picker input being slow about rendering large amounts of data. I have a vector of ~50K records. Has any solution been implemented yet?
I see a lot of great work in this thread in improving performance, but i cant figure out how to implement it. Can somebody guide me please?

@tylerlittlefield
Copy link

tylerlittlefield commented Jan 10, 2020

I am struggling with this as well with ~20,000+ choices. I am thinking of using updateSelectizeInput based on advice from Joe Cheng:

library(shiny)
library(dplyr)

baby_names <- babynames::babynames %>% 
  distinct(name) %>%
  .[["name"]] %>% 
  sort()

ui <- fluidPage(
  selectInput("babyname", "Baby Name", multiple = TRUE, choices = character(0))
)

server <- function(input, output, session) {
  updateSelectizeInput(session, "babyname", choices = baby_names, server = TRUE)
}

shinyApp(ui, server)

This is out of my depth, but I wonder if existing code from updateSelectizeInput could be used in updatePickerInput.


Update: When profiling my shiny app, it looked like paste and capture.output were taking a bit of time under the hood. I wonder if captureOutput from R.utils might increase the speed a bit. Some pretty interesting discussions here and here.

@trafficonese
Copy link

trafficonese commented Feb 13, 2020

  • Nice solution from @swnydick, which is consistently around 10 times faster for me with 500, 5.000 and 50.000 choices and gives identical results.

  • And using captureOutput as suggested by @tyluRp speeds up the updatePickerInput quite a bit, and even exponential (or just quadratic) with growing choices, as explained in the details of captureOutput:

    This method imitates capture.output with the major difference that it captures strings via a raw connection rather than via internal strings. The latter becomes exponentially slow for large outputs [1,2].

    For me it's about 4 times faster with 5.000 choices and about 40 times faster with 50.000 choices.

  • And another thing that might not speed-up updatePickerInput by a lot, but I think there is unnecessary code there in the following 3 lines:

    options = NULL

    options is set to NULL, then it is checked if options is not NULL, and then it is removed from the list with dropNulls.

  • And I'm not sure if the following lapply is necessary:

    options <- lapply(options, function(x) {

    It also works without it for me.

@lalitsc12
Copy link

Has this been finally implemented ?

@lalitsc12
Copy link

Hello,
Is there a way that i can only display the 100 choice, but let user search the more option

Thanks
Lalit

@tylerlittlefield
Copy link

@lalitsc12 You should take a look at shiny::selectizeInput, it allows for server side processing.

@lalitsc12
Copy link

@lalitsc12 You should take a look at shiny::selectizeInput, it allows for server side processing.

hello,
Thank you for your reply. will this allow me to limit the diplay in the dropdown to 100, but still allow to search from the bigger list. Right now i am facing issue that it take a lot of time to update the pickerinput with the 1000000 option to choose from. So i am looking at option that will show only 100 in the dropdown list, but still can search from 1000000.

Thanks
Lalit

@tylerlittlefield
Copy link

@lalitsc12 That's correct, you can choose to render only 100 values but then search to render more values. It drastically improves performance.

@lalitsc12
Copy link

@tyluRp
I have been trying for last couple of days, but i have not succeeded to speed up the loading of the options. can you please post a example on how to achieve this ?
Thanks
Lalit

@pvictor
Copy link
Member

pvictor commented Nov 13, 2020

Here's an example with server-side selectizeInput:

library(shiny)

# choices: 97,310 names
baby_names <- sort(unique(babynames::babynames$name))

# ui
ui <- fluidPage(
  selectizeInput(
    inputId = "ID", 
    label = "Select Something",
    choices = NULL,
    selected = 1
  )
)
# server
server <- function(input, output, session) {
  updateSelectizeInput(
    session = session, 
    inputId = "ID",
    choices = baby_names, 
    server = TRUE
  )
}
# app
shinyApp(ui = ui, server = server)

There's no equivalent with pickerInput, but PR are welcome.

Victor

@hlendway
Copy link
Author

hlendway commented Aug 31, 2021

@pvictor - do you think you will implement the change suggested
by @swnydick (#184 (comment)) at some point? Based on the feedback it sounds like this really speeds things up and this remains an outstanding issue for me. I'd be happy to submit the change in a pull request if that helps? Not sure if others have created their own work around with that code or if others could still use this fix. Thanks!

@jgsarm-rb
Copy link

I ran into this problem with 100,000 options with no choiceOpts. The "mapply" underlying "selectOptions" is very slow (it's essentially running a "for" loop under the hood). It would be much faster to flag list options, run code on just those options, run different code on the non-list options, and then paste everything together.

Also - note that "vapply" is faster than "sapply" and is more reliable because you can specify the function return value. The code below (which worked on a few test examples, although I didn't test it thoroughly) took .36 seconds with 100000 choices, where the original function took 37.8 seconds. If you can get away from using mapply/lapply, the calls will be much faster.

selectOptions <- function(choices,
                          selected = NULL){
  
  # initial vector to store output character strings
  html           <- vector("character", length(choices))
  
  # indicating where to update list elements
  is_list_choice <- vapply(choices, is.list, logical(1L))
  
  # apply function ON list choices and add back to html
  if(any(is_list_choice)){
    list_choices   <- choices[is_list_choice]
    list_html      <- sprintf(
      fmt = '<optgroup label="%s">\n%s\n</optgroup>',
      htmltools::htmlEscape(text      = names(list_choices),
                            attribute = TRUE),
      vapply(list_choices, selectOptions, character(1L), selected = selected)
    )
    html[is_list_choice] <- list_html
  } 
  
  # run on just vector choices and put back into html
  if(any(!is_list_choice)){
    vec_choices <- choices[!is_list_choice]
    vec_html    <- sprintf(
      fmt = '<option value="%s"%s>%s</option>',
      htmltools::htmlEscape(text      = vec_choices,
                            attribute = TRUE),
      c("", " selected")[(vec_choices %in% selected) + 1],
      htmltools::htmlEscape(names(vec_choices))
    )
    html[!is_list_choice] <- vec_html
  }
  
  # paste everything together
  htmltools::HTML(paste(html, collapse = "\n"))
}

Hi there! I am running into the same problem and it looks like the suggestions here have not yet been implemented to date. Could you please provide a walkthrough on how to implement this?

@swnydick
Copy link

@jgsarm-rb There are effectively three ways of implementing and testing a change like this:

  1. The "hack" solution in R is to use unlockBinding on the package environment, assign the function to the package environment, and then use lockBinding to lock the package environment again. This isn't recommended because you're effectively patching a package that is already installed, but it can be a nice way of quickly testing a change without having to fork or modify the package itself.
  2. The "temporary" solution is to fork the repo, make the update in your own repo, and then install your version of the package. This is temporary because if the original package has any updates, they will no longer match the forked copy and you will have to sync the fork with the original repo. Moreover, if you update your package from github or CRAN, it will no longer have your change.
  3. The "permanent" solution is to make this change in the repo. This can either be from the developers themselves or by making a pull request to update the repo.

@jgsarm-rb
Copy link

@swnydick cool! So I simply have to fork pickerInput's repo here, replace pickerSelectOptions with your own version (selectOptions), and use that repo to install the modified package in my R installation. Did I get that right?

@swnydick
Copy link

@jgsarm-rb yes that should work, unless there have been any other changes under-the-hood to the package since the original comments that make the package not work anymore with this change.

@GitHunter0
Copy link

@pvictor - do you think you will implement the change suggested by @swnydick (#184 (comment)) at some point? Based on the feedback it sounds like this really speeds things up and this remains an outstanding issue for me. I'd be happy to submit the change in a pull request if that helps? Not sure if others have created their own work around with that code or if others could still use this fix. Thanks!

I was wondering the same thing. pickerInput is so good but it is making my apps worryingly slow...

@jgsarm-rb
Copy link

I was wondering the same thing. pickerInput is so good but it is making my apps worryingly slow...

I did what @swnydick told me to do. It reduced the 50 second processing time to just a fraction of a second. Now, the only thing left is to speed up is the rendering of the actual drop down menu. From a whopping 1:58 to render 100k choices, now it's down to 50s.

@GitHunter0
Copy link

@jgsarm-rb that's great news! I hope the dropdown menu part soon gets optimized as well.

If @pvictor finds it suitable, would you be willing to submit a PR of what you accomplished so far?

@pvictor
Copy link
Member

pvictor commented Jan 14, 2022

Sure you can submit a PR with those improvments, just be sure that results are the same than before.

Moreover, in last pre-release of bootstrap-select you have a source option to add data in javascript, this would avoid going through the HTML markup. Downside is that it don't work with bootstrap 3.

@swnydick
Copy link

I submitted a pull request for the issue. Thanks @pvictor!

@pvictor
Copy link
Member

pvictor commented Jan 18, 2022

If you need a fast select menu, here's an experiment: https://github.com/dreamRs/shinyvs
Feedback welcome!

@jgsarm-rb
Copy link

@pvictor Holy cow Victor! I tried this and it works fine. I have to put this into bigger tests with our applications and will report if I encounter any bugs. How do I submit feedback? Posting it in this thread will deviate from the topic at hand.

@pvictor
Copy link
Member

pvictor commented Jan 18, 2022

Thanks for testing @jgsarm-rb ! You can try the discussion feature in the other repo : https://github.com/dreamRs/shinyvs/discussions

@GitHunter0
Copy link

If you need a fast select menu, here's an experiment: https://github.com/dreamRs/shinyvs Feedback welcome!

@pvictor , this is great man!
Is there an example of how to customize choices and labels with HTML?
PS: multiple=TRUE is not working for me, I will open an issue there later

Sipkovandam added a commit to Sipkovandam/shinyWidgets that referenced this issue Apr 16, 2023
in accordance with dreamRs#184 comment by user swnydick
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests