Performance #4

lrennels · 2021-11-03T17:53:27Z

There should be places we can really improve performance, but ideally we would maintain the flexibility for countries and time, while also speeding things up. Some steps might be:

remove use of unique
remove use of Query looping functions like @filter
run Profiler for type stability

For example, it may be faster to move away from DataFrames and Query to use lower-level indexing methods like replacing

 subset = g_ssp_datasets[ssp_dict_key] |>
            @filter(_.year == gettime(t) && _.country in p.country_names) |>
            DataFrame

with something like (pseudocode)

yr_idxs = g_ssp_datasets[ssp_dict_key].year == gettime(t) # Bit Vector
country_idxs = indexin(g_ssp_datasets[ssp_dict_key]. country,  p.country_names) # Vector of Ints

and then use those to pare down to the subset, or are those also slow?

The text was updated successfully, but these errors were encountered:

lrennels added the enhancement New feature or request label Aug 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance #4

Performance #4

lrennels commented Nov 3, 2021 •

edited

Loading

Performance #4

Performance #4

Comments

lrennels commented Nov 3, 2021 • edited Loading

lrennels commented Nov 3, 2021 •

edited

Loading