Performance: avoid creation of new Quantity-objects in getOutputValues #462

PavelBal · 2020-12-22T13:44:29Z

In one particular scenario I have identified a bottleneck in the performance of retrieving the output values of simulated results.

I am simulating a steady-state of the model. For this, I first retrieve all quantities that are described by an ODE - first all molecules, and then all parameters and sorting the parameters that are state variable:

  quantities <- getAllMoleculesMatching("Organism|**", container = simulation)
  # Get all state variable parameters
  allParams <- getAllParametersMatching("**", container = simulation)
... extract param$isStateVariable ...

This already takes almost 4 minutes, and the majority of time is spent on creating the quantities, in particular in creating the quantities' formula objects:

OSPSuite-R/R/quantity.R

Lines 93 to 97 in 44d8471

    
           formula <- private$wrapExtensionMethod(QUANTITY_EXTENSIONS, "GetFormula") 
        
           private$.formula <- Formula$new(formula) 
        
           if (self$isTable) { 
        
             private$.formula <- TableFormula$new(formula) 
        
           }

Once identified all quantities (lets call them allQuantities), these are used as outputs in the simulation. Simulating and getting the results consumes 53 seconds, of which only 20 seconds are for the simulation itself - the other 30 seconds are spent on getting the results. Once again, much time is spent on creating Quantity-objects, including here:

OSPSuite-R/R/utilities-simulation-results.R

Lines 81 to 85 in 44d8471

    
           for (path in paths) { 
        
             quantity <- getQuantity(path, simulationResults$simulation, stopIfNotFound = stopIfNotFound) 
        
             metaData[[path]] <- list(unit = quantity$unit, dimension = quantity$dimension) 
        
             values[[path]] <- simulationResults$getValuesByPath(path, individualIds, stopIfNotFound) 
        
           }

However, creating Quantity-objects is not necessary if a list of quantities is provided instead of paths. By re-writing the method I could improve the time from 53 to 40 seconds - which is not negligible if the scenario is performed multiple times (e.g. in a parameter identification).

I will submit a PR.

The text was updated successfully, but these errors were encountered:

msevestre · 2020-12-22T13:55:36Z

This already takes almost 4 minutes

That I find very surprising. How large is your model?
Also getAllParametersMatching(**) looks like you want all parameters. Don't we have a more optimized function for this?

PavelBal · 2020-12-22T14:18:56Z

That I find very surprising. How large is your model?

Yes this is crazy. Well, the model is quite large - the exported pkml is 50mb...

getAllMoleculesMatching("Organism|**", container = simulation) takes 20 seconds, with quite some time spent on creating/accessing the formulas (returns 3800 elements):

And getAllParametersMatching(**) is ridiculously slow (returns 37500 elements) with 3.6 minutes!.

Retrieving only state variable parameters would be a great improvement, but I did not find how to do it directly.

msevestre · 2020-12-22T14:30:04Z

40k parameters lol
What I mean is getAllParameterMatching(**) will try to match every single path by regex. I think it would be faster to load them all and then filter

How do I read this perf graph? I see the number of time a method is called but I don't see the overall time

PavelBal · 2020-12-22T14:37:28Z

What I mean is getAllParameterMatching(**) will try to match every single path by regex. I think it would be faster to load them all and then filter

But matching is performed in .net, right?

OSPSuite-R/R/utilities-entity.R

Lines 105 to 122 in 44d8471

    
           getAllEntitiesMatching <- function(paths, container, entityType, method = NULL) { 
        
             # Test for correct inputs 
        
             validateIsOfType(container, c(Simulation, Container, Molecule)) 
        
             validateIsString(paths) 
        
             validateIsString(method, nullAllowed = TRUE) 
        
             className <- entityType$classname 
        
             if (length(which(names(AllMatchingMethod) == className)) == 0) { 
        
               stop(messages$errorWrongType("entityType", className, names(AllMatchingMethod))) 
        
             } 
        
             task <- getContainerTask() 
        
             method <- method %||% AllMatchingMethod[[className]] 
        
             findEntitiesByPath <- function(path) { 
        
               toObjectType(rClr::clrCall(task, method, container$ref, enc2utf8(path)), entityType) 
        
             } 
        
             return(unify(findEntitiesByPath, paths))

This is fast, definitely not the bottle-neck.

How do I read this perf graph? I see the number of time a method is called but I don't see the overall time

Those are the ms spent on each line. What I try to show here that a great amount of time in spent on formula-related tasks. For each .net-quantity, an r-Quantity is created and "GetFormula" and "self$isTable" (which calls self$formula$isTable) takes about 45 seconds each (in total).

msevestre · 2020-12-22T14:42:07Z

self$isTable is 45s ?? WHAT? We are accessing a boolean variable here

msevestre · 2020-12-22T14:43:48Z

There are cleary some stuff to explore. self$isTable just calls self$formula$isTable. This could be cached. However it seems to be only called once anyway. Maybe the call in .NET takes time (I don't understand how)

PavelBal mentioned this issue Dec 22, 2020

Do not create new quantity objects if utilities-simulation-results if… #463

Merged

msevestre closed this as completed in ea404b0 Jan 5, 2021

PavelBal mentioned this issue May 10, 2021

Using hash for performance improvement #517

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance: avoid creation of new Quantity-objects in getOutputValues #462

Performance: avoid creation of new Quantity-objects in getOutputValues #462

PavelBal commented Dec 22, 2020

msevestre commented Dec 22, 2020

PavelBal commented Dec 22, 2020

msevestre commented Dec 22, 2020

PavelBal commented Dec 22, 2020

msevestre commented Dec 22, 2020

msevestre commented Dec 22, 2020

Performance: avoid creation of new Quantity-objects in getOutputValues #462

Performance: avoid creation of new Quantity-objects in getOutputValues #462

Comments

PavelBal commented Dec 22, 2020

msevestre commented Dec 22, 2020

PavelBal commented Dec 22, 2020

msevestre commented Dec 22, 2020

PavelBal commented Dec 22, 2020

msevestre commented Dec 22, 2020

msevestre commented Dec 22, 2020