Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Condition Number from get_summary() Appears Inaccurate #128

Open
billdenney opened this issue Dec 25, 2018 · 5 comments
Open

Condition Number from get_summary() Appears Inaccurate #128

billdenney opened this issue Dec 25, 2018 · 5 comments
Labels
Milestone

Comments

@billdenney
Copy link
Contributor

I have a model with the following eigenvalues, and the condition number is reported as 1.8 by get_summary() while it should be 42.5:

NONMEM output:

 ************************************************************************************************************************
 ********************                                                                                ********************
 ********************               FIRST ORDER CONDITIONAL ESTIMATION WITH INTERACTION              ********************
 ********************                      EIGENVALUES OF COR MATRIX OF ESTIMATE                     ********************
 ********************                                                                                ********************
 ************************************************************************************************************************


             1         2         3         4         5         6         7         8         9        10        11        12
             13        14        15        16        17        18        19        20        21        22        23        24
            25

         8.65E-02  1.33E-01  1.92E-01  2.31E-01  2.72E-01  3.00E-01  3.39E-01  4.20E-01  5.69E-01  5.81E-01  6.53E-01  7.52E-01
          7.61E-01  9.33E-01  9.54E-01  9.99E-01  1.12E+00  1.18E+00  1.28E+00  1.48E+00  1.61E+00  1.72E+00  2.19E+00  2.57E+00
         3.68E+00
@marianklose
Copy link

I can also confirm a substantial difference in the condition number reported by xpose::get_summary() and PsN sumo or Piranas output.

Similar picture like reported by @billdenney: The condition number in xpose::get_summary() is way smaller (1.6) than PsN's sumo (521).

That's the xpose function which is calculating the condition number:

sum_condn <- function(model, software, rounding) {
if (software == 'nonmem') {
x <- model %>%
dplyr::filter(.$subroutine == 'lst') %>%
dplyr::slice(which(stringr::str_detect(.$code, stringr::fixed('EIGENVALUES OF COR'))) + 4)
if (nrow(x) == 0) return(sum_tpl('condn', 'na'))
x %>%
dplyr::group_by_at(.vars = 'problem') %>%
tidyr::nest() %>%
dplyr::ungroup() %>%
dplyr::mutate(subprob = 0,
label = 'condn',
value = purrr::map_chr(.$data, function(x) {
stringr::str_trim(x$code, side = 'both') %>%
stringr::str_split(pattern = '\\s+') %>%
purrr::flatten_chr() %>%
as.numeric() %>%
{max(.)/min(.)} %>%
round(digits = rounding) %>%
as.character()})) %>%
dplyr::select(dplyr::one_of('problem', 'subprob', 'label', 'value'))
}
}

Here is my .lst output:

 ************************************************************************************************************************
 ********************                                                                                ********************
 ********************               FIRST ORDER CONDITIONAL ESTIMATION WITH INTERACTION              ********************
 ********************                      EIGENVALUES OF COR MATRIX OF ESTIMATE                     ********************
 ********************                                                                                ********************
 ************************************************************************************************************************
 

             1         2         3         4         5         6         7         8         9        10        11        12
             13        14        15        16        17        18        19        20        21
 
         7.61E-03  4.44E-02  4.82E-02  6.23E-02  1.12E-01  2.08E-01  2.69E-01  3.57E-01  3.93E-01  5.83E-01  5.96E-01  8.08E-01
          9.21E-01  9.59E-01  1.04E+00  1.32E+00  1.50E+00  2.20E+00  2.52E+00  3.08E+00  3.97E+00
 

Largest eigenvalue is 3.97E+00, smallest is 7.61E-03, so the condition number should be 3.97/0.00761 = 521.68 (in line with PsN's sumo). The sum_condn and subsequently xpose::get_summary() report

> sum_condn(model = xpdb$code, software = software, rounding = rounding)
# A tibble: 1 × 4
  problem subprob label value 
    <int>   <dbl> <chr> <chr> 
1       1       0 condn 1.6154

I've just checked the first part of the sum_cond function with my model example and I get this:

> x <- model %>% 
+     dplyr::filter(.$subroutine == 'lst') %>% 
+     dplyr::slice(which(stringr::str_detect(.$code, stringr::fixed('EIGENVALUES OF COR'))) + 4)
> x
# A tibble: 1 × 5
  problem level subroutine code                          comment
    <int> <int> <chr>      <chr>                         <chr>  
1       1    31 lst        " 13 14 15 16 17 18 19 20 21" ""     
> x$code
[1] " 13 14 15 16 17 18 19 20 21"

My educated guess would be at this point that the function works for a small number of parameters but not for a larger number (where we have a linebreak within the eigenvalue report). In the function we have

dplyr::slice(which(stringr::str_detect(.$code, stringr::fixed('EIGENVALUES OF COR'))) + 4)

and the + 4 will only work if we have no linebreak between index (1 2 3 4 5 ...) and Eigenvalues. This is how it looks:

image

In this case the eigenvalues are also not limited to one line. To fix this we would need to detect the index of "EIGENVALUES OF COR", detect the index of something like "Elapsed finaloutput time" and then calculate the eigenvalue lines. Something like this could work:

# Condition number
sum_condn <- function(model, software, rounding) {
  if (software == 'nonmem') {
    x <- model %>% 
      dplyr::filter(.$subroutine == 'lst')

    eigenvalue_index <- which(stringr::str_detect(x$code, stringr::fixed('EIGENVALUES OF COR')))
    finaloutput_index <- which(stringr::str_detect(x$code, stringr::fixed('Elapsed finaloutput time')))
    start <- eigenvalue_index + 3
    end <- finaloutput_index
    n_lines <- end - start
    n_index_lines <- n_lines/2
    eigenvalue_lines <- seq(from = start + n_index_lines, to = end - 1, by = 1)
    
    x <- lst %>% 
      dplyr::slice(as.integer(eigenvalue_lines))
    
    if (nrow(x) == 0) return(sum_tpl('condn', 'na'))
    
    x %>% 
      dplyr::group_by_at(.vars = 'problem') %>% 
      tidyr::nest() %>% 
      dplyr::ungroup() %>% 
      dplyr::mutate(subprob = 0,
                    label = 'condn',
                    value = purrr::map_chr(.$data, function(x) {
                      stringr::str_trim(x$code, side = 'both') %>%  
                        stringr::str_split(pattern = '\\s+') %>% 
                        purrr::flatten_chr() %>% 
                        as.numeric() %>% 
                        {max(.)/min(.)} %>% 
                        round(digits = rounding) %>% 
                        as.character()})) %>% 
      dplyr::select(dplyr::one_of('problem', 'subprob', 'label', 'value'))
  }
}

# test function
sum_condn(model = model, software = 'nonmem', rounding = 3)

which now returns correctly:

> sum_condn(model = model, software = 'nonmem', rounding = 3)
# A tibble: 1 × 4
  problem subprob label value  
    <int>   <dbl> <chr> <chr>  
1       1       0 condn 521.682

But I have realised that not each and every model has a Elapsed finaloutput time directly after the EIGENVALUE section. So this is still something to think about. For xpdb_ex_pk$code we have:

image

Meaning that my code proposal would fail. Any ideas how to solve that issue?

@bguiastr
Copy link
Collaborator

bguiastr commented Jun 2, 2024

@billdenney & @marianklose thank you for looking into this, indeed xpose was only looking at the first row of eigen value which lead to wrong condition numbers for larger models. I pushed a fix you can install the dev version with devtools::install_github('UUPharmacometrics/xpose') and check if it works with your examples. If all is okay I will publish the new release.

@billdenney
Copy link
Contributor Author

I don't have an example that I can easily test.

@marianklose
Copy link

Hi @bguiastr, thanks for putting work into this. I've just had a look into two examples with multiple lines of eigenvalues and in both cases it now calculates the condition number correctly. For me everything is okay now. Thanks a lot!

@JoannaPeng
Copy link

Hi @bguiastr, the same issue was reported back in 2018, but it seems that the condition number reported by summary() is still incorrect. It was reported as 1.1, while it should be 31.2:

NONMEM output:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants