Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error processing ko: object 'ko' not found and WARNINGS? + more than two group columns? + The 'method' column in the 'daa_results_df' data frame contains more than one method? #135

Open
marwa38 opened this issue Jan 27, 2025 · 4 comments
Labels
bug Something isn't working

Comments

@marwa38
Copy link

marwa38 commented Jan 27, 2025

Hello
I am getting this error Error processing ko: object 'ko' not found with punch of WARNINGS as below. Any advice?
I also got There are more than two 'group' columns error. I am using kruskal for comparison between three gorups pyloric, middle and distal gut regions.
This error was accompanied by The 'method' column in the 'daa_results_df' data frame contains more than one methodalthough I already have one method choosen earlier.
I added the required vectors in zip folder.

unizp dataset.zip first

metadata <- read_delim("metadata.tsv", delim = "\t", escape_double = FALSE, trim_ws = TRUE) 
metadata_VMV <- metadata %>% filter(Regime == "VMV")

kegg_abundance <- ko2kegg_abundance("abundance_data.tsv")
sample_ids_VMV <- metadata_VMV$sample.ID 
kegg_VMV <- kegg_abundance[, colnames(kegg_abundance) %in% sample_ids_VMV]

daa_results_df_VMV <- pathway_daa(abundance = kegg_VMV,
                              metadata = metadata_VMV,
                              group = "Region",
                              daa_method = "ALDEx2", select = NULL, reference = NULL)
daa_sub_method_results_df_VMV <- daa_results_df_VMV[daa_results_df$method == "ALDEx2_Kruskal-Wallace test", ]

daa_sub_method_results_df_VMV <- readRDS("daa_sub_method_results_df_VMV.rds")
daa_annotated_sub_method_results_df_VMV <- pathway_annotation(pathway = "KO", 
+                                                               daa_results_df = daa_sub_method_results_df_VMV, 
+                                                               ko_to_kegg = TRUE)
[2025-01-27 10:37:14] INFO: Starting KEGG annotation process
[2025-01-27 10:37:15] ERROR: Error processing ko00471: object 'ko00471' not found
  [>]  86% | Elapsed:  2s | ETA:  0s
  105/122 (69 B/s/sec) | Processing ko00473[2025-01-27 10:37:16] ERROR: Error processing ko00473: object 'ko00473' not found
  [>]  87% | Elapsed:  3s | ETA:  0s
  [>]  88% | Elapsed:  3s | ETA:  0sko04210
  [>]  89% | Elapsed:  3s | ETA:  0sko00240
  [>]  89% | Elapsed:  3s | ETA:  0sko04011
  [>]  90% | Elapsed:  3s | ETA:  0sko04070
  [>]  91% | Elapsed:  3s | ETA:  0sko04310
  [>]  92% | Elapsed:  3s | ETA:  0sko03420
  [>]  93% | Elapsed:  3s | ETA:  0sko00770
  [>]  93% | Elapsed:  3s | ETA:  0sko00670
  [>]  94% | Elapsed:  3s | ETA:  0sko04112
  [>]  95% | Elapsed:  3s | ETA:  0sko05340
  [>]  96% | Elapsed:  3s | ETA:  0sko00564
  [>]  97% | Elapsed:  3s | ETA:  0sko00680
  [>]  98% | Elapsed:  3s | ETA:  0sko00562
  [>]  98% | Elapsed:  3s | ETA:  0sko03030
  [>]  99% | Elapsed:  3s | ETA:  0sko00561
  [=] 100% | Elapsed:  3s | ETA:  0sko00250
  122/122 (47 B/s/sec) | Processing ko00740
[2025-01-27 10:37:16] INFO: KEGG annotation process completed
Warning messages:
1: In doTryCatch(return(expr), name, parentenv, handler) :
  restarting interrupted promise evaluation
2: In doTryCatch(return(expr), name, parentenv, handler) :
  restarting interrupted promise evaluation
3: In doTryCatch(return(expr), name, parentenv, handler) :
  restarting interrupted promise evaluation
4: In doTryCatch(return(expr), name, parentenv, handler) :
  restarting interrupted promise evaluation

kegg_VMV<- readRDS("kegg_VMV.rds")
metadata_VMV<- readRDS("metadata_VMV.rds")

> p_VMV <- pathway_errorbar(abundance = kegg_VMV, 
+                           daa_results_df = daa_annotated_sub_method_results_df_VMV, 
+                           Group = metadata_VMV$Region, 
+                           p_values_threshold = 0.05, 
+                           order = "pathway_class", select = NULL, 
+                           ko_to_kegg = TRUE, p_value_bar = TRUE, 
+                           colors = NULL, x_lab = "pathway_name")
There are more than two 'group' columns in the 'daa_results_df' data frame. As a result, it is not possible to compute the log2 fold values. The 'p_value_bar' has been automatically set to FALSE.
Error in pathway_errorbar(abundance = kegg_VMV, daa_results_df = daa_annotated_sub_method_results_df_VMV,  : 
  The 'method' column in the 'daa_results_df' data frame contains more than one method. Please filter it to contain only one method.

> sessionInfo()
R version 4.4.2 (2024-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8    LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                           LC_TIME=English_United States.utf8    

time zone: Europe/Paris
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ALDEx2_1.28.0         zCompositions_1.5.0-4 truncnorm_1.0-9       NADA_1.6-1.1          survival_3.8-3       
 [6] MASS_7.3-64           patchwork_1.3.0       ggprism_1.0.5         lubridate_1.9.4       forcats_1.0.0        
[11] stringr_1.5.1         dplyr_1.1.4           purrr_1.0.2           tidyr_1.3.1           ggplot2_3.5.1        
[16] tidyverse_2.0.0       tibble_3.2.1          ggpicrust2_1.7.4      readr_2.1.5          

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.1            farver_2.1.2                Biostrings_2.74.1           fastmap_1.2.0              
 [5] digest_0.6.37               timechange_0.3.0            lifecycle_1.0.4             KEGGREST_1.46.0            
 [9] magrittr_2.0.3              compiler_4.4.2              progress_1.2.3              rlang_1.1.4                
[13] tools_4.4.2                 utf8_1.2.4                  yaml_2.3.10                 knitr_1.49                 
[17] prettyunits_1.2.0           S4Arrays_1.6.0              curl_6.2.0                  bit_4.5.0.1                
[21] DelayedArray_0.32.0         aplot_0.2.4                 abind_1.4-8                 BiocParallel_1.40.0        
[25] withr_3.0.2                 BiocGenerics_0.52.0         grid_4.4.2                  ggh4x_0.3.0                
[29] stats4_4.4.2                multtest_2.62.0             colorspace_2.1-1            scales_1.3.0               
[33] SummarizedExperiment_1.36.0 cli_3.6.3                   rmarkdown_2.29              crayon_1.5.3               
[37] generics_0.1.3              RcppParallel_5.1.9          rstudioapi_0.17.1           httr_1.4.7                 
[41] tzdb_0.4.0                  zlibbioc_1.52.0             splines_4.4.2               parallel_4.4.2             
[45] ggplotify_0.1.2             XVector_0.46.0              matrixStats_1.5.0           yulab.utils_0.1.9          
[49] vctrs_0.6.5                 Matrix_1.7-1                jsonlite_1.8.9              gridGraphics_0.5-1         
[53] IRanges_2.40.1              hms_1.1.3                   S4Vectors_0.44.0            bit64_4.5.2                
[57] glue_1.8.0                  codetools_0.2-20            stringi_1.8.4               gtable_0.3.6               
[61] GenomeInfoDb_1.42.1         GenomicRanges_1.58.0        UCSC.utils_1.2.0            RcppZiggurat_0.1.6         
[65] munsell_0.5.1               pillar_1.10.1               htmltools_0.5.8.1           GenomeInfoDbData_1.2.13    
[69] R6_2.5.1                    vroom_1.6.5                 evaluate_1.0.3              lattice_0.22-6             
[73] Biobase_2.66.0              png_0.1-8                   Rfast_2.1.3                 ggfun_0.1.8                
[77] Rcpp_1.0.13-1               SparseArray_1.6.0           xfun_0.49                   fs_1.6.5                   
[81] MatrixGenerics_1.18.1       pkgconfig_2.0.3  
@marwa38 marwa38 added the bug Something isn't working label Jan 27, 2025
@marwa38
Copy link
Author

marwa38 commented Jan 27, 2025

datasets.zip

@marwa38 marwa38 changed the title Error processing ko: object 'ko' not found and WARNINGS? Error processing ko: object 'ko' not found and WARNINGS? + more than two group columns? + The 'method' column in the 'daa_results_df' data frame contains more than one method? Jan 27, 2025
@PeterUgbanu
Copy link

Hello @marwa38 ,

There are more than two 'group' columns error from pathway errorbar function is because the function only works for 2 group comparison. So you could just do a two group comparison.

As for "Error processing ko: object 'ko' not found". I'm not sure what mapping files ggpicrust uses in the ko2kegg_abundance function (I haven't checked). Some KEGG pathways have been updated so when the KOs are request using KEGGREST (which ggpicrust uses), it returns this error if the pathway could not be found.
The only part I find disturbing is that ggpicrust discards all KOs which the corresponding pathway could not be found.

@marwa38
Copy link
Author

marwa38 commented Jan 30, 2025

Thanks so much @PeterUgbanu

@marwa38
Copy link
Author

marwa38 commented Feb 7, 2025

I wonder it is possible to have more than 30 pathways if it is only few?

pathway_errorbar(abundance = metacyc_middle,
                 daa_results_df = metacyc_daa_annotated_results_df_middle,
                 Group = metadata_middle$Regime, ko_to_kegg = FALSE,
                 p_values_threshold = 0.05, order = "group",
                 select = NULL, p_value_bar = TRUE, 
                 colors = c("MMV" = "#C71585", "VMV" = "#008080"),
                 x_lab = "description")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants