Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DoHMRF fails at some cases #162

Open
Chengwei94 opened this issue Dec 13, 2021 · 4 comments
Open

DoHMRF fails at some cases #162

Chengwei94 opened this issue Dec 13, 2021 · 4 comments

Comments

@Chengwei94
Copy link

Hi,

When running dohmrf, it fails at certain cases. It happens very randomly, for example, it fails to cluster when my k is 11 but works when my k is 12. This is the code that I am running.

HMRF_spatial_genes <- doHMRF(st.giotto,
  expression_values = "scaled",
  spatial_genes = my_spatial_genes, k = num_clusters,
  spatial_network_name = "spatial_network",
  dimensions_to_use = 1:15,
  betas = c(40, 0, 1),
  numinit = 100,
  seed = 103,
  output_folder = file.path(output_path, paste0(num_clusters, "test_HMRF")),
)

It seems that there are some dampening factors that failed to be calculated, and giotto didnt account for the scenario where it happens. This is the error that is shown.

[1] "Loaded smfishHmrf"
[1] 0.001972318
[1] 0.001972318
[1] 0.001972318
[1] 0.001972318
[1] 0.001972318
[1] 0.001972318
[1] 0.001972318
[1] 0.001972318
[1] 0.001972318
[1] 0.001972318
[1] 0.001972318
[1] "dampen factor 0.245633644062211"
[1] "dampen factor 0.562996811658807"
[1] "dampen factor 0.270811092578588"
[1] "dampen factor 0.420116889117133"
[1] "dampen factor 0.329172575598602"
[1] "dampen factor 0.329172575598602"
[1] "dampen factor 0.792193051639228"
[1] "dampen factor 0.362912764597459"
[1] "dampen factor 0.270811092578588"
Error in cov(y[lclust[[i]], ]) :
supply both 'x' and 'y' or a matrix-like 'x'
Execution halted

My sessioninfo is attached:

R version 4.0.3 (2020-10-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.5 LTS Matrix products: default BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so locale: [1] LC_CTYPE=en_SG.UTF-8 LC_NUMERIC=C LC_TIME=en_SG.UTF-8 LC_COLLATE=en_SG.UTF-8 [5] LC_MONETARY=en_SG.UTF-8 LC_MESSAGES=en_SG.UTF-8 LC_PAPER=en_SG.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_SG.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats4 stats graphics grDevices utils datasets methods base other attached packages: [1] Matrix_1.3-4 patchwork_1.1.1 stringr_1.4.0 ggplot2_3.3.5 [5] BayesSpace_1.1.4 SingleCellExperiment_1.12.0 SummarizedExperiment_1.20.0 Biobase_2.50.0 [9] GenomicRanges_1.42.0 GenomeInfoDb_1.26.7 IRanges_2.24.1 S4Vectors_0.28.1 [13] BiocGenerics_0.36.1 MatrixGenerics_1.2.1 matrixStats_0.61.0 Giotto_1.1.0 loaded via a namespace (and not attached): [1] utf8_1.2.2 reticulate_1.22 tidyselect_1.1.1 RSQLite_2.2.7 htmlwidgets_1.5.4 [6] FactoMineR_2.4 grid_4.0.3 BiocParallel_1.24.1 Rtsne_0.15 miscTools_0.6-26 [11] munsell_0.5.0 codetools_0.2-18 ica_1.0-2 DT_0.20 statmod_1.4.36 [16] scran_1.18.7 xgboost_1.4.1.1 future_1.22.1 miniUI_0.1.1.1 withr_2.4.2 [21] colorspace_2.0-2 leaps_3.1 Seurat_4.0.4 ROCR_1.0-11 tensor_1.5 [26] pbmcapply_1.5.0 listenv_0.8.0 Rdpack_2.1.2 labeling_0.4.2 GenomeInfoDbData_1.2.4 [31] polyclip_1.10-0 farver_2.1.0 bit64_4.0.5 rhdf5_2.34.0 rprojroot_2.0.2 [36] coda_0.19-4 parallelly_1.28.1 vctrs_0.3.8 generics_0.1.1 BiocFileCache_1.14.0 [41] R6_2.5.1 ggbeeswarm_0.6.0 rsvd_1.0.5 locfit_1.5-9.4 hdf5r_1.3.3 [46] RcppZiggurat_0.1.6 bitops_1.0-7 rhdf5filters_1.2.1 spatstat.utils_2.2-0 cachem_1.0.6 [51] DelayedArray_0.16.3 assertthat_0.2.1 promises_1.2.0.1 scales_1.1.1 beeswarm_0.4.0 [56] gtable_0.3.0 beachmat_2.6.4 globals_0.14.0 goftest_1.2-3 sandwich_3.0-1 [61] rlang_0.4.11 scatterplot3d_0.3-41 splines_4.0.3 lazyeval_0.2.2 spatstat.geom_2.2-2 [66] yaml_2.2.1 reshape2_1.4.4 abind_1.4-5 Rfast_2.0.3 httpuv_1.6.3 [71] tools_4.0.3 ellipsis_0.3.2 spatstat.core_2.3-0 RColorBrewer_1.1-2 ggridges_0.5.3 [76] Rcpp_1.0.7 plyr_1.8.6 sparseMatrixStats_1.2.1 zlibbioc_1.36.0 purrr_0.3.4 [81] RCurl_1.98-1.5 rpart_4.1-15 deldir_1.0-6 pbapply_1.5-0 viridis_0.6.1 [86] cowplot_1.1.1 zoo_1.8-9 SeuratObject_4.0.2 ggrepel_0.9.1 cluster_2.1.2 [91] fs_1.5.2 here_1.0.1 magrittr_2.0.1 data.table_1.14.2 scattermore_0.7 [96] lmtest_0.9-39 RANN_2.6.1 fitdistrplus_1.1-6 mime_0.12 xtable_1.8-4 [101] mclust_5.4.7 gridExtra_2.3 compiler_4.0.3 scater_1.18.6 tibble_3.1.4 [106] KernSmooth_2.23-20 crayon_1.4.2 htmltools_0.5.2 mgcv_1.8-36 later_1.3.0 [111] Formula_1.2-4 tidyr_1.1.4 DBI_1.1.1 dbplyr_2.1.1 MASS_7.3-54 [116] rappdirs_0.3.3 rbibutils_2.2.1 igraph_1.2.7 pkgconfig_2.0.3 flashClust_1.01-2 [121] plotly_4.9.4.1 scuttle_1.0.4 spatstat.sparse_2.0-0 vipor_0.4.5 dqrng_0.3.0 [126] XVector_0.30.0 digest_0.6.28 pracma_2.3.6 sctransform_0.3.2 RcppAnnoy_0.0.19 [131] spatstat.data_2.1-0 leiden_0.3.9 uwot_0.1.11 edgeR_3.34.1 DelayedMatrixStats_1.12.3 [136] maxLik_1.5-2 curl_4.3.2 shiny_1.7.0 lifecycle_1.0.1 nlme_3.1-153 [141] jsonlite_1.7.2 Rhdf5lib_1.12.1 BiocNeighbors_1.8.2 viridisLite_0.4.0 limma_3.46.0 [146] fansi_0.5.0 pillar_1.6.3 lattice_0.20-45 fastmap_1.1.0 httr_1.4.2 [151] survival_3.2-13 glue_1.4.2 smfishHmrf_0.1 png_0.1-7 bluster_1.0.0 [156] bit_4.0.4 stringi_1.7.4 blob_1.2.2 BiocSingular_1.6.0 DirichletReg_0.7-1 [161] memoise_2.0.1 dplyr_1.0.7 irlba_2.3.3 future.apply_1.8.1

 

| >

@bernard2012
Copy link
Contributor

bernard2012 commented Dec 13, 2021

Hi,
I think you should not be using PC components as input to HMRF.
dimensions_to_use = 1:15,
The input is specified as "scaled" which should be used.
Also I recommend you try a bunch of betas rather than putting all your faith on one of them:
betas = c(40, 0, 1),
can be changed perhaps to:
betas = c(0, 10, 4)
We have recently commited updates to HMRF which I highly encourage you to check out:

https://bitbucket.org/qzhudfci/smfishhmrf-r/src/master/TRANSITION.md

These changes can be seen if you pull the master branch:
install_github("RubD/Giotto")

Contact me further if you have more questions: qian_zhu@dfci.harvard.edu
Best,

@Chengwei94
Copy link
Author

Hi,

I have tried the new API and still have the same problem. The problem lies in the init_V2_hmrf

[1] "dampen factor 0.620703984853835"
[1] "dampen factor 0.917062481403862"
[1] "dampen factor 0.651739184096526"
[1] "dampen factor 0.684326143301353"
[1] "dampen factor 0.917062481403862"
[1] "dampen factor 0.83180270422119"
[1] "dampen factor 0.71854245046642"
[1] "dampen factor 0.83180270422119"
[1] "dampen factor 0.83180270422119"
[1] "dampen factor 0.754469572989741"
[1] "dampen factor 0.651739184096526"
Error in cov(y[lclust[[i]], ]) :
supply both 'x' and 'y' or a matrix-like 'x'
Calls: giotto_visium -> initHMRF_V2 -> cov -> cov

@bernard2012
Copy link
Contributor

Here are more things you could try to locate the problem:

  • Can you perhaps try a different seed in initHMRF step?
  • What is the number of genes you are using?
  • There may be some very highly correlated genes in your gene list. May be try to keep one copy and remove others.
  • k is set too high resulting in some initial clusters being very small. This could also be due to existence of outlier genes in your gene list which may form its own very small cluster.

@pacificma
Copy link
Contributor

Hi,

I have tried the new API and still have the same problem. The problem lies in the init_V2_hmrf

[1] "dampen factor 0.620703984853835" [1] "dampen factor 0.917062481403862" [1] "dampen factor 0.651739184096526" [1] "dampen factor 0.684326143301353" [1] "dampen factor 0.917062481403862" [1] "dampen factor 0.83180270422119" [1] "dampen factor 0.71854245046642" [1] "dampen factor 0.83180270422119" [1] "dampen factor 0.83180270422119" [1] "dampen factor 0.754469572989741" [1] "dampen factor 0.651739184096526" Error in cov(y[lclust[[i]], ]) : supply both 'x' and 'y' or a matrix-like 'x' Calls: giotto_visium -> initHMRF_V2 -> cov -> cov

It seems that the err was from when you calculate the covariance of a single number. That means the number of that cluster was 1. It would be potentially driven by a large K compare to data size. Can you provide some more information about your data. i.e. what size is the my_spatial_genes and what is the dimension of your data sets : # of cells and # of genes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants