Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in H5Fopen(file): Unable to open HDF5 file when using subsetArchRProject to create new arrow files #248

Closed
JohnGenome opened this issue Jul 17, 2020 · 28 comments
Assignees
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@JohnGenome
Copy link

Describe the bug
When I used subsetArchRProject to create new arrow files, a bug arose as in the KI.txt.

Copying Arrow Files...
Error in .safelapply(seq_along(inArrows), function(x) { :
Error Found Iteration 1 :
        [1] "Error in H5Fopen(file) : HDF5. File accessibilty. Unable to open file.\n"
        <simpleError in H5Fopen(file): HDF5. File accessibilty. Unable to open file.>

KI.txt
If I only use one core, the bug also exists. I would be happy to get your comments on this. Thanks.

Copying ArchRProject to new outputDirectory : /asnas/KIR2
Copying Arrow Files...
Error in H5Fopen(file) : HDF5. File accessibilty. Unable to open file.

Session Info
R version 3.6.1 (2019-07-05)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS: /software/biosoft/software/python/anaconda3-python3-2018/lib/libblas.so.3.6.0
LAPACK: /software/biosoft/software/python/anaconda3-python3-2018/lib/liblapack.so.3.6.0

locale:
[1] C

attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base

other attached packages:
[1] ArchR_0.9.5 magrittr_1.5
[3] rhdf5_2.28.0 Matrix_1.2-17
[5] data.table_1.12.6 SummarizedExperiment_1.14.0
[7] DelayedArray_0.10.0 BiocParallel_1.18.0
[9] matrixStats_0.55.0 Biobase_2.44.0
[11] GenomicRanges_1.36.0 GenomeInfoDb_1.20.0
[13] IRanges_2.18.1 S4Vectors_0.22.0
[15] BiocGenerics_0.30.0 ggplot2_3.2.1

loaded via a namespace (and not attached):
[1] Rcpp_1.0.2 pillar_1.4.4
[3] compiler_3.6.1 XVector_0.24.0
[5] tools_3.6.1 bitops_1.0-6
[7] zlibbioc_1.30.0 BSgenome_1.52.0
[9] lifecycle_0.2.0 tibble_3.0.1
[11] gtable_0.3.0 lattice_0.20-38
[13] pkgconfig_2.0.3 rlang_0.4.6
[15] GenomeInfoDbData_1.2.1 stringr_1.4.0
[17] rtracklayer_1.44.2 withr_2.1.2
[19] dplyr_0.8.3 Biostrings_2.52.0
[21] vctrs_0.2.4 grid_3.6.1
[23] tidyselect_1.0.0 glue_1.4.1
[25] R6_2.4.0 BSgenome.Hsapiens.UCSC.hg19_1.4.0
[27] XML_3.98-1.20 Rhdf5lib_1.6.0
[29] purrr_0.3.3 GenomicAlignments_1.20.1
[31] Rsamtools_2.0.0 scales_1.0.0
[33] ellipsis_0.3.0 assertthat_0.2.1
[35] colorspace_1.4-1 stringi_1.4.3
[37] RCurl_1.95-4.12 lazyeval_0.2.2
[39] munsell_0.5.0 crayon_1.3.4
[41] Cairo_1.5-10

@JohnGenome JohnGenome added the bug Something isn't working label Jul 17, 2020
@hhvu0102
Copy link

Hi,
I had a similar problem when I tried createArrowFiles(). My input was fragments.tsv.gz file, genome was built for rn6. The attached file is my log file:
ArchR-createArrows-49244e836d09-Date-2020-07-31_Time-12-16-05.log
The error that stood out to me was:

simpleError in H5Fopen(file, native = native): HDF5. File accessibilty. Unable to open file.

I tried processing H5 files in other contexts and it was ok.

I hope you guys can help me solve this problem. Thanks a lot!

@likelet
Copy link

likelet commented Aug 3, 2020

I encontered the exact same error with addDoubletScores, here is my command and errors:

ArrowFiles <- createArrowFiles(
  inputFiles = inputFiles,
  sampleNames = names(inputFiles),
  filterTSS = 4, #Dont set this too high because you can always increase later
  filterFrags = 1000, 
  addTileMat = TRUE,
  addGeneScoreMat = TRUE
)

doubScores <- addDoubletScores(
  input = ArrowFiles,
  k = 10, #Refers to how many cells near a "pseudo-doublet" to count.
  knnMethod = "UMAP", #Refers to the embedding to use for nearest neighbor search.
  LSIMethod = 1
)

and my errors

ArchR logging to : ArchRLogs/ArchR-addDoubletScores-4ce451c38440-Date-2020-08-03_Time-09-52-07.log
If there is an issue, please report to github with logFile!
2020-08-03 09:52:07 : Batch Execution w/ safelapply!, 0 mins elapsed.
Error in H5Fopen(file, "H5F_ACC_RDONLY", native = native) : 
  HDF5. File accessibilty. Unable to open file.

My env is :

R version 3.5.0 (2018-04-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS: /disk/soft/R-3.5.0/lib/libRblas.so
LAPACK: /disk/soft/R-3.5.0/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BSgenome_1.50.0             ArchR_0.9.5                 magrittr_1.5                rhdf5_2.26.2                Matrix_1.2-16               data.table_1.12.0          
 [7] SummarizedExperiment_1.12.0 DelayedArray_0.8.0          BiocParallel_1.14.2         matrixStats_0.54.0          Biobase_2.40.0              ggplot2_3.1.0              
[13] rtracklayer_1.42.2          Biostrings_2.48.0           XVector_0.20.0              GenomicRanges_1.34.0        GenomeInfoDb_1.18.2         IRanges_2.16.0             
[19] S4Vectors_0.20.1            BiocGenerics_0.28.0        

loaded via a namespace (and not attached):
  [1] Seurat_3.2.0             Rtsne_0.15               colorspace_1.4-1         deldir_0.1-28            ggridges_0.5.1           rstudioapi_0.9.0         spatstat.data_1.4-3     
  [8] leiden_0.3.1             listenv_0.7.0            npsurv_0.4-0             ggrepel_0.8.0            codetools_0.2-16         splines_3.5.0            lsei_1.2-0              
 [15] polyclip_1.10-0          jsonlite_1.6             Cairo_1.5-12.2           Rsamtools_1.34.1         ica_1.0-2                cluster_2.1.0            png_0.1-7               
 [22] uwot_0.1.8               shiny_1.2.0              sctransform_0.2.0        compiler_3.5.0           httr_1.4.0               assertthat_0.2.1         lazyeval_0.2.2          
 [29] later_0.8.0              htmltools_0.3.6          tools_3.5.0              rsvd_1.0.1               igraph_1.2.4             gtable_0.3.0             glue_1.3.1              
 [36] GenomeInfoDbData_1.1.0   RANN_2.6.1               reshape2_1.4.3           dplyr_0.8.0.1            Rcpp_1.0.1               spatstat_1.64-1          gdata_2.18.0            
 [43] ape_5.3                  nlme_3.1-141             lmtest_0.9-37            stringr_1.4.0            globals_0.12.4           mime_0.6                 miniUI_0.1.1.1          
 [50] irlba_2.3.3              gtools_3.8.1             XML_3.98-1.19            goftest_1.2-2            future_1.14.0            MASS_7.3-51.1            zlibbioc_1.26.0         
 [57] zoo_1.8-4                scales_1.0.0             promises_1.0.1           spatstat.utils_1.17-0    RColorBrewer_1.1-2       reticulate_1.11.1        pbapply_1.4-0           
 [64] gridExtra_2.3            rpart_4.1-15             stringi_1.4.3            caTools_1.17.1.2         rlang_0.4.0              pkgconfig_2.0.2          bitops_1.0-6            
 [71] lattice_0.20-38          Rhdf5lib_1.4.3           ROCR_1.0-7               purrr_0.3.2              tensor_1.5               GenomicAlignments_1.16.0 patchwork_1.0.1         
 [78] htmlwidgets_1.3          cowplot_0.9.4            tidyselect_0.2.5         RcppAnnoy_0.0.13         plyr_1.8.5               R6_2.4.0                 gplots_3.0.1.1          
 [85] withr_2.1.2              pillar_1.3.1             mgcv_1.8-27              fitdistrplus_1.0-14      survival_2.43-3          abind_1.4-5              RCurl_1.95-4.12         
 [92] tibble_2.1.1             future.apply_1.3.0       crayon_1.3.4             KernSmooth_2.23-16       plotly_4.9.0             grid_3.5.0               digest_0.6.18           
 [99] xtable_1.8-3             tidyr_0.8.3              httpuv_1.5.0             munsell_0.5.0            viridisLite_0.3.0       

@vimarin
Copy link

vimarin commented Sep 14, 2020

Hi girls and guys, I am rather new in R and ArchR.
I got same issue when trying to creat the arrow file.
In practice how did you handle it ? which code line ?

Thanks in advance and nice day

@rdalbanus
Copy link

+1 to this when running createArrowFiles in parallel. No issues when running on a single thread.

@jgranja24
Copy link
Contributor

I have been looking into this more see pachterlab/sleuth#120. I dont really know how to best diagnose this issue still besides recommending installing more recent versions of rhdf5. The versions I see in this thread are rhdf5_2.28.0, rhdf5_2.30.1, rhdf5_2.26.2. I use rhdf5_2.30.1 currently and we are trying to put together a stable packrat for user download for these type of issues. The current bioconductor version is rhdf5_2.32.4 which may have some stability to these issues. Sorry I am not more helpful at this time.

@jgranja24
Copy link
Contributor

jgranja24 commented Oct 27, 2020

For any of you with known corrupt ArrowFiles can you send one of them to archr.devs@gmail.com . I can look into trying to recover these files or something to stabilize this type of issue. Sorry for the troubles.

@jgranja24
Copy link
Contributor

I think the issue is based on hdf5 file locking. ArchR disables this to speed up these computations in parallel. I wonder if these issues are related to how your operating system handles this file locking procedure. I was able to recapitulate this error but found just disabling file locking (which ArchR tries to do worked).

This is based on this type of error --

HDF5. File accessibilty. Unable to open file.

To confirm this is this type of error try --

h5ls(getArrowFiles(ArchRProj)[1])

Which should still show the contents indicating this file is not corrupted.

Hope this helps

Jeff

@jgranja24
Copy link
Contributor

jgranja24 commented Oct 29, 2020

For createArrowFiles if you disable ie subThreading = FALSE I imagine this will allow you to parallelize your creation still. This will just prevent the additional level of parallelization.

@rcorces rcorces closed this as completed Dec 31, 2020
@leeanapeters
Copy link

Hi, I know this issue is closed but I am also experiencing this issue when trying to run GetMarkerFeatures. I disabled subthreading as above but am still getting the error. Any help would be much appreciated!

Thanks

@rcorces
Copy link
Collaborator

rcorces commented Jan 5, 2021

Hi @leeanapeters - Sorry you're having trouble with this. The H5 errors are really hard to track down and we (and many other software developers) are still trying to figure this out. The best I can say is that HDF5 errors are often sporadic and environment specific. If you've tried threads = 1 and disabling subthreading and that doesnt work, its possible that your HDF5 file got corrupted at some point (no fault of your own). You can try re-running the analysis from a previous project save and see if it resolves. Unfortunately, these errors just are not very reproducible and we have not been able to track down any underlying problems.

@xiaosuyu1997
Copy link

xiaosuyu1997 commented Mar 3, 2021

I guess this issue happens when user try to use ArchR on a HPC cluster system with shared file system (e.g. I meet this question on lustre, which is configured to disable flock for performance issue).
There is one work around in the current release(v1.0.1) for this issue as follows:

  • set environment variable: HDF5_USE_FILE_LOCKING=FALSE, RHDF5_USE_FILE_LOCKING=FALSE
  • when createArrowFiles, pass "subThreading = F" argument

I guess there is still some trivial errors in v1.0.1 version, described as follows:

  • As the rhdf5 document states, calling h5enableFileLocking() will change the env variable RHDF5_USE_FILE_LOCKING permanently for the current R session. So even user set RHDF5_USE_FILE_LOCKING=FALSE outside, h5enableFileLocking() inside createArrowFiles function will change the wanted behavior. When user do not pass "subThreading = F" to createArrowFiles(), arrow files can be created correctly, but later function calls which want to read or write arrow files will lead to this "HDF5. File accessibilty. Unable to open file." error. For example, I ran into this error when call addDoubletScores.
    image

Many thanks!

@rcorces rcorces reopened this Mar 3, 2021
@rcorces rcorces added the help wanted Extra attention is needed label Mar 3, 2021
@rcorces
Copy link
Collaborator

rcorces commented Mar 3, 2021

thanks @xiaosuyu1997 - flagging this for @jgranja24

@tianchen2019
Copy link

I guess this issue happens when user try to use ArchR on a HPC cluster system with shared file system (e.g. I meet this question on lustre, which is configured to disable flock for performance issue).
There is one work around in the current release(v1.0.1) for this issue as follows:

  • set environment variable: RHDF5_USE_FILE_LOCKING=FALSE
  • when createArrowFiles, pass "subThreading = F" argument

I guess there is still some trivial errors in v1.0.1 version, described as follows:

  • As the rhdf5 document states, calling h5enableFileLocking() will change the env variable RHDF5_USE_FILE_LOCKING permanently for the current R session. So even user set RHDF5_USE_FILE_LOCKING=FALSE outside, h5enableFileLocking() inside createArrowFiles function will change the wanted behavior. When user do not pass "subThreading = F" to createArrowFiles(), arrow files can be created correctly, but later function calls which want to read or write arrow files will lead to this "HDF5. File accessibilty. Unable to open file." error. For example, I ran into this error when call addDoubletScores.
    image

Many thanks!

Thank you so much! I encountered the same problem and solved it with your solution.

@janeshen91
Copy link

janeshen91 commented Oct 22, 2021

I guess this issue happens when user try to use ArchR on a HPC cluster system with shared file system (e.g. I meet this question on lustre, which is configured to disable flock for performance issue). There is one work around in the current release(v1.0.1) for this issue as follows:

  • set environment variable: HDF5_USE_FILE_LOCKING=FALSE, RHDF5_USE_FILE_LOCKING=FALSE
  • when createArrowFiles, pass "subThreading = F" argument

I guess there is still some trivial errors in v1.0.1 version, described as follows:

  • As the rhdf5 document states, calling h5enableFileLocking() will change the env variable RHDF5_USE_FILE_LOCKING permanently for the current R session. So even user set RHDF5_USE_FILE_LOCKING=FALSE outside, h5enableFileLocking() inside createArrowFiles function will change the wanted behavior. When user do not pass "subThreading = F" to createArrowFiles(), arrow files can be created correctly, but later function calls which want to read or write arrow files will lead to this "HDF5. File accessibilty. Unable to open file." error. For example, I ran into this error when call addDoubletScores.
    image

Many thanks!

Just to clarify, if the arrow files have already been created without the subThreading=F, then I can't access the H5DF files unless I start over with the createArrowFiles() step?

Currently, I'm trying to use the archr project file created by someone else (but I have the read permission to their arrow files), and I am getting the error

"Error in H5Fopen(file) : HDF5. File accessibility. Unable to open file.\n"

I've tried to use h5disableFileLocking() to set the environment variables, but I still get the same error. I'm using the function getGroupBW(), but I've been seeing this error with other functions too.

Thanks in advance for your help

@kirakoko
Copy link

kirakoko commented Dec 6, 2021

Hey everyone, I am so sorry to bother you but as many people had this issue I would like to share the error message and would seek for help as i am also new in computational stuff...

inputFiles <- c('test'= "/outs/atac_fragments.tsv.gz")
inputFiles
test
"/outs/atac_fragments.tsv.gz"
export HDF5_USE_FILE_LOCKING=FALSE, RHDF5_USE_FILE_LOCKING=FALSE
Error: unexpected symbol in "export HDF5_USE_FILE_LOCKING"
HDF5_USE_FILE_LOCKING=FALSE
HDF5_USE_FILE_LOCKING
[1] FALSE
addArchRGenome('hg38')
Setting default genome to Hg38.
addArchRThreads(threads = 1)
Setting default number of Parallel threads to 1.
ArrowFiles <- createArrowFiles(

  • inputFiles = inputFiles,
  • sampleNames = inputFiles,
  • minTSS = 4, #Dont set this too high because you can always increase later
  • minFrags = 1000,
  • addTileMat = TRUE,
  • addGeneScoreMat = TRUE,
  • subThreading = FALSE
  • )
    Using GeneAnnotation set by addArchRGenome(Hg38)!
    Using GeneAnnotation set by addArchRGenome(Hg38)!
    ArchR logging to : ArchRLogs/ArchR-createArrows-47ed44daa662-Date-2021-12-06_Time-11-46-50.log
    If there is an issue, please report to github with logFile!
    2021-12-06 11:46:50 : Batch Execution w/ safelapply!, 0 mins elapsed.
    2021-12-06 11:46:50 : (/outs/atac_fragments.tsv.gz : 1 of 1) Reading In Fragments from inputFiles (readMethod = tabix), 0.001 mins elapsed.
    2021-12-06 11:46:50 : (/outs/atac_fragments.tsv.gz : 1 of 1) Tabix Bed To Temporary File, 0.001 mins elapsed.

2021-12-06 11:46:50 : ERROR Found in .tabixToTmp for (/outs/atac_fragments.tsv.gz : 1 of 1)
LogFile = ArchRLogs/ArchR-createArrows-47ed44daa662-Date-2021-12-06_Time-11-46-50.log

<simpleError in H5Fcreate(file): HDF5. File accessibility. Unable to open file.>


2021-12-06 11:46:50 : createArrowFiles has encountered an error, checking if any ArrowFiles completed..
ArchR logging successful to : ArchRLogs/ArchR-createArrows-47ed44daa662-Date-2021-12-06_Time-11-46-50.log

if someone has a solution i would appreciate it! thank you

@rcorces
Copy link
Collaborator

rcorces commented Mar 16, 2022

@xiaosuyu1997 - I know that your solution is about a year old and I'm sorry that we havent implemented it yet. I'm trying to tidy up the current development branch (release_1.0.2) to create a stable release and I would like to adequately address this.

Could you clarify your solution?

From what I understand, the primary problem is that ArchR defaults to using h5disableFileLocking() and h5enableFileLocking() which cause problems on certain systems. In createArrowFiles() this can be avoided by setting subThreading = FALSE in the params. But in other functions, h5disableFileLocking() and h5enableFileLocking() are used without an option to set subThreading = FALSE.

A solution would be to set the default value for subThreading to be FALSE and to add the subThreading param to any function that would potentially call h5disableFileLocking() and h5enableFileLocking() and prevent users from calling these functions if subThreading = FALSE. Does that sound right?

Would we need to additionally set default values (FALSE) for the environment variables HDF5_USE_FILE_LOCKING and RHDF5_USE_FILE_LOCKING? This seems like it would also be important given that the default is to use file locking.

@wfma
Copy link

wfma commented Mar 16, 2022

Sorry to chime in but i am also having this issue intermittently while running ArchR in Dockerized RStudio (Rocker Project). The issue seems to go away if I completely restart the R session or remove the loaded R project.

The issue I have is while running plot genome track!

temp <- plotBrowserTrack(
            ArchRProj = project, 
            groupBy = "Clusters", 
            geneSymbol = "NOX4", #example gene 
            upstream = 50000,
            downstream = 50000, 
          )

Error code:

Setting default number of Parallel threads to 1.
ArchR logging to : ArchRLogs/ArchR-plotBrowserTrack-12af62a197-Date-2022-03-16_Time-20-20-04.log
If there is an issue, please report to github with logFile!
2022-03-16 20:20:05 : Validating Region, 0.01 mins elapsed.
GRanges object with 1 range and 2 metadata columns:
      seqnames            ranges strand |     gene_id      symbol
         <Rle>         <IRanges>  <Rle> | <character> <character>
  [1]    chr11 89324356-89589611      - |       50507        NOX4
  -------
  seqinfo: 24 sequences from hg38 genome
2022-03-16 20:20:05 : Adding Bulk Tracks (1 of 1), 0.013 mins elapsed.

************************************************************
2022-03-16 20:20:05 : ERROR Found in .groupRegionSumArrows for scA 
LogFile = ArchRLogs/ArchR-plotBrowserTrack-12af62a197-Date-2022-03-16_Time-20-20-04.log

<simpleError in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'subsetByOverlaps': HDF5. File accessibility. Unable to open file.>

************************************************************

Warning: Error in .logError: Exiting See Error Above
  184: stop
  183: .logError
  182: value[[3L]]
  181: tryCatchOne
  180: tryCatchList
  179: tryCatch
  178: FUN
  177: lapply
  176: .safelapply
  173: .groupRegionSumArrows
  172: .bulkTracks
  171: FUN
  170: lapply
  169: plotBrowserTrack
  168: renderPlot [/home/rstudio/Documents/My Drive/PlaqView_Master/PlaqView_ATAC/app.R#575]
  166: func
  126: drawPlot
  112: <reactive:plotObj>
   96: drawReactive
   83: renderFunc
   82: output$genometrack
    1: runApp

@xiaosuyu1997
Copy link

xiaosuyu1997 commented Mar 19, 2022

@rcorces
Sorry for late reply.

A solution would be to set the default value for subThreading to be FALSE and to add the subThreading param to any function that would potentially call h5disableFileLocking() and h5enableFileLocking() and prevent users from calling these functions if subThreading = FALSE. Does that sound right?

I think maybe cases where subThreading has to be FALSE is rare (default to TRUE maybe better for effeciency?). The most important question maybe the consistency of lock-use in one run (environment variable -> createArrowFiles -> later function calls use these ArrowFiles), changing in midtime may not be a good idea. This shall be warned in the document, especially for Network File System users who may run into these problems.
Checking the environment variable (need HDF5_USE_FILE_LOCKING=TRUE + RHDF5_USE_FILE_LOCKING=TRUE + subThreading=TRUE to enable file locking) may be a better idea, cause we always want the superior behavior of ENV vars.

@BeyondTheProof
Copy link

BeyondTheProof commented Apr 22, 2022

I've had the same error when trying to call addGroupCoverages. I've managed to resolve this error by creating a copy of my project and setting threads=1.

@VDD58
Copy link

VDD58 commented Jul 6, 2022

Hi,
I also have the same problem if HDF5, I do use human genome.
I have tried as previously suggested to start from an earlier saved project, do threads =1 and or subthreads=False.
The problem is that i still get the same error message which is the following:
Error in h5ls(ArrowFile): HDF5. Object header. Can't get value.
Traceback:

  1. addGeneIntegrationMatrix(ArchRProj = proj2.1_integr, useMatrix = "GeneScoreMatrix",
    . matrixName = "GeneIntegrationMatrix", reducedDims = "IterativeLSI",
    . seRNA = seRNA, addToArrow = TRUE, force = TRUE, groupRNA = "differentiation",
    . nameCell = "predictedCell", nameGroup = "predictedGroup",
    . nameScore = "predictedScore", threads = 1)
  2. .safelapply(seq_along(allSamples), function(y) {
    . sample <- allSamples[y]
    . prefix <- sprintf("%s (%s of %s)", sample, y, length(ArrowFiles))
    . .logDiffTime(sprintf("%s Getting GeneIntegrationMatrix From TempFiles!",
    . prefix), tstart, verbose = verbose, logFile = logFile)
    . sampleIF <- lapply(seq_along(h5list), function(x) {
    . if (any(h5list[[x]]$group == paste0("/", sample))) {
    . integrationFiles[x]
    . }
    . else {
    . NULL
    . }
    . }) %>% unlist
    . sampleMat <- lapply(seq_along(sampleIF), function(x) {
    . cellNames <- .h5read(sampleIF[x], paste0(sample, "/cellNames"))
    . mat <- sparseMatrix(i = .h5read(sampleIF[x], paste0(sample,
    . "/i"))[, 1], j = as.vector(Rle(.h5read(sampleIF[x],
    . paste0(sample, "/jValues"))[, 1], .h5read(sampleIF[x],
    . paste0(sample, "/jLengths"))[, 1])), x = .h5read(sampleIF[x],
    . paste0(sample, "/x"))[, 1], dims = c(nrow(featureDF),
    . length(cellNames)))
    . colnames(mat) <- cellNames
    . mat
    . }) %>% Reduce("cbind", .)
    . sampleMat@x <- exp(sampleMat@x) - 1
    . sampleMat <- .normalizeCols(sampleMat, scaleTo = scaleTo)
    . sampleMat <- drop0(sampleMat)
    . rownames(sampleMat) <- paste0(featureDF$name)
    . sampleMat <- sampleMat[, ArchRProj$cellNames[BiocGenerics::which(ArchRProj$Sample ==
    . sample)], drop = FALSE]
    . o <- .createArrowGroup(ArrowFile = ArrowFiles[sample], group = matrixName,
    . force = force)
    . o <- .initializeMat(ArrowFile = ArrowFiles[sample], Group = matrixName,
    . Class = "double", Units = "NormCounts", cellNames = colnames(sampleMat),
    . params = dfParams, featureDF = featureDF, force = force)
    . o <- h5write(obj = dfAll[colnames(sampleMat), "predictionScore"],
    . file = ArrowFiles[sample], name = paste0(matrixName,
    . "/Info/predictionScore"))
    . o <- h5write(obj = dfAll[colnames(sampleMat), "predictedGroup"],
    . file = ArrowFiles[sample], name = paste0(matrixName,
    . "/Info/predictedGroup"))
    . o <- h5write(obj = dfAll[colnames(sampleMat), "predictedCell"],
    . file = ArrowFiles[sample], name = paste0(matrixName,
    . "/Info/predictedCell"))
    . .logDiffTime(sprintf("%s Adding GeneIntegrationMatrix to ArrowFile!",
    . prefix), tstart, verbose = verbose, logFile = logFile)
    . for (z in seq_along(allChr)) {
    . chrz <- allChr[z]
    . .logDiffTime(sprintf("Adding GeneIntegrationMatrix to %s for Chr (%s of %s)!",
    . sample, z, length(allChr)), tstart, verbose = FALSE,
    . logFile = logFile)
    . idz <- BiocGenerics::which(featureDF$seqnames %bcin%
    . chrz)
    . matz <- sampleMat[idz, , drop = FALSE]
    . stopifnot(identical(paste0(featureDF$name[idz]), paste0(rownames(matz))))
    . o <- .addMatToArrow(mat = matz, ArrowFile = ArrowFiles[sample],
    . Group = paste0(matrixName, "/", chrz), binarize = FALSE,
    . addColSums = TRUE, addRowSums = TRUE, addRowVarsLog2 = TRUE,
    . logFile = logFile)
    . rm(matz)
    . if (z%%3 == 0 | z == length(allChr)) {
    . gc()
    . }
    . }
    . 0
    . }, threads = threads)
  3. lapply(...)
  4. FUN(X[[i]], ...)
  5. .createArrowGroup(ArrowFile = ArrowFiles[sample], group = matrixName,
    . force = force)
  6. .summarizeArrowContent(ArrowFile)
  7. h5ls(ArrowFile)

I am definitely running out of ideas so if anybody has a suggestion i would be glad.

ArchR-addGeneIntegrationMatrix-19443462eecb-Date-2022-07-05_Time-19-32-06.log

       ___      .______        ______  __    __  .______      
      /   \     |   _  \      /      ||  |  |  | |   _  \     
     /  ^  \    |  |_)  |    |  ,----'|  |__|  | |  |_)  |    
    /  /_\  \   |      /     |  |     |   __   | |      /     

   /  _____  \  |  |\  \\___ |  `----.|  |  |  | |  |\  \\___.
  /__/     \__\ | _| `._____| \______||__|  |__| | _| `._____|

Logging With ArchR!

Start Time : 2022-07-05 19:32:06

------- ArchR Info

ArchRThreads = 16
ArchRGenome = Hg19

------- System Info

Computer OS = unix
Total Cores = 28

------- Session Info

R version 4.0.1 (2020-06-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /n/app/openblas/0.2.19/lib/libopenblas_core2p-r0.2.19.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base

other attached packages:
[1] gtable_0.3.0 SingleCellExperiment_1.12.0
[3] pheatmap_1.0.12 ArchR_1.0.1
[5] magrittr_2.0.3 rhdf5_2.34.0
[7] Matrix_1.4-1 data.table_1.14.2
[9] SummarizedExperiment_1.20.0 Biobase_2.50.0
[11] GenomicRanges_1.42.0 GenomeInfoDb_1.26.7
[13] IRanges_2.24.1 S4Vectors_0.28.1
[15] BiocGenerics_0.36.1 MatrixGenerics_1.2.1
[17] matrixStats_0.62.0 ggplot2_3.3.6
[19] sp_1.5-0 SeuratObject_4.1.0
[21] Seurat_4.1.1

loaded via a namespace (and not attached):
[1] uuid_1.1-0 plyr_1.8.7
[3] igraph_1.3.2 repr_1.1.4
[5] lazyeval_0.2.2 splines_4.0.1
[7] BiocParallel_1.24.1 listenv_0.8.0
[9] scattermore_0.8 digest_0.6.29
[11] htmltools_0.5.2 fansi_1.0.3
[13] BSgenome_1.58.0 tensor_1.5
[15] cluster_2.1.0 ROCR_1.0-11
[17] Biostrings_2.58.0 globals_0.15.1
[19] spatstat.sparse_2.1-1 colorspace_2.0-3
[21] ggrepel_0.9.1 dplyr_1.0.9
[23] crayon_1.5.1 RCurl_1.98-1.7
[25] jsonlite_1.8.0 progressr_0.10.1
[27] spatstat.data_2.2-0 survival_3.1-12
[29] zoo_1.8-10 glue_1.6.2
[31] polyclip_1.10-0 zlibbioc_1.36.0
[33] XVector_0.30.0 leiden_0.4.2
[35] DelayedArray_0.16.3 Rhdf5lib_1.12.1
[37] future.apply_1.9.0 abind_1.4-5
[39] scales_1.2.0 DBI_1.1.3
[41] spatstat.random_2.2-0 miniUI_0.1.1.1
[43] Rcpp_1.0.8.3 viridisLite_0.4.0
[45] xtable_1.8-4 reticulate_1.25
[47] spatstat.core_2.4-4 htmlwidgets_1.5.4
[49] httr_1.4.3 RColorBrewer_1.1-3
[51] ellipsis_0.3.2 ica_1.0-2
[53] farver_2.1.0 XML_3.99-0.10
[55] pkgconfig_2.0.3 uwot_0.1.11
[57] deldir_1.0-6 utf8_1.2.2
[59] labeling_0.4.2 tidyselect_1.1.2
[61] rlang_1.0.3 reshape2_1.4.4
[63] later_1.3.0 munsell_0.5.0
[65] tools_4.0.1 cli_3.3.0
[67] generics_0.1.2 ggridges_0.5.3
[69] evaluate_0.15 stringr_1.4.0
[71] fastmap_1.1.0 goftest_1.2-3
[73] fitdistrplus_1.1-8 purrr_0.3.4
[75] RANN_2.6.1 pbapply_1.5-0
[77] future_1.26.1 nlme_3.1-148
[79] mime_0.12 compiler_4.0.1
[81] plotly_4.10.0 png_0.1-7
[83] spatstat.utils_2.3-1 tibble_3.1.7
[85] stringi_1.7.6 RSpectra_0.16-1
[87] rgeos_0.5-9 lattice_0.20-41
[89] IRdisplay_1.1 vctrs_0.4.1
[91] pillar_1.7.0 lifecycle_1.0.1
[93] rhdf5filters_1.2.1 spatstat.geom_2.4-0
[95] lmtest_0.9-40 RcppAnnoy_0.0.19
[97] cowplot_1.1.1 bitops_1.0-7
[99] irlba_2.3.5 rtracklayer_1.50.0
[101] httpuv_1.6.5 patchwork_1.1.1
[103] R6_2.5.1 promises_1.2.0.1
[105] KernSmooth_2.23-17 gridExtra_2.3
[107] parallelly_1.32.0 codetools_0.2-16
[109] gtools_3.9.2.2 MASS_7.3-51.6
[111] assertthat_0.2.1 withr_2.5.0
[113] GenomicAlignments_1.26.0 Rsamtools_2.6.0
[115] sctransform_0.3.3 GenomeInfoDbData_1.2.4
[117] mgcv_1.8-31 grid_4.0.1
[119] rpart_4.1-15 IRkernel_1.3.0.9000
[121] tidyr_1.2.0 BSgenome.Hsapiens.UCSC.hg19_1.4.3
[123] Cairo_1.5-15 Rtsne_0.16
[125] pbdZMQ_0.3-7 shiny_1.7.1
[127] base64enc_0.1-3

------- Log Info

2022-07-05 19:32:07 : Running Seurat's Integration Stuart* et al 2019, 0.011 mins elapsed.

2022-07-05 19:32:07 : Input-Parameters, Class = list

Input-Parameters$: length = 1

1 function (name)
2 .Internal(args(name))

Input-Parameters$ArchRProj: length = 1

Input-Parameters$useMatrix: length = 1
[1] "GeneScoreMatrix"

Input-Parameters$matrixName: length = 1
[1] "GeneIntegrationMatrix"

Input-Parameters$reducedDims: length = 1
[1] "IterativeLSI"

Input-Parameters$seRNA: length = 33477
class: SingleCellExperiment
dim: 6 3129
metadata(0):
assays(4): counts logcounts logcounts_raw scaledval
rownames(6): TSPAN6 TNMD ... C1orf112 FGR
rowData names(18): is_feature_control is_feature_control_MT ...
end_position feature_id
colnames(3129):
VEH_5D_11929_10J10_11202017_blood_plate2_SingleCellB11_S176_rep1_NA
BCAT_BCAT11929_8J10_Blood_SingleCell_F6_S332_rep1_NA ...
DBZ_5D_83146_10F0_11162017_blood_plate2_SingleCellH1_S281_rep1_NA
BCAT83146_8F1_DBZ_5D_111517_blood_plate2_C9_S154_rep1_NA
colData names(118): sampleName Mouse ... NKT_44negNK1_1neg_Th_score_new
TTT
reducedDimNames(3): pagoda_largeVIS pagoda_tSNE
progenitortSNE_no_gmpcmp
altExpNames(0):

Input-Parameters$groupATAC: length = 0
NULL

Input-Parameters$groupRNA: length = 1
[1] "differentiation"

Input-Parameters$groupList: length = 0
NULL

Input-Parameters$sampleCellsATAC: length = 1
[1] 10000

Input-Parameters$sampleCellsRNA: length = 1
[1] 10000

Input-Parameters$embeddingATAC: length = 0
NULL

Input-Parameters$embeddingRNA: length = 0
NULL

Input-Parameters$dimsToUse: length = 30
[1] 1 2 3 4 5 6

Input-Parameters$scaleDims: length = 0
NULL

Input-Parameters$corCutOff: length = 1
[1] 0.75

Input-Parameters$plotUMAP: length = 1
[1] TRUE

Input-Parameters$nGenes: length = 1
[1] 2000

Input-Parameters$useImputation: length = 1
[1] TRUE

Input-Parameters$reduction: length = 1
[1] "cca"

Input-Parameters$addToArrow: length = 1
[1] TRUE

Input-Parameters$scaleTo: length = 1
[1] 10000

Input-Parameters$genesUse: length = 0
NULL

Input-Parameters$nameCell: length = 1
[1] "predictedCell"

Input-Parameters$nameGroup: length = 1
[1] "predictedGroup"

Input-Parameters$nameScore: length = 1
[1] "predictedScore"

Input-Parameters$threads: length = 1
[1] 1

Input-Parameters$verbose: length = 1
[1] TRUE

Input-Parameters$force: length = 1
[1] TRUE

Input-Parameters$logFile: length = 1
[1] "ArchRLogs/ArchR-addGeneIntegrationMatrix-19443462eecb-Date-2022-07-05_Time-19-32-06.log"

2022-07-05 19:32:07 : Checking ATAC Input, 0.017 mins elapsed.
2022-07-05 19:32:07 : Checking RNA Input, 0.017 mins elapsed.
2022-07-05 19:32:11 : Found 20976 overlapping gene names from gene scores and rna matrix!, 0.084 mins elapsed.
2022-07-05 19:32:11 : Creating Integration Blocks, 0.085 mins elapsed.
2022-07-05 19:32:11 : Prepping Interation Data, 0.087 mins elapsed.
2022-07-05 19:32:13 : Computing Integration in 3 Integration Blocks!, 0 mins elapsed.
2022-07-05 19:32:13 : Block (1 of 3) : Computing Integration, 0 mins elapsed.
2022-07-05 19:32:13 : Block (1 of 3) : Identifying Variable Genes, 0.011 mins elapsed.
2022-07-05 19:32:16 : Block (1 of 3) : Getting GeneScoreMatrix, 0.054 mins elapsed.

2022-07-05 19:32:36 : GeneScoreMat-Block-1, Class = dgCMatrix
GeneScoreMat-Block-1: nRows = 2000, nCols = 10848
GeneScoreMat-Block-1: NonZeroEntries = 8004207, EntryRange = [ 0.016 , 462.342 ]
5 x 5 sparse Matrix of class "dgCMatrix"
BCAT10B0#AAACGAATCCTTACGC-1 BCAT10C10#TGAGTCAAGACCAATA-1
TTLL10 . 0.726
TNFRSF18 . 0.879
PLCH2 . .
LINC00982 . .
ARHGEF16 . .
BCAT10C10#AACGTACTCGGTCGAC-1 BCAT7F0#AAATGCCTCGGATGTT-1
TTLL10 . .
TNFRSF18 . .
PLCH2 1.061 .
LINC00982 . .
ARHGEF16 . .
BCAT10B0#TGCTCGTCAAGTGGCA-1
TTLL10 .
TNFRSF18 .
PLCH2 .
LINC00982 .
ARHGEF16 .

2022-07-05 19:32:36 : Block (1 of 3) : Imputing GeneScoreMatrix, 0.385 mins elapsed.

2022-07-05 19:32:36 : addImputeWeights Input-Parameters, Class = list

addImputeWeights Input-Parameters$ArchRProj: length = 1

addImputeWeights Input-Parameters$reducedDims: length = 1
[1] "IterativeLSI"

addImputeWeights Input-Parameters$dimsToUse: length = 30
[1] 1 2 3 4 5 6

addImputeWeights Input-Parameters$scaleDims: length = 0
NULL

addImputeWeights Input-Parameters$corCutOff: length = 1
[1] 0.75

addImputeWeights Input-Parameters$td: length = 1
[1] 3

addImputeWeights Input-Parameters$ka: length = 1
[1] 4

addImputeWeights Input-Parameters$sampleCells: length = 1
[1] 5000

addImputeWeights Input-Parameters$nRep: length = 1
[1] 2

addImputeWeights Input-Parameters$k: length = 1
[1] 15

addImputeWeights Input-Parameters$epsilon: length = 1
[1] 1

addImputeWeights Input-Parameters$useHdf5: length = 1
[1] TRUE

addImputeWeights Input-Parameters$randomSuffix: length = 1
[1] TRUE

addImputeWeights Input-Parameters$threads: length = 1
[1] 1

addImputeWeights Input-Parameters$seed: length = 1
[1] 1

addImputeWeights Input-Parameters$verbose: length = 1
[1] TRUE

addImputeWeights Input-Parameters$logFile: length = 1
[1] "ArchRLogs/ArchR-addGeneIntegrationMatrix-19443462eecb-Date-2022-07-05_Time-19-32-06.log"

2022-07-05 19:32:36 : Computing Impute Weights Using Magic (Cell 2018), 0 mins elapsed.
2022-07-05 19:32:36 : Computing Partial Diffusion Matrix with Magic (1 of 2), 0 mins elapsed.
2022-07-05 19:32:52 : Computing Partial Diffusion Matrix with Magic (2 of 2), 0.259 mins elapsed.
2022-07-05 19:33:07 : Completed Getting Magic Weights!, 0.511 mins elapsed.

2022-07-05 19:33:07 : imputeMatrix Input-Parameters, Class = list

imputeMatrix Input-Parameters$mat: nRows = 2000, nCols = 10848
imputeMatrix Input-Parameters$mat: NonZeroEntries = 8004207, EntryRange = [ 0.016 , 462.342 ]
5 x 5 sparse Matrix of class "dgCMatrix"
BCAT10B0#AAACGAATCCTTACGC-1 BCAT10C10#TGAGTCAAGACCAATA-1
TTLL10 . 0.726
TNFRSF18 . 0.879
PLCH2 . .
LINC00982 . .
ARHGEF16 . .
BCAT10C10#AACGTACTCGGTCGAC-1 BCAT7F0#AAATGCCTCGGATGTT-1
TTLL10 . .
TNFRSF18 . .
PLCH2 1.061 .
LINC00982 . .
ARHGEF16 . .
BCAT10B0#TGCTCGTCAAGTGGCA-1
TTLL10 .
TNFRSF18 .
PLCH2 .
LINC00982 .
ARHGEF16 .

imputeMatrix Input-Parameters$threads: length = 1
[1] 16

imputeMatrix Input-Parameters$verbose: length = 1
[1] FALSE

imputeMatrix Input-Parameters$logFile: length = 1
[1] "ArchRLogs/ArchR-addGeneIntegrationMatrix-19443462eecb-Date-2022-07-05_Time-19-32-06.log"

2022-07-05 19:33:07 : mat, Class = dgCMatrix
mat: nRows = 2000, nCols = 10848
mat: NonZeroEntries = 8004207, EntryRange = [ 0.016 , 462.342 ]
5 x 5 sparse Matrix of class "dgCMatrix"
BCAT10B0#AAACGAATCCTTACGC-1 BCAT10C10#TGAGTCAAGACCAATA-1
TTLL10 . 0.726
TNFRSF18 . 0.879
PLCH2 . .
LINC00982 . .
ARHGEF16 . .
BCAT10C10#AACGTACTCGGTCGAC-1 BCAT7F0#AAATGCCTCGGATGTT-1
TTLL10 . .
TNFRSF18 . .
PLCH2 1.061 .
LINC00982 . .
ARHGEF16 . .
BCAT10B0#TGCTCGTCAAGTGGCA-1
TTLL10 .
TNFRSF18 .
PLCH2 .
LINC00982 .
ARHGEF16 .

2022-07-05 19:33:07 : weightList, Class = SimpleList

weightList$w1: length = 1
[1] "/n/data2/dfci/pedonc/knoechel/Vale/scATAC/2.0/ImputeWeights/Impute-Weights-19446893b521-Date-2022-07-05_Time-19-32-36-Rep-1"

weightList$w2: length = 1
[1] "/n/data2/dfci/pedonc/knoechel/Vale/scATAC/2.0/ImputeWeights/Impute-Weights-19446893b521-Date-2022-07-05_Time-19-32-36-Rep-2"

2022-07-05 19:33:07 : Imputing Matrix (1 of 2), 0 mins elapsed.

2022-07-05 19:33:07 :

2022-07-05 19:33:07 :

2022-07-05 19:33:07 :
2022-07-05 19:33:45 : Imputing Matrix (2 of 2), 0.626 mins elapsed.

2022-07-05 19:33:45 :

2022-07-05 19:33:45 :

2022-07-05 19:33:45 :
2022-07-05 19:34:22 : Finished Imputing Matrix, 1.255 mins elapsed.

2022-07-05 19:34:22 : GeneScoreMat-Block-Impute-1, Class = dgeMatrix
GeneScoreMat-Block-Impute-1: nRows = 2000, nCols = 10848
GeneScoreMat-Block-Impute-1: NonZeroEntries = 21696000, EntryRange = [ 0 , 19.8315855307255 ]
5 x 5 Matrix of class "dgeMatrix"
BCAT10B0#AAACGAATCCTTACGC-1 BCAT10C10#TGAGTCAAGACCAATA-1
TTLL10 0.1346371 0.16619306
TNFRSF18 0.2657720 0.30042963
PLCH2 0.2724369 0.33988475
LINC00982 0.1314875 0.08061962
ARHGEF16 0.2374111 0.24831891
BCAT10C10#AACGTACTCGGTCGAC-1 BCAT7F0#AAATGCCTCGGATGTT-1
TTLL10 0.17520193 0.2287382
TNFRSF18 0.31564023 0.2718751
PLCH2 0.37930800 0.3206190
LINC00982 0.06279794 0.2258322
ARHGEF16 0.25227268 0.2474777
BCAT10B0#TGCTCGTCAAGTGGCA-1
TTLL10 0.1203257
TNFRSF18 0.2395978
PLCH2 0.2672784
LINC00982 0.1396059
ARHGEF16 0.2382568

2022-07-05 19:34:28 : Block (1 of 3) : Seurat FindTransferAnchors, 2.257 mins elapsed.

2022-07-05 19:35:42 : transferAnchors-1, Class = character

transferAnchors-1: length = 1
[1] "An AnchorSet object containing 473 anchors between the reference and query Seurat objects. \n This can be used as input to TransferData."

2022-07-05 19:35:42 : rDSub-1, Class = matrix

2022-07-05 19:35:42 : rDSub-1, Class = array
LSI1 LSI2 LSI3 LSI4
BCAT10B0#AAACGAATCCTTACGC-1 -5.021738 -0.09775941 0.3872247 1.15782233
BCAT10C10#TGAGTCAAGACCAATA-1 -5.145515 -0.30401617 -0.2900463 0.39661796
BCAT10C10#AACGTACTCGGTCGAC-1 -5.073399 -0.73500688 -0.3454685 0.10401208
BCAT7F0#AAATGCCTCGGATGTT-1 -4.420402 0.39895481 0.1813360 0.03934758
BCAT10B0#TGCTCGTCAAGTGGCA-1 -5.160597 -0.31431289 0.3277308 0.90400553
LSI5
BCAT10B0#AAACGAATCCTTACGC-1 0.0001318576
BCAT10C10#TGAGTCAAGACCAATA-1 -0.1009195361
BCAT10C10#AACGTACTCGGTCGAC-1 0.3050791993
BCAT7F0#AAATGCCTCGGATGTT-1 0.9578369118
BCAT10B0#TGCTCGTCAAGTGGCA-1 0.3869502650

rDSub-1: nRows = 10848, nCols = 30

2022-07-05 19:35:42 : Block (1 of 3) : Seurat TransferData Cell Group Labels, 3.48 mins elapsed.
2022-07-05 19:35:44 : Block (1 of 3) : Seurat TransferData Cell Names Labels, 3.519 mins elapsed.
2022-07-05 19:35:54 : Block (1 of 3) : Seurat TransferData GeneMatrix, 3.682 mins elapsed.
2022-07-05 19:36:15 : Block (1 of 3) : Saving TransferAnchors Joint CCA, 4.037 mins elapsed.
2022-07-05 19:36:16 : Block (1 of 3) : Transferring Paired RNA to Temp File, 4.062 mins elapsed.
2022-07-05 19:37:09 : Block (1 of 3) : Completed Integration, 4.936 mins elapsed.
2022-07-05 19:37:10 : Block (2 of 3) : Computing Integration, 4.957 mins elapsed.
2022-07-05 19:37:11 : Block (2 of 3) : Identifying Variable Genes, 4.966 mins elapsed.
2022-07-05 19:37:13 : Block (2 of 3) : Getting GeneScoreMatrix, 5.006 mins elapsed.

2022-07-05 19:37:29 : GeneScoreMat-Block-2, Class = dgCMatrix
GeneScoreMat-Block-2: nRows = 2000, nCols = 10848
GeneScoreMat-Block-2: NonZeroEntries = 7993789, EntryRange = [ 0.014 , 157.556 ]
5 x 5 sparse Matrix of class "dgCMatrix"
BCAT10C0#TTACCGCGTGGATTTC-1 BCAT10B0#GTCACCTGTATGGGTG-1
TTLL10 1.84 .
TNFRSF18 . 2.138
PLCH2 . .
LINC00982 . 1.787
ARHGEF16 . 0.155
BCAT10B0#ACGTGGCTCTTCCACG-1 BCAT10B0#CATGTTTCAGGTAGCA-1
TTLL10 . .
TNFRSF18 . .
PLCH2 . .
LINC00982 . .
ARHGEF16 0.214 .
BCAT10C10#CCAGATATCCCAGCAG-1
TTLL10 .
TNFRSF18 0.340
PLCH2 .
LINC00982 0.155
ARHGEF16 1.328

2022-07-05 19:37:29 : Block (2 of 3) : Imputing GeneScoreMatrix, 5.273 mins elapsed.

2022-07-05 19:37:29 : addImputeWeights Input-Parameters, Class = list

addImputeWeights Input-Parameters$ArchRProj: length = 1

addImputeWeights Input-Parameters$reducedDims: length = 1
[1] "IterativeLSI"

addImputeWeights Input-Parameters$dimsToUse: length = 30
[1] 1 2 3 4 5 6

addImputeWeights Input-Parameters$scaleDims: length = 0
NULL

addImputeWeights Input-Parameters$corCutOff: length = 1
[1] 0.75

addImputeWeights Input-Parameters$td: length = 1
[1] 3

addImputeWeights Input-Parameters$ka: length = 1
[1] 4

addImputeWeights Input-Parameters$sampleCells: length = 1
[1] 5000

addImputeWeights Input-Parameters$nRep: length = 1
[1] 2

addImputeWeights Input-Parameters$k: length = 1
[1] 15

addImputeWeights Input-Parameters$epsilon: length = 1
[1] 1

addImputeWeights Input-Parameters$useHdf5: length = 1
[1] TRUE

addImputeWeights Input-Parameters$randomSuffix: length = 1
[1] TRUE

addImputeWeights Input-Parameters$threads: length = 1
[1] 1

addImputeWeights Input-Parameters$seed: length = 1
[1] 1

addImputeWeights Input-Parameters$verbose: length = 1
[1] TRUE

addImputeWeights Input-Parameters$logFile: length = 1
[1] "ArchRLogs/ArchR-addGeneIntegrationMatrix-19443462eecb-Date-2022-07-05_Time-19-32-06.log"

2022-07-05 19:37:29 : Computing Impute Weights Using Magic (Cell 2018), 0 mins elapsed.
2022-07-05 19:37:29 : Computing Partial Diffusion Matrix with Magic (1 of 2), 0.001 mins elapsed.
2022-07-05 19:37:43 : Computing Partial Diffusion Matrix with Magic (2 of 2), 0.235 mins elapsed.
2022-07-05 19:37:58 : Completed Getting Magic Weights!, 0.485 mins elapsed.

2022-07-05 19:37:58 : imputeMatrix Input-Parameters, Class = list

imputeMatrix Input-Parameters$mat: nRows = 2000, nCols = 10848
imputeMatrix Input-Parameters$mat: NonZeroEntries = 7993789, EntryRange = [ 0.014 , 157.556 ]
5 x 5 sparse Matrix of class "dgCMatrix"
BCAT10C0#TTACCGCGTGGATTTC-1 BCAT10B0#GTCACCTGTATGGGTG-1
TTLL10 1.84 .
TNFRSF18 . 2.138
PLCH2 . .
LINC00982 . 1.787
ARHGEF16 . 0.155
BCAT10B0#ACGTGGCTCTTCCACG-1 BCAT10B0#CATGTTTCAGGTAGCA-1
TTLL10 . .
TNFRSF18 . .
PLCH2 . .
LINC00982 . .
ARHGEF16 0.214 .
BCAT10C10#CCAGATATCCCAGCAG-1
TTLL10 .
TNFRSF18 0.340
PLCH2 .
LINC00982 0.155
ARHGEF16 1.328

imputeMatrix Input-Parameters$threads: length = 1
[1] 16

imputeMatrix Input-Parameters$verbose: length = 1
[1] FALSE

imputeMatrix Input-Parameters$logFile: length = 1
[1] "ArchRLogs/ArchR-addGeneIntegrationMatrix-19443462eecb-Date-2022-07-05_Time-19-32-06.log"

2022-07-05 19:37:59 : mat, Class = dgCMatrix
mat: nRows = 2000, nCols = 10848
mat: NonZeroEntries = 7993789, EntryRange = [ 0.014 , 157.556 ]
5 x 5 sparse Matrix of class "dgCMatrix"
BCAT10C0#TTACCGCGTGGATTTC-1 BCAT10B0#GTCACCTGTATGGGTG-1
TTLL10 1.84 .
TNFRSF18 . 2.138
PLCH2 . .
LINC00982 . 1.787
ARHGEF16 . 0.155
BCAT10B0#ACGTGGCTCTTCCACG-1 BCAT10B0#CATGTTTCAGGTAGCA-1
TTLL10 . .
TNFRSF18 . .
PLCH2 . .
LINC00982 . .
ARHGEF16 0.214 .
BCAT10C10#CCAGATATCCCAGCAG-1
TTLL10 .
TNFRSF18 0.340
PLCH2 .
LINC00982 0.155
ARHGEF16 1.328

2022-07-05 19:37:59 : weightList, Class = SimpleList

weightList$w1: length = 1
[1] "/n/data2/dfci/pedonc/knoechel/Vale/scATAC/2.0/ImputeWeights/Impute-Weights-19444cb06125-Date-2022-07-05_Time-19-37-29-Rep-1"

weightList$w2: length = 1
[1] "/n/data2/dfci/pedonc/knoechel/Vale/scATAC/2.0/ImputeWeights/Impute-Weights-19444cb06125-Date-2022-07-05_Time-19-37-29-Rep-2"

2022-07-05 19:37:59 : Imputing Matrix (1 of 2), 0 mins elapsed.

2022-07-05 19:37:59 :

2022-07-05 19:37:59 :

2022-07-05 19:37:59 :
2022-07-05 19:38:36 : Imputing Matrix (2 of 2), 0.619 mins elapsed.

2022-07-05 19:38:36 :

2022-07-05 19:38:36 :

2022-07-05 19:38:36 :
2022-07-05 19:39:14 : Finished Imputing Matrix, 1.26 mins elapsed.

2022-07-05 19:39:14 : GeneScoreMat-Block-Impute-2, Class = dgeMatrix
GeneScoreMat-Block-Impute-2: nRows = 2000, nCols = 10848
GeneScoreMat-Block-Impute-2: NonZeroEntries = 21696000, EntryRange = [ 0 , 7.44260509597085 ]
5 x 5 Matrix of class "dgeMatrix"
BCAT10C0#TTACCGCGTGGATTTC-1 BCAT10B0#GTCACCTGTATGGGTG-1
TTLL10 0.2210199 0.1446547
TNFRSF18 0.3220487 0.2628406
PLCH2 0.6557544 0.2646934
LINC00982 0.1030445 0.1554096
ARHGEF16 0.6191359 0.2422558
BCAT10B0#ACGTGGCTCTTCCACG-1 BCAT10B0#CATGTTTCAGGTAGCA-1
TTLL10 0.1462962 0.1236322
TNFRSF18 0.2948222 0.2886919
PLCH2 0.3396610 0.2503875
LINC00982 0.1614462 0.1248387
ARHGEF16 0.2458269 0.2386034
BCAT10C10#CCAGATATCCCAGCAG-1
TTLL10 0.1546347
TNFRSF18 0.2934461
PLCH2 0.3051085
LINC00982 0.1170512
ARHGEF16 0.2313758

2022-07-05 19:39:21 : Block (2 of 3) : Seurat FindTransferAnchors, 7.129 mins elapsed.

2022-07-05 19:40:18 : transferAnchors-2, Class = character

transferAnchors-2: length = 1
[1] "An AnchorSet object containing 457 anchors between the reference and query Seurat objects. \n This can be used as input to TransferData."

2022-07-05 19:40:18 : rDSub-2, Class = matrix

2022-07-05 19:40:18 : rDSub-2, Class = array
LSI1 LSI2 LSI3 LSI4
BCAT10C0#TTACCGCGTGGATTTC-1 -4.963100 -0.7988335 0.01041248 -0.07223824
BCAT10B0#GTCACCTGTATGGGTG-1 -5.146598 -0.2388545 0.30016091 0.70423159
BCAT10B0#ACGTGGCTCTTCCACG-1 -5.109084 -0.2674287 0.94245151 0.46584431
BCAT10B0#CATGTTTCAGGTAGCA-1 -5.086913 -0.2312653 0.34954247 1.05357144
BCAT10C10#CCAGATATCCCAGCAG-1 -5.152116 -0.1109624 0.29572124 0.76973124
LSI5
BCAT10C0#TTACCGCGTGGATTTC-1 0.50060023
BCAT10B0#GTCACCTGTATGGGTG-1 -0.11692034
BCAT10B0#ACGTGGCTCTTCCACG-1 -0.04637475
BCAT10B0#CATGTTTCAGGTAGCA-1 0.06103473
BCAT10C10#CCAGATATCCCAGCAG-1 -0.36072181

rDSub-2: nRows = 10848, nCols = 30

2022-07-05 19:40:18 : Block (2 of 3) : Seurat TransferData Cell Group Labels, 8.092 mins elapsed.
2022-07-05 19:40:21 : Block (2 of 3) : Seurat TransferData Cell Names Labels, 8.129 mins elapsed.
2022-07-05 19:40:29 : Block (2 of 3) : Seurat TransferData GeneMatrix, 8.273 mins elapsed.
2022-07-05 19:40:47 : Block (2 of 3) : Saving TransferAnchors Joint CCA, 8.57 mins elapsed.
2022-07-05 19:40:48 : Block (2 of 3) : Transferring Paired RNA to Temp File, 8.594 mins elapsed.
2022-07-05 19:41:38 : Block (2 of 3) : Completed Integration, 9.421 mins elapsed.
2022-07-05 19:41:39 : Block (3 of 3) : Computing Integration, 9.44 mins elapsed.
2022-07-05 19:41:40 : Block (3 of 3) : Identifying Variable Genes, 9.449 mins elapsed.
2022-07-05 19:41:42 : Block (3 of 3) : Getting GeneScoreMatrix, 9.492 mins elapsed.

2022-07-05 19:41:57 : GeneScoreMat-Block-3, Class = dgCMatrix
GeneScoreMat-Block-3: nRows = 2000, nCols = 10848
GeneScoreMat-Block-3: NonZeroEntries = 7961066, EntryRange = [ 0.016 , 120.224 ]
5 x 5 sparse Matrix of class "dgCMatrix"
BCAT10F0#TGACTCCTCCTCAAGA-1 BCAT10C10#GGTACCGTCAGGTCTA-1
TTLL10 . 0.481
TNFRSF18 . 0.898
PLCH2 0.394 .
LINC00982 0.764 .
ARHGEF16 . 0.628
BCAT10F0#TGACTCCTCCCTATTA-1 BCAT10B0#TTCGGTCGTCAATCCA-1
TTLL10 . 1.017
TNFRSF18 . .
PLCH2 . .
LINC00982 . .
ARHGEF16 . .
BCAT10B0#TCGAGCGCAACCTCCT-1
TTLL10 .
TNFRSF18 .
PLCH2 .
LINC00982 .
ARHGEF16 .

2022-07-05 19:41:57 : Block (3 of 3) : Imputing GeneScoreMatrix, 9.734 mins elapsed.

2022-07-05 19:41:57 : addImputeWeights Input-Parameters, Class = list

addImputeWeights Input-Parameters$ArchRProj: length = 1

addImputeWeights Input-Parameters$reducedDims: length = 1
[1] "IterativeLSI"

addImputeWeights Input-Parameters$dimsToUse: length = 30
[1] 1 2 3 4 5 6

addImputeWeights Input-Parameters$scaleDims: length = 0
NULL

addImputeWeights Input-Parameters$corCutOff: length = 1
[1] 0.75

addImputeWeights Input-Parameters$td: length = 1
[1] 3

addImputeWeights Input-Parameters$ka: length = 1
[1] 4

addImputeWeights Input-Parameters$sampleCells: length = 1
[1] 5000

addImputeWeights Input-Parameters$nRep: length = 1
[1] 2

addImputeWeights Input-Parameters$k: length = 1
[1] 15

addImputeWeights Input-Parameters$epsilon: length = 1
[1] 1

addImputeWeights Input-Parameters$useHdf5: length = 1
[1] TRUE

addImputeWeights Input-Parameters$randomSuffix: length = 1
[1] TRUE

addImputeWeights Input-Parameters$threads: length = 1
[1] 1

addImputeWeights Input-Parameters$seed: length = 1
[1] 1

addImputeWeights Input-Parameters$verbose: length = 1
[1] TRUE

addImputeWeights Input-Parameters$logFile: length = 1
[1] "ArchRLogs/ArchR-addGeneIntegrationMatrix-19443462eecb-Date-2022-07-05_Time-19-32-06.log"

2022-07-05 19:41:57 : Computing Impute Weights Using Magic (Cell 2018), 0 mins elapsed.
2022-07-05 19:41:57 : Computing Partial Diffusion Matrix with Magic (1 of 2), 0.001 mins elapsed.
2022-07-05 19:42:12 : Computing Partial Diffusion Matrix with Magic (2 of 2), 0.254 mins elapsed.
2022-07-05 19:42:31 : Completed Getting Magic Weights!, 0.566 mins elapsed.

2022-07-05 19:42:31 : imputeMatrix Input-Parameters, Class = list

imputeMatrix Input-Parameters$mat: nRows = 2000, nCols = 10848
imputeMatrix Input-Parameters$mat: NonZeroEntries = 7961066, EntryRange = [ 0.016 , 120.224 ]
5 x 5 sparse Matrix of class "dgCMatrix"
BCAT10F0#TGACTCCTCCTCAAGA-1 BCAT10C10#GGTACCGTCAGGTCTA-1
TTLL10 . 0.481
TNFRSF18 . 0.898
PLCH2 0.394 .
LINC00982 0.764 .
ARHGEF16 . 0.628
BCAT10F0#TGACTCCTCCCTATTA-1 BCAT10B0#TTCGGTCGTCAATCCA-1
TTLL10 . 1.017
TNFRSF18 . .
PLCH2 . .
LINC00982 . .
ARHGEF16 . .
BCAT10B0#TCGAGCGCAACCTCCT-1
TTLL10 .
TNFRSF18 .
PLCH2 .
LINC00982 .
ARHGEF16 .

imputeMatrix Input-Parameters$threads: length = 1
[1] 16

imputeMatrix Input-Parameters$verbose: length = 1
[1] FALSE

imputeMatrix Input-Parameters$logFile: length = 1
[1] "ArchRLogs/ArchR-addGeneIntegrationMatrix-19443462eecb-Date-2022-07-05_Time-19-32-06.log"

2022-07-05 19:42:31 : mat, Class = dgCMatrix
mat: nRows = 2000, nCols = 10848
mat: NonZeroEntries = 7961066, EntryRange = [ 0.016 , 120.224 ]
5 x 5 sparse Matrix of class "dgCMatrix"
BCAT10F0#TGACTCCTCCTCAAGA-1 BCAT10C10#GGTACCGTCAGGTCTA-1
TTLL10 . 0.481
TNFRSF18 . 0.898
PLCH2 0.394 .
LINC00982 0.764 .
ARHGEF16 . 0.628
BCAT10F0#TGACTCCTCCCTATTA-1 BCAT10B0#TTCGGTCGTCAATCCA-1
TTLL10 . 1.017
TNFRSF18 . .
PLCH2 . .
LINC00982 . .
ARHGEF16 . .
BCAT10B0#TCGAGCGCAACCTCCT-1
TTLL10 .
TNFRSF18 .
PLCH2 .
LINC00982 .
ARHGEF16 .

2022-07-05 19:42:31 : weightList, Class = SimpleList

weightList$w1: length = 1
[1] "/n/data2/dfci/pedonc/knoechel/Vale/scATAC/2.0/ImputeWeights/Impute-Weights-19442ed3c811-Date-2022-07-05_Time-19-41-57-Rep-1"

weightList$w2: length = 1
[1] "/n/data2/dfci/pedonc/knoechel/Vale/scATAC/2.0/ImputeWeights/Impute-Weights-19442ed3c811-Date-2022-07-05_Time-19-41-57-Rep-2"

2022-07-05 19:42:31 : Imputing Matrix (1 of 2), 0 mins elapsed.

2022-07-05 19:42:31 :

2022-07-05 19:42:31 :

2022-07-05 19:42:32 :
2022-07-05 19:43:28 : Imputing Matrix (2 of 2), 0.944 mins elapsed.

2022-07-05 19:43:28 :

2022-07-05 19:43:28 :

2022-07-05 19:43:29 :
2022-07-05 19:44:18 : Finished Imputing Matrix, 1.779 mins elapsed.

2022-07-05 19:44:18 : GeneScoreMat-Block-Impute-3, Class = dgeMatrix
GeneScoreMat-Block-Impute-3: nRows = 2000, nCols = 10848
GeneScoreMat-Block-Impute-3: NonZeroEntries = 21696000, EntryRange = [ 0 , 5.97785203694152 ]
5 x 5 Matrix of class "dgeMatrix"
BCAT10F0#TGACTCCTCCTCAAGA-1 BCAT10C10#GGTACCGTCAGGTCTA-1
TTLL10 0.1558917 0.17292252
TNFRSF18 0.2645825 0.32055777
PLCH2 0.2939904 0.36761450
LINC00982 0.1861021 0.07818258
ARHGEF16 0.1920293 0.27583055
BCAT10F0#TGACTCCTCCCTATTA-1 BCAT10B0#TTCGGTCGTCAATCCA-1
TTLL10 0.1338464 0.1599414
TNFRSF18 0.1661903 0.2491711
PLCH2 0.2546585 0.3023310
LINC00982 0.1563450 0.1510510
ARHGEF16 0.2032253 0.2839722
BCAT10B0#TCGAGCGCAACCTCCT-1
TTLL10 0.1198278
TNFRSF18 0.1899804
PLCH2 0.2625219
LINC00982 0.1172837
ARHGEF16 0.2568719

2022-07-05 19:44:25 : Block (3 of 3) : Seurat FindTransferAnchors, 12.201 mins elapsed.

2022-07-05 19:45:49 : transferAnchors-3, Class = character

transferAnchors-3: length = 1
[1] "An AnchorSet object containing 446 anchors between the reference and query Seurat objects. \n This can be used as input to TransferData."

2022-07-05 19:45:49 : rDSub-3, Class = matrix

2022-07-05 19:45:49 : rDSub-3, Class = array
LSI1 LSI2 LSI3 LSI4
BCAT10F0#TGACTCCTCCTCAAGA-1 -5.100551 1.0426396 0.0543997 -0.453805392
BCAT10C10#GGTACCGTCAGGTCTA-1 -5.148028 -0.4997842 -0.1644659 0.253280036
BCAT10F0#TGACTCCTCCCTATTA-1 -4.944103 1.3597347 -0.8129402 -0.007404994
BCAT10B0#TTCGGTCGTCAATCCA-1 -5.146995 -0.1611272 0.1788771 0.745286492
BCAT10B0#TCGAGCGCAACCTCCT-1 -4.933004 0.3243187 -0.2483456 1.462553790
LSI5
BCAT10F0#TGACTCCTCCTCAAGA-1 -0.074836034
BCAT10C10#GGTACCGTCAGGTCTA-1 -0.001555628
BCAT10F0#TGACTCCTCCCTATTA-1 0.313881174
BCAT10B0#TTCGGTCGTCAATCCA-1 -0.217239078
BCAT10B0#TCGAGCGCAACCTCCT-1 0.504175375

rDSub-3: nRows = 10848, nCols = 30

2022-07-05 19:45:49 : Block (3 of 3) : Seurat TransferData Cell Group Labels, 13.606 mins elapsed.
2022-07-05 19:45:52 : Block (3 of 3) : Seurat TransferData Cell Names Labels, 13.648 mins elapsed.
2022-07-05 19:46:00 : Block (3 of 3) : Seurat TransferData GeneMatrix, 13.792 mins elapsed.
2022-07-05 19:46:18 : Block (3 of 3) : Saving TransferAnchors Joint CCA, 14.079 mins elapsed.
2022-07-05 19:46:19 : Block (3 of 3) : Transferring Paired RNA to Temp File, 14.098 mins elapsed.
2022-07-05 19:47:05 : Block (3 of 3) : Completed Integration, 14.863 mins elapsed.
2022-07-05 19:47:06 : Block (1 of 3) : Plotting Joint UMAP, 14.883 mins elapsed.
2022-07-05 19:47:45 : Block (2 of 3) : Plotting Joint UMAP, 15.532 mins elapsed.
2022-07-05 19:49:15 : Block (3 of 3) : Plotting Joint UMAP, 17.041 mins elapsed.
2022-07-05 19:50:33 : Transferring Data to ArrowFiles, 18.339 mins elapsed.
2022-07-05 19:50:33 : BCAT10B0 (1 of 5) Getting GeneIntegrationMatrix From TempFiles!, 18.342 mins elapsed.
2022-07-05 19:51:48 : BCAT10B0 (1 of 5) Adding GeneIntegrationMatrix to ArrowFile!, 19.588 mins elapsed.
2022-07-05 19:51:48 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (1 of 23)!, 19.588 mins elapsed.
2022-07-05 19:51:55 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (2 of 23)!, 19.703 mins elapsed.
2022-07-05 19:51:58 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (3 of 23)!, 19.76 mins elapsed.
2022-07-05 19:52:03 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (4 of 23)!, 19.839 mins elapsed.
2022-07-05 19:52:07 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (5 of 23)!, 19.901 mins elapsed.
2022-07-05 19:52:09 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (6 of 23)!, 19.938 mins elapsed.
2022-07-05 19:52:13 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (7 of 23)!, 19.997 mins elapsed.
2022-07-05 19:52:15 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (8 of 23)!, 20.043 mins elapsed.
2022-07-05 19:52:19 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (9 of 23)!, 20.097 mins elapsed.
2022-07-05 19:52:23 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (10 of 23)!, 20.173 mins elapsed.
2022-07-05 19:52:25 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (11 of 23)!, 20.208 mins elapsed.
2022-07-05 19:52:30 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (12 of 23)!, 20.282 mins elapsed.
2022-07-05 19:52:35 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (13 of 23)!, 20.369 mins elapsed.
2022-07-05 19:52:37 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (14 of 23)!, 20.41 mins elapsed.
2022-07-05 19:52:39 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (15 of 23)!, 20.441 mins elapsed.
2022-07-05 19:52:43 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (16 of 23)!, 20.496 mins elapsed.
2022-07-05 19:52:46 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (17 of 23)!, 20.559 mins elapsed.
2022-07-05 19:52:49 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (18 of 23)!, 20.608 mins elapsed.
2022-07-05 19:52:53 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (19 of 23)!, 20.676 mins elapsed.
2022-07-05 19:52:57 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (20 of 23)!, 20.734 mins elapsed.
2022-07-05 19:53:00 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (21 of 23)!, 20.791 mins elapsed.
2022-07-05 19:53:04 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (22 of 23)!, 20.85 mins elapsed.
2022-07-05 19:53:07 : Adding GeneIntegrationMatrix to BCAT10B0 for Chr (23 of 23)!, 20.9 mins elapsed.
2022-07-05 19:53:10 : BCAT10C10 (2 of 5) Getting GeneIntegrationMatrix From TempFiles!, 20.962 mins elapsed.
2022-07-05 19:54:10 : BCAT10C10 (2 of 5) Adding GeneIntegrationMatrix to ArrowFile!, 21.958 mins elapsed.
2022-07-05 19:54:10 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (1 of 23)!, 21.958 mins elapsed.
2022-07-05 19:54:16 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (2 of 23)!, 22.053 mins elapsed.
2022-07-05 19:54:18 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (3 of 23)!, 22.095 mins elapsed.
2022-07-05 19:54:22 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (4 of 23)!, 22.16 mins elapsed.
2022-07-05 19:54:28 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (5 of 23)!, 22.261 mins elapsed.
2022-07-05 19:54:39 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (6 of 23)!, 22.445 mins elapsed.
2022-07-05 19:54:54 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (7 of 23)!, 22.691 mins elapsed.
2022-07-05 19:54:58 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (8 of 23)!, 22.761 mins elapsed.
2022-07-05 19:55:04 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (9 of 23)!, 22.854 mins elapsed.
2022-07-05 19:55:29 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (10 of 23)!, 23.267 mins elapsed.
2022-07-05 19:55:41 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (11 of 23)!, 23.464 mins elapsed.
2022-07-05 19:55:47 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (12 of 23)!, 23.57 mins elapsed.
2022-07-05 19:55:55 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (13 of 23)!, 23.708 mins elapsed.
2022-07-05 19:55:59 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (14 of 23)!, 23.776 mins elapsed.
2022-07-05 19:56:03 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (15 of 23)!, 23.83 mins elapsed.
2022-07-05 19:56:06 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (16 of 23)!, 23.886 mins elapsed.
2022-07-05 19:56:09 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (17 of 23)!, 23.941 mins elapsed.
2022-07-05 19:56:12 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (18 of 23)!, 23.983 mins elapsed.
2022-07-05 19:56:15 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (19 of 23)!, 24.039 mins elapsed.
2022-07-05 19:56:18 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (20 of 23)!, 24.086 mins elapsed.
2022-07-05 19:56:21 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (21 of 23)!, 24.131 mins elapsed.
2022-07-05 19:56:24 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (22 of 23)!, 24.181 mins elapsed.
2022-07-05 19:56:26 : Adding GeneIntegrationMatrix to BCAT10C10 for Chr (23 of 23)!, 24.221 mins elapsed.
2022-07-05 19:56:29 : BCAT10C0 (3 of 5) Getting GeneIntegrationMatrix From TempFiles!, 24.273 mins elapsed.
2022-07-05 19:57:23 : BCAT10C0 (3 of 5) Adding GeneIntegrationMatrix to ArrowFile!, 25.165 mins elapsed.
2022-07-05 19:57:23 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (1 of 23)!, 25.165 mins elapsed.
2022-07-05 19:57:27 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (2 of 23)!, 25.238 mins elapsed.
2022-07-05 19:57:29 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (3 of 23)!, 25.273 mins elapsed.
2022-07-05 19:57:33 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (4 of 23)!, 25.329 mins elapsed.
2022-07-05 19:57:35 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (5 of 23)!, 25.37 mins elapsed.
2022-07-05 19:57:37 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (6 of 23)!, 25.396 mins elapsed.
2022-07-05 19:57:39 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (7 of 23)!, 25.441 mins elapsed.
2022-07-05 19:57:41 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (8 of 23)!, 25.473 mins elapsed.
2022-07-05 19:57:43 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (9 of 23)!, 25.508 mins elapsed.
2022-07-05 19:57:46 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (10 of 23)!, 25.561 mins elapsed.
2022-07-05 19:57:48 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (11 of 23)!, 25.587 mins elapsed.
2022-07-05 19:57:51 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (12 of 23)!, 25.634 mins elapsed.
2022-07-05 19:57:54 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (13 of 23)!, 25.694 mins elapsed.
2022-07-05 19:57:57 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (14 of 23)!, 25.73 mins elapsed.
2022-07-05 19:57:58 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (15 of 23)!, 25.755 mins elapsed.
2022-07-05 19:58:01 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (16 of 23)!, 25.796 mins elapsed.
2022-07-05 19:58:03 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (17 of 23)!, 25.839 mins elapsed.
2022-07-05 19:58:05 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (18 of 23)!, 25.874 mins elapsed.
2022-07-05 19:58:08 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (19 of 23)!, 25.927 mins elapsed.
2022-07-05 19:58:11 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (20 of 23)!, 25.966 mins elapsed.
2022-07-05 19:58:13 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (21 of 23)!, 26.005 mins elapsed.
2022-07-05 19:58:16 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (22 of 23)!, 26.05 mins elapsed.
2022-07-05 19:58:18 : Adding GeneIntegrationMatrix to BCAT10C0 for Chr (23 of 23)!, 26.084 mins elapsed.
2022-07-05 19:58:21 : BCAT7F0 (4 of 5) Getting GeneIntegrationMatrix From TempFiles!, 26.129 mins elapsed.
2022-07-05 19:59:12 : BCAT7F0 (4 of 5) Adding GeneIntegrationMatrix to ArrowFile!, 26.99 mins elapsed.
2022-07-05 19:59:12 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (1 of 23)!, 26.991 mins elapsed.
2022-07-05 19:59:16 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (2 of 23)!, 27.052 mins elapsed.
2022-07-05 19:59:18 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (3 of 23)!, 27.085 mins elapsed.
2022-07-05 19:59:21 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (4 of 23)!, 27.138 mins elapsed.
2022-07-05 19:59:23 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (5 of 23)!, 27.175 mins elapsed.
2022-07-05 19:59:25 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (6 of 23)!, 27.2 mins elapsed.
2022-07-05 19:59:27 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (7 of 23)!, 27.243 mins elapsed.
2022-07-05 19:59:29 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (8 of 23)!, 27.273 mins elapsed.
2022-07-05 19:59:31 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (9 of 23)!, 27.307 mins elapsed.
2022-07-05 19:59:34 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (10 of 23)!, 27.358 mins elapsed.
2022-07-05 19:59:36 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (11 of 23)!, 27.382 mins elapsed.
2022-07-05 19:59:38 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (12 of 23)!, 27.424 mins elapsed.
2022-07-05 19:59:41 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (13 of 23)!, 27.479 mins elapsed.
2022-07-05 19:59:43 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (14 of 23)!, 27.507 mins elapsed.
2022-07-05 19:59:45 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (15 of 23)!, 27.53 mins elapsed.
2022-07-05 19:59:47 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (16 of 23)!, 27.571 mins elapsed.
2022-07-05 19:59:49 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (17 of 23)!, 27.61 mins elapsed.
2022-07-05 19:59:51 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (18 of 23)!, 27.641 mins elapsed.
2022-07-05 19:59:54 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (19 of 23)!, 27.687 mins elapsed.
2022-07-05 19:59:56 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (20 of 23)!, 27.727 mins elapsed.
2022-07-05 19:59:58 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (21 of 23)!, 27.76 mins elapsed.
2022-07-05 20:00:01 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (22 of 23)!, 27.807 mins elapsed.
2022-07-05 20:00:03 : Adding GeneIntegrationMatrix to BCAT7F0 for Chr (23 of 23)!, 27.84 mins elapsed.
2022-07-05 20:00:06 : BCAT10F0 (5 of 5) Getting GeneIntegrationMatrix From TempFiles!, 27.883 mins elapsed.

@rcorces
Copy link
Collaborator

rcorces commented Jul 6, 2022

@VDD58 - I dont have a solution for your problem but we are hoping that an upcoming release will fix many of these HDF5 problems. That being said, I dont think the error you are reporting is the same as what is discussed in this issue post.

Also, in the future, please upload you log file rather than pasting the full text into the issue.

@VDD58
Copy link

VDD58 commented Jul 6, 2022

Thank you so much for responding!

@rcorces
Copy link
Collaborator

rcorces commented Aug 10, 2022

I believe this has now been addressed properly on the dev branch via 55f0923#diff-bab2afba66a9980d18aaaa80450eaeff5c80a12eb8a943c0680d7b71b06b7e5e

There is a more stable way of handling file locking that will be implemented in release_1.0.3 and described in an upcoming release of the manual.

@rcorces rcorces closed this as completed Aug 10, 2022
@mdanb
Copy link

mdanb commented Jan 19, 2023

@rcorces I installed the version in dev (devtools::install_github("GreenleafLab/ArchR", ref="dev", repos = BiocManager::repositories())) I still get the same error when I run createArrowFiles

@rcorces
Copy link
Collaborator

rcorces commented Jan 19, 2023

@mdanb - I'll need more information to actually help. The dev version includes a new way of preventing HDF5 accessibility errors. You would likely need to disable subthreading by running addArchRLocking(locking = TRUE) prior to running createArrowFiles(). In the future, please always upload your ArchR log file when posting to GitHub.

@mdanb
Copy link

mdanb commented Jan 21, 2023

@rcorces I'm good. It turns out it was because my sampleNames had a / which threw things off

@gingerii
Copy link

@mdanb - I'll need more information to actually help. The dev version includes a new way of preventing HDF5 accessibility errors. You would likely need to disable subthreading by running addArchRLocking(locking = TRUE) prior to running createArrowFiles(). In the future, please always upload your ArchR log file when posting to GitHub.

Just wanted to confirm: Is there currently no work around for this issue when running downstream analysis? I am using ArrowFiles created by someone else, and I don't really want to re-process all the arrow files, but if there is no other way I will.

@jgarces02
Copy link

Hi, none of the above things worked for me. The only thing that let me continue was to set copyArrows = F

archro <- ArchRProject(arrowfiles, outputDirectory = "archr_test", thread = 1, copyArrows = FALSE)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests