Trying to read a h5mu object, a few errors popping up #9

mikelove · 2024-09-26T17:50:07Z

I'm working with a scCRISPR dataset from another group, I can open it with muon but not with MuData (1.8.0). Some details below. Thanks for taking a look!

wget -O inference_mudata.h5mu "https://dl.dropboxusercontent.com/scl/fi/u3hyg4gq9pfttpf6amlv5/inference_mudata.h5mu?rlkey=fa908coboty72rgqsg2bcndfu&st=985x9zhh&dl=1"

In python:

import muon as md
crispr_mu = md.read_h5mu('inference_mudata.h5mu')
>>> crispr_mu
MuData object with n_obs × n_vars = 32471 × 25184
  obs:	'cov1', 'batch'
  uns:	'pairs_to_test', 'test_results'
  3 modalities
    gene:	32471 x 24731
      obs:	'batch', 'cov1', 'batch_number', 'n_counts', 'log1p_n_genes_by_counts', 'total_gene_umis', 'log1p_total_counts', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'pct_counts_in_top_200_genes', 'pct_counts_in_top_500_genes', 'total_counts_mt', 'log1p_total_counts_mt', 'percent_mito', 'total_counts_ribo', 'log1p_total_counts_ribo', 'pct_counts_ribo', 'num_expressed_genes', 'doublet_scores', 'predicted_doublets', 'doublet_info'
      var:	'symbol', 'mt', 'ribo', 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'n_cells', 'gene_chr', 'gene_start', 'gene_end'
    guide:	32471 x 441
      obs:	'batch', 'cov1', 'num_expressed_guides', 'batch_number', 'total_guide_umis'
      var:	'guide_id', 'sgRNA_ID', 'sgRNA_sequences', 'Target_name', 'chr', 'start', 'end', 'Set', 'intended_target_name', 'intended_target_chr', 'intended_target_start', 'intended_target_end', 'sequence', 'targeting'
      uns:	'capture_method', 'moi'
      layers:	'guide_assignment'
    hashing:	32471 x 12
      obs:	'batch', 'cov1', 'cluster_id', 'hto_type', 'hto_type_split'

In R:

> rhdf5::h5ls("~/Downloads/inference_mudata.h5mu", recursive = FALSE)
  group   name     otype dclass dim
0     /    mod H5I_GROUP
1     /    obs H5I_GROUP
2     /   obsm H5I_GROUP
3     / obsmap H5I_GROUP
4     /   obsp H5I_GROUP
5     /    uns H5I_GROUP
6     /    var H5I_GROUP
7     /   varm H5I_GROUP
8     / varmap H5I_GROUP
9     /   varp H5I_GROUP

I started to try to load with MuData (rename file to test.h5mu). The first error I came to is

 > dat <- readH5MU("test.h5mu")
 Error: Error in h5checktype(). The provided H5Identifier is not a dataset identifier.

This was coming from roughly here (I'm looking at 1.8.0 code):

https://github.com/ilia-kats/MuData/blob/master/R/read_h5mu.R#L60

This fails on this dataset because H5Dclose(col) won't work as col is of type group not dataset.

H5Iget_type(col)
 [1] "H5I_GROUP

I did a simple thing and just comment out H5Dclose.

The next error (which may not be relevant if my simple step is misguided):

 > dat <- readH5MU("test.h5mu")
 Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  :
   row names supplied are of the wrong length

This is coming from the first call to read_dataframe. There is a line do.call(data.frame, args = col_list) but we have:

 Browse[1]> str(col_list)
 List of 3
  $ cov1     : NULL
  $ batch    : NULL
  $ row.names: chr [1:32471(1d)] "CAGTAACCACTCTGTC_0" "CCTCTGACATCTATGG_0" "TATCTCAAGTTAACGA_0" "CTTACCGTCAGTGTTG_0" ...
 Browse[1]> columnorder
 [1] "cov1"  "batch"
!Browse[1]> group
 HDF5 GROUP
         name /obs
     filename

     name       otype dclass   dim
 0 _index H5I_DATASET STRING 32471
 1 batch  H5I_GROUP
 2 cov1   H5I_GROUP

So this cannot be coerced to a data.frame.

The text was updated successfully, but these errors were encountered:

ilia-kats · 2024-10-08T09:44:06Z

Thanks for providing the sample file. This works for me with current Github master. We haven't had a Bioc release in a while. @gtca can you make a new release?

mikelove · 2025-02-10T12:24:23Z

Any update on this? We are looking to produce MuData objects in a consortium and cross-language support would be ideal.

Again, I tried to debug above but I just don't know enough about the codebase to figure out the error.

ilia-kats · 2025-02-12T10:41:54Z

We are trying to add me as a maintainer of the Bioc package so we can get a new release out. Please be patient.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to read a h5mu object, a few errors popping up #9

Trying to read a h5mu object, a few errors popping up #9

mikelove commented Sep 26, 2024

ilia-kats commented Oct 8, 2024

mikelove commented Feb 10, 2025

ilia-kats commented Feb 12, 2025

Trying to read a h5mu object, a few errors popping up #9

Trying to read a h5mu object, a few errors popping up #9

Comments

mikelove commented Sep 26, 2024

ilia-kats commented Oct 8, 2024

mikelove commented Feb 10, 2025

ilia-kats commented Feb 12, 2025