Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hicPCA AttributeError : 'list' object has no attribute 'real' #655

Closed
shiyi-pan opened this issue Jan 10, 2021 · 18 comments
Closed

hicPCA AttributeError : 'list' object has no attribute 'real' #655

shiyi-pan opened this issue Jan 10, 2021 · 18 comments

Comments

@shiyi-pan
Copy link

Hi, I used hicPCA to analysis my hic data, here is my code:

hicPCA -m hic_corrected.h5 --outputFileName pca1.bw pca2.bw --format bigwig --pearsonMatrix pearson.h5 --method dist_norm --obsexpMatrix obs_exp --extraTrack NN1138.Chrosome.gene.sorted.bed

my bed file format is here:

Chr01 150057 150620 NN01g00001.1 0 - 150057 150620 0 2 169,80, 0,483,
Chr01 238606 249703 NN01g00002.1 0 + 238606 249703 0 3 2160,251,31, 0,9261,11066,
Chr01 258601 264467 NN01g00003.1 0 - 258601 264467 0 9 330,51,307,119,65,180,211,48,141, 0,756,855,1195,1400,2106,5207,5605,5725,
Chr01 264993 267366 NN01g00004.1 0 + 264993 267366 0 4 34,65,55,290, 0,474,775,2083,

and I met an error:

Traceback (most recent call last):
File "/ds3512/home/panyp/ruanjian/python36/bin/hicPCA", line 7, in
main()
File "/ds3512/home/panyp/ruanjian/python36/lib/python3.6/site-packages/hicexplorer/hicPCA.py", line 334, in main
vecs_list = correlateEigenvectorWithGeneTrack(ma, vecs_list, args.extraTrack)
File "/ds3512/home/panyp/ruanjian/python36/lib/python3.6/site-packages/hicexplorer/hicPCA.py", line 175, in correlateEigenvectorWithGeneTrack
_correlation = pearsonr(eigenvector[bin_id[0]:bin_id[1]].real,
AttributeError: 'list' object has no attribute 'real'

could you help me fix this error ? Thank you very much .

@LeilyR
Copy link
Collaborator

LeilyR commented Jan 11, 2021

which version are you using?

@shiyi-pan
Copy link
Author

Thank you for your reply , LeiliR . The version I using is 3.4.1.

@LeilyR
Copy link
Collaborator

LeilyR commented Jan 11, 2021

Did you get this warning message: "Number of fields in BED file is not standard. Assuming bed6." ?
Are you sure that the chr names of your matrix are the same? e.g. Chr01

@joachimwolff
Copy link
Collaborator

Thank you for your reply , LeiliR . The version I using is 3.4.1.

Please update to HiCExplorer version 3.6.

@shiyi-pan
Copy link
Author

Thank you for your reply both , LeiliR and joachimwolff. there is no warning message as you described. because the hic_matrix.h5 is binary file , I can't check it ,but the chr names of my reference genome is "Chr01" format , so I think the matrix are same. I will try to update to HiCExplorer version 3.6. Thank you again.

@joachimwolff
Copy link
Collaborator

Please check the output of hicInfo -m yourmatrix.h5. It will tell you what chromosome names you used to create your Hi-C interaction matrix.

The given format of the chromosome names does not follow the usual two standards of UCSC (chromosome 1 is named: chr1) or Ensembl (chromosome 1 is named: 1); see at UCSC. Please change your data to one of the two standards.

@joachimwolff
Copy link
Collaborator

@shiyi-pan Please check our develop branch if this provided fix solves your issue.

Best,

Joachim

@shiyi-pan
Copy link
Author

Sorry for reply so late. I tried to download the latest HiCExplorer and failed many times.
now I upgrade the HiCExplorer and it don't fix the problem.
here is my Matrix information, it seems normal.

Matrix information file. Created with HiCExplorer's hicInfo version 3.6

File: hic_corrected.h5
Size: 96,060
Bin_length: 10000
Sum of matrix: 34285422.90995298
Chromosomes:length: Chr01: 57180813 bp; Chr02: 49974985 bp; Chr03: 47135343 bp; Chr04: 50772388 bp; Chr05: 40763724 bp; Chr06: 49765926 bp; Chr07: 44477389 bp; Chr08: 47918604 bp; Chr09: 48783406 bp; Chr10: 52870952 bp; Chr11: 39316363 bp; Chr12: 40945015 bp; Chr13: 45260557 bp; Chr14: 49694925 bp; Chr15: 52150659 bp; Chr16: 37044301 bp; Chr17: 41996393 bp; Chr18: 58227433 bp; Chr19: 49303783 bp; Chr20: 47338249 bp; Scaffold_1: 1376331 bp; Scaffold_10: 79343 bp; Scaffold_100: 37068 bp; Scaffold_101: 42738 bp; Scaffold_102: 26844 bp; Scaffold_103: 228794 bp; Scaffold_104: 32224 bp; Scaffold_105: 64192 bp; Scaffold_106: 18086 bp; Scaffold_107: 24298 bp; Scaffold_108: 32567 bp; Scaffold_109: 26501 bp; Scaffold_11: 68995 bp; Scaffold_110: 30230 bp; Scaffold_111: 32186 bp; Scaffold_112: 29075 bp; Scaffold_113: 44707 bp; Scaffold_114: 31011 bp; Scaffold_115: 53279 bp; Scaffold_116: 50853 bp; Scaffold_117: 32689 bp; Scaffold_118: 22740 bp; Scaffold_119: 43728 bp; Scaffold_12: 82648 bp; Scaffold_120: 29361 bp; Scaffold_121: 77151 bp; Scaffold_122: 24454 bp; Scaffold_123: 27786 bp; Scaffold_124: 24495 bp; Scaffold_125: 25183 bp; Scaffold_126: 72182 bp; Scaffold_127: 53744 bp; Scaffold_128: 26293 bp; Scaffold_129: 37480 bp; Scaffold_13: 100901 bp; Scaffold_130: 26616 bp; Scaffold_131: 47216 bp; Scaffold_132: 25511 bp; Scaffold_133: 38574 bp; Scaffold_134: 39819 bp; Scaffold_135: 31471 bp; Scaffold_136: 68622 bp; Scaffold_137: 35934 bp; Scaffold_138: 27115 bp; Scaffold_139: 30448 bp; Scaffold_14: 49877 bp; Scaffold_140: 103730 bp; Scaffold_141: 25958 bp; Scaffold_142: 74686 bp; Scaffold_143: 34985 bp; Scaffold_144: 48180 bp; Scaffold_145: 145388 bp; Scaffold_146: 34480 bp; Scaffold_147: 84396 bp; Scaffold_148: 95665 bp; Scaffold_149: 45280 bp; Scaffold_15: 65928 bp; Scaffold_150: 35715 bp; Scaffold_151: 31258 bp; Scaffold_152: 28092 bp; Scaffold_153: 27805 bp; Scaffold_154: 27766 bp; Scaffold_155: 24449 bp; Scaffold_156: 20304 bp; Scaffold_157: 109821 bp; Scaffold_158: 21654 bp; Scaffold_159: 24454 bp; Scaffold_16: 34092 bp; Scaffold_17: 23083 bp; Scaffold_18: 85037 bp; Scaffold_19: 52024 bp; Scaffold_2: 926866 bp; Scaffold_20: 37941 bp; Scaffold_21: 77000 bp; Scaffold_22: 28139 bp; Scaffold_23: 27054 bp; Scaffold_24: 24918 bp; Scaffold_25: 29551 bp; Scaffold_26: 29820 bp; Scaffold_27: 21547 bp; Scaffold_28: 31000 bp; Scaffold_29: 60459 bp; Scaffold_3: 54977 bp; Scaffold_30: 14929 bp; Scaffold_31: 12941 bp; Scaffold_32: 12410 bp; Scaffold_33: 12045 bp; Scaffold_34: 10478 bp; Scaffold_35: 9797 bp; Scaffold_36: 7715 bp; Scaffold_37: 4901 bp; Scaffold_38: 4038 bp; Scaffold_39: 5000 bp; Scaffold_4: 28354 bp; Scaffold_40: 25000 bp; Scaffold_41: 50000 bp; Scaffold_42: 50000 bp; Scaffold_43: 50000 bp; Scaffold_44: 50000 bp; Scaffold_45: 25000 bp; Scaffold_46: 25000 bp; Scaffold_47: 25000 bp; Scaffold_48: 25000 bp; Scaffold_49: 25000 bp; Scaffold_5: 29883 bp; Scaffold_50: 25000 bp; Scaffold_51: 25000 bp; Scaffold_52: 25000 bp; Scaffold_53: 5000 bp; Scaffold_54: 25000 bp; Scaffold_55: 50000 bp; Scaffold_56: 34282 bp; Scaffold_57: 47162 bp; Scaffold_58: 25997 bp; Scaffold_59: 19013 bp; Scaffold_6: 40904 bp; Scaffold_60: 25693 bp; Scaffold_61: 84331 bp; Scaffold_62: 37296 bp; Scaffold_63: 46469 bp; Scaffold_64: 27074 bp; Scaffold_65: 17918 bp; Scaffold_66: 55362 bp; Scaffold_67: 37606 bp; Scaffold_68: 33523 bp; Scaffold_69: 44903 bp; Scaffold_7: 32595 bp; Scaffold_70: 28561 bp; Scaffold_71: 36620 bp; Scaffold_72: 60085 bp; Scaffold_73: 105810 bp; Scaffold_74: 23713 bp; Scaffold_75: 31008 bp; Scaffold_76: 26196 bp; Scaffold_77: 23870 bp; Scaffold_78: 22335 bp; Scaffold_79: 45524 bp; Scaffold_8: 51638 bp; Scaffold_80: 45152 bp; Scaffold_81: 35854 bp; Scaffold_82: 33673 bp; Scaffold_83: 69659 bp; Scaffold_84: 96453 bp; Scaffold_85: 47017 bp; Scaffold_86: 30893 bp; Scaffold_87: 25513 bp; Scaffold_88: 23061 bp; Scaffold_89: 66364 bp; Scaffold_9: 50400 bp; Scaffold_90: 66484 bp; Scaffold_91: 44568 bp; Scaffold_92: 24224 bp; Scaffold_93: 45891 bp; Scaffold_94: 95094 bp; Scaffold_95: 53584 bp; Scaffold_96: 27959 bp; Scaffold_97: 37118 bp; Scaffold_98: 37684 bp; Scaffold_99: 62049 bp;
Non-zero elements: 108,045,174
Minimum (non zero): 2.0425253050862239e-07
Maximum: 713.8334980460619
NaN bins: 0

here is my bed file:

NN1138.Chrosome.gene.sorted.bed.zip

@LeilyR
Copy link
Collaborator

LeilyR commented Jan 25, 2021

Are you using the develop branch? Please use the develop branch , we have fixed it there. you can use git clone -b develop <repo-url> and then python setup.py install in your conda env

@shiyi-pan
Copy link
Author

Hi, sorry for reply late. I have installed the develop branch and it doesn't work . Do I bed file wrong or something else wrong ? Thank you very much, LeilyR and Joachim.

@joachimwolff
Copy link
Collaborator

Can you specify: it doesn't work?

@shiyi-pan
Copy link
Author

Hi,thank you for your reply. I download the HiCExplorer with the comoand :
git clone -b develop https://github.com/deeptools/HiCExplorer.git
and install it.
then I run the command:
python /ds3512/home/panyp/ruanjian/python36/bin/hicPCA -m hic_corrected.h5 --outputFileName pca1.bw pca2.bw --format bigwig --pearsonMatrix pearson.h5 --method dist_norm --obsexpMatrix obs_exp --extraTrack NN1138.gene.sorted.bed

and met the error :
ERROR:hicmatrix.HiCMatrix:Index error
Traceback (most recent call last):
File "/ds3512/home/panyp/ruanjian/python36/lib/python3.6/site-packages/hicmatrix/HiCMatrix.py", line 262, in getRegionBinRange
endbin = sorted(self.interval_trees[chrname][endpos:endpos + 1])[0].data
IndexError: list index out of range
Traceback (most recent call last):
File "/ds3512/home/panyp/ruanjian/python36/bin/hicPCA", line 7, in
main()
File "/ds3512/home/panyp/ruanjian/python36/lib/python3.6/site-packages/hicexplorer/hicPCA.py", line 338, in main
vecs_list = correlateEigenvectorWithGeneTrack(ma, vecs_list, args.extraTrack)
File "/ds3512/home/panyp/ruanjian/python36/lib/python3.6/site-packages/hicexplorer/hicPCA.py", line 162, in correlateEigenvectorWithGeneTrack
gene_occurrence[bin_id[1]] += 1
TypeError: 'NoneType' object is not subscriptable

here is my bed file:

NN1138.Chrosome.gene.sorted.bed.zip
and here is the information of my hic_corrected.h5 file:

Matrix information file. Created with HiCExplorer's hicInfo version 3.6

File: hic_corrected.h5
Size: 96,060
Bin_length: 10000
Sum of matrix: 34285422.90995298
Chromosomes:length: Chr01: 57180813 bp; Chr02: 49974985 bp; Chr03: 47135343 bp; Chr04: 50772388 bp; Chr05: 40763724 bp; Chr06: 49765926 bp; Chr07: 44477389 b
p; Chr08: 47918604 bp; Chr09: 48783406 bp; Chr10: 52870952 bp; Chr11: 39316363 bp; Chr12: 40945015 bp; Chr13: 45260557 bp; Chr14: 49694925 bp; Chr15: 5215065
9 bp; Chr16: 37044301 bp; Chr17: 41996393 bp; Chr18: 58227433 bp; Chr19: 49303783 bp; Chr20: 47338249 bp; Scaffold_1: 1376331 bp; Scaffold_10: 79343 bp; Scaf
fold_100: 37068 bp; Scaffold_101: 42738 bp; Scaffold_102: 26844 bp; Scaffold_103: 228794 bp; Scaffold_104: 32224 bp; Scaffold_105: 64192 bp; Scaffold_106: 18
086 bp; Scaffold_107: 24298 bp; Scaffold_108: 32567 bp; Scaffold_109: 26501 bp; Scaffold_11: 68995 bp; Scaffold_110: 30230 bp; Scaffold_111: 32186 bp; Scaffo
ld_112: 29075 bp; Scaffold_113: 44707 bp; Scaffold_114: 31011 bp; Scaffold_115: 53279 bp; Scaffold_116: 50853 bp; Scaffold_117: 32689 bp; Scaffold_118: 22740
bp; Scaffold_119: 43728 bp; Scaffold_12: 82648 bp; Scaffold_120: 29361 bp; Scaffold_121: 77151 bp; Scaffold_122: 24454 bp; Scaffold_123: 27786 bp; Scaffold_
124: 24495 bp; Scaffold_125: 25183 bp; Scaffold_126: 72182 bp; Scaffold_127: 53744 bp; Scaffold_128: 26293 bp; Scaffold_129: 37480 bp; Scaffold_13: 100901 bp
; Scaffold_130: 26616 bp; Scaffold_131: 47216 bp; Scaffold_132: 25511 bp; Scaffold_133: 38574 bp; Scaffold_134: 39819 bp; Scaffold_135: 31471 bp; Scaffold_13
6: 68622 bp; Scaffold_137: 35934 bp; Scaffold_138: 27115 bp; Scaffold_139: 30448 bp; Scaffold_14: 49877 bp; Scaffold_140: 103730 bp; Scaffold_141: 25958 bp;
Scaffold_142: 74686 bp; Scaffold_143: 34985 bp; Scaffold_144: 48180 bp; Scaffold_145: 145388 bp; Scaffold_146: 34480 bp; Scaffold_147: 84396 bp; Scaffold_148
: 95665 bp; Scaffold_149: 45280 bp; Scaffold_15: 65928 bp; Scaffold_150: 35715 bp; Scaffold_151: 31258 bp; Scaffold_152: 28092 bp; Scaffold_153: 27805 bp; Sc
affold_154: 27766 bp; Scaffold_155: 24449 bp; Scaffold_156: 20304 bp; Scaffold_157: 109821 bp; Scaffold_158: 21654 bp; Scaffold_159: 24454 bp; Scaffold_16: 3
4092 bp; Scaffold_17: 23083 bp; Scaffold_18: 85037 bp; Scaffold_19: 52024 bp; Scaffold_2: 926866 bp; Scaffold_20: 37941 bp; Scaffold_21: 77000 bp; Scaffold_2
2: 28139 bp; Scaffold_23: 27054 bp; Scaffold_24: 24918 bp; Scaffold_25: 29551 bp; Scaffold_26: 29820 bp; Scaffold_27: 21547 bp; Scaffold_28: 31000 bp; Scaffo
ld_29: 60459 bp; Scaffold_3: 54977 bp; Scaffold_30: 14929 bp; Scaffold_31: 12941 bp; Scaffold_32: 12410 bp; Scaffold_33: 12045 bp; Scaffold_34: 10478 bp; Sca
ffold_35: 9797 bp; Scaffold_36: 7715 bp; Scaffold_37: 4901 bp; Scaffold_38: 4038 bp; Scaffold_39: 5000 bp; Scaffold_4: 28354 bp; Scaffold_40: 25000 bp; Scaff
old_41: 50000 bp; Scaffold_42: 50000 bp; Scaffold_43: 50000 bp; Scaffold_44: 50000 bp; Scaffold_45: 25000 bp; Scaffold_46: 25000 bp; Scaffold_47: 25000 bp; S
caffold_48: 25000 bp; Scaffold_49: 25000 bp; Scaffold_5: 29883 bp; Scaffold_50: 25000 bp; Scaffold_51: 25000 bp; Scaffold_52: 25000 bp; Scaffold_53: 5000 bp;
Scaffold_54: 25000 bp; Scaffold_55: 50000 bp; Scaffold_56: 34282 bp; Scaffold_57: 47162 bp; Scaffold_58: 25997 bp; Scaffold_59: 19013 bp; Scaffold_6: 40904
bp; Scaffold_60: 25693 bp; Scaffold_61: 84331 bp; Scaffold_62: 37296 bp; Scaffold_63: 46469 bp; Scaffold_64: 27074 bp; Scaffold_65: 17918 bp; Scaffold_66: 55
362 bp; Scaffold_67: 37606 bp; Scaffold_68: 33523 bp; Scaffold_69: 44903 bp; Scaffold_7: 32595 bp; Scaffold_70: 28561 bp; Scaffold_71: 36620 bp; Scaffold_72:
60085 bp; Scaffold_73: 105810 bp; Scaffold_74: 23713 bp; Scaffold_75: 31008 bp; Scaffold_76: 26196 bp; Scaffold_77: 23870 bp; Scaffold_78: 22335 bp; Scaffol
d_79: 45524 bp; Scaffold_8: 51638 bp; Scaffold_80: 45152 bp; Scaffold_81: 35854 bp; Scaffold_82: 33673 bp; Scaffold_83: 69659 bp; Scaffold_84: 96453 bp; Scaf
fold_85: 47017 bp; Scaffold_86: 30893 bp; Scaffold_87: 25513 bp; Scaffold_88: 23061 bp; Scaffold_89: 66364 bp; Scaffold_9: 50400 bp; Scaffold_90: 66484 bp; S
caffold_91: 44568 bp; Scaffold_92: 24224 bp; Scaffold_93: 45891 bp; Scaffold_94: 95094 bp; Scaffold_95: 53584 bp; Scaffold_96: 27959 bp; Scaffold_97: 37118 b
p; Scaffold_98: 37684 bp; Scaffold_99: 62049 bp;
Non-zero elements: 108,045,174
Minimum (non zero): 2.0425253050862239e-07
Maximum: 713.8334980460619
NaN bins: 0

could you help me fix this problem ? thank you very much.

@14stutzmanav
Copy link

14stutzmanav commented Mar 16, 2021

Hi all,

Is there a way to troubleshoot the error message, "TypeError: 'NoneType' object is not subscriptable"? I am also encountering this message when I use hicPCA and specify a gene track as the extra track.

Thank you!

@LeilyR
Copy link
Collaborator

LeilyR commented Mar 17, 2021

could you please tell us which version you are using and what is the command line you try to run?

@LeilyR
Copy link
Collaborator

LeilyR commented Mar 17, 2021

@shiyi-pan The error you get is due to having a gene coordinate which exceeds the length of the chromosome (IndexError: list index out of range). Please check for genes to be from the same genome version of the reference you used to map your hic data. In future we add a check to report a more clear message.

@shiyi-pan
Copy link
Author

Thank you for reply me. my hicexplorer version is 3.6 and here is my command:
gff3ToGenePred NN1138.gene.gff3 NN1138.gene.GenePred
genePredToBed NN1138.gene.GenePred NN1138.gene.bed
sort -k1,1 -k2,2n NN1138.gene.bed > NN1138.gene.sorted.bed
python hicPCA -m hic_corrected.h5 --outputFileName pca1.bw pca2.bw --format bigwig --pearsonMatrix pearson.h5 --method dist_norm --obsexpMatrix obs_exp --extraTrack NN1138.gene.sorted.bed

I have check the gene coordinate and they aren't exceeds the length of the chromosome.
here is my NN1138.gene.sorted.bed file and NN1138-2.v1.0.chrom.sizes file.
bed.zip

@LeilyR
Copy link
Collaborator

LeilyR commented Mar 18, 2021

I could already see that you have :
Chr01 57147361 57173638
while your chr01 length is 57173637
I have not checked the rest of your chromosomes. There might be more of such cases of exceeding the chromosome length.

Also when --obsexpMatrix add the extension to your matrix name (obs_exp.h5) This should not cause the error you reported but you wont get the matrix otherwise.

@shiyi-pan
Copy link
Author

Thank you for your reply.
I'm sorry for my mistake. I changed the NN1138-2.v1.0.chrom.sizes file to test my script. the ture length of chr01 is 57180813. Is there are any disagreement between NN1138-2.v1.0.chrom.sizes file and NN1138.gene.sorted.bed file.
Thank you again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants