Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exit: Error: Directory Hierarchy of rawdata '/home/genome/Caulobacter/HiC_data/' is not correct. No '.fastq(.gz)' files detected #168

Closed
xiaoaozqd opened this issue Jul 27, 2018 · 8 comments

Comments

@xiaoaozqd
Copy link

Dear,
When I use Hic-Pro (HiC-Pro_2.10.0) on (linux 3.10.0-693.21.1.el7.x86_64 #1 x86_64 GNU/Linux) with the code "HiC-Pro -i /home/genome/Caulobacter/HiC_data/ -o /home/genome/Caulobacter/HiC_data_out -c /home/genome/Caulobacter/config-hicpro.txt", there is an error "Exit: Error: Directory Hierarchy of rawdata '/home/genome/Caulobacter/HiC_data/' is not correct. No '.fastq(.gz)' files detected".
while the fastq files just in this file
"ll /home/genome/Caulobacter/HiC_data/
总用量 9965776
-rw-rw-r-- 1 zengqd zengqd 5102474162 7月 27 20:12 SRR824846_1.fastq
-rw-rw-r-- 1 zengqd zengqd 5102474162 7月 27 20:12 SRR824846_2.fastq
"
my config-hicpro.txt as:
cat config-hicpro.txt

Please change the variable settings below if necessary

#########################################################################

Paths and Settings - Do not edit !

#########################################################################

TMP_DIR = tmp
LOGS_DIR = logs
BOWTIE2_OUTPUT_DIR = bowtie_results
MAPC_OUTPUT = hic_results
RAW_DIR = rawdata

#######################################################################

SYSTEM AND SCHEDULER - Start Editing Here !!

#######################################################################
N_CPU = 40
LOGFILE = hicpro.log

JOB_NAME =
JOB_MEM =
JOB_WALLTIME =
JOB_QUEUE =
JOB_MAIL =

#########################################################################

Data

#########################################################################

PAIR1_EXT = _R1
PAIR2_EXT = _R2

#######################################################################

Alignment options

#######################################################################

FORMAT = phred33
MIN_MAPQ = 0

BOWTIE2_IDX_PATH = /home/genome/Caulobacter
BOWTIE2_GLOBAL_OPTIONS = --very-sensitive -L 30 --score-min L,-0.6,-0.2 --end-to-end --reorder
BOWTIE2_LOCAL_OPTIONS = --very-sensitive -L 20 --score-min L,-0.6,-0.2 --end-to-end --reorder

#######################################################################

Annotation files

#######################################################################

REFERENCE_GENOME = bacteria
GENOME_SIZE = bacteria.size
CAPTURE_TARGET =

#######################################################################

Allele specific analysis

#######################################################################

ALLELE_SPECIFIC_SNP =

#######################################################################

Digestion Hi-C

#######################################################################

GENOME_FRAGMENT = /home/genome/Caulobacter/bacteria_dpnii.bed
LIGATION_SITE = AAGCTAGCTT
MIN_FRAG_SIZE =
MAX_FRAG_SIZE =
MIN_INSERT_SIZE =
MAX_INSERT_SIZE =

#######################################################################

Hi-C processing

#######################################################################

MIN_CIS_DIST =
GET_ALL_INTERACTION_CLASSES = 1
GET_PROCESS_SAM = 0
RM_SINGLETON = 1
RM_MULTI = 1
RM_DUP = 1

#######################################################################

Contact Maps

#######################################################################

#BIN_SIZE = 20000 40000 100000 150000 500000 1000000
BIN_SIZE = 100000
MATRIX_FORMAT = upper

#######################################################################

Normalization

#######################################################################
MAX_ITER = 100
FILTER_LOW_COUNT_PERC = 0.02
FILTER_HIGH_COUNT_PERC = 0
EPS = 0.1

Could you kindhearted help me to fix this problem?

@nservant
Copy link
Owner

Hi
Your input file is not correct.
You need to have one folder per sample, ie.
HiC_data
++ sample1
++ ++ SRR824846_1.fastq
++ ++ SRR824846_2.fastq

You will also need to update the config file with
PAIR1_EXT = _1
PAIR2_EXT = _2

And please, check to the index files are
/home/genome/Caulobacter/bacteria*

Best

@xiaoaozqd
Copy link
Author

Dear nservant,
So appreciation for your help!
It's worked!
Best wishes!
Qingdong

@nservant
Copy link
Owner

nservant commented Mar 11, 2019 via email

@ViriatoII
Copy link

ViriatoII commented Jul 9, 2019

The same happens to me, but they are there..

bin/HiC-Pro -i /gpfs/project/projects/qggp/Potatodenovo/data/HiC/sample1 -o /gpfs/project/projects/qggp/Potatodenovo/results/10x/HiC-pro -c config-hicpro.txt -p

ll /gpfs/project/projects/qggp/Potatodenovo/data/HiC/sample1
-rwxrwx--- 1 riesd QGGP 46656175972 May 4 04:19 HiC_R1.fastq
-rwxrwx--- 1 riesd QGGP 46640555968 May 4 04:34 HiC_R2.fastq

@porpheria
Copy link

porpheria commented Jul 10, 2019 via email

@nservant
Copy link
Owner

Please run HiC-Pro with
bin/HiC-Pro -i /gpfs/project/projects/qggp/Potatodenovo/data/HiC/ -o /gpfs/project/projects/qggp/Potatodenovo/results/10x/HiC-pro -c config-hicpro.txt -p

@wuxiaopei0509
Copy link

I encounter the same issue, I have only one sample,and construst the reads directory like
02Hi-C_reads
++ Sample1
++ ++ Hi-C_R1.fq.gz
++ ++ Hi-C_R2.fq.gz
but it also occured 'Exit: Error: Directory Hierarchy of rawdata '~/02Hi-C_reads/' is not correct. No '.fastq(.gz)' files detected ',why does it happen?
Thank you !

@JeffreyMaurer
Copy link

This is the first result when I google the issue. I think the issue for wuxiaopei0509 is that HiC Pro doesn't accept gzipped fastq files.

I could be wrong, but I didn't find anything contradictory in the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants