nf-core · nservant · Apr 30, 2019 · Apr 28, 2019 · Apr 30, 2019 · Apr 30, 2019
diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md
@@ -6,7 +6,9 @@ We try to manage the required tasks for nf-core/hic using GitHub issues, you pro
 
 However, don't be put off by this template - other more general issues and suggestions are welcome! Contributions to the code are even more welcome ;)
 
-> If you need help using or modifying nf-core/hic then the best place to go is the Gitter chatroom where you can ask us questions directly: https://gitter.im/nf-core/Lobby
+> If you need help using or modifying nf-core/hic then the best place to ask is on the pipeline channel on [Slack](https://nf-core-invite.herokuapp.com/).
+
+
 
 ## Contribution workflow
 If you'd like to write some code for nf-core/hic, the standard workflow
@@ -42,4 +44,4 @@ If there are any failures then the automated tests fail.
 These tests are run both with the latest available version of Nextflow and also the minimum required version that is stated in the pipeline code.
 
 ## Getting help
-For further information/help, please consult the [nf-core/hic documentation](https://github.com/nf-core/hic#documentation) and don't hesitate to get in touch on [Gitter](https://gitter.im/nf-core/Lobby)
+For further information/help, please consult the [nf-core/hic documentation](https://github.com/nf-core/hic#documentation) and don't hesitate to get in touch on the pipeline channel on [Slack](https://nf-core-invite.herokuapp.com/).
diff --git a/.github/markdownlint.yml b/.github/markdownlint.yml
@@ -0,0 +1,9 @@
+# Markdownlint configuration file
+default: true,
+line-length: false
+no-multiple-blanks: 0
+blanks-around-headers: false
+blanks-around-lists: false
+header-increment: false
+no-duplicate-header:
+    siblings_only: true
diff --git a/.gitignore b/.gitignore
@@ -4,3 +4,4 @@ data/
 results/
 .DS_Store
 tests/test_data
+*.pyc
diff --git a/.travis.yml b/.travis.yml
@@ -13,6 +13,7 @@ before_install:
   # Pull the docker image first so the test doesn't wait for this
   - docker pull nfcore/hic:dev
   # Fake the tag locally so that the pipeline runs properly
+  # Looks weird when this is :dev to :dev, but makes sense when testing code for a release (:dev to :1.0.1)
   - docker tag nfcore/hic:dev nfcore/hic:dev
 
 install:
@@ -25,12 +26,17 @@ install:
   - pip install nf-core
   # Reset
   - mkdir ${TRAVIS_BUILD_DIR}/tests && cd ${TRAVIS_BUILD_DIR}/tests
+  # Install markdownlint-cli
+  - sudo apt-get install npm && npm install -g markdownlint-cli
 
 env:
   - NXF_VER='0.32.0' # Specify a minimum NF version that should be tested and work
+  - NXF_VER='' # Plus: get the latest NF version and check that it works
 
 script:
   # Lint the pipeline code
   - nf-core lint ${TRAVIS_BUILD_DIR}
+  # Lint the documentation
+  - markdownlint ${TRAVIS_BUILD_DIR} -c ${TRAVIS_BUILD_DIR}/.github/markdownlint.yml
   # Run the pipeline with the test profile
   - nextflow run ${TRAVIS_BUILD_DIR} -profile test,docker
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,14 +2,15 @@
 
 ## v1.0dev - 2019-04-09
 
-	First version of nf-core-hic pipeline which is a Nextflow implementation of the HiC-Pro pipeline [https://github.com/nservant/HiC-Pro].
-	Note that all HiC-Pro functionalities are not yet all implemented. The current version is designed for protocols based on restriction enzyme digestion.
-
-	In summary, this version allows :
-	* Automatic detection and generation of annotation files based on igenomes if not provided.
-	* Two-steps alignment of raw sequencing reads
-	* Reads filtering and detection of valid interaction products
-	* Generation of raw contact matrices for a set of resolutions
-	* Normalization of the contact maps using the ICE algorithm
-	* Generation of cooler file for visualization on higlass [https://higlass.io/]
-	* Quality report based on HiC-Pro MultiQC module
+First version of nf-core-hic pipeline which is a Nextflow implementation of the [HiC-Pro pipeline](https://github.com/nservant/HiC-Pro/).
+Note that all HiC-Pro functionalities are not yet all implemented. The current version is designed for protocols based on restriction enzyme digestion.
+
+In summary, this version allows :
+
+* Automatic detection and generation of annotation files based on igenomes if not provided.
+* Two-steps alignment of raw sequencing reads
+* Reads filtering and detection of valid interaction products
+* Generation of raw contact matrices for a set of resolutions
+* Normalization of the contact maps using the ICE algorithm
+* Generation of cooler file for visualization on [higlass](https://higlass.io/)
+* Quality report based on HiC-Pro MultiQC module
diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md
@@ -34,7 +34,7 @@ This Code of Conduct applies both within project spaces and in public spaces whe
 
 ## Enforcement
 
-Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team on the [Gitter channel](https://gitter.im/nf-core/Lobby). The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
+Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team on [Slack](https://nf-core-invite.herokuapp.com/). The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
 
 Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.
 

diff --git a/README.md b/README.md
@@ -1,8 +1,8 @@
-# ![nf-core/hic](docs/images/nfcore-hic_logo.png)
+# nf-core/hic
 
-**Analysis of Chromosome Conformation Capture data (Hi-C)**
+**Analysis of Chromosome Conformation Capture data (Hi-C)**.
 
-[![Build Status](https://travis-ci.org/nf-core/hic.svg?branch=master)](https://travis-ci.org/nf-core/hic)
+[![Build Status](https://travis-ci.com/nf-core/hic.svg?branch=master)](https://travis-ci.com/nf-core/hic)
 [![Nextflow](https://img.shields.io/badge/nextflow-%E2%89%A50.32.0-brightgreen.svg)](https://www.nextflow.io/)
 
 [![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg)](http://bioconda.github.io/)

diff --git a/assets/email_template.txt b/assets/email_template.txt
@@ -17,23 +17,6 @@ ${errorReport}
 } %>
 
 
-<% if (!success){
-    out << """####################################################
-## nf-core/hic execution completed unsuccessfully! ##
-####################################################
-The exit status of the task that caused the workflow execution to fail was: $exitStatus.
-The full error message was:
-
-${errorReport}
-"""
-} else {
-    out << "## nf-core/hic execution completed successfully! ##"
-}
-%>
-
-
-
-
 The workflow was completed at $dateComplete (duration: $duration)
 
 The command used to launch the workflow was as follows:

diff --git a/assets/multiqc_config.yaml b/assets/multiqc_config.yaml
@@ -0,0 +1,9 @@
+report_comment: >
+    This report has been generated by the <a href="https://github.com/nf-core/hic" target="_blank">nf-core/hic</a>
+    analysis pipeline. For information about how to interpret these results, please see the
+    <a href="https://github.com/nf-core/hic" target="_blank">documentation</a>.
+report_section_order:
+    nf-core/hic-software-versions:
+        order: -1000
+
+export_plots: true
diff --git a/assets/sendmail_template.txt b/assets/sendmail_template.txt
@@ -1,11 +1,36 @@
 To: $email
 Subject: $subject
 Mime-Version: 1.0
-Content-Type: multipart/related;boundary="nfmimeboundary"
+Content-Type: multipart/related;boundary="nfcoremimeboundary"
 
---nfmimeboundary
+--nfcoremimeboundary
 Content-Type: text/html; charset=utf-8
 
 $email_html
 
---nfmimeboundary--
+<%
+if (mqcFile){
+def mqcFileObj = new File("$mqcFile")
+if (mqcFileObj.length() < mqcMaxSize){
+out << """
+--nfcoremimeboundary
+Content-Type: text/html; name=\"multiqc_report\"
+Content-Transfer-Encoding: base64
+Content-ID: <mqcreport>
+Content-Disposition: attachment; filename=\"${mqcFileObj.getName()}\"
+
+${mqcFileObj.
+  bytes.
+  encodeBase64().
+  toString().
+  tokenize( '\n' )*.
+  toList()*.
+  collate( 76 )*.
+  collect { it.join() }.
+  flatten().
+  join( '\n' )}
+"""
+}}
+%>
+
+--nfcoremimeboundary--
diff --git a/bin/scrape_software_versions.py b/bin/scrape_software_versions.py
@@ -3,14 +3,22 @@
 from collections import OrderedDict
 import re
 
-# TODO nf-core: Add additional regexes for new tools in process get_software_versions
+# Add additional regexes for new tools in process get_software_versions
 regexes = {
     'nf-core/hic': ['v_pipeline.txt', r"(\S+)"],
     'Nextflow': ['v_nextflow.txt', r"(\S+)"],
+    'Bowtie2': ['v_bowtie2.txt', r"Bowtie2 v(\S+)"],
+    'Python': ['v_python.txt', r"Python v(\S+)"],
+    'Samtools': ['v_samtools.txt', r"Samtools v(\S+)"],
+    'MultiQC': ['v_multiqc.txt', r"multiqc, version (\S+)"],
 }
 results = OrderedDict()
 results['nf-core/hic'] = '<span style="color:#999999;\">N/A</span>'
 results['Nextflow'] = '<span style="color:#999999;\">N/A</span>'
+results['Bowtie2'] = '<span style="color:#999999;\">N/A</span>'
+results['Python'] = '<span style="color:#999999;\">N/A</span>'
+results['Samtools'] = '<span style="color:#999999;\">N/A</span>'
+results['MultiQC'] = '<span style="color:#999999;\">N/A</span>'
 
 # Search each file using its regex
 for k, v in regexes.items():
@@ -20,9 +28,14 @@
         if match:
             results[k] = "v{}".format(match.group(1))
 
+# Remove software set to false in results
+for k in results:
+    if not results[k]:
+        del(results[k])
+
 # Dump to YAML
 print ('''
-id: 'nf-core/hic-software-versions'
+id: 'software_versions'
 section_name: 'nf-core/hic Software Versions'
 section_href: 'https://github.com/nf-core/hic'
 plot_type: 'html'
@@ -31,5 +44,10 @@
     <dl class="dl-horizontal">
 ''')
 for k,v in results.items():
-    print("        <dt>{}</dt><dd>{}</dd>".format(k,v))
+    print("        <dt>{}</dt><dd><samp>{}</samp></dd>".format(k,v))
 print ("    </dl>")
+
+# Write out regexes as csv file:
+with open('software_versions.csv', 'w') as f:
+    for k,v in results.items():
+        f.write("{}\t{}\n".format(k,v))
diff --git a/conf/awsbatch.config b/conf/awsbatch.config
@@ -1,10 +1,15 @@
 /*
  * -------------------------------------------------
- *  Nextflow config file for AWS Batch
+ *  Nextflow config file for running on AWS batch
  * -------------------------------------------------
- * Imported under the 'awsbatch' Nextflow profile in nextflow.config
- * Uses docker for software depedencies automagically, so not specified here.
+ * Base config needed for running with -profile awsbatch
  */
+params {
+  config_profile_name = 'AWSBATCH'
+  config_profile_description = 'AWSBATCH Cloud Profile'
+  config_profile_contact = 'Alexander Peltzer (@apeltzer)'
+  config_profile_url = 'https://aws.amazon.com/de/batch/'
+}
 
 aws.region = params.awsregion
 process.executor = 'awsbatch'

diff --git a/conf/base.config b/conf/base.config
@@ -1,6 +1,6 @@
 /*
  * -------------------------------------------------
- *  Nextflow base config file
+ *  nf-core/hic Nextflow base config file
  * -------------------------------------------------
  * A 'blank slate' config file, appropriate for general
  * use on most high performace compute environments.
@@ -11,21 +11,20 @@
 
 process {
 
-  container = process.container
-
-  cpus = { check_max( 2, 'cpus' ) }
+  // Check the defaults for all processes
+  cpus = { check_max( 1 * task.attempt, 'cpus' ) }
   memory = { check_max( 8.GB * task.attempt, 'memory' ) }
   time = { check_max( 2.h * task.attempt, 'time' ) }
 
-  errorStrategy = { task.exitStatus in [1,143,137,104,134,139] ? 'retry' : 'terminate' }
+  errorStrategy = { task.exitStatus in [143,137,104,134,139] ? 'retry' : 'finish' }
   maxRetries = 1
   maxErrors = '-1'
 
   // Process-specific resource requirements
   withName:makeBowtie2Index {
      cpus = { check_max( 1, 'cpus' ) }
      memory = { check_max( 10.GB * task.attempt, 'memory' ) }
-     time = { check_max( 12.h * task.attempt, 'time' ) } 
+     time = { check_max( 12.h * task.attempt, 'time' ) }
   }
   withName:bowtie2_end_to_end {
     cpus = { check_max( 4, 'cpus' ) }

diff --git a/conf/igenomes.config b/conf/igenomes.config
@@ -60,7 +60,7 @@ params {
     }
     'Gm01' {
       fasta   = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Sequence/WholeGenomeFasta/genome.fa"
-      bowtie2 = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Sequence/Bowtie2Index/genome"                                                                                                                 
+      bowtie2 = "${params.igenomes_base}/Glycine_max/Ensembl/Gm01/Sequence/Bowtie2Index/genome"
     }
     'Mmul_1' {
       fasta   = "${params.igenomes_base}/Macaca_mulatta/Ensembl/Mmul_1/Sequence/WholeGenomeFasta/genome.fa"
@@ -96,7 +96,7 @@ params {
     }
     'AGPv3' {
       fasta   = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Sequence/WholeGenomeFasta/genome.fa"
-      bowtie2 = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Sequence/Bowtie2Index/genome"                                     
+      bowtie2 = "${params.igenomes_base}/Zea_mays/Ensembl/AGPv3/Sequence/Bowtie2Index/genome"
     }
   }
 }
diff --git a/conf/test.config b/conf/test.config
@@ -16,7 +16,7 @@ params {
   max_cpus = 2
   max_memory = 4.GB
   max_time = 1.h
-  
+
   // Input data
   readPaths = [
     ['SRR4292758_00', ['https://github.com/nf-core/test-datasets/raw/hic/data/SRR4292758_00_R1.fastq.gz', 'https://github.com/nf-core/test-datasets/raw/hic/data/SRR4292758_00_R2.fastq.gz']]
@@ -31,4 +31,3 @@ params {
   // Options
   skip_cool = true
 }
-
diff --git a/docs/configuration/local.md b/docs/configuration/local.md
@@ -10,6 +10,7 @@ Nextflow has [excellent integration](https://www.nextflow.io/docs/latest/docker.
 First, install docker on your system: [Docker Installation Instructions](https://docs.docker.com/engine/installation/)
 
 Then, simply run the analysis pipeline:
+
 ```bash
 nextflow run nf-core/hic -profile docker --genome '<genome ID>'
 ```

diff --git a/docs/configuration/reference_genomes.md b/docs/configuration/reference_genomes.md
@@ -39,11 +39,12 @@ Multiple reference index types are held together with consistent structure for m
 We have put a copy of iGenomes up onto AWS S3 hosting and this pipeline is configured to use this by default.
 The hosting fees for AWS iGenomes are currently kindly funded by a grant from Amazon.
 The pipeline will automatically download the required reference files when you run the pipeline.
-For more information about the AWS iGenomes, see https://ewels.github.io/AWS-iGenomes/
+For more information about the AWS iGenomes, see [AWS-iGenomes](https://ewels.github.io/AWS-iGenomes/)
 
 Downloading the files takes time and bandwidth, so we recommend making a local copy of the iGenomes resource.
 Once downloaded, you can customise the variable `params.igenomes_base` in your custom configuration file to point to the reference location.
 For example:
+
 ```nextflow
 params.igenomes_base = '/path/to/data/igenomes/'
 ```
diff --git a/docs/installation.md b/docs/installation.md
@@ -74,7 +74,7 @@ Be warned of two important points about this default configuration:
 #### 3.1) Software deps: Docker
 First, install docker on your system: [Docker Installation Instructions](https://docs.docker.com/engine/installation/)
 
-Then, running the pipeline with the option `-profile docker` tells Nextflow to enable Docker for this run. An image containing all of the software requirements will be automatically fetched and used from dockerhub (https://hub.docker.com/r/nfcore/hic).
+Then, running the pipeline with the option `-profile docker` tells Nextflow to enable Docker for this run. An image containing all of the software requirements will be automatically fetched and used from [dockerhub](https://hub.docker.com/r/nfcore/hic).
 
 #### 3.1) Software deps: Singularity
 If you're not able to use Docker then [Singularity](http://singularity.lbl.gov/) is a great alternative.

diff --git a/docs/output.md b/docs/output.md
@@ -26,7 +26,7 @@ Singletons are discarded, and multi-hits are filtered according to the configura
 Note that if the `--dnase` mode is activated, HiC-Pro will skip the second mapping step.
 
 **Output directory: `results/mapping`**
-                                                                                                                                                                                                            
+
 * `*bwt2pairs.bam` - final BAM file with aligned paired data
 * `*.pairstat` - mapping statistics
 
@@ -50,7 +50,7 @@ Invalid pairs are classified as follow:
 * Dangling end, i.e. unligated fragments (both reads mapped on the same restriction fragment)
 * Self circles, i.e. fragments ligated on themselves (both reads mapped on the same restriction fragment in inverted orientation)
 * Religation, i.e. ligation of juxtaposed fragments
-* Filtered pairs, i.e. any pairs that do not match the filtering criteria on inserts size, restriction fragments size 
+* Filtered pairs, i.e. any pairs that do not match the filtering criteria on inserts size, restriction fragments size
 * Dumped pairs, i.e. any pairs for which we were not able to reconstruct the ligation product.
 
 Only valid pairs involving two different restriction fragments are used to build the contact maps.
@@ -59,12 +59,12 @@ Duplicated valid pairs associated to PCR artefacts are discarded (see `--rm_dup`
 In case of Hi-C protocols that do not require a restriction enzyme such as DNase Hi-C or micro Hi-C, the assignment to a restriction is not possible (see `--dnase`).
 Short range interactions that are likely to be spurious ligation products can thus be discarded using the `--min_cis_dist` parameter.
 
-* `*.validPairs` - List of valid ligation products 
+* `*.validPairs` - List of valid ligation products
 * `*RSstat` - Statitics of number of read pairs falling in each category
 
 The validPairs are stored using a simple tab-delimited text format ;
 
-```
+```bash
 read name / chr_reads1 / pos_reads1 / strand_reads1 / chr_reads2 / pos_reads2 / strand_reads2 / fragment_size / res frag name R1 / res frag R2 / mapping qual R1 / mapping qual R2 [/ allele_specific_tag]
 ```
 
@@ -102,7 +102,7 @@ A contact map is defined by :
 
 Based on the observation that a contact map is symmetric and usually sparse, only non-zero values are stored for half of the matrix. The user can specified if the 'upper', 'lower' or 'complete' matrix has to be stored. The 'asis' option allows to store the contacts as they are observed from the valid pairs files.
 
-```
+```bash
    A   B   10
    A   C   23
    B   C   24
@@ -124,4 +124,4 @@ The pipeline has special steps which allow the software versions used to be repo
 * `Project_multiqc_data/`
   * Directory containing parsed statistics from the different tools used in the pipeline
 
-For more information about how to use MultiQC reports, see http://multiqc.info
+For more information about how to use MultiQC reports, see [http://multiqc.info](http://multiqc.info)
-Original file line number
+Diff line change
@@ Expand Up @@
     First, install docker on your system: [Docker Installation Instructions](https://docs.docker.com/engine/installation/)
     Then, simply run the analysis pipeline:
     ```bash
     nextflow run nf-core/hic -profile docker --genome '<genome ID>'
     ```
@@ Expand Down @@