Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/cicd 1 #2

Merged
merged 21 commits into from
Jul 21, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
eec6499
Update .travis.yml
taoliu Jul 1, 2020
60e684a
Merge pull request #2 from taoliu/patch-travis-release-to-conda
taoliu Jul 2, 2020
bc7da3b
update python package version number
taoliu Jul 15, 2020
d3944bd
Update parameters in macs2: --keep-dup all -f BAMPE -g hs/mm
DongqingSun Jul 16, 2020
9373410
Generate all temp files in output directory
DongqingSun Jul 16, 2020
23b1f10
Modify .gitignore
DongqingSun Jul 16, 2020
bd527f4
Merge branch 'master' of github.com:taoliu/MAESTRO
taoliu Jul 16, 2020
058f979
Merge branch 'master' into feat/support_gzipped_files
taoliu Jul 16, 2020
5aa9b56
Support fragment and bam as input
DongqingSun Jul 17, 2020
a89da72
add universal_open function to scATAC_utility.py to support both unco…
taoliu Jul 17, 2020
b37ae68
testing data for scatac-peakcount
taoliu Jul 17, 2020
78a9e66
upload test.sh
taoliu Jul 18, 2020
d73524c
import gzip
taoliu Jul 18, 2020
eafec17
gzip open with rt mode
taoliu Jul 19, 2020
08f32bc
add more info in test.sh
taoliu Jul 19, 2020
4ac7361
Add the standard output file; let test.sh check if the result is cons…
taoliu Jul 19, 2020
3909842
Support bam file and fragment file as input, and provide an option to…
DongqingSun Jul 20, 2020
4cdd370
Merge pull request #1 from taoliu/feat/support_gzipped_files
DongqingSun96 Jul 20, 2020
3927f63
Update HTML report to adapt to different cell-type annotation methods…
DongqingSun Jul 20, 2020
addfd06
Update MAESTRO/scATAC_utility.py to support gzip
DongqingSun Jul 20, 2020
4a7400d
Set r-base =3.6.1, add Sys.setenv(R_REMOTES_NO_ERRORS_FROM_WARNINGS=t…
DongqingSun Jul 21, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@ dist/*
*.tar.bz2
*.dylib
*.tmp
*.egg-info/*
50 changes: 49 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ jobs:
cache:
directories:
- ~/miniconda
- ~/Data

before_install:
# Set conda path info
Expand All @@ -29,6 +30,49 @@ before_install:
wget https://repo.continuum.io/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -O $HOME/download/miniconda.sh;
fi;
fi;
# download test data for scatac
- TEST_DATA_PATH=$HOME/Data;
- TEST_DATA_SCATAC_PATH=$TEST_DATA_PATH/atac_pbmc_500_v1_fastqs_sampling;
- TEST_DATA_GIGGLE_PATH=$TEST_DATA_PATH/giggle.all;
- TEST_DATA_SCATAC_REFERENCE_PATH=$TEST_DATA_PATH/Refdata_scATAC_MAESTRO_GRCh38_1.1.0;
- if [[ -d $TEST_DATA_PATH ]]; then
echo "Directory Data already exists";
else
mkdir -p $TEST_DATA_PATH;
fi;
- if [[ -d $TEST_DATA_SCATAC_PATH ]]; then
echo "scATAC test data already available from cache";
else
cd $TEST_DATA_PATH;
echo "downloading scatac test data";
wget http://cistrome.org/~chenfei/MAESTRO/atac_pbmc_500_v1_fastqs_sampling.tar.gz;
echo "decompressing scatac test data";
tar -xvzf atac_pbmc_500_v1_fastqs_sampling.tar.gz;
rm atac_pbmc_500_v1_fastqs_sampling.tar.gz;
cd ../;
fi;
- if [[ -d $TEST_DATA_GIGGLE_PATH ]]; then
echo "scATAC giggle index data already available from cache";
else
cd $TEST_DATA_PATH;
echo "downloading giggle index";
wget http://cistrome.org/~chenfei/MAESTRO/giggle.all.tar.gz;
echo "decompressing giggle index";
tar -xvzf giggle.all.tar.gz;
rm giggle.all.tar.gz;
cd ../;
fi;
- if [[ -d $TEST_DATA_SCATAC_REFERENCE_PATH ]]; then
echo "scATAC reference genome data already available from cache";
else
cd $TEST_DATA_PATH;
echo "downloading reference genome";
wget http://cistrome.org/~chenfei/MAESTRO/Refdata_scATAC_MAESTRO_GRCh38_1.1.0.tar.gz;
echo "decompressing reference genome";
tar -xvzf Refdata_scATAC_MAESTRO_GRCh38_1.1.0.tar.gz;
rm Refdata_scATAC_MAESTRO_GRCh38_1.1.0.tar.gz;
cd ../;
fi;

# install the package and dependencies:
install:
Expand Down Expand Up @@ -84,6 +128,10 @@ script:
- MAESTRO -v
- R -e "library(MAESTRO);library(Seurat)"
- R -e "library(org.Hs.eg.db);library(org.Mm.eg.db)"
# test python utility scripts
- cd test
- bash test.sh
- cd ..
# We also need more testing here!

# the following codes will upload when all the above is successful and
Expand All @@ -102,7 +150,7 @@ after_success:
- mamba install anaconda-client
- |
# Only upload builds from tags
if [[ $TRAVIS_PULL_REQUEST == false && $TRAVIS_REPO_SLUG == "liulab-dfci/MAESTRO"
if [[ $TRAVIS_PULL_REQUEST == false && $TRAVIS_REPO_SLUG == "dongqingsun/MAESTRO"
&& $TRAVIS_BRANCH == $TRAVIS_TAG && $TRAVIS_TAG != '' ]]; then
export ANACONDA_API_TOKEN=$CONDA_UPLOAD_TOKEN
anaconda upload bld-dir/**/PACKAGENAME-*.tar.bz2
Expand Down
17 changes: 17 additions & 0 deletions MAESTRO.egg-info/PKG-INFO
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Metadata-Version: 1.1
Name: MAESTRO
Version: 1.2.0
Summary: MAESTRO(Model-based AnalysEs of Single-cell Transcriptome and RegulOme) is a comprehensive single-cell RNA-seq and ATAC-seq analysis suit built using snakemake.
Home-page: https://github.com/chenfeiwang/MAESTRO
Author: Chenfei Wang, Dongqing Sun
Author-email: UNKNOWN
License: GPL-3.0
Description: UNKNOWN
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GPL License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
27 changes: 27 additions & 0 deletions MAESTRO.egg-info/SOURCES.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
README.md
setup.py
MAESTRO/MAESTRO
MAESTRO/MAESTRO_PipeInit.py
MAESTRO/__init__.py
MAESTRO/integrate_HTMLReport.py
MAESTRO/scATAC_10x_BarcodeCorrect.py
MAESTRO/scATAC_10x_PeakCount.py
MAESTRO/scATAC_BamAddTag.py
MAESTRO/scATAC_FragmentCorrect.py
MAESTRO/scATAC_FragmentGenerate.py
MAESTRO/scATAC_Genescore.py
MAESTRO/scATAC_H5Process.py
MAESTRO/scATAC_HTMLReport.py
MAESTRO/scATAC_QC.py
MAESTRO/scATAC_microfluidic_PeakCount.py
MAESTRO/scATAC_microfluidic_QC.py
MAESTRO/scATAC_sci_BarcodeExtract.py
MAESTRO/scATAC_utility.py
MAESTRO/scRNA_AnalysisPipeline.py
MAESTRO/scRNA_HTMLReport.py
MAESTRO/scRNA_QC.py
MAESTRO/scRNA_utility.py
MAESTRO.egg-info/PKG-INFO
MAESTRO.egg-info/SOURCES.txt
MAESTRO.egg-info/dependency_links.txt
MAESTRO.egg-info/top_level.txt
1 change: 1 addition & 0 deletions MAESTRO.egg-info/dependency_links.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

1 change: 1 addition & 0 deletions MAESTRO.egg-info/top_level.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
MAESTRO
7 changes: 5 additions & 2 deletions MAESTRO/MAESTRO
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,15 @@
# @Last Modified by: Dongqing Sun
# @Last Modified time: 2020-02-28 13:58:10

version = "1.1.0"
version = "1.2.1"

import logging
import sys, os
import shutil
import argparse as ap

from MAESTRO.MAESTRO_PipeInit import *
from MAESTRO.MAESTRO_ParameterValidate import *
from MAESTRO.scATAC_H5Process import *
from MAESTRO.scATAC_Genescore import genescore_parser, genescore
from MAESTRO.scATAC_10x_PeakCount import peakcount_parser, peakcount
Expand Down Expand Up @@ -47,17 +48,19 @@ def main():

scrna_analysis_parser(subparsers)

logging.basicConfig(format="%(message)s", level=logging.INFO, stream=sys.stderr)
logging.basicConfig(format="%(levelname)s: %(message)s", stream=sys.stderr)

args = parser.parse_args()

if args.version:
print(version)
exit(0)
elif args.subcommand == "scatac-init":
scatac_validator(args)
scatac_config(args)

elif args.subcommand == "scrna-init":
scrna_validator(args)
scrna_config(args)

elif args.subcommand == "integrate-init":
Expand Down
138 changes: 138 additions & 0 deletions MAESTRO/MAESTRO_ParameterValidate.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# -*- coding: utf-8 -*-
# @Author: Dongqing Sun
# @E-mail: Dongqingsun96@gmail.com
# @Date: 2020-07-19 17:20:32
# @Last Modified by: Dongqing Sun
# @Last Modified time: 2020-07-20 13:49:05


import sys
import os
import re
import logging
from argparse import ArgumentError


def scatac_validator(args):
"""
Validate parameters from scatac-init argument parsers.
"""
if args.platform == "10x-genomics":
if args.format == "fastq":
if args.fastq_dir == "":
logging.error("--fastq-dir is required. Please specify the directory where fastq files are stored!")
exit(1)
if args.fastq_prefix == "":
logging.error("--fastq-prefix is required. Please provide the sample name of fastq files!")
exit(1)
if args.fasta == "":
logging.error("--fasta is required if fastq files are provided!")
exit(1)
if args.whitelist == "":
logging.error("--whitelist is required for 10x-genomics data!")
exit(1)
if args.format == "bam":
if args.bam == "":
logging.error("--bam is required. Please provide the bam file with CB tag!")
exit(1)
if args.format == "fragments":
if args.frag == "":
logging.error("--frag is required. Please provide the fragment file generated by CellRanger ATAC!")
exit(1)

if args.platform == "sci-ATAC-seq":
if args.format == "fastq":
if args.fastq_dir == "":
logging.error("--fastq-dir is required. Please specify the directory where fastq files are stored!")
exit(1)
if args.fastq_prefix == "":
logging.error("--fastq-prefix is required. Please provide the sample name of fastq files!")
exit(1)
if args.fasta == "":
logging.error("--fasta is required if fastq files are provided!")
exit(1)
if args.format == "bam":
if args.bam == "":
logging.error("--bam is required. Please provide the bam file with CB tag!")
exit(1)
if args.format == "fragments":
logging.error("Format of 'fragments' is supported only when the platform is '10x-genomics'.")
exit(1)

if args.platform == "microfluidic":
if args.format == "fastq":
if args.fastq_dir == "":
logging.error("--fastq-dir is required. Please specify the directory where fastq files are stored!")
exit(1)
if args.fasta == "":
logging.error("--fasta is required if fastq files are provided!")
exit(1)
if args.format == "bam":
logging.error("Format of 'bam' is supported when the platform is '10x-genomics' or 'sci-ATAC-seq'.")
exit(1)
if args.format == "fragments":
logging.error("Format of 'fragments' is supported only when the platform is '10x-genomics'.")
exit(1)

if args.signature not in ['human.immune.CIBERSORT', 'mouce.brain.ALLEN', 'mouse.all.facs.TabulaMuris', 'mouse.all.droplet.TabulaMuris']:
if os.path.exists(args.signature):
pass
else:
logging.error("Please specify the signature built in MAESTRO or provide customized signature file. See --signature help for more details!")
exit(1)


def scrna_validator(args):
"""
Validate parameters from scrna-init argument parsers.
"""

if args.platform == "10x-genomics":
if args.fastq_dir == "":
logging.error("--fastq-dir is required. Please specify the directory where fastq files are stored!")
exit(1)
if args.fastq_prefix == "":
logging.error("--fastq-prefix is required. Please provide the sample name of fastq files!")
exit(1)
if args.whitelist == "":
logging.error("--whitelist is required for 10x-genomics data!")
exit(1)

if args.platform == "Dropseq":
if args.fastq_dir == "":
logging.error("--fastq-dir is required for Dropsea data. Please specify the directory where fastq files are stored!")
exit(1)
if args.fastq_barcode == "":
logging.error("--fastq-barcode is required for Dropsea data. Please specify the barcode fastq file!")
exit(1)
if args.fastq_transcript == "":
logging.error("--fastq-transcript is required for Dropsea data. Please specify the transcript fastq file!")
exit(1)
if args.whitelist == "":
logging.error("--whitelist is required for Dropsea data. Please provide the barcode whitelist.")
exit(1)

if args.platform == "Smartseq2":
if args.fastq_dir == "":
logging.error("--fastq-dir is required. Please specify the directory where fastq files are stored!")
exit(1)
if args.rsem == "":
logging.error("--rsem is required. Please provide the prefix of transcript references for RSEM. See --rsem help for more details.")
exit(1)

if args.lisamode == "local":
if args.lisaenv == "":
logging.error("--lisaenv is required when lisamode is 'local'. Please specify the name of LISA environment!")
exit(1)
if args.condadir == "":
logging.error("--condadir is required when lisamode is 'local'. Please specify the directory where miniconda or anaconda is installed!")
exit(1)

if args.signature not in ['human.immune.CIBERSORT', 'mouce.brain.ALLEN', 'mouse.all.facs.TabulaMuris', 'mouse.all.droplet.TabulaMuris']:
if os.path.exists(args.signature):
pass
else:
logging.error("Please specify the signature built in MAESTRO or provide customized signature file. See --signature help for more details!")
exit(1)


Loading