Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

restore an option to do per-lane stats for QC #85

Open
golobor opened this issue Jan 19, 2018 · 7 comments
Open

restore an option to do per-lane stats for QC #85

golobor opened this issue Jan 19, 2018 · 7 comments

Comments

@golobor
Copy link
Member

golobor commented Jan 19, 2018

No description provided.

@sergpolly
Copy link
Member

sergpolly commented Jan 19, 2018

i.e. in other words go from this:

process merge_chunks_into_runs {
    tag "library:${library} run:${run}"
    storeDir getIntermediateDir('pairsam_run')
 
    input:
    set val(library), val(run), file(pairsam_chunks) from LIB_RUN_GROUP_PAIRSAMS
     
    output:
    set library, run, "${library}.${run}.pairsam.${suffix}" into LIB_RUN_PAIRSAMS
 
    script:
    // can we replace this part with just the "else" branch, so that pairsamtools merge will take care of it?
    if( isSingleFile(pairsam_chunks) )
        """
        ln -s \"\$(readlink -f ${pairsam_chunks})\" ${library}.${run}.pairsam.${suffix}
        """
    else
        """
        mkdir ./tmp4sort
        pairsamtools merge ${pairsam_chunks} --nproc ${task.cpus} -o ${library}.${run}.pairsam.${suffix} --tmpdir ./tmp4sort
        rm -rf ./tmp4sort
        """

to something like this:

process merge_chunks_into_runs {
    tag "library:${library} run:${run}"
    storeDir getIntermediateDir('pairsam_run')
    publishDir path:'.', mode:"copy", saveAs: {
      if( it.endsWith('.stats' ))
        return getOutDir("stats_lane") +"/${library}${run}.stats"
    }

    input:
    set val(library), val(run), file(pairsam_chunks) from LIB_RUN_GROUP_PAIRSAMS
     
    output:
    set library, run, "${library}.${run}.pairsam.${suffix}" into LIB_RUN_PAIRSAMS
 
    script:

    optional_stats_cmd = ( params.get('lane_stats','false').toBoolean() ? 
        "pairsamtools stats -o ${library}.${run}.stats ${library}.${run}.pairsam.${suffix}" : "" )

    // can we replace this part with just the "else" branch, so that pairsamtools merge will take care of it?
    if( isSingleFile(pairsam_chunks) )
        """
        ln -s \"\$(readlink -f ${pairsam_chunks})\" ${library}.${run}.pairsam.${suffix}
        ${optional_stats_cmd}
        """
    else
        """
        mkdir ./tmp4sort
        pairsamtools merge ${pairsam_chunks} --nproc ${task.cpus} -o ${library}.${run}.pairsam.${suffix} --tmpdir ./tmp4sort
        ${optional_stats_cmd}
        rm -rf ./tmp4sort
        """

something like that

just do it more carefully

@golobor
Copy link
Member Author

golobor commented Jan 19, 2018

yes, this seems legit to me. I wouldn't even know how to make it more careful than this 👍

@golobor
Copy link
Member Author

golobor commented Jan 19, 2018

with one exception - what does this line do?
optional_stats_cmd = params['map'].get('drop_sam','false').toBoolean()

@sergpolly
Copy link
Member

copi pasta

@sergpolly
Copy link
Member

i've edited the comment to remove it.
I'll test this whenever i can ...

@mimakaev
Copy link

mimakaev commented Feb 13, 2020

Hi guys. I just realized that not even per-lane stats are missing, but per-library stats as well. Is that intentional? I find it very limiting not to have stats for replicates.

EDIT: found them in the pairs folder. Maybe we should put them in a separate folder? or is it too difficult?

@Phlya
Copy link
Member

Phlya commented Feb 13, 2020

Is there really any reason why the code from above is not merged? Should this be done together with the stats format change? open2c/pairtools#79

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants