Skip to content

Commit

Permalink
Merge branch 'hotfix/v1.8.1'
Browse files Browse the repository at this point in the history
  • Loading branch information
dbolotin committed Jun 29, 2016
2 parents bf71575 + 086575a commit 1803550
Show file tree
Hide file tree
Showing 20 changed files with 543 additions and 106 deletions.
13 changes: 13 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,4 +1,17 @@

MiXCR 1.8.1 (29 Jun 2016)
========================

-- Revert quality filtering/mapping algorithm to MiXCR 1.7.x
-- Added different quality aggregation algorithms for assemble (Average, Min, Max, MiniMax) option
`-OqualityAggregationType=Max`
-- Added `clonesDiff` action to calculate descriptive statistics of differnece between two samples
-- Fixed wrong anchor point positions in Macaca mulatta IGL reference imported from IMGT
-- MiXCR returns exit code 1 if program terminated with error
-- Automatic correction of -OvParameters.geneFeatureToAlign in align in some cases (e.g. IMGT
reference and rna-seq paramenters)


MiXCR 1.8 (21 Jun 2016)
========================

Expand Down
13 changes: 8 additions & 5 deletions doc/assemble.rst
Original file line number Diff line number Diff line change
Expand Up @@ -133,16 +133,19 @@ Other global parameters are:
+=================================+=================+==========================================================================================+
| ``minimalClonalSequenceLength`` | ``12`` | Minimal length of clonal sequence |
+---------------------------------+-----------------+------------------------------------------------------------------------------------------+
| ``minimalMeanQuality`` | ``32`` | Minimal value of mean quality to consider sequence as a "good" one. If mean sequence |
| | | quality is lower than ``minimalMeanQuality``, then that sequence will be deferred for |
| | | further processing by mapper. |
| ``qualityAggregationType`` | ``Max`` | Algorithm used for aggregation of total clonal sequence quality. Possible values: |
| | | ``Max`` (maximal quality across all reads for each position), |
| | | ``Min`` (minimal quality across all reads for each position), |
| | | ``Average`` (average quality across all reads for each position), |
| | | ``MiniMax`` (all letters has the same quality which is the maximum of minimal quality of |
| | | clonal sequence in each read). |
+---------------------------------+-----------------+------------------------------------------------------------------------------------------+
| ``minimalQuality`` | ``20`` | Minimal allowed quality of each nucleotide of aggregated clone. If at least one |
| ``minimalQuality`` | ``0`` | Minimal allowed quality of each nucleotide of aggregated clone. If at least one |
| | | nucleotide in the aggregated clone has quality lower than ``minimalQuality``, this clone |
| | | will be dropped (remember that qualities of reads are summed when assembling core |
| | | clonotypes). |
+---------------------------------+-----------------+------------------------------------------------------------------------------------------+
| ``badQualityThreshold`` | ``18`` | Minimal value of sequencing quality score: nucleotides with lower quality are |
| ``badQualityThreshold`` | ``20`` | Minimal value of sequencing quality score: nucleotides with lower quality are |
| | | considered as "bad". If sequence contains at least one "bad" nucleotide, it will be |
| | | deferred at initial assembling stage, for further processing by mapper. |
+---------------------------------+-----------------+------------------------------------------------------------------------------------------+
Expand Down
11 changes: 9 additions & 2 deletions importFromIMGT.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ do
esac
done

echo ${mixcr}
${mixcr} -v

type wget >/dev/null 2>&1 || { echo >&2 "This script requires \"wget\". Try \"brew install wget\" or \"apt-get install wget\"." ; exit 1; }
type pup >/dev/null 2>&1 || { echo >&2 "This script requires \"pup\". Try \"brew install https://raw.githubusercontent.com/EricChiang/pup/master/pup.rb\" or \"go get github.com/ericchiang/pup\"." ; exit 1; }
type xmllint >/dev/null 2>&1 || { echo >&2 "This script requires \"xmllint\". Try \"sudo apt-get install libxml2-utils\"." ; exit 1; }
Expand Down Expand Up @@ -97,11 +100,15 @@ do
# Workaround for *** IMGT malformed files
if [[ "$(echo ${species} | tr [:upper:] [:lower:])" == mus* ]] && [[ "$locus" == TR[AD] ]]; then
comm="$comm -p imgt_a1"
echo "Special paramenters for Mouse TRA/D genes activated."
echo "Special parameters for Mouse TRA/D genes activated."
fi
if [[ "$(echo ${species} | tr [:upper:] [:lower:])" == rat* ]] && [[ "$locus" == IG[HK] ]]; then
comm="$comm -p imgt_a2"
echo "Special paramenters for Rat IGH/K genes activated."
echo "Special parameters for Rat IGH/K genes activated."
fi
if [[ "$(echo ${species} | tr [:upper:] [:lower:])" == maca* ]] && [[ "$locus" == IGL ]]; then
comm="$comm -p imgt_a3"
echo "Special parameters for Macaca IGL genes activated."
fi

# Output file info on the last iteration
Expand Down
22 changes: 14 additions & 8 deletions mixcr
Original file line number Diff line number Diff line change
Expand Up @@ -65,17 +65,9 @@ case $os in
;;
esac

mixcr=${dir}/mixcr

mixcrArgs=()
javaArgs=()

if [[ $# -eq 1 ]] && [[ $(echo $1 | tr '[:upper:]' '[:lower:]') == "importfromimgt" ]]; then
echo "Starting importFromIMGT.sh script"
${dir}/importFromIMGT.sh -mixcr ${mixcr} || exit 1
exit 0
fi

needXmxXms=true
otherJar=""

Expand Down Expand Up @@ -104,6 +96,19 @@ do
esac
done

mixcr=${dir}/mixcr

if [[ ! -z ${otherJar} ]];
then
mixcr="${mixcr} -V ${otherJar}"
fi

if [[ $(echo ${mixcrArgs[0]} | tr '[:upper:]' '[:lower:]') == "importfromimgt" ]]; then
echo "Starting importFromIMGT.sh script"
${dir}/importFromIMGT.sh -mixcr "${mixcr}" || exit 1
exit 0
fi

if [[ ${needXmxXms} == true ]]
then
targetXmx=12000
Expand Down Expand Up @@ -155,3 +160,4 @@ then
fi

$java -Dmixcr.path=$dir -Dmixcr.command=mixcr -XX:+AggressiveOpts "${javaArgs[@]}" -jar $jar "${mixcrArgs[@]}"
exit $?
4 changes: 2 additions & 2 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@

<groupId>com.milaboratory</groupId>
<artifactId>mixcr</artifactId>
<version>1.8</version>
<version>1.8.1</version>
<packaging>jar</packaging>
<name>MiXCR</name>

Expand All @@ -44,7 +44,7 @@

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<milib.version>1.4</milib.version>
<milib.version>1.5-SNAPSHOT</milib.version>
</properties>

<dependencies>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,11 @@


import com.milaboratory.core.Range;
import com.milaboratory.core.merger.MergerParameters;
import com.milaboratory.core.sequence.NSequenceWithQuality;
import com.milaboratory.core.sequence.NucleotideSequence;
import com.milaboratory.core.sequence.SequenceQuality;
import com.milaboratory.core.sequence.quality.QualityAggregationType;
import com.milaboratory.core.sequence.quality.QualityAggregator;
import com.milaboratory.mixcr.basictypes.ClonalSequence;
import com.milaboratory.mixcr.basictypes.VDJCAlignments;
import com.milaboratory.mixcr.basictypes.VDJCHit;
Expand All @@ -42,33 +43,34 @@
import gnu.trove.iterator.TObjectFloatIterator;
import gnu.trove.map.hash.TObjectFloatHashMap;

import java.util.Arrays;
import java.util.EnumMap;

public final class CloneAccumulator {
final EnumMap<GeneType, TObjectFloatHashMap<AlleleId>> geneScores = new EnumMap<>(GeneType.class);
private ClonalSequence sequence;
final byte[] quality;
final QualityAggregator aggregator;
long count = 0, countMapped = 0;
private volatile int cloneIndex = -1;
final Range[] nRegions;

public CloneAccumulator(ClonalSequence sequence, Range[] nRegions) {
public CloneAccumulator(ClonalSequence sequence, Range[] nRegions, QualityAggregationType qualityAggregationType) {
this.sequence = sequence;
this.nRegions = nRegions;
this.quality = sequence.getConcatenated().getQuality().asArray();
this.aggregator = qualityAggregationType.create(sequence.getConcatenated().size());
//this.quality = sequence.getConcatenated().getQuality().asArray();
}

public ClonalSequence getSequence() {
return sequence;
}

public void rebuildClonalSequence() {
SequenceQuality newQuality = aggregator.getQuality();
final NSequenceWithQuality[] updated = new NSequenceWithQuality[sequence.size()];
int pointer = 0;
for (int i = 0; i < updated.length; i++) {
final NucleotideSequence s = this.sequence.get(i).getSequence();
updated[i] = new NSequenceWithQuality(s, new SequenceQuality(Arrays.copyOfRange(quality, pointer, pointer + s.size())));
updated[i] = new NSequenceWithQuality(s, newQuality.getRange(pointer, pointer + s.size()));
pointer += s.size();
}
sequence = new ClonalSequence(updated);
Expand Down Expand Up @@ -156,18 +158,20 @@ public synchronized void accumulate(ClonalSequence data, VDJCAlignments alignmen
}
}

int pointer = 0;
for (NSequenceWithQuality p : data) {
for (int i = 0; i < p.size(); ++i) {
final SequenceQuality q = p.getQuality();
if (quality[pointer] != MergerParameters.DEFAULT_MAX_QUALITY_VALUE)
if (quality[pointer] + q.value(i) > MergerParameters.DEFAULT_MAX_QUALITY_VALUE)
quality[pointer] = MergerParameters.DEFAULT_MAX_QUALITY_VALUE;
else
quality[pointer] += q.value(i);
++pointer;
}
}
aggregator.aggregate(data.getConcatenated().getQuality());

//int pointer = 0;
//for (NSequenceWithQuality p : data) {
// for (int i = 0; i < p.size(); ++i) {
// final SequenceQuality q = p.getQuality();
// if (quality[pointer] != MergerParameters.DEFAULT_MAX_QUALITY_VALUE)
// if (quality[pointer] + q.value(i) > MergerParameters.DEFAULT_MAX_QUALITY_VALUE)
// quality[pointer] = MergerParameters.DEFAULT_MAX_QUALITY_VALUE;
// else
// quality[pointer] += q.value(i);
// ++pointer;
// }
//}
} else ++countMapped;
}
}
18 changes: 10 additions & 8 deletions src/main/java/com/milaboratory/mixcr/assembler/CloneAssembler.java
Original file line number Diff line number Diff line change
Expand Up @@ -338,7 +338,7 @@ public void process(VDJCAlignments input) {
return;
}

if (target.getConcatenated().getQuality().meanValue() < parameters.getMinimalMeanQuality()) {
if (badPoints > 0) {
// Has some number of bad points but not greater then maxBadPointsToMap
log(new AssemblerEvent(input.getAlignmentsIndex(), input.getReadId(), AssemblerEvent.DEFERRED));
onAlignmentDeferred(input);
Expand Down Expand Up @@ -535,7 +535,8 @@ synchronized CloneAccumulator accumulate(ClonalSequence sequence, VDJCAlignments
VJCSignature vjcSignature = extractSignature(alignments);
CloneAccumulator acc = accumulators.get(vjcSignature);
if (acc == null) {
acc = new CloneAccumulator(sequence, extractNRegions(sequence, alignments));
acc = new CloneAccumulator(sequence, extractNRegions(sequence, alignments),
parameters.getQualityAggregationType());
accumulators.put(vjcSignature, acc);
acc.setCloneIndex(cloneIndexGenerator.incrementAndGet());
onNewCloneCreated(acc);
Expand Down Expand Up @@ -583,13 +584,14 @@ public List<CloneAccumulator> build() {
if (acc == null)
continue;

for (byte b : acc.quality)
if (b < parameters.minimalQuality) {
onCloneDropped(acc);
continue out;
}

acc.rebuildClonalSequence();

if (acc.getSequence().getConcatenated().getQuality().minValue() <
parameters.minimalQuality) {
onCloneDropped(acc);
continue out;
}

result.add(acc);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
import com.fasterxml.jackson.annotation.JsonCreator;
import com.fasterxml.jackson.annotation.JsonIgnore;
import com.fasterxml.jackson.annotation.JsonProperty;
import com.milaboratory.core.sequence.quality.QualityAggregationType;
import com.milaboratory.mixcr.reference.GeneFeature;

import java.util.Arrays;
Expand All @@ -44,6 +45,7 @@ public final class CloneAssemblerParameters implements java.io.Serializable {
private static final int MAX_MAPPING_REGION = 1000;
GeneFeature[] assemblingFeatures;
int minimalClonalSequenceLength;
QualityAggregationType qualityAggregationType;
CloneClusteringParameters cloneClusteringParameters;
CloneFactoryParameters cloneFactoryParameters;
boolean separateByV, separateByJ, separateByC;
Expand All @@ -54,12 +56,12 @@ public final class CloneAssemblerParameters implements java.io.Serializable {
String mappingThreshold;
@JsonIgnore
long variants;
byte minimalMeanQuality;
byte minimalQuality;

@JsonCreator
public CloneAssemblerParameters(@JsonProperty("assemblingFeatures") GeneFeature[] assemblingFeatures,
@JsonProperty("minimalClonalSequenceLength") int minimalClonalSequenceLength,
@JsonProperty("qualityAggregationType") QualityAggregationType qualityAggregationType,
@JsonProperty("cloneClusteringParameters") CloneClusteringParameters cloneClusteringParameters,
@JsonProperty("cloneFactoryParameters") CloneFactoryParameters cloneFactoryParameters,
@JsonProperty("separateByV") boolean separateByV,
Expand All @@ -70,10 +72,10 @@ public CloneAssemblerParameters(@JsonProperty("assemblingFeatures") GeneFeature[
@JsonProperty("badQualityThreshold") byte badQualityThreshold,
@JsonProperty("maxBadPointsPercent") double maxBadPointsPercent,
@JsonProperty("mappingThreshold") String mappingThreshold,
@JsonProperty("minimalMeanQuality") byte minimalMeanQuality,
@JsonProperty("minimalQuality") byte minimalQuality) {
this.assemblingFeatures = assemblingFeatures;
this.minimalClonalSequenceLength = minimalClonalSequenceLength;
this.qualityAggregationType = qualityAggregationType;
this.cloneClusteringParameters = cloneClusteringParameters;
this.cloneFactoryParameters = cloneFactoryParameters;
this.separateByV = separateByV;
Expand All @@ -84,7 +86,6 @@ public CloneAssemblerParameters(@JsonProperty("assemblingFeatures") GeneFeature[
this.badQualityThreshold = badQualityThreshold;
this.maxBadPointsPercent = maxBadPointsPercent;
this.mappingThreshold = mappingThreshold;
this.minimalMeanQuality = minimalMeanQuality;
this.minimalQuality = minimalQuality;
updateVariants();
}
Expand Down Expand Up @@ -127,6 +128,15 @@ public int getMinimalClonalSequenceLength() {
return minimalClonalSequenceLength;
}

public QualityAggregationType getQualityAggregationType() {
return qualityAggregationType;
}

public CloneAssemblerParameters setQualityAggregationType(QualityAggregationType qualityAggregationType) {
this.qualityAggregationType = qualityAggregationType;
return null;
}

public CloneFactoryParameters getCloneFactoryParameters() {
return cloneFactoryParameters;
}
Expand Down Expand Up @@ -167,15 +177,6 @@ public String getMappingThreshold() {
return mappingThreshold;
}

public byte getMinimalMeanQuality() {
return minimalMeanQuality;
}

public CloneAssemblerParameters setMinimalMeanQuality(byte minimalMeanQuality) {
this.minimalMeanQuality = minimalMeanQuality;
return this;
}

public void setMappingThreshold(String mappingThreshold) {
this.mappingThreshold = mappingThreshold;
updateVariants();
Expand Down Expand Up @@ -246,10 +247,11 @@ public boolean isClusteringEnabled() {
@Override
public CloneAssemblerParameters clone() {
return new CloneAssemblerParameters(assemblingFeatures.clone(), minimalClonalSequenceLength,
qualityAggregationType,
cloneClusteringParameters == null ? null : cloneClusteringParameters.clone(),
cloneFactoryParameters.clone(), separateByV, separateByJ, separateByC,
maximalPreClusteringRatio, addReadsCountOnClustering, badQualityThreshold, maxBadPointsPercent,
mappingThreshold, minimalMeanQuality, minimalQuality);
mappingThreshold, minimalQuality);
}

@Override
Expand All @@ -260,6 +262,7 @@ public boolean equals(Object o) {
CloneAssemblerParameters that = (CloneAssemblerParameters) o;

if (minimalClonalSequenceLength != that.minimalClonalSequenceLength) return false;
if (qualityAggregationType != that.qualityAggregationType) return false;
if (separateByV != that.separateByV) return false;
if (separateByJ != that.separateByJ) return false;
if (separateByC != that.separateByC) return false;
Expand All @@ -274,8 +277,6 @@ public boolean equals(Object o) {
return false;
if (!cloneFactoryParameters.equals(that.cloneFactoryParameters))
return false;
if (minimalMeanQuality != that.minimalMeanQuality)
return false;
if (minimalQuality != that.minimalQuality)
return false;
return true;
Expand All @@ -287,6 +288,7 @@ public int hashCode() {
long temp;
result = Arrays.hashCode(assemblingFeatures);
result = 31 * result + minimalClonalSequenceLength;
result = 31 * result + qualityAggregationType.hashCode();
result = 31 * result + (cloneClusteringParameters != null ? cloneClusteringParameters.hashCode() : 0);
result = 31 * result + (cloneFactoryParameters != null ? cloneFactoryParameters.hashCode() : 0);
result = 31 * result + (separateByV ? 1 : 0);
Expand All @@ -299,7 +301,6 @@ public int hashCode() {
temp = Double.doubleToLongBits(maxBadPointsPercent);
result = 31 * result + (int) (temp ^ (temp >>> 32));
result = 31 * result + (int) (variants ^ (variants >>> 32));
result = 31 * result + (int) (minimalMeanQuality ^ (minimalMeanQuality >>> 32));
result = 31 * result + (int) (minimalQuality ^ (minimalQuality >>> 32));
return result;
}
Expand Down
Loading

0 comments on commit 1803550

Please sign in to comment.