maxentpy is a python wrapper for MaxEntScan to calculate splice site strength.
It contains two functions. score5
is adapt from MaxEntScan::score5ss to score 5' splice sites. score3
is adapt from MaxEntScan::score3ss to score 3' splice sites. They only use Maximum Entropy Model to score.
- Cython
- msgpack-python
>>> from maxentpy import maxent # use normal version of maxent
>>> maxent.score5('cagGTAAGT') # 3 bases in exon and 6 bases in intron
10.858313101356437
>>> maxent.score3('ttccaaacgaacttttgtAGgga') # 20 bases in the intron and 3 base in the exon
2.8867730651152104
>>> from maxentpy.maxent import load_matrix5, load_matrix3 # preloading matrix will speed up
>>> timeit maxent.score5('cagGTAAGT')
10 loops, best of 3: 23.5 ms per loop
>>> matrix5 = load_matrix5()
>>> timeit maxent.score5('cagGTAAGT', matrix=matrix5)
100000 loops, best of 3: 3.27 µs per loop
>>> timeit maxent.score3('ttccaaacgaacttttgtAGgga')
1 loop, best of 3: 259 ms per loop
>>> matrix3 = load_matrix3()
>>> timeit maxent.score3('ttccaaacgaacttttgtAGgga', matrix=matrix3)
10000 loops, best of 3: 103 µs per loop
>>> from maxentpy import maxent_fast # fast version could further speed up
>>> timeit maxent_fast.score5('cagGTAAGT')
100 loops, best of 3: 5.04 ms per loop
>>> timeit maxent_fast.score3('ttccaaacgaacttttgtAGgga')
100 loops, best of 3: 9.3 ms per loop
>>> from maxentpy.maxent_fast import load_matrix # support preloading matrix
>>> matrix5 = load_matrix(5)
>>> timeit maxent_fast.score5('cagGTAAGT', matrix=matrix5)
100000 loops, best of 3: 3.61 µs per loop
>>> matrix3 = load_matrix(3)
>>> timeit maxent_fast.score3('ttccaaacgaacttttgtAGgga', matrix=matrix3)
100000 loops, best of 3: 7.76 µs per loop
score5 | maxentpy.maxent | maxentpy.maxent_fast |
---|---|---|
without matrix | 23.5 ms | 5.04 ms |
with matrix | 3.27 µs | 3.61 µs |
score3 | maxentpy.maxent | maxentpy.maxent_fast |
---|---|---|
without matrix | 259 ms | 9.3 ms |
with matrix | 103 µs | 7.76 µs |
Yeo G, Burge CB. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. Journal of Computational Biology. 2004, 11:377-94.
The original algorithm and perl scripts are under license described in http://genes.mit.edu/burgelab/maxent/download/READTHIS. The python version of maxent is under the MIT License.