YASA (Yet Another Sequence Aligner)

Created to solve the problem of aligning long, relatively similar sequences. It may work well on less-similar sequences, but that has not been tested yet.

Install

pip install --upgrade git+https://github.com/riklopfer/YASA/

Basic Usage

Starting with an interactive python prompt.

Import the module

import yasa

Define source and target lists

source = "this is a test of the beam aligner".split()
target = "that was a test of the bean aligner".split()

Create the aligner and perform the alignment. heap_size is the total number of paths to consider at a time. beam_width is the maximum allowed cost distance from the best path. For example, if the best path has a cost of 10 and beam_width=5, any path with cost > 15 will be pruned.

# create the aligner
aligner = yasa.LevinshteinAligner(heap_size=50, beam_width=5)
# do the alignment
word_alignment = aligner.align(source, target)
# pretty print
print(word_alignment)

Iterate over source-target pairs in the alignment

for src, tgt in word_alignment:
    print("SRC: '{}' TGT: '{}'".format(src, tgt))

If we alter the input to be more poorly aligned, we can use the nested aligner to get a "better" alignment. Omitting beam_with will not prune paths according to that metric. This is fine so long as the heap_size is reasonable.

regular_aligner = yasa.LevinshteinAligner(heap_size=50)
nested_aligner = yasa.NestedLevinshteinAligner(heap_size=50)

source = "this is a test of the beam aligner".split() * 2
target = "that was a test of the bean".split() * 2

print(regular_aligner.align(source, target))
print(nested_aligner.align(source, target))

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
.github/workflows		.github/workflows
test		test
yasa		yasa
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YASA (Yet Another Sequence Aligner)

Install

Basic Usage

About

Releases

Packages

Languages

License

riklopfer/YASA

Folders and files

Latest commit

History

Repository files navigation

YASA (Yet Another Sequence Aligner)

Install

Basic Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages