Skip to content

Commit 71780a6

Browse files
anshul-2010pre-commit-ci[bot]tianyizheng02
authored andcommitted
Edit Distance Algorithm for String Matching (TheAlgorithms#10571)
* Edit Distance Algorithm for String Matching * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Apply suggestions from code review * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update edit_distance.py * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Tianyi Zheng <tianyizheng02@gmail.com>
1 parent 1d7e9f8 commit 71780a6

File tree

1 file changed

+32
-0
lines changed

1 file changed

+32
-0
lines changed

strings/edit_distance.py

+32
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
def edit_distance(source: str, target: str) -> int:
2+
"""
3+
Edit distance algorithm is a string metric, i.e., it is a way of quantifying how
4+
dissimilar two strings are to one another. It is measured by counting the minimum
5+
number of operations required to transform one string into another.
6+
7+
This implementation assumes that the cost of operations (insertion, deletion and
8+
substitution) is always 1
9+
10+
Args:
11+
source: the initial string with respect to which we are calculating the edit
12+
distance for the target
13+
target: the target string, formed after performing n operations on the source string
14+
15+
>>> edit_distance("GATTIC", "GALTIC")
16+
1
17+
"""
18+
if len(source) == 0:
19+
return len(target)
20+
elif len(target) == 0:
21+
return len(source)
22+
23+
delta = int(source[-1] != target[-1]) # Substitution
24+
return min(
25+
edit_distance(source[:-1], target[:-1]) + delta,
26+
edit_distance(source, target[:-1]) + 1,
27+
edit_distance(source[:-1], target) + 1,
28+
)
29+
30+
31+
if __name__ == "__main__":
32+
print(edit_distance("ATCGCTG", "TAGCTAA")) # Answer is 4

0 commit comments

Comments
 (0)