add lcs problem to DP

aaronoah · aaronoah · commit 737083f3adcf · 2019-01-13T14:29:58.000+08:00
diff --git a/.nvmrc b/.nvmrc
@@ -0,0 +1 @@
+10.13.0
diff --git a/docs/SUMMARY.md b/docs/SUMMARY.md
@@ -48,6 +48,8 @@
     * [String Search](searching/string-matching.md)
 * [Dynamic Programming][dynamic]
     * [Overview][dynamic]
+    * [Fibonacci Numbers](dynamic-programming/fibonacci-numbers.md)
+    * [LCS problem](dynamic-programming/lcs-problem.md)
 * [Greedy Algorithms][greedy]
     * [Overview][greedy]
 
diff --git a/docs/dynamic-programming/fibonacci-numbers.md b/docs/dynamic-programming/fibonacci-numbers.md
@@ -46,4 +46,20 @@ FIBONACCI(n)
 </code>
 </pre>
 
-Therefore, FIBONACCI(k) only takes **one** recursion, &forall; k &isin; n; and all memoized calls use &Theta;(1) time
+Therefore, FIBONACCI(k) only takes **one** recursion, &forall; k &isin; n; and all memoized calls use &Theta;(1) time. Thus, the time complexity is &Omicron;(n), space complexity is &Omicron;(1)
+
+What about we don't want to have recursions? Then, we can use an extra linear space to store the FIB(n - 1) and FIB(n - 2) when computing FIB(n), which is to use _iteration_ in replacement of _recursion_ and having the same time complexity but avoid the use of recursion, which is also called **bottom-up DP**.
+
+<pre>
+<code>
+FIBONACCI(n)
+  fib := {}
+  for k in 1...n
+    if k &les; 2
+      f = 1
+    else
+      f = fib[n - 1] + fib[n - 2]
+    fib[k] = f
+  return fib[n]
+</code>
+</pre>
diff --git a/docs/dynamic-programming/lcs-problem.md b/docs/dynamic-programming/lcs-problem.md
@@ -1,3 +1,54 @@
 # LCS (longest-common-subsequence) problem
 
-[longest-common-subsequence](https://en.wikipedia.org/wiki/Longest_common_subsequence_problem) differs from problem of finding the [longest common substring](https://en.wikipedia.org/wiki/Longest_common_substring_problem); and it has wide applications such as [diff utility](https://en.wikipedia.org/wiki/Diff_utility) and [bioinformatics](https://en.wikipedia.org/wiki/Bioinformatics)
+[longest-common-subsequence](https://en.wikipedia.org/wiki/Longest_common_subsequence_problem) differs from problem of finding the [longest common substring](https://en.wikipedia.org/wiki/Longest_common_substring_problem); and it has wide applications such as [diff utility](https://en.wikipedia.org/wiki/Diff_utility) and [bioinformatics](https://en.wikipedia.org/wiki/Bioinformatics).
+
+The problem is formally defined as follows: define two strings for example _abcgf_ and _achfe_, find the same longest subsequence from both of them. A subsequence is composed of characters in same relative order within the string but not necessarily being contiguous. Strings like _ac_, _bc_, _bf_, _abf_ are all subsequences of string _abcgf_.
+
+From all the subsequences of strings _abcgf_ and _achfe_, the common ones are _a_, _ac_, _cf_, _c_, _f_, _acf_, the longest common subsequence is _acf_.
+
+## Analysis of this problem
+
+Define two sequences X[0..m-1] and Y[0..n-1]. And L(X[0..m-1], Y[0..n-1]) be the LCS of the two sequences X and Y. LCS problem can be solved in dynamic programming for its satisfaction of two important factors:
+
+- Optimal Substructure
+
+  If a problem has an optimal solution and its sub-problems also have optimal solutions, then we say this problem has optimal substructure.
+
+  In the above example, sequence X is _abcgf_ and sequence Y is _achfe_; Then, if the last characters of X and Y match we only need to find out if the preceding characters match: L(X[0..m-1], Y[0..n-1]) = 1 + L(X[0..m-2], Y[0..n-2]).
+
+  If the last characters of X and Y do not match, we need to find out the maximal LCS between L(X[0..m-2], Y[0..n-1]) and L(X[0..m-1], Y[0..n-2]).
+
+- Overlapping Subproblems
+
+  In the recursion of finding common sequences, there are overlapping function calls; they are called overlapping subproblems:
+
+  In the above example, finding the LCS of X and Y is broken down into finding L(_abcgf_, _achf_) and L(_abcg_, _achfe_), the next level recursions of both have overlapping subproblems: L(_abcg_, _achf_). By memoization, the computing cost can be saved significantly (from exponential time to polynomial time).
+
+Then, the following formula is given for this problem to be solved in dynamic programming:
+
+<figure style="text-align: center">
+  <img src="../images/lcs.png" />
+  <figcaption>Figure 1. LCS Formula</figcaption>
+</figure>
+
+## Pseudocode Solution
+
+X of size m, Y of size n
+
+```
+LCS(X, Y)
+  initialize a 2-D array L of size m+1 by n+1
+
+  for i in m
+    for j in n
+      if i == 0 and j == 0
+        L[i][j] = 0
+      else if X[i - 1] == Y[j - 1]
+        L[i][j] = 1 + L[i - 1][j - 1]
+      else
+        L[i][j] = max(L[i][j - 1], L[i - 1][j])
+
+  return L[m][n]
+```
+
+The time and the space complexity are both &Omicron;(n &times; m).
diff --git a/docs/dynamic-programming/overview.md b/docs/dynamic-programming/overview.md
@@ -8,19 +8,13 @@ In contrast to the paradigm of [DnC][DnC] that sub-problems are independent, dyn
 
 The typical [dynamic programming][dynamic] algorithm is developed in steps follow:
 
-1. Characterize the _optimal_ sub-structures along with possible moves.
+1. Characterize the _optimal_ sub-structures along with possible moves. (think of it as finding a DAG for a solution path)
 2. Define the recurrence relations of sub-problems.
 3. Compute recursively or iteratively in a _bottom-up_ fashion or _top-down_ with [_memoization_](https://en.wikipedia.org/wiki/Memoization) fashion.
 4. Construct an overall _optimal_ solution or combining solutions of sub-problems.
 
 _Noted that the word **programming** does not stand for **computer programming** but a tabulation method that was invented by [R. Bellman](https://en.wikipedia.org/wiki/Richard_E._Bellman)_.
 
-## Principles of Dynamic Programming
-
-
-
-## Classic Knapsack Problem
-
 ## Comparing with [Greedy Algorithms][greedy]
 
 It is often confusing to determine if the programming logic is built on [dynamic programming][dynamic] or [greedy algorithms][greedy]. Both are adopted in favor of tackling _optimization problem_, and [dynamic programming][dynamic] seeks and combines the previous solutions for sub-problems to yield the final result while [greedy algorithms][greedy] chooses locally optimal solution in each run and not guarantee to have the optimal result.
diff --git a/docs/images/lcs.png b/docs/images/lcs.png
diff --git a/docs/searching/hash-table.md b/docs/searching/hash-table.md
@@ -186,9 +186,6 @@ Overall, a killer solution is by using the concept of [Hash Table][hash-table].
 
 ### Java 8 HashMap
 
-> _reference_:
-> http://www.nagarro.com/de/perspectives/post/24/performance-improvement-for-hashmap-in-java-8
-
 Hash collisions in hash table data structures have significant impact on performance of LOOKUP operation as to increase the [worst-case](../asymptotic-analysis.md) running time from &Omicron;(1) to &Omicron;(n).
 
 To improve upon that, Java8 spec of HashMap implementations requires the buckets containing colliding keys should store entries in a balanced tree instead of linked list. Hence, the searching operation takes no more than &Omicron;(log(n)) in general.
diff --git a/docs/sorting/shell-sort.md b/docs/sorting/shell-sort.md
@@ -56,8 +56,6 @@ In 1986, [Prof. Robert Sedgewick](https://en.wikipedia.org/wiki/Robert_Sedgewick
 
 ## Additional Resources
 
-1. Computer Algorithms: Shell Sort, http://www.stoimen.com/blog/2012/02/27/computer-algorithms-shell-sort/
+1. Fastest gap sequence for shell sort? https://stackoverflow.com/questions/2539545/fastest-gap-sequence-for-shell-sort
 
-2. Fastest gap sequence for shell sort? https://stackoverflow.com/questions/2539545/fastest-gap-sequence-for-shell-sort
-
-3. Analysis of Shell Sort and Related Algorithms, R. Sedgewick, Princeton U. http://www.cs.princeton.edu/~rs/shell/paperF.pdf
+2. Analysis of Shell Sort and Related Algorithms, R. Sedgewick, Princeton U. http://www.cs.princeton.edu/~rs/shell/paperF.pdf