diff --git a/0.4/404.html b/0.4/404.html new file mode 100644 index 0000000..b5b0de4 --- /dev/null +++ b/0.4/404.html @@ -0,0 +1,571 @@ + + + +
+ + + + + + + + + + + + + + +This page presents the internals and design decisions of the kebbie
package.
The Oracle is the main class of the package.
+ +It's the class that takes care of iterating the dataset, introducing the artifical typos, and calling the given Corrector with the noisy text. Then it scores the results, knowing what was the expected text, and return the aggregated metrics as a result.
+Performances
+The task is embarassingly parallel. Each sentence can be tested separately. The Oracle leverages multiprocessing to ensure we run the tests as fast as possible.
+Reproducibility
+Although The Oracle runs in parallel, the evaluation is entirely reproducible and deterministic. Running twice the same evaluation (with the same Corrector and the same parameters) should give you the exact same results.
+If you follow the flow of the data, this is what it looks like :
+ +The NoiseModel is the class responsible for introducing artificial typos in a clean text.
+This is done in two steps :
+Info
+The keystrokes are generated by using two Gaussian distributions (over the X-axis and the Y-axis), centered on the middle of the intended key.
+In the end, the output is a noisy version of the word, alongside with the corresponding keystrokes coordinates.
+ + + + + + + + + + + + + +