cortexlabs · vishalbollu · Mar 12, 2020 · Feb 29, 2020 · Feb 29, 2020 · Feb 29, 2020
diff --git a/examples/README.md b/examples/README.md
@@ -12,6 +12,10 @@
 
 - [License plate reader](tensorflow/license-plate-reader): deploy a YOLOv3 model (and others) to identify license plates in real time.
 
+## Keras
+
+- [Denoisify text documents](keras/document-denoiser): deploy an Autoencoder model to clean text document images of noise.
+
 ## PyTorch
 
 - [Iris classification](pytorch/iris-classifier): deploy a model to classify iris flowers.

diff --git a/examples/keras/document-denoiser/README.md b/examples/keras/document-denoiser/README.md
@@ -0,0 +1,46 @@
+# Clean Dirty Documents w/ Autoencoders
+
+This example model cleans text documents of anything that isn't text (aka noise): coffee stains, old wear artifacts, etc. You can inspect the notebook that has been used to train the model [here](trainer.ipynb).
+
+Here's a collage of input texts and predictions.
+
+![Imgur](https://i.imgur.com/M4Mjz2l.jpg)
+
+*Figure 1 - The dirty documents are on the left side and the cleaned ones are on the right*
+
+## Sample Prediction
+
+### Prediction
+
+Once this model is deployed, get the API endpoint by running `cortex get document-denoiser`.
+
+Now let's take a sample image like this one.
+
+![Imgur](https://i.imgur.com/JJLfFxB.png)
+
+Export the endpoint & the image's URL by running
+```bash
+export ENDPOINT=your-api-endpoint
+export IMAGE_URL=https://i.imgur.com/JJLfFxB.png
+```
+
+Then run the following piped commands
+```bash
+curl "${ENDPOINT}" -X POST -H "Content-Type: application/json" -d '{"url":"'${IMAGE_URL}'"}' |
+sed 's/"//g' |
+base64 -d >> prediction.png
+```
+
+Once this has run, we'll see a `prediction.png` file saved to the disk. This is the result.
+
+![Imgur](https://i.imgur.com/PRB2oS8.png)
+
+As it can be seen, the text document has been cleaned of any noise. Success!
+
+---
+
+Here's a short list of URLs of other text documents in image format that can be cleaned using this model. Export these links to `IMAGE_URL` variable:
+
+* https://i.imgur.com/6COQ46f.png
+* https://i.imgur.com/alLI83b.png
+* https://i.imgur.com/QVoSTuu.png
diff --git a/examples/keras/document-denoiser/cortex.yaml b/examples/keras/document-denoiser/cortex.yaml
@@ -0,0 +1,11 @@
+# WARNING: you are on the master branch, please refer to the examples on the branch that matches your `cortex version`
+
+- name: document-denoiser
+  predictor:
+    type: python
+    path: predictor.py
+    config:
+      model: s3://cortex-examples/keras/document-denoiser
+      resize_shape: [540, 260]
+  compute:
+    cpu: 1
diff --git a/examples/keras/document-denoiser/predictor.py b/examples/keras/document-denoiser/predictor.py
@@ -0,0 +1,80 @@
+# WARNING: you are on the master branch, please refer to the examples on the branch that matches your `cortex version`
+
+import boto3, base64, cv2, re, os, requests
+import numpy as np
+from tensorflow.keras.models import load_model
+
+
+def get_url_image(url_image):
+    """
+    Get numpy image from URL image.
+    """
+    resp = requests.get(url_image, stream=True).raw
+    image = np.asarray(bytearray(resp.read()), dtype="uint8")
+    image = cv2.imdecode(image, cv2.IMREAD_GRAYSCALE)
+    return image
+
+
+def image_to_png_nparray(image):
+    """
+    Convert numpy image to jpeg numpy vector.
+    """
+    is_success, im_buf_arr = cv2.imencode(".png", image)
+    return im_buf_arr
+
+
+def image_to_png_bytes(image):
+    """
+    Convert numpy image to bytes-encoded png image.
+    """
+    buf = image_to_png_nparray(image)
+    byte_im = buf.tobytes()
+    return byte_im
+
+
+class PythonPredictor:
+    def __init__(self, config):
+        # download the model
+        model_path = config["model"]
+        model_name = "model.h5"
+        bucket, key = re.match("s3://(.+?)/(.+)", model_path).groups()
+        s3 = boto3.client("s3")
+        s3.download_file(bucket, os.path.join(key, model_name), model_name)
+
+        # load the model
+        self.model = load_model(model_name)
+
+        # resize shape (width, height)
+        self.resize_shape = tuple(config["resize_shape"])
+
+    def predict(self, payload):
+        # download image
+        img_url = payload["url"]
+        image = get_url_image(img_url)
+        resized = cv2.resize(image, self.resize_shape)
+
+        # prediction
+        pred = self.make_prediction(resized)
+
+        # image represented in bytes
+        byte_im = image_to_png_bytes(pred)
+
+        # encode image
+        image_enc = base64.b64encode(byte_im).decode("utf-8")
+
+        return image_enc
+
+    def make_prediction(self, img):
+        """
+        Make prediction on image.
+        """
+        processed = img / 255.0
+        processed = np.expand_dims(processed, 0)
+        processed = np.expand_dims(processed, 3)
+        pred = self.model.predict(processed)
+        pred = np.squeeze(pred, 3)
+        pred = np.squeeze(pred, 0)
+        out_img = pred * 255
+        out_img[out_img > 255.0] = 255.0
+        out_img = out_img.astype(np.uint8)
+        return out_img
diff --git a/examples/keras/document-denoiser/requirements.txt b/examples/keras/document-denoiser/requirements.txt
@@ -0,0 +1,4 @@
+numpy==1.18.0
+requests==2.22.0
+opencv-python==4.1.2.30
+keras==2.3.1
diff --git a/examples/keras/document-denoiser/sample.json b/examples/keras/document-denoiser/sample.json
@@ -0,0 +1,3 @@
+{
+    "url": "https://i.imgur.com/JJLfFxB.png"
+}