This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Clojure example for fixed label-width captcha recognition (#13769)
* Clojure example for fixed label-width captcha recognition * Update evaluation * Better training and inference (w/ cleanup) * Captcha generation for testing * Make simple test work * Add test and update README * Add missing consts file * Follow comments
- Loading branch information
1 parent
9fc5382
commit d3bd5e7
Showing
10 changed files
with
522 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
/.lein-* | ||
/.nrepl-port | ||
images/* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
# Captcha | ||
|
||
This is the clojure version of [captcha recognition](https://github.com/xlvector/learning-dl/tree/master/mxnet/ocr) | ||
example by xlvector and mirrors the R captcha example. It can be used as an | ||
example of multi-label training. For the following captcha example, we consider it as an | ||
image with 4 labels and train a CNN over the data set. | ||
|
||
![captcha example](captcha_example.png) | ||
|
||
## Installation | ||
|
||
Before you run this example, make sure that you have the clojure package | ||
installed. In the main clojure package directory, do `lein install`. | ||
Then you can run `lein install` in this directory. | ||
|
||
## Usage | ||
|
||
### Training | ||
|
||
First the OCR model needs to be trained based on [labeled data](https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/R/data/captcha_example.zip). | ||
The training can be started using the following: | ||
``` | ||
$ lein train [:cpu|:gpu] [num-devices] | ||
``` | ||
This downloads the training/evaluation data using the `get_data.sh` script | ||
before starting training. | ||
|
||
It is possible that you will encounter some out-of-memory issues while training using :gpu on Ubuntu | ||
linux (18.04). However, the command `lein train` (training on one CPU) may resolve the issue. | ||
|
||
The training runs for 10 iterations by default and saves the model with the | ||
prefix `ocr-`. The model achieved an exact match accuracy of ~0.954 and | ||
~0.628 on training and validation data respectively. | ||
|
||
### Inference | ||
|
||
Once the model has been saved, it can be used for prediction. This can be done | ||
by running: | ||
``` | ||
$ lein infer | ||
INFO MXNetJVM: Try loading mxnet-scala from native path. | ||
INFO MXNetJVM: Try loading mxnet-scala-linux-x86_64-gpu from native path. | ||
INFO MXNetJVM: Try loading mxnet-scala-linux-x86_64-cpu from native path. | ||
WARN MXNetJVM: MXNet Scala native library not found in path. Copying native library from the archive. Consider installing the library somewhere in the path (for Windows: PATH, for Linux: LD_LIBRARY_PATH), or specifying by Java cmd option -Djava.library.path=[lib path]. | ||
WARN org.apache.mxnet.DataDesc: Found Undefined Layout, will use default index 0 for batch axis | ||
INFO org.apache.mxnet.infer.Predictor: Latency increased due to batchSize mismatch 8 vs 1 | ||
WARN org.apache.mxnet.DataDesc: Found Undefined Layout, will use default index 0 for batch axis | ||
WARN org.apache.mxnet.DataDesc: Found Undefined Layout, will use default index 0 for batch axis | ||
CAPTCHA output: 6643 | ||
INFO org.apache.mxnet.util.NativeLibraryLoader: Deleting /tmp/mxnet6045308279291774865/libmxnet.so | ||
INFO org.apache.mxnet.util.NativeLibraryLoader: Deleting /tmp/mxnet6045308279291774865/mxnet-scala | ||
INFO org.apache.mxnet.util.NativeLibraryLoader: Deleting /tmp/mxnet6045308279291774865 | ||
``` | ||
The model runs on `captcha_example.png` by default. | ||
|
||
It can be run on other generated captcha images as well. The script | ||
`gen_captcha.py` generates random captcha images for length 4. | ||
Before running the python script, you will need to install the [captcha](https://pypi.org/project/captcha/) | ||
library using `pip3 install --user captcha`. The captcha images are generated | ||
in the `images/` folder and we can run the prediction using | ||
`lein infer images/7534.png`. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
#!/usr/bin/env python3 | ||
|
||
# Licensed to the Apache Software Foundation (ASF) under one | ||
# or more contributor license agreements. See the NOTICE file | ||
# distributed with this work for additional information | ||
# regarding copyright ownership. The ASF licenses this file | ||
# to you under the Apache License, Version 2.0 (the | ||
# "License"); you may not use this file except in compliance | ||
# with the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, | ||
# software distributed under the License is distributed on an | ||
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
# KIND, either express or implied. See the License for the | ||
# specific language governing permissions and limitations | ||
# under the License. | ||
|
||
from captcha.image import ImageCaptcha | ||
import os | ||
import random | ||
|
||
length = 4 | ||
width = 160 | ||
height = 60 | ||
IMAGE_DIR = "images" | ||
|
||
|
||
def random_text(): | ||
return ''.join(str(random.randint(0, 9)) | ||
for _ in range(length)) | ||
|
||
|
||
if __name__ == '__main__': | ||
image = ImageCaptcha(width=width, height=height) | ||
captcha_text = random_text() | ||
if not os.path.exists(IMAGE_DIR): | ||
os.makedirs(IMAGE_DIR) | ||
image.write(captcha_text, os.path.join(IMAGE_DIR, captcha_text + ".png")) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
#!/bin/bash | ||
|
||
# Licensed to the Apache Software Foundation (ASF) under one | ||
# or more contributor license agreements. See the NOTICE file | ||
# distributed with this work for additional information | ||
# regarding copyright ownership. The ASF licenses this file | ||
# to you under the Apache License, Version 2.0 (the | ||
# "License"); you may not use this file except in compliance | ||
# with the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, | ||
# software distributed under the License is distributed on an | ||
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
# KIND, either express or implied. See the License for the | ||
# specific language governing permissions and limitations | ||
# under the License. | ||
|
||
set -evx | ||
|
||
EXAMPLE_ROOT=$(cd "$(dirname $0)"; pwd) | ||
|
||
data_path=$EXAMPLE_ROOT | ||
|
||
if [ ! -f "$data_path/captcha_example.zip" ]; then | ||
wget https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/R/data/captcha_example.zip -P $data_path | ||
fi | ||
|
||
if [ ! -f "$data_path/captcha_example/captcha_train.rec" ]; then | ||
unzip $data_path/captcha_example.zip -d $data_path | ||
fi |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
;; | ||
;; Licensed to the Apache Software Foundation (ASF) under one or more | ||
;; contributor license agreements. See the NOTICE file distributed with | ||
;; this work for additional information regarding copyright ownership. | ||
;; The ASF licenses this file to You under the Apache License, Version 2.0 | ||
;; (the "License"); you may not use this file except in compliance with | ||
;; the License. You may obtain a copy of the License at | ||
;; | ||
;; http://www.apache.org/licenses/LICENSE-2.0 | ||
;; | ||
;; Unless required by applicable law or agreed to in writing, software | ||
;; distributed under the License is distributed on an "AS IS" BASIS, | ||
;; WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
;; See the License for the specific language governing permissions and | ||
;; limitations under the License. | ||
;; | ||
|
||
(defproject captcha "0.1.0-SNAPSHOT" | ||
:description "Captcha recognition via multi-label classification" | ||
:plugins [[lein-cljfmt "0.5.7"]] | ||
:dependencies [[org.clojure/clojure "1.9.0"] | ||
[org.apache.mxnet.contrib.clojure/clojure-mxnet "1.5.0-SNAPSHOT"]] | ||
:main ^:skip-aot captcha.train-ocr | ||
:profiles {:train {:main captcha.train-ocr} | ||
:infer {:main captcha.infer-ocr} | ||
:uberjar {:aot :all}} | ||
:aliases {"train" ["with-profile" "train" "run"] | ||
"infer" ["with-profile" "infer" "run"]}) |
27 changes: 27 additions & 0 deletions
27
contrib/clojure-package/examples/captcha/src/captcha/consts.clj
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
;; | ||
;; Licensed to the Apache Software Foundation (ASF) under one or more | ||
;; contributor license agreements. See the NOTICE file distributed with | ||
;; this work for additional information regarding copyright ownership. | ||
;; The ASF licenses this file to You under the Apache License, Version 2.0 | ||
;; (the "License"); you may not use this file except in compliance with | ||
;; the License. You may obtain a copy of the License at | ||
;; | ||
;; http://www.apache.org/licenses/LICENSE-2.0 | ||
;; | ||
;; Unless required by applicable law or agreed to in writing, software | ||
;; distributed under the License is distributed on an "AS IS" BASIS, | ||
;; WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
;; See the License for the specific language governing permissions and | ||
;; limitations under the License. | ||
;; | ||
|
||
(ns captcha.consts) | ||
|
||
(def batch-size 8) | ||
(def channels 3) | ||
(def height 30) | ||
(def width 80) | ||
(def data-shape [channels height width]) | ||
(def num-labels 10) | ||
(def label-width 4) | ||
(def model-prefix "ocr") |
56 changes: 56 additions & 0 deletions
56
contrib/clojure-package/examples/captcha/src/captcha/infer_ocr.clj
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
;; | ||
;; Licensed to the Apache Software Foundation (ASF) under one or more | ||
;; contributor license agreements. See the NOTICE file distributed with | ||
;; this work for additional information regarding copyright ownership. | ||
;; The ASF licenses this file to You under the Apache License, Version 2.0 | ||
;; (the "License"); you may not use this file except in compliance with | ||
;; the License. You may obtain a copy of the License at | ||
;; | ||
;; http://www.apache.org/licenses/LICENSE-2.0 | ||
;; | ||
;; Unless required by applicable law or agreed to in writing, software | ||
;; distributed under the License is distributed on an "AS IS" BASIS, | ||
;; WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
;; See the License for the specific language governing permissions and | ||
;; limitations under the License. | ||
;; | ||
|
||
(ns captcha.infer-ocr | ||
(:require [captcha.consts :refer :all] | ||
[org.apache.clojure-mxnet.dtype :as dtype] | ||
[org.apache.clojure-mxnet.infer :as infer] | ||
[org.apache.clojure-mxnet.layout :as layout] | ||
[org.apache.clojure-mxnet.ndarray :as ndarray])) | ||
|
||
(defn create-predictor | ||
[] | ||
(let [data-desc {:name "data" | ||
:shape [batch-size channels height width] | ||
:layout layout/NCHW | ||
:dtype dtype/FLOAT32} | ||
label-desc {:name "label" | ||
:shape [batch-size label-width] | ||
:layout layout/NT | ||
:dtype dtype/FLOAT32} | ||
factory (infer/model-factory model-prefix | ||
[data-desc label-desc])] | ||
(infer/create-predictor factory))) | ||
|
||
(defn -main | ||
[& args] | ||
(let [[filename] args | ||
image-fname (or filename "captcha_example.png") | ||
image-ndarray (-> image-fname | ||
infer/load-image-from-file | ||
(infer/reshape-image width height) | ||
(infer/buffered-image-to-pixels [channels height width]) | ||
(ndarray/expand-dims 0)) | ||
label-ndarray (ndarray/zeros [1 label-width]) | ||
predictor (create-predictor) | ||
predictions (-> (infer/predict-with-ndarray | ||
predictor | ||
[image-ndarray label-ndarray]) | ||
first | ||
(ndarray/argmax 1) | ||
ndarray/->vec)] | ||
(println "CAPTCHA output:" (apply str (mapv int predictions))))) |
Oops, something went wrong.