cs386

Dependencies

pip install -r requirments.txt

preprocess

Read the dataset and divide into train and valid set.

python preprocess.py

Then you can get the train.json and val.json, which contain the train and valid dataset path.

import os
import json
with open("train.json","r")as f:
	train_dataset = json.load(f)
for dir_name, filename_list in train_dataset.items():
    # label is dir_name
	for filename in filename_list:
		file_dir = os.path.join("dataset", dir_name, filename)
		print(file_dir)
# for valid set, the code is just same as above.

train

python train.py kwargs/oracle/mobilenetv2/defaults.py

Baseline: 97.36

real-world preprocessing

Use the function binarize(GrayImage, c=5) to transform a real-world gray image into binarized image, input is gray image and the output is binarized image. Parameter c is from adaptive threshold binarization, c could be larger to reduce noise but result in destroy the shape of word.

For the preprocessing process, I first apply median blur on original gray image; second, apply adaptive threshold binarization; third, apply opening (morphology transformation) to reduce snow noise and closing to link some strokes.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
SVM		SVM
dataset		dataset
datasets		datasets
kwargs		kwargs
models		models
README.md		README.md
binarize.py		binarize.py
meters.py		meters.py
preprocess.py		preprocess.py
requirments.txt		requirments.txt
train.json		train.json
train.py		train.py
val.json		val.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cs386

Dependencies

preprocess

train

real-world preprocessing

About

Releases

Packages

Contributors 3

Languages

Hunterhuan/CS386

Folders and files

Latest commit

History

Repository files navigation

cs386

Dependencies

preprocess

train

real-world preprocessing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages