OCR_paper

Papers in the field of OCR(Continually updated)

Document Analysis with multi-modal large language model

A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding
MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations
DOCLLM: A LAYOUT-AWARE GENERATIVE LANGUAGE MODEL FOR MULTIMODAL DOCUMENT UNDERSTANDING

Text Detection

ADNet: Rethinking the Shrunk Polygon-Based Approach in Scene Text Detection(ADNet)(TMM)
CBNet: A Plug-and-Play Network for Segmentation-based Scene Text Detection(CBNet)
Zoom Text Detector
UNITS: UNSUPERVISED INTERMEDIATE TRAINING STAGE FOR SCENE TEXT DETECTION(ICME2022)
Vision-Language Pre-Training for Boosting Scene Text Detectors(ssl for text det CVPR2022)
Few Could Be Better Than All:Feature Sampling and Grouping for Scene Text Detection(Transformer-based)
Kernel Proposal Network for Arbitrary Shape Text Detection(KPN)
Real-Time Scene Text Detection with Differentiable Binarizationand Adaptive Scale Fusion(DBNet++)
FAST: Searching for a Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation (code)

Text Recognition

End-to-End text recogniton(Text Spotting)

DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Text Spotting
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting(CPR2023)
Text Spotting Transformers(Transformer detect control points)
PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text(PANNet for Text Spotting)
DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting(single point text spotting)
SPTS: Single-Point Text Spotting(single point text spotting)

Document layout analysis

Doc2Graph: a Task Agnostic Document Understanding Framework based on Graph Neural Networks(DOC2GRAPH)

Font Generation && Style Transfer

Look Closer to Supervise Better: One-Shot Font Generation via Component-Based Discriminator

OCR Post Process（spell check）

General and Domain Adaptive Chinese Spelling Check with Error Consistent Pretraining

Paragraph Recognition

LexiconNet: An End-to-End Handwritten Paragraph Text Recognition System

Mathematical Expression Recognition

CoMER: Modeling Coverage for Transformer-based Handwritten Mathematical Expression Recognition
When Counting Meets HMER:Counting-Aware Network for Handwritten Mathematical Expression Recognition

Table Releated

Revisiting Table Detection Datasets for Visually Rich Documents

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR_paper

Document Analysis with multi-modal large language model

Text Detection

Text Recognition

End-to-End text recogniton(Text Spotting)

Document layout analysis

Font Generation && Style Transfer

OCR Post Process（spell check）

Paragraph Recognition

Mathematical Expression Recognition

Table Releated

About

Releases

Packages

milely/OCR_paper

Folders and files

Latest commit

History

Repository files navigation

OCR_paper

Document Analysis with multi-modal large language model

Text Detection

Text Recognition

End-to-End text recogniton(Text Spotting)

Document layout analysis

Font Generation && Style Transfer

OCR Post Process（spell check）

Paragraph Recognition

Mathematical Expression Recognition

Table Releated

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages