Restrict Text Recognization to Digit Only #1876

jidalii · 2025-02-28T16:17:35Z

jidalii
Feb 28, 2025

Hi, I am trying to do the text recognition of a table with digits only. Sometimes, the digits are recognized as English characters. I wonder whether there is a way to restrict the scope of the text recognition to digits only. Thanks!

Answered by felixdittrich92

Mar 5, 2025

However, you have to "know what you are doing there" with a LogitsProcessor... I think not many people would use that or it needs extensive explanation and/or help

something like ocr_predictor(.. whitelist=VOCABS["french"] + VOCABS["german"]) for example seems to be more user friendly ..but would require a more robust solution / logic on our end 😅

View full answer

felixdittrich92 · 2025-03-04T13:08:14Z

felixdittrich92
Mar 4, 2025
Maintainer

Hi @jidalii 👋,

Unfortunately, we don’t have built-in logic for blacklisting or whitelisting characters yet.
There’s already a related ticket: #988.

I’ve tried several approaches, including:

Logits masking: Hard avoidance of blacklisted characters by setting their indices to -inf on the logits before applying softmax.
Logits scaling: Applying a penalty by multiplying the logits of blacklisted characters before softmax.
Filtering after softmax: Replacing blacklisted characters with the next most probable character.
Refined iteration: Computing softmax -> masking blacklisted character indicies -> recomputing the last feature representation (from the head weight) and passing the masked features through the final linear layer.

However, I wasn’t really satisfied with the results. In some cases, the model started predicting seemingly random characters to fill the restricted positions (e.g., when "a" was blacklisted, instead of choosing the closer "á," it predicted "R").

Maybe @frgfm has some ideas I haven’t tested yet ? 🤗

In general, I think the simplest approach would be to introduce a RecognitionLogitsProcessor, allowing users to manipulate the logit vector before softmax. However, this would require some understanding of the process, so I’m not sure how many users would take advantage of it.

5 replies

frgfm Mar 5, 2025
Maintainer

Hey there 👋

I think the logits masking would be the best approach here. We could argue it does extra computation for nothing but for decoding it's much easier than changing the number of possible output characters.

In some regards, we could think about a Logitsprocessor in the same spirit than structured generation for LLMs. But I think that would require heavy refactoring 😅

felixdittrich92 Mar 5, 2025
Maintainer

Logitsprocessor should not take to much effort we have the base class :

doctr/doctr/models/recognition/core.py

Line 39 in 225210c

class RecognitionPostProcessor(NestedObject):

and all our recognition post processor's inerhit from it - where we pass the logits and compute softmax :)

Mh about computation I totally agree, but if I remember correct with masking before softmax it replaced it with "random" chars because the distribution is to close

felixdittrich92 Mar 5, 2025
Maintainer

However, you have to "know what you are doing there" with a LogitsProcessor... I think not many people would use that or it needs extensive explanation and/or help

something like ocr_predictor(.. whitelist=VOCABS["french"] + VOCABS["german"]) for example seems to be more user friendly ..but would require a more robust solution / logic on our end 😅

Answer selected by jidalii

jidalii Mar 19, 2025
Author

Thank you so much for your suggestions! I would definitely try that approach of having an extra layer of LogitsProcessor.

felixdittrich92 Mar 19, 2025
Maintainer

It's on the roadmap to find a hopefully more user friendly solution 👍
CC @SiddhantBahuguna

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restrict Text Recognization to Digit Only #1876

{{title}}

Replies: 1 comment 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Restrict Text Recognization to Digit Only #1876

jidalii Feb 28, 2025

Replies: 1 comment · 5 replies

felixdittrich92 Mar 4, 2025 Maintainer

frgfm Mar 5, 2025 Maintainer

felixdittrich92 Mar 5, 2025 Maintainer

felixdittrich92 Mar 5, 2025 Maintainer

jidalii Mar 19, 2025 Author

felixdittrich92 Mar 19, 2025 Maintainer

jidalii
Feb 28, 2025

Replies: 1 comment 5 replies

felixdittrich92
Mar 4, 2025
Maintainer

frgfm Mar 5, 2025
Maintainer

felixdittrich92 Mar 5, 2025
Maintainer

felixdittrich92 Mar 5, 2025
Maintainer

jidalii Mar 19, 2025
Author

felixdittrich92 Mar 19, 2025
Maintainer