image-text

This project is a FastAPI-based web application designed to analyze C a m b r i d g e I E L T S P D F s ( B o o k s 1 − 18 ) for the most and least repeated words. It can handle both regular text-based PDFs and scanned image-based PDFs by converting them to images and extracting text using OCR (Optical Character Recognition).

ielts image-text fast-api

Updated Aug 16, 2024
Python

leeyunjai / image2text

Star

caption generator using lavis and argostranslate

captions caption image-analysis captioning-images img2txt image-text caption-generation caption-generator blip2

Updated Mar 21, 2023
Python

dinhanhx / VisualRoBERTa

Sponsor

Star

The first public Vietnamese visual linguistic foundation model(s)

python python3 image-captioning python-3 vietnamese-nlp visual-question-answering image-text visual-linguistic

Updated Oct 29, 2023
Python

jianzhnie / MultimodalTransformers

Star

lmmtoolkit is a toolkit for Multi-Modal Learning

image-text text-image multi-modal-learning text-to-video

Updated Nov 21, 2023
Python

dinhanhx / VL-datasets

Sponsor

Star

Some Python scripts to load Vietnamese visual linguistic data

python vietnamese python3 image-captioning python-3 vietnamese-nlp visual-question-answering image-text visual-linguistic

Updated Aug 23, 2022
Python

ppraneeth270 / img2text

Star

textrecognition image-text image2text

Updated May 23, 2021
Python

yomnaFathy / Text-Detection-and-Recognition

Star

opencv ocr computer-vision deep-learning text-recognition transfer-learning pretrained-models text-detection pytesseract east image-text text-detection-recognition

Updated Oct 20, 2020
Python

DarkKnightSgh / Text-Image-Text

Star

Text-Image-Text is a bidirectional system that enables seamless retrieval of images based on text descriptions, and vice versa. It leverages state-of-the-art language and vision models to bridge the gap between textual and visual representations.

python information-retrieval transformers image-text flickr8k-dataset text-image streamlit semantic-embedding huggingface-transformers

Updated Apr 27, 2024
Python

xiongshufeng / MTFN-RR-PyTorch-Code

Star

The offical code for paper "Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking", ACM Multimedia 2019 Oral

fusion image-text

Updated Sep 28, 2019
Python

ask0ne / ocrator

Star

Scan text from an image and convert into speech/audio of desired language.

natural-language-processing text-to-speech image-recognition pytesseract image-text

Updated Dec 8, 2022
Python

Improve this page

Add a description, image, and links to the image-text topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the image-text topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

image-text

Here are 19 public repositories matching this topic...

salesforce / ALBEF

Sense-GVT / DeCLIP

X-PLUG / mPLUG

miccunifi / QualiCLIP

labyrinth7x / Deep-Cross-Modal-Projection-Learning-for-Image-Text-Matching

TheoCoombes / crawlingathome

TheoCoombes / crawlingathome-server

Thisisus7 / ING-VP

dvlab-research / TagCLIP

fatemeh-mohseni-AI / most-repeated-vocabulary-IELTS

leeyunjai / image2text

dinhanhx / VisualRoBERTa

jianzhnie / MultimodalTransformers

dinhanhx / VL-datasets

ppraneeth270 / img2text

yomnaFathy / Text-Detection-and-Recognition

DarkKnightSgh / Text-Image-Text

xiongshufeng / MTFN-RR-PyTorch-Code

ask0ne / ocrator

Improve this page

Add this topic to your repo