okvqa

Here are 4 public repositories matching this topic...

MILVLG / prophet

Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".

pytorch visual-question-answering multimodal-deep-learning gpt-3 prompt-engineering okvqa a-okvqa

Updated May 23, 2023
Python

ellenzhuwang / implicit_vkood

Star

An end-to-end multimodal framework incorporating explicit knowledge graphs and OOD-detection. (NeurIPS23)

knowledge-graph vqa visual-question-answering multimodal vision-and-language multimodal-deep-learning ood-detection implicit-differentiation image-text-retrieval okvqa neurips-2023

Updated Sep 4, 2024
Python

ParadoxZW / prophet

Star

Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".

pytorch multimodal-learning visual-question-answering gpt-3 prompt-engineering okvqa a-okvqa

Updated May 12, 2023
Python

Violet is a Python-based library designed for generating Arabic image captions. The pipeline leverages state-of-the-art transformer models, providing an easy-to-use interface for researchers and developers working on tasks such as image captioning and visual question answering (VQA).

transformers python3 pytorch vqa image-captioning vqav2 okvqa

Updated Jan 3, 2025
Jupyter Notebook

Improve this page

Add a description, image, and links to the okvqa topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the okvqa topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly