Path_VQA

The following repository contains code for the paper 'TM-PATHVQA: 90000+ Textless Multilingual Questions for Medical Visual Question Answering' which has been accepted in Interspeech 2024.

Overview

The paper explores a novel VQA dataset in healthcare and medical diagnostics. Current text-based VQA systems limit their utility in scenarios where hands-free interaction and accessibility are crucial while performing tasks. A speech-based VQA system may provide a better means of interaction where information can be accessed while performing tasks simultaneously. To this end, this work implements a speech-based VQA system by introducing a Textless Multilingual Pathological VQA (TM-PathVQA) dataset, an expansion of the PathVQA dataset The dataset can be accessed by the following link:

Installation

Prerequisites

Python>=3.8
torch>1.6

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
code		code
data		data
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Path_VQA

Overview

Installation

Prerequisites

About

Releases

Packages

Languages

aquorio15/path_vqa

Folders and files

Latest commit

History

Repository files navigation

Path_VQA

Overview

Installation

Prerequisites

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages