Skip to content

aquorio15/path_vqa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Path_VQA

The following repository contains code for the paper 'TM-PATHVQA: 90000+ Textless Multilingual Questions for Medical Visual Question Answering' which has been accepted in Interspeech 2024.

Overview

The paper explores a novel VQA dataset in healthcare and medical diagnostics. Current text-based VQA systems limit their utility in scenarios where hands-free interaction and accessibility are crucial while performing tasks. A speech-based VQA system may provide a better means of interaction where information can be accessed while performing tasks simultaneously. To this end, this work implements a speech-based VQA system by introducing a Textless Multilingual Pathological VQA (TM-PathVQA) dataset, an expansion of the PathVQA dataset The dataset can be accessed by the following link:

Installation

Prerequisites

  • Python>=3.8
  • torch>1.6

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages