Japanese Visual Genome VQA dataset

We have created a Japanese visual question answering (VQA) dataset by using Yahoo! Crowdsourcing, based on the images from the Visual Genome dataset. Our dataset is meant to be comparable to the freeform QA part of Visual Genome dataset. The dataset consists of 99,208 images, together with 793,664 QA pairs in Japanese with every image having eight QA pairs.

Annotation Format

The annotations are stored in a single JSON file. The data format is a subset of Visual Genome dataset v1.2.

License

Creative Commons Attribution 4.0 License

Citation

@InProceedings{C18-1163,
  author = 	"Shimizu, Nobuyuki
		and Rong, Na
		and Miyazaki, Takashi",
  title = 	"Visual Question Answering Dataset for Bilingual Image Understanding: A Study of Cross-Lingual Transfer Using Attention Maps",
  booktitle = 	"Proceedings of the 27th International Conference on Computational Linguistics",
  year = 	"2018",
  publisher = 	"Association for Computational Linguistics",
  pages = 	"1918--1928",
  location = 	"Santa Fe, New Mexico, USA",
  url = 	"http://aclweb.org/anthology/C18-1163"
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
question_answers.json.zip		question_answers.json.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Japanese Visual Genome VQA dataset

Annotation Format

License

Citation

About

Releases

Packages

yahoojapan/ja-vg-vqa

Folders and files

Latest commit

History

Repository files navigation

Japanese Visual Genome VQA dataset

Annotation Format

License

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages