This is the code for our paper N24News: A New Dataset for Multimodal News Classification, which has been accepted by the 13th Conference on Language Resources and Evaluation (LREC 2022).
The build_fully_dataset.py can be used to download the complete version of N24News. Place nytimes_dataset.json in the same place and run the py file.
Once finished, the images will be stored in the 'images' folder and the name of images correspond to the 'image_id' in the json file.
And here is the direct download link for N24News in Google Drive https://drive.google.com/file/d/1OS1fXwZ1Vsj70lEQajccyssxQRYp5X9D/view?usp=share_link
And the Baidu Cloud link is https://pan.baidu.com/s/1wb6-9IrydjAi03_P3l0u9w 提取码: qaib
Please use this bib to cite our paper if you use N24News in your paper.
@InProceedings{wang-EtAl:2022:LREC3,
author = {Wang, Zhen and Shan, Xu and Zhang, Xiangxie and Yang, Jie},
title = {N24News: A New Dataset for Multimodal News Classification},
booktitle = {Proceedings of the Language Resources and Evaluation Conference},
month = {June},
year = {2022},
address = {Marseille, France},
publisher = {European Language Resources Association},
pages = {6768--6775},
url = {https://aclanthology.org/2022.lrec-1.729}
}