Implementation of Our Paper "Object-aware Multimodal Named Entity Recognition in Social Media Posts with Adversarial Learning" in IEEE Trans. on Multimedia. This implementation is based on the NCRF++.
The overall architecture of our model. Our model utilizes the relevant objects as auxiliary contexts for understanding entities in sentences. Adversarial learning is used to bridge the modality gap between text and visual. A gated bilinear attention network is applied to align entities with related visual objects and fuse multimodal information.
python >= 3.6
pytorch >= 1.1.0
NCRF++
- You can download the multimodal dataset from twitter2015
- We adopt the glove embeddings to initialize our model which can be downloaded here
- We preprocess the visual contents and extract the object features with Mask-RCNN. The preprocessed object feature data can be downloaded here
Set the status
attribute in demo.train.config to train
or decode
, and then
python main.py --config demo.train.config
If you find this repo helpful, please cite the following:
@article{zheng2020object,
title={Object-aware Multimodal Named Entity Recognition in Social Media Posts with Adversarial Learning},
author={Zheng, Changmeng and Wu, Zhiwei and Wang, Tao and Yi, Cai and Li, Qing},
journal={IEEE Transactions on Multimedia},
year={2020},
publisher={IEEE}
}