This project is produce massive MLLM data using data engine. It includes:
- Document: arvix source data, with png images and according text;
- ScreenShots: detected texts along with screenshot;
- Persons: Identify famous persons as world knowledge;
test with latex:
全部采用全新的安卓搜索疫情。