SherlookingArt

Project conducted during the King's College Prompting Hackathon

To what extent can LLMs be useful for multi-modal knowledge acquisition and inferencing?

Prior work on leveraging text-only LLMs for knowledge extraction and KG completion (overview here).
We would like to extend such approaches to multi-modal knowledge, including not only text and images, but also audio, video, haptics etc.
The goal would be to test the ability of multi-modal LLMs such as GPT-4 (as well as others) towards the construction and completion of a multi-modal KG in the context of the MuseIT project (https://www.muse-it.eu/).
Particularly interesting would be to explore the functionality of LLMs for multi-modal reasoning and inferencing.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data		data
documentation		documentation
src		src
LICENSE		LICENSE
README.md		README.md
TestCelian.ipynb		TestCelian.ipynb

Provide feedback