Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector #29

Prat011 · 2024-07-19T16:43:19Z

Summary

Hallucination detection is a challenging task for large language models (LLMs), and existing studies heavily rely on powerful closed-source LLMs such as GPT-4. In this paper, we propose an autonomous LLM-based agent framework, called HaluAgent, which enables relatively smaller LLMs (e.g. Baichuan2-Chat 7B) to actively select suitable tools for detecting multiple hallucination types such as text, code, and mathematical expression.

Key Findings

HaluAgent integrates the LLM, multi-functional toolbox, and a fine-grained three-stage detection framework along with a memory mechanism.
HaluAgent leverages existing Chinese and English datasets to synthesize detection trajectories for fine-tuning, enabling bilingual hallucination detection.
Extensive experiments demonstrate that HaluAgent can perform hallucination detection on various types of tasks and datasets, achieving performance comparable to or even higher than GPT-4 without tool enhancements.

Implementation Guidance

Develop a multi-functional toolbox for HaluAgent to detect various types of hallucinations.
Design a fine-grained three-stage detection framework and integrate it with a memory mechanism.
Fine-tune HaluAgent using existing Chinese and English datasets to enhance its bilingual detection capabilities.

Reference

Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector

Assignee

@sawradip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector #29

Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector #29

Prat011 commented Jul 19, 2024

Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector #29

Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector #29

Comments

Prat011 commented Jul 19, 2024

Summary

Key Findings

Implementation Guidance

Reference

Tags

Assignee