Oxford researchers developed a novel method to detect when a Large Language Model (LLM) is likely to "hallucinate" or invent facts that sound plausible but are imaginary3. The method, grounded in statistics, estimates uncertainty based on the amount of variation between multiple outputs, computing uncertainty at the level of meaning rather than sequences of words4. This helps identify when LLMs are uncertain about the actual meaning of an answer, not just the phrasing.
The new method for detecting hallucinations in large language models (LLMs) was tested on six open-source LLMs, including GPT-4 and LLaMA 2. The method proved effective in identifying when a question was likely to be answered incorrectly across various datasets and tasks, without requiring task-specific data.
The implications of LLMs "hallucinating" in critical fields, such as healthcare, legal, and financial domains, can lead to serious consequences4. Inaccurate or fabricated information generated by LLMs can result in misdiagnosis, wrong legal advice, or financial losses. The advancement in detecting these hallucinations, as demonstrated by the University of Oxford researchers, is crucial in ensuring the reliability and trustworthiness of AI-generated information in critical applications.