Large language models (LLMs) primarily excel in natural language processing tasks, such as understanding and generating text, and can perform a wide range of tasks due to their training on immense amounts of data4. Their capabilities include question-answering, summarization, and even creative content generation. LLMs have become popular in recent years, especially in the context of generative AI applications.
LLMs often produce incorrect answers due to their probabilistic nature, training data limitations, and sensitivity to prompts. They generate responses based on a probability distribution, and their training data may contain false or biased information4. Additionally, the way a question is phrased can influence the response.
Researchers improve LLMs' factual accuracy by fine-tuning on well-encoded facts, strategic selection of fine-tuning data, focusing on well-known facts, regularization techniques to overcome attention imbalance, curriculum learning strategies, and developing synthetic data for efficient knowledge extraction.