AI's increased computational demands raise concerns about cost, environmental impact, and the privatization of AI research. The high financial and computational costs required to train modern AI models are beyond the reach of many academic institutions, leading to a landscape where cutting-edge AI research is increasingly confined within well-funded private corporations. Additionally, the energy requirements for running these computations contribute to a worrying trend of AI's environmental footprint.
Meta's novel approach in AI model training is a multi-token prediction method, which tasks models with forecasting multiple future words simultaneously, instead of the traditional approach of predicting just the next word in a sequence. This approach promises enhanced performance and drastically reduced training times, potentially making advanced AI more accessible and sustainable.
Multi-token prediction enhances AI performance by predicting multiple future tokens simultaneously instead of just the next token4. This approach improves sample efficiency, captures long-range dependencies, and develops richer representations, leading to better performance on various tasks, including code generation and natural language understanding. Additionally, it enables faster inference times and promotes algorithmic reasoning capabilities.