Scaling laws in machine learning explain the predictable improvements in model performance as the size of the training dataset increases. They help determine the balance between model size and data size, and have been observed in various model types and datasets. These laws have practical applications such as understanding trade-offs, predicting performance, and comparing learning algorithms.
Scaling laws help determine the balance between model size and data size by explaining the relationship between error rates and dataset size. They predict how quickly performance improves with more data and larger models, allowing for informed decisions on resource allocation2. However, they don't consider individual training examples' value, which is crucial for noisy datasets. Researchers have developed methods to estimate the contribution of each data point to the model's performance, enabling more efficient use of resources.
The method that scores training examples by marginal contribution is the Marginal Contribution Feature Importance (MCI). It is a feature importance scoring method that assigns importance scores to individual features based on their contribution to the model's performance. The MCI method evaluates the marginal contribution of each feature and satisfies a set of axioms that represent the properties expected from a feature importance function in the natural scenario.