Apple collects AI training data by using its web crawler, AppleBot, to scrape publicly available information from the internet. The company also licenses data from third-party sources, such as Shutterstock and Photobucket, and creates its own training data. Additionally, Apple uses user-generated data, but claims not to use private personal data or user interactions for training its foundation models.
Artists are critical of Apple's data transparency because they feel the company has not been forthcoming about the source of training data for its AI model, Apple Intelligence. They argue that the lack of transparency raises concerns about copyright infringement and the ethical use of artists' work in training the AI model.
In 2023, creatives took legal actions against AI companies, with over a dozen lawsuits accusing them of using copyrighted works without consent3. Major music labels sued AI music generators Suno and Udio for copyright infringement. Artists, authors, and musicians accused generative AI companies of profiting off their work without compensation, leading to multiple lawsuits.