Not defending Apple here, but everyone with a vested interest in AI is doing it. Nobody is asking permission or respecting copyright in this race to the bottom.
I know a lot of people are down voting your comment, but I want you to know they are down voting the idea that companies treat public content like public property.
You shouldn't be down voted for pointing that out.
Its a problem with how we categorise content as either private or public without regard to copyright.
It seems copyright is for big companies like Disney, but a YouTube creator isnt afforded the same protection for their creation. They are merely providing "content" no intellectual property.
If you read the article you find this was a dataset from a nonprofit, available to anyone. The nonprofit used captions from a set of YouTube videos.
“Most of the Pile’s datasets are accessible and open for anyone on the internet with enough space and computing power to access them.”
That anyone included a lot of other big names in tech, not just Apple.
Also I wasn’t aware that Apple had its own AI. I thought they were licensing stuff from others like OpenAI. I guess maybe this is some research project for an unannounced project?
Most children don't (sick burn against the Grimm Brothers). I mean, fuck Apple and all of these companies, but they're hoovering data from a publicly available resource using totally legal means.
I know I'm snowballing here, but overreacting to this headline could end up supporting those who argue that web crawlers, plane-tracking bots, and the completely legal actions of Aaron Swartz that the Feds tried using to crucify him.
Once again, fuck Apple, but the real villain in this scenario is either Google for allowing companies to train their AI models on their content, or the content creators who are still using YouTube.
Since I can't fault anyone who is trying to make a living by exploring Google, then I guess I'll just add "fuck Google" to the pile.