tomBitonti
Adventurer
They were compensated for allowing specific use of their content. The question is not whether there was any compensation. The question is were they compensated for their content being used to train an AI.The artists were compensated by pintrest & the other sites in the form of free hosting I don't know how you could miss that by now without actively making an effort. Coincidentally that same effort was displayed when you were presented an example of humans doing the exact thing you faulted an AI for only to dismiss it as a meme to avoid the logic conflict. As an archive, Common Crawl doesn't do anything that requires compensation because "The Common Crawl dataset includes copyrighted work and is distributed from the US under fair use claims. Researchers in other countries have made use of techniques such as shuffling sentences or referencing the common crawl dataset to work around copyright law in other legal jurisdictions" but it also follows robots.txt directives
For someone who doesn't "care to speculate on how the courts will rule", you've shown no hesitancy in declaring theft stealing unethical behavior & so on while taking stances as if that speculation was both done as well as decided in the manner most supportive of your position throughout the thread
I imagine that common crawls doesn’t train AIs to create art. The commons crawls use seems to be very specific, e.g., archiving and indexing, uses which are non-controversial.
TomB