Not as bad as the AI-generated articles showing up in search results. Some websites I get driven to make absolutely no sense, despite a lot of words being written about all kinds of topics.
I'm looking forward to the day when "certified human content" is a thing, and that's all search engines allow you to see.
I'm looking forward to the day when "certified human content" is a thing, and that's all search engines allow you to see.
I can't wait for that. I get the feeling it's gonna get real messy before we figure out solutions to all the problems caused by AI-generated content.
I mean yeah, there's already plenty of human-generated misinformation and shit, but it seems to me (not an expert) like ai is capable of fucking with society on a whole new scale.
The big difference is that high quality human generated content is often based on reputation, a history of quality content, and frequently reviewed by experts in the field (very common for medical articles).
But AI has none of that. It's 100% quantity over quality, and that's just internet pollution as far as I'm concerned.
We really do have to figure something out, though.
Eventually all content will just be AI generated on the fly. No need to keep dumb content on precious storage that could be used to increase model size.
Yeah, a lot of repair sites come up with pages that have just hundreds of Q&A's, but often times they don't make sense or aren't even related to the topic! Once you realize how much time was wasted on these garbage sites, you don't even feel motivated to keep looking for answers.
Or, you know, we go back to the time when the news media had real gatekeepers and not just any random jackass could churn out some bullshit copy and broadcast it to the world, let alone have it get published by their local paper.
It's nice that the Internet has democratized access to a national or even global audience, but let's not pretend for a moment that it hasn't caused a ton of problems in the process such that now many people have no idea of what to believe while others believe whatever they want.
It's still pretty easy to tell the difference. You have to have a pretty low level of media literacy to not be able to easily spot it. Unfortunately we already know that most people don't have a clue when it comes to mass media, and even if they did, we also know that people tend to believe whatever reinforces their priors.
For now, just like it was easy to identify AI art by the fucked up hands for a few months before that was mostly ironed out. AI really doesn't need to get that much "smarter" to start fooling people in their native tongue, it just needs to be able to string the right words together more often. And there's a few billion guinea pigs out there to test on.
I mean, they would have started appearing in there from the first moment that someone created one and hosted it somewhere, no? So it's already been a thing for a couple years now, I believe.
Why would they not? There’s no way for such a system to know it’s AI generated unless there’s some metadata that makes it obvious. And even if it was, who’s to say the user wouldn’t want to see them in the results?
This is a nothing issue. It’s not like this is being generated in response to a search, it’s something that already existed being returned as a result because there is assembly something that links it to the search.
There’s no way for such a system to know it’s AI generated unless there’s some metadata that makes it obvious.
I agree with your comment but just want to point out that AI-generated images actually often do contain metadata, usually describing the model and prompt used.
That's fine for looking up cat pictures or porn, but many people are searching for information contained in images, and that is a problem. What if you were looking for a graph, a map, a blueprint, etc.? How do you discern the real from the fake? What if you click through and the image seems to come from a legit source that is also generated?
Its time to start talking about "memetic effluent." In the same way corporations polluted our physical world, they're pollution our memetic world. AI spewing garbage data is just the most obvious way, but corporations have been toxifying our memetic space for generations.
This memetic effluent will make sorting through data harder and harder over the years. But the oil and tobacco industries undermined science and democracy for decades with it's own memetic effluent in order to protect their business for decades. Advertising is it's own effluent that distorts and destroys language. Jerry Rubin said it in 1970,
"How can I tell you 'I love you' after hearing 'cars love shell?'"
While physical effluent destroys our physical environment making living in the world harder, memetics effluent destroys meaning and makes thinking about and comprehending the world harder. Both are the garbage side effects of the perpetuation of capitalism.
This example of poisoning the data well is just too obvious to ignore, but there are so many others.
It's interesting, because the idea is basically that knowledge and ideas should be constructive, so as not to pollute the sum of human knowledge.
So that raises the question, what is the constructive conclusion to "memetic effluent"? Without one, is the concept itself an example of such effluent?
I don't think that's the implication here. Following the metaphor, pottery and arrow points have been waste products for a while. Prior to the industrial revolution, and specifically prior to the chemical revolution, industrial waste streams haven't been as major of a problem (ignoring cholera for a bit). It's been the development of selling chemicals for profit and the extensive use of petroleum that's really caused massive problems threatening humanity as a whole.
The implication then is that people should be responsible for their memes. Corporations are inherently irresponsible because there exit economic incentives to externalize costs, be that environmental or informational. AI garbage as a waste stream would be fine if the data was clearly labeled as such. Unfortunately at least some AI garbage is intended to be deceptive. There exists an economic incentives to produce AI garbage that is hard to distinguish from human output. Since AI garbage can be produced at an industrial scale, there's a massive waste data stream that's able to overload the systems we've built to parse and organize data.
There are probably a lot more implications here, but "what are we doing with our information world" is something worth thinking about before we make it completely unusable.
This feels like the precursor to the information Apocalypse referenced in the comic Transmetropolitan.
AI generated images often contain model and prompt metadata so in fact it could potentially tell the difference. Not that that should necessarily mean the image should be excluded.
This isn't really a realistic answer, since the issue is that these images aren't labeled as being AI generated, and constantly mixing generative content into everything we consume risks blurring reality for a lot of people.
Personally, I would prefer to see as little AI content as possible when searching for images unless that's the kind of image I am looking for, and I would like those images to be labeled as such whenever possible.
I wonder what would happen in the future as future AI's get trained with AI generated images that they got from the internet. Would the generated images start to degrade or have somekind of distinct style pop out.
Yeah something like that. I imagine it would be something like jpeg which degrades as you keep converting over and over. But not sure how would AI generated images would look like.
Not really. Check midjourney v6 generated images. I found many images, which look undistinctable from real images. So i dont see, why image generation should get worse. What matters is the dataset and only dataset. It doesnt matter if the model is trained on ai images, as long as the dataset is good
Just wanted to point out that the Pinterest examples are conflating two distinct issues: low-quality results polluting our searches (in that they are visibly AI-generated) and images that are not "true" but very convincing,
The first one (search results quality) should theoretically be Google's main job, except that they've never been great at it with images. Better quality results should get closer to the top as the algorithm and some manual editing do their job; crappy images (including bad AI ones) should move towards the bottom.
The latter issue ("reality" of the result) is the one I find more concerning. As AI-generated results get better and harder to tell from reality, how would we know that the search results for anything isn't a convincing spoof just coughed up by an AI? But I'm not sure this is a search-engine or even an Internet-specific issue. The internet is clearly more efficient in spreading information quickly, but any video seen on TV or image quoted in a scientific article has to be viewed much more skeptically now.
There can be 2 major difficulties with tracking to origin.
Time. It can take a good amount of time to find the true origin of something. And you don't have the time to trace back to the true origin of everything you see and hear. So you will tend to choose the "source" you most agree with introducing bias to your "origin".
And the question of "Is the 'origin' I found the real source?" This is sometimes referred to Facts by Common Knowledge or the Wikipedia effect. And as AI gets better and better, original source material is going to become harder to access and harder to verify unless you can lay your hands on a real piece of paper that says it's so.
So it appears at this point in time, there is no simple solution like "provenance" and " find the origin".
The Google AI that pre-loads the results query isn't able to distinguish real photos from fake AI generated photos. So there's no way to filter out all the trash, because we've made generative AI just good enough to snooker search AI.
Why bother looking anything up when you can just make it fresh right now. I figured this might start happening once I heard of people using chatGPT as a search replacement (tell me lies, pretty pretty lies).