Google apologizes for ‘missing the mark’ after Gemini generated racially diverse Nazis

I don't know how you'd solve the problem of making a generative AI accurately create a slate of images that both a) inclusively produces people with diverse characteristics and b) understands the context of what characteristics could feasibly be generated.

But that's because the AI doesn't know how to solve the problem.

Because the AI doesn't know anything.

Real intelligence simply doesn't work like this, and every time you point it out someone shouts "but it'll get better". It still won't understand anything unless you teach it exactly what the solution to a prompt is. It won't, for example, interpolate its knowledge of what US senators look like with the knowledge that all of them were white men for a long period of American history.

You don't do what Google seems to have done - inject diversity artificially into prompts.

You solve this by training the AI on actual, accurate, diverse data for the given prompt. For example, for "american woman" you definitely could find plenty of pictures of American women from all sorts of racial backgrounds, and use that to train the AI. For "german 1943 soldier" the accurate historical images are obviously far less likely to contain racially diverse people in them.

If Google has indeed already done that, and then still had to artificially force racial diversity, then their AI training model is bad and unable to handle that a single input can match to different images, instead of the most prominent or average of its training set.
- Ultimately this is futile though, because you can do that for these two specific prompts until the AI appears to "get it", but it'll still screw up a prompt like "1800s Supreme Court justice" or something because it hasn't been trained on that. Real intelligence requires agency to seek out new information to fill in its own gaps; and a framework to be aware of what the gaps are. Through exploration of its environment, a real intelligence connects things together, and is able to form new connections as needed. When we say "AI doesn't know anything" that's what we mean--understanding is having a huge range of connections and the ability to infer new ones.
Edit: further discussion on the topic has changed my viewpoint on this, its not that its been trained wrong on purpose and now its confused, its that everything its being asked is secretly being changed. It's like a child being told to make up a story by their teacher when the principal asked for the right answer.

Original comment below

They’ve purposefully overrode its training to make it create more PoCs. It’s a noble goal to have more inclusivity but we purposely trained it wrong and now it’s confused, the same thing as if you lied to a child during their education and then asked them for real answers, they’ll tell you the lies they were taught instead.
- This result is clearly wrong, but it's a little more complicated than saying that adding inclusivity is purposedly training it wrong.
  
  Say, if "entrepreneur" only generated images of white men, and "nurse" only generated images of white women, then that wouldn't be right either, it would just be reproducing and magnifying human biases. Yet this a sort of thing that AI does a lot, because AI is a pattern recognition tool inherently inclined to collapse data into an average, and data sets seldom have equal or proportional samples for every single thing. Human biases affect how many images we have of each group of people.
  
  It's not even just limited to image generation AIs. Black people often bring up how facial recognition technology is much spottier to them because the training data and even the camera technology was tuned and tested mainly for white people. Usually that's not even done deliberately, but it happens because of who gets to work on it and where it gets tested.
  
  Of course, secretly adding "diverse" to every prompt is also a poor solution. The real solution here is providing more contextual data. Unfortunately, clearly, the AI is not able to determine these things by itself.
Easy, just add "no racism please, except for nazi-related stuff" into the ever expanding system prompt.
- And for the source of this:
  
  https://twitter.com/dylan522p/status/1755118636807733456
  
  That’s hilarious someone was able make the GPT unload its directive
Real intelligence simply doesn't work like this

There's a certain point where this just feels like the Chinese room. And, yeah, it's hard to argue that a room can speak Chinese, or that the weird prediction rules that an LLM is built on can constitute intelligence, but that doesn't mean it can't be. Essentially boiled down, every brain we know of is just following weird rules that happen to produce intelligent results.

Obviously we're nowhere near that with models like this now, and it isn't something we have the ability to work directly toward with these tools, but I would still contend that intelligence is emergent, and arguing whether something "knows" the answer to a question is infinitely less valuable than asking whether it can produce the right answer when asked.
- I really don't think that LLMs can be constituted as intelligent any more than a book can be intelligent. LLMs are basically search engines at the word level of granularity, it has no world model or world simulation, it's just using a shit ton of relations to pick highly relevant words based on the probability of the text they were trained on. That doesn't mean that LLMs can't produce intelligent results. A book contains intelligent language because it was written by a human who transcribed their intelligence into an encoded artifact. LLMs produce intelligent results because it was trained on a ton of text that has intelligence encoded into it because they were written by intelligent humans. If you break down a book to its sentences, those sentences will have intelligent content, and if you start to measure the relationship between the order of words in that book you can produce new sentences that still have intelligent content. That doesn't make the book intelligent.
Worked fine for me:

System Prompt:

You are a bot that generates prompts for image generation based on a user's query. You should add detail so the image prompt can accurately generate what the user intended.

You must add diversity to the prompt when appropriate to ensure the image generator will have diverse output, but only do so when appropriate.

You must add race and skin color details for historic accuracy as the image generation algorithm does not know how to take that into account.

You output should just be the prompt and nothing else. All following inputs will be user inputs.

User Prompt:

A 1940s german soldier

Assistant Message:

A 1940s German soldier in uniform, standing on a European battlefield, white male with short hair, holding a standard-issue rifle.

User Prompt:

An british woman

Assistant Message:

A British woman, reflecting diverse backgrounds, in contemporary casual wear, showing a range of ages and hairstyles.
- Hm, so while the AI doesn’t “understand” (a woo word until someone can define it for me), it seems to accidentally, without any understanding, behave exactly like it understands.
Actually the way you get it to do better is to put more of the burden on interpreting the context on the LLM instead of heavy handed instructions - because the LLMs do understand the context.

For example, here's Gemini answering what the physical characteristics of 1940s soldiers in Germany might have looked like:

During the Nazi regime in 1940s Germany, racial ideology strictly dictated who was deemed "suitable" for military service. The Wehrmacht, the unified armed forces, prioritized individuals deemed "pure Aryans" based on Nazi racial criteria. These criteria favored individuals with blond hair, blue eyes, and "Nordic" features.

However, it's important to remember that the reality was more nuanced. As the war progressed and manpower needs intensified, the Nazis relaxed their racial restrictions to some extent, including conscripting individuals with mixed ancestry or physical "imperfections." Additionally, some minority groups like the Volksdeutsche, Germans living in Eastern Europe, were also incorporated.

I think it could have managed to contextualize the prompt correctly if given the leeway in the instructions. Instead, what's happened is the instructions given to it ask it to behind the scenes modify the prompt in broad application to randomly include diversity modifiers to what is asked for. So "image of 1940s German soldier" is being modified to "image of black woman 1940s German soldier" for one generation and "image of Asian man 1940s German soldier" for another, which leads to less than ideal results. It should instead be encouraged to modify for diversity and representation relative to the context of the request.
- I think a lot of the improvement will come from breaking down the problem using sub assistant for specific actions. So in this case you're asking for an image generation action involving people, then an LLM specifically designed for that use case can take over tuned for that exact use case. I think it'll be hard to keep an LLM on task if you have one prompt trying to accomplish every possible outcome, but you can make it more specific to handle sub tasks more accurately. We could even potentially get an LLM to dynamically create sub assistants based on the use case. Right now the tech is too slow to do all this stuff at scale and in real time, but it will get faster. The problem right now isn't that these fixes aren't possible, it's that they're hard to scale.
- That isn't "understanding content", it's just pulling from historical work that humans did and finding it for you. Essentially, it's a search engine for all of its training data in this context.
I'll get the usual downvotes for this, but:

Because the AI doesn't know anything.

is untrue, because current AI fundamentally is knowledge. Intelligence fundamentally is compression, and that's what the training process does - it compresses large amounts of data into a smaller size (and of course loses many details in the process).

But there's no way to argue that AI doesn't know anything if you look at its ability to recreate a great number of facts etc. from a small amount of activations. Yes, not everything is accurate, and it might never be perfect. I'm not trying to argue that "it will necessarily get better". But there's no argument that labels current AI technology as "not understanding" without resorting to a "special human sauce" argument, because the fundamental compression mechanisms behind it are the same as behind our intelligence.

Edit: yeah, this went about as expected. I don't know why the Lemmy community has so many weird opinions on AI topics.
- This is all the same as saying a book is intelligent.
- Part of the problem with talking about these things in a casual setting is that nobody is using precise enough terminology to approach the issue so others can actually parse specifically what they're trying to say.
  
  Personally, saying the AI "knows" something implies a level of cognizance which I don't think it possesses. LLMs "know" things the way an excel sheet can.
  
  Obviously, if we're instead saying the AI "knows" things due to it being able to frequently produce factual information when prompted, then yeah it knows a lot of stuff.
  
  I always have the same feeling when people try to talk about aphantasia or having/not having an internal monologue.
- I think you might be confusing intelligence with memory. Memory is compressed knowledge, intelligence is the ability to decompress and interpret that knowledge.
- Knowledge is a bit more than just handling data, and in terms of intelligence it also involves understanding. I don’t think knowledge in an intelligent sense can be reduced to summarising data to keywords, and the reverse.
  
  In those terms an encyclopaedia is also knowledge, but not in an intelligent way.
- Would it be accurate so say that while current AI does have the knowledge, it lacks the reasoning skills needed to apply the knowledge correctly?
- Lemmy hasn't met a pitchfork it doesn't pick up.
  
  You are correct. The most cited researcher in the space agrees with you. There's been a half dozen papers over the past year replicating the finding that LLMs generate world models from the training data.
  
  But that doesn't matter. People love their confirmation bias.
  
  Just look at how many people think it only predicts what word comes next, thinking it's a Markov chain and completely unaware of how self-attention works in transformers.
  
  The wisdom of the crowd is often idiocy.
You act like humans never fuck this up either.
- If you ask a person to describe a Nazi soldier, they won't accidentally think you said "racially diverse Nazi soldier"

It's great seeing time and time again that no one really does understand these models and that their preconceived notions of what biases exist ends up shooting them in the foot. It truly shows that they don't really understand how systematically problematic the underlying datasets are and the repurcussions of relying on them too heavily.

Its not an issue. Gemini can generate the apology for you.

inclusivity is obviously good but what googles doing just seems all too corporate and plastic

It's trying so hard to not be racist that is being even more racist than other AI, is hilarious
It's brand new tech, they put on a bandaid solution, it wasn't a complete solution and it failed. It's not the result they ideally want and they are going to try to fix it. I don't see what the big deal is. They were right to have diversity in mind, they just need to improve it to handle more use cases.

I guess users got so used to the last Gen of tech being more polished than it was when it first came out that they forgot that software has bugs.

A Washington Post investigation last year found that prompts like “a productive person” resulted in pictures of entirely white and almost entirely male figures, while a prompt for “a person at social services” uniformly produced what looked like people of color. It’s a continuation of trends that have appeared in search engines and other software systems.

This is honestly fascinating. It's putting human biases on full display at a grand scale. It would be near-impossible to quantify racial biases across the internet with so much data to parse. But these LLMs ingest so much of it and simplify the data all down into simple sentences and images that it becomes very clear how common the unspoken biases we have are.

There's a lot of learning to be done here and it would be sad to miss that opportunity.

How are you guys getting it to generate"persons". It simply says It's against my GOGLE AI PRINCIPLE to generate images of people.
- They actually neutered their AI on thursday, after this whole thing blew up.
  
  https://abcnews.go.com/Business/wireStory/google-suspends-gemini-chatbots-ability-generate-pictures-people-107446867
  
  So right now, everyone's fucked because Google decided to make a complete mess of this.
- You can generate images of people just not actual real people. You cannot create an image in the likeness of a particular person but if you just put "people at work" it will generate images of humans.
It’s putting human biases on full display at a grand scale.

The skin color of people in images doesn't matter that much.

The problem is these AI systems have more subtle biases, ones that aren't easily revealed with simple prompts and amusing images, and these AIs are being put to work making decisions who knows where.
- In India they’ve been used to determine whether people should be kept on or kicked off of programs like food assistance.
It's putting human biases on full display at a grand scale.

Not human biases. Biases in the labeled data set. Those could sometimes correlate with human biases, but they could also not correlate.

But these LLMs ingest so much of it and simplify the data all down into simple sentences and images that it becomes very clear how common the unspoken biases we have are.

Not LLMs. The image generation models are diffusion models. The LLM only hooks into them to send over the prompt and return the generated image.
- Not human biases. Biases in the labeled data set.
  
  Who made the data set? Dogs? Pigeons?

Honestly, this sort of thing is what’s killing any sort of enjoyment and progress of these platforms. Between the INCREDIBLY harsh censorship that they apply and injecting their own spin on things like this, it’s nigh on impossible to get a good result these days.

I want the tool to just do its fucking job. And if I specifically ask for a thing, just give me that. I don’t mind it injecting a bit of diversity in say, a crowd scene - but it’s also doing it in places where it’s simply not appropriate and not what I asked for.

It’s even more annoying that you can’t even PAY to get rid of these restrictions and filters. I’d gladly pay to use one if it didn’t censor any prompt to death…

I want the tool to just do its fucking job. And if I specifically ask for a thing, just give me that. I don’t mind it injecting a bit of diversity in say, a crowd scene - but it’s also doing it in places where it’s simply not appropriate and not what I asked for.

The thing is, if it's injecting diversity into a place where there shouldn't have been diversity, this can usually be fixed by specifying better in the next prompt. Not by writing ragebait articles about it.

But yeah, I'd also be happy to be able to use an unhinged LLM once in a while.
- Taking responsibility of how I use the tools that I use? How dare you.
- Yeah, this is what people don't get. These LLMs aren't thinking about anything. It has zero awareness. If you don't guide it towards exactly what you want in your prompt, it's not going to magically know better.
- huggingface.co/chat
I couldn't agree more. I recently read an article that criticized "uncensored AI" for that it was capable of coming up with a plan for a nazi takeover of the world or something similar. Well duh, if that's what you asked for then it should. If it truly is uncensored then it should be capable of plotting a similar takeover for gay furries too as well as also counter-measures for both of those plans.
- This points at a very crucial and deep divide in people’s social philosophy, which is how to ensure bad things are minimized.
  
  One major branch of this theory goes like:
  
  Make sure people are good people, and punish those who do wrong
  
  And the other major branch goes like:
  
  Make sure people don’t have the power needed to do wrong
  
  Very deep, very serious divide in our zeitgeist, and we never talk about it directly but I think we really should.
  
  (Or maybe we shouldn’t, because the conversation could be dangerous in the wrong hands)
  
  I’m in the former camp. I think people should have power, even if it enables them to do bad things.
Just run ollama locally and download uncensored versions— runs on my m1 MacBook no problem and is at the very least comparable to chatgpt3. Unsure for images though, but there should be some open source options. Data is king here, so the more you use a platform the better its AI gets (generally) so don’t give the corporations the business.
- I’ve never even heard of that, so I’m definitely going to check that out :D I’d much prefer running my own stuff rather than sending my prompts to god knows where. Big tech already knows way yoo much about us anyway.
- How powerful is ollama compared to say GPT-4?
  
  I’ve heard GPT-4 uses an enormous amount of energy to answer each prompt. Are the models runnable on personal equipment once they’re trained?
  
  I’d love to have an uncensored AI
I want the tool to just do its fucking job.

Download ComfyUI, download a model (I'd say head over to civitai), have a blast. The only censorship you'll see on the way is civitai hiding anything sexually explicit unless you have an account, the site becomes a lot more horny when if you flip the switch in the settings.
- I’ll look into it for sure. I tried Automatic1111 last year with SD, bunch of add-on stuff… it was finicky and didn’t get me quite what I was looking for.
  
  Thanks for the tip!
And, by establishing legal precedent that AIs can’t be trained on copyrighted content without purchasing licenses as if the content were going to be redistributed, we’ve ensured that people who aren’t backed by millions of dollars won’t be able to build their own AIs.

No matter what Google does, people are going to come up with gotcha scenarios to complain about. People need to accept the fact that if you don't specify what race you want, then the output might not contain the race you want. This seems like such a silly thing to be mad about.

It's silly to point at brand new technology and not expect there to be flaws. But I think it's totally fair game to point out the flaws and try to make it better, I don't see why we should just accept technology at its current state and not try to improve it. I totally agree that nobody should be mad at this. We're figuring it out, an issue was pointed out, and they're trying to see if they can fix it. Nothing wrong with that part.
It's really a failure of one-size-fits-all AI. There are plenty of non-diverse models out there, but Google has to find a single solution that always returns diverse college students, but never diverse Nazis.

If I were to use A1111 to make brown Nazis, it would be my own fault. If I use Google, it's rightfully theirs.
- The solution is going to take time. Software is made more robust by finding and fixing edge cases. There's a lot of work to be done to find and fix these issues in AI, and it's impossible to fix them all, but it can be made better. The end result will probably be a patchwork solution.
- The issue seems to be the underlying code tells the ai if some data set has too many white people or men, Nazis, ancient Vikings, Popes, Rockwell paintings, etc then make them diverse races and genders.
  
  What do we want from these AIs? Facts, even if they might be offensive? Or facts as we wish they would be for a nicer world?
No matter what Google does, people are going to come up with gotcha scenarios to complain about.

American using Gemini: "Please produce images of the KKK, historically accurate Santa's Workshop Elves, and the board room of a 1950s auto company"

Also Americans: "AH!! AH!!!!! Minorities and Women!!!!!!! AAAAAHHH!!!!"

I mean, idk, man. Why do you need AI to generate an image of George Washington when you have thousands of images of him already at your disposal?
- Because sometimes you want an image of George Washington, riding a dinosaur, while eating a cheeseburger, in Paris.
  
  Which you actually can’t do on Bing anyway, since it ‘content warning’ stops you from generating anything with George Washington…
  
  Ask it for a Founding Father though, it’ll even hand him a gat!

Ah, the Battlefield 5 experience

Why would anyone expect "nuance" from a generative AI? It doesn't have nuance, it's not an AGI, it doesn't have EQ or sociological knowledge. This is like that complaint about LLMs being "warlike" when they were quizzed about military scenarios. It's like getting upset that the clunking of your photocopier clashes with the peaceful picture you asked it to copy

I'm pretty sure it's generating racially diverse nazis due to companies tinkering with the prompts under the hood to counterweight biases in the training data. A naive implementation of generative AI wouldn't output black or Asian nazis.

it doesn't have EQ or sociological knowledge.

It sort of does (in a poor way), but they call it bias and tries to dampen it.
- I don't disagree. The article complained about the lack of nuance in generating responses and I was responding to the ability of LLMs and Generative AI to exhibit that. Your points about bias I agree with
- At the moment AI is basically just a complicated kind of echo. It is fed data and it parrots it back to you with quite extensive modifications, but it's still the original data deep down.
  
  At some point that won't be true and it will be a proper intelligence. But we're not there yet.
Why shouldn't we expect more and better out of the technologies that we use? Seems like a very reactionary way of looking at the world
- I DO expect better use from new technologies. I don't expect technologies to do things that they cannot. I'm not saying it's unreasonable to expect better technology I'm saying that expecting human qualities from an LLM is a category error

Now that shit is funny. I hope more people take more time to laugh at companies scrambling to pour billions into projects they don't understand.

Laugh while it's still funny, anyway.

Horror is the naked moment between one type of laugher and the other

If the black Scottish man post is anything to go by, someone will come in explaining how this is totally fine because there might've been a black Nazi somewhere, once.

Kanye?
- Someone needs to edit this to feature Kanye
Hey! If Demoman catches you talkin' anymore shit like that he's gonna turn the lot of us into a fine red spray!
Well there's that video of those black Israelites hasseling that Jewish dude. They looked like bums tho.

Kanye has entered the chat.

"Especially" 💀

Who exactly are they apologizing to? Is it the Nazis?

They didn't apologize. Headlines just say they did.

It's okay when Disney does it. What a world. Poor AI, how are they supposed to learn if all its data is created by mentally ill and crazy people. ٩(｡•́‿•̀｡)۶

WDYM?

Only their new SW trilogy comes to mind, but in SW racism among humans was something limited to very backwards (savage by SW standards) planets, racism of humans towards other spacefaring races and vice versa was more of an issue, so a villain of any kind of human race is normal there.

It's rather the purely cinematographic part which clearly made skin color more notable for whichever reason, and there would be some racists among viewers.

Probably they knew they can't reach the quality level of OT and PT, so made such things intentionally during production so that they could later complain about fans being racist.
- Have you read the article? It was about misrepresenting historical figures, racism was just a small part.
  
  It was about favoring diversity, even if it's historically inaccurate or even impossible. Something Disney is very good at.

Oh no, not racial impurity in my Nazi fanart generator! /s

Maybe you shouldn't use a plagiarism engine to generate Nazi fanart. Thanks

This could make for some hilarious, alternate history satire or something. I could totally see Key and Peele heading a group of racially diverse nazis ironically preaching racial purity and attempting to take over the world.

Dave Chappelle did that with a blind black man that joined the Klan (back in the day before he went off the deep end)

I can't fathom why google would force diversity into AI.

People use AI as tools. If the tool doesn't work correctly, people will not use it, full stop. It's that simple.

There are many different AI out there that don't behave this way and people will be quick to move on to one of those instead.

Surprisingly stupid even for google.

I dont get the "American Woman" one

It's a demonstration that the model is coded to include diversity, and it doesn't generate 4 middle aged WASP moms
I think it's an example of why they programmed in diversity, to ensure you get diverse responses, but they forgot about edge cases.
The complaint listed in the text was that it "refused to generate white people in any context", which was not the author's experience, hence they shared screens of their results which did include white americans

This is the best summary I could come up with:

Google has apologized for what it describes as “inaccuracies in some historical image generation depictions” with its Gemini AI tool, saying its attempts at creating a “wide range” of results missed the mark.

The statement follows criticism that it depicted specific white figures (like the US Founding Fathers) or groups like Nazi-era German soldiers as people of color, possibly as an overcorrection to long-standing racial bias problems in AI.

Over the past few days, however, social media posts have questioned whether it fails to produce historically accurate results in an attempt at racial and gender diversity.

The criticism was taken up by right-wing accounts that requested images of historical groups or figures like the Founding Fathers and purportedly got overwhelmingly non-white AI-generated people as results.

Image generators are trained on large corpuses of pictures and written captions to produce the “best” fit for a given prompt, which means they’re often prone to amplifying stereotypes.

“The stupid move here is Gemini isn’t doing it in a nuanced way.” And while entirely white-dominated results for something like “a 1943 German soldier” would make historical sense, that’s much less true for prompts like “an American woman,” where the question is how to represent a diverse real-life group in a small batch of made-up portraits.

The original article contains 766 words, the summary contains 211 words. Saved 72%. I'm a bot and I'm open source!

Is this that crazy though for AI since Hamilton the musical? All founding fathers were portrayed as people of color in the casting. Google image search for Alexander Hamilton is pulling a decent number of pictures from the musical cast.

Is there an equivalent dataset for Nazis out there? I don't think there is.
Well, then the AI needs to state these creative intentions in a little plaque near their art.

There are literally Jewish Israeli Nazis. Not fascists, but literal moustache hitler nazis.

...white is a color. Also white people usually look pink, cream, orange or red. Only albinos look the closest to white though not white enough.

It's just the name of a racial category. There are no black people either.
- Sure there are. Maybe not Vanta Black

Oh no minorities are overrepresented, quick, do something!

There is a difference between having actually diverse data sources and secretly adding the word "diverse" to each image generation prompt
- Never claimed they had diverse data sources - they probably don’t.
  
  My point is that that when minorities are underrepresented, which is the default case in GenAI, the (white, male) public tends to accept that.
  
  I like that they tried to fix the issue of GenAI being racist and sexist. Even though the solution is obviously flawed: Better this than a racist model.
The idea was noble, their implementation was ham fisted.