[Paper] Creativity Has Left the Chat: The Price of Debiasing Language Models
[Paper] Creativity Has Left the Chat: The Price of Debiasing Language Models
Abstract: Large Language Models (LLMs) have revolutionized natural language processing but can exhibit biases and may generate toxic content. While alignment techniques like Reinforcement Learning from Human Feedback (RLHF) reduce these issues, their impact on creativity, defined as syntactic and semantic diversity, remains unexplored. We investigate the unintended consequences of RLHF on the creativity of LLMs through three experiments focusing on the Llama-2 series. Our findings reveal that aligned models exhibit lower entropy in token predictions, form distinct clusters in the embedding space, and gravitate towards "attractor states", indicating limited output diversity. Our findings have significant implications for marketers who rely on LLMs for creative tasks such as copywriting, ad creation, and customer persona generation. The trade-off between consistency and creativity in aligned models should be carefully considered when selecting the appropriate model for a given application. We also discuss the importance of prompt engineering in harnessing the creative potential of base models.
Lay summary (by Llama 3 70B with a few edits): AI chatbots that can understand and generate human-like language have become intelligent, but sometimes they can be biased or even mean. To fix this, researchers have developed a technique called Reinforcement Learning from Human Feedback (RLHF). RLHF is like a training program that teaches the chatbot what's right and wrong by giving it feedback on its responses. For example, if the chatbot says something biased or offensive, the feedback system tells it that's not okay and encourages it to come up with a better response. This training helps the chatbot learn what kinds of responses are appropriate and respectful. However, our research showed that RLHF has an unintended consequence: it makes the chatbot less creative.
When we used RLHF to train the chatbot, we found that it started to repeat itself more often and come up with fewer new ideas. This is because the training encourages the chatbot to stick to what it knows is safe and acceptable, rather than taking risks and trying out new things. As a result, the chatbot's responses become less diverse and less creative. This is a problem because companies use these chatbots to come up with new ideas for ads and marketing campaigns. If the chatbot is not creative, it might not come up with good ideas. Additionally, our research found that the chatbot's responses started to cluster together in certain patterns, like it was getting stuck in a rut. This is not what we want from a creative AI, so we need to be careful when choosing which chatbot to use for a job and how to ask them questions to get the most creative answers. We also need to find ways to balance the need for respectful and appropriate responses with the need for creativity and diversity.