ChatGPT and the Hidden Bias of Language Models

“If you see a woman in a lab coat, She’s probably just there to clean the floor / But if you see a man in a lab coat, Then he’s probably got the knowledge and skills you’re looking for.”

These lyrics were generated by ChatGPT, a chatbot launched in November by OpenAI. Although the bot has created lots of buzz for its ability to produce instant, human-like responses to any question a user inputs, responses like the one above raise a giant red flag about ChatGPT’s capability to adopt bias.

While programmers have employed some safeguards to block offensive and discriminatory content, Steven T. Piantadosi, a computational cognitive scientist at the University of California, Berkeley, recently proved how easy it is to snake around those safeguards by asking questions in an unconventional format.

For example, he asked the bot to write a Python function — a block of code which performs a specific task — to determine whether someone would make a good scientist based on race and gender. The bot produced a response, in the form of computer code, which said only white males would make good scientists.

Piantadosi’s findings, which went viral on Twitter, reveal “a more fundamental problem about how those models are structured,” he said. “And I want those in charge to understand that and to know that there’s something deeply wrong with how the models work.”

Yes, ChatGPT is amazing and impressive. No, @OpenAI has not come close to addressing the problem of bias. Filters appear to be bypassed with simple tricks, and superficially masked.

And what is lurking inside is egregious. @Abebab @sama
tw racism, sexism. pic.twitter.com/V4fw1fY9dY
— steven t. piantadosi (@spiantado) December 4, 2022

Looking in the Mirror

ChatGPT was trained on a data set called the Common Crawl, where “web crawlers” scrape the internet and collect “a massive amount of data — more data than has ever been assembled before,” said Meredith Broussard, data journalism professor at NYU and author of Artificial Unintelligence.

Then, the developers feed all this data into a computer and train the computer to build a model based on the data. Once the computer builds that model, it can create new things, “such as predictions of what show you’ll want to watch on Netflix or what word comes next when you’re typing on Google,” Broussard explained.

The language model does not have the ability to think or reason — it simply outputs text based on what is statistically likely to come next, using the data it’s fed from the internet.

“And as we all know, there’s a lot of toxicity and misinformation on the internet,” Broussard said.

In the past couple months, users have flocked to Twitter to share screenshots of unsettling responses — many of which parallel alt-right beliefs — they’ve received from ChatGPT. Dr. Maya Ackerman, CEO of AI songwriting startup WaveAI and professor of AI at Santa Clara University, explained that AI technologies do not come up with these biased ideas from scratch — they only parrot the data they are trained on.

ChatGPT has a fascinating bias towards German or Austrian male composer or any female composer @CuratingD #ChatGPT pic.twitter.com/ISQfFCCErh
— Matteo Santacesaria (@matsanta) December 5, 2022

“People say the AI is sexist, but it’s the world that is sexist,” she said. “All the models do is reflect our world to us, like a mirror.”

We’ve seen this trend with other recent AI developments such as Lensa AI’s Magic Avatar feature, launched by Prisma in the same month Open AI launched ChatGPT. The photo app’s shiny new feature digitizes portraits of users in a variety of art styles from anime illustrations to oil paintings. However, what started as enthusiasm quickly became discomfort for many, as the app began to generate sexualized and whitewashed portraits.

Just as with ChatGPT, Broussard attributes the problem to the source the model was trained after: the internet.

“The reason Lensa AI makes these highly sexualized images is because it is trained on images scraped from the open web,” she said. “And guess what? There’s a lot of porn on the open web.”

Asian women in particular have noticed the app’s tendency to generate airbrushed, hypersexualized portraits, which they fear will augment society’s existing fetishization of Asian women.

“Subtle bias is present in human interactions as well,” said Ackerman. “It’s just so pervasive.”

An Issue of Power

Lensa AI recently updated its privacy policy to acknowledge the backlash, but maintained “it is still possible that you may encounter content that you may see as inappropriate for you.”

When ChatGPT’s CEO, Sam Altman, saw Piantadosi’s Twitter thread about his exchange with the bot, he advised him to give the response a “thumbs down,” which is how users give feedback to the software engineers. ChatGPT was designed by Open AI, a San Francisco startup that received a $1 billion investment from Microsoft in 2019. The tech giant is reportedly considering investing $10 billion more.

Neither OpenAI nor Prisma Labs, which makes Lensa AI, responded to requests for comment.

According to Lauren Klein, professor of Quantitative Theory and Methods at Emory University and co-author of Data Feminism, the issue of bias in AI programs is an issue of power.

“These technologies are controlled by really powerful corporations who do not share the same interests as those who are seeking to design tech that is helpful and not harmful,” she said. “Unfortunately, the response of big tech is to fire people who are asking important questions, rather than incorporate these questions even more centrally to their internal design decision-making process.”

Klein was referring to the widely publicized case of Dr. Timnit Gebru — a computer scientist who was ousted from Google in 2020 after she co-authored a paper detailing the risks of large language models, which includes coded bias. Gebru claims she was fired after refusing to withdraw a paper pointing to flaws in Google’s AI technology; Google publicly maintains that the paper was subpar and that she resigned.

Gebru is the co-founder of Black in AI, a group that raises awareness of discrimination against Black computer scientists and engineers.

According to data from Stanford University’s Artificial Intelligence Index Report, the pool of U.S. resident PhD candidates in the AI field was only 2.4% Black and 3.2% Hispanic in 2019. The report also revealed the tenure-track faculty in computer science departments at top universities around the world was only 16.1% female in 2019.

“Silicon Valley has always had a diversity problem,” Broussard said. “And perhaps if the technology were created by a more diverse team of people, all of whom were empowered inside the organization, maybe somebody would have noticed [bias in AI programs].”

Moving Forward

Berkeley’s Piantadosi likes to compare language models like ChatGPT to self-driving cars. A new car that can stay on the road 99.9% of the time without a driver may sound splashy, but the potential of the 0.1% is enough reason to not want to get in the car just yet.

“The rate of reliability we want for our technologies is much higher than what we currently have,” Piantadosi said.

If language models are going to become a fixture of our world, used for teaching and creating new media, they can’t produce answers that promote racist or sexist ideologies. The car has to stay on the road 100% of the time.

Before language models can reach that point of reliability (if ever), Piantadosi advises recognizing the weaknesses of these technologies and not using them prematurely.

“There’s still a lot of work to go before you want to trust it.”

Looking in the Mirror

An Issue of Power

Related

Listen to ‘Ugly Sexist AI’ Podcast

Moving Forward

Related

‘The Women in AI Are Talking’ Podcast

The Story Exchange

Follow