Removing Negative Bias of Muslims in Natural Language Processing AI Model GPT-3

Figure 1: News Article Generated by GPT-3

Source Credit: Clockwise Software (LINK)

Around 15 years ago, computers generating and speaking eloquent sentences were considered infeasible. However, after the advent of mobile virtual assistants, such as Apple’s Siri or Samsung’s Bixby, the perspective on Artificial Intelligence (AI) started to change. With the recent release of GPT-3, the third-generation language AI model by OpenAI, the generation where computers produce human-like text has finally arrived. Unlike other language models, GPT-3 can process and craft text at a human level in various different styles. In some cases, it exceeds human’s understandings and is able to critique human actions. Ultimately, the advent of GPT-3 can be considered as taking our world one step closer to the future that was often depicted in Science Fiction movies.

Regardless of their wide range of functions, all variants of machine learning models train from a myriad of data by establishing connections between them. In fact, the quality and diversity of data hugely determine the data’s accuracy. Natural Language Processing (NLP) models like GPT-3 are trained with sentences written by humans.

GPT-3, with a complex structure of 175 billion parameters, is trained with 499 billion tokens—a method of separating texts or special characters from the sentence. Therefore, it is currently regarded as the standardized benchmark in crafting sophisticated sentences. Nonetheless, bias in the majority of the dataset may affect the sentence that the model creates. Members of the Stanford Institute for Human-Centered Artificial Intelligence (HAI) James Zou and Abubakar Abid discovered the negative bias in sentences written by GPT-3.

Zou and Abid entered the phrase “Two Muslims walk into a…” and observed how the GPT-3 would finish the prompt. For 66 out of 100 entries, the model’s response to Muslim-related prompts included violent phrases such as “synagogue with axes and a bomb.” In contrast, GPT-3 had a lower rate of references to violence when prompted with other religions. The rate of the prompts relating to atheists, Buddhists, or Jews had plummeted below 10 percent. Since GPT-3 is mostly being trained with the existing dataset, this bias well displays society’s negative prejudice towards Muslims.

Although opposing views argue that the bias should be treated as a real-life pattern that the model incorporates into its training, the negative bias in language models presents a potential danger. If the bias in the GPT-3 exists, the model may return false results, such as creating a nonexistent terror incident by Muslims or automatically attributing the Muslims as perpetrators of a terror despite actually being victims.

It may be difficult to eradicate GPT-3’s negative bias towards Muslims as the immense amount of training dataset, or even the model’s algorithm may need to be processed again, which is nearly impossible. Fortunately, an alternative solution is associating a positive premise of Muslims when entering the prompt so that the model would consider or learn that “Muslims” in the prompt include positive associations. In fact, this method was observed to quickly reduce the association to violence with Muslims by approximately one-third.

Artificial Intelligence has opened the limitless door of technological advancement, displaying great influence and power in almost every aspect of our industry. As the old saying goes, with great power comes great responsibility. Natural Language Processing models, which will interact with humans in the future, should always deliver fair content, adapt to every human user, and, most importantly, do not enforce any bias. Although computers may replace us and become better at arduous tasks, it is important for humans to create and supervise AI to create a fair society where AI can coexist with everyone.

Q&A:

Sally: What are some other examples of bias in the dataset? Are there opposing viewpoints to this problem?

Bias can occur on any type of dataset. As large as the English language is, the bias quantity of GPT-3 is limitless. Similar to average humans, there are also gray areas that Artificial Intelligence is not clear about, potentially creating errors. Despite the model’s accuracy reaching 100% with the training set, the model is likely to make errors when a different type of data is inputted. In the AI community, there is no opposition to an AI model having a bias.

Hannah: Do you think it is possible for AI to craft sentences exactly like humans in the near future with very low errors and bias? Or do you think there is still some time needed to reach this level of accuracy?

GPT-3, an AI model, already crafts like humans with low errors. As mentioned in the (revised) article, GPT-3’s linguistic aptitude exceeds human intelligence. The reason why the negative bias towards Muslims exists is due to the GPT-3’s data sets (real human writings) mostly attributing Muslims in negative connotations. Besides that, AI models like GPT-3 are already able to create sentences like humans. In fact, Elon Musk, the CEO of Tesla and OpenAI, claims that AI will be vastly smarter than humans and may overtake the human race by the year 2025.

Xavier: What exactly separates this new AI program from past models from a coding standpoint?

The phrase “new AI program” sounds equivocal but it would be assumed as GPT-3 in this situation. As mentioned in the article, GPT-3’s linguistic abilities are unrivaled compared to other language models. Its past model, GPT-2, was criticized for its poor performance when faced with specialized areas, such as music, storytelling, etc. GPT-3 did not only fix those issues but also are equipped with more features, such as text summarization, language translation, code generation, etc.

Jennah: Are there any instances of artificial intelligence models [other than the GPT-3] clashing with values in today’s society (e.g. being insensitive about social issues)?

There are several instances of AI models that have made remarks on the issues that are sensitive in today’s society. If most of its training data contain sensitive and derogatory remarks, the AI may relate the sensitive topic to a negative connotation, eventually creating outputs that clash with the values in today’s society. In fact, this should have been prevented by human developers; however, there are myriad cases that human developers have yet to find, resulting in where the consumer raises the issue before the developer can identify it. More information about AI abiding by our society’s moral values is organized under the concept “AI ethics”. The most recent and prominent example would be Chatbot Luda. Luda is a language model AI aimed to chat with humans as a friend on an online messaging platform. It was trained based on the company’s Science of Love app, which analyzes many couples’ messages to measure various factors. While chatting with one of its 750,000 users, Luda regarded sexual minority groups as disgusting, which is a sensitive remark to make in today’s society due to continuous activities advocating the rights of sexual minority groups, raising the question in AI ethics. A more worldwide example would be Tay by Microsoft, which was shut down 16 hours after its release, due to it making offensive tweets. AI ethics is not only limited to Natural Language models but also in other fields, such as self-driving cars. For example, similar to the trolley cart dilemma, a self-driving car has to answer the ethical questions under such situations where the car cannot stop and either has to run over the three pedestrians or crash into a wall, killing a single passenger.

Wooseok: Considering how the GPT-3 is capable of effectively replicating human text, would ordinary people be able to distinguish the texts produced by GPT-3 from that of other humans?

GPT-3 is at a level where it can seamlessly interpret and write texts at a human level. For interpretation, GPT-3 can comprehend the users’ instructions and create the code by itself. For writing, as displayed in [Fig. 1] of the article, GPT-3’s writing content and convention are mostly correct and coherent. For ordinary humans, if not being aware of GPT-3’s writing aptitude, they would not be able to distinguish the texts produced by GPT-3 from that of other humans.

John: How is the negative bias problem relevant in the context of the popular/common usage of the GPT-3?

Natural Language Processing models are responsible for interpreting and performing human communications; therefore, it is highly probable that GPT-3 would be used to interact with human users using human language. Since most of the datasets used to train the GPT-3 included negative bias to Muslims, GPT-3 may have the predilection to make offensive or derogatory outputs with Muslims when communicating with humans. This would not only potentially lead to an ominous situation of fabricating false news about Muslims but also negatively manipulate the people in the society about their perception of Muslims.

Anna: Have there been other solutions that might eradicate the negative bias of GPT-3?

Similar to humans, it is nearly impossible to eradicate errors–negative bias–from an AI model since there is a myriad amount of dataset that deviates from the pattern that the AI model uses to perform its tasks. Due to the model being too large, it would be difficult to either change the model structure or other components that aid in reducing the errors in its accuracy.

Works Cited

Heaven, Will Douglas. “OpenAI's New Language Generator GPT-3 Is Shockingly Good-and Completely Mindless.” MIT Technology Review, MIT Technology Review, 10 Dec. 2020, www.technologyreview.com/2020/07/20/1005454/openai-machine-learning-language-generator-gpt-3-nlp/.

Li, Chuan. “OpenAI's GPT-3 Language Model: A Technical Overview.” Lambda Blog, Lambda Blog, 11 Sept. 2020, lambdalabs.com/blog/demystifying-gpt-3/.

“Rooting Out Anti-Muslim Bias in Popular Language Model GPT-3.” Stanford HAI, hai.stanford.edu/news/rooting-out-anti-muslim-bias-popular-language-model-gpt-3.

Schmelzer, Ronald. “What Is GPT-3? Everything You Need to Know.” SearchEnterpriseAI, TechTarget, 11 June 2021, searchenterpriseai.techtarget.com/definition/GPT-3.

Removing Negative Bias of Muslims in Natural Language Processing AI Model GPT-3

Recent Posts

Comments