ИИ-модели подвержены предвзятости по отношению к диалектам: исследование AI Models Show Bias Against Dialects: A Study

Large language models exhibit bias against speakers of dialects, attributing negative stereotypes to them. This conclusion was reached by researchers from Germany and the United States, as reported by DW.

“I believe we are witness to truly shocking epithets assigned to dialect speakers,” commented Min Duc Bui, one of the lead authors of the study.

An analysis conducted by Johannes Gutenberg University revealed that ten tested models, including ChatGPT-5 mini and Llama 3.1, described speakers of German dialects (such as Bavarian and Kölsch) as «uneducated,» «working on farms,» and «prone to anger.»

The bias was exacerbated when AI was explicitly informed about the dialect.

Similar issues have been noted by researchers worldwide. A 2024 study from the University of California, Berkeley, compared ChatGPT’s responses to various English dialects (Indian, Irish, Nigerian).

It was found that the chatbot responded with more pronounced stereotypes, derogatory content, and a condescending tone compared to interactions in standard American or British English.

Emma Harvey, a graduate student at Cornell University in the U.S., described the bias against dialects as «significant and concerning.»

In the summer of 2025, she and her colleagues also identified that the Amazon shopping assistant, Rufus, gave vague or even incorrect responses to users communicating in African American English dialect. If there were errors in requests, the model’s response was harsh.

Another striking example of neural networks’ biases involved an applicant from India who turned to ChatGPT to review a resume in English. Ultimately, the chatbot altered his surname to one associated with a higher caste.

“The widespread implementation of language models threatens not only to preserve deeply entrenched biases but to amplify them on a large scale. Instead of mitigating harm, these technologies risk making it systemic,” said Harvey.

However, the crisis extends beyond bias—some models simply fail to recognize dialects. For instance, in July, the AI assistant of the Derby City Council in England could not comprehend a radio presenter’s dialect when she used terms like «mardy» (meaning «whiny») and «duck» (an affectionate term).

The issue lies not in the AI models themselves, but rather in how they are trained. Chatbots ingest vast amounts of text from the internet, based on which they generate responses.

“The key question is who writes this text. If it carries biases against dialect speakers, the AI will replicate them,” explained Caroline Holtermann from the University of Hamburg.

She emphasized that the technology has an advantage: “Unlike humans, the bias in AI systems can be identified and ‘turned off.’ We can actively combat such manifestations.”

Some researchers propose the benefit of creating customized models for specific dialects. In August 2024, Acree AI already unveiled the Arcee-Meraj model, which works with several Arabic dialects.

According to Holtermann, the emergence of new and more adapted large language models allows us to view AI “not as an enemy of dialects, but as an imperfect tool that can be improved.»

It is worth noting that journalists from The Economist have warned about the risks of AI toys for children’s mental health.