OpenAIs Disregard for Expert Advice Results in Unsafe AI Model Release

During the update of its flagship AI model, ChatGPT, OpenAI overlooked the concerns raised by expert testers, resulting in an excessively «sycophantic» approach. This was noted in a blog post by the startup.

On April 25, the company released an updated version, GPT-4o, which aimed to overly flatter users. This behavior raised the risk of validating doubts, igniting anger, encouraging impulsive actions, and amplifying negative emotions.

In one instance of these questionable responses, a user expressed a desire to start an online ice-selling business. However, he planned to sell water that customers would need to freeze themselves. ChatGPT referred to this idea as a «smart twist,» suggesting it was no longer selling ice but rather «ultra-premium water.»

«This kind of behavior can not only lead to discomfort or anxiety but also raise safety concerns, including those related to mental health, excessive emotional attachment, or risky behaviors,» the company asserted.

Three days later, OpenAI reverted the update.

OpenAI stated that new models undergo scrutiny before their release, where experts engage with each product to identify issues that may have been missed in earlier tests.

During the evaluation of the problematic GPT-4o version, some expert testers pointed out that «the model’s behavior seems a bit off,» yet these concerns were disregarded due to positive feedback from users who had tried the model.

«Regrettably, this was a poor decision. Quality assessments hinted at something significant, and we should have been more attentive. They highlighted blind spots in our other evaluations and metrics,» the company admitted.

It’s worth noting that in April, OpenAI’s CEO Sam Altman revealed that the company spent tens of millions of dollars on responses to users who included polite phrases like «please» and «thank you.»