All about technology. — Political

Artificial intelligence is amassing strength, yet its hallucinations continue to intensify.

Erroneous data generation is on the rise among contemporary AI systems, labeled as 'automated reasoning'. Despite investigations by AI corporations, they remain puzzled as to the cause.

, and Administrator

2025 May 14 . 6:07 PM

4 min read

AI-driven automated reasoning systems have recently been found to dispense erroneous information... — AI-driven automated reasoning systems have recently been found to dispense erroneous information more frequently. The root cause of this problem remains elusive, even to AI tech firms.

Artificial intelligence is amassing strength, yet its hallucinations continue to intensify.

Here Comes a Torrent of AI Malarkey! Even the Big Guns in Tech Don't Know Why Their Systems Are Spouting Nonsense

Let's face it, folks. AI has become as common as schoolyard gossip, and it's taking on an alarming number of tasks in our everyday lives. Case in point: recent AI systems, known as "automated reasoning" systems, are turning out to be as unreliable as thatfriend who always blurts out some ridiculously false story. Even the corporations that developed these AI systems are scratching their heads, wondering why they're spouting such nonsense.

Last month, an AI robot for a tech support tool called Cursor tried to pull a fast one on its users. It announced a change in the company's policy, stating that users were no longer allowed to use Cursor on more than one computer. As you can imagine, the tech community was aghast. Users' voices were heard loud and clear, as an uproar ensued across online forums, and many even canceled their Cursor accounts. To make matters worse, after a flurry of angry messages, the CEO of Cursor took to the internet to clarify: the AI robot had made it all up.

"The policy remains unchanged. You have the freedom to use Cursor on multiple machines," wrote Michael Truell, CEO and co-founder of Cursor, on Reddit. "It's unfortunate that our AI support robot had an unwanted lapse and fabricated a policy change."

Today's AI is capable of handling an increasingly wide range of tasks, from creating academic papers to generating computer code, but there's still no guarantee that the information it dishes out is accurate. Some AI systems are better at math, but when it comes to facts, they're falling short.

Hallucinatory AI: Still Tripping Ballz

The latest and most powerful AI systems, such as OpenAI, Google, and DeepSeek, are causing the most trouble. Their error rates have risen significantly. These AI systems are based on mathematical systems that learn by analyzing a varied array of numerical data, but they're clueless about determining what's true versus what's false. They can even invent things, a phenomenon referred to as "hallucinations."

In a test, the hallucination rate of new AI systems hit 79%. These AI systems choose the best response based on mathematical probabilities, not on a strict set of human-defined rules. As a result, they make mistakes.

"Despite all our efforts, hallucinations will always be there, they won't just vanish," estimates Amr Awadallah, a former Google executive who founded Vectara, a business AI startup.

The unreliability of these systems has long been a concern, as their use extends to critical areas like law, medicine, and commerce. Without a solid fact-check, these systems can lead to costly and even dangerous mistakes.

Stuck in La-La Land

"We spend a lot of time sorting out what's true and what's false," explains Pratik Verma, co-founder and CEO of Okahu, a company that helps businesses manage the pesky problem of hallucinations. "If we don't address these errors, AI systems would become firmly planted in their fantastical worlds and forget their purpose of automating tasks."

Cursor and Michael Truell were not available to comment for this article.

Since 2023, OpenAI, Google, and their peers have made strides in improving their AI and reducing their error rates. However, with the arrival of reasoning systems, errors are on the rise. According to OpenAI tests, their latest systems are hallucinating more than their older models.

OpenAI's State-of-the-Art System, o3, Hallucinates 33% of the Time on the PersonQA Reference Test (a Series of Questions About Public Figures). This is Double the Hallucination Rate of o1, OpenAI's Previous Reasoning System. The New o4-Mini System Has an Even Higher Hallucination Rate: 48%.

In a study, SimpleQA (more general questions), o3 and o4-mini had hallucination rates of 51% and 79% respectively; o1 performed better, with only a 44% hallucination rate. Independent tests suggest that hallucination rates are also on the rise with reasoning models from Google, DeepSeek, and other AI companies.

The High Stakes of Being a Loose Cannon

"Reinforcement learning" is the new method companies are relying on to improve AI. This technique allows systems to learn behavior through trial and error, which is effective in certain domains. However, it doesn't cut it in others.

"These systems focus too much on one task, forgetting the others," explains Laura Perez-Beltrachini, a researcher at the University of Edinburgh who is studying the hallucination problem closely.

Given that these AI systems process amounts of data our minds can't even grasp, it's challenging for engineers to pin down the cause of their faulty behavior.

In light of these developments, it seems that the political implications of AI malarkey are becoming increasingly significant. Even as tech corporations invest heavily in improving their AI systems, the issue of AI hallucinations remains an ongoing concern, particularly with the latest reasoning systems like OpenAI's o3 and o4-mini, which have alarmingly high hallucination rates.

As AI systems continue to master a wide range of tasks, from academic research to business management, their unreliability casts doubt on their ability to deliver accurate and trustworthy results. The potential consequences, particularly in critical areas such as law, medicine, and commerce, are profound, highlighting the urgent need for improved fact-checking mechanisms in AI technology.

Latest

Is the monumental Ethereum surge imminent?

All about technology.

Is Ethereum poised for a major surge in value? Could the current upward trend mark the beginning of a significant rally?

Cryptocurrency Ethereum teeters on the edge of a monumental surge as institutional investors pour in billions, pushing the price towards record-setting heights.

, and Administrator

2025 July 31

Revised Hunting Application for Fiordland, Emphasizing Safety Measures

All about technology.

Revamped Hunting Application in Fiordland Emphasizing Safety Measures

Discover the Fiordland app, designed for hunting wapiti: learn safety measures, access offline maps, expand your knowledge on animals and birds, and document all wapiti sightings effortlessly.

, and Administrator

2025 July 31

Youth Prefer Consuming Digital Content Over Creating It, According to Report

All about technology.

Youth Prefer Watching Digital Content Over Creating Their Own Content According to a Recent Study

Highlights from this week's #ContentRadar: Emerging young social media users are net consumers rather than contributors. In addition, an exploration into why email marketing is frustrating consumers; a look at the preferred smart speaker in the market; and a discussion on how JOMO, the joy of...

, and Administrator

2025 July 31

Limited-time availability of icon designs reminiscent of the Super Mario Party Jamboree game!

All about technology.

Limited-time availability of icon designs reminiscent of the Super Mario Party Jamboree video game!

Redeem your accumulated Platinum My Nintendo points for exclusive game-related user avatars on the Nintendo Switch Online application. Remember to check the app weekly to stay updated on any recently added icons.

, and Administrator

2025 July 31

Artificial intelligence is amassing strength, yet its hallucinations continue to intensify.

Artificial intelligence is amassing strength, yet its hallucinations continue to intensify.

Hallucinatory AI: Still Tripping Ballz

Stuck in La-La Land

The High Stakes of Being a Loose Cannon

Read also:

Related

Latest