As generative artificial intelligence (AI) adoption accelerates, there are some concerns that companies are ignoring the technology’s tendency to make inaccurate statements. While the industry is committed to addressing the problem, AI hallucinations could potentially lead to customers losing trust in chatbots.
Google’s [GOOGL] new generative AI feature has a problem: it keeps churning out inaccurate responses to queries posed by users.
When asked how to stop cheese from sliding off a pizza base, AI Overview reportedly suggested applying glue. Another user was advised to “eat at least one small rock per day” for nutritional purposes.
Hallucinating isn’t a problem exclusive to AI Overview, but is something all generative AI large language models (LLMs) do. The problem is so bad that Elon Musk generally avoids using generative AI tools at either SpaceX or Starlink.
“I’ll ask it questions about the Fermi Paradox, about rocket engine design, about electrochemistry. And so far, the AI has been terrible at all those questions,” Musk told the Milken Institute conference last month.
Patronus, a start-up founded by former Meta [META] Reality Labs employees, raised $17m last month to try to eliminate AI hallucinations. However, the issue won’t be disappearing anytime soon.
Google Scales Back Following Mishaps
AI Overview’s embarrassing errors aren’t a new phenomenon at Google. Back in February, the tech giant paused its Gemini AI image generator after the tool was found to be generating images of historical figures as people of colour.
Following backlash over the past couple of weeks, Google announced last Thursday that it’s scaling back AI Overview’s capabilities as well. This includes restricting responses to queries where there isn’t enough data available to make accurate statements.
The company’s head of search, Liz Reid, defended AI Overview in a blog post last week, arguing that its responses “generally don’t ‘hallucinate’ or make things up in the ways that other LLM products might”. Rather, the mishaps were more down to “misinterpreting a nuance of language on the web”.
Even if AI Overview isn’t making things up per se, it is responding to search results with false information. This and the problems faced by the Gemini image generator raise the question of whether generative AI is moving too fast.
“Google’s not doing itself any favours by racing ahead with a clearly flawed model training strategy,” Rebecca Wettemann, CEO and Principal Analyst at Valoir, told OPTO.
Chatbot Makers Risk Losing Enterprise Customers
The hallucination rate can vary from LLM to LLM. Research by Vectara, a generative AI platform co-founded by a former Google executive, found that OpenAI’s ChatGPT had the lowest hallucination rate at 3%, while Meta’s Llama had a 5.1% rate and the Claude 2 chatbot from Amazon-backed [AMZN] Anthropic had an 8.5% rate.
As LLMs advance, these rates should keep falling. Until then, the average person will need to be willing to persevere with AI hallucinations. However, if the problem persists, then it could lead to serious headaches for enterprise customers of chatbot makers.
“Organisations are looking for trusted partners to help them navigate the complexity of AI,” said Wettemann. If they can’t trust the technology, then it may be easier to walk away from it.
Wettemann adds that “we’re going to continue to see AI ‘accidents’ across the market for some time”, but investors may still buy into the generative AI hype regardless. The litmus test for generative AI stocks will be in “what enterprises and end consumers are willing to pay for and use”. Ultimately, the winners will be the companies whose LLMs have been trained the most rigorously on the highest quality, unbiased data, as this will reduce the likelihood of hallucinations occurring.
Google Share Price Continues to Power On
Investors seem unfazed by AI Overview’s hallucination problem. The Google share price hit an all-time high of $178.77 on 20 May and is up 3.5% in the past month through 3 June and up 24% year-to-date.
As well as holding Google shares outright, another way to gain exposure to the stock is through thematic ETFs.
The Invesco AI and Next Gen Software ETF [IGPT] has Google as its biggest holding, with a weighting of 10.6% as of 31 May. The fund is up 28.9% in the past year and up 16.9% year-to-date.
The Roundhill Generative AI and Technology ETF [CHAT] has Google as its third-biggest holding, with a weighting of 5.4% as of 3 June. The fund is up 23.5% in the past year and up 12.9% year-to-date.
Disclaimer Past performance is not a reliable indicator of future results.
CMC Markets is an execution-only service provider. The material (whether or not it states any opinions) is for general information purposes only, and does not take into account your personal circumstances or objectives. Nothing in this material is (or should be considered to be) financial, investment or other advice on which reliance should be placed. No opinion given in the material constitutes a recommendation by CMC Markets or the author that any particular investment, security, transaction or investment strategy is suitable for any specific person.
The material has not been prepared in accordance with legal requirements designed to promote the independence of investment research. Although we are not specifically prevented from dealing before providing this material, we do not seek to take advantage of the material prior to its dissemination.
CMC Markets does not endorse or offer opinion on the trading strategies used by the author. Their trading strategies do not guarantee any return and CMC Markets shall not be held responsible for any loss that you may incur, either directly or indirectly, arising from any investment based on any information contained herein.
*Tax treatment depends on individual circumstances and can change or may differ in a jurisdiction other than the UK.
Continue reading for FREE
- Includes free newsletter updates, unsubscribe anytime. Privacy policy