Chris Horn: AI is 90% marketing, 10% reality, and its true business impact has yet to be proven

  • 📰 IrishTimes
  • ⏱ Reading Time:
  • 64 sec. here
  • 7 min. at publisher
  • 📊 Quality Score:
  • News: 43%
  • Publisher: 98%

Microsoft News

Apple,Chatgpt,Open-Ai

AI large language models can give an illusion of intelligence but in fact it is inherently limited in high-level reasoning

AI large language models can give an illusion of intelligence but in fact the technology is inherently limited in high-level reasoning

The grade school math 8k suite has become a popular benchmark for various AI large language models , such as. The suite contains 8,500 problems like the one above, divided into problems to train a LLM and then the real problems to be solved. The latest LLM from ChatGPT’s OpenAI, the GPT-4o model, has scored 92.5 per cent on the GSM8K suite while Google’s LLM Gemini 1.5 Pro scored 91.7 per cent.

Even more intriguingly, dropping or adding additional clauses had a significant impact on the performance of the LLMs. For example, removing the clause specifying a call price reduction after 10 minutes in the test problem above, or adding a new clause giving a 5 per cent discount for calls costing more than $10, frequently caused a variation in the accuracy of the results.

The researchers concluded: “Ultimately, our work underscores significant limitations in the ability of LLMs to perform genuine mathematical reasoning. The high variance in LLM performance on different versions of the same question, their substantial drop in performance with a minor increase in difficulty, and their sensitivity to inconsequential information indicate that their reasoning is fragile. It may resemble sophisticated pattern matching more than true logical reasoning.

 

Thank you for your comment. Your comment will be published after being reviewed.
Please try again later.
We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

 /  🏆 3. in US

United States United States Latest News, United States United States Headlines