Chris Horn: AI is 90% marketing, 10% reality, and its true business impact has yet to be proven

  • 📰 IrishTimes
  • ⏱ Reading Time:
  • 64 sec. here
  • 7 min. at publisher
  • 📊 Quality Score:
  • News: 43%
  • Publisher: 98%

Microsoft أخبار

Apple,Chatgpt,Open-Ai

AI large language models can give an illusion of intelligence but in fact it is inherently limited in high-level reasoning

AI large language models can give an illusion of intelligence but in fact the technology is inherently limited in high-level reasoning

The grade school math 8k suite has become a popular benchmark for various AI large language models , such as. The suite contains 8,500 problems like the one above, divided into problems to train a LLM and then the real problems to be solved. The latest LLM from ChatGPT’s OpenAI, the GPT-4o model, has scored 92.5 per cent on the GSM8K suite while Google’s LLM Gemini 1.5 Pro scored 91.7 per cent.

Even more intriguingly, dropping or adding additional clauses had a significant impact on the performance of the LLMs. For example, removing the clause specifying a call price reduction after 10 minutes in the test problem above, or adding a new clause giving a 5 per cent discount for calls costing more than $10, frequently caused a variation in the accuracy of the results.

The researchers concluded: “Ultimately, our work underscores significant limitations in the ability of LLMs to perform genuine mathematical reasoning. The high variance in LLM performance on different versions of the same question, their substantial drop in performance with a minor increase in difficulty, and their sensitivity to inconsequential information indicate that their reasoning is fragile. It may resemble sophisticated pattern matching more than true logical reasoning.

 

شكرًا لك على تعليقك. سيتم نشر تعليقك بعد مراجعته.
لقد قمنا بتلخيص هذا الخبر حتى تتمكن من قراءته بسرعة. إذا كنت مهتمًا بالأخبار، يمكنك قراءة النص الكامل هنا. اقرأ أكثر:

 /  🏆 3. in SA

المملكة العربية السعودية أحدث الأخبار, المملكة العربية السعودية عناوين