Christopher Foster-McBride’s Post

The ‘AI Risk guy’, Co-Founder @Digital Human Assistants | Founder @AI for the Soul | Co-Founder @tokes compare | Founder @Medical Coding and Documentation GPT, also healthcare and public services

5mo

📚 "LLMs Will Always Hallucinate, and We Need to Live With This" by Sourav Banerjee and the team. 🧠 This is a foundational paper - if you're an AI / LLM practitioner or champion (or naysayer), this is worth reading - many of you will know this but the evidence is vital. 🔍 Summary: As Large Language Models become more ubiquitous across domains, it becomes important to examine their inherent limitations critically (hence my work on the AI Trust / Verisimilitude Paradox). 🤖 This work argues that hallucinations in language models are not just occasional errors but an inevitable feature of these systems. 🎭 The researchers demonstrate that hallucinations stem from the fundamental mathematical and logical structure of LLMs. It is, therefore, impossible to eliminate them through architectural improvements, dataset enhancements, or fact-checking mechanisms. 🧮 As I have said before - we are playing a game of error minimization, so we need to understand risk and risk mitigation. 🎯 There is still utility in LLMs, but they need to be handled and managed with care. ⚠️ We can save you time, money and help you safely navigate the ‘Age of AI’ #AI Risk Guy Digital Human Assistants Paul Edginton Ricky Sydney https://lnkd.in/gRPDET6x

LLMs Will Always Hallucinate, and We Need to Live With This

arxiv.org

2 Comments

Paul Edginton

CEO Advantage Podcast - Company Director, Board Chair, Innovator

5mo

Carmel Crouch this what we were talking about on Saturday. It’s not a case of “bad product” but a case of “no one is perfect” or in this case no thing…

1 Reaction

To view or add a comment, sign in

More Relevant Posts

Ayşegül Güzel

AI Auditor & Evaluator | AI Governance Consultant & Trainer | Social Entrepreneur | Community Facilitator | Interdisciplinary Researcher | Public Speaker | Writer
6mo
Report this post
Exciting new research challenges our understanding of AI language models! 🧠💡 A new paper titled "LLMs Will Always Hallucinate, and We Need to Live With This" argues that hallucinations in large language models aren't just bugs - they're features deeply rooted in the mathematical foundations of these systems. https://lnkd.in/di5_9DGx Key takeaways: -Hallucinations are inevitable in LLMs -This stems from fundamental computational theory principles -We can't eliminate them through better data or architectures But here's the inspiring part: Recognizing limitations is the first step to innovation. This research opens doors to: -New approaches for managing AI outputs -Enhanced human-AI collaboration models -Fresh perspectives on machine intelligence As we push the boundaries of AI, let's embrace these challenges as opportunities for growth and discovery. The future of AI isn't about perfection - it's about understanding and leveraging its unique capabilities alongside human insight. What are your thoughts on this research? How might it shape the future of AI applications in your field? #AIResearch #MachineLearning #FutureOfTech #InnovationMindset

LLMs Will Always Hallucinate, and We Need to Live With This

arxiv.org
Like Comment
To view or add a comment, sign in
Antonio Boza

Data Scientist & AI Engineer | Machine Learning | LLM | Generative AI | Deep Learning | AWS |
6mo Edited
Report this post
Hi everyone! Large Language Models (LLMs) will inevitably hallucinate, and this is a reality we must accept. Hallucinations in LLMs not just mistakes, but inherent property. Arise from undecidable problems in training and usage process. The fact, as derived from this paper (https://lnkd.in/eBHwuMPm), is that the complete eradication of hallucinations is not feasible, due to the problems inherent in the foundations of LLMs. No amount of adjustments or fact-checking can entirely address this issue, which is a fundamental constraint of the current LLM methodology. They use computational theory and Gödel's incompleteness theorems to explain hallucinations. Argue that LLM structure inherently leads to some inputs causing model to generate false or nonsensical information. Gödel's incompleteness theorems: First theorem: Any consistent formal system powerful enough to encode arithmetic contains statements that are true but unprovable within the system. Second theorem: Such a system cannot prove its own consistency. In my view, LLMs are likely to hallucinate far less than humans. Moreover, it wouldn't be beneficial for LLMs to merely echo their training data; we expect them to reason independently (At some point in the future, at least). I believe that hallucinations may represent the nascent stages of robust self-reasoning, which isn't necessarily negative. Currently, it's true that LLMs experience hallucinations, but this is merely the present stage of AI development. The forthcoming wave of AI will focus on logical reasoning, equipped with the capability for mechanized thought. It must be sufficiently robust to address AI safety concerns by consistently and reliably processing software and intricate data. #AI #LLM #Datascience #LLMhallucinations

LLMs Will Always Hallucinate, and We Need to Live With This

arxiv.org
Like Comment
To view or add a comment, sign in
Fabrizio Billi

HealthTech Innovator. Professor, Department of Orthopaedic Surgery, UCLA. Director, Musculoskeletal Innovation Group (BiMIG), Co-Chair Digital Orthopaedic Conference San Francisco.
6mo
Report this post
🚨 Hallucinations in Large Language Models (LLMs): A Feature, Not a Bug? 🚨 As AI continues to revolutionize various sectors, it's crucial to understand the limitations of one of its most powerful tools: Large Language Models (LLMs). A new paper, https://lnkd.in/gvBCS_GK, argues that hallucinations—where AI models generate false or nonsensical information—are not just occasional glitches but a fundamental feature of these systems. 🤔 Key Insights: 1 - Hallucinations Are Inevitable: The researchers argue that hallucinations are deeply embedded in the mathematical and computational foundations of LLMs. This means that no amount of data cleaning, architectural tweaks, or even fact-checking can fully eliminate these errors. 2 - Why Is This the Case? 🧠 The authors draw on Gödel's Incompleteness Theorems and computational theory to explain that LLMs face inherently undecidable problems—like the Halting Problem. As a result, there will always be some inputs that cause the model to go off the rails, generating inaccurate or entirely fabricated outputs. 3 - Every Stage Has a Risk of Hallucination: -> Training data is always incomplete or outdated. -> Retrieving the correct information is inherently probabilistic and sometimes faulty. -> Understanding user intent is an undecidable problem in computational terms. -> The generation process itself is unpredictable. 4 - What Does This Mean for Us? 🚀 For those of us working with AI, this means understanding and accepting these limitations. LLMs are powerful tools, but they are not oracles. We need to approach them with caution, awareness, and strategies to handle their inherent flaws. 5 - The Path Forward: The conversation should now focus on making these models more robust and developing methods to identify and mitigate hallucinations when they occur. Despite their limitations, LLMs have incredible potential—but only when used responsibly. Final Thought: Hallucinations may not be "solvable," but they are manageable. As we continue to push the boundaries of AI, let's do so with an understanding of its strengths and its flaws. 💬 What are your thoughts on hallucinations in AI? Do you see them as a fundamental limitation or an opportunity for further innovation? #AI #MachineLearning #ArtificialIntelligence #DeepLearning #LLMs #TechInnovation #FutureOfAI #ResponsibleAI

LLMs Will Always Hallucinate, and We Need to Live With This

arxiv.org
Like Comment
To view or add a comment, sign in
Alexey Mastryukov

Founder/CEO – Alpaca Medical, One-stop solution for running clinics & hospitals (CRM, EHR, LIS, Apps)
3mo Edited
Report this post
AI SCEPTICISM POST (again) ----- As you all know, I do value AI as an interesting tool, that greatly helps with several types of tasks. But I feel fireworks at the bottom when I hear stuff like "AI will revolutionize X", and, especially, "we will cure X with AI". Undoutably , it's a great instrument, but yet a hundred times less valuable then Excel😁 My greatest concern is accuracy. Since LLMs main design feature is approximation and prediction, herein lies it's main vulnerability. I came across quite an interesting article today - https://lnkd.in/evW458U2. It investigates the "hallucinations" in LLMs, and even though it might be a bit controversial, it still emphasizes good points. NB! It doesn't mean that LLMs are bad or something like this. The idea is that AI, like any tool, has it's strong sides and it's limitations, and it's crucial to know them to find the right applications for it.

LLMs Will Always Hallucinate, and We Need to Live With This

arxiv.org

3 Comments
Like Comment
To view or add a comment, sign in
Isabel Sassoon, PhD

Senior Lecturer (Associate Professor), Director Teaching & Learning in Computer Science at Brunel University London
1mo
Report this post
Over the winter break we released a new paper. The work was lead by Federico Castagna and also a collaboration with Simon Parsons and Isabel Sassoon, PhD. This paper introduces "Critical-Questions-of-Thought (CQoT)", a novel method to enhance Large Language Models' (LLMs) reasoning by using critical questions from argumentation theory at test-time, thus giving the models "more time to think". The approach prompts LLMs to create a step-by-step reasoning plan, then uses critical questions based on Toulmin’s model to check each step. If the answers to these questions are mostly positive, the LLM provides a final answer, otherwise, the LLM iterates. CQoT significantly improves LLM performance on reasoning and math tasks compared to baseline and Chain-of-Thought (CoT) approaches. The method allows open-source models to compete with, and sometimes surpass, proprietary LLMs. #research #llms #explainableAI #GenAI You can read this paper: https://lnkd.in/e7uxpjCV

2412.15177

arxiv.org
Like Comment
To view or add a comment, sign in
Bhaskara Reddy Sannapureddy

Senior Project Manager|Infosys|B.E(Hons) BITS, Pilani & PGD in ML & AI at IIITB & Master of Science in ML & AI at LJMU, UK | (Building AI for World & Create AICX)(Learn, Unlearn, Relearn)
8mo
Report this post
ARE LLMS ANY GOOD FOR FORECASTING? arxiv.org/abs/2406.16964 Large language models (LLMs) are being applied to time series tasks, particularly time series forecasting. However, are language models actually useful for time series? After a series of ablation studies on three recent and popular LLM-based time series forecasting methods, researchers find that removing the LLM component or replacing it with a basic attention layer does not degrade the forecasting results -- in most cases the results even improved. Researchers also find that despite their significant computational cost, pretrained LLMs do no better than models trained from scratch, do not represent the sequential dependencies in time series, and do not assist in few-shot settings. Additionally, researchers explore time series encoders and reveal that patching and attention structures perform similarly to state-of-the-art LLM-based forecasters.
Like Comment
To view or add a comment, sign in
Igor Halperin

Finance AI & Quant (FAIQ)
6mo Edited
Report this post
Language Models and Godel’s Incompleteness Theorem. A folklore explanation of Large Language Models hallucinations is that they happen because of the probabilistic generative mechanism of LLMs based on generation of the next token according to its probability. But this new paper says that there is a deeper and unavoidable reason - certain types of hallicinations occur due the famous Godel’s incompleteness theorem that establishes existence of statements that can not be either proved or disproved. The main conclusion is conventiently placed right at the title of the paper. Abstract: As Large Language Models become more ubiquitous across domains, it becomes important to examine their inherent limitations critically. This work argues that hallucinations in language models are not just occasional errors but an inevitable feature of these systems. We demonstrate that hallucinations stem from the fundamental mathematical and logical structure of LLMs. It is, therefore, impossible to eliminate them through architectural improvements, dataset enhancements, or fact-checking mechanisms. Our analysis draws on computational theory and Godel's First Incompleteness Theorem, which references the undecidability of problems like the Halting, Emptiness, and Acceptance Problems. We demonstrate that every stage of the LLM process-from training data compilation to fact retrieval, intent classification, and text generation-will have a non-zero probability of producing hallucinations. This work introduces the concept of Structural Hallucination as an intrinsic nature of these systems. By establishing the mathematical certainty of hallucinations, we challenge the prevailing notion that they can be fully mitigated. PS: this post is for information sharing only, and is not an endorsement of this research, due to incompleteness of my own knowledge and understanding! #ai #agi #LLM

LLMs Will Always Hallucinate, and We Need to Live With This

arxiv.org

40 Comments
Like Comment
To view or add a comment, sign in
Frédéric Célerse, PhD

Computational Scientist / Data Science & Machine Learning || PhD || PMP Certification in progress
5mo
Report this post
Hallucinations are part of all the models including LLMs, as they are all based on probabilities. Taking care of the attention mechanisms and keeping mind that everything is probabilities on data, it is powerful when used in the good conditions. Thanks for sharing the article Igor Halperin 💡👍🏻 #llms #hallucinations

Igor Halperin

Finance AI & Quant (FAIQ)
6mo Edited

Language Models and Godel’s Incompleteness Theorem. A folklore explanation of Large Language Models hallucinations is that they happen because of the probabilistic generative mechanism of LLMs based on generation of the next token according to its probability. But this new paper says that there is a deeper and unavoidable reason - certain types of hallicinations occur due the famous Godel’s incompleteness theorem that establishes existence of statements that can not be either proved or disproved. The main conclusion is conventiently placed right at the title of the paper. Abstract: As Large Language Models become more ubiquitous across domains, it becomes important to examine their inherent limitations critically. This work argues that hallucinations in language models are not just occasional errors but an inevitable feature of these systems. We demonstrate that hallucinations stem from the fundamental mathematical and logical structure of LLMs. It is, therefore, impossible to eliminate them through architectural improvements, dataset enhancements, or fact-checking mechanisms. Our analysis draws on computational theory and Godel's First Incompleteness Theorem, which references the undecidability of problems like the Halting, Emptiness, and Acceptance Problems. We demonstrate that every stage of the LLM process-from training data compilation to fact retrieval, intent classification, and text generation-will have a non-zero probability of producing hallucinations. This work introduces the concept of Structural Hallucination as an intrinsic nature of these systems. By establishing the mathematical certainty of hallucinations, we challenge the prevailing notion that they can be fully mitigated. PS: this post is for information sharing only, and is not an endorsement of this research, due to incompleteness of my own knowledge and understanding! #ai #agi #LLM

LLMs Will Always Hallucinate, and We Need to Live With This

arxiv.org
Like Comment
To view or add a comment, sign in
Pascal Biese

Daily AI highlights for 70k+ experts 📲🤗 AI/ML Engineer
1mo
Report this post
Getting RAG-level results without paying for it? Retrieval Augmented Generation (RAG) has become a popular technique to improve the accuracy and reduce hallucinations in Large Language Models (LLMs). However, RAG can come with a significant computational cost and may not always be necessary, as it can introduce irrelevant information. A new study conducted a comprehensive analysis of 35 adaptive retrieval methods, including 8 recent approaches and 27 uncertainty estimation techniques, across 6 datasets using 10 metrics for QA performance, self-knowledge, and efficiency. Their findings surprised them: uncertainty estimation techniques often outperform complex pipelines in terms of efficiency and self-knowledge, while maintaining comparable QA performance. This means that by leveraging the intrinsic knowledge of LLMs and using uncertainty estimation to decide when to retrieve external information, we might be able achieve similar results to RAG at a fraction of the computational cost. The study provides valuable insights into the trade-offs between performance, self-knowledge, and efficiency in adaptive retrieval methods. Sometimes, it seems, simple & straightforward is all you need. ↓ Liked this post? Join my newsletter with 50k+ readers that breaks down all you need to know about the latest LLM research: llmwatch.com 💡

5 Comments
Like Comment
To view or add a comment, sign in
Aleksandar Basara

Transforming data into an operational advantage for the world’s most critical industries
2mo
Report this post
𝐈𝐧𝐭𝐫𝐨𝐝𝐮𝐜𝐢𝐧𝐠 𝐅𝐀𝐂𝐓𝐒 𝐆𝐫𝐨𝐮𝐧𝐝𝐢𝐧𝐠: 𝐀 𝐍𝐞𝐰 𝐁𝐞𝐧𝐜𝐡𝐦𝐚𝐫𝐤 𝐟𝐨𝐫 𝐄𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐧𝐠 𝐋𝐋𝐌 𝐅𝐚𝐜𝐭𝐮𝐚𝐥𝐢𝐭𝐲 Large language models (LLMs) are revolutionizing information access but face challenges with factual accuracy. Enter FACTS Grounding, a groundbreaking benchmark designed to evaluate LLMs' ability to provide factually grounded, detailed responses based on input documents—minimizing hallucinations and improving trust. Key Highlights: ✔️ Benchmark Dataset: 1,719 examples spanning domains like tech, law, medicine, and more. Tasks include summarization, Q&A, and rewriting—all requiring precise grounding in source materials. ✔️ FACTS Leaderboard: Hosted on Kaggle, this tracks LLM progress in grounding accuracy. Initial results are in, with ongoing updates as the field advances. 📈 ✔️ Robust Evaluation: Automated scoring with multiple frontier LLM judges (Gemini 1.5 Pro, GPT-4o, Claude 3.5 Sonnet) ensures objectivity and alignment with human evaluations. 🤝 ✔️ Open Participation: The public dataset is now available for researchers and developers to benchmark their models. 🌟 FACTS Grounding aims to drive industry-wide progress, ensuring LLMs become more reliable tools for real-world applications.
2 Comments
Like Comment
To view or add a comment, sign in

4,764 followers

View Profile Connect

Christopher Foster-McBride’s Post

LLMs Will Always Hallucinate, and We Need to Live With This

arxiv.org

More from this author

Reforming outpatient services in Australia

Backlogs and Waiting Times: How can Australia reboot surgery and increase capacity in a COVID-19 world?

How can Australia’s Emergency Departments meet the needs and care requirements of older patients?

Explore topics