🧠 AI Gets Smarter by Knowing When to Shut Up
Can AI models stay silent when unsure? New research explores this question. Plus, the latest updates on AI tools: Mercury redefines text generation, Sesame AI's CSM delivers lifelike speech, CL1 pioneers "living computers," and Tencent's Yuanbao leads in AI chat in China.
Hey there!
Recent research from The Johns Hopkins University explores a new way to make AI models more reliable by assessing their own confidence before responding. This selective answering mechanism helps models only reply when they're sufficiently certain, significantly reducing errors, especially in high-stakes situations.
This approach complements an emerging AI inference strategy called test-time scaling. Unlike traditional models with fixed computational limits, test-time scaling allows AI to dynamically adjust its computing power based on the complexity of a task.
Advanced AI systems, including OpenAI o1 series, DeepSeek-R1, s1 models (Fei-Fei Li and Percy Liang among authors), and AI at Meta Llama-3.2 variants, already use test-time scaling. The growing adoption of these techniques signals their importance in making AI smarter, safer, and more practical for real-world applications.
By combining selective answering with test-time scaling, AI models can reason more deeply while knowing when to remain silent. This is a major step forward in building powerful and trustworthy AI systems.
Now, let's get to the news!
Updates on Generative AI Tools
1. Inception Labs Introduces Mercury
Mercury redefines text generation by adapting diffusion-based techniques, traditionally used for image synthesis, to language modeling. This breakthrough approach delivers unprecedented speed and efficiency, potentially reducing inference costs and enabling real‑time applications that were previously out of reach.
Features:
Diffusion-based text synthesis
Up to 10× faster processing
Hybrid method merging image-generation techniques with text modeling
2. Sesame's Conversational Speech Model (CSM)
CSM elevates text-to-speech technology by generating natural, human-like speech that captures authentic tone, rhythm, and emotion. This advancement enhances interactive applications, making voice outputs more engaging and reliable across diverse real‑time scenarios.
Features:
Natural voice presence with real intonation and pauses
Multimodal processing for real‑time adaptation
Advanced tokenization separating semantic and acoustic features
Context‑aware prosody adjustments
Efficient training with reduced memory overhead
3. Microsoft's Dragon Copilot
Dragon Copilot marks a major leap for healthcare AI by integrating advanced voice dictation and ambient listening. Automating clinical documentation and task management promises to alleviate administrative burdens, thereby allowing clinicians to focus more on patient care and improving overall healthcare efficiency.
Features:
Unified voice experience (dictation and ambient listening)
Automated note-taking and task automation
Instant access to trusted medical data
Built-in healthcare-specific security and compliance measures
4. Cortical Labs Launches the CL1: The First "Living Computer"
CL1 pioneers the fusion of biology and silicon by integrating live neuron‑cultivated cells with digital computing. This innovative "living computer" offers adaptive intelligence and energy efficiency that could transform research in drug discovery, personalized medicine, and robotics.
Features:
Hybrid integration of biological neurons with silicon computing
Built‑in life‑support system for neural cell sustainability
Programmable bi‑directional stimulation interface with Python API
Energy‑efficient design powering a 30‑unit rack
Global access via “Wetware‑as‑a‑Service”
5. Tencent Introduces Yuanbao
Yuanbao signals Tencent's strategic move into advanced AI chatbots. Leveraging its trillion‑parameter Hunyuan model, Yuanbao delivers sophisticated document analysis, Q&A, and multimodal content generation that can integrate into Tencent's extensive digital ecosystem, potentially setting new standards in user engagement. Recently, Yuanbao surpassed DeepSeek to become China's most downloaded iPhone app, emerging as the market favorite.
Features:
Powered by Tencent's cutting‑edge Hunyuan model
Multifunctional capabilities for document analysis, summarization, and content generation
Seamless ecosystem integration (e.g., WeChat)
Supports both text and image generation
Other News
Google SpeciesNet: Google DeepMind has launched an AI model called SpeciesNet designed for wildlife identification.
Meta AI Mind-Reading AI: AI at Meta is developing AI that can decode thoughts into text via brain activity.
Cohere Aya Vision: Cohere's Aya Vision AI model analyzes images in 23 languages with less computing power.
Claude 3.7 Sonnet in ElevenLabs: Anthropic Claude 3.7 Sonnet is now available in ElevenLabs Conversational AI.
Microsoft Copilot UI Revamp: Microsoft has redesigned the user interface for Copilot.
Google AI Mode in Search: Google tests "AI Mode" in Search, allowing complex questions and follow-ups.
OpenAI Simplified AI Agents: OpenAI just introduced a complete platform for building AI agents that perform real-world tasks instead of just chatting.
Curated Gems
We've just published a new set of docs in our Prompt Engineering Guide for 5 chain-of-thought-inspired prompting techniques you've probably missed:
Chain-of-Code: combines code execution and language-based reasoning, merging the strengths of Chain-of-Thought (CoT) and Program of Thoughts (PoT) by using a mix of executable code for precise operations and language-based simulation for ambiguous reasoning.
Chain-of-Density: enhances text summarization by iteratively refining summaries, integrating missing details while maintaining a fixed length through controlled compression and abstraction, ensuring conciseness and informativeness.
Chain-of-Dictionary: enhances multilingual machine translation by incorporating external dictionary entries into the prompt, helping LLMs translate rare or low-frequency words more accurately, especially in low-resource
Chain-of-Draft: optimizes LLM reasoning by generating concise, information-dense outputs, significantly reducing token usage while maintaining or improving accuracy, making it a more efficient alternative to Chain-of-Thought (CoT) prompting.
Chain-of-Knowledge: improves reasoning in LLMs by structuring knowledge representation and verification using evidence triples and explanation hints, reducing hallucinations common in Chain-of-Thought (CoT) prompting.
Check them out and let us know what you think!
Our AI Security Masterclass Now Features 9 Top Experts
Time is running out. In just 2 days, we kick off our 6-week Masterclass on AI Security, where you'll learn from leading experts in Generative AI, Cybersecurity, and AI Red Teaming.
And there's more: we've added four new live guest speakers, bringing the total to nine AI security specialists who will share their cutting-edge insights and hands-on expertise with you.
Sander Schulhoff: CEO of Learn Prompting, creator of HackAPrompt, and leader of AI security workshops at Microsoft, OpenAI, Deloitte, Dropbox, and Stanford.
Jason Haddix: Former CISO at Ubisoft, Head of Security at Bugcrowd, and a top-ranked bug bounty hacker, with extensive experience in penetration testing and AI security.
Richard Lundeen: Principal Software Engineering Lead at Microsoft’s AI Red Team, developing PyRit, a foundational AI security framework.
Sandy Dunn : Cybersecurity leader with over 20 years of experience, project lead for the OWASP Top 10 Risks for LLM Applications, and an adjunct professor in cybersecurity.
Joseph Thacker: Principal AI Engineer at AppOmni, top AI security researcher, and winner of Google Bard's LLM bug bounty competition.
Donato Capitella: Offensive security expert and AI researcher with over 300,000 YouTube learners, teaching how to build and break AI systems.
Akshat Parikh: Elite bug bounty hacker, ranked in the top 21 in JP Morgan’s Bug Bounty Hall of Fame, and AI security researcher backed by OpenAI, Microsoft, and DeepMind researchers.
Pliny the Prompter: Well-known AI jailbreaker, specializing in bypassing major AI model defenses.
Johann Rehberger: Former Microsoft Azure Red Team leader, known for pioneering techniques like ASCII Smuggling and AI-powered C2 attacks.
Final spots are available. Sign up today!
Thanks for reading! Check our blog for more content about prompting and practical Generative AI!