Subscribe to continue reading
Become a paid subscriber to get access to the rest of this post and other exclusive content.
Become a paid subscriber to get access to the rest of this post and other exclusive content.
While large language models (LLMs) have mastered the art of processing text and images, they remain largely confined to the digital realm. Moving from generating code to folding laundry requires a fundamental shift in how AI perceives the world. Microsoft is attempting to bridge this gap with Rho-alpha (⍴ɑ), a new robotics foundation model designed to bring adaptivity to physical tasks.
Rho-alpha falls under the category of Vision-Language-Action (VLA) models. These systems ingest visual data and natural language commands to output robot arm actions. However, standard VLAs often struggle with precision tasks where vision is obstructed or insufficient, such as manipulating a slippery object or inserting a plug behind a desk. Rho-alpha addresses this by integrating tactile sensing directly into its decision-making process, a capability Microsoft refers to as “VLA+.”

Lasso Security has discovered significant prompt injection vulnerabilities in BrowseSafe, a new open-source tool from Perplexity designed to protect AI browsers against prompt injection attacks. Despite marketing that promised developers could “immediately harden their systems,” Lasso’s red team achieved a 36% bypass rate using standard encoding techniques. The findings show that relying on a single model for security can create dangerous blind spots, leaving agentic browsers vulnerable to hijacking.
Become a paid subscriber to get access to the rest of this post and other exclusive content.
This article is part of our coverage of the latest in AI research.
Researchers at Ubiquant have proposed a new deep learning architecture that improves the ability of AI models to solve complex reasoning tasks. Their architecture, the Universal Reasoning Model (URM), refines the Universal Transformer (UT) framework used by other research teams to tackle difficult benchmarks such as ARC-AGI and Sudoku.
While recent models like the Hierarchical Reasoning Model (HRM) and Tiny Recursive Model (TRM) have highlighted the potential of recurrent architectures, the Ubiquant team identified key areas where these models could be optimized. Their resulting approach substantially improves reasoning performance compared to these existing small reasoning models, achieving best-in-class results on reasoning benchmarks.
By Purusoth Mahendran
Most engineering teams build systems that work today, but the best teams build systems that survive orders of magnitude growth. The difference becomes apparent when transaction volume shifts from millions to billions, rigid workflows give way to conversational interfaces, and batch processing evolves into real-time intelligence.
The gap between these approaches isn’t about writing better code; it’s about understanding that software architecture must account for operational reality, data quality constraints, and inevitable business evolution. Real scalability depends on architecture, data quality, and organizational design.
Google has just released Gemini 3 Flash, a lightweight, efficient tool optimized for speed and low latency, capable of delivering performance comparable to the larger Gemini 3 Pro at a fraction of the cost. Google brands it as the democratization of frontier intelligence. On the surface, Gemini 3 Flash appears to be a standard upgrade in the race for efficient AI: a smaller, faster model distilled from its larger sibling.
However, a closer look at independent benchmarks and leaked architectural details suggests that Gemini 3 Flash is not simply a small model. We are likely looking at a massive, trillion-parameter architecture behaving like a lightweight agent through extreme sparsity, a design choice that brings unprecedented power but introduces specific tradeoffs in token efficiency and reliability. (Lots of speculation incoming.)

Nvidia has released the Nemotron 3, a family of open source language models designed for reasoning and multi-agent tasks. Available in Nano, Super, and Ultra sizes, the models feature a hybrid mixture-of-experts (MoE) architecture that delivers high throughput and a massive 1-million-token context window.
Unlike typical open-weight releases, Nvidia has open-sourced the entire development stack, including training data, recipes, and reinforcement learning environments. As an affordable and easy-to-use model, Nemotron 3 might redefine the model landscape and provide Nvidia the chance to crown itself as the king of open-source AI.