Blog

URM shows how small, recurrent models can outperform big LLMs in reasoning tasks

Ben Dickson

The key to solving complex reasoning isn't stacking more transformer layers, but refining the "thought process" through efficient recurrent loops.

The hidden architecture behind AI systems that don’t break under growth

Contributor

Most systems break at 100x growth. Real scalability depends on architecture, data quality, and organizational design, not just writing better code.

A few interesting observations on Gemini 3 Flash

Ben Dickson

Google didn’t reveal a lot of information about its Gemini 3 Flash model. So we had to speculate a lot on what is going on under the hood.

How Nvidia changed the open source AI game with Nemotron 3

Ben Dickson

As the industry shifts from chatbots to multi-agent workflows, Nvidia's Nemotron 3 offers a blueprint for efficient, long-context reasoning.

Why AI benchmarks are broken

Ben Dickson

AI labs are racing to overtake each other on key industry benchmarks. But this intense race has stripped the benchmarks of most of their value.

Salesforce tackles the ‘brittleness’ of web agents with new WALT framework

Ben Dickson

WALT abstracts away the chaos of dynamic layouts, allowing AI to focus on high-level planning instead of low-level clicks.

Beyond raw intelligence: How Poetiq cracked the ARC-AGI-2 benchmark

Ben Dickson

The verified solution achieves 54% accuracy on the semi-private test set, outperforming Gemini 3 Deep Think at less than half the cost.

OpenAI’s code red: The curse of being at the forefront of AI

Ben Dickson

OpenAI’s problem is not that it doesn't have the best model anymore but that the general feeling is that it has fallen behind.

What is next in reinforcement learning for LLMs?

Ben Dickson

Reinforcement learning from verifiable rewards (RLVR) ushered in a new generation of reasoning models. Now, researchers are looking beyond RLVR to create the next breakthrough in AI.

Prompt injection attack tricks Google’s Antigravity into stealing your secrets

Ben Dickson

An indirect prompt injection turns the AI agent in Google's Antigravity IDE into an insider threat, bypassing security controls to steal credentials.

VL-JEPA is a lean, fast vision-language model that rivals the giants

URM shows how small, recurrent models can outperform big LLMs in…

The hidden architecture behind AI systems that don’t break under growth

A few interesting observations on Gemini 3 Flash

How Nvidia changed the open source AI game with Nemotron 3

Applied ML: When ‘perfect’ becomes the enemy of ‘good’

AI can’t replace software engineers yet, but here is how to…

How to turbocharge your product and market research with DeepSearch

How looking differently at data can save your machine learning project

Building a solid data foundation for generative AI applications

The evolution of LLM tool-use from API calls to agentic applications

What makes DeepSeek-V3.2 so efficient?

What to know about Claude Opus 4.5

OpenAI’s GPT-5: A reality check for the AI hype train

OpenAI’s grand return to open source: unpacking the gpt-oss release

AI is writing your code, but who’s reviewing it?

Machine learning in space: Building intelligent systems for the harshest environments

Decoding the brain, inspiring AI: How Rahul Biswas is bridging neuroscience…

The cash flow conundrum: How technology is reshaping small business finance

What to know about the security of open-source machine learning models

VL-JEPA is a lean, fast vision-language model that rivals the giants

URM shows how small, recurrent models can outperform big LLMs in reasoning tasks

The hidden architecture behind AI systems that don’t break under growth

A few interesting observations on Gemini 3 Flash

How Nvidia changed the open source AI game with Nemotron 3

Why AI benchmarks are broken

Salesforce tackles the ‘brittleness’ of web agents with new WALT framework

Beyond raw intelligence: How Poetiq cracked the ARC-AGI-2 benchmark

OpenAI’s code red: The curse of being at the forefront of AI

What is next in reinforcement learning for LLMs?

Prompt injection attack tricks Google’s Antigravity into stealing your secrets