LLM training optimization
Training large language models usually requires a cluster of GPUs. FlashOptim changes the math, enabling full-parameter training on fewer accelerators.
Sparse attention
As AI agents take on longer tasks, the KV cache of LLMs has become a massive bottleneck. Discover how sparse attention techniques are freeing up GPU memory.
Semantic chaining attack
Semantic Chaining exploits the fragmented safety architecture of multimodal models, bypassing filters by hiding prohibited intent within a sequence of benign edits.
LLM context management
RePo, Sakana AI’s new technique, solves the "needle in a haystack" problem by allowing LLMs to organize their own memory.
Digital supply chain
Stop reacting to compliance violations and start preventing them. See how AI empowers organizations to turn regulatory discipline into an engine for innovation and growth.
Recursive language model
Brute-forcing larger context windows is hitting a mathematical wall. Here is how MIT’s new framework solves "context rot" to process 10 million tokens and beyond.
robot with tactile sensing
Microsoft’s Rho-Alpha upgrades Vision-Language-Action models with tactile data to bridge the gap between semantic reasoning and low-level motor control.
Lasso Security compromised Perplexity’s BrowseSafe guardrail model for AI browsers, proving that "out-of-the-box" tools fail to stop prompt injection attacks.
continual learning
By treating language modeling as a continual learning problem, the TTT-E2E architecture achieves the accuracy of full-attention Transformers on 128k context tasks while matching the speed of linear models.
token generation vs embeddings
Meta’s VL-JEPA outperforms massive vision-language models on world modeling tasks by learning to predict "thought vectors" instead of text tokens.