Grok 3
Grok-3 storms the AI scene, boasting superior capabilities and competitive benchmarks. Here's everything to know about this new LLM and LRM from xAI.
LLM ensemble
LLM ensembles use the power of teamwork to improve the responses of models. Mixture-of-agents (MoA), a more advanced technique, takes ensembles to the next level.
DeepSeek R1
There is a lot of hype and confusion around DeepSeek-R1. Here is what you need to know about how this reasoning model works and what makes it special.
openai chatgpt
OpenAI's o3-mini is a game-changer—faster, cheaper, and smarter than o1, but it's also a bid to reclaim dominance amid DeepSeek's rising threat.
Robot solving Rubik's cube
OpenAI o1 and o3 are very effective at math, coding, and reasoning tasks. But they are not the only models that can reason.
multi-modal language model
GPT-4 Vision is an impressive model that can create new user experiences. Fortunately, there are open-source alternatives. But they come with caveats.
baby llama llm compression
Large language models (LLM) require huge memory and computational resources. LLM compression techniques make models more compact and executable on memory-constrained devices.
3D gradient descent
Gradient descent is the main technique for training machine learning and deep learning models. Read all about it.
swirls abstract data
Everything to know about LLM fine-tuning, supervised fine-tuning, reinforcement learning from human feedback (RLHF), and parameter-efficient fine-tuning (PEFT)
vector abstract background
Low-rank adaptation (LoRA) is a technique that cuts the costs of fine-tuning large language models (LLM) to a fraction of its actual figure.