№ 01 · The pillar

Drip.
Long reads with labs you can play.

One idea, given the time and the interactive surface area it deserves. Drip pieces are essays you read, not chapters you skim — designed to leave you with a working mental model by the last paragraph.

Featured

Core Concepts

Core Concepts

LoRA & qLoRA

Fine-tune massive LLMs on consumer hardware. Learn about Low-Rank Adaptation and 4-bit Quantization.

Read Live
Core Concepts

Tokenization

Before AI can read, it must chop. Learn how text is broken down into the fundamental atoms of meaning.

Read Live
Core Concepts

LLM Sampling

How do LLMs decide what to say next? Explore greedy vs. probabilistic sampling and log probabilities.

Read Live
Core Concepts

Context Engineering

Beyond the prompt: Curate the perfect information to feed your LLM's limited attention span.

Read Live
Core Concepts

Prompt Engineering

Learn how Zero-Shot, Few-Shot, and Chain-of-Thought prompting steer LLM probabilities.

Read Live
Core Concepts

KV Cache (Inference)

Why doesn't ChatGPT re-read your whole chat every time it types a word? Memory optimization explained.

Read Live
Core Concepts

Naive Bayes

Predicting the future by assuming simplicity. Learn how this probabilistic algorithm uses Bayes' Theorem for classification.

Read Live
Core Concepts

Random Forest

Strength in numbers. See how an ensemble of diverse decision trees can vote to make robust predictions.

Read Live
Core Concepts

Support Vector Machines

The classic algorithm that finds the widest possible street between two classes of data.

Read Live
Core Concepts

Recommender Systems

From collaborative filtering to matrix factorization: how Netflix knows what you want before you do.

Read Live

Architectures

Agents & RAG

Agents & RAG

Multi-MCP Architecture

New Research: 10 connected MCP servers = ~8K tokens of schemas loaded before your prompt starts. Many-small > one-big, and how to lazy-load.

Read Live
Agents & RAG

Eval-Driven Development

New Research: Prompts are code. The four-stage CI pipeline (local dev → PR check → deploy gate → production monitor) that replaces saturated benchmarks.

Read Live
Agents & RAG

Verifying AI Code

New Research: 66% of developers said the same thing — AI code that is almost right. The verification cascade and the four error classes it catches.

Read Live
Agents & RAG

Harness Engineering

New Research: 60% of all LLM errors are rate limits, not model errors. The five harness layers that decide reliability.

Read Live
Agents & RAG

Agentic Context Engineering

New Research: Why your agent forgets the rules by turn 15 — and the four operations (Write, Select, Compress, Isolate) production teams converged on.

Read Live
Agents & RAG

Agentic ETL

New Research: What 1,200+ production deployments taught us about putting LLM agents into the extract-transform-load loop — and the two-layer sandwich that survives contact with real data.

Read Live
Agents & RAG

Build: A Multi-MCP Router

Build Along: 80 lines, no dependencies. Three small MCP servers, the tool-bloat problem they create together, and the router that fixes it. Companion code to Multi-MCP Architecture.

Read Live
Agents & RAG

Build: An Eval Harness

Build Along: the smallest honest eval suite — dataset, system under test, scorers, and a CI gate that exits non-zero below threshold. Companion code to Eval-Driven Development.

Read Live
Agents & RAG

Build: An Agentic ETL Pipeline

Build Along: 70 lines that show the two-layer sandwich — a fuzzy LLM transform wrapped in deterministic validation on both sides. Companion code to Agentic ETL.

Read Live
Agents & RAG

Build: An AI Code Verifier

Build Along: the generate → verify → feed-failure-back loop in 60 lines. The model writes; the verifier decides. Companion code to Verifying AI Code.

Read Live
Agents & RAG

HyPA-RAG (Legal AI)

New Research: A hybrid, parameter-adaptive RAG system designed specifically for high-stakes legal applications.

Read Live
Agents & RAG

Agentic RAG

When RAG gets smart. Learn how adding an autonomous agent loop enables multi-hop reasoning and self-correction.

Read Live
Agents & RAG

Agentic Hybrid RAG

New Research: Combining GraphRAG and VectorRAG with an autonomous router for scientific literature review.

Read Live
Agents & RAG

Agentic Design Patterns

Google Cloud Architecture: From simple prompts to complex multi-agent systems.

Read Live
Agents & RAG

RAG (Retrieval-Augmented)

Give AI an open-book test. Connect LLMs to external knowledge bases for accurate answers.

Read Live
Agents & RAG

Advanced RAG Techniques

Go beyond basic vector search with Reranking, Hybrid Search, and Query Expansion for production-grade accuracy.

Read Live
Agents & RAG

The Future of Agentic AI

Research Deep Dive: Why Small Language Models (SLMs) are replacing monolithic LLMs.

Read Live

Latest Research

In queue
Latest Research

Three Agent Papers, April 2026

New Research: Hyperagents (Meta FAIR), Recursive Language Models (MIT), and GMPO (Microsoft / ICLR). Three architectural moves from a single month.

Preview Drafting
Latest Research

AI Overthinking

New Research: When models think too much, they often talk themselves out of the correct answer.

Read Live
Latest Research

Latent Reasoning (Coconut)

New Research: What if LLMs didn't have to 'think' in words? Explore reasoning directly in continuous latent space.

Read Live
Latest Research

Qwen3 (Unified Thinking)

New Research: A single model that can dynamically switch between fast responses and deep reasoning modes.

Read Live
Latest Research

DeepSeekMath (GRPO)

New Research: How a 7B model approached GPT-4 math performance by ditching the RL 'Critic' model.

Read Live
Latest Research

Kimi K2 Thinking

New Research: An open-source thinking agent that interleaves reasoning with tool use (300+ steps).

Read Live
Latest Research

DeepSeek-OCR

New Research: Compressing long documents into highly efficient 2D visual tokens instead of text.

Read Live
Latest Research

CoT Monitoring

New Research: Can AI models learn to hide their dangerous thoughts from safety monitors?

Read Live
Latest Research

Transformer Sensitivity

New Research: Why are Transformers so robust? They naturally learn 'low sensitivity' functions.

Read Live
Latest Research

Coherence (Segmentation)

New Research: An unsupervised method that uses 'sticky' keywords to find topic boundaries.

Read Live
Latest Research

SFT vs. RL Generalization

New Research: Does Supervised Fine-Tuning just memorize while RL actually learns rules?

Read Live
Latest Research

Natural Language Autoencoders

New Research: Anthropic's 2026 method for translating Claude's internal activations directly into human-readable English.

Read Live