Writing

Two kinds of writing: topics taken apart from first principles, and build logs from real projects. Newest first — filter by what you're after.

Representation of Meaning: Chunking + Embedding in RAG

To retrieve text by *meaning* instead of *keywords*, we cut documents into chunks and turn each chunk into a vector whose position was deliberately trained so that "close in space" means "close in meaning.

13 min

RAG Implementation

One line: Build retrieval as hybrid (dense + lexical) → fuse → rerank → generate, prove every choice on a 50–200 pair labeled set, and treat the latency budget as the constraint that vetoes half the "advanced" tricks.

23 min

RAG Theory

One-line summary: RAG doesn't make a language model know more — it slips the right text into the prompt at query time so the model can answer from a source instead of from memory.

5 min