April 7, 2026 in Engineering by Unbound Force7 minutes
Every developer using AI coding tools knows the feeling: you spend an hour building context with an agent — explaining the architecture, the naming conventions, the reasons behind a particular design decision — and then the session ends. The next session starts blank. The agent has no memory of what you discussed, no access to the decisions you made, no awareness of the other repositories in your organization.
Andrej Karpathy described this as the “lobotomy on restart” problem. It is not a model limitation — it is an infrastructure problem. The model is capable of reasoning over context. It lacks the infrastructure to accumulate, organize, and retrieve that context across sessions.
Both Karpathy and the Dewey project address this problem. Both reject the standard RAG approach (chunk documents, embed into a vector database, retrieve by similarity). Both bet on markdown as the native format. But they take fundamentally different paths to get there.
Karpathy’s “LLM Knowledge Base” treats the LLM itself as a research librarian. The system works in three stages ( VentureBeat coverage):
1. Ingest. Raw materials — research papers, GitHub repositories, datasets, web articles — are dumped into a raw/ directory. Karpathy uses the Obsidian Web Clipper to convert web content to markdown, including images for vision model reference.
2. Compile. This is the core innovation. Instead of indexing the files, the LLM reads them and writes a structured wiki. It generates summaries, identifies key concepts, authors encyclopedia-style articles, and creates backlinks between related ideas. The LLM is not searching existing text — it is synthesizing new text from existing sources.
3. Lint. The system is not static. Karpathy describes running “health checks” where the LLM scans the wiki for inconsistencies, missing data, or new connections. The wiki heals itself over time as the LLM identifies gaps and fills them.
The result is a curated, human-readable knowledge base where every claim traces back to a specific markdown file. No embeddings, no vector database, no black box. The LLM navigates via summaries and index files — structured text, not mathematical similarity.
At a scale of roughly 100 articles and 400,000 words, this works well. The LLM’s context window is large enough to hold the index and navigate to the relevant article. As Karpathy notes, “the fancy RAG infrastructure often introduces more latency and retrieval noise than it solves” at this scale.
Dewey takes the opposite approach. Instead of having the LLM organize knowledge, Dewey indexes existing repositories and makes them searchable through the Model Context Protocol (MCP).
No LLM involvement in indexing. Dewey reads markdown files, extracts blocks, computes embeddings with a local model (IBM Granite, running via Ollama), and stores everything in a SQLite database. The LLM never touches the indexing process — it consumes the search results.
Four source types. Dewey indexes content from:
go/ast to extract function signatures, CLI commands, MCP tool registrations, and package documentationMulti-repo by default. When
uf init configures Dewey, it scans ../ for sibling repositories and generates a multi-repo source configuration. A single Dewey instance indexes your entire organization’s workspace — not just the current project.
MCP protocol. Dewey exposes 40 tools over MCP’s stdio JSON-RPC protocol. Any MCP-compatible client (OpenCode, Claude Code, Cursor, or custom agents) can search across the entire index. Semantic search, structured graph traversal, page retrieval, block-level queries — all available as standard tool calls.
At a scale of 1,000+ pages, 10,000+ blocks, and 5+ repositories, Dewey’s embedding-based retrieval becomes essential. A context window large enough to hold an index of 100 articles is not large enough to hold an index of 10,000 blocks across 5 repos. Embeddings solve the needle-in-a-haystack problem that structured navigation cannot.
| Dimension | Karpathy (AI Librarian) | Dewey (Searchable Index) |
|---|---|---|
| Core metaphor | Librarian who reads and writes | Index that catalogs and retrieves |
| Who organizes | The LLM (active synthesis) | The indexer (passive extraction) |
| Search method | Summaries + index navigation | Semantic search + graph traversal |
| Embeddings | None — structured text only | Local embeddings (IBM Granite) |
| Scale target | ~100 articles, ~400K words | 1,000+ pages, 10K+ blocks, 5+ repos |
| Multi-repo | Manual (copy files to raw/) | Automatic (scans sibling repos) |
| Self-healing | Yes — LLM linting passes | No — indexes reflect source reality |
| Protocol | Custom scripts | MCP (Model Context Protocol) |
| Multi-agent | Single-agent workflow | Multi-agent (Replicator coordination) |
| Local-only | Yes (Obsidian + local LLM) | Yes (SQLite + Ollama) |
| Code awareness | No — markdown only | Yes — Go AST parsing (function signatures, CLI commands) |
| LLM cost | High — compilation requires many tokens | Low — indexing is mechanical, no LLM tokens |
These approaches are not competing — they solve different parts of the same problem.
The compiled wiki becomes a Dewey source. If you use Karpathy’s approach to maintain a curated research wiki, that wiki is a folder of markdown files. Dewey indexes markdown files. Point a Dewey disk source at your compiled wiki and every agent in your swarm can search it.
Dewey’s store_learning is a primitive compilation step. When the
/unleash pipeline completes a task, its retrospective step stores a narrative learning in Dewey via store_learning. These learnings accumulate over sessions and surface via semantic search in future sessions. This is not full LLM-driven compilation — it is a single learning per session, not a synthesized wiki — but it serves the same purpose: the system remembers what it learned.
The ideal system could use both. Karpathy’s LLM-as-librarian for active knowledge synthesis on curated research. Dewey for cross-repo semantic search on the full organizational knowledge base. The compiled wiki feeds into the searchable index.
Dewey could add autonomous synthesis. A dewey compile command that reads indexed content and generates summary articles — turning passive indexing into active knowledge curation. Karpathy’s linting concept (scan for inconsistencies, fill gaps) could make Dewey’s index self-improving.
Dewey could add contamination separation. Obsidian co-creator Steph Ango suggested keeping personal vaults clean and letting agents work in a “messy vault.” Dewey could separate agent-generated learnings from human-authored documentation, promoting only validated content.
Karpathy’s approach could benefit from MCP standardization. The “hacky collection of scripts” Karpathy describes could be exposed as MCP tools, making the compiled wiki accessible to any MCP-compatible agent — not just the specific LLM session that created it.
Karpathy’s approach could benefit from code awareness. Dewey’s Go AST parsing extracts function signatures, CLI commands, and MCP tool registrations from source code — structured knowledge that markdown compilation alone cannot capture.
Use Karpathy’s approach if:
Use Dewey if:
Use both if:
To try Dewey:
brew install unbound-force/tap/unbound-force
uf setupuf setup installs Dewey and configures multi-repo sources automatically. See the
Quick Start guide for detailed installation, or the
knowledge retrieval guide for source configuration and web source templates.
To try Karpathy’s approach, see his X post for the architecture and the VentureBeat article for a detailed breakdown.