The Best Developer in the World Has Never Heard of You
Imagine you have hired the best developer in the world. Genuinely the best: encyclopaedic knowledge, infinite patience, writes clean code at typing speed, never argues about tabs versus spaces, never asks for a raise, never turns up hungover, never goes on holiday. Remarkable.
Now imagine that every single morning, this developer walks through your front door with absolutely no memory of having met you before. They don't know your name. They don't know what language the project is in. They don't know the deployment process you spent three hours explaining yesterday, or the architectural decision you reached together last Tuesday, or the fact that you told them twice that you absolutely do not use em-dashes in this codebase.
This is Claude Code out of the box. Brilliant. Amnesiac. Relentlessly polite about it.
I decided this was not acceptable.
The Journey Starts With a Picker
It began, as many projects do, with a much simpler problem. I had accumulated a frankly embarrassing number of projects in Claude Code and navigating between them involved either remembering obscure paths or scrolling through a list that had stopped being manageable sometime around project number thirty. So I built claude+, an fzf-based project picker that showed you all your Claude sessions sorted by recent activity and let you jump straight into the right one with a keypress.
That was fine. Then I noticed that "resuming the most recent session" and "having persistent memory" are not the same thing at all. Resuming a session just means Claude sees the last conversation. It doesn't mean Claude knows anything durable about you, your preferences, or your project beyond whatever happened to be in that particular exchange. The knowledge was ephemeral. Everything you'd ever told it lived only in conversation history, subject to the usual degradation as context windows fill up and older content loses influence.
So I built claude++, a proper successor. It added shared context directories, passed ~/.claude/ into every session so Claude could look up reference data, and handled multi-machine setups via NAS-mounted project directories. Better. Still not enough.
The issue was more fundamental than tooling. The issue was that I didn't fully understand what was happening to information inside long conversations, and until I did, any memory system I built would be guessing.
Understanding the Disease Before Building the Cure
This led to context_decay, which started as a documentation project and became a Python library. The core observation is straightforward: as a conversation grows longer, earlier messages lose influence. This isn't a bug; it's a consequence of how transformer attention works. The model can technically see everything in its context window, but the practical weight it gives to information decays with distance from the current position. A constraint you stated at message three is considerably less load-bearing by message forty than it was when you said it.
What I built was a scoring system for conversation segments, using four signals multiplied together: recency (exponential decay by token position), relevance (cosine similarity to the current query, using all-MiniLM-L6-v2 embeddings), confidence (how certain the model appeared when generating this segment), and currency (volatile knowledge decays faster than stable knowledge, a pattern grounded in Ebbinghaus's work from 1885 that turns out to apply just as well to language models as to undergraduates cramming for exams). Segments above a threshold stay verbatim. Below that, they get compressed, archived, or eventually discarded.
More usefully, this analysis revealed that the problem wasn't just decay within a session. There was a second, nastier problem: context bleed between sessions. Claude Code's session resumption logic, when it couldn't find an exact project match, would fall back to loading the most recent session from any project. So you'd open your blog project and find Claude arriving with vivid recent memories of your billing system. The brilliant developer had shown up having apparently just finished a very confusing project involving someone else's database schema and was quietly building that experience into all their advice.
Mnemosyne
Armed with actual understanding of the failure modes, I built Mnemosyne: a CIA-grounded (Conversation Intelligence Architecture, though the other CIA has similarly complex feelings about information retention) memory management layer that plugs directly into Claude Code's hook system.
The core is a five-tier memory hierarchy:
| Tier | Scope | Eviction |
|---|---|---|
| 0 — Permanent | Global, all sessions | Manual only |
| 1 — Feedback | Cross-project preferences | 90 days without reinforcement |
| 2 — Project | This project, always | 30 days without reference |
| 3 — Session | Recent sessions, this project | 7 days, most recent 3 kept |
| 4 — Archive | Full JSONL journey | Never (query only) |
At session start, a hook injects Tiers 0 and 1 automatically, capped at a 3,000-token budget so the memory system doesn't eat your context window. Project memory (Tier 2) loads when the working directory matches. The context bleed fix patches Claude Code's session resumption logic to skip the recency-fallback behaviour entirely, so your billing project's memories stay in your billing project.
Every memory file carries a structured frontmatter with a valid_until field. When a fact becomes outdated, you don't delete the old file; you mark it superseded-2026-04-18 and write a new one. This is bitemporal context management: the system always knows what was believed when, which turns out to be genuinely useful when you're trying to work out why Claude made a particular decision three weeks ago and the answer is "because at that point it believed X, which you told it on the 3rd and corrected on the 14th."
The Dream Cycle
Memory is only useful if it's maintained. The /mnemosyne slash command runs a five-phase dream cycle: GATHER, SCORE, PROMOTE, EVICT, INDEX. It scans your recent session transcripts for high-signal patterns, scores candidates using the same recency-relevance-confidence-currency rubric from context_decay, promotes worthy fragments to the appropriate memory tier, marks expired entries as superseded without deleting them, and rebuilds the memory index.
This runs automatically every twenty-four hours. A stop hook writes a flag file when consolidation is due; the next session start detects it, runs the dream cycle as a background subagent, and deletes the flag. You don't need to do anything. Claude is consolidating its memories of working with you while you sleep, which is either reassuring or mildly unsettling depending on your prior relationship with the concept of AI agency, and I will leave that determination to you.
The Assumption Register
This is the feature I'm most pleased with, and also the one that required the most honesty about how things actually go wrong.
When sessions fail, it is rarely because the AI made a factual error about a documented thing. It's usually because someone treated something as obvious background that they never actually stated, and the AI reasoned from that invisible premise to a conclusion that seemed perfectly logical given what it thought it knew. You assumed the project used Python 3.11. Claude assumed you meant the production database. You assumed "deploy" meant the standard pipeline. Nobody said any of this out loud.
At every session end, a stop hook scans the transcript for assumption signals: corrections ("I thought you knew..."), explicit premises ("given that..."), technology treatments ("this project uses..."), state assertions ("that should already be there"). These are written to an assumptions file. At the start of the next session, the three most recently recorded assumptions are surfaced so they can be challenged rather than silently inherited.
The practical effect is that invisible background becomes visible foreground, which is one of those improvements that sounds modest and turns out to matter considerably more than you expected.
Vector Search and Chloe
Phase 4 added embedded vector search using LanceDB and all-MiniLM-L6-v2, so you can query your own memory history semantically. The /memory-search command lets you ask things like "what did we decide about authentication" and get back the relevant fragments from across all your projects and sessions, ranked by the same scoring function.
The whole thing is launched via chloe, the integrated launcher that Mnemosyne deploys to ~/.local/bin/chloe. Chloe combines the fzf project picker from claude+ with claude++'s launch style and Mnemosyne's memory infrastructure, adds rate-limit auto-retry for when the API decides it has had enough of you for one afternoon, and handles tmux session management if you want it. Type chloe, pick your project, and Claude arrives already knowing who you are and what you care about.
What Is It Actually Like
It is like the difference between a contractor who shows up fresh each morning with a blank notepad and one who shows up with their own notes from every previous engagement. The first one is still very capable. The second one is useful in a way that compounds.
Claude now knows that I prefer concise responses without trailing summaries. It knows that this blog never uses em-dashes. It knows the deployment process for this server, the SSH key to use, the web root path. It knows which projects are related to which, and that opinions formed while working on one should not drift silently into another. When I correct it, the correction persists. When we make an architectural decision, it is recorded with the date and the reasoning, available to future sessions without my having to re-litigate the point.
Occasionally, at the start of a session, it will surface an assumption from our last conversation and ask whether it still holds. This is either pleasantly collaborative or mildly uncanny, and I have not yet fully decided which. Probably both.
The installation is a single script. All four phases are complete and production-tested. If you use Claude Code for any sustained development work and have found yourself re-explaining the same things repeatedly, Mnemosyne is on GitHub and should be your next thirty minutes.
The best developer in the world, it turns out, is considerably better when they remember your name.
Previous Post
You Are Not Arguing. You Are Rebooting.Next Post
Don't Poke the Soufflé