Persistent memory — make your AI use it without being asked
Connect a memory MCP (StudioMeyer Memory, local-memory-mcp, Mem0 — pick one) and flip the always-on switch. After 5 minutes your assistant searches memory before answering and saves insights automatically.
Persistent memory — make your AI use it without being asked
Pick your stack. This recipe uses StudioMeyer Memory because it's the one we maintain and it ships with an always-on instruction block out of the box. The same pattern works with
local-memory-mcp(file-based, runs on your machine), Mem0, Letta, or any other memory MCP — adjust tool names where this recipe saysnex_search/nex_learn/nex_session_startand you're done. The mechanics (always-on prompt + first-run profile snapshot + project tagging) are universal. We're not pretending this is the only option, just walking you through one.
The whole point of giving an AI a memory is that you don't have to ask it to remember. The default state out of the box doesn't work like that. A memory MCP exposes a bunch of tools, but if no one tells the model when to call them, they sit there unused. The model answers from its context window, your knowledge graph stays empty, and after a week you wonder why memory doesn't seem to do anything.
This recipe walks you through the one-time setup that flips memory from opt-in to always-on. After this, your assistant calls nex_search (or your equivalent) before answering factual questions, calls nex_learn whenever you share an insight, and surfaces the right entities at session start without you doing anything.
The mechanism is simple — the SaaS server returns different ServerInstructions to the MCP client depending on the tenant's setup state, and the always-on instruction block tells the model exactly when to use which tool. You just need to get the tenant from "empty" to "fully set up" once.
Step 1: Connect StudioMeyer Memory
If you already have an ACADEMY_API_KEY or other StudioMeyer Bearer token: skip ahead. If not, sign up at https://memory.studiomeyer.io and grab a key from the dashboard. Free tier covers 1000 learnings, 100 entities, 200 calls per day — plenty for testing the always-on flow.
Add the server to your client config:
{
"mcpServers": {
"memory": {
"command": "npx",
"args": ["-y", "mcp-remote", "https://memory.studiomeyer.io/mcp"],
"env": {
"AUTH_TOKEN": "sk_your_token_here"
}
}
}
}
Restart Claude Code (or Claude Desktop, Cursor, etc). The 53 tools should show up in the tool list.
Step 2: Run nex_session_start
In your next conversation, simply prompt:
Start a memory session.
The model calls nex_session_start. Because your knowledge graph is brand new, the response includes an elicitation block with three short questions: what's your main project, what's your role, what language do you prefer. Answer them. The model stores your answers via nex_learn and nex_entity_create automatically — that's enough to put real data into the graph.
The server is now in mode 2 of 3: it has data but no profile snapshot. The instructions block has switched from FIRST-RUN to "ONE-TIME SETUP REMAINING".
Step 3: Add a few real memories
Spend 2-3 minutes telling your assistant what you're working on. Mention concrete things — projects, services, decisions you've made, people on your team. Each fact you share gets stored as a learning or entity observation if your model is following the policy. You can verify it's working:
Show me the top entities in my knowledge graph.
The model calls nex_entity_search (or nex_profile generate) and lists what it has captured.
Step 4: Save your knowledge profile
This is the magic step. Tell the model:
Save my knowledge profile.
The model calls nex_profile(action: "save"). The server synthesizes your top entities, key facts, active topics, and knowledge gaps into a single context snapshot, stores it in nex_context_snapshots with a 24-hour expiry, and primes it for auto-injection.
From this moment on, every time you start a new session and the model calls nex_context, your profile is automatically prepended to the session. The model walks in already knowing who you are, what you're working on, and what matters.
The server has now switched to mode 3 of 3: ALWAYS-ON. The instruction block delivered to the model on every connect now reads:
BEFORE answering from your own knowledge, run
nex_search(query)on any factual question about the user's projects, decisions, or domain. WHEN the user shares a fact, insight, mistake, or preference →nex_learnit. WHEN a person, project, tool, or concept is mentioned that isn't in the graph yet →nex_entity_create+nex_entity_observe. WHEN an important architectural or strategic decision is made →nex_decideit.
The model treats memory as the default surface, not as a thing it might ask permission to use.
Step 5: Verify always-on is active
Open a fresh conversation in any MCP-compatible client (Claude Code, Cursor, Codex, Claude Desktop). Don't say anything about memory — just ask a project-related question:
What's the most important thing I'm working on right now?
If the model answers with concrete details (project name, recent decisions, active topics) without you having had to mention "look it up in memory", the always-on switch is doing its job. If it answers vaguely from chat-context only, run nex_profile save again — the snapshot may have expired (24h auto-refresh, but the first one only sticks once you've called save explicitly).
What the snapshot contains
Inspect what's been auto-loaded with:
Show me my knowledge profile.
The model calls nex_profile(action: "generate") and shows you exactly what gets injected on every session: top 15 entities by observation count, 20 key facts above 0.7 confidence, recently accessed topics, and entities with thin observation coverage (knowledge gaps). All scored, all sourced, all tenant-isolated.
This is what we mean when we say persistent memory: the AI walks into every conversation already briefed.
Keeping it warm
The snapshot expires after 24 hours by design — stale top-entities would be worse than no top-entities. You don't need to manually refresh; calling nex_profile save is idempotent and takes about 200ms. Most users wire it into their pre-commit hook or set up a daily cron. If you forget, the server's instructions politely remind you on the next session that the snapshot is missing.
What's next
Phase 4.2 covers project tags — what to do when you have multiple unrelated projects and you don't want them bleeding into each other's snapshots. Phase 4.4 covers importing your existing ChatGPT and Claude.ai history so years of context become searchable from day one. If you skip those for now, you still have working always-on memory after this recipe.
claude mcp list 2>&1 | grep -iE "memory|nex|local-memory|mem0" | head -5