Memory is how the agent remembers things between turns and between sessions. Each memory is a markdown file with frontmatter; an auto-generated MEMORY.md index gets injected into the system prompt so the agent always has an overview without paying the full token cost of every memory.
The system lives in a-mini/memory/:
types.py—MEMORY_TYPESconstant and format guidance.store.py— save / load / delete / search andMEMORY.mdrebuilding.scan.py—MemoryHeader, age and freshness helpers.context.py—get_memory_context()for system-prompt injection, with truncation and AI search.tools.py— the four memory tools.
Two scopes
Every memory is either user-scoped or project-scoped:
| Scope | Path | Use for |
|---|---|---|
user | ~/.nano_claude/memory/ | Things that follow you across all projects — your role, preferences, coding style. |
project | ./.nano_claude/memory/ (cwd) | Things specific to the current repo — feature decisions, deadlines, domain knowledge. |
Project scope wins when both scopes have a memory with the same name.
Four types
types.py defines four memory types. They're labels for the model to know what to do with the content; the agent picks an appropriate type when it saves.
| Type | Use for |
|---|---|
user | Your role, preferences, background. |
feedback | How you want the model to behave. |
project | Ongoing work, decisions, deadlines. |
reference | Pointers to external resources. |
File format
A single memory file:
---
name: coding style
description: Python formatting preferences
type: feedback
created: 2026-04-02
---
Prefer 4-space indentation and full type hints in all Python code.
**Why:** user explicitly stated this preference.
**How to apply:** apply to every Python file written or edited.
Frontmatter fields: name, description, type, created. Optional updated is bumped on save. The body is freeform markdown — the agent reads the whole thing when relevance is high enough.
The MEMORY.md index
Whenever a memory is saved or deleted, store.py rebuilds MEMORY.md (≤ 200 lines / 25 KB) — a flat listing of every memory's frontmatter plus a one-line summary. context.py injects this index into the system prompt on every turn.
The full memory contents are NOT injected — only the index. The agent calls MemorySearch or MemoryList to read the body when it decides a memory is relevant. This keeps the per-turn cost bounded.
The four tools
MemorySave(name, type, description, content, scope) — create or update a memory. scope is user or project. name becomes the filename slug.
MemoryDelete(name, scope) — remove a memory.
MemorySearch(query, scope?, use_ai=false, max_results=10) — keyword or AI-ranked search. With use_ai=true, the agent ranks results by semantic relevance.
MemoryList(scope?) — list all memories with age and metadata.
Inside a session you also have slash-command shortcuts:
/memory # list all
/memory python # search by keyword
When the agent reads / writes
By default the agent reads MEMORY.md (the index) on every turn — it's part of the system prompt. The agent decides to call MemorySearch or MemoryList when:
- The user asks about a previous decision ("how did we handle auth last time?").
- The agent is about to make a decision similar to one stamped in memory.
- The conversation references something that might be in user preferences ("write idiomatic code" — check if there's a coding-style memory).
The agent calls MemorySave when:
- The user says "remember this" or "from now on, do X".
- The user explicitly states a preference.
- The agent makes an architecture decision worth recording.
The agent should NOT save:
- Transient information (the current open file, the last error message).
- Anything sensitive (credentials, internal docs the user pasted).
- Duplicates of an existing memory —
MemorySaveupdates by name, so this is automatic, but the agent should re-use existing names.
Staleness
scan.py computes age from the created (or updated) timestamp. Memories older than 1 day get a "stale" note in /memory listings — a reminder to review and update or delete if no longer relevant.
The freshness check is just a hint; the agent doesn't auto-delete stale memories. That's the user's call.
AI-ranked search
Plain MemorySearch is keyword grep. With use_ai=true, the model ranks candidate memories against the query and returns the top N. Useful when the user's query doesn't share words with the memory ("how should I structure my code?" → matches the coding style memory even though "structure" isn't in it).
The cost: an extra LLM call per search. Use sparingly.
Example interaction
You: Remember that I prefer 4-space indentation and type hints in all Python code.
AI: [calls MemorySave(name="coding style", type="feedback", scope="user",
description="Python formatting preferences",
content="Prefer 4-space indent and type hints in all Python code.")]
Memory saved: coding style [feedback/user]
You: /memory
[feedback/user] coding style (today): Python formatting preferences
You: write me a Python class for a stack
AI: [reads MEMORY.md, sees coding-style memory, applies 4-space + type hints]
class Stack:
def __init__(self) -> None:
self._items: list[Any] = []
...
The agent didn't call MemorySearch — the index summary was enough to know the constraint. That's the design: the index is cheap, the body is on-demand.
In Vibe Studio
Vibe Studio workspaces have memory under .nano_claude/memory/ (project scope) and ~/.nano_claude/memory/ (user scope). The agent reads both at every turn. Project memory ships in the workspace zip when you deploy — useful if you want the deployed agent to remember the same constraints, less useful if memory contains anything sensitive.
If you want memory to stay local-only, add .nano_claude/memory/ to .gitignore and exclude it from the deploy package.
Limits
MEMORY.mdis capped at ~25 KB / 200 lines so the system prompt doesn't grow without bound.- Individual memory bodies have no hard limit, but the agent will only load a few via
MemorySearchper turn. - Memory types are fixed (
user,feedback,project,reference) — adding a new type would require modifyingtypes.pyand the index renderer.