Documentation

Memory

Long-running context the agent saves and retrieves across conversations.

Memory is how the agent remembers things between turns and between sessions. Each memory is a markdown file with frontmatter; an auto-generated MEMORY.md index gets injected into the system prompt so the agent always has an overview without paying the full token cost of every memory.

The system lives in a-mini/memory/:

  • types.pyMEMORY_TYPES constant and format guidance.
  • store.py — save / load / delete / search and MEMORY.md rebuilding.
  • scan.pyMemoryHeader, age and freshness helpers.
  • context.pyget_memory_context() for system-prompt injection, with truncation and AI search.
  • tools.py — the four memory tools.

Two scopes

Every memory is either user-scoped or project-scoped:

ScopePathUse for
user~/.nano_claude/memory/Things that follow you across all projects — your role, preferences, coding style.
project./.nano_claude/memory/ (cwd)Things specific to the current repo — feature decisions, deadlines, domain knowledge.

Project scope wins when both scopes have a memory with the same name.

Four types

types.py defines four memory types. They're labels for the model to know what to do with the content; the agent picks an appropriate type when it saves.

TypeUse for
userYour role, preferences, background.
feedbackHow you want the model to behave.
projectOngoing work, decisions, deadlines.
referencePointers to external resources.

File format

A single memory file:

---
name: coding style
description: Python formatting preferences
type: feedback
created: 2026-04-02
---
Prefer 4-space indentation and full type hints in all Python code.
**Why:** user explicitly stated this preference.
**How to apply:** apply to every Python file written or edited.

Frontmatter fields: name, description, type, created. Optional updated is bumped on save. The body is freeform markdown — the agent reads the whole thing when relevance is high enough.

The MEMORY.md index

Whenever a memory is saved or deleted, store.py rebuilds MEMORY.md (≤ 200 lines / 25 KB) — a flat listing of every memory's frontmatter plus a one-line summary. context.py injects this index into the system prompt on every turn.

The full memory contents are NOT injected — only the index. The agent calls MemorySearch or MemoryList to read the body when it decides a memory is relevant. This keeps the per-turn cost bounded.

The four tools

MemorySave(name, type, description, content, scope) — create or update a memory. scope is user or project. name becomes the filename slug.

MemoryDelete(name, scope) — remove a memory.

MemorySearch(query, scope?, use_ai=false, max_results=10) — keyword or AI-ranked search. With use_ai=true, the agent ranks results by semantic relevance.

MemoryList(scope?) — list all memories with age and metadata.

Inside a session you also have slash-command shortcuts:

/memory                  # list all
/memory python           # search by keyword

When the agent reads / writes

By default the agent reads MEMORY.md (the index) on every turn — it's part of the system prompt. The agent decides to call MemorySearch or MemoryList when:

  • The user asks about a previous decision ("how did we handle auth last time?").
  • The agent is about to make a decision similar to one stamped in memory.
  • The conversation references something that might be in user preferences ("write idiomatic code" — check if there's a coding-style memory).

The agent calls MemorySave when:

  • The user says "remember this" or "from now on, do X".
  • The user explicitly states a preference.
  • The agent makes an architecture decision worth recording.

The agent should NOT save:

  • Transient information (the current open file, the last error message).
  • Anything sensitive (credentials, internal docs the user pasted).
  • Duplicates of an existing memory — MemorySave updates by name, so this is automatic, but the agent should re-use existing names.

Staleness

scan.py computes age from the created (or updated) timestamp. Memories older than 1 day get a "stale" note in /memory listings — a reminder to review and update or delete if no longer relevant.

The freshness check is just a hint; the agent doesn't auto-delete stale memories. That's the user's call.

AI-ranked search

Plain MemorySearch is keyword grep. With use_ai=true, the model ranks candidate memories against the query and returns the top N. Useful when the user's query doesn't share words with the memory ("how should I structure my code?" → matches the coding style memory even though "structure" isn't in it).

The cost: an extra LLM call per search. Use sparingly.

Example interaction

You: Remember that I prefer 4-space indentation and type hints in all Python code.
AI: [calls MemorySave(name="coding style", type="feedback", scope="user",
                       description="Python formatting preferences",
                       content="Prefer 4-space indent and type hints in all Python code.")]
    Memory saved: coding style [feedback/user]

You: /memory
  [feedback/user] coding style (today): Python formatting preferences

You: write me a Python class for a stack
AI: [reads MEMORY.md, sees coding-style memory, applies 4-space + type hints]
    class Stack:
        def __init__(self) -> None:
            self._items: list[Any] = []
        ...

The agent didn't call MemorySearch — the index summary was enough to know the constraint. That's the design: the index is cheap, the body is on-demand.

In Vibe Studio

Vibe Studio workspaces have memory under .nano_claude/memory/ (project scope) and ~/.nano_claude/memory/ (user scope). The agent reads both at every turn. Project memory ships in the workspace zip when you deploy — useful if you want the deployed agent to remember the same constraints, less useful if memory contains anything sensitive.

If you want memory to stay local-only, add .nano_claude/memory/ to .gitignore and exclude it from the deploy package.

Limits

  • MEMORY.md is capped at ~25 KB / 200 lines so the system prompt doesn't grow without bound.
  • Individual memory bodies have no hard limit, but the agent will only load a few via MemorySearch per turn.
  • Memory types are fixed (user, feedback, project, reference) — adding a new type would require modifying types.py and the index renderer.