pi-lcm-memory
Persistent cross-session semantic memory for Pi β a hybrid (FTS5 + vector) recall layer on top of pi-lcm.
Package details
Install pi-lcm-memory from npm and Pi will load the resources declared by the package manifest.
$ pi install npm:pi-lcm-memory- Package
pi-lcm-memory- Version
1.0.1- Published
- Apr 30, 2026
- Downloads
- not available
- Author
- sharkone
- License
- MIT
- Types
- extension
- Size
- 188.2 KB
- Dependencies
- 3 dependencies Β· 5 peers
Pi manifest JSON
{
"extensions": [
"./index.ts"
]
}Security note
Pi packages can execute code and influence agent behavior. Review the source before installing third-party packages.
README
π§ pi-lcm-memory
Persistent, cross-session semantic memory for Pi.
Never lose context. Every session remembered, every thought retrievable β
by meaning, not just keywords. Fully local. No external APIs.
Built as an additive layer on top of pi-lcm.
β¨ What it does
When you open Pi in a project you've worked in before, pi-lcm-memory:
- π Briefs you with a session-start primer of recent work
- π Recalls past messages and summaries via hybrid semantic + lexical search
- β‘ Auto-injects relevant context when you say things like "remember earlierβ¦"
- π Indexes silently in the background β no latency on your turns
All embeddings live in the same SQLite file pi-lcm already manages. No duplication, no sync, no external services.
ποΈ Architecture
βββββββββββββββββββββββββββββ Pi Session βββββββββββββββββββββββββββββ
β β
β βββββββββββββββ message_end ββββββββββββββββββββββββββββββββ β
β β pi-lcm β βββββββββββββββΊ β pi-lcm-memory β β
β β β β β β
β β messages β βββ read-only ββ β Indexer (hook + sweep) β β
β β summaries β β β β β
β β FTS5 index β β βΌ β β
β βββββββββββββββ β Worker thread β β
β β β (ONNX / Transformers.js) β β
β β shared SQLite β β β β
β βΌ β βΌ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β ~/.pi/agent/lcm/<hash>.db β β β
β β β β β
β β messages ββββββββββββββββββββ memory_index (join) β β β
β β summaries βββββββββββββββββββ memory_vec (sqlite-vec) β β β
β β memory_meta (kv + events) β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β session_start ββββββββββββββββΊ Primer + auto-recall β β
β user turn βββββββββββββββββββββΊ Heuristic recall injection β β
β lcm_recall / lcm_similar βββββββΊ Retriever (FTS5 + vec β RRF) β β
β β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Both extensions are independent Pi peers β pi-lcm-memory never patches pi-lcm. It only adds three tables (memory_vec, memory_index, memory_meta) to the existing per-project SQLite.
π Quick start
pi install npm:pi-lcm # if not already installed
pi install npm:pi-lcm-memory
pi # open Pi as normal
First session in a project with existing pi-lcm history:
- β¬οΈ Downloads the embedding model (
Xenova/bge-small-en-v1.5, ~33 MB, once per machine) - βοΈ Backfills embeddings for all existing messages + summaries in batches of 32
- π Renders a session-start primer with recent topics
- π From now on, every new message is embedded in the background
π What it adds on top of pi-lcm
| pi-lcm | pi-lcm-memory | |
|---|---|---|
| Per-message storage | β SQLite | shared (no duplication) |
| FTS5 lexical search | β
lcm_grep |
reused |
| DAG summaries (D0/D1/D2β¦) | β | reused |
| Cross-session recall within project | β | reused |
| Dense vector index | β | β
sqlite-vec virtual table |
| Hybrid semantic + lexical retrieval | β | β
lcm_recall |
| "More like this" navigation | β | β
lcm_similar |
| Session-start memory primer | β | β |
| Heuristic auto-recall | β | β |
| Settings panel | β | β (mirrors pi-lcm UX) |
π οΈ Agent tools
lcm_recall
Hybrid (FTS5 + vector) search across all sessions in this project.
lcm_recall(query, k?, mode?, sessionFilter?, after?, before?)
| param | default | description |
|---|---|---|
query |
β | Natural-language or keyword query |
k |
10 |
Number of results |
mode |
hybrid |
hybrid Β· lexical Β· semantic |
sessionFilter |
β | Restrict to a single conversation UUID |
after / before |
β | ISO 8601 date bounds |
lcm_similar
Find messages semantically close to a known one β great for "show me more like this".
lcm_similar(messageId, k?)
π‘ Use
lcm_grepfor exact strings,lcm_recallfor concepts and paraphrases,lcm_expand(summary_id)to drill into any summary returned by recall.
π¬ Slash commands
/memory stats counts, model, dimensions, DB size
/memory status sweep cycles, busy flag, last error, current interval
/memory search <query> ad-hoc recall (same as lcm_recall)
/memory reindex wipe all embeddings and re-embed everything
/memory settings open interactive settings panel
Embedding model and hyperparameters (rrfK, lexMult, semMult) are changed via /memory settings.
βοΈ Settings
Stored under the lcm-memory key in pi-lcm's settings files.
Resolution order: env vars β project β global β defaults.
| Key | Default | Description |
|---|---|---|
enabled |
true |
Master switch. Auto-disables if pi-lcm is disabled. |
embeddingModel |
Xenova/bge-small-en-v1.5 |
Any Transformers.js feature-extraction model. |
embeddingQuantize |
q8 |
auto / fp32 / fp16 / q8 / int8 / q4 |
indexMessages |
true |
Embed user/assistant turns. |
indexSummaries |
true |
Embed pi-lcm DAG summaries. |
skipToolIO |
true |
Skip tool call/result content (FTS5 still covers these). |
primer |
true |
Show session-start briefing. |
primerTopK |
5 |
Number of recent topics in the primer. |
autoRecall |
heuristic |
off / heuristic / always |
autoRecallTopK |
5 |
Hits injected on auto-recall. |
autoRecallTokenBudget |
600 |
Hard token cap on injected recall block. |
recallDefaultTopK |
10 |
Default k for lcm_recall. |
rrfK |
20 |
Reciprocal Rank Fusion constant (sweep-tuned). |
lexMult |
4 |
FTS5 candidate breadth multiplier (sweep-tuned). |
semMult |
16 |
Vector candidate breadth multiplier (sweep-tuned). |
sweepIntervalMs |
30000 |
Base sweep period (backs off Γ2 up to 5 min on idle). |
modelCacheDir |
null |
Override model weight cache directory. |
debugMode |
false |
Verbose notifications. |
Env overrides: PI_LCM_MEMORY_ENABLED, PI_LCM_MEMORY_DB_DIR, PI_LCM_MEMORY_MODEL,
PI_LCM_MEMORY_QUANTIZE, PI_LCM_MEMORY_SWEEP_MS, PI_LCM_MEMORY_DEBUG
β‘ Performance
Measured on Apple Silicon (M-class), default model Xenova/bge-small-en-v1.5 q8, 8 ORT threads:
| Metric | Value |
|---|---|
| Backfill throughput | ~1 500β2 000 messages/sec |
| Hook latency (p50) | ~3.4 ms |
| Sweep throughput | ~262 rows/sec |
| Recall latency | ~12 ms |
| Model download (once) | ~33 MB |
| DB growth per message | ~2 KB at 384 dims |
| 100k messages | β 80 MB index |
All embedding work runs in a dedicated worker thread β the Pi TUI is never blocked. The main thread is idle between turns.
π¬ How it works
Ingestion β two concurrent paths keep the index fresh:
- Hook path:
message_endβ embed in worker βINSERT OR IGNORE - Sweep path: every 30 s (adaptive backoff), scan for un-indexed pi-lcm rows, process in batches of 32
- Hook path:
Retrieval β
lcm_recall(query):- Run FTS5 BM25 over
messages+summariesβ ranked list - Run sqlite-vec kNN over
memory_vecβ ranked list - Merge with Reciprocal Rank Fusion (RRF, k=60)
- Run FTS5 BM25 over
Primer β at session start, render up to 5 recent Dβ₯1 summaries into a
## Project memoryblock (β€300 tokens). Shows a one-line notification to the user ([memory] N prior sessions; last on DATE) and injects the full block into Claude's context on the first turnAuto-recall β a regex listener on each user turn (
/remember|earlier|previously|like last time|.../i) injects a## Recallblock into the current turn's system contextWorker thread β
src/embeddings/worker.mjsowns the Transformers.js pipeline. ORT is configured withintraOpNumThreads = cpus()-1(max 8), zero-copyArrayBuffertransfers back to main thread
π Debugging
Set PI_LCM_MEMORY_TRACE=1 before launching Pi to write a side-channel trace log:
PI_LCM_MEMORY_TRACE=1 pi
# β /tmp/pi-lcm-memory.<pid>.trace.log
PI_LCM_MEMORY_TRACE=/path/to/log pi # explicit path
Both the main thread and the embedder worker write to the same file with pid/src markers. The log is written with fs.writeSync so it survives main-thread freezes β it's the right tool when the TUI hangs and the in-DB diagnostics ring can't be written.
π§βπ» Local dev
git clone git@github.com:sharkone/pi-lcm-memory.git
cd pi-lcm-memory
npm install
npm test # 91 vitest tests, ~500 ms
npm run typecheck # tsc --noEmit
npm run bench # perf + quality benchmarks (needs a live pi-lcm DB)
pi -e ./index.ts # load local extension into Pi
β οΈ
test/worker.live.test.tsdownloads ~33 MB of model weights. It is skipped by default β enable withPI_LCM_MEMORY_LIVE_TEST=1.
π License
MIT Β© sharkone