🧠 Semantic Memory

Locentra OS doesn’t forget.

Unlike stateless LLMs that treat every prompt as an isolated request, Locentra introduces a persistent, vector-based memory layer that enables:

Long-term semantic context
Prompt recall and reuse
Adaptive domain learning over time

🧬 How It Works

Locentra’s memory system embeds every relevant prompt using a SentenceTransformer, stores it in a PostgreSQL database, and performs cosine similarity search at runtime.

📊 Memory Pipeline

flowchart TD
    A[Prompt / Session] --> B[Embedding via SentenceTransformer]
    B --> C[Store Vector + Metadata in DB]
    D[New Prompt] --> E[Vectorize Incoming Prompt]
    E --> F[Cosine Similarity Search]
    F --> G[Top-N Memory Matches]
    G --> H[Inject into Prompt Context]

🧾 Memory Schema

Memory entries are stored via SQLAlchemy inside:

backend/db/models.py

Each memory row includes:

prompt (text)
vector (embedding array)
tags (optional metadata)
created_at timestamp
score (for future relevance weighting)

🔍 Similarity Search Logic

Each new prompt is vectorized (e.g. MiniLM)
A cosine similarity score is computed against all existing vectors
Top-N similar entries are retrieved
Threshold can be configured (e.g., ≥ 0.82)

Returned results are:

Injected as pre-context to improve model inference
Logged in the session
Optionally passed to PromptOptimizer for rewrite proposals

⚙️ Configurable Parameters

Adjust your memory behavior via .env or core/config.py:

Variable

Description

Default

MEMORY_ENABLED

Enable/disable semantic memory

true

MEMORY_SEARCH_TOP_K

Number of similar entries to retrieve

5

MEMORY_SIMILARITY_THRESHOLD

Minimum cosine similarity to be valid

0.82

MEMORY_MODEL

SentenceTransformer model used

all-MiniLM-L6-v2

🔁 You can swap in your own embedding model, or integrate FAISS for high-speed ANN retrieval.

🧪 Example Usage Flow

User queries: "What’s a validator on Solana?"
Locentra vectorizes it and searches memory.
Matches found:
- "How do Solana nodes work?"
- "What is a Solana cluster?"
Injects both into system prompt as pre-context.
Final output is now context-aware, rich, and precise.

🛠 Developer Tools

➕ Vectorize a Prompt

python cli/train.py \
  --prompt "Explain zkEVMs" \
  --completion "zkEVMs are zero-knowledge Ethereum-compatible VMs..." \
  --vectorize

🧼 Clean Up Old Entries (Planned)

python scripts/clean_memory.py --older-than 30d --tags "test,demo"

🔍 Memory Introspection (Planned)

python scripts/view_memory.py

Displays:

Vector stats
Prompt metadata
Similarity cluster graphs (future)

🔩 Extending the Memory Engine

Want more control?

Override similarity scoring in: backend/data/vectorizer.py
Replace cosine with:
- dot_product
- euclidean
- manhattan
- Or hook in FAISS, Annoy, or ScaNN

You can also tag, score, and prioritize memory entries for fine-tuning selection or RLHF.

🧠 Why It Matters

The Locentra memory system enables:

Persistent learning
Prompt conditioning
Vector-based recall
Knowledge consolidation

All of it—local-first, privacy-respecting, and modular.

Previous🤖 Agents System Next🎓 Training & Fine-Tuning

Last updated 1 month ago