🏗 System Architecture
Locentra OS is built as a modular LLM operating system, where inference, memory, feedback, and agents are decoupled but deeply integrated. Every subsystem can evolve independently—yet all contribute to a shared intelligence layer.
This isn’t just a chatbot backend. It’s an autonomous AI runtime.
🔧 Component Stack (Layered View)
🧠 Key System Components
1. Frontend (web/
)
Built with React, Vite, Tailwind
Connects directly to Solana wallets (Phantom, Backpack)
Displays:
Model responses
Vector memory hits
Agent evaluations
Live training state
Auth signature is passed to backend for secure, token-gated access
2. Backend (backend/
)
Built on FastAPI
Exposes:
/api/llm/query
,/api/llm/train
/api/system/logs
,/api/user/create
Manages:
Auth middleware
API keys & session scope
Core registry + configuration lifecycle
Integrates:
CLI usage
Agent triggers
Semantic embedding + injection
3. Model Engine (models/
)
Powered by HuggingFace Transformers and SentenceTransformers
Supports:
Falcon
Mistral
GPT-J
LLaMA
Handles:
On-device inference
Real-time fine-tuning
Adapter logic (e.g.,
adapter.py
for LoRA/hybrid layers)
🧬 Semantic Memory System
Found in:
backend/data/
,backend/db/
Every prompt is embedded via a SentenceTransformer
Stored as vector in PostgreSQL alongside metadata
Top-K search via cosine similarity retrieves related history
Matches are injected into prompt context before generation
Can be queried, rewritten, scored, or tagged for training
🤖 Autonomous Agent System
Found in:
backend/agents/
Locentra runs feedback-aware agents to self-correct and self-train:
AutoTrainer
Detects low-score prompts → triggers fine-tuning
FeedbackLoop
Logs user edits + re-queries → queues for learning
PromptOptimizer
Rewrites confusing queries → enhances clarity
Agents operate asynchronously and are granted scoped access to:
Vector memory
Raw prompt history
LLM inference pipeline
Output evaluation metrics
Registry and analytics
Custom agents can be added via:
⚙️ Infrastructure & Deployment
Docker for stack orchestration
NGINX for reverse proxy / TLS
.env
for runtime config (MODEL_NAME
,TRAINING_EPOCHS
, etc.)Uvicorn + Gunicorn for production-grade ASGI execution
Fully containerized:
Locentra can run on bare metal, local dev, or Kubernetes-based infrastructure.
📂 Directory Highlights
backend/api/
FastAPI routes
backend/models/
Model adapter, trainer, infer, loader
backend/agents/
Agent logic and lifecycle
backend/data/
Embedding + tokenizer logic
backend/services/
User service, memory handler, analytics
backend/db/
Schema, ORM, SQLAlchemy interface
web/
Frontend React app
cli/
Developer tools (training, querying)
🔄 System Lifecycle
This is LLM orchestration at runtime—autonomous, memory-aware, and open.
Last updated