- Intro
- Channels
- API
- Runtime
- Memory
- Infra
- Return
- Outcome
A private agent stack.
Five layers. One boundary.
Every customer request enters at the top, descends through five layers of controlled execution, and returns from inside the customer's own compute boundary.
↓ scroll to follow the path
Where work enters the system
Customer and employee requests reach Laava through the channels you already operate. Email, voice, internal chat, and self-service portals are first-class — agents don't pretend the channel doesn't matter.
Each connector preserves the original artifact and threads identity through every downstream step.
- channels.emailInbound tickets, replies, case updates
- channels.voiceLive calls, summaries, transcripts
- channels.workspaceSlack, Teams, self-service portal
The single front door
One Agent API receives every piece of work. It authenticates the source, normalizes the payload, and emits a typed event onto the bus. Nothing downstream runs without a signed, traced event.
Reviews and handoffs route back through here too — escalations don't bypass the API.
# inbound POST /api/v1/work → event.work.received → event.work.routed(agent=service) # review path POST /api/v1/review/:id → event.review.approved | rejected
Specialist agents, one customer outcome
The runtime orchestrates specialist agents, not one mega-prompt. A Service agent owns the customer promise; a Knowledge agent retrieves source-backed context; a Backoffice agent checks invoices, usage, and policy.
They call tools from a typed registry. Every CRM update, every refund, every ticket close is a function with a signature — not free-text instruction.
# tool registry — typed, traced, reversible tools.crm.update_case(id, fields) tools.knowledge.search(query, top_k=8) tools.invoice.lookup(account_id) tools.review.request(work_id, reason)
Operational memory and retrieval
Conversations, work items, and knowledge chunks live in Postgres with pgvector. Redis handles queues, sessions, and real-time coordination. A separate trace store captures every decision, tool call, and approval.
Immutable, queryable, exportable. This is what makes the system auditable, not just observable.
- pg + pgvectorConversations, work items, knowledge chunks, vector search
- redisQueues, sessions, real-time coordination
- trace.storeDecisions, approvals, tool calls, audit trail
Inside the boundary
Everything above runs on Laava Box — a private NVIDIA GPU cluster, customer-controlled. Model services (reasoning, reranking, STT, TTS, embeddings) all execute locally.
Scale to your workload: 1, 2, 4, or 8 cards per box. The network, runtime, and data perimeter sit under your control. Not ours.
- nvidia ×1–8Sized to your workload — customer-controlled private compute
- model.servicesReasoning · reranking · STT · TTS · embeddings
- boundaryNetwork, runtime, and data perimeter — customer-owned
Nothing leaves the boundary
Responses retrace the path. Tool outputs flow back through the runtime, the API publishes a completion event, and the connector delivers the answer to its original channel.
Reviewers get notified inside the boundary. Auditors get an exportable trace. The customer sees a single accountable outcome.
AI that integrates.
Systems that endure.
Want this mapped to your channels, data, and workflows? We'll turn this diagram into a concrete implementation plan for your stack.
Talk to an engineer