A private agent stack.
Five layers. One boundary.

Every customer request enters at the top, descends through five layers of controlled execution, and returns from inside the customer's own compute boundary.

↓ scroll to follow the path

01Connectors

Where work enters the system

Customer and employee requests reach Laava through the channels you already operate. Email, voice, internal chat, and self-service portals are first-class — agents don't pretend the channel doesn't matter.

Each connector preserves the original artifact and threads identity through every downstream step.

channels.emailInbound tickets, replies, case updates
channels.voiceLive calls, summaries, transcripts
channels.workspaceSlack, Teams, self-service portal

02API layer

The single front door

One Agent API receives every piece of work. It authenticates the source, normalizes the payload, and emits a typed event onto the bus. Nothing downstream runs without a signed, traced event.

Reviews and handoffs route back through here too — escalations don't bypass the API.

# inbound
POST /api/v1/work
  → event.work.received
  → event.work.routed(agent=service)

# review path
POST /api/v1/review/:id
  → event.review.approved | rejected

03Runtime — Laava SDK

Specialist agents, one customer outcome

The runtime orchestrates specialist agents, not one mega-prompt. A Service agent owns the customer promise; a Knowledge agent retrieves source-backed context; a Backoffice agent checks invoices, usage, and policy.

They call tools from a typed registry. Every CRM update, every refund, every ticket close is a function with a signature — not free-text instruction.

# tool registry — typed, traced, reversible
tools.crm.update_case(id, fields)
tools.knowledge.search(query, top_k=8)
tools.invoice.lookup(account_id)
tools.review.request(work_id, reason)

04Database + state

Operational memory and retrieval

Conversations, work items, and knowledge chunks live in Postgres with pgvector. Redis handles queues, sessions, and real-time coordination. A separate trace store captures every decision, tool call, and approval.

Immutable, queryable, exportable. This is what makes the system auditable, not just observable.

pg + pgvectorConversations, work items, knowledge chunks, vector search
redisQueues, sessions, real-time coordination
trace.storeDecisions, approvals, tool calls, audit trail

05Infrastructure

Inside the boundary

Everything above runs on Laava Box — a private NVIDIA GPU cluster, customer-controlled. Model services (reasoning, reranking, STT, TTS, embeddings) all execute locally.

Scale to your workload: 1, 2, 4, or 8 cards per box. The network, runtime, and data perimeter sit under your control. Not ours.

nvidia ×1–8Sized to your workload — customer-controlled private compute
model.servicesReasoning · reranking · STT · TTS · embeddings
boundaryNetwork, runtime, and data perimeter — customer-owned

06Return path

Nothing leaves the boundary

Responses retrace the path. Tool outputs flow back through the runtime, the API publishes a completion event, and the connector delivers the answer to its original channel.

Reviewers get notified inside the boundary. Auditors get an exportable trace. The customer sees a single accountable outcome.

07Outcome

AI that integrates.
Systems that endure.

Want this mapped to your channels, data, and workflows? We'll turn this diagram into a concrete implementation plan for your stack.

Talk to an engineer

A private agent stack.Five layers. One boundary.