LaavaTalk to an engineer
layer.intro·00 / 07
  1. Intro
  2. Channels
  3. API
  4. Runtime
  5. Memory
  6. Infra
  7. Return
  8. Outcome
00Architecture

A private agent stack.
Five layers. One boundary.

Every customer request enters at the top, descends through five layers of controlled execution, and returns from inside the customer's own compute boundary.

↓ scroll to follow the path

01Connectors

Where work enters the system

Customer and employee requests reach Laava through the channels you already operate. Email, voice, internal chat, and self-service portals are first-class — agents don't pretend the channel doesn't matter.

Each connector preserves the original artifact and threads identity through every downstream step.

  • channels.emailInbound tickets, replies, case updates
  • channels.voiceLive calls, summaries, transcripts
  • channels.workspaceSlack, Teams, self-service portal
02API layer

The single front door

One Agent API receives every piece of work. It authenticates the source, normalizes the payload, and emits a typed event onto the bus. Nothing downstream runs without a signed, traced event.

Reviews and handoffs route back through here too — escalations don't bypass the API.

# inbound
POST /api/v1/work
   event.work.received
   event.work.routed(agent=service)

# review path
POST /api/v1/review/:id
   event.review.approved | rejected
03Runtime — Laava SDK

Specialist agents, one customer outcome

The runtime orchestrates specialist agents, not one mega-prompt. A Service agent owns the customer promise; a Knowledge agent retrieves source-backed context; a Backoffice agent checks invoices, usage, and policy.

They call tools from a typed registry. Every CRM update, every refund, every ticket close is a function with a signature — not free-text instruction.

# tool registry — typed, traced, reversible
tools.crm.update_case(id, fields)
tools.knowledge.search(query, top_k=8)
tools.invoice.lookup(account_id)
tools.review.request(work_id, reason)
04Database + state

Operational memory and retrieval

Conversations, work items, and knowledge chunks live in Postgres with pgvector. Redis handles queues, sessions, and real-time coordination. A separate trace store captures every decision, tool call, and approval.

Immutable, queryable, exportable. This is what makes the system auditable, not just observable.

  • pg + pgvectorConversations, work items, knowledge chunks, vector search
  • redisQueues, sessions, real-time coordination
  • trace.storeDecisions, approvals, tool calls, audit trail
05Infrastructure

Inside the boundary

Everything above runs on Laava Box — a private NVIDIA GPU cluster, customer-controlled. Model services (reasoning, reranking, STT, TTS, embeddings) all execute locally.

Scale to your workload: 1, 2, 4, or 8 cards per box. The network, runtime, and data perimeter sit under your control. Not ours.

  • nvidia ×1–8Sized to your workload — customer-controlled private compute
  • model.servicesReasoning · reranking · STT · TTS · embeddings
  • boundaryNetwork, runtime, and data perimeter — customer-owned
06Return path

Nothing leaves the boundary

Responses retrace the path. Tool outputs flow back through the runtime, the API publishes a completion event, and the connector delivers the answer to its original channel.

Reviewers get notified inside the boundary. Auditors get an exportable trace. The customer sees a single accountable outcome.

07Outcome

AI that integrates.
Systems that endure.

Want this mapped to your channels, data, and workflows? We'll turn this diagram into a concrete implementation plan for your stack.

Talk to an engineer
Architecture — The Private Laava Agent Stack | Laava