Self-Host LLM Agent Manager with Matrix & Docker

Build a privacy-first self-hosted LLM agent manager using Matrix for notifications and Docker for safe isolation. Step-by-step guide for developers.

Stop trusting desktop agents with full disk access — build your own privacy-first LLM agent manager

Commercial desktop agent apps introduced in late 2025 and early 2026 (Anthropic's Cowork being the most notable) pushed convenience by granting broad filesystem and app access to autonomous agents. That model solves productivity problems — at the cost of privacy, auditability, and control. If you're a developer or sysadmin who needs autonomous developer agents but refuses to hand private repos, credentials, and production systems to a closed SaaS desktop app, this guide shows how to build a self-hosted LLM agent manager that keeps control where it belongs: on your infrastructure.

Why this matters in 2026

Two trends converged in 2025–2026 that make a self-hosted approach practical and necessary:

Open and efficient LLMs and runtimes matured for edge deployment, enabling teams to run capable models on-prem or in private cloud.
Concerns about desktop agents and centralized orchestration — amplified by the launch of commercial desktop copilots and the continuing consolidation and shutdowns of big vendor services — increased demand for private, auditable alternatives.

Building a manager that orchestrates tasks, logs, and permissions lets you get the productivity gains of autonomous agents without surrendering secrets or system control.

What you'll build — architecture overview

Goal: a minimal, privacy-first service that schedules and runs autonomous agents in isolated Docker containers, streams logs, and sends notifications and audit messages over Matrix. The system will include a tiny UI for authoring tasks and reviewing results.

Core components

Agent Manager: central coordinator, schedules jobs, enforces policies, stores metadata and audit logs (Postgres or SQLite for small installs).
Agent Runner(s): ephemeral Docker containers that execute steps, constrained with seccomp, resource limits, and user namespaces.
Matrix notifier: a bridge/bot that publishes task events, logs summaries, and approval requests to Matrix rooms.
Minimal UI: single-page app (static) or small server to create tasks, view logs, and manage permissions.
Reverse proxy and TLS: Traefik or Caddy for Let’s Encrypt / TLS and routing.
Optional model backend: local LLM runtime (containerized) or API proxy to a hosted model with fine-grained controls.

Key principles

Least privilege — agents get just the files and network they need.
Auditability — all actions are logged, immutable hashes stored, and key events pushed to Matrix rooms.
Isolated execution — Docker containers with no host network unless explicitly allowed.
Minimal trust surface — run models locally or through a vetted gateway; never bake secrets into agent images.

Privacy-first doesn't mean low-quality. With modern LLMs and careful isolation, you can get near commercial-level autonomy while keeping your data private and auditable.

Prerequisites and quick checklist

A VPS or server (x86/ARM) with Docker and Docker Compose (or Podman) — GPU optional for local LLMs.
Domain name and TLS (Traefik or Caddy recommended).
Matrix homeserver or account for the notifier (you can self-host Synapse/Dendrite or use a trusted provider).
PostgreSQL (recommended) or SQLite for small deployments.
Basic Git and container image build skills.

Practical deploy: Docker Compose starter

The following minimal docker-compose.yml shows the manager, agent-runner template, a Matrix bot, and Postgres. This is a starter — extend it for model backends, UI, and monitoring.

version: '3.8'
services:
  db:
    image: postgres:15
    environment:
      POSTGRES_USER: manager
      POSTGRES_PASSWORD: change-me
      POSTGRES_DB: agent_manager
    volumes:
      - ./data/postgres:/var/lib/postgresql/data

  manager:
    image: ghcr.io/yourorg/llm-agent-manager:latest
    environment:
      DATABASE_URL: postgres://manager:change-me@db:5432/agent_manager
      MATRIX_HOMESERVER: https://matrix.example.com
      MATRIX_ROOM_ID: '!agents:example.com'
      MATRIX_ACCESS_TOKEN: 'YOUR_MATRIX_TOKEN'
    ports:
      - '8080:8080'
    depends_on:
      - db
    volumes:
      - ./data/manager:/app/data

  matrix-bot:
    image: ghcr.io/yourorg/matrix-agent-bot:latest
    environment:
      MATRIX_HOMESERVER: 'https://matrix.example.com'
      MATRIX_ACCESS_TOKEN: 'YOUR_MATRIX_TOKEN'
      MATRIX_ROOM_ID: '!agents:example.com'
    depends_on:
      - manager

  runner:
    image: ghcr.io/yourorg/agent-runner:latest
    # runner is started by manager via docker socket or remote API
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./data/artifacts:/artifacts

Notes:

The manager schedules jobs and calls the Docker API to spin up ephemeral runners.
Store only non-sensitive metadata in the DB; never store raw secrets.
For production, avoid bind-mounting /var/run/docker.sock from untrusted services; use a remote Docker API with TLS and a least-privileged certificate, or use a dedicated runner service.

Matrix notifier: simple event flow

Use Matrix for human-readable notifications and approvals. Events to publish:

Task created (with an ID and metadata)
Agent started (link to logs)
Approval required (interactive request)
Task completed or failed (with artifacts and exit code)

Example JSON payload the manager sends to the Matrix bot (conceptual):

{
  "type": "task_started",
  "task_id": "T-1234",
  "user": "alice",
  "summary": "Refactor service X: rename API endpoints",
  "log_url": "https://manager.example.com/tasks/T-1234/logs"
}

Agent runner design and hardening

Each agent runs inside its own container built from a minimal image. Key hardening steps:

Run as a non-root user (USER 1000).
Use seccomp and read-only rootfs where possible.
Mount only the directories the agent needs (e.g., a single repo directory) as writable.
Block or proxy outbound network access by default; allow necessary destinations via an egress policy or proxy.
Use cgroup CPU/memory limits, and set restart policy to "no" for ephemeral jobs.
Use ephemeral volumes for task artifacts; copy artifacts to manager storage only after completion.

Example docker run flags you should apply programmatically:

docker run --rm \
  --user 1000:1000 \
  --read-only \
  --tmpfs /tmp:rw,size=64M \
  --memory=1g --cpus=1.0 \
  --security-opt seccomp=/path/to/seccomp.json \
  --network=none \
  -v /host/repos/repo-123:/work:rw \
  ghcr.io/yourorg/agent-image:latest /bin/sh -c 'run-agent'

Permissions and approval workflow

Implement a simple RBAC model in the manager:

Roles: admin, developer, auditor, guest.
Policy rules: who can create tasks, who can approve network access, who can mount production paths.
Approval flow: if an agent requests high-risk actions (write access to a prod repo, network egress, or secret access), the manager publishes an approval request to a designated Matrix room and pauses the job until an authorized user approves.

Logs, observability, and retention

Design logs for humans and machines:

Stream container stdout/stderr to the manager; keep a compressed, tamper-evident archive of logs per task (e.g., signed by manager key).
Publish short summaries and links via Matrix to avoid flooding rooms with raw logs.
For advanced setups, add Prometheus metrics and Grafana dashboards. Loki is a good option for log indexing.
Retention policy: keep logs for an audit window (30–90 days) and archive or delete based on policy.

Backups and upgrades

Back up your database daily and test restores.
Store artifacts in an object store (minIO or S3) and replicate offsite if needed.
Automate manager and runner image updates with a canary process. Keep a forced-rollback plan and immutable tags for reproducible runs.

Example: autonomous refactor agent flow

Developer creates a task in the UI: "Refactor X to new API contract" and selects repo and branch.
Manager validates the request, checks policies, and creates task T-1234 in DB.
Manager sends a Matrix message: "New refactor task T-1234 created by alice" with a link.
Manager spins a runner container with read-write mount to a sandboxed clone of the repo and limited network access.
Agent runs: edits files, runs tests, commits to a draft branch, and writes artifacts. All output streamed to manager logs.
If the agent requests to push to remote (write to remote origin), manager creates an approval request in Matrix. A reviewer approves; the push occurs or is enacted by a trusted service account.
Manager publishes final summary and artifact link to Matrix and archives logs.

Scaling paths: from a single server to Kubernetes

Start small with Docker Compose. When you need scale:

Move runners to Kubernetes as Jobs with PodSecurityPolicy / OPA Gatekeeper constraints.
Delegate GPU scheduling with device plugins (NVIDIA/AMD/ROCm) for local LLM acceleration.
Use a message queue (RabbitMQ/Redis Streams) between manager and runners for reliability.
Federate Matrix rooms across teams or sites for cross-site notifications.

Hardening checklist

Encrypt DB at rest and in transit. Rotate DB and Matrix tokens regularly.
Run manager and Matrix bot in separate accounts and namespaces.
Use HashiCorp Vault or similar for secrets; never store API keys in DB or in container images.
Restrict source repo clones to ephemeral sandboxes; never mount /home or other host sensitive paths.
Monitor for unexpected egress attempts. Reject tasks that request broad outbound access by default.

Trends and predictions (2026+)

Looking ahead from early 2026, a few trends are clear:

Expectation: more teams will choose self-hosted or hybrid agent orchestration to meet privacy and compliance needs. Vendor desktop agents will exist, but enterprise adoption will favor auditable, self-hosted options.
Model landscape: efficient open weights and accelerators will keep improving. Running capable models on-premises will become cheaper and more common.
Interoperability: federated messaging protocols like Matrix will be the de facto notification layer for self-hosted workflows, because they provide user-managed rooms, bridges, and strong client ecosystem support.
Regulation and audits will increase demand for immutable audit trails and approved execution paths for autonomous agents.

Actionable takeaways

Start with a minimal manager + Matrix bot + runner pattern on a single server to validate workflows.
Enforce least privilege for mounts and network egress from the first day.
Use Matrix for approvals and short log summaries to keep humans in the loop without exposing sensitive data.
Keep secrets in a vault and never bake them into agent images or container environment variables that persist.
Design auditability from day one: sign and archive logs and artifacts for compliance.

Final notes and next steps

Self-hosting an LLM agent manager isn't trivial, but it's now practical for development teams who prioritize privacy and control. The approach in this article — isolating agents with Docker, using Matrix as the notification and approval layer, and enforcing least privilege — gives you a repeatable, auditable, and privacy-preserving alternative to closed desktop copilots.

Want a jump-start? Start a small PoC tonight: spin up a Matrix room, run a minimal manager on a disposable VPS, and test a simple agent that performs read-only analysis of a repo. Gradually add approvals, artifact storage, and stronger sandboxing as you validate the model and workflows.

Call to action

Clone the starter repo, deploy the Docker Compose stack, and join a Matrix room to discuss designs with other self-hosters. If you'd like, I can provide a curated starter repo (manager, matrix-bot, and agent-runner) with secure defaults and a sample approval workflow to accelerate your PoC — ask for the repo and I'll share the template and deployment checklist.

Self-Hosting an LLM Agent Manager: Building a Local 'Cowork' Alternative with Matrix and Docker

Stop trusting desktop agents with full disk access — build your own privacy-first LLM agent manager

Why this matters in 2026

What you'll build — architecture overview

Core components

Key principles

Prerequisites and quick checklist

Practical deploy: Docker Compose starter

Matrix notifier: simple event flow

Agent runner design and hardening

Permissions and approval workflow

Logs, observability, and retention

Backups and upgrades

Example: autonomous refactor agent flow

Scaling paths: from a single server to Kubernetes

Hardening checklist

Trends and predictions (2026+)

Actionable takeaways

Final notes and next steps

Call to action

Related Topics

selfhosting

Up Next

Traefik Docker Compose Guide for Self-Hosted Apps

Best Self-Hosted Alternatives to Google Workspace for Small Teams

How to Run Multiple Self-Hosted Apps on One Server Safely

Stop trusting desktop agents with full disk access — build your own privacy-first LLM agent manager

Why this matters in 2026

What you'll build — architecture overview

Core components

Key principles

Prerequisites and quick checklist

Practical deploy: Docker Compose starter

Matrix notifier: simple event flow

Agent runner design and hardening

Permissions and approval workflow

Logs, observability, and retention

Backups and upgrades

Example: autonomous refactor agent flow

Scaling paths: from a single server to Kubernetes

Hardening checklist

Trends and predictions (2026+)

Actionable takeaways

Final notes and next steps

Call to action

Related Reading

Related Topics

selfhosting

Up Next

Traefik Docker Compose Guide for Self-Hosted Apps

Best Self-Hosted Alternatives to Google Workspace for Small Teams

How to Run Multiple Self-Hosted Apps on One Server Safely