Engram — Self-Hosted Deployment Guide

Deploy Engram on your own infrastructure. A single docker compose up gets you a fully functional instance with the API server, background worker, dashboard, and database.

Prerequisites

Docker Engine 24+ and Docker Compose v2
A machine with at least 1 CPU core and 512 MB RAM (production: 2+ cores, 2+ GB recommended)
Ports 8080 (API) and 3000 (dashboard) available, or a reverse proxy in front
An LLM provider — either Engram-managed API keys (Anthropic + OpenAI) or a BYO endpoint (Ollama, Bedrock, Azure OpenAI)

Quick Start

Create a directory and a docker-compose.yml:

services:
  engram:
    image: engram/server:latest
    ports: ["8080:8080"]
    depends_on:
      postgres:
        condition: service_healthy
    environment:
      DATABASE_URL: postgres://engram:changeme@postgres:5432/engram
      JWT_SECRET: replace-with-a-strong-random-string
      ANTHROPIC_API_KEY: sk-ant-...
      OPENAI_API_KEY: sk-...
    restart: unless-stopped

  engram-web:
    image: engram/dashboard:latest
    ports: ["3000:3000"]
    environment:
      NEXT_PUBLIC_API_URL: http://engram:8080
    restart: unless-stopped

  postgres:
    image: pgvector/pgvector:pg16
    environment:
      POSTGRES_USER: engram
      POSTGRES_PASSWORD: changeme
      POSTGRES_DB: engram
    volumes:
      - pgdata:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U engram"]
      interval: 5s
      timeout: 5s
      retries: 5
    restart: unless-stopped

volumes:
  pgdata:

Start the stack:

docker compose up -d

Open http://localhost:3000 in your browser. Create your first account and workspace.

The server binary runs both the API and the background worker in a single process by default. To run them separately (recommended for larger deployments), see Process Modes.

Configuration Reference

Required Environment Variables

Variable	Description	Example
`DATABASE_URL`	PostgreSQL connection string. Must include a database with the pgvector extension.	`postgres://engram:secret@postgres:5432/engram`
`JWT_SECRET`	Secret key for signing JWT access and refresh tokens. Use a random string of at least 32 characters.	`a1b2c3d4e5...`
`ANTHROPIC_API_KEY`	Anthropic API key for AI completions (gap analysis, doc generation, validation). Required unless all workspaces use BYO LLM.	`sk-ant-api03-...`
`OPENAI_API_KEY`	OpenAI API key for text embeddings (semantic search, source mapping). Required unless all workspaces use BYO embeddings.	`sk-proj-...`

Optional Environment Variables

Variable	Default	Description
`PORT`	`8080`	Port the API server listens on.
`HOST`	`0.0.0.0`	Bind address.
`LOG_LEVEL`	`info`	Log verbosity: `error`, `warn`, `info`, `debug`, `trace`.
`LOG_FORMAT`	`json`	Log output format: `json` (production) or `pretty` (development).
`REDIS_URL`	(none)	Optional Redis URL for caching compiled outputs. Improves context-serving latency but is not required.
`GITHUB_APP_ID`	(none)	GitHub App ID for automated repo deployments.
`GITHUB_APP_PRIVATE_KEY`	(none)	GitHub App private key (PEM). Can also be loaded from a file via `GITHUB_APP_PRIVATE_KEY_PATH`.
`GITHUB_APP_WEBHOOK_SECRET`	(none)	Webhook secret for verifying GitHub App payloads.
`OTEL_EXPORTER_OTLP_ENDPOINT`	(none)	OpenTelemetry collector endpoint for distributed tracing.
`METRICS_ENABLED`	`true`	Expose Prometheus metrics at `/metrics`.
`MODE`	`both`	Process mode: `api`, `worker`, or `both`. See Process Modes.
`CORS_ORIGINS`	`*`	Allowed CORS origins (comma-separated). Set to your dashboard URL in production.

Dashboard Environment Variables

Variable	Default	Description
`NEXT_PUBLIC_API_URL`	`http://localhost:8080`	URL of the Engram API server, as reachable from the browser.

Database Setup

Engram requires PostgreSQL 16 (or newer) with the pgvector extension for semantic search and embedding storage.

Using the Bundled PostgreSQL

The quick-start docker-compose.yml above includes a PostgreSQL container with pgvector pre-installed (pgvector/pgvector:pg16). This is the simplest option.

Using an External PostgreSQL Instance

If you prefer to use an existing PostgreSQL server:

Install pgvector:

CREATE EXTENSION IF NOT EXISTS vector;

On managed services (AWS RDS, Google Cloud SQL, Azure), pgvector is available as a supported extension — enable it from your provider's console.

Create the database and user:

CREATE USER engram WITH PASSWORD 'your-secure-password';
CREATE DATABASE engram OWNER engram;
\c engram
CREATE EXTENSION IF NOT EXISTS vector;

Point DATABASE_URL at your external instance:

DATABASE_URL=postgres://engram:your-secure-password@db.example.com:5432/engram

Migrations

The server runs database migrations automatically on startup. No manual migration step is required.

LLM Provider Configuration

Engram needs an LLM for completions (gap analysis, document generation, validation) and an embedding model for semantic search. The provider is configured at the workspace level — each workspace can use a different provider.

Engram-Managed (Default)

Set ANTHROPIC_API_KEY and OPENAI_API_KEY as server environment variables. All workspaces use these keys by default. This is the simplest setup.

ANTHROPIC_API_KEY=sk-ant-api03-...   # completions (Claude)
OPENAI_API_KEY=sk-proj-...           # embeddings (text-embedding-3-small)

BYO — Own API Keys

Individual workspaces can override the server-level keys with their own credentials. Configure this in the dashboard under Workspace Settings > LLM Configuration, or via the API:

{
  "llm": {
    "mode": "byo",
    "completion": {
      "provider": "anthropic",
      "model": "claude-sonnet-4-20250514",
      "api_key": "sk-ant-..."
    },
    "embedding": {
      "provider": "openai",
      "model": "text-embedding-3-small",
      "api_key": "sk-..."
    }
  }
}

BYO — Azure OpenAI

Use base_url to point at your Azure OpenAI deployment:

{
  "llm": {
    "mode": "byo",
    "completion": {
      "provider": "openai",
      "model": "gpt-4o",
      "api_key": "your-azure-key",
      "base_url": "https://your-instance.openai.azure.com/openai/deployments/gpt-4o"
    },
    "embedding": {
      "provider": "openai",
      "model": "text-embedding-3-small",
      "api_key": "your-azure-key",
      "base_url": "https://your-instance.openai.azure.com/openai/deployments/text-embedding-3-small"
    }
  }
}

BYO — AWS Bedrock

Use the Bedrock provider with AWS credentials:

{
  "llm": {
    "mode": "byo",
    "completion": {
      "provider": "bedrock",
      "model": "anthropic.claude-sonnet-4-20250514-v1:0",
      "region": "us-east-1"
    },
    "embedding": {
      "provider": "bedrock",
      "model": "amazon.titan-embed-text-v2:0",
      "region": "us-east-1"
    }
  }
}

AWS credentials are resolved from the standard chain: environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY), IAM role, or instance profile.

BYO — Ollama (Air-Gapped / Self-Hosted LLM)

For fully air-gapped deployments with no external API calls:

{
  "llm": {
    "mode": "byo",
    "completion": {
      "provider": "ollama",
      "model": "llama3.1:70b",
      "base_url": "http://ollama:11434"
    },
    "embedding": {
      "provider": "ollama",
      "model": "nomic-embed-text",
      "base_url": "http://ollama:11434"
    }
  }
}

Add Ollama to your docker-compose.yml:

services:
  ollama:
    image: ollama/ollama
    volumes:
      - ollama_data:/root/.ollama
    # GPU passthrough (NVIDIA):
    # deploy:
    #   resources:
    #     reservations:
    #       devices:
    #         - driver: nvidia
    #           count: 1
    #           capabilities: [gpu]

volumes:
  ollama_data:

Ollama uses the OpenAI-compatible API, so any OpenAI-compatible inference server (vLLM, llama.cpp, etc.) works with the same config — just set the base_url.

TLS / Reverse Proxy

In production, place a reverse proxy in front of Engram to handle TLS termination. Below is an nginx example.

nginx Configuration

server {
    listen 443 ssl http2;
    server_name engram.example.com;

    ssl_certificate     /etc/letsencrypt/live/engram.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/engram.example.com/privkey.pem;

    # API server
    location /api/ {
        proxy_pass http://127.0.0.1:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # SSE support (job progress streaming)
        proxy_buffering off;
        proxy_cache off;
        proxy_read_timeout 300s;
    }

    # Webhook endpoint
    location /webhooks/ {
        proxy_pass http://127.0.0.1:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    # Health and metrics
    location /health {
        proxy_pass http://127.0.0.1:8080;
    }

    location /metrics {
        proxy_pass http://127.0.0.1:8080;
        # Restrict to internal monitoring
        allow 10.0.0.0/8;
        deny all;
    }

    # Dashboard
    location / {
        proxy_pass http://127.0.0.1:3000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

server {
    listen 80;
    server_name engram.example.com;
    return 301 https://$server_name$request_uri;
}

Update the dashboard environment variable to match your public URL:

NEXT_PUBLIC_API_URL=https://engram.example.com

And restrict CORS origins on the server:

CORS_ORIGINS=https://engram.example.com

Process Modes

The server binary supports three modes, controlled by the MODE environment variable:

Mode	Description	When to use
`both` (default)	Runs the API server and background worker in a single process.	Small to medium deployments.
`api`	Runs only the API server. Does not process background jobs.	Scale API and worker independently.
`worker`	Runs only the background worker. Does not serve HTTP requests.	Dedicated worker node for heavy AI jobs.

For larger deployments, run separate containers:

services:
  engram-api:
    image: engram/server:latest
    environment:
      MODE: api
      DATABASE_URL: postgres://engram:changeme@postgres:5432/engram
      # ... other env vars
    ports: ["8080:8080"]

  engram-worker:
    image: engram/server:latest
    environment:
      MODE: worker
      DATABASE_URL: postgres://engram:changeme@postgres:5432/engram
      ANTHROPIC_API_KEY: sk-ant-...
      OPENAI_API_KEY: sk-...
      # ... other env vars

The worker needs LLM API keys (or BYO config); the API server needs them only if workspaces use Engram-managed keys for compilation preview.

Backup and Restore

Database Backup

All Engram state lives in PostgreSQL. Back up the database regularly.

# Backup
docker exec engram-postgres pg_dump -U engram engram > engram_backup_$(date +%Y%m%d).sql

# Compressed backup
docker exec engram-postgres pg_dump -U engram -Fc engram > engram_backup_$(date +%Y%m%d).dump

Restore

# From SQL dump
docker exec -i engram-postgres psql -U engram engram < engram_backup_20260308.sql

# From compressed dump
docker exec -i engram-postgres pg_restore -U engram -d engram --clean engram_backup_20260308.dump

What to Back Up

PostgreSQL data — all workspace config, documents, versions, source connections, compiled outputs, and job history.
Environment file — your .env or docker-compose.yml with secrets (JWT_SECRET, API keys).
GitHub App private key — if you configured a GitHub App for automated deployments.

There is no local filesystem state. Redis (if used) is a cache and does not need to be backed up.

Upgrading

Pull the latest images:

docker compose pull

Restart the stack:

docker compose up -d

Database migrations run automatically on startup. The server will not start serving requests until all migrations have completed.

Version Pinning

For production stability, pin to a specific version tag instead of latest:

services:
  engram:
    image: engram/server:1.2.0
  engram-web:
    image: engram/dashboard:1.2.0

Rolling Back

If an upgrade causes issues:

Stop the stack: docker compose down
Restore the database from your pre-upgrade backup.
Update docker-compose.yml to the previous version tag.
Start: docker compose up -d

Database migrations are forward-only. Rolling back to a previous server version requires restoring a database backup taken before the upgrade.

Troubleshooting

Server fails to start with "connection refused" to PostgreSQL

The PostgreSQL container may not be ready yet. The healthcheck in the compose file and depends_on with condition: service_healthy handle this. If you are using an external database, verify the DATABASE_URL is correct and that the database accepts connections from the server's network.

"pgvector extension not found"

The pgvector/pgvector:pg16 image includes the extension. If using an external database, run:

CREATE EXTENSION IF NOT EXISTS vector;

This requires superuser privileges. On managed services, enable the extension from your provider's console.

Context serving is slow (>10ms)

Enable Redis caching by setting REDIS_URL.
Verify the database has appropriate indexes (migrations create them automatically).
Check that the server and database are in the same network/region.

AI pipeline jobs fail or timeout

Verify LLM API keys are valid and have sufficient quota.
Check server logs: docker compose logs engram --tail 100
For BYO endpoints, verify the base_url is reachable from the server container.
Ollama models must be pulled before use: docker exec ollama ollama pull llama3.1:70b

Dashboard cannot reach the API

The NEXT_PUBLIC_API_URL must be reachable from the user's browser, not from inside the Docker network. If you access the dashboard at https://engram.example.com, set:

NEXT_PUBLIC_API_URL=https://engram.example.com

Not http://engram:8080 (that is the internal Docker hostname).

GitHub App webhook delivery fails

Verify GITHUB_APP_WEBHOOK_SECRET matches the secret configured in your GitHub App settings.
The webhook URL must be publicly reachable (e.g., https://engram.example.com/webhooks/github).
Check the GitHub App's "Recent Deliveries" page for error details.

Checking server health

curl http://localhost:8080/health

Returns 200 OK with version info when the server is running and the database connection is healthy.

Resource Requirements

Component	CPU	RAM	Disk
engram server (API + worker)	1 core minimum, 2+ recommended	50 MB baseline, ~200 MB under load	Negligible (~20 MB binary)
engram dashboard	0.5 core	128 MB	~100 MB (Next.js)
PostgreSQL	1 core minimum, 2+ recommended	256 MB minimum, 1+ GB recommended	Depends on data volume (typically <1 GB for most teams)
Redis (optional)	0.5 core	64 MB	Negligible

The Rust server binary is statically linked (~20 MB), starts in under 100ms, and uses no garbage collector. Memory usage scales with concurrent connections and active background jobs, not with data volume.

For a team of 50 people with ~100 guideline documents: a single 2-core / 2 GB machine handles everything comfortably, including the database.