Engram — Self-Hosted Deployment Guide
Deploy Engram on your own infrastructure. A single docker compose up gets you a fully functional instance with the API server, background worker, dashboard, and database.
Prerequisites
- Docker Engine 24+ and Docker Compose v2
- A machine with at least 1 CPU core and 512 MB RAM (production: 2+ cores, 2+ GB recommended)
- Ports 8080 (API) and 3000 (dashboard) available, or a reverse proxy in front
- An LLM provider — either Engram-managed API keys (Anthropic + OpenAI) or a BYO endpoint (Ollama, Bedrock, Azure OpenAI)
Quick Start
- Create a directory and a
docker-compose.yml:
services:
engram:
image: engram/server:latest
ports: ["8080:8080"]
depends_on:
postgres:
condition: service_healthy
environment:
DATABASE_URL: postgres://engram:changeme@postgres:5432/engram
JWT_SECRET: replace-with-a-strong-random-string
ANTHROPIC_API_KEY: sk-ant-...
OPENAI_API_KEY: sk-...
restart: unless-stopped
engram-web:
image: engram/dashboard:latest
ports: ["3000:3000"]
environment:
NEXT_PUBLIC_API_URL: http://engram:8080
restart: unless-stopped
postgres:
image: pgvector/pgvector:pg16
environment:
POSTGRES_USER: engram
POSTGRES_PASSWORD: changeme
POSTGRES_DB: engram
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U engram"]
interval: 5s
timeout: 5s
retries: 5
restart: unless-stopped
volumes:
pgdata:
- Start the stack:
docker compose up -d
- Open
http://localhost:3000in your browser. Create your first account and workspace.
The server binary runs both the API and the background worker in a single process by default. To run them separately (recommended for larger deployments), see Process Modes.
Configuration Reference
Required Environment Variables
| Variable | Description | Example |
|---|---|---|
DATABASE_URL | PostgreSQL connection string. Must include a database with the pgvector extension. | postgres://engram:secret@postgres:5432/engram |
JWT_SECRET | Secret key for signing JWT access and refresh tokens. Use a random string of at least 32 characters. | a1b2c3d4e5... |
ANTHROPIC_API_KEY | Anthropic API key for AI completions (gap analysis, doc generation, validation). Required unless all workspaces use BYO LLM. | sk-ant-api03-... |
OPENAI_API_KEY | OpenAI API key for text embeddings (semantic search, source mapping). Required unless all workspaces use BYO embeddings. | sk-proj-... |
Optional Environment Variables
| Variable | Default | Description |
|---|---|---|
PORT | 8080 | Port the API server listens on. |
HOST | 0.0.0.0 | Bind address. |
LOG_LEVEL | info | Log verbosity: error, warn, info, debug, trace. |
LOG_FORMAT | json | Log output format: json (production) or pretty (development). |
REDIS_URL | (none) | Optional Redis URL for caching compiled outputs. Improves context-serving latency but is not required. |
GITHUB_APP_ID | (none) | GitHub App ID for automated repo deployments. |
GITHUB_APP_PRIVATE_KEY | (none) | GitHub App private key (PEM). Can also be loaded from a file via GITHUB_APP_PRIVATE_KEY_PATH. |
GITHUB_APP_WEBHOOK_SECRET | (none) | Webhook secret for verifying GitHub App payloads. |
OTEL_EXPORTER_OTLP_ENDPOINT | (none) | OpenTelemetry collector endpoint for distributed tracing. |
METRICS_ENABLED | true | Expose Prometheus metrics at /metrics. |
MODE | both | Process mode: api, worker, or both. See Process Modes. |
CORS_ORIGINS | * | Allowed CORS origins (comma-separated). Set to your dashboard URL in production. |
Dashboard Environment Variables
| Variable | Default | Description |
|---|---|---|
NEXT_PUBLIC_API_URL | http://localhost:8080 | URL of the Engram API server, as reachable from the browser. |
Database Setup
Engram requires PostgreSQL 16 (or newer) with the pgvector extension for semantic search and embedding storage.
Using the Bundled PostgreSQL
The quick-start docker-compose.yml above includes a PostgreSQL container with pgvector pre-installed (pgvector/pgvector:pg16). This is the simplest option.
Using an External PostgreSQL Instance
If you prefer to use an existing PostgreSQL server:
- Install pgvector:
CREATE EXTENSION IF NOT EXISTS vector;
On managed services (AWS RDS, Google Cloud SQL, Azure), pgvector is available as a supported extension — enable it from your provider's console.
- Create the database and user:
CREATE USER engram WITH PASSWORD 'your-secure-password';
CREATE DATABASE engram OWNER engram;
\c engram
CREATE EXTENSION IF NOT EXISTS vector;
- Point
DATABASE_URLat your external instance:
DATABASE_URL=postgres://engram:your-secure-password@db.example.com:5432/engram
Migrations
The server runs database migrations automatically on startup. No manual migration step is required.
LLM Provider Configuration
Engram needs an LLM for completions (gap analysis, document generation, validation) and an embedding model for semantic search. The provider is configured at the workspace level — each workspace can use a different provider.
Engram-Managed (Default)
Set ANTHROPIC_API_KEY and OPENAI_API_KEY as server environment variables. All workspaces use these keys by default. This is the simplest setup.
ANTHROPIC_API_KEY=sk-ant-api03-... # completions (Claude)
OPENAI_API_KEY=sk-proj-... # embeddings (text-embedding-3-small)
BYO — Own API Keys
Individual workspaces can override the server-level keys with their own credentials. Configure this in the dashboard under Workspace Settings > LLM Configuration, or via the API:
{
"llm": {
"mode": "byo",
"completion": {
"provider": "anthropic",
"model": "claude-sonnet-4-20250514",
"api_key": "sk-ant-..."
},
"embedding": {
"provider": "openai",
"model": "text-embedding-3-small",
"api_key": "sk-..."
}
}
}
BYO — Azure OpenAI
Use base_url to point at your Azure OpenAI deployment:
{
"llm": {
"mode": "byo",
"completion": {
"provider": "openai",
"model": "gpt-4o",
"api_key": "your-azure-key",
"base_url": "https://your-instance.openai.azure.com/openai/deployments/gpt-4o"
},
"embedding": {
"provider": "openai",
"model": "text-embedding-3-small",
"api_key": "your-azure-key",
"base_url": "https://your-instance.openai.azure.com/openai/deployments/text-embedding-3-small"
}
}
}
BYO — AWS Bedrock
Use the Bedrock provider with AWS credentials:
{
"llm": {
"mode": "byo",
"completion": {
"provider": "bedrock",
"model": "anthropic.claude-sonnet-4-20250514-v1:0",
"region": "us-east-1"
},
"embedding": {
"provider": "bedrock",
"model": "amazon.titan-embed-text-v2:0",
"region": "us-east-1"
}
}
}
AWS credentials are resolved from the standard chain: environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY), IAM role, or instance profile.
BYO — Ollama (Air-Gapped / Self-Hosted LLM)
For fully air-gapped deployments with no external API calls:
{
"llm": {
"mode": "byo",
"completion": {
"provider": "ollama",
"model": "llama3.1:70b",
"base_url": "http://ollama:11434"
},
"embedding": {
"provider": "ollama",
"model": "nomic-embed-text",
"base_url": "http://ollama:11434"
}
}
}
Add Ollama to your docker-compose.yml:
services:
ollama:
image: ollama/ollama
volumes:
- ollama_data:/root/.ollama
# GPU passthrough (NVIDIA):
# deploy:
# resources:
# reservations:
# devices:
# - driver: nvidia
# count: 1
# capabilities: [gpu]
volumes:
ollama_data:
Ollama uses the OpenAI-compatible API, so any OpenAI-compatible inference server (vLLM, llama.cpp, etc.) works with the same config — just set the base_url.
TLS / Reverse Proxy
In production, place a reverse proxy in front of Engram to handle TLS termination. Below is an nginx example.
nginx Configuration
server {
listen 443 ssl http2;
server_name engram.example.com;
ssl_certificate /etc/letsencrypt/live/engram.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/engram.example.com/privkey.pem;
# API server
location /api/ {
proxy_pass http://127.0.0.1:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# SSE support (job progress streaming)
proxy_buffering off;
proxy_cache off;
proxy_read_timeout 300s;
}
# Webhook endpoint
location /webhooks/ {
proxy_pass http://127.0.0.1:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# Health and metrics
location /health {
proxy_pass http://127.0.0.1:8080;
}
location /metrics {
proxy_pass http://127.0.0.1:8080;
# Restrict to internal monitoring
allow 10.0.0.0/8;
deny all;
}
# Dashboard
location / {
proxy_pass http://127.0.0.1:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
server {
listen 80;
server_name engram.example.com;
return 301 https://$server_name$request_uri;
}
Update the dashboard environment variable to match your public URL:
NEXT_PUBLIC_API_URL=https://engram.example.com
And restrict CORS origins on the server:
CORS_ORIGINS=https://engram.example.com
Process Modes
The server binary supports three modes, controlled by the MODE environment variable:
| Mode | Description | When to use |
|---|---|---|
both (default) | Runs the API server and background worker in a single process. | Small to medium deployments. |
api | Runs only the API server. Does not process background jobs. | Scale API and worker independently. |
worker | Runs only the background worker. Does not serve HTTP requests. | Dedicated worker node for heavy AI jobs. |
For larger deployments, run separate containers:
services:
engram-api:
image: engram/server:latest
environment:
MODE: api
DATABASE_URL: postgres://engram:changeme@postgres:5432/engram
# ... other env vars
ports: ["8080:8080"]
engram-worker:
image: engram/server:latest
environment:
MODE: worker
DATABASE_URL: postgres://engram:changeme@postgres:5432/engram
ANTHROPIC_API_KEY: sk-ant-...
OPENAI_API_KEY: sk-...
# ... other env vars
The worker needs LLM API keys (or BYO config); the API server needs them only if workspaces use Engram-managed keys for compilation preview.
Backup and Restore
Database Backup
All Engram state lives in PostgreSQL. Back up the database regularly.
# Backup
docker exec engram-postgres pg_dump -U engram engram > engram_backup_$(date +%Y%m%d).sql
# Compressed backup
docker exec engram-postgres pg_dump -U engram -Fc engram > engram_backup_$(date +%Y%m%d).dump
Restore
# From SQL dump
docker exec -i engram-postgres psql -U engram engram < engram_backup_20260308.sql
# From compressed dump
docker exec -i engram-postgres pg_restore -U engram -d engram --clean engram_backup_20260308.dump
What to Back Up
- PostgreSQL data — all workspace config, documents, versions, source connections, compiled outputs, and job history.
- Environment file — your
.envordocker-compose.ymlwith secrets (JWT_SECRET, API keys). - GitHub App private key — if you configured a GitHub App for automated deployments.
There is no local filesystem state. Redis (if used) is a cache and does not need to be backed up.
Upgrading
- Pull the latest images:
docker compose pull
- Restart the stack:
docker compose up -d
Database migrations run automatically on startup. The server will not start serving requests until all migrations have completed.
Version Pinning
For production stability, pin to a specific version tag instead of latest:
services:
engram:
image: engram/server:1.2.0
engram-web:
image: engram/dashboard:1.2.0
Rolling Back
If an upgrade causes issues:
- Stop the stack:
docker compose down - Restore the database from your pre-upgrade backup.
- Update
docker-compose.ymlto the previous version tag. - Start:
docker compose up -d
Database migrations are forward-only. Rolling back to a previous server version requires restoring a database backup taken before the upgrade.
Troubleshooting
Server fails to start with "connection refused" to PostgreSQL
The PostgreSQL container may not be ready yet. The healthcheck in the compose file and depends_on with condition: service_healthy handle this. If you are using an external database, verify the DATABASE_URL is correct and that the database accepts connections from the server's network.
"pgvector extension not found"
The pgvector/pgvector:pg16 image includes the extension. If using an external database, run:
CREATE EXTENSION IF NOT EXISTS vector;
This requires superuser privileges. On managed services, enable the extension from your provider's console.
Context serving is slow (>10ms)
- Enable Redis caching by setting
REDIS_URL. - Verify the database has appropriate indexes (migrations create them automatically).
- Check that the server and database are in the same network/region.
AI pipeline jobs fail or timeout
- Verify LLM API keys are valid and have sufficient quota.
- Check server logs:
docker compose logs engram --tail 100 - For BYO endpoints, verify the
base_urlis reachable from the server container. - Ollama models must be pulled before use:
docker exec ollama ollama pull llama3.1:70b
Dashboard cannot reach the API
The NEXT_PUBLIC_API_URL must be reachable from the user's browser, not from inside the Docker network. If you access the dashboard at https://engram.example.com, set:
NEXT_PUBLIC_API_URL=https://engram.example.com
Not http://engram:8080 (that is the internal Docker hostname).
GitHub App webhook delivery fails
- Verify
GITHUB_APP_WEBHOOK_SECRETmatches the secret configured in your GitHub App settings. - The webhook URL must be publicly reachable (e.g.,
https://engram.example.com/webhooks/github). - Check the GitHub App's "Recent Deliveries" page for error details.
Checking server health
curl http://localhost:8080/health
Returns 200 OK with version info when the server is running and the database connection is healthy.
Resource Requirements
| Component | CPU | RAM | Disk |
|---|---|---|---|
| engram server (API + worker) | 1 core minimum, 2+ recommended | 50 MB baseline, ~200 MB under load | Negligible (~20 MB binary) |
| engram dashboard | 0.5 core | 128 MB | ~100 MB (Next.js) |
| PostgreSQL | 1 core minimum, 2+ recommended | 256 MB minimum, 1+ GB recommended | Depends on data volume (typically <1 GB for most teams) |
| Redis (optional) | 0.5 core | 64 MB | Negligible |
The Rust server binary is statically linked (~20 MB), starts in under 100ms, and uses no garbage collector. Memory usage scales with concurrent connections and active background jobs, not with data volume.
For a team of 50 people with ~100 guideline documents: a single 2-core / 2 GB machine handles everything comfortably, including the database.
Production Checklist
- Set a strong, unique
JWT_SECRET(at least 32 random characters) - Use a dedicated PostgreSQL password (not the default
changeme) - Configure TLS termination via reverse proxy
- Set
CORS_ORIGINSto your dashboard URL - Set up automated database backups
- Pin Docker image versions (avoid
latestin production) - Configure log aggregation (
LOG_FORMAT=json) - Set up monitoring on
/healthand/metricsendpoints - Restrict
/metricsendpoint to your monitoring network - Test LLM connectivity (run a gap analysis after setup)