LiteLLM with PostgreSQL Backend

← Back to Index

IP: 192.168.1.XXX | Port: 4000 | Stack: dell/litellm/ | Database: PostgreSQL (172.X.X.X:5432)


Overview

LiteLLM production deployment with PostgreSQL backend for persistent storage of:

  • Virtual API keys with budgets
  • Spend tracking and cost analytics
  • User management and authentication
  • Team-based access control

Previous State: LiteLLM running with Redis caching only (in-memory/virtual keys lost on restart)
Current State: Full database-backed enterprise proxy with persistent storage


Access


Location

/opt/litellm/
├── docker-compose.yaml    # LiteLLM with database image
├── config.yaml            # Model configuration
├── data/                  # (if using SQLite fallback)
└── .env                   # Database credentials

Architecture

User Request → LiteLLM (192.168.1.XXX:4000)
                    │
                    ├──► PostgreSQL (172.X.X.X:5432) - Keys, Spend, Users
                    │
                    └──► Redis (172.X.X.X:6379) - Cache, Rate Limits
                    │
                    └──► LLM Providers (Ollama, MLX, Cloud APIs)

Docker Compose

services:
  litellm:
    image: ghcr.io/berriai/litellm-database:main-latest
    container_name: skip-dashboard-litellm
    restart: unless-stopped
    ports:
      - "4000:4000"
    environment:
      - LITELLM_MASTER_KEY=${LITELLM_MASTER_KEY}
      - DATABASE_URL=${DATABASE_URL}
      - REDIS_HOST=172.X.X.X
      - REDIS_PORT=6379
      - ANTHROPIC_API_KEY=[REDACTED]
      - MOONSHOT_API_KEY=[REDACTED]
    volumes:
      - ./config.yaml:/app/config.yaml
    security_opt:
      - no-new-privileges:true

Infrastructure Integration

PostgreSQL (Existing)

  • Host: 172.X.X.X (litellm-postgres container)
  • Database: litellm
  • User: litellm
  • Connection: postgresql://litellm:***@172.X.X.X:5432/litellm

Redis (Existing)

  • Host: 172.X.X.X (redis-litellm container)
  • Purpose: Caching (10-minute TTL), rate limiting
  • Reuse: Shared with LiteLLM caching layer

Key Configuration Notes

Database Image

  • Uses litellm-database tag (not base litellm image)
  • Includes Prisma ORM for database migrations
  • Automatically applies 77+ migrations on startup

Redis Caching

  • TTL: 600 seconds (10 minutes)
  • Namespace: litellm.cache
  • Types: acompletion, aembedding

API Keys

  • Master Key: For admin operations (key generation, user management)
  • Virtual Keys: End-user keys with budgets and model restrictions

Configuration (config.yaml)

model_list:
  # Local Ollama (Dell Server)
  - model_name: qwen-dell
    litellm_params:
      model: ollama/qwen2.5:14b-instruct-q4_K_M
      api_base: http://192.168.1.XXX:11434
  
  # Mac Studio MLX
  - model_name: qwen-mac-mlx
    litellm_params:
      model: openai/qwen2.5-14b-mlx
      api_base: http://192.168.1.XXX:8001
      api_key: [REDACTED]
  
  # Cloud Providers
  - model_name: anthropic-fallback
    litellm_params:
      model: anthropic/claude-3-5-sonnet-20241022
      api_key: [REDACTED]
  
  - model_name: kimi-k2.5
    litellm_params:
      model: moonshot/kimi-k2.5
      api_key: [REDACTED]
 
router_settings:
  routing_strategy: "usage-based-routing"
  allowed_fails: 3
  cooldown_time: 60
  num_retries: 2
 
litellm_settings:
  cache: true
  cache_params:
    type: redis
    host: 172.X.X.X
    port: 6379
    ttl: 600
    namespace: "litellm.cache"
 
general_settings:
  master_key: ${LITELLM_MASTER_KEY}
  track_cost: true
  track_spend: true

Database Features Enabled

Virtual API Keys ✅

# Generate key with budget
curl -X POST http://192.168.1.XXX:4000/key/generate \
  -H "Authorization: Bearer $MASTER_KEY" \
  -d '{
    "models": ["qwen-dell"],
    "max_budget": 10.0,
    "user_id": "test-user"
  }'

Spend Tracking ✅

  • Real-time cost tracking per request
  • User-level spend aggregation
  • Model-level cost breakdown
  • Budget enforcement

User Management ✅

  • User creation and management
  • Team-based access control
  • Role-based permissions

Available Models (11 Total)

ModelProviderType
qwen-dellOllama (Dell)Local
codellama-dellOllama (Dell)Local
mistral-dellOllama (Dell)Local
gemma2-dellOllama (Dell)Local
llama3.2-dellOllama (Dell)Local
qwen-mac-mlxMLX (Mac Studio)Local
phi4-mac-mlxMLX (Mac Studio)Local
deepseek-fallbackDeepSeekCloud (pending key)
anthropic-fallbackAnthropicCloud ✅
kimi-k2.5MoonshotCloud ✅
kimi-latestMoonshotCloud ✅

API Endpoints

EndpointMethodAuthDescription
/healthGETMaster KeyHealth check
/v1/modelsGETAny KeyList models
/v1/chat/completionsPOSTAny KeyChat completion
/key/generatePOSTMaster KeyCreate virtual key
/user/infoGETMaster KeyUser details

Migration from In-Memory

Previous State

  • Virtual keys lost on container restart
  • No spend tracking across restarts
  • No user management

Migration Steps

  1. ✅ Deployed PostgreSQL container
  2. ✅ Switched to litellm-database image
  3. ✅ Applied 77 Prisma migrations
  4. ✅ Configured DATABASE_URL environment variable
  5. ✅ Verified virtual key persistence

Security Considerations

Database Security

  • PostgreSQL user restricted to litellm database
  • Password stored in .env (chmod 600)
  • Network binding: Internal Docker network only
  • UFW firewall: Port 5432 restricted to 192.168.1.XXX/24

API Key Hierarchy

  1. Master Key - Admin operations only
  2. Virtual Keys - End-user keys with budgets
  3. Provider Keys - Never exposed to end users

Backup Strategy

Database Backup

# Backup PostgreSQL
docker exec litellm-postgres pg_dump -U litellm litellm > litellm-db-backup.sql
 
# Restore
docker exec -i litellm-postgres psql -U lithentik litellm < litellm-db-backup.sql

Configuration Backup

# Backup config and .env
tar czf litellm-config-backup.tar.gz config.yaml .env

Troubleshooting

Database Connection Errors

  • Check PostgreSQL is running: docker ps | grep postgres
  • Verify DATABASE_URL format
  • Check pg_hba.conf allows connections
  • Test: docker exec litellm-postgres psql -U litellm -c "SELECT 1"

Migration Failures

  • Check logs: docker logs skip-dashboard-litellm
  • Prisma may need manual intervention
  • Reset option: Drop and recreate database

Redis Connection Issues

  • Verify Redis running: docker ps | grep redis
  • Test: docker exec redis-litellm redis-cli ping

Performance

Database Connection Pool

  • Default: 10 connections
  • Configurable via environment variables

Cache Hit Rate

  • Check: /cache/ping endpoint
  • Expected: >80% for repeated queries

Query Latency

  • Typical: <50ms for cached responses
  • Database queries: <100ms


Deployment Date

2026-02-16 - Migrated from in-memory to PostgreSQL backend


Future Enhancements

Phase 2: Enterprise Features

  • Team-based billing
  • Custom model aliases
  • Request/response logging
  • Advanced rate limiting

Phase 3: Monitoring

  • Prometheus metrics export
  • Grafana dashboard for spend tracking
  • Alerting for budget thresholds
  • Cache hit rate monitoring

Phase 4: High Availability

  • PostgreSQL replication
  • Redis Sentinel
  • LiteLLM load balancing