Advanced LLM-MCP Configuration Guide

This guide provides detailed information on advanced configuration options for the LLM Platform Model Context Protocol (LLM-MCP) server. These configurations allow you to fine-tune performance, security, resource usage, and behavior for enterprise production environments.

Configuration Methods
Server Configuration
Security Configuration
Storage Configuration
Tool Registry Configuration
Resource Management
Logging and Monitoring
Caching Configuration
High Availability Configuration
Advanced gRPC Configuration
Environment-Specific Configurations
Complete Configuration Example

Configuration Methods

LLM-MCP supports multiple configuration methods, which are processed in the following order (later methods override earlier ones):

Default configuration: Built-in defaults
Configuration file: JSON or YAML file
Environment variables: System environment variables
Command line arguments: Passed when starting the server

Configuration File

The recommended approach is to use a configuration file:

# Start LLM-MCP with a specific configuration file
node src/index.js --config /path/to/config.json

# Using the LLM-MCP CLI
llm-mcp start --config /path/to/config.json

Environment Variables

Environment variables can override file configuration:

# Set the server port
export LLM-MCP_SERVER_PORT=3001

# Set MongoDB URI
export LLM-MCP_STORAGE_MONGODB_URI=mongodb://username:password@host:port/database

# Start LLM-MCP (will use environment variables)
node src/index.js

Configuration Validation

LLM-MCP validates your configuration at startup. You can also manually validate a configuration file:

# Validate a configuration file
llm-mcp validate --config /path/to/config.json

# Validate and show expanded configuration with defaults
llm-mcp validate --config /path/to/config.json --verbose

Server Configuration

Advanced server configuration options allow you to control networking, concurrency, and request handling:

{
  "server": {
    "host": "0.0.0.0",             // Server bind address (0.0.0.0 for all interfaces)
    "port": 3001,                  // HTTP/REST API port
    "grpcPort": 3002,              // gRPC API port
    "socketPath": null,            // Unix socket path (alternative to TCP)
    "maxConnections": 1000,        // Maximum concurrent connections
    "requestTimeout": 30000,       // Global request timeout in milliseconds
    "connectionIdleTimeout": 60000, // Idle connection timeout
    "maxRequestSize": "10mb",      // Maximum request body size
    "compression": true,           // Enable HTTP compression
    "trustProxy": true,            // Trust X-Forwarded-* headers
    "cors": {                      // CORS configuration
      "enabled": true,
      "origin": ["*"],             // Allowed origins (use specific domains in production)
      "methods": ["GET", "POST", "PUT", "DELETE", "OPTIONS"],
      "allowedHeaders": ["Content-Type", "Authorization"],
      "exposedHeaders": ["X-Request-ID"],
      "credentials": false,        // Allow cookies in cross-origin requests
      "maxAge": 86400              // CORS preflight cache time in seconds
    },
    "helmet": {                    // Security headers (Helmet configuration)
      "enabled": true,
      "contentSecurityPolicy": {
        "directives": {
          "default-src": ["'self'"],
          "script-src": ["'self'"]
        }
      },
      "hsts": {
        "maxAge": 15552000,        // 180 days
        "includeSubDomains": true
      }
    },
    "rateLimiting": {              // Global rate limiting
      "enabled": true,
      "windowMs": 60000,           // 1 minute
      "max": 100,                  // 100 requests per minute
      "standardHeaders": true,     // Return rate limit info in headers
      "skipSuccessfulRequests": false, // Count all requests
      "keyGenerator": "ip",        // Options: "ip", "user", "custom"
      "skip": {                    // Skip rate limiting for certain paths
        "paths": ["/health", "/metrics"],
        "ips": ["127.0.0.1"]
      }
    },
    "cluster": {                   // Node.js cluster configuration
      "enabled": true,
      "workers": "auto",           // "auto" or specific number
      "restartOnFailure": true     // Restart workers on crash
    }
  }
}

Advanced Host Configuration

For production systems with multiple network interfaces:

{
  "server": {
    "interfaces": [
      {
        "type": "http",
        "host": "192.168.1.10",    // Internal network interface
        "port": 3001,
        "allowedIps": ["192.168.1.0/24"]
      },
      {
        "type": "grpc",
        "host": "10.0.0.5",        // Different interface for gRPC
        "port": 3002,
        "allowedIps": ["10.0.0.0/16"]
      }
    ],
    "preferIpv4": true            // Prefer IPv4 over IPv6
  }
}

Security Configuration

Advanced security configuration options protect your LLM-MCP installation:

{
  "security": {
    "auth": {
      "type": "apiKey",                 // Authentication type: "apiKey", "jwt", "oauth2"
      "apiKey": {
        "header": "X-API-Key",          // API key header name
        "query": "api_key",             // API key query parameter (alternative)
        "keys": {                       // API key definitions
          "admin-key": {
            "name": "Admin Key",
            "roles": ["admin"],
            "expires": "2025-12-31T00:00:00Z"
          },
          "user-key-1": {
            "name": "User 1",
            "roles": ["user"],
            "permissions": ["tool:execute"],
            "rateLimit": {
              "max": 10,
              "period": "second"
            }
          }
        },
        "keyFile": "/path/to/keys.json" // Alternative: load keys from file
      },
      "jwt": {
        "secret": "your-jwt-secret",     // JWT secret (for HMAC algorithms)
        "publicKeyFile": "/path/to/public.pem", // RSA/ECDSA public key file
        "privateKeyFile": "/path/to/private.pem", // RSA/ECDSA private key file
        "algorithm": "HS256",            // Algorithm: HS256, RS256, ES256, etc.
        "audience": "llm-mcp-api",         // Expected audience
        "issuer": "auth.bluefly.ai",     // Expected issuer
        "expiresIn": "1h",               // Token expiration time
        "refreshExpiresIn": "7d",        // Refresh token expiration
        "clockTolerance": 30,            // Clock skew tolerance in seconds
        "userProperty": "user"           // Request property to store user info
      },
      "oauth2": {
        "provider": "keycloak",          // OAuth provider
        "clientId": "llm-mcp-server",      // OAuth client ID
        "clientSecret": "your-client-secret", // OAuth client secret
        "discovery": "https://auth.example.com/.well-known/openid-configuration",
        "scope": ["openid", "profile"],  // Requested scopes
        "rolesClaim": "resource_access.llm-mcp-server.roles", // JWT path for roles
        "rolesMapping": {                // Map provider roles to LLM-MCP roles
          "llm-mcp-admin": "admin",
          "llm-mcp-user": "user"
        }
      }
    },
    "rbac": {                          // Role-based access control
      "enabled": true,
      "roles": {                       // Role definitions
        "admin": {
          "description": "Administrator with full access",
          "permissions": ["*"]         // All permissions
        },
        "user": {
          "description": "Regular user",
          "permissions": [
            "tool:list",
            "tool:execute",
            "vector:store",
            "vector:query",
            "model:list"
          ]
        },
        "readonly": {
          "description": "Read-only user",
          "permissions": [
            "tool:list",
            "model:list",
            "vector:query"
          ]
        }
      },
      "defaultRole": "readonly"        // Default role for authenticated users without role
    },
    "encryption": {                    // Data encryption
      "enabled": true,
      "keyFile": "/path/to/encryption-key.json", // Encryption key file
      "algorithm": "aes-256-gcm",      // Encryption algorithm
      "fields": ["vector:metadata.credentials"] // Fields to encrypt
    },
    "ipAllowList": ["192.168.0.0/16"], // Allowed IP ranges (CIDR notation)
    "ipBlockList": ["10.0.0.1"],       // Blocked IPs
    "requestValidation": {             // Request validation
      "enabled": true,
      "validateContentType": true,     // Validate Content-Type header
      "validateAccept": true,          // Validate Accept header
      "schemas": {                     // Custom JSON Schema validation
        "/api/v1/tools": {
          "post": {
            "$ref": "./schemas/tool-registration.json"
          }
        }
      }
    },
    "tls": {                           // TLS/SSL configuration
      "enabled": true,
      "certFile": "/path/to/cert.pem", // Certificate file
      "keyFile": "/path/to/key.pem",   // Private key file
      "caFile": "/path/to/ca.pem",     // CA certificate for client cert validation
      "requireClientCert": false,      // Require client certificates
      "ciphers": "HIGH:!aNULL:!MD5",   // Allowed cipher suites
      "minVersion": "TLSv1.2",         // Minimum TLS version
      "dhParam": "/path/to/dhparam.pem" // DH parameters for perfect forward secrecy
    },
    "auditLogging": {                  // Security audit logging
      "enabled": true,
      "logFile": "/var/log/llm-mcp/audit.log", // Audit log file
      "events": [                      // Events to audit
        "authentication",
        "authorization",
        "tool-registration",
        "tool-execution"
      ],
      "format": "json",                // Log format: json or text
      "rotation": {                    // Log rotation
        "size": "100m",                // Rotate at 100 MB
        "interval": "1d",              // Also rotate daily
        "maxFiles": 30                 // Keep 30 rotated files
      }
    }
  }
}

API Key Rotation

For secure API key rotation in production:

{
  "security": {
    "auth": {
      "type": "apiKey",
      "apiKey": {
        "rotationStrategy": {
          "enabled": true,
          "graceInterval": "7d",       // Accept old keys for 7 days after rotation
          "rotationInterval": "90d",   // Generate rotation warning after 90 days
          "notifications": {
            "email": "[email protected]",
            "webhook": "https://example.com/api/key-rotation-hook"
          }
        }
      }
    }
  }
}

Storage Configuration

Configure storage backends for metadata, vector data, and operational data:

{
  "storage": {
    "type": "composite",           // Storage type: "mongodb", "qdrant", "composite"
    
    "metadata": {                  // Metadata storage (tools, configurations, etc.)
      "type": "mongodb",
      "uri": "mongodb://username:password@host:port/database",
      "options": {
        "useNewUrlParser": true,
        "useUnifiedTopology": true,
        "maxPoolSize": 50,
        "connectTimeoutMS": 30000,
        "socketTimeoutMS": 45000,
        "serverSelectionTimeoutMS": 30000,
        "replicaSet": "rs0",
        "readPreference": "secondaryPreferred",
        "w": "majority",
        "wtimeoutMS": 10000,
        "authSource": "admin",
        "ssl": true,
        "retryWrites": true
      }
    },
    
    "vector": {                   // Vector storage configuration
      "type": "qdrant",           // Vector DB type: "mongodb", "qdrant", "redis", "milvus"
      
      "qdrant": {                 // Qdrant-specific configuration
        "uri": "http://qdrant:6333",
        "apiKey": "your-qdrant-api-key",
        "timeout": 30000,
        "vectors": {
          "dimensions": 1536,     // Default vector dimensions
          "distance": "Cosine",   // Distance function: Cosine, Euclid, Dot
          "onDisk": true,         // Store vectors on disk for larger datasets
          "optimizers": {
            "indexing": {
              "maxVectorsPerSegment": 20000, // Optimization parameters
              "memoryGB": 2
            }
          }
        },
        "collections": {         // Pre-defined collections
          "default": {
            "dimensions": 1536,
            "distance": "Cosine"
          },
          "images": {
            "dimensions": 512,
            "distance": "Cosine",
            "customParameters": {
              "m": 16,
              "efConstruction": 128,
              "ef": 64
            }
          }
        }
      },
      
      "mongodb": {               // MongoDB vector configuration
        "indexType": "hnsw",    // Index type: "hnsw" or "flat"
        "indexParams": {
          "m": 16,              // HNSW connections per layer
          "efConstruction": 64, // HNSW construction parameter
          "dimensions": 1536    // Vector dimensions
        }
      },
      
      "milvus": {               // Milvus configuration
        "uri": "localhost:19530",
        "username": "root",
        "password": "milvus",
        "database": "default",
        "ssl": false,
        "timeout": 10000
      }
    },
    
    "cache": {                  // Cache storage
      "type": "redis",          // Cache type: "memory", "redis"
      "redis": {
        "host": "redis",
        "port": 6379,
        "password": "your-redis-password",
        "db": 0,
        "prefix": "llm-mcp:",
        "tls": {
          "enabled": false,
          "ca": "/path/to/ca.pem"
        },
        "cluster": {
          "enabled": false,
          "nodes": [
            { "host": "redis-node-1", "port": 6379 },
            { "host": "redis-node-2", "port": 6379 },
            { "host": "redis-node-3", "port": 6379 }
          ]
        },
        "sentinel": {
          "enabled": false,
          "master": "mymaster",
          "sentinels": [
            { "host": "sentinel-1", "port": 26379 },
            { "host": "sentinel-2", "port": 26379 },
            { "host": "sentinel-3", "port": 26379 }
          ]
        },
        "keyPrefix": "cache:",
        "ttl": 3600              // Default TTL in seconds
      },
      "memory": {
        "maxSize": 1000,         // Maximum items in memory cache
        "ttl": 3600,             // Default TTL in seconds
        "checkInterval": 60      // Cleanup interval in seconds
      }
    },
    
    "queue": {                  // Queue storage for async operations
      "type": "redis",          // Queue type: "memory", "redis", "kafka"
      "redis": {
        "host": "redis",
        "port": 6379,
        "password": "your-redis-password",
        "db": 1,
        "keyPrefix": "queue:"
      },
      "kafka": {
        "clientId": "llm-mcp",
        "brokers": ["kafka-1:9092", "kafka-2:9092", "kafka-3:9092"],
        "ssl": {
          "enabled": true,
          "ca": "/path/to/ca.pem",
          "cert": "/path/to/cert.pem",
          "key": "/path/to/key.pem"
        },
        "sasl": {
          "mechanism": "plain",
          "username": "kafka-user",
          "password": "kafka-password"
        },
        "topics": {
          "toolExecution": "llm-mcp-tool-execution",
          "vectorOperations": "llm-mcp-vector-operations"
        }
      }
    },
    
    "backup": {                // Backup configuration
      "enabled": true,
      "schedule": "0 0 * * *", // Daily at midnight (cron format)
      "path": "/var/backups/llm-mcp",
      "retention": {
        "days": 30,            // Keep backups for 30 days
        "count": 10            // Keep at least 10 backups
      },
      "storage": {
        "type": "s3",          // Backup storage: "local", "s3"
        "s3": {
          "bucket": "llm-mcp-backups",
          "prefix": "production/",
          "region": "us-west-2",
          "credentials": {
            "accessKeyId": "your-access-key",
            "secretAccessKey": "your-secret-key"
          }
        }
      }
    }
  }
}

MongoDB Sharding Configuration

For large-scale deployments with MongoDB sharding:

{
  "storage": {
    "metadata": {
      "type": "mongodb",
      "uri": "mongodb://router1:27017,router2:27017/llm-mcp",
      "options": {
        "useNewUrlParser": true,
        "useUnifiedTopology": true
      },
      "sharding": {
        "enabled": true,
        "collections": {
          "tools": {
            "key": { "category": 1 }
          },
          "executions": {
            "key": { "createdAt": 1 },
            "timeseries": {
              "timeField": "createdAt",
              "metaField": "toolId",
              "granularity": "hours"
            }
          },
          "vectors": {
            "key": { "collection": "hashed" }
          }
        }
      }
    }
  }
}

Tool Registry Configuration

Configure how tools are registered, discovered, and executed:

{
  "toolRegistry": {
    "maxTools": 1000,             // Maximum number of registered tools
    "validation": {               // Tool validation settings
      "enabled": true,
      "strictSchema": true,       // Strict JSON Schema validation
      "validateFunctions": true,  // Validate tool functions
      "allowRemote": true         // Allow remote tool registration
    },
    "execution": {
      "timeout": 30000,           // Default execution timeout (ms)
      "maxConcurrent": 100,       // Maximum concurrent executions
      "retryStrategy": {          // Retry strategy for failed executions
        "attempts": 3,            // Maximum retry attempts
        "delay": 1000,            // Initial delay in ms
        "backoff": 2,             // Exponential backoff factor
        "maxDelay": 10000         // Maximum delay between retries
      },
      "sandbox": {                // Execution sandbox settings
        "enabled": true,          // Enable execution sandboxing
        "type": "vm",             // Sandbox type: "vm", "docker", "isolate"
        "resourceLimits": {       // Resource limits for tool execution
          "cpu": 1,               // CPU cores
          "memory": "512M",       // Memory limit
          "timeout": 30000        // Execution timeout (ms)
        }
      }
    },
    "discovery": {                // Tool discovery settings
      "providers": [              // Tool providers
        {
          "name": "local",        // Local tool provider
          "enabled": true
        },
        {
          "name": "bfllm",        // BFLLM tool provider
          "enabled": true,
          "url": "http://bfllm:3002/tools",
          "apiKey": "your-bfllm-api-key",
          "refresh": 60000,       // Refresh interval (ms)
          "categories": ["ai", "generation"]
        }
      ],
      "autoRegister": {          // Auto-register tools from providers
        "enabled": true,
        "interval": 300000,      // Check interval (ms)
        "onStartup": true        // Register on startup
      }
    },
    "plugins": {                 // Tool registry plugins
      "openapi": {               // OpenAPI tool generation
        "enabled": true,
        "specs": [
          {
            "url": "https://api.example.com/openapi.json",
            "auth": {
              "type": "bearer",
              "token": "your-api-token"
            },
            "toolIdPrefix": "example_api_",
            "includeTagsAsCategories": true
          }
        ]
      },
      "jsonrpc": {              // JSON-RPC tool generation
        "enabled": true,
        "servers": [
          {
            "url": "https://jsonrpc.example.com",
            "methods": ["method1", "method2"],
            "toolIdPrefix": "jsonrpc_"
          }
        ]
      },
      "graphql": {              // GraphQL tool generation
        "enabled": true,
        "endpoints": [
          {
            "url": "https://graphql.example.com",
            "schema": "/path/to/schema.graphql",
            "operations": ["query1", "query2"],
            "toolIdPrefix": "graphql_"
          }
        ]
      }
    },
    "rateLimit": {              // Tool-specific rate limiting
      "enabled": true,
      "defaultRules": {
        "perSecond": 10,        // Default limit per second
        "perMinute": 100,       // Default limit per minute
        "perHour": 1000         // Default limit per hour
      },
      "byCategory": {           // Category-specific limits
        "expensive": {
          "perSecond": 2,
          "perMinute": 20,
          "perHour": 100
        }
      },
      "byTool": {               // Tool-specific limits
        "openai_completion": {
          "perSecond": 5,
          "perMinute": 50,
          "perHour": 500
        }
      }
    }
  }
}

Advanced Tool Execution

For sophisticated tool execution control:

{
  "toolRegistry": {
    "execution": {
      "middleware": [            // Execution middleware
        "logging",               // Log all executions
        "validation",            // Validate inputs/outputs
        "rateLimit",             // Apply rate limiting
        "cache",                 // Apply caching
        "metrics"                // Collect metrics
      ],
      "hooks": {                 // Execution lifecycle hooks
        "beforeExecution": "logs/before-execution.js",
        "afterExecution": "logs/after-execution.js",
        "onError": "logs/on-error.js"
      },
      "strategies": {           // Execution strategies
        "batchSize": 10,         // Batch execution size
        "priorityQueue": {       // Priority queue settings
          "enabled": true,
          "levels": 3            // Priority levels
        },
        "loadBalancing": {       // Load balancing for distributed execution
          "enabled": true,
          "strategy": "round-robin" // Strategy: round-robin, least-connections
        }
      }
    }
  }
}

Resource Management

Configure how LLM-MCP manages system resources:

{
  "resources": {
    "cpu": {
      "limit": 80,                  // Maximum CPU usage percentage
      "target": 60,                 // Target CPU usage percentage
      "throttling": {               // CPU throttling configuration
        "enabled": true,
        "checkInterval": 5000,      // Check interval in ms
        "cooldown": 30000,          // Cooldown period after throttling
        "strategies": ["defer", "queue"] // Throttling strategies
      }
    },
    "memory": {
      "limit": "4G",                // Maximum memory usage
      "reserved": "1G",             // Reserved memory (always keep free)
      "gc": {                       // Garbage collection settings
        "heapThreshold": 80,        // Trigger GC at heap usage percentage
        "intervalMin": 30000,       // Minimum time between GC (ms)
        "forcedGC": true            // Allow forced GC when needed
      }
    },
    "io": {
      "maxConcurrentFileOperations": 100, // Max concurrent file operations
      "maxConcurrentNetworkRequests": 200, // Max concurrent network requests
      "disk": {
        "monitorUsage": true,       // Monitor disk usage
        "lowSpaceThreshold": 10,    // Low space threshold percentage
        "criticalSpaceThreshold": 5 // Critical space threshold percentage
      }
    },
    "limits": {                     // Resource limits by operation type
      "toolExecution": {
        "cpu": 2,                   // CPU cores
        "memory": "1G",             // Memory
        "timeout": 30000            // Timeout in ms
      },
      "vectorOperations": {
        "cpu": 4,                   // CPU cores
        "memory": "2G",             // Memory
        "timeout": 60000            // Timeout in ms
      }
    },
    "scaling": {                    // Automatic resource scaling
      "enabled": true,
      "metrics": ["cpu", "memory", "requestRate"],
      "checkInterval": 60000,       // Check interval in ms
      "scaleUpThreshold": 80,       // Scale up at resource usage percentage
      "scaleDownThreshold": 40,     // Scale down at resource usage percentage
      "cooldownUp": 300000,         // Cooldown after scale up (ms)
      "cooldownDown": 600000,       // Cooldown after scale down (ms)
      "minReplicas": 2,             // Minimum replicas
      "maxReplicas": 10             // Maximum replicas
    }
  }
}

Advanced Thread Pool Configuration

For fine-grained control over thread pools:

{
  "resources": {
    "threadPools": {
      "default": {
        "min": 4,                 // Minimum threads
        "max": 32,                // Maximum threads
        "keepAlive": 60000        // Thread keep-alive time (ms)
      },
      "toolExecution": {
        "min": 8,
        "max": 64,
        "keepAlive": 30000,
        "queueSize": 1000         // Maximum queue size
      },
      "vectorOperations": {
        "min": 4,
        "max": 16,
        "keepAlive": 30000,
        "queueSize": 500
      },
      "backgroundTasks": {
        "min": 2,
        "max": 8,
        "keepAlive": 120000,
        "queueSize": 200
      }
    }
  }
}

Logging and Monitoring

Configure comprehensive logging and monitoring:

{
  "logging": {
    "level": "info",                   // Log level: trace, debug, info, warn, error, fatal
    "format": "json",                  // Log format: json, text
    "colorize": false,                 // Colorize logs
    "timestamp": true,                 // Include timestamp
    "source": true,                    // Include source location
    "requestId": true,                 // Include request ID
    "traceId": true,                   // Include trace ID
    "serializers": {                   // Custom serializers
      "req": "custom-serializers.js#requestSerializer",
      "res": "custom-serializers.js#responseSerializer",
      "err": "custom-serializers.js#errorSerializer"
    },
    "transports": [                    // Log transports
      {
        "type": "console",             // Console transport
        "level": "info"                // Transport-specific level
      },
      {
        "type": "file",                // File transport
        "level": "debug",
        "filename": "/var/log/llm-mcp/server.log",
        "maxsize": 10485760,           // 10 MB
        "maxFiles": 10,
        "tailable": true,
        "zippedArchive": true
      },
      {
        "type": "http",                // HTTP transport
        "level": "error",
        "host": "log-collector.example.com",
        "port": 8080,
        "path": "/logs",
        "auth": {
          "username": "logger",
          "password": "secret"
        },
        "ssl": true
      }
    ],
    "redact": [                        // Fields to redact from logs
      "req.headers.authorization",
      "req.body.password",
      "res.body.token",
      "*.credentials",
      "*.secret",
      "*.password"
    ]
  },
  "monitoring": {
    "metrics": {
      "enabled": true,
      "port": 9090,                    // Metrics server port
      "path": "/metrics",              // Metrics endpoint path
      "prefix": "llm-mcp_",              // Metrics name prefix
      "defaultLabels": {               // Default labels for all metrics
        "environment": "production",
        "region": "us-west"
      },
      "collectors": [                  // Metric collectors
        "system",                      // System metrics (CPU, memory, etc.)
        "nodejs",                      // Node.js metrics
        "http",                        // HTTP metrics
        "tool",                        // Tool metrics
        "vector",                      // Vector metrics
        "grpc"                         // gRPC metrics
      ],
      "custom": [                      // Custom metrics
        {
          "type": "counter",
          "name": "tool_execution_total",
          "help": "Total number of tool executions",
          "labelNames": ["tool_id", "category", "status"]
        },
        {
          "type": "histogram",
          "name": "tool_execution_duration_seconds",
          "help": "Tool execution duration in seconds",
          "labelNames": ["tool_id", "category"],
          "buckets": [0.1, 0.5, 1, 2, 5, 10, 30]
        }
      ]
    },
    "tracing": {
      "enabled": true,
      "exporter": {
        "type": "jaeger",              // Tracing exporter: jaeger, zipkin, otlp
        "host": "jaeger",
        "port": 6832,
        "endpoint": "/api/traces"
      },
      "sampler": {
        "type": "probabilistic",       // Sampler type: always, never, probabilistic
        "rate": 0.1                    // Sampling rate (10%)
      },
      "propagation": ["b3", "w3c"],    // Trace context propagation formats
      "instrumentations": [            // What to instrument
        "http",
        "grpc",
        "mongodb",
        "redis",
        "kafka",
        "graphql"
      ]
    },
    "health": {
      "enabled": true,
      "port": 8080,
      "path": "/health",
      "liveness": {                    // Liveness probe
        "path": "/health/live",
        "failureThreshold": 3,
        "successThreshold": 1,
        "initialDelaySeconds": 30,
        "periodSeconds": 10,
        "timeoutSeconds": 5
      },
      "readiness": {                   // Readiness probe
        "path": "/health/ready",
        "checks": [                    // Readiness checks
          "storage",
          "toolRegistry",
          "cache",
          "queue"
        ],
        "failureThreshold": 3,
        "successThreshold": 1,
        "initialDelaySeconds": 60,
        "periodSeconds": 30,
        "timeoutSeconds": 10
      }
    },
    "alerts": {
      "enabled": true,
      "providers": [
        {
          "type": "email",
          "recipients": ["[email protected]"],
          "sender": "[email protected]",
          "smtp": {
            "host": "smtp.example.com",
            "port": 587,
            "secure": true,
            "auth": {
              "user": "[email protected]",
              "pass": "smtp-password"
            }
          }
        },
        {
          "type": "webhook",
          "url": "https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX",
          "method": "POST",
          "headers": {
            "Content-Type": "application/json"
          }
        },
        {
          "type": "pagerduty",
          "routingKey": "your-pagerduty-routing-key",
          "severity": "critical"
        }
      ],
      "rules": [
        {
          "name": "HighErrorRate",
          "condition": "rate(llm-mcp_http_requests_total{status_code=~\"5..\"}[5m]) / rate(llm-mcp_http_requests_total[5m]) > 0.05",
          "duration": "5m",
          "severity": "critical",
          "description": "High error rate detected",
          "providers": ["email", "pagerduty"]
        },
        {
          "name": "HighCpuUsage",
          "condition": "avg(llm-mcp_system_cpu_usage) > 80",
          "duration": "10m",
          "severity": "warning",
          "description": "High CPU usage detected",
          "providers": ["email", "webhook"]
        }
      ]
    }
  }
}

Advanced Logging Patterns

For production-grade logging:

{
  "logging": {
    "patterns": {
      "request": "{method} https://example.com Status [response-time]ms",
      "error": "[err.name]: [err.message]\n[err.stack]"
    },
    "rotation": {
      "enabled": true,
      "interval": "1d",              // Rotate daily
      "maxFiles": 30,                // Keep 30 files
      "maxSize": "100m"              // Or when file reaches 100 MB
    },
    "aggregation": {
      "enabled": true,
      "similarErrorsWindow": 300,    // Aggregate similar errors within 5 minutes
      "rateLimit": {
        "window": 60,                // 1 minute window
        "max": 100                   // Max 100 similar log entries per minute
      }
    },
    "sampling": {
      "enabled": true,
      "rules": [
        {
          "pattern": "GET /api/v1/health",
          "rate": 0.01               // Log only 1% of health check requests
        },
        {
          "pattern": "GET /api/v1/metrics",
          "rate": 0.05               // Log only 5% of metrics requests
        }
      ]
    }
  }
}

Caching Configuration

Configure caching for improved performance:

{
  "cache": {
    "enabled": true,
    "defaultTTL": 300,              // Default TTL in seconds
    "strategies": {
      "toolRegistry": {             // Tool registry caching
        "enabled": true,
        "ttl": 3600                 // 1 hour
      },
      "toolExecution": {            // Tool execution result caching
        "enabled": true,
        "ttl": 300,                 // 5 minutes
        "ignoreParameters": ["timestamp", "random"], // Parameters to ignore in cache key
        "varyByUser": true,         // Vary cache by user
        "keyGenerator": "md5",      // Cache key generation algorithm
        "tools": {                  // Tool-specific cache settings
          "weather_api": {
            "ttl": 1800,            // 30 minutes
            "varyByParameters": ["location", "units"]
          },
          "static_data_lookup": {
            "ttl": 86400            // 24 hours
          }
        }
      },
      "modelRegistry": {            // Model registry caching
        "enabled": true,
        "ttl": 3600                 // 1 hour
      },
      "vectorSearch": {             // Vector search caching
        "enabled": true,
        "ttl": 300,                 // 5 minutes
        "maxVectorDimensions": 1536, // Maximum vector dimensions to cache
        "ignoreParameters": ["limit", "includeMetadata"]
      },
      "responses": {               // HTTP response caching
        "enabled": true,
        "routes": [
          {
            "pattern": "GET /api/v1/tools",
            "ttl": 60               // 1 minute
          },
          {
            "pattern": "GET /api/v1/models",
            "ttl": 300              // 5 minutes
          }
        ]
      }
    },
    "invalidation": {              // Cache invalidation settings
      "events": {                  // Events that trigger invalidation
        "toolUpdate": ["toolRegistry", "toolExecution"],
        "modelUpdate": ["modelRegistry"]
      },
      "patterns": [                // Invalidation patterns
        {
          "when": "POST /api/v1/tools",
          "invalidate": "toolRegistry"
        },
        {
          "when": "PUT /api/v1/tools/*",
          "invalidate": "toolRegistry"
        }
      ]
    },
    "storage": {                   // Cache storage configuration
      "type": "redis",             // Cache storage type: "memory", "redis"
      "redis": {
        "host": "redis",
        "port": 6379,
        "password": "your-redis-password",
        "db": 0,
        "keyPrefix": "llm-mcp:cache:"
      },
      "memory": {
        "maxSize": 1000,           // Maximum items in memory cache
        "maxSizeBytes": "100mb",   // Maximum memory usage
        "staleWhileRevalidate": true // Serve stale data while revalidating
      }
    },
    "compression": {              // Cache compression settings
      "enabled": true,
      "threshold": 1024,          // Compress entries larger than 1KB
      "algorithm": "gzip"         // Compression algorithm: gzip, deflate, brotli
    },
    "monitoring": {
      "enabled": true,
      "metrics": [                // Cache metrics to collect
        "hit_ratio",
        "miss_ratio",
        "size",
        "evictions"
      ]
    }
  }
}

Intelligent Caching

For adaptive caching strategies:

{
  "cache": {
    "adaptive": {
      "enabled": true,
      "analyzer": {
        "enabled": true,
        "sampleRate": 0.1,        // Analyze 10% of requests
        "minimumSamples": 100,    // Minimum samples before adapting
        "analysisInterval": 3600, // Analyze hourly (seconds)
        "metrics": ["hitRatio", "latency", "size"]
      },
      "strategies": {
        "ttlAdjustment": {
          "enabled": true,
          "minTTL": 60,           // Minimum TTL (seconds)
          "maxTTL": 86400,        // Maximum TTL (seconds)
          "adjustmentFactor": 1.5 // TTL adjustment factor
        },
        "prefetching": {
          "enabled": true,
          "threshold": 0.8,       // Prefetch when TTL is 80% expired
          "priorityQueue": true,  // Use priority queue for prefetching
          "maxConcurrent": 10     // Maximum concurrent prefetches
        }
      }
    }
  }
}

High Availability Configuration

Configure LLM-MCP for high availability and fault tolerance:

{
  "highAvailability": {
    "enabled": true,
    "mode": "active-active",      // HA mode: "active-active", "active-passive"
    "cluster": {
      "enabled": true,
      "discovery": {
        "type": "kubernetes",     // Discovery type: "kubernetes", "consul", "etcd", "manual"
        "kubernetes": {
          "namespace": "llm-mcp",
          "labelSelector": "app=llm-mcp-server",
          "podIP": true
        },
        "consul": {
          "host": "consul",
          "port": 8500,
          "serviceName": "llm-mcp",
          "aclToken": "your-consul-token"
        },
        "etcd": {
          "hosts": ["etcd-1:2379", "etcd-2:2379", "etcd-3:2379"],
          "ttl": 15,
          "prefix": "/llm-mcp/nodes"
        },
        "manual": {
          "nodes": [
            {"host": "llm-mcp-1", "port": 3001},
            {"host": "llm-mcp-2", "port": 3001},
            {"host": "llm-mcp-3", "port": 3001}
          ]
        },
        "refreshInterval": 30000  // Refresh interval in ms
      },
      "gossip": {
        "enabled": true,
        "port": 7946,
        "protocol": "tcp",
        "encryption": {
          "enabled": true,
          "key": "32-byte-encryption-key"
        },
        "joinTimeout": 5000
      },
      "membership": {
        "probeInterval": 1000,    // Probe interval in ms
        "probeTimeout": 3000,     // Probe timeout in ms
        "suspectTimeout": 5000,   // Suspect timeout in ms
        "failureDetector": {
          "type": "phi-accrual",  // Detector type: "phi-accrual", "simple"
          "threshold": 8,         // Phi threshold
          "maxSamples": 1000,     // Maximum samples
          "minStdDeviation": 50   // Minimum standard deviation
        }
      }
    },
    "stateReplication": {
      "enabled": true,
      "strategy": "activeSync",   // Replication strategy: "activeSync", "periodic"
      "activeSync": {
        "synchronous": false,     // Synchronous replication
        "batchSize": 100,         // Batch size
        "maxDelay": 1000          // Maximum delay in ms
      },
      "periodic": {
        "interval": 60000,        // Replication interval in ms
        "fullSync": false         // Full sync or incremental
      },
      "conflict": {
        "resolution": "lww",      // Conflict resolution: "lww", "vector-clock"
        "mergeStrategy": "auto"   // Merge strategy: "auto", "manual"
      }
    },
    "loadBalancing": {
      "enabled": true,
      "strategy": "consistent-hash", // Strategy: "random", "round-robin", "least-conn", "consistent-hash"
      "healthCheck": {
        "enabled": true,
        "interval": 5000,         // Health check interval in ms
        "timeout": 2000,          // Health check timeout in ms
        "unhealthyThreshold": 3,  // Unhealthy threshold
        "healthyThreshold": 2     // Healthy threshold
      },
      "affinity": {
        "enabled": true,
        "cookie": "llm-mcp-server", // Session affinity cookie
        "ttl": 3600               // Session affinity TTL in seconds
      }
    },
    "failover": {
      "enabled": true,
      "timeout": 10000,           // Failover timeout in ms
      "maxAttempts": 3,           // Maximum failover attempts
      "backoffFactor": 2,         // Backoff factor
      "strategy": "auto",         // Failover strategy: "auto", "manual"
      "splitBrain": {
        "prevention": "quorum",   // Split-brain prevention: "quorum", "priority"
        "quorum": {
          "min": 2                // Minimum quorum size
        },
        "priority": {
          "enabled": true,
          "rules": [              // Priority rules
            {"pattern": "llm-mcp-1", "priority": 100},
            {"pattern": "llm-mcp-2", "priority": 90},
            {"pattern": "llm-mcp-3", "priority": 80}
          ]
        }
      }
    },
    "persistence": {
      "enabled": true,
      "strategy": "periodic",     // Persistence strategy: "periodic", "onchange"
      "periodic": {
        "interval": 300000,       // Persistence interval in ms
        "includeTimestamp": true  // Include timestamp in persistence
      },
      "location": {
        "type": "shared-disk",    // Location type: "shared-disk", "s3"
        "path": "/var/lib/llm-mcp/ha"
      }
    }
  }
}

Kubernetes-Specific HA Configuration

For Kubernetes deployments:

{
  "highAvailability": {
    "kubernetes": {
      "enabled": true,
      "namespace": "llm-mcp",
      "resources": {
        "statefulSet": {
          "replicas": 3,
          "updateStrategy": "RollingUpdate",
          "podManagementPolicy": "OrderedReady"
        },
        "configMap": "llm-mcp-config",
        "secret": "llm-mcp-secret",
        "service": {
          "name": "llm-mcp",
          "ports": [
            {"name": "http", "port": 3001},
            {"name": "grpc", "port": 3002},
            {"name": "gossip", "port": 7946}
          ]
        },
        "podDisruptionBudget": {
          "minAvailable": 2       // Minimum available pods
        },
        "volumes": {
          "data": {
            "type": "persistentVolumeClaim",
            "claimName": "llm-mcp-data",
            "mountPath": "/var/lib/llm-mcp"
          }
        }
      },
      "serviceAccount": "llm-mcp-sa",
      "rbac": {
        "create": true,
        "rules": [
          {
            "apiGroups": [""],
            "resources": ["pods", "endpoints", "services"],
            "verbs": ["get", "list", "watch"]
          }
        ]
      }
    }
  }
}

Advanced gRPC Configuration

Configure the gRPC server for high-performance communication:

{
  "grpc": {
    "enabled": true,
    "port": 3002,
    "host": "0.0.0.0",
    "maxConcurrentStreams": 100,   // Maximum concurrent streams
    "keepalive": {
      "maxConnectionIdle": 300000, // Max idle time in ms
      "maxConnectionAge": 600000,  // Max connection age in ms
      "maxConnectionAgeGrace": 60000, // Grace period in ms
      "time": 7200000,            // Keepalive time in ms
      "timeout": 20000            // Keepalive timeout in ms
    },
    "channelOptions": {           // Channel options
      "grpc.max_send_message_length": 10485760, // 10 MB
      "grpc.max_receive_message_length": 10485760, // 10 MB
      "grpc.enable_channelz": 1,
      "grpc.enable_retries": 1,
      "grpc.service_config": "{\"loadBalancingConfig\": [\"round_robin\":{}}]"
    },
    "security": {
      "tls": {
        "enabled": true,
        "certFile": "/path/to/cert.pem",
        "keyFile": "/path/to/key.pem",
        "caFile": "/path/to/ca.pem",
        "requireClientCert": false
      },
      "auth": {
        "type": "apiKey",
        "metadataKey": "x-api-key"
      }
    },
    "compression": {
      "enabled": true,
      "algorithms": ["gzip", "deflate"],
      "defaultAlgorithm": "gzip"
    },
    "reflection": {
      "enabled": true             // Enable gRPC reflection
    },
    "healthCheck": {
      "enabled": true,            // Enable gRPC health checking
      "checkInterval": 10000      // Health check interval in ms
    },
    "interceptors": {
      "server": [                 // Server interceptors
        "logging",
        "authentication",
        "metrics",
        "rateLimit"
      ],
      "client": [                 // Client interceptors
        "logging",
        "retry",
        "metrics"
      ]
    },
    "services": {                 // Service-specific configuration
      "MCP": {
        "maxConcurrentCalls": 100,
        "timeout": 30000
      },
      "Vector": {
        "maxConcurrentCalls": 50,
        "timeout": 60000
      },
      "ToolRegistry": {
        "maxConcurrentCalls": 200,
        "timeout": 15000
      }
    }
  }
}

Environment-Specific Configurations

Define configurations for different environments:

{
  "environments": {
    "development": {
      "server": {
        "port": 3001,
        "cors": {
          "origin": ["*"]
        }
      },
      "logging": {
        "level": "debug",
        "format": "text",
        "colorize": true
      },
      "security": {
        "auth": {
          "type": "apiKey",
          "apiKey": {
            "keys": {
              "dev-key": {
                "roles": ["admin"]
              }
            }
          }
        }
      },
      "storage": {
        "metadata": {
          "type": "mongodb",
          "uri": "mongodb://localhost:27017/llm-mcp-dev"
        },
        "vector": {
          "type": "mongodb",
          "uri": "mongodb://localhost:27017/llm-mcp-dev"
        }
      }
    },
    "staging": {
      "server": {
        "port": 3001,
        "cors": {
          "origin": ["https://staging.example.com"]
        }
      },
      "logging": {
        "level": "info",
        "format": "json"
      },
      "security": {
        "auth": {
          "type": "jwt",
          "jwt": {
            "secret": "staging-jwt-secret",
            "algorithm": "HS256"
          }
        }
      },
      "storage": {
        "metadata": {
          "type": "mongodb",
          "uri": "mongodb://mongodb:27017/llm-mcp-staging"
        },
        "vector": {
          "type": "qdrant",
          "uri": "http://qdrant:6333"
        }
      }
    },
    "production": {
      "server": {
        "port": 3001,
        "cors": {
          "origin": ["https://app.example.com", "https://api.example.com"]
        },
        "cluster": {
          "enabled": true,
          "workers": "auto"
        }
      },
      "logging": {
        "level": "info",
        "format": "json",
        "transports": [
          {
            "type": "console",
            "level": "info"
          },
          {
            "type": "file",
            "level": "info",
            "filename": "/var/log/llm-mcp/server.log"
          }
        ]
      },
      "security": {
        "auth": {
          "type": "jwt",
          "jwt": {
            "algorithm": "RS256",
            "publicKeyFile": "/etc/llm-mcp/keys/public.pem",
            "privateKeyFile": "/etc/llm-mcp/keys/private.pem"
          }
        },
        "tls": {
          "enabled": true,
          "certFile": "/etc/llm-mcp/tls/cert.pem",
          "keyFile": "/etc/llm-mcp/tls/key.pem"
        }
      },
      "storage": {
        "metadata": {
          "type": "mongodb",
          "uri": "mongodb://mongodb-0.mongodb,mongodb-1.mongodb,mongodb-2.mongodb:27017/llm-mcp?replicaSet=rs0"
        },
        "vector": {
          "type": "qdrant",
          "uri": "http://qdrant:6333",
          "apiKey": "production-qdrant-api-key"
        }
      },
      "highAvailability": {
        "enabled": true,
        "mode": "active-active",
        "cluster": {
          "enabled": true,
          "discovery": {
            "type": "kubernetes"
          }
        }
      }
    }
  }
}

Complete Configuration Example

Below is a complete configuration example for a production LLM-MCP deployment:

{
  "appName": "llm-mcp",
  "version": "1.0.0",
  "environment": "production",
  
  "server": {
    "host": "0.0.0.0",
    "port": 3001,
    "grpcPort": 3002,
    "maxConnections": 1000,
    "requestTimeout": 30000,
    "maxRequestSize": "10mb",
    "compression": true,
    "trustProxy": true,
    "cors": {
      "enabled": true,
      "origin": ["https://app.example.com", "https://api.example.com"],
      "methods": ["GET", "POST", "PUT", "DELETE", "OPTIONS"],
      "allowedHeaders": ["Content-Type", "Authorization"],
      "exposedHeaders": ["X-Request-ID"],
      "credentials": false,
      "maxAge": 86400
    },
    "helmet": {
      "enabled": true,
      "contentSecurityPolicy": {
        "directives": {
          "default-src": ["'self'"],
          "script-src": ["'self'"]
        }
      },
      "hsts": {
        "maxAge": 15552000,
        "includeSubDomains": true
      }
    },
    "rateLimiting": {
      "enabled": true,
      "windowMs": 60000,
      "max": 100,
      "standardHeaders": true,
      "skip": {
        "paths": ["/health", "/metrics"],
        "ips": ["127.0.0.1"]
      }
    },
    "cluster": {
      "enabled": true,
      "workers": "auto",
      "restartOnFailure": true
    }
  },
  
  "security": {
    "auth": {
      "type": "jwt",
      "jwt": {
        "algorithm": "RS256",
        "publicKeyFile": "/etc/llm-mcp/keys/public.pem",
        "privateKeyFile": "/etc/llm-mcp/keys/private.pem",
        "audience": "llm-mcp-api",
        "issuer": "auth.example.com",
        "expiresIn": "1h",
        "clockTolerance": 30
      }
    },
    "rbac": {
      "enabled": true,
      "roles": {
        "admin": {
          "description": "Administrator with full access",
          "permissions": ["*"]
        },
        "user": {
          "description": "Regular user",
          "permissions": [
            "tool:list",
            "tool:execute",
            "vector:store",
            "vector:query",
            "model:list"
          ]
        },
        "readonly": {
          "description": "Read-only user",
          "permissions": [
            "tool:list",
            "model:list",
            "vector:query"
          ]
        }
      },
      "defaultRole": "readonly"
    },
    "encryption": {
      "enabled": true,
      "keyFile": "/etc/llm-mcp/keys/encryption-key.json",
      "algorithm": "aes-256-gcm",
      "fields": ["vector:metadata.credentials"]
    },
    "tls": {
      "enabled": true,
      "certFile": "/etc/llm-mcp/tls/cert.pem",
      "keyFile": "/etc/llm-mcp/tls/key.pem",
      "caFile": "/etc/llm-mcp/tls/ca.pem",
      "minVersion": "TLSv1.2"
    },
    "auditLogging": {
      "enabled": true,
      "logFile": "/var/log/llm-mcp/audit.log",
      "events": [
        "authentication",
        "authorization",
        "tool-registration",
        "tool-execution"
      ],
      "format": "json",
      "rotation": {
        "size": "100m",
        "interval": "1d",
        "maxFiles": 30
      }
    }
  },
  
  "storage": {
    "type": "composite",
    
    "metadata": {
      "type": "mongodb",
      "uri": "mongodb://mongodb-0.mongodb,mongodb-1.mongodb,mongodb-2.mongodb:27017/llm-mcp?replicaSet=rs0",
      "options": {
        "useNewUrlParser": true,
        "useUnifiedTopology": true,
        "maxPoolSize": 50,
        "connectTimeoutMS": 30000,
        "socketTimeoutMS": 45000,
        "serverSelectionTimeoutMS": 30000,
        "replicaSet": "rs0",
        "readPreference": "secondaryPreferred",
        "w": "majority",
        "wtimeoutMS": 10000,
        "ssl": true
      }
    },
    
    "vector": {
      "type": "qdrant",
      "qdrant": {
        "uri": "http://qdrant:6333",
        "apiKey": "your-qdrant-api-key",
        "timeout": 30000,
        "vectors": {
          "dimensions": 1536,
          "distance": "Cosine",
          "onDisk": true
        }
      }
    },
    
    "cache": {
      "type": "redis",
      "redis": {
        "host": "redis",
        "port": 6379,
        "password": "your-redis-password",
        "db": 0,
        "prefix": "llm-mcp:",
        "tls": {
          "enabled": true,
          "ca": "/etc/llm-mcp/tls/redis-ca.pem"
        },
        "cluster": {
          "enabled": true,
          "nodes": [
            { "host": "redis-0.redis", "port": 6379 },
            { "host": "redis-1.redis", "port": 6379 },
            { "host": "redis-2.redis", "port": 6379 }
          ]
        }
      }
    },
    
    "queue": {
      "type": "redis",
      "redis": {
        "host": "redis",
        "port": 6379,
        "password": "your-redis-password",
        "db": 1,
        "keyPrefix": "queue:"
      }
    },
    
    "backup": {
      "enabled": true,
      "schedule": "0 0 * * *",
      "path": "/var/backups/llm-mcp",
      "retention": {
        "days": 30,
        "count": 10
      },
      "storage": {
        "type": "s3",
        "s3": {
          "bucket": "llm-mcp-backups",
          "prefix": "production/",
          "region": "us-west-2"
        }
      }
    }
  },
  
  "toolRegistry": {
    "maxTools": 1000,
    "validation": {
      "enabled": true,
      "strictSchema": true,
      "validateFunctions": true,
      "allowRemote": true
    },
    "execution": {
      "timeout": 30000,
      "maxConcurrent": 100,
      "retryStrategy": {
        "attempts": 3,
        "delay": 1000,
        "backoff": 2,
        "maxDelay": 10000
      },
      "sandbox": {
        "enabled": true,
        "type": "vm",
        "resourceLimits": {
          "cpu": 1,
          "memory": "512M",
          "timeout": 30000
        }
      }
    },
    "discovery": {
      "providers": [
        {
          "name": "local",
          "enabled": true
        },
        {
          "name": "bfllm",
          "enabled": true,
          "url": "http://bfllm:3002/tools",
          "apiKey": "your-bfllm-api-key",
          "refresh": 60000
        }
      ],
      "autoRegister": {
        "enabled": true,
        "interval": 300000,
        "onStartup": true
      }
    },
    "rateLimit": {
      "enabled": true,
      "defaultRules": {
        "perSecond": 10,
        "perMinute": 100,
        "perHour": 1000
      },
      "byCategory": {
        "expensive": {
          "perSecond": 2,
          "perMinute": 20,
          "perHour": 100
        }
      }
    }
  },
  
  "cache": {
    "enabled": true,
    "defaultTTL": 300,
    "strategies": {
      "toolRegistry": {
        "enabled": true,
        "ttl": 3600
      },
      "toolExecution": {
        "enabled": true,
        "ttl": 300,
        "varyByUser": true
      },
      "modelRegistry": {
        "enabled": true,
        "ttl": 3600
      },
      "vectorSearch": {
        "enabled": true,
        "ttl": 300
      }
    },
    "storage": {
      "type": "redis",
      "redis": {
        "host": "redis",
        "port": 6379,
        "password": "your-redis-password",
        "db": 0,
        "keyPrefix": "llm-mcp:cache:"
      }
    },
    "compression": {
      "enabled": true,
      "threshold": 1024,
      "algorithm": "gzip"
    }
  },
  
  "logging": {
    "level": "info",
    "format": "json",
    "timestamp": true,
    "source": true,
    "requestId": true,
    "traceId": true,
    "transports": [
      {
        "type": "console",
        "level": "info"
      },
      {
        "type": "file",
        "level": "info",
        "filename": "/var/log/llm-mcp/server.log",
        "maxsize": 10485760,
        "maxFiles": 10,
        "tailable": true,
        "zippedArchive": true
      }
    ],
    "redact": [
      "req.headers.authorization",
      "req.body.password",
      "res.body.token",
      "*.credentials",
      "*.secret",
      "*.password"
    ]
  },
  
  "monitoring": {
    "metrics": {
      "enabled": true,
      "port": 9090,
      "path": "/metrics",
      "prefix": "llm-mcp_",
      "defaultLabels": {
        "environment": "production",
        "region": "us-west"
      },
      "collectors": [
        "system",
        "nodejs",
        "http",
        "tool",
        "vector",
        "grpc"
      ]
    },
    "tracing": {
      "enabled": true,
      "exporter": {
        "type": "jaeger",
        "host": "jaeger",
        "port": 6832
      },
      "sampler": {
        "type": "probabilistic",
        "rate": 0.1
      },
      "propagation": ["b3", "w3c"],
      "instrumentations": [
        "http",
        "grpc",
        "mongodb",
        "redis"
      ]
    },
    "health": {
      "enabled": true,
      "port": 8080,
      "path": "/health",
      "liveness": {
        "path": "/health/live",
        "failureThreshold": 3,
        "successThreshold": 1,
        "initialDelaySeconds": 30,
        "periodSeconds": 10,
        "timeoutSeconds": 5
      },
      "readiness": {
        "path": "/health/ready",
        "checks": [
          "storage",
          "toolRegistry",
          "cache",
          "queue"
        ],
        "failureThreshold": 3,
        "successThreshold": 1,
        "initialDelaySeconds": 60,
        "periodSeconds": 30,
        "timeoutSeconds": 10
      }
    },
    "alerts": {
      "enabled": true,
      "providers": [
        {
          "type": "email",
          "recipients": ["[email protected]"],
          "sender": "[email protected]"
        },
        {
          "type": "webhook",
          "url": "https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX"
        },
        {
          "type": "pagerduty",
          "routingKey": "your-pagerduty-routing-key"
        }
      ],
      "rules": [
        {
          "name": "HighErrorRate",
          "condition": "rate(llm-mcp_http_requests_total{status_code=~\"5..\"}[5m]) / rate(llm-mcp_http_requests_total[5m]) > 0.05",
          "duration": "5m",
          "severity": "critical",
          "description": "High error rate detected",
          "providers": ["email", "pagerduty"]
        }
      ]
    }
  },
  
  "grpc": {
    "enabled": true,
    "port": 3002,
    "host": "0.0.0.0",
    "maxConcurrentStreams": 100,
    "security": {
      "tls": {
        "enabled": true,
        "certFile": "/etc/llm-mcp/tls/cert.pem",
        "keyFile": "/etc/llm-mcp/tls/key.pem",
        "caFile": "/etc/llm-mcp/tls/ca.pem"
      }
    },
    "compression": {
      "enabled": true,
      "algorithms": ["gzip"],
      "defaultAlgorithm": "gzip"
    },
    "reflection": {
      "enabled": true
    },
    "healthCheck": {
      "enabled": true
    }
  },
  
  "highAvailability": {
    "enabled": true,
    "mode": "active-active",
    "cluster": {
      "enabled": true,
      "discovery": {
        "type": "kubernetes",
        "kubernetes": {
          "namespace": "llm-mcp",
          "labelSelector": "app=llm-mcp-server"
        }
      }
    },
    "stateReplication": {
      "enabled": true,
      "strategy": "activeSync"
    },
    "loadBalancing": {
      "enabled": true,
      "strategy": "consistent-hash",
      "healthCheck": {
        "enabled": true,
        "interval": 5000,
        "timeout": 2000
      }
    },
    "failover": {
      "enabled": true,
      "timeout": 10000,
      "maxAttempts": 3,
      "strategy": "auto",
      "splitBrain": {
        "prevention": "quorum",
        "quorum": {
          "min": 2
        }
      }
    }
  },
  
  "resources": {
    "cpu": {
      "limit": 80,
      "target": 60,
      "throttling": {
        "enabled": true,
        "checkInterval": 5000
      }
    },
    "memory": {
      "limit": "4G",
      "reserved": "1G",
      "gc": {
        "heapThreshold": 80,
        "intervalMin": 30000
      }
    },
    "limits": {
      "toolExecution": {
        "cpu": 2,
        "memory": "1G",
        "timeout": 30000
      },
      "vectorOperations": {
        "cpu": 4,
        "memory": "2G",
        "timeout": 60000
      }
    }
  }
}

Conclusion

This advanced configuration guide provides a comprehensive reference for configuring LLM-MCP for enterprise production environments. By carefully tuning these configuration parameters, you can optimize LLM-MCP for your specific requirements, including performance, security, availability, and resource utilization.

For more information:

See the API Reference for detailed API documentation
Check the Architecture Overview for understanding the system design
Refer to the Troubleshooting Guide for resolving configuration issues

If you have questions about specific configuration options, please consult our community forum or contact our support team.

Advanced LLM-MCP Configuration Guide

Table of Contents​

Configuration Methods​

Configuration File​

Environment Variables​

Configuration Validation​

Server Configuration​

Advanced Host Configuration​

Security Configuration​

API Key Rotation​

Storage Configuration​

MongoDB Sharding Configuration​

Tool Registry Configuration​

Advanced Tool Execution​

Resource Management​

Advanced Thread Pool Configuration​

Logging and Monitoring​

Advanced Logging Patterns​

Caching Configuration​

Intelligent Caching​

High Availability Configuration​

Kubernetes-Specific HA Configuration​

Advanced gRPC Configuration​

Environment-Specific Configurations​

Complete Configuration Example​

Conclusion​

Table of Contents