Skip to main content

LLM Gateway

API gateway for Large Language Model providers with request routing and caching.

Featuresโ€‹

  • Multi-provider support for OpenAI, Anthropic, and LangChain
  • Request routing based on model availability
  • Caching layer with Redis support
  • Rate limiting per provider and endpoint
  • OpenTelemetry metrics and tracing
  • JWT authentication and API key management
  • Request validation and error handling
  • GraphQL and REST endpoints
  • OpenAPI 3.0 specification

Installationโ€‹

npm install @bluefly/llm-gateway

Quick Startโ€‹

import [createGateway] from '@bluefly/llm-gateway';

const gateway = createGateway({
providers: {
openai: {
apiKey: process.env.OPENAI_API_KEY,
},
anthropic: {
apiKey: process.env.ANTHROPIC_API_KEY,
},
},
cache: {
enabled: true,
ttl: 3600,
},
rateLimit: {
windowMs: 60000,
max: 100,
},
});

await gateway.start(3000);

Configurationโ€‹

Environment Variablesโ€‹

# Provider API Keys
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key

# Cache Configuration
REDIS_URL=redis://localhost:6379
CACHE_TTL=3600

# Rate Limiting
RATE_LIMIT_WINDOW=60000
RATE_LIMIT_MAX=100

# Server Configuration
PORT=3000
NODE_ENV=production

Configuration File (gateway.config.js)โ€‹

module.exports = {
providers: {
openai: {
apiKey: process.env.OPENAI_API_KEY,
models: ['gpt-4', 'gpt-3.5-turbo'],
timeout: 30000
},
anthropic: {
apiKey: process.env.ANTHROPIC_API_KEY,
models: ['claude-3-sonnet', 'claude-3-haiku'],
timeout: 30000
}
},
cache: {
enabled: true,
provider: 'redis',
url: process.env.REDIS_URL,
ttl: 3600
},
rateLimit: {
windowMs: 60000,
max: 100,
skipSuccessfulRequests: false
},
monitoring: {
enabled: true,
endpoint: '/metrics'
}
};

API Endpointsโ€‹

REST APIโ€‹

# Chat completion
POST /api/v1/chat/completions
Content-Type: application/json

{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "Hello, world!"}
]
}

# Health check
GET /health

# Metrics (Prometheus format)
GET /metrics

GraphQL APIโ€‹

mutation ChatCompletion($input: ChatInput!) {
chat(input: $input) {
id
choices {
message {
role
content
}
}
usage [promptTokens
completionTokens
totalTokens]
}
}

Usage Examplesโ€‹

Basic Chat Completionโ€‹

const response = await fetch('http://localhost:3000/api/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer your-api-key'
},
body: JSON.stringify({
model: 'gpt-4',
messages: [
{ role: 'user', content: 'Explain quantum computing' }
]
})
});

const result = await response.json();
console.log(result.choices[0].message.content);

Streaming Responseโ€‹

const response = await fetch('http://localhost:3000/api/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer your-api-key'
},
body: JSON.stringify({
model: 'gpt-4',
messages: [role: 'user', content: 'Tell me a story'],
stream: true
})
});

const reader = response.body.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;

const chunk = new TextDecoder().decode(value);
console.log(chunk);
}

Developmentโ€‹

Setupโ€‹

# Clone repository
git clone https://gitlab.bluefly.io/llm/llm-gateway.git
cd llm-gateway

# Install dependencies
npm install

# Start development server
npm run dev

# Run tests
npm test

Scriptsโ€‹

# Development
npm run dev # Start with hot reload
npm run dev:debug # Start with debugging

# Building
npm run build # Build for production
npm run build:prod # Build with minification

# Testing
npm test # Run all tests
npm run test:watch # Run tests in watch mode
npm run test:coverage # Generate coverage report

# Linting
npm run lint # Check code style
npm run lint:fix # Fix code style issues

# Type checking
npm run typecheck # Check TypeScript types

Dockerโ€‹

# Build image
npm run docker:build

# Run container
docker run -p 3000:3000 -e OPENAI_API_KEY=your_key llm-gateway

Monitoringโ€‹

Health Checksโ€‹

# Basic health check
curl http://localhost:3000/health

# Detailed health with provider status
curl http://localhost:3000/health/detailed

Metricsโ€‹

The gateway exposes Prometheus metrics at /metrics:

  • llm_gateway_requests_total - Total number of requests
  • llm_gateway_request_duration_seconds - Request duration histogram
  • llm_gateway_provider_errors_total - Provider error counts
  • llm_gateway_cache_hits_total - Cache hit/miss counts

Loggingโ€‹

Structured logging with configurable levels:

// Log levels: error, warn, info, debug
LOG_LEVEL=info

// Log format: json, text
LOG_FORMAT=json

Deploymentโ€‹

Environment Variablesโ€‹

Required environment variables for production:

# Provider Configuration
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

# Cache Configuration
REDIS_URL=redis://localhost:6379

# Security
JWT_SECRET=your-jwt-secret
API_KEY=your-api-key

# Monitoring
OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:14268/api/traces

Kubernetesโ€‹

apiVersion: apps/v1
kind: Deployment
metadata:
name: llm-gateway
spec:
replicas: 3
selector:
matchLabels:
app: llm-gateway
template:
metadata:
labels:
app: llm-gateway
spec:
containers:
- name: llm-gateway
image: llm-gateway:latest
ports:
- containerPort: 3000
env:
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: llm-secrets
key: openai-api-key

Contributingโ€‹

  1. Fork the repository on GitLab
  2. Create a feature branch
  3. Make your changes
  4. Add tests for new functionality
  5. Ensure all tests pass
  6. Submit a merge request

Repositoryโ€‹

Licenseโ€‹

MIT