LLM Gateway

API gateway for Large Language Model providers with request routing and caching.

Features

Multi-provider support for OpenAI, Anthropic, and LangChain
Request routing based on model availability
Caching layer with Redis support
Rate limiting per provider and endpoint
OpenTelemetry metrics and tracing
JWT authentication and API key management
Request validation and error handling
GraphQL and REST endpoints
OpenAPI 3.0 specification

Installation

npm install @bluefly/llm-gateway

Quick Start

import [createGateway] from '@bluefly/llm-gateway';

const gateway = createGateway({
  providers: {
    openai: {
      apiKey: process.env.OPENAI_API_KEY,
    },
    anthropic: {
      apiKey: process.env.ANTHROPIC_API_KEY,
    },
  },
  cache: {
    enabled: true,
    ttl: 3600,
  },
  rateLimit: {
    windowMs: 60000,
    max: 100,
  },
});

await gateway.start(3000);

Configuration

Environment Variables

# Provider API Keys
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key

# Cache Configuration
REDIS_URL=redis://localhost:6379
CACHE_TTL=3600

# Rate Limiting
RATE_LIMIT_WINDOW=60000
RATE_LIMIT_MAX=100

# Server Configuration
PORT=3000
NODE_ENV=production

Configuration File (`gateway.config.js`)

module.exports = {
  providers: {
    openai: {
      apiKey: process.env.OPENAI_API_KEY,
      models: ['gpt-4', 'gpt-3.5-turbo'],
      timeout: 30000
    },
    anthropic: {
      apiKey: process.env.ANTHROPIC_API_KEY,
      models: ['claude-3-sonnet', 'claude-3-haiku'],
      timeout: 30000
    }
  },
  cache: {
    enabled: true,
    provider: 'redis',
    url: process.env.REDIS_URL,
    ttl: 3600
  },
  rateLimit: {
    windowMs: 60000,
    max: 100,
    skipSuccessfulRequests: false
  },
  monitoring: {
    enabled: true,
    endpoint: '/metrics'
  }
};

API Endpoints

REST API

# Chat completion
POST /api/v1/chat/completions
Content-Type: application/json

{
  "model": "gpt-4",
  "messages": [
    {"role": "user", "content": "Hello, world!"}
  ]
}

# Health check
GET /health

# Metrics (Prometheus format)
GET /metrics

GraphQL API

mutation ChatCompletion($input: ChatInput!) {
  chat(input: $input) {
    id
    choices {
      message {
        role
        content
      }
    }
    usage [promptTokens
      completionTokens
      totalTokens]
  }
}

Usage Examples

Basic Chat Completion

const response = await fetch('http://localhost:3000/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer your-api-key'
  },
  body: JSON.stringify({
    model: 'gpt-4',
    messages: [
      { role: 'user', content: 'Explain quantum computing' }
    ]
  })
});

const result = await response.json();
console.log(result.choices[0].message.content);

Streaming Response

const response = await fetch('http://localhost:3000/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer your-api-key'
  },
  body: JSON.stringify({
    model: 'gpt-4',
    messages: [role: 'user', content: 'Tell me a story'],
    stream: true
  })
});

const reader = response.body.getReader();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  const chunk = new TextDecoder().decode(value);
  console.log(chunk);
}

Development

Setup

# Clone repository
git clone https://gitlab.bluefly.io/llm/llm-gateway.git
cd llm-gateway

# Install dependencies
npm install

# Start development server
npm run dev

# Run tests
npm test

Scripts

# Development
npm run dev              # Start with hot reload
npm run dev:debug        # Start with debugging

# Building
npm run build            # Build for production
npm run build:prod       # Build with minification

# Testing
npm test                 # Run all tests
npm run test:watch       # Run tests in watch mode
npm run test:coverage    # Generate coverage report

# Linting
npm run lint             # Check code style
npm run lint:fix         # Fix code style issues

# Type checking
npm run typecheck        # Check TypeScript types

Docker

# Build image
npm run docker:build

# Run container
docker run -p 3000:3000 -e OPENAI_API_KEY=your_key llm-gateway

Monitoring

Health Checks

# Basic health check
curl http://localhost:3000/health

# Detailed health with provider status
curl http://localhost:3000/health/detailed

Metrics

The gateway exposes Prometheus metrics at /metrics:

llm_gateway_requests_total - Total number of requests
llm_gateway_request_duration_seconds - Request duration histogram
llm_gateway_provider_errors_total - Provider error counts
llm_gateway_cache_hits_total - Cache hit/miss counts

Logging

Structured logging with configurable levels:

// Log levels: error, warn, info, debug
LOG_LEVEL=info

// Log format: json, text
LOG_FORMAT=json

Deployment

Environment Variables

Required environment variables for production:

# Provider Configuration
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

# Cache Configuration
REDIS_URL=redis://localhost:6379

# Security
JWT_SECRET=your-jwt-secret
API_KEY=your-api-key

# Monitoring
OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:14268/api/traces

Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: llm-gateway
spec:
  replicas: 3
  selector:
    matchLabels:
      app: llm-gateway
  template:
    metadata:
      labels:
        app: llm-gateway
    spec:
      containers:
      - name: llm-gateway
        image: llm-gateway:latest
        ports:
        - containerPort: 3000
        env:
        - name: OPENAI_API_KEY
          valueFrom:
            secretKeyRef:
              name: llm-secrets
              key: openai-api-key

Contributing

Fork the repository on GitLab
Create a feature branch
Make your changes
Add tests for new functionality
Ensure all tests pass
Submit a merge request

Repository

GitLab: https://gitlab.bluefly.io/llm/llm-gateway.git

License

MIT

LLM Gateway

Features​

Installation​

Quick Start​

Configuration​

Environment Variables​

Configuration File (gateway.config.js)​

API Endpoints​

REST API​

GraphQL API​

Usage Examples​

Basic Chat Completion​

Streaming Response​

Development​

Setup​

Scripts​

Docker​

Monitoring​

Health Checks​

Metrics​

Logging​

Deployment​

Environment Variables​

Kubernetes​

Contributing​

Repository​

License​

Features

Installation

Quick Start

Configuration

Environment Variables

Configuration File (`gateway.config.js`)

API Endpoints

REST API

GraphQL API

Usage Examples

Basic Chat Completion

Streaming Response

Development

Setup

Scripts

Docker

Monitoring

Health Checks

Metrics

Logging

Deployment

Environment Variables

Kubernetes

Contributing

Repository

License