LLM Platform Architecture Overview

System Overview

The LLM Platform is a multi-tier enterprise AI system built on Drupal 11 with TypeScript/Node.js microservices. The architecture separates concerns between data processing, AI operations, and user interfaces while maintaining security and scalability.

Core Architecture Principles

Microservices Design: NPM packages provide specialized AI services
Drupal Integration: Custom modules extend Drupal with AI capabilities
Security First: All AI operations include security scanning and compliance
Test-Driven Development: 95%+ test coverage requirement across components
Multi-Provider Support: Abstract AI provider implementations for vendor flexibility

System Tiers

Tier 1: Foundation Services (NPM Packages)

┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
│   apple-fm      │  │  llm-gateway    │  │     tddai       │
│ Apple Models    │  │ API Gateway     │  │ Test Framework  │
│ TypeScript SDK  │  │ Multi-Provider  │  │ AI Integration  │
└─────────────────┘  └─────────────────┘  └─────────────────┘
         │                     │                     │
         └─────────────────────┼─────────────────────┘
                               │
              ┌─────────────────┴─────────────────┐
              │         llm-gateway-sdk           │
              │     OpenAPI Generated SDK         │
              │    TypeScript Client Library      │
              └───────────────────────────────────┘

Production Ready:

apple-fm: TypeScript SDK for Apple Foundation Models with real Ollama integration
llm-gateway-sdk: OpenAPI-generated client with comprehensive API coverage
tddai: Working test framework with AI integration and multiple presets

In Development:

llm-gateway: API gateway core working, advanced features developing
llm-ui: React components exist, build optimization needed

Tier 2: Drupal Integration Layer

┌──────────────────────────────────────────────────────────────┐
│                     Drupal 11 Core                          │
├──────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐   │
│  │     llm     │  │ai_provider_ │  │  api_normalization  │   │
│  │ Core Module │  │    apple    │  │     Framework       │   │
│  │Production   │  │ Complete    │  │    Ready            │   │
│  └─────────────┘  └─────────────┘  └─────────────────────┘   │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐   │
│  │gov_compliance│  │mcp_client_  │  │ recipe_onboarding   │   │
│  │  Framework  │  │   extras    │  │    Framework        │   │
│  │    Stage    │  │ Development │  │      Stage          │   │
│  └─────────────┘  └─────────────┘  └─────────────────────┘   │
└──────────────────────────────────────────────────────────────┘

Production Ready:

llm: Enterprise AI platform with circuit breaker patterns, intelligent caching
ai_provider_apple: Complete Apple Silicon optimization with 85% test coverage

Framework Stage:

api_normalization: 95/100 innovation score, Drupal.org ready
gov_compliance: Compliance framework structure
recipe_onboarding: Installation and setup automation

Tier 3: Deployment & Configuration

┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
│  llm_platform   │  │ secure_drupal   │  │llm_platform_    │
│     Recipe      │  │     Recipe      │  │   manager       │
│   Optimized     │  │   Security      │  │     Theme       │
│   Deployment    │  │   Focused       │  │   Enterprise    │
└─────────────────┘  └─────────────────┘  └─────────────────┘

Data Flow Architecture

AI Request Processing Flow

User Request → Drupal Module → NPM Service → AI Provider → Response Processing → User Interface
     ↓              ↓              ↓            ↓              ↓              ↓
  Security    → Validation  → Rate Limiting → API Call → Content Safety → Caching
  Checking      & Auth        & Queuing       & Retry     Filtering       & Delivery

Key Components

Request Validation: Input sanitization and security scanning
Provider Abstraction: Unified interface for multiple AI providers
Queue Management: Asynchronous processing for long-running operations
Response Processing: Content safety filtering and format normalization
Caching Strategy: Multi-layer caching for performance optimization

Security Architecture

Multi-Layer Security Model

┌─────────────────────────────────────────────────────────────┐
│                    Application Layer                        │
│  • Input Validation  • CSRF Protection  • Rate Limiting    │
├─────────────────────────────────────────────────────────────┤
│                      Service Layer                         │
│  • API Key Management  • Circuit Breakers  • Audit Logs   │
├─────────────────────────────────────────────────────────────┤
│                      Data Layer                            │
│  • Field Encryption  • Secure Storage  • Access Controls  │
└─────────────────────────────────────────────────────────────┘

Security Features

Field-Level Encryption: Sensitive data encrypted at rest
API Key Rotation: Automated key management and rotation
Circuit Breaker Pattern: Prevents cascade failures
Audit Logging: Comprehensive security event tracking
Compliance Framework: GDPR, HIPAA, SOC2 compliance modules

Integration Patterns

NPM Package → Drupal Module Integration

Service Container: NPM packages registered as Drupal services
Configuration Bridge: Drupal configuration mapped to NPM package settings
Event System: Drupal hooks trigger NPM package operations
Queue Integration: Long-running NPM operations processed via Drupal queues

Provider Abstraction Pattern

interface AIProviderInterface {
    public function generateText(string $prompt, array $options): AIResponse;
    public function generateImage(string $prompt, array $options): AIResponse;
    public function processAudio(string $audio, array $options): AIResponse;
}

Each provider (OpenAI, Anthropic, Apple, etc.) implements this interface, allowing seamless switching between providers.

Performance Architecture

Caching Strategy

Response Caching: AI responses cached by content hash
Vector Caching: Embeddings cached for similarity searches
Configuration Caching: Provider settings cached for performance
Edge Caching: CDN integration for static AI-generated content

Scaling Considerations

Horizontal Scaling: NPM services can be deployed across multiple containers
Load Balancing: API gateway distributes requests across service instances
Database Optimization: Specialized tables for AI operations and logging
Queue Processing: Background workers handle resource-intensive operations

Development Environment

Local Development Stack

Docker Compose Environment:
├── Drupal 11 (Apache/PHP 8.3)
├── MySQL 8.0
├── Redis (Caching)
├── Node.js 18+ (NPM Services)
└── Testing Services (Jest, PHPUnit)

Required Dependencies

PHP Dependencies:

Drupal 11.x
Key module (API key management)
Redis module (performance caching)

Node.js Dependencies:

TypeScript 5.x
OpenAPI generators
Jest testing framework

Deployment Architecture

Production Environment

┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
│   Load Balancer │  │   Web Servers   │  │   DB Cluster    │
│    (HAProxy)    │→ │  (Drupal 11)    │→ │   (MySQL 8)     │
└─────────────────┘  └─────────────────┘  └─────────────────┘
         │                     │                     │
         ↓                     ↓                     ↓
┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐
│   CDN/Edge      │  │  NPM Services   │  │ Redis Cluster   │
│   (CloudFlare)  │  │  (Node.js)      │  │   (Caching)     │
└─────────────────┘  └─────────────────┘  └─────────────────┘

Container Architecture

Each NPM package can be containerized and deployed independently:

# Example: apple-fm service container
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY dist/ ./dist/
EXPOSE 3000
CMD ["node", "dist/index.js"]

Monitoring & Observability

Health Monitoring

Service Health Checks: Each NPM service exposes health endpoints
Database Monitoring: Connection pooling and query performance
AI Provider Monitoring: Response times and error rates per provider
Queue Monitoring: Processing times and backlog tracking

Logging Strategy

Application Logs → Structured JSON → Centralized Logging → Alerting
     ↓                    ↓               ↓              ↓
  Drupal Logs    →  Log Aggregation → ElasticSearch → PagerDuty
  NPM Logs       →     (Fluentd)     →    (ELK)     → Email/Slack

Configuration Management

Environment-Based Configuration

# Development
ai_providers:
  openai:
    endpoint: "https://api.openai.com/v1"
    timeout: 30000
  apple:
    endpoint: "http://localhost:3001"
    timeout: 5000

# Production
ai_providers:
  openai:
    endpoint: "https://api.openai.com/v1"
    timeout: 10000
  apple:
    endpoint: "https://apple-fm.internal.cluster"
    timeout: 2000

Feature Flags

Provider Switching: Enable/disable AI providers without deployment
Feature Rollouts: Gradual feature enablement for user groups
Emergency Shutoffs: Quick disable of problematic features

Testing Architecture

Multi-Level Testing Strategy

Unit Tests: Individual function testing (Jest, PHPUnit)
Integration Tests: Service-to-service communication testing
End-to-End Tests: Full user workflow testing
Performance Tests: Load testing and response time validation
Security Tests: Vulnerability scanning and penetration testing

Test Coverage Requirements

NPM Packages: 95% code coverage minimum
Drupal Modules: 85% code coverage minimum
Integration Tests: All critical user paths covered
Performance Tests: Response time benchmarks established

Known Limitations & Constraints

Current Implementation Constraints

Apple Intelligence: Currently uses Ollama integration, native Apple Intelligence pending
Provider Rate Limits: Each AI provider has different rate limiting strategies
Response Size Limits: Large content generation may require chunking
Cache Invalidation: Complex cache dependencies require careful management

Scalability Considerations

Database: AI operation logs can grow quickly, requires archiving strategy
File Storage: Generated content files need distributed storage solution
Memory Usage: Vector embeddings can consume significant memory
API Costs: AI provider costs scale with usage, monitoring required

Future Architecture Evolution

Planned Enhancements

Native Apple Intelligence: Direct integration when APIs become available
Multi-Region Deployment: Geographic distribution for performance
Advanced Caching: Predictive pre-caching based on usage patterns
Enhanced Security: Zero-trust architecture implementation

This architecture document reflects the current implemented state of the LLM Platform, with clear distinctions between production-ready components and those in development. All architectural decisions are based on actual working code rather than aspirational features.

LLM Platform Architecture Overview

System Overview​

Core Architecture Principles​

System Tiers​

Tier 1: Foundation Services (NPM Packages)​

Tier 2: Drupal Integration Layer​

Tier 3: Deployment & Configuration​

Data Flow Architecture​

AI Request Processing Flow​

Key Components​

Security Architecture​

Multi-Layer Security Model​

Security Features​

Integration Patterns​

NPM Package → Drupal Module Integration​

Provider Abstraction Pattern​

Performance Architecture​

Caching Strategy​

Scaling Considerations​

Development Environment​

Local Development Stack​

Required Dependencies​

Deployment Architecture​

Production Environment​

Container Architecture​

Monitoring & Observability​

Health Monitoring​

Logging Strategy​

Configuration Management​

Environment-Based Configuration​

Feature Flags​

Testing Architecture​

Multi-Level Testing Strategy​

Test Coverage Requirements​

Known Limitations & Constraints​

Current Implementation Constraints​

Scalability Considerations​

Future Architecture Evolution​

Planned Enhancements​