Maxine - Lightning Fast Service Registry

A minimal, high-performance service discovery and registry for microservices.

Features

Lightning Fast: In-memory storage with O(1) lookups, optimized heartbeat with periodic cleanup, pre-allocated response buffers, fast LCG PRNG for random selection
Simple API: Register, discover, heartbeat, and deregister services with support for service versioning
Automatic Cleanup: Removes expired services with efficient periodic cleanup (every 30 seconds)
Load Balancing: Round-robin, random, weighted-random, least-connections, consistent-hash, ip-hash, geo-aware, predictive, ai-driven, advanced-ml (synchronous deep learning), cost-aware, power-of-two-choices selection for advanced load balancing
Health Checks: /health endpoint returning service and node counts, active health monitoring for real-time status
Advanced Health Checks: Custom health check endpoints with proactive monitoring, configurable intervals, and health status integration with load balancing decisions
Circuit Breakers: Automatic failure detection and recovery to protect against cascading failures
Rate Limiting: Protect services from excessive requests with configurable limits
API Key Management: Generate, validate, and revoke API keys with per-key rate limiting for secure service access
Access Control Lists (ACLs): Fine-grained permissions for service discovery access
Service Intentions: Define allowed communication patterns between services
Service Dependencies: Manage service dependencies with cycle detection, graph visualization, and automatic dependency detection through call logging
Version Compatibility Checking: Define compatibility rules for service versions to prevent incompatible service interactions
Service Call Analytics: Real-time dashboard visualizing service communication patterns, call frequencies, and dependency graphs with interactive D3.js charts
Advanced Service Validation: Comprehensive schema validation for service registrations including metadata fields (tags, healthCheck, version, weight)
Chaos Engineering Tools: Built-in chaos testing with latency injection, failure simulation, and automated experiments for resilience validation
Metrics: Basic /metrics endpoint with request counts, errors, uptime, and basic stats including cache performance metrics
OpenTelemetry Metrics: Comprehensive observability with Prometheus-compatible metrics for service registrations, discoveries, heartbeats, deregistrations, cache hits/misses, and total services/nodes
Advanced Rate Limiting: Distributed rate limiting with configurable limits per client IP to protect against excessive requests
Audit Logging: Comprehensive logging of all registry operations using Winston, including user actions, system events, and security incidents with log rotation and export capabilities
Persistence: Optional persistence to survive restarts with file-based, Redis, memory-mapped (mmap), or shared memory (shm) storage
Minimal Dependencies: Only essential packages for maximum performance
Lightning Mode: Dedicated mode for ultimate speed with core features: register, heartbeat, deregister, discover with round-robin/random load balancing, health, optimized for minimal overhead
HTTP/3 Support: Optional QUIC-based HTTP/3 server for ultra-low latency service discovery (enabled with HTTP3_ENABLED=true)
WebAssembly Support: Complete WebAssembly service registry for edge computing deployments with full API compatibility
Optimized Parsing: Fast JSON parsing with error handling
Event-Driven: Real-time events for service changes and notifications via WebSocket and MQTT
Federation: Connect multiple Maxine instances across datacenters for global service discovery (available in Lightning Mode)
Multi-Datacenter Support: Global service discovery with cross-datacenter replication and load balancing
Authentication/Authorization: Optional JWT-based auth for Lightning Mode to secure sensitive operations with Role-Based Access Control (RBAC)
Configuration Management: Dynamic configuration updates for services with versioning and event notifications
gRPC Support: High-performance gRPC API for service operations
Service Mesh Integration: Automatic Envoy, Istio, and Linkerd configuration generation for seamless service mesh deployment
Open Service Broker API Integration: Compatible with enterprise service catalogs for seamless integration with Kubernetes Service Catalog and other OSB implementations

Performance

Maxine delivers exceptional performance for service discovery operations:

Ultra-Fast Mode: Average 1.87ms, P95 3.94ms for discovery requests (with advanced optimizations and AI-driven load balancing)
Throughput: 25,004+ requests per second under load (50 concurrent users, 5000 iterations)
Lightning Mode: Average 4.91ms, P95 6.49ms for discovery requests (100 concurrent users, 1000 iterations)
Throughput: 20,136+ requests per second under load (100 concurrent users, 1000 iterations)
Optimizations: QUIC/HTTP3 support for ultra-low latency, HTTP/1.1 for ultra-fast mode (disabled HTTP/2 for lower latency), disabled OpenTelemetry tracing and Prometheus metrics in Lightning Mode, ultra-fast mode with minimal features for maximum speed, fast LCG PRNG, pre-allocated buffers, object pooling, adaptive caching, binary search for weighted random selection, SIMD-inspired fast bulk operations (fastMin, fastMax, fastSum, etc.) for load balancing calculations, removed console.log from production code (24% throughput improvement), optimized discovery to use ultraFastGetRandomNodeSync directly, disabled expensive operations in lightning mode, synchronous load balancing for ultra-fast mode, updated GC flags for Node.js 22 compatibility, enabled small LRU caches in ultra-fast mode for better performance

Security

Maxine implements comprehensive security measures for production deployments:

Security Features

Input Validation: All API endpoints use Joi schema validation with sanitization
Rate Limiting: Distributed Redis-backed rate limiting to prevent abuse
Authentication: JWT-based authentication with role-based access control (RBAC)
API Keys: Secure API key management with configurable rate limits per key
Mutual TLS: mTLS support for encrypted service-to-service communication
Audit Logging: Comprehensive logging of all security events and operations
Dependency Security: All dependencies are regularly audited and updated

Security Best Practices

Enable authentication in production: AUTH_ENABLED=true
Use HTTPS/TLS for all communications
Configure rate limiting based on your traffic patterns
Regularly update dependencies and monitor for security advisories
Use API keys for service-to-service authentication
Enable audit logging for compliance requirements

Security Scanning

# Run security audit
npm audit

# Check for outdated dependencies
npm outdated

# Use ESLint for code quality
npm run lint

Quick Start

npm install
npm start

Maxine runs in Ultra-Fast Mode by default for maximum performance with core features only. For more features, set ULTRA_FAST_MODE=false and LIGHTNING_MODE=true.

Kubernetes Integration

Maxine provides comprehensive Kubernetes integration through a custom operator for declarative service registry management:

Custom Resource Definitions (CRDs)

ServiceRegistry: Declarative Maxine instance management with auto-scaling, persistence, and multi-cloud support
ServiceInstance: Automatic service registration and health monitoring for Kubernetes services
ServicePolicy: Advanced load balancing, circuit breakers, and AI optimization policies
ServiceMeshOperator: Automated Istio, Linkerd, and Envoy configuration generation
TrafficPolicy: Fine-grained traffic management with fault injection and canary deployments
ServiceEndpoint: Direct endpoint management with health checks and metadata

Installation

# Install CRDs
kubectl apply -f helm/maxine-operator/crds/

# Install operator
helm install maxine-operator helm/maxine-operator/

# Create a service registry
kubectl apply -f - <<EOF
apiVersion: maxine.io/v1
kind: ServiceRegistry
metadata:
  name: my-registry
spec:
  replicas: 3
  mode: lightning
  config:
    port: 8080
    persistenceEnabled: true
    aiOptimizationEnabled: true
    multiCloudEnabled: true
EOF

Features

Auto-scaling: Kubernetes HPA integration with custom metrics
Service Discovery: Automatic registration of Kubernetes services
AI Optimization: ML-driven load balancing and traffic optimization
Multi-cloud: Cross-cloud service discovery and failover
Service Mesh: Automated Istio/Linkerd configuration
Chaos Engineering: Built-in fault injection and resilience testing

Documentation

Advanced Load Balancing Tutorial - Comprehensive guide to load balancing strategies
WebSocket Events Tutorial - Real-time event streaming and monitoring
Monitoring and Alerting Guide - Production monitoring setup
Client SDKs - SDK documentation for multiple languages
Event Streaming - Event-driven architectures

Development

Code Quality

Maxine uses modern development tools for code quality and security:

# Run linting
npm run lint

# Auto-fix linting issues
npm run lint:fix

# Format code
npm run format

# Run tests
npm test

# Run load tests
npm run load-test

Development Tools

ESLint: Code linting with security rules
Prettier: Code formatting
Joi: Input validation and sanitization
Mocha: Testing framework
Istanbul/NYC: Code coverage

Persistence

Maxine supports optional persistence to maintain registry state across restarts:

File-based: Saves to registry.json in the working directory
Redis: Uses Redis for distributed storage
Memory-mapped (mmap): Zero-copy operations with memory-mapped files for ultra-fast persistence
Shared Memory (shm): In-memory shared buffer with file backing for maximum performance
PostgreSQL: Enterprise-grade SQL persistence with connection pooling and advanced querying
MySQL: High-performance MySQL persistence with optimized schemas and indexing
TiKV: Distributed key-value store with strong consistency and horizontal scaling
FoundationDB: Multi-model database with ACID transactions and fault tolerance

Enable with PERSISTENCE_ENABLED=true and set PERSISTENCE_TYPE=file, redis, mmap, shm, postgres, mysql, tikv, or foundationdb.

For Redis, configure REDIS_HOST, REDIS_PORT, REDIS_PASSWORD.

Distributed Caching

Maxine supports Redis-based distributed caching for service discovery results across multiple instances, reducing latency and improving scalability.

Enable with REDIS_CACHE_ENABLED=true and configure Redis settings. Discovery results for deterministic load balancing strategies (consistent-hash, ip-hash, geo-aware, etc.) are cached in Redis with a 30-second TTL, allowing multiple Maxine instances to share cached results and reduce registry load.

Federation

Maxine supports federation to connect multiple instances across datacenters for global service discovery, cross-datacenter replication, and load balancing.

Enable with FEDERATION_ENABLED=true and configure peers with FEDERATION_PEERS=http://peer1:8080,http://peer2:8080.

Additional options: FEDERATION_TIMEOUT (default 5000ms), FEDERATION_RETRY_ATTEMPTS (default 3).

In Lightning Mode, federated registries are queried automatically if a service is not found locally. Registrations and deregistrations are replicated across peers.

Multi-Cluster Auto-Failover

Maxine includes advanced multi-cluster failover capabilities for high availability:

Health Monitoring: Continuous health checks of federated registries every 30 seconds
Replication Lag Detection: Monitors replication lag between clusters with configurable thresholds
Automatic Failover: Automatically switches to healthy backup registries when primary fails
Region-Aware Failover: Prioritizes failover targets based on geographic proximity
Conflict Resolution: Handles service registration conflicts during failover scenarios

Failover Status Endpoint

GET /api/maxine/serviceops/federation/status

Returns comprehensive failover status including:

Current primary registry
Health status of all federated registries
Replication lag metrics
Failover priority rankings
Last health check timestamps

Failover Configuration

FEDERATION_PEERS: Comma-separated list of peer URLs with optional priority (e.g., peer1:http://host1:8080,peer2:http://host2:8080)
REPLICATION_LAG_THRESHOLD: Maximum acceptable replication lag in milliseconds (default: 5000ms)

Authentication (Lightning Mode)

Maxine supports optional JWT-based authentication in Lightning Mode to secure sensitive operations like backup/restore and tracing.

Enable with AUTH_ENABLED=true and configure:

JWT_SECRET: Secret key for JWT signing
JWT_EXPIRES_IN: Token expiration (default 1h)
ADMIN_USERNAME: Admin username (default admin)
ADMIN_PASSWORD_HASH: Bcrypt hash of admin password

Sign in via POST /signin to get a token, then include in requests as Authorization: Bearer <token>.

OAuth2 Integration

Maxine supports OAuth2 authentication with Google for external user management.

Enable with OAUTH2_ENABLED=true and configure:

GOOGLE_CLIENT_ID: Google OAuth2 client ID
GOOGLE_CLIENT_SECRET: Google OAuth2 client secret

Redirect users to GET /auth/google to start OAuth flow, then handle the callback at GET /auth/google/callback to receive JWT token.

Role-Based Access Control (RBAC)

Maxine supports Role-Based Access Control with fine-grained permissions for different user roles.

Roles

admin: Full access to all operations
operator: Service management, configuration, monitoring, and advanced features
viewer: Read-only access to discovery, metrics, and health
service: Limited access for service registration, discovery, and heartbeat

Role Management Endpoints

GET /api/maxine/roles - List all roles and permissions (admin only)
GET /api/maxine/user/roles/:username - Get user role (admin only)
POST /api/maxine/user/roles - Set user role (admin only)

Demo Users

For testing, Maxine includes demo users with different roles:

admin/admin (admin role)
operator/operator (operator role)
viewer/viewer (viewer role)
service/service (service role)

API Key Management

Maxine supports API key-based authentication with configurable rate limiting for secure service access.

API Key Endpoints

POST /api/maxine/api-keys/generate - Generate a new API key (admin only)
```
{
  "serviceName": "my-service",
  "rateLimit": 1000
}
```
POST /api/maxine/api-keys/revoke - Revoke an API key (admin only)
```
{
  "apiKey": "your-api-key-here"
}
```
GET /api/maxine/api-keys - List all API keys (admin only)
POST /api/maxine/api-keys/validate - Validate an API key
```
{
  "apiKey": "your-api-key-here"
}
```

Using API Keys

Include the API key in requests using the X-API-Key header or apiKey query parameter:

curl -H "X-API-Key: your-api-key" http://localhost:8080/discover?serviceName=my-service

API keys are automatically rate limited based on their configured limits.

LDAP Authentication

Maxine supports LDAP/Active Directory authentication for enterprise environments.

Enable with LDAP_ENABLED=true and configure:

LDAP_URL: LDAP server URL (e.g., ldap://localhost:389)
LDAP_BASE_DN: Base DN for searches (e.g., dc=example,dc=com)
LDAP_BIND_USER: Bind user DN for authentication
LDAP_BIND_PASSWORD: Bind user password

When LDAP is enabled, the /signin endpoint will first attempt LDAP authentication, falling back to local users if LDAP fails.

SAML Authentication

Maxine supports SAML 2.0 authentication for enterprise single sign-on integration.

Enable with SAML_ENABLED=true and configure:

SAML_ENTRY_POINT: SAML identity provider entry point URL
SAML_ISSUER: SAML service provider issuer
SAML_CERT: SAML identity provider certificate (public key)
SAML_CALLBACK_URL: SAML callback URL (default: http://localhost:8080/auth/saml/callback)

Redirect users to GET /auth/saml to start SAML authentication flow, then handle the callback at POST /auth/saml/callback to receive JWT token.

Mutual TLS (mTLS) Support

Maxine supports Mutual TLS for encrypted and authenticated service-to-service communication in Lightning Mode.

Enable with MTLS_ENABLED=true and provide certificate paths:

SERVER_CERT_PATH: Path to server certificate (default: src/main/config/certs/server.crt)
SERVER_KEY_PATH: Path to server private key (default: src/main/config/certs/server.key)
CA_CERT_PATH: Path to CA certificate for client verification (default: src/main/config/certs/ca.crt)

To generate self-signed certificates for testing, run:

node src/main/config/certs/generate-certs.js

This creates CA, server, and client certificates. Use client.crt and client.key for client authentication.

Example curl with client cert:

curl --cert src/main/config/certs/client.crt --key src/main/config/certs/client.key --cacert src/main/config/certs/ca.crt https://localhost:8080/health

MQTT Integration (Lightning Mode)

Maxine supports optional MQTT integration for publishing real-time events to MQTT brokers.

Enable with MQTT_ENABLED=true and configure:

MQTT_BROKER: MQTT broker URL (default mqtt://localhost:1883)
MQTT_TOPIC: Base topic for events (default maxine/registry/events)

Events are published to topics like maxine/registry/events/service_registered, maxine/registry/events/circuit_open, etc. with QoS 1.

MQTT publishing is now enabled in the broadcast function for real-time event distribution.

OpenTelemetry Tracing

Maxine supports OpenTelemetry tracing for distributed observability. Traces are automatically generated for key operations like service registration, discovery, and deregistration.

Configure Jaeger exporter with JAEGER_ENDPOINT environment variable (default: http://localhost:14268/api/traces).

Tracing is enabled by default and provides detailed spans for:

Service registration/deregistration
Service discovery with load balancing
API request handling
Registry operations

Modes

Ultra-Fast Mode (default): Extreme performance with minimal features. Core operations only: register, heartbeat, deregister, discover. No logging, metrics, auth, WebSocket, MQTT, gRPC. Uses UDP for heartbeats for speed. Set ULTRA_FAST_MODE=true.
Lightning Mode: Ultra-fast with additional features for maximum speed. Core operations: register, heartbeat, deregister, discover with advanced load balancing, caching, health checks. Optional JWT auth for sensitive endpoints. Uses root-level API endpoints like /register, /discover. Set ULTRA_FAST_MODE=false LIGHTNING_MODE=true.
Full Mode: Comprehensive features including federation, tracing, ACLs, intentions, service blacklists, management UI, security, metrics, etc. Uses /api/* endpoints. Set LIGHTNING_MODE=false.

To run in full mode: LIGHTNING_MODE=false npm start

API

Lightning Mode (Default)

HTTP API

Register a Service

POST /register
Content-Type: application/json

{
   "serviceName": "my-service",
   "host": "localhost",
   "port": 3000,
   "metadata": {"version": "1.0", "weight": 1, "tags": ["web", "api"], "healthCheck": {"url": "/health", "interval": 30000, "timeout": 5000}}
}

Note: version in metadata enables service versioning. weight in metadata is used for weighted-random load balancing (default 1). tags in metadata is an array of strings for service tagging and filtering. healthCheck in metadata configures proactive health monitoring with url (default "/health"), interval (default 30000ms), and timeout (default 5000ms).

Response:

{
  "nodeId": "my-service:localhost:3000"
}

Discover a Service

GET /discover?serviceName=my-service&loadBalancing=round-robin&version=1.0&tags=web,api

Load balancing options: round-robin (default), random, weighted-random, least-connections, weighted-least-connections, consistent-hash, ip-hash, geo-aware, least-response-time, health-score, predictive (uses time-series trend analysis for optimal node selection), ai-driven (uses reinforcement learning for optimal routing), advanced-ml (uses machine learning with predictive analytics for intelligent load balancing), cost-aware (prefers lower-cost nodes like on-prem over cloud), power-of-two-choices (selects two random nodes and picks the one with fewer connections for better load distribution). Custom load balancing strategies can be registered via plugins. Use version parameter for service versioning. Use tags parameter to filter services by tags (comma-separated).

Response: Returns a service instance or 404 if not found.

Heartbeat

POST /heartbeat
Content-Type: application/json

{
  "nodeId": "my-service:localhost:3000"
}

Deregister a Service

DELETE /deregister
Content-Type: application/json

{
   "nodeId": "localhost:3000"
}

List All Services

GET /servers

Get Service Instances

GET /services/:serviceName

Returns all healthy instances of the specified service with their metadata and health status.

Response:

{
  "serviceName": "my-service",
  "instances": [
    {
      "nodeId": "my-service:localhost:3000",
      "address": "localhost:3000",
      "nodeName": "my-service:localhost:3000",
      "metadata": { "version": "1.0", "weight": 1 },
      "lastHeartbeat": 1640995200000,
      "healthy": true
    }
  ]
}

Health Check

GET /health

Returns status, services count, nodes count.

Node Health Check

GET /health/:nodeId

Returns detailed health status for a specific service instance.

Response:

{
  "nodeId": "my-service:localhost:3000",
  "serviceName": "my-service",
  "address": "localhost:3000",
  "nodeName": "my-service:localhost:3000",
  "metadata": { "version": "1.0" },
  "lastHeartbeat": 1640995200000,
  "timeSinceLastHeartbeat": 5000,
  "healthy": true,
  "responseTime": 150
}

Metrics

GET /metrics

Returns uptime, requests, errors, services, nodes, persistenceEnabled, persistenceType, wsConnections, eventsBroadcasted, cacheHits, cacheMisses.

Additionally, comprehensive Prometheus-compatible metrics are exposed on port 9464 at /metrics, including:

maxine_service_registrations_total: Total service registrations
maxine_service_discoveries_total{service_name, strategy}: Total service discoveries by service and strategy
maxine_service_heartbeats_total: Total heartbeats
maxine_service_deregistrations_total: Total deregistrations
maxine_cache_hits_total: Cache hits
maxine_cache_misses_total: Cache misses
maxine_redis_cache_hit_total: Redis distributed cache hits
maxine_redis_cache_miss_total: Redis distributed cache misses
maxine_services_active: Active services count
maxine_nodes_active: Active nodes count
maxine_circuit_breakers_open: Open circuit breakers count
maxine_response_time_seconds{operation}: Response time histogram for operations (register, discover, heartbeat, deregister)

Dashboard

GET /dashboard

Returns an advanced HTML dashboard with real-time metrics, charts, service topology, and event streaming for comprehensive monitoring. Features include:

Real-time stats updates via WebSocket
Interactive charts for node health and cache performance
Service and node status visualization
Recent events feed
Connection status indicators

Dependency Graph

GET /dependency-graph

Returns an interactive HTML page visualizing the service dependency graph using D3.js. Features include:

Force-directed graph layout
Click on nodes to view dependency impact (dependencies and dependents)
Cycle detection alerts
Export to JSON or SVG
Real-time updates (planned)

Heap Dump

GET /heapdump

Creates a heap snapshot file for memory profiling (requires heapdump module).

Backup Registry

GET /backup

Returns the current registry state as JSON (requires persistence enabled).

Restore Registry

POST /restore
Content-Type: application/json

{ ... registry data ... }

Restores registry from backup data (requires persistence enabled).

Start Trace

POST /trace/start
Content-Type: application/json

{
  "id": "trace-123",
  "operation": "discover"
}

Add Trace Event

POST /trace/event
Content-Type: application/json

{
  "id": "trace-123",
  "event": "node selected"
}

End Trace

POST /trace/end
Content-Type: application/json

{
  "id": "trace-123"
}

Get Trace

GET /trace/:id

Returns the trace data for the given id, including start time, duration, events, and status.

OpenTelemetry Integration: Maxine includes comprehensive OpenTelemetry tracing for all registry operations:

Service registration/deregistration
Service discovery with load balancing
Heartbeat operations
Federation queries
Configuration updates

Traces are automatically exported to Jaeger or Zipkin when configured. Set JAEGER_ENDPOINT or ZIPKIN_ENDPOINT environment variables to enable trace export.

Get Service Versions

GET /versions?serviceName=my-service

Response:

{
  "serviceName": "my-service",
  "versions": ["1.0", "2.0", "default"]
}

DNS Service Discovery

Maxine supports DNS-based service discovery for compatibility with standard DNS clients.

Enable with DNS_ENABLED=true (default) and configure DNS_PORT (default 53).

Query SRV records for service discovery:

dig SRV _my-service._tcp.default.default.default @localhost

Or A records for direct IP resolution:

dig A my-service.default.default.default @localhost

This allows integration with DNS-aware applications and load balancers.

Get Anomalies

GET /anomalies

Returns detected anomalies in the service registry using statistical analysis and machine learning algorithms. Anomalies are prioritized by severity.

Anomaly Types:

high_circuit_failures: Excessive circuit breaker failures
no_healthy_nodes: Service has nodes but none are healthy
no_nodes: Service has no registered nodes
high_response_time: Response time exceeds 3 standard deviations from mean
response_time_trend: Significant increase in response times over time
stale_heartbeat: Node hasn't sent heartbeat within expected interval
high_error_rate: Error rate exceeds 10%

Response:

{
  "anomalies": [
    {
      "serviceName": "my-service",
      "type": "high_response_time",
      "value": 2500,
      "threshold": 1500,
      "severity": "medium"
    },
    {
      "serviceName": "bad-service",
      "type": "no_healthy_nodes",
      "severity": "critical"
    }
  ]
}

Get Health Scores

GET /health-score?serviceName=my-service

Returns health scores (0-100, higher better) for all healthy nodes in the service, based on response times, failure rates, and circuit breaker state.

Response:

{
  "serviceName": "my-service",
  "scores": {
    "my-service:localhost:3000": 85,
    "my-service:localhost:3001": 92
  }
}

Predict Service Health

GET /predict-health?serviceName=my-service&window=300000

Returns health predictions for service nodes using time-series analysis. The window parameter specifies the prediction time window in milliseconds (default: 300000ms / 5 minutes).

Response:

{
  "serviceName": "my-service",
  "predictions": {
    "my-service:localhost:3000": {
      "currentScore": 85,
      "predictedScore": 78,
      "trend": 2.5,
      "predictedResponseTime": 180
    }
  },
  "predictionWindow": 300000
}

Set Traffic Distribution (Canary Deployments)

POST /traffic/set
Content-Type: application/json

{
  "serviceName": "my-service",
  "distribution": {"1.0": 80, "2.0": 20}
}

Promote Version (Blue-Green Deployment)

POST /version/promote
Content-Type: application/json

{
  "serviceName": "my-service",
  "version": "2.0"
}

Retire Version

POST /version/retire
Content-Type: application/json

{
  "serviceName": "my-service",
  "version": "1.0"
}

Shift Traffic Gradually

POST /traffic/shift
Content-Type: application/json

{
  "serviceName": "my-service",
  "fromVersion": "1.0",
  "toVersion": "2.0",
  "percentage": 10
}

Set Service Config

POST /api/maxine/serviceops/config/set
Content-Type: application/json

{
  "serviceName": "my-service",
  "key": "timeout",
  "value": 5000,
  "namespace": "default",
  "region": "us-east",
  "zone": "zone1"
}

Get Service Config

GET /api/maxine/serviceops/config/get?serviceName=my-service&key=timeout&namespace=default&region=us-east&zone=zone1

Get All Service Configs

GET /api/maxine/serviceops/config/all?serviceName=my-service&namespace=default&region=us-east&zone=zone1

Watch Service Config Changes

GET /api/maxine/serviceops/config/watch?serviceName=my-service&namespace=default&region=us-east&zone=zone1

Returns Server-Sent Events for real-time config changes.

Delete Service Config

DELETE /api/maxine/serviceops/config/delete
Content-Type: application/json

{
  "serviceName": "my-service",
  "key": "timeout",
  "namespace": "default",
  "region": "us-east",
  "zone": "zone1"
}

Generate Envoy Config

GET /api/maxine/serviceops/envoy/config

Returns Envoy proxy configuration JSON based on registered services, suitable for service mesh integration. Includes enhanced observability with access logging, custom headers, and circuit breaker metrics.

Service Mesh Metrics

GET /api/maxine/serviceops/service-mesh/metrics

Returns comprehensive service mesh observability metrics including:

Configuration generations (Envoy, Istio, Linkerd)
Circuit breaker statistics
Retry attempt counts
Service health metrics
Active service and node counts

Generate Istio Config

GET /service-mesh/istio-config

Returns Istio VirtualService and DestinationRule configurations in JSON format for service mesh deployment.

Generate Linkerd Config

GET /service-mesh/linkerd-config

Returns Linkerd ServiceProfile configurations in JSON format for service mesh deployment, including retry budgets and route conditions.

Open Service Broker API

Maxine supports the Open Service Broker API for integration with enterprise service catalogs.

Get Catalog

GET /v2/catalog

Returns the service catalog in OSB format, listing all registered services and their versions as plans.

Response:

{
  "services": [
    {
      "id": "my-service",
      "name": "my-service",
      "description": "Service my-service",
      "bindable": false,
      "plans": [
        {
          "id": "my-service-1.0",
          "name": "1.0",
          "description": "Version 1.0 of my-service"
        }
      ]
    }
  ]
}

Get Circuit Breaker State

GET /circuit-breaker/:nodeId

Returns the circuit breaker state for the specified node, including state (closed/open/half-open), failure count, last failure timestamp, and next retry timestamp.

Get Event History

GET /events?since=<timestamp>&limit=<number>

Returns recent events from the event history. Use since to get events after a specific timestamp (default 0), and limit to limit the number of events returned (default 100).

Add Service to Blacklist

POST /blacklist/add
Content-Type: application/json

{
  "serviceName": "bad-service"
}

Remove Service from Blacklist

DELETE /blacklist/remove
Content-Type: application/json

{
  "serviceName": "bad-service"
}

Get Blacklist

GET /blacklist

Returns the list of blacklisted services.

GraphQL API

GET /graphql
POST /graphql

Maxine provides a GraphQL API for flexible queries and mutations. The GraphQL playground is available at /graphql for testing queries.

Queries:

services: Get all registered services
service(serviceName: String!): Get a specific service
discover(serviceName: String!, ip: String, group: String, tags: [String], deployment: String, filter: String): Discover a service instance
healthScores(serviceName: String!): Get health scores for all nodes in a service

Mutations:

register(serviceName: String!, nodeName: String!, address: String!, metadata: String): Register a service
deregister(serviceName: String!, nodeName: String!): Deregister a service

Set Service Config

POST /config/set
Content-Type: application/json

{
  "serviceName": "my-service",
  "key": "timeout",
  "value": 5000,
  "metadata": {"description": "Request timeout"}
}

Get Service Config

GET /config/get?serviceName=my-service&key=timeout

Get All Service Configs

GET /config/all?serviceName=my-service

Delete Service Config

DELETE /config/delete?serviceName=my-service&key=timeout

Record Response Time

POST /record-response-time
Content-Type: application/json

{
  "nodeId": "my-service:localhost:3000",
  "responseTime": 150
}

Records the response time for a node to enable predictive load balancing based on historical performance data.

Record Service Call

POST /record-call
Content-Type: application/json

{
  "callerService": "web-service",
  "calledService": "api-service"
}

Records a service call for automatic dependency detection. Services can report their outbound calls to enable auto-detection of service dependencies.

Add Service Dependency

POST /api/maxine/serviceops/dependency/add
Content-Type: application/json

{
  "serviceName": "my-service",
  "dependsOn": "dependent-service"
}

Remove Service Dependency

POST /api/maxine/serviceops/dependency/remove
Content-Type: application/json

{
  "serviceName": "my-service",
  "dependsOn": "dependent-service"
}

Get Service Dependencies

GET /api/maxine/serviceops/dependency/get?serviceName=my-service

Response:

{
  "serviceName": "my-service",
  "dependencies": ["dependent-service"]
}

Get Service Dependents

GET /api/maxine/serviceops/dependency/dependents?serviceName=my-service

Response:

{
  "serviceName": "my-service",
  "dependents": ["dependent-service"]
}

Get Dependency Graph

GET /api/maxine/serviceops/dependency/graph

Response:

{
  "my-service": ["dependent-service"],
  "another-service": ["my-service"]
}

Detect Circular Dependencies

GET /api/maxine/serviceops/dependency/cycles

Response:

{
  "cycles": [["service-a", "service-b", "service-a"]]
}

Analyze Dependencies

POST /api/maxine/serviceops/dependency/analyze

Triggers automatic dependency analysis based on recorded service calls. Dependencies are inferred from call logs where services have called each other above the configured threshold within the time window.

Response:

{
  "success": true,
  "message": "Dependency analysis completed"
}

Set Compatibility Rule

POST /api/maxine/serviceops/compatibility/set
Content-Type: application/json

{
  "serviceName": "my-service",
  "version": "1.0",
  "compatibleVersions": ["1.0", "1.1", "^1.0.0"]
}

Get Compatibility Rules

GET /api/maxine/serviceops/compatibility/get?serviceName=my-service&version=1.0

Response:

{
  "serviceName": "my-service",
  "version": "1.0",
  "rules": ["1.0", "1.1", "^1.0.0"]
}

Check Compatibility

POST /api/maxine/serviceops/compatibility/check
Content-Type: application/json

{
  "serviceName": "my-service",
  "version": "1.0",
  "requiredVersion": "1.1"
}

Response:

{
  "serviceName": "my-service",
  "version": "1.0",
  "requiredVersion": "1.1",
  "compatible": true
}

Set ACL

POST /api/maxine/serviceops/acl/set
Content-Type: application/json

{
  "serviceName": "my-service",
  "allow": ["service-a", "service-b"],
  "deny": ["service-c"]
}

Get ACL

GET /api/maxine/serviceops/acl/:serviceName

Response:

{
  "allow": ["service-a", "service-b"],
  "deny": ["service-c"]
}

Set Intention

POST /api/maxine/serviceops/intention/set
Content-Type: application/json

{
  "source": "service-a",
  "destination": "service-b",
  "action": "allow"
}

Get Intention

GET /api/maxine/serviceops/intention/:source/:destination

Response:

{
  "source": "service-a",
  "destination": "service-b",
  "action": "allow"
}

Add Service Dependency

POST /api/maxine/serviceops/dependency/add
Content-Type: application/json

{
  "serviceName": "my-service",
  "dependsOn": "dependent-service"
}

Remove Service Dependency

POST /api/maxine/serviceops/dependency/remove
Content-Type: application/json

{
  "serviceName": "my-service",
  "dependsOn": "dependent-service"
}

Get Service Dependencies

GET /api/maxine/serviceops/dependency/get?serviceName=my-service

Get Service Dependents

GET /api/maxine/serviceops/dependency/dependents?serviceName=my-service

Get Dependency Graph

GET /api/maxine/serviceops/dependency/graph

Detect Circular Dependencies

GET /api/maxine/serviceops/dependency/cycles

Proxy to Service

GET /proxy/:serviceName/:path

Proxies requests to a discovered service instance. For example, /proxy/my-service/health will proxy to the health endpoint of a random instance of my-service.

Sign In (Authentication)

POST /signin
Content-Type: application/json

{
  "username": "admin",
  "password": "yourpassword"
}

Response:

{
  "token": "jwt-token-here"
}

Use the token in Authorization header: Bearer <token> for protected endpoints like /backup, /restore, /trace/*.

OAuth2 Authentication

Maxine supports OAuth2 with Google for external authentication.

Enable with OAUTH2_ENABLED=true and configure:

GOOGLE_CLIENT_ID: Google OAuth2 client ID
GOOGLE_CLIENT_SECRET: Google OAuth2 client secret
GOOGLE_CALLBACK_URL: Callback URL (default: http://localhost:8080/auth/google/callback)

Start OAuth flow: GET /auth/google

Callback: GET /auth/google/callback returns JWT token.

Chaos Engineering

Maxine includes chaos engineering tools for resilience testing.

Inject Latency

POST /api/maxine/chaos/inject-latency
Content-Type: application/json

{
  "serviceName": "my-service",
  "delay": 1000
}

Inject Failure

POST /api/maxine/chaos/inject-failure
Content-Type: application/json

{
  "serviceName": "my-service",
  "rate": 0.1
}

Reset Chaos

POST /api/maxine/chaos/reset
Content-Type: application/json

{
  "serviceName": "my-service"
}

Get Chaos Status

GET /api/maxine/chaos/status

Get Scaling Recommendations

GET /api/maxine/serviceops/scaling/recommendations?serviceName=my-service

Returns intelligent scaling recommendations based on service metrics analysis. Analyzes response times, connection counts, and node health to suggest scale up/down actions.

Response:

{
  "serviceName": "my-service",
  "recommendations": [
    {
      "serviceName": "my-service",
      "action": "scale_up",
      "reason": "High response time (1500ms)",
      "confidence": 0.85,
      "metrics": {
        "totalNodes": 2,
        "healthyNodes": 2,
        "avgResponseTime": 1500,
        "avgConnectionsPerNode": 75
      },
      "recommendedInstances": 3
    }
  ],
  "timestamp": "2025-09-24T20:07:53.000Z"
}

Refresh Token

POST /refresh-token
Content-Type: application/json

{
  "token": "current-jwt-token"
}

Response:

{
  "token": "new-jwt-token"
}

gRPC API

Maxine supports gRPC for high-performance service registration and discovery.

Default gRPC port: 50051

Available methods:

Register: Register a service instance
Discover: Discover a service instance with load balancing
Heartbeat: Send heartbeat for a service instance
Deregister: Deregister a service instance
WatchServices: Stream service updates (basic implementation)

Client SDKs can be generated from api-specs/maxine.proto.

Dart: Full Mode and Lightning Mode APIs with async/await support for Flutter and Dart applications

WebSocket API

Maxine supports real-time event streaming via WebSocket for monitoring service changes.

Connect to WebSocket

ws://localhost:8080

Authentication

If authentication is enabled, clients must authenticate by sending an auth message with JWT token:

{
  "auth": "jwt-token-here"
}

Upon successful authentication, the server responds with {"type": "authenticated", "user": {...}}. If authentication fails, the connection is closed.

Role-based access: Certain subscriptions may require specific roles (e.g., admin for admin events).

Subscription and Filtering

Clients can subscribe to specific events by sending a JSON message:

{
  "subscribe": {
    "event": "service_registered",
    "serviceName": "my-service"
  }
}

Supported filter criteria:

event: Filter by event type (e.g., "service_registered", "circuit_open")
serviceName: Filter by service name
nodeId: Filter by node ID

To unsubscribe:

{
  "unsubscribe": true
}

To refresh token:

{
  "refresh_token": true
}

Response: {"type": "token_refreshed", "token": "new-token"}

If no filter is set, all events are received.

Events

The server broadcasts the following events as JSON messages:

service_registered: When a new service instance is registered

{
  "event": "service_registered",
  "data": {
    "serviceName": "my-service",
    "nodeId": "my-service:localhost:3000"
  },
  "timestamp": 1640995200000
}

service_deregistered: When a service instance is deregistered

{
  "event": "service_deregistered",
  "data": {
    "nodeId": "my-service:localhost:3000"
  },
  "timestamp": 1640995200000
}

service_heartbeat: When a service instance sends a heartbeat

{
  "event": "service_heartbeat",
  "data": {
    "nodeId": "my-service:localhost:3000"
  },
  "timestamp": 1640995200000
}

service_unhealthy: When a service instance is removed due to expired heartbeat

{
  "event": "service_unhealthy",
  "data": {
    "nodeId": "my-service:localhost:3000"
  },
  "timestamp": 1640995200000
}

config_changed: When a service configuration is updated

{
  "event": "config_changed",
  "data": {
    "serviceName": "my-service",
    "key": "timeout",
    "value": 5000,
    "namespace": "default",
    "region": "us-east",
    "zone": "zone1"
  },
  "timestamp": 1640995200000
}

config_deleted: When a service configuration is deleted

{
  "event": "config_deleted",
  "data": {
    "serviceName": "my-service",
    "key": "timeout",
    "namespace": "default",
    "region": "us-east",
    "zone": "zone1"
  },
  "timestamp": 1640995200000
}

Full Mode API

Full Mode provides additional endpoints for advanced features like federation, tracing, ACLs, intentions, and service blacklists. These are available under /api/maxine/serviceops/.

Chaos Engineering

Maxine includes chaos engineering tools for resilience testing.

Inject Latency

POST /api/maxine/chaos/inject-latency
Content-Type: application/json

{
  "serviceName": "my-service",
  "delay": 1000
}

Inject Failure

POST /api/maxine/chaos/inject-failure
Content-Type: application/json

{
  "serviceName": "my-service",
  "rate": 0.1
}

Reset Chaos

POST /api/maxine/chaos/reset
Content-Type: application/json

{
  "serviceName": "my-service"
}

Get Chaos Status

GET /api/maxine/chaos/status

Add Federated Registry

POST /api/maxine/serviceops/federation/add
Content-Type: application/json

{
  "name": "remote-registry",
  "url": "http://remote-maxine:8080"
}

Remove Federated Registry

POST /api/maxine/serviceops/federation/remove
Content-Type: application/json

{
  "name": "remote-registry"
}

Get Federated Registries

GET /api/maxine/serviceops/federation

Start Trace

POST /api/maxine/serviceops/trace/start
Content-Type: application/json

{
  "operation": "discover",
  "id": "trace-123"
}

Add Trace Event

POST /api/maxine/serviceops/trace/event
Content-Type: application/json

{
  "id": "trace-123",
  "event": "node selected"
}

End Trace

POST /api/maxine/serviceops/trace/end
Content-Type: application/json

{
  "id": "trace-123"
}

Get Trace

GET /api/maxine/serviceops/trace/:id

Set ACL

POST /api/maxine/serviceops/acl/set
Content-Type: application/json

{
  "serviceName": "my-service",
  "allow": ["service-a", "service-b"],
  "deny": ["service-c"]
}

Get ACL

GET /api/maxine/serviceops/acl/:serviceName

Set Intention

POST /api/maxine/serviceops/intention/set
Content-Type: application/json

{
  "source": "service-a",
  "destination": "service-b",
  "action": "allow"
}

Get Intention

GET /api/maxine/serviceops/intention/:source/:destination

Add Service to Blacklist

POST /api/maxine/serviceops/blacklist/service/add
Content-Type: application/json

{
  "serviceName": "bad-service"
}

Remove Service from Blacklist

POST /api/maxine/serviceops/blacklist/service/remove
Content-Type: application/json

{
  "serviceName": "bad-service"
}

Check if Service is Blacklisted

GET /api/maxine/serviceops/blacklist/service/:serviceName

Custom Load Balancing Plugins

Maxine supports custom load balancing strategies through a plugin system. You can register your own load balancing algorithms for specialized routing needs.

Registering a Custom Plugin

const serviceRegistry = global.serviceRegistry;

// Register a custom strategy
serviceRegistry.registerLBPlugin('my-custom-strategy', (nodes, context) => {
    // nodes: array of available service nodes
    // context: { clientIP, serviceName, tags }
    // Return the selected node

    // Example: select node with lowest CPU usage (assuming metadata has cpu field)
    let bestNode = null;
    let lowestCpu = Infinity;
    for (const node of nodes) {
        const cpu = node.metadata?.cpu || 0;
        if (cpu < lowestCpu) {
            lowestCpu = cpu;
            bestNode = node;
        }
    }
    return bestNode || nodes[0];
});

// Now you can use 'my-custom-strategy' in discovery requests
GET /discover?serviceName=my-service&loadBalancing=my-custom-strategy

Deep Learning Load Balancing

Maxine includes advanced deep learning capabilities for intelligent load balancing. Using TensorFlow.js, it trains neural networks on historical performance data to predict optimal service nodes.

Features

Neural Network Models: Feedforward neural networks trained on service metrics
Time Series Analysis: Advanced analysis including autocorrelation, trend detection, and seasonality
Predictive Analytics: Forecasts response times, error rates, and load patterns
Continuous Learning: Models update automatically with new performance data
Fallback Strategy: Falls back to time-series analysis if deep learning model unavailable

Usage

Use the advanced-ml strategy for deep learning-based load balancing:

GET /discover?serviceName=my-service&loadBalancing=advanced-ml

Model Training

Models are trained automatically on:

Response times
Success/failure rates
Load patterns
Historical trends

Training occurs every minute with recent performance data. Models are persisted to disk for continuity across restarts.

Metrics

Access model performance metrics:

const metrics = serviceRegistry.deepLearningService.getModelMetrics('my-service');
// Returns: { loss, mse, mae, accuracy, precision, recall, f1, ... }

Client SDKs

Maxine provides client SDKs for easy integration:

Swift: Lightning Mode API support with async/await for iOS/macOS/watchOS/tvOS
Kotlin: Lightning Mode API support with coroutines for Android
Python: Supports both Full Mode and Lightning Mode APIs, including WebSocket for real-time events
Go: Full Mode API support
Java: Full Mode API support
C#: Full Mode API support
Rust: Full Mode API support
Dart: Full Mode and Lightning Mode APIs with async/await support for Flutter and Dart applications

Client SDKs include caching, automatic retries, and support for all discovery strategies.

PHP: Full Mode and Lightning Mode APIs with caching support
Ruby: Full Mode and Lightning Mode APIs with WebSocket support
C++: High-performance C++ SDK for low-latency applications and game servers (complete with CMake build system and examples)

Architecture

Maxine maintains an in-memory registry of services and their instances. Services register with heartbeats, and expired services are automatically cleaned up. Discovery returns a healthy instance using various load balancing strategies.

Performance

Lightning Mode: Ultra-fast response times using raw Node.js HTTP server, O(1) lookups using optimized in-memory data structures with lightweight LRU caching (10k entries, 30s TTL), pre-allocated buffer responses, fast LCG PRNG for random selection, advanced load balancing strategies (round-robin, random, weighted-random, least-connections, consistent-hash, ip-hash, geo-aware, predictive), optimized request handling without deferred execution for minimal latency, stripped-down registry with only core features for minimal overhead, memory-mapped and shared memory persistence options
Full Mode: Comprehensive features with optimized caching, async operations, and JWT authentication
Minimal memory footprint with efficient data structures
Automatic cleanup prevents memory leaks with periodic sweeps (every 30 seconds)
Optimized routing: O(1) Map-based HTTP routing for ultra-fast request handling
Optimized heartbeat and discovery logic with parallel operations and async I/O
Active health checks for proactive service monitoring
Event-driven notifications for real-time updates
- Load test results: 5,000 requests with 50 concurrent users in ~0.37s, average response time 3.53ms, 95th percentile 6.22ms, 100% success rate, 13k req/s - Load test target: 95th percentile < 10ms for 50 concurrent users (achieved) - Recent optimizations: Removed console.log statements from production code to reduce I/O overhead, implemented object pooling for response objects to reduce GC pressure, added service health prediction using time-series analysis, adaptive caching with access-based TTL, SIMD-inspired binary search for weighted random selection, fine-tuned GC settings, added CPU affinity, synchronous ultra-fast discovery, pre-allocated JSON buffers

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 980 Commits
.circleci		.circleci
.github/workflows		.github/workflows
.idx		.idx
api-specs		api-specs
artifacts		artifacts
bin		bin
client-sdk		client-sdk
client		client
docs		docs
helm		helm
src		src
test/load-test		test/load-test
.dockerignore		.dockerignore
.gitignore		.gitignore
.mocharc.js		.mocharc.js
.nycrc.json		.nycrc.json
.prettierrc		.prettierrc
.readthedocs.yaml		.readthedocs.yaml
.sonarcloud.properties		.sonarcloud.properties
Dockerfile		Dockerfile
LICENSE		LICENSE
PENDING_TASKS.md		PENDING_TASKS.md
README.md		README.md
asconfig.json		asconfig.json
conf.js		conf.js
docker-compose.yml		docker-compose.yml
eslint.config.js		eslint.config.js
index-simple.js		index-simple.js
index-ultra-fast.js		index-ultra-fast.js
index.js		index.js
mkdocs.yml		mkdocs.yml
package-lock.json		package-lock.json
package-simple.json		package-simple.json
package.json		package.json
simple-perf-test.js		simple-perf-test.js
vercel.json		vercel.json

License

VrushankPatel/Maxine

Folders and files

Latest commit

History

Repository files navigation

Maxine - Lightning Fast Service Registry

Features

Performance

Security

Security Features

Security Best Practices

Security Scanning

Quick Start

Kubernetes Integration

Custom Resource Definitions (CRDs)

Installation

Features

Documentation

Development

Code Quality

Development Tools

Persistence

Distributed Caching

Federation

Multi-Cluster Auto-Failover

Failover Status Endpoint

Failover Configuration

Authentication (Lightning Mode)

OAuth2 Integration

Role-Based Access Control (RBAC)

Roles

Role Management Endpoints

Demo Users

API Key Management

API Key Endpoints

Using API Keys

LDAP Authentication

SAML Authentication

Mutual TLS (mTLS) Support

MQTT Integration (Lightning Mode)

OpenTelemetry Tracing

Modes

API

Lightning Mode (Default)

HTTP API

Register a Service

Discover a Service

Heartbeat

Deregister a Service

List All Services

Get Service Instances

Health Check

Node Health Check

Metrics

Dashboard

Dependency Graph

Heap Dump

Backup Registry

Restore Registry

Start Trace

Add Trace Event

End Trace

Get Trace

Get Service Versions

DNS Service Discovery

Get Anomalies

Get Health Scores

Predict Service Health

Set Traffic Distribution (Canary Deployments)

Promote Version (Blue-Green Deployment)

Retire Version

Shift Traffic Gradually

Set Service Config

Get Service Config

Get All Service Configs

Watch Service Config Changes

Delete Service Config

Generate Envoy Config

Service Mesh Metrics