Skip to content

Maxine is a service discovery and a naming server for microservices that solves the issue of hard-wiring urls by hostnames and ports.

License

Notifications You must be signed in to change notification settings

VrushankPatel/Maxine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

980 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Maxine - Lightning Fast Service Registry

A minimal, high-performance service discovery and registry for microservices.

Features

  • Lightning Fast: In-memory storage with O(1) lookups, optimized heartbeat with periodic cleanup, pre-allocated response buffers, fast LCG PRNG for random selection
  • Simple API: Register, discover, heartbeat, and deregister services with support for service versioning
  • Automatic Cleanup: Removes expired services with efficient periodic cleanup (every 30 seconds)
  • Load Balancing: Round-robin, random, weighted-random, least-connections, consistent-hash, ip-hash, geo-aware, predictive, ai-driven, advanced-ml (synchronous deep learning), cost-aware, power-of-two-choices selection for advanced load balancing
  • Health Checks: /health endpoint returning service and node counts, active health monitoring for real-time status
  • Advanced Health Checks: Custom health check endpoints with proactive monitoring, configurable intervals, and health status integration with load balancing decisions
  • Circuit Breakers: Automatic failure detection and recovery to protect against cascading failures
  • Rate Limiting: Protect services from excessive requests with configurable limits
  • API Key Management: Generate, validate, and revoke API keys with per-key rate limiting for secure service access
  • Access Control Lists (ACLs): Fine-grained permissions for service discovery access
  • Service Intentions: Define allowed communication patterns between services
  • Service Dependencies: Manage service dependencies with cycle detection, graph visualization, and automatic dependency detection through call logging
  • Version Compatibility Checking: Define compatibility rules for service versions to prevent incompatible service interactions
  • Service Call Analytics: Real-time dashboard visualizing service communication patterns, call frequencies, and dependency graphs with interactive D3.js charts
  • Advanced Service Validation: Comprehensive schema validation for service registrations including metadata fields (tags, healthCheck, version, weight)
  • Chaos Engineering Tools: Built-in chaos testing with latency injection, failure simulation, and automated experiments for resilience validation
  • Metrics: Basic /metrics endpoint with request counts, errors, uptime, and basic stats including cache performance metrics
  • OpenTelemetry Metrics: Comprehensive observability with Prometheus-compatible metrics for service registrations, discoveries, heartbeats, deregistrations, cache hits/misses, and total services/nodes
  • Advanced Rate Limiting: Distributed rate limiting with configurable limits per client IP to protect against excessive requests
  • Audit Logging: Comprehensive logging of all registry operations using Winston, including user actions, system events, and security incidents with log rotation and export capabilities
  • Persistence: Optional persistence to survive restarts with file-based, Redis, memory-mapped (mmap), or shared memory (shm) storage
  • Minimal Dependencies: Only essential packages for maximum performance
  • Lightning Mode: Dedicated mode for ultimate speed with core features: register, heartbeat, deregister, discover with round-robin/random load balancing, health, optimized for minimal overhead
  • HTTP/3 Support: Optional QUIC-based HTTP/3 server for ultra-low latency service discovery (enabled with HTTP3_ENABLED=true)
  • WebAssembly Support: Complete WebAssembly service registry for edge computing deployments with full API compatibility
  • Optimized Parsing: Fast JSON parsing with error handling
  • Event-Driven: Real-time events for service changes and notifications via WebSocket and MQTT
  • Federation: Connect multiple Maxine instances across datacenters for global service discovery (available in Lightning Mode)
  • Multi-Datacenter Support: Global service discovery with cross-datacenter replication and load balancing
  • Authentication/Authorization: Optional JWT-based auth for Lightning Mode to secure sensitive operations with Role-Based Access Control (RBAC)
  • Configuration Management: Dynamic configuration updates for services with versioning and event notifications
  • gRPC Support: High-performance gRPC API for service operations
  • Service Mesh Integration: Automatic Envoy, Istio, and Linkerd configuration generation for seamless service mesh deployment
  • Open Service Broker API Integration: Compatible with enterprise service catalogs for seamless integration with Kubernetes Service Catalog and other OSB implementations

Performance

Maxine delivers exceptional performance for service discovery operations:

  • Ultra-Fast Mode: Average 1.87ms, P95 3.94ms for discovery requests (with advanced optimizations and AI-driven load balancing)
  • Throughput: 25,004+ requests per second under load (50 concurrent users, 5000 iterations)
  • Lightning Mode: Average 4.91ms, P95 6.49ms for discovery requests (100 concurrent users, 1000 iterations)
  • Throughput: 20,136+ requests per second under load (100 concurrent users, 1000 iterations)
  • Optimizations: QUIC/HTTP3 support for ultra-low latency, HTTP/1.1 for ultra-fast mode (disabled HTTP/2 for lower latency), disabled OpenTelemetry tracing and Prometheus metrics in Lightning Mode, ultra-fast mode with minimal features for maximum speed, fast LCG PRNG, pre-allocated buffers, object pooling, adaptive caching, binary search for weighted random selection, SIMD-inspired fast bulk operations (fastMin, fastMax, fastSum, etc.) for load balancing calculations, removed console.log from production code (24% throughput improvement), optimized discovery to use ultraFastGetRandomNodeSync directly, disabled expensive operations in lightning mode, synchronous load balancing for ultra-fast mode, updated GC flags for Node.js 22 compatibility, enabled small LRU caches in ultra-fast mode for better performance

Security

Maxine implements comprehensive security measures for production deployments:

Security Features

  • Input Validation: All API endpoints use Joi schema validation with sanitization
  • Rate Limiting: Distributed Redis-backed rate limiting to prevent abuse
  • Authentication: JWT-based authentication with role-based access control (RBAC)
  • API Keys: Secure API key management with configurable rate limits per key
  • Mutual TLS: mTLS support for encrypted service-to-service communication
  • Audit Logging: Comprehensive logging of all security events and operations
  • Dependency Security: All dependencies are regularly audited and updated

Security Best Practices

  • Enable authentication in production: AUTH_ENABLED=true
  • Use HTTPS/TLS for all communications
  • Configure rate limiting based on your traffic patterns
  • Regularly update dependencies and monitor for security advisories
  • Use API keys for service-to-service authentication
  • Enable audit logging for compliance requirements

Security Scanning

# Run security audit
npm audit

# Check for outdated dependencies
npm outdated

# Use ESLint for code quality
npm run lint

Quick Start

npm install
npm start

Maxine runs in Ultra-Fast Mode by default for maximum performance with core features only. For more features, set ULTRA_FAST_MODE=false and LIGHTNING_MODE=true.

Kubernetes Integration

Maxine provides comprehensive Kubernetes integration through a custom operator for declarative service registry management:

Custom Resource Definitions (CRDs)

  • ServiceRegistry: Declarative Maxine instance management with auto-scaling, persistence, and multi-cloud support
  • ServiceInstance: Automatic service registration and health monitoring for Kubernetes services
  • ServicePolicy: Advanced load balancing, circuit breakers, and AI optimization policies
  • ServiceMeshOperator: Automated Istio, Linkerd, and Envoy configuration generation
  • TrafficPolicy: Fine-grained traffic management with fault injection and canary deployments
  • ServiceEndpoint: Direct endpoint management with health checks and metadata

Installation

# Install CRDs
kubectl apply -f helm/maxine-operator/crds/

# Install operator
helm install maxine-operator helm/maxine-operator/

# Create a service registry
kubectl apply -f - <<EOF
apiVersion: maxine.io/v1
kind: ServiceRegistry
metadata:
  name: my-registry
spec:
  replicas: 3
  mode: lightning
  config:
    port: 8080
    persistenceEnabled: true
    aiOptimizationEnabled: true
    multiCloudEnabled: true
EOF

Features

  • Auto-scaling: Kubernetes HPA integration with custom metrics
  • Service Discovery: Automatic registration of Kubernetes services
  • AI Optimization: ML-driven load balancing and traffic optimization
  • Multi-cloud: Cross-cloud service discovery and failover
  • Service Mesh: Automated Istio/Linkerd configuration
  • Chaos Engineering: Built-in fault injection and resilience testing

Documentation

Development

Code Quality

Maxine uses modern development tools for code quality and security:

# Run linting
npm run lint

# Auto-fix linting issues
npm run lint:fix

# Format code
npm run format

# Run tests
npm test

# Run load tests
npm run load-test

Development Tools

  • ESLint: Code linting with security rules
  • Prettier: Code formatting
  • Joi: Input validation and sanitization
  • Mocha: Testing framework
  • Istanbul/NYC: Code coverage

Persistence

Maxine supports optional persistence to maintain registry state across restarts:

  • File-based: Saves to registry.json in the working directory
  • Redis: Uses Redis for distributed storage
  • Memory-mapped (mmap): Zero-copy operations with memory-mapped files for ultra-fast persistence
  • Shared Memory (shm): In-memory shared buffer with file backing for maximum performance
  • PostgreSQL: Enterprise-grade SQL persistence with connection pooling and advanced querying
  • MySQL: High-performance MySQL persistence with optimized schemas and indexing
  • TiKV: Distributed key-value store with strong consistency and horizontal scaling
  • FoundationDB: Multi-model database with ACID transactions and fault tolerance

Enable with PERSISTENCE_ENABLED=true and set PERSISTENCE_TYPE=file, redis, mmap, shm, postgres, mysql, tikv, or foundationdb.

For Redis, configure REDIS_HOST, REDIS_PORT, REDIS_PASSWORD.

Distributed Caching

Maxine supports Redis-based distributed caching for service discovery results across multiple instances, reducing latency and improving scalability.

Enable with REDIS_CACHE_ENABLED=true and configure Redis settings. Discovery results for deterministic load balancing strategies (consistent-hash, ip-hash, geo-aware, etc.) are cached in Redis with a 30-second TTL, allowing multiple Maxine instances to share cached results and reduce registry load.

Federation

Maxine supports federation to connect multiple instances across datacenters for global service discovery, cross-datacenter replication, and load balancing.

Enable with FEDERATION_ENABLED=true and configure peers with FEDERATION_PEERS=http://peer1:8080,http://peer2:8080.

Additional options: FEDERATION_TIMEOUT (default 5000ms), FEDERATION_RETRY_ATTEMPTS (default 3).

In Lightning Mode, federated registries are queried automatically if a service is not found locally. Registrations and deregistrations are replicated across peers.

Multi-Cluster Auto-Failover

Maxine includes advanced multi-cluster failover capabilities for high availability:

  • Health Monitoring: Continuous health checks of federated registries every 30 seconds
  • Replication Lag Detection: Monitors replication lag between clusters with configurable thresholds
  • Automatic Failover: Automatically switches to healthy backup registries when primary fails
  • Region-Aware Failover: Prioritizes failover targets based on geographic proximity
  • Conflict Resolution: Handles service registration conflicts during failover scenarios

Failover Status Endpoint

GET /api/maxine/serviceops/federation/status

Returns comprehensive failover status including:

  • Current primary registry
  • Health status of all federated registries
  • Replication lag metrics
  • Failover priority rankings
  • Last health check timestamps

Failover Configuration

  • FEDERATION_PEERS: Comma-separated list of peer URLs with optional priority (e.g., peer1:http://host1:8080,peer2:http://host2:8080)
  • REPLICATION_LAG_THRESHOLD: Maximum acceptable replication lag in milliseconds (default: 5000ms)

Authentication (Lightning Mode)

Maxine supports optional JWT-based authentication in Lightning Mode to secure sensitive operations like backup/restore and tracing.

Enable with AUTH_ENABLED=true and configure:

  • JWT_SECRET: Secret key for JWT signing
  • JWT_EXPIRES_IN: Token expiration (default 1h)
  • ADMIN_USERNAME: Admin username (default admin)
  • ADMIN_PASSWORD_HASH: Bcrypt hash of admin password

Sign in via POST /signin to get a token, then include in requests as Authorization: Bearer <token>.

OAuth2 Integration

Maxine supports OAuth2 authentication with Google for external user management.

Enable with OAUTH2_ENABLED=true and configure:

  • GOOGLE_CLIENT_ID: Google OAuth2 client ID
  • GOOGLE_CLIENT_SECRET: Google OAuth2 client secret

Redirect users to GET /auth/google to start OAuth flow, then handle the callback at GET /auth/google/callback to receive JWT token.

Role-Based Access Control (RBAC)

Maxine supports Role-Based Access Control with fine-grained permissions for different user roles.

Roles

  • admin: Full access to all operations
  • operator: Service management, configuration, monitoring, and advanced features
  • viewer: Read-only access to discovery, metrics, and health
  • service: Limited access for service registration, discovery, and heartbeat

Role Management Endpoints

  • GET /api/maxine/roles - List all roles and permissions (admin only)
  • GET /api/maxine/user/roles/:username - Get user role (admin only)
  • POST /api/maxine/user/roles - Set user role (admin only)

Demo Users

For testing, Maxine includes demo users with different roles:

  • admin/admin (admin role)
  • operator/operator (operator role)
  • viewer/viewer (viewer role)
  • service/service (service role)

API Key Management

Maxine supports API key-based authentication with configurable rate limiting for secure service access.

API Key Endpoints

  • POST /api/maxine/api-keys/generate - Generate a new API key (admin only)
    {
      "serviceName": "my-service",
      "rateLimit": 1000
    }
  • POST /api/maxine/api-keys/revoke - Revoke an API key (admin only)
    {
      "apiKey": "your-api-key-here"
    }
  • GET /api/maxine/api-keys - List all API keys (admin only)
  • POST /api/maxine/api-keys/validate - Validate an API key
    {
      "apiKey": "your-api-key-here"
    }

Using API Keys

Include the API key in requests using the X-API-Key header or apiKey query parameter:

curl -H "X-API-Key: your-api-key" http://localhost:8080/discover?serviceName=my-service

API keys are automatically rate limited based on their configured limits.

LDAP Authentication

Maxine supports LDAP/Active Directory authentication for enterprise environments.

Enable with LDAP_ENABLED=true and configure:

  • LDAP_URL: LDAP server URL (e.g., ldap://localhost:389)
  • LDAP_BASE_DN: Base DN for searches (e.g., dc=example,dc=com)
  • LDAP_BIND_USER: Bind user DN for authentication
  • LDAP_BIND_PASSWORD: Bind user password

When LDAP is enabled, the /signin endpoint will first attempt LDAP authentication, falling back to local users if LDAP fails.

SAML Authentication

Maxine supports SAML 2.0 authentication for enterprise single sign-on integration.

Enable with SAML_ENABLED=true and configure:

  • SAML_ENTRY_POINT: SAML identity provider entry point URL
  • SAML_ISSUER: SAML service provider issuer
  • SAML_CERT: SAML identity provider certificate (public key)
  • SAML_CALLBACK_URL: SAML callback URL (default: http://localhost:8080/auth/saml/callback)

Redirect users to GET /auth/saml to start SAML authentication flow, then handle the callback at POST /auth/saml/callback to receive JWT token.

Mutual TLS (mTLS) Support

Maxine supports Mutual TLS for encrypted and authenticated service-to-service communication in Lightning Mode.

Enable with MTLS_ENABLED=true and provide certificate paths:

  • SERVER_CERT_PATH: Path to server certificate (default: src/main/config/certs/server.crt)
  • SERVER_KEY_PATH: Path to server private key (default: src/main/config/certs/server.key)
  • CA_CERT_PATH: Path to CA certificate for client verification (default: src/main/config/certs/ca.crt)

To generate self-signed certificates for testing, run:

node src/main/config/certs/generate-certs.js

This creates CA, server, and client certificates. Use client.crt and client.key for client authentication.

Example curl with client cert:

curl --cert src/main/config/certs/client.crt --key src/main/config/certs/client.key --cacert src/main/config/certs/ca.crt https://localhost:8080/health

MQTT Integration (Lightning Mode)

Maxine supports optional MQTT integration for publishing real-time events to MQTT brokers.

Enable with MQTT_ENABLED=true and configure:

  • MQTT_BROKER: MQTT broker URL (default mqtt://localhost:1883)
  • MQTT_TOPIC: Base topic for events (default maxine/registry/events)

Events are published to topics like maxine/registry/events/service_registered, maxine/registry/events/circuit_open, etc. with QoS 1.

MQTT publishing is now enabled in the broadcast function for real-time event distribution.

OpenTelemetry Tracing

Maxine supports OpenTelemetry tracing for distributed observability. Traces are automatically generated for key operations like service registration, discovery, and deregistration.

Configure Jaeger exporter with JAEGER_ENDPOINT environment variable (default: http://localhost:14268/api/traces).

Tracing is enabled by default and provides detailed spans for:

  • Service registration/deregistration
  • Service discovery with load balancing
  • API request handling
  • Registry operations

Modes

  • Ultra-Fast Mode (default): Extreme performance with minimal features. Core operations only: register, heartbeat, deregister, discover. No logging, metrics, auth, WebSocket, MQTT, gRPC. Uses UDP for heartbeats for speed. Set ULTRA_FAST_MODE=true.
  • Lightning Mode: Ultra-fast with additional features for maximum speed. Core operations: register, heartbeat, deregister, discover with advanced load balancing, caching, health checks. Optional JWT auth for sensitive endpoints. Uses root-level API endpoints like /register, /discover. Set ULTRA_FAST_MODE=false LIGHTNING_MODE=true.
  • Full Mode: Comprehensive features including federation, tracing, ACLs, intentions, service blacklists, management UI, security, metrics, etc. Uses /api/* endpoints. Set LIGHTNING_MODE=false.

To run in full mode: LIGHTNING_MODE=false npm start

API

Lightning Mode (Default)

HTTP API

Register a Service
POST /register
Content-Type: application/json

{
   "serviceName": "my-service",
   "host": "localhost",
   "port": 3000,
   "metadata": {"version": "1.0", "weight": 1, "tags": ["web", "api"], "healthCheck": {"url": "/health", "interval": 30000, "timeout": 5000}}
}

Note: version in metadata enables service versioning. weight in metadata is used for weighted-random load balancing (default 1). tags in metadata is an array of strings for service tagging and filtering. healthCheck in metadata configures proactive health monitoring with url (default "/health"), interval (default 30000ms), and timeout (default 5000ms).

Response:

{
  "nodeId": "my-service:localhost:3000"
}
Discover a Service
GET /discover?serviceName=my-service&loadBalancing=round-robin&version=1.0&tags=web,api

Load balancing options: round-robin (default), random, weighted-random, least-connections, weighted-least-connections, consistent-hash, ip-hash, geo-aware, least-response-time, health-score, predictive (uses time-series trend analysis for optimal node selection), ai-driven (uses reinforcement learning for optimal routing), advanced-ml (uses machine learning with predictive analytics for intelligent load balancing), cost-aware (prefers lower-cost nodes like on-prem over cloud), power-of-two-choices (selects two random nodes and picks the one with fewer connections for better load distribution). Custom load balancing strategies can be registered via plugins. Use version parameter for service versioning. Use tags parameter to filter services by tags (comma-separated).

Response: Returns a service instance or 404 if not found.

Heartbeat
POST /heartbeat
Content-Type: application/json

{
  "nodeId": "my-service:localhost:3000"
}
Deregister a Service
DELETE /deregister
Content-Type: application/json

{
   "nodeId": "localhost:3000"
}
List All Services
GET /servers
Get Service Instances
GET /services/:serviceName

Returns all healthy instances of the specified service with their metadata and health status.

Response:

{
  "serviceName": "my-service",
  "instances": [
    {
      "nodeId": "my-service:localhost:3000",
      "address": "localhost:3000",
      "nodeName": "my-service:localhost:3000",
      "metadata": { "version": "1.0", "weight": 1 },
      "lastHeartbeat": 1640995200000,
      "healthy": true
    }
  ]
}
Health Check
GET /health

Returns status, services count, nodes count.

Node Health Check
GET /health/:nodeId

Returns detailed health status for a specific service instance.

Response:

{
  "nodeId": "my-service:localhost:3000",
  "serviceName": "my-service",
  "address": "localhost:3000",
  "nodeName": "my-service:localhost:3000",
  "metadata": { "version": "1.0" },
  "lastHeartbeat": 1640995200000,
  "timeSinceLastHeartbeat": 5000,
  "healthy": true,
  "responseTime": 150
}
Metrics
GET /metrics

Returns uptime, requests, errors, services, nodes, persistenceEnabled, persistenceType, wsConnections, eventsBroadcasted, cacheHits, cacheMisses.

Additionally, comprehensive Prometheus-compatible metrics are exposed on port 9464 at /metrics, including:

  • maxine_service_registrations_total: Total service registrations
  • maxine_service_discoveries_total{service_name, strategy}: Total service discoveries by service and strategy
  • maxine_service_heartbeats_total: Total heartbeats
  • maxine_service_deregistrations_total: Total deregistrations
  • maxine_cache_hits_total: Cache hits
  • maxine_cache_misses_total: Cache misses
  • maxine_redis_cache_hit_total: Redis distributed cache hits
  • maxine_redis_cache_miss_total: Redis distributed cache misses
  • maxine_services_active: Active services count
  • maxine_nodes_active: Active nodes count
  • maxine_circuit_breakers_open: Open circuit breakers count
  • maxine_response_time_seconds{operation}: Response time histogram for operations (register, discover, heartbeat, deregister)
Dashboard
GET /dashboard

Returns an advanced HTML dashboard with real-time metrics, charts, service topology, and event streaming for comprehensive monitoring. Features include:

  • Real-time stats updates via WebSocket
  • Interactive charts for node health and cache performance
  • Service and node status visualization
  • Recent events feed
  • Connection status indicators
Dependency Graph
GET /dependency-graph

Returns an interactive HTML page visualizing the service dependency graph using D3.js. Features include:

  • Force-directed graph layout
  • Click on nodes to view dependency impact (dependencies and dependents)
  • Cycle detection alerts
  • Export to JSON or SVG
  • Real-time updates (planned)
Heap Dump
GET /heapdump

Creates a heap snapshot file for memory profiling (requires heapdump module).

Backup Registry
GET /backup

Returns the current registry state as JSON (requires persistence enabled).

Restore Registry
POST /restore
Content-Type: application/json

{ ... registry data ... }

Restores registry from backup data (requires persistence enabled).

Start Trace
POST /trace/start
Content-Type: application/json

{
  "id": "trace-123",
  "operation": "discover"
}
Add Trace Event
POST /trace/event
Content-Type: application/json

{
  "id": "trace-123",
  "event": "node selected"
}
End Trace
POST /trace/end
Content-Type: application/json

{
  "id": "trace-123"
}
Get Trace
GET /trace/:id

Returns the trace data for the given id, including start time, duration, events, and status.

OpenTelemetry Integration: Maxine includes comprehensive OpenTelemetry tracing for all registry operations:

  • Service registration/deregistration
  • Service discovery with load balancing
  • Heartbeat operations
  • Federation queries
  • Configuration updates

Traces are automatically exported to Jaeger or Zipkin when configured. Set JAEGER_ENDPOINT or ZIPKIN_ENDPOINT environment variables to enable trace export.

Get Service Versions
GET /versions?serviceName=my-service

Response:

{
  "serviceName": "my-service",
  "versions": ["1.0", "2.0", "default"]
}
DNS Service Discovery

Maxine supports DNS-based service discovery for compatibility with standard DNS clients.

Enable with DNS_ENABLED=true (default) and configure DNS_PORT (default 53).

Query SRV records for service discovery:

dig SRV _my-service._tcp.default.default.default @localhost

Or A records for direct IP resolution:

dig A my-service.default.default.default @localhost

This allows integration with DNS-aware applications and load balancers.

Get Anomalies
GET /anomalies

Returns detected anomalies in the service registry using statistical analysis and machine learning algorithms. Anomalies are prioritized by severity.

Anomaly Types:

  • high_circuit_failures: Excessive circuit breaker failures
  • no_healthy_nodes: Service has nodes but none are healthy
  • no_nodes: Service has no registered nodes
  • high_response_time: Response time exceeds 3 standard deviations from mean
  • response_time_trend: Significant increase in response times over time
  • stale_heartbeat: Node hasn't sent heartbeat within expected interval
  • high_error_rate: Error rate exceeds 10%

Response:

{
  "anomalies": [
    {
      "serviceName": "my-service",
      "type": "high_response_time",
      "value": 2500,
      "threshold": 1500,
      "severity": "medium"
    },
    {
      "serviceName": "bad-service",
      "type": "no_healthy_nodes",
      "severity": "critical"
    }
  ]
}
Get Health Scores
GET /health-score?serviceName=my-service

Returns health scores (0-100, higher better) for all healthy nodes in the service, based on response times, failure rates, and circuit breaker state.

Response:

{
  "serviceName": "my-service",
  "scores": {
    "my-service:localhost:3000": 85,
    "my-service:localhost:3001": 92
  }
}
Predict Service Health
GET /predict-health?serviceName=my-service&window=300000

Returns health predictions for service nodes using time-series analysis. The window parameter specifies the prediction time window in milliseconds (default: 300000ms / 5 minutes).

Response:

{
  "serviceName": "my-service",
  "predictions": {
    "my-service:localhost:3000": {
      "currentScore": 85,
      "predictedScore": 78,
      "trend": 2.5,
      "predictedResponseTime": 180
    }
  },
  "predictionWindow": 300000
}
Set Traffic Distribution (Canary Deployments)
POST /traffic/set
Content-Type: application/json

{
  "serviceName": "my-service",
  "distribution": {"1.0": 80, "2.0": 20}
}
Promote Version (Blue-Green Deployment)
POST /version/promote
Content-Type: application/json

{
  "serviceName": "my-service",
  "version": "2.0"
}
Retire Version
POST /version/retire
Content-Type: application/json

{
  "serviceName": "my-service",
  "version": "1.0"
}
Shift Traffic Gradually
POST /traffic/shift
Content-Type: application/json

{
  "serviceName": "my-service",
  "fromVersion": "1.0",
  "toVersion": "2.0",
  "percentage": 10
}
Set Service Config
POST /api/maxine/serviceops/config/set
Content-Type: application/json

{
  "serviceName": "my-service",
  "key": "timeout",
  "value": 5000,
  "namespace": "default",
  "region": "us-east",
  "zone": "zone1"
}
Get Service Config
GET /api/maxine/serviceops/config/get?serviceName=my-service&key=timeout&namespace=default&region=us-east&zone=zone1
Get All Service Configs
GET /api/maxine/serviceops/config/all?serviceName=my-service&namespace=default&region=us-east&zone=zone1
Watch Service Config Changes
GET /api/maxine/serviceops/config/watch?serviceName=my-service&namespace=default&region=us-east&zone=zone1

Returns Server-Sent Events for real-time config changes.

Delete Service Config
DELETE /api/maxine/serviceops/config/delete
Content-Type: application/json

{
  "serviceName": "my-service",
  "key": "timeout",
  "namespace": "default",
  "region": "us-east",
  "zone": "zone1"
}
Generate Envoy Config
GET /api/maxine/serviceops/envoy/config

Returns Envoy proxy configuration JSON based on registered services, suitable for service mesh integration. Includes enhanced observability with access logging, custom headers, and circuit breaker metrics.

Service Mesh Metrics
GET /api/maxine/serviceops/service-mesh/metrics

Returns comprehensive service mesh observability metrics including:

  • Configuration generations (Envoy, Istio, Linkerd)
  • Circuit breaker statistics
  • Retry attempt counts
  • Service health metrics
  • Active service and node counts
Generate Istio Config
GET /service-mesh/istio-config

Returns Istio VirtualService and DestinationRule configurations in JSON format for service mesh deployment.

Generate Linkerd Config
GET /service-mesh/linkerd-config

Returns Linkerd ServiceProfile configurations in JSON format for service mesh deployment, including retry budgets and route conditions.

Open Service Broker API

Maxine supports the Open Service Broker API for integration with enterprise service catalogs.

Get Catalog
GET /v2/catalog

Returns the service catalog in OSB format, listing all registered services and their versions as plans.

Response:

{
  "services": [
    {
      "id": "my-service",
      "name": "my-service",
      "description": "Service my-service",
      "bindable": false,
      "plans": [
        {
          "id": "my-service-1.0",
          "name": "1.0",
          "description": "Version 1.0 of my-service"
        }
      ]
    }
  ]
}
Get Circuit Breaker State
GET /circuit-breaker/:nodeId

Returns the circuit breaker state for the specified node, including state (closed/open/half-open), failure count, last failure timestamp, and next retry timestamp.

Get Event History
GET /events?since=<timestamp>&limit=<number>

Returns recent events from the event history. Use since to get events after a specific timestamp (default 0), and limit to limit the number of events returned (default 100).

Add Service to Blacklist
POST /blacklist/add
Content-Type: application/json

{
  "serviceName": "bad-service"
}
Remove Service from Blacklist
DELETE /blacklist/remove
Content-Type: application/json

{
  "serviceName": "bad-service"
}
Get Blacklist
GET /blacklist

Returns the list of blacklisted services.

GraphQL API
GET /graphql
POST /graphql

Maxine provides a GraphQL API for flexible queries and mutations. The GraphQL playground is available at /graphql for testing queries.

Queries:

  • services: Get all registered services
  • service(serviceName: String!): Get a specific service
  • discover(serviceName: String!, ip: String, group: String, tags: [String], deployment: String, filter: String): Discover a service instance
  • healthScores(serviceName: String!): Get health scores for all nodes in a service

Mutations:

  • register(serviceName: String!, nodeName: String!, address: String!, metadata: String): Register a service
  • deregister(serviceName: String!, nodeName: String!): Deregister a service
Set Service Config
POST /config/set
Content-Type: application/json

{
  "serviceName": "my-service",
  "key": "timeout",
  "value": 5000,
  "metadata": {"description": "Request timeout"}
}
Get Service Config
GET /config/get?serviceName=my-service&key=timeout
Get All Service Configs
GET /config/all?serviceName=my-service
Delete Service Config
DELETE /config/delete?serviceName=my-service&key=timeout
Record Response Time
POST /record-response-time
Content-Type: application/json

{
  "nodeId": "my-service:localhost:3000",
  "responseTime": 150
}

Records the response time for a node to enable predictive load balancing based on historical performance data.

Record Service Call
POST /record-call
Content-Type: application/json

{
  "callerService": "web-service",
  "calledService": "api-service"
}

Records a service call for automatic dependency detection. Services can report their outbound calls to enable auto-detection of service dependencies.

Add Service Dependency
POST /api/maxine/serviceops/dependency/add
Content-Type: application/json

{
  "serviceName": "my-service",
  "dependsOn": "dependent-service"
}
Remove Service Dependency
POST /api/maxine/serviceops/dependency/remove
Content-Type: application/json

{
  "serviceName": "my-service",
  "dependsOn": "dependent-service"
}
Get Service Dependencies
GET /api/maxine/serviceops/dependency/get?serviceName=my-service

Response:

{
  "serviceName": "my-service",
  "dependencies": ["dependent-service"]
}
Get Service Dependents
GET /api/maxine/serviceops/dependency/dependents?serviceName=my-service

Response:

{
  "serviceName": "my-service",
  "dependents": ["dependent-service"]
}
Get Dependency Graph
GET /api/maxine/serviceops/dependency/graph

Response:

{
  "my-service": ["dependent-service"],
  "another-service": ["my-service"]
}
Detect Circular Dependencies
GET /api/maxine/serviceops/dependency/cycles

Response:

{
  "cycles": [["service-a", "service-b", "service-a"]]
}
Analyze Dependencies
POST /api/maxine/serviceops/dependency/analyze

Triggers automatic dependency analysis based on recorded service calls. Dependencies are inferred from call logs where services have called each other above the configured threshold within the time window.

Response:

{
  "success": true,
  "message": "Dependency analysis completed"
}
Set Compatibility Rule
POST /api/maxine/serviceops/compatibility/set
Content-Type: application/json

{
  "serviceName": "my-service",
  "version": "1.0",
  "compatibleVersions": ["1.0", "1.1", "^1.0.0"]
}
Get Compatibility Rules
GET /api/maxine/serviceops/compatibility/get?serviceName=my-service&version=1.0

Response:

{
  "serviceName": "my-service",
  "version": "1.0",
  "rules": ["1.0", "1.1", "^1.0.0"]
}
Check Compatibility
POST /api/maxine/serviceops/compatibility/check
Content-Type: application/json

{
  "serviceName": "my-service",
  "version": "1.0",
  "requiredVersion": "1.1"
}

Response:

{
  "serviceName": "my-service",
  "version": "1.0",
  "requiredVersion": "1.1",
  "compatible": true
}
Set ACL
POST /api/maxine/serviceops/acl/set
Content-Type: application/json

{
  "serviceName": "my-service",
  "allow": ["service-a", "service-b"],
  "deny": ["service-c"]
}
Get ACL
GET /api/maxine/serviceops/acl/:serviceName

Response:

{
  "allow": ["service-a", "service-b"],
  "deny": ["service-c"]
}
Set Intention
POST /api/maxine/serviceops/intention/set
Content-Type: application/json

{
  "source": "service-a",
  "destination": "service-b",
  "action": "allow"
}
Get Intention
GET /api/maxine/serviceops/intention/:source/:destination

Response:

{
  "source": "service-a",
  "destination": "service-b",
  "action": "allow"
}
Add Service Dependency
POST /api/maxine/serviceops/dependency/add
Content-Type: application/json

{
  "serviceName": "my-service",
  "dependsOn": "dependent-service"
}
Remove Service Dependency
POST /api/maxine/serviceops/dependency/remove
Content-Type: application/json

{
  "serviceName": "my-service",
  "dependsOn": "dependent-service"
}
Get Service Dependencies
GET /api/maxine/serviceops/dependency/get?serviceName=my-service
Get Service Dependents
GET /api/maxine/serviceops/dependency/dependents?serviceName=my-service
Get Dependency Graph
GET /api/maxine/serviceops/dependency/graph
Detect Circular Dependencies
GET /api/maxine/serviceops/dependency/cycles
Proxy to Service
GET /proxy/:serviceName/:path

Proxies requests to a discovered service instance. For example, /proxy/my-service/health will proxy to the health endpoint of a random instance of my-service.

Sign In (Authentication)
POST /signin
Content-Type: application/json

{
  "username": "admin",
  "password": "yourpassword"
}

Response:

{
  "token": "jwt-token-here"
}

Use the token in Authorization header: Bearer <token> for protected endpoints like /backup, /restore, /trace/*.

OAuth2 Authentication

Maxine supports OAuth2 with Google for external authentication.

Enable with OAUTH2_ENABLED=true and configure:

Start OAuth flow: GET /auth/google

Callback: GET /auth/google/callback returns JWT token.

Chaos Engineering

Maxine includes chaos engineering tools for resilience testing.

Inject Latency
POST /api/maxine/chaos/inject-latency
Content-Type: application/json

{
  "serviceName": "my-service",
  "delay": 1000
}
Inject Failure
POST /api/maxine/chaos/inject-failure
Content-Type: application/json

{
  "serviceName": "my-service",
  "rate": 0.1
}
Reset Chaos
POST /api/maxine/chaos/reset
Content-Type: application/json

{
  "serviceName": "my-service"
}
Get Chaos Status
GET /api/maxine/chaos/status
Get Scaling Recommendations
GET /api/maxine/serviceops/scaling/recommendations?serviceName=my-service

Returns intelligent scaling recommendations based on service metrics analysis. Analyzes response times, connection counts, and node health to suggest scale up/down actions.

Response:

{
  "serviceName": "my-service",
  "recommendations": [
    {
      "serviceName": "my-service",
      "action": "scale_up",
      "reason": "High response time (1500ms)",
      "confidence": 0.85,
      "metrics": {
        "totalNodes": 2,
        "healthyNodes": 2,
        "avgResponseTime": 1500,
        "avgConnectionsPerNode": 75
      },
      "recommendedInstances": 3
    }
  ],
  "timestamp": "2025-09-24T20:07:53.000Z"
}
Refresh Token
POST /refresh-token
Content-Type: application/json

{
  "token": "current-jwt-token"
}

Response:

{
  "token": "new-jwt-token"
}

gRPC API

Maxine supports gRPC for high-performance service registration and discovery.

Default gRPC port: 50051

Available methods:

  • Register: Register a service instance
  • Discover: Discover a service instance with load balancing
  • Heartbeat: Send heartbeat for a service instance
  • Deregister: Deregister a service instance
  • WatchServices: Stream service updates (basic implementation)

Client SDKs can be generated from api-specs/maxine.proto.

  • Dart: Full Mode and Lightning Mode APIs with async/await support for Flutter and Dart applications

WebSocket API

Maxine supports real-time event streaming via WebSocket for monitoring service changes.

Connect to WebSocket
ws://localhost:8080
Authentication

If authentication is enabled, clients must authenticate by sending an auth message with JWT token:

{
  "auth": "jwt-token-here"
}

Upon successful authentication, the server responds with {"type": "authenticated", "user": {...}}. If authentication fails, the connection is closed.

Role-based access: Certain subscriptions may require specific roles (e.g., admin for admin events).

Subscription and Filtering

Clients can subscribe to specific events by sending a JSON message:

{
  "subscribe": {
    "event": "service_registered",
    "serviceName": "my-service"
  }
}

Supported filter criteria:

  • event: Filter by event type (e.g., "service_registered", "circuit_open")
  • serviceName: Filter by service name
  • nodeId: Filter by node ID

To unsubscribe:

{
  "unsubscribe": true
}

To refresh token:

{
  "refresh_token": true
}

Response: {"type": "token_refreshed", "token": "new-token"}

If no filter is set, all events are received.

Events

The server broadcasts the following events as JSON messages:

  • service_registered: When a new service instance is registered

    {
      "event": "service_registered",
      "data": {
        "serviceName": "my-service",
        "nodeId": "my-service:localhost:3000"
      },
      "timestamp": 1640995200000
    }
  • service_deregistered: When a service instance is deregistered

    {
      "event": "service_deregistered",
      "data": {
        "nodeId": "my-service:localhost:3000"
      },
      "timestamp": 1640995200000
    }
  • service_heartbeat: When a service instance sends a heartbeat

    {
      "event": "service_heartbeat",
      "data": {
        "nodeId": "my-service:localhost:3000"
      },
      "timestamp": 1640995200000
    }
  • service_unhealthy: When a service instance is removed due to expired heartbeat

    {
      "event": "service_unhealthy",
      "data": {
        "nodeId": "my-service:localhost:3000"
      },
      "timestamp": 1640995200000
    }
  • config_changed: When a service configuration is updated

    {
      "event": "config_changed",
      "data": {
        "serviceName": "my-service",
        "key": "timeout",
        "value": 5000,
        "namespace": "default",
        "region": "us-east",
        "zone": "zone1"
      },
      "timestamp": 1640995200000
    }
  • config_deleted: When a service configuration is deleted

    {
      "event": "config_deleted",
      "data": {
        "serviceName": "my-service",
        "key": "timeout",
        "namespace": "default",
        "region": "us-east",
        "zone": "zone1"
      },
      "timestamp": 1640995200000
    }

Full Mode API

Full Mode provides additional endpoints for advanced features like federation, tracing, ACLs, intentions, and service blacklists. These are available under /api/maxine/serviceops/.

Chaos Engineering

Maxine includes chaos engineering tools for resilience testing.

Inject Latency
POST /api/maxine/chaos/inject-latency
Content-Type: application/json

{
  "serviceName": "my-service",
  "delay": 1000
}
Inject Failure
POST /api/maxine/chaos/inject-failure
Content-Type: application/json

{
  "serviceName": "my-service",
  "rate": 0.1
}
Reset Chaos
POST /api/maxine/chaos/reset
Content-Type: application/json

{
  "serviceName": "my-service"
}
Get Chaos Status
GET /api/maxine/chaos/status
Add Federated Registry
POST /api/maxine/serviceops/federation/add
Content-Type: application/json

{
  "name": "remote-registry",
  "url": "http://remote-maxine:8080"
}
Remove Federated Registry
POST /api/maxine/serviceops/federation/remove
Content-Type: application/json

{
  "name": "remote-registry"
}
Get Federated Registries
GET /api/maxine/serviceops/federation
Start Trace
POST /api/maxine/serviceops/trace/start
Content-Type: application/json

{
  "operation": "discover",
  "id": "trace-123"
}
Add Trace Event
POST /api/maxine/serviceops/trace/event
Content-Type: application/json

{
  "id": "trace-123",
  "event": "node selected"
}
End Trace
POST /api/maxine/serviceops/trace/end
Content-Type: application/json

{
  "id": "trace-123"
}
Get Trace
GET /api/maxine/serviceops/trace/:id
Set ACL
POST /api/maxine/serviceops/acl/set
Content-Type: application/json

{
  "serviceName": "my-service",
  "allow": ["service-a", "service-b"],
  "deny": ["service-c"]
}
Get ACL
GET /api/maxine/serviceops/acl/:serviceName
Set Intention
POST /api/maxine/serviceops/intention/set
Content-Type: application/json

{
  "source": "service-a",
  "destination": "service-b",
  "action": "allow"
}
Get Intention
GET /api/maxine/serviceops/intention/:source/:destination
Add Service to Blacklist
POST /api/maxine/serviceops/blacklist/service/add
Content-Type: application/json

{
  "serviceName": "bad-service"
}
Remove Service from Blacklist
POST /api/maxine/serviceops/blacklist/service/remove
Content-Type: application/json

{
  "serviceName": "bad-service"
}
Check if Service is Blacklisted
GET /api/maxine/serviceops/blacklist/service/:serviceName

Custom Load Balancing Plugins

Maxine supports custom load balancing strategies through a plugin system. You can register your own load balancing algorithms for specialized routing needs.

Registering a Custom Plugin

const serviceRegistry = global.serviceRegistry;

// Register a custom strategy
serviceRegistry.registerLBPlugin('my-custom-strategy', (nodes, context) => {
    // nodes: array of available service nodes
    // context: { clientIP, serviceName, tags }
    // Return the selected node

    // Example: select node with lowest CPU usage (assuming metadata has cpu field)
    let bestNode = null;
    let lowestCpu = Infinity;
    for (const node of nodes) {
        const cpu = node.metadata?.cpu || 0;
        if (cpu < lowestCpu) {
            lowestCpu = cpu;
            bestNode = node;
        }
    }
    return bestNode || nodes[0];
});

// Now you can use 'my-custom-strategy' in discovery requests
GET /discover?serviceName=my-service&loadBalancing=my-custom-strategy

Deep Learning Load Balancing

Maxine includes advanced deep learning capabilities for intelligent load balancing. Using TensorFlow.js, it trains neural networks on historical performance data to predict optimal service nodes.

Features

  • Neural Network Models: Feedforward neural networks trained on service metrics
  • Time Series Analysis: Advanced analysis including autocorrelation, trend detection, and seasonality
  • Predictive Analytics: Forecasts response times, error rates, and load patterns
  • Continuous Learning: Models update automatically with new performance data
  • Fallback Strategy: Falls back to time-series analysis if deep learning model unavailable

Usage

Use the advanced-ml strategy for deep learning-based load balancing:

GET /discover?serviceName=my-service&loadBalancing=advanced-ml

Model Training

Models are trained automatically on:

  • Response times
  • Success/failure rates
  • Load patterns
  • Historical trends

Training occurs every minute with recent performance data. Models are persisted to disk for continuity across restarts.

Metrics

Access model performance metrics:

const metrics = serviceRegistry.deepLearningService.getModelMetrics('my-service');
// Returns: { loss, mse, mae, accuracy, precision, recall, f1, ... }

Client SDKs

Maxine provides client SDKs for easy integration:

  • Swift: Lightning Mode API support with async/await for iOS/macOS/watchOS/tvOS
  • Kotlin: Lightning Mode API support with coroutines for Android
  • Python: Supports both Full Mode and Lightning Mode APIs, including WebSocket for real-time events
  • Go: Full Mode API support
  • Java: Full Mode API support
  • C#: Full Mode API support
  • Rust: Full Mode API support
  • Dart: Full Mode and Lightning Mode APIs with async/await support for Flutter and Dart applications

Client SDKs include caching, automatic retries, and support for all discovery strategies.

  • PHP: Full Mode and Lightning Mode APIs with caching support
  • Ruby: Full Mode and Lightning Mode APIs with WebSocket support
  • C++: High-performance C++ SDK for low-latency applications and game servers (complete with CMake build system and examples)

Architecture

Maxine maintains an in-memory registry of services and their instances. Services register with heartbeats, and expired services are automatically cleaned up. Discovery returns a healthy instance using various load balancing strategies.

Performance

  • Lightning Mode: Ultra-fast response times using raw Node.js HTTP server, O(1) lookups using optimized in-memory data structures with lightweight LRU caching (10k entries, 30s TTL), pre-allocated buffer responses, fast LCG PRNG for random selection, advanced load balancing strategies (round-robin, random, weighted-random, least-connections, consistent-hash, ip-hash, geo-aware, predictive), optimized request handling without deferred execution for minimal latency, stripped-down registry with only core features for minimal overhead, memory-mapped and shared memory persistence options
  • Full Mode: Comprehensive features with optimized caching, async operations, and JWT authentication
  • Minimal memory footprint with efficient data structures
  • Automatic cleanup prevents memory leaks with periodic sweeps (every 30 seconds)
  • Optimized routing: O(1) Map-based HTTP routing for ultra-fast request handling
  • Optimized heartbeat and discovery logic with parallel operations and async I/O
  • Active health checks for proactive service monitoring
  • Event-driven notifications for real-time updates
    • Load test results: 5,000 requests with 50 concurrent users in ~0.37s, average response time 3.53ms, 95th percentile 6.22ms, 100% success rate, 13k req/s - Load test target: 95th percentile < 10ms for 50 concurrent users (achieved) - Recent optimizations: Removed console.log statements from production code to reduce I/O overhead, implemented object pooling for response objects to reduce GC pressure, added service health prediction using time-series analysis, adaptive caching with access-based TTL, SIMD-inspired binary search for weighted random selection, fine-tuned GC settings, added CPU affinity, synchronous ultra-fast discovery, pre-allocated JSON buffers

License

MIT

About

Maxine is a service discovery and a naming server for microservices that solves the issue of hard-wiring urls by hostnames and ports.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •