A minimal, high-performance service discovery and registry for microservices.
- Lightning Fast: In-memory storage with O(1) lookups, optimized heartbeat with periodic cleanup, pre-allocated response buffers, fast LCG PRNG for random selection
- Simple API: Register, discover, heartbeat, and deregister services with support for service versioning
- Automatic Cleanup: Removes expired services with efficient periodic cleanup (every 30 seconds)
- Load Balancing: Round-robin, random, weighted-random, least-connections, consistent-hash, ip-hash, geo-aware, predictive, ai-driven, advanced-ml (synchronous deep learning), cost-aware, power-of-two-choices selection for advanced load balancing
- Health Checks: /health endpoint returning service and node counts, active health monitoring for real-time status
- Advanced Health Checks: Custom health check endpoints with proactive monitoring, configurable intervals, and health status integration with load balancing decisions
- Circuit Breakers: Automatic failure detection and recovery to protect against cascading failures
- Rate Limiting: Protect services from excessive requests with configurable limits
- API Key Management: Generate, validate, and revoke API keys with per-key rate limiting for secure service access
- Access Control Lists (ACLs): Fine-grained permissions for service discovery access
- Service Intentions: Define allowed communication patterns between services
- Service Dependencies: Manage service dependencies with cycle detection, graph visualization, and automatic dependency detection through call logging
- Version Compatibility Checking: Define compatibility rules for service versions to prevent incompatible service interactions
- Service Call Analytics: Real-time dashboard visualizing service communication patterns, call frequencies, and dependency graphs with interactive D3.js charts
- Advanced Service Validation: Comprehensive schema validation for service registrations including metadata fields (tags, healthCheck, version, weight)
- Chaos Engineering Tools: Built-in chaos testing with latency injection, failure simulation, and automated experiments for resilience validation
- Metrics: Basic /metrics endpoint with request counts, errors, uptime, and basic stats including cache performance metrics
- OpenTelemetry Metrics: Comprehensive observability with Prometheus-compatible metrics for service registrations, discoveries, heartbeats, deregistrations, cache hits/misses, and total services/nodes
- Advanced Rate Limiting: Distributed rate limiting with configurable limits per client IP to protect against excessive requests
- Audit Logging: Comprehensive logging of all registry operations using Winston, including user actions, system events, and security incidents with log rotation and export capabilities
- Persistence: Optional persistence to survive restarts with file-based, Redis, memory-mapped (mmap), or shared memory (shm) storage
- Minimal Dependencies: Only essential packages for maximum performance
- Lightning Mode: Dedicated mode for ultimate speed with core features: register, heartbeat, deregister, discover with round-robin/random load balancing, health, optimized for minimal overhead
- HTTP/3 Support: Optional QUIC-based HTTP/3 server for ultra-low latency service discovery (enabled with
HTTP3_ENABLED=true) - WebAssembly Support: Complete WebAssembly service registry for edge computing deployments with full API compatibility
- Optimized Parsing: Fast JSON parsing with error handling
- Event-Driven: Real-time events for service changes and notifications via WebSocket and MQTT
- Federation: Connect multiple Maxine instances across datacenters for global service discovery (available in Lightning Mode)
- Multi-Datacenter Support: Global service discovery with cross-datacenter replication and load balancing
- Authentication/Authorization: Optional JWT-based auth for Lightning Mode to secure sensitive operations with Role-Based Access Control (RBAC)
- Configuration Management: Dynamic configuration updates for services with versioning and event notifications
- gRPC Support: High-performance gRPC API for service operations
- Service Mesh Integration: Automatic Envoy, Istio, and Linkerd configuration generation for seamless service mesh deployment
- Open Service Broker API Integration: Compatible with enterprise service catalogs for seamless integration with Kubernetes Service Catalog and other OSB implementations
Maxine delivers exceptional performance for service discovery operations:
- Ultra-Fast Mode: Average 1.87ms, P95 3.94ms for discovery requests (with advanced optimizations and AI-driven load balancing)
- Throughput: 25,004+ requests per second under load (50 concurrent users, 5000 iterations)
- Lightning Mode: Average 4.91ms, P95 6.49ms for discovery requests (100 concurrent users, 1000 iterations)
- Throughput: 20,136+ requests per second under load (100 concurrent users, 1000 iterations)
- Optimizations: QUIC/HTTP3 support for ultra-low latency, HTTP/1.1 for ultra-fast mode (disabled HTTP/2 for lower latency), disabled OpenTelemetry tracing and Prometheus metrics in Lightning Mode, ultra-fast mode with minimal features for maximum speed, fast LCG PRNG, pre-allocated buffers, object pooling, adaptive caching, binary search for weighted random selection, SIMD-inspired fast bulk operations (fastMin, fastMax, fastSum, etc.) for load balancing calculations, removed console.log from production code (24% throughput improvement), optimized discovery to use ultraFastGetRandomNodeSync directly, disabled expensive operations in lightning mode, synchronous load balancing for ultra-fast mode, updated GC flags for Node.js 22 compatibility, enabled small LRU caches in ultra-fast mode for better performance
Maxine implements comprehensive security measures for production deployments:
- Input Validation: All API endpoints use Joi schema validation with sanitization
- Rate Limiting: Distributed Redis-backed rate limiting to prevent abuse
- Authentication: JWT-based authentication with role-based access control (RBAC)
- API Keys: Secure API key management with configurable rate limits per key
- Mutual TLS: mTLS support for encrypted service-to-service communication
- Audit Logging: Comprehensive logging of all security events and operations
- Dependency Security: All dependencies are regularly audited and updated
- Enable authentication in production:
AUTH_ENABLED=true - Use HTTPS/TLS for all communications
- Configure rate limiting based on your traffic patterns
- Regularly update dependencies and monitor for security advisories
- Use API keys for service-to-service authentication
- Enable audit logging for compliance requirements
# Run security audit
npm audit
# Check for outdated dependencies
npm outdated
# Use ESLint for code quality
npm run lintnpm install
npm startMaxine runs in Ultra-Fast Mode by default for maximum performance with core features only. For more features, set ULTRA_FAST_MODE=false and LIGHTNING_MODE=true.
Maxine provides comprehensive Kubernetes integration through a custom operator for declarative service registry management:
- ServiceRegistry: Declarative Maxine instance management with auto-scaling, persistence, and multi-cloud support
- ServiceInstance: Automatic service registration and health monitoring for Kubernetes services
- ServicePolicy: Advanced load balancing, circuit breakers, and AI optimization policies
- ServiceMeshOperator: Automated Istio, Linkerd, and Envoy configuration generation
- TrafficPolicy: Fine-grained traffic management with fault injection and canary deployments
- ServiceEndpoint: Direct endpoint management with health checks and metadata
# Install CRDs
kubectl apply -f helm/maxine-operator/crds/
# Install operator
helm install maxine-operator helm/maxine-operator/
# Create a service registry
kubectl apply -f - <<EOF
apiVersion: maxine.io/v1
kind: ServiceRegistry
metadata:
name: my-registry
spec:
replicas: 3
mode: lightning
config:
port: 8080
persistenceEnabled: true
aiOptimizationEnabled: true
multiCloudEnabled: true
EOF- Auto-scaling: Kubernetes HPA integration with custom metrics
- Service Discovery: Automatic registration of Kubernetes services
- AI Optimization: ML-driven load balancing and traffic optimization
- Multi-cloud: Cross-cloud service discovery and failover
- Service Mesh: Automated Istio/Linkerd configuration
- Chaos Engineering: Built-in fault injection and resilience testing
- Advanced Load Balancing Tutorial - Comprehensive guide to load balancing strategies
- WebSocket Events Tutorial - Real-time event streaming and monitoring
- Monitoring and Alerting Guide - Production monitoring setup
- Client SDKs - SDK documentation for multiple languages
- Event Streaming - Event-driven architectures
Maxine uses modern development tools for code quality and security:
# Run linting
npm run lint
# Auto-fix linting issues
npm run lint:fix
# Format code
npm run format
# Run tests
npm test
# Run load tests
npm run load-test- ESLint: Code linting with security rules
- Prettier: Code formatting
- Joi: Input validation and sanitization
- Mocha: Testing framework
- Istanbul/NYC: Code coverage
Maxine supports optional persistence to maintain registry state across restarts:
- File-based: Saves to
registry.jsonin the working directory - Redis: Uses Redis for distributed storage
- Memory-mapped (mmap): Zero-copy operations with memory-mapped files for ultra-fast persistence
- Shared Memory (shm): In-memory shared buffer with file backing for maximum performance
- PostgreSQL: Enterprise-grade SQL persistence with connection pooling and advanced querying
- MySQL: High-performance MySQL persistence with optimized schemas and indexing
- TiKV: Distributed key-value store with strong consistency and horizontal scaling
- FoundationDB: Multi-model database with ACID transactions and fault tolerance
Enable with PERSISTENCE_ENABLED=true and set PERSISTENCE_TYPE=file, redis, mmap, shm, postgres, mysql, tikv, or foundationdb.
For Redis, configure REDIS_HOST, REDIS_PORT, REDIS_PASSWORD.
Maxine supports Redis-based distributed caching for service discovery results across multiple instances, reducing latency and improving scalability.
Enable with REDIS_CACHE_ENABLED=true and configure Redis settings. Discovery results for deterministic load balancing strategies (consistent-hash, ip-hash, geo-aware, etc.) are cached in Redis with a 30-second TTL, allowing multiple Maxine instances to share cached results and reduce registry load.
Maxine supports federation to connect multiple instances across datacenters for global service discovery, cross-datacenter replication, and load balancing.
Enable with FEDERATION_ENABLED=true and configure peers with FEDERATION_PEERS=http://peer1:8080,http://peer2:8080.
Additional options: FEDERATION_TIMEOUT (default 5000ms), FEDERATION_RETRY_ATTEMPTS (default 3).
In Lightning Mode, federated registries are queried automatically if a service is not found locally. Registrations and deregistrations are replicated across peers.
Maxine includes advanced multi-cluster failover capabilities for high availability:
- Health Monitoring: Continuous health checks of federated registries every 30 seconds
- Replication Lag Detection: Monitors replication lag between clusters with configurable thresholds
- Automatic Failover: Automatically switches to healthy backup registries when primary fails
- Region-Aware Failover: Prioritizes failover targets based on geographic proximity
- Conflict Resolution: Handles service registration conflicts during failover scenarios
GET /api/maxine/serviceops/federation/statusReturns comprehensive failover status including:
- Current primary registry
- Health status of all federated registries
- Replication lag metrics
- Failover priority rankings
- Last health check timestamps
FEDERATION_PEERS: Comma-separated list of peer URLs with optional priority (e.g.,peer1:http://host1:8080,peer2:http://host2:8080)REPLICATION_LAG_THRESHOLD: Maximum acceptable replication lag in milliseconds (default: 5000ms)
Maxine supports optional JWT-based authentication in Lightning Mode to secure sensitive operations like backup/restore and tracing.
Enable with AUTH_ENABLED=true and configure:
JWT_SECRET: Secret key for JWT signingJWT_EXPIRES_IN: Token expiration (default 1h)ADMIN_USERNAME: Admin username (default admin)ADMIN_PASSWORD_HASH: Bcrypt hash of admin password
Sign in via POST /signin to get a token, then include in requests as Authorization: Bearer <token>.
Maxine supports OAuth2 authentication with Google for external user management.
Enable with OAUTH2_ENABLED=true and configure:
GOOGLE_CLIENT_ID: Google OAuth2 client IDGOOGLE_CLIENT_SECRET: Google OAuth2 client secret
Redirect users to GET /auth/google to start OAuth flow, then handle the callback at GET /auth/google/callback to receive JWT token.
Maxine supports Role-Based Access Control with fine-grained permissions for different user roles.
- admin: Full access to all operations
- operator: Service management, configuration, monitoring, and advanced features
- viewer: Read-only access to discovery, metrics, and health
- service: Limited access for service registration, discovery, and heartbeat
GET /api/maxine/roles- List all roles and permissions (admin only)GET /api/maxine/user/roles/:username- Get user role (admin only)POST /api/maxine/user/roles- Set user role (admin only)
For testing, Maxine includes demo users with different roles:
- admin/admin (admin role)
- operator/operator (operator role)
- viewer/viewer (viewer role)
- service/service (service role)
Maxine supports API key-based authentication with configurable rate limiting for secure service access.
POST /api/maxine/api-keys/generate- Generate a new API key (admin only){ "serviceName": "my-service", "rateLimit": 1000 }POST /api/maxine/api-keys/revoke- Revoke an API key (admin only){ "apiKey": "your-api-key-here" }GET /api/maxine/api-keys- List all API keys (admin only)POST /api/maxine/api-keys/validate- Validate an API key{ "apiKey": "your-api-key-here" }
Include the API key in requests using the X-API-Key header or apiKey query parameter:
curl -H "X-API-Key: your-api-key" http://localhost:8080/discover?serviceName=my-serviceAPI keys are automatically rate limited based on their configured limits.
Maxine supports LDAP/Active Directory authentication for enterprise environments.
Enable with LDAP_ENABLED=true and configure:
LDAP_URL: LDAP server URL (e.g.,ldap://localhost:389)LDAP_BASE_DN: Base DN for searches (e.g.,dc=example,dc=com)LDAP_BIND_USER: Bind user DN for authenticationLDAP_BIND_PASSWORD: Bind user password
When LDAP is enabled, the /signin endpoint will first attempt LDAP authentication, falling back to local users if LDAP fails.
Maxine supports SAML 2.0 authentication for enterprise single sign-on integration.
Enable with SAML_ENABLED=true and configure:
SAML_ENTRY_POINT: SAML identity provider entry point URLSAML_ISSUER: SAML service provider issuerSAML_CERT: SAML identity provider certificate (public key)SAML_CALLBACK_URL: SAML callback URL (default:http://localhost:8080/auth/saml/callback)
Redirect users to GET /auth/saml to start SAML authentication flow, then handle the callback at POST /auth/saml/callback to receive JWT token.
Maxine supports Mutual TLS for encrypted and authenticated service-to-service communication in Lightning Mode.
Enable with MTLS_ENABLED=true and provide certificate paths:
SERVER_CERT_PATH: Path to server certificate (default: src/main/config/certs/server.crt)SERVER_KEY_PATH: Path to server private key (default: src/main/config/certs/server.key)CA_CERT_PATH: Path to CA certificate for client verification (default: src/main/config/certs/ca.crt)
To generate self-signed certificates for testing, run:
node src/main/config/certs/generate-certs.jsThis creates CA, server, and client certificates. Use client.crt and client.key for client authentication.
Example curl with client cert:
curl --cert src/main/config/certs/client.crt --key src/main/config/certs/client.key --cacert src/main/config/certs/ca.crt https://localhost:8080/healthMaxine supports optional MQTT integration for publishing real-time events to MQTT brokers.
Enable with MQTT_ENABLED=true and configure:
MQTT_BROKER: MQTT broker URL (default mqtt://localhost:1883)MQTT_TOPIC: Base topic for events (default maxine/registry/events)
Events are published to topics like maxine/registry/events/service_registered, maxine/registry/events/circuit_open, etc. with QoS 1.
MQTT publishing is now enabled in the broadcast function for real-time event distribution.
Maxine supports OpenTelemetry tracing for distributed observability. Traces are automatically generated for key operations like service registration, discovery, and deregistration.
Configure Jaeger exporter with JAEGER_ENDPOINT environment variable (default: http://localhost:14268/api/traces).
Tracing is enabled by default and provides detailed spans for:
- Service registration/deregistration
- Service discovery with load balancing
- API request handling
- Registry operations
- Ultra-Fast Mode (default): Extreme performance with minimal features. Core operations only: register, heartbeat, deregister, discover. No logging, metrics, auth, WebSocket, MQTT, gRPC. Uses UDP for heartbeats for speed. Set
ULTRA_FAST_MODE=true. - Lightning Mode: Ultra-fast with additional features for maximum speed. Core operations: register, heartbeat, deregister, discover with advanced load balancing, caching, health checks. Optional JWT auth for sensitive endpoints. Uses root-level API endpoints like
/register,/discover. SetULTRA_FAST_MODE=false LIGHTNING_MODE=true. - Full Mode: Comprehensive features including federation, tracing, ACLs, intentions, service blacklists, management UI, security, metrics, etc. Uses
/api/*endpoints. SetLIGHTNING_MODE=false.
To run in full mode: LIGHTNING_MODE=false npm start
POST /register
Content-Type: application/json
{
"serviceName": "my-service",
"host": "localhost",
"port": 3000,
"metadata": {"version": "1.0", "weight": 1, "tags": ["web", "api"], "healthCheck": {"url": "/health", "interval": 30000, "timeout": 5000}}
}Note: version in metadata enables service versioning. weight in metadata is used for weighted-random load balancing (default 1). tags in metadata is an array of strings for service tagging and filtering. healthCheck in metadata configures proactive health monitoring with url (default "/health"), interval (default 30000ms), and timeout (default 5000ms).
Response:
{
"nodeId": "my-service:localhost:3000"
}GET /discover?serviceName=my-service&loadBalancing=round-robin&version=1.0&tags=web,apiLoad balancing options: round-robin (default), random, weighted-random, least-connections, weighted-least-connections, consistent-hash, ip-hash, geo-aware, least-response-time, health-score, predictive (uses time-series trend analysis for optimal node selection), ai-driven (uses reinforcement learning for optimal routing), advanced-ml (uses machine learning with predictive analytics for intelligent load balancing), cost-aware (prefers lower-cost nodes like on-prem over cloud), power-of-two-choices (selects two random nodes and picks the one with fewer connections for better load distribution). Custom load balancing strategies can be registered via plugins. Use version parameter for service versioning. Use tags parameter to filter services by tags (comma-separated).
Response: Returns a service instance or 404 if not found.
POST /heartbeat
Content-Type: application/json
{
"nodeId": "my-service:localhost:3000"
}DELETE /deregister
Content-Type: application/json
{
"nodeId": "localhost:3000"
}GET /serversGET /services/:serviceNameReturns all healthy instances of the specified service with their metadata and health status.
Response:
{
"serviceName": "my-service",
"instances": [
{
"nodeId": "my-service:localhost:3000",
"address": "localhost:3000",
"nodeName": "my-service:localhost:3000",
"metadata": { "version": "1.0", "weight": 1 },
"lastHeartbeat": 1640995200000,
"healthy": true
}
]
}GET /healthReturns status, services count, nodes count.
GET /health/:nodeIdReturns detailed health status for a specific service instance.
Response:
{
"nodeId": "my-service:localhost:3000",
"serviceName": "my-service",
"address": "localhost:3000",
"nodeName": "my-service:localhost:3000",
"metadata": { "version": "1.0" },
"lastHeartbeat": 1640995200000,
"timeSinceLastHeartbeat": 5000,
"healthy": true,
"responseTime": 150
}GET /metricsReturns uptime, requests, errors, services, nodes, persistenceEnabled, persistenceType, wsConnections, eventsBroadcasted, cacheHits, cacheMisses.
Additionally, comprehensive Prometheus-compatible metrics are exposed on port 9464 at /metrics, including:
maxine_service_registrations_total: Total service registrationsmaxine_service_discoveries_total{service_name, strategy}: Total service discoveries by service and strategymaxine_service_heartbeats_total: Total heartbeatsmaxine_service_deregistrations_total: Total deregistrationsmaxine_cache_hits_total: Cache hitsmaxine_cache_misses_total: Cache missesmaxine_redis_cache_hit_total: Redis distributed cache hitsmaxine_redis_cache_miss_total: Redis distributed cache missesmaxine_services_active: Active services countmaxine_nodes_active: Active nodes countmaxine_circuit_breakers_open: Open circuit breakers countmaxine_response_time_seconds{operation}: Response time histogram for operations (register, discover, heartbeat, deregister)
GET /dashboardReturns an advanced HTML dashboard with real-time metrics, charts, service topology, and event streaming for comprehensive monitoring. Features include:
- Real-time stats updates via WebSocket
- Interactive charts for node health and cache performance
- Service and node status visualization
- Recent events feed
- Connection status indicators
GET /dependency-graphReturns an interactive HTML page visualizing the service dependency graph using D3.js. Features include:
- Force-directed graph layout
- Click on nodes to view dependency impact (dependencies and dependents)
- Cycle detection alerts
- Export to JSON or SVG
- Real-time updates (planned)
GET /heapdumpCreates a heap snapshot file for memory profiling (requires heapdump module).
GET /backupReturns the current registry state as JSON (requires persistence enabled).
POST /restore
Content-Type: application/json
{ ... registry data ... }Restores registry from backup data (requires persistence enabled).
POST /trace/start
Content-Type: application/json
{
"id": "trace-123",
"operation": "discover"
}POST /trace/event
Content-Type: application/json
{
"id": "trace-123",
"event": "node selected"
}POST /trace/end
Content-Type: application/json
{
"id": "trace-123"
}GET /trace/:idReturns the trace data for the given id, including start time, duration, events, and status.
OpenTelemetry Integration: Maxine includes comprehensive OpenTelemetry tracing for all registry operations:
- Service registration/deregistration
- Service discovery with load balancing
- Heartbeat operations
- Federation queries
- Configuration updates
Traces are automatically exported to Jaeger or Zipkin when configured. Set JAEGER_ENDPOINT or ZIPKIN_ENDPOINT environment variables to enable trace export.
GET /versions?serviceName=my-serviceResponse:
{
"serviceName": "my-service",
"versions": ["1.0", "2.0", "default"]
}Maxine supports DNS-based service discovery for compatibility with standard DNS clients.
Enable with DNS_ENABLED=true (default) and configure DNS_PORT (default 53).
Query SRV records for service discovery:
dig SRV _my-service._tcp.default.default.default @localhost
Or A records for direct IP resolution:
dig A my-service.default.default.default @localhost
This allows integration with DNS-aware applications and load balancers.
GET /anomaliesReturns detected anomalies in the service registry using statistical analysis and machine learning algorithms. Anomalies are prioritized by severity.
Anomaly Types:
high_circuit_failures: Excessive circuit breaker failuresno_healthy_nodes: Service has nodes but none are healthyno_nodes: Service has no registered nodeshigh_response_time: Response time exceeds 3 standard deviations from meanresponse_time_trend: Significant increase in response times over timestale_heartbeat: Node hasn't sent heartbeat within expected intervalhigh_error_rate: Error rate exceeds 10%
Response:
{
"anomalies": [
{
"serviceName": "my-service",
"type": "high_response_time",
"value": 2500,
"threshold": 1500,
"severity": "medium"
},
{
"serviceName": "bad-service",
"type": "no_healthy_nodes",
"severity": "critical"
}
]
}GET /health-score?serviceName=my-serviceReturns health scores (0-100, higher better) for all healthy nodes in the service, based on response times, failure rates, and circuit breaker state.
Response:
{
"serviceName": "my-service",
"scores": {
"my-service:localhost:3000": 85,
"my-service:localhost:3001": 92
}
}GET /predict-health?serviceName=my-service&window=300000Returns health predictions for service nodes using time-series analysis. The window parameter specifies the prediction time window in milliseconds (default: 300000ms / 5 minutes).
Response:
{
"serviceName": "my-service",
"predictions": {
"my-service:localhost:3000": {
"currentScore": 85,
"predictedScore": 78,
"trend": 2.5,
"predictedResponseTime": 180
}
},
"predictionWindow": 300000
}POST /traffic/set
Content-Type: application/json
{
"serviceName": "my-service",
"distribution": {"1.0": 80, "2.0": 20}
}POST /version/promote
Content-Type: application/json
{
"serviceName": "my-service",
"version": "2.0"
}POST /version/retire
Content-Type: application/json
{
"serviceName": "my-service",
"version": "1.0"
}POST /traffic/shift
Content-Type: application/json
{
"serviceName": "my-service",
"fromVersion": "1.0",
"toVersion": "2.0",
"percentage": 10
}POST /api/maxine/serviceops/config/set
Content-Type: application/json
{
"serviceName": "my-service",
"key": "timeout",
"value": 5000,
"namespace": "default",
"region": "us-east",
"zone": "zone1"
}GET /api/maxine/serviceops/config/get?serviceName=my-service&key=timeout&namespace=default®ion=us-east&zone=zone1GET /api/maxine/serviceops/config/all?serviceName=my-service&namespace=default®ion=us-east&zone=zone1GET /api/maxine/serviceops/config/watch?serviceName=my-service&namespace=default®ion=us-east&zone=zone1Returns Server-Sent Events for real-time config changes.
DELETE /api/maxine/serviceops/config/delete
Content-Type: application/json
{
"serviceName": "my-service",
"key": "timeout",
"namespace": "default",
"region": "us-east",
"zone": "zone1"
}GET /api/maxine/serviceops/envoy/configReturns Envoy proxy configuration JSON based on registered services, suitable for service mesh integration. Includes enhanced observability with access logging, custom headers, and circuit breaker metrics.
GET /api/maxine/serviceops/service-mesh/metricsReturns comprehensive service mesh observability metrics including:
- Configuration generations (Envoy, Istio, Linkerd)
- Circuit breaker statistics
- Retry attempt counts
- Service health metrics
- Active service and node counts
GET /service-mesh/istio-configReturns Istio VirtualService and DestinationRule configurations in JSON format for service mesh deployment.
GET /service-mesh/linkerd-configReturns Linkerd ServiceProfile configurations in JSON format for service mesh deployment, including retry budgets and route conditions.
Maxine supports the Open Service Broker API for integration with enterprise service catalogs.
GET /v2/catalogReturns the service catalog in OSB format, listing all registered services and their versions as plans.
Response:
{
"services": [
{
"id": "my-service",
"name": "my-service",
"description": "Service my-service",
"bindable": false,
"plans": [
{
"id": "my-service-1.0",
"name": "1.0",
"description": "Version 1.0 of my-service"
}
]
}
]
}GET /circuit-breaker/:nodeIdReturns the circuit breaker state for the specified node, including state (closed/open/half-open), failure count, last failure timestamp, and next retry timestamp.
GET /events?since=<timestamp>&limit=<number>Returns recent events from the event history. Use since to get events after a specific timestamp (default 0), and limit to limit the number of events returned (default 100).
POST /blacklist/add
Content-Type: application/json
{
"serviceName": "bad-service"
}DELETE /blacklist/remove
Content-Type: application/json
{
"serviceName": "bad-service"
}GET /blacklistReturns the list of blacklisted services.
GET /graphql
POST /graphqlMaxine provides a GraphQL API for flexible queries and mutations. The GraphQL playground is available at /graphql for testing queries.
Queries:
services: Get all registered servicesservice(serviceName: String!): Get a specific servicediscover(serviceName: String!, ip: String, group: String, tags: [String], deployment: String, filter: String): Discover a service instancehealthScores(serviceName: String!): Get health scores for all nodes in a service
Mutations:
register(serviceName: String!, nodeName: String!, address: String!, metadata: String): Register a servicederegister(serviceName: String!, nodeName: String!): Deregister a service
POST /config/set
Content-Type: application/json
{
"serviceName": "my-service",
"key": "timeout",
"value": 5000,
"metadata": {"description": "Request timeout"}
}GET /config/get?serviceName=my-service&key=timeoutGET /config/all?serviceName=my-serviceDELETE /config/delete?serviceName=my-service&key=timeoutPOST /record-response-time
Content-Type: application/json
{
"nodeId": "my-service:localhost:3000",
"responseTime": 150
}Records the response time for a node to enable predictive load balancing based on historical performance data.
POST /record-call
Content-Type: application/json
{
"callerService": "web-service",
"calledService": "api-service"
}Records a service call for automatic dependency detection. Services can report their outbound calls to enable auto-detection of service dependencies.
POST /api/maxine/serviceops/dependency/add
Content-Type: application/json
{
"serviceName": "my-service",
"dependsOn": "dependent-service"
}POST /api/maxine/serviceops/dependency/remove
Content-Type: application/json
{
"serviceName": "my-service",
"dependsOn": "dependent-service"
}GET /api/maxine/serviceops/dependency/get?serviceName=my-serviceResponse:
{
"serviceName": "my-service",
"dependencies": ["dependent-service"]
}GET /api/maxine/serviceops/dependency/dependents?serviceName=my-serviceResponse:
{
"serviceName": "my-service",
"dependents": ["dependent-service"]
}GET /api/maxine/serviceops/dependency/graphResponse:
{
"my-service": ["dependent-service"],
"another-service": ["my-service"]
}GET /api/maxine/serviceops/dependency/cyclesResponse:
{
"cycles": [["service-a", "service-b", "service-a"]]
}POST /api/maxine/serviceops/dependency/analyzeTriggers automatic dependency analysis based on recorded service calls. Dependencies are inferred from call logs where services have called each other above the configured threshold within the time window.
Response:
{
"success": true,
"message": "Dependency analysis completed"
}POST /api/maxine/serviceops/compatibility/set
Content-Type: application/json
{
"serviceName": "my-service",
"version": "1.0",
"compatibleVersions": ["1.0", "1.1", "^1.0.0"]
}GET /api/maxine/serviceops/compatibility/get?serviceName=my-service&version=1.0Response:
{
"serviceName": "my-service",
"version": "1.0",
"rules": ["1.0", "1.1", "^1.0.0"]
}POST /api/maxine/serviceops/compatibility/check
Content-Type: application/json
{
"serviceName": "my-service",
"version": "1.0",
"requiredVersion": "1.1"
}Response:
{
"serviceName": "my-service",
"version": "1.0",
"requiredVersion": "1.1",
"compatible": true
}POST /api/maxine/serviceops/acl/set
Content-Type: application/json
{
"serviceName": "my-service",
"allow": ["service-a", "service-b"],
"deny": ["service-c"]
}GET /api/maxine/serviceops/acl/:serviceNameResponse:
{
"allow": ["service-a", "service-b"],
"deny": ["service-c"]
}POST /api/maxine/serviceops/intention/set
Content-Type: application/json
{
"source": "service-a",
"destination": "service-b",
"action": "allow"
}GET /api/maxine/serviceops/intention/:source/:destinationResponse:
{
"source": "service-a",
"destination": "service-b",
"action": "allow"
}POST /api/maxine/serviceops/dependency/add
Content-Type: application/json
{
"serviceName": "my-service",
"dependsOn": "dependent-service"
}POST /api/maxine/serviceops/dependency/remove
Content-Type: application/json
{
"serviceName": "my-service",
"dependsOn": "dependent-service"
}GET /api/maxine/serviceops/dependency/get?serviceName=my-serviceGET /api/maxine/serviceops/dependency/dependents?serviceName=my-serviceGET /api/maxine/serviceops/dependency/graphGET /api/maxine/serviceops/dependency/cyclesGET /proxy/:serviceName/:pathProxies requests to a discovered service instance. For example, /proxy/my-service/health will proxy to the health endpoint of a random instance of my-service.
POST /signin
Content-Type: application/json
{
"username": "admin",
"password": "yourpassword"
}Response:
{
"token": "jwt-token-here"
}Use the token in Authorization header: Bearer <token> for protected endpoints like /backup, /restore, /trace/*.
Maxine supports OAuth2 with Google for external authentication.
Enable with OAUTH2_ENABLED=true and configure:
GOOGLE_CLIENT_ID: Google OAuth2 client IDGOOGLE_CLIENT_SECRET: Google OAuth2 client secretGOOGLE_CALLBACK_URL: Callback URL (default: http://localhost:8080/auth/google/callback)
Start OAuth flow: GET /auth/google
Callback: GET /auth/google/callback returns JWT token.
Maxine includes chaos engineering tools for resilience testing.
POST /api/maxine/chaos/inject-latency
Content-Type: application/json
{
"serviceName": "my-service",
"delay": 1000
}POST /api/maxine/chaos/inject-failure
Content-Type: application/json
{
"serviceName": "my-service",
"rate": 0.1
}POST /api/maxine/chaos/reset
Content-Type: application/json
{
"serviceName": "my-service"
}GET /api/maxine/chaos/statusGET /api/maxine/serviceops/scaling/recommendations?serviceName=my-serviceReturns intelligent scaling recommendations based on service metrics analysis. Analyzes response times, connection counts, and node health to suggest scale up/down actions.
Response:
{
"serviceName": "my-service",
"recommendations": [
{
"serviceName": "my-service",
"action": "scale_up",
"reason": "High response time (1500ms)",
"confidence": 0.85,
"metrics": {
"totalNodes": 2,
"healthyNodes": 2,
"avgResponseTime": 1500,
"avgConnectionsPerNode": 75
},
"recommendedInstances": 3
}
],
"timestamp": "2025-09-24T20:07:53.000Z"
}POST /refresh-token
Content-Type: application/json
{
"token": "current-jwt-token"
}Response:
{
"token": "new-jwt-token"
}Maxine supports gRPC for high-performance service registration and discovery.
Default gRPC port: 50051
Available methods:
- Register: Register a service instance
- Discover: Discover a service instance with load balancing
- Heartbeat: Send heartbeat for a service instance
- Deregister: Deregister a service instance
- WatchServices: Stream service updates (basic implementation)
Client SDKs can be generated from api-specs/maxine.proto.
- Dart: Full Mode and Lightning Mode APIs with async/await support for Flutter and Dart applications
Maxine supports real-time event streaming via WebSocket for monitoring service changes.
ws://localhost:8080
If authentication is enabled, clients must authenticate by sending an auth message with JWT token:
{
"auth": "jwt-token-here"
}Upon successful authentication, the server responds with {"type": "authenticated", "user": {...}}. If authentication fails, the connection is closed.
Role-based access: Certain subscriptions may require specific roles (e.g., admin for admin events).
Clients can subscribe to specific events by sending a JSON message:
{
"subscribe": {
"event": "service_registered",
"serviceName": "my-service"
}
}Supported filter criteria:
event: Filter by event type (e.g., "service_registered", "circuit_open")serviceName: Filter by service namenodeId: Filter by node ID
To unsubscribe:
{
"unsubscribe": true
}To refresh token:
{
"refresh_token": true
}Response: {"type": "token_refreshed", "token": "new-token"}
If no filter is set, all events are received.
The server broadcasts the following events as JSON messages:
-
service_registered: When a new service instance is registered{ "event": "service_registered", "data": { "serviceName": "my-service", "nodeId": "my-service:localhost:3000" }, "timestamp": 1640995200000 } -
service_deregistered: When a service instance is deregistered{ "event": "service_deregistered", "data": { "nodeId": "my-service:localhost:3000" }, "timestamp": 1640995200000 } -
service_heartbeat: When a service instance sends a heartbeat{ "event": "service_heartbeat", "data": { "nodeId": "my-service:localhost:3000" }, "timestamp": 1640995200000 } -
service_unhealthy: When a service instance is removed due to expired heartbeat{ "event": "service_unhealthy", "data": { "nodeId": "my-service:localhost:3000" }, "timestamp": 1640995200000 } -
config_changed: When a service configuration is updated{ "event": "config_changed", "data": { "serviceName": "my-service", "key": "timeout", "value": 5000, "namespace": "default", "region": "us-east", "zone": "zone1" }, "timestamp": 1640995200000 } -
config_deleted: When a service configuration is deleted{ "event": "config_deleted", "data": { "serviceName": "my-service", "key": "timeout", "namespace": "default", "region": "us-east", "zone": "zone1" }, "timestamp": 1640995200000 }
Full Mode provides additional endpoints for advanced features like federation, tracing, ACLs, intentions, and service blacklists. These are available under /api/maxine/serviceops/.
Maxine includes chaos engineering tools for resilience testing.
POST /api/maxine/chaos/inject-latency
Content-Type: application/json
{
"serviceName": "my-service",
"delay": 1000
}POST /api/maxine/chaos/inject-failure
Content-Type: application/json
{
"serviceName": "my-service",
"rate": 0.1
}POST /api/maxine/chaos/reset
Content-Type: application/json
{
"serviceName": "my-service"
}GET /api/maxine/chaos/statusPOST /api/maxine/serviceops/federation/add
Content-Type: application/json
{
"name": "remote-registry",
"url": "http://remote-maxine:8080"
}POST /api/maxine/serviceops/federation/remove
Content-Type: application/json
{
"name": "remote-registry"
}GET /api/maxine/serviceops/federationPOST /api/maxine/serviceops/trace/start
Content-Type: application/json
{
"operation": "discover",
"id": "trace-123"
}POST /api/maxine/serviceops/trace/event
Content-Type: application/json
{
"id": "trace-123",
"event": "node selected"
}POST /api/maxine/serviceops/trace/end
Content-Type: application/json
{
"id": "trace-123"
}GET /api/maxine/serviceops/trace/:idPOST /api/maxine/serviceops/acl/set
Content-Type: application/json
{
"serviceName": "my-service",
"allow": ["service-a", "service-b"],
"deny": ["service-c"]
}GET /api/maxine/serviceops/acl/:serviceNamePOST /api/maxine/serviceops/intention/set
Content-Type: application/json
{
"source": "service-a",
"destination": "service-b",
"action": "allow"
}GET /api/maxine/serviceops/intention/:source/:destinationPOST /api/maxine/serviceops/blacklist/service/add
Content-Type: application/json
{
"serviceName": "bad-service"
}POST /api/maxine/serviceops/blacklist/service/remove
Content-Type: application/json
{
"serviceName": "bad-service"
}GET /api/maxine/serviceops/blacklist/service/:serviceNameMaxine supports custom load balancing strategies through a plugin system. You can register your own load balancing algorithms for specialized routing needs.
const serviceRegistry = global.serviceRegistry;
// Register a custom strategy
serviceRegistry.registerLBPlugin('my-custom-strategy', (nodes, context) => {
// nodes: array of available service nodes
// context: { clientIP, serviceName, tags }
// Return the selected node
// Example: select node with lowest CPU usage (assuming metadata has cpu field)
let bestNode = null;
let lowestCpu = Infinity;
for (const node of nodes) {
const cpu = node.metadata?.cpu || 0;
if (cpu < lowestCpu) {
lowestCpu = cpu;
bestNode = node;
}
}
return bestNode || nodes[0];
});
// Now you can use 'my-custom-strategy' in discovery requests
GET /discover?serviceName=my-service&loadBalancing=my-custom-strategyMaxine includes advanced deep learning capabilities for intelligent load balancing. Using TensorFlow.js, it trains neural networks on historical performance data to predict optimal service nodes.
- Neural Network Models: Feedforward neural networks trained on service metrics
- Time Series Analysis: Advanced analysis including autocorrelation, trend detection, and seasonality
- Predictive Analytics: Forecasts response times, error rates, and load patterns
- Continuous Learning: Models update automatically with new performance data
- Fallback Strategy: Falls back to time-series analysis if deep learning model unavailable
Use the advanced-ml strategy for deep learning-based load balancing:
GET /discover?serviceName=my-service&loadBalancing=advanced-mlModels are trained automatically on:
- Response times
- Success/failure rates
- Load patterns
- Historical trends
Training occurs every minute with recent performance data. Models are persisted to disk for continuity across restarts.
Access model performance metrics:
const metrics = serviceRegistry.deepLearningService.getModelMetrics('my-service');
// Returns: { loss, mse, mae, accuracy, precision, recall, f1, ... }Maxine provides client SDKs for easy integration:
- Swift: Lightning Mode API support with async/await for iOS/macOS/watchOS/tvOS
- Kotlin: Lightning Mode API support with coroutines for Android
- Python: Supports both Full Mode and Lightning Mode APIs, including WebSocket for real-time events
- Go: Full Mode API support
- Java: Full Mode API support
- C#: Full Mode API support
- Rust: Full Mode API support
- Dart: Full Mode and Lightning Mode APIs with async/await support for Flutter and Dart applications
Client SDKs include caching, automatic retries, and support for all discovery strategies.
- PHP: Full Mode and Lightning Mode APIs with caching support
- Ruby: Full Mode and Lightning Mode APIs with WebSocket support
- C++: High-performance C++ SDK for low-latency applications and game servers (complete with CMake build system and examples)
Maxine maintains an in-memory registry of services and their instances. Services register with heartbeats, and expired services are automatically cleaned up. Discovery returns a healthy instance using various load balancing strategies.
- Lightning Mode: Ultra-fast response times using raw Node.js HTTP server, O(1) lookups using optimized in-memory data structures with lightweight LRU caching (10k entries, 30s TTL), pre-allocated buffer responses, fast LCG PRNG for random selection, advanced load balancing strategies (round-robin, random, weighted-random, least-connections, consistent-hash, ip-hash, geo-aware, predictive), optimized request handling without deferred execution for minimal latency, stripped-down registry with only core features for minimal overhead, memory-mapped and shared memory persistence options
- Full Mode: Comprehensive features with optimized caching, async operations, and JWT authentication
- Minimal memory footprint with efficient data structures
- Automatic cleanup prevents memory leaks with periodic sweeps (every 30 seconds)
- Optimized routing: O(1) Map-based HTTP routing for ultra-fast request handling
- Optimized heartbeat and discovery logic with parallel operations and async I/O
- Active health checks for proactive service monitoring
- Event-driven notifications for real-time updates
- Load test results: 5,000 requests with 50 concurrent users in ~0.37s, average response time 3.53ms, 95th percentile 6.22ms, 100% success rate, 13k req/s - Load test target: 95th percentile < 10ms for 50 concurrent users (achieved) - Recent optimizations: Removed console.log statements from production code to reduce I/O overhead, implemented object pooling for response objects to reduce GC pressure, added service health prediction using time-series analysis, adaptive caching with access-based TTL, SIMD-inspired binary search for weighted random selection, fine-tuned GC settings, added CPU affinity, synchronous ultra-fast discovery, pre-allocated JSON buffers
MIT