Compute Node Service

A microservice for provisioning and managing compute nodes in the Pennsieve platform, providing serverless infrastructure for data processing workflows.

Overview

The Compute Node Service is responsible for:

Creating and managing compute infrastructure for data processing
Provisioning AWS Fargate containers for compute workloads
Managing compute node lifecycle (create, read, delete operations)
Integrating with the Pennsieve workflow management system

Architecture

The service consists of two main components:

1. Lambda Function (API Layer)

Location: lambda/service/
Language: Go
Purpose: Provides REST API endpoints for compute node management
Endpoints:
- POST /compute-nodes - Create a new compute node
- GET /compute-nodes - List all compute nodes
- GET /compute-nodes/{id} - Get specific compute node details
- DELETE /compute-nodes/{id} - Delete a compute node

2. Fargate Provisioner

Location: fargate/compute-node-provisioner/
Language: Go
Purpose: Handles the actual infrastructure provisioning using Terraform
Features:
- Terraform-based infrastructure as code
- Support for both CPU and GPU compute resources
- EFS volume mounting for data persistence
- S3 integration for artifact storage

Prerequisites

Go 1.x or higher
Docker and Docker Compose
AWS CLI configured with appropriate credentials
Terraform (for infrastructure deployment)
Make utility

Development Setup

Clone the repository:

git clone https://github.com/Pennsieve/compute-node-service.git
cd compute-node-service

Set up environment variables:

cp fargate/compute-node-provisioner/env.dev.sample .env
# Edit .env with your configuration

Install dependencies:

cd lambda/service && go mod download
cd ../../fargate/compute-node-provisioner && go mod download

Building

Build Lambda Function

make package

This command:

Builds the Lambda function for ARM64 architecture
Creates a deployment package as a ZIP file
Builds and pushes the Fargate provisioner Docker image

Build for Local Testing

docker-compose -f docker-compose.test.yml build

Testing

Run the test suite locally:

make test

For CI environment testing:

make test-ci

Deployment

Deploy Lambda Function

make publish

This will:

Build the Lambda deployment package
Upload to S3 bucket (pennsieve-cc-lambda-functions-use1)
Build and push the Fargate provisioner container

Infrastructure Deployment

The service uses Terraform for infrastructure management:

cd terraform
terraform init
terraform plan -var="environment_name=dev" -var="image_tag=latest"
terraform apply -var="environment_name=dev" -var="image_tag=latest"

Configuration

Environment Variables

Lambda Service

COMPUTE_NODES_TABLE - DynamoDB table name for storing compute node data
AWS_REGION - AWS region for deployment

Fargate Provisioner

COMPUTE_NODE_ID - Unique identifier for the compute node
ACTION - Action to perform (CREATE/DELETE)
ACCOUNT_UUID - Account unique identifier
ACCOUNT_ID - AWS account ID
ACCOUNT_TYPE - Type of account (e.g., workspace)
ORG_ID - Organization identifier
USER_ID - User identifier
ENV - Environment (dev/staging/prod)
NODE_NAME - Name of the compute node
NODE_DESCRIPTION - Description of the compute node
WM_TAG - Workflow manager Docker image tag
NODE_IDENTIFIER - Unique node identifier (auto-generated)

Infrastructure Components

AWS Resources Created

ECS Fargate Tasks - For running compute workloads
EFS File System - For persistent storage
S3 Buckets - For artifact and data storage
DynamoDB Table - For compute node metadata
CloudWatch Logs - For monitoring and debugging
IAM Roles - For service permissions
Lambda Function - For API endpoints

GPU Support

The service includes GPU support through specialized ECS task definitions and capacity providers. GPU resources can be provisioned by specifying appropriate task requirements.

Monitoring

CloudWatch Metrics

The service publishes custom metrics to CloudWatch:

Compute node creation/deletion events
Task execution status
Resource utilization

Logging

All components log to CloudWatch Logs:

Lambda logs: /aws/lambda/compute-node-service-{env}
Fargate logs: /ecs/compute-node-provisioner-{env}

CI/CD

The service uses Jenkins for continuous integration and deployment:

Test Stage: Runs automated tests
Build & Push Stage: Builds artifacts and Docker images
Deploy Stage: Deploys to target environment (dev/staging/prod)

The pipeline is triggered on pushes to the main branch.

Project Structure

compute-node-service/
├── lambda/
│   └── service/         # Lambda function code
│       ├── handler/     # HTTP request handlers
│       ├── models/      # Data models
│       ├── runner/      # Task execution logic
│       ├── store_dynamodb/ # DynamoDB integration
│       └── utils/       # Utility functions
├── fargate/
│   └── compute-node-provisioner/  # Fargate provisioner
│       ├── provisioner/ # Provisioning logic
│       ├── scripts/     # Helper scripts
│       └── terraform/   # Infrastructure as code
├── terraform/          # Service infrastructure
├── Makefile           # Build automation
├── Jenkinsfile        # CI/CD pipeline
└── docker-compose.test.yml  # Test environment

API Reference

Create Compute Node

POST /compute-nodes
Content-Type: application/json

{
  "name": "my-compute-node",
  "description": "Processing node for data analysis",
  "account_uuid": "uuid",
  "organization_id": "org-123",
  "user_id": "user-456"
}

Get Compute Nodes

GET /compute-nodes

Get Specific Compute Node

GET /compute-nodes/{id}

Delete Compute Node

DELETE /compute-nodes/{id}

Development Commands

# Run tests
make test

# Clean up test environment
make clean

# Build Lambda package
make package

# Publish Lambda to S3
make publish

# Tidy Go modules
make tidy

# View help
make help

Troubleshooting

Common Issues

Lambda build fails: Ensure Go is installed and GOARCH is set correctly
Docker build errors: Check Docker daemon is running and credentials are configured
Terraform errors: Verify AWS credentials and permissions
DynamoDB errors: Check table exists and has proper indexes

Debug Mode

Enable debug logging by setting:

export DEBUG=true

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Support

For issues and questions:

Create an issue in the GitHub repository
Contact the Pennsieve platform team

Name		Name	Last commit message	Last commit date
Latest commit History 290 Commits
fargate		fargate
lambda/service		lambda/service
terraform		terraform
.gitignore		.gitignore
Dockerfile.test		Dockerfile.test
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.test.yml		docker-compose.test.yml
dockertest.env		dockertest.env

License

Pennsieve/compute-node-service

Folders and files

Latest commit

History

Repository files navigation

Compute Node Service

Overview

Architecture

1. Lambda Function (API Layer)

2. Fargate Provisioner

Prerequisites

Development Setup

Building

Build Lambda Function

Build for Local Testing

Testing

Deployment

Deploy Lambda Function

Infrastructure Deployment

Configuration

Environment Variables

Lambda Service

Fargate Provisioner

Infrastructure Components

AWS Resources Created

GPU Support

Monitoring

CloudWatch Metrics

Logging

CI/CD

Project Structure

API Reference

Create Compute Node

Get Compute Nodes

Get Specific Compute Node

Delete Compute Node

Development Commands

Troubleshooting

Common Issues

Debug Mode

Contributing

License

Support

Related Services

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages