A microservice for provisioning and managing compute nodes in the Pennsieve platform, providing serverless infrastructure for data processing workflows.
The Compute Node Service is responsible for:
- Creating and managing compute infrastructure for data processing
- Provisioning AWS Fargate containers for compute workloads
- Managing compute node lifecycle (create, read, delete operations)
- Integrating with the Pennsieve workflow management system
The service consists of two main components:
- Location:
lambda/service/ - Language: Go
- Purpose: Provides REST API endpoints for compute node management
- Endpoints:
POST /compute-nodes- Create a new compute nodeGET /compute-nodes- List all compute nodesGET /compute-nodes/{id}- Get specific compute node detailsDELETE /compute-nodes/{id}- Delete a compute node
- Location:
fargate/compute-node-provisioner/ - Language: Go
- Purpose: Handles the actual infrastructure provisioning using Terraform
- Features:
- Terraform-based infrastructure as code
- Support for both CPU and GPU compute resources
- EFS volume mounting for data persistence
- S3 integration for artifact storage
- Go 1.x or higher
- Docker and Docker Compose
- AWS CLI configured with appropriate credentials
- Terraform (for infrastructure deployment)
- Make utility
- Clone the repository:
git clone https://github.com/Pennsieve/compute-node-service.git
cd compute-node-service- Set up environment variables:
cp fargate/compute-node-provisioner/env.dev.sample .env
# Edit .env with your configuration- Install dependencies:
cd lambda/service && go mod download
cd ../../fargate/compute-node-provisioner && go mod downloadmake packageThis command:
- Builds the Lambda function for ARM64 architecture
- Creates a deployment package as a ZIP file
- Builds and pushes the Fargate provisioner Docker image
docker-compose -f docker-compose.test.yml buildRun the test suite locally:
make testFor CI environment testing:
make test-cimake publishThis will:
- Build the Lambda deployment package
- Upload to S3 bucket (
pennsieve-cc-lambda-functions-use1) - Build and push the Fargate provisioner container
The service uses Terraform for infrastructure management:
cd terraform
terraform init
terraform plan -var="environment_name=dev" -var="image_tag=latest"
terraform apply -var="environment_name=dev" -var="image_tag=latest"COMPUTE_NODES_TABLE- DynamoDB table name for storing compute node dataAWS_REGION- AWS region for deployment
COMPUTE_NODE_ID- Unique identifier for the compute nodeACTION- Action to perform (CREATE/DELETE)ACCOUNT_UUID- Account unique identifierACCOUNT_ID- AWS account IDACCOUNT_TYPE- Type of account (e.g., workspace)ORG_ID- Organization identifierUSER_ID- User identifierENV- Environment (dev/staging/prod)NODE_NAME- Name of the compute nodeNODE_DESCRIPTION- Description of the compute nodeWM_TAG- Workflow manager Docker image tagNODE_IDENTIFIER- Unique node identifier (auto-generated)
- ECS Fargate Tasks - For running compute workloads
- EFS File System - For persistent storage
- S3 Buckets - For artifact and data storage
- DynamoDB Table - For compute node metadata
- CloudWatch Logs - For monitoring and debugging
- IAM Roles - For service permissions
- Lambda Function - For API endpoints
The service includes GPU support through specialized ECS task definitions and capacity providers. GPU resources can be provisioned by specifying appropriate task requirements.
The service publishes custom metrics to CloudWatch:
- Compute node creation/deletion events
- Task execution status
- Resource utilization
All components log to CloudWatch Logs:
- Lambda logs:
/aws/lambda/compute-node-service-{env} - Fargate logs:
/ecs/compute-node-provisioner-{env}
The service uses Jenkins for continuous integration and deployment:
- Test Stage: Runs automated tests
- Build & Push Stage: Builds artifacts and Docker images
- Deploy Stage: Deploys to target environment (dev/staging/prod)
The pipeline is triggered on pushes to the main branch.
compute-node-service/
├── lambda/
│ └── service/ # Lambda function code
│ ├── handler/ # HTTP request handlers
│ ├── models/ # Data models
│ ├── runner/ # Task execution logic
│ ├── store_dynamodb/ # DynamoDB integration
│ └── utils/ # Utility functions
├── fargate/
│ └── compute-node-provisioner/ # Fargate provisioner
│ ├── provisioner/ # Provisioning logic
│ ├── scripts/ # Helper scripts
│ └── terraform/ # Infrastructure as code
├── terraform/ # Service infrastructure
├── Makefile # Build automation
├── Jenkinsfile # CI/CD pipeline
└── docker-compose.test.yml # Test environment
POST /compute-nodes
Content-Type: application/json
{
"name": "my-compute-node",
"description": "Processing node for data analysis",
"account_uuid": "uuid",
"organization_id": "org-123",
"user_id": "user-456"
}GET /compute-nodesGET /compute-nodes/{id}DELETE /compute-nodes/{id}# Run tests
make test
# Clean up test environment
make clean
# Build Lambda package
make package
# Publish Lambda to S3
make publish
# Tidy Go modules
make tidy
# View help
make help- Lambda build fails: Ensure Go is installed and GOARCH is set correctly
- Docker build errors: Check Docker daemon is running and credentials are configured
- Terraform errors: Verify AWS credentials and permissions
- DynamoDB errors: Check table exists and has proper indexes
Enable debug logging by setting:
export DEBUG=true- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
For issues and questions:
- Create an issue in the GitHub repository
- Contact the Pennsieve platform team