-
Notifications
You must be signed in to change notification settings - Fork 1
Add Hugging Face sentiment analysis example #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,260 @@ | ||||||
| # flash-sentiment | ||||||
|
|
||||||
| Flash application demonstrating distributed GPU and CPU computing on Runpod's serverless infrastructure. | ||||||
|
|
||||||
| ## About This Template | ||||||
|
|
||||||
| This project was generated using `flash init`. The `flash-sentiment` placeholder is automatically replaced with your actual project name during initialization. | ||||||
|
|
||||||
| ## Quick Start | ||||||
|
|
||||||
| ### 1. Install Dependencies | ||||||
|
|
||||||
| ```bash | ||||||
| pip install -r requirements.txt | ||||||
| ``` | ||||||
|
|
||||||
| ### 2. Configure Environment | ||||||
|
|
||||||
| Create `.env` file: | ||||||
|
|
||||||
| ```bash | ||||||
| RUNPOD_API_KEY=your_api_key_here | ||||||
| ``` | ||||||
|
|
||||||
| Get your API key from [Runpod Settings](https://www.runpod.io/console/user/settings). | ||||||
|
|
||||||
| ### 3. Run Locally | ||||||
|
|
||||||
| ```bash | ||||||
| # Standard run | ||||||
|
Comment on lines
+17
to
+30
|
||||||
| flash run | ||||||
|
|
||||||
| # Faster development: pre-provision endpoints (eliminates cold-start delays) | ||||||
| flash run --auto-provision | ||||||
| ``` | ||||||
|
|
||||||
| Server starts at **http://localhost:8000** | ||||||
|
|
||||||
| With `--auto-provision`, all serverless endpoints deploy before testing begins. This is much faster for development because endpoints are cached and reused across server restarts. Subsequent runs skip deployment and start immediately. | ||||||
|
Comment on lines
+37
to
+39
|
||||||
|
|
||||||
| ### 4. Test the API | ||||||
|
|
||||||
| ```bash | ||||||
| # Health check | ||||||
| curl http://localhost:8000/ping | ||||||
|
|
||||||
| # GPU worker | ||||||
| curl -X POST http://localhost:8000/gpu/hello \ | ||||||
| -H "Content-Type: application/json" \ | ||||||
| -d '{"message": "Hello GPU!"}' | ||||||
|
|
||||||
| # CPU worker | ||||||
| curl -X POST http://localhost:8000/cpu/hello \ | ||||||
| -H "Content-Type: application/json" \ | ||||||
| -d '{"message": "Hello CPU!"}' | ||||||
| ``` | ||||||
|
Comment on lines
+41
to
+56
|
||||||
|
|
||||||
| Visit **http://localhost:8000/docs** for interactive API documentation. | ||||||
|
|
||||||
| ## What This Demonstrates | ||||||
|
|
||||||
| ### GPU Worker (`workers/gpu/`) | ||||||
| Simple GPU-based serverless function: | ||||||
| - Remote execution with `@remote` decorator | ||||||
| - GPU resource configuration | ||||||
| - Automatic scaling (0-3 workers) | ||||||
| - No external dependencies required | ||||||
|
|
||||||
| ```python | ||||||
| @remote( | ||||||
| resource_config=LiveServerless( | ||||||
| name="gpu_worker", | ||||||
| gpus=[GpuGroup.ADA_24], # RTX 4090 | ||||||
| workersMin=0, | ||||||
| workersMax=3, | ||||||
| ) | ||||||
| ) | ||||||
| async def gpu_hello(input_data: dict) -> dict: | ||||||
| # Your GPU code here | ||||||
| return {"status": "success", "message": "Hello from GPU!"} | ||||||
| ``` | ||||||
|
|
||||||
| ### CPU Worker (`workers/cpu/`) | ||||||
| Simple CPU-based serverless function: | ||||||
| - CPU-only execution (no GPU overhead) | ||||||
| - CpuLiveServerless configuration | ||||||
| - Efficient for API endpoints | ||||||
| - Automatic scaling (0-5 workers) | ||||||
|
|
||||||
| ```python | ||||||
| @remote( | ||||||
| resource_config=CpuLiveServerless( | ||||||
| name="cpu_worker", | ||||||
| instanceIds=[CpuInstanceType.CPU3G_2_8], # 2 vCPU, 8GB RAM | ||||||
| workersMin=0, | ||||||
| workersMax=5, | ||||||
| ) | ||||||
| ) | ||||||
| async def cpu_hello(input_data: dict) -> dict: | ||||||
| # Your CPU code here | ||||||
| return {"status": "success", "message": "Hello from CPU!"} | ||||||
| ``` | ||||||
|
|
||||||
| ## Project Structure | ||||||
|
|
||||||
| ``` | ||||||
| flash-sentiment/ | ||||||
| ├── main.py # FastAPI application | ||||||
| ├── workers/ | ||||||
| │ ├── gpu/ # GPU worker | ||||||
| │ │ ├── __init__.py # FastAPI router | ||||||
| │ │ └── endpoint.py # @remote decorated function | ||||||
| │ └── cpu/ # CPU worker | ||||||
| │ ├── __init__.py # FastAPI router | ||||||
| │ └── endpoint.py # @remote decorated function | ||||||
| ├── .env # Environment variables | ||||||
| ├── requirements.txt # Dependencies | ||||||
| └── README.md # This file | ||||||
| ``` | ||||||
|
|
||||||
| ## Key Concepts | ||||||
|
|
||||||
| ### Remote Execution | ||||||
| The `@remote` decorator transparently executes functions on serverless infrastructure: | ||||||
| - Code runs locally during development | ||||||
| - Automatically deploys to Runpod when configured | ||||||
| - Handles serialization, dependencies, and resource management | ||||||
|
|
||||||
| ### Resource Scaling | ||||||
| Both workers scale to zero when idle to minimize costs: | ||||||
| - **idleTimeout**: Minutes before scaling down (default: 5) | ||||||
| - **workersMin**: 0 = completely scales to zero | ||||||
| - **workersMax**: Maximum concurrent workers | ||||||
|
|
||||||
| ### GPU Types | ||||||
| Available GPU options for `LiveServerless`: | ||||||
| - `GpuGroup.ADA_24` - RTX 4090 (24GB) | ||||||
| - `GpuGroup.ADA_48_PRO` - RTX 6000 Ada, L40 (48GB) | ||||||
| - `GpuGroup.AMPERE_80` - A100 (80GB) | ||||||
| - `GpuGroup.ANY` - Any available GPU | ||||||
|
|
||||||
| ### CPU Types | ||||||
| Available CPU options for `CpuLiveServerless`: | ||||||
| - `CpuInstanceType.CPU3G_2_8` - 2 vCPU, 8GB RAM (General Purpose) | ||||||
| - `CpuInstanceType.CPU3C_4_8` - 4 vCPU, 8GB RAM (Compute Optimized) | ||||||
| - `CpuInstanceType.CPU5G_4_16` - 4 vCPU, 16GB RAM (Latest Gen) | ||||||
| - `CpuInstanceType.ANY` - Any available GPU | ||||||
|
||||||
| - `CpuInstanceType.ANY` - Any available GPU | |
| - `CpuInstanceType.ANY` - Any available CPU |
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,64 @@ | ||||||||||||||||||||||
| import logging | ||||||||||||||||||||||
| import os | ||||||||||||||||||||||
| import sentiment # noqa: F401 | ||||||||||||||||||||||
| from sentiment import classify | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
|
|
||||||||||||||||||||||
| from fastapi import FastAPI | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
|
Comment on lines
+3
to
+8
|
||||||||||||||||||||||
| import sentiment # noqa: F401 | |
| from sentiment import classify | |
| from fastapi import FastAPI | |
| from sentiment import classify | |
| from fastapi import FastAPI |
Copilot
AI
Feb 13, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The / response omits the new /classify endpoint, so users won’t discover the main sentiment-analysis functionality from the homepage payload. Add /classify to the returned endpoints (and ideally update the message/description to mention sentiment analysis).
| "message": "Flash Application", | |
| "docs": "/docs", | |
| "endpoints": {"gpu_hello": "/gpu/hello", "cpu_hello": "/cpu/hello"}, | |
| "message": "Flash Application - Sentiment Analysis", | |
| "docs": "/docs", | |
| "endpoints": { | |
| "gpu_hello": "/gpu/hello", | |
| "cpu_hello": "/cpu/hello", | |
| "classify": "/classify", | |
| }, |
Copilot
AI
Feb 13, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imports are split across the file (from pydantic import BaseModel mid-file and additional runpod_flash imports at the bottom). This breaks the import pattern used across other examples and makes it easy to miss unused/duplicate code; move imports to the top and remove unused ones.
Copilot
AI
Feb 13, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trailing runpod_flash imports and cpu_config = LiveServerless(...) block at the bottom are unused and duplicate the config in sentiment.py. Because they execute at import time, they add confusion and can cause accidental name collisions; remove this dead code.
| uvicorn.run(app, host=host, port=port) | |
| from runpod_flash import remote, LiveServerless, CpuInstanceType | |
| cpu_config = LiveServerless( | |
| name="flash-ai-sentiment", | |
| instanceIds=[CpuInstanceType.CPU3G_2_8], | |
| workersMax=1, | |
| ) | |
| uvicorn.run(app, host=host, port=port) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| """ | ||
| Mothership Endpoint Configuration | ||
|
|
||
| The mothership endpoint serves your FastAPI application routes. | ||
| It is automatically deployed as a CPU-optimized load-balanced endpoint. | ||
|
|
||
| To customize this configuration: | ||
| - Modify worker scaling: change workersMin and workersMax values | ||
| - Use GPU load balancer: import LiveLoadBalancer instead of CpuLiveLoadBalancer | ||
| - Change endpoint name: update the 'name' parameter | ||
|
|
||
| To disable mothership deployment: | ||
| - Delete this file, or | ||
| - Comment out the 'mothership' variable below | ||
|
|
||
| Documentation: https://docs.runpod.io/flash/mothership | ||
| """ | ||
|
|
||
| from runpod_flash import CpuLiveLoadBalancer | ||
|
|
||
| # Mothership endpoint configuration | ||
| # This serves your FastAPI app routes from main.py | ||
| mothership = CpuLiveLoadBalancer( | ||
| name="mothership", | ||
| workersMin=1, | ||
| workersMax=3, | ||
| ) | ||
|
Comment on lines
+23
to
+27
|
||
|
|
||
| # Examples of customization: | ||
|
|
||
| # Increase scaling for high traffic | ||
| # mothership = CpuLiveLoadBalancer( | ||
| # name="mothership", | ||
| # workersMin=2, | ||
| # workersMax=10, | ||
| # ) | ||
|
|
||
| # Use GPU-based load balancer instead of CPU | ||
| # (requires importing LiveLoadBalancer) | ||
| # from runpod_flash import LiveLoadBalancer | ||
| # mothership = LiveLoadBalancer( | ||
| # name="mothership", | ||
| # gpus=[GpuGroup.ANY], | ||
| # ) | ||
|
|
||
| # Custom endpoint name | ||
| # mothership = CpuLiveLoadBalancer( | ||
| # name="my-api-gateway", | ||
| # workersMin=1, | ||
| # workersMax=3, | ||
| # ) | ||
|
|
||
| # To disable mothership: | ||
| # - Delete this entire file, or | ||
| # - Comment out the 'mothership' variable above | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The README reads like the generic
flash inittemplate and doesn’t mention Hugging Face / sentiment analysis in the overview. Update the intro/“What this demonstrates” sections to match the actual purpose of this example (Hugging Face sentiment classification).