Feat/dl unified api server#151
Conversation
- Enhanced BERT-based emotion classifier for journal entries - Multi-label emotion classification (28 emotions from GoEmotions) - Temperature scaling for calibrated predictions - Comprehensive emotion labels and descriptions - Standalone test script for validation - Configuration file with SAMO-specific optimizations - Error handling and logging improvements Files: - src/models/emotion_detection/samo_bert_emotion_classifier.py - src/models/emotion_detection/emotion_labels.py - configs/samo_emotion_detection_config.yaml - test_samo_emotion_detection_standalone.py This completes PR-3 of the surgical breakdown plan.
- Fixed NameError for classifier_dropout_prob and freeze_bert_layers - Updated constructor to use self.classifier_dropout_prob and self.freeze_bert_layers - Model now initializes correctly and passes standalone tests - Maintains 110M parameters with 66M frozen BERT layers Part of PR-3: Emotion Detection model completion
- Created FastAPI-based unified server integrating T5, Whisper, and BERT models - Individual endpoints: /summarize, /transcribe, /detect-emotions - Combined pipeline endpoint: /process-audio (transcription -> summary -> emotions) - Comprehensive error handling and validation - Health monitoring endpoint - CORS support for web applications - Request/response models with Pydantic validation - Background task cleanup for uploaded files Part of PR-4: Unified API Server implementation
- Added API configuration file (samo_api_config.yaml) with model settings - Created comprehensive test suite (test_unified_api_server.py) with mocked tests - Added startup script (start_api_server.py) with CLI arguments - Updated API requirements file with all necessary dependencies - Includes health checks, validation, error handling, and combined pipeline tests Part of PR-4: Unified API Server implementation
- Fixed emotion detection model loading (tuple unpacking issue) - Successfully tested all endpoints: /summarize, /transcribe, /detect-emotions, /process-audio - Combined pipeline working: transcription -> summarization -> emotion detection - All models loading correctly with proper error handling - API server running on http://localhost:8000 with full documentation Part of PR-4: Unified API Server - all endpoints functional and tested
WalkthroughIntroduces a FastAPI-based unified API server with endpoints for summarization, transcription, emotion detection, and a combined pipeline. Adds SAMO BERT emotion classifier and emotion label utilities. Provides configs for API and emotion detection, updates API dependencies to ML-centric stack, adds server start script, and comprehensive tests. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant C as Client
participant S as SAMOUnifiedAPIServer (FastAPI)
rect rgba(200,235,255,0.25)
note right of S: Model loading on startup
S->>S: _load_models() for T5, Whisper, BERT
S-->>S: Set availability flags per model
end
C->>S: POST /summarize {text, params}
alt summarizer available
S->>S: Run T5 summarization
S-->>C: {summary, lengths, time, model_info}
else unavailable
S-->>C: 503 Service Unavailable
end
C->>S: POST /emotions {text, threshold, top_k}
alt emotion model available
S->>S: Tokenize, forward, sigmoid, threshold/top-k
S-->>C: {emotions, probabilities, predictions, time, model_info}
else unavailable
S-->>C: 503 Service Unavailable
end
C->>S: POST /transcribe (audio multipart)
alt transcriber available & type allowed
S->>S: Save temp file
S->>S: Whisper transcription
S->>S: Cleanup temp file (background)
S-->>C: {text, language, confidence, duration, metrics}
else type unsupported
S-->>C: 400 Unsupported audio format
else transcriber unavailable
S-->>C: 503 Service Unavailable
end
C->>S: POST /process (audio multipart + options)
alt pipeline models available
S->>S: Transcribe
opt summary requested
S->>S: Summarize text
end
S->>S: Detect emotions
S-->>C: {transcription, summary?, emotions, total_time, steps}
else any model missing
S-->>C: 503 Service Unavailable
end
C->>S: GET /health
S-->>C: {status, timestamp, models_loaded, memory_usage}
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks (2 passed, 1 warning)❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Poem
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Here's the code health analysis summary for commits Analysis Summary
|
There was a problem hiding this comment.
Summary of Changes
Hello @uelkerd, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces a new unified API server designed to centralize access to various deep learning models. It streamlines the interaction with summarization, transcription, and emotion detection capabilities, providing both individual model access and integrated processing pipelines. This change significantly updates the project's API infrastructure and underlying dependencies to support advanced natural language and audio processing.
Highlights
- New Unified API Server: Introduced a single FastAPI server to consolidate and expose various deep learning models, streamlining access to summarization, transcription, and emotion detection capabilities.
- Integrated Deep Learning Models: The server now integrates a T5 model for text summarization, a Whisper model for audio transcription, and a custom BERT-based model for emotion detection.
- Comprehensive Emotion Detection: A new BERT-based multi-label emotion classifier has been implemented, supporting 28 distinct emotion categories (GoEmotions dataset) with detailed configuration and utility functions for emotion labels.
- API Endpoints and Pipelines: Dedicated API endpoints are available for each model's functionality (summarize, transcribe, detect-emotions), alongside a powerful combined endpoint for end-to-end audio processing (transcription -> summarization -> emotion analysis).
- Dependency Overhaul: The API's dependencies have been significantly updated to align with the new deep learning stack, including
torch,transformers,openai-whisper, andscikit-learn, while removing older web framework and database-related packages. - Robust Testing Framework: New standalone test scripts for the emotion detection model and a comprehensive pytest suite for the unified API server have been added to ensure reliability and correct functionality.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Pull Request Overview
This PR introduces a comprehensive unified API server for the SAMO-DL system that integrates three core deep learning models: T5 summarization, Whisper transcription, and BERT emotion detection. The implementation provides both individual model endpoints and combined processing pipelines for journal entry analysis.
Key changes include:
- Implementation of a FastAPI-based unified server with REST endpoints
- Integration of all three SAMO models with proper error handling and validation
- Comprehensive test suite covering individual endpoints and combined processing workflows
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
src/models/unified_api_server.py |
Core FastAPI server implementation with model integration and endpoint definitions |
src/models/emotion_detection/samo_bert_emotion_classifier.py |
BERT-based emotion classifier with multi-label support and temperature scaling |
src/models/emotion_detection/emotion_labels.py |
GoEmotions emotion categories and utility functions |
tests/test_unified_api_server.py |
Comprehensive test suite for API endpoints and combined processing |
test_samo_emotion_detection_standalone.py |
Standalone tests for emotion detection model |
scripts/start_api_server.py |
Server startup script with configuration options |
dependencies/requirements-api.txt |
Updated API dependencies for the unified server |
configs/samo_emotion_detection_config.yaml |
Configuration for emotion detection model |
configs/samo_api_config.yaml |
Unified API server configuration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
|
||
| try: | ||
| # Save uploaded file temporarily | ||
| temp_path = f"/tmp/{file.filename}" |
There was a problem hiding this comment.
Using /tmp directly with user-provided filenames creates a path traversal vulnerability. Filenames could contain '../' sequences to write outside the intended directory. Use tempfile.mkstemp() or tempfile.NamedTemporaryFile() instead.
| if not self.models["transcriber"]: | ||
| raise HTTPException(status_code=503, detail="Transcription model not available") | ||
|
|
||
| temp_path = f"/tmp/{file.filename}" |
There was a problem hiding this comment.
Same path traversal vulnerability as in the transcribe endpoint. Use secure temporary file creation methods instead of constructing paths with user input.
| # Configure CORS | ||
| self.app.add_middleware( | ||
| CORSMiddleware, | ||
| allow_origins=["*"], # Configure for production |
There was a problem hiding this comment.
Allowing all origins with wildcard '*' is a security risk in production. This should be configured with specific allowed origins or read from environment variables.
| self.freeze_bert_layers = config["freeze_bert_layers"] | ||
| self.temperature = nn.Parameter(torch.ones(1) * config["temperature"]) | ||
| self.class_weights = None | ||
| self.prediction_threshold = 0.6 # Updated from 0.5 to 0.6 based on calibration |
There was a problem hiding this comment.
The hardcoded threshold value should be configurable rather than embedded in the class. Consider making this a parameter that can be set during initialization or through configuration.
| summary = self.models["summarizer"].generate_summary( | ||
| transcription_result.text, | ||
| max_length=summary_max_length | ||
| ) | ||
|
|
||
| summary_response = SummarizationResponse( | ||
| summary=summary, | ||
| original_length=len(transcription_result.text), | ||
| summary_length=len(summary), | ||
| processing_time=0.0, # Would need to track separately |
There was a problem hiding this comment.
The processing_time is hardcoded to 0.0 with a comment indicating it should be tracked separately. This provides incorrect timing information to API consumers. Implement proper timing measurement for the summarization step.
| summary = self.models["summarizer"].generate_summary( | |
| transcription_result.text, | |
| max_length=summary_max_length | |
| ) | |
| summary_response = SummarizationResponse( | |
| summary=summary, | |
| original_length=len(transcription_result.text), | |
| summary_length=len(summary), | |
| processing_time=0.0, # Would need to track separately | |
| start_time = time.perf_counter() | |
| summary = self.models["summarizer"].generate_summary( | |
| transcription_result.text, | |
| max_length=summary_max_length | |
| ) | |
| end_time = time.perf_counter() | |
| summary_processing_time = end_time - start_time | |
| summary_response = SummarizationResponse( | |
| summary=summary, | |
| original_length=len(transcription_result.text), | |
| summary_length=len(summary), | |
| processing_time=summary_processing_time, |
| emotion_results = self.models["emotion_detector"].predict_emotions( | ||
| transcription_result.text, | ||
| threshold=emotion_threshold | ||
| ) | ||
|
|
||
| emotion_response = EmotionDetectionResponse( | ||
| emotions=emotion_results["emotions"][0] if emotion_results["emotions"] else [], | ||
| probabilities=emotion_results["probabilities"][0] if emotion_results["probabilities"] else [], | ||
| predictions=emotion_results["predictions"][0] if emotion_results["predictions"] else [], | ||
| processing_time=0.0, |
There was a problem hiding this comment.
Similar to the summarization step, the emotion detection processing_time is hardcoded to 0.0. This should measure the actual processing time for the emotion detection step.
| emotion_results = self.models["emotion_detector"].predict_emotions( | |
| transcription_result.text, | |
| threshold=emotion_threshold | |
| ) | |
| emotion_response = EmotionDetectionResponse( | |
| emotions=emotion_results["emotions"][0] if emotion_results["emotions"] else [], | |
| probabilities=emotion_results["probabilities"][0] if emotion_results["probabilities"] else [], | |
| predictions=emotion_results["predictions"][0] if emotion_results["predictions"] else [], | |
| processing_time=0.0, | |
| emotion_start_time = time.time() | |
| emotion_results = self.models["emotion_detector"].predict_emotions( | |
| transcription_result.text, | |
| threshold=emotion_threshold | |
| ) | |
| emotion_processing_time = time.time() - emotion_start_time | |
| emotion_response = EmotionDetectionResponse( | |
| emotions=emotion_results["emotions"][0] if emotion_results["emotions"] else [], | |
| probabilities=emotion_results["probabilities"][0] if emotion_results["probabilities"] else [], | |
| predictions=emotion_results["predictions"][0] if emotion_results["predictions"] else [], | |
| processing_time=emotion_processing_time, |
| return { | ||
| "input_ids": encoding["input_ids"].squeeze(0), | ||
| "attention_mask": encoding["attention_mask"].squeeze(0), | ||
| "token_type_ids": encoding.get("token_type_ids", torch.zeros_like(encoding["input_ids"])).squeeze(0), | ||
| "labels": label_tensor, | ||
| } |
There was a problem hiding this comment.
Creating zeros_like tensor as a fallback for every item could be inefficient. Consider checking if the tokenizer actually returns token_type_ids and handle the fallback at the model level instead.
| return { | |
| "input_ids": encoding["input_ids"].squeeze(0), | |
| "attention_mask": encoding["attention_mask"].squeeze(0), | |
| "token_type_ids": encoding.get("token_type_ids", torch.zeros_like(encoding["input_ids"])).squeeze(0), | |
| "labels": label_tensor, | |
| } | |
| item = { | |
| "input_ids": encoding["input_ids"].squeeze(0), | |
| "attention_mask": encoding["attention_mask"].squeeze(0), | |
| "labels": label_tensor, | |
| } | |
| if "token_type_ids" in encoding: | |
| item["token_type_ids"] = encoding["token_type_ids"].squeeze(0) | |
| return item |
There was a problem hiding this comment.
Code Review
This pull request introduces a unified API server, which is a significant and valuable addition. The code is generally well-structured with good separation of concerns. However, I've identified several critical issues related to configuration management, security, and correctness that need to be addressed before this can be considered production-ready. Specifically, the server configuration from YAML files is not being loaded, there's a security vulnerability in file handling, and the emotion detection model returns incorrect labels. I've left detailed comments with suggestions on how to fix these issues. Additionally, there are some improvements to be made in dependency management and testing. Great start on a complex feature!
| try: | ||
| logger.info("🚀 Starting SAMO Unified API Server") | ||
| logger.info(f"Host: {args.host}") | ||
| logger.info(f"Port: {args.port}") | ||
| logger.info(f"Workers: {args.workers}") | ||
| logger.info(f"Reload: {args.reload}") | ||
| logger.info(f"Config: {args.config}") | ||
|
|
||
| # Create and start server | ||
| server = SAMOUnifiedAPIServer() | ||
|
|
||
| logger.info("✅ Server initialized successfully") | ||
| logger.info("📖 API Documentation: http://localhost:8000/docs") | ||
| logger.info("🔄 ReDoc Documentation: http://localhost:8000/redoc") | ||
| logger.info("💚 Health Check: http://localhost:8000/health") | ||
|
|
||
| # Start the server | ||
| server.run(host=args.host, port=args.port) | ||
|
|
||
| except KeyboardInterrupt: | ||
| logger.info("🛑 Server shutdown requested by user") | ||
| except Exception as e: | ||
| logger.error(f"❌ Failed to start server: {e}") | ||
| sys.exit(1) |
There was a problem hiding this comment.
The script parses command-line arguments like --config, --workers, and --reload, but they are not actually used to configure or run the server. The configuration file path from --config is logged but never loaded, and other arguments are not passed to uvicorn. This is a critical issue as the server will always run with its hardcoded default settings, ignoring any configuration provided via the command line or YAML file.
The SAMOUnifiedAPIServer should be initialized with the loaded configuration, and the run method should accept and pass through the uvicorn-related arguments.
| def _load_models(self): | ||
| """Load all SAMO models.""" | ||
| try: | ||
| logger.info("Loading T5 Summarization Model...") | ||
| self.models["summarizer"] = create_t5_summarizer("t5-small") | ||
| logger.info("✅ T5 Summarization Model loaded") | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"❌ Failed to load T5 Summarization Model: {e}") | ||
| self.models["summarizer"] = None | ||
|
|
||
| try: | ||
| logger.info("Loading Whisper Transcription Model...") | ||
| self.models["transcriber"] = create_whisper_transcriber("base") | ||
| logger.info("✅ Whisper Transcription Model loaded") | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"❌ Failed to load Whisper Transcription Model: {e}") | ||
| self.models["transcriber"] = None | ||
|
|
||
| try: | ||
| logger.info("Loading BERT Emotion Detection Model...") | ||
| model, loss_fn = create_samo_bert_emotion_classifier() | ||
| self.models["emotion_detector"] = model | ||
| logger.info("✅ BERT Emotion Detection Model loaded") | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"❌ Failed to load BERT Emotion Detection Model: {e}") | ||
| self.models["emotion_detector"] = None |
There was a problem hiding this comment.
The models are being loaded with hardcoded parameters (e.g., create_t5_summarizer("t5-small")). This makes the server difficult to configure for different environments (development, staging, production) without changing the code. The server should be loading its configuration from the samo_api_config.yaml file.
I recommend modifying the SAMOUnifiedAPIServer's __init__ method to accept a configuration dictionary, which is then used within _load_models to initialize the models with the correct parameters from the config file.
| # SAMO Unified API Server Requirements | ||
| # Dependencies for running the unified API server with all models | ||
|
|
||
| # Core FastAPI and web framework | ||
| fastapi==0.104.1 | ||
| uvicorn[standard]==0.24.0 | ||
| pydantic==2.5.0 | ||
|
|
||
| # Machine Learning and Transformers | ||
| torch>=2.0.0 | ||
| transformers>=4.35.0 | ||
| datasets>=2.15.0 | ||
| accelerate>=0.24.0 | ||
|
|
||
| # Audio processing for Whisper | ||
| openai-whisper>=20231117 | ||
| pydub>=0.25.1 | ||
| librosa>=0.10.0 | ||
|
|
||
| # Scientific computing | ||
| numpy>=1.24.0 | ||
| scipy>=1.11.0 | ||
|
|
||
| # Data processing and ML | ||
| scikit-learn>=1.3.0 | ||
| pandas>=2.1.0 | ||
|
|
||
| # Configuration and utilities | ||
| pyyaml>=6.0 | ||
| python-multipart>=0.0.6 | ||
|
|
||
| # Development and testing | ||
| pytest>=7.4.0 | ||
| pytest-asyncio>=0.21.0 | ||
| httpx>=0.25.0 | ||
| pytest-mock>=3.12.0 | ||
|
|
||
| # Logging and monitoring | ||
| structlog>=23.2.0 | ||
|
|
||
| # Optional: GPU support (uncomment if needed) | ||
| # torch-audio>=2.0.0 # For better audio processing on GPU | ||
|
|
||
| # Optional: Model optimization | ||
| # onnxruntime>=1.16.0 # For ONNX model inference | ||
| # optimum>=1.14.0 # For optimized transformers No newline at end of file |
There was a problem hiding this comment.
The dependencies in this file are inconsistently pinned. Some packages have exact versions (e.g., fastapi==0.104.1), while others have open-ended ranges (e.g., torch>=2.0.0). For production applications, it's crucial to have a fully pinned set of dependencies to ensure reproducible builds and avoid unexpected issues from transitive dependency updates.
I recommend pinning all direct and transitive dependencies to specific versions. A good practice is to use a tool like pip-tools to manage a requirements.in file with your direct dependencies and generate a fully-pinned requirements.txt for deployment.
| emotions = [ | ||
| f"emotion_{i}" for i, p in enumerate(pred) if p > 0 | ||
| ] |
There was a problem hiding this comment.
The predict_emotions function is returning generic emotion labels like emotion_0, emotion_1. This is not user-friendly as API consumers won't know what these labels correspond to. The function should return the actual emotion names (e.g., 'joy', 'sadness').
You can achieve this by using the get_emotion_name utility from the emotion_labels module to map the index to the correct name.
emotions = [
get_emotion_name(i) for i, p in enumerate(pred) if p > 0
]| temp_path = f"/tmp/{file.filename}" | ||
| with open(temp_path, "wb") as buffer: | ||
| content = await file.read() | ||
| buffer.write(content) |
There was a problem hiding this comment.
Constructing a file path directly from a user-provided filename (f"/tmp/{file.filename}") is a security vulnerability known as Path Traversal. A malicious user could provide a filename like ../../etc/passwd to attempt to overwrite sensitive system files.
To fix this, you should use the tempfile module to create a secure temporary file. This ensures that the file is created with a random, non-guessable name in a secure temporary directory.
| temp_path = f"/tmp/{file.filename}" | |
| with open(temp_path, "wb") as buffer: | |
| content = await file.read() | |
| buffer.write(content) | |
| import tempfile | |
| with tempfile.NamedTemporaryFile(delete=False, suffix=Path(file.filename).suffix) as temp_file: | |
| content = await file.read() | |
| temp_file.write(content) | |
| temp_path = temp_file.name |
| def test_model_unavailable_errors(self, client): | ||
| """Test error handling when models are not available.""" | ||
| # Temporarily set models to None | ||
| original_models = client.app.state.models.copy() | ||
|
|
||
| try: | ||
| # Mock unavailable models | ||
| client.app.state.models = { | ||
| "summarizer": None, | ||
| "transcriber": None, | ||
| "emotion_detector": None | ||
| } | ||
|
|
||
| # Test summarization | ||
| response = client.post("/summarize", json={"text": "Test text"}) | ||
| assert response.status_code == 503 | ||
| assert "not available" in response.json()["detail"] | ||
|
|
||
| # Test emotion detection | ||
| response = client.post("/detect-emotions", json={"text": "Test text"}) | ||
| assert response.status_code == 503 | ||
| assert "not available" in response.json()["detail"] | ||
|
|
||
| finally: | ||
| # Restore original models | ||
| client.app.state.models = original_models |
There was a problem hiding this comment.
This test is intended to verify error handling when models are unavailable, but it has a flaw. It attempts to modify client.app.state.models, but the models are actually stored in api_server.models. Because the test patches the wrong object, it doesn't correctly simulate the model unavailability scenario and will pass vacuously while the underlying logic remains untested.
To fix this, the test function should take the api_server fixture as an argument and modify api_server.models directly.
def test_model_unavailable_errors(self, api_server, client):
"""Test error handling when models are not available."""
# Temporarily set models to None
original_models = api_server.models.copy()
try:
# Mock unavailable models
api_server.models = {
"summarizer": None,
"transcriber": None,
"emotion_detector": None
}
# Test summarization
response = client.post("/summarize", json={"text": "Test text"})
assert response.status_code == 503
assert "not available" in response.json()["detail"]
# Test emotion detection
response = client.post("/detect-emotions", json={"text": "Test text"})
assert response.status_code == 503
assert "not available" in response.json()["detail"]
finally:
# Restore original models
api_server.models = original_modelsThere was a problem hiding this comment.
Actionable comments posted: 13
🧹 Nitpick comments (30)
configs/samo_api_config.yaml (1)
86-86: Fix missing newline at end of fileYAML files should end with a newline character.
enable_docs: true enable_redoc: true reload_on_change: false +configs/samo_emotion_detection_config.yaml (1)
13-187: Remove trailing spaces throughout the fileThe file contains numerous trailing spaces that should be removed for consistency and to follow YAML best practices.
Apply this command to remove all trailing spaces:
sed -i 's/[[:space:]]*$//' configs/samo_emotion_detection_config.yamlsrc/models/emotion_detection/emotion_labels.py (3)
1-1: File has shebang but is not executableThe file starts with a shebang (
#!/usr/bin/env python3) but doesn't have executable permissions. Either remove the shebang or make the file executable.Either remove the shebang:
-#!/usr/bin/env python3 """ Emotion Labels for SAMO-DL Emotion DetectionOr make the file executable if it's intended to be run directly:
chmod +x src/models/emotion_detection/emotion_labels.py
180-183: Use exception chaining for better error contextWhen re-raising exceptions, use
fromto preserve the original exception context.try: return GOEMOTIONS_EMOTIONS.index(emotion.lower()) except ValueError: - raise ValueError(f"Emotion '{emotion}' not found in GoEmotions list") + raise ValueError(f"Emotion '{emotion}' not found in GoEmotions list") from None
337-337: Remove unnecessary f-string prefixThe string doesn't contain any placeholders, so the f-string prefix is not needed.
- print(f"\nEmotion descriptions:") + print("\nEmotion descriptions:")scripts/start_api_server.py (3)
1-1: File has shebang but is not executableThe script has a shebang line but lacks executable permissions.
Make the script executable since it's intended to be run directly:
chmod +x scripts/start_api_server.py
88-90: Use logging.exception for better error diagnosticsWhen catching exceptions, use
logging.exceptioninstead oflogging.errorto include the full traceback.except Exception as e: - logger.error(f"❌ Failed to start server: {e}") + logger.exception("❌ Failed to start server") sys.exit(1)
79-81: Documentation URLs assume localhostThe hardcoded localhost URLs in log messages won't be correct if the server is bound to a different host or accessed remotely.
- logger.info("📖 API Documentation: http://localhost:8000/docs") - logger.info("🔄 ReDoc Documentation: http://localhost:8000/redoc") - logger.info("💚 Health Check: http://localhost:8000/health") + base_url = f"http://{args.host if args.host != '0.0.0.0' else 'localhost'}:{args.port}" + logger.info(f"📖 API Documentation: {base_url}/docs") + logger.info(f"🔄 ReDoc Documentation: {base_url}/redoc") + logger.info(f"💚 Health Check: {base_url}/health")tests/test_unified_api_server.py (2)
1-1: Remove unnecessary shebang from test fileTest files are typically not executed directly but run through pytest, so the shebang is unnecessary.
-#!/usr/bin/env python3 """ Test Suite for SAMO Unified API Server
28-32: Consider adding cleanup in test fixturesThe API server fixture loads heavy ML models but doesn't have cleanup. Consider adding proper teardown to free resources.
@pytest.fixture def api_server(self): """Create API server instance for testing.""" server = SAMOUnifiedAPIServer() - return server + yield server + # Cleanup models to free memory + if hasattr(server, 'models'): + server.models.clear()test_samo_emotion_detection_standalone.py (6)
1-1: Shebang is unused (file not executable).Either make the file executable (
chmod +x) or drop the shebang to silence EXE001.-#!/usr/bin/env python3 +# This file can be run with: python test_samo_emotion_detection_standalone.py
13-15: Prefer robust import paths over sys.path manipulation.sys.path hacking is brittle. Consider packaging (editable install) or resolving repo root explicitly.
-sys.path.insert(0, str(Path(__file__).parent / "src")) +repo_root = Path(__file__).resolve().parent +src_dir = repo_root / "src" +if str(src_dir) not in sys.path: + sys.path.insert(0, str(src_dir))
70-85: Guard against label/probability misalignment and cast predictions to ints.Model returns float 0/1 predictions; also ensure probability vector length matches label list to avoid IndexErrors.
- emotions = results['emotions'][0] - probabilities = results['probabilities'][0] + emotions = results['emotions'][0] + probabilities = results['probabilities'][0] + preds = [int(p) for p in results.get('predictions', [[]])[0]] + assert len(probabilities) == len(all_emotions) == len(preds), "Label/probability length mismatch"
96-111: Enforce positive temperature.set_temperature accepts any float; negative or zero would break scaling. Clamp here (and/or inside the model).
- model.set_temperature(0.5) # Lower temperature = more confident + model.set_temperature(max(0.001, 0.5)) # Lower temperature = more confident @@ - model.set_temperature(2.0) # Higher temperature = less confident + model.set_temperature(max(0.001, 2.0)) # Higher temperature = less confident
155-167: Avoid blindexcept Exception; use logging.exception and keep stack trace.Catching broad exceptions in tests hides real failures.
- except Exception as e: - print(f"❌ Error testing emotion classifier: {e}") - import traceback - traceback.print_exc() - raise + except Exception: + import logging + logging.exception("❌ Error testing emotion classifier") + raise
200-202: Same here: avoid blind catch.Mirror the improvement above for performance test.
- except Exception as e: - print(f"❌ Error in performance test: {e}") + except Exception: + import logging + logging.exception("❌ Error in performance test")src/models/emotion_detection/samo_bert_emotion_classifier.py (6)
30-36: Library modules shouldn’t call logging.basicConfig or globally suppress warnings.Leave logging configuration to the application; warning suppression may hide important signals.
-# Configure logging -logging.basicConfig(level=logging.INFO) logger = logging.getLogger(__name__) - -# Suppress warnings for cleaner output -warnings.filterwarnings("ignore", category=UserWarning)
84-87: Inconsistent thresholds across module.Default prediction_threshold is 0.6, while evaluate_emotion_classifier uses 0.2, and API uses 0.5. Align these or let callers pass None to use model default.
- self.prediction_threshold = 0.6 # Updated from 0.5 to 0.6 based on calibration + self.prediction_threshold = 0.6 # Calibrated defaultAnd in evaluate_emotion_classifier (see separate comment) prefer
threshold: Optional[float] = Noneand fallback tomodel.prediction_threshold.
187-205: Type hint:thresholdshould be Optional[float]; provide a precise return type.PEP 484 warns on default None without Optional; also probabilities are List[List[float]], not List[float].
- def predict_emotions( + def predict_emotions( self, texts: Union[str, List[str]], - threshold: float = None, + threshold: Optional[float] = None, top_k: Optional[int] = None, batch_size: int = 32, - ) -> Dict[str, Union[List[str], List[float], List[List[int]]]]: + ) -> Dict[str, List]:Or define a TypedDict for stronger typing:
from typing import TypedDict class EmotionPredictionResult(TypedDict): emotions: List[List[str]] probabilities: List[List[float]] predictions: List[List[int]]and return EmotionPredictionResult.
270-273: Clamp temperature on setter.- def set_temperature(self, temperature: float) -> None: + def set_temperature(self, temperature: float) -> None: """Set temperature scaling parameter.""" - self.temperature.data.fill_(temperature) + self.temperature.data.fill_(max(1e-6, float(temperature))) logger.info(f"Set temperature to {temperature}")
431-436: Align evaluation threshold with model default.-def evaluate_emotion_classifier( +def evaluate_emotion_classifier( model: SAMOBERTEmotionClassifier, dataloader: DataLoader, device: torch.device, - threshold: float = 0.2, # Lowered from 0.5 to capture more predictions + threshold: Optional[float] = None, ) -> Dict[str, float]: @@ - model.eval() + model.eval() + if threshold is None: + threshold = getattr(model, "prediction_threshold", 0.5)
489-521: Main block: avoid blind catch; log stack trace properly.Use logger.exception to preserve traceback.
- except Exception as e: - print(f"❌ Error testing emotion classifier: {e}") - raise + except Exception: + logger.exception("❌ Error testing emotion classifier") + raisesrc/models/unified_api_server.py (8)
128-135: CORS: avoid wildcard origins with credentials in production.allow_origins=["*"] with allow_credentials=True can be problematic. Configure explicit origins via settings.
- allow_origins=["*"], # Configure for production + allow_origins=["http://localhost", "https://your.app"], # TODO: read from config allow_credentials=True,
209-212: Use logger.exception and preserve cause when converting to HTTPException.Provides stack trace and better debugging.
- except Exception as e: - logger.error(f"Summarization error: {e}") - raise HTTPException(status_code=500, detail=f"Summarization failed: {str(e)}") + except Exception as e: + logger.exception("Summarization error") + raise HTTPException(status_code=500, detail="Summarization failed") from e
224-227: Validate file size and sanitize name early.Add basic size limit and basename sanitization to reduce risk and improve UX.
- if not file.filename.lower().endswith(('.mp3', '.wav', '.m4a', '.ogg', '.flac')): + if not file.filename.lower().endswith(('.mp3', '.wav', '.m4a', '.ogg', '.flac')): raise HTTPException(status_code=400, detail="Unsupported audio format") + safe_name = Path(file.filename).name # strip any paths + # Optional: enforce max size via app server/nginx; here we can soft-check content-length header if provided.
257-260: Same logging/exception improvement for transcription.- except Exception as e: - logger.error(f"Transcription error: {e}") - raise HTTPException(status_code=500, detail=f"Transcription failed: {str(e)}") + except Exception as e: + logger.exception("Transcription error") + raise HTTPException(status_code=500, detail="Transcription failed") from e
348-355: Fix summary_length when summarizer is unavailable.Currently hard-codes 200 though an ellipsis is added; compute actual length.
- summary_response = SummarizationResponse( - summary=transcription_result.text[:200] + "...", + fallback_summary = transcription_result.text[:200] + "..." + summary_response = SummarizationResponse( + summary=fallback_summary, original_length=len(transcription_result.text), - summary_length=200, + summary_length=len(fallback_summary), processing_time=0.0, model_info={"error": "Summarization model not available"} )
397-400: Use logger.exception and preserve cause in combined pipeline error.- except Exception as e: - logger.error(f"Combined processing error: {e}") - raise HTTPException(status_code=500, detail=f"Processing failed: {str(e)}") + except Exception as e: + logger.exception("Combined processing error") + raise HTTPException(status_code=500, detail="Processing failed") from e
424-427: Binding to 0.0.0.0 exposes service publicly.Keep for container use, but consider making host configurable and default to localhost for dev.
- def run(self, host: str = "0.0.0.0", port: int = 8000): + def run(self, host: str = "127.0.0.1", port: int = 8000):
185-206: Avoid blocking the event loop with CPU-bound work; offload to threadpool.Model inference is synchronous and heavy; use
asyncio.to_threadorrun_in_threadpoolfor better concurrency.- summary = self.models["summarizer"].generate_summary( + from fastapi.concurrency import run_in_threadpool + summary = await run_in_threadpool(self.models["summarizer"].generate_summary, request.text, max_length=request.max_length, min_length=request.min_length, num_beams=request.num_beams )Apply similarly for
transcriber.transcribe(...)andemotion_detector.predict_emotions(...).Also applies to: 262-287, 294-396
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
configs/samo_api_config.yaml(1 hunks)configs/samo_emotion_detection_config.yaml(1 hunks)dependencies/requirements-api.txt(1 hunks)scripts/start_api_server.py(1 hunks)src/models/emotion_detection/emotion_labels.py(1 hunks)src/models/emotion_detection/samo_bert_emotion_classifier.py(1 hunks)src/models/unified_api_server.py(1 hunks)test_samo_emotion_detection_standalone.py(1 hunks)tests/test_unified_api_server.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
scripts/start_api_server.py (1)
src/models/unified_api_server.py (1)
SAMOUnifiedAPIServer(115-427)
tests/test_unified_api_server.py (1)
src/models/unified_api_server.py (1)
SAMOUnifiedAPIServer(115-427)
test_samo_emotion_detection_standalone.py (2)
src/models/emotion_detection/samo_bert_emotion_classifier.py (5)
create_samo_bert_emotion_classifier(388-428)count_parameters(275-277)count_frozen_parameters(279-281)predict_emotions(187-268)set_temperature(270-273)src/models/emotion_detection/emotion_labels.py (2)
get_all_emotions(270-277)get_emotion_description(244-254)
src/models/unified_api_server.py (3)
src/models/summarization/t5_summarizer.py (1)
T5SummarizationModel(120-348)src/models/voice_processing/whisper_transcriber.py (1)
WhisperTranscriber(181-438)src/models/emotion_detection/samo_bert_emotion_classifier.py (2)
create_samo_bert_emotion_classifier(388-428)SAMOBERTEmotionClassifier(38-281)
🪛 YAMLlint (1.37.1)
configs/samo_api_config.yaml
[error] 86-86: no new line character at the end of file
(new-line-at-end-of-file)
configs/samo_emotion_detection_config.yaml
[error] 13-13: trailing spaces
(trailing-spaces)
[error] 16-16: trailing spaces
(trailing-spaces)
[error] 19-19: trailing spaces
(trailing-spaces)
[error] 28-28: trailing spaces
(trailing-spaces)
[error] 31-31: trailing spaces
(trailing-spaces)
[error] 34-34: trailing spaces
(trailing-spaces)
[error] 43-43: trailing spaces
(trailing-spaces)
[error] 47-47: trailing spaces
(trailing-spaces)
[error] 51-51: trailing spaces
(trailing-spaces)
[error] 55-55: trailing spaces
(trailing-spaces)
[error] 66-66: trailing spaces
(trailing-spaces)
[error] 69-69: trailing spaces
(trailing-spaces)
[error] 83-83: trailing spaces
(trailing-spaces)
[error] 86-86: trailing spaces
(trailing-spaces)
[error] 96-96: trailing spaces
(trailing-spaces)
[error] 105-105: trailing spaces
(trailing-spaces)
[error] 108-108: trailing spaces
(trailing-spaces)
[error] 117-117: trailing spaces
(trailing-spaces)
[error] 121-121: trailing spaces
(trailing-spaces)
[error] 124-124: trailing spaces
(trailing-spaces)
[error] 132-132: trailing spaces
(trailing-spaces)
[error] 135-135: trailing spaces
(trailing-spaces)
[error] 138-138: trailing spaces
(trailing-spaces)
[error] 141-141: trailing spaces
(trailing-spaces)
[error] 150-150: trailing spaces
(trailing-spaces)
[error] 154-154: trailing spaces
(trailing-spaces)
[error] 163-163: trailing spaces
(trailing-spaces)
[error] 166-166: trailing spaces
(trailing-spaces)
[error] 169-169: trailing spaces
(trailing-spaces)
[error] 177-177: trailing spaces
(trailing-spaces)
[error] 180-180: trailing spaces
(trailing-spaces)
[error] 183-183: trailing spaces
(trailing-spaces)
🪛 Ruff (0.12.2)
scripts/start_api_server.py
1-1: Shebang is present but file is not executable
(EXE001)
25-25: Possible binding to all interfaces
(S104)
88-88: Do not catch blind exception: Exception
(BLE001)
89-89: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
tests/test_unified_api_server.py
1-1: Shebang is present but file is not executable
(EXE001)
test_samo_emotion_detection_standalone.py
1-1: Shebang is present but file is not executable
(EXE001)
200-200: Do not catch blind exception: Exception
(BLE001)
src/models/emotion_detection/samo_bert_emotion_classifier.py
1-1: Shebang is present but file is not executable
(EXE001)
190-190: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
src/models/emotion_detection/emotion_labels.py
1-1: Shebang is present but file is not executable
(EXE001)
183-183: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
183-183: Avoid specifying long messages outside the exception class
(TRY003)
202-202: Avoid specifying long messages outside the exception class
(TRY003)
337-337: f-string without any placeholders
Remove extraneous f prefix
(F541)
src/models/unified_api_server.py
1-1: Shebang is present but file is not executable
(EXE001)
153-153: Do not catch blind exception: Exception
(BLE001)
154-154: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
162-162: Do not catch blind exception: Exception
(BLE001)
163-163: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
172-172: Do not catch blind exception: Exception
(BLE001)
173-173: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
209-209: Do not catch blind exception: Exception
(BLE001)
210-210: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
211-211: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
211-211: Use explicit conversion flag
Replace with conversion flag
(RUF010)
216-216: Do not perform function call File in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
230-230: Probable insecure usage of temporary file or directory: "/tmp/"
(S108)
257-257: Do not catch blind exception: Exception
(BLE001)
258-258: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
259-259: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
259-259: Use explicit conversion flag
Replace with conversion flag
(RUF010)
289-289: Do not catch blind exception: Exception
(BLE001)
290-290: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
291-291: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
291-291: Use explicit conversion flag
Replace with conversion flag
(RUF010)
296-296: Do not perform function call File in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
309-309: Abstract raise to an inner function
(TRY301)
311-311: Probable insecure usage of temporary file or directory: "/tmp/"
(S108)
397-397: Do not catch blind exception: Exception
(BLE001)
398-398: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
399-399: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
399-399: Use explicit conversion flag
Replace with conversion flag
(RUF010)
424-424: Possible binding to all interfaces
(S104)
🔇 Additional comments (3)
dependencies/requirements-api.txt (1)
16-16: No change needed: openai-whisper uses date-based YYYYMMDD versioning
openai-whisper employs a date-based version scheme;>=20231117is valid.Likely an incorrect or invalid review comment.
configs/samo_emotion_detection_config.yaml (1)
11-12: Emotion count consistency verified The GOEMOTIONS_EMOTIONS list contains 28 entries (27 GoEmotions + neutral), matchingnum_emotions: 28.tests/test_unified_api_server.py (1)
157-204: Good test coverage for transcription endpointThe test properly mocks the Whisper transcriber and validates all response fields comprehensively.
| # Configuration file for the unified API server integrating T5, Whisper, and BERT models | ||
|
|
||
| server: | ||
| host: "0.0.0.0" |
There was a problem hiding this comment.
Security concern: API server binds to all interfaces by default
The server is configured to bind to 0.0.0.0 by default, which exposes it to all network interfaces. This could be a security risk in production environments.
Consider using 127.0.0.1 or localhost as the default for development, and require explicit configuration for production deployments:
server:
- host: "0.0.0.0"
+ host: "127.0.0.1" # Use 0.0.0.0 for production with proper security measures
port: 8000📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| host: "0.0.0.0" | |
| server: | |
| host: "127.0.0.1" # Use 0.0.0.0 for production with proper security measures | |
| port: 8000 |
🤖 Prompt for AI Agents
In configs/samo_api_config.yaml around line 5 the host is set to "0.0.0.0",
which binds the API server to all network interfaces; change the default to
"127.0.0.1" or "localhost" for development and update config docs to require an
explicit host (e.g., via env var or separate production config) for production
deployments, ensuring any deployment scripts or examples demonstrate how to
override the host to a public address when intentionally exposing the service.
| cors_origins: | ||
| - "http://localhost:3000" | ||
| - "http://localhost:8080" | ||
| - "https://your-frontend-domain.com" |
There was a problem hiding this comment.
Security: CORS origins should be environment-specific
The hardcoded CORS origins include development URLs and a placeholder domain that should be configured per environment.
These values should be loaded from environment variables or environment-specific config files:
cors_origins:
- - "http://localhost:3000"
- - "http://localhost:8080"
- - "https://your-frontend-domain.com"
+ # Configure via environment variables or separate config files
+ # Example: ${CORS_ORIGINS} or load from samo_api_config.{env}.yaml
+ - "${FRONTEND_URL}"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cors_origins: | |
| - "http://localhost:3000" | |
| - "http://localhost:8080" | |
| - "https://your-frontend-domain.com" | |
| cors_origins: | |
| # Configure via environment variables or separate config files | |
| # Example: ${CORS_ORIGINS} or load from samo_api_config.{env}.yaml | |
| - "${FRONTEND_URL}" |
🤖 Prompt for AI Agents
In configs/samo_api_config.yaml around lines 42 to 45, the CORS origins are
hardcoded to development and placeholder domains; change this so origins are
loaded from environment-specific configuration or an environment variable (e.g.,
SAMO_CORS_ORIGINS) instead of literals: remove the fixed entries and read a
comma-separated env var (or include an env-specific YAML file) and parse it into
the list of origins, with a sensible default only for local development (e.g.,
localhost) and no wildcard in production.
| trusted_hosts: [] | ||
| api_keys_required: false # Set to true for production |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Inconsistent security configuration
The configuration has api_keys_required: false with a comment suggesting it should be true for production, but there's no mechanism to enforce or validate API keys in the implementation.
🏁 Script executed:
#!/bin/bash
# Check if API key validation is implemented in the server
rg -n "api_key" --type=py
rg -n "APIKey|api_keys_required" --type=pyLength of output: 4688
Bind API key enforcement flag to actual checks
configs/samo_api_config.yaml lines 72–73 define api_keys_required but it isn’t referenced anywhere; enforcement in deployment/cloud-run/secure_api_server.py (require_api_key/verify_api_key) runs unconditionally. Wire this flag into the decorator logic (or remove the unused setting).
🤖 Prompt for AI Agents
In configs/samo_api_config.yaml lines 72–73 and
deployment/cloud-run/secure_api_server.py, the api_keys_required flag is defined
but not used; modify secure_api_server.py to read that config value and
conditionally enforce API key checks: load the config (or inject the flag) at
startup, then make require_api_key a decorator factory that becomes a no-op when
api_keys_required is false (or have verify_api_key immediately allow requests
when the flag is false). Ensure imports/initialization happen once at module
startup and add unit/quick manual test to verify that toggling api_keys_required
enables/disables enforcement.
| fastapi==0.104.1 | ||
| uvicorn[standard]==0.24.0 | ||
| pydantic==2.5.0 |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Consider pinning FastAPI ecosystem versions more strictly
Using exact version pins for critical API framework dependencies ensures reproducible builds and avoids unexpected breaking changes.
-fastapi==0.104.1
-uvicorn[standard]==0.24.0
-pydantic==2.5.0
+fastapi==0.104.1
+uvicorn[standard]==0.24.0.post1 # Latest patch
+pydantic==2.5.3 # Security and bug fixesConsider using a dependency management tool like Poetry or pip-tools to manage version locks separately from version constraints.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fastapi==0.104.1 | |
| uvicorn[standard]==0.24.0 | |
| pydantic==2.5.0 | |
| fastapi==0.104.1 | |
| uvicorn[standard]==0.24.0.post1 # Latest patch | |
| pydantic==2.5.3 # Security and bug fixes |
🤖 Prompt for AI Agents
In dependencies/requirements-api.txt around lines 5 to 7, the FastAPI ecosystem
isn’t fully and reproducibly pinned; fix this by adding explicit, exact version
pins for FastAPI’s direct runtime deps (e.g., starlette and anyio) alongside the
existing fastapi and pydantic lines, and switch to a lockfile workflow (use
Poetry or pip-tools) so you maintain a separate human-editable constraints file
(requirements.in or pyproject.toml) and generate a fully pinned
requirements.txt/requirements.lock for installs; update CI to install from the
generated lockfile to ensure reproducible builds.
| # Apply temperature scaling | ||
| logits = logits / self.temperature | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Temperature must be strictly positive; guard against 0/negatives.
Division by a non-positive temperature will explode or error.
- logits = logits / self.temperature
+ # Ensure strictly positive temperature for stability
+ safe_temp = torch.clamp(self.temperature, min=1e-6)
+ logits = logits / safe_temp📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Apply temperature scaling | |
| logits = logits / self.temperature | |
| # Apply temperature scaling | |
| # Ensure strictly positive temperature for stability | |
| safe_temp = torch.clamp(self.temperature, min=1e-6) | |
| logits = logits / safe_temp |
🤖 Prompt for AI Agents
In src/models/emotion_detection/samo_bert_emotion_classifier.py around lines 182
to 184, the code divides logits by self.temperature without validating
temperature; add a guard to ensure temperature is strictly positive before any
division. Validate and enforce this in the class initialization (raise a
ValueError with a clear message if temperature <= 0) or, if you prefer runtime
safety here, check if self.temperature > 0 and raise a ValueError (or clamp to a
small positive epsilon) before performing logits = logits / self.temperature so
division-by-zero or negative scaling cannot occur.
| temp_path = f"/tmp/{file.filename}" | ||
| with open(temp_path, "wb") as buffer: | ||
| content = await file.read() | ||
| buffer.write(content) | ||
|
|
There was a problem hiding this comment.
Insecure temp file handling; use NamedTemporaryFile and stream in chunks.
Avoid predictable paths in /tmp and unbounded reads.
- temp_path = f"/tmp/{file.filename}"
- with open(temp_path, "wb") as buffer:
- content = await file.read()
- buffer.write(content)
+ import tempfile
+ suffix = Path(file.filename).suffix
+ with tempfile.NamedTemporaryFile(delete=False, suffix=suffix) as tmp:
+ temp_path = tmp.name
+ while True:
+ chunk = await file.read(1024 * 1024) # 1MB
+ if not chunk:
+ break
+ tmp.write(chunk)Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.12.2)
230-230: Probable insecure usage of temporary file or directory: "/tmp/"
(S108)
🤖 Prompt for AI Agents
In src/models/unified_api_server.py around lines 230 to 234, the code writes
uploaded files to a predictable /tmp path and uses a single unbounded await
file.read(), which is insecure; replace this with a securely created temporary
file via tempfile.NamedTemporaryFile (use delete=False so you can return the
path) opened in binary mode, stream the uploaded file in fixed-size chunks
(e.g., 64KB) by repeatedly awaiting file.read(chunk_size) and writing each chunk
to the temp file, ensure the temp file permissions are tightened (e.g.,
os.fchmod to 0o600) before closing, and avoid constructing paths from
file.filename so the saved path is the NamedTemporaryFile.name.
| start_time = time.time() | ||
| try: | ||
| results = self.models["emotion_detector"].predict_emotions( | ||
| request.text, | ||
| threshold=request.threshold, | ||
| top_k=request.top_k | ||
| ) | ||
|
|
||
| processing_time = time.time() - start_time | ||
|
|
||
| return EmotionDetectionResponse( | ||
| emotions=results["emotions"][0] if results["emotions"] else [], | ||
| probabilities=results["probabilities"][0] if results["probabilities"] else [], | ||
| predictions=results["predictions"][0] if results["predictions"] else [], | ||
| processing_time=processing_time, | ||
| model_info={ | ||
| "model_name": "SAMO BERT Emotion Classifier", | ||
| "num_emotions": 28, | ||
| "device": str(self.models["emotion_detector"].device) | ||
| } | ||
| ) |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Return real emotion names and integer predictions; current output leaks emotion_i placeholders and floats.
Map indices to names via emotion_labels and cast predictions to ints to match schema.
- results = self.models["emotion_detector"].predict_emotions(
+ results = self.models["emotion_detector"].predict_emotions(
request.text,
threshold=request.threshold,
top_k=request.top_k
)
@@
- return EmotionDetectionResponse(
- emotions=results["emotions"][0] if results["emotions"] else [],
- probabilities=results["probabilities"][0] if results["probabilities"] else [],
- predictions=results["predictions"][0] if results["predictions"] else [],
+ # Map to human-readable labels and ensure ints
+ all_labels = get_all_emotions()
+ probs = results["probabilities"][0] if results["probabilities"] else []
+ raw_preds = results["predictions"][0] if results["predictions"] else []
+ preds = [int(p) for p in raw_preds]
+ selected = [i for i, v in enumerate(preds) if v == 1]
+ labels = [all_labels[i] for i in selected] if all_labels else results.get("emotions", [[]])[0],
+ return EmotionDetectionResponse(
+ emotions=labels if isinstance(labels, list) else [],
+ probabilities=probs,
+ predictions=preds,
processing_time=processing_time,
model_info={
"model_name": "SAMO BERT Emotion Classifier",
"num_emotions": 28,
"device": str(self.models["emotion_detector"].device)
}
)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/models/unified_api_server.py around lines 267 to 287, the response
currently returns placeholder emotion keys like "emotion_i" and float prediction
values; replace those with real emotion names and integer predictions by mapping
result indices through the model's emotion_labels (e.g.
self.models["emotion_detector"].emotion_labels) and casting prediction indices
to int before returning; keep probabilities as floats, and preserve the current
empty-list guards (i.e., if results["emotions"] else []) so you map only when
data exists and avoid altering processing_time or model_info.
| temp_path = f"/tmp/{file.filename}" | ||
| with open(temp_path, "wb") as buffer: | ||
| content = await file.read() | ||
| buffer.write(content) | ||
|
|
There was a problem hiding this comment.
Repeat: secure temp file handling for combined pipeline.
- temp_path = f"/tmp/{file.filename}"
- with open(temp_path, "wb") as buffer:
- content = await file.read()
- buffer.write(content)
+ import tempfile
+ suffix = Path(file.filename).suffix
+ with tempfile.NamedTemporaryFile(delete=False, suffix=suffix) as tmp:
+ temp_path = tmp.name
+ while True:
+ chunk = await file.read(1024 * 1024)
+ if not chunk:
+ break
+ tmp.write(chunk)Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.12.2)
311-311: Probable insecure usage of temporary file or directory: "/tmp/"
(S108)
🤖 Prompt for AI Agents
In src/models/unified_api_server.py around lines 311-315, the code writes
uploaded files to a predictable /tmp/<filename> path which is insecure; replace
this with a secure temporary file mechanism (use tempfile.NamedTemporaryFile or
tempfile.TemporaryDirectory to create an unpredictable temp path), write the
uploaded bytes into that temp file (ensuring binary mode), set restrictive
permissions (e.g. 0o600) if needed, handle errors and ensure the temp file is
removed after use (finally/try-except or context manager) to avoid leftover
files and TOCTOU issues.
| server = SAMOUnifiedAPIServer() | ||
|
|
||
| if __name__ == "__main__": | ||
| # Start the server | ||
| print("🚀 Starting SAMO Unified API Server") | ||
| print("=" * 50) | ||
| print("📖 API Documentation: http://localhost:8000/docs") | ||
| print("🔄 ReDoc Documentation: http://localhost:8000/redoc") | ||
| print("💚 Health Check: http://localhost:8000/health") | ||
| print("\nPress Ctrl+C to stop the server") | ||
| print("=" * 50) | ||
|
|
||
| server.run() No newline at end of file |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Avoid heavy side-effects at import time.
Constructing a global server eagerly loads large models on import, hurting test startup and CLIs. Expose a factory and let the launcher create it.
-# Global server instance
-server = SAMOUnifiedAPIServer()
+# Global server instance (created lazily)
+server: Optional[SAMOUnifiedAPIServer] = None
+def get_server() -> SAMOUnifiedAPIServer:
+ global server
+ if server is None:
+ server = SAMOUnifiedAPIServer()
+ return server
@@
- server.run()
+ get_server().run()And update external entrypoints/tests to call get_server().
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| server = SAMOUnifiedAPIServer() | |
| if __name__ == "__main__": | |
| # Start the server | |
| print("🚀 Starting SAMO Unified API Server") | |
| print("=" * 50) | |
| print("📖 API Documentation: http://localhost:8000/docs") | |
| print("🔄 ReDoc Documentation: http://localhost:8000/redoc") | |
| print("💚 Health Check: http://localhost:8000/health") | |
| print("\nPress Ctrl+C to stop the server") | |
| print("=" * 50) | |
| server.run() | |
| # Global server instance (created lazily) | |
| server: Optional[SAMOUnifiedAPIServer] = None | |
| def get_server() -> SAMOUnifiedAPIServer: | |
| global server | |
| if server is None: | |
| server = SAMOUnifiedAPIServer() | |
| return server | |
| if __name__ == "__main__": | |
| # Start the server | |
| print("🚀 Starting SAMO Unified API Server") | |
| print("=" * 50) | |
| print("📖 API Documentation: http://localhost:8000/docs") | |
| print("🔄 ReDoc Documentation: http://localhost:8000/redoc") | |
| print("💚 Health Check: http://localhost:8000/health") | |
| print("\nPress Ctrl+C to stop the server") | |
| print("=" * 50) | |
| get_server().run() |
🤖 Prompt for AI Agents
In src/models/unified_api_server.py around lines 431 to 443, avoid creating a
global SAMOUnifiedAPIServer() at import time because it eagerly loads large
models; refactor by removing the top-level instantiation and instead add a
get_server() factory function that constructs and returns a new
SAMOUnifiedAPIServer instance on demand, keep the if __name__ == "__main__"
block to call get_server() and then run() on the returned instance, and update
any external entrypoints/tests to call get_server() where they previously relied
on the global server.
| # Temporarily set models to None | ||
| original_models = client.app.state.models.copy() | ||
|
|
||
| try: | ||
| # Mock unavailable models | ||
| client.app.state.models = { | ||
| "summarizer": None, | ||
| "transcriber": None, | ||
| "emotion_detector": None | ||
| } | ||
|
|
||
| # Test summarization | ||
| response = client.post("/summarize", json={"text": "Test text"}) | ||
| assert response.status_code == 503 | ||
| assert "not available" in response.json()["detail"] | ||
|
|
||
| # Test emotion detection | ||
| response = client.post("/detect-emotions", json={"text": "Test text"}) | ||
| assert response.status_code == 503 | ||
| assert "not available" in response.json()["detail"] | ||
|
|
||
| finally: | ||
| # Restore original models | ||
| client.app.state.models = original_models | ||
|
|
There was a problem hiding this comment.
Test incorrectly accesses app.state.models
The test tries to access client.app.state.models but the server implementation stores models in self.models, not in app.state.
The test should mock the server's models dictionary directly:
def test_model_unavailable_errors(self, client):
"""Test error handling when models are not available."""
- # Temporarily set models to None
- original_models = client.app.state.models.copy()
+ # Get the server instance from the test client
+ # Note: TestClient doesn't expose the server instance directly
+ # This test needs to be restructured to properly mock the models
- try:
- # Mock unavailable models
- client.app.state.models = {
- "summarizer": None,
- "transcriber": None,
- "emotion_detector": None
- }
+ with patch.object(client.app, 'models', {
+ "summarizer": None,
+ "transcriber": None,
+ "emotion_detector": None
+ }):
# Test summarization
response = client.post("/summarize", json={"text": "Test text"})
assert response.status_code == 503
assert "not available" in response.json()["detail"]
# Test emotion detection
response = client.post("/detect-emotions", json={"text": "Test text"})
assert response.status_code == 503
assert "not available" in response.json()["detail"]
-
- finally:
- # Restore original models
- client.app.state.models = original_models📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Temporarily set models to None | |
| original_models = client.app.state.models.copy() | |
| try: | |
| # Mock unavailable models | |
| client.app.state.models = { | |
| "summarizer": None, | |
| "transcriber": None, | |
| "emotion_detector": None | |
| } | |
| # Test summarization | |
| response = client.post("/summarize", json={"text": "Test text"}) | |
| assert response.status_code == 503 | |
| assert "not available" in response.json()["detail"] | |
| # Test emotion detection | |
| response = client.post("/detect-emotions", json={"text": "Test text"}) | |
| assert response.status_code == 503 | |
| assert "not available" in response.json()["detail"] | |
| finally: | |
| # Restore original models | |
| client.app.state.models = original_models | |
| def test_model_unavailable_errors(self, client): | |
| """Test error handling when models are not available.""" | |
| # Get the server instance from the test client | |
| # Note: TestClient doesn't expose the server instance directly | |
| # This test needs to be restructured to properly mock the models | |
| with patch.object(client.app, 'models', { | |
| "summarizer": None, | |
| "transcriber": None, | |
| "emotion_detector": None | |
| }): | |
| # Test summarization | |
| response = client.post("/summarize", json={"text": "Test text"}) | |
| assert response.status_code == 503 | |
| assert "not available" in response.json()["detail"] | |
| # Test emotion detection | |
| response = client.post("/detect-emotions", json={"text": "Test text"}) | |
| assert response.status_code == 503 | |
| assert "not available" in response.json()["detail"] |
🤖 Prompt for AI Agents
In tests/test_unified_api_server.py around lines 274 to 298, the test
incorrectly manipulates client.app.state.models while the server stores models
on the server instance as self.models; update the test to copy, mock, and
restore the server's models dictionary directly by using client.app.models (or
the actual server instance attribute) instead of client.app.state.models so the
temporary None mocks affect the real code paths that check self.models.
Summary by CodeRabbit
New Features
Configuration
Chores
Tests