Skip to content

flexible api: explicit timeouts + no PII in errors#57

Open
uelkerd wants to merge 1 commit into
mainfrom
feature/flexible-api-timeouts
Open

flexible api: explicit timeouts + no PII in errors#57
uelkerd wants to merge 1 commit into
mainfrom
feature/flexible-api-timeouts

Conversation

@uelkerd
Copy link
Copy Markdown
Owner

@uelkerd uelkerd commented Aug 10, 2025

Summary by Sourcery

Add a Flask-based flexible emotion detection API server supporting serverless, endpoint, and local deployments with explicit timeouts and PII-safe error handling

New Features:

  • Implement a Flask server offering single (/predict) and batch (/predict_batch) emotion prediction endpoints along with health and documentation routes
  • Support serverless HuggingFace Inference API, paid inference endpoints, and local model deployments through a unified detector class

Enhancements:

  • Introduce environment-configurable timeouts and retry strategy for remote API calls
  • Redact input text in error logs to avoid exposing PII
  • Add secure default host binding and runtime security warnings for all-interfaces requests

Summary by CodeRabbit

  • New Features
    • Introduced a flexible emotion detection API server with multiple deployment options, including serverless, paid endpoints, and local models.
    • Added endpoints for health checks, single and batch emotion prediction, and API documentation.
    • Enhanced error handling and security warnings for safer deployment.
    • Provided structured JSON responses with emotion predictions and metadata.
    • Included example usage instructions and curl commands for easier testing.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Aug 10, 2025

Walkthrough

A new Flask-based API server for emotion detection has been implemented. It supports three HuggingFace deployment strategies—serverless inference API, paid inference endpoints, and local self-hosted models—configurable via environment variables. The server exposes endpoints for health checks, single and batch predictions, and API documentation, with robust error handling, logging, and security guidance.

Changes

Cohort / File(s) Change Summary
Flexible Emotion Detection API Server
deployment/flexible_api_server.py
Introduced a configurable Flask API server for emotion detection supporting HuggingFace serverless, endpoints, and local models. Added the FlexibleEmotionDetector class, multiple prediction methods, error handling, logging, and endpoints for health, single/batch prediction, and documentation.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant FlaskAPI
    participant FlexibleEmotionDetector
    participant HuggingFaceAPI
    participant LocalModel

    Client->>FlaskAPI: POST /predict (text)
    FlaskAPI->>FlexibleEmotionDetector: predict(text)
    alt Serverless/Endpoint
        FlexibleEmotionDetector->>HuggingFaceAPI: Request prediction
        HuggingFaceAPI-->>FlexibleEmotionDetector: Return prediction
    else Local
        FlexibleEmotionDetector->>LocalModel: Predict emotion
        LocalModel-->>FlexibleEmotionDetector: Return prediction
    end
    FlexibleEmotionDetector-->>FlaskAPI: Prediction result
    FlaskAPI-->>Client: JSON response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Poem

A bunny hopped to code today,
And built an API the flexible way!
Emotions detected, with models three—
Cloud, endpoint, or run locally.
Secure and robust, with logs that delight,
This fluffy release is working just right!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/flexible-api-timeouts

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Aug 10, 2025

Reviewer's Guide

This PR introduces a new Flask-based emotion detection API server that supports serverless, endpoint, and local deployment strategies, with explicit timeout configuration, robust retry handling, and PII-safe error logging implemented within a single Python module.

Sequence diagram for /predict endpoint request flow

sequenceDiagram
    actor User
    participant FlaskApp as Flask App
    participant Detector as FlexibleEmotionDetector
    User->>FlaskApp: POST /predict {"text": "..."}
    FlaskApp->>Detector: predict(text)
    Detector->>Detector: Select deployment strategy
    alt Serverless
        Detector->>Detector: _predict_serverless(text)
    else Endpoint
        Detector->>Detector: _predict_endpoint(text)
    else Local
        Detector->>Detector: _predict_local(text)
    end
    Detector-->>FlaskApp: prediction result
    FlaskApp-->>User: JSON response
Loading

Sequence diagram for /predict_batch endpoint request flow

sequenceDiagram
    actor User
    participant FlaskApp as Flask App
    participant Detector as FlexibleEmotionDetector
    User->>FlaskApp: POST /predict_batch {"texts": ["...", ...]}
    FlaskApp->>Detector: predict(text) for each text
    Detector->>Detector: Select deployment strategy per text
    Detector-->>FlaskApp: prediction results (list)
    FlaskApp-->>User: JSON response with predictions
Loading

Class diagram for FlexibleEmotionDetector and DeploymentType

classDiagram
    class FlexibleEmotionDetector {
        - deployment_type: DeploymentType
        - model_name: str
        - hf_token: str
        - emotion_labels: List[str]
        - model: Any
        - tokenizer: Any
        - session: Any
        + __init__()
        + predict(text: str) Dict[str, Any]
        + get_status() Dict[str, Any]
        - _initialize()
        - _initialize_serverless()
        - _initialize_endpoint()
        - _initialize_local()
        - _predict_serverless(text: str) Dict[str, Any]
        - _predict_endpoint(text: str) Dict[str, Any]
        - _predict_local(text: str) Dict[str, Any]
        - _get_model_device_str() Optional[str]
    }
    class DeploymentType {
        SERVERLESS
        ENDPOINT
        LOCAL
    }
    FlexibleEmotionDetector --> DeploymentType
Loading

File-Level Changes

Change Details Files
Configurable request timeouts with user guidance
  • Added TIMEOUT_SECONDS environment variable for serverless and endpoint calls
  • Applied timeout parameter on session.post requests
  • Special handling of Timeout exceptions to return friendly error and suggestion
deployment/flexible_api_server.py
Redacted input text in logs to avoid PII leaks
  • Logged only a 20-character preview of user text on errors
  • Replaced raw text in exception handlers with redacted preview
deployment/flexible_api_server.py
Built in HTTP retry strategy for serverless API calls
  • Created requests.Session with HTTPAdapter and Retry
  • Configured retry on status codes [429,500,502,503,504] and specific HTTP methods
deployment/flexible_api_server.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @uelkerd, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've added a new Flask-based API server that provides a flexible solution for emotion detection. This server can seamlessly integrate with various HuggingFace deployment options, from cost-effective serverless solutions to high-performance dedicated endpoints, or even local self-hosted models. A key focus of this change was to enhance the API's robustness by implementing explicit timeouts for external requests and improving privacy by ensuring that sensitive user data is not inadvertently logged during errors.

Highlights

  • Flexible Emotion Detection API Server: I've introduced a new Flask API server designed for emotion detection. This server is highly flexible, supporting three distinct HuggingFace deployment strategies: the free Serverless Inference API, paid Inference Endpoints, and self-hosted local models. This allows for adaptable deployment based on performance and cost requirements.
  • Explicit Timeouts and PII Redaction: To improve reliability and prevent hanging requests, I've implemented explicit timeouts for all external API calls made to HuggingFace. Additionally, I've focused on privacy by ensuring that Personal Identifiable Information (PII) is not exposed in error logs, redacting sensitive input text when logging prediction failures.
  • Robust Error Handling and Cold Start Management: The new server includes comprehensive error handling, particularly for the HuggingFace Serverless API, which can experience 'cold starts'. The system now intelligently retries requests if a 503 (Service Unavailable) status is received, allowing the model time to load. This makes the API more resilient to transient issues.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@deepsource-io
Copy link
Copy Markdown
Contributor

deepsource-io Bot commented Aug 10, 2025

Here's the code health analysis summary for commits 70cb2f6..ae92171. View details on DeepSource ↗.

Analysis Summary

AnalyzerStatusSummaryLink
DeepSource Test coverage LogoTest coverage⚠️ Artifact not reportedTimed out: Artifact was never reportedView Check ↗
DeepSource Python LogoPython❌ Failure
❗ 51 occurences introduced
View Check ↗
DeepSource Terraform LogoTerraform✅ SuccessView Check ↗
DeepSource Secrets LogoSecrets✅ SuccessView Check ↗
DeepSource Shell LogoShell✅ SuccessView Check ↗
DeepSource Docker LogoDocker✅ SuccessView Check ↗

💡 If you’re a repository administrator, you can configure the quality gates from the settings.

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @uelkerd - I've reviewed your changes - here's some feedback:

Blocking issues:

  • time.sleep() call; did you mean to leave this in? (link)

General comments:

  • There’s a lot of duplicated logic in _predict_serverless and _predict_endpoint for parsing and formatting results—consider extracting a shared helper to reduce repetition.
  • This file mixes CLI startup, Flask routes, and deployment logic into one large script—splitting into modules (e.g. routes, model client, config) would improve readability and maintainability.
  • Configuration values like timeouts and retry settings are scattered across methods—centralizing them into a single config object or class would make adjustments and testing much easier.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- There’s a lot of duplicated logic in _predict_serverless and _predict_endpoint for parsing and formatting results—consider extracting a shared helper to reduce repetition.
- This file mixes CLI startup, Flask routes, and deployment logic into one large script—splitting into modules (e.g. routes, model client, config) would improve readability and maintainability.
- Configuration values like timeouts and retry settings are scattered across methods—centralizing them into a single config object or class would make adjustments and testing much easier.

## Individual Comments

### Comment 1
<location> `deployment/flexible_api_server.py:39` </location>
<code_context>
+    def __init__(self):
+        """Initialize based on environment configuration."""
+        self.deployment_type = DeploymentType(os.getenv('DEPLOYMENT_TYPE', 'serverless'))
+        self.model_name = os.getenv('MODEL_NAME', 'your-username/samo-dl-emotion-model')
+        self.hf_token = os.getenv('HF_TOKEN')
+
</code_context>

<issue_to_address>
Default model name is a placeholder and may cause runtime errors.

If MODEL_NAME is not set, loading the model may fail. Add a check to raise an explicit error when MODEL_NAME is missing or still set to the placeholder.
</issue_to_address>

### Comment 2
<location> `deployment/flexible_api_server.py:81` </location>
<code_context>
+        from requests.adapters import HTTPAdapter
+        from requests.packages.urllib3.util.retry import Retry
+
+        retry_strategy = Retry(
+            total=3,
+            backoff_factor=1,
+            status_forcelist=[429, 500, 502, 503, 504],
+            allowed_methods={"POST", "GET", "PUT", "PATCH"}  # Enable retries for these HTTP methods
+        )
+        adapter = HTTPAdapter(max_retries=retry_strategy)
</code_context>

<issue_to_address>
allowed_methods should be a list or tuple for compatibility.

Using a set for allowed_methods may lead to compatibility issues; please use a list or tuple instead.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
        retry_strategy = Retry(
            total=3,
            backoff_factor=1,
            status_forcelist=[429, 500, 502, 503, 504],
            allowed_methods={"POST", "GET", "PUT", "PATCH"}  # Enable retries for these HTTP methods
        )
=======
        retry_strategy = Retry(
            total=3,
            backoff_factor=1,
            status_forcelist=[429, 500, 502, 503, 504],
            allowed_methods=["POST", "GET", "PUT", "PATCH"]  # Enable retries for these HTTP methods
        )
>>>>>>> REPLACE

</suggested_fix>

### Comment 3
<location> `deployment/flexible_api_server.py:294` </location>
<code_context>
+            predicted_class = predicted_class.cpu()
+
+            # Get emotion label
+            if hasattr(self.model.config, 'id2label'):
+                emotion = self.model.config.id2label[predicted_class.item()]
+            else:
+                emotion = self.emotion_labels[predicted_class.item()]
</code_context>

<issue_to_address>
Potential KeyError if id2label does not contain predicted_class index.

Use .get() with a default value or check if the index exists before accessing id2label to avoid a KeyError.
</issue_to_address>

### Comment 4
<location> `deployment/flexible_api_server.py:304` </location>
<code_context>
+
+            # Get all emotion probabilities
+            all_emotions = {}
+            for i, prob in enumerate(probabilities[0]):
+                if hasattr(self.model.config, 'id2label'):
+                    label = self.model.config.id2label[i]
+                else:
+                    label = self.emotion_labels[i] if i < len(self.emotion_labels) else f"emotion_{i}"
+                all_emotions[label] = prob.item()
+
</code_context>

<issue_to_address>
Possible index error if emotion_labels is shorter than model output.

Consider adding a warning log when the fallback label is used to aid in identifying configuration mismatches.

Suggested implementation:

```python
            # Get all emotion probabilities
            all_emotions = {}
            for i, prob in enumerate(probabilities[0]):
                if hasattr(self.model.config, 'id2label'):
                    label = self.model.config.id2label[i]
                else:
                    if i < len(self.emotion_labels):
                        label = self.emotion_labels[i]
                    else:
                        label = f"emotion_{i}"
                        logging.warning(
                            f"Emotion label index {i} exceeds emotion_labels length ({len(self.emotion_labels)}). "
                            f"Using fallback label '{label}'. This may indicate a configuration mismatch."
                        )
                all_emotions[label] = prob.item()

```

```python
import os
import time
import logging
from typing import Dict, List, Optional, Any
from enum import Enum

```
</issue_to_address>

### Comment 5
<location> `deployment/flexible_api_server.py:389` </location>
<code_context>
+    try:
+        data = request.get_json()
+
+        if not data or 'text' not in data:
+            return jsonify({'error': 'No text provided'}), 400
+
</code_context>

<issue_to_address>
Error response for missing 'text' is clear, but could include expected format.

Including the expected request format in the error response will make it easier for clients to fix their requests.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
        if not data or 'text' not in data:
            return jsonify({'error': 'No text provided'}), 400
=======
        if not data or 'text' not in data:
            return jsonify({
                'error': 'No text provided',
                'expected_format': {'text': '<your text here>'}
            }), 400
>>>>>>> REPLACE

</suggested_fix>

### Comment 6
<location> `deployment/flexible_api_server.py:564` </location>
<code_context>
+        print(f"   • Use {all_interfaces_env} only in production with firewall/proxy")
+        print("   • Never expose debug=True to external networks")
+
+    app.run(host=host, port=port, debug=debug)
</code_context>

<issue_to_address>
Flask debug mode should not be enabled in production.

Consider adding a check to prevent debug=True when the app is bound to all interfaces, to avoid accidental exposure.
</issue_to_address>

## Security Issues

### Issue 1
<location> `deployment/flexible_api_server.py:167` </location>

<issue_to_address>
**security (python.lang.best-practice.arbitrary-sleep):** time.sleep() call; did you mean to leave this in?

*Source: opengrep*
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

def __init__(self):
"""Initialize based on environment configuration."""
self.deployment_type = DeploymentType(os.getenv('DEPLOYMENT_TYPE', 'serverless'))
self.model_name = os.getenv('MODEL_NAME', 'your-username/samo-dl-emotion-model')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Default model name is a placeholder and may cause runtime errors.

If MODEL_NAME is not set, loading the model may fail. Add a check to raise an explicit error when MODEL_NAME is missing or still set to the placeholder.

Comment on lines +81 to +86
retry_strategy = Retry(
total=3,
backoff_factor=1,
status_forcelist=[429, 500, 502, 503, 504],
allowed_methods={"POST", "GET", "PUT", "PATCH"} # Enable retries for these HTTP methods
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: allowed_methods should be a list or tuple for compatibility.

Using a set for allowed_methods may lead to compatibility issues; please use a list or tuple instead.

Suggested change
retry_strategy = Retry(
total=3,
backoff_factor=1,
status_forcelist=[429, 500, 502, 503, 504],
allowed_methods={"POST", "GET", "PUT", "PATCH"} # Enable retries for these HTTP methods
)
retry_strategy = Retry(
total=3,
backoff_factor=1,
status_forcelist=[429, 500, 502, 503, 504],
allowed_methods=["POST", "GET", "PUT", "PATCH"] # Enable retries for these HTTP methods
)

Comment on lines +294 to +295
if hasattr(self.model.config, 'id2label'):
emotion = self.model.config.id2label[predicted_class.item()]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: Potential KeyError if id2label does not contain predicted_class index.

Use .get() with a default value or check if the index exists before accessing id2label to avoid a KeyError.

Comment on lines +304 to +308
for i, prob in enumerate(probabilities[0]):
if hasattr(self.model.config, 'id2label'):
label = self.model.config.id2label[i]
else:
label = self.emotion_labels[i] if i < len(self.emotion_labels) else f"emotion_{i}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Possible index error if emotion_labels is shorter than model output.

Consider adding a warning log when the fallback label is used to aid in identifying configuration mismatches.

Suggested implementation:

            # Get all emotion probabilities
            all_emotions = {}
            for i, prob in enumerate(probabilities[0]):
                if hasattr(self.model.config, 'id2label'):
                    label = self.model.config.id2label[i]
                else:
                    if i < len(self.emotion_labels):
                        label = self.emotion_labels[i]
                    else:
                        label = f"emotion_{i}"
                        logging.warning(
                            f"Emotion label index {i} exceeds emotion_labels length ({len(self.emotion_labels)}). "
                            f"Using fallback label '{label}'. This may indicate a configuration mismatch."
                        )
                all_emotions[label] = prob.item()
import os
import time
import logging
from typing import Dict, List, Optional, Any
from enum import Enum

Comment on lines +389 to +390
if not data or 'text' not in data:
return jsonify({'error': 'No text provided'}), 400
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Error response for missing 'text' is clear, but could include expected format.

Including the expected request format in the error response will make it easier for clients to fix their requests.

Suggested change
if not data or 'text' not in data:
return jsonify({'error': 'No text provided'}), 400
if not data or 'text' not in data:
return jsonify({
'error': 'No text provided',
'expected_format': {'text': '<your text here>'}
}), 400

print(f" • Use {all_interfaces_env} only in production with firewall/proxy")
print(" • Never expose debug=True to external networks")

app.run(host=host, port=port, debug=debug)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 suggestion (security): Flask debug mode should not be enabled in production.

Consider adding a check to prevent debug=True when the app is bound to all interfaces, to avoid accidental exposure.

if response.status_code == 503:
# Model is loading (cold start)
logger.info("🔄 Model loading, waiting...")
time.sleep(10) # Wait for model to load
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security (python.lang.best-practice.arbitrary-sleep): time.sleep() call; did you mean to leave this in?

Source: opengrep

"""Predict using local model."""
try:
# Tokenize input
inputs = self.tokenizer(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): We've found these issues:

Comment on lines +400 to +403
if 'error' in result:
return jsonify(result), 500
return jsonify(result)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): We've found these issues:

Suggested change
if 'error' in result:
return jsonify(result), 500
return jsonify(result)
return (jsonify(result), 500) if 'error' in result else jsonify(result)

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new flexible API server for emotion detection, supporting multiple deployment strategies. The implementation is well-structured, but there are several areas for improvement. Key issues include a critical security vulnerability where raw exception details are exposed to clients, and a high-severity performance issue in the batch prediction endpoint which processes requests sequentially instead of in a true batch. Additionally, there are opportunities to improve maintainability by reducing code duplication, and to increase robustness by adding retry logic and validating configuration more strictly. My review provides specific suggestions to address these points.

Comment on lines +404 to +405
except Exception as e:
return jsonify({'error': str(e)}), 500
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Exposing raw exception messages to the client is a security risk. It can leak sensitive information about the application's internal workings, such as library versions, file paths, or underlying error details. It's better to log the detailed exception for debugging and return a generic, user-friendly error message.

    except Exception as e:
        logger.error(f"An unexpected error occurred in /predict: {e}", exc_info=True)
        return jsonify({'error': 'An internal server error occurred.'}), 500

Comment on lines +425 to +429
results = []
for text in texts:
if text and text.strip():
result = detector.predict(text.strip())
results.append(result)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The batch prediction endpoint processes texts sequentially in a loop, calling detector.predict() for each text. This is highly inefficient and defeats the purpose of a batch endpoint, as it results in N separate predictions (and N API calls for serverless/endpoint modes) instead of a single batch operation. Both HuggingFace APIs and local transformer models are optimized for batch processing. You should add a predict_batch method to the FlexibleEmotionDetector class that is optimized for batching.

        texts_to_predict = [text.strip() for text in texts if text and text.strip()]
        if not texts_to_predict:
            return jsonify({'predictions': [], 'count': 0, 'deployment_type': detector.deployment_type.value})

        # Assumes a new `predict_batch` method is added to the detector for true batching
        results = detector.predict_batch(texts_to_predict)

def __init__(self):
"""Initialize based on environment configuration."""
self.deployment_type = DeploymentType(os.getenv('DEPLOYMENT_TYPE', 'serverless'))
self.model_name = os.getenv('MODEL_NAME', 'your-username/samo-dl-emotion-model')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The default MODEL_NAME is a placeholder value 'your-username/samo-dl-emotion-model'. If the MODEL_NAME environment variable is not set, the application will fail with a potentially confusing error from the HuggingFace library. It would be more robust to check for this placeholder and raise a clear ValueError if it's not been changed, improving the user experience on first run.

Suggested change
self.model_name = os.getenv('MODEL_NAME', 'your-username/samo-dl-emotion-model')
self.model_name = os.getenv('MODEL_NAME', 'your-username/samo-dl-emotion-model')
if 'your-username' in self.model_name:
raise ValueError("MODEL_NAME environment variable is not set or is using the default placeholder. Please set it to your HuggingFace model.")

Comment on lines +78 to +79
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Imports should be at the top of the file as per PEP 8 guidelines. Placing them inside a function or method can lead to circular import issues and makes it harder to see the file's dependencies at a glance.

self.headers = {"Authorization": f"Bearer {self.hf_token}"}

# Create session
self.session = requests.Session()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The requests.Session for the endpoint deployment type is initialized without a retry strategy. Inference Endpoints can also experience transient network issues or temporary unavailability. Adding a retry mechanism, similar to the one used for the serverless deployment, would make the client more resilient to these issues.

        self.session = requests.Session()
        from requests.adapters import HTTPAdapter
        from requests.packages.urllib3.util.retry import Retry

        retry_strategy = Retry(
            total=3,
            backoff_factor=1,
            status_forcelist=[429, 500, 502, 503, 504],
            allowed_methods={"POST", "GET"}
        )
        adapter = HTTPAdapter(max_retries=retry_strategy)
        self.session.mount("http://", adapter)
        self.session.mount("https://", adapter)

Comment on lines +164 to +173
if response.status_code == 503:
# Model is loading (cold start)
logger.info("🔄 Model loading, waiting...")
time.sleep(10) # Wait for model to load
response = self.session.post(
self.api_url,
headers=self.headers,
json=payload,
timeout=timeout
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The requests.post call is duplicated to handle the 503 (model loading) status code. This makes the code harder to maintain. You can refactor this into a loop to avoid repetition.

            for attempt in range(2):  # Try up to 2 times
                response = self.session.post(
                    self.api_url,
                    headers=self.headers,
                    json=payload,
                    timeout=timeout
                )
                if response.status_code != 503:
                    break  # Success or other error

                if attempt == 0:  # If it's the first attempt and model is loading
                    logger.info("🔄 Model loading, waiting...")
                    time.sleep(10)  # Wait for model to load

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (12)
deployment/flexible_api_server.py (12)

2-9: Docstring style fixes (D205, D212, D415) and trailing whitespace (W291).

Tighten the module docstring to meet Ruff pydocstyle rules and remove trailing whitespace.

-"""
-🚀 FLEXIBLE EMOTION DETECTION API SERVER
-========================================
-Supports multiple HuggingFace deployment strategies:
-- Serverless Inference API (free)
-- Inference Endpoints (paid)  
-- Self-hosted (local)
-"""
+"""Flexible emotion detection API server.
+
+Supports multiple HuggingFace deployment strategies:
+- Serverless Inference API (free)
+- Inference Endpoints (paid)
+- Self-hosted (local)
+"""

14-14: Use built-in generics and remove unused imports (UP006, F401).

  • Drop unused List.
  • Prefer builtin generics over typing.Dict for Python 3.9+.
-from typing import Dict, List, Optional, Any
+from typing import Optional, Any

And update signatures to builtins:

-def predict(self, text: str) -> Dict[str, Any]:
+def predict(self, text: str) -> dict[str, Any]:
@@
-def _predict_serverless(self, text: str) -> Dict[str, Any]:
+def _predict_serverless(self, text: str) -> dict[str, Any]:
@@
-def _predict_endpoint(self, text: str) -> Dict[str, Any]:
+def _predict_endpoint(self, text: str) -> dict[str, Any]:
@@
-def _predict_local(self, text: str) -> Dict[str, Any]:
+def _predict_local(self, text: str) -> dict[str, Any]:
@@
-def get_status(self) -> Dict[str, Any]:
+def get_status(self) -> dict[str, Any]:

Also applies to: 131-131, 150-150, 214-214, 261-261, 343-343


28-32: Add docstring to public Enum (D101).

Minor: document DeploymentType values for clarity and lint compliance.

 class DeploymentType(Enum):
+    """Supported deployment strategies for emotion detector."""
     SERVERLESS = "serverless"
     ENDPOINT = "endpoint"
     LOCAL = "local"

55-67: Add return annotation to initializer and handle unknown types explicitly (ANN202).

Small cleanups; also good to raise if an unsupported type appears (defensive programming).

-    def _initialize(self):
+    def _initialize(self) -> None:
@@
-        elif self.deployment_type == DeploymentType.LOCAL:
+        elif self.deployment_type == DeploymentType.LOCAL:
             self._initialize_local()
+        else:
+            raise ValueError(f"Unsupported deployment type: {self.deployment_type!r}")

93-108: Mirror retry config for endpoints (consistency) and add missing return annotation (ANN202).

Endpoint init currently lacks retry configuration. Consider reusing the same Session adapter strategy for consistent behavior, and annotate return type.

-    def _initialize_endpoint(self):
+    def _initialize_endpoint(self) -> None:
@@
-        # Create session
-        self.session = requests.Session()
+        # Create session with retry strategy similar to serverless
+        from requests.adapters import HTTPAdapter
+        from urllib3.util.retry import Retry
+        retry_strategy = Retry(
+            total=3,
+            backoff_factor=1,
+            status_forcelist=[429, 500, 502, 503, 504],
+            allowed_methods={"POST", "GET", "PUT", "PATCH"},
+            respect_retry_after_header=True,
+        )
+        adapter = HTTPAdapter(max_retries=retry_strategy)
+        self.session = requests.Session()
+        self.session.mount("http://", adapter)
+        self.session.mount("https://", adapter)

109-131: Add return annotation to local initializer (ANN202).

Nit: annotate and trim whitespace (W291 noted elsewhere).

-    def _initialize_local(self):
+    def _initialize_local(self) -> None:

293-301: Fix confidence indexing (minor correctness).

Indexing with a tensor returns a 1D tensor; use the scalar index for clarity.

-            # Get confidence
-            confidence = probabilities[0][predicted_class].item()
+            # Get confidence
+            predicted_idx = predicted_class.item()
+            confidence = probabilities[0, predicted_idx].item()

343-355: Compute a more accurate readiness flag.

Currently always True; reflect real readiness across modes.

     def get_status(self) -> dict[str, Any]:
         """Get detector status information."""
-        return {
+        ready = False
+        if self.deployment_type == DeploymentType.LOCAL:
+            ready = bool(self.model and self.tokenizer)
+        else:
+            ready = bool(self.session)
+        return {
             "deployment_type": self.deployment_type.value,
             "model_name": self.model_name,
             "emotion_labels": self.emotion_labels,
-            "ready": True,
+            "ready": ready,
             "config": {
                 "serverless_api": self.api_url if hasattr(self, 'api_url') else None,
                 "endpoint_url": self.endpoint_url if hasattr(self, 'endpoint_url') else None,
                 "local_device": self._get_model_device_str() if self.model else None,
             }
         }

366-366: Add return type annotations for Flask route functions (ANN201).

Use Flask’s ResponseReturnValue for accurate typing.

-from flask import Flask, request, jsonify
+from flask import Flask, request, jsonify
+from flask.typing import ResponseReturnValue
@@
-def health_check():
+def health_check() -> ResponseReturnValue:
@@
-def predict_emotion():
+def predict_emotion() -> ResponseReturnValue:
@@
-def predict_batch():
+def predict_batch() -> ResponseReturnValue:
@@
-def home():
+def home() -> ResponseReturnValue:

Also applies to: 379-379, 408-408, 441-441


193-196: Optional: avoid echoing full input text in successful responses.

While the PR focuses on errors, echoing request text can still propagate PII to downstream logs/clients. Consider omitting or replacing with metadata (length/hash).

-                    "text": text,
+                    # "text": text,  # Consider omitting to avoid PII propagation

Also applies to: 239-243, 315-319


482-564: Replace prints with logger or silence lints (T201); handle S104 intentionally.

If Ruff T201 is enforced, convert prints to logger.info for consistency. Otherwise, add noqa to specific lines. Also explicitly mark the all-interfaces constant line as intentional to satisfy S104.

-if __name__ == '__main__':
-    print("🌐 Starting Flexible Emotion Detection API...")
-    print("=" * 60)
+if __name__ == '__main__':
+    logger.info("🌐 Starting Flexible Emotion Detection API...")
+    logger.info("=" * 60)
@@
-        print(f"📋 Deployment Type: {status['deployment_type'].upper()}")
-        print(f"🤖 Model: {status['model_name']}")
-        print(f"🎭 Emotions: {len(status['emotion_labels'])} classes")
+        logger.info("📋 Deployment Type: %s", status['deployment_type'].upper())
+        logger.info("🤖 Model: %s", status['model_name'])
+        logger.info("🎭 Emotions: %d classes", len(status['emotion_labels']))
@@
-    ALL_INTERFACES = '0.0.0.0'
+    ALL_INTERFACES = '0.0.0.0'  # noqa: S104
@@
-    print(f"\n🚀 Server starting on {server_url}")
+    logger.info("🚀 Server starting on %s", server_url)

If you prefer prints for CLI UX, consider adding # noqa: T201 on those lines instead.


62-62: Trim trailing whitespace flagged by Ruff (W291).

Minor; remove trailing spaces on the noted lines.

Also applies to: 465-465, 547-547, 158-159, 169-170, 266-269

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 70cb2f6 and ae92171.

📒 Files selected for processing (1)
  • deployment/flexible_api_server.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
deployment/flexible_api_server.py

2-9: 1 blank line required between summary line and description

(D205)


2-9: Multi-line docstring summary should start at the first line

Remove whitespace after opening quotes

(D212)


2-9: First line should end with a period, question mark, or exclamation point

Add closing punctuation

(D415)


7-7: Trailing whitespace

Remove trailing whitespace

(W291)


14-14: typing.List imported but unused

Remove unused import: typing.List

(F401)


28-28: Missing docstring in public class

(D101)


36-36: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


55-55: Missing return type annotation for private function _initialize

Add return type annotation: None

(ANN202)


62-62: Trailing whitespace

Remove trailing whitespace

(W291)


68-68: Missing return type annotation for private function _initialize_serverless

Add return type annotation: None

(ANN202)


93-93: Missing return type annotation for private function _initialize_endpoint

Add return type annotation: None

(ANN202)


109-109: Missing return type annotation for private function _initialize_local

Add return type annotation: None

(ANN202)


131-131: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


150-150: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


158-158: Trailing whitespace

Remove trailing whitespace

(W291)


159-159: Trailing whitespace

Remove trailing whitespace

(W291)


169-169: Trailing whitespace

Remove trailing whitespace

(W291)


170-170: Trailing whitespace

Remove trailing whitespace

(W291)


214-214: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


261-261: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


266-266: Trailing whitespace

Remove trailing whitespace

(W291)


267-267: Trailing whitespace

Remove trailing whitespace

(W291)


268-268: Trailing whitespace

Remove trailing whitespace

(W291)


269-269: Trailing whitespace

Remove trailing whitespace

(W291)


343-343: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


366-366: Missing return type annotation for public function health_check

(ANN201)


379-379: Missing return type annotation for public function predict_emotion

(ANN201)


408-408: Missing return type annotation for public function predict_batch

(ANN201)


441-441: Missing return type annotation for public function home

(ANN201)


465-465: Trailing whitespace

Remove trailing whitespace

(W291)


483-483: print found

Remove print

(T201)


484-484: print found

Remove print

(T201)


488-488: print found

Remove print

(T201)


489-489: print found

Remove print

(T201)


490-490: print found

Remove print

(T201)


493-493: print found

Remove print

(T201)


494-494: print found

Remove print

(T201)


496-496: print found

Remove print

(T201)


497-497: print found

Remove print

(T201)


499-499: print found

Remove print

(T201)


500-500: print found

Remove print

(T201)


502-502: print found

Remove print

(T201)


503-503: print found

Remove print

(T201)


504-504: print found

Remove print

(T201)


505-505: print found

Remove print

(T201)


506-506: print found

Remove print

(T201)


508-508: print found

Remove print

(T201)


513-513: Possible binding to all interfaces

(S104)


528-528: print found

Remove print

(T201)


529-529: print found

Remove print

(T201)


530-530: print found

Remove print

(T201)


531-531: print found

Remove print

(T201)


536-536: print found

Remove print

(T201)


538-538: print found

Remove print

(T201)


539-539: print found

Remove print

(T201)


540-540: print found

Remove print

(T201)


541-541: print found

Remove print

(T201)


547-547: Trailing whitespace

Remove trailing whitespace

(W291)


551-551: print found

Remove print

(T201)


552-552: print found

Remove print

(T201)


553-553: print found

Remove print

(T201)


554-554: print found

Remove print

(T201)


558-558: print found

Remove print

(T201)


559-559: print found

Remove print

(T201)


561-561: print found

Remove print

(T201)


562-562: print found

Remove print

(T201)

Comment on lines +36 to +41
def __init__(self):
"""Initialize based on environment configuration."""
self.deployment_type = DeploymentType(os.getenv('DEPLOYMENT_TYPE', 'serverless'))
self.model_name = os.getenv('MODEL_NAME', 'your-username/samo-dl-emotion-model')
self.hf_token = os.getenv('HF_TOKEN')

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Harden DEPLOYMENT_TYPE parsing; add return annotation (ANN204).

Current Enum casting is case-sensitive and may raise on unexpected values. Normalize and fall back safely. Also annotate init.

-    def __init__(self):
+    def __init__(self) -> None:
         """Initialize based on environment configuration."""
-        self.deployment_type = DeploymentType(os.getenv('DEPLOYMENT_TYPE', 'serverless'))
+        raw = os.getenv('DEPLOYMENT_TYPE', 'serverless')
+        normalized = (raw or 'serverless').strip().lower()
+        try:
+            self.deployment_type = DeploymentType(normalized)
+        except ValueError:
+            logger.warning("Unknown DEPLOYMENT_TYPE=%r; falling back to 'serverless'", raw)
+            self.deployment_type = DeploymentType.SERVERLESS
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def __init__(self):
"""Initialize based on environment configuration."""
self.deployment_type = DeploymentType(os.getenv('DEPLOYMENT_TYPE', 'serverless'))
self.model_name = os.getenv('MODEL_NAME', 'your-username/samo-dl-emotion-model')
self.hf_token = os.getenv('HF_TOKEN')
def __init__(self) -> None:
"""Initialize based on environment configuration."""
raw = os.getenv('DEPLOYMENT_TYPE', 'serverless')
normalized = (raw or 'serverless').strip().lower()
try:
self.deployment_type = DeploymentType(normalized)
except ValueError:
logger.warning("Unknown DEPLOYMENT_TYPE=%r; falling back to 'serverless'", raw)
self.deployment_type = DeploymentType.SERVERLESS
self.model_name = os.getenv('MODEL_NAME', 'your-username/samo-dl-emotion-model')
self.hf_token = os.getenv('HF_TOKEN')
🧰 Tools
🪛 Ruff (0.12.2)

36-36: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)

🤖 Prompt for AI Agents
In deployment/flexible_api_server.py around lines 36 to 41, the __init__ method
lacks a return type annotation and the DeploymentType enum casting is
case-sensitive, which can raise errors on unexpected values. Add a return type
annotation of None to __init__. Normalize the DEPLOYMENT_TYPE environment
variable to lowercase before casting and implement a safe fallback to a default
DeploymentType value if the provided value is invalid or unexpected.

Comment on lines +68 to +91
def _initialize_serverless(self):
"""Initialize serverless inference API."""
if not self.hf_token:
raise ValueError("HF_TOKEN environment variable required for serverless API")

self.api_url = f"https://api-inference.huggingface.co/models/{self.model_name}"
self.headers = {"Authorization": f"Bearer {self.hf_token}"}

# Create session with retry strategy
self.session = requests.Session()
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

retry_strategy = Retry(
total=3,
backoff_factor=1,
status_forcelist=[429, 500, 502, 503, 504],
allowed_methods={"POST", "GET", "PUT", "PATCH"} # Enable retries for these HTTP methods
)
adapter = HTTPAdapter(max_retries=retry_strategy)
self.session.mount("http://", adapter)
self.session.mount("https://", adapter)

logger.info(f"📡 Serverless API: {self.api_url}")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Retry import path and respect Retry-After; trim trailing whitespace (UP, W291).

  • Prefer urllib3 import path (less brittle).
  • Respect Retry-After headers for 429/503.
  • Remove trailing whitespace.
-        from requests.adapters import HTTPAdapter
-        from requests.packages.urllib3.util.retry import Retry
+        from requests.adapters import HTTPAdapter
+        from urllib3.util.retry import Retry
@@
-        retry_strategy = Retry(
+        retry_strategy = Retry(
             total=3,
             backoff_factor=1,
             status_forcelist=[429, 500, 502, 503, 504],
-            allowed_methods={"POST", "GET", "PUT", "PATCH"}  # Enable retries for these HTTP methods
+            allowed_methods={"POST", "GET", "PUT", "PATCH"},
+            respect_retry_after_header=True,
         )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _initialize_serverless(self):
"""Initialize serverless inference API."""
if not self.hf_token:
raise ValueError("HF_TOKEN environment variable required for serverless API")
self.api_url = f"https://api-inference.huggingface.co/models/{self.model_name}"
self.headers = {"Authorization": f"Bearer {self.hf_token}"}
# Create session with retry strategy
self.session = requests.Session()
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
retry_strategy = Retry(
total=3,
backoff_factor=1,
status_forcelist=[429, 500, 502, 503, 504],
allowed_methods={"POST", "GET", "PUT", "PATCH"} # Enable retries for these HTTP methods
)
adapter = HTTPAdapter(max_retries=retry_strategy)
self.session.mount("http://", adapter)
self.session.mount("https://", adapter)
logger.info(f"📡 Serverless API: {self.api_url}")
self.api_url = f"https://api-inference.huggingface.co/models/{self.model_name}"
self.headers = {"Authorization": f"Bearer {self.hf_token}"}
# Create session with retry strategy
self.session = requests.Session()
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
retry_strategy = Retry(
total=3,
backoff_factor=1,
status_forcelist=[429, 500, 502, 503, 504],
allowed_methods={"POST", "GET", "PUT", "PATCH"},
respect_retry_after_header=True,
)
adapter = HTTPAdapter(max_retries=retry_strategy)
self.session.mount("http://", adapter)
self.session.mount("https://", adapter)
logger.info(f"📡 Serverless API: {self.api_url}")
🧰 Tools
🪛 Ruff (0.12.2)

68-68: Missing return type annotation for private function _initialize_serverless

Add return type annotation: None

(ANN202)

🤖 Prompt for AI Agents
In deployment/flexible_api_server.py around lines 68 to 91, update the import of
Retry to use the urllib3 package directly instead of requests.packages.urllib3
for better reliability. Modify the Retry configuration to set
respect_retry_after_header=True to honor Retry-After headers on 429 and 503
responses. Also, remove any trailing whitespace on these lines to comply with
style guidelines.

Comment on lines +141 to +148
except Exception as e:
# Log error with redacted text for debugging (avoid PII exposure in logs)
text_preview = f"{text[:20]}..." if len(text) > 20 else text
logger.error(f"❌ Prediction failed: {e} (input preview: {text_preview})")
return {
"error": str(e),
"deployment_type": self.deployment_type.value
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Sanitize error details returned to clients to avoid PII and internal leakage.

Returning str(e) exposes internal messages upstream (contrary to “no PII in errors”). Keep details in logs; return stable codes/messages to clients.

-        except Exception as e:
-            # Log error with redacted text for debugging (avoid PII exposure in logs)
+        except Exception as e:
+            # Log with redacted preview; do not expose details to clients
             text_preview = f"{text[:20]}..." if len(text) > 20 else text
             logger.error(f"❌ Prediction failed: {e} (input preview: {text_preview})")
             return {
-                "error": str(e),
+                "error": "Prediction failed. Please try again later.",
+                "error_code": "PREDICT_FAILURE",
                 "deployment_type": self.deployment_type.value
             }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e:
# Log error with redacted text for debugging (avoid PII exposure in logs)
text_preview = f"{text[:20]}..." if len(text) > 20 else text
logger.error(f"❌ Prediction failed: {e} (input preview: {text_preview})")
return {
"error": str(e),
"deployment_type": self.deployment_type.value
}
except Exception as e:
# Log with redacted preview; do not expose details to clients
text_preview = f"{text[:20]}..." if len(text) > 20 else text
logger.error(f"❌ Prediction failed: {e} (input preview: {text_preview})")
return {
"error": "Prediction failed. Please try again later.",
"error_code": "PREDICT_FAILURE",
"deployment_type": self.deployment_type.value
}
🤖 Prompt for AI Agents
In deployment/flexible_api_server.py around lines 141 to 148, the error response
returns str(e) which may expose PII or internal details to clients. Modify the
return statement to send a generic, stable error message or code instead of
str(e), while keeping the detailed error logged internally. This ensures
sensitive information is not leaked upstream.

Comment on lines +150 to +213
def _predict_serverless(self, text: str) -> Dict[str, Any]:
"""Predict using serverless inference API."""
try:
payload = {"inputs": text}

# Add timeout and rate limit handling
timeout = int(os.getenv('TIMEOUT_SECONDS', '30'))
response = self.session.post(
self.api_url,
headers=self.headers,
json=payload,
timeout=timeout
)

if response.status_code == 503:
# Model is loading (cold start)
logger.info("🔄 Model loading, waiting...")
time.sleep(10) # Wait for model to load
response = self.session.post(
self.api_url,
headers=self.headers,
json=payload,
timeout=timeout
)

response.raise_for_status()
result = response.json()

# Convert HuggingFace format to our format
if isinstance(result, list) and len(result) > 0:
# Standard classification output
predictions = result[0] if isinstance(result[0], list) else result

# Find highest scoring emotion
best_prediction = max(predictions, key=lambda x: x['score'])

# Create emotion probabilities dict
all_emotions = {pred['label']: pred['score'] for pred in predictions}

return {
"emotion": best_prediction['label'],
"confidence": best_prediction['score'],
"all_emotions": all_emotions,
"text": text,
"deployment_type": "serverless",
"model": self.model_name
}
return {
"error": "Unexpected response format",
"raw_response": result,
"deployment_type": "serverless"
}
except requests.exceptions.Timeout:
return {
"error": "Request timeout (model may be cold starting)",
"suggestion": "Try again in a few seconds",
"deployment_type": "serverless"
}
except requests.exceptions.RequestException as e:
return {
"error": f"API request failed: {e}",
"deployment_type": "serverless"
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Harden serverless prediction error handling and keep errors generic.

  • Avoid returning raw exception strings to clients.
  • Optionally add a capped retry instead of a fixed 10s sleep on 503.
-            if response.status_code == 503:
+            if response.status_code == 503:
                 # Model is loading (cold start)
                 logger.info("🔄 Model loading, waiting...")
-                time.sleep(10)  # Wait for model to load
+                time.sleep(10)  # Wait for model to load (consider exponential backoff)
@@
-        except requests.exceptions.Timeout:
+        except requests.exceptions.Timeout:
             return {
-                "error": "Request timeout (model may be cold starting)",
+                "error": "Request timeout.",
                 "suggestion": "Try again in a few seconds",
                 "deployment_type": "serverless"
             }
-        except requests.exceptions.RequestException as e:
+        except requests.exceptions.RequestException as e:
+            logger.error("Serverless HTTP error: %s", e)
             return {
-                "error": f"API request failed: {e}",
+                "error": "Upstream API request failed.",
+                "error_code": "SERVERLESS_HTTP_ERROR",
                 "deployment_type": "serverless"
             }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _predict_serverless(self, text: str) -> Dict[str, Any]:
"""Predict using serverless inference API."""
try:
payload = {"inputs": text}
# Add timeout and rate limit handling
timeout = int(os.getenv('TIMEOUT_SECONDS', '30'))
response = self.session.post(
self.api_url,
headers=self.headers,
json=payload,
timeout=timeout
)
if response.status_code == 503:
# Model is loading (cold start)
logger.info("🔄 Model loading, waiting...")
time.sleep(10) # Wait for model to load
response = self.session.post(
self.api_url,
headers=self.headers,
json=payload,
timeout=timeout
)
response.raise_for_status()
result = response.json()
# Convert HuggingFace format to our format
if isinstance(result, list) and len(result) > 0:
# Standard classification output
predictions = result[0] if isinstance(result[0], list) else result
# Find highest scoring emotion
best_prediction = max(predictions, key=lambda x: x['score'])
# Create emotion probabilities dict
all_emotions = {pred['label']: pred['score'] for pred in predictions}
return {
"emotion": best_prediction['label'],
"confidence": best_prediction['score'],
"all_emotions": all_emotions,
"text": text,
"deployment_type": "serverless",
"model": self.model_name
}
return {
"error": "Unexpected response format",
"raw_response": result,
"deployment_type": "serverless"
}
except requests.exceptions.Timeout:
return {
"error": "Request timeout (model may be cold starting)",
"suggestion": "Try again in a few seconds",
"deployment_type": "serverless"
}
except requests.exceptions.RequestException as e:
return {
"error": f"API request failed: {e}",
"deployment_type": "serverless"
}
def _predict_serverless(self, text: str) -> Dict[str, Any]:
"""Predict using serverless inference API."""
try:
payload = {"inputs": text}
# Add timeout and rate limit handling
timeout = int(os.getenv('TIMEOUT_SECONDS', '30'))
response = self.session.post(
self.api_url,
headers=self.headers,
json=payload,
timeout=timeout
)
if response.status_code == 503:
# Model is loading (cold start)
logger.info("🔄 Model loading, waiting...")
time.sleep(10) # Wait for model to load (consider exponential backoff)
response = self.session.post(
self.api_url,
headers=self.headers,
json=payload,
timeout=timeout
)
response.raise_for_status()
result = response.json()
# Convert HuggingFace format to our format
if isinstance(result, list) and len(result) > 0:
# Standard classification output
predictions = result[0] if isinstance(result[0], list) else result
# Find highest scoring emotion
best_prediction = max(predictions, key=lambda x: x['score'])
# Create emotion probabilities dict
all_emotions = {pred['label']: pred['score'] for pred in predictions}
return {
"emotion": best_prediction['label'],
"confidence": best_prediction['score'],
"all_emotions": all_emotions,
"text": text,
"deployment_type": "serverless",
"model": self.model_name
}
return {
"error": "Unexpected response format",
"raw_response": result,
"deployment_type": "serverless"
}
except requests.exceptions.Timeout:
return {
"error": "Request timeout.",
"suggestion": "Try again in a few seconds",
"deployment_type": "serverless"
}
except requests.exceptions.RequestException as e:
logger.error("Serverless HTTP error: %s", e)
return {
"error": "Upstream API request failed.",
"error_code": "SERVERLESS_HTTP_ERROR",
"deployment_type": "serverless"
}
🧰 Tools
🪛 Ruff (0.12.2)

150-150: Use dict instead of Dict for type annotation

Replace with dict

(UP006)


158-158: Trailing whitespace

Remove trailing whitespace

(W291)


159-159: Trailing whitespace

Remove trailing whitespace

(W291)


169-169: Trailing whitespace

Remove trailing whitespace

(W291)


170-170: Trailing whitespace

Remove trailing whitespace

(W291)

🤖 Prompt for AI Agents
In deployment/flexible_api_server.py between lines 150 and 213, the serverless
prediction method currently returns raw exception messages to clients and uses a
fixed 10-second sleep on 503 errors. To fix this, replace raw exception strings
with generic error messages that do not expose internal details. Implement a
capped retry mechanism with a limited number of retries and incremental backoff
instead of a single fixed sleep to handle 503 responses more robustly.

Comment on lines +214 to +260
def _predict_endpoint(self, text: str) -> Dict[str, Any]:
"""Predict using inference endpoints."""
try:
payload = {"inputs": text}
timeout = int(os.getenv('TIMEOUT_SECONDS', '10'))

response = self.session.post(
self.endpoint_url,
headers=self.headers,
json=payload,
timeout=timeout
)

response.raise_for_status()
result = response.json()

# Same processing as serverless
if isinstance(result, list) and len(result) > 0:
predictions = result[0] if isinstance(result[0], list) else result
best_prediction = max(predictions, key=lambda x: x['score'])
all_emotions = {pred['label']: pred['score'] for pred in predictions}

return {
"emotion": best_prediction['label'],
"confidence": best_prediction['score'],
"all_emotions": all_emotions,
"text": text,
"deployment_type": "endpoint",
"model": self.model_name
}
return {
"error": "Unexpected response format",
"raw_response": result,
"deployment_type": "endpoint"
}
except requests.exceptions.Timeout:
return {
"error": "Request timeout (endpoint may be starting up)",
"suggestion": "Try again in a few seconds",
"deployment_type": "endpoint"
}
except requests.exceptions.RequestException as e:
return {
"error": f"Endpoint request failed: {e}",
"deployment_type": "endpoint"
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Mirror sanitized errors for endpoint backend; consider DRY refactor.

  • Keep client-facing error messages generic (no str(e)).
  • Optionally factor out shared HTTP prediction logic to reduce duplication.
-        except requests.exceptions.Timeout:
+        except requests.exceptions.Timeout:
             return {
-                "error": "Request timeout (endpoint may be starting up)",
+                "error": "Request timeout.",
                 "suggestion": "Try again in a few seconds",
                 "deployment_type": "endpoint"
             }
-        except requests.exceptions.RequestException as e:
+        except requests.exceptions.RequestException as e:
+            logger.error("Endpoint HTTP error: %s", e)
             return {
-                "error": f"Endpoint request failed: {e}",
+                "error": "Endpoint request failed.",
+                "error_code": "ENDPOINT_HTTP_ERROR",
                 "deployment_type": "endpoint"
             }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _predict_endpoint(self, text: str) -> Dict[str, Any]:
"""Predict using inference endpoints."""
try:
payload = {"inputs": text}
timeout = int(os.getenv('TIMEOUT_SECONDS', '10'))
response = self.session.post(
self.endpoint_url,
headers=self.headers,
json=payload,
timeout=timeout
)
response.raise_for_status()
result = response.json()
# Same processing as serverless
if isinstance(result, list) and len(result) > 0:
predictions = result[0] if isinstance(result[0], list) else result
best_prediction = max(predictions, key=lambda x: x['score'])
all_emotions = {pred['label']: pred['score'] for pred in predictions}
return {
"emotion": best_prediction['label'],
"confidence": best_prediction['score'],
"all_emotions": all_emotions,
"text": text,
"deployment_type": "endpoint",
"model": self.model_name
}
return {
"error": "Unexpected response format",
"raw_response": result,
"deployment_type": "endpoint"
}
except requests.exceptions.Timeout:
return {
"error": "Request timeout (endpoint may be starting up)",
"suggestion": "Try again in a few seconds",
"deployment_type": "endpoint"
}
except requests.exceptions.RequestException as e:
return {
"error": f"Endpoint request failed: {e}",
"deployment_type": "endpoint"
}
def _predict_endpoint(self, text: str) -> Dict[str, Any]:
"""Predict using inference endpoints."""
try:
payload = {"inputs": text}
timeout = int(os.getenv('TIMEOUT_SECONDS', '10'))
response = self.session.post(
self.endpoint_url,
headers=self.headers,
json=payload,
timeout=timeout
)
response.raise_for_status()
result = response.json()
# Same processing as serverless
if isinstance(result, list) and len(result) > 0:
predictions = result[0] if isinstance(result[0], list) else result
best_prediction = max(predictions, key=lambda x: x['score'])
all_emotions = {pred['label']: pred['score'] for pred in predictions}
return {
"emotion": best_prediction['label'],
"confidence": best_prediction['score'],
"all_emotions": all_emotions,
"text": text,
"deployment_type": "endpoint",
"model": self.model_name
}
return {
"error": "Unexpected response format",
"raw_response": result,
"deployment_type": "endpoint"
}
except requests.exceptions.Timeout:
return {
"error": "Request timeout.",
"suggestion": "Try again in a few seconds",
"deployment_type": "endpoint"
}
except requests.exceptions.RequestException as e:
logger.error("Endpoint HTTP error: %s", e)
return {
"error": "Endpoint request failed.",
"error_code": "ENDPOINT_HTTP_ERROR",
"deployment_type": "endpoint"
}
🧰 Tools
🪛 Ruff (0.12.2)

214-214: Use dict instead of Dict for type annotation

Replace with dict

(UP006)

🤖 Prompt for AI Agents
In deployment/flexible_api_server.py around lines 214 to 260, the error handling
for the endpoint backend currently exposes detailed exception messages to
clients, which should be sanitized to keep error messages generic. Update the
error responses to avoid including the string representation of exceptions and
instead provide general error messages. Additionally, consider refactoring the
HTTP prediction logic into a shared helper method to reduce code duplication
between different deployment types.

Comment on lines +378 to +406
@app.route('/predict', methods=['POST'])
def predict_emotion():
"""Predict emotion for given text."""
if detector is None:
return jsonify({
'error': 'Detector not initialized'
}), 503

try:
data = request.get_json()

if not data or 'text' not in data:
return jsonify({'error': 'No text provided'}), 400

text = data['text']
if not text.strip():
return jsonify({'error': 'Empty text provided'}), 400

# Make prediction
result = detector.predict(text)

# Return appropriate status code
if 'error' in result:
return jsonify(result), 500
return jsonify(result)

except Exception as e:
return jsonify({'error': str(e)}), 500

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verification: scan for potential PII in error responses.

Ensure no error JSON includes user-provided text or secrets. The current diffs remove str(e) from client responses; validate across the repo.

Also applies to: 407-439


🏁 Script executed:

#!/bin/bash
# Find any error responses that might include request text or variables named 'text'
# 1) Look for error dicts that also include "text"
rg -n -A1 -B1 $'jsonify\\(\\{[^\\)]*error[^\\}]*\\}\\)' | sed -n '1,200p'
rg -n $'\"text\"\\s*:' | sed -n '1,200p'

# 2) Ensure no direct exception strings are returned to clients
rg -n $'jsonify\\(\\{[^\\)]*str\\(e\\)' | sed -n '1,200p'

# 3) Check detector error returns for str(e)
rg -n $'\"error\"\\s*:\\s*str\\(e\\)'

Length of output: 32652


Replace direct exception messages in API responses

Multiple endpoints still return str(e) in their JSON error responses, which can expose internal details or sensitive information. Instead, return a generic error message and log the exception internally. For example, replace:

except Exception as e:
    return jsonify({'error': str(e)}), 500

with something like:

except Exception as e:
    logger.error("Prediction error", exc_info=e)
    return jsonify({'error': 'Internal server error'}), 500

Key locations to update (search for return jsonify({'error': str(e)}):

  • deployment/flexible_api_server.py: lines 405, 438
  • deployment/api_server.py: lines 53, 73
  • deployment/secure_api_server.py: lines 185, 434, 464, 500, 530, 574, 617, 634, 694
  • deployment/local/api_server.py: lines 214, 253, 299, 375
  • deployment/gcp/predict.py: line 123
  • deployment/cloud-run/minimal_api_server.py: line 73
  • deployment/cloud-run/onnx_api_server.py: lines 271, 304
  • scripts/deployment/deploy_locally.py: lines 156, 183
  • …and any other instances of str(e) in error responses.

Run a global grep for str(e) in JSONify calls to catch all occurrences, then swap to a generic message and ensure exceptions are logged with logger.error(..., exc_info=e).

🧰 Tools
🪛 Ruff (0.12.2)

379-379: Missing return type annotation for public function predict_emotion

(ANN201)

🤖 Prompt for AI Agents
In deployment/flexible_api_server.py around lines 378 to 406, the exception
handling in the predict_emotion endpoint returns the direct exception message in
the JSON response, which can expose sensitive internal details. To fix this,
replace the return statement inside the except block to log the exception using
logger.error with exc_info=e, and return a generic error message like {'error':
'Internal server error'} with a 500 status code instead of returning str(e).
Apply this pattern consistently to all similar exception handlers in the file
and other specified files.

Comment on lines +404 to +406
except Exception as e:
return jsonify({'error': str(e)}), 500

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Do not return raw exception messages to clients (no PII in errors).

Return a generic message and log details server-side.

-    except Exception as e:
-        return jsonify({'error': str(e)}), 500
+    except Exception as e:
+        logger.exception("Unhandled error in /predict: %s", e)
+        return jsonify({'error': 'Internal error. Please try again later.', 'error_code': 'PREDICT_INTERNAL_ERROR'}), 500
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e:
return jsonify({'error': str(e)}), 500
except Exception as e:
logger.exception("Unhandled error in /predict: %s", e)
return jsonify({
'error': 'Internal error. Please try again later.',
'error_code': 'PREDICT_INTERNAL_ERROR'
}), 500
🤖 Prompt for AI Agents
In deployment/flexible_api_server.py around lines 404 to 406, avoid returning
raw exception messages to clients to prevent exposing PII. Instead, return a
generic error message like "Internal server error" in the JSON response, and log
the detailed exception message on the server side for debugging purposes.

Comment on lines +421 to +430
texts = data['texts']
if not isinstance(texts, list):
return jsonify({'error': 'Texts must be a list'}), 400

results = []
for text in texts:
if text and text.strip():
result = detector.predict(text.strip())
results.append(result)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Validate batch items and enforce a max batch size.

Prevent accidental large payloads and non-string items from triggering errors.

-        texts = data['texts']
+        texts = data['texts']
+        max_batch = int(os.getenv('MAX_BATCH_SIZE', '32'))
+        if len(texts) > max_batch:
+            return jsonify({'error': f'Batch size exceeds limit of {max_batch}'}), 400
@@
-        for text in texts:
-            if text and text.strip():
-                result = detector.predict(text.strip())
-                results.append(result)
+        for idx, text in enumerate(texts):
+            if not isinstance(text, str):
+                results.append({'error': 'Invalid item type; expected string', 'index': idx})
+                continue
+            stripped = text.strip()
+            if stripped:
+                results.append(detector.predict(stripped))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
texts = data['texts']
if not isinstance(texts, list):
return jsonify({'error': 'Texts must be a list'}), 400
results = []
for text in texts:
if text and text.strip():
result = detector.predict(text.strip())
results.append(result)
texts = data['texts']
max_batch = int(os.getenv('MAX_BATCH_SIZE', '32'))
if len(texts) > max_batch:
return jsonify({'error': f'Batch size exceeds limit of {max_batch}'}), 400
if not isinstance(texts, list):
return jsonify({'error': 'Texts must be a list'}), 400
results = []
for idx, text in enumerate(texts):
if not isinstance(text, str):
results.append({'error': 'Invalid item type; expected string', 'index': idx})
continue
stripped = text.strip()
if stripped:
results.append(detector.predict(stripped))
🤖 Prompt for AI Agents
In deployment/flexible_api_server.py around lines 421 to 430, the code currently
does not validate individual items in the 'texts' list or enforce a maximum
batch size. To fix this, add a check to ensure each item in 'texts' is a string
and skip or handle non-string items appropriately. Also, enforce a maximum batch
size limit by returning an error response if the list exceeds this limit to
prevent processing overly large payloads.

Comment on lines +437 to +439
except Exception as e:
return jsonify({'error': str(e)}), 500

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Sanitize batch endpoint errors and log server-side.

Same rationale as single predict.

-    except Exception as e:
-        return jsonify({'error': str(e)}), 500
+    except Exception as e:
+        logger.exception("Unhandled error in /predict_batch: %s", e)
+        return jsonify({'error': 'Internal error. Please try again later.', 'error_code': 'PREDICT_BATCH_INTERNAL_ERROR'}), 500
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e:
return jsonify({'error': str(e)}), 500
except Exception as e:
logger.exception("Unhandled error in /predict_batch: %s", e)
return jsonify({
'error': 'Internal error. Please try again later.',
'error_code': 'PREDICT_BATCH_INTERNAL_ERROR'
}), 500
🤖 Prompt for AI Agents
In deployment/flexible_api_server.py around lines 437 to 439, the batch endpoint
error handling currently returns the error message directly to the client
without sanitization or server-side logging. Modify the except block to log the
full exception details on the server side for debugging, and return a sanitized,
generic error message to the client in the JSON response to avoid exposing
sensitive information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants