Port and enhance unified ai api by uelkerd · Pull Request #73 · uelkerd/SAMO--DL

uelkerd · 2025-08-12T01:01:14Z

Implement core text and voice processing API endpoints and fix Swagger documentation.

This PR significantly expands the API's capabilities beyond basic emotion detection by integrating text summarization, journal analysis, and voice transcription features. It also resolves the persistent 500 error on the /docs endpoint by serving a custom, functional Swagger UI.

Summary by Sourcery

Port and enhance unified AI API by integrating text summarization, journal analysis, and voice transcription endpoints with lazy model loading, custom Swagger UI, and updated OpenAPI spec and integration tests.

New Features:

Add /api/summarize endpoint for text summarization
Add /api/analyze/journal endpoint for combined emotion detection and optional summarization
Add /api/transcribe and /api/transcribe_batch endpoints for single and batch audio transcription
Add /api/analyze/voice_journal endpoint to transcribe audio and perform full journal analysis

Bug Fixes:

Fix persistent 500 error on /docs by disabling default Swagger UI and registering a custom docs blueprint

Enhancements:

Implement lazy loading for T5 summarizer and Whisper transcriber to defer model imports and initialization

Documentation:

Extend openapi.yaml with new paths and request/response schemas for summarization, journal analysis, and transcription APIs

Tests:

Add integration tests for summarization, journal analysis, single and batch transcription endpoints

…ister custom docs blueprint; keep Swagger disabled

…r transcribe endpoints

Co-authored-by: denizcan.uelker <denizcan.uelker@mercedes-benz.com>

cursor · 2025-08-12T01:01:15Z

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
_{Learn more about Cursor Agents}

sourcery-ai · 2025-08-12T01:01:19Z

Reviewer's Guide

This PR refactors the secure API server to disable the default Swagger UI, register a custom docs blueprint, add lazy-loaded summarization and transcription services, expose new text/voice processing endpoints with corresponding API models, update the OpenAPI spec, and add integration tests for these extensions.

Sequence diagram for the new /api/analyze/voice_journal endpoint

sequenceDiagram
    actor User
    participant API_Server
    participant Transcriber
    participant JournalAnalyzer
    User->>API_Server: POST /api/analyze/voice_journal (audio file)
    API_Server->>Transcriber: transcribe(audio)
    Transcriber-->>API_Server: transcription result
    API_Server->>JournalAnalyzer: analyze_journal(transcription.text)
    JournalAnalyzer-->>API_Server: analysis result
    API_Server-->>User: return journal analysis response

Class diagram for new and updated API models and services

classDiagram
    class SummarizeRequest {
      +string text
      +string model
    }
    class SummarizeResponse {
      +string summary
      +object meta
    }
    class JournalRequest {
      +string text
      +bool generate_summary
      +float emotion_threshold
    }
    class JournalResponse {
      +object emotion_analysis
      +object summary
      +float processing_time_ms
      +object pipeline_status
    }
    class TranscriptionResponse {
      +string text
      +string language
      +float confidence
      +float duration
      +string audio_quality
      +int word_count
      +float speaking_rate
    }
    class BatchTranscriptionResponse {
      +int total_files
      +array results
    }
    class T5SummarizationModel {
      +generate_summary(text, max_length, min_length)
    }
    class WhisperTranscriber {
      +transcribe(audio_path)
    }
    SummarizeRequest --> SummarizeResponse
    JournalRequest --> JournalResponse
    WhisperTranscriber --> TranscriptionResponse
    WhisperTranscriber --> BatchTranscriptionResponse
    T5SummarizationModel --> SummarizeResponse
    JournalResponse --> SummarizeResponse

File-Level Changes

Change	Details	Files
Configure custom Swagger UI and disable default docs	Import and conditionally register a docs_blueprint with fallback to None Set OPENAPI_SPEC_PATH and OPENAPI_ALLOWED_DIR when unset Disable Flask-RESTX default Swagger docs to prevent 500 errors	`deployment/cloud-run/secure_api_server.py`
Add lazy-loading mechanisms for summarization and transcription models	Define ensure_summarizer_loaded with thread lock and dynamic import of T5SummarizationModel Define ensure_transcriber_loaded with thread lock and dynamic import of WhisperTranscriber Introduce global variables and locks for both models	`deployment/cloud-run/secure_api_server.py`
Implement new text and voice processing API endpoints	Define RESTX models for SummarizeRequest/Response and JournalRequest/Response Implement /summarize and /analyze/journal endpoints with sanitization, rate limiting, and error handling Add _save_upload helper and implement /transcribe and /transcribe_batch endpoints Implement /analyze/voice_journal endpoint by chaining transcription and journal analysis	`deployment/cloud-run/secure_api_server.py`
Expand OpenAPI specification with new paths and schemas	Add OpenAPI path entries for summarization, journal analysis, transcription, batch transcription, and voice journal analysis Define new components schemas: SummarizeRequest/Response, JournalRequest/Response, TranscriptionResponse, BatchTranscriptionResponse	`deployment/cloud-run/openapi.yaml`
Add integration tests for the extended API	Create pytest-based smoke tests for /api/summarize and /api/analyze/journal Add file-based tests for /api/transcribe and /api/transcribe_batch endpoints	`tests/integration/test_secure_api_extensions.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

coderabbitai · 2025-08-12T01:01:20Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch cursor/port-and-enhance-unified-ai-api-03a1

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

gemini-code-assist

Summary of Changes

Hello @uelkerd, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the unified AI API by introducing new core endpoints for advanced text and voice processing. It expands the API's functionality beyond basic emotion detection to include text summarization, detailed journal analysis, and audio transcription. Additionally, it addresses and resolves a critical issue with the Swagger documentation, ensuring that the API's capabilities are properly exposed and accessible.

Highlights

Expanded AI API Capabilities: The API now includes new endpoints for text summarization, comprehensive journal analysis (combining emotion detection with summarization), and robust voice transcription capabilities for both single and batch audio files.
Improved API Documentation: A custom Swagger UI has been implemented and integrated, resolving the persistent 500 error on the /docs endpoint and providing functional API documentation.
Optimized Model Loading: Summarization and transcription models are now lazy-loaded, meaning they are only initialized when their respective endpoints are first called, optimizing server startup and resource utilization.
Comprehensive Integration Tests: New integration tests have been added to ensure the stability and correctness of the newly introduced text and voice processing API endpoints.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

deepsource-io · 2025-08-12T01:01:33Z

Here's the code health analysis summary for commits 69ddfe4..cb9c733. View details on DeepSource ↗.

Analysis Summary

Analyzer	Status	Summary	Link
Test coverage	⚠️ Artifact not reported	Timed out: Artifact was never reported	View Check ↗
Python	❌ Failure	❗ 100 occurences introduced	View Check ↗
Terraform	✅ Success		View Check ↗
Secrets	✅ Success		View Check ↗
Shell	✅ Success		View Check ↗
Docker	✅ Success		View Check ↗

💡 If you’re a repository administrator, you can configure the quality gates from the settings.

gemini-code-assist

Code Review

This pull request introduces significant new functionality with text summarization and voice processing endpoints. The implementation is generally solid, with good use of lazy loading for models and a custom Swagger UI fix. My review focuses on improving API consistency, making error handling more robust, and addressing some placeholder values in the implementation. I've also added suggestions to enhance the OpenAPI specification for better client-side experience and a minor improvement in the new integration tests.

gemini-code-assist · 2025-08-12T01:02:31Z

+                    summary_results = {
+                        'summary': summary_text,
+                        'key_emotions': [e.get('emotion') for e in (emotion_results.get('emotions') or [])[:1]],
+                        'compression_ratio': 0.5,


The compression_ratio is hardcoded to 0.5. This seems to be a placeholder value. It should be calculated dynamically based on the length of the original text and the generated summary to be meaningful.

Suggested change

'compression_ratio': 0.5,

'compression_ratio': len(summary_text) / len(safe_text) if len(safe_text) > 0 else 0,

gemini-code-assist · 2025-08-12T01:02:31Z

+                        'summary': summary_text,
+                        'key_emotions': [e.get('emotion') for e in (emotion_results.get('emotions') or [])[:1]],
+                        'compression_ratio': 0.5,
+                        'emotional_tone': 'neutral'


The emotional_tone is hardcoded to 'neutral'. This appears to be a placeholder. This value should be derived from the emotion_analysis results, for example by selecting the top emotion or a combination of emotions.

Suggested change

'emotional_tone': 'neutral'

'emotional_tone': (emotion_results.get('emotions') or [{}])[0].get('emotion', 'neutral')

gemini-code-assist · 2025-08-12T01:02:32Z

+        summary:
+          type: object
+          nullable: true


The summary object within JournalResponse is defined with type: object without specifying its properties. This makes the API contract less clear for consumers. It's better to define the full structure of the summary object for clarity and to enable better client-side validation.

summary: type: object nullable: true properties: summary: type: string description: The generated summary text. key_emotions: type: array items: type: string description: Key emotions detected in the text. compression_ratio: type: number format: float description: The ratio of original text length to summary length. emotional_tone: type: string description: The overall emotional tone of the summary.

gemini-code-assist · 2025-08-12T01:02:32Z

+          items:
+            type: object
+            properties:
+              index:
+                type: integer
+              success:
+                type: boolean
+              text:
+                type: string
+                nullable: true
+              language:
+                type: string
+                nullable: true
+              confidence:
+                type: number
+                format: float
+                nullable: true
+              error:
+                type: string
+                nullable: true


The schema for successful items in BatchTranscriptionResponse is inconsistent with TranscriptionResponse. It's missing fields like duration, audio_quality, word_count, and speaking_rate. To maintain consistency across the API, the batch response should provide the same level of detail for each successful transcription.

gemini-code-assist · 2025-08-12T01:02:32Z

+        except Exception as exc:
+            logger.warning("Summarizer unavailable: %s", exc)
+            _summarizer = None
+            return False


Catching a broad Exception can hide specific issues and make debugging harder. It's better to catch more specific exceptions, like ImportError if a dependency is missing, or RuntimeError for model loading issues. This also applies to the ensure_transcriber_loaded function.

Suggested change

except Exception as exc:

logger.warning("Summarizer unavailable: %s", exc)

_summarizer = None

return False

except (ImportError, RuntimeError) as exc:

logger.warning("Summarizer unavailable: %s", exc)

_summarizer = None

return False

gemini-code-assist · 2025-08-12T01:02:32Z

+                try:
+                    os.remove(audio_path)
+                except Exception:
+                    pass


The except Exception: pass in the finally block silently ignores any errors during file cleanup. This can lead to temporary files being left behind without any notification. It's better to log the exception. This same issue exists in the transcribe_batch and analyze_voice_journal endpoints.

Suggested change

try:

os.remove(audio_path)

except Exception:

pass

try:

os.remove(audio_path)

except Exception as e:

logger.warning(f"Failed to remove temporary file {audio_path}: {e}")

gemini-code-assist · 2025-08-12T01:02:32Z

+                    temp_paths.append(p)
+                    try:
+                        r = _transcriber.transcribe(p)
+                        results.append({'index': idx, 'success': True, 'text': r.text, 'language': r.language, 'confidence': r.confidence})


The response for successful batch transcriptions is missing several fields that are available in the TranscriptionResult object and are returned by the single /transcribe endpoint (e.g., duration, audio_quality, word_count, speaking_rate). For consistency, these fields should be included in the batch response as well. This would also align the implementation with the suggested changes in openapi.yaml.

Suggested change

results.append({'index': idx, 'success': True, 'text': r.text, 'language': r.language, 'confidence': r.confidence})

results.append({

'index': idx,

'success': True,

'text': r.text,

'language': r.language,

'confidence': r.confidence,

'duration': r.duration,

'audio_quality': r.audio_quality,

'word_count': r.word_count,

'speaking_rate': r.speaking_rate

})

gemini-code-assist · 2025-08-12T01:02:32Z

+                    try:
+                        os.remove(p)
+                    except Exception:
+                        pass


The except Exception: pass in the finally block silently ignores any errors during file cleanup. This can lead to temporary files being left behind without any notification. It's better to log the exception.

try: os.remove(p) except Exception as e: logger.warning(f"Failed to remove temporary file {p}: {e}")

gemini-code-assist · 2025-08-12T01:02:32Z

+                try:
+                    os.remove(audio_path)
+                except Exception:
+                    pass


The except Exception: pass in the finally block silently ignores any errors during file cleanup. This can lead to temporary files being left behind without any notification. It's better to log the exception.

try: os.remove(audio_path) except Exception as e: logger.warning(f"Failed to remove temporary file {audio_path}: {e}")

gemini-code-assist · 2025-08-12T01:02:32Z

+            try:
+                fh.close()
+            except Exception:
+                pass


The except Exception: pass silently ignores potential errors when closing file handles. While this is a test file, it's still good practice to handle exceptions explicitly, for example by logging them, to avoid hiding potential issues during test runs.

Suggested change

try:

fh.close()

except Exception:

pass

try:

fh.close()

except Exception as e:

print(f"Warning: failed to close file handle for {name}: {e}")

…cs blueprint defaults

…tests for lazy loaders and error paths

…s branch)

Co-authored-by: denizcan.uelker <denizcan.uelker@mercedes-benz.com>

Resolved issues in the following files with DeepSource Autofix: 1. tests/integration/conftest.py 2. tests/integration/test_secure_api_extensions.py 3. tests/unit/test_lazy_loaders.py

cursoragent and others added 4 commits August 12, 2025 00:52

API: add summarization and voice endpoints; lazy-load T5/Whisper; reg…

97f5bba

…ister custom docs blueprint; keep Swagger disabled

Docs: extend OpenAPI with summarize/journal/transcribe endpoints

87e19ee

Tests: add integration tests for summarize/journal and smoke tests fo…

8bb81cf

…r transcribe endpoints

Configure default OpenAPI spec path when not explicitly set

ad7a3e1

Co-authored-by: denizcan.uelker <denizcan.uelker@mercedes-benz.com>

gemini-code-assist Bot reviewed Aug 12, 2025

View reviewed changes

cursoragent and others added 7 commits August 12, 2025 01:03

API: add auth and monitoring endpoints with minimal secure tokens; do…

f7b9c5a

…cs blueprint defaults

Docs: add auth and monitoring endpoints to OpenAPI

b63c7ac

Tests: add pytest server fixture; expand integration tests; add unit …

d8213cb

…tests for lazy loaders and error paths

Docs: add API implementation strategy (single source of truth for thi…

49e0f01

…s branch)

Tests: add rate limit and monitoring schema integration tests

7f3ae59

Docs: add streaming decision and staging deployment prep

06cbe6b

Add rate-limited API server fixture for integration tests

3205c83

Co-authored-by: denizcan.uelker <denizcan.uelker@mercedes-benz.com>

uelkerd self-assigned this Aug 12, 2025

Port and enhance unified ai api

cb9c733

Resolved issues in the following files with DeepSource Autofix: 1. tests/integration/conftest.py 2. tests/integration/test_secure_api_extensions.py 3. tests/unit/test_lazy_loaders.py

	'compression_ratio': 0.5,
	'compression_ratio': len(summary_text) / len(safe_text) if len(safe_text) > 0 else 0,

	'emotional_tone': 'neutral'
	'emotional_tone': (emotion_results.get('emotions') or [{}])[0].get('emotion', 'neutral')

Conversation

uelkerd commented Aug 12, 2025 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

cursor Bot commented Aug 12, 2025

Uh oh!

sourcery-ai Bot commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for the new /api/analyze/voice_journal endpoint

Class diagram for new and updated API models and services

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai Bot commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Review skipped

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

deepsource-io Bot commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Analysis Summary

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

uelkerd commented Aug 12, 2025 •

edited by sourcery-ai Bot

Loading

sourcery-ai Bot commented Aug 12, 2025 •

edited

Loading

coderabbitai Bot commented Aug 12, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

deepsource-io Bot commented Aug 12, 2025 •

edited

Loading