Cld2labs/DocSummarization #2405

gopal-raj-suresh · 2026-01-23T18:17:52Z

Description

This PR introduces the DocSummarization blueprint, a GenAI-powered application for intelligent document summarization. The blueprint supports multiple document formats (PDF, TXT) and provides configurable summarization styles and lengths, making it suitable for enterprise document processing workflows.

Key Features:

PDF and text document summarization
Customizable summary length (short, medium, long)
Multiple summarization styles (executive, technical, bullet points)
Dual input modes (file upload or text paste)

Issues

n/a

Type of change

List the type of change like below. Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds new functionality)
Breaking change (fix or feature that would break existing design and interface)
Others (enhancement, documentation, validation, etc.)

Dependencies

No new repository-level dependencies.

All dependencies for the DocSummarization blueprint are listed in:

Backend: DocSummarization/backend/requirements.txt
Frontend: DocSummarization/frontend/package.json

Key technologies: FastAPI, React, OpenAI-compatible LLM integration, PyPDF

Tests

Testing Instructions:

git clone https://github.com/cld2labs/GenAIExamples.git
cd GenAIExamples
git checkout cld2labs/doc-summarization
cd DocSummarization

github-actions · 2026-01-23T18:18:22Z

Dependency Review

The following issues were found:

❌ 4 vulnerable package(s)
✅ 0 package(s) with incompatible licenses
✅ 0 package(s) with invalid SPDX license definitions
⚠️ 3 package(s) with unknown licenses.

See the Details below.

Vulnerabilities

DocSummarization/backend/requirements.txt

Name	Version	Vulnerability	Severity
Pillow	10.2.0	Pillow buffer overflow vulnerability	high
python-multipart	0.0.6	python-multipart vulnerable to Content-Type Header ReDoS	high
		Denial of service (DoS) via deformation `multipart/form-data` boundary	high
pypdf	6.1.1	pypdf possibly loops infinitely when reading DCT inline images without EOF marker	moderate
		pypdf can exhaust RAM via manipulated LZWDecode streams	moderate
		pypdf's LZWDecode streams be manipulated to exhaust RAM	moderate
		pypdf has possible long runtimes for missing /Root object with large /Size values	low
		pypdf has possible long runtimes for malformed startxref	low
requests	2.31.0	Requests `Session` object does not verify requests after making first request with verify=False	moderate
		Requests vulnerable to .netrc credentials leak via malicious URLs	moderate

License Issues

DocSummarization/backend/requirements.txt

Package	Version	License	Issue Type
Pillow	10.2.0	Null	Unknown License
pypdf	6.1.1	Null	Unknown License

DocSummarization/frontend/package.json

Package	Version	License	Issue Type
lucide-react	^0.294.0	Null	Unknown License

Scanned Files

DocSummarization/backend/requirements.txt
DocSummarization/frontend/package.json

gopal-raj-suresh added 2 commits January 22, 2026 13:14

Add DocSummarization blueprint

9e4085d

Merge branch 'opea-project:main' into cld2labs/doc-summarization

4ea07ec

gopal-raj-suresh requested review from chensuyue, ftian1, lkk12014402, lvliang-intel, minmin-intel and rbrugaro as code owners January 23, 2026 18:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cld2labs/DocSummarization #2405

Cld2labs/DocSummarization #2405

Uh oh!

gopal-raj-suresh commented Jan 23, 2026

Uh oh!

github-actions bot commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Cld2labs/DocSummarization #2405

Are you sure you want to change the base?

Cld2labs/DocSummarization #2405

Uh oh!

Conversation

gopal-raj-suresh commented Jan 23, 2026

Description

Issues

Type of change

Dependencies

Tests

Uh oh!

github-actions bot commented Jan 23, 2026

Dependency Review

Vulnerabilities

DocSummarization/backend/requirements.txt

License Issues

DocSummarization/backend/requirements.txt

DocSummarization/frontend/package.json

Scanned Files

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant