Skip to content

Add DOCX upload support with local text extraction#23

Open
ganeshmurthy wants to merge 1 commit into
rh-ai-quickstart:mainfrom
ganeshmurthy:UPLOAD-DOCX
Open

Add DOCX upload support with local text extraction#23
ganeshmurthy wants to merge 1 commit into
rh-ai-quickstart:mainfrom
ganeshmurthy:UPLOAD-DOCX

Conversation

@ganeshmurthy

Copy link
Copy Markdown
Collaborator

Fix DOCX upload failures by extracting text locally before upload. LlamaStack cannot extract DOCX content server-side, so we now use python-docx to extract text in the frontend and upload as .txt.

  • Add python-docx dependency
  • Add local_extractors.py module for DOCX text extraction
  • Auto-detect and extract .docx files in upload flow
  • PDF/TXT files still handled server-side (unchanged)

Fixes "Could not extract content from data_url properly" error.

Fix DOCX upload failures by extracting text locally before upload.
LlamaStack cannot extract DOCX content server-side, so we now use
python-docx to extract text in the frontend and upload as .txt.

- Add python-docx dependency
- Add local_extractors.py module for DOCX text extraction
- Auto-detect and extract .docx files in upload flow
- PDF/TXT files still handled server-side (unchanged)

Fixes "Could not extract content from data_url properly" error.
@ganeshmurthy ganeshmurthy requested a review from sauagarwa June 15, 2026 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant