fix: feishu block type detection, table rendering & image download#1
Open
dabuddha wants to merge 1 commit intojoeseesun:mainfrom
Open
fix: feishu block type detection, table rendering & image download#1dabuddha wants to merge 1 commit intojoeseesun:mainfrom
dabuddha wants to merge 1 commit intojoeseesun:mainfrom
Conversation
- Replace unreliable block_type number matching with key-based detection (detect_block_kind), fixing bullet lists rendered as empty code blocks and images being completely lost - Add table rendering support (render_table + render_cell_content) for Table and TableCell blocks, which were previously unhandled - Add image download via Feishu drive API (drive:drive:readonly permission required), with graceful fallback to feishu-image:// protocol - All changes are backward compatible with default parameter values
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes three critical issues in
fetch_feishu.pythat cause most document content to be lost or corrupted:block_typenumbers (e.g.,10=Bullet,17=Image) that don't match the actual API responses (where12=Bullet,27=Image). This causes bullet lists to render as empty code blocks and images to be completely lost.block_type=31) and TableCell (block_type=32) blocks fall through to theelsebranch, producing garbled text instead of proper Markdown tables.feishu-image://{token}— a custom protocol that no Markdown renderer can display.Changes
detect_block_kind(block)— Detects block type by checking actual data keys ("image","table","bullet", etc.) instead of relying onblock_typenumbers. This is robust against API version differences.render_table()+render_cell_content()— Renders Feishu Table blocks as proper Markdown tables, including images inside table cells.download_image()+download_all_images()— Downloads document images via Feishu drive API (/drive/v1/medias/{token}/download) and saves them locally. Markdown references use relative local paths.fetch_feishu_doc(save_dir=None)— New optionalsave_dirparameter. When provided, images are downloaded to{save_dir}/{title}_images/.Backward Compatibility
--jsonmode behavior is unchanged (no image download)feishu-image://protocolAdditional Permission Required
Image download requires the Feishu app to have
drive:drive:readonlypermission enabled. Without it, images will use the fallbackfeishu-image://format (same as before).Test plan
python3 scripts/fetch_feishu.py <feishu_url>on a document with tables and images| col1 | col2 |)~/Downloads/{title}_images/directory--jsonmode still works without downloading images