Skip to content

Error: 超过 CHUNK SIZE #614

@enddywu

Description

@enddywu

1️⃣ 描述一下问题
文档入库时, 发生以下错误chunk_token_size > 1200 的错误? 请问如何提高 chunk_token_size ?

wledgebases/kb_a39451d8cf6715564c387cc3d47b1e85/upload/new/3_Layer_depo__1775436088404.xlsx
04-06 08:45:32 INFO init.py:1028: 2026-04-06 08:45:32,702 - lightrag - INFO - Processing d-id: file_aa0769
04-06 08:45:32 WARNING init.py:1028: 2026-04-06 08:45:32,717 - lightrag - WARNING - Chunk split_by_character exceeds token limit: len=1372 limit=1200
04-06 08:45:32 ERROR init.py:1028: 2026-04-06 08:45:32,720 - lightrag - ERROR - Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/lightrag/lightrag.py", line 1848, in process_document
chunking_result = self.chunking_func(
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/lightrag/operate.py", line 121, in chunking_by_token_size
raise ChunkTokenLimitExceededError(
lightrag.exceptions.ChunkTokenLimitExceededError: Chunk token length 1372 exceeds chunk_token_size 1200. Preview: '|----------------------------------|----------------------------------|---------'

04-06 08:45:32 ERROR init.py:1028: 2026-04-06 08:45:32,721 - lightrag - ERROR - Failed to extract document 1/1: http://localhost:9000/knowledgebases/kb_a39451d8cf6715564c387cc3d47b1e85/upload/new/3_Layer_depo__1775436088404.xlsx
04-06 08:45:32 INFO init.py:1028: 2026-04-06 08:45:32,722 - lightrag - INFO - Enqueued document processing pipeline stopped
04-06 08:45:32 ERROR lightrag.py:409: Indexing failed for file_aa0769: LightRAG 实体关系抽取失败: file_id=file_aa0769, status=failed, error=Chunk token length 1372 exceeds chunk_token_size 1200. Preview: '|----------------------------------|----------------------------------|---------'
04-06 08:45:32 DEBUG base.py:770: Removed file file_aa0769 from processing queue
04-06 08:45:32 ERROR knowledge_router.py:595: Index failed for file_aa0769: LightRAG 实体关系抽取失败: file_id=file_aa0769, status=failed, error=Chunk token length 1372 exceeds chunk_token_size 1200. Preview: '|----------------------------------|----------------------------------|---------'
04-06 08:45:33 INFO: 172.18.0.11:33430 - "GET /api/knowledge/databases/kb_a39451d8cf6715564c387cc3d47b1e85 HTTP/1.1" 200 - 14ms

2️⃣ 报错日志

请运行以下命令,并提供部分相关日志:

# macOS / Linux
make logs

# Windows
docker logs --tail=100 api-dev
git rev-parse HEAD
make logs 的输出:



3️⃣ 相关截图

#️⃣ 其他相关信息

✅ 如果问题与模型调用相关,请尝试切换到其他在线模型

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions