Skip to content

Optimize image caption flow to skip pre-success failed images and reduce redundant LLM calls#3

Merged
nbnbnd merged 4 commits into
mainfrom
copilot/update-image-caption-logic
May 18, 2026
Merged

Optimize image caption flow to skip pre-success failed images and reduce redundant LLM calls#3
nbnbnd merged 4 commits into
mainfrom
copilot/update-image-caption-logic

Conversation

Copilot AI commented May 18, 2026

Copy link
Copy Markdown

Current image captioning retries all images in a message, including ones that repeatedly failed before. This change adds failure-aware skip logic so images that failed before the latest successful caption in the same processing window are not retried.

  • Failure-state caching in ImageCacheManager

    • Added failure record storage (failure_cache) keyed by image hash.
    • Added APIs to record/query/clear failures and decide skip eligibility:
      • set_failed(image)
      • get_failed_timestamp(image)
      • clear_failed(image)
      • should_skip_failed_image(image, latest_success_timestamp, window_seconds)
    • Extended disk persistence format to store both caption cache and failure cache, with backward compatibility for existing cache files.
  • Caption generation control in ImageCaptionUtils

    • generate_image_caption(...) now accepts latest_success_timestamp.
    • Added pre-call skip gate for known failed images that are before the latest success and within configured window.
    • Failure paths (timeout/exception/empty result) now write failure records; successful caption clears failure record.
    • Added centralized config parsing for skip window (get_failed_image_skip_window_seconds).
  • Per-message success tracking in MessageUtils

    • outline_message_list(...) now tracks the latest successful image-caption timestamp while iterating message components.
    • That timestamp is passed into caption generation so earlier failed images in the same flow can be skipped after a success occurs.
# simplified skip rule used by caption flow
if ImageCacheManager.should_skip_failed_image(
    image,
    latest_success_timestamp=latest_success_ts,
    window_seconds=skip_window_seconds,
):
    return None  # skip redundant LLM caption attempt
Original prompt

问题描述

目前的图片转述逻辑会对所有包含图片的消息进行转述尝试,即使某些图片曾经转述失败过。这导致反复尝试对已知会失败的图片进行转述,造成不必要的 LLM 调用和资源浪费。

需求

调整图片转述逻辑,在处理一条消息中的多张图片时:

  • 当某张图片成功转述时,记录该次成功转述
  • 此后,该条消息中在这次成功转述之前发送的图片中,那些曾经转述失败过的图片应该被跳过,不再进行转述尝试
  • 这样可以避免有些图片老是转述失败却一直被反复尝试

实现方案

  1. 扩展 ImageCacheManager 缓存管理系统:

    • 添加"失败记录"功能,记录转述失败的图片
    • 支持查询某张图片是否曾转述失败
    • 支持清理失败记录
  2. 修改 ImageCaptionUtils.generate_image_caption() 方法:

    • 在尝试转述前检查该图片是否已记录为失败
    • 如果是失败图片且距离最后成功转述时间较近,则跳过转述
  3. 修改 MessageUtils.outline_message_list() 方法:

    • 跟踪消息中最近一次成功的图片转述时间戳
    • 在处理后续图片时,如果是失败图片且在最近成功转述之前,则跳过

预期效果

  • 减少对已知会失败的图片的重复转述尝试
  • 降低 LLM API 调用次数,节省资源
  • 提高整体性能

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: 调整识图逻辑,在最近一次成功的图片转述以前的未能转述的图片不要进行转述,这是为了避免有些图片老是转述失败

Assistant: I'll help you with adjusting the image recognition logic in your repository. Let me first explore the codebase to understand the current implementation.

Assistant: 现在让我查看消息处理和图片转述的调用流程:

Assistant: 现在我理解了您的需求。您希望调整图片转述逻辑,使得在最近一次成功的图片转述之后,之前那些未能转述的图片应该被跳过,不再进行转述尝试。这是为了避免某些图片反复转述失败。

让我为您创建一个 PR,实现这个优化:

Assistant: I'm waiting for your confirmation to proceed with creating a pull request for this optimization.

This pull request was created from Copilot chat.

Copilot AI changed the title [WIP] Update image caption logic to skip failed attempts Optimize image caption flow to skip pre-success failed images and reduce redundant LLM calls May 18, 2026
Copilot finished work on behalf of nbnbnd May 18, 2026 04:29
Copilot AI requested a review from nbnbnd May 18, 2026 04:29
Copilot finished work on behalf of nbnbnd May 18, 2026 04:32
@nbnbnd nbnbnd marked this pull request as ready for review May 18, 2026 04:32
Copilot finished work on behalf of nbnbnd May 18, 2026 04:34
@nbnbnd nbnbnd merged commit 0c12bf7 into main May 18, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants