Skip to content

[WIP] feat(llm): migrate to official SDKs and add use_max_completion_tokens…#42

Open
stay-foolish-forever wants to merge 1 commit into
alibaba:mainfrom
stay-foolish-forever:feat/llm-sdk-integration
Open

[WIP] feat(llm): migrate to official SDKs and add use_max_completion_tokens…#42
stay-foolish-forever wants to merge 1 commit into
alibaba:mainfrom
stay-foolish-forever:feat/llm-sdk-integration

Conversation

@stay-foolish-forever
Copy link
Copy Markdown
Contributor

Description

Migrate LLM client from custom HTTP implementation to official SDKs:

  • github.com/openai/openai-go v3.38.0
  • github.com/anthropics/anthropic-sdk-go v1.46.0

Key changes in this branch:

  1. SDK Migration (refactor)

    • Use SDK native streaming with RawJSON() for format preservation
    • ExtraBody support via option.WithJSONSet
    • Custom User-Agent headers via option.WithHeader
    • Retry logic delegated to SDK (option.WithMaxRetries)
    • StreamCompletion now accepts context.Context for cancellation
    • URL normalization reversed (strip suffixes, SDKs append them)
    • Anthropic tool schema includes required field constraints
    • OpenAI streaming: include_usage enabled for token stats
    • Remove unused resolveUsage/probePath from usage_resolver.go
  2. Token Usage Fix

    • Compute totalTokensUsed as input+output instead of relying on API TotalTokens, which some providers report inconsistently
  3. use_max_completion_tokens Config Option (feat)

    • Add UseMaxCompletionTokens field to ResolvedEndpoint and ClientConfig
    • Support config file field: llm.use_max_completion_tokens
    • Support environment variable: OCR_USE_MAX_COMPLETION_TOKENS
    • Wire into CLI config set command for user toggle
    • Default to max_tokens for backward compatibility
  4. Test Coverage

    • Comprehensive unit tests for client.go (message conversion, utility functions, constructors, streaming response handling)
    • Unit tests for setConfigValue covering all config keys (30 cases)
  5. Documentation

    • Add llm.use_max_completion_tokens config key and OCR_USE_MAX_COMPLETION_TOKENS env var to both EN and ZH READMEs

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Refactoring (no functional changes)
  • Documentation update
  • CI / Build / Tooling

How Has This Been Tested?

  • make test passes locally
  • Manual testing (describe below)
    Tested using various LLM model provider to ensure it functions correctly

Checklist

  • My code follows the project's coding style (go fmt, go vet)
  • I have performed a self-review of my code
  • I have added tests that prove my fix is effective or my feature works
  • New and existing unit tests pass locally with my changes
  • I have updated the documentation accordingly (if applicable)
  • I have signed the CLA

Related Issues

Issue #30

… config

Migrate LLM client from custom HTTP implementation to official SDKs:
- github.com/openai/openai-go v3.38.0
- github.com/anthropics/anthropic-sdk-go v1.46.0

Key changes in this branch:

1. SDK Migration (refactor)
   - Use SDK native streaming with RawJSON() for format preservation
   - ExtraBody support via option.WithJSONSet
   - Custom User-Agent headers via option.WithHeader
   - Retry logic delegated to SDK (option.WithMaxRetries)
   - StreamCompletion now accepts context.Context for cancellation
   - URL normalization reversed (strip suffixes, SDKs append them)
   - Anthropic tool schema includes required field constraints
   - OpenAI streaming: include_usage enabled for token stats
   - Remove unused resolveUsage/probePath from usage_resolver.go

2. Token Usage Fix
   - Compute totalTokensUsed as input+output instead of relying on
     API TotalTokens, which some providers report inconsistently

3. use_max_completion_tokens Config Option (feat)
   - Add UseMaxCompletionTokens field to ResolvedEndpoint and ClientConfig
   - Support config file field: llm.use_max_completion_tokens
   - Support environment variable: OCR_USE_MAX_COMPLETION_TOKENS
   - Wire into CLI config set command for user toggle
   - Default to max_tokens for backward compatibility

4. Test Coverage
   - Comprehensive unit tests for client.go (message conversion,
     utility functions, constructors, streaming response handling)
   - Unit tests for setConfigValue covering all config keys (30 cases)

5. Documentation
   - Add llm.use_max_completion_tokens config key and
     OCR_USE_MAX_COMPLETION_TOKENS env var to both EN and ZH READMEs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant