Skip to content

[Feature]: Support MCP Tools in Live Mode Testing #8

@hertznsk

Description

@hertznsk

Problem Statement

Skillforge's live mode testing does not support SKILLs that use MCP (Model Context Protocol) tools. When running tests in --mode live:

  1. The AI API is called correctly
  2. But MCP tools (e.g., search, getIssue, updateIssue) are not passed to the AI model
  3. SKILLs that depend on MCP tools cannot be tested - the AI cannot execute required tool calls
  4. The AI either returns error or incomplete output

This limits testing to only SKILLs with pure logic/LLM generation, excluding integration-based SKILLs from live testing.

Proposed Solution

The simple solution of adding tools parameter to LLM API calls is insufficient because skillforge does not handle tool calling workflows:

  1. Tools parameter alone - LLM may call tools, but skillforge cannot process tool calls
  2. Multi-turn required - Tool calling requires multiple requests (tool call → execute → result → final answer)
  3. Result extraction - Skillforge only extracts content, not tool_calls

What's needed (not implemented):

Changes required:

  1. Pass tools parameter to LLM API calls
  2. Implement tool calling response processing
  3. Handle multi-turn conversations for tool execution
  4. Add tool execution endpoint for MCP tools

Alternatives Considered

  1. Mock mode only - Works for basic output validation, but cannot test real integration scenarios
  2. Separate integration tests - Requires maintaining separate test infrastructure outside skillforge
  3. Manual testing - Prone to human error, not suitable for CI/CD

Feature Category

Testing

Impact

Nice to have

Additional Context

SKILL examples requiring MCP tools:

  • redmine-duplicate-checker - search, getIssue, updateIssue (Redmine)
  • github-checker - searchIssues, getIssue, createComment (GitHub)

Current state comparison:

Mode Tools Available Real Integration CI/CD Ready
Mock ❌ No ❌ No ✅ Yes
Live (current) ❌ No ❌ No ✅ Yes
Live (proposed) ✅ Yes ✅ Yes ✅ Yes

Impact:

  • Without this: Teams cannot CI/CD test SKILLs with MCP tools, integration bugs only caught in production
  • With this: Complete CI/CD for all SKILL types, early bug detection, single test framework

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions