Skip to content

bug: contains grader doc claims case-insensitive default but implementation is case-sensitive #1154

@christso

Description

@christso

Objective

plugins/agentv-dev/skills/agentv-bench/agents/grader.md:42 documents:

contains | Check if response includes the value substring (case-insensitive by default)

The implementation at packages/core/src/evaluation/graders/assertions.ts:14-25 does:

export function runContainsAssertion(output: string, value: string): AssertionResult {
  const passed = output.includes(value);  // case-sensitive — no toLowerCase
  ...
}

contains-any (assertions.ts:28-45) and contains-all (assertions.ts:48-63) are also case-sensitive (raw .includes()). The icontains* family at assertions.ts:68-123 explicitly lowercases both sides — which only makes sense as a variant if the bare contains* functions are case-sensitive.

So grader.md:42 is both factually wrong and internally inconsistent with the icontains* entries at grader.md:45.

Reproducer

tests:
  - id: t
    input: test
    assertions:
      - name: has_hello
        type: contains
        value: hello

Response "Hello, world!" → assertion fails. The auto-generated failure text "Output does not contain \"hello\"" comes from assertions.ts:20 and is a grep anchor for the case-sensitive branch.

Design latitude

  1. Fix the doc (recommended)grader.md:42 states contains is case-sensitive by default; direct users to icontains* for case-insensitive matching. Aligns with the existing icontains convention.
  2. Fix the implementation — make contains* case-insensitive by default (change assertions.ts:15, :32, :52). Breaking change; any eval relying on case-sensitive contains would start passing incorrectly.

Option 1 is the YAGNI path unless there's concrete evidence users expected case-insensitive behavior from bare contains. icontains* already covers the case-insensitive use case.

Acceptance signals

  • grader.md:42-44 accurately describes contains, contains-any, contains-all case-sensitivity (case-sensitive if Option 1).
  • Regression test in packages/core/test/evaluation/graders/ pinning the chosen behavior, e.g. for Option 1: expect(runContainsAssertion("Hello", "hello").score).toBe(0) and expect(runContainsAssertion("hello", "hello").score).toBe(1).
  • No other skill/doc file claims contains* is case-insensitive.

Non-goals

  • equals (assertions.ts:196), starts-with (:126), ends-with (:140) are all case-sensitive and their grader.md:46-49 entries do not claim otherwise — explicitly out of scope.
  • regex case flags are handled via flags parameter — out of scope.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions