Current Claude Opus 4.5 degradation is not being reflected

There are a few possibilities -- among them the degradation is unevenly distributed (some accounts/keys and not others), or they are gaming your tests while degrading elsewhere. 

What is the solution to this? A random distributed test generated from first principles rather than tests made statically available open-source? I haven't looked to see how you are currently handling this adversarial possibility, if at all.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Current Claude Opus 4.5 degradation is not being reflected #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Current Claude Opus 4.5 degradation is not being reflected #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions