[bug] Classifier failures always fail open with no strict policy option

## What went wrong

The backend classifier maps classifier failures to a synthetic `M0` / `benign` verdict. This includes transport errors, non-2xx responses, malformed JSON, empty choices, and responses with no parseable MAD code.

That may be acceptable for availability-first deployments, but Adrian does not currently expose a strict or fail-closed option for users who want BLOCK/HITL mode to stop execution when classification cannot be completed safely.

There also appears to be a mismatch between the `Classifier` interface comment and the current implementation.

Relevant code:

- `backend/internal/engine/client.go`: `failOpen`
- `backend/internal/engine/engine.go`: `Classifier` interface comment
- `backend/internal/ws/handler.go`: `persistAndClassify`

## Reproduction steps

1. Configure Adrian policy mode as `block`, with M3/M4 in scope.
2. Make the classifier return malformed JSON, an empty choices array, or a response with no MAD code.
3. Send an SDK event that requires classification before tool execution.
4. Observe the backend records a synthetic `M0` / `benign` verdict.
5. Observe the SDK allows execution because the verdict is not in scope for blocking.

## Expected behaviour

Adrian should support an explicit strict or fail-closed policy for high-assurance deployments.

When strict mode is enabled:

- Classifier failure in `block` mode should halt execution.
- Classifier failure in `hitl` mode should queue or hold for review.
- The dashboard should still record enough reasoning to show that classification failed.

The default can remain fail-open if that is the intended availability posture.

## Actual behaviour

Classifier failures currently become `M0` / `benign`.

In BLOCK mode, that verdict is treated as allow unless M0 is explicitly in scope.

## Environment

- Adrian version / commit: current `main`
- OS: not expected to be OS-specific
- Docker version: not required to reproduce
- GPU model: not required to reproduce

## Logs

<details>
<summary>Relevant code path</summary>

Classifier failures are converted to `M0` / `benign`:

```go
func (c *HTTPClient) failOpen(ctx context.Context, cause error, start time.Time) *Verdict {
    slog.WarnContext(ctx, "engine.classifier_failure_fail_open", "error", cause)
    return &Verdict{
        MADCode:        "M0",
        Classification: "benign",
        Reasoning:      "classifier failure (fail-open): " + cause.Error(),
        LatencyMS:      time.Since(start).Milliseconds(),
    }
}
```
</details>


## Suggested fix

Add an explicit strict or fail-closed policy option.

Possible shape:

Add a server-side policy field such as fail_closed_on_classifier_error.
Include it in the policy snapshot sent to the SDK.
On classifier failure:
In alert mode, record the failure as today.
In hitl mode, hold or queue the action for review.
In block mode, send a blocking verdict or equivalent policy result.
Add tests for classifier failure under alert, hitl, and block modes.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] Classifier failures always fail open with no strict policy option #46

What went wrong

Reproduction steps

Expected behaviour

Actual behaviour

Environment

Logs

Suggested fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[bug] Classifier failures always fail open with no strict policy option #46

Description

What went wrong

Reproduction steps

Expected behaviour

Actual behaviour

Environment

Logs

Suggested fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions