bazaarvoice · loukratz-bv · Feb 19, 2026 · Feb 13, 2026 · Feb 16, 2026 · Feb 17, 2026
diff --git a/.gitignore b/.gitignore
@@ -225,4 +225,7 @@ ignored
 
 qdrant_storage
 
-.qtype-cache
+.qtype-cache
+
+# SSL certificates (combined bundle with corporate certs)
+certs/*.pem
diff --git a/.specify/memory/constitution.md b/.specify/memory/constitution.md
@@ -0,0 +1,164 @@
+<!--
+Sync Impact Report
+===================
+Version change: N/A → 1.0.0
+Modified principles: N/A (initial ratification)
+Added sections:
+  - Core Principles (5 principles)
+  - Technology Stack & Standards
+  - Development Workflow
+  - Governance
+Removed sections: N/A
+Templates requiring updates:
+  - .specify/templates/plan-template.md ✅ no changes needed
+    (Constitution Check section is dynamically filled at plan time)
+  - .specify/templates/spec-template.md ✅ no changes needed
+    (generic template, no constitution-specific references)
+  - .specify/templates/tasks-template.md ✅ no changes needed
+    (generic template, no constitution-specific references)
+  - .specify/templates/commands/ ✅ no command files present
+Follow-up TODOs: none
+-->
+
+# QType Constitution
+
+## Core Principles
+
+### I. Code Quality
+
+All code MUST pass the project's automated quality gates before
+merge. This is non-negotiable.
+
+- Every function, class, and module MUST have type hints and
+  docstrings.
+- All Python code MUST pass `ruff`, `ty`, and `isort` checks with
+  zero violations.
+- Logging MUST be used instead of print statements for all
+  debug, info, and error output.
+- Catch specific exceptions — bare `except` clauses are forbidden.
+- Tests MUST accompany new functionality; untested code is
+  incomplete code.
+
+**Rationale**: Automated enforcement eliminates style debates and
+catches defects early. Consistent quality makes the codebase
+navigable by any contributor.
+
+### II. Clean Design Patterns & Abstractions
+
+Architecture MUST follow the established layered dependency flow:
+CLI → Application → Interpreter → Semantic → DSL → Base.
+
+- Each layer MUST only import from layers below it; never upward
+  or sideways.
+- Classes and functions MUST have a single, clear responsibility.
+- Use composition over inheritance; prefer small, focused
+  interfaces.
+- Data structures MUST use Pydantic `BaseModel` for validation
+  and serialization.
+- Shared utilities belong in the Base layer; domain logic belongs
+  in its respective layer.
+
+**Rationale**: Strict layering prevents entanglement and makes each
+component independently testable and replaceable.
+
+### III. Concise, Minimal & Readable Code
+
+Code MUST communicate intent clearly with the least amount of
+syntax necessary.
+
+- Prefer explicit over implicit — but never verbose over concise.
+- Use f-strings, comprehensions, and pattern matching where they
+  improve clarity.
+- Lines MUST NOT exceed 79 characters unless breaking would harm
+  readability.
+- Functions SHOULD fit on a single screen (~30 lines); extract
+  when they grow beyond that.
+- Comments MUST explain *why*, not *what* — the code itself MUST
+  explain the what.
+- Remove dead code immediately; do not comment it out.
+
+**Rationale**: Code is read far more often than it is written.
+Minimalism reduces cognitive load and accelerates onboarding.
+
+### IV. Async-First Execution
+
+All I/O-bound operations MUST use asynchronous execution via
+`async`/`await`.
+
+- Network calls, file I/O, and external process invocations MUST
+  be async.
+- Use `asyncio` primitives (`gather`, `TaskGroup`, semaphores)
+  for concurrency control.
+- CPU-bound work SHOULD remain synchronous unless it blocks an
+  event loop, in which case offload to a thread/process executor.
+- Async functions MUST NOT call blocking synchronous I/O without
+  wrapping in `asyncio.to_thread`.
+- Prefer structured concurrency (`TaskGroup`) over bare
+  `create_task` for proper error propagation.
+
+**Rationale**: QType orchestrates LLM calls, tool invocations, and
+document retrieval — all inherently I/O-bound. Async execution
+maximizes throughput without thread complexity.
+
+### V. Simplicity — YAGNI, DRY, No Over-Engineering
+
+Build only what is needed *right now*. Eliminate duplication.
+Resist speculative abstractions.
+
+- Do NOT add code, parameters, or abstractions for hypothetical
+  future requirements (YAGNI).
+- Extract shared logic into a single source of truth when it
+  appears in two or more places (DRY).
+- Prefer flat, straightforward control flow over clever patterns.
+- Optimization MUST be driven by measured bottlenecks, never by
+  assumption.
+- If a simpler solution solves the problem, it is the correct
+  solution — complexity MUST be justified.
+
+**Rationale**: Over-engineering is the leading cause of
+unmaintainable code. Simplicity keeps velocity high and defect
+rates low.
+
+## Technology Stack & Standards
+
+- **Language**: Python 3.10 — use all 3.10 features (union `|`
+  types, `match`/`case`, built-in generics).
+- **Package manager**: `uv` — all commands run via `uv run`.
+- **Linting**: `ruff` (line-length 79, target py310).
+- **Type checking**: `ty` with strict annotations.
+- **Import sorting**: `isort` with standard/third-party/local
+  grouping.
+- **Data models**: Pydantic `BaseModel` for all structured data.
+- **UI framework**: React + TypeScript with shadcn component
+  library (functional components, hooks only).
+- **Testing**: `pytest` via `uv run pytest`.
+
+## Development Workflow
+
+- All changes MUST be validated against `ruff`, `ty`, and `isort`
+  before committing.
+- Use feature branches; merge only after CI passes.
+- Code review MUST verify adherence to these principles.
+- Follow the project's copilot-instructions for detailed style
+  rules (`.github/copilot-instructions.md`).
+- Use `logging` for runtime diagnostics; structured logs
+  preferred.
+
+## Governance
+
+This constitution is the authoritative source of engineering
+standards for QType. It supersedes informal conventions and
+personal preferences.
+
+- **Amendments** require documentation of the change, rationale,
+  and a version bump to this file.
+- **Versioning** follows semantic versioning:
+  - MAJOR: principle removal or incompatible redefinition.
+  - MINOR: new principle or materially expanded guidance.
+  - PATCH: clarifications, wording, or typo fixes.
+- **Compliance** is verified during code review; every PR MUST
+  be checked against these principles.
+- **Complexity justification**: any deviation from Principle V
+  MUST be documented inline with a rationale.
+
+**Version**: 1.0.0 | **Ratified**: 2026-02-13 | **Last Amended**: 2026-02-13
diff --git a/.vscode/launch.json b/.vscode/launch.json
@@ -11,7 +11,7 @@
             "cwd": "${workspaceFolder}",
             "justMyCode": false,
             "env": {
-                "PYTHONPATH": "${workspaceFolder}"
+                "PYTHONPATH": "${workspaceFolder}",
             },
             "envFile": "${workspaceFolder}/.env"
         },

diff --git a/docs/How To/Observability & Debugging/trace_calls_with_open_telemetry.md b/docs/How To/Observability & Debugging/trace_calls_with_open_telemetry.md
@@ -15,7 +15,7 @@ telemetry:
 
 - **telemetry**: Top-level application configuration for observability
 - **id**: Unique identifier for the telemetry sink
-- **provider**: Telemetry backend (`Phoenix` or `Langfuse`)
+- **provider**: Telemetry backend (`Phoenix`, `Arize`, or `Langfuse`)
 - **endpoint**: URL where OpenTelemetry traces are sent
 
 ### Starting Phoenix
@@ -31,17 +31,23 @@ Phoenix will start on `http://localhost:6006` where you can view traces and span
 ## Complete Example
 
 ```yaml
---8<-- "../examples/observability_debugging/trace_with_opentelemetry.qtype.yaml"
+--8<-- "../examples/observability_debugging/trace_with_phoenix.qtype.yaml"
 ```
 
 Run the example:
 
 ```bash
-qtype run examples/observability_debugging/trace_with_opentelemetry.qtype.yaml --text "I love this product!"
+qtype run examples/observability_debugging/trace_with_phoenix.qtype.yaml --text "I love this product!"
 ```
 
 Then open `http://localhost:6006` in your browser to see the traced execution.
 
+## Complete Example for Arize Cloud
+
+```yaml
+--8<-- "../examples/observability_debugging/trace_with_arize.qtype.yaml"
+```
+
 ## See Also
 
 - [Application Reference](../../components/Application.md)

diff --git a/docs/How To/Qtype Server/add_feedback_buttons.md b/docs/How To/Qtype Server/add_feedback_buttons.md
@@ -2,6 +2,8 @@
 
 Collect user feedback (thumbs, ratings, or categories) directly in the QType UI by adding a `feedback` block to your flow. Feedback submission requires `telemetry` to be enabled so QType can attach the feedback to traces/spans.
 
+> **Note**: Feedback is only supported for **Conversational** interfaces and flows accessed via the **REST endpoint** in the UI. Feedback is not available for **Complete** interfaces.
+
 ### QType YAML
 
 ```yaml
@@ -23,6 +25,7 @@ telemetry:
 ### Explanation
 
 - **flows[].feedback**: Enables a feedback widget on the flow’s outputs in the UI.
+- **flows[].interface.type**: Must be `Conversational` for feedback to work in the streaming UI. For flows without an interface (or with `Complete` interface), use the REST endpoint tab in the UI instead of the streaming tab.
 - **feedback.type**: Feedback widget type: `thumbs`, `rating`, or `category`.
 - **feedback.explanation**: If `true`, prompts the user for an optional text explanation along with their feedback.
 - **rating.scale**: For `rating` feedback, sets the maximum score (typically `5` or `10`).

diff --git a/examples/observability_debugging/trace_with_arize.qtype.yaml b/examples/observability_debugging/trace_with_arize.qtype.yaml
@@ -0,0 +1,56 @@
+id: arize_trace_example
+description: Example of tracing QType application calls with OpenTelemetry to Arize Cloud
+
+auths:
+  - id: arize-auth
+    type: api_key
+    api_key: ${ARIZE_API_KEY}
+
+  - id: bedrock-auth
+    type: aws
+    region: us-east-1
+
+models:
+  - type: Model
+    id: nova
+    provider: aws-bedrock
+    model_id: amazon.nova-lite-v1:0
+    auth: bedrock-auth
+    inference_params:
+      temperature: 0.7
+      max_tokens: 512
+
+flows:
+  - type: Flow
+    id: classify_text
+    feedback:
+      type: thumbs
+      explanation: true
+    variables:
+      - id: text
+        type: text
+      - id: response
+        type: text
+    inputs:
+      - text
+    outputs:
+      - response
+    steps:
+      - id: classify
+        type: LLMInference
+        model: nova
+        system_message: "Classify the following text as positive, negative, or neutral. Respond with only one word."
+        inputs:
+          - text
+        outputs:
+          - response
+
+telemetry:
+  id: arize-telemetry
+  provider: Arize
+  endpoint: https://otlp.arize.com/v1/traces
+  auth: arize-auth
+  args:
+    space_id: ${ARIZE_SPACE_ID}
+    project_name: qtype-classify-example
+
diff --git a/...gging/trace_with_opentelemetry.qtype.yaml → ...y_debugging/trace_with_phoenix.qtype.yaml b/...gging/trace_with_opentelemetry.qtype.yaml → ...y_debugging/trace_with_phoenix.qtype.yaml
diff --git a/pyproject.toml b/pyproject.toml
@@ -32,7 +32,9 @@ Homepage = "https://github.com/bazaarvoice/qtype"
 [project.optional-dependencies]
 interpreter = [
     "aiostream>=0.7.1",
+    "arize>=8.0.0",
     "arize-phoenix-otel>=0.12.1",
+    "arize-otel>=0.11.0",
     "boto3>=1.34.0",
     "datasets>=4.4.1",
     "diskcache>=5.6.3",
@@ -101,7 +103,6 @@ docs = [
     "mkdocs>=1.5.0",
     "mkdocs-material>=9.0.0",
 ]
-
 [tool.uv]
 # Install dev dependencies by default when running uv sync
 default-groups = ["dev"]

diff --git a/qtype/dsl/model.py b/qtype/dsl/model.py
@@ -1003,7 +1003,7 @@ class TelemetrySink(StrictBaseModel):
     id: str = Field(
         ..., description="Unique ID of the telemetry sink configuration."
     )
-    provider: Literal["Phoenix", "Langfuse"] = "Phoenix"
+    provider: Literal["Phoenix", "Langfuse", "Arize"] = "Phoenix"
     auth: Reference[AuthProviderType] | str | None = Field(
         default=None,
         description="AuthorizationProvider used to authenticate telemetry data transmission.",

diff --git a/qtype/interpreter/base/base_step_executor.py b/qtype/interpreter/base/base_step_executor.py
@@ -406,11 +406,12 @@ async def _process_message_with_telemetry(
                                 "error": str(output_msg.error),
                             },
                         )
+
                     # Enrich with process_message span for feedback tracking
                     span_context = span.get_span_context()
                     output_msg = output_msg.with_telemetry_metadata(
-                        span_id=format(span_context.span_id, "016x"),
-                        trace_id=format(span_context.trace_id, "032x"),
+                        span_id=span_context.span_id,
+                        trace_id=span_context.trace_id,
                     )
                     yield output_msg