Skip to content

[UX-Automation] Setup Infrastructure for AI-Driven UX Audits #103

Description

@BeSovereign

Summary

Implement the technical foundation to allow an LLM-powered agent to control a browser and perform semantic UX audits. This infrastructure should wrap a headless browser with an AI automation layer (e.g., Stagehand or Browser-use).

Technical Context

  • Base Directory: tests/ux-audit/ in the freeshard repository.
  • Recommended Stack: TypeScript with Stagehand (Playwright-based LLM automation) or Python with browser-use.
  • Goal: Create a reusable runner script that accepts a semantic instruction and returns a structured UX report.

Implementation Details for the Agent

  1. Environment Setup:
    • Initialize tests/ux-audit/package.json with dependencies for AI-driven automation.
    • Configure access to an LLM provider (OpenAI/Anthropic) via environment variables (e.g., OPENAI_API_KEY).
  2. The Audit Runner:
    • Create tests/ux-audit/runner.ts (or .py).
    • The runner should:
      • Launch a browser instance (Chromium preferred).
      • Pass a high-level goal to the AI agent.
      • Capture screenshots/logs for each step of the AI's journey.
      • Export the final result as reports/ux-audit-{{timestamp}}.md.
  3. Justfile Integration:
    • Add a command just audit-ui to trigger the default UX audit flow.

Open Technical Points / Decisions

  • Browser Choice: The agent should default to Chromium, but allow overriding via AUDIT_BROWSER=firefox.
  • Artifact Storage: Decide on a retention policy for screenshots taken by the AI during the audit.

Acceptance Criteria

  • A script exists that can semantically navigate to the landing page.
  • The script can handle dynamic redirects (e.g., to the login page).
  • The audit report is generated as a Markdown file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions