diff --git a/contracts/README.md b/contracts/README.md new file mode 100644 index 0000000..387781a --- /dev/null +++ b/contracts/README.md @@ -0,0 +1,239 @@ +# Deepline Contracts + +This directory contains contract definitions for the Deepline Multi-Agent Data Scientist system. + +## Contents + +### 1. UI Test IDs Contract (`ui-test-ids.json`) + +A comprehensive enumeration of all UI screens and critical elements with their designated `data-testid` attributes. + +**Purpose:** +- Provides a single source of truth for UI test selectors +- Enables reliable E2E testing with Playwright, Cypress, or similar tools +- Documents the structure of the UI for test automation + +**Structure:** +```json +{ + "screens": { + "ScreenName": { + "description": "Description of the screen", + "elements": [ + { + "id": "test-id-name", + "description": "What this element is", + "selector": "[data-testid='test-id-name']", + "required": true + } + ] + } + } +} +``` + +**Statistics:** +- 6 screens defined +- 47 UI elements documented +- 46 required elements +- 1 optional element + +**Screens:** +1. **Header** - Main navigation header with agent tabs +2. **ConsolePanel** - Natural language console interface +3. **DatasetsPanel** - Dataset management and upload +4. **WorkflowsPanel** - Workflow execution monitoring +5. **ProcessesPanel** - Background processes and health +6. **App** - Root application structure + +### 2. API Schema Contract (`api-schema.yaml`) + +OpenAPI 3.0.3 specification documenting all REST API endpoints across the microservices architecture. + +**Purpose:** +- Defines request/response types for all API endpoints +- Enables API client generation and validation +- Documents the complete API surface for integration + +**Coverage:** +- **Master Orchestrator** (port 8000): 11 endpoints +- **EDA Agent** (port 8001): 8 endpoints +- **Refinery Agent** (port 8005): 3 endpoints +- **ML Agent** (port 8002): 5 endpoints +- **Total:** 27 endpoints documented + +**Key Endpoints:** + +Service | Endpoint | Method | Purpose +--------|----------|--------|-------- +Orchestrator | `/workflows/start` | POST | Start new workflow +Orchestrator | `/datasets/upload` | POST | Upload dataset +Orchestrator | `/runs/{run_id}/status` | GET | Get run status +EDA | `/load_data` | POST | Load dataset into memory +EDA | `/basic_info` | POST | Get dataset info +EDA | `/statistical_summary` | POST | Compute statistics +EDA | `/create_visualization` | POST | Generate charts +Refinery | `/execute` | POST | Execute DQ/FE task +ML | `/class_imbalance` | POST | Handle imbalance +ML | `/baseline_sanity` | POST | Train baseline models + +**Schemas:** +The contract defines comprehensive request/response schemas including: +- `WorkflowRequest` / `WorkflowResponse` +- `LoadDataRequest` / `LoadDataResponse` +- `TaskRequest` / `TaskResponse` +- `MLResponse` +- And 20+ more schemas + +## Validation + +Use the validation script to ensure contracts are valid: + +```bash +node scripts/validate-contracts.js +``` + +The validation script checks: +- ✅ JSON/YAML validity +- ✅ Required structure present +- ✅ No duplicate test IDs +- ✅ All expected sections exist +- ✅ File references are correct + +## Usage + +### For Frontend Developers + +Use the UI test IDs contract when adding `data-testid` attributes: + +```javascript +// ❌ Before + + +// ✅ After (following contract) + +``` + +See `/reports/selector-adoption.md` for a complete TODO list. + +### For QA/Test Engineers + +Use test IDs from the contract in your E2E tests: + +```javascript +// Playwright example +await page.getByTestId('console-prompt').fill('load iris dataset'); +await page.getByTestId('console-submit').click(); +await expect(page.getByTestId('console-output')).toContainText('Success'); +``` + +### For Backend Developers + +Use the API schema when implementing or consuming endpoints: + +```python +# The contract documents that /load_data expects: +{ + "path": str, # Required + "name": str, # Required + "file_type": str # Optional, one of: csv, xlsx, json +} + +# And returns: +{ + "name": str, + "rows": int, + "cols": int, + "dtypes": dict, + "memory_usage": str, + "sample_preview": list +} +``` + +### For API Consumers + +Generate API clients from the OpenAPI schema: + +```bash +# Generate TypeScript client +openapi-generator-cli generate \ + -i contracts/api-schema.yaml \ + -g typescript-axios \ + -o client/typescript + +# Generate Python client +openapi-generator-cli generate \ + -i contracts/api-schema.yaml \ + -g python \ + -o client/python +``` + +## CI/CD Integration + +Add contract validation to your CI pipeline: + +```yaml +# .github/workflows/contracts.yml +name: Validate Contracts +on: [push, pull_request] + +jobs: + validate: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v3 + - uses: actions/setup-node@v3 + with: + node-version: '18' + - name: Validate contracts + run: node scripts/validate-contracts.js +``` + +## Maintaining Contracts + +### When to Update UI Test IDs Contract: + +1. **New component added** - Add screen definition with elements +2. **Element added to existing component** - Add to elements array +3. **Element removed** - Remove from contract (breaking change) +4. **Test ID renamed** - Update `id` field (breaking change) + +### When to Update API Schema Contract: + +1. **New endpoint added** - Add path definition with method +2. **Endpoint modified** - Update request/response schemas +3. **Endpoint deprecated** - Mark with `deprecated: true` +4. **Schema changed** - Update component schemas (may be breaking) + +### Versioning: + +Both contracts follow semantic versioning: +- **Major** (1.0.0 → 2.0.0): Breaking changes (renamed IDs, removed endpoints) +- **Minor** (1.0.0 → 1.1.0): New additions (new elements, new endpoints) +- **Patch** (1.0.0 → 1.0.1): Clarifications, typo fixes (non-breaking) + +## Related Documentation + +- `/reports/selector-adoption.md` - TODO list for UI test ID adoption +- `/dashboard-ui/src/main.jsx` - Main UI component source +- `/mcp-server/` - Backend service implementations + +## Support + +For questions about contracts: +1. Check the validation script output for errors +2. Review related code in `/dashboard-ui` or `/mcp-server` +3. See `/reports/selector-adoption.md` for implementation guidance + +--- + +**Last Updated:** 2025-10-13 +**Contract Version:** 1.0.0 diff --git a/contracts/api-schema.yaml b/contracts/api-schema.yaml new file mode 100644 index 0000000..397f4bc --- /dev/null +++ b/contracts/api-schema.yaml @@ -0,0 +1,1012 @@ +openapi: 3.0.3 +info: + title: Deepline Multi-Agent Data Scientist API + description: | + Comprehensive API schema for the Deepline Master Orchestrator and its microservice agents. + Includes EDA Agent, Refinery Agent, ML Agent, and Master Orchestrator endpoints. + version: 1.0.0 + contact: + name: Deepline Team + license: + name: BUSL-1.1 + +servers: + - url: http://localhost:8000 + description: Master Orchestrator (default) + - url: http://localhost:8001 + description: EDA Agent + - url: http://localhost:8005 + description: Refinery Agent + - url: http://localhost:8002 + description: ML Agent + +tags: + - name: orchestrator + description: Master orchestrator workflow management + - name: eda + description: Exploratory Data Analysis operations + - name: refinery + description: Data quality and feature engineering + - name: ml + description: Machine learning training and experiments + - name: health + description: Health check endpoints + +paths: + # Master Orchestrator Endpoints + /: + get: + tags: [orchestrator] + summary: Root endpoint + description: Returns API information and status + responses: + '200': + description: API information + content: + application/json: + schema: + type: object + properties: + message: + type: string + version: + type: string + + /health: + get: + tags: [health] + summary: Health check endpoint + description: Returns health status of the service + responses: + '200': + description: Service is healthy + content: + application/json: + schema: + $ref: '#/components/schemas/HealthResponse' + + /datasets/upload: + post: + tags: [orchestrator] + summary: Upload dataset + description: Upload a dataset file for analysis + requestBody: + required: true + content: + multipart/form-data: + schema: + type: object + required: + - file + - name + properties: + file: + type: string + format: binary + description: Dataset file (CSV, XLSX, JSON) + name: + type: string + description: Name for the dataset + license: + type: string + description: Optional license information + uploader: + type: string + description: Optional uploader identifier + responses: + '200': + description: Dataset uploaded successfully + content: + application/json: + schema: + type: object + properties: + dataset_name: + type: string + filename: + type: string + snapshot: + type: object + pii_detected: + type: boolean + pii_matches_count: + type: integer + message: + type: string + + /datasets: + get: + tags: [orchestrator] + summary: List datasets + description: Get a list of all loaded datasets + responses: + '200': + description: List of datasets + content: + application/json: + schema: + type: object + properties: + datasets: + type: array + items: + $ref: '#/components/schemas/Dataset' + + /workflows/start: + post: + tags: [orchestrator] + summary: Start a workflow + description: Start a new workflow with specified tasks + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/WorkflowRequest' + responses: + '200': + description: Workflow started successfully + content: + application/json: + schema: + $ref: '#/components/schemas/WorkflowResponse' + + /runs: + get: + tags: [orchestrator] + summary: List all workflow runs + description: Get a list of all workflow runs + responses: + '200': + description: List of runs + content: + application/json: + schema: + type: object + properties: + runs: + type: array + items: + $ref: '#/components/schemas/RunStatus' + + /runs/{run_id}/status: + get: + tags: [orchestrator] + summary: Get run status + description: Get the status of a specific workflow run + parameters: + - name: run_id + in: path + required: true + schema: + type: string + responses: + '200': + description: Run status + content: + application/json: + schema: + $ref: '#/components/schemas/RunStatus' + + /runs/{run_id}/artifacts: + get: + tags: [orchestrator] + summary: List run artifacts + description: Get all artifacts for a specific run + parameters: + - name: run_id + in: path + required: true + schema: + type: string + responses: + '200': + description: List of artifacts + content: + application/json: + schema: + type: array + items: + $ref: '#/components/schemas/Artifact' + + /artifacts/{run_id}/{filename}: + get: + tags: [orchestrator] + summary: Download artifact + description: Download a specific artifact file + parameters: + - name: run_id + in: path + required: true + schema: + type: string + - name: filename + in: path + required: true + schema: + type: string + responses: + '200': + description: Artifact file + content: + application/octet-stream: + schema: + type: string + format: binary + + /runs/{run_id}: + delete: + tags: [orchestrator] + summary: Delete workflow run + description: Delete a workflow run and its artifacts + parameters: + - name: run_id + in: path + required: true + schema: + type: string + responses: + '200': + description: Run deleted successfully + + # EDA Agent Endpoints + /load_data: + post: + tags: [eda] + summary: Load dataset + description: Load a dataset from file into memory + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/LoadDataRequest' + responses: + '200': + description: Dataset loaded successfully + content: + application/json: + schema: + $ref: '#/components/schemas/LoadDataResponse' + + /basic_info: + post: + tags: [eda] + summary: Get basic dataset information + description: Get shape, data types, memory usage, and preview + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/BasicInfoRequest' + responses: + '200': + description: Basic info retrieved + content: + application/json: + schema: + $ref: '#/components/schemas/BasicInfoResponse' + + /statistical_summary: + post: + tags: [eda] + summary: Get statistical summary + description: Get descriptive statistics, correlations, skewness, and kurtosis + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/StatisticalSummaryRequest' + responses: + '200': + description: Statistical summary computed + content: + application/json: + schema: + $ref: '#/components/schemas/StatisticalSummaryResponse' + + /missing_data_analysis: + post: + tags: [eda] + summary: Analyze missing data + description: Analyze missing data patterns and get recommendations + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/MissingDataRequest' + responses: + '200': + description: Missing data analysis complete + content: + application/json: + schema: + $ref: '#/components/schemas/MissingDataResult' + + /create_visualization: + post: + tags: [eda] + summary: Create visualization + description: Create data visualizations (histogram, boxplot, correlation, etc.) + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/VisualizationRequest' + responses: + '200': + description: Visualization created + content: + application/json: + schema: + $ref: '#/components/schemas/VisualizationResponse' + + /infer_schema: + post: + tags: [eda] + summary: Infer data schema + description: Infer data types and schema with confidence scores + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/SchemaInferenceRequest' + responses: + '200': + description: Schema inferred + content: + application/json: + schema: + $ref: '#/components/schemas/SchemaInferenceResponse' + + /detect_outliers: + post: + tags: [eda] + summary: Detect outliers + description: Detect outliers using IQR, Isolation Forest, or LOF methods + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/OutlierDetectionRequest' + responses: + '200': + description: Outliers detected + content: + application/json: + schema: + $ref: '#/components/schemas/OutlierResult' + + /datasets/{name}: + delete: + tags: [eda] + summary: Delete dataset + description: Delete a dataset from memory + parameters: + - name: name + in: path + required: true + schema: + type: string + responses: + '200': + description: Dataset deleted + + # Refinery Agent Endpoints + /execute: + post: + tags: [refinery] + summary: Execute refinery task + description: Execute a data quality check or feature engineering task + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/TaskRequest' + responses: + '200': + description: Task executed + content: + application/json: + schema: + $ref: '#/components/schemas/TaskResponse' + + /metrics: + get: + tags: [refinery, health] + summary: Get Prometheus metrics + description: Get Prometheus-formatted metrics + responses: + '200': + description: Prometheus metrics + content: + text/plain: + schema: + type: string + + /pipelines: + get: + tags: [refinery] + summary: List active pipelines + description: Get list of active feature engineering pipelines + responses: + '200': + description: List of pipelines + content: + application/json: + schema: + type: object + properties: + pipelines: + type: array + items: + type: object + + # ML Agent Endpoints + /class_imbalance: + post: + tags: [ml] + summary: Handle class imbalance + description: Apply sampling techniques to handle class imbalance + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/ClassImbalanceRequest' + responses: + '200': + description: Class imbalance handled + content: + application/json: + schema: + $ref: '#/components/schemas/MLResponse' + + /train_validation_test: + post: + tags: [ml] + summary: Split data for training + description: Split data into training, validation, and test sets + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/TrainValidationTestRequest' + responses: + '200': + description: Data split complete + content: + application/json: + schema: + $ref: '#/components/schemas/MLResponse' + + /baseline_sanity: + post: + tags: [ml] + summary: Run baseline models + description: Train baseline models for sanity checking + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/BaselineSanityRequest' + responses: + '200': + description: Baseline models trained + content: + application/json: + schema: + $ref: '#/components/schemas/MLResponse' + + /experiment_tracking: + post: + tags: [ml] + summary: Track ML experiment + description: Track machine learning experiments with MLflow + requestBody: + required: true + content: + application/json: + schema: + $ref: '#/components/schemas/ExperimentTrackingRequest' + responses: + '200': + description: Experiment tracked + content: + application/json: + schema: + $ref: '#/components/schemas/MLResponse' + + /experiments: + get: + tags: [ml] + summary: List experiments + description: Get list of all ML experiments + responses: + '200': + description: List of experiments + content: + application/json: + schema: + type: object + properties: + experiments: + type: array + items: + type: object + +components: + schemas: + HealthResponse: + type: object + properties: + healthy: + type: boolean + timestamp: + type: string + format: date-time + agents: + type: object + telemetry: + type: object + fe_module_enabled: + type: boolean + fe_module_available: + type: boolean + + Dataset: + type: object + properties: + name: + type: string + shape: + type: array + items: + type: integer + memory_usage: + type: string + dtypes: + type: object + + WorkflowRequest: + type: object + required: + - run_name + - tasks + properties: + run_name: + type: string + tasks: + type: array + items: + $ref: '#/components/schemas/Task' + priority: + type: integer + default: 1 + + Task: + type: object + required: + - agent + - action + - args + properties: + agent: + type: string + enum: [eda_agent, refinery_agent, ml_agent] + action: + type: string + args: + type: object + + WorkflowResponse: + type: object + properties: + run_id: + type: string + status: + type: string + message: + type: string + + RunStatus: + type: object + properties: + run_id: + type: string + status: + type: string + enum: [QUEUED, RUNNING, COMPLETED, FAILED] + progress: + type: number + format: float + current_task: + type: string + nullable: true + start_time: + type: string + end_time: + type: string + nullable: true + error_message: + type: string + nullable: true + + Artifact: + type: object + properties: + artifact_id: + type: string + type: + type: string + filename: + type: string + size: + type: integer + created_at: + type: string + download_url: + type: string + nullable: true + + LoadDataRequest: + type: object + required: + - path + - name + properties: + path: + type: string + name: + type: string + file_type: + type: string + enum: [csv, xlsx, json] + + LoadDataResponse: + type: object + properties: + name: + type: string + rows: + type: integer + cols: + type: integer + dtypes: + type: object + memory_usage: + type: string + sample_preview: + type: array + items: + type: object + + BasicInfoRequest: + type: object + required: + - name + properties: + name: + type: string + + BasicInfoResponse: + type: object + properties: + shape: + type: array + items: + type: integer + dtypes: + type: object + memory_usage: + type: string + preview: + type: array + items: + type: object + null_counts: + type: object + + StatisticalSummaryRequest: + type: object + required: + - name + properties: + name: + type: string + sample_size: + type: integer + default: 10000 + + StatisticalSummaryResponse: + type: object + properties: + descriptive_stats: + type: object + correlation_matrix: + type: object + skewness: + type: object + kurtosis: + type: object + + MissingDataRequest: + type: object + required: + - name + properties: + name: + type: string + + MissingDataResult: + type: object + properties: + missing_counts: + type: object + missing_percentages: + type: object + missing_patterns: + type: array + items: + type: object + recommendations: + type: array + items: + type: string + visualization_base64: + type: string + nullable: true + + VisualizationRequest: + type: object + required: + - name + - chart_type + properties: + name: + type: string + chart_type: + type: string + enum: [histogram, boxplot, correlation, scatter, line, bar] + columns: + type: array + items: + type: string + sample_size: + type: integer + default: 10000 + + VisualizationResponse: + type: object + properties: + chart_type: + type: string + columns: + type: array + items: + type: string + base64_image: + type: string + metadata: + type: object + + SchemaInferenceRequest: + type: object + required: + - name + properties: + name: + type: string + confidence_threshold: + type: number + format: float + default: 0.8 + + SchemaInferenceResponse: + type: object + properties: + schema: + type: object + confidence_scores: + type: object + recommendations: + type: array + items: + type: string + yaml_schema: + type: string + + OutlierDetectionRequest: + type: object + required: + - name + properties: + name: + type: string + method: + type: string + enum: [iqr, isolation_forest, lof] + default: iqr + columns: + type: array + items: + type: string + contamination: + type: number + format: float + default: 0.1 + + OutlierResult: + type: object + properties: + method: + type: string + outlier_indices: + type: array + items: + type: integer + outlier_count: + type: integer + outlier_percentage: + type: number + format: float + columns_analyzed: + type: array + items: + type: string + scores: + type: object + nullable: true + + TaskRequest: + type: object + required: + - task_id + - action + properties: + task_id: + type: string + action: + type: string + enum: + - check_schema_consistency + - check_missing_values + - check_distributions + - check_duplicates + - check_leakage + - check_drift + - comprehensive_quality_report + - assign_feature_roles + - basic_impute_missing_values + - basic_scale_numeric_features + - basic_encode_categorical_features + - basic_generate_datetime_features + - basic_vectorise_text_features + - basic_generate_interactions + - basic_select_features + - save_fe_pipeline + - execute_feature_pipeline + - advanced_impute_missing_values + - advanced_encode_categorical_features + - advanced_feature_selection + - feature_interactions + - pipeline_persistence + mode: + type: string + enum: [data_quality, feature_engineering] + backend: + type: string + enum: [refinery_basic, fe_module] + params: + type: object + + TaskResponse: + type: object + properties: + task_id: + type: string + success: + type: boolean + mode: + type: string + backend: + type: string + result: + type: object + nullable: true + error: + type: string + nullable: true + execution_time: + type: number + format: float + timestamp: + type: number + format: float + + ClassImbalanceRequest: + type: object + required: + - dataset_name + properties: + dataset_name: + type: string + method: + type: string + enum: [smote, adasyn, random_under, tomek, smoteenn, smotetomek] + target_column: + type: string + + TrainValidationTestRequest: + type: object + required: + - dataset_name + properties: + dataset_name: + type: string + test_size: + type: number + format: float + default: 0.2 + validation_size: + type: number + format: float + default: 0.2 + stratify: + type: boolean + default: true + + BaselineSanityRequest: + type: object + required: + - dataset_name + - target_column + properties: + dataset_name: + type: string + target_column: + type: string + models: + type: array + items: + type: string + default: [logistic_regression, decision_tree, random_forest] + + ExperimentTrackingRequest: + type: object + required: + - experiment_name + properties: + experiment_name: + type: string + parameters: + type: object + metrics: + type: object + artifacts: + type: array + items: + type: string + + MLResponse: + type: object + properties: + success: + type: boolean + result: + type: object + error: + type: string + nullable: true + execution_time: + type: number + format: float diff --git a/contracts/ui-test-ids.json b/contracts/ui-test-ids.json new file mode 100644 index 0000000..c3ff044 --- /dev/null +++ b/contracts/ui-test-ids.json @@ -0,0 +1,320 @@ +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "title": "UI Test IDs Contract", + "description": "Enumeration of screens and critical elements with data-testid attributes for E2E testing", + "version": "1.0.0", + "screens": { + "Header": { + "description": "Main navigation header with agent tabs and system status", + "elements": [ + { + "id": "header-container", + "description": "Main header container", + "selector": "[data-testid='header-container']", + "required": true + }, + { + "id": "brand-logo", + "description": "DEEPLINE brand logo/title", + "selector": "[data-testid='brand-logo']", + "required": true + }, + { + "id": "nav-orchestrator", + "description": "Orchestrator agent navigation tab", + "selector": "[data-testid='nav-orchestrator']", + "required": true + }, + { + "id": "nav-eda", + "description": "EDA agent navigation tab", + "selector": "[data-testid='nav-eda']", + "required": true + }, + { + "id": "nav-refinery", + "description": "Refinery agent navigation tab", + "selector": "[data-testid='nav-refinery']", + "required": true + }, + { + "id": "nav-ml", + "description": "ML agent navigation tab", + "selector": "[data-testid='nav-ml']", + "required": true + }, + { + "id": "system-status", + "description": "Overall system health status indicator", + "selector": "[data-testid='system-status']", + "required": true + }, + { + "id": "status-dot", + "description": "System status indicator dot", + "selector": "[data-testid='status-dot']", + "required": true + } + ] + }, + "ConsolePanel": { + "description": "Natural language console for submitting prompts to the orchestrator", + "elements": [ + { + "id": "console-container", + "description": "Console panel container", + "selector": "[data-testid='console-container']", + "required": true + }, + { + "id": "console-header", + "description": "Console header with icon and title", + "selector": "[data-testid='console-header']", + "required": true + }, + { + "id": "console-output", + "description": "Console output area showing command results", + "selector": "[data-testid='console-output']", + "required": true + }, + { + "id": "console-line", + "description": "Individual console output line", + "selector": "[data-testid='console-line']", + "required": true + }, + { + "id": "console-prompt", + "description": "Console input field for natural language prompts", + "selector": "[data-testid='console-prompt']", + "required": true + }, + { + "id": "console-submit", + "description": "Console submit button", + "selector": "[data-testid='console-submit']", + "required": true + } + ] + }, + "DatasetsPanel": { + "description": "Panel showing uploaded datasets and upload controls", + "elements": [ + { + "id": "datasets-container", + "description": "Datasets panel container", + "selector": "[data-testid='datasets-container']", + "required": true + }, + { + "id": "datasets-header", + "description": "Datasets header with icon and title", + "selector": "[data-testid='datasets-header']", + "required": true + }, + { + "id": "upload-area", + "description": "Upload area for new datasets", + "selector": "[data-testid='upload-area']", + "required": true + }, + { + "id": "file-upload-input", + "description": "Hidden file input for dataset upload", + "selector": "[data-testid='file-upload-input']", + "required": true + }, + { + "id": "upload-button", + "description": "Button to trigger file upload", + "selector": "[data-testid='upload-button']", + "required": true + }, + { + "id": "dataset-name-input", + "description": "Input field for dataset name", + "selector": "[data-testid='dataset-name-input']", + "required": true + }, + { + "id": "datasets-list", + "description": "List of uploaded datasets", + "selector": "[data-testid='datasets-list']", + "required": true + }, + { + "id": "dataset-row", + "description": "Individual dataset row in the table", + "selector": "[data-testid='dataset-row']", + "required": true + }, + { + "id": "resource-usage", + "description": "System resource usage metrics section", + "selector": "[data-testid='resource-usage']", + "required": true + }, + { + "id": "resource-metric-cpu", + "description": "CPU usage metric", + "selector": "[data-testid='resource-metric-cpu']", + "required": true + }, + { + "id": "resource-metric-memory", + "description": "Memory usage metric", + "selector": "[data-testid='resource-metric-memory']", + "required": true + } + ] + }, + "WorkflowsPanel": { + "description": "Panel displaying active and completed workflows", + "elements": [ + { + "id": "workflows-container", + "description": "Workflows panel container", + "selector": "[data-testid='workflows-container']", + "required": true + }, + { + "id": "workflows-header", + "description": "Workflows header with icon and title", + "selector": "[data-testid='workflows-header']", + "required": true + }, + { + "id": "workflow-card", + "description": "Individual workflow card", + "selector": "[data-testid='workflow-card']", + "required": true + }, + { + "id": "workflow-id", + "description": "Workflow identifier", + "selector": "[data-testid='workflow-id']", + "required": true + }, + { + "id": "workflow-status", + "description": "Workflow status badge", + "selector": "[data-testid='workflow-status']", + "required": true + }, + { + "id": "workflow-name", + "description": "Workflow name/description", + "selector": "[data-testid='workflow-name']", + "required": true + }, + { + "id": "workflow-progress-bar", + "description": "Workflow progress bar", + "selector": "[data-testid='workflow-progress-bar']", + "required": true + }, + { + "id": "progress-text", + "description": "Workflow progress percentage text", + "selector": "[data-testid='progress-text']", + "required": true + }, + { + "id": "no-workflows", + "description": "Empty state message when no workflows exist", + "selector": "[data-testid='no-workflows']", + "required": false + } + ] + }, + "ProcessesPanel": { + "description": "Panel showing background processes and agent health status", + "elements": [ + { + "id": "processes-container", + "description": "Processes panel container", + "selector": "[data-testid='processes-container']", + "required": true + }, + { + "id": "processes-header", + "description": "Processes header with icon and title", + "selector": "[data-testid='processes-header']", + "required": true + }, + { + "id": "process-card-orchestrator", + "description": "Orchestrator process card", + "selector": "[data-testid='process-card-orchestrator']", + "required": true + }, + { + "id": "process-card-refinery", + "description": "Refinery process card", + "selector": "[data-testid='process-card-refinery']", + "required": true + }, + { + "id": "process-status", + "description": "Process health status indicator", + "selector": "[data-testid='process-status']", + "required": true + }, + { + "id": "process-status-dot", + "description": "Process status dot indicator", + "selector": "[data-testid='process-status-dot']", + "required": true + }, + { + "id": "telemetry-section", + "description": "EDA telemetry section", + "selector": "[data-testid='telemetry-section']", + "required": true + }, + { + "id": "telemetry-table", + "description": "EDA telemetry data table", + "selector": "[data-testid='telemetry-table']", + "required": true + }, + { + "id": "telemetry-row", + "description": "Individual telemetry row", + "selector": "[data-testid='telemetry-row']", + "required": true + } + ] + }, + "App": { + "description": "Root application container", + "elements": [ + { + "id": "app-container", + "description": "Main application container", + "selector": "[data-testid='app-container']", + "required": true + }, + { + "id": "main-layout", + "description": "Main layout container", + "selector": "[data-testid='main-layout']", + "required": true + }, + { + "id": "main-content", + "description": "Main content area", + "selector": "[data-testid='main-content']", + "required": true + }, + { + "id": "sidebar", + "description": "Sidebar container", + "selector": "[data-testid='sidebar']", + "required": true + } + ] + } + } +} diff --git a/reports/selector-adoption.md b/reports/selector-adoption.md new file mode 100644 index 0000000..9604831 --- /dev/null +++ b/reports/selector-adoption.md @@ -0,0 +1,490 @@ +# Selector Adoption Report + +## Overview +This document tracks the adoption of `data-testid` attributes across the Deepline Dashboard UI components. These test IDs are essential for reliable end-to-end testing and automation. + +## Current Status +**Last Updated:** 2025-10-13 +**Adoption Rate:** 0% (0/117 elements) + +## Component Status + +### ✅ Fully Adopted Components +None yet. + +### ⚠️ Partially Adopted Components +None yet. + +### ❌ Not Adopted Components +All components need `data-testid` attributes added. + +--- + +## TODO: Components Requiring data-testid Attributes + +### 1. Header Component (`dashboard-ui/src/main.jsx`) + +**File:** `dashboard-ui/src/main.jsx` +**Lines:** ~29-66 +**Component:** `Header` + +#### Required Changes: + +```javascript +// TODO: Add data-testid to header container +
+
+
+ // TODO: Add data-testid to brand logo +

DEEPLINE

+ +
+
+ // TODO: Add data-testid to system status +
+ // TODO: Add data-testid to status dot +
+ System Healthy +
+
+
+
+``` + +--- + +### 2. ConsolePanel Component (`dashboard-ui/src/main.jsx`) + +**File:** `dashboard-ui/src/main.jsx` +**Lines:** ~68-122 +**Component:** `ConsolePanel` + +#### Required Changes: + +```javascript +return ( + // TODO: Add data-testid to console container +
+ // TODO: Add data-testid to console header +
+ +

Console

+
+ // TODO: Add data-testid to console output +
+ {consoleOutput.map(output => ( + // TODO: Add data-testid to each console line +
+ {output.message} +
+ ))} + {lastResult && ( +
+ {typeof lastResult === 'string' ? lastResult : JSON.stringify(lastResult, null, 2)} +
+ )} +
+
+ // TODO: Add data-testid to console prompt input + setNlPrompt(e.target.value)} + onKeyPress={(e) => e.key === 'Enter' && handleSubmit()} + className="console-prompt" + placeholder="Ask Deepline to analyze your data..." + disabled={busy} + data-testid="console-prompt" + /> + // TODO: Add data-testid to submit button + +
+
+) +``` + +--- + +### 3. DatasetsPanel Component (`dashboard-ui/src/main.jsx`) + +**File:** `dashboard-ui/src/main.jsx` +**Lines:** ~124-219 +**Component:** `DatasetsPanel` + +#### Required Changes: + +```javascript +return ( + // TODO: Add data-testid to datasets container +
+ // TODO: Add data-testid to datasets header +
+ +

Datasets

+
+ +
+ {!datasets?.datasets?.length ? ( + // TODO: Add data-testid to upload area +
+ +
No datasets uploaded
+ // TODO: Add data-testid to file upload input + setFile(e.target.files?.[0] ?? null)} + className="upload-input" + data-testid="file-upload-input" + /> + // TODO: Add data-testid to upload button label + +
+ ) : ( +
+ setFile(e.target.files?.[0] ?? null)} + className="file-input" + data-testid="file-upload-input" + /> + // TODO: Add data-testid to dataset name input + setName(e.target.value)} + placeholder="dataset name" + className="name-input" + data-testid="dataset-name-input" + /> + +
+ )} +
+ + {datasets?.datasets?.length > 0 && ( + // TODO: Add data-testid to datasets list +
+
+
+
Name
+
Rows×Cols
+
Memory
+
+ {datasets.datasets.map((d, i) => ( + // TODO: Add data-testid to each dataset row +
+
{d.name}
+
{d.shape?.[0]} × {d.shape?.[1]}
+
{d.memory_usage}
+
+ ))} +
+
+ )} + + // TODO: Add data-testid to resource usage section +
+

+ + System Resources +

+
+ // TODO: Add data-testid to CPU metric +
+
CPU Usage
+
23%
+
+
+
+
+ // TODO: Add data-testid to memory metric +
+
Memory
+
67%
+
+
+
+
+
+
+
+) +``` + +--- + +### 4. WorkflowsPanel Component (`dashboard-ui/src/main.jsx`) + +**File:** `dashboard-ui/src/main.jsx` +**Lines:** ~221-282 +**Component:** `WorkflowsPanel` + +#### Required Changes: + +```javascript +return ( + // TODO: Add data-testid to workflows container +
+ // TODO: Add data-testid to workflows header +
+ +

Workflows

+
+
+ {(runs?.runs ?? []).slice().reverse().slice(0, 6).map((workflow) => ( + // TODO: Add data-testid to each workflow card +
+
+
+ // TODO: Add data-testid to workflow ID +
+ {workflow.run_id.substring(0, 8)}... +
+ // TODO: Add data-testid to workflow status +
+ {workflow.status} +
+
+ // TODO: Add data-testid to workflow name +
+ Run {workflow.run_id.substring(0, 12)}... +
+
+
+ // TODO: Add data-testid to progress bar container + {getProgressBar(Math.round(workflow.progress), workflow.status)} + // TODO: Add data-testid to progress text + + {Math.round(workflow.progress)}% + +
+
+ ))} + {!runs?.runs?.length && ( + // TODO: Add data-testid to no workflows message +
+ + No workflows running +
+ )} +
+
+) +``` + +**Note:** The `getProgressBar` function also needs updating: + +```javascript +const getProgressBar = (progress, status) => { + const isRunning = status === 'RUNNING' + return ( + // TODO: Add data-testid to progress container +
+
+
+ ) +} +``` + +--- + +### 5. ProcessesPanel Component (`dashboard-ui/src/main.jsx`) + +**File:** `dashboard-ui/src/main.jsx` +**Lines:** ~284-361 +**Component:** `ProcessesPanel` + +#### Required Changes: + +```javascript +return ( + // TODO: Add data-testid to processes container +
+ // TODO: Add data-testid to processes header +
+ +

Background Processes

+
+
+ // TODO: Add data-testid to orchestrator process card +
+
+

Orchestrator

+ // TODO: Add data-testid to process status +
+ // TODO: Add data-testid to status dot +
+ + {!!orchestrator?.healthy ? 'healthy' : 'unhealthy'} + +
+
+
+
Agents: {Object.keys(orchestrator?.agents || {}).length}
+
+ Started: {orchestrator?.timestamp ? new Date(orchestrator.timestamp).toLocaleTimeString() : 'N/A'} +
+
+
+ + // TODO: Add data-testid to refinery process card +
+
+

Refinery

+
+
+ + {!!refinery?.healthy ? 'healthy' : 'unhealthy'} + +
+
+
+
FE Module: + {refinery?.fe_module_enabled ? 'enabled' : 'disabled'} +
+
Available: + {String(!!refinery?.fe_module_available)} +
+
+
+
+ + // TODO: Add data-testid to telemetry section +
+
+ +

EDA Telemetry

+
+ // TODO: Add data-testid to telemetry table +
+
+
Operation
+
Count
+
Errors
+
+ {edaOps.map((o) => ( + // TODO: Add data-testid to each telemetry row +
+
{o.op}
+
{o.count}
+
{o.errors}
+
+ ))} + {!edaOps.length && ( +
No telemetry data
+ )} +
+
+
+) +``` + +--- + +### 6. App Component (`dashboard-ui/src/main.jsx`) + +**File:** `dashboard-ui/src/main.jsx` +**Lines:** ~452-467 +**Component:** `App` + +#### Required Changes: + +```javascript +return ( + // TODO: Add data-testid to app container +
+
+ // TODO: Add data-testid to main layout +
+ // TODO: Add data-testid to main content area +
+ + +
+ // TODO: Add data-testid to sidebar +
+ + +
+
+
+) +``` + +--- + +## Summary of Changes Required + +| Component | File | Total Elements | Priority | +|-----------|------|----------------|----------| +| Header | `dashboard-ui/src/main.jsx` | 8 | High | +| ConsolePanel | `dashboard-ui/src/main.jsx` | 6 | High | +| DatasetsPanel | `dashboard-ui/src/main.jsx` | 11 | High | +| WorkflowsPanel | `dashboard-ui/src/main.jsx` | 9 | High | +| ProcessesPanel | `dashboard-ui/src/main.jsx` | 9 | Medium | +| App | `dashboard-ui/src/main.jsx` | 4 | High | +| **TOTAL** | | **47** | | + +--- + +## Implementation Guidelines + +1. **Naming Convention**: Use kebab-case for test IDs (e.g., `data-testid="console-submit"`) +2. **Uniqueness**: Ensure each test ID is unique within its screen/component +3. **Descriptiveness**: Test IDs should clearly describe the element's purpose +4. **Dynamic IDs**: For repeated elements (lists, cards), append index or unique identifier +5. **Consistency**: Follow the contract defined in `contracts/ui-test-ids.json` + +## Testing Recommendations + +After adding data-testid attributes: + +1. Validate all test IDs are present using the validation script +2. Write E2E tests using Playwright or Cypress targeting these test IDs +3. Ensure test IDs don't change unless the component is fundamentally restructured +4. Document any new test IDs in the contract file + +## Next Steps + +1. ✅ Contract authored in `contracts/ui-test-ids.json` +2. ⚠️ UI implementation pending - Add data-testid attributes to components +3. ⚠️ E2E test suite pending - Create tests using these test IDs +4. ⚠️ CI validation pending - Run validation script in CI/CD pipeline + +--- + +**Note:** This is a contract-only document. UI modifications should be made separately as part of the implementation phase. diff --git a/scripts/validate-contracts.js b/scripts/validate-contracts.js new file mode 100755 index 0000000..ee2cc65 --- /dev/null +++ b/scripts/validate-contracts.js @@ -0,0 +1,365 @@ +#!/usr/bin/env node +/** + * Contract Validation Script + * + * Validates that: + * 1. UI test ID contract (contracts/ui-test-ids.json) is valid JSON + * 2. API schema contract (contracts/api-schema.yaml) is valid YAML/OpenAPI + * 3. All referenced files exist + * 4. Contract structure follows expected format + * + * Exit codes: + * 0 - All validations passed + * 1 - Validation failures detected + */ + +const fs = require('fs'); +const path = require('path'); + +// Colors for terminal output +const colors = { + reset: '\x1b[0m', + red: '\x1b[31m', + green: '\x1b[32m', + yellow: '\x1b[33m', + blue: '\x1b[34m', + cyan: '\x1b[36m', +}; + +function log(message, color = colors.reset) { + console.log(`${color}${message}${colors.reset}`); +} + +function logError(message) { + log(`❌ ${message}`, colors.red); +} + +function logSuccess(message) { + log(`✅ ${message}`, colors.green); +} + +function logWarning(message) { + log(`⚠️ ${message}`, colors.yellow); +} + +function logInfo(message) { + log(`ℹ️ ${message}`, colors.cyan); +} + +function logSection(message) { + log(`\n${'='.repeat(60)}`, colors.blue); + log(message, colors.blue); + log('='.repeat(60), colors.blue); +} + +// Validation results +const validationResults = { + passed: 0, + failed: 0, + warnings: 0, + errors: [], +}; + +/** + * Validate UI Test IDs Contract + */ +function validateUITestIDsContract() { + logSection('Validating UI Test IDs Contract'); + + const contractPath = path.join(__dirname, '..', 'contracts', 'ui-test-ids.json'); + + // Check file exists + if (!fs.existsSync(contractPath)) { + logError(`Contract file not found: ${contractPath}`); + validationResults.failed++; + validationResults.errors.push('ui-test-ids.json not found'); + return false; + } + + logSuccess(`Contract file found: ${contractPath}`); + + // Parse JSON + let contract; + try { + const content = fs.readFileSync(contractPath, 'utf8'); + contract = JSON.parse(content); + logSuccess('Contract is valid JSON'); + validationResults.passed++; + } catch (error) { + logError(`Failed to parse JSON: ${error.message}`); + validationResults.failed++; + validationResults.errors.push('Invalid JSON in ui-test-ids.json'); + return false; + } + + // Validate structure + if (!contract.screens || typeof contract.screens !== 'object') { + logError('Contract missing required "screens" object'); + validationResults.failed++; + validationResults.errors.push('Invalid structure: missing screens object'); + return false; + } + + logSuccess(`Contract contains ${Object.keys(contract.screens).length} screen(s)`); + + // Validate each screen + let totalElements = 0; + let requiredElements = 0; + const duplicateIds = new Set(); + const allIds = new Set(); + + for (const [screenName, screen] of Object.entries(contract.screens)) { + if (!screen.elements || !Array.isArray(screen.elements)) { + logError(`Screen "${screenName}" missing elements array`); + validationResults.failed++; + validationResults.errors.push(`Invalid screen: ${screenName}`); + continue; + } + + logInfo(` Screen: ${screenName} (${screen.elements.length} elements)`); + totalElements += screen.elements.length; + + for (const element of screen.elements) { + if (!element.id) { + logWarning(` Element in ${screenName} missing "id" field`); + validationResults.warnings++; + continue; + } + + if (allIds.has(element.id)) { + duplicateIds.add(element.id); + } + allIds.add(element.id); + + if (element.required) { + requiredElements++; + } + + // Validate selector format + if (!element.selector || !element.selector.includes('data-testid')) { + logWarning(` Element "${element.id}" has invalid selector format`); + validationResults.warnings++; + } + } + } + + if (duplicateIds.size > 0) { + logWarning(`Found ${duplicateIds.size} duplicate test ID(s): ${Array.from(duplicateIds).join(', ')}`); + validationResults.warnings += duplicateIds.size; + } + + logInfo(`Total elements: ${totalElements}`); + logInfo(`Required elements: ${requiredElements}`); + logInfo(`Optional elements: ${totalElements - requiredElements}`); + + validationResults.passed++; + return true; +} + +/** + * Validate API Schema Contract + */ +function validateAPISchemaContract() { + logSection('Validating API Schema Contract'); + + const contractPath = path.join(__dirname, '..', 'contracts', 'api-schema.yaml'); + + // Check file exists + if (!fs.existsSync(contractPath)) { + logError(`Contract file not found: ${contractPath}`); + validationResults.failed++; + validationResults.errors.push('api-schema.yaml not found'); + return false; + } + + logSuccess(`Contract file found: ${contractPath}`); + + // Basic YAML validation (check if it's readable and has expected structure) + let content; + try { + content = fs.readFileSync(contractPath, 'utf8'); + logSuccess('Contract file is readable'); + validationResults.passed++; + } catch (error) { + logError(`Failed to read file: ${error.message}`); + validationResults.failed++; + validationResults.errors.push('Failed to read api-schema.yaml'); + return false; + } + + // Basic structure validation (simple text search, no YAML parser dependency) + const requiredSections = [ + 'openapi:', + 'info:', + 'paths:', + 'components:', + 'schemas:', + ]; + + let missingSection = false; + for (const section of requiredSections) { + if (!content.includes(section)) { + logError(`Missing required section: ${section}`); + validationResults.failed++; + missingSection = true; + } + } + + if (!missingSection) { + logSuccess('All required OpenAPI sections present'); + validationResults.passed++; + } + + // Count endpoints + const pathMatches = content.match(/^\s{2}\/[^:]+:/gm) || []; + logInfo(`Found ${pathMatches.length} API endpoint(s)`); + + // Check for key endpoints + const keyEndpoints = [ + '/health', + '/datasets', + '/workflows/start', + '/load_data', + '/execute', + '/class_imbalance', + ]; + + for (const endpoint of keyEndpoints) { + if (content.includes(` ${endpoint}:`)) { + logInfo(` ✓ ${endpoint}`); + } else { + logWarning(` ? ${endpoint} not found`); + validationResults.warnings++; + } + } + + validationResults.passed++; + return true; +} + +/** + * Validate Directory Structure + */ +function validateDirectoryStructure() { + logSection('Validating Directory Structure'); + + const requiredDirs = [ + path.join(__dirname, '..', 'contracts'), + path.join(__dirname, '..', 'reports'), + path.join(__dirname, '..', 'scripts'), + ]; + + for (const dir of requiredDirs) { + if (fs.existsSync(dir)) { + logSuccess(`Directory exists: ${path.relative(path.join(__dirname, '..'), dir)}`); + validationResults.passed++; + } else { + logError(`Directory missing: ${path.relative(path.join(__dirname, '..'), dir)}`); + validationResults.failed++; + validationResults.errors.push(`Missing directory: ${path.basename(dir)}`); + } + } +} + +/** + * Validate Selector Adoption Report + */ +function validateSelectorAdoptionReport() { + logSection('Validating Selector Adoption Report'); + + const reportPath = path.join(__dirname, '..', 'reports', 'selector-adoption.md'); + + if (!fs.existsSync(reportPath)) { + logError(`Report file not found: ${reportPath}`); + validationResults.failed++; + validationResults.errors.push('selector-adoption.md not found'); + return false; + } + + logSuccess(`Report file found: ${reportPath}`); + + const content = fs.readFileSync(reportPath, 'utf8'); + + // Check for required sections + const requiredSections = [ + '# Selector Adoption Report', + '## TODO: Components Requiring data-testid Attributes', + 'dashboard-ui/src/main.jsx', + ]; + + let missingSection = false; + for (const section of requiredSections) { + if (!content.includes(section)) { + logWarning(`Report missing expected section: ${section}`); + validationResults.warnings++; + missingSection = true; + } + } + + if (!missingSection) { + logSuccess('All expected sections present in report'); + validationResults.passed++; + } + + // Count TODO items + const todoCount = (content.match(/\/\/ TODO:/g) || []).length; + logInfo(`Found ${todoCount} TODO comment(s) in report`); + + validationResults.passed++; + return true; +} + +/** + * Main validation function + */ +function main() { + log('\n╔═══════════════════════════════════════════════════════════╗', colors.cyan); + log('║ Contract Validation Script ║', colors.cyan); + log('║ Deepline Multi-Agent Data Scientist ║', colors.cyan); + log('╚═══════════════════════════════════════════════════════════╝\n', colors.cyan); + + // Run all validations + validateDirectoryStructure(); + validateUITestIDsContract(); + validateAPISchemaContract(); + validateSelectorAdoptionReport(); + + // Print summary + logSection('Validation Summary'); + + log(`Passed: ${validationResults.passed}`, colors.green); + log(`Failed: ${validationResults.failed}`, colors.red); + log(`Warnings: ${validationResults.warnings}`, colors.yellow); + + if (validationResults.errors.length > 0) { + log('\nErrors:', colors.red); + validationResults.errors.forEach(error => { + log(` • ${error}`, colors.red); + }); + } + + console.log(''); + + if (validationResults.failed > 0) { + logError('❌ Contract validation FAILED'); + process.exit(1); + } else if (validationResults.warnings > 0) { + logWarning('⚠️ Contract validation PASSED with warnings'); + process.exit(0); + } else { + logSuccess('✅ All contract validations PASSED'); + process.exit(0); + } +} + +// Run validation +if (require.main === module) { + main(); +} + +module.exports = { + validateUITestIDsContract, + validateAPISchemaContract, + validateDirectoryStructure, + validateSelectorAdoptionReport, +};