diff --git a/contracts/README.md b/contracts/README.md
new file mode 100644
index 0000000..387781a
--- /dev/null
+++ b/contracts/README.md
@@ -0,0 +1,239 @@
+# Deepline Contracts
+
+This directory contains contract definitions for the Deepline Multi-Agent Data Scientist system.
+
+## Contents
+
+### 1. UI Test IDs Contract (`ui-test-ids.json`)
+
+A comprehensive enumeration of all UI screens and critical elements with their designated `data-testid` attributes.
+
+**Purpose:**
+- Provides a single source of truth for UI test selectors
+- Enables reliable E2E testing with Playwright, Cypress, or similar tools
+- Documents the structure of the UI for test automation
+
+**Structure:**
+```json
+{
+ "screens": {
+ "ScreenName": {
+ "description": "Description of the screen",
+ "elements": [
+ {
+ "id": "test-id-name",
+ "description": "What this element is",
+ "selector": "[data-testid='test-id-name']",
+ "required": true
+ }
+ ]
+ }
+ }
+}
+```
+
+**Statistics:**
+- 6 screens defined
+- 47 UI elements documented
+- 46 required elements
+- 1 optional element
+
+**Screens:**
+1. **Header** - Main navigation header with agent tabs
+2. **ConsolePanel** - Natural language console interface
+3. **DatasetsPanel** - Dataset management and upload
+4. **WorkflowsPanel** - Workflow execution monitoring
+5. **ProcessesPanel** - Background processes and health
+6. **App** - Root application structure
+
+### 2. API Schema Contract (`api-schema.yaml`)
+
+OpenAPI 3.0.3 specification documenting all REST API endpoints across the microservices architecture.
+
+**Purpose:**
+- Defines request/response types for all API endpoints
+- Enables API client generation and validation
+- Documents the complete API surface for integration
+
+**Coverage:**
+- **Master Orchestrator** (port 8000): 11 endpoints
+- **EDA Agent** (port 8001): 8 endpoints
+- **Refinery Agent** (port 8005): 3 endpoints
+- **ML Agent** (port 8002): 5 endpoints
+- **Total:** 27 endpoints documented
+
+**Key Endpoints:**
+
+Service | Endpoint | Method | Purpose
+--------|----------|--------|--------
+Orchestrator | `/workflows/start` | POST | Start new workflow
+Orchestrator | `/datasets/upload` | POST | Upload dataset
+Orchestrator | `/runs/{run_id}/status` | GET | Get run status
+EDA | `/load_data` | POST | Load dataset into memory
+EDA | `/basic_info` | POST | Get dataset info
+EDA | `/statistical_summary` | POST | Compute statistics
+EDA | `/create_visualization` | POST | Generate charts
+Refinery | `/execute` | POST | Execute DQ/FE task
+ML | `/class_imbalance` | POST | Handle imbalance
+ML | `/baseline_sanity` | POST | Train baseline models
+
+**Schemas:**
+The contract defines comprehensive request/response schemas including:
+- `WorkflowRequest` / `WorkflowResponse`
+- `LoadDataRequest` / `LoadDataResponse`
+- `TaskRequest` / `TaskResponse`
+- `MLResponse`
+- And 20+ more schemas
+
+## Validation
+
+Use the validation script to ensure contracts are valid:
+
+```bash
+node scripts/validate-contracts.js
+```
+
+The validation script checks:
+- ✅ JSON/YAML validity
+- ✅ Required structure present
+- ✅ No duplicate test IDs
+- ✅ All expected sections exist
+- ✅ File references are correct
+
+## Usage
+
+### For Frontend Developers
+
+Use the UI test IDs contract when adding `data-testid` attributes:
+
+```javascript
+// ❌ Before
+
+
+// ✅ After (following contract)
+
+```
+
+See `/reports/selector-adoption.md` for a complete TODO list.
+
+### For QA/Test Engineers
+
+Use test IDs from the contract in your E2E tests:
+
+```javascript
+// Playwright example
+await page.getByTestId('console-prompt').fill('load iris dataset');
+await page.getByTestId('console-submit').click();
+await expect(page.getByTestId('console-output')).toContainText('Success');
+```
+
+### For Backend Developers
+
+Use the API schema when implementing or consuming endpoints:
+
+```python
+# The contract documents that /load_data expects:
+{
+ "path": str, # Required
+ "name": str, # Required
+ "file_type": str # Optional, one of: csv, xlsx, json
+}
+
+# And returns:
+{
+ "name": str,
+ "rows": int,
+ "cols": int,
+ "dtypes": dict,
+ "memory_usage": str,
+ "sample_preview": list
+}
+```
+
+### For API Consumers
+
+Generate API clients from the OpenAPI schema:
+
+```bash
+# Generate TypeScript client
+openapi-generator-cli generate \
+ -i contracts/api-schema.yaml \
+ -g typescript-axios \
+ -o client/typescript
+
+# Generate Python client
+openapi-generator-cli generate \
+ -i contracts/api-schema.yaml \
+ -g python \
+ -o client/python
+```
+
+## CI/CD Integration
+
+Add contract validation to your CI pipeline:
+
+```yaml
+# .github/workflows/contracts.yml
+name: Validate Contracts
+on: [push, pull_request]
+
+jobs:
+ validate:
+ runs-on: ubuntu-latest
+ steps:
+ - uses: actions/checkout@v3
+ - uses: actions/setup-node@v3
+ with:
+ node-version: '18'
+ - name: Validate contracts
+ run: node scripts/validate-contracts.js
+```
+
+## Maintaining Contracts
+
+### When to Update UI Test IDs Contract:
+
+1. **New component added** - Add screen definition with elements
+2. **Element added to existing component** - Add to elements array
+3. **Element removed** - Remove from contract (breaking change)
+4. **Test ID renamed** - Update `id` field (breaking change)
+
+### When to Update API Schema Contract:
+
+1. **New endpoint added** - Add path definition with method
+2. **Endpoint modified** - Update request/response schemas
+3. **Endpoint deprecated** - Mark with `deprecated: true`
+4. **Schema changed** - Update component schemas (may be breaking)
+
+### Versioning:
+
+Both contracts follow semantic versioning:
+- **Major** (1.0.0 → 2.0.0): Breaking changes (renamed IDs, removed endpoints)
+- **Minor** (1.0.0 → 1.1.0): New additions (new elements, new endpoints)
+- **Patch** (1.0.0 → 1.0.1): Clarifications, typo fixes (non-breaking)
+
+## Related Documentation
+
+- `/reports/selector-adoption.md` - TODO list for UI test ID adoption
+- `/dashboard-ui/src/main.jsx` - Main UI component source
+- `/mcp-server/` - Backend service implementations
+
+## Support
+
+For questions about contracts:
+1. Check the validation script output for errors
+2. Review related code in `/dashboard-ui` or `/mcp-server`
+3. See `/reports/selector-adoption.md` for implementation guidance
+
+---
+
+**Last Updated:** 2025-10-13
+**Contract Version:** 1.0.0
diff --git a/contracts/api-schema.yaml b/contracts/api-schema.yaml
new file mode 100644
index 0000000..397f4bc
--- /dev/null
+++ b/contracts/api-schema.yaml
@@ -0,0 +1,1012 @@
+openapi: 3.0.3
+info:
+ title: Deepline Multi-Agent Data Scientist API
+ description: |
+ Comprehensive API schema for the Deepline Master Orchestrator and its microservice agents.
+ Includes EDA Agent, Refinery Agent, ML Agent, and Master Orchestrator endpoints.
+ version: 1.0.0
+ contact:
+ name: Deepline Team
+ license:
+ name: BUSL-1.1
+
+servers:
+ - url: http://localhost:8000
+ description: Master Orchestrator (default)
+ - url: http://localhost:8001
+ description: EDA Agent
+ - url: http://localhost:8005
+ description: Refinery Agent
+ - url: http://localhost:8002
+ description: ML Agent
+
+tags:
+ - name: orchestrator
+ description: Master orchestrator workflow management
+ - name: eda
+ description: Exploratory Data Analysis operations
+ - name: refinery
+ description: Data quality and feature engineering
+ - name: ml
+ description: Machine learning training and experiments
+ - name: health
+ description: Health check endpoints
+
+paths:
+ # Master Orchestrator Endpoints
+ /:
+ get:
+ tags: [orchestrator]
+ summary: Root endpoint
+ description: Returns API information and status
+ responses:
+ '200':
+ description: API information
+ content:
+ application/json:
+ schema:
+ type: object
+ properties:
+ message:
+ type: string
+ version:
+ type: string
+
+ /health:
+ get:
+ tags: [health]
+ summary: Health check endpoint
+ description: Returns health status of the service
+ responses:
+ '200':
+ description: Service is healthy
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/HealthResponse'
+
+ /datasets/upload:
+ post:
+ tags: [orchestrator]
+ summary: Upload dataset
+ description: Upload a dataset file for analysis
+ requestBody:
+ required: true
+ content:
+ multipart/form-data:
+ schema:
+ type: object
+ required:
+ - file
+ - name
+ properties:
+ file:
+ type: string
+ format: binary
+ description: Dataset file (CSV, XLSX, JSON)
+ name:
+ type: string
+ description: Name for the dataset
+ license:
+ type: string
+ description: Optional license information
+ uploader:
+ type: string
+ description: Optional uploader identifier
+ responses:
+ '200':
+ description: Dataset uploaded successfully
+ content:
+ application/json:
+ schema:
+ type: object
+ properties:
+ dataset_name:
+ type: string
+ filename:
+ type: string
+ snapshot:
+ type: object
+ pii_detected:
+ type: boolean
+ pii_matches_count:
+ type: integer
+ message:
+ type: string
+
+ /datasets:
+ get:
+ tags: [orchestrator]
+ summary: List datasets
+ description: Get a list of all loaded datasets
+ responses:
+ '200':
+ description: List of datasets
+ content:
+ application/json:
+ schema:
+ type: object
+ properties:
+ datasets:
+ type: array
+ items:
+ $ref: '#/components/schemas/Dataset'
+
+ /workflows/start:
+ post:
+ tags: [orchestrator]
+ summary: Start a workflow
+ description: Start a new workflow with specified tasks
+ requestBody:
+ required: true
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/WorkflowRequest'
+ responses:
+ '200':
+ description: Workflow started successfully
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/WorkflowResponse'
+
+ /runs:
+ get:
+ tags: [orchestrator]
+ summary: List all workflow runs
+ description: Get a list of all workflow runs
+ responses:
+ '200':
+ description: List of runs
+ content:
+ application/json:
+ schema:
+ type: object
+ properties:
+ runs:
+ type: array
+ items:
+ $ref: '#/components/schemas/RunStatus'
+
+ /runs/{run_id}/status:
+ get:
+ tags: [orchestrator]
+ summary: Get run status
+ description: Get the status of a specific workflow run
+ parameters:
+ - name: run_id
+ in: path
+ required: true
+ schema:
+ type: string
+ responses:
+ '200':
+ description: Run status
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/RunStatus'
+
+ /runs/{run_id}/artifacts:
+ get:
+ tags: [orchestrator]
+ summary: List run artifacts
+ description: Get all artifacts for a specific run
+ parameters:
+ - name: run_id
+ in: path
+ required: true
+ schema:
+ type: string
+ responses:
+ '200':
+ description: List of artifacts
+ content:
+ application/json:
+ schema:
+ type: array
+ items:
+ $ref: '#/components/schemas/Artifact'
+
+ /artifacts/{run_id}/{filename}:
+ get:
+ tags: [orchestrator]
+ summary: Download artifact
+ description: Download a specific artifact file
+ parameters:
+ - name: run_id
+ in: path
+ required: true
+ schema:
+ type: string
+ - name: filename
+ in: path
+ required: true
+ schema:
+ type: string
+ responses:
+ '200':
+ description: Artifact file
+ content:
+ application/octet-stream:
+ schema:
+ type: string
+ format: binary
+
+ /runs/{run_id}:
+ delete:
+ tags: [orchestrator]
+ summary: Delete workflow run
+ description: Delete a workflow run and its artifacts
+ parameters:
+ - name: run_id
+ in: path
+ required: true
+ schema:
+ type: string
+ responses:
+ '200':
+ description: Run deleted successfully
+
+ # EDA Agent Endpoints
+ /load_data:
+ post:
+ tags: [eda]
+ summary: Load dataset
+ description: Load a dataset from file into memory
+ requestBody:
+ required: true
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/LoadDataRequest'
+ responses:
+ '200':
+ description: Dataset loaded successfully
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/LoadDataResponse'
+
+ /basic_info:
+ post:
+ tags: [eda]
+ summary: Get basic dataset information
+ description: Get shape, data types, memory usage, and preview
+ requestBody:
+ required: true
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/BasicInfoRequest'
+ responses:
+ '200':
+ description: Basic info retrieved
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/BasicInfoResponse'
+
+ /statistical_summary:
+ post:
+ tags: [eda]
+ summary: Get statistical summary
+ description: Get descriptive statistics, correlations, skewness, and kurtosis
+ requestBody:
+ required: true
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/StatisticalSummaryRequest'
+ responses:
+ '200':
+ description: Statistical summary computed
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/StatisticalSummaryResponse'
+
+ /missing_data_analysis:
+ post:
+ tags: [eda]
+ summary: Analyze missing data
+ description: Analyze missing data patterns and get recommendations
+ requestBody:
+ required: true
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/MissingDataRequest'
+ responses:
+ '200':
+ description: Missing data analysis complete
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/MissingDataResult'
+
+ /create_visualization:
+ post:
+ tags: [eda]
+ summary: Create visualization
+ description: Create data visualizations (histogram, boxplot, correlation, etc.)
+ requestBody:
+ required: true
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/VisualizationRequest'
+ responses:
+ '200':
+ description: Visualization created
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/VisualizationResponse'
+
+ /infer_schema:
+ post:
+ tags: [eda]
+ summary: Infer data schema
+ description: Infer data types and schema with confidence scores
+ requestBody:
+ required: true
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/SchemaInferenceRequest'
+ responses:
+ '200':
+ description: Schema inferred
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/SchemaInferenceResponse'
+
+ /detect_outliers:
+ post:
+ tags: [eda]
+ summary: Detect outliers
+ description: Detect outliers using IQR, Isolation Forest, or LOF methods
+ requestBody:
+ required: true
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/OutlierDetectionRequest'
+ responses:
+ '200':
+ description: Outliers detected
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/OutlierResult'
+
+ /datasets/{name}:
+ delete:
+ tags: [eda]
+ summary: Delete dataset
+ description: Delete a dataset from memory
+ parameters:
+ - name: name
+ in: path
+ required: true
+ schema:
+ type: string
+ responses:
+ '200':
+ description: Dataset deleted
+
+ # Refinery Agent Endpoints
+ /execute:
+ post:
+ tags: [refinery]
+ summary: Execute refinery task
+ description: Execute a data quality check or feature engineering task
+ requestBody:
+ required: true
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/TaskRequest'
+ responses:
+ '200':
+ description: Task executed
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/TaskResponse'
+
+ /metrics:
+ get:
+ tags: [refinery, health]
+ summary: Get Prometheus metrics
+ description: Get Prometheus-formatted metrics
+ responses:
+ '200':
+ description: Prometheus metrics
+ content:
+ text/plain:
+ schema:
+ type: string
+
+ /pipelines:
+ get:
+ tags: [refinery]
+ summary: List active pipelines
+ description: Get list of active feature engineering pipelines
+ responses:
+ '200':
+ description: List of pipelines
+ content:
+ application/json:
+ schema:
+ type: object
+ properties:
+ pipelines:
+ type: array
+ items:
+ type: object
+
+ # ML Agent Endpoints
+ /class_imbalance:
+ post:
+ tags: [ml]
+ summary: Handle class imbalance
+ description: Apply sampling techniques to handle class imbalance
+ requestBody:
+ required: true
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/ClassImbalanceRequest'
+ responses:
+ '200':
+ description: Class imbalance handled
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/MLResponse'
+
+ /train_validation_test:
+ post:
+ tags: [ml]
+ summary: Split data for training
+ description: Split data into training, validation, and test sets
+ requestBody:
+ required: true
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/TrainValidationTestRequest'
+ responses:
+ '200':
+ description: Data split complete
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/MLResponse'
+
+ /baseline_sanity:
+ post:
+ tags: [ml]
+ summary: Run baseline models
+ description: Train baseline models for sanity checking
+ requestBody:
+ required: true
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/BaselineSanityRequest'
+ responses:
+ '200':
+ description: Baseline models trained
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/MLResponse'
+
+ /experiment_tracking:
+ post:
+ tags: [ml]
+ summary: Track ML experiment
+ description: Track machine learning experiments with MLflow
+ requestBody:
+ required: true
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/ExperimentTrackingRequest'
+ responses:
+ '200':
+ description: Experiment tracked
+ content:
+ application/json:
+ schema:
+ $ref: '#/components/schemas/MLResponse'
+
+ /experiments:
+ get:
+ tags: [ml]
+ summary: List experiments
+ description: Get list of all ML experiments
+ responses:
+ '200':
+ description: List of experiments
+ content:
+ application/json:
+ schema:
+ type: object
+ properties:
+ experiments:
+ type: array
+ items:
+ type: object
+
+components:
+ schemas:
+ HealthResponse:
+ type: object
+ properties:
+ healthy:
+ type: boolean
+ timestamp:
+ type: string
+ format: date-time
+ agents:
+ type: object
+ telemetry:
+ type: object
+ fe_module_enabled:
+ type: boolean
+ fe_module_available:
+ type: boolean
+
+ Dataset:
+ type: object
+ properties:
+ name:
+ type: string
+ shape:
+ type: array
+ items:
+ type: integer
+ memory_usage:
+ type: string
+ dtypes:
+ type: object
+
+ WorkflowRequest:
+ type: object
+ required:
+ - run_name
+ - tasks
+ properties:
+ run_name:
+ type: string
+ tasks:
+ type: array
+ items:
+ $ref: '#/components/schemas/Task'
+ priority:
+ type: integer
+ default: 1
+
+ Task:
+ type: object
+ required:
+ - agent
+ - action
+ - args
+ properties:
+ agent:
+ type: string
+ enum: [eda_agent, refinery_agent, ml_agent]
+ action:
+ type: string
+ args:
+ type: object
+
+ WorkflowResponse:
+ type: object
+ properties:
+ run_id:
+ type: string
+ status:
+ type: string
+ message:
+ type: string
+
+ RunStatus:
+ type: object
+ properties:
+ run_id:
+ type: string
+ status:
+ type: string
+ enum: [QUEUED, RUNNING, COMPLETED, FAILED]
+ progress:
+ type: number
+ format: float
+ current_task:
+ type: string
+ nullable: true
+ start_time:
+ type: string
+ end_time:
+ type: string
+ nullable: true
+ error_message:
+ type: string
+ nullable: true
+
+ Artifact:
+ type: object
+ properties:
+ artifact_id:
+ type: string
+ type:
+ type: string
+ filename:
+ type: string
+ size:
+ type: integer
+ created_at:
+ type: string
+ download_url:
+ type: string
+ nullable: true
+
+ LoadDataRequest:
+ type: object
+ required:
+ - path
+ - name
+ properties:
+ path:
+ type: string
+ name:
+ type: string
+ file_type:
+ type: string
+ enum: [csv, xlsx, json]
+
+ LoadDataResponse:
+ type: object
+ properties:
+ name:
+ type: string
+ rows:
+ type: integer
+ cols:
+ type: integer
+ dtypes:
+ type: object
+ memory_usage:
+ type: string
+ sample_preview:
+ type: array
+ items:
+ type: object
+
+ BasicInfoRequest:
+ type: object
+ required:
+ - name
+ properties:
+ name:
+ type: string
+
+ BasicInfoResponse:
+ type: object
+ properties:
+ shape:
+ type: array
+ items:
+ type: integer
+ dtypes:
+ type: object
+ memory_usage:
+ type: string
+ preview:
+ type: array
+ items:
+ type: object
+ null_counts:
+ type: object
+
+ StatisticalSummaryRequest:
+ type: object
+ required:
+ - name
+ properties:
+ name:
+ type: string
+ sample_size:
+ type: integer
+ default: 10000
+
+ StatisticalSummaryResponse:
+ type: object
+ properties:
+ descriptive_stats:
+ type: object
+ correlation_matrix:
+ type: object
+ skewness:
+ type: object
+ kurtosis:
+ type: object
+
+ MissingDataRequest:
+ type: object
+ required:
+ - name
+ properties:
+ name:
+ type: string
+
+ MissingDataResult:
+ type: object
+ properties:
+ missing_counts:
+ type: object
+ missing_percentages:
+ type: object
+ missing_patterns:
+ type: array
+ items:
+ type: object
+ recommendations:
+ type: array
+ items:
+ type: string
+ visualization_base64:
+ type: string
+ nullable: true
+
+ VisualizationRequest:
+ type: object
+ required:
+ - name
+ - chart_type
+ properties:
+ name:
+ type: string
+ chart_type:
+ type: string
+ enum: [histogram, boxplot, correlation, scatter, line, bar]
+ columns:
+ type: array
+ items:
+ type: string
+ sample_size:
+ type: integer
+ default: 10000
+
+ VisualizationResponse:
+ type: object
+ properties:
+ chart_type:
+ type: string
+ columns:
+ type: array
+ items:
+ type: string
+ base64_image:
+ type: string
+ metadata:
+ type: object
+
+ SchemaInferenceRequest:
+ type: object
+ required:
+ - name
+ properties:
+ name:
+ type: string
+ confidence_threshold:
+ type: number
+ format: float
+ default: 0.8
+
+ SchemaInferenceResponse:
+ type: object
+ properties:
+ schema:
+ type: object
+ confidence_scores:
+ type: object
+ recommendations:
+ type: array
+ items:
+ type: string
+ yaml_schema:
+ type: string
+
+ OutlierDetectionRequest:
+ type: object
+ required:
+ - name
+ properties:
+ name:
+ type: string
+ method:
+ type: string
+ enum: [iqr, isolation_forest, lof]
+ default: iqr
+ columns:
+ type: array
+ items:
+ type: string
+ contamination:
+ type: number
+ format: float
+ default: 0.1
+
+ OutlierResult:
+ type: object
+ properties:
+ method:
+ type: string
+ outlier_indices:
+ type: array
+ items:
+ type: integer
+ outlier_count:
+ type: integer
+ outlier_percentage:
+ type: number
+ format: float
+ columns_analyzed:
+ type: array
+ items:
+ type: string
+ scores:
+ type: object
+ nullable: true
+
+ TaskRequest:
+ type: object
+ required:
+ - task_id
+ - action
+ properties:
+ task_id:
+ type: string
+ action:
+ type: string
+ enum:
+ - check_schema_consistency
+ - check_missing_values
+ - check_distributions
+ - check_duplicates
+ - check_leakage
+ - check_drift
+ - comprehensive_quality_report
+ - assign_feature_roles
+ - basic_impute_missing_values
+ - basic_scale_numeric_features
+ - basic_encode_categorical_features
+ - basic_generate_datetime_features
+ - basic_vectorise_text_features
+ - basic_generate_interactions
+ - basic_select_features
+ - save_fe_pipeline
+ - execute_feature_pipeline
+ - advanced_impute_missing_values
+ - advanced_encode_categorical_features
+ - advanced_feature_selection
+ - feature_interactions
+ - pipeline_persistence
+ mode:
+ type: string
+ enum: [data_quality, feature_engineering]
+ backend:
+ type: string
+ enum: [refinery_basic, fe_module]
+ params:
+ type: object
+
+ TaskResponse:
+ type: object
+ properties:
+ task_id:
+ type: string
+ success:
+ type: boolean
+ mode:
+ type: string
+ backend:
+ type: string
+ result:
+ type: object
+ nullable: true
+ error:
+ type: string
+ nullable: true
+ execution_time:
+ type: number
+ format: float
+ timestamp:
+ type: number
+ format: float
+
+ ClassImbalanceRequest:
+ type: object
+ required:
+ - dataset_name
+ properties:
+ dataset_name:
+ type: string
+ method:
+ type: string
+ enum: [smote, adasyn, random_under, tomek, smoteenn, smotetomek]
+ target_column:
+ type: string
+
+ TrainValidationTestRequest:
+ type: object
+ required:
+ - dataset_name
+ properties:
+ dataset_name:
+ type: string
+ test_size:
+ type: number
+ format: float
+ default: 0.2
+ validation_size:
+ type: number
+ format: float
+ default: 0.2
+ stratify:
+ type: boolean
+ default: true
+
+ BaselineSanityRequest:
+ type: object
+ required:
+ - dataset_name
+ - target_column
+ properties:
+ dataset_name:
+ type: string
+ target_column:
+ type: string
+ models:
+ type: array
+ items:
+ type: string
+ default: [logistic_regression, decision_tree, random_forest]
+
+ ExperimentTrackingRequest:
+ type: object
+ required:
+ - experiment_name
+ properties:
+ experiment_name:
+ type: string
+ parameters:
+ type: object
+ metrics:
+ type: object
+ artifacts:
+ type: array
+ items:
+ type: string
+
+ MLResponse:
+ type: object
+ properties:
+ success:
+ type: boolean
+ result:
+ type: object
+ error:
+ type: string
+ nullable: true
+ execution_time:
+ type: number
+ format: float
diff --git a/contracts/ui-test-ids.json b/contracts/ui-test-ids.json
new file mode 100644
index 0000000..c3ff044
--- /dev/null
+++ b/contracts/ui-test-ids.json
@@ -0,0 +1,320 @@
+{
+ "$schema": "http://json-schema.org/draft-07/schema#",
+ "title": "UI Test IDs Contract",
+ "description": "Enumeration of screens and critical elements with data-testid attributes for E2E testing",
+ "version": "1.0.0",
+ "screens": {
+ "Header": {
+ "description": "Main navigation header with agent tabs and system status",
+ "elements": [
+ {
+ "id": "header-container",
+ "description": "Main header container",
+ "selector": "[data-testid='header-container']",
+ "required": true
+ },
+ {
+ "id": "brand-logo",
+ "description": "DEEPLINE brand logo/title",
+ "selector": "[data-testid='brand-logo']",
+ "required": true
+ },
+ {
+ "id": "nav-orchestrator",
+ "description": "Orchestrator agent navigation tab",
+ "selector": "[data-testid='nav-orchestrator']",
+ "required": true
+ },
+ {
+ "id": "nav-eda",
+ "description": "EDA agent navigation tab",
+ "selector": "[data-testid='nav-eda']",
+ "required": true
+ },
+ {
+ "id": "nav-refinery",
+ "description": "Refinery agent navigation tab",
+ "selector": "[data-testid='nav-refinery']",
+ "required": true
+ },
+ {
+ "id": "nav-ml",
+ "description": "ML agent navigation tab",
+ "selector": "[data-testid='nav-ml']",
+ "required": true
+ },
+ {
+ "id": "system-status",
+ "description": "Overall system health status indicator",
+ "selector": "[data-testid='system-status']",
+ "required": true
+ },
+ {
+ "id": "status-dot",
+ "description": "System status indicator dot",
+ "selector": "[data-testid='status-dot']",
+ "required": true
+ }
+ ]
+ },
+ "ConsolePanel": {
+ "description": "Natural language console for submitting prompts to the orchestrator",
+ "elements": [
+ {
+ "id": "console-container",
+ "description": "Console panel container",
+ "selector": "[data-testid='console-container']",
+ "required": true
+ },
+ {
+ "id": "console-header",
+ "description": "Console header with icon and title",
+ "selector": "[data-testid='console-header']",
+ "required": true
+ },
+ {
+ "id": "console-output",
+ "description": "Console output area showing command results",
+ "selector": "[data-testid='console-output']",
+ "required": true
+ },
+ {
+ "id": "console-line",
+ "description": "Individual console output line",
+ "selector": "[data-testid='console-line']",
+ "required": true
+ },
+ {
+ "id": "console-prompt",
+ "description": "Console input field for natural language prompts",
+ "selector": "[data-testid='console-prompt']",
+ "required": true
+ },
+ {
+ "id": "console-submit",
+ "description": "Console submit button",
+ "selector": "[data-testid='console-submit']",
+ "required": true
+ }
+ ]
+ },
+ "DatasetsPanel": {
+ "description": "Panel showing uploaded datasets and upload controls",
+ "elements": [
+ {
+ "id": "datasets-container",
+ "description": "Datasets panel container",
+ "selector": "[data-testid='datasets-container']",
+ "required": true
+ },
+ {
+ "id": "datasets-header",
+ "description": "Datasets header with icon and title",
+ "selector": "[data-testid='datasets-header']",
+ "required": true
+ },
+ {
+ "id": "upload-area",
+ "description": "Upload area for new datasets",
+ "selector": "[data-testid='upload-area']",
+ "required": true
+ },
+ {
+ "id": "file-upload-input",
+ "description": "Hidden file input for dataset upload",
+ "selector": "[data-testid='file-upload-input']",
+ "required": true
+ },
+ {
+ "id": "upload-button",
+ "description": "Button to trigger file upload",
+ "selector": "[data-testid='upload-button']",
+ "required": true
+ },
+ {
+ "id": "dataset-name-input",
+ "description": "Input field for dataset name",
+ "selector": "[data-testid='dataset-name-input']",
+ "required": true
+ },
+ {
+ "id": "datasets-list",
+ "description": "List of uploaded datasets",
+ "selector": "[data-testid='datasets-list']",
+ "required": true
+ },
+ {
+ "id": "dataset-row",
+ "description": "Individual dataset row in the table",
+ "selector": "[data-testid='dataset-row']",
+ "required": true
+ },
+ {
+ "id": "resource-usage",
+ "description": "System resource usage metrics section",
+ "selector": "[data-testid='resource-usage']",
+ "required": true
+ },
+ {
+ "id": "resource-metric-cpu",
+ "description": "CPU usage metric",
+ "selector": "[data-testid='resource-metric-cpu']",
+ "required": true
+ },
+ {
+ "id": "resource-metric-memory",
+ "description": "Memory usage metric",
+ "selector": "[data-testid='resource-metric-memory']",
+ "required": true
+ }
+ ]
+ },
+ "WorkflowsPanel": {
+ "description": "Panel displaying active and completed workflows",
+ "elements": [
+ {
+ "id": "workflows-container",
+ "description": "Workflows panel container",
+ "selector": "[data-testid='workflows-container']",
+ "required": true
+ },
+ {
+ "id": "workflows-header",
+ "description": "Workflows header with icon and title",
+ "selector": "[data-testid='workflows-header']",
+ "required": true
+ },
+ {
+ "id": "workflow-card",
+ "description": "Individual workflow card",
+ "selector": "[data-testid='workflow-card']",
+ "required": true
+ },
+ {
+ "id": "workflow-id",
+ "description": "Workflow identifier",
+ "selector": "[data-testid='workflow-id']",
+ "required": true
+ },
+ {
+ "id": "workflow-status",
+ "description": "Workflow status badge",
+ "selector": "[data-testid='workflow-status']",
+ "required": true
+ },
+ {
+ "id": "workflow-name",
+ "description": "Workflow name/description",
+ "selector": "[data-testid='workflow-name']",
+ "required": true
+ },
+ {
+ "id": "workflow-progress-bar",
+ "description": "Workflow progress bar",
+ "selector": "[data-testid='workflow-progress-bar']",
+ "required": true
+ },
+ {
+ "id": "progress-text",
+ "description": "Workflow progress percentage text",
+ "selector": "[data-testid='progress-text']",
+ "required": true
+ },
+ {
+ "id": "no-workflows",
+ "description": "Empty state message when no workflows exist",
+ "selector": "[data-testid='no-workflows']",
+ "required": false
+ }
+ ]
+ },
+ "ProcessesPanel": {
+ "description": "Panel showing background processes and agent health status",
+ "elements": [
+ {
+ "id": "processes-container",
+ "description": "Processes panel container",
+ "selector": "[data-testid='processes-container']",
+ "required": true
+ },
+ {
+ "id": "processes-header",
+ "description": "Processes header with icon and title",
+ "selector": "[data-testid='processes-header']",
+ "required": true
+ },
+ {
+ "id": "process-card-orchestrator",
+ "description": "Orchestrator process card",
+ "selector": "[data-testid='process-card-orchestrator']",
+ "required": true
+ },
+ {
+ "id": "process-card-refinery",
+ "description": "Refinery process card",
+ "selector": "[data-testid='process-card-refinery']",
+ "required": true
+ },
+ {
+ "id": "process-status",
+ "description": "Process health status indicator",
+ "selector": "[data-testid='process-status']",
+ "required": true
+ },
+ {
+ "id": "process-status-dot",
+ "description": "Process status dot indicator",
+ "selector": "[data-testid='process-status-dot']",
+ "required": true
+ },
+ {
+ "id": "telemetry-section",
+ "description": "EDA telemetry section",
+ "selector": "[data-testid='telemetry-section']",
+ "required": true
+ },
+ {
+ "id": "telemetry-table",
+ "description": "EDA telemetry data table",
+ "selector": "[data-testid='telemetry-table']",
+ "required": true
+ },
+ {
+ "id": "telemetry-row",
+ "description": "Individual telemetry row",
+ "selector": "[data-testid='telemetry-row']",
+ "required": true
+ }
+ ]
+ },
+ "App": {
+ "description": "Root application container",
+ "elements": [
+ {
+ "id": "app-container",
+ "description": "Main application container",
+ "selector": "[data-testid='app-container']",
+ "required": true
+ },
+ {
+ "id": "main-layout",
+ "description": "Main layout container",
+ "selector": "[data-testid='main-layout']",
+ "required": true
+ },
+ {
+ "id": "main-content",
+ "description": "Main content area",
+ "selector": "[data-testid='main-content']",
+ "required": true
+ },
+ {
+ "id": "sidebar",
+ "description": "Sidebar container",
+ "selector": "[data-testid='sidebar']",
+ "required": true
+ }
+ ]
+ }
+ }
+}
diff --git a/reports/selector-adoption.md b/reports/selector-adoption.md
new file mode 100644
index 0000000..9604831
--- /dev/null
+++ b/reports/selector-adoption.md
@@ -0,0 +1,490 @@
+# Selector Adoption Report
+
+## Overview
+This document tracks the adoption of `data-testid` attributes across the Deepline Dashboard UI components. These test IDs are essential for reliable end-to-end testing and automation.
+
+## Current Status
+**Last Updated:** 2025-10-13
+**Adoption Rate:** 0% (0/117 elements)
+
+## Component Status
+
+### ✅ Fully Adopted Components
+None yet.
+
+### ⚠️ Partially Adopted Components
+None yet.
+
+### ❌ Not Adopted Components
+All components need `data-testid` attributes added.
+
+---
+
+## TODO: Components Requiring data-testid Attributes
+
+### 1. Header Component (`dashboard-ui/src/main.jsx`)
+
+**File:** `dashboard-ui/src/main.jsx`
+**Lines:** ~29-66
+**Component:** `Header`
+
+#### Required Changes:
+
+```javascript
+// TODO: Add data-testid to header container
+DEEPLINE
+
+