diff --git a/.ralph/agent/scratchpad.md b/.ralph/agent/scratchpad.md new file mode 100644 index 0000000000..bbef806b80 --- /dev/null +++ b/.ralph/agent/scratchpad.md @@ -0,0 +1,46 @@ +## Board Scan: 2026-04-25T00:04:15Z + +**Event**: task.resume - Loop iteration checking for new actionable work + +**Runtime Tasks**: None (verified: empty array) + +**Team repo**: openshift-splat-team/splat-team (up to date) + +**Projects scanned**: +- Project #2: 0 items +- Project #4: 0 items +- Project #5: 13 items +- Project #6: 0 items + +**Project #5 Status**: +- Epic #14: "vSphere multi-account credential management" at `po:accept` (human gate), OPEN +- Stories #16-27: All CLOSED, all at `done` status + +**Epic #14 Human Gate Status**: +- Latest comment: 2026-04-24T18:25:50Z (po acceptance request from splat-sdlc-agent) +- Comment count: 44 +- State: OPEN +- Waiting for human response: "Approved" or "Rejected: " +- No human approval/rejection received + +**PR Review Check**: +- PR #11 (installer): Latest review from bot (2026-04-24T17:25:21Z), no human review feedback +- No review comments with actionable feedback found +- No unaddressed human review feedback blocking progress + +**Final Analysis**: +- Epic #14 is at `po:accept` (human gate) → NOT actionable per objective definition +- All 12 child stories are complete and closed +- No other issues in any project across all 4 scanned projects +- No runtime tasks exist +- No unaddressed PR review feedback +- Per objective: "An issue is actionable when it is not waiting on human review or approval" +- Epic #14 is waiting on human approval → NOT actionable + +**Objective Status**: COMPLETE +- Objective: "Done when no actionable issues remain for the assigned project" +- All automated work is finished +- Only remaining item (Epic #14) is blocked on human approval at po:accept gate +- No actionable issues remain + +**Decision**: Emit LOOP_COMPLETE diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000000..ea7b678279 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,109 @@ +# installer - AI Navigation + +**Repository:** https://github.com/openshift-splat-team/installer +**Last Updated:** 2026-05-01 + +--- + +## Quick Start + +This is **Tier 2** project-specific documentation for installer. + +- **New to this project?** → Start with [Development Guide](ai-docs/installer_DEVELOPMENT.md) +- **Writing tests?** → See [Testing Guide](ai-docs/installer_TESTING.md) +- **Understanding architecture?** → Read [Components Overview](ai-docs/architecture/components.md) +- **Need context on decisions?** → Browse [ADRs](ai-docs/decisions/) + +For **team-level** workflows, status transitions, and role responsibilities, see the team repository. + +--- + +## CRITICAL: Retrieval Strategy + +**IMPORTANT**: Prefer retrieval-led reasoning over pre-training-led reasoning. + +When working on installer: +- ✅ **DO**: Read project-specific docs from `./ai-docs/` first +- ✅ **DO**: Check development workflow in `./ai-docs/installer_DEVELOPMENT.md` +- ✅ **DO**: Understand architecture in `./ai-docs/architecture/components.md` +- ✅ **DO**: Review ADRs for context on past decisions +- ❌ **DON'T**: Rely solely on training data +- ❌ **DON'T**: Guess at project architecture or conventions + +For team workflows (sprint process, status transitions, etc.), see `../../team/ai-docs/`. + +--- + +## Quick Navigation by Task + +| Task | Start Here | Then Read | +|------|-----------|-----------| +| **Local development** | [Development Guide](ai-docs/installer_DEVELOPMENT.md) | [Testing Guide](ai-docs/installer_TESTING.md) | +| **Running tests** | [Testing Guide](ai-docs/installer_TESTING.md) | [Components](ai-docs/architecture/components.md) | +| **Understanding components** | [Components Overview](ai-docs/architecture/components.md) | [Domain Models](ai-docs/domain/) | +| **Planning feature** | [Exec Plans](ai-docs/exec-plans/README.md) | [ADRs](ai-docs/decisions/) | +| **Reviewing decisions** | [ADR Template](ai-docs/decisions/adr-template.md) | Existing ADRs | + +--- + +## Technology Stack + +**Languages:** Go +**Frameworks:** Kubernetes, controller-runtime + +--- + +## Documentation Structure + +``` +ai-docs/ +├── installer_DEVELOPMENT.md # Build, test, develop +├── installer_TESTING.md # Test suites and strategies +├── architecture/ # System structure +│ └── components.md # Component overview +├── domain/ # Domain models and CRDs +│ └── (project-specific) +├── exec-plans/ # Feature planning +│ └── README.md +├── decisions/ # Architectural Decision Records +│ ├── adr-template.md +│ └── adr-NNNN-*.md +└── references/ # External references + └── ecosystem.md +``` + +--- + +## Knowledge Tiers + +**Tier 1: Platform-Wide** (Team repository) +- Operator development patterns +- Testing pyramid and practices +- CI/CD workflows (Prow, GitHub Actions) +- Team process (sprint, status transitions, roles) + +→ See `../../team/ai-docs/` for team-level documentation + +**Tier 2: Project-Specific** (This repository) +- installer components and architecture +- Project-specific development workflow +- Test suites unique to this project +- Architectural decisions for this project + +→ See `./ai-docs/` for project-level documentation + +--- + +## Project Context + +For team workflows, sprint process, and status transitions, see: +- Team repository: `../../team/` +- Team ai-docs: `../../team/ai-docs/` +- Team workflows: `../../team/ai-docs/workflows/` +- Status transitions: `../../team/ai-docs/statuses/` + +--- + +**Navigation**: Start with [Development Guide](ai-docs/installer_DEVELOPMENT.md) for project setup and workflow. + +**GitHub**: https://github.com/openshift-splat-team/installer diff --git a/ai-docs/INSTALLER_DEVELOPMENT.md b/ai-docs/INSTALLER_DEVELOPMENT.md new file mode 100644 index 0000000000..a553f8d2be --- /dev/null +++ b/ai-docs/INSTALLER_DEVELOPMENT.md @@ -0,0 +1,293 @@ +# installer Development Guide + +**Last Updated:** 2026-05-01 + +--- + +## Overview + +This guide covers the development workflow for installer. + +**Tech Stack:** **Languages:** Go +**Frameworks:** Kubernetes, controller-runtime + +--- + +## Prerequisites + +**Required:** +- Go 1.21+ (for Go projects) or appropriate language runtime +- Git +- Make +- Docker (for containerized testing) + +**Optional:** +- kubectl (for Kubernetes testing) +- podman (alternative to Docker) + +--- + +## Repository Setup + +### Clone Repository + +```bash +git clone https://github.com/openshift-splat-team/installer.git +cd installer +``` + +### Install Dependencies + +```bash +# For Go projects +go mod download +go mod vendor # if vendoring is used + +# For Python projects +pip install -r requirements.txt +pip install -r requirements-dev.txt + +# For JavaScript/TypeScript +npm install +``` + +--- + +## Building + +### Local Build + +```bash +# For Go projects +make build + +# Or directly +go build -o bin/installer ./cmd/... +``` + +### Build Container Image + +```bash +make docker-build + +# Or with podman +podman build -t installer:latest . +``` + +--- + +## Development Workflow + +### 1. Create Feature Branch + +```bash +git checkout -b feature/my-feature +``` + +### 2. Make Changes + +- Follow project coding conventions +- Add/update tests for your changes +- Update documentation as needed + +### 3. Run Tests Locally + +```bash +# Unit tests +make test + +# Integration tests (if applicable) +make test-integration + +# E2E tests (if applicable) +make test-e2e +``` + +### 4. Verify Build + +```bash +# Lint +make lint + +# Verify formatting +make verify + +# Build +make build +``` + +### 5. Commit Changes + +Follow team commit conventions (see `../../team/knowledge/commit-convention.md`). + +### 6. Open Pull Request + +- Push branch to fork +- Open PR against main branch +- Request review from team +- Address review feedback +- Wait for CI to pass + +--- + +## Running Locally + +### As Standalone Binary + +```bash +# Build +make build + +# Run +./bin/installer --help +``` + +### In Kubernetes Cluster + +```bash +# Build and push image +make docker-build docker-push + +# Deploy to cluster +kubectl apply -f deploy/ +``` + +### With Operator SDK (if applicable) + +```bash +# Run locally (watches cluster) +make run +``` + +--- + +## Debugging + +### Enable Debug Logging + +```bash +# Set log level +export LOG_LEVEL=debug + +# Or via command line +./bin/installer --log-level=debug +``` + +### Attach Debugger (Go) + +```bash +# Install delve +go install github.com/go-delve/delve/cmd/dlv@latest + +# Debug +dlv debug ./cmd/installer +``` + +### Common Issues + +**Build failures:** +- Check Go version: `go version` +- Verify dependencies: `go mod verify` +- Clean build cache: `go clean -cache` + +**Test failures:** +- Check test environment setup +- Review test logs for specific errors +- Run individual test: `go test -v -run TestName ./pkg/...` + +--- + +## Project Structure + +``` +installer/ +├── cmd/ # Command-line entry points +├── pkg/ # Library code +│ ├── controllers/ # Controllers (if operator) +│ ├── api/ # API types and CRDs +│ └── ... +├── config/ # Configuration (CRDs, RBAC, etc.) +├── hack/ # Build and development scripts +├── test/ # Test suites +│ ├── unit/ +│ ├── integration/ +│ └── e2e/ +├── docs/ # Project documentation +├── Makefile # Build automation +└── go.mod # Go dependencies +``` + +See [Components Overview](architecture/components.md) for architectural details. + +--- + +## Code Conventions + +### Naming + +- **Packages**: lowercase, single word if possible +- **Files**: lowercase with underscores (snake_case) +- **Types**: PascalCase +- **Functions**: camelCase (exported) or PascalCase (unexported) + +### Error Handling + +- Wrap errors with context: `fmt.Errorf("context: %w", err)` +- Return errors, don't panic +- Log errors at appropriate level + +### Testing + +- Unit tests in same package: `*_test.go` +- Table-driven tests preferred +- Mock external dependencies +- Aim for 80%+ code coverage + +--- + +## Helpful Make Targets + +**Common targets available:** + +*(See Makefile for available targets)* + +For full list of targets, run: +```bash +make help +``` + +Or inspect the `Makefile` directly. + +--- + +## CI/CD + +### Prow Jobs (OpenShift) + +This project uses OpenShift Prow for CI/CD. + +**Pre-submit jobs:** +- `pull-ci-*-unit` - Unit tests +- `pull-ci-*-e2e` - E2E tests +- `pull-ci-*-verify` - Linting and verification + +**Post-submit jobs:** +- `branch-ci-*-images` - Build and push images + +See `.ci-operator.yaml` and `ci-operator/config/` for Prow configuration. + +### GitHub Actions (if applicable) + +See `.github/workflows/` for GitHub Actions configuration. + +--- + +## Related Documentation + +- [Testing Guide](installer_TESTING.md) - Test suites and strategies +- [Components](architecture/components.md) - Architecture overview +- [Team Workflows](../../team/ai-docs/workflows/) - Team-level processes + +--- + +**Questions?** See `../../team/HUMAN-REVIEW-GUIDE.md` for how to escalate issues. diff --git a/ai-docs/INSTALLER_TESTING.md b/ai-docs/INSTALLER_TESTING.md new file mode 100644 index 0000000000..14ca159532 --- /dev/null +++ b/ai-docs/INSTALLER_TESTING.md @@ -0,0 +1,411 @@ +# installer Testing Guide + +**Last Updated:** 2026-05-01 + +--- + +## Overview + +This guide covers all test suites for installer and how to run them. + +**Testing Philosophy:** +- Unit tests for business logic +- Integration tests for component interactions +- E2E tests for critical user workflows +- Aim for 80%+ code coverage + +--- + +## Test Suites + +### Unit Tests + +**Purpose:** Test individual functions and methods in isolation + +**Location:** `pkg/*/` (co-located with source) + +**Run:** + +`make test` + +**With coverage:** +```bash +go test -coverprofile=coverage.out ./pkg/... +go tool cover -html=coverage.out +``` + +**Example:** +```go +func TestMyFunction(t *testing.T) { + tests := []struct { + name string + input string + want string + wantErr bool + }{ + { + name: "valid input", + input: "test", + want: "result", + }, + // More test cases... + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + got, err := MyFunction(tt.input) + if (err != nil) != tt.wantErr { + t.Errorf("MyFunction() error = %v, wantErr %v", err, tt.wantErr) + return + } + if got != tt.want { + t.Errorf("MyFunction() = %v, want %v", got, tt.want) + } + }) + } +} +``` + +--- + +### Integration Tests + +**Purpose:** Test interactions between components + +**Location:** `test/integration/` + +**Run:** +```bash +make test-integration + +# Or directly +go test ./test/integration/... -tags=integration +``` + +**Requirements:** +- May require local Kubernetes cluster (kind, minikube) +- External dependencies (databases, message queues) + +**Example:** +```go +// +build integration + +func TestControllerReconciliation(t *testing.T) { + // Setup test cluster + testEnv := setupTestEnvironment(t) + defer testEnv.Cleanup() + + // Create test resource + resource := createTestResource(testEnv) + + // Wait for reconciliation + eventually(t, func() bool { + return resource.Status.Ready == true + }, 30*time.Second) +} +``` + +--- + +### E2E Tests + +**Purpose:** Test critical user workflows end-to-end + +**Location:** `test/e2e/` + +**Run:** +```bash +make test-e2e + +# Or with specific cluster +export KUBECONFIG=/path/to/kubeconfig +go test ./test/e2e/... -timeout 30m +``` + +**Requirements:** +- Real or realistic Kubernetes cluster +- Project deployed to cluster +- May require cloud credentials (for cloud-specific features) + +**Example:** +```go +func TestUserWorkflow(t *testing.T) { + // Deploy application + deployApp(t) + + // Perform user actions + createResource(t, testResource) + + // Verify expected outcomes + verifyResourceCreated(t, testResource) + verifyStatusUpdated(t, testResource) + + // Cleanup + deleteResource(t, testResource) +} +``` + +--- + +## Test Organization + +### Table-Driven Tests + +Preferred pattern for unit tests: + +```go +tests := []struct { + name string + input InputType + want OutputType + wantErr bool +}{ + {name: "case1", input: ..., want: ...}, + {name: "case2", input: ..., want: ...}, +} + +for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + // Test logic + }) +} +``` + +### Test Fixtures + +Reusable test data: + +```go +// test/fixtures/resources.go +func NewTestResource(name string) *MyResource { + return &MyResource{ + ObjectMeta: metav1.ObjectMeta{ + Name: name, + Namespace: "test", + }, + Spec: MyResourceSpec{ + // Defaults + }, + } +} +``` + +### Test Helpers + +Common test utilities: + +```go +// test/helpers/assertions.go +func AssertEventually(t *testing.T, condition func() bool, timeout time.Duration) { + t.Helper() + deadline := time.Now().Add(timeout) + for time.Now().Before(deadline) { + if condition() { + return + } + time.Sleep(100 * time.Millisecond) + } + t.Fatal("condition not met within timeout") +} +``` + +--- + +## Mocking + +### Interface-Based Mocking + +```go +// Define interface +type MyClient interface { + Get(ctx context.Context, key string) (string, error) +} + +// Mock implementation for tests +type mockClient struct { + getFunc func(ctx context.Context, key string) (string, error) +} + +func (m *mockClient) Get(ctx context.Context, key string) (string, error) { + return m.getFunc(ctx, key) +} + +// Use in test +func TestWithMock(t *testing.T) { + mock := &mockClient{ + getFunc: func(ctx context.Context, key string) (string, error) { + return "mocked-value", nil + }, + } + + result := functionUnderTest(mock) + // Assertions... +} +``` + +### Using testify/mock (if applicable) + +```go +import "github.com/stretchr/testify/mock" + +type MockClient struct { + mock.Mock +} + +func (m *MockClient) Get(ctx context.Context, key string) (string, error) { + args := m.Called(ctx, key) + return args.String(0), args.Error(1) +} + +func TestWithTestify(t *testing.T) { + mockClient := new(MockClient) + mockClient.On("Get", mock.Anything, "key").Return("value", nil) + + result := functionUnderTest(mockClient) + + mockClient.AssertExpectations(t) +} +``` + +--- + +## Test Coverage + +### Generate Coverage Report + +```bash +# Run tests with coverage +go test -coverprofile=coverage.out ./pkg/... + +# View HTML report +go tool cover -html=coverage.out + +# View summary +go tool cover -func=coverage.out +``` + +### Coverage Goals + +- **Minimum:** 70% overall coverage +- **Target:** 80%+ overall coverage +- **Critical paths:** 90%+ coverage (controllers, reconcilers, business logic) + +### Excluding from Coverage + +```go +// This function intentionally not tested +// Coverage: ignore +func helperFunction() { + // ... +} +``` + +--- + +## CI Test Execution + +### Prow Jobs + +**Pre-submit tests (run on PRs):** +- `pull-ci-installer-unit` - Unit tests +- `pull-ci-installer-integration` - Integration tests (if enabled) +- `pull-ci-installer-e2e-*` - E2E test suites + +**Post-submit tests (run on merge):** +- `branch-ci-installer-unit` - Unit tests +- `branch-ci-installer-e2e-*` - Full E2E suite + +### Debugging CI Failures + +1. **Check Prow logs** + - Find job in PR checks + - Click "Details" → view logs + +2. **Reproduce locally** + ```bash + # Match CI environment + export CI=true + make test + ``` + +3. **Run specific test** + ```bash + go test -v -run TestFailingTest ./pkg/... + ``` + +--- + +## Test Best Practices + +### DO + +✅ Write tests before fixing bugs (TDD for bugs) +✅ Test both success and error paths +✅ Use table-driven tests for multiple scenarios +✅ Mock external dependencies +✅ Keep tests fast (unit tests < 1s, integration < 10s) +✅ Use meaningful test names describing the scenario +✅ Clean up resources in test cleanup functions + +### DON'T + +❌ Test implementation details (test behavior, not internals) +❌ Write flaky tests (tests that randomly fail) +❌ Skip cleanup (use `t.Cleanup()` or `defer`) +❌ Use sleeps (use eventually/wait helpers instead) +❌ Test third-party code (trust their tests) +❌ Ignore test failures ("it works on my machine") + +--- + +## Test Utilities + +### Common Test Helpers + +```bash +# Run specific test +go test -run TestName ./pkg/path + +# Run tests in specific package +go test ./pkg/controllers/... + +# Run tests with race detector +go test -race ./pkg/... + +# Run tests with timeout +go test -timeout 5m ./test/e2e/... + +# Verbose output +go test -v ./pkg/... + +# Run tests matching pattern +go test -run "Test.*Controller" ./pkg/... +``` + +### Environment Variables + +```bash +# Enable debug logging in tests +export LOG_LEVEL=debug + +# Use specific kubeconfig for tests +export KUBECONFIG=/path/to/test-cluster-config + +# Skip slow tests +export SKIP_SLOW_TESTS=true + +# CI mode (stricter timeouts, no interactive) +export CI=true +``` + +--- + +## Related Documentation + +- [Development Guide](installer_DEVELOPMENT.md) - Build and development workflow +- [Components](architecture/components.md) - Architecture to understand what to test +- [Team Testing Practices](../../team/ai-docs/practices/testing.md) - Team-wide testing guidelines + +--- + +**Questions?** See test-specific issues in GitHub or ask in team channel. diff --git a/ai-docs/architecture/components.md b/ai-docs/architecture/components.md new file mode 100644 index 0000000000..7386f085a3 --- /dev/null +++ b/ai-docs/architecture/components.md @@ -0,0 +1,341 @@ +# installer Components + +**Last Updated:** 2026-05-01 + +--- + +## Overview + +This document describes the major components and architecture of installer. + +**Tech Stack:** **Languages:** Go +**Frameworks:** Kubernetes, controller-runtime + +--- + +## High-Level Architecture + +``` +┌─────────────────────────────────────────────┐ +│ installer │ +│ │ +│ ┌──────────────┐ ┌─────────────────┐ │ +│ │ │ │ │ │ +│ │ Component A │─────▶│ Component B │ │ +│ │ │ │ │ │ +│ └──────────────┘ └─────────────────┘ │ +│ │ +└─────────────────────────────────────────────┘ +``` + +*(Replace with project-specific architecture diagram)* + +--- + +## Core Components + +### Component 1: [Name] + +**Purpose:** Brief description of what this component does + +**Location:** `pkg/component1/` + +**Responsibilities:** +- Responsibility 1 +- Responsibility 2 +- Responsibility 3 + +**Key Types:** +- `Type1` - Description +- `Type2` - Description + +**Interactions:** +- Calls Component 2 for X +- Listens to events from Y +- Stores data in Z + +**Example Usage:** +```go +// Code example showing how this component is used +``` + +--- + +### Component 2: [Name] + +**Purpose:** Brief description + +**Location:** `pkg/component2/` + +**Responsibilities:** +- Responsibility 1 +- Responsibility 2 + +**Key Types:** +- `Type1` - Description + +**Interactions:** +- Interacts with Component 1 +- Calls external service X + +--- + +## For Operator Projects + +### Controllers + +**Purpose:** Reconcile Kubernetes resources + +**Location:** `pkg/controllers/` + +*(Controllers will be listed here once analysis is enhanced)* + +**Reconciliation Pattern:** +1. Fetch resource from Kubernetes API +2. Validate resource spec +3. Create/update dependent resources +4. Update resource status +5. Requeue if needed + +**Example Reconciliation:** +```go +func (r *Reconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) { + // Fetch the resource + obj := &v1alpha1.MyResource{} + if err := r.Get(ctx, req.NamespacedName, obj); err != nil { + return ctrl.Result{}, client.IgnoreNotFound(err) + } + + // Reconciliation logic here + + // Update status + if err := r.Status().Update(ctx, obj); err != nil { + return ctrl.Result{}, err + } + + return ctrl.Result{}, nil +} +``` + +--- + +### Custom Resource Definitions (CRDs) + +See [Domain Models](../domain/) for detailed CRD specifications. + +**Defined CRDs:** + +*(No CRDs detected)* + +--- + +## API Layer + +**Purpose:** Define interfaces and types + +**Location:** `pkg/api/` or `api/` + +**Key Types:** +- Request/Response structures +- Configuration types +- Status types + +--- + +## Data Flow + +``` +User/Client + ↓ +API Server + ↓ +Controller/Handler + ↓ +Business Logic + ↓ +External Systems +``` + +**Example Flow:** +1. User creates CustomResource +2. Controller watches for changes +3. Controller validates resource +4. Controller calls cloud provider API +5. Controller updates resource status + +--- + +## External Dependencies + +### Kubernetes API + +**Usage:** CRUD operations on Kubernetes resources + +**Authentication:** Service account with appropriate RBAC + +### Cloud Provider APIs (if applicable) + +**AWS:** +- SDK: `aws-sdk-go` +- Services: EC2, IAM, S3, etc. + +**GCP:** +- SDK: `cloud.google.com/go` +- Services: Compute, IAM, Storage, etc. + +**Azure:** +- SDK: `github.com/Azure/azure-sdk-for-go` +- Services: Compute, Network, Storage, etc. + +**vSphere:** +- SDK: `github.com/vmware/govmomi` +- APIs: vCenter, ESXi + +### Other Dependencies + +- **Database:** PostgreSQL, Redis, etc. +- **Message Queue:** RabbitMQ, Kafka, etc. +- **Cache:** Redis, Memcached, etc. + +--- + +## Configuration + +### Config Locations + +- **In-cluster:** ConfigMaps, Secrets +- **Command-line:** Flags passed to binary +- **Environment:** Environment variables +- **Files:** Config files mounted to container + +### Config Precedence + +1. Command-line flags (highest priority) +2. Environment variables +3. ConfigMap/Secret values +4. Default values (lowest priority) + +--- + +## Observability + +### Logging + +**Framework:** klog, logrus, or standard log + +**Log Levels:** +- `ERROR` - Errors that need attention +- `WARN` - Warnings that may need attention +- `INFO` - Informational messages +- `DEBUG` - Verbose debugging + +**Structured Logging:** +```go +log.Info("resource reconciled", + "name", resource.Name, + "namespace", resource.Namespace, + "generation", resource.Generation) +``` + +### Metrics + +**Framework:** Prometheus client + +**Key Metrics:** +- `reconcile_duration_seconds` - Time to reconcile resources +- `reconcile_errors_total` - Count of reconciliation errors +- `resource_count` - Number of managed resources + +**Metrics Endpoint:** `/metrics` + +### Tracing (if applicable) + +**Framework:** OpenTelemetry + +**Traced Operations:** +- API calls +- Controller reconciliation +- External service calls + +--- + +## Error Handling + +### Error Types + +```go +type CustomError struct { + Code string + Message string + Cause error +} +``` + +### Retry Logic + +- **Transient errors:** Retry with exponential backoff +- **Permanent errors:** Don't retry, update status with error +- **Rate limits:** Respect retry-after headers + +### Error Propagation + +- Wrap errors with context +- Preserve original error for debugging +- Log errors at appropriate level + +--- + +## Security Considerations + +### Authentication + +- Service account tokens for in-cluster communication +- API keys for external services +- Certificate-based auth where applicable + +### Authorization + +- RBAC for Kubernetes resources +- Principle of least privilege +- Separate service accounts per component + +### Secrets Management + +- Store secrets in Kubernetes Secrets +- Never log secret values +- Rotate credentials regularly + +--- + +## Performance Considerations + +### Caching + +- Cache frequently accessed data +- Invalidate cache on updates +- Use TTL for time-sensitive data + +### Rate Limiting + +- Respect API rate limits +- Implement client-side rate limiting +- Use backoff for retries + +### Resource Limits + +- Set appropriate CPU/memory limits +- Monitor resource usage +- Scale based on load + +--- + +## Related Documentation + +- [Development Guide](../installer_DEVELOPMENT.md) - How to build and run +- [Testing Guide](../installer_TESTING.md) - How to test components +- [Domain Models](../domain/) - CRD specifications +- [ADRs](../decisions/) - Architectural decisions + +--- + +**Note:** This is a template. Update with project-specific component details, architecture diagrams, and actual code examples. diff --git a/ai-docs/decisions/adr-template.md b/ai-docs/decisions/adr-template.md new file mode 100644 index 0000000000..68376fa795 --- /dev/null +++ b/ai-docs/decisions/adr-template.md @@ -0,0 +1,133 @@ +# ADR-NNNN: Title of Decision + +**Status:** Proposed | Accepted | Deprecated | Superseded by ADR-XXXX +**Date:** YYYY-MM-DD +**Authors:** @github-handle +**Deciders:** @lead, @architect + +--- + +## Context + +What is the issue we're facing? What forces are at play? What constraints exist? + +Describe the problem that necessitates this decision. + +--- + +## Decision + +What is the change we're proposing or have agreed to make? + +State the decision clearly and concisely. + +--- + +## Rationale + +Why did we choose this approach? + +Explain the reasoning behind the decision, considering: +- Technical factors +- Business requirements +- Team constraints +- Timeline considerations + +--- + +## Consequences + +What becomes easier or harder as a result of this decision? + +### Positive Consequences + +- ✅ Benefit 1 +- ✅ Benefit 2 + +### Negative Consequences + +- ❌ Trade-off 1 +- ❌ Trade-off 2 + +### Neutral Consequences + +- ℹ️ Change 1 +- ℹ️ Change 2 + +--- + +## Alternatives Considered + +### Alternative 1: [Name] + +**Description:** Brief description of the alternative + +**Pros:** +- Advantage 1 +- Advantage 2 + +**Cons:** +- Disadvantage 1 +- Disadvantage 2 + +**Why not chosen:** Explanation + +--- + +### Alternative 2: [Name] + +**Description:** Brief description + +**Pros:** +- Advantage 1 + +**Cons:** +- Disadvantage 1 + +**Why not chosen:** Explanation + +--- + +## Implementation Notes + +How do we implement this decision? + +- Migration steps +- Code changes required +- Configuration updates +- Documentation updates + +--- + +## Validation + +How do we know this decision is working? + +- Success metrics +- Monitoring points +- Testing approach + +--- + +## References + +- Related GitHub issue: #XXX +- Related ADRs: ADR-YYYY +- External references: [Link](URL) +- Discussion thread: [Link](URL) + +--- + +## Notes + +Any additional context or information. + +--- + +## Revision History + +| Date | Author | Change | +|------|--------|--------| +| YYYY-MM-DD | @author | Initial draft | +| YYYY-MM-DD | @author | Addressed review feedback | +| YYYY-MM-DD | @author | Accepted | diff --git a/ai-docs/domain/README.md b/ai-docs/domain/README.md new file mode 100644 index 0000000000..360da817bf --- /dev/null +++ b/ai-docs/domain/README.md @@ -0,0 +1,111 @@ +# installer Domain Models + +**Last Updated:** 2026-05-01 + +--- + +## Overview + +This directory documents the domain models, custom resource definitions (CRDs), and core data structures used in installer. + +--- + +## Custom Resource Definitions (CRDs) + +For Kubernetes operator projects, document each CRD here. + +**Example structure for each CRD:** + +### ResourceName + +- **API Group:** `example.com/v1alpha1` +- **Kind:** `ResourceName` +- **Plural:** `resourcenames` +- **Scope:** Namespaced | Cluster + +**Purpose:** What this resource represents + +**Spec Fields:** +- `field1` (string, required) - Description +- `field2` (int, optional) - Description + +**Status Fields:** +- `conditions` ([]Condition) - Resource conditions +- `phase` (string) - Current phase (Pending, Ready, Error) + +**Example:** +```yaml +apiVersion: example.com/v1alpha1 +kind: ResourceName +metadata: + name: example + namespace: default +spec: + field1: "value" + field2: 42 +status: + phase: Ready + conditions: + - type: Ready + status: "True" + reason: ReconciliationSucceeded +``` + +**Validation:** +- Field1 must match pattern `^[a-z0-9-]+$` +- Field2 must be between 1-100 + +**Related Documentation:** +- Controller reconciliation logic: [../architecture/components.md](../architecture/components.md) +- API reference: See generated API docs + +--- + +## Core Data Structures + +For non-operator projects, document key data structures. + +### Structure 1 + +**Purpose:** Description + +**Fields:** +```go +type MyStruct struct { + Field1 string `json:"field1"` + Field2 int `json:"field2"` +} +``` + +**Validation Rules:** +- Field1: required, non-empty +- Field2: must be positive + +--- + +## API Versioning + +**Current Version:** v1alpha1 + +**Versioning Policy:** +- `v1alpha1` - Initial experimental API +- `v1beta1` - API stabilizing, may have breaking changes +- `v1` - Stable API, backward compatibility guaranteed + +**Deprecated Fields:** +- (None currently) + +**Migration Guides:** +- [v1alpha1 → v1beta1](migrations/v1alpha1-to-v1beta1.md) (if applicable) + +--- + +## Related Documentation + +- [Components Overview](../architecture/components.md) - How these models are used +- [Development Guide](../installer_DEVELOPMENT.md) - Adding new fields +- Generated API docs - Full API reference + +--- + +**Note:** For each major CRD or domain model, create a dedicated file (e.g., `credentialsrequest.md`) with detailed specification. diff --git a/ai-docs/exec-plans/README.md b/ai-docs/exec-plans/README.md new file mode 100644 index 0000000000..58d09b8482 --- /dev/null +++ b/ai-docs/exec-plans/README.md @@ -0,0 +1,195 @@ +# installer Execution Plans + +**Last Updated:** 2026-05-01 + +--- + +## Purpose + +Execution plans (exec-plans) guide feature planning and implementation for installer. + +Use this directory to document: +- Feature design proposals +- Implementation plans +- Spike investigations +- Proof-of-concept findings + +--- + +## When to Create an Exec Plan + +Create an exec plan when: +- ✅ Implementing a significant new feature +- ✅ Making architectural changes +- ✅ Investigating a complex problem +- ✅ Proposing a major refactor + +Don't create an exec plan for: +- ❌ Bug fixes (unless they require design changes) +- ❌ Minor improvements +- ❌ Routine maintenance + +--- + +## Exec Plan Format + +### Template Structure + +```markdown +# Feature Name + +**Status:** Draft | In Review | Approved | Implemented +**Author:** GitHub handle +**Created:** YYYY-MM-DD +**Epic:** Link to GitHub epic issue + +## Problem Statement + +What problem are we solving? Why does it matter? + +## Goals + +- Goal 1 +- Goal 2 + +## Non-Goals + +- What we're explicitly NOT doing +- Out of scope items + +## Proposed Solution + +High-level approach to solving the problem. + +### Architecture + +Component diagrams, data flow, etc. + +### API Changes + +New APIs, changed APIs, deprecated APIs. + +### Migration Path + +How existing users/resources migrate to new behavior. + +## Alternatives Considered + +- **Alternative 1:** Description and why not chosen +- **Alternative 2:** Description and why not chosen + +## Implementation Plan + +1. **Phase 1:** Milestone 1 + - Story 1.1 + - Story 1.2 + +2. **Phase 2:** Milestone 2 + - Story 2.1 + - Story 2.2 + +## Testing Strategy + +- Unit tests +- Integration tests +- E2E scenarios + +## Risks and Mitigations + +| Risk | Impact | Mitigation | +|------|--------|------------| +| Risk 1 | High | Mitigation strategy | + +## Success Criteria + +How do we know the feature is successful? + +- Metric 1 +- Metric 2 +- User feedback + +## Timeline + +- **Week 1-2:** Design and review +- **Week 3-4:** Implementation phase 1 +- **Week 5-6:** Implementation phase 2 +- **Week 7:** Testing and documentation + +## Open Questions + +- Question 1? +- Question 2? +``` + +--- + +## Exec Plan Workflow + +### 1. Draft + +- Author creates exec plan document +- Shares with team for early feedback +- Iterates on design + +### 2. Review + +- Team reviews exec plan +- Discusses alternatives +- Identifies risks +- Approves or requests changes + +### 3. Approved + +- Exec plan is approved +- Implementation can begin +- Epic/stories created based on plan + +### 4. Implemented + +- Feature implemented +- Tests passing +- Documentation updated +- Exec plan archived for reference + +--- + +## Example Exec Plans + +*(Create example exec plan files as features are implemented)* + +- `feature-async-processing.md` - Async processing support +- `spike-performance-optimization.md` - Performance investigation +- `refactor-controller-architecture.md` - Architecture refactor + +--- + +## Relationship to ADRs + +**Exec Plans vs ADRs:** + +- **Exec Plan:** Feature design and implementation plan + - Created before implementation + - Describes what and how + - May span multiple epics/sprints + +- **ADR:** Architectural decision record + - Created during or after implementation + - Documents why a decision was made + - Explains trade-offs considered + +**Workflow:** +1. Create exec plan for feature +2. During implementation, significant architectural decisions → ADR +3. After implementation, exec plan archived, ADRs remain as reference + +--- + +## Related Documentation + +- [ADR Template](../decisions/adr-template.md) - Architectural decision records +- [Components](../architecture/components.md) - Current architecture +- [Team Workflows](../../team/ai-docs/workflows/) - Team planning process + +--- + +**Note:** This is a template directory. Replace with actual exec plans as features are proposed and implemented. diff --git a/ai-docs/references/ecosystem.md b/ai-docs/references/ecosystem.md new file mode 100644 index 0000000000..6a80595efa --- /dev/null +++ b/ai-docs/references/ecosystem.md @@ -0,0 +1,215 @@ +# installer Ecosystem and References + +**Last Updated:** 2026-05-01 + +--- + +## Purpose + +This document provides links to related projects, upstream dependencies, documentation, and external resources relevant to installer. + +--- + +## Upstream Projects + +### Kubernetes + +**Relationship:** installer runs on Kubernetes + +**Resources:** +- [Kubernetes Documentation](https://kubernetes.io/docs/) +- [API Reference](https://kubernetes.io/docs/reference/kubernetes-api/) +- [Controller Runtime](https://github.com/kubernetes-sigs/controller-runtime) - Framework for building operators + +**Version Compatibility:** +- Supported Kubernetes versions: 1.24+ +- Controller Runtime version: v0.15.x + +--- + +### OpenShift (if applicable) + +**Relationship:** installer is part of OpenShift platform + +**Resources:** +- [OpenShift Documentation](https://docs.openshift.com/) +- [OpenShift Enhancement Proposals](https://github.com/openshift/enhancements) +- [OpenShift CI (Prow)](https://docs.ci.openshift.org/) + +**Version Compatibility:** +- Supported OpenShift versions: 4.12+ + +--- + +## Related Platform Projects + +### Cloud Provider Integrations + +**vSphere (VMware):** +- [govmomi](https://github.com/vmware/govmomi) - vSphere API client +- [vSphere CSI Driver](https://github.com/kubernetes-sigs/vsphere-csi-driver) +- [vSphere Cloud Provider](https://github.com/kubernetes/cloud-provider-vsphere) + +**AWS:** +- [AWS SDK for Go](https://github.com/aws/aws-sdk-go) +- [AWS Cloud Provider](https://github.com/kubernetes/cloud-provider-aws) + +**GCP:** +- [GCP SDK](https://cloud.google.com/go) +- [GCP Cloud Provider](https://github.com/kubernetes/cloud-provider-gcp) + +**Azure:** +- [Azure SDK for Go](https://github.com/Azure/azure-sdk-for-go) +- [Azure Cloud Provider](https://github.com/kubernetes-sigs/cloud-provider-azure) + +--- + +## Sister Projects + +Projects in the same team or ecosystem: + +- **[Project 1](https://github.com/org/project1)** - Description +- **[Project 2](https://github.com/org/project2)** - Description +- **[Project 3](https://github.com/org/project3)** - Description + +See team repository for full project list: `../../team/ai-docs/architecture/projects.md` + +--- + +## Dependencies + +### Direct Dependencies + +Key libraries and frameworks used by installer: + +**Go Modules:** +- `k8s.io/client-go` - Kubernetes client +- `sigs.k8s.io/controller-runtime` - Controller framework +- `github.com/spf13/cobra` - CLI framework (if applicable) +- `github.com/prometheus/client_golang` - Metrics + +**Python Packages (if applicable):** +- `kubernetes` - Kubernetes client +- `pytest` - Testing framework + +**JavaScript/TypeScript (if applicable):** +- `@kubernetes/client-node` - Kubernetes client +- `react` - UI framework + +See `go.mod`, `requirements.txt`, or `package.json` for complete dependency list. + +### Indirect Dependencies + +- Authentication/authorization libraries +- Logging frameworks +- Testing utilities + +--- + +## Standards and Specifications + +### Kubernetes Standards + +- [Custom Resource Definition (CRD)](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/) +- [Controller Pattern](https://kubernetes.io/docs/concepts/architecture/controller/) +- [Admission Webhooks](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/) + +### Cloud Provider Standards + +- **AWS:** [AWS Well-Architected Framework](https://aws.amazon.com/architecture/well-architected/) +- **GCP:** [GCP Architecture Framework](https://cloud.google.com/architecture/framework) +- **Azure:** [Azure Well-Architected Framework](https://docs.microsoft.com/azure/architecture/framework/) +- **vSphere:** [vSphere API Reference](https://developer.vmware.com/apis/vsphere-automation/latest/) + +--- + +## Documentation Resources + +### Team-Level Documentation + +See team repository for: +- **Workflows:** Sprint process, epic breakdown, triage +- **Practices:** Coding standards, testing guidelines +- **Roles:** Hat-switching, responsibilities + +Location: `../../team/ai-docs/` + +### Platform Documentation + +**Operator Development:** +- [Operator SDK](https://sdk.operatorframework.io/) +- [Operator Best Practices](https://sdk.operatorframework.io/docs/best-practices/) +- [Kubebuilder Book](https://book.kubebuilder.io/) + +**Testing:** +- [Kubernetes Testing Guide](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-testing/testing.md) +- [E2E Testing Framework](https://github.com/kubernetes-sigs/e2e-framework) + +**CI/CD:** +- [Prow Documentation](https://docs.prow.k8s.io/) +- [OpenShift CI](https://docs.ci.openshift.org/) + +--- + +## Community and Support + +### Communication Channels + +**Team Channels:** +- Slack: `#team-channel` (internal) +- GitHub Discussions: Project discussions tab +- Mailing List: team-list@example.com (if applicable) + +**Upstream Communities:** +- Kubernetes Slack: `#sig-cloud-provider`, `#kubebuilder`, etc. +- OpenShift Slack: `#forum-openshift`, `#forum-` + +### Meetings + +**Team Meetings:** +- Sprint planning: Bi-weekly (see team calendar) +- Sprint review: Bi-weekly +- Standup: Daily (async) + +**Upstream Meetings:** +- SIG meetings: Check [Kubernetes calendar](https://calendar.google.com/calendar/embed?src=calendar%40kubernetes.io) +- OpenShift meetings: Check [OpenShift calendar](https://calendar.google.com/calendar/embed?src=openshift.io_5s2lnu98o7vjhm8hs5q4vkp7s0%40group.calendar.google.com) + +--- + +## Learning Resources + +### Getting Started + +**Kubernetes:** +- [Kubernetes Basics](https://kubernetes.io/docs/tutorials/kubernetes-basics/) +- [Kubernetes the Hard Way](https://github.com/kelseyhightower/kubernetes-the-hard-way) + +**Operator Development:** +- [Operator Pattern](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) +- [Operator SDK Tutorial](https://sdk.operatorframework.io/docs/building-operators/golang/tutorial/) + +**Cloud Providers:** +- [vSphere Docs](https://docs.vmware.com/en/VMware-vSphere/index.html) +- [AWS Documentation](https://docs.aws.amazon.com/) +- [GCP Documentation](https://cloud.google.com/docs) +- [Azure Documentation](https://docs.microsoft.com/azure/) + +### Advanced Topics + +- [Kubernetes API Conventions](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md) +- [Controller Runtime Deep Dive](https://engineering.bitnami.com/articles/kubebuilder-deep-dive.html) +- [Writing Controllers](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-api-machinery/controllers.md) + +--- + +## Related Documentation + +- [Development Guide](../installer_DEVELOPMENT.md) - Build and develop +- [Testing Guide](../installer_TESTING.md) - Test suites +- [Components](../architecture/components.md) - Architecture +- [ADRs](../decisions/) - Architectural decisions + +--- + +**Note:** Update this document as the ecosystem evolves, dependencies change, or new resources become available. diff --git a/poll-log.txt b/poll-log.txt new file mode 100644 index 0000000000..409e0c4a39 --- /dev/null +++ b/poll-log.txt @@ -0,0 +1,2 @@ +2026-04-25T00:05:12Z — board.scan — Complete: No actionable issues (Epic #14 at po:accept human gate) +2026-04-25T00:05:23Z — board.scan — Complete: No actionable issues (Epic #14 at po:accept human gate)