Skip to content

[Schedules 4/8] Schedule execution logic — deploy, prompt, monitor, teardown #32

@sre-helmcode

Description

@sre-helmcode

Implementation Order: 4 of 8

Depends on: #31 (scheduler engine)
Feature: Schedules — Recurring automated tasks for AI agent teams

Summary

Implement the execution logic that runs when a schedule triggers. This is the function called by the scheduler engine for each due schedule. It handles the full lifecycle: deploy team → send prompt → wait for completion → record result → stop team.

Execution Flow

triggerSchedule(schedule)
  ├── 1. Create ScheduleRun record (status: running)
  ├── 2. Update Schedule.Status = "running", Schedule.LastRunAt = now
  ├── 3. Deploy team on-demand (reuse existing deploy logic)
  │     └── Wait for all agents to be running
  ├── 4. Send prompt as chat message to the team
  ├── 5. Monitor execution:
  │     ├── Wait for team to finish processing
  │     └── Enforce timeout (see below)
  ├── 6. Record result:
  │     ├── Success → ScheduleRun.Status = "success"
  │     ├── Timeout → ScheduleRun.Status = "timeout"
  │     └── Error → ScheduleRun.Status = "failed", ScheduleRun.Error = err.Error()
  ├── 7. Stop team (teardown containers/pods)
  └── 8. Update Schedule.Status = "idle" (or "error" if failed)

Timeout Configuration

  • Default: 1 hour (SCHEDULE_TIMEOUT env var, format: Go duration e.g. 1h, 30m, 2h)
  • If execution exceeds timeout:
    • Mark ScheduleRun as timeout
    • Force-stop the team
    • Log warning
  • User warning: The API should include the configured timeout in schedule responses so the frontend can display a warning during creation.

Team Deployment Link

  • Each ScheduleRun stores team_deployment_id — this links to the team's chat/activity history
  • The user can click on a run and see the full conversation that the agents had during that execution
  • This is essential for debugging automations

Concurrency Considerations

  • Each execution runs in its own goroutine (from the scheduler engine)
  • Multiple executions of the same schedule can run in parallel
  • Use the schedule ID + run ID to distinguish between concurrent executions
  • Team names for on-demand deployments should include the run ID to avoid conflicts (e.g., marketing-team-run-abc123)

Error Handling

  • If team deploy fails → mark run as failed, don't attempt chat
  • If chat send fails → mark run as failed, attempt to stop team
  • If team stop fails → log error, mark run as failed (don't leave orphaned containers)
  • Always attempt cleanup (stop team) even on errors

Acceptance Criteria

  • executeSchedule(schedule) function in internal/scheduler/
  • Creates ScheduleRun record at start
  • Deploys team on-demand using existing runtime logic
  • Sends prompt via existing chat mechanism
  • Enforces configurable timeout (SCHEDULE_TIMEOUT env var, default 1h)
  • Records run result (success/failed/timeout) with timestamps
  • Stops team after completion or timeout
  • Handles errors gracefully — always cleans up
  • Links run to team deployment for conversation viewing
  • Integration tests for the full execution lifecycle

Metadata

Metadata

Assignees

Labels

featureNew feature or request

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions