Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions .github/workflows/docker-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,13 @@ jobs:
- name: Checkout code
uses: actions/checkout@v4

- name: Prepare environment file for CI
run: |
if [ ! -f installer/.env ]; then
cp installer/.env.example installer/.env
echo "Created .env file from .env.example"
fi

- name: Validate docker-compose files
run: |
cd installer
Expand All @@ -32,7 +39,7 @@ jobs:

strategy:
matrix:
service: [slackbot, lappy, startup-data-loader, file-uploader]
service: [slackbot, lap-detector, startup-data-loader, file-uploader]

steps:
- name: Checkout code
Expand All @@ -52,7 +59,10 @@ jobs:
- name: Extract metadata
id: meta
run: |
echo -e "tags=${{ env.REGISTRY }}/${IMAGE_NAME_LOWER}/${{ matrix.service }}:latest\n${{ env.REGISTRY }}/${IMAGE_NAME_LOWER}/${{ matrix.service }}:${{ github.sha }}" >> $GITHUB_OUTPUT
echo "tags<<EOF" >> $GITHUB_OUTPUT
echo "${{ env.REGISTRY }}/${IMAGE_NAME_LOWER}/${{ matrix.service }}:latest" >> $GITHUB_OUTPUT
echo "${{ env.REGISTRY }}/${IMAGE_NAME_LOWER}/${{ matrix.service }}:${{ github.sha }}" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT

- name: Build and push Docker image
uses: docker/build-push-action@v5
Expand Down
9 changes: 9 additions & 0 deletions .github/workflows/stack-smoke-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,17 @@ jobs:
- name: Ensure CI script is executable
run: chmod +x ./dev-utils/ci/stack-smoke-test.sh

- name: Prepare environment file for CI
run: |
if [ ! -f installer/.env ]; then
cp installer/.env.example installer/.env
echo "Created .env file from .env.example"
fi
ls -la installer/.env

- name: Run installer smoke test
env:
CI: "true"
DOCKER_BUILDKIT: "1"
COMPOSE_DOCKER_CLI_BUILD: "1"
run: ./dev-utils/ci/stack-smoke-test.sh
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,13 @@ installer/*.dbc
!installer/example.dbc

installer/slackbot/logs/*

installer/sandbox/prompt-guide.txt

wfr-telemetry
/installer/data-downloader/data
installer/slackbot/*.png
!installer/slackbot/lappy_test_image.png
installer/slackbot/*.jpg
installer/slackbot/*.jpeg

Expand Down
45 changes: 45 additions & 0 deletions installer/data-downloader/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,51 @@ Both JSON files are shared through the `./data` directory so every service (fron
3. Open http://localhost:3000 to access the web UI, and keep the API running on http://localhost:8000 if you want to call it directly.

## Runtime behaviour
```mermaid
sequenceDiagram
participant Worker as periodic_worker.py
participant Service as DataDownloaderService
participant Scanner as server_scanner.py
participant Slicks as slicks library
participant InfluxDB as InfluxDB3
participant Storage as JSON Storage

Worker->>Service: run_full_scan(source="periodic")
Service->>Service: Sort seasons by year (newest first)

loop For each season (WFR25, WFR26)
Service->>Scanner: scan_runs(ScannerConfig{<br/>database: season.database,<br/>year: season.year})
Scanner->>Slicks: connect_influxdb3(url, token, db)
Scanner->>Slicks: scan_data_availability(start, end, table, bin_size)

loop Adaptive scanning (inside slicks)
Slicks->>InfluxDB: Try query_grouped_bins()<br/>(DATE_BIN + COUNT(*))
alt Success
InfluxDB-->>Slicks: Return bins with counts
else Failure (timeout/size)
Slicks->>Slicks: Binary subdivision
Slicks->>InfluxDB: query_exists_per_bin()<br/>(SELECT 1 LIMIT 1 per bin)
InfluxDB-->>Slicks: Return existence flags
end
end

Slicks-->>Scanner: ScanResult (windows)
Scanner-->>Service: List[dict] (formatted runs)

Service->>Service: fetch_unique_sensors(season.database)
Service->>Storage: runs_repos[season.name].merge_scanned_runs(runs)
Storage-->>Storage: Atomic write to runs_WFR25.json
Service->>Storage: sensors_repos[season.name].write_sensors(sensors)
Storage-->>Storage: Atomic write to sensors_WFR25.json

alt Season scan failed
Service->>Service: Log error, continue to next season
end
end

Service->>Storage: status_repo.mark_finish(success)
Storage-->>Storage: Update scanner_status.json
```

- `frontend` serves the compiled React bundle via nginx and now proxies `/api` requests (including `/api/scan` and `/api/scanner-status`) directly to the FastAPI container. When the UI is loaded from anything other than `localhost`, the client automatically falls back to relative `/api/...` calls so a single origin on a VPS still reaches the backend. Override `VITE_API_BASE_URL` if you want the UI to talk to a different host (for example when running `npm run dev` locally) and keep that host in `ALLOWED_ORIGINS`.
- `api` runs `uvicorn backend.app:app`, exposing
Expand Down
96 changes: 86 additions & 10 deletions installer/data-downloader/backend/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

from fastapi import BackgroundTasks, FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import HTMLResponse
from pydantic import BaseModel

from backend.config import get_settings
Expand Down Expand Up @@ -40,14 +41,19 @@ def healthcheck() -> dict:
return {"status": "ok"}


@app.get("/api/seasons")
def list_seasons() -> List[dict]:
return service.get_seasons()


@app.get("/api/runs")
def list_runs() -> dict:
return service.get_runs()
def list_runs(season: str | None = None) -> dict:
return service.get_runs(season=season)


@app.get("/api/sensors")
def list_sensors() -> dict:
return service.get_sensors()
def list_sensors(season: str | None = None) -> dict:
return service.get_sensors(season=season)


@app.get("/api/scanner-status")
Expand All @@ -56,10 +62,10 @@ def scanner_status() -> dict:


@app.post("/api/runs/{key}/note")
def save_note(key: str, payload: NotePayload) -> dict:
run = service.update_note(key, payload.note.strip())
def save_note(key: str, payload: NotePayload, season: str | None = None) -> dict:
run = service.update_note(key, payload.note.strip(), season=season)
if not run:
raise HTTPException(status_code=404, detail=f"Run {key} not found")
raise HTTPException(status_code=404, detail=f"Run {key} not found (season={season})")
return run


Expand All @@ -69,7 +75,77 @@ def trigger_scan(background_tasks: BackgroundTasks) -> dict:
return {"status": "scheduled"}


@app.post("/api/data/query")
def query_data(payload: DataQueryPayload) -> dict:
@app.post("/api/query")
def query_signal(payload: DataQueryPayload, season: str | None = None) -> dict:
limit = None if payload.no_limit else (payload.limit or 2000)
return service.query_signal_series(payload.signal, payload.start, payload.end, limit)
return service.query_signal_series(
payload.signal,
payload.start,
payload.end,
limit,
season=season
)


@app.get("/", response_class=HTMLResponse)
def index():
"""Simple status page for debugging."""
influx_status = "Unknown"
influx_color = "gray"
try:
service._log_influx_connectivity()
influx_status = "Connected"
influx_color = "green"
except Exception as e:
influx_status = f"Error: {e}"
influx_color = "red"

# Default to first season for overview
runs = service.get_runs()
sensors = service.get_sensors()
scanner_status = service.get_scanner_status()
seasons_list = service.get_seasons()
seasons_html = ", ".join([f"{s['name']} ({s['year']})" for s in seasons_list])

html = f"""
<!DOCTYPE html>
<html>
<head>
<title>DAQ Data Downloader Status</title>
<style>
body {{ font-family: sans-serif; max-width: 800px; margin: 2rem auto; line-height: 1.6; }}
h1 {{ border-bottom: 2px solid #eee; padding-bottom: 0.5rem; }}
.card {{ border: 1px solid #ddd; border-radius: 8px; padding: 1.5rem; margin-bottom: 1.5rem; }}
.status-ok {{ color: green; font-weight: bold; }}
.status-err {{ color: red; font-weight: bold; }}
code {{ background: #f4f4f4; padding: 2px 5px; border-radius: 4px; }}
</style>
</head>
<body>
<h1>DAQ Data Downloader Status</h1>

<div class="card">
<h2>System Status</h2>
<p><strong>InfluxDB Connection:</strong> <span style="color: {influx_color}">{influx_status}</span></p>
<p><strong>Scanner Status:</strong> {scanner_status.get('status', 'Unknown')} (Last run: {scanner_status.get('last_run', 'Never')})</p>
<p><strong>API Version:</strong> 1.1.0 (Multi-Season Support)</p>
</div>

<div class="card">
<h2>Active Config</h2>
<p><strong>Seasons Configured:</strong> {seasons_html}</p>
</div>

<div class="card">
<h2>Default Season Stats ({seasons_list[0]['name'] if seasons_list else 'None'})</h2>
<ul>
<li><strong>Runs Found:</strong> {len(runs.get('runs', []))}</li>
<li><strong>Sensors Found:</strong> {len(sensors.get('sensors', []))}</li>
</ul>
</div>

<p><a href="/docs">API Docs</a> | <a href="/api/seasons">JSON Seasons List</a> | <a href="http://localhost:3000">Frontend</a></p>
</body>
</html>
"""
return HTMLResponse(content=html)
56 changes: 53 additions & 3 deletions installer/data-downloader/backend/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,26 +12,76 @@ def _parse_origins(raw: str | None) -> List[str]:
return [origin.strip() for origin in raw.split(",") if origin.strip()]


class SeasonConfig(BaseModel):
name: str # e.g. "WFR25"
year: int # e.g. 2025
database: str # e.g. "WFR25"
color: str | None = None # e.g. "222 76 153"


def _parse_seasons(raw: str | None) -> List[SeasonConfig]:
"""Parse SEASONS env var: "WFR25:2025:222 76 153,WFR26:2026:..."."""
if not raw:
# Default fallback if not set
return [SeasonConfig(name="WFR25", year=2025, database="WFR25", color="#DE4C99")]

seasons = []
for part in raw.split(","):
part = part.strip()
if not part:
continue
try:
# Split into at most 3 parts: Name, Year, Color
parts = part.split(":", 2)
name = parts[0]

if len(parts) >= 2:
year = int(parts[1])
else:
# Malformed or simple format not supported purely by regex?
# Actually if just "WFR25", split gives ['WFR25']
# require at least year
continue

color = parts[2] if len(parts) > 2 else None

# Assume DB name matches Season Name
seasons.append(SeasonConfig(name=name, year=year, database=name, color=color))
except ValueError:
continue

if not seasons:
return [SeasonConfig(name="WFR25", year=2025, database="WFR25")]

# Sort by year descending (newest first)
seasons.sort(key=lambda s: s.year, reverse=True)
return seasons


class Settings(BaseModel):
"""Centralised configuration pulled from environment variables."""

data_dir: str = Field(default_factory=lambda: os.getenv("DATA_DIR", "./data"))

influx_host: str = Field(default_factory=lambda: os.getenv("INFLUX_HOST", "http://localhost:9000"))
influx_token: str = Field(default_factory=lambda: os.getenv("INFLUX_TOKEN", ""))
influx_database: str = Field(default_factory=lambda: os.getenv("INFLUX_DATABASE", "WFR25"))

# Global/Default Influx settings (used for connectivity check or default fallback)
influx_schema: str = Field(default_factory=lambda: os.getenv("INFLUX_SCHEMA", "iox"))
influx_table: str = Field(default_factory=lambda: os.getenv("INFLUX_TABLE", "WFR25"))

scanner_year: int = Field(default_factory=lambda: int(os.getenv("SCANNER_YEAR", "2025")))
scanner_bin: str = Field(default_factory=lambda: os.getenv("SCANNER_BIN", "hour")) # hour or day
seasons: List[SeasonConfig] = Field(default_factory=lambda: _parse_seasons(os.getenv("SEASONS")))

# Scanner settings common to all seasons (unless we want per-season granularity later)
scanner_bin: str = Field(default_factory=lambda: os.getenv("SCANNER_BIN", "hour"))
scanner_include_counts: bool = Field(default_factory=lambda: os.getenv("SCANNER_INCLUDE_COUNTS", "true").lower() == "true")
scanner_initial_chunk_days: int = Field(default_factory=lambda: int(os.getenv("SCANNER_INITIAL_CHUNK_DAYS", "31")))

sensor_window_days: int = Field(default_factory=lambda: int(os.getenv("SENSOR_WINDOW_DAYS", "7")))
sensor_lookback_days: int = Field(default_factory=lambda: int(os.getenv("SENSOR_LOOKBACK_DAYS", "30")))

periodic_interval_seconds: int = Field(default_factory=lambda: int(os.getenv("SCAN_INTERVAL_SECONDS", "3600")))
scan_daily_time: str | None = Field(default_factory=lambda: os.getenv("SCAN_DAILY_TIME"))

allowed_origins: List[str] = Field(default_factory=lambda: _parse_origins(os.getenv("ALLOWED_ORIGINS", "*")))

Expand Down
15 changes: 13 additions & 2 deletions installer/data-downloader/backend/influx_queries.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,14 @@ def _normalize(dt: datetime) -> datetime:
return dt.astimezone(timezone.utc)


def fetch_signal_series(settings: Settings, signal: str, start: datetime, end: datetime, limit: int | None) -> dict:
def fetch_signal_series(
settings: Settings,
signal: str,
start: datetime,
end: datetime,
limit: int | None,
database: str | None = None
) -> dict:
start_dt = _normalize(start)
end_dt = _normalize(end)
if start_dt >= end_dt:
Expand All @@ -35,8 +42,11 @@ def fetch_signal_series(settings: Settings, signal: str, start: datetime, end: d
AND time <= TIMESTAMP '{end_dt.isoformat()}'
ORDER BY time{limit_clause}
"""

# Use provided database or fallback to default setting
target_db = database if database else settings.influx_database

with InfluxDBClient3(host=settings.influx_host, token=settings.influx_token, database=settings.influx_database) as client:
with InfluxDBClient3(host=settings.influx_host, token=settings.influx_token, database=target_db) as client:
tbl = client.query(sql)
points = []
for idx in range(tbl.num_rows):
Expand All @@ -56,6 +66,7 @@ def fetch_signal_series(settings: Settings, signal: str, start: datetime, end: d
"start": start_dt.isoformat(),
"end": end_dt.isoformat(),
"limit": limit,
"database": target_db,
"row_count": len(points),
"points": points,
"sql": " ".join(line.strip() for line in sql.strip().splitlines()),
Expand Down
Loading