Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 42 additions & 12 deletions roles/telemetry_chargeback/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,31 +33,61 @@ The role uses the following variables to control the testing environment and exe
| Variable | Default Value | Description |
|----------|---------------|-------------|
| `openstack_cmd` | `openstack` | The command used to execute OpenStack CLI calls. This can be customized if the binary is not in the standard PATH. |
| `cloudkitty_debug` | `false` | Enable debug mode for CloudKitty database dumps. |
| `logs_dir_zuul` | `{{ ansible_env.HOME }}/ci-framework-data/logs` | Directory for log files. |
| `artifacts_dir_zuul` | `{{ ansible_env.HOME }}/ci-framework-data/artifacts` | Directory for generated artifacts and test output. |
| `cert_dir` | `{{ ansible_user_dir }}/ck-certs` | Directory for CloudKitty client certificates. |
| `local_cert_dir` | `{{ ansible_env.HOME }}/ci-framework-data/flush_certs` | Local directory for certificate extraction. |
| `cloudkitty_namespace` | `openstack` | Kubernetes namespace where CloudKitty is deployed. |

How It Works
------------

### Internal Variables (vars/main.yml)
The role executes the following workflow:

These variables are used internally by the role and typically do not need to be modified.
1. **CloudKitty Validation** - Enables the hashmap rating module and sets its priority to 100.
2. **Loki Environment Setup** - Extracts Loki route information and certificates from the OpenShift cluster.
3. **Admin Credentials** - Retrieves admin project ID and user ID for test data generation.
4. **Scenario Discovery** - Finds all `test_*.yml` scenario files in the scenario directory.
5. **Scenario Loop** - For each scenario file found (exposed as `{{ scenario_name }}`):
- Generates synthetic Loki log data based on the scenario configuration
- Calculates expected chargeback metrics from the generated data
- Loads the metrics for validation
6. **Cleanup** - Removes temporary certificate directories.

| Variable | Default Value | Description |
|----------|---------------|-------------|
| `logs_dir_zuul` | `/home/zuul/ci-framework-data/logs` | Remote directory for log files. |
| `artifacts_dir_zuul` | `/home/zuul/ci-framework-data/artifacts` | Directory for generated artifacts. |
| `cloudkitty_synth_script` | `{{ role_path }}/files/gen_synth_loki_data.py` | Path to the synthetic data generation script. |
| `cloudkitty_data_template` | `{{ role_path }}/templates/loki_data_templ.j2` | Path to the Jinja2 template for Loki data format. |
| `ck_data_config` | `{{ role_path }}/files/test_static.yml` | Path to the scenario configuration file. |
| `ck_output_file_local` | `{{ artifacts_dir_zuul }}/loki_synth_data.json` | Local path for generated synthetic data. |
| `ck_output_file_remote` | `{{ logs_dir_zuul }}/gen_loki_synth_data.log` | Remote destination for synthetic data. |
The role uses `{{ scenario_name }}` as the loop variable when processing multiple test scenarios, making it easy to track which scenario is currently being executed.

Scenario Configuration
----------------------
The synthetic data generation is controlled by a YAML configuration file (`files/test_static.yml`). This file defines:
The synthetic data generation is controlled by YAML configuration files in the `files/` directory. Any file matching the pattern `test_*.yml` will be automatically discovered and executed.

**Available scenarios:**
- `test_static.yml` - Static test scenario with predefined values
- `test_dyn_basic.yml` - Dynamic test scenario with variable values over time

Each scenario file defines:

* **generation** - Time range configuration (days, step_seconds)
* **log_types** - List of log type definitions with name, type, unit, qty, price, groupby, and metadata
* **required_fields** - Fields required for validation
* **date_fields** - Date fields to add to groupby (week_of_the_year, day_of_the_year, month, year)
* **loki_stream** - Loki stream configuration (service name)

### Data Generation Script Options

The `gen_synth_loki_data.py` script supports the following options:

* `--tmpl` - Path to the Jinja2 template file (required)
* `-t, --test` - Path to the scenario YAML file (required)
* `-o, --output` - Path for the output JSON file (required)
* `-p, --project-id` - Optional project ID to override the scenario file value
* `-u, --user-id` - Optional user ID to override the scenario file value
* `--ascending` - Sort timestamps in ascending order (oldest first, newest last)
* `--descending` - Sort timestamps in descending order (newest first, oldest last) - **default**
* `--debug` - Enable debug logging

By default, the script generates data in descending order (newest timestamps first), which is the expected format for Loki ingestion.

Dependencies
------------
This role has no direct hard dependencies on other Ansible roles.
Expand Down
28 changes: 28 additions & 0 deletions roles/telemetry_chargeback/defaults/main.yml
Original file line number Diff line number Diff line change
@@ -1,2 +1,30 @@
---
# OpenStack CLI command
openstack_cmd: "openstack"

# Directory paths
logs_dir_zuul: "{{ ansible_env.HOME }}/ci-framework-data/logs"
artifacts_dir_zuul: "{{ ansible_env.HOME }}/ci-framework-data/artifacts"
cert_dir: "{{ ansible_user_dir }}/ck-certs"
local_cert_dir: "{{ ansible_env.HOME }}/ci-framework-data/flush_certs"
remote_cert_dir: "osp-certs"

# Debug mode set
cloudkitty_debug: false
cloudkitty_debug_dir: "{{ artifacts_dir_zuul + '/debug_ck_db' }}"

# Cloudkitty certificates and secrets
cert_secret_name: "cert-cloudkitty-client-internal"
client_secret: "secret/cloudkitty-lokistack-gateway-client-http"
ca_configmap: "cm/cloudkitty-lokistack-ca-bundle"

# LogQL Query
logql_query: "{{ loki_query | default('{service=\"cloudkitty\"}') }}"

# OpenShift/Kubernetes settings
cloudkitty_namespace: "openstack"
openstackpod: "openstackclient"

# Time window settings
lookback: 6
limit: 50
85 changes: 59 additions & 26 deletions roles/telemetry_chargeback/files/gen_db_summary.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ def _apply_mutate(qty: float, mutate: str) -> float:
return math.floor(qty)
elif mutate_upper == "NUMBOOL":
# If qty equals 0, leave it at 0. Else, set it to 1.
return 0.0 if qty == 0 else 1.0
return 0.0 if abs(qty) < 1e-9 else 1.0
elif mutate_upper == "NOTNUMBOOL":
# If qty equals 0, set it to 1. Else, set it to 0.
return 1.0 if qty == 0 else 0.0
Expand Down Expand Up @@ -175,8 +175,9 @@ def _parse_numeric(value: Any, default: float = 0) -> float:

def aggregate_rates_by_type(
pairs: list[tuple[str, str]],
) -> tuple[dict, float]:
sums: defaultdict[str, float] = defaultdict(float)
) -> tuple[dict, float, dict]:
rate_sums: defaultdict[str, float] = defaultdict(float)
qty_sums: defaultdict[str, float] = defaultdict(float)
for _, log_str in pairs:
try:
entry = json.loads(log_str)
Expand All @@ -196,17 +197,26 @@ def aggregate_rates_by_type(
except (TypeError, ValueError):
continue

# Apply mutate transformation
# Track raw qty sum (before any transformation)
qty_sums[mtype] += qty

# Apply mutate transformation for rating calculation
qty_mutated = _apply_mutate(qty, mutate)

# Apply factor and offset
qty_rate = qty_mutated * factor + offset

# Calculate rate
sums[mtype] += qty_rate * price
by_types = {k: {"Rate": round(v, 4)} for k, v in sorted(sums.items())}
total = sum(sums.values())
return by_types, total
rate_sums[mtype] += qty_rate * price

by_types = {
k: {"Rate": round(v, 4)} for k, v in sorted(rate_sums.items())
}
qty_by_types = {
k: {"qty_sum": round(v, 4)} for k, v in sorted(qty_sums.items())
}
total = sum(rate_sums.values())
return by_types, total, qty_by_types


def build_summary(pairs: list[tuple[str, str]]) -> dict[str, Any]:
Expand Down Expand Up @@ -237,17 +247,35 @@ def build_summary(pairs: list[tuple[str, str]]) -> dict[str, Any]:
empty = {"nanosec": None, "begin": None, "end": None}
time_block = {"begin_step": empty.copy(), "end_step": empty.copy()}

by_types, total_r = aggregate_rates_by_type(pairs)
# Get aggregated data by type
by_types, total_r, qty_by_types = aggregate_rates_by_type(pairs)

# Get overall time range for by_type entries
begin_time = first.get("start") if pairs else None
end_time = last.get("end") if pairs else None

# Build flat list of entries
rate_list = []
for type_name in sorted(by_types.keys()):
entry = {
"Begin": begin_time,
"End": end_time,
"Qty": qty_by_types.get(type_name, {}).get("qty_sum", 0.0),
"Rate": by_types[type_name]["Rate"],
"Type": type_name,
}
rate_list.append(entry)

return {
"time": time_block,
"data_log": {
"data_summary": {
"total_timesteps": n_ts,
"metrics_per_step": mps,
"log_count": log_count,
"total_rating": round(total_r, 4),
},
"rate": {
"by_types": by_types,
"total": {"Rating": round(total_r, 4)},
"by_type": {
"rate": rate_list,
},
}

Expand All @@ -267,7 +295,8 @@ def write_yaml(path: Path, doc: dict[str, Any]) -> None:
def main() -> None:
parser = argparse.ArgumentParser(
description=(
"Summarize Loki JSON log entries to YAML (time, data_log, rate)."
"Summarize Loki JSON log entries to YAML "
"(time, data_summary, by_type)."
),
)
parser.add_argument(
Expand All @@ -282,11 +311,20 @@ def main() -> None:
)
parser.add_argument(
"--debug",
action="store_true",
help=(
"Enable debug mode: write <stem>_diff.txt with one "
"[ts,log] JSON per line."
),
)
parser.add_argument(
"--debug_dir",
type=Path,
default=None,
metavar="DIR",
help=(
"If set, write <stem>_diff.txt with one [ts,log] JSON per line."
"Directory for debug output. If not specified, uses the "
"directory from -o output path."
),
)
args = parser.parse_args()
Expand All @@ -299,24 +337,19 @@ def main() -> None:
out_path = args.output or (args.json.parent / f"{stem}_total.yml")
pairs = extract_and_sort(args.json)

dbg = str(args.debug).strip() if args.debug is not None else ""
if dbg and dbg != ".":
args.debug.mkdir(parents=True, exist_ok=True)
dbg_file = args.debug / f"{args.json.stem}_diff.txt"
if args.debug:
# Determine debug directory: use --debug_dir if provided,
# otherwise use output directory
debug_dir = args.debug_dir if args.debug_dir else out_path.parent
debug_dir.mkdir(parents=True, exist_ok=True)
dbg_file = debug_dir / f"{args.json.stem}_diff.txt"
with dbg_file.open("w", encoding="utf-8") as f:
for ts, log_str in pairs:
print(json.dumps([ts, log_str], ensure_ascii=False), file=f)

doc = build_summary(pairs)
write_yaml(out_path, doc)

if doc["data_log"]["metrics_per_step"] == "ERROR":
per_ts = Counter(ts for ts, _ in pairs)
exp = next(iter(per_ts.values()), 0)
for ts in sorted(per_ts, key=int):
if per_ts[ts] != exp:
print(ts, per_ts[ts], file=sys.stdout)


if __name__ == "__main__":
main()
Loading
Loading