Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 98 additions & 16 deletions roles/telemetry_chargeback/README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
telemetry_chargeback
=========

The **`telemetry_chargeback`** role is designed to test the **RHOSO Cloudkitty** feature. These tests are specific to the Cloudkitty feature. Tests that are not specific to this feature (e.g., standard OpenStack deployment validation, basic networking) should be added to a common role.

The role performs two main functions:

1. **CloudKitty Validation** - Enables and configures the CloudKitty hashmap rating module, then validates its state.
2. **Synthetic Data Generation** - Generates synthetic Loki log data for testing chargeback scenarios using a Python script and Jinja2 template.
2. **Synthetic Data Generation & Analysis** - Generates synthetic Loki log data for testing chargeback scenarios and calculates metric totals. The role automatically discovers and processes all scenario files matching `test_*.yml` in the `files/` directory. For each scenario it runs: generate synthetic data, compute syn-totals, ingest to Loki, flush Loki ingester memory, and get cost via CloudKitty rating summary (using begin/end from syn-totals). Retrieve-from-Loki is included in the load_loki_data flow. After all scenarios, the role runs cleanup (`cleanup_ck.yml`) to remove the local flush cert directory.

Requirements
------------
Expand All @@ -15,48 +15,130 @@ It relies on the following being available on the target or control host:
* The **OpenStack CLI client** must be installed and configured with administrative credentials.
* Required Python libraries for the `openstack` CLI (e.g., `python3-openstackclient`).
* Connectivity to the OpenStack API endpoint.
* **Python 3** with the following libraries for synthetic data generation:
* **Python 3** with the following libraries for synthetic data generation and analysis:
* `PyYAML`
* `Jinja2`

It is expected to be run **after** a successful deployment and configuration of the following components:

* **OpenStack:** A functional OpenStack cloud (RHOSO) environment.
* **Cloudkitty:** The Cloudkitty service must be installed, configured, and running.
* **Loki / OpenShift (for ingest and flush):** When using ingest and flush tasks, the control host must have `oc` CLI access, and the Cloudkitty Loki stack (route, certificates, ingester) must be deployed. The role sets Loki push/query URLs and extracts certificates via `setup_loki_env.yml`.

Role Variables
--------------
The role uses the following variables to control the testing environment and execution.

### User-Configurable Variables (defaults/main.yml)

These variables can be overridden when importing the role or set at the play level. Users can customize these based on their deployment environment and test requirements.

| Variable | Default Value | Description |
|----------|---------------|-------------|
| `openstack_cmd` | `openstack` | The command used to execute OpenStack CLI calls. This can be customized if the binary is not in the standard PATH. |
| `cloudkitty_debug` | `false` | Enable debug mode for the role. |
| `logs_dir_zuul` | `{{ ansible_env.HOME }}/ci-framework-data/logs` | Directory for log files. |
| `artifacts_dir_zuul` | `{{ ansible_env.HOME }}/ci-framework-data/artifacts` | Directory for generated artifacts. |
| `cert_dir` | `{{ ansible_user_dir }}/ck-certs` | Local directory for extracted ingest/query certs. |
| `local_cert_dir` | `{{ ansible_env.HOME }}/ci-framework-data/flush_certs` | Local directory for flush certs (removed by cleanup_ck.yml after the run). |
| `remote_cert_dir` | `osp-certs` | Directory inside the OpenStack pod for certs. |
| `cert_secret_name` | `cert-cloudkitty-client-internal` | OpenShift secret name for client certificates. |
| `client_secret` | `secret/cloudkitty-lokistack-gateway-client-http` | Secret for flush client certs. |
| `ca_configmap` | `cm/cloudkitty-lokistack-ca-bundle` | ConfigMap for CA bundle. |
| `logql_query` | `{service="cloudkitty"}` (overridable via `loki_query`) | LogQL query for Loki. |
| `cloudkitty_namespace` | `openstack` | OpenShift namespace for Cloudkitty/Loki resources. |
| `openstackpod` | `openstackclient` | OpenStack client pod name for exec/cp. |
| `lookback` | `6` | Days lookback for Loki query time range. |
| `limit` | `50` | Limit for Loki query results. |

**Example: Overriding variables when importing the role**
```yaml
- name: "Run chargeback tests"
ansible.builtin.import_role:
name: telemetry_chargeback
vars:
cloudkitty_namespace: "my-custom-namespace"
lookback: 10
cloudkitty_debug: true
```

### Internal Variables (vars/main.yml)
### Synthetic Data Scripts

These variables are used internally by the role and typically do not need to be modified.
These variables are used internally by the role and should not be modified. They use `role_path` for internal file/script references and define internal file naming conventions.

| Variable | Default Value | Description |
|----------|---------------|-------------|
| `logs_dir_zuul` | `/home/zuul/ci-framework-data/logs` | Remote directory for log files. |
| `artifacts_dir_zuul` | `/home/zuul/ci-framework-data/artifacts` | Directory for generated artifacts. |
| `cloudkitty_scenario_dir` | `{{ role_path }}/files` | Directory containing scenario files (`test_*.yml`). |
| `cloudkitty_synth_script` | `{{ role_path }}/files/gen_synth_loki_data.py` | Path to the synthetic data generation script. |
| `cloudkitty_data_template` | `{{ role_path }}/templates/loki_data_templ.j2` | Path to the Jinja2 template for Loki data format. |
| `ck_data_config` | `{{ role_path }}/files/test_static.yml` | Path to the scenario configuration file. |
| `ck_output_file_local` | `{{ artifacts_dir_zuul }}/loki_synth_data.json` | Local path for generated synthetic data. |
| `ck_output_file_remote` | `{{ logs_dir_zuul }}/gen_loki_synth_data.log` | Remote destination for synthetic data. |
| `cloudkitty_summary_script` | `{{ role_path }}/files/gen_db_summary.py` | Path to the summary script (gen_db_summary.py). |

**Note:** Loki push/query URLs are set dynamically in `setup_loki_env.yml` from the Cloudkitty Loki route.

### Scenario Result Dictionary

Instead of using separate file-suffix variables, the role builds a `scenario_result` dictionary for each scenario that carries all metadata through the pipeline:

```yaml
scenario_result:
file_name: "test_static" # scenario name
synth_data_file: "<artifacts_dir>/test_static-synth_data.json"
synth_totals_file: "<artifacts_dir>/test_static-synth_metrics_summary.yml"
num_values: 12 # number of generated log entries
total_rate: 1.234 # expected total rating
synth_summary: { ... } # full output from gen_db_summary.py
loki_data_file: "<artifacts_dir>/test_static-loki_data.json" # added after retrieval
loki_totals_file: "<artifacts_dir>/test_static-loki_metrics_summary.yml"
loki_summary: { ... } # added after retrieval
ck_rating_by_type: { ... } # added after CloudKitty query
ck_rating_summary: { ... } # added after CloudKitty query
```

This dictionary is built in `gen_synth_loki_data.yml` and progressively enriched by `retrieve_loki_data.yml` and `loki_rate.yml`. Comparisons in `run_test_scenarios.yml` use the dictionary values directly instead of diffing files.

### Synthetic Data Scripts

**gen_synth_loki_data.py** — Generates Loki-format JSON from a scenario YAML and template. The role invokes it with `-r` so that timestamps in the output are in **reverse** order (youngest first, oldest last). When run manually you can omit `-r` for chronological order (oldest first, youngest last).

| Option | Description |
|--------|--------------|
| `--tmpl` | Path to the Jinja2 template (e.g. `loki_data_templ.j2`). |
| `-t`, `--test` | Path to the scenario YAML (e.g. `test_dyn_basic.yml`). |
| `-o`, `--output` | Path to the output JSON file. |
| `-p`, `--project-id` | Optional; overrides `groupby.project_id` in every log entry. |
| `-u`, `--user-id` | Optional; overrides `groupby.user_id` in every log entry. |
| `-r`, `--reverse` | Reverse timestamp order in JSON output (youngest first, oldest last). |
| `--debug` | Enable debug logging. |

**gen_db_summary.py** — Parses Loki-style JSON (streams or `data.result`), sorts entries by timestamp, and writes a YAML summary. This script is invoked by the role for **both** synthetic totals (in `gen_synth_loki_data.yml`) and Loki-retrieved totals (in `retrieve_loki_data.yml`). It applies rate calculations with support for `factor`, `offset`, and `mutate` transformations.

| Option | Description |
|--------|--------------|
| `-j`, `--json` | Path to the input JSON file (required). |
| `-o`, `--output` | Path to the output YAML file (default: `<input_stem>_total.yml`). |
| `--debug` | Directory to write debug output (`<stem>_diff.txt` with one `[ts,log]` JSON per line). |

Output YAML structure:

* **time** — `begin_step` / `end_step`, each with `nanosec` (nanosecond timestamp), `begin`, `end` (ISO window strings from the log payload). The `nanosec` values are used for Loki query time range in `retrieve_loki_data.yml`.
* **data_log** — `total_timesteps`, `metrics_per_step`, `log_count`.
* **rate** — `by_types` (per-type `Rate` calculated as `Σ((qty_mutated * factor + offset) * price)`) and `total.Rating` (sum of all rates).

Scenario Configuration
----------------------
The synthetic data generation is controlled by a YAML configuration file (`files/test_static.yml`). This file defines:
The synthetic data generation is controlled by YAML configuration files in the `files/` directory. Any file matching `test_*.yml` will be automatically discovered and processed. Files whose names start with an underscore (e.g. `_test_*.yml`) are **not** discovered by the role; they can be used as reference or for manual runs.

Each scenario file defines:

* **generation** — Time range configuration (days, step_seconds).
* **log_types** — List of log type definitions. Each entry has **type** (identifier and value in output), unit, description, qty, price, groupby, and metadata. The **groupby** dict typically includes dimension keys (e.g. id, user_id, project_id, tenant_id); the generator merges **date_fields** into groupby at run time.
* **required_fields** — Top-level keys required for each log type (e.g. type, unit, qty, price, groupby, metadata).
* **date_fields** — Date field names to merge into groupby (week_of_the_year, day_of_the_year, month, year).
* **loki_stream** — Loki stream configuration (service name).

**groupby.id** should be consistent by metric type across scenario files so that the same type always uses the same id.

* **generation** - Time range configuration (days, step_seconds)
* **log_types** - List of log type definitions with name, type, unit, qty, price, groupby, and metadata
* **required_fields** - Fields required for validation
* **date_fields** - Date fields to add to groupby (week_of_the_year, day_of_the_year, month, year)
* **loki_stream** - Loki stream configuration (service name)
Scenario files matching `test_*.yml` in the `files/` directory are automatically discovered and processed. Files whose names start with an underscore are not auto-discovered.

Dependencies
------------
Expand Down
30 changes: 30 additions & 0 deletions roles/telemetry_chargeback/defaults/main.yml
Original file line number Diff line number Diff line change
@@ -1,2 +1,32 @@
---
# OpenStack CLI command
openstack_cmd: "openstack"

# Debug mode
cloudkitty_debug: false

# Directory paths
logs_dir_zuul: "{{ ansible_env.HOME }}/ci-framework-data/logs"
artifacts_dir_zuul: "{{ ansible_env.HOME }}/ci-framework-data/artifacts"
cert_dir: "{{ ansible_user_dir }}/ck-certs"
local_cert_dir: "{{ ansible_env.HOME }}/ci-framework-data/flush_certs"
remote_cert_dir: "osp-certs"

# Cloudkitty certificates and secrets
cert_secret_name: "cert-cloudkitty-client-internal"
client_secret: "secret/cloudkitty-lokistack-gateway-client-http"
ca_configmap: "cm/cloudkitty-lokistack-ca-bundle"

# LogQL Query
logql_query: "{{ loki_query | default('{service=\"cloudkitty\"}') }}"

# OpenShift/Kubernetes settings
cloudkitty_namespace: "openstack"
openstackpod: "openstackclient"

# Time window settings
lookback: 6
limit: 50

# List of test scenario files to run
cloudkitty_test_scenarios: []
Loading
Loading