diff --git a/roles/telemetry_chargeback/README.md b/roles/telemetry_chargeback/README.md index 352b58d2f..c6c567df8 100644 --- a/roles/telemetry_chargeback/README.md +++ b/roles/telemetry_chargeback/README.md @@ -1,11 +1,11 @@ telemetry_chargeback -========= + The **`telemetry_chargeback`** role is designed to test the **RHOSO Cloudkitty** feature. These tests are specific to the Cloudkitty feature. Tests that are not specific to this feature (e.g., standard OpenStack deployment validation, basic networking) should be added to a common role. The role performs two main functions: 1. **CloudKitty Validation** - Enables and configures the CloudKitty hashmap rating module, then validates its state. -2. **Synthetic Data Generation** - Generates synthetic Loki log data for testing chargeback scenarios using a Python script and Jinja2 template. +2. **Synthetic Data Generation & Analysis** - Generates synthetic Loki log data for testing chargeback scenarios and calculates metric totals. The role automatically discovers and processes all scenario files matching `test_*.yml` in the `files/` directory. For each scenario it runs: generate synthetic data, compute syn-totals, ingest to Loki, flush Loki ingester memory, and get cost via CloudKitty rating summary (using begin/end from syn-totals). Retrieve-from-Loki is included in the load_loki_data flow. After all scenarios, the role runs cleanup (`cleanup_ck.yml`) to remove the local flush cert directory. Requirements ------------ @@ -15,7 +15,7 @@ It relies on the following being available on the target or control host: * The **OpenStack CLI client** must be installed and configured with administrative credentials. * Required Python libraries for the `openstack` CLI (e.g., `python3-openstackclient`). * Connectivity to the OpenStack API endpoint. -* **Python 3** with the following libraries for synthetic data generation: +* **Python 3** with the following libraries for synthetic data generation and analysis: * `PyYAML` * `Jinja2` @@ -23,6 +23,7 @@ It is expected to be run **after** a successful deployment and configuration of * **OpenStack:** A functional OpenStack cloud (RHOSO) environment. * **Cloudkitty:** The Cloudkitty service must be installed, configured, and running. +* **Loki / OpenShift (for ingest and flush):** When using ingest and flush tasks, the control host must have `oc` CLI access, and the Cloudkitty Loki stack (route, certificates, ingester) must be deployed. The role sets Loki push/query URLs and extracts certificates via `setup_loki_env.yml`. Role Variables -------------- @@ -30,33 +31,114 @@ The role uses the following variables to control the testing environment and exe ### User-Configurable Variables (defaults/main.yml) +These variables can be overridden when importing the role or set at the play level. Users can customize these based on their deployment environment and test requirements. + | Variable | Default Value | Description | |----------|---------------|-------------| | `openstack_cmd` | `openstack` | The command used to execute OpenStack CLI calls. This can be customized if the binary is not in the standard PATH. | +| `cloudkitty_debug` | `false` | Enable debug mode for the role. | +| `logs_dir_zuul` | `{{ ansible_env.HOME }}/ci-framework-data/logs` | Directory for log files. | +| `artifacts_dir_zuul` | `{{ ansible_env.HOME }}/ci-framework-data/artifacts` | Directory for generated artifacts. | +| `cert_dir` | `{{ ansible_user_dir }}/ck-certs` | Local directory for extracted ingest/query certs. | +| `local_cert_dir` | `{{ ansible_env.HOME }}/ci-framework-data/flush_certs` | Local directory for flush certs (removed by cleanup_ck.yml after the run). | +| `remote_cert_dir` | `osp-certs` | Directory inside the OpenStack pod for certs. | +| `cert_secret_name` | `cert-cloudkitty-client-internal` | OpenShift secret name for client certificates. | +| `client_secret` | `secret/cloudkitty-lokistack-gateway-client-http` | Secret for flush client certs. | +| `ca_configmap` | `cm/cloudkitty-lokistack-ca-bundle` | ConfigMap for CA bundle. | +| `logql_query` | `{service="cloudkitty"}` (overridable via `loki_query`) | LogQL query for Loki. | +| `cloudkitty_namespace` | `openstack` | OpenShift namespace for Cloudkitty/Loki resources. | +| `openstackpod` | `openstackclient` | OpenStack client pod name for exec/cp. | +| `lookback` | `6` | Days lookback for Loki query time range. | +| `limit` | `50` | Limit for Loki query results. | + +**Example: Overriding variables when importing the role** +```yaml +- name: "Run chargeback tests" + ansible.builtin.import_role: + name: telemetry_chargeback + vars: + cloudkitty_namespace: "my-custom-namespace" + lookback: 10 + cloudkitty_debug: true +``` -### Internal Variables (vars/main.yml) +### Synthetic Data Scripts -These variables are used internally by the role and typically do not need to be modified. +These variables are used internally by the role and should not be modified. They use `role_path` for internal file/script references and define internal file naming conventions. | Variable | Default Value | Description | |----------|---------------|-------------| -| `logs_dir_zuul` | `/home/zuul/ci-framework-data/logs` | Remote directory for log files. | -| `artifacts_dir_zuul` | `/home/zuul/ci-framework-data/artifacts` | Directory for generated artifacts. | +| `cloudkitty_scenario_dir` | `{{ role_path }}/files` | Directory containing scenario files (`test_*.yml`). | | `cloudkitty_synth_script` | `{{ role_path }}/files/gen_synth_loki_data.py` | Path to the synthetic data generation script. | | `cloudkitty_data_template` | `{{ role_path }}/templates/loki_data_templ.j2` | Path to the Jinja2 template for Loki data format. | -| `ck_data_config` | `{{ role_path }}/files/test_static.yml` | Path to the scenario configuration file. | -| `ck_output_file_local` | `{{ artifacts_dir_zuul }}/loki_synth_data.json` | Local path for generated synthetic data. | -| `ck_output_file_remote` | `{{ logs_dir_zuul }}/gen_loki_synth_data.log` | Remote destination for synthetic data. | +| `cloudkitty_summary_script` | `{{ role_path }}/files/gen_db_summary.py` | Path to the summary script (gen_db_summary.py). | + +**Note:** Loki push/query URLs are set dynamically in `setup_loki_env.yml` from the Cloudkitty Loki route. + +### Scenario Result Dictionary + +Instead of using separate file-suffix variables, the role builds a `scenario_result` dictionary for each scenario that carries all metadata through the pipeline: + +```yaml +scenario_result: + file_name: "test_static" # scenario name + synth_data_file: "/test_static-synth_data.json" + synth_totals_file: "/test_static-synth_metrics_summary.yml" + num_values: 12 # number of generated log entries + total_rate: 1.234 # expected total rating + synth_summary: { ... } # full output from gen_db_summary.py + loki_data_file: "/test_static-loki_data.json" # added after retrieval + loki_totals_file: "/test_static-loki_metrics_summary.yml" + loki_summary: { ... } # added after retrieval + ck_rating_by_type: { ... } # added after CloudKitty query + ck_rating_summary: { ... } # added after CloudKitty query +``` + +This dictionary is built in `gen_synth_loki_data.yml` and progressively enriched by `retrieve_loki_data.yml` and `loki_rate.yml`. Comparisons in `run_test_scenarios.yml` use the dictionary values directly instead of diffing files. + +### Synthetic Data Scripts + +**gen_synth_loki_data.py** — Generates Loki-format JSON from a scenario YAML and template. The role invokes it with `-r` so that timestamps in the output are in **reverse** order (youngest first, oldest last). When run manually you can omit `-r` for chronological order (oldest first, youngest last). + +| Option | Description | +|--------|--------------| +| `--tmpl` | Path to the Jinja2 template (e.g. `loki_data_templ.j2`). | +| `-t`, `--test` | Path to the scenario YAML (e.g. `test_dyn_basic.yml`). | +| `-o`, `--output` | Path to the output JSON file. | +| `-p`, `--project-id` | Optional; overrides `groupby.project_id` in every log entry. | +| `-u`, `--user-id` | Optional; overrides `groupby.user_id` in every log entry. | +| `-r`, `--reverse` | Reverse timestamp order in JSON output (youngest first, oldest last). | +| `--debug` | Enable debug logging. | + +**gen_db_summary.py** — Parses Loki-style JSON (streams or `data.result`), sorts entries by timestamp, and writes a YAML summary. This script is invoked by the role for **both** synthetic totals (in `gen_synth_loki_data.yml`) and Loki-retrieved totals (in `retrieve_loki_data.yml`). It applies rate calculations with support for `factor`, `offset`, and `mutate` transformations. + +| Option | Description | +|--------|--------------| +| `-j`, `--json` | Path to the input JSON file (required). | +| `-o`, `--output` | Path to the output YAML file (default: `_total.yml`). | +| `--debug` | Directory to write debug output (`_diff.txt` with one `[ts,log]` JSON per line). | + +Output YAML structure: + +* **time** — `begin_step` / `end_step`, each with `nanosec` (nanosecond timestamp), `begin`, `end` (ISO window strings from the log payload). The `nanosec` values are used for Loki query time range in `retrieve_loki_data.yml`. +* **data_log** — `total_timesteps`, `metrics_per_step`, `log_count`. +* **rate** — `by_types` (per-type `Rate` calculated as `Σ((qty_mutated * factor + offset) * price)`) and `total.Rating` (sum of all rates). Scenario Configuration ---------------------- -The synthetic data generation is controlled by a YAML configuration file (`files/test_static.yml`). This file defines: +The synthetic data generation is controlled by YAML configuration files in the `files/` directory. Any file matching `test_*.yml` will be automatically discovered and processed. Files whose names start with an underscore (e.g. `_test_*.yml`) are **not** discovered by the role; they can be used as reference or for manual runs. + +Each scenario file defines: + +* **generation** — Time range configuration (days, step_seconds). +* **log_types** — List of log type definitions. Each entry has **type** (identifier and value in output), unit, description, qty, price, groupby, and metadata. The **groupby** dict typically includes dimension keys (e.g. id, user_id, project_id, tenant_id); the generator merges **date_fields** into groupby at run time. +* **required_fields** — Top-level keys required for each log type (e.g. type, unit, qty, price, groupby, metadata). +* **date_fields** — Date field names to merge into groupby (week_of_the_year, day_of_the_year, month, year). +* **loki_stream** — Loki stream configuration (service name). + +**groupby.id** should be consistent by metric type across scenario files so that the same type always uses the same id. -* **generation** - Time range configuration (days, step_seconds) -* **log_types** - List of log type definitions with name, type, unit, qty, price, groupby, and metadata -* **required_fields** - Fields required for validation -* **date_fields** - Date fields to add to groupby (week_of_the_year, day_of_the_year, month, year) -* **loki_stream** - Loki stream configuration (service name) +Scenario files matching `test_*.yml` in the `files/` directory are automatically discovered and processed. Files whose names start with an underscore are not auto-discovered. Dependencies ------------ diff --git a/roles/telemetry_chargeback/defaults/main.yml b/roles/telemetry_chargeback/defaults/main.yml index 64f07b7a1..1e64535ce 100644 --- a/roles/telemetry_chargeback/defaults/main.yml +++ b/roles/telemetry_chargeback/defaults/main.yml @@ -1,2 +1,32 @@ --- +# OpenStack CLI command openstack_cmd: "openstack" + +# Debug mode +cloudkitty_debug: false + +# Directory paths +logs_dir_zuul: "{{ ansible_env.HOME }}/ci-framework-data/logs" +artifacts_dir_zuul: "{{ ansible_env.HOME }}/ci-framework-data/artifacts" +cert_dir: "{{ ansible_user_dir }}/ck-certs" +local_cert_dir: "{{ ansible_env.HOME }}/ci-framework-data/flush_certs" +remote_cert_dir: "osp-certs" + +# Cloudkitty certificates and secrets +cert_secret_name: "cert-cloudkitty-client-internal" +client_secret: "secret/cloudkitty-lokistack-gateway-client-http" +ca_configmap: "cm/cloudkitty-lokistack-ca-bundle" + +# LogQL Query +logql_query: "{{ loki_query | default('{service=\"cloudkitty\"}') }}" + +# OpenShift/Kubernetes settings +cloudkitty_namespace: "openstack" +openstackpod: "openstackclient" + +# Time window settings +lookback: 6 +limit: 50 + +# List of test scenario files to run +cloudkitty_test_scenarios: [] diff --git a/roles/telemetry_chargeback/files/gen_synth_loki_data.py b/roles/telemetry_chargeback/files/gen_synth_loki_data.py index f05796e29..5918b7615 100755 --- a/roles/telemetry_chargeback/files/gen_synth_loki_data.py +++ b/roles/telemetry_chargeback/files/gen_synth_loki_data.py @@ -1,14 +1,50 @@ +# Python script to generate synthetic data """Generate synthetic Loki log data from a Jinja2 template.""" import logging import argparse import json +import sys import yaml from datetime import datetime, timezone, timedelta from pathlib import Path -from typing import Dict, Any +from typing import Dict, Any, List, Union from jinja2 import Environment +def _get_value_for_step( + values: List[Union[int, float]], + step_idx: int, + num_steps: int +) -> Union[int, float]: + """ + Get the appropriate value from a list based on the current step index. + + Values are distributed evenly across all steps. For example, if there are + 12 steps and 4 values, each value covers 3 steps: + - Steps 0-2: values[0] + - Steps 3-5: values[1] + - Steps 6-8: values[2] + - Steps 9-11: values[3] + + Args: + values: List of values to choose from. + step_idx: Current step index (0-based). + num_steps: Total number of steps. + + Returns: + The value corresponding to the current step. + """ + num_values = len(values) + if num_values == 1: + return values[0] + + # Calculate how many steps each value covers + steps_per_value = num_steps / num_values + # Determine which value index to use, clamping to valid range + value_idx = min(int(step_idx // steps_per_value), num_values - 1) + return values[value_idx] + + # --- Configure logging with a default level that can be changed --- logging.basicConfig( level=logging.INFO, @@ -73,7 +109,10 @@ def generate_loki_data( start_time: datetime, end_time: datetime, time_step_seconds: int, - config: Dict[str, Any] + config: Dict[str, Any], + project: Union[str, int, None] = None, + user: Union[str, int, None] = None, + reverse_timestamps: bool = False, ): """ Generate synthetic Loki log data by preparing a data list and rendering. @@ -85,6 +124,12 @@ def generate_loki_data( end_time (datetime): The end time for data generation. time_step_seconds (int): The duration of each log entry in seconds. config (Dict[str, Any]): Configuration dictionary loaded from file. + project: Optional value to inject as groupby.project in every + log entry in the output (overrides test_* file value when set). + user: Optional value to inject as groupby.user in every + log entry in the output (overrides test_* file value when set). + reverse_timestamps (bool): If True, reverse the order of timestamps + in the JSON output (newest first, oldest last). """ # Hardcoded constant for invalid timestamps invalid_timestamp = "INVALID_TIMESTAMP" @@ -175,39 +220,54 @@ def generate_loki_data( logger.error(f"Invalid log type configuration: {log_type_config}") raise ValueError("Each log type in log_types must be a dictionary") - log_type_name = log_type_config.get("name") - if not log_type_name: - logger.error("Each log type must have a 'name' field") - raise ValueError("Each log type must have a 'name' field") + # "type" is log-type identifier (dict key) and output value + type_key = log_type_config.get("type") + if not type_key: + logger.error("Each log type must have a 'type' field") + raise ValueError("Each log type must have a 'type' field") # Validate required fields - missing = [f for f in required_fields if f not in log_type_config] + # metadata is optional for generation; name is not a log-type field + required_for_item = [ + f for f in required_fields + if f not in ("name", "metadata") + ] + missing = [f for f in required_for_item if f not in log_type_config] if missing: logger.error( - f"Missing required fields in {log_type_name} config: {missing}" + f"Missing required fields in {type_key!r} config: {missing}" ) raise ValueError( - f"Missing required fields in {log_type_name}: {missing}" + f"Missing required fields in {type_key!r}: {missing}" ) # Build groupby from config groupby = log_type_config.get("groupby", {}) if not isinstance(groupby, dict): logger.error( - f"groupby must be a dictionary for {log_type_name}" + f"groupby must be a dictionary for {type_key!r}" ) raise ValueError( - f"groupby must be a dictionary for {log_type_name}" + f"groupby must be a dictionary for {type_key!r}" ) - log_types[log_type_name] = { - "type": log_type_config["type"], + # Ensure qty and price are lists for step-based distribution + qty_val = log_type_config["qty"] + price_val = log_type_config["price"] + qty_list = qty_val if isinstance(qty_val, list) else [qty_val] + price_list = price_val if isinstance(price_val, list) else [price_val] + + log_types[type_key] = { + "type": type_key, "unit": log_type_config["unit"], "description": log_type_config.get("description"), - "qty": log_type_config["qty"], - "price": log_type_config["price"], + "qty": qty_list, + "price": price_list, "groupby": groupby.copy(), - "metadata": log_type_config.get("metadata", {}) + "metadata": log_type_config.get("metadata", {}), + "factor": log_type_config.get("factor", 1), + "offset": log_type_config.get("offset", 0), + "mutate": log_type_config.get("mutate", "NONE") } # --- Step 3: Load template and render --- @@ -231,15 +291,21 @@ def tojson_preserve_order(obj): # --- Render the template in one pass with all the data --- logger.info("Rendering final output...") + if reverse_timestamps: + log_data_list.reverse() + logger.debug( + "Reversed timestamp order (newest first, oldest last)." + ) + + # Calculate total number of steps for value distribution + num_steps = len(log_data_list) + logger.debug(f"Total number of time steps: {num_steps}") + # Pre-calculate log types with date fields for each time step log_types_list = [] for idx, item in enumerate(log_data_list): - # For the last entry, use end_time to ensure it shows today's date - if idx == len(log_data_list) - 1: - dt = end_time - else: - epoch_seconds = item["nanoseconds"] / 1_000_000_000 - dt = datetime.fromtimestamp(epoch_seconds, tz=timezone.utc) + epoch_seconds = item["nanoseconds"] / 1_000_000_000 + dt = datetime.fromtimestamp(epoch_seconds, tz=timezone.utc) iso_year, iso_week, _ = dt.isocalendar() day_of_year = dt.timetuple().tm_yday @@ -267,6 +333,17 @@ def tojson_preserve_order(obj): log_type_with_dates = log_type_data.copy() log_type_with_dates["groupby"] = log_type_data["groupby"].copy() log_type_with_dates["groupby"].update(date_fields) + if project is not None: + log_type_with_dates["groupby"]["project"] = project + if user is not None: + log_type_with_dates["groupby"]["user"] = user + # Select qty and price based on step index distribution + log_type_with_dates["qty"] = _get_value_for_step( + log_type_data["qty"], idx, num_steps + ) + log_type_with_dates["price"] = _get_value_for_step( + log_type_data["price"], idx, num_steps + ) log_types_with_dates[log_type_name] = log_type_with_dates log_types_list.append(log_types_with_dates) @@ -296,8 +373,19 @@ def tojson_preserve_order(obj): ) except IOError as e: logger.error(f"Failed to write to output file '{output_path}': {e}") - except Exception as e: - logger.error(f"An unexpected error occurred during file write: {e}") + raise + + # --- Step 5: Validate that the output is valid JSON --- + try: + with output_path.open('r') as f_in: + json.load(f_in) + logger.info("Output file validated as valid JSON.") + except json.JSONDecodeError as e: + logger.error( + f"Output file is not valid JSON: {e}. " + f"Delete '{output_path}' and fix the template or data." + ) + sys.exit(1) def main(): @@ -324,8 +412,38 @@ def main(): required=True, help="Path to the output file." ) + parser.add_argument( + "-p", "--project-id", + type=str, + default=None, + metavar="ID", + help="Optional alphanumeric value to use as groupby.project in " + "every log entry in the output (overrides value from test file)." + ) + parser.add_argument( + "-u", "--user-id", + type=str, + default=None, + metavar="ID", + help="Optional alphanumeric value to use as groupby.user in " + "every log entry in the output (overrides value from test file)." + ) # --- Optional Utility Arguments --- + parser.add_argument( + "-s", "--scenario", + type=str, + default=None, + metavar="NAME", + help="Scenario name to add as a label in the Loki stream, " + "allowing per-scenario filtering on retrieval." + ) + parser.add_argument( + "-r", "--reverse", + action="store_true", + help="Reverse timestamp order in JSON output: newest first, " + "oldest last (default is oldest first, newest last)." + ) parser.add_argument( "--debug", action="store_true", @@ -356,13 +474,21 @@ def main(): # Run the generator try: + if args.scenario: + loki_stream = config.get("loki_stream", {}) + loki_stream["scenario"] = args.scenario + config["loki_stream"] = loki_stream + generate_loki_data( template_path=Path(args.tmpl), output_path=Path(args.output), start_time=start_time_utc, end_time=end_time_utc, time_step_seconds=step_seconds, - config=config + config=config, + project=args.project_id, + user=args.user_id, + reverse_timestamps=args.reverse, ) except FileNotFoundError: logger.error( diff --git a/roles/telemetry_chargeback/tasks/chargeback_tests.yml b/roles/telemetry_chargeback/tasks/chargeback_tests.yml index 8519d7891..99ddcc44e 100644 --- a/roles/telemetry_chargeback/tasks/chargeback_tests.yml +++ b/roles/telemetry_chargeback/tasks/chargeback_tests.yml @@ -8,7 +8,9 @@ - name: "Find the current value of hashmap" ansible.builtin.shell: - cmd: "{{ openstack_cmd }} rating module get hashmap -c Priority -f csv | tail -n +2" + cmd: "set -o pipefail && {{ openstack_cmd }} rating module get hashmap -c Priority -f csv | tail -n +2" + args: + executable: /bin/bash register: get_hashmap_priority changed_when: false diff --git a/roles/telemetry_chargeback/tasks/cleanup_ck.yml b/roles/telemetry_chargeback/tasks/cleanup_ck.yml new file mode 100644 index 000000000..01407d155 --- /dev/null +++ b/roles/telemetry_chargeback/tasks/cleanup_ck.yml @@ -0,0 +1,5 @@ +--- +- name: "Cleanup local certificates" + ansible.builtin.file: + path: "{{ local_cert_dir }}" + state: absent diff --git a/roles/telemetry_chargeback/tasks/flush_loki_data.yml b/roles/telemetry_chargeback/tasks/flush_loki_data.yml new file mode 100644 index 000000000..6ec05419d --- /dev/null +++ b/roles/telemetry_chargeback/tasks/flush_loki_data.yml @@ -0,0 +1,52 @@ +--- +# Flush Loki Ingester Memory to Storage + +- name: "Flush execution inside OpenStack CLI" + block: + # create dir + - name: "Create directory inside OpenStack CLI" + ansible.builtin.command: + cmd: "oc exec -n {{ cloudkitty_namespace }} {{ openstackpod }} -- mkdir -p {{ remote_cert_dir }}" + changed_when: false + + # certs to Flush data to Loki + - name: "Create directory to extract certificates" + ansible.builtin.file: + path: "{{ local_cert_dir }}" + state: directory + mode: '0755' + + # copy all certs + - name: "Copy certificates to OpenStack CLI" + ansible.builtin.command: + cmd: "oc cp {{ local_cert_dir }}/. {{ cloudkitty_namespace }}/{{ openstackpod }}:{{ remote_cert_dir }}/" + changed_when: true + + # flush loki + - name: "Trigger Loki ingester flush" + ansible.builtin.command: + cmd: > + oc exec -n {{ cloudkitty_namespace }} {{ openstackpod }} -- + curl -v -X POST {{ ingester_flush_url }} + --cert {{ remote_cert_dir }}/tls.crt + --key {{ remote_cert_dir }}/tls.key + --cacert {{ remote_cert_dir }}/service-ca.crt + register: flush_response + changed_when: true + failed_when: flush_response.rc != 0 + + # Status + - name: "Verify flush status" + ansible.builtin.assert: + that: + - "'204' in flush_response.stderr or '200' in flush_response.stderr" + fail_msg: "Flush failed" + success_msg: "Ingester Memory Flushed successfully" + + rescue: + - name: "Debug failure output" + ansible.builtin.debug: + msg: + - "Failure" + - "Stdout: {{ flush_response.stdout | default('') }}" + - "Stderr: {{ flush_response.stderr | default('') }}" diff --git a/roles/telemetry_chargeback/tasks/gen_synth_loki_data.yml b/roles/telemetry_chargeback/tasks/gen_synth_loki_data.yml index 0b8d5880d..b4ebfd17d 100644 --- a/roles/telemetry_chargeback/tasks/gen_synth_loki_data.yml +++ b/roles/telemetry_chargeback/tasks/gen_synth_loki_data.yml @@ -1,39 +1,57 @@ --- -- name: Check for preexisting output file +- name: "Ensure artifacts directory exists for {{ item }}" + ansible.builtin.file: + path: "{{ artifacts_dir_zuul }}" + state: directory + mode: '0755' + +- name: "Set file paths for {{ item }}" + ansible.builtin.set_fact: + _synth_data_file: "{{ artifacts_dir_zuul }}/{{ item }}-synth_data.json" + _synth_totals_file: "{{ artifacts_dir_zuul }}/{{ item }}-synth_metrics_summary.yml" + _test_file: "{{ cloudkitty_scenario_dir }}/{{ item }}.yml" + +- name: "Check for preexisting output file" ansible.builtin.stat: - path: "{{ ck_output_file_local }}" + path: "{{ _synth_data_file }}" register: file_preexists -- name: TEST Generate Synthetic Data +- name: "Generate Synthetic Data for {{ item }}" ansible.builtin.command: cmd: > python3 "{{ cloudkitty_synth_script }}" + -r --tmpl "{{ cloudkitty_data_template }}" - -t "{{ ck_data_config }}" - -o "{{ ck_output_file_local }}" + -t "{{ _test_file }}" + -o "{{ _synth_data_file }}" + -s "{{ item }}" + {% if cloudkitty_project_id is defined and cloudkitty_project_id %} -p "{{ cloudkitty_project_id }}"{% endif %} register: script_output - when: not file_preexists.stat.exists | bool + when: not file_preexists.stat.exists | bool changed_when: script_output.rc == 0 -- name: Read the content of the file - ansible.builtin.slurp: - src: "{{ ck_output_file_local }}" - register: slurped_file - -- name: TEST Validate JSON format of synthetic data file - ansible.builtin.assert: - that: - # This filter will trigger a task failure if the string isn't valid JSON - - slurped_file.content | b64decode | from_json is defined - fail_msg: "The file does not contain valid JSON format." - success_msg: "JSON format validated successfully." +- name: "Generate chargeback rating from synthetic data file {{ item }}" + ansible.builtin.command: + cmd: > + python3 "{{ cloudkitty_summary_script }}" + -j "{{ _synth_data_file }}" + -o "{{ _synth_totals_file }}" + --debug "{{ cloudkitty_debug_dir }}" + register: synth_rating_info + when: not file_preexists.stat.exists | bool + changed_when: synth_rating_info.rc == 0 -- name: Print output_file_remote path - ansible.builtin.debug: - msg: "Synthetic data file: {{ ck_output_file_remote }}" +- name: "Load synthetic metrics summary" + ansible.builtin.include_vars: + file: "{{ _synth_totals_file }}" + name: _synth_summary -- name: Copy output file to remote host - ansible.builtin.copy: - src: "{{ ck_output_file_local }}" - dest: "{{ ck_output_file_remote }}" - mode: '0644' +- name: "Build scenario_result dict for {{ item }}" + ansible.builtin.set_fact: + scenario_result: + file_name: "{{ item }}" + synth_data_file: "{{ _synth_data_file }}" + synth_totals_file: "{{ _synth_totals_file }}" + num_values: "{{ _synth_summary.data_log.log_count }}" + total_rate: "{{ _synth_summary.rate.total.Rating }}" + synth_summary: "{{ _synth_summary }}" diff --git a/roles/telemetry_chargeback/tasks/ingest_loki_data.yml b/roles/telemetry_chargeback/tasks/ingest_loki_data.yml new file mode 100644 index 000000000..711a01b90 --- /dev/null +++ b/roles/telemetry_chargeback/tasks/ingest_loki_data.yml @@ -0,0 +1,42 @@ +--- +# Ingest data log to Loki that is generated from gen_synth_loki_data.yml + +- name: "Ingest data log to Loki via API" + block: + + - name: "Read log file content" + ansible.builtin.slurp: + src: "{{ scenario_result.synth_data_file }}" + register: log_file_content + + - name: "Push data to Loki" + ansible.builtin.uri: + url: "{{ loki_push_url }}" + method: POST + body: "{{ log_file_content['content'] | b64decode | from_json }}" + body_format: json + client_cert: "{{ cert_dir }}/tls.crt" + client_key: "{{ cert_dir }}/tls.key" + validate_certs: false + status_code: 204 + return_content: true + register: loki_response + ignore_errors: false + failed_when: loki_response.status != 204 + + # Success + - name: "Confirm ingestion success" + ansible.builtin.debug: + msg: "Ingestion Successful!" + + rescue: + # Rescue block + - name: "Debug failure" + ansible.builtin.debug: + msg: "{{ loki_response.status | default('N/A') }}" + + # Failure + - name: "Report ingestion failure" + ansible.builtin.fail: + msg: "Ingestion Failed" + ignore_errors: false diff --git a/roles/telemetry_chargeback/tasks/load_loki_data.yml b/roles/telemetry_chargeback/tasks/load_loki_data.yml new file mode 100644 index 000000000..a2a1e129f --- /dev/null +++ b/roles/telemetry_chargeback/tasks/load_loki_data.yml @@ -0,0 +1,12 @@ +--- +- name: "Ingest CloudKitty data log for {{ item }}" + ansible.builtin.include_tasks: + file: ingest_loki_data.yml + +- name: "Flush data to Loki storage for {{ item }}" + ansible.builtin.include_tasks: + file: flush_loki_data.yml + +- name: "Retrieve data log from Loki for {{ item }}" + ansible.builtin.include_tasks: + file: retrieve_loki_data.yml diff --git a/roles/telemetry_chargeback/tasks/loki_rate.yml b/roles/telemetry_chargeback/tasks/loki_rate.yml new file mode 100644 index 000000000..d5db3c2a4 --- /dev/null +++ b/roles/telemetry_chargeback/tasks/loki_rate.yml @@ -0,0 +1,39 @@ +--- +- name: "TEST Get Rate and Qty by type from CloudKitty {{ item }}" + ansible.builtin.command: + cmd: "{{ openstack_cmd }} --rating-api-version 2 rating summary get -f yaml -g type" + register: cost_totals_by_type + changed_when: false + failed_when: cost_totals_by_type.rc != 0 + +- name: "Print the rating by type {{ item }}" + ansible.builtin.debug: + var: cost_totals_by_type.stdout + +- name: "Save rating output for CI logs {{ item }}" + ansible.builtin.copy: + content: "{{ cost_totals_by_type.stdout }}" + dest: "{{ artifacts_dir_zuul }}/{{ scenario_result.file_name }}-rating.yml" + mode: '0644' + +- name: "TEST Get Rate and Qty Summary from CloudKitty {{ item }}" + ansible.builtin.command: + cmd: >- + {{ openstack_cmd }} --rating-api-version 2 rating summary get -f yaml + -b "{{ scenario_result.synth_summary.time.begin_step.begin }}" + -e "{{ scenario_result.synth_summary.time.end_step.end }}" + register: cost_totals_summary + changed_when: false + failed_when: cost_totals_summary.rc != 0 + +- name: "Print the rating summary {{ item }}" + ansible.builtin.debug: + var: cost_totals_summary.stdout + +- name: "Update scenario_result with CloudKitty rating for {{ item }}" + ansible.builtin.set_fact: + scenario_result: >- + {{ scenario_result | combine({ + 'ck_rating_by_type': cost_totals_by_type.stdout | from_yaml, + 'ck_rating_summary': cost_totals_summary.stdout | from_yaml + }) }} diff --git a/roles/telemetry_chargeback/tasks/main.yml b/roles/telemetry_chargeback/tasks/main.yml index 98a94b233..c8963bb4b 100644 --- a/roles/telemetry_chargeback/tasks/main.yml +++ b/roles/telemetry_chargeback/tasks/main.yml @@ -1,6 +1,65 @@ --- -- name: "Validate Chargeback Feature" +- name: "Validate Chargeback Feature deployed correctly" ansible.builtin.include_tasks: "chargeback_tests.yml" -- name: "Generate Synthetic Data" - ansible.builtin.include_tasks: "gen_synth_loki_data.yml" +- name: "Setup Loki Environment" + ansible.builtin.include_tasks: "setup_loki_env.yml" + +- name: "CloudKitty debug ON/OFF" + ansible.builtin.set_fact: + cloudkitty_debug_dir: "{{ (cloudkitty_debug | bool) | ternary(artifacts_dir_zuul + '/debug_ck_db', '') }}" + +- name: "Get admin project ID for CI" + ansible.builtin.command: + cmd: "{{ openstack_cmd }} project show admin -f value -c id" + register: get_admin_project_id + changed_when: false + failed_when: false + +- name: "Set admin project ID for CI" + ansible.builtin.set_fact: + cloudkitty_project_id: "{{ (get_admin_project_id.stdout | trim) | default('') }}" + +- name: "Get admin user ID for CI" + ansible.builtin.command: + cmd: "{{ openstack_cmd }} user show admin -f value -c id" + register: get_admin_user_id + changed_when: false + failed_when: false + +- name: "Set admin user ID for CI" + ansible.builtin.set_fact: + cloudkitty_user_id: "{{ (get_admin_user_id.stdout | trim) | default('') }}" + +- name: "Auto-discover test scenarios" + when: cloudkitty_test_scenarios | length == 0 + block: + - name: "Find test files" + ansible.builtin.find: + paths: "{{ cloudkitty_scenario_dir }}" + patterns: "test_*.yml" + register: found_files_raw + + - name: "Set scenario list from discovered files" + ansible.builtin.set_fact: + found_files: "{{ found_files_raw.files | map(attribute='path') | map('basename') | map('regex_replace', '\\.yml$', '') | list }}" + +- name: "Set scenario list from user-provided variable" + ansible.builtin.set_fact: + found_files: "{{ cloudkitty_test_scenarios }}" + when: cloudkitty_test_scenarios | length > 0 + +- name: "Run scenario file through workflow" + when: found_files | length > 0 + block: + - name: "Process and Loop if files exist" + ansible.builtin.include_tasks: run_test_scenarios.yml + loop: "{{ found_files }}" + + - name: "Cleanup after job run" + ansible.builtin.include_tasks: cleanup_ck.yml + + rescue: + - name: "Log failure" + ansible.builtin.debug: + msg: "Running test scenarios loop failed." diff --git a/roles/telemetry_chargeback/tasks/retrieve_loki_data.yml b/roles/telemetry_chargeback/tasks/retrieve_loki_data.yml new file mode 100644 index 000000000..1b90bc013 --- /dev/null +++ b/roles/telemetry_chargeback/tasks/retrieve_loki_data.yml @@ -0,0 +1,90 @@ +--- +- name: "Expected Count {{ item }}" + ansible.builtin.debug: + msg: "Input file has {{ scenario_result.num_values }} data entries that Loki has to return" + +- name: "Set Loki output paths and query for {{ item }}" + ansible.builtin.set_fact: + _loki_data_file: "{{ artifacts_dir_zuul }}/{{ scenario_result.file_name }}-loki_data.json" + _loki_totals_file: "{{ artifacts_dir_zuul }}/{{ scenario_result.file_name }}-loki_metrics_summary.yml" + _loki_scenario_query: '{service="cloudkitty", scenario="{{ scenario_result.file_name }}"}' + +# Query Loki +- name: "Retrieve Logs from Loki via API {{ item }}" + block: + - name: "Query Loki API" + ansible.builtin.uri: + url: "{{ loki_query_url }}?query={{ _loki_scenario_query | urlencode }}&start={{ scenario_result.synth_summary.time.begin_step.nanosec }}&limit={{ limit }}" + method: GET + client_cert: "{{ cert_dir }}/tls.crt" + client_key: "{{ cert_dir }}/tls.key" + ca_path: "{{ cert_dir }}/ca.crt" + validate_certs: false + return_content: true + body_format: json + register: loki_response + until: + - loki_response.status == 200 + - loki_response.json.status == 'success' + - loki_response.json.data.result | length > 0 + - (loki_response.json.data.result | map(attribute='values') | map('length') | sum) >= (scenario_result.num_values | int) + retries: 25 + delay: 60 + + - name: "Save Loki Data to JSON file" + ansible.builtin.copy: + content: "{{ loki_response.json | to_json }}" + dest: "{{ _loki_data_file }}" + mode: '0644' + + - name: "Verify Data Integrity {{ item }}" + vars: + actual_count: "{{ loki_response.json.data.result | map(attribute='values') | map('length') | sum }}" + ansible.builtin.assert: + that: + - loki_response.json.status == 'success' + - loki_response.json.data.result | length > 0 + - actual_count | int == (scenario_result.num_values | int) + fail_msg: >- + Query did not return all data entries. Expected + {{ scenario_result.num_values }} log entries, but Loki + only returned {{ actual_count }} + success_msg: >- + Query returned all data entries. Input file had + {{ scenario_result.num_values }} entries and Loki returned {{ actual_count }} + + rescue: + - name: "Debug failure" + ansible.builtin.debug: + msg: + - "Status: {{ loki_response.status | default('Unknown') }}" + - "Body: {{ loki_response.content | default('No Content') }}" + - "Msg: {{ loki_response.msg | default('Request failed') }}" + + - name: "Report Retrieval Failure" + ansible.builtin.fail: + msg: "Retrieval Failed" + +- name: "Generate chargeback stats from Loki-retrieved data file: {{ item }}" + ansible.builtin.command: + cmd: > + python3 "{{ cloudkitty_summary_script }}" + -j "{{ _loki_data_file }}" + -o "{{ _loki_totals_file }}" + --debug "{{ cloudkitty_debug_dir }}" + register: loki_retrieved_summary_info + changed_when: loki_retrieved_summary_info.rc == 0 + +- name: "Load Loki metrics summary" + ansible.builtin.include_vars: + file: "{{ _loki_totals_file }}" + name: _loki_summary + +- name: "Update scenario_result with Loki retrieval data for {{ item }}" + ansible.builtin.set_fact: + scenario_result: >- + {{ scenario_result | combine({ + 'loki_data_file': _loki_data_file, + 'loki_totals_file': _loki_totals_file, + 'loki_summary': _loki_summary + }) }} diff --git a/roles/telemetry_chargeback/tasks/run_test_scenarios.yml b/roles/telemetry_chargeback/tasks/run_test_scenarios.yml new file mode 100644 index 000000000..85388954c --- /dev/null +++ b/roles/telemetry_chargeback/tasks/run_test_scenarios.yml @@ -0,0 +1,54 @@ +--- +- name: "Generate Synthetic Data for each file: {{ item }}" + ansible.builtin.include_tasks: "gen_synth_loki_data.yml" + +- name: "Load data to Loki: {{ item }}" + ansible.builtin.include_tasks: "load_loki_data.yml" + +- name: "Get total rate from Loki: {{ item }}" + ansible.builtin.include_tasks: "loki_rate.yml" + +- name: "TEST Scenario data available for comparison {{ item }}" + ansible.builtin.assert: + that: + - scenario_result.synth_summary is defined + - scenario_result.loki_summary is defined + fail_msg: >- + FAILED! Missing summary data for scenario {{ item }}. + synth_summary defined: {{ scenario_result.synth_summary is defined }}, + loki_summary defined: {{ scenario_result.loki_summary is defined }} + success_msg: "PASSED! Scenario data available for {{ item }}" + +- name: "TEST Compare log counts - synthetic vs Loki {{ item }}" + ansible.builtin.assert: + that: + - scenario_result.synth_summary.data_log.log_count | int == scenario_result.loki_summary.data_log.log_count | int + fail_msg: >- + FAILED! {{ item }} - Log count mismatch: + synthetic={{ scenario_result.synth_summary.data_log.log_count }}, + loki={{ scenario_result.loki_summary.data_log.log_count }} + success_msg: >- + PASSED! {{ item }} - Log counts match: + {{ scenario_result.synth_summary.data_log.log_count }} + +- name: "TEST Compare total rate - synthetic vs Loki {{ item }}" + ansible.builtin.assert: + that: + - scenario_result.synth_summary.rate.total.Rating | float == scenario_result.loki_summary.rate.total.Rating | float + fail_msg: >- + FAILED! {{ item }} - Total rate mismatch: + synthetic={{ scenario_result.synth_summary.rate.total.Rating }}, + loki={{ scenario_result.loki_summary.rate.total.Rating }} + success_msg: >- + PASSED! {{ item }} - Total rates match: + {{ scenario_result.synth_summary.rate.total.Rating }} + +- name: "TEST Compare per-type rates - synthetic vs Loki {{ item }}" + ansible.builtin.assert: + that: + - scenario_result.synth_summary.rate.by_types == scenario_result.loki_summary.rate.by_types + fail_msg: >- + FAILED! {{ item }} - Per-type rates differ: + synthetic={{ scenario_result.synth_summary.rate.by_types }}, + loki={{ scenario_result.loki_summary.rate.by_types }} + success_msg: "PASSED! {{ item }} - Per-type rates are identical." diff --git a/roles/telemetry_chargeback/tasks/setup_loki_env.yml b/roles/telemetry_chargeback/tasks/setup_loki_env.yml new file mode 100644 index 000000000..e4a80250f --- /dev/null +++ b/roles/telemetry_chargeback/tasks/setup_loki_env.yml @@ -0,0 +1,56 @@ +--- +# Setup Loki Environment + +# Dynamic URL's +- name: "Get Loki Public Route Host" + ansible.builtin.command: + cmd: | + oc get route cloudkitty-lokistack -n {{ cloudkitty_namespace }} -o "jsonpath={.spec.host}" + register: loki_route + changed_when: false + +- name: "Set Loki URLs" + ansible.builtin.set_fact: + # Base URL + loki_base_url: "https://{{ loki_route.stdout }}" + + # Internal Flush URL (Service DNS: https://..svc:3100/flush) + ingester_flush_url: "https://cloudkitty-lokistack-ingester-http.{{ cloudkitty_namespace }}.svc:3100/flush" + +- name: "Set Derived Loki URLs" + ansible.builtin.set_fact: + loki_push_url: "{{ loki_base_url }}/api/logs/v1/cloudkitty/loki/api/v1/push" + loki_query_url: "{{ loki_base_url }}/api/logs/v1/cloudkitty/loki/api/v1/query_range" + +- name: "Debug URLs" + ansible.builtin.debug: + msg: + - "Loki Route: {{ loki_base_url }}" + - "Push URL: {{ loki_push_url }}" + - "Flush URL: {{ ingester_flush_url }}" + - "Query URL: {{ loki_query_url }}" + +# Certs to Ingest & Retrieve data to/from Loki +- name: "Ensure Local Certificate Directory Exists" + ansible.builtin.file: + path: "{{ cert_dir }}" + state: directory + mode: '0755' + +- name: "Extract Certificates from OpenShift Secret" + ansible.builtin.command: + cmd: | + oc extract secret/{{ cert_secret_name }} --to={{ cert_dir }} --confirm -n {{ cloudkitty_namespace }} + changed_when: true + +- name: "Extract Client Certificates" + ansible.builtin.command: + cmd: | + oc extract {{ client_secret }} --to={{ local_cert_dir }} --confirm -n {{ cloudkitty_namespace }} + changed_when: true + +- name: "Extract CA Bundle" + ansible.builtin.command: + cmd: | + oc extract {{ ca_configmap }} --to={{ local_cert_dir }} --confirm -n {{ cloudkitty_namespace }} + changed_when: true diff --git a/roles/telemetry_chargeback/templates/loki_data_templ.j2 b/roles/telemetry_chargeback/templates/loki_data_templ.j2 index b676f3013..a337f4976 100644 --- a/roles/telemetry_chargeback/templates/loki_data_templ.j2 +++ b/roles/telemetry_chargeback/templates/loki_data_templ.j2 @@ -1,4 +1,4 @@ -{"streams": [{ "stream": { "service": "{{ loki_stream.service }}" }, "values": [ +{"streams": [{ "stream": { {% for key, value in loki_stream.items() %}"{{ key }}": "{{ value }}"{% if not loop.last %}, {% endif %}{% endfor %} }, "values": [ {%- for item in log_data %} {%- set outer_idx = loop.index0 %} {%- set is_last_outer = loop.last %} @@ -13,7 +13,10 @@ "qty": entry_data.qty, "price": entry_data.price, "groupby": entry_data.groupby, - "metadata": entry_data.metadata + "metadata": entry_data.metadata, + "factor": entry_data.factor, + "offset": entry_data.offset, + "mutate": entry_data.mutate } -%} [ "{{ item.nanoseconds }}", diff --git a/roles/telemetry_chargeback/vars/main.yml b/roles/telemetry_chargeback/vars/main.yml index 5d7a47804..a4b01f1be 100644 --- a/roles/telemetry_chargeback/vars/main.yml +++ b/roles/telemetry_chargeback/vars/main.yml @@ -1,20 +1,6 @@ --- -logs_dir_zuul: "/home/zuul/ci-framework-data/logs" -artifacts_dir_zuul: "/home/zuul/ci-framework-data/artifacts" - -cloudkitty_synth_script: "{{ role_path }}/files/gen_synth_loki_data.py" -cloudkitty_data_template: "{{ role_path }}/templates/loki_data_templ.j2" -ck_data_config: "{{ role_path }}/files/test_static.yml" -ck_output_file_local: "{{ artifacts_dir_zuul }}/loki_synth_data.json" -ck_output_file_remote: "{{ logs_dir_zuul }}/gen_loki_synth_data.log" - # Scenario and script paths (using role_path) cloudkitty_scenario_dir: "{{ role_path }}/files" +cloudkitty_synth_script: "{{ role_path }}/files/gen_synth_loki_data.py" +cloudkitty_data_template: "{{ role_path }}/templates/loki_data_templ.j2" cloudkitty_summary_script: "{{ role_path }}/files/gen_db_summary.py" - -# File naming conventions (internal standardization) -cloudkitty_synth_data_suffix: "-synth_data.json" -cloudkitty_loki_data_suffix: "-loki_data.json" -cloudkitty_synth_totals_metrics_suffix: "-synth_metrics_summary.yml" -cloudkitty_loki_totals_metrics_suffix: "-loki_metrics_summary.yml" -cloudkitty_loki_totals_suffix: "-rating.yml"