Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/administration/authentication.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,7 @@ Eppo supports the following enterprise authentication options:
- OpenID Connect

Follow the guides linked above or reach out to your Eppo team if you would like one of these options configured for your users.

:::info SSO login flow
Eppo supports **SP-initiated** (Service Provider-initiated) SSO login only. Users must start the login flow from the Eppo login page (`eppo.cloud`), not from the identity provider's app dashboard. IdP-initiated login (clicking the Eppo tile in Okta, Azure AD, etc.) can result in a login loop and is not supported.
:::
11 changes: 11 additions & 0 deletions docs/data-management/connecting-dwh/bigquery.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,3 +76,14 @@ Now that you have a proper Service Account created for Eppo with adequate privil
### Updating Credentials

Credentials can be updated at any time within the Admin panel of the app.

### Rotating service accounts

When switching to a new service account:

1. **Grant the new service account the same IAM roles** as the existing one — at minimum, BigQuery Data Viewer on the datasets referenced in your definitions, plus BigQuery Data Editor on the `eppo_output` dataset.
2. **Upload the new service account key** in the Eppo Admin panel and click "Test Connection" to verify.
3. **Trigger a test refresh** on one experiment to confirm the pipeline runs end-to-end.
4. **Revoke the old service account key** only after verifying the new one works.

If you see permission errors after switching, the most common cause is missing IAM grants on the new service account.
11 changes: 11 additions & 0 deletions docs/data-management/connecting-dwh/redshift.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,3 +169,14 @@ Now that you have a proper User created for Eppo with adequate privileges, you c
### Updating Credentials

Credentials can be updated at any time within the Admin panel of the app.

### Rotating service accounts

When switching to a new service account or database user:

1. **Grant the new user the same permissions** as the existing one — read access to all schemas and tables referenced in your definitions, plus write access to the `eppo_output` schema.
2. **Update the connection** in the Eppo Admin panel and click "Test Connection" to verify.
3. **Trigger a test refresh** on one experiment to confirm the pipeline runs end-to-end.
4. **Revoke old credentials** only after verifying the new account works.

If you see `Object does not exist or not authorized` errors after switching, the most common cause is missing grants on the new user.
11 changes: 11 additions & 0 deletions docs/data-management/connecting-dwh/snowflake.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,3 +148,14 @@ MIIFJDBWBg...
### Updating Credentials

Credentials can be updated at any time within the Admin panel of the app.

### Rotating service accounts

When switching to a new service account (e.g., rotating credentials or migrating to a different account):

1. **Grant the new service account the same permissions** as the existing one. At minimum, the new account needs read access to all schemas and tables referenced in your Fact SQL and Assignment SQL definitions, plus write access to the `EPPO_OUTPUT` schema (or equivalent).
2. **Update the connection** in the Eppo Admin panel with the new credentials and click "Test Connection" to verify.
3. **Trigger a test refresh** on one experiment to confirm the pipeline runs successfully end-to-end with the new account.
4. **Revoke the old credentials** only after verifying the new account works correctly.

If you see `Object does not exist or not authorized` errors after switching, the most common cause is missing grants on the new service account. Mirror all grants from the old account before removing it.
22 changes: 21 additions & 1 deletion docs/data-management/data-pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,24 @@ Note that the y axis shows the compute time accrued by that task type. That is,

### Incremental refreshes

Eppo's scheduled jobs will run an incremental refresh that only scans recent data. By default, this lookback window will include data starting 48 hours before the last successful run (to change this time window, reach out to your Eppo contact or email support@geteppo.com). New metrics and metrics/facts with an updated definition will automatically be backfilled from the start of the experiment. Further, if a job fails on a given day, the next scheduled job will automatically go back and re-run metrics for that day.
Eppo's scheduled jobs will run an incremental refresh that only scans recent data. By default, this lookback window covers the **2 days** before the last successful run, snapped to the start of day (to change this time window, reach out to your Eppo contact or email support@geteppo.com). New metrics and metrics/facts with an updated definition will automatically be backfilled from the start of the experiment. Further, if a job fails on a given day, the next scheduled job will automatically go back and re-run metrics for that day.

You can also trigger a refresh in the UI by clicking "refresh experiment results" on the metric scorecard. This will compute new metrics from scratch and update existing metrics based on the incremental logic described above. If you'd instead like to force a full refresh and recompute all metrics from the start of the experiment, click "update now" under "results last updated".

![Data pipeline chart](/img/data-management/pipeline/refresh.png)

### When do I need a full refresh or backfill?

Not every data issue requires a full backfill. Use this decision tree to determine the right action:

- **Eppo's pipeline failed (e.g., warehouse timeout, permission error) but your underlying data is correct:** You generally do **not** need a backfill. The incremental lookback window (default 2 days) will automatically re-process the missed period on the next successful run. Verify the next scheduled run completes successfully.

- **Your upstream data was wrong and has now been corrected (e.g., a broken ETL was fixed, late-arriving data has landed):** You likely **do** need a full refresh to recompute metrics from the affected date. Trigger a full refresh from the experiment's results page ("update now" under "results last updated"), or use the API: `POST /api/v1/experiment-results/update/{experiment_id}`. Both endpoints accept a `lookback_date` query parameter (ISO 8601 format, e.g. `?lookback_date=2025-06-01T00:00:00Z`) to recompute results starting from a specific date instead of reprocessing the entire experiment. You can also pass `full_refresh=true` to force a non-incremental refresh.

- **You changed a metric definition or Fact SQL:** New and updated metric definitions are automatically backfilled from the start of the experiment on the next pipeline run (scheduled, triggered via the API, or triggered manually from the UI). No manual action is needed.

- **You're unsure whether data has been corrected upstream:** Before triggering a full refresh, confirm with your data team that the source tables now contain the correct data for the affected period. A full refresh against still-broken data will not help.



### Pipeline steps
Expand Down Expand Up @@ -91,6 +103,14 @@ As we’ve detailed, Eppo doesn’t export individual data from your warehouse.

If you have any question about our privacy practices, please reach out.

### Intermediate tables and views

Eppo creates intermediate tables and views in a dedicated schema (typically `EPPO_OUTPUT`) in your warehouse. Over time — especially in long-running workspaces with many experiments — these can accumulate into thousands of objects. This is expected behavior and does not affect experiment results.

To manage this, Eppo provides an **automatic warehouse table cleanup** setting. Navigate to **Admin → Settings → Experiment Schedule Settings** and enable **"Automatically clean up old warehouse tables"**. You configure a retention period (e.g., 90 days) — Eppo will then drop any `EPPO_OUTPUT` tables that haven't been updated within that window. The cleanup runs on the 1st of every month. By default, tables used by Explore charts and the Sample Size Calculator are preserved; you can opt in to cleaning those up as well with separate toggles.

Do not manually drop tables from the `EPPO_OUTPUT` schema — active experiments may depend on them. Use the built-in cleanup automation instead, which only removes tables outside the retention window.

## Clustered Analysis Pipeline

Clustered analysis experiments have a few more steps in the data pipeline to aggregate from the subentity level to the cluster level. See diagram below with additional steps highlighted.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,14 @@ For some experiments, subjects are assigned to a variant in one place, but are n

Eppo provides the ability to filter an assignment source by an [Entry Point](/statistics/sample-size-calculator/setup#creating-entry-points) (also known as a qualifying event) when configuring an experiment. This ensures that only the subjects assigned to that entry point are analyzed in the experiment, based on the logged events for that entry point. All decisions (inclusion into the experiment, time-framed metrics) are based on the timestamp of the entry point.

:::caution Entry points change when exposure starts, not just who is included
When you add an entry point filter, the **entry point timestamp replaces the assignment timestamp** as the start of each subject's analysis window. This means metric events are measured relative to when the subject triggered the entry point, not when they were originally assigned.

If you only want to filter which subjects are included (without shifting the analysis window), use an Assignment SQL filter or a targeting rule instead, or make sure the entry point and the assignment timestamps match.

The entry point entity must match the assignment entity for the join to work correctly.
:::

First you’ll need both an assignment source and an entry point source configured. Then, when setting up an experiment, check the box marked “Filter assignments by entry points” in the **Logging & Experiment Key** section:

![Choose assignment SQL](/img/building-experiments/select-assignment-source.png)
Expand Down
2 changes: 1 addition & 1 deletion docs/experiment-analysis/configuration/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ On the side panel, you'll be prompted to enter some information about the experi
2. The [Entity](/data-management/definitions/entities) on which the experiment was randomized (user, device, workspace, etc.)
3. Which [Assignment Source](/data-management/definitions/assignment-sql) has assignment logs for the experiment
4. An optional [entry point](/statistics/sample-size-calculator/setup#what-is-an-entry-point) on which to filter experiment assignments. This will limit the experiment analysis to subjects (e.g., users) that hit the specified entry point. You can read more about filtering experiment assignments [here](/experiment-analysis/configuration/filter-assignments-by-entry-point).
5. The experiment key of interest. The drop-down will show flags created in Eppo as well as other experiment keys in the selected Assignment Source. If your experiment key does not show up in the drop-down you can also enter it manually.
5. The experiment key of interest. The drop-down will show flags created in Eppo as well as other experiment keys in the selected Assignment Source. If your experiment key does not show up in the drop-down you can also enter it manually. Note that experiment keys are **not required to be unique** across experiments — the same key can appear in multiple assignment sources or be reused over time. If you use the API to create or query experiments programmatically, ensure you account for this by also specifying the assignment source or date range to disambiguate.
6. For experiments randomized with Eppo's feature flags, you'll also specify the [Allocation](/feature-flagging/#allocations) you want to analyze (one flag can be used to run multiple experiments)
7. A hypothesis for the experiment. You can also add this later when creating an experiment [report](/experiment-analysis/reporting/experiment-reports)

Expand Down
27 changes: 16 additions & 11 deletions docs/experiment-analysis/diagnostics.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,15 +30,7 @@ Validity of experimental results crucially relies on proper randomization of sub

### Traffic imbalance diagnostic

The traffic imbalance diagnostic runs a test to see whether the randomization works as expected and the number of subjects assigned to each variation is as expected. This indicates that there is likely an issue with the randomization of subjects (e.g. a bug in the randomization code), which can invalidate the results of an experiment.

We run this traffic imbalance test by running a [Pearson’s chi-squared test](https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test) with $\alpha = 0.001$ on active variations, using the assignment weights for each variant (default is equal split across variations), which we convert to probabilities. This is also known as the sample ratio mismatch test (SRM). We run the test at the more conservative $\alpha = 0.001$ level because this test is not sequentially valid; the more conservative significance level helps us avoid false positives.

Issues with the traffic allocations can come from many sources; here are some common ones we have seen:

- There is an issue with the logging assignments (note this could be introduced through latency)
- Traffic allocations are updated in the middle of an experiments; in general, try to avoid changing the traffic allocations during an experiment
- Assignments for one variant (e.g. the control cell) started before assignments to other variants
Eppo runs a test to check whether the number of subjects assigned to each variation matches the expected split (sample ratio mismatch, or SRM). When it doesn’t, there is likely an issue with randomization or assignment logging, which can invalidate experiment results. For a detailed explanation of the test, common causes, and a step-by-step troubleshooting flow, see [Sample Ratio Mismatch](/statistics/sample-ratio-mismatch).

![Example diagnostic for a traffic imbalance in the assignment data](/img/experiments/diagnostics/diagnostics_imbalance.png)

Expand Down Expand Up @@ -72,9 +64,22 @@ Data quality diagnostics check that experiment data matches what we would expect

### Pre-experiment data diagnostic

Eppo detects when pre-experiment metric averages differ significantly across variations for one or more metrics. Eppo will highlight the top metrics where we see this differentiation.
This issue is most often driven by the non-random assignment of users into variations within the experiment. Consult with your engineering team to diagnose potential issues with your randomization tool.
Eppo detects when pre-experiment metric averages differ significantly across variations for one or more metrics. Eppo will highlight the top metrics where we see this differentiation. When the gap is too large to be plausibly due to chance, we flag it so you can investigate before trusting experiment results.

Possible reasons include: incorrectly specified experiment dates; iterating on a feature (e.g. same split after a buggy build) so Treatment had different pre-experiment exposure than Control; gradual roll-out with the experiment start set to full roll-out; or any [traffic imbalance](#traffic-imbalance-diagnostic) cause (assignment logging, latency, one variant starting before others). For a detailed list of causes and a step-by-step diagnostic process, see [CUPED and significance](/guides/advanced-experimentation/cuped_and_significance#common-causes-for-pre-experiment-imbalance).



:::info
The pre-experiment data diagnostic is only run when CUPED is enabled. This setting can be changed in the Admin panel across all experiments, or on a per-experiment basis in the Configure tab under Statistical Analysis Plan.
:::

## Understanding diagnostic queries

Each diagnostic check includes a SQL query that you can copy and run directly in your warehouse to investigate further. However, there is an important caveat:

:::caution Diagnostic queries are approximations
The SQL queries shown in the diagnostic sidebar are **simplified approximations** of the full experiment pipeline. They do not apply [CUPED++](/statistics/cuped) variance reduction, [winsorization](/statistics/confidence-intervals/#estimating-lift), or mixed-assignment filtering. As a result, running these queries in your warehouse may produce numbers that differ from what Eppo displays on the experiment results page.

This is expected and does not indicate a bug. The diagnostic queries are designed to help you verify that data exists and joins correctly — not to reproduce the final experiment statistics.
:::
8 changes: 7 additions & 1 deletion docs/experiment-analysis/reading-results/global-lift.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,10 @@ If your rollout plan would include additional users from the same audience that

:::

Global lift and coverage are currently only available for **sum** and **count** aggregation types. For details on how Global Lift is calculated, see [the Statistics section](/statistics/global-lift).
Global lift and coverage are available for **sum**, **count**, and **unique entity** (count distinct) aggregation types.

:::info Unique entity metrics and non-additivity
Unique entity (count distinct) metrics are **non-additive**: a user who converts in both the experiment population and the ineligible population is counted once in the global total, not twice. This could make the extrapolation used in the Global Lift calculation invalid. If you are comfortable making that assumption, Eppo support can activate that option for you.
:::

For details on how Global Lift is calculated, see [the Statistics section](/statistics/global-lift).
Loading