From eb20605ca3afe91dc6fd36ec2fbd1433013280df Mon Sep 17 00:00:00 2001 From: David Miculit Date: Tue, 28 Apr 2026 17:38:21 +0300 Subject: [PATCH 1/3] docs: add steps for local data ingestion --- docs/installation.md | 92 ++++++++++++++++++++++++++++++++++++-------- docs/pulseload.md | 2 +- 2 files changed, 76 insertions(+), 18 deletions(-) diff --git a/docs/installation.md b/docs/installation.md index be35fe4cfc1..db1d1591b99 100644 --- a/docs/installation.md +++ b/docs/installation.md @@ -161,41 +161,99 @@ docker compose run -e PROJECTS_TO_INGEST=autoland backend celery -A treeherder w ### Manual ingestion -`NOTE`: You have to include `--root-url https://community-tc.services.mozilla.com` in order to ingest from the [Taskcluster Community instance](https://community-tc.services.mozilla.com), otherwise, it will default to the Firefox CI. +!!! note + You have to include `--root-url https://community-tc.services.mozilla.com` in order to ingest from the [Taskcluster Community instance](https://community-tc.services.mozilla.com), otherwise, it will default to the Firefox CI. Open a terminal window and run `docker compose up`. All following sections assume this step. #### Ingesting pushes -`NOTE`: Only the push information will be ingested. Tasks -associated with the pushes will not. This mode is useful to seed pushes so -they are visible on the web interface and so you can easily copy and paste -changesets from the web interface into subsequent commands to ingest all tasks. +!!! note + Only the push information will be ingested. Tasks + associated with the pushes will not. This mode is useful to seed pushes so + they are visible on the web interface and so you can easily copy and paste + changesets from the web interface into subsequent commands to ingest all tasks. -Ingest a single Mercurial push or the last N pushes: +These steps should help you set up everything you need to ingest data locally while working on perfherder. -```console -docker compose exec backend ./manage.py ingest push -p autoland -r 63f8a47cfdf5 -docker compose exec backend ./manage.py ingest push -p mozilla-central --last-n-pushes 100 +#### Create a pulse guardian account and run + +```bash +export PULSE_URL=amqp://USER:PASSWORD@pulse.mozilla.org:5671/?ssl=1 +pnpm install ``` -Ingest a single Github push or the last 10: +#### Run each of these commands in a seperate window -```console +```bash +docker compose up --build +docker compose run -e PROJECTS_TO_INGEST=autoland backend celery -A treeherder worker --concurrency 1 +``` + +--- + +#### Run the db viewer + +You can use any database viewer of your choice `(e.g. dbeaver-ce, mysql workbench, etc.)` + +**Connect to the following while `docker compose up --build` from the previous step is running:** + +`Serverhost: localhost` +`Port: 3306` +`Database: treeherder` +`Username: root` +`No password` + +--- + +#### Run the following in separate window while running above to do ingestion + + +!!! note + These commands perform fetches for the data, they are run sequentially in the same window. The first command makes the second one run faster. + +#### Ingest push + +```bash +docker compose exec backend ./manage.py ingest push -p autoland -r 1ee42a54a431acdd6cbe43b49de0237fe67eddd9 +``` + +#### Ingest all the tasks, run celery to trigger the log parsing, and performance data ingestion + +> **Warning:** This command can take a long time to ingest and parse everything. + +```bash +docker compose exec backend ./manage.py ingest push -p autoland -r 1ee42a54a431acdd6cbe43b49de0237fe67eddd9 -a --enable-eager-celery +``` + +`--enable-eager-celery` triggers the log parsing which is required to capture the `PERFHERDER_DATA` output. + +#### For ingesting multiple pushes + +```bash +docker compose exec backend ./manage.py ingest push -p autoland --last-n-pushes 100 +``` + +#### Ingest a single Github push or the last 10 + +```bash docker compose exec backend ./manage.py ingest git-push -p servo-try -c 92fc94588f3b6987082923c0003012fd696b1a2d docker compose exec -e GITHUB_TOKEN= backend ./manage.py ingest git-pushes -p android-components ``` -`NOTE`: You can ingest all tasks for a push. Check the help output for the script to determine the -parameters needed. +!!! note + You can ingest all tasks for a push. Check the help output for the script to determine the + parameters needed. -`NOTE`: If you make too many calls to the Github API you will start getting 403 messages because of the rate limit. -To avoid this visit [your settings](https://github.com/settings/tokens) and set up `GITHUB_TOKEN`. You don't need -to grant scopes for it. +!!! note + If you make too many calls to the Github API you will start getting 403 messages because of the rate limit. + To avoid this visit [your settings](https://github.com/settings/tokens) and set up `GITHUB_TOKEN`. You don't need + to grant scopes for it. #### Ingesting Github PRs -`NOTE`: This will only ingest the commits if there's an active Github PRs project. It will only ingest the commits. +!!! note + This will only ingest the commits if there's an active Github PRs project. It will only ingest the commits. ```bash docker compose exec backend ./manage.py ingest pr --pr-url https://github.com/mozilla-mobile/android-components/pull/4821 diff --git a/docs/pulseload.md b/docs/pulseload.md index 4be15fe2d9b..9bae88b7b23 100644 --- a/docs/pulseload.md +++ b/docs/pulseload.md @@ -19,7 +19,7 @@ export PROJECTS_TO_INGEST=autoland,try Visit [Pulse Guardian], sign in, and create a **Pulse User**. It will ask you to set a username and password. Remember these as you'll use them in the next step. This is recommended, because using the default value **MAY** cause you to miss some data, -if it was already ingested by another user Unfortunately, **Pulse** doesn't support creating +if it was already ingested by another user. Unfortunately, **Pulse** doesn't support creating queues with a guest account. If your **Pulse User** was username: `foo` and password: `bar`, your Pulse URL From e61479588d015bad71dc4c3260b88bcaec1fc854 Mon Sep 17 00:00:00 2001 From: David Miculit Date: Thu, 30 Apr 2026 12:09:16 +0300 Subject: [PATCH 2/3] fix: address comments for improvement --- docs/installation.md | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/docs/installation.md b/docs/installation.md index db1d1591b99..2e46f0eb847 100644 --- a/docs/installation.md +++ b/docs/installation.md @@ -178,6 +178,9 @@ These steps should help you set up everything you need to ingest data locally wh #### Create a pulse guardian account and run +!!! note + This step is optional and used if you want to ingest from pulse messages. + ```bash export PULSE_URL=amqp://USER:PASSWORD@pulse.mozilla.org:5671/?ssl=1 pnpm install @@ -199,14 +202,14 @@ You can use any database viewer of your choice `(e.g. dbeaver-ce, mysql workbenc **Connect to the following while `docker compose up --build` from the previous step is running:** `Serverhost: localhost` -`Port: 3306` +`Port: 5432` `Database: treeherder` -`Username: root` -`No password` +`Username: postgres` +`Password: mozilla1234` --- -#### Run the following in separate window while running above to do ingestion +#### Run the following in a separate window while running above to do ingestion !!! note @@ -234,6 +237,12 @@ docker compose exec backend ./manage.py ingest push -p autoland -r 1ee42a54a431a docker compose exec backend ./manage.py ingest push -p autoland --last-n-pushes 100 ``` +#### Ingest a single task + +```bash +docker-compose exec backend ./manage.py ingest task -p autoland -r 1ee42a54a431acdd6cbe43b49de0237fe67eddd9 --task-id --enable-eager-celery +``` + #### Ingest a single Github push or the last 10 ```bash From b6fa2687d6726c91ec008490b1d1d2c67048bd31 Mon Sep 17 00:00:00 2001 From: David Miculit Date: Wed, 6 May 2026 16:21:49 +0300 Subject: [PATCH 3/3] fix: change from docker-compose to docker compose --- docs/installation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/installation.md b/docs/installation.md index 2e46f0eb847..ca761e9266e 100644 --- a/docs/installation.md +++ b/docs/installation.md @@ -240,7 +240,7 @@ docker compose exec backend ./manage.py ingest push -p autoland --last-n-pushes #### Ingest a single task ```bash -docker-compose exec backend ./manage.py ingest task -p autoland -r 1ee42a54a431acdd6cbe43b49de0237fe67eddd9 --task-id --enable-eager-celery +docker compose exec backend ./manage.py ingest task -p autoland -r 1ee42a54a431acdd6cbe43b49de0237fe67eddd9 --task-id --enable-eager-celery ``` #### Ingest a single Github push or the last 10