|
1 | 1 | # BigQuery |
2 | 2 |
|
| 3 | +## Introduction |
| 4 | + |
| 5 | +This guide provides step-by-step instructions on how to connect SQLMesh to the BigQuery SQL engine. |
| 6 | + |
| 7 | +It will walk you through the steps of installing SQLMesh and BigQuery connection libraries locally, configuring the connection in SQLMesh, and running the [quickstart project](../../quick_start.md). |
| 8 | + |
| 9 | +## Prerequisites |
| 10 | + |
| 11 | +This guide assumes the following about the BigQuery project being used with SQLMesh: |
| 12 | + |
| 13 | +- The project already exists |
| 14 | +- Project [CLI/API access is enabled](https://cloud.google.com/endpoints/docs/openapi/enable-api) |
| 15 | +- Project [billing is configured](https://cloud.google.com/billing/docs/how-to/manage-billing-account) (i.e. it's not a sandbox project) |
| 16 | +- SQLMesh can authenticate using an account with permissions to execute commands against the project |
| 17 | + |
| 18 | +## Installation |
| 19 | + |
| 20 | +Follow the [quickstart installation guide](../../installation.md) up to the step that [installs SQLMesh](../../installation.md#install-sqlmesh-core), where we deviate to also install the necessary BigQuery libraries. |
| 21 | + |
| 22 | +Instead of installing just SQLMesh core, we will also include the BigQuery engine libraries: |
| 23 | + |
| 24 | +```bash |
| 25 | +> pip install "sqlmesh[bigquery]" |
| 26 | +``` |
| 27 | + |
| 28 | +### Install Google Cloud SDK |
| 29 | + |
| 30 | +SQLMesh connects to BigQuery via the Python [`google-cloud-bigquery` library](https://pypi.org/project/google-cloud-bigquery/), which uses the [Google Cloud SDK `gcloud` tool](https://cloud.google.com/sdk/docs) for [authenticating with BigQuery](https://googleapis.dev/python/google-api-core/latest/auth.html). |
| 31 | + |
| 32 | +Follow these steps to install and configure the Google Cloud SDK on your computer: |
| 33 | + |
| 34 | +- Download the appropriate installer for your system from the [Google Cloud installation guide](https://cloud.google.com/sdk/docs/install) |
| 35 | +- Unpack the downloaded file with the `tar` command: |
| 36 | + |
| 37 | + ```bash |
| 38 | + > tar -xzvf google-cloud-cli-{SYSTEM_SPECIFIC_INFO}.tar.gz |
| 39 | + ``` |
| 40 | + |
| 41 | +- Run the installation script: |
| 42 | + |
| 43 | + ```bash |
| 44 | + > ./google-cloud-sdk/install.sh |
| 45 | + ``` |
| 46 | + |
| 47 | +- Reload your shell profile (e.g., for zsh): |
| 48 | + |
| 49 | + ```bash |
| 50 | + > source $HOME/.zshrc |
| 51 | + ``` |
| 52 | + |
| 53 | +- Run [`gcloud init` to setup authentication](https://cloud.google.com/sdk/gcloud/reference/init) |
| 54 | + |
| 55 | +## Configuration |
| 56 | + |
| 57 | +### Configure SQLMesh for BigQuery |
| 58 | + |
| 59 | +Add the following gateway specification to your SQLMesh project's `config.yaml` file: |
| 60 | +
|
| 61 | +```yaml |
| 62 | +bigquery: |
| 63 | + connection: |
| 64 | + type: bigquery |
| 65 | + project: <your_project_id> |
| 66 | +
|
| 67 | +default_gateway: bigquery |
| 68 | +``` |
| 69 | +
|
| 70 | +This creates a gateway named `bigquery` and makes it your project's default gateway. |
| 71 | + |
| 72 | +It uses the [`oauth` authentication method](#authentication-methods), which does not specify a username or other information directly in the connection configuration. Other authentication methods are [described below](#authentication-methods). |
| 73 | + |
| 74 | +In BigQuery, navigate to the dashboard and select the BigQuery project your SQLMesh project will use. From the Google Cloud dashboard, use the arrow to open the pop-up menu: |
| 75 | + |
| 76 | + |
| 77 | + |
| 78 | +Now we can identify the project ID needed in the `config.yaml` gateway specification above. Select the project that you want to work with, the project ID that you need to add to your yaml file is the ID label from the pop-up menu. |
| 79 | + |
| 80 | + |
| 81 | + |
| 82 | +For this guide, the Docs-Demo is the one we will use, thus the project ID for this example is `healthy-life-440919-s0`. |
| 83 | + |
| 84 | +## Usage |
| 85 | + |
| 86 | +### Test the connection |
| 87 | + |
| 88 | +Run the following command to verify that SQLMesh can connect to BigQuery: |
| 89 | + |
| 90 | +```bash |
| 91 | +> sqlmesh info |
| 92 | +``` |
| 93 | + |
| 94 | +The output will look something like this: |
| 95 | + |
| 96 | + |
| 97 | + |
| 98 | +- **Set quota project (optional)** |
| 99 | + |
| 100 | + You may see warnings like this when you run `sqlmesh info`: |
| 101 | + |
| 102 | +  |
| 103 | + |
| 104 | + You can avoid these warnings about quota projects by running: |
| 105 | + |
| 106 | + ```bash |
| 107 | + > gcloud auth application-default set-quota-project <your_project_id> |
| 108 | + > gcloud config set project <your_project_id> |
| 109 | + ``` |
| 110 | + |
| 111 | + |
| 112 | +### Create and run a plan |
| 113 | + |
| 114 | +We've verified our connection, so we're ready to create and execute a plan in BigQuery: |
| 115 | + |
| 116 | +```bash |
| 117 | +> sqlmesh plan |
| 118 | +``` |
| 119 | + |
| 120 | +### View results in BigQuery Console |
| 121 | + |
| 122 | +Let's confirm that our project models are as expected. |
| 123 | +
|
| 124 | +First, navigate to the BigQuery Studio Console: |
| 125 | +
|
| 126 | + |
| 127 | +
|
| 128 | +Then use the left sidebar to find your project and the newly created models: |
| 129 | +
|
| 130 | + |
| 131 | +
|
| 132 | +We have confirmed that our SQLMesh project is running properly in BigQuery! |
| 133 | +
|
3 | 134 | ## Local/Built-in Scheduler |
4 | 135 |
|
5 | 136 | **Engine Adapter Type**: `bigquery` |
@@ -76,7 +207,7 @@ sqlmesh_airflow = SQLMeshAirflow( |
76 | 207 | ) |
77 | 208 | ``` |
78 | 209 |
|
79 | | -## Connection Methods |
| 210 | +## Authentication Methods |
80 | 211 | - [oauth](https://google-auth.readthedocs.io/en/master/reference/google.auth.html#google.auth.default) (default) |
81 | 212 | - Related Credential Configuration: |
82 | 213 | - `scopes` (Optional) |
|
0 commit comments