Skip to content

Commit fa039a5

Browse files
mesmith027treysp
andauthored
Big query tutorial (#3365)
Co-authored-by: Trey Spiller <treyspiller@gmail.com>
1 parent d711aa7 commit fa039a5

7 files changed

Lines changed: 132 additions & 1 deletion

File tree

docs/integrations/engines/bigquery.md

Lines changed: 132 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,136 @@
11
# BigQuery
22

3+
## Introduction
4+
5+
This guide provides step-by-step instructions on how to connect SQLMesh to the BigQuery SQL engine.
6+
7+
It will walk you through the steps of installing SQLMesh and BigQuery connection libraries locally, configuring the connection in SQLMesh, and running the [quickstart project](../../quick_start.md).
8+
9+
## Prerequisites
10+
11+
This guide assumes the following about the BigQuery project being used with SQLMesh:
12+
13+
- The project already exists
14+
- Project [CLI/API access is enabled](https://cloud.google.com/endpoints/docs/openapi/enable-api)
15+
- Project [billing is configured](https://cloud.google.com/billing/docs/how-to/manage-billing-account) (i.e. it's not a sandbox project)
16+
- SQLMesh can authenticate using an account with permissions to execute commands against the project
17+
18+
## Installation
19+
20+
Follow the [quickstart installation guide](../../installation.md) up to the step that [installs SQLMesh](../../installation.md#install-sqlmesh-core), where we deviate to also install the necessary BigQuery libraries.
21+
22+
Instead of installing just SQLMesh core, we will also include the BigQuery engine libraries:
23+
24+
```bash
25+
> pip install "sqlmesh[bigquery]"
26+
```
27+
28+
### Install Google Cloud SDK
29+
30+
SQLMesh connects to BigQuery via the Python [`google-cloud-bigquery` library](https://pypi.org/project/google-cloud-bigquery/), which uses the [Google Cloud SDK `gcloud` tool](https://cloud.google.com/sdk/docs) for [authenticating with BigQuery](https://googleapis.dev/python/google-api-core/latest/auth.html).
31+
32+
Follow these steps to install and configure the Google Cloud SDK on your computer:
33+
34+
- Download the appropriate installer for your system from the [Google Cloud installation guide](https://cloud.google.com/sdk/docs/install)
35+
- Unpack the downloaded file with the `tar` command:
36+
37+
```bash
38+
> tar -xzvf google-cloud-cli-{SYSTEM_SPECIFIC_INFO}.tar.gz
39+
```
40+
41+
- Run the installation script:
42+
43+
```bash
44+
> ./google-cloud-sdk/install.sh
45+
```
46+
47+
- Reload your shell profile (e.g., for zsh):
48+
49+
```bash
50+
> source $HOME/.zshrc
51+
```
52+
53+
- Run [`gcloud init` to setup authentication](https://cloud.google.com/sdk/gcloud/reference/init)
54+
55+
## Configuration
56+
57+
### Configure SQLMesh for BigQuery
58+
59+
Add the following gateway specification to your SQLMesh project's `config.yaml` file:
60+
61+
```yaml
62+
bigquery:
63+
connection:
64+
type: bigquery
65+
project: <your_project_id>
66+
67+
default_gateway: bigquery
68+
```
69+
70+
This creates a gateway named `bigquery` and makes it your project's default gateway.
71+
72+
It uses the [`oauth` authentication method](#authentication-methods), which does not specify a username or other information directly in the connection configuration. Other authentication methods are [described below](#authentication-methods).
73+
74+
In BigQuery, navigate to the dashboard and select the BigQuery project your SQLMesh project will use. From the Google Cloud dashboard, use the arrow to open the pop-up menu:
75+
76+
![BigQuery Dashboard](./bigquery/bigquery-1.png)
77+
78+
Now we can identify the project ID needed in the `config.yaml` gateway specification above. Select the project that you want to work with, the project ID that you need to add to your yaml file is the ID label from the pop-up menu.
79+
80+
![BigQuery Dashboard: selecting your project](./bigquery/bigquery-2.png)
81+
82+
For this guide, the Docs-Demo is the one we will use, thus the project ID for this example is `healthy-life-440919-s0`.
83+
84+
## Usage
85+
86+
### Test the connection
87+
88+
Run the following command to verify that SQLMesh can connect to BigQuery:
89+
90+
```bash
91+
> sqlmesh info
92+
```
93+
94+
The output will look something like this:
95+
96+
![Terminal Output](./bigquery/bigquery-3.png)
97+
98+
- **Set quota project (optional)**
99+
100+
You may see warnings like this when you run `sqlmesh info`:
101+
102+
![Terminal Output with warnings](./bigquery/bigquery-4.png)
103+
104+
You can avoid these warnings about quota projects by running:
105+
106+
```bash
107+
> gcloud auth application-default set-quota-project <your_project_id>
108+
> gcloud config set project <your_project_id>
109+
```
110+
111+
112+
### Create and run a plan
113+
114+
We've verified our connection, so we're ready to create and execute a plan in BigQuery:
115+
116+
```bash
117+
> sqlmesh plan
118+
```
119+
120+
### View results in BigQuery Console
121+
122+
Let's confirm that our project models are as expected.
123+
124+
First, navigate to the BigQuery Studio Console:
125+
126+
![Steps to the Studio](./bigquery/bigquery-5.png)
127+
128+
Then use the left sidebar to find your project and the newly created models:
129+
130+
![New Models](./bigquery/bigquery-6.png)
131+
132+
We have confirmed that our SQLMesh project is running properly in BigQuery!
133+
3134
## Local/Built-in Scheduler
4135
5136
**Engine Adapter Type**: `bigquery`
@@ -76,7 +207,7 @@ sqlmesh_airflow = SQLMeshAirflow(
76207
)
77208
```
78209

79-
## Connection Methods
210+
## Authentication Methods
80211
- [oauth](https://google-auth.readthedocs.io/en/master/reference/google.auth.html#google.auth.default) (default)
81212
- Related Credential Configuration:
82213
- `scopes` (Optional)
195 KB
Loading
199 KB
Loading
18.6 KB
Loading
51.7 KB
Loading
208 KB
Loading
126 KB
Loading

0 commit comments

Comments
 (0)