|
| 1 | +# Code generation for bigframes.bigquery |
| 2 | + |
| 3 | +This document describes code generation for the `bigframes.bigquery` modules. |
| 4 | +For detailed specifications on input and output types, refer to |
| 5 | +[Contributing to bigframes.bigquery](./bigframes-bigquery-contributing.md). |
| 6 | + |
| 7 | +## Overview |
| 8 | + |
| 9 | +The script at `packages/bigframes/scripts/generate_bigframes_bigquery.py` |
| 10 | +generates python submodules for the `bigframes.bigquery` module. When run |
| 11 | +without any arguments, it iterates through all yaml files at |
| 12 | +`packages/bigframes/scripts/data/sql-functions/**/*.yaml` to generate the code. |
| 13 | + |
| 14 | +The script at `packages/bigframes/scripts/check_bigframes_bigquery.py` iterates |
| 15 | +through all the same yaml files and checks that the functions have been included |
| 16 | +in the `bigframes.bigquery` module, as the `__init__.py` file requires manual |
| 17 | +updates. |
| 18 | + |
| 19 | +## Generated code organization |
| 20 | + |
| 21 | +The `generate_bigframes_bigquery.py` script generates submodules of |
| 22 | +`bigframes.bigquery._operations`, with the full path reflecting the organization |
| 23 | +of the YAML files. For example, a YAML file at |
| 24 | +`packages/bigframes/scripts/data/sql-functions/aead.yaml` corresponds to a |
| 25 | +generated Python module at `bigframes.bigquery._operations.aead`. Likewise, |
| 26 | +`packages/bigframes/scripts/data/sql-functions/builtins/bit.yaml` corresponds |
| 27 | +to the `bigframes.bigquery._operations.builtins.bit` submodule. |
| 28 | + |
| 29 | +## Generated module implementation |
| 30 | + |
| 31 | +Each generated module has all functions defined in the YAML file converted to |
| 32 | +the equivalent Python definition, including keyword arguments and docstrings. |
| 33 | + |
| 34 | +### Handling optional arguments |
| 35 | + |
| 36 | +When the user calls a Python function without specifying the optional |
| 37 | +argument, that argument is omitted from the SQL text. To allow for explicit |
| 38 | +NULL values to be passed in (None in Python), the default value is specified |
| 39 | +to be a default sentinel value enum `bigframes.core.sentinels.DEFAULT`. For |
| 40 | +example: |
| 41 | + |
| 42 | +```python |
| 43 | +import bigframes.core.sentinels |
| 44 | + |
| 45 | +def current_date( |
| 46 | + time_zone_expression: str | bigframes.core.sentinels.Default = bigframes.core.sentinels.DEFAULT, |
| 47 | +): |
| 48 | + ... |
| 49 | +``` |
| 50 | + |
| 51 | +### Input and output types |
| 52 | + |
| 53 | +Refer to the table in |
| 54 | +[Contributing to bigframes.bigquery](./bigframes-bigquery-contributing.md). |
| 55 | + |
| 56 | +### Internal bigframes operator |
| 57 | + |
| 58 | +Scalar functions should generate an expression using the `GoogleSqlScalarOp`. |
| 59 | +This keeps the implementation as scalar SQL functions consistent. |
| 60 | + |
| 61 | +Aggregate, analytic, and table-valued functions currently require custom ops. As |
| 62 | +such, those functions are currently out of scope for this generator. |
0 commit comments