Skip to content

Commit c6b94fa

Browse files
committed
create a spec for code generation
1 parent 4c94ec3 commit c6b94fa

3 files changed

Lines changed: 98 additions & 3 deletions

File tree

.pre-commit-config.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,16 +16,16 @@
1616
# See https://pre-commit.com/hooks.html for more hooks
1717
repos:
1818
- repo: https://github.com/pre-commit/pre-commit-hooks
19-
rev: v4.0.1
19+
rev: v6.0.0
2020
hooks:
2121
- id: trailing-whitespace
2222
- id: end-of-file-fixer
2323
- id: check-yaml
2424
- repo: https://github.com/psf/black
25-
rev: 22.3.0
25+
rev: 23.7.0
2626
hooks:
2727
- id: black
2828
- repo: https://github.com/pycqa/flake8
29-
rev: 3.9.2 # version-scanner: ignore
29+
rev: 6.1.0 # version-scanner: ignore
3030
hooks:
3131
- id: flake8
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Copyright 2026 Google LLC
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
"""Sentinel values used throughout BigFrames."""
16+
17+
from __future__ import annotations
18+
19+
from enum import Enum
20+
21+
22+
class Default(Enum):
23+
"""Default values used throughout BigFrames.
24+
25+
When a parameter is set to this, that parameter is explicitly omitted
26+
from the SQL text. This allows for NULL (None in Python) to be explicitly
27+
passed in to optional parameters.
28+
"""
29+
30+
token = 0
31+
32+
33+
DEFAULT = Default.token
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Code generation for bigframes.bigquery
2+
3+
This document describes code generation for the `bigframes.bigquery` modules.
4+
For detailed specifications on input and output types, refer to
5+
[Contributing to bigframes.bigquery](./bigframes-bigquery-contributing.md).
6+
7+
## Overview
8+
9+
The script at `packages/bigframes/scripts/generate_bigframes_bigquery.py`
10+
generates python submodules for the `bigframes.bigquery` module. When run
11+
without any arguments, it iterates through all yaml files at
12+
`packages/bigframes/scripts/data/sql-functions/**/*.yaml` to generate the code.
13+
14+
The script at `packages/bigframes/scripts/check_bigframes_bigquery.py` iterates
15+
through all the same yaml files and checks that the functions have been included
16+
in the `bigframes.bigquery` module, as the `__init__.py` file requires manual
17+
updates.
18+
19+
## Generated code organization
20+
21+
The `generate_bigframes_bigquery.py` script generates submodules of
22+
`bigframes.bigquery._operations`, with the full path reflecting the organization
23+
of the YAML files. For example, a YAML file at
24+
`packages/bigframes/scripts/data/sql-functions/aead.yaml` corresponds to a
25+
generated Python module at `bigframes.bigquery._operations.aead`. Likewise,
26+
`packages/bigframes/scripts/data/sql-functions/builtins/bit.yaml` corresponds
27+
to the `bigframes.bigquery._operations.builtins.bit` submodule.
28+
29+
## Generated module implementation
30+
31+
Each generated module has all functions defined in the YAML file converted to
32+
the equivalent Python definition, including keyword arguments and docstrings.
33+
34+
### Handling optional arguments
35+
36+
When the user calls a Python function without specifying the optional
37+
argument, that argument is omitted from the SQL text. To allow for explicit
38+
NULL values to be passed in (None in Python), the default value is specified
39+
to be a default sentinel value enum `bigframes.core.sentinels.DEFAULT`. For
40+
example:
41+
42+
```python
43+
import bigframes.core.sentinels
44+
45+
def current_date(
46+
time_zone_expression: str | bigframes.core.sentinels.Default = bigframes.core.sentinels.DEFAULT,
47+
):
48+
...
49+
```
50+
51+
### Input and output types
52+
53+
Refer to the table in
54+
[Contributing to bigframes.bigquery](./bigframes-bigquery-contributing.md).
55+
56+
### Internal bigframes operator
57+
58+
Scalar functions should generate an expression using the `GoogleSqlScalarOp`.
59+
This keeps the implementation as scalar SQL functions consistent.
60+
61+
Aggregate, analytic, and table-valued functions currently require custom ops. As
62+
such, those functions are currently out of scope for this generator.

0 commit comments

Comments
 (0)