Skip to content

Commit 9d0610e

Browse files
authored
External models default yaml (#2717)
1 parent c13626f commit 9d0610e

File tree

14 files changed

+53
-43
lines changed

14 files changed

+53
-43
lines changed

docs/concepts/models/external_models.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,15 @@ SQLMesh stores external tables' column information as `EXTERNAL` models.
1010

1111
`EXTERNAL` models consist solely of an external table's column information, so there is no query for SQLMesh to run.
1212

13-
SQLMesh has no information about the data contained in the table represented by an `EXTERNAL` model. The table could be altered or have all its data deleted, and SQLMesh will not detect it. All SQLMesh knows about the table is that it contains the columns specified in the `EXTERNAL` model's `schema.yaml` file (more information below).
13+
SQLMesh has no information about the data contained in the table represented by an `EXTERNAL` model. The table could be altered or have all its data deleted, and SQLMesh will not detect it. All SQLMesh knows about the table is that it contains the columns specified in the `EXTERNAL` model's file (more information below).
1414

1515
SQLMesh will not take any actions based on an `EXTERNAL` model - its actions are solely determined by the model whose query selects from the `EXTERNAL` model.
1616

1717
The querying model's [`kind`](./model_kinds.md), [`cron`](./overview.md#cron), and previously loaded time intervals determine when SQLMesh will query the `EXTERNAL` model.
1818

1919
## Generating an external models schema file
2020

21-
External models can be defined in the `schema.yaml` file in the SQLMesh project's root folder.
21+
External models can be defined in the `external_models.yaml` file in the SQLMesh project's root folder. The alternative name for this file is `schema.yaml`.
2222

2323
You can create this file by either writing the YAML by hand or allowing SQLMesh to fetch information about external tables with the `create_external_models` CLI command.
2424

@@ -38,13 +38,13 @@ FROM
3838

3939
The following sections demonstrate how to create an external model containing `external_db.external_table`'s column information.
4040

41-
All of a SQLMesh project's external models are defined in a single `schema.yaml` file, so the files created below might also include column information for other external models.
41+
All of a SQLMesh project's external models are defined in a single `external_models.yaml` file, so the files created below might also include column information for other external models.
4242

4343
Alternatively, additional external models can also be defined in the [external_models/](#using-the-external_models-directory) folder.
4444

4545
### Writing YAML by hand
4646

47-
This example demonstrates the structure of a `schema.yaml` file:
47+
This example demonstrates the structure of a `external_models.yaml` file:
4848

4949
```yaml
5050
- name: external_db.external_table
@@ -65,9 +65,9 @@ The file can be constructed by hand using a standard text editor or IDE.
6565

6666
### Using CLI
6767

68-
Instead of creating the `schema.yaml` file manually, SQLMesh can generate it for you with the [create_external_models](../../reference/cli.md#create_external_models) CLI command.
68+
Instead of creating the `external_models.yaml` file manually, SQLMesh can generate it for you with the [create_external_models](../../reference/cli.md#create_external_models) CLI command.
6969

70-
The command identifies all external tables referenced in your SQLMesh project, fetches their column information from the SQL engine's metadata, and then stores the information in the `schema.yaml` file.
70+
The command identifies all external tables referenced in your SQLMesh project, fetches their column information from the SQL engine's metadata, and then stores the information in the `external_models.yaml` file.
7171

7272
If SQLMesh does not have access to an external table's metadata, the table will be omitted from the file and SQLMesh will issue a warning.
7373

@@ -77,18 +77,18 @@ If SQLMesh does not have access to an external table's metadata, the table will
7777

7878
Sometimes, SQLMesh cannot infer the structure of a model and you need to add it manually.
7979

80-
However, since `sqlmesh create_external_models` replaces the `schema.yaml` file, any manual changes you made to that file will be overwritten.
80+
However, since `sqlmesh create_external_models` replaces the `external_models.yaml` file, any manual changes you made to that file will be overwritten.
8181

8282
The solution is to create the manual model definition files in the `external_models/` directory, like so:
8383

8484
```
85-
schema.yaml
86-
external_models/another_schema.yaml
87-
external_models/yet_another_schema.yaml
85+
external_models.yaml
86+
external_models/more_external_models.yaml
87+
external_models/even_more_external_models.yaml
8888
```
8989

90-
Files in the `external_models` directory must be `.yaml` files that follow the same structure as the `schema.yaml` file.
90+
Files in the `external_models` directory must be `.yaml` files that follow the same structure as the `external_models.yaml` file.
9191

92-
When SQLMesh loads the definitions, it will first load the models defined in `schema.yaml` followed by any models it can find in `external_models/*.yaml`.
92+
When SQLMesh loads the definitions, it will first load the models defined in `external_models.yaml` (or `schema.yaml`) and any models found in `external_models/*.yaml`.
9393

94-
Therefore, you can use `sqlmesh create_external_models` to manage the `schema.yaml` file and then put any models that need to be defined manually inside the `external_models/` directory.
94+
Therefore, you can use `sqlmesh create_external_models` to manage the `external_models.yaml` file and then put any models that need to be defined manually inside the `external_models/` directory.

docs/concepts/tests.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -474,7 +474,7 @@ It's not always possible to correctly interpret certain values in a unit test wi
474474

475475
To avoid this ambiguity, SQLMesh needs to know the columns' types. By default, it will try to infer these types based on the model definitions, but they can also be explicitly specified:
476476

477-
- in the [`schema.yaml`](models/external_models.md#generating-an-external-models-schema-file) file (for external models)
477+
- in the [`external_models.yaml`](models/external_models.md#generating-an-external-models-schema-file) file (for external models)
478478
- using the [`columns`](models/overview.md#columns) model property
479479
- using the [`columns`](#creating_tests) attribute of the unit test
480480

docs/guides/table_migration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ Consider an existing table named `my_schema.existing_table`. Migrating this tabl
3939

4040
1. Ensure `my_schema.existing_table` is up to date (has ingested all available source data)
4141
2. Rename `my_schema.existing_table` to any other name, such as `my_schema.existing_table_historical`
42-
- Optionally, enable column-level lineage for the table by making it an [`EXTERNAL` model](../concepts/models/model_kinds.md#external) and adding it to the project's `schema.yaml` file
42+
- Optionally, enable column-level lineage for the table by making it an [`EXTERNAL` model](../concepts/models/model_kinds.md#external) and adding it to the project's `external_models.yaml` file
4343
3. Create a new incremental staging model named `my_schema.existing_table_staging` (see below for code)
4444
4. Create a new [`VIEW` model](../concepts/models/model_kinds.md#view) named `my_schema.existing_table` (see below for code)
4545
5. Run `sqlmesh plan` to create and backfill the models

sqlmesh/core/constants.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,15 +40,16 @@
4040
"""The default directory for log files."""
4141

4242
AUDITS = "audits"
43+
CACHE = ".cache"
44+
EXTERNAL_MODELS = "external_models"
4345
MACROS = "macros"
4446
METRICS = "metrics"
4547
MODELS = "models"
46-
EXTERNAL_MODELS = "external_models"
4748
SEEDS = "seeds"
4849
TESTS = "tests"
49-
CACHE = ".cache"
50-
SCHEMA_YAML = "schema.yaml"
5150

51+
EXTERNAL_MODELS_YAML = "external_models.yaml"
52+
EXTERNAL_MODELS_DEPRECATED_YAML = "schema.yaml"
5253

5354
DEFAULT_SCHEMA = "default"
5455

sqlmesh/core/context.py

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@
8181
from sqlmesh.core.plan import Plan, PlanBuilder
8282
from sqlmesh.core.reference import ReferenceGraph
8383
from sqlmesh.core.scheduler import Scheduler
84-
from sqlmesh.core.schema_loader import create_schema_file
84+
from sqlmesh.core.schema_loader import create_external_models_file
8585
from sqlmesh.core.selector import Selector
8686
from sqlmesh.core.snapshot import (
8787
DeployabilityIndex,
@@ -1598,17 +1598,22 @@ def rollback(self) -> None:
15981598

15991599
@python_api_analytics
16001600
def create_external_models(self) -> None:
1601-
"""Create a schema file with all external models.
1601+
"""Create a file to document the schema of external models.
16021602
1603-
The schema file contains all columns and types of external models, allowing for more robust
1604-
lineage, validation, and optimizations.
1603+
The external models file contains all columns and types of external models, allowing for more
1604+
robust lineage, validation, and optimizations.
16051605
"""
16061606
if not self._models:
16071607
self.load(update_schemas=False)
16081608

16091609
for path, config in self.configs.items():
1610-
create_schema_file(
1611-
path=path / c.SCHEMA_YAML,
1610+
deprecated_yaml = path / c.EXTERNAL_MODELS_DEPRECATED_YAML
1611+
1612+
external_models_yaml = (
1613+
path / c.EXTERNAL_MODELS_YAML if not deprecated_yaml.exists() else deprecated_yaml
1614+
)
1615+
create_external_models_file(
1616+
path=external_models_yaml,
16121617
models=UniqueKeyDict(
16131618
"models",
16141619
{

sqlmesh/core/loader.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -182,12 +182,15 @@ def _load_metrics(self) -> UniqueKeyDict[str, MetricMeta]:
182182
def _load_external_models(self) -> UniqueKeyDict[str, Model]:
183183
models: UniqueKeyDict[str, Model] = UniqueKeyDict("models")
184184
for context_path, config in self._context.configs.items():
185-
schema_path = Path(context_path / c.SCHEMA_YAML)
185+
external_models_yaml = Path(context_path / c.EXTERNAL_MODELS_YAML)
186+
deprecated_yaml = Path(context_path / c.EXTERNAL_MODELS_DEPRECATED_YAML)
186187
external_models_path = context_path / c.EXTERNAL_MODELS
187188

188189
paths_to_load = []
189-
if schema_path.exists():
190-
paths_to_load.append(schema_path)
190+
if external_models_yaml.exists():
191+
paths_to_load.append(external_models_yaml)
192+
elif deprecated_yaml.exists():
193+
paths_to_load.append(deprecated_yaml)
191194

192195
if external_models_path.exists() and external_models_path.is_dir():
193196
paths_to_load.extend(external_models_path.glob("*.yaml"))

sqlmesh/core/schema_loader.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,15 +16,15 @@
1616
logger = logging.getLogger(__name__)
1717

1818

19-
def create_schema_file(
19+
def create_external_models_file(
2020
path: Path,
2121
models: UniqueKeyDict[str, Model],
2222
adapter: EngineAdapter,
2323
state_reader: StateReader,
2424
dialect: DialectType,
2525
max_workers: int = 1,
2626
) -> None:
27-
"""Create or replace a YAML file with model schemas.
27+
"""Create or replace a YAML file with column and types of all columns in all external models.
2828
2929
Args:
3030
path: The path to store the YAML file.

sqlmesh/core/test/definition.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ def setUp(self) -> None:
142142
_raise_error(
143143
f"Failed to infer the data type of column '{col}' for '{name}'. This issue can be "
144144
"mitigated by casting the column in the model definition, setting its type in "
145-
"schema.yaml if it's an external model, setting the model's 'columns' property, "
145+
"external_models.yaml if it's an external model, setting the model's 'columns' property, "
146146
"or setting its 'columns' mapping in the test itself",
147147
self.path,
148148
)

tests/core/test_context.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -668,7 +668,7 @@ def test_load_external_models(copy_to_temp_path):
668668

669669
assert len(external_model_names) > 0
670670

671-
# from schema.yaml in root dir
671+
# from default external_models.yaml in root dir
672672
assert "raw.demographics" in external_model_names
673673

674674
# from external_models/model1.yaml

0 commit comments

Comments
 (0)