feat(catalog): add SQL catalog#693
Conversation
Add a relational database backed SqlCatalog with a CatalogStore abstraction, built-in sqlpp23 stores for SQLite, PostgreSQL, and MySQL, and Java-compatible catalog tables.
| shell: pwsh | ||
| run: | | ||
| vcpkg install zlib:x64-windows nlohmann-json:x64-windows nanoarrow:x64-windows roaring:x64-windows cpr:x64-windows | ||
| vcpkg install zlib:x64-windows nlohmann-json:x64-windows nanoarrow:x64-windows roaring:x64-windows cpr:x64-windows sqlite3:x64-windows |
There was a problem hiding this comment.
Is it better to add a dedicated ci workflow for the sql catalog? We can trigger it only when files related to sql catalog have been changed to reduce resource usage.
There was a problem hiding this comment.
Agreed. I've added a dedicated CI workflow, but I haven't configured paths filtering for SQL catalog-related changes yet. Since it's a bit tricky to define a clean filter, I'd rather address that in a separate PR.
| option(ICEBERG_BUILD_REST "Build rest catalog client" ON) | ||
| option(ICEBERG_BUILD_REST_INTEGRATION_TESTS "Build rest catalog integration tests" OFF) | ||
| option(ICEBERG_BUILD_HIVE "Build hive (HMS) catalog client" OFF) | ||
| option(ICEBERG_BUILD_SQL_CATALOG "Build SQL catalog client" ON) |
There was a problem hiding this comment.
Should we make it off by default? This follows the pattern used by rest catalog library.
There was a problem hiding this comment.
REST catalog is ON by default, but I'd agree that we should make SQL catalog OFF, changed.
| const TableIdentifier& identifier, | ||
| const std::string& metadata_file_location) override; | ||
|
|
||
| #ifdef BUILD_SQLITE3_CONNECTOR |
There was a problem hiding this comment.
Would it be good to always make these functions visible to users? We can check BUILD_SQLITE3_CONNECTOR and other macros in the sql_catalog.cc file and return NotSupported if not built.
There was a problem hiding this comment.
Agreed, done that way.
| mkdir build && cd build | ||
| cmake .. -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON | ||
| cmake .. -G Ninja -DCMAKE_EXPORT_COMPILE_COMMANDS=ON \ | ||
| -DICEBERG_BUILD_SQL_CATALOG=ON \ |
There was a problem hiding this comment.
It would be nice to enhance this in the future by enabling specific modules only when their paths are affected.
Add a relational database backed SqlCatalog with a CatalogStore abstraction, built-in sqlpp23 stores for SQLite, PostgreSQL, and MySQL, and Java-compatible catalog tables.