[management] migration: convert SQLite-imported bool columns to native Postgres boolean (fixes #4326)#6109
Conversation
…e Postgres boolean
When a self-hosted netbird management server is migrated from SQLite to
Postgres with external tooling (e.g. pgloader), SQLite's INTEGER 0/1
representation of bool fields ends up as Postgres `numeric` rather than
`boolean`. The Go struct fields are typed `bool`, so the very first
attempt to update any account setting from the dashboard fails with:
failed to encode args[N]: unable to encode true into binary format
for numeric (OID 1700): cannot find encode plan
The current migration registry has no step that fixes column types,
which leaves affected operators with no in-process recovery path —
they have to apply manual ALTER TABLE statements (see the user-supplied
SQL in the issue). Multiple users have hit this on different settings
fields, so a per-column manual fix is not a viable answer.
This commit adds a generic, gorm-metadata-driven migration that walks
every bool-typed field on a given model, queries
information_schema.columns for the actual SQL type, and runs an
idempotent ALTER COLUMN ... TYPE boolean USING (COALESCE(col::numeric, 0) <> 0)
when the column is anything other than boolean. NULL → false, 0 → false,
1 → true. NOT NULL + DEFAULT false are re-applied so the post-migration
schema matches what gorm.AutoMigrate would have produced on a fresh
install.
The migration runs in the pre-AutoMigrate phase (so AutoMigrate's own
type-mismatch check does not trip on still-numeric columns) and is
registered for `types.Account`, which covers the embedded Settings and
ExtraSettings fields where the user-reported failure surfaced. It is a
strict no-op on non-Postgres engines (SQLite stores bools natively as
INTEGER, MySQL as TINYINT(1), neither needs conversion).
Tests against a real `postgres:16-alpine` testcontainer:
- NoOpOnSqlite — function returns nil, schema/data untouched on SQLite
- TableMissingIsNoOp — function returns nil when the table is absent
- AlreadyBooleanIsNoOp — function returns nil when columns are already
boolean (idempotency for fresh installs and second-run scenarios)
- ConvertsNumericColumnsAndPreservesData — core regression test for
issue netbirdio#4326. Synthesises the broken-schema state (numeric columns
where bool is expected), seeds NULL/0/1 rows, runs the migration,
asserts (a) data_type is now 'boolean' for every affected column,
(b) non-bool columns are untouched, (c) existing rows have correctly
converted values, (d) the user-reported failure mode (writing Go
bool=true via gorm) succeeds afterwards.
- IsIdempotent — running the migration twice in a row is safe.
All five tests pass in CI's existing postgres-container scaffolding.
Closes netbirdio#4326
📝 WalkthroughWalkthroughThis PR resolves a SQLite-to-PostgreSQL migration defect where boolean columns were incorrectly stored as numeric types. It adds ChangesPostgreSQL Boolean Column Conversion
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
management/server/migration/migration.go (1)
717-732: 💤 Low valueHardcoded
DEFAULT falseignoresgorm:"default:..."tags on the model.The migration unconditionally re-applies
DEFAULT falseandNOT NULLregardless of what the GORM tag actually declares. For all currently-affected fields ontypes.Account(which usedefault:false) this is fine, but if a future bool field is declaredgorm:"default:true", this helper would silently overwrite that default after a SQLite→Postgres conversion. For more robust fidelity withAutoMigrate, derive the default fromfield.HasDefaultValue/field.DefaultValueInterfaceinstead of hardcodingfalse.Also note that the
UPDATE ... WHERE %q IS NULLat line 729 is effectively a no-op given theCOALESCE(...::numeric, 0) <> 0USING expression already maps NULL→false during the type change — it's harmless but can be dropped if you tighten this up.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@management/server/migration/migration.go` around lines 717 - 732, The ALTER sequence hardcodes "DEFAULT false" and a NULL-fill UPDATE, which can overwrite model defaults; change the logic that builds the alters slice (around usingExpr, columnName, tableName) to determine the intended default from the GORM field metadata (use field.HasDefaultValue and field.DefaultValueInterface) and apply `ALTER TABLE ... SET DEFAULT <derived-default>` instead of `false`, and drop the redundant `UPDATE ... SET ... WHERE ... IS NULL` because the USING expression already maps NULLs; ensure the ALTER that sets NOT NULL is only added when the field actually requires NOT NULL per the model.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@management/server/store/store.go`:
- Around line 460-468: Update the migratePostAuto expansion to remove
types.Group (it has no bool fields) and instead include the eight types that do
require the Postgres bool-fix: nbpeer.Peer, types.SetupKey, types.User,
types.Policy, route.Route, routerTypes.NetworkRouter,
resourceTypes.NetworkResource, and domain.Domain; locate the slice/array where
migration.FixPostgresBoolColumns[types.Account] is being appended/invoked and
add corresponding migration.FixPostgresBoolColumns[T] entries for each of those
eight types while ensuring types.Group is not included.
---
Nitpick comments:
In `@management/server/migration/migration.go`:
- Around line 717-732: The ALTER sequence hardcodes "DEFAULT false" and a
NULL-fill UPDATE, which can overwrite model defaults; change the logic that
builds the alters slice (around usingExpr, columnName, tableName) to determine
the intended default from the GORM field metadata (use field.HasDefaultValue and
field.DefaultValueInterface) and apply `ALTER TABLE ... SET DEFAULT
<derived-default>` instead of `false`, and drop the redundant `UPDATE ... SET
... WHERE ... IS NULL` because the USING expression already maps NULLs; ensure
the ALTER that sets NOT NULL is only added when the field actually requires NOT
NULL per the model.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9f6788cb-0651-42d2-b374-d2f3e2a603d9
📒 Files selected for processing (3)
management/server/migration/migration.gomanagement/server/migration/migration_test.gomanagement/server/store/store.go
| // Issue #4326: SQLite -> Postgres migrations done with external | ||
| // tooling (e.g. pgloader) preserve SQLite's INTEGER 0/1 bool | ||
| // representation as Postgres `numeric`, which then breaks any | ||
| // settings update with "cannot find encode plan". Convert all | ||
| // gorm-mapped bool columns on `accounts` to native `boolean` | ||
| // before AutoMigrate runs (no-op on non-Postgres engines). | ||
| func(db *gorm.DB) error { | ||
| return migration.FixPostgresBoolColumns[types.Account](ctx, db) | ||
| }, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# List Go struct fields whose type is `bool` in management/server, with
# their containing struct name — useful to enumerate models that should
# also be passed to FixPostgresBoolColumns.
rg -nP --type=go -g 'management/server/**' -B 6 '^\s*\w+\s+bool\b' | \
awk '
/^[^-]+\.go-/ && /type [A-Z]\w+ struct/ { struct=$0 }
/:\s*\w+\s+bool\b/ { print struct; print " →", $0; print "" }
'Repository: netbirdio/netbird
Length of output: 43
🏁 Script executed:
# First, find the FixPostgresBoolColumns function to understand its signature
rg -n "FixPostgresBoolColumns" --type=go management/server/Repository: netbirdio/netbird
Length of output: 2354
🏁 Script executed:
# Search for struct definitions of the mentioned types to see their bool fields
rg -n "type Account struct" --type=go management/server/ -A 20Repository: netbirdio/netbird
Length of output: 2198
🏁 Script executed:
# Check what types are already in AutoMigrate
rg -n "AutoMigrate" --type=go management/server/store/ -B 5 -A 15Repository: netbirdio/netbird
Length of output: 4877
🏁 Script executed:
# Try a simpler approach to find bool fields
rg -n "\s+\w+\s+bool" --type=go management/server/store/ | head -30Repository: netbirdio/netbird
Length of output: 3334
🏁 Script executed:
# Find SetupKey struct with bool fields
rg -n "type SetupKey struct" --type=go management/server/ -A 30Repository: netbirdio/netbird
Length of output: 2478
🏁 Script executed:
# Find Peer struct with bool fields
rg -n "type Peer struct" --type=go management/server/ -A 30Repository: netbirdio/netbird
Length of output: 2650
🏁 Script executed:
# Find User struct with bool fields
rg -n "type User struct" --type=go management/server/ -A 30Repository: netbirdio/netbird
Length of output: 2469
🏁 Script executed:
# Find Group struct with bool fields
rg -n "type Group struct" --type=go management/server/ -A 20Repository: netbirdio/netbird
Length of output: 1666
🏁 Script executed:
# Find Policy struct with bool fields
rg -n "type Policy struct" --type=go management/server/ -A 20Repository: netbirdio/netbird
Length of output: 1433
🏁 Script executed:
# Find Route struct with bool fields
rg -n "type Route struct" --type=go management/server/ -A 20Repository: netbirdio/netbird
Length of output: 43
🏁 Script executed:
# Find NetworkRouter struct with bool fields
rg -n "type NetworkRouter struct" --type=go management/server/ -A 20Repository: netbirdio/netbird
Length of output: 1832
🏁 Script executed:
# Find NetworkResource struct with bool fields
rg -n "type NetworkResource struct" --type=go management/server/ -A 20Repository: netbirdio/netbird
Length of output: 1955
🏁 Script executed:
# Find Domain struct with bool fields
rg -n "type Domain struct" --type=go management/server/ -A 20Repository: netbirdio/netbird
Length of output: 43
🏁 Script executed:
# Find Route struct - search more broadly
rg -n "type Route struct" --type=go -A 25Repository: netbirdio/netbird
Length of output: 14158
🏁 Script executed:
# Find Domain struct - search more broadly
rg -n "type Domain struct" --type=go -A 25Repository: netbirdio/netbird
Length of output: 2763
🏁 Script executed:
# Get complete Peer struct definition
rg -n "type Peer struct" --type=go management/server/peer/ -A 60Repository: netbirdio/netbird
Length of output: 4678
🏁 Script executed:
# Check imports and the full AutoMigrate call around line 131-138
sed -n '1,150p' management/server/store/sql_store.go | head -150Repository: netbirdio/netbird
Length of output: 5345
🏁 Script executed:
# Verify Group struct has no bool fields by checking the entire file
cat management/server/types/group.goRepository: netbirdio/netbird
Length of output: 4214
Remove types.Group from the expansion; it has no bool columns.
The suggestion to expand the bool-column fix to additional tables is sound, but one type should be excluded: types.Group contains no bool fields and therefore doesn't need the migration. The remaining eight types—nbpeer.Peer, types.SetupKey, types.User, types.Policy, route.Route, routerTypes.NetworkRouter, resourceTypes.NetworkResource, and domain.Domain—all have persisted bool columns affected by the pgloader numeric conversion issue and should be added to migratePostAuto.
📋 Corrected expansion
func(db *gorm.DB) error {
return migration.FixPostgresBoolColumns[types.Account](ctx, db)
},
+ func(db *gorm.DB) error {
+ return migration.FixPostgresBoolColumns[nbpeer.Peer](ctx, db)
+ },
+ func(db *gorm.DB) error {
+ return migration.FixPostgresBoolColumns[types.SetupKey](ctx, db)
+ },
+ func(db *gorm.DB) error {
+ return migration.FixPostgresBoolColumns[types.User](ctx, db)
+ },
+ func(db *gorm.DB) error {
+ return migration.FixPostgresBoolColumns[types.Policy](ctx, db)
+ },
+ func(db *gorm.DB) error {
+ return migration.FixPostgresBoolColumns[route.Route](ctx, db)
+ },
+ func(db *gorm.DB) error {
+ return migration.FixPostgresBoolColumns[routerTypes.NetworkRouter](ctx, db)
+ },
+ func(db *gorm.DB) error {
+ return migration.FixPostgresBoolColumns[resourceTypes.NetworkResource](ctx, db)
+ },
+ func(db *gorm.DB) error {
+ return migration.FixPostgresBoolColumns[domain.Domain](ctx, db)
+ },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@management/server/store/store.go` around lines 460 - 468, Update the
migratePostAuto expansion to remove types.Group (it has no bool fields) and
instead include the eight types that do require the Postgres bool-fix:
nbpeer.Peer, types.SetupKey, types.User, types.Policy, route.Route,
routerTypes.NetworkRouter, resourceTypes.NetworkResource, and domain.Domain;
locate the slice/array where migration.FixPostgresBoolColumns[types.Account] is
being appended/invoked and add corresponding migration.FixPostgresBoolColumns[T]
entries for each of those eight types while ensuring types.Group is not
included.



Summary
Closes #4326.
When a self-hosted NetBird management server is migrated from SQLite to Postgres with external tooling (e.g.
pgloader), SQLite's INTEGER 0/1 representation ofboolfields lands in Postgres asnumericrather thanboolean. The Go struct fields are typedbool, so the very first attempt to update any account setting from the dashboard fails with:Multiple operators have hit this on different bool settings fields (reporter's original report, confirmation that "this behavior occurs on basically every setting related to the account settings"). Until now the only fix was running manual
ALTER TABLEstatements per affected column.Change
New migration helper
FixPostgresBoolColumns[T]that walks every gorm-mapped bool field on a model, inspectsinformation_schema.columnsfor the actual SQL type, and runs an idempotentALTER COLUMN ... TYPE boolean USING ...when the column is anything other thanboolean. Conversion semantics:NULL→false(matches Go bool zero value)0→falsetrueNOT NULL+DEFAULT falseare re-applied so the post-migration schema matches whatgorm.AutoMigratewould produce on a fresh installThe migration is registered for
types.AccountingetMigrationsPreAuto. Pre-AutoMigrate placement is required soAutoMigrate's own type-mismatch check does not trip on the still-numeric columns. It is a strict no-op on non-Postgres engines (SQLite stores bools natively as INTEGER, MySQL as TINYINT(1) — neither needs conversion).Tests
All tests run against a real
postgres:16-alpinetest container via the existingtestcontainers-goscaffolding:TestFixPostgresBoolColumns_NoOpOnSqliteTestFixPostgresBoolColumns_TableMissingIsNoOpTestFixPostgresBoolColumns_AlreadyBooleanIsNoOpTestFixPostgresBoolColumns_ConvertsNumericColumnsAndPreservesDataboolean, NULL→false, 0→false, 1→true, and the user-reported failure mode (writing Gobool=truevia gorm) succeeds afterwardsTestFixPostgresBoolColumns_IsIdempotentTest plan
go build ./management/...cleanAffected columns (concrete)
For
types.Accountthe migration covers the gorm-embedded Settings + ExtraSettings bool fields. At time of writing those are:The list is derived automatically from gorm metadata, so any future bool field added to
Settings/ExtraSettingsis covered without code changes.Use case
Real-world: any operator who migrated their self-hosted NetBird from SQLite to Postgres using a tool that maps SQLite's
INTEGERto Postgresnumericrather thanboolean. That includespgloaderwith default settings (the most common path), but also other custom dump/restore scripts. With this PR the management server self-heals on first start instead of returning HTTP 500 on every settings update.Summary by CodeRabbit
Release Notes