Skip to content

[management] migration: convert SQLite-imported bool columns to native Postgres boolean (fixes #4326)#6109

Open
MichaelUray wants to merge 1 commit intonetbirdio:mainfrom
MichaelUray:fix/postgres-bool-column-migration
Open

[management] migration: convert SQLite-imported bool columns to native Postgres boolean (fixes #4326)#6109
MichaelUray wants to merge 1 commit intonetbirdio:mainfrom
MichaelUray:fix/postgres-bool-column-migration

Conversation

@MichaelUray
Copy link
Copy Markdown
Contributor

@MichaelUray MichaelUray commented May 8, 2026

Summary

Closes #4326.

When a self-hosted NetBird management server is migrated from SQLite to Postgres with external tooling (e.g. pgloader), SQLite's INTEGER 0/1 representation of bool fields lands in Postgres as numeric rather than boolean. The Go struct fields are typed bool, so the very first attempt to update any account setting from the dashboard fails with:

failed to encode args[N]: unable to encode true into binary format for numeric (OID 1700): cannot find encode plan

Multiple operators have hit this on different bool settings fields (reporter's original report, confirmation that "this behavior occurs on basically every setting related to the account settings"). Until now the only fix was running manual ALTER TABLE statements per affected column.

Change

New migration helper FixPostgresBoolColumns[T] that walks every gorm-mapped bool field on a model, inspects information_schema.columns for the actual SQL type, and runs an idempotent ALTER COLUMN ... TYPE boolean USING ... when the column is anything other than boolean. Conversion semantics:

  • NULLfalse (matches Go bool zero value)
  • 0false
  • non-zero numeric → true
  • NOT NULL + DEFAULT false are re-applied so the post-migration schema matches what gorm.AutoMigrate would produce on a fresh install

The migration is registered for types.Account in getMigrationsPreAuto. Pre-AutoMigrate placement is required so AutoMigrate's own type-mismatch check does not trip on the still-numeric columns. It is a strict no-op on non-Postgres engines (SQLite stores bools natively as INTEGER, MySQL as TINYINT(1) — neither needs conversion).

Tests

All tests run against a real postgres:16-alpine test container via the existing testcontainers-go scaffolding:

Test Coverage
TestFixPostgresBoolColumns_NoOpOnSqlite SQLite path is unchanged
TestFixPostgresBoolColumns_TableMissingIsNoOp absent table → return nil, no error
TestFixPostgresBoolColumns_AlreadyBooleanIsNoOp already-boolean columns are skipped (fresh-install idempotency)
TestFixPostgresBoolColumns_ConvertsNumericColumnsAndPreservesData core regression test for #4326: synthesises the broken-schema state (numeric columns where bool is expected), seeds NULL/0/1 rows, runs the migration, asserts data_type is now boolean, NULL→false, 0→false, 1→true, and the user-reported failure mode (writing Go bool=true via gorm) succeeds afterwards
TestFixPostgresBoolColumns_IsIdempotent running the migration twice in a row is safe
$ NETBIRD_STORE_ENGINE=postgres go test ./management/server/migration/ -run TestFixPostgresBoolColumns -v -count=1
=== RUN   TestFixPostgresBoolColumns_NoOpOnSqlite
--- PASS: TestFixPostgresBoolColumns_NoOpOnSqlite (0.00s)
=== RUN   TestFixPostgresBoolColumns_TableMissingIsNoOp
--- PASS: TestFixPostgresBoolColumns_TableMissingIsNoOp (1.66s)
=== RUN   TestFixPostgresBoolColumns_AlreadyBooleanIsNoOp
--- PASS: TestFixPostgresBoolColumns_AlreadyBooleanIsNoOp (1.66s)
=== RUN   TestFixPostgresBoolColumns_ConvertsNumericColumnsAndPreservesData
time="2026-05-08T09:47:27Z" level=info msg="converting Postgres column bool_col_test_models.top_level_bool from numeric to boolean (issue #4326)"
time="2026-05-08T09:47:27Z" level=info msg="converting Postgres column bool_col_test_models.settings_lazy_connection_enabled from numeric to boolean (issue #4326)"
time="2026-05-08T09:47:27Z" level=info msg="converting Postgres column bool_col_test_models.settings_jwt_groups_enabled from numeric to boolean (issue #4326)"
--- PASS: TestFixPostgresBoolColumns_ConvertsNumericColumnsAndPreservesData (1.68s)
=== RUN   TestFixPostgresBoolColumns_IsIdempotent
--- PASS: TestFixPostgresBoolColumns_IsIdempotent (1.62s)
PASS
ok      github.com/netbirdio/netbird/management/server/migration        7.065s

Test plan

  • Unit + integration tests cover SQLite no-op, missing-table no-op, already-boolean no-op, the actual conversion + data preservation, and idempotency
  • Existing migration test suite still green (no regression)
  • go build ./management/... clean

Affected columns (concrete)

For types.Account the migration covers the gorm-embedded Settings + ExtraSettings bool fields. At time of writing those are:

settings_peer_login_expiration_enabled
settings_peer_inactivity_expiration_enabled
settings_regular_users_view_blocked
settings_groups_propagation_enabled
settings_jwt_groups_enabled
settings_routing_peer_dns_resolution_enabled
settings_peer_expose_enabled
settings_lazy_connection_enabled
settings_auto_update_always
settings_extra_peer_approval_enabled
settings_extra_user_approval_required

The list is derived automatically from gorm metadata, so any future bool field added to Settings/ExtraSettings is covered without code changes.

Use case

Real-world: any operator who migrated their self-hosted NetBird from SQLite to Postgres using a tool that maps SQLite's INTEGER to Postgres numeric rather than boolean. That includes pgloader with default settings (the most common path), but also other custom dump/restore scripts. With this PR the management server self-heals on first start instead of returning HTTP 500 on every settings update.

  • I added/updated documentation
  • Documentation is not needed

Summary by CodeRabbit

Release Notes

  • Bug Fixes
    • Fixed PostgreSQL database schema issues where boolean columns were not using the proper native type. The correction automatically applies during database migration and is safe to run multiple times.

…e Postgres boolean

When a self-hosted netbird management server is migrated from SQLite to
Postgres with external tooling (e.g. pgloader), SQLite's INTEGER 0/1
representation of bool fields ends up as Postgres `numeric` rather than
`boolean`. The Go struct fields are typed `bool`, so the very first
attempt to update any account setting from the dashboard fails with:

    failed to encode args[N]: unable to encode true into binary format
    for numeric (OID 1700): cannot find encode plan

The current migration registry has no step that fixes column types,
which leaves affected operators with no in-process recovery path —
they have to apply manual ALTER TABLE statements (see the user-supplied
SQL in the issue). Multiple users have hit this on different settings
fields, so a per-column manual fix is not a viable answer.

This commit adds a generic, gorm-metadata-driven migration that walks
every bool-typed field on a given model, queries
information_schema.columns for the actual SQL type, and runs an
idempotent ALTER COLUMN ... TYPE boolean USING (COALESCE(col::numeric, 0) <> 0)
when the column is anything other than boolean. NULL → false, 0 → false,
1 → true. NOT NULL + DEFAULT false are re-applied so the post-migration
schema matches what gorm.AutoMigrate would have produced on a fresh
install.

The migration runs in the pre-AutoMigrate phase (so AutoMigrate's own
type-mismatch check does not trip on still-numeric columns) and is
registered for `types.Account`, which covers the embedded Settings and
ExtraSettings fields where the user-reported failure surfaced. It is a
strict no-op on non-Postgres engines (SQLite stores bools natively as
INTEGER, MySQL as TINYINT(1), neither needs conversion).

Tests against a real `postgres:16-alpine` testcontainer:

  - NoOpOnSqlite — function returns nil, schema/data untouched on SQLite
  - TableMissingIsNoOp — function returns nil when the table is absent
  - AlreadyBooleanIsNoOp — function returns nil when columns are already
    boolean (idempotency for fresh installs and second-run scenarios)
  - ConvertsNumericColumnsAndPreservesData — core regression test for
    issue netbirdio#4326. Synthesises the broken-schema state (numeric columns
    where bool is expected), seeds NULL/0/1 rows, runs the migration,
    asserts (a) data_type is now 'boolean' for every affected column,
    (b) non-bool columns are untouched, (c) existing rows have correctly
    converted values, (d) the user-reported failure mode (writing Go
    bool=true via gorm) succeeds afterwards.
  - IsIdempotent — running the migration twice in a row is safe.

All five tests pass in CI's existing postgres-container scaffolding.

Closes netbirdio#4326
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR resolves a SQLite-to-PostgreSQL migration defect where boolean columns were incorrectly stored as numeric types. It adds FixPostgresBoolColumns[T], a generic pre-auto migration function that safely converts affected columns to native boolean type while preserving existing data and applying correct defaults, with comprehensive Postgres-specific regression tests and integration into the account migration pipeline.

Changes

PostgreSQL Boolean Column Conversion

Layer / File(s) Summary
Function Signature & Type Inspection
management/server/migration/migration.go
Import reflect package; add FixPostgresBoolColumns[T any](ctx context.Context, db *gorm.DB) error function that scans bool-typed GORM fields, checks Postgres column types via information_schema, and converts numeric/NULL columns to boolean using USING cast logic with NULL→false mapping.
Test Model Setup
management/server/migration/migration_test.go
Define synthetic test models with boolean fields (including embedded settings) and custom TableName() to support manual schema inspection and conversion testing.
Test Scenarios
management/server/migration/migration_test.go
Five regression tests: no-op on SQLite with working inserts, no-op on missing table, no-op on already-boolean columns, core conversion with data preservation (NULL→false, non-bool columns unchanged), and idempotency across multiple runs.
Pre-Auto Migration Integration
management/server/store/store.go
Append FixPostgresBoolColumns[types.Account] call to getMigrationsPreAuto chain to run before AutoMigrate, fixing account boolean schema on legacy Postgres deployments.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 A migration so clever and bright,
Makes booleans behave just right,
From numeric's curse to boolean's true grace,
PostgreSQL finds its proper place,
No more tangled types to fight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: converting SQLite-imported bool columns to native Postgres boolean, and references the issue it fixes (#4326).
Description check ✅ Passed The description comprehensively addresses all template requirements: changes are described, issue #4326 is linked, tests are documented, and documentation status is marked as not needed with justification.
Linked Issues check ✅ Passed The code changes fully address issue #4326: a migration helper FixPostgresBoolColumns[T] automates the exact conversion steps shown in the issue's SQL fix for all bool columns affected by SQLite-to-Postgres migrations.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing issue #4326: migration helper, comprehensive tests, and integration into the pre-AutoMigrate migration flow—no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 8, 2026

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
management/server/migration/migration.go (1)

717-732: 💤 Low value

Hardcoded DEFAULT false ignores gorm:"default:..." tags on the model.

The migration unconditionally re-applies DEFAULT false and NOT NULL regardless of what the GORM tag actually declares. For all currently-affected fields on types.Account (which use default:false) this is fine, but if a future bool field is declared gorm:"default:true", this helper would silently overwrite that default after a SQLite→Postgres conversion. For more robust fidelity with AutoMigrate, derive the default from field.HasDefaultValue / field.DefaultValueInterface instead of hardcoding false.

Also note that the UPDATE ... WHERE %q IS NULL at line 729 is effectively a no-op given the COALESCE(...::numeric, 0) <> 0 USING expression already maps NULL→false during the type change — it's harmless but can be dropped if you tighten this up.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@management/server/migration/migration.go` around lines 717 - 732, The ALTER
sequence hardcodes "DEFAULT false" and a NULL-fill UPDATE, which can overwrite
model defaults; change the logic that builds the alters slice (around usingExpr,
columnName, tableName) to determine the intended default from the GORM field
metadata (use field.HasDefaultValue and field.DefaultValueInterface) and apply
`ALTER TABLE ... SET DEFAULT <derived-default>` instead of `false`, and drop the
redundant `UPDATE ... SET ... WHERE ... IS NULL` because the USING expression
already maps NULLs; ensure the ALTER that sets NOT NULL is only added when the
field actually requires NOT NULL per the model.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@management/server/store/store.go`:
- Around line 460-468: Update the migratePostAuto expansion to remove
types.Group (it has no bool fields) and instead include the eight types that do
require the Postgres bool-fix: nbpeer.Peer, types.SetupKey, types.User,
types.Policy, route.Route, routerTypes.NetworkRouter,
resourceTypes.NetworkResource, and domain.Domain; locate the slice/array where
migration.FixPostgresBoolColumns[types.Account] is being appended/invoked and
add corresponding migration.FixPostgresBoolColumns[T] entries for each of those
eight types while ensuring types.Group is not included.

---

Nitpick comments:
In `@management/server/migration/migration.go`:
- Around line 717-732: The ALTER sequence hardcodes "DEFAULT false" and a
NULL-fill UPDATE, which can overwrite model defaults; change the logic that
builds the alters slice (around usingExpr, columnName, tableName) to determine
the intended default from the GORM field metadata (use field.HasDefaultValue and
field.DefaultValueInterface) and apply `ALTER TABLE ... SET DEFAULT
<derived-default>` instead of `false`, and drop the redundant `UPDATE ... SET
... WHERE ... IS NULL` because the USING expression already maps NULLs; ensure
the ALTER that sets NOT NULL is only added when the field actually requires NOT
NULL per the model.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9f6788cb-0651-42d2-b374-d2f3e2a603d9

📥 Commits

Reviewing files that changed from the base of the PR and between 7da94a4 and d28e75f.

📒 Files selected for processing (3)
  • management/server/migration/migration.go
  • management/server/migration/migration_test.go
  • management/server/store/store.go

Comment on lines +460 to +468
// Issue #4326: SQLite -> Postgres migrations done with external
// tooling (e.g. pgloader) preserve SQLite's INTEGER 0/1 bool
// representation as Postgres `numeric`, which then breaks any
// settings update with "cannot find encode plan". Convert all
// gorm-mapped bool columns on `accounts` to native `boolean`
// before AutoMigrate runs (no-op on non-Postgres engines).
func(db *gorm.DB) error {
return migration.FixPostgresBoolColumns[types.Account](ctx, db)
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# List Go struct fields whose type is `bool` in management/server, with
# their containing struct name — useful to enumerate models that should
# also be passed to FixPostgresBoolColumns.
rg -nP --type=go -g 'management/server/**' -B 6 '^\s*\w+\s+bool\b' | \
  awk '
    /^[^-]+\.go-/ && /type [A-Z]\w+ struct/ { struct=$0 }
    /:\s*\w+\s+bool\b/ { print struct; print "  →", $0; print "" }
  '

Repository: netbirdio/netbird

Length of output: 43


🏁 Script executed:

# First, find the FixPostgresBoolColumns function to understand its signature
rg -n "FixPostgresBoolColumns" --type=go management/server/

Repository: netbirdio/netbird

Length of output: 2354


🏁 Script executed:

# Search for struct definitions of the mentioned types to see their bool fields
rg -n "type Account struct" --type=go management/server/ -A 20

Repository: netbirdio/netbird

Length of output: 2198


🏁 Script executed:

# Check what types are already in AutoMigrate
rg -n "AutoMigrate" --type=go management/server/store/ -B 5 -A 15

Repository: netbirdio/netbird

Length of output: 4877


🏁 Script executed:

# Try a simpler approach to find bool fields
rg -n "\s+\w+\s+bool" --type=go management/server/store/ | head -30

Repository: netbirdio/netbird

Length of output: 3334


🏁 Script executed:

# Find SetupKey struct with bool fields
rg -n "type SetupKey struct" --type=go management/server/ -A 30

Repository: netbirdio/netbird

Length of output: 2478


🏁 Script executed:

# Find Peer struct with bool fields
rg -n "type Peer struct" --type=go management/server/ -A 30

Repository: netbirdio/netbird

Length of output: 2650


🏁 Script executed:

# Find User struct with bool fields
rg -n "type User struct" --type=go management/server/ -A 30

Repository: netbirdio/netbird

Length of output: 2469


🏁 Script executed:

# Find Group struct with bool fields
rg -n "type Group struct" --type=go management/server/ -A 20

Repository: netbirdio/netbird

Length of output: 1666


🏁 Script executed:

# Find Policy struct with bool fields
rg -n "type Policy struct" --type=go management/server/ -A 20

Repository: netbirdio/netbird

Length of output: 1433


🏁 Script executed:

# Find Route struct with bool fields
rg -n "type Route struct" --type=go management/server/ -A 20

Repository: netbirdio/netbird

Length of output: 43


🏁 Script executed:

# Find NetworkRouter struct with bool fields
rg -n "type NetworkRouter struct" --type=go management/server/ -A 20

Repository: netbirdio/netbird

Length of output: 1832


🏁 Script executed:

# Find NetworkResource struct with bool fields
rg -n "type NetworkResource struct" --type=go management/server/ -A 20

Repository: netbirdio/netbird

Length of output: 1955


🏁 Script executed:

# Find Domain struct with bool fields
rg -n "type Domain struct" --type=go management/server/ -A 20

Repository: netbirdio/netbird

Length of output: 43


🏁 Script executed:

# Find Route struct - search more broadly
rg -n "type Route struct" --type=go -A 25

Repository: netbirdio/netbird

Length of output: 14158


🏁 Script executed:

# Find Domain struct - search more broadly
rg -n "type Domain struct" --type=go -A 25

Repository: netbirdio/netbird

Length of output: 2763


🏁 Script executed:

# Get complete Peer struct definition
rg -n "type Peer struct" --type=go management/server/peer/ -A 60

Repository: netbirdio/netbird

Length of output: 4678


🏁 Script executed:

# Check imports and the full AutoMigrate call around line 131-138
sed -n '1,150p' management/server/store/sql_store.go | head -150

Repository: netbirdio/netbird

Length of output: 5345


🏁 Script executed:

# Verify Group struct has no bool fields by checking the entire file
cat management/server/types/group.go

Repository: netbirdio/netbird

Length of output: 4214


Remove types.Group from the expansion; it has no bool columns.

The suggestion to expand the bool-column fix to additional tables is sound, but one type should be excluded: types.Group contains no bool fields and therefore doesn't need the migration. The remaining eight types—nbpeer.Peer, types.SetupKey, types.User, types.Policy, route.Route, routerTypes.NetworkRouter, resourceTypes.NetworkResource, and domain.Domain—all have persisted bool columns affected by the pgloader numeric conversion issue and should be added to migratePostAuto.

📋 Corrected expansion
 		func(db *gorm.DB) error {
 			return migration.FixPostgresBoolColumns[types.Account](ctx, db)
 		},
+		func(db *gorm.DB) error {
+			return migration.FixPostgresBoolColumns[nbpeer.Peer](ctx, db)
+		},
+		func(db *gorm.DB) error {
+			return migration.FixPostgresBoolColumns[types.SetupKey](ctx, db)
+		},
+		func(db *gorm.DB) error {
+			return migration.FixPostgresBoolColumns[types.User](ctx, db)
+		},
+		func(db *gorm.DB) error {
+			return migration.FixPostgresBoolColumns[types.Policy](ctx, db)
+		},
+		func(db *gorm.DB) error {
+			return migration.FixPostgresBoolColumns[route.Route](ctx, db)
+		},
+		func(db *gorm.DB) error {
+			return migration.FixPostgresBoolColumns[routerTypes.NetworkRouter](ctx, db)
+		},
+		func(db *gorm.DB) error {
+			return migration.FixPostgresBoolColumns[resourceTypes.NetworkResource](ctx, db)
+		},
+		func(db *gorm.DB) error {
+			return migration.FixPostgresBoolColumns[domain.Domain](ctx, db)
+		},
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@management/server/store/store.go` around lines 460 - 468, Update the
migratePostAuto expansion to remove types.Group (it has no bool fields) and
instead include the eight types that do require the Postgres bool-fix:
nbpeer.Peer, types.SetupKey, types.User, types.Policy, route.Route,
routerTypes.NetworkRouter, resourceTypes.NetworkResource, and domain.Domain;
locate the slice/array where migration.FixPostgresBoolColumns[types.Account] is
being appended/invoked and add corresponding migration.FixPostgresBoolColumns[T]
entries for each of those eight types while ensuring types.Group is not
included.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant