feat/db-roles-rls — Database roles and per-user data isolation#33
Merged
feat/db-roles-rls — Database roles and per-user data isolation#33
Conversation
- Creates user1_role (parser + webserver for user1 data) and webserver_role (read-all for admin queries; SET ROLE for scoped reads). - Grants SELECT/INSERT/UPDATE on cml_data, cml_metadata, cml_stats to user1_role; SELECT-only to webserver_role. - Grants SELECT on cml_data_1h and EXECUTE on update_cml_stats() to user1_role. - Enables RLS on the three base tables and creates per-role isolation policies (user_id = 'user1'). - webserver_role gets a permissive (USING true) read-all policy; scoped reads are achieved via SET ROLE user1_role. Backward-compatible: myuser (superuser) bypasses RLS, so the existing parser and webserver continue to work without changes until PR3 and PR5 wire up the new role credentials. Note: cml_data_1h (TimescaleDB continuous aggregate / materialized view) does not support RLS at the DB level; application queries must always include WHERE user_id = ?. Part of multi-user RLS rollout (issue #31).
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #33 +/- ##
=======================================
Coverage 73.18% 73.18%
=======================================
Files 22 22
Lines 1965 1965
=======================================
Hits 1438 1438
Misses 527 527
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Replace the initial per-user-role design with a cleaner approach: - Rename user1_role → user1 (role name = user_id value = current_user). This one change unlocks two improvements: - Generic current_user RLS policies: a single policy per base table covers all users; no per-user policy is needed when onboarding new users. - cml_data_1h_secure: a security_barrier view over the continuous aggregate using WHERE user_id = current_user. User roles query this view for fully DB-enforced per-user filtering with no application-level WHERE clause needed. webserver_role retains direct access to cml_data_1h for admin/cross-user aggregate queries. Also updates docs/multi-user-architecture.md to reflect the new role naming convention throughout.
TimescaleDB rejects ALTER TABLE ... ENABLE ROW LEVEL SECURITY on a hypertable that already has timescaledb.compress set. init.sql (fresh install): Move ALTER TABLE cml_data ENABLE ROW LEVEL SECURITY to immediately before the compression block. The restriction is purely ordering: RLS must be enabled before the compress option is applied. migration 004 (live DB): On an existing deployment (after migration 002) compression is already enabled. Apply the same decompress/recompress pattern used in migration 002: decompress all compressed chunks, ENABLE ROW LEVEL SECURITY, then re-compress chunks older than the policy threshold. cml_metadata and cml_stats are plain tables and are unaffected.
…pression TimescaleDB does not allow ENABLE ROW LEVEL SECURITY on a compressed hypertable, and compression cannot be set on an RLS-enabled table. These two features are mutually exclusive; no ordering of statements works around this. Resolution: - Remove ENABLE ROW LEVEL SECURITY from cml_data entirely (keep compression). - Apply full RLS only to cml_metadata and cml_stats (plain tables, no compression restriction). - Add cml_data_secure: a security_barrier view over cml_data with WHERE user_id = current_user and WITH CHECK OPTION, using the same pattern already established for cml_data_1h_secure. This provides SQL injection protection (optimizer cannot push predicates above the filter) and write-path enforcement via the view. Verified locally: fresh-volume docker compose up now completes without errors (PostgreSQL init process complete).
Change compress_segmentby from 'user_id, cml_id, sublink_id' to 'user_id, cml_id'. With ~80% of CMLs having 2 sublinks and ~15% having 4, keeping sublinks in the same segment reduces the number of chunks that need to be decompressed per CML query roughly 2-4x. Sublinks of the same CML share correlated RSL/TSL ranges so they compress well together with no meaningful loss in compression ratio. Add migration 005 to apply the change on live databases via a decompress -> alter compress options -> recompress cycle. Update migration 002 comment to cross-reference 005.
…ctions - Verify: \du user1_role -> user1; smoke-test uses user1 role and cml_data_secure view (cml_data direct gives all rows — no RLS on it); RLS check comment clarifies cml_data shows f intentionally - Rollback: drop cml_data_secure/cml_data_1h_secure views; correct policy names to user_cml_metadata_policy / user_cml_stats_policy / webserver_cml_metadata_policy / webserver_cml_stats_policy; remove DISABLE RLS on cml_data (never enabled); fix user1_role -> user1 throughout REVOKE and DROP ROLE statements - migration 004 header: step 4 description corrects 'cml_data, cml_metadata, cml_stats' -> 'cml_metadata and cml_stats'
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
feat/db-roles-rls — Database roles and per-user data isolation
What this PR does
Introduces PostgreSQL roles and DB-layer per-user isolation as Phase 2 of the multi-user rollout.
Roles
user1— login role for parser + webserver (user1's data); role name intentionally matches theuser_idvalue in the data tables, enabling genericcurrent_userRLS policies without per-user policy changes at onboarding.webserver_role— read-all login role for admin queries; canSET ROLE user1for DB-enforced scoped reads.Row-Level Security
cml_metadataandcml_statswith a single genericFOR ALL … USING (user_id = current_user)policy per table.cml_datais excluded: TimescaleDB compression and RLS are mutually exclusive on the same hypertable. Per-user isolation for raw data is provided by security-barrier views instead.Security-barrier views
cml_data_secure—WITH (security_barrier) AS SELECT * FROM cml_data WHERE user_id = current_user WITH CHECK OPTIONcml_data_1h_secure— same pattern over the continuous aggregateCompression optimisation (migration 005)
sublink_idfromcompress_segmentby; new setting:'user_id, cml_id'.Migrations
004_add_roles_rls.sql005_drop_sublink_from_segmentby.sqlBackward compatibility
myuser(superuser) bypasses RLS; existing parser and webserver are unaffected until PR3 (feat/parser-user-id) and PR5 (feat/webserver-auth) wire up the new credentials.