Skip to content

Add audit log#214

Open
raulrpearson wants to merge 13 commits intoClaperCo:devfrom
raulrpearson:add-audit-log
Open

Add audit log#214
raulrpearson wants to merge 13 commits intoClaperCo:devfrom
raulrpearson:add-audit-log

Conversation

@raulrpearson
Copy link
Contributor

Summary

Audit.Log model

The PR adds a new Audit context, Log schema and associated migration. The model for the new log includes:

  • action: a tag like "user.login", "user.logout" or "system.restart". This name was picked to avoid a clash with "event", which we already use for the user generated ones.
  • resource_type and resource_id: an optional string tag referencing and entity like an event, quiz, etc. and its associated ID. This is implemented like this rather than a foreign key to keep things simple and open, as we're not sure what type of entity "actions" we'll want to track. Let me know if there's some easy way to create some kind of polymorphic association, I don't know how, which is why I'm suggesting a loose string and ID.
  • metadata: the general purpose map/jsonb blob.
  • user: association to user. Nullable in case we want logs not associated to a user (system).
  • No updated_at: as this is supposed to be an audit log, I suppose we're not supposed to edit entries, so no update timestamp necessary.

If we want to actually enforce no updates or deletes we could do something like this vibecoded snippet:

CREATE OR REPLACE FUNCTION prevent_update_delete()
RETURNS TRIGGER AS $$
BEGIN
    RAISE EXCEPTION 'Updates and deletes are not allowed on table %', TG_TABLE_NAME;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER no_update_trigger
    BEFORE UPDATE ON your_table_name
    EXECUTE FUNCTION prevent_update_delete();

CREATE TRIGGER no_delete_trigger
    BEFORE DELETE ON your_table_name
    EXECUTE FUNCTION prevent_update_delete();

Flop and Flop Phoenix

I ended up going with Flop for filtering, sorting and pagination because adding filtering and sorting on top of our Repo.paginate/2 was requiring a lot of wiring. I like how it turned out with the URL query string holding the current filtering parameters, so navigation and update is more robust. You can refresh or share the URL with a given query.

IMO could be good to rework existing live views with this approach. Right now admin events and users are not paginated, for example. Taking that a step further, is it wise to ever have a context module listing function not be parametrised with filtering, ordering and pagination? We currently have some complexity around this in the Events module, for example.

Phoenix components

In order to make Flop Phoenix components work out-of-the-box, I copy-pasted (with a couple small tweaks) the CoreComponents module generated with the latest mix phx.new. This duplicates approaches with our current ClaperWeb.Component.Input module. I'd be inclined to choose getting closer to the CoreComponents implementation.

I also created an Icons module just to reuse some SVG markup. I've been meaning to suggest a better alternative using Iconify, but something to discuss in the future, I think.

Client IP

The ClaperWeb.Helpers.ConnUtils module is an AI creation. It gives preference to the x-forwarded-for header. Do we want to use a more robust approach?

Review

Feel free to question any of the choices I made. Available to discuss on a call if you want. The default page size for the logs view is 50. If you want to test pagination, maybe populate the database programmatically or you can add the default_limit: 5 keyword to the Flop segment in Claper.Audit.Log so you don't need as many.

@alxlion
Copy link
Contributor

alxlion commented Feb 22, 2026

Thanks @raulrpearson ,

For the Flop lib adoption, I agree that existing admin views (events, users) would benefit from the same approach in the future.

Here are some concerns, you already talked about them but I think we have to be strict on security and data integrity.

Bugs

B1. Access.key! crash when log.user is nil

File: audit_log_live.html.heex:71
Severity: High — will crash in production when a user has been deleted

{get_in(log, [Access.key!(:user), Access.key!(:email)]) || "System"}

When a user is deleted (FK is on_delete: :nilify_all), log.user will be nil. Calling Access.key!(:email) on nil raises a KeyError — the || "System" fallback never executes.

Fix:

{if(log.user, do: log.user.email, else: gettext("System"))}

This also fixes a missing gettext() wrap — the :show view correctly uses gettext("System") on line 114, but the :index table does not.


B2. Flop.validate! in handle_event can crash the LiveView

File: audit_log_live.ex:56
Severity: Medium — malformed WebSocket params crash the process

Flop.validate!(unsigned_params, for: Audit.Log)

Unlike list_logs/1 which uses replace_invalid_params: true, this bang call raises Flop.InvalidParamsError on malformed client params. Since this handles user-supplied data from the WebSocket, it should use the non-bang Flop.validate/2 and handle the error gracefully.


Security

S1. ConnUtils.get_client_ip/1 trusts the leftmost X-Forwarded-For

File: lib/claper_web/helpers/conn_utils.ex:16
Severity: High — trivially spoofable, degrades audit data integrity

[ip | _] -> ip |> String.split(",") |> List.first() |> String.trim()

You already mentioned this one in your PR, and I think we should use a more secure approach to avoid spoofing.

Problems:

  • Trivially spoofable: any client can set X-Forwarded-For: 1.2.3.4. Behind a reverse proxy, the proxy appends the real IP, making the last (or second-to-last) entry trustworthy — not the first.
  • No input validation: the returned string is not validated as a valid IP address. A client could send X-Forwarded-For: <arbitrary string> and it gets stored verbatim in metadata.
  • No trusted proxy awareness: cannot reliably extract the real client IP without knowing which proxy hops are trusted.

Suggestions:

  • Use the remote_ip library which handles trusted proxy configuration properly.
  • At minimum, validate with :inet.parse_address/1 and take the rightmost unknown IP, not leftmost.

S2. No max_limit on Flop schema — unbounded page size

File: lib/claper/audit/log.ex@derive Flop.Schema block
Severity: Medium — admin-only, but still exploitable

Without max_limit, a crafted request can pass page_size=1000000 in URL params and force the DB to load an unbounded number of rows, potentially causing memory exhaustion.

Fix: Add max_limit: 100 (or desired value) to the @derive Flop.Schema options.


S3. Audit logging is synchronous in the auth path

File: lib/claper_web/controllers/user_auth.ex:36-39, 92-95
Severity: Medium — availability concern

Audit.log_action/3 calls Repo.insert() synchronously in the login/logout request pipeline. If the DB insert is slow or fails:

  • Login/logout is delayed.
  • The error tuple {:error, changeset} is silently discarded.

Suggestions:

  • Consider making audit logging async (Task.start/1 or an Oban job) so a failed audit insert never degrades authentication.
  • At minimum, add a Logger.warning on failure so silent data loss is detectable.

Code Bloat

CB1. CoreComponents — ~500 lines, only input/1 is used

File: lib/claper_web/components/core_components.ex
Severity: Medium — maintenance burden and architectural confusion

You already mentioned that, and the only component actually used by this PR is input/1 (needed by Flop Phoenix's filter_fields). The flash, button, header, table, list, and icon components are all unused.

The project already has ClaperWeb.Component.Input and other admin components. Shipping 500 lines of duplicate, unused components creates maintenance burden and confusion about which component system to use.

But I think it's temporary before migration?

Suggestion: Extract only the input/1 component that Flop Phoenix needs, or adapt the existing ClaperWeb.Component.Input module to satisfy Flop's requirements.


CB2. Icons module duplicates available Heroicons

File: lib/claper_web/components/icons.ex
Severity: Low

The CoreComponents module already includes an icon/1 component that renders Heroicons by name (e.g., <.icon name="hero-eye" />). The separate Icons module with hand-inlined SVGs for eye, arrow_up, and arrow_down is redundant.


i18n Issues

I1. Table column labels not wrapped in gettext()

File: audit_log_live.html.heex:66-85

The :col labels use bare English strings: "Timestamp", "Action", "User", "Details". These should be wrapped in gettext() for consistency with the rest of the template (which correctly uses gettext for pagination, headings, etc.).


I2. "System" fallback not in gettext() in index view

File: audit_log_live.html.heex:71

Already covered in B1 — the :show view correctly uses gettext("System") but the :index table does not.


Schema / Migration

SM1. Type mismatch: :integer in schema vs :bigint in migration

Files: lib/claper/audit/log.ex:27 and priv/repo/migrations/20260129165937_create_audit_logs.exs:8

The migration defines resource_id as :bigint but the Ecto schema declares :integer. In practice this works (Elixir integers are arbitrary precision), but the intent is misaligned. The project's other tables use standard integer/bigserial PKs. Recommend aligning both to :integer.


SM2. No validate_length on string fields

File: lib/claper/audit/log.ex:37-40

:action and :resource_type have validate_required but no length constraints. The DB columns are varchar(255), so Postgres will raise a raw error on overly long strings instead of a friendly changeset error. Add validate_length(:action, max: 255) and similarly for :resource_type.


SM3. Consider a composite index on (action, inserted_at)

File: priv/repo/migrations/20260129165937_create_audit_logs.exs

Filtering by action + sorting by timestamp is the primary query pattern in the UI. The current separate indexes may not combine efficiently. A composite index would help as the table grows.


UX / Accessibility

UX1. Eye icon link has no accessible text

File: audit_log_live.html.heex:82-84

The "view" link wraps only an SVG icon with no aria-label or <span class="sr-only">. Screen readers will announce an empty link. Add aria-label={gettext("View details")}.


UX2. Inconsistent timestamp formatting

Files: audit_log_live.html.heex:66 vs :show view line 121

The :index table renders raw NaiveDateTime ({log.inserted_at}), while the :show view formats it with Calendar.strftime and a UTC label. Should be consistent.

Minor / Informational

Item Location Note
list_action_types/0 full table scan audit.ex:76-83 SELECT DISTINCT action will degrade on large tables. Consider caching.
log_resource_action/5 has no nil-user clause audit.ex:49-63 Inconsistent with log_action/3 which handles nil users.
get_log!/1 uses two queries audit.ex:99-103 Repo.get! + Repo.preload could be a single query.
Test coverage gaps audit_test.exs No tests for filtering/pagination params, LiveView rendering, or access control enforcement.
@impl Phoenix.LiveView vs @impl true audit_log_live.ex Inconsistent with existing admin LiveViews which use @impl true.
Dep version pins ~> 0.26.3 mix.exs More restrictive than typical ~> 0.26 — may miss bugfix releases.
Nav active state admin.html.heex:236 Sidebar highlight doesn't persist on /admin/audit_logs/:id (existing pattern).
SVG icons missing aria-hidden="true" icons.ex Decorative icons should be hidden from screen readers.
No on_mount hook for admin live_session router.ex Existing gap — admin role revocation not enforced on active LiveView sessions. Not new to this PR.

Verdict

Must fix before merge: B1, B2, S1, S2

Strongly recommended: S3, CB1, I1, UX1, and the Nav active state

Nice to have: Everything else

Finally, the suggested prevent_update_delete() trigger is a good idea. Could be added in a follow-up migration.

@raulrpearson
Copy link
Contributor Author

Thanks, @alxlion, pretty epic review, I like it. I'll work on amendments and ping back for re-review when ready.

@raulrpearson
Copy link
Contributor Author

@alxlion, the PR is ready for another round of review. Here's is a summary of follow-ups:

Bugs

  • B1. Access.key! crash when log.user is nil. Not an issue since get_in/1 is a macro that expands to nil-safe nested access, so get_in(log.user.email) short-circuits gracefully when log.user is nil. See the docs.
  • B2. Flop.validate! in handle_event can crash the LiveView. Mitigated by passing replace_invalid_params: true to Flop.validate!, so malformed params are silently replaced with defaults rather than raising.

Security

  • S1. ConnUtils.get_client_ip/1 trusts the leftmost X-Forwarded-For. Switched to the remote_ip library. Trusted proxy IPs and headers are configurable via REMOTE_IP_PROXIES / REMOTE_IP_HEADERS env vars (see updated .env.sample).
  • S2. No max_limit on Flop schema. Flop has an implicit default max_limit but it's 1000, which is maybe too much, added max_limit: 100 to the @derive Flop.Schema block. Making it explicit also adds some peace of mind.
  • S3. Audit logging is synchronous in the auth path. I made it async. I also had to amend ConnCase and DataCase in tests so that the async calls are drained/awaited before tearing down database connection (I saw some error logs while testing, even though this wasn't causing test failures).

Code bloat

  • CB1. CoreComponents — ~500 lines, only input/1 is used. Intentional for now. I'd favour keeping the full module as a stepping stone toward migrating existing views to these components in future PRs. But let me know if you disagree and you prefer that I cull this module out.
  • CB2. Icons module duplicates available Heroicons. Same as CB1. In this case, I think the correct solution would be to use something like Iconify (with iconify_ex or as a Tailwind plugin). We're using a mix of icons, usually inlined, duplicated and cluttering the views and I think we should find a better approach. If you want me to go with inlining icons rather than this separate Icons module to match the current approach, let me know.

i18n issues

  • I1. Table column labels not wrapped in gettext(). Fixed.
  • I2. "System" fallback not in gettext() in index view. Fixed, although I fully removed the "System" string and now using "—" for consistency and to avoid the confusing of mixing a system nil-user action and an user action whose user was deleted.

Schema / Migration

  • SM1. Type mismatch: :integer in schema vs :bigint in migration. I disagree with this assessment. :bigint in the migration is correct (consistent with the project's :bigserial PKs, which resource_id may reference). Ecto's :integer type maps fine to a bigint column, so no change needed.
  • SM2. No validate_length on string fields. Good catch, this validation was added.
  • SM3. Consider a composite index on (action, inserted_at). After some cogitating with Claude, I refactored indexes to match the query patterns I expect we'll commonly have (descending inserted at, and that plus either filtered by action or user email). Let me know if you have any feedback.

UX / Accessibility

  • UX1. Eye icon link has no accessible text. Fixed by adding aria-label={gettext("View details for log #%{id}", id: log.id)}.
  • UX2. Inconsistent timestamp formatting. Fixed by refactoring and reusing formatting logic.

Minor / Informational

  • list_action_types/0 full table scan: I don't think caching is the right move here, the (action, inserted_at) index should allow Postgres to satisfy SELECT DISTINCT action ORDER BY action with an index-only scan, so it should be performant as the table grows. Let me know if you disagree.
  • log_resource_action/5 has no nil-user clause: I'd add one when the need arises. I don't think the inconsistency is a problem right now.
  • get_log!/1 uses two queries: fixed so that it uses a single query now.
  • prevent_update_delete() trigger: I'd leave this for a different PR? One thing to think through: if we deleted a user nilify_all on the FK would trigger an update on audit_logs, which the trigger would block. We'd need to account for that (e.g. excluding FK-only updates, or accepting that soft-deletes make the immutability guarantee workable already). Let me know what you think.
  • @impl Phoenix.LiveView vs @impl true: I'd favour keeping @impl Phoenix.LiveView and migrating other modules to this convention; it's more explicit about which behaviour is being implemented. Let me know if you disagree and I can switch it over.
  • Test coverage: I don't think we need to test pagination and filtering as this would be akin to testing Flop. Regarding authz, I agree, but this is a bit of a can of worms. The two main things I think we should address first is convening on the scope pattern present in the latest Phoenix template to guard context functions and then also addressing the on_mount hook issue mentioned below. After/with that work I'd be inclined to add tests for access control.
  • Dep version pins ~> 0.26.3: agree, loosened to ~> 0.26.
  • Nav active state: I like the idea but would do in a different PR, to implement for the whole admin live views group.
  • SVG icons missing aria-hidden="true": fixed.
  • No on_mount hook for admin live_session: agreed but would also address in a different PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants