Skip to content

feat(tcp): restore socket with UID#40

Open
kr3v wants to merge 5 commits into
live-4.2from
istio-halftcp
Open

feat(tcp): restore socket with UID#40
kr3v wants to merge 5 commits into
live-4.2from
istio-halftcp

Conversation

@kr3v
Copy link
Copy Markdown

@kr3v kr3v commented May 10, 2026

Istio has a bunch of iptables rules that match connections that have 'uid' at 1337 (istio internal uid). CRIU does not restore uid atm, just adding this to dump side, proto image and using the data on restore.

Note: setfsuid is process-wide call. But I believe CRIU is single-threaded at this stage (well, Claude says that), so it should not be a problem.
Note 2: the test seem to really work. Consider #41 (where the restore side is dropped). The test has failed - FAIL: socket-tcp-uid.c:167: post-restore: sk_uid is 0, want 1337 (inode 685980) (errno = 11 (Resource temporarily unavailable)).


istio iptables rules sample

# Generated by iptables-save v1.8.11 (nf_tables) on Sun Apr 26 19:15:21 2026
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:ISTIO_INBOUND - [0:0]
:ISTIO_IN_REDIRECT - [0:0]
:ISTIO_OUTPUT - [0:0]
:ISTIO_REDIRECT - [0:0]
-A PREROUTING -p tcp -j ISTIO_INBOUND
-A OUTPUT -j ISTIO_OUTPUT
-A ISTIO_INBOUND -p tcp -m tcp --dport 15008 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 15090 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 15021 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 15020 -j RETURN
-A ISTIO_INBOUND -p tcp -j ISTIO_IN_REDIRECT
-A ISTIO_IN_REDIRECT -p tcp -j REDIRECT --to-ports 15006
-A ISTIO_OUTPUT -s 127.0.0.6/32 -o lo -j RETURN
-A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -p tcp -m tcp ! --dport 15008 -m owner --uid-owner 1337 -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -o lo -m owner ! --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -p tcp -m tcp ! --dport 15008 -m owner --gid-owner 1337 -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -o lo -m owner ! --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -d 127.0.0.1/32 -j RETURN
-A ISTIO_OUTPUT -j ISTIO_REDIRECT
-A ISTIO_REDIRECT -p tcp -j REDIRECT --to-ports 15001
COMMIT
# Completed on Sun Apr 26 19:15:21 2026

Kimchi Summary

What changed

Preserve the kernel socket owner (sk_uid) across checkpoint/restore for INET sockets. During dump CRIU captures the uid from inet_diag; during restore it temporarily switches fsuid via setfsuid() around socket(2) so the kernel stamps the correct owner on the new socket.

Why

The kernel initializes sk_uid from current_fsuid() at socket creation time and exposes no interface to change it later. Without this, restored sockets are owned by the restore-time fsuid (typically root), breaking netfilter rules such as xt_owner --uid-owner that depend on the original socket owner.

Key changes

  • images/sk-inet.proto: added optional uid field (field 21) to InetSkEntry to persist the dumped fsuid.
  • criu/include/sk-inet.h: added uid member to struct inet_sk_desc.
  • criu/sk-inet.c:
    • capture uid from idiag_uid in inet_collect_one() and from st_uid for unconstrained sockets in gen_uncon_sk().
    • populate ie->uid and ie->has_uid in do_dump_one_inet_fd().
    • in open_inet_sk(), call setfsuid(ie->uid) before socket() and revert immediately after; warn if the fsuid could not be set.
  • test/zdtm/static/socket-tcp-uid.c (new): ZDTM regression test that creates a socket under a non-root fsuid, verifies the expected sk_uid via /proc/net/tcp before and after C/R, and checks connection integrity.
  • test/zdtm/static/Makefile & socket-tcp-uid.desc: wired the new test into the build with flavor h ns and flags suid nouser samens.

Impact

  • Netfilter rules using xt_owner --uid-owner now remain valid after restore for checkpointed TCP connections.
  • In non-user-namespaced containers (the common Kubernetes case) the dumped uid equals the container-local uid and restores transparently.
  • In user-namespaced containers dumped from the host namespace, the captured uid may be an unmapped host uid inside the container’s user namespace; setfsuid() will silently fail and a warning is logged so the mismatch is visible.

Kimchi Summary

What changed

Preserves per-socket ownership (sk_uid) for INET sockets across checkpoint/restore by saving the dump-time UID in the CRIU image and temporarily switching the process filesystem UID during socket re-creation.

Why

The kernel stamps sk_uid from current_fsuid() at socket creation and provides no interface to change it later. Downstream tooling such as xt_owner (--uid-owner) netfilter rules depends on this value, so restored sockets must be recreated under their original UIDs to avoid policy mismatches.

Key changes

  • images/sk-inet.proto: Added optional uid field (id 21) to inet_sk_entry.
  • criu/include/sk-inet.h: Added uid member to struct inet_sk_desc.
  • criu/sk-inet.c:
    • Captures socket UID in gen_uncon_sk() from p->stat.st_uid and in inet_collect_one() from m->idiag_uid.
    • Serializes ie.uid and sets ie.has_uid in do_dump_one_inet_fd().
    • In open_inet_sk(), calls setfsuid(ie->uid) before socket(2) and restores the old fsuid immediately afterward so the kernel stamps the correct sk_uid.
  • test/zdtm/static/: Added socket-tcp-uid ZDTM test and descriptor (socket-tcp-uid.c, socket-tcp-uid.desc); registered in Makefile. The test creates a TCP socket under a non-root UID and verifies via /proc/net/tcp that the UID is preserved before and after C/R.

Impact

  • Restored INET sockets now retain their original sk_uid, keeping firewall and network-ownership policies consistent.
  • In user-namespaced containers where CRIU dumps from the host namespace, the saved UID may be unmapped inside the container; setfsuid() will silently fail in that case and a warning is logged.

@kimchi-review
Copy link
Copy Markdown

kimchi-review Bot commented May 10, 2026

Kimchi Code Review

Property Value
Commit ff96f4f
Author @kr3v
Files changed 0
Review status Completed
Comments 1 (1 warning)
Duration 100s

Summary

📊 Review Score: 90/100 (overall code quality — 0 lowest, 100 highest)
⏱️ Estimated effort to review: 2/5 (1 = trivial, 5 = very complex)

🧪 Tests: yes — A new zdtm test socket-tcp-uid was added to the static test Makefile, indicating coverage for the uid preservation logic. The actual test source was not shown in the diff, but the build integration confirms a targeted test exists.

📝 Found 1 issue(s). See inline comments for details.

What to expect

Kimchi will analyze the changes in this pull request and post:

  • A summary of the overall changes
  • Inline comments on specific lines with findings categorized by issue type

The review typically completes within a few minutes. This comment will be updated once the review is ready.

Interact with Kimchi
  • @kimchi review — re-trigger a full review on the latest commit
  • @kimchi summary — regenerate the PR summary
  • @kimchi ignore — skip this PR (no review will be posted)
  • Reply to any inline comment to ask follow-up questions or request clarification
Configuration

Reviews are configured by your organization admin.
Review instructions, excluded directories, and severity thresholds can be adjusted per repository in the Kimchi dashboard.


Powered by Kimchi — AI-powered code review by CAST AI

Copy link
Copy Markdown

@kimchi-review kimchi-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📊 Review Score: 90/100 (overall code quality — 0 lowest, 100 highest)
⏱️ Estimated effort to review: 2/5 (1 = trivial, 5 = very complex)

🧪 Tests: yes — A new zdtm test socket-tcp-uid was added to the static test Makefile, indicating coverage for the uid preservation logic. The actual test source was not shown in the diff, but the build integration confirms a targeted test exists.

📝 Found 1 issue(s). See inline comments for details.

Comment thread criu/sk-inet.c Outdated
@kr3v kr3v marked this pull request as ready for review May 12, 2026 08:37
Comment thread criu/sk-inet.c
if (run_setsockcreatecon(fle->fe))
return -1;

/*
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if possible lets isolate this as a function as in other cases so its easier to merge form upstream when we need it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants