Skip to content

bpf, sockmap: Fix af_unix null-ptr-deref in proto update#10983

Closed
kernel-patches-daemon-bpf[bot] wants to merge 4 commits intobpf_basefrom
series/1051882=>bpf
Closed

bpf, sockmap: Fix af_unix null-ptr-deref in proto update#10983
kernel-patches-daemon-bpf[bot] wants to merge 4 commits intobpf_basefrom
series/1051882=>bpf

Conversation

@kernel-patches-daemon-bpf
Copy link
Copy Markdown

Pull request for series with
subject: bpf, sockmap: Fix af_unix null-ptr-deref in proto update
version: 2
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1051882

@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: 2687c84
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1051882
version: 2

@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: 05f7e89
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1051882
version: 2

@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: 72c3950
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1051882
version: 2

@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: a7423e6
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1051882
version: 2

@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: a9aabb3
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1051882
version: 2

@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: 677755d
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1051882
version: 2

@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: 28c65a8
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1051882
version: 2

@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: 28c65a8
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1051882
version: 2

@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: c81e432
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1051882
version: 2

@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: 1a7eb7a
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1051882
version: 2

@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: 8552286
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1051882
version: 2

sock_map_sk_state_allowed() and sock_map_redirect_allowed() read af_unix
socket sk_state locklessly.

Use READ_ONCE(). Note that for sock_map_redirect_allowed() change affects
not only af_unix, but all non-TCP sockets (UDP, af_vsock).

Suggested-by: Kuniyuki Iwashima <kuniyu@google.com>
Suggested-by: Martin KaFai Lau <martin.lau@linux.dev>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

Upstream branch: 8419dbe
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1051882
version: 2

Instead of repeating the same (un)locking pattern, reuse
sock_map_sk_{acquire,release}(). This centralizes the code and makes it
easier to adapt sockmap to af_unix-specific locking.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
unix_stream_connect() sets sk_state (`WRITE_ONCE(sk->sk_state,
TCP_ESTABLISHED)`) _before_ it assigns a peer (`unix_peer(sk) = newsk`).
sk_state == TCP_ESTABLISHED makes sock_map_sk_state_allowed() believe that
socket is properly set up, which would include having a defined peer. IOW,
there's a window when unix_stream_bpf_update_proto() can be called on
socket which still has unix_peer(sk) == NULL.

          T0 bpf                            T1 connect
          ------                            ----------

                                WRITE_ONCE(sk->sk_state, TCP_ESTABLISHED)
sock_map_sk_state_allowed(sk)
...
sk_pair = unix_peer(sk)
sock_hold(sk_pair)
                                sock_hold(newsk)
                                smp_mb__after_atomic()
                                unix_peer(sk) = newsk

BUG: kernel NULL pointer dereference, address: 0000000000000080
RIP: 0010:unix_stream_bpf_update_proto+0xa0/0x1b0
Call Trace:
  sock_map_link+0x564/0x8b0
  sock_map_update_common+0x6e/0x340
  sock_map_update_elem_sys+0x17d/0x240
  __sys_bpf+0x26db/0x3250
  __x64_sys_bpf+0x21/0x30
  do_syscall_64+0x6b/0x3a0
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

Initial idea was to move peer assignment _before_ the sk_state update[1],
but that involved an additional memory barrier, and changing the hot path
was rejected. Then a check during proto update was considered[2], but a
follow-up discussion[3] concluded the root cause is sockmap taking a wrong
lock.

Thus, teach sockmap about the af_unix-specific locking: instead of the
usual lock_sock() involving sock::sk_lock, af_unix protects critical
sections under unix_state_lock() operating on unix_sock::lock.

[1]: https://lore.kernel.org/netdev/ba5c50aa-1df4-40c2-ab33-a72022c5a32e@rbox.co/
[2]: https://lore.kernel.org/netdev/20240610174906.32921-1-kuniyu@amazon.com/
[3]: https://lore.kernel.org/netdev/7603c0e6-cd5b-452b-b710-73b64bd9de26@linux.dev/

This patch also happens to fix a deadlock that may occur when
bpf_iter_unix_seq_show()'s lock_sock_fast() takes the fast path and the
iter prog attempts to update a sockmap. Which ends up spinning at
sock_map_update_elem()'s bh_lock_sock():

WARNING: possible recursive locking detected
--------------------------------------------
test_progs/1393 is trying to acquire lock:
ffff88811ec25f58 (slock-AF_UNIX){+...}-{3:3}, at: sock_map_update_elem+0xdb/0x1f0

but task is already holding lock:
ffff88811ec25f58 (slock-AF_UNIX){+...}-{3:3}, at: __lock_sock_fast+0x37/0xe0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(slock-AF_UNIX);
  lock(slock-AF_UNIX);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

4 locks held by test_progs/1393:
 #0: ffff88814b59c790 (&p->lock){+.+.}-{4:4}, at: bpf_seq_read+0x59/0x10d0
 #1: ffff88811ec25fd8 (sk_lock-AF_UNIX){+.+.}-{0:0}, at: bpf_seq_read+0x42c/0x10d0
 #2: ffff88811ec25f58 (slock-AF_UNIX){+...}-{3:3}, at: __lock_sock_fast+0x37/0xe0
 #3: ffffffff85a6a7c0 (rcu_read_lock){....}-{1:3}, at: bpf_iter_run_prog+0x51d/0xb00

Call Trace:
 dump_stack_lvl+0x5d/0x80
 print_deadlock_bug.cold+0xc0/0xce
 __lock_acquire+0x130f/0x2590
 lock_acquire+0x14e/0x2b0
 _raw_spin_lock+0x30/0x40
 sock_map_update_elem+0xdb/0x1f0
 bpf_prog_2d0075e5d9b721cd_dump_unix+0x55/0x4f4
 bpf_iter_run_prog+0x5b9/0xb00
 bpf_iter_unix_seq_show+0x1f7/0x2e0
 bpf_seq_read+0x42c/0x10d0
 vfs_read+0x171/0xb20
 ksys_read+0xff/0x200
 do_syscall_64+0x6b/0x3a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Suggested-by: Kuniyuki Iwashima <kuniyu@google.com>
Suggested-by: Martin KaFai Lau <martin.lau@linux.dev>
Fixes: c638291 ("af_unix: Implement ->psock_update_sk_prot()")
Fixes: 2c860a4 ("bpf: af_unix: Implement BPF iterator for UNIX domain socket.")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Updating a sockmap from a unix iterator prog may lead to a deadlock.
Piggyback on the original selftest.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
@kernel-patches-daemon-bpf
Copy link
Copy Markdown
Author

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=1051882 expired. Closing PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant