Skip to content

Implement Unix domain sockets (AF_UNIX) #207

@sysheap

Description

@sysheap

AF_UNIX sockets are required by X11, D-Bus, Docker, systemd, and many language runtimes. They provide efficient IPC without going through the network stack.

Address family and types

  • AF_UNIX with SOCK_STREAM (connection-oriented, reliable) and SOCK_DGRAM (connectionless).
  • Socket addresses: struct sockaddr_un with sun_path — either a filesystem path (VFS node) or an abstract name (starts with \0, no VFS entry).
  • socketpair(AF_UNIX, SOCK_STREAM, 0, fds) — creates two already-connected sockets without going through the filesystem.

Key syscalls (no new syscall numbers needed — existing dispatch handles them)

Syscall AF_UNIX behavior
socket(AF_UNIX, ...) Allocate an unbound Unix socket fd
bind(fd, addr, len) For filesystem sockets: create a socket file in the VFS at sun_path; for abstract: register in an in-kernel abstract socket table
connect(fd, addr, len) Find the listening socket at the address, enqueue connection request
accept(fd, ...) Dequeue a pending connection, return new connected fd
send / recv / read / write Transfer data through the in-kernel buffer
sendmsg / recvmsg Required for ancillary data (SCM_RIGHTS, SCM_CREDENTIALS)
socketpair Allocate two connected Unix stream sockets

In-kernel data structures

A connected Unix stream socket pair shares a pair of directional byte buffers (one per direction). Each buffer is a VecDeque<u8> protected by a Spinlock, same pattern as PipeInner in kernel/src/io/pipe.rs. Wakers are needed so read can block until data arrives.

A Unix stream listener holds a queue of pending connections (each is a pair of connected socket buffers); connect enqueues, accept dequeues.

Abstract namespace: a global BTreeMap<Vec<u8>, Weak<UnixListener>> keyed by the abstract name, similar to the TCP port map in kernel/src/net/sockets.rs.

New FileDescriptor variants:

  • UnboundUnixSocket — after socket(AF_UNIX, ...), before bind/connect
  • UnixListener(Arc<Spinlock<UnixListenerInner>>) — after bind + listen
  • UnixStream(Arc<Spinlock<UnixStreamInner>>) — connected stream socket

SCM_RIGHTS (file descriptor passing)

sendmsg with a cmsghdr where cmsg_level = SOL_SOCKET and cmsg_type = SCM_RIGHTS transfers an array of file descriptors from sender to receiver. On the kernel side:

  1. Sender: extract the fd numbers from the cmsg payload, look up the FileDescriptor objects, clone them, and attach to the message.
  2. Receiver: recvmsg installs the cloned FileDescriptor objects into the receiver's fd table and returns the new fd numbers in a cmsg.

SCM_CREDENTIALS

cmsg_type = SCM_CREDENTIALS passes a struct ucred { pid, uid, gid }. Kernel fills in the actual credentials of the sending process (sender can only spoof lower privilege). Implement as a read from the sending process's credential fields.

Implementation location

  • kernel/src/net/unix.rs (new file) — UnixListenerInner, UnixStreamInner, abstract namespace table
  • kernel/src/syscalls/net_ops.rs — extend do_socket, do_bind, do_connect, do_accept for AF_UNIX; add do_sendmsg, do_recvmsg
  • kernel/src/processes/fd_table.rs — new variants, read/write impls

Acceptance criteria

  1. Two processes communicate over a filesystem Unix socket (/tmp/test.sock): one binds/listens/accepts, the other connects; both send and receive a message.
  2. socketpair works: parent and child communicate over the pair after fork.
  3. SCM_RIGHTS: parent passes an open file fd to child via sendmsg/recvmsg; child reads the file through the received fd.

System tests in system-tests/src/tests/ covering all three scenarios.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions