Skip to content

New drag and drop API#4571

Open
eira-fransham wants to merge 89 commits into
rust-windowing:masterfrom
slint-ui:drag-n-drop
Open

New drag and drop API#4571
eira-fransham wants to merge 89 commits into
rust-windowing:masterfrom
slint-ui:drag-n-drop

Conversation

@eira-fransham
Copy link
Copy Markdown

@eira-fransham eira-fransham commented May 19, 2026

This PR implements a new API for drag and drop, with a DataTransfer type which abstracts over the various clipboard/drag and drop APIs across different platforms. I built this on top of #2429 in order to ensure @SludgePhD gets credit if it gets merged, although admittedly I ended up removing pretty much all of their work while I was reworking the design.

This is being built in order to help support drag-and-drop work in Slint's winit backend. As part of that work, I did extensive research on how drag-and-drop and clipboard APIs are implemented across different platforms, and wrote a (still WIP) research document that can be found here.

Some platforms (Wayland, X11) always transfer bytes with a MIME type, other platforms have a set of standardised transferrable types. However, all types that are supported cross-platform (images, RTF, HTML, plaintext, URIs/URI lists) are cleanly expressible using MIME types on all supported platforms. As part of this PR, I've written up a quick-and-dirty summary of which types are supported on different platforms.

Design

The new API is inspired by the browser's DataTransfer API. The main complexity comes from supporting both the common set of capabilities (the types and traits in winit-core/src/data_transfer.rs) while also allowing a consumer to use the platform-specific APIs.

The design may look somewhat complex, and I am open to suggestions for simplifying it, but OS drag-and-drop/clipboard/etc APIs are just fundamentally complex. Unfortunately, there's going to be a lot of complexity here no matter the implementation. This article by a Wayland maintainer describes it as "arguably one of the most complicated parts of the core Wayland protocol", and from researching the design for other platforms it seems like Wayland actually has the simplest API.

In general, the API is designed around the idea that the user should be able to supply/read both cross-platform types and platform-specific types. Not all of the platform-specific types are fully implemented, although many of them are.

Current state

Both receiving and initiating a drag operation are implemented on Windows, Wayland and macOS. X11 supports receiving dropped data, but initiating a drag on X11 is not planned as part of this PR.

Example

The drag-and-drop example has been updated, so both sending and receiving drag operations is now shown off.

macOS

Sending and receiving a dragging operation:

2026-06-03.17-28-00.mov

Dragging styled text (via the HTML clipboard type) into the notes app on macOS:

2026-06-03.17-38-15.mov

Wayland

Video_2026-06-04_17-29-06.mp4

Windows

2026-06-04.18-41-50.mp4

AI disclosure

Unsure if this is important to anyone reading this PR, but I thought I should mention just in case since I know that it's a common issue for OSS projects:

AI was NOT used for the development of the core winit API, nor the X11, Wayland, or macOS implementations, nor any other code, documentation, or comments contributed to this branch by me. It was also not used to write the PR description, nor any comments made by me. The Windows implementation was done by a colleague, so I do not know for sure if AI was used. I can ask him if the reviewer would like to know, but I can at least confirm that I've manually reviewed every line of his PR to my branch.

@kchibisov
Copy link
Copy Markdown
Member

I haven't read very much into impl. details, but from what I've seen:

  1. I think operations for fetching should be on ActiveEventLoop. You can not fetch clipboard on a window without loop on Wayland at all.
  2. The transfer/mime stuff should work with clipboard, because Drag and Drop and system clipboard is the same. So once winit supports clipboard, it should use the similar-ish system.
  3. We should keep in mind that on Wayland, if you get 2-3 types mime types, data that will be provided for each will be different, thus picking what to load should be up to user, and also, the loading of data from the Dnd/Clipboard is async operation, so the data should come back in terms of Event back to user. So, something like ActiveEventLoop::request_data_transfer_data(DataTransferId) -> Result<()> and a callback with data Vec<u8>, mime: Mime on ApplicationTrait should come later once we done reading the data back. Any blocking operation won't work, unfortunately.

In the current API, we have this URI list preloaded, but I think this is solved, we should only load with async-ish API once user asks, I think, if you want to follow Wayland.

If the API is not lazy, I'm not sure how it should be done, so if yo can provide tl;dr it would help (haven't found clear wording in your research).

This article by a Wayland maintainer describes it as "arguably one of the most complicated parts of the core Wayland protocol"

Well, because it's a lot of work, you need to negotiate what you want to read (image/text/html/whatever), then both ends should do non-blocking writing/reading to an FD, thus implying that you plug that FD into epoll based event loop or something. And ensure that you don't hang each other.

And compositor does very little here actually, it just lets you exchange FDs with other end, but all this negotiation + writing the right mime type is on the client.

So for example, when something drags object/pastes, you have a bunch of mime types you can use to query data, e.g. image/text/audio, then you ask for audio, and other end should provide audio.

Then you want to initiate drag and drop, and you started dragging something, but this something is either text or image, and then depending on what the other end picks(e.g. you drag from winit to firefox, and firefox picks one of two mimetypes you gave to it), winit must reply with the right data. Note that you don't create buffer for both image and text before hand, you only create them once you get event what type of data other end wants. It can also ask you for both, or ask you something from time to time as long as you have advertised something.

@eira-fransham
Copy link
Copy Markdown
Author

eira-fransham commented May 21, 2026

@kchibisov Thanks for your response. I appreciate you clarifying the way that data transfer (i.e. clipboard and drag-and-drop) works, but it might be worth reading through the PR since I have an extensive doc comment explaining the exact concerns that you bring up and how the API addresses them. In particular:

  • The API already handles type multiplexing. That is not Wayland-specific, it's done on all platforms
  • The API is already lazy and asynchronous (in the general sense, not in the async-keyword sense). See the docs for fetch_data_transfer + the DataTransferResult event

The implementation in this PR only addresses drag-and-drop, but it is specifically designed to support clipboard operations in the future. Clipboard operations are planned in a follow-up PR, and only left unimplemented for now for two reasons:

  1. I wanted to focus on the new API rather than adding too much new implementation (see also the note about removing the Wayland implementation in the PR description). This is already going to be a reasonably-large PR and I didn't want to overload the winit team.
  2. The most-pressing concern from the side of Slint is drag-and-drop. Implementing the clipboard via a side-channel is a lot simpler and more self-contained than trying to do so for drag-and-drop.

The type hierarchy is like so:

  • DataTransfer - the offer of multiple typed views of some data (clipboard or drag-and-drop). None of the data is guaranteed to be resolved at this time. Corresponds to wl_data_offer, NSPasteboard, Windows.ApplicationModel.DataTransfer.DataPackage, etc.
  • TransferType + TypeHint - the platform-specific and cross-platform representation of a type, respectively. In most cases, TypeHint is enough. It covers the set of advertisable types which have some equivalent in the data transfer mechanism on every platform.
  • TypedData (could potentially do with a better name) - the data resulting from the resolved request for some specific type of a DataTransfer. Like with DataTransfer and TransferType, this can be interacted with generically or downcast to a platform-specific type.

The API flow as-implemented by this PR is like so:

  • Instead of receiving a parsed value (as in the design on master), the user just gets an ID.
  • The user can use that ID to request the types of the data transfer synchronously, and/or request a certain type of the data of the transfer asynchronously.
  • The user can accept or reject a drag operation by using that same DataTransferId. Even though a DataTransferId could also refer to clipboard data, in this case I figured it was better to just error out if the ID of a data transfer that wasn't part of a drag operation. Alternatively, a separate DragOperationId could be introduced to avoid the overloading, but I don't see a reason to do so.
  • When the fetch of some type in a data transfer has completed, a dyn TypedData is passed using the DataTransferResult event. The user may read this TypedData using the specified data format, with helpers for plaintext and URI lists since those are special-cased on some platforms and have platform-specific encoding (UTF-8 vs UTF-16) that users should not have to handle manually.

Note that we cannot just return a potentially-blocking io::Read impl when fetching a specific type, as at least on X11 the data is transferred as part of the event loop. TypedData must have the invariant that directly reading it on the event loop may stall the application but can never cause a deadlock. That's also why types can be fetched synchronously - all platforms support reading the types offered by another application without stalling the event loop.

Regarding moving fetch_data_transfer (+ the other data transfer related stuff) to the event loop: the event loop feels like the most-natural place for it, but X11 handles selection transfer on a per-window basis rather than per-event-loop, so putting it on the window was the lowest common denominator w.r.t. cross-platform use. I believe that X11 is the only platform that does it this way though, so I wouldn't be against putting it on the event loop instead.

@kchibisov
Copy link
Copy Markdown
Member

Regarding moving fetch_data_transfer (+ the other data transfer related stuff) to the event loop: the event loop feels like the most-natural place for it, but X11 handles selection transfer on a per-window basis rather than per-event-loop, so putting it on the window was the lowest common denominator w.r.t. cross-platform use. I believe that X11 is the only platform that does it this way though, so I wouldn't be against putting it on the event loop instead.

But we can pass WindowId to which window deliver callback. Clipboard is the same, it can not work without window on Wayland, but still, transfer itself is event loop based. Window could be implementation detail of a transfer/etc. For X11 people tended to use hidden empty util window(for clipboard for sure, but it's a bit slow).

Also, X11 is nearly dead, so no point in design around it.

@eira-fransham
Copy link
Copy Markdown
Author

Yeah, as of the latest few commits I’ve moved the whole drag-and-drop API to the event loop. Fetching still needs to be done asynchronously to support X11 though unfortunately, there’s not really a way around that without running the risk of deadlocking the event loop. The user can always immediately try to read the data without waiting for the DataTransferResult event, that’ll work on most platforms.

I’ve also updated the dnd example to show how to use the new API.

Copy link
Copy Markdown
Member

@kchibisov kchibisov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally using event will be fine, I guess. The API should certainly be async, sync won't work on Wayland as well.

Comment thread winit/examples/dnd.rs Outdated
This commit also adds deadlock detection to X11,
since on that platform the event loop needs to be
polled in order for the selection to be received.
@eira-fransham eira-fransham marked this pull request as ready for review June 3, 2026 15:25
@eira-fransham eira-fransham requested a review from madsmtm as a code owner June 3, 2026 15:25
@eira-fransham eira-fransham changed the title WIP: New drag and drop API New drag and drop API Jun 3, 2026
@eira-fransham eira-fransham requested a review from kchibisov June 3, 2026 18:05
tronical and others added 20 commits June 4, 2026 11:33
Drop emitted DragDropped before computing the effect, so a target
rejection via set_valid_actions(none()) left the source seeing
DROPEFFECT_NONE while the target app had already seen DragDropped.
Compute the effect first and route DROPEFFECT_NONE to DragLeft.

Also trace effect_out on the source - cross-process drops have no
in-process target callback to observe it through.
Standard Arc pattern: Release on decrement publishes, but the dropping
thread needs an Acquire fence on the zero transition to see writes
another thread made through the object. AddRef drops to Relaxed - a
bump on a count you already reference doesn't synchronize anything.
Latent today since our COM objects stay in the STA.
Plumb the icon through IDragSourceHelper::InitializeFromBitmap. The
shell composites in premultiplied alpha, so convert RGBA to
premultiplied BGRA in a top-down DIB or translucent edges get a halo.
Helper failures are non-fatal.
The TYMED branch values in `duplicate_stgmedium` were each off by one
bit shift: TYMED_FILE was 1 (= HGLOBAL) instead of 2, GDI was 32
instead of 16, MFPICT 64 instead of 32, ENHMF 128 instead of 64. Latent
because the shell drag helper only uses HGLOBAL, but a HANDLE-bearing
SetData of any other tymed would have either fallen through or aliased
the wrong union arm. Use the named TYMED_* constants from windows-sys
so we can't drift again.
win32: add support for starting drags
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants