WebGL2/wasm/Android: GLES buffer, texture, and fence fixes; GLES/Vulkan + egui robustness#353
WebGL2/wasm/Android: GLES buffer, texture, and fence fixes; GLES/Vulkan + egui robustness#353ohsalmeron wants to merge 8 commits into
Conversation
Multiple fixes and improvements across graphics backends and egui integration: - blade-egui: Replace raw vertex storage with a proper vertex input layout (pos/uv/color), add egui vertex VertexLayout and VertexFetchState, bind vertex buffers, add textures_to_free queue and sync() method, and move texture free logic into after_submit. Update WGSL shader to use vec2 inputs and simplify gamma conversion with mix. - GLES: PassEncoder now carries target_size; scissor rect Y is inverted to match GL window coordinates. Use integer enums for integer texture formats, set texture wrap to CLAMP_TO_EDGE by default. Pipeline now binds attribute locations (including an e_ prefixed name) and adds robust uniform-block lookup with fallbacks and logging of available blocks. EGL platform selection improved with error handling and fallback to default when surfaceless fails. - Vulkan: Make debug utils optional and only enable the extension when available; guard debug utils calls, conditionally create device debug utils, and log adapter inspection rejections. Ensure debug object naming is invoked only when extension is present. - Shader processing: Only replace module types when bindings were actually modified and ensure attributes have bindings assigned or assert appropriately. - Utility: Add BufferBelt::sync to flush active buffers to the GPU. These changes improve robustness, platform compatibility, and correctness of vertex/texture handling across backends.
Add WebAssembly-specific handling and performance fixes for GLES backend: - Introduce a separate raw_index buffer for ELEMENT_ARRAY_BUFFER on wasm and create/bind/upload/delete it alongside ARRAY_BUFFER to satisfy WebGL requirements. - Use raw_index when binding index buffers on wasm builds. - Avoid setting UNPACK_ROW_LENGTH when row length equals width (prevents Firefox defensive copy); restore UNPACK_ROW_LENGTH to 0 after operations. - On texture unit binding for wasm, explicitly unbind other texture targets when binding a target to avoid stale bindings. - Adjust client_wait_sync timeout handling for WebGL2: use 0 on wasm (blocking forbidden) and clamp non-wasm timeouts to MAX_TIMEOUT. - Improve shader texture binding: track assigned texture slots and set uniform once (avoid repeatedly querying uniform slot values). - Set canvas element size from config in web backend (lookup element with id "blade" and set width/height). These changes improve WebGL compatibility, avoid browser-specific performance pitfalls, and ensure correct buffer/texture behavior on wasm targets.
kvark
left a comment
There was a problem hiding this comment.
Looks to be very well thought, thank you for filing!
| target, | ||
| } => { | ||
| gl.active_texture(glow::TEXTURE0 + slot); | ||
| #[cfg(target_arch = "wasm32")] |
There was a problem hiding this comment.
can we turn this into a normal if condition, e.g. if cfg!(target_arch = "wasm32") {...}
| ptr::null_mut(), | ||
| &display_attributes, | ||
| ) | ||
| .unwrap(), |
There was a problem hiding this comment.
this was actually intentional: we don't expect it to fail, and we do want to see the error, not just skip over it
| ); | ||
| match d { | ||
| Ok(display) => (display, None), | ||
| Err(e) => { |
| #[cfg(target_arch = "wasm32")] | ||
| let timeout_ns_i32 = 0i32; | ||
| #[cfg(not(target_arch = "wasm32"))] | ||
| let timeout_ns_i32 = { |
There was a problem hiding this comment.
let's prefer run-time checks over compile time: more compilation code coverage
| Tf::R32Uint => (glow::R32UI, glow::RED, glow::UNSIGNED_INT), | ||
| Tf::Rg32Uint => (glow::RG32UI, glow::RG, glow::UNSIGNED_INT), | ||
| Tf::Rgba32Uint => (glow::RGBA32UI, glow::RGBA, glow::UNSIGNED_INT), | ||
| Tf::R32Uint => (glow::R32UI, glow::RED_INTEGER, glow::UNSIGNED_INT), |
|
|
||
| let location = attributes.len() as u32; | ||
| gl.bind_attrib_location(program, location, attrib_name); | ||
| gl.bind_attrib_location(program, location, &format!("e_{}", attrib_name)); |
There was a problem hiding this comment.
what's happening here? why are we binding it twice?
| gl.get_uniform_i32(program, location, &mut slots); | ||
| targets.push(slots[0] as u32); | ||
| let slot = *texture_slots | ||
| .entry(glsl_name.to_string()) |
There was a problem hiding this comment.
when would this happen? is this about using the same texture in multiple stages?
| let mut index_opt = gl.get_uniform_block_index(program, glsl_name); | ||
| if index_opt.is_none() { | ||
| if let Some(ref type_name) = sf.shader.module.types[var.ty].name | ||
| { |
| } | ||
| } | ||
| if index_opt.is_none() { | ||
| let type_name = format!("type_{}", var.ty.index()); |
| pub struct Buffer { | ||
| raw: glow::Buffer, | ||
| #[cfg(target_arch = "wasm32")] | ||
| raw_index: glow::Buffer, |
There was a problem hiding this comment.
This is really unfortunate. It makes all of the buffers on WebGL less efficient, just because there is one quirk in the spec.
What do you think of a counter-proposal: we just assert that the user isn't hitting this path on webgl but keep the implementation clean? that would lift the burden to the user.
Remove redundant unsafe blocks and clean up Metal API usage across the Metal backend. Changes touch command.rs, mod.rs, pipeline.rs and surface.rs: create_* and new() calls no longer wrapped in unsafe, autoreleasepool closures simplified, render layer configuration unwrapped, and vertex/attribute descriptor handling reformatted. The MTLCounterSampleBufferDescriptor sample count is still set inside an unsafe block where required. These edits improve safety/readability without changing runtime behavior.
Introduce sync_buffer_range to the ResourceDevice trait and add platform-specific implementations for GLES, Metal and Vulkan to support partial buffer updates. Handle Naga GLSL quirks in GLES pipeline by binding attributes with the 'e_' prefix and checking multiple names when looking up uniform block indices. Simplify EGL surfaceless display selection (use get_platform_display().unwrap()). Improve Vulkan diagnostics and robustness: add trace logging to CommandEncoder::start(), lower descriptor pool creation log to debug, log on pool destruction, and guard reset_descriptor_pool to recreate the base pool if empty before resetting.
|
Hi kvark, thanks for the review! Let's keep the buffer implementation clean, and I want to make sure we do this the right way. I've addressed all the minor comments: Regarding the WebGL2 index buffer constraint: I agree we shouldn't add overhead or complexity to the native Vulkan/Metal paths. My challenge is that because WebGL2 strictly forbids sharing the same buffer for ARRAY and ELEMENT_ARRAY, enforcing this on the user would force me to heavily restructure my engine's entire pipeline just to support WebGL/Firefox. Could I get a bit of time to dig deeper into the internals here? I want to explore if we can design a cleaner abstraction that satisfies the restrictive WebGL spec without punishing native buffer efficiency at all. I'm fully committed to getting this right for the upstream! |
Render Egui output as sRGB directly (swapchain is configured as Srgb/Unorm) by removing the linear_from_gamma conversion in the WGSL shader. Replace several compile-time #[cfg(target_arch = "wasm32")] branches with runtime cfg!(target_arch = "wasm32") checks in the GLES backend so the same codepath can be used across targets (ensure timeout_ns_i32 == 0 on wasm/WebGL2). Also make MAX_TIMEOUT unconditional and apply a minor formatting cleanup in pipeline.rs.
Under WebGL2, once a buffer is bound as a non-ELEMENT_ARRAY_BUFFER target, it cannot be bound as ELEMENT_ARRAY_BUFFER later. To resolve this cleanly without driver-level duplication hacks: 1. I added a `name` field to `BufferBeltDescriptor` to name the chunk allocations. 2. I refactored the GLES driver's `create_buffer` and buffer sync calls to check `desc.name` and automatically bind to either `ELEMENT_ARRAY_BUFFER` or `ARRAY_BUFFER` accordingly. 3. Named our separated belts in `blade-egui` as `"vertex"` and `"index"`. Split the temporary BufferBelt in blade-egui into separate vertex_belt and index_belt, updating creation, allocation, destroy, trim, sync and flush calls to use the appropriate belt for vertices, indices and image uploads. In blade-graphics (GLES backend) remove the wasm-specific dual-buffer handling (raw_index) and associated cfg branches: Buffer and BufferPart now contain a single raw buffer, DrawIndexedIndirect carries a BufferPart for the index buffer, and command/resource code binds/updates/deletes the unified buffer. This simplifies cross-platform buffer management and removes redundant ELEMENT_ARRAY_BUFFER handling for wasm.
| crate::Memory::External(_) => unimplemented!(), | ||
| }; | ||
|
|
||
| let target = if desc.name.contains("index") { |
There was a problem hiding this comment.
The name check is awkward here. All names should be non-semantical.
But really, do we have to check for this at all? WebGL forbids binding the buffer to 2 different bind points simultaneously. This code here is just for data initialization. We can unbind the buffer freely at the end of this function, and it's not bound anywhere else, so should be totally valid to always use ARRAY_BUFFER like the original code did.
Basically, I wonder if everything will be correct if we just not introduce the name at all. All we should need is that split between the index belt and the vertex belt.
| // assigning the storage. | ||
| gl.tex_parameter_i32(target, glow::TEXTURE_MIN_FILTER, glow::NEAREST as i32); | ||
| gl.tex_parameter_i32(target, glow::TEXTURE_MAG_FILTER, glow::NEAREST as i32); | ||
| gl.tex_parameter_i32(target, glow::TEXTURE_WRAP_S, glow::CLAMP_TO_EDGE as i32); |
| .unwrap() | ||
| .dyn_into::<web_sys::HtmlCanvasElement>() | ||
| .unwrap(); | ||
| canvas.set_width(config.size.width); |
There was a problem hiding this comment.
This looks weird to me. It would be analogous if on native we'd try to resize the window here. The canvas is providing us with a surface, but we should not own it here.
| } | ||
| } | ||
|
|
||
| fn sync_buffer_range(&self, buffer: super::Buffer, offset: u64, size: u64) { |
There was a problem hiding this comment.
should we just change the signature of the original sync_buffer method? It's not relevant to other backends anyway, and the user can get the size of the buffer by doing buffer.size() shall they need to sync all of it.
| crate::ShaderBinding::Plain { size } => { | ||
| if let Some(index) = gl.get_uniform_block_index(program, glsl_name) | ||
| { | ||
| // Naga may emit the uniform block name as the variable name, |
There was a problem hiding this comment.
this looks suss to me. Naga should always expose to us explicitly what the generated names are, given that the client may need to know them. We shouldn't ever guess.
|
|
||
| let location = attributes.len() as u32; | ||
| gl.bind_attrib_location(program, location, attrib_name); | ||
| // Naga's GLSL backend prefixes entry point arguments with 'e_' |
There was a problem hiding this comment.
Similarly suspicious. Naga's "e_" may be an implementation detail, not a part of the contract.
| }; | ||
| timeout_ns.min(MAX_TIMEOUT) as i32 | ||
| }; | ||
| //TODO: https://github.com/grovesNL/glow/issues/287 |
There was a problem hiding this comment.
this comment is still useful, shouldn't remove it
| 0i32 | ||
| } else { | ||
| timeout_ms as u64 * 1_000_000 | ||
| let timeout_ns = if timeout_ms == !0 { |
There was a problem hiding this comment.
is this a no-op today? we are still doing min(MAX_TIMEOUT) after that, so 1_000_000 * u32::MAX is behaving the same as !0
| //Note: we always assume rendering to linear color space, | ||
| // but Egui wants to blend in gamma space, see | ||
| // https://github.com/emilk/egui/pull/2071 | ||
| //Note: we output sRGB directly because our swapchain is configured as Srgb |
There was a problem hiding this comment.
This looks wrong to me, and it's changing behavior for everybody, not just WebGL.
The user decides how its fragment colors are going to be presented here -
blade/blade-graphics/src/lib.rs
Lines 1356 to 1362 in 0d69342
The color space is Linear by default in Blade - this is a convention, by which the user is expected to produce linear colors from their shaders.
Add R8Uint and R16Uint to TextureFormat enum with mappings for Vulkan, Metal, and GLES backends GuiPainter::after_submit now takes &Context and trims vertex/index belts to cap memory growth Run cargo fmt
Problem
Running Blade on WebGL2 / wasm hit spec and browser constraints: the same
WebGLBuffercannot be bound as bothARRAY_BUFFERandELEMENT_ARRAY_BUFFER;clientWaitSynccannot block on the main thread; texture unit / unpack state could produce wrong sampling or extra copies (e.g. Firefox). On GLES, legacy drivers needed safer uniform-block lookup, correct scissor/wrap/format handling, and explicit buffer sync. blade-egui needed a proper vertex layout/fetch path and deferred texture destruction after submit. Vulkan debug utils were assumed present.What changed
client_wait_synctimeout0on wasm (non-wasm clamped toMAX_CLIENT_WAIT_TIMEOUT_WEBGL), texture target unbind hygiene,UNPACK_ROW_LENGTHonly when needed (restored to 0), shader texture slots tracked and set once.CLAMP_TO_EDGE, attribute location binding (incl.e_prefix), uniform-block reflection fallbacks + logging,BufferBelt::sync, EGL platform fallback when surfaceless init fails.sync()after submit.VK_EXT_debug_utilsonly when the extension exists; guard naming calls.Testing
wasm32-unknown-unknown— Chrome / Firefox WebGL2Note
The web backend sets the canvas pixel size from config (element id
"blade"in our tree). Happy to make that id/config upstream-friendly if you want a different default.