WebGL2/wasm/Android: GLES buffer, texture, and fence fixes; GLES/Vulkan + egui robustness by ohsalmeron · Pull Request #353 · kvark/blade

ohsalmeron · 2026-05-12T03:18:07Z

Problem

Running Blade on WebGL2 / wasm hit spec and browser constraints: the same WebGLBuffer cannot be bound as both ARRAY_BUFFER and ELEMENT_ARRAY_BUFFER; clientWaitSync cannot block on the main thread; texture unit / unpack state could produce wrong sampling or extra copies (e.g. Firefox). On GLES, legacy drivers needed safer uniform-block lookup, correct scissor/wrap/format handling, and explicit buffer sync. blade-egui needed a proper vertex layout/fetch path and deferred texture destruction after submit. Vulkan debug utils were assumed present.

What changed

wasm / WebGL2: separate index buffer path, client_wait_sync timeout 0 on wasm (non-wasm clamped to MAX_CLIENT_WAIT_TIMEOUT_WEBGL), texture target unbind hygiene, UNPACK_ROW_LENGTH only when needed (restored to 0), shader texture slots tracked and set once.
GLES: scissor Y flip for GL coordinates, integer internal formats, default CLAMP_TO_EDGE, attribute location binding (incl. e_ prefix), uniform-block reflection fallbacks + logging, BufferBelt::sync, EGL platform fallback when surfaceless init fails.
blade-egui: vertex input layout + fetch, WGSL vertex inputs / gamma tweak, textures-to-free queue + sync() after submit.
Vulkan: enable and use VK_EXT_debug_utils only when the extension exists; guard naming calls.

Testing

wasm32-unknown-unknown — Chrome / Firefox WebGL2
Android GLES (e.g. older Mali) if you have a device
[X ] Native Vulkan smoke (adapter without debug utils)

Note

The web backend sets the canvas pixel size from config (element id "blade" in our tree). Happy to make that id/config upstream-friendly if you want a different default.

Multiple fixes and improvements across graphics backends and egui integration: - blade-egui: Replace raw vertex storage with a proper vertex input layout (pos/uv/color), add egui vertex VertexLayout and VertexFetchState, bind vertex buffers, add textures_to_free queue and sync() method, and move texture free logic into after_submit. Update WGSL shader to use vec2 inputs and simplify gamma conversion with mix. - GLES: PassEncoder now carries target_size; scissor rect Y is inverted to match GL window coordinates. Use integer enums for integer texture formats, set texture wrap to CLAMP_TO_EDGE by default. Pipeline now binds attribute locations (including an e_ prefixed name) and adds robust uniform-block lookup with fallbacks and logging of available blocks. EGL platform selection improved with error handling and fallback to default when surfaceless fails. - Vulkan: Make debug utils optional and only enable the extension when available; guard debug utils calls, conditionally create device debug utils, and log adapter inspection rejections. Ensure debug object naming is invoked only when extension is present. - Shader processing: Only replace module types when bindings were actually modified and ensure attributes have bindings assigned or assert appropriately. - Utility: Add BufferBelt::sync to flush active buffers to the GPU. These changes improve robustness, platform compatibility, and correctness of vertex/texture handling across backends.

Add WebAssembly-specific handling and performance fixes for GLES backend: - Introduce a separate raw_index buffer for ELEMENT_ARRAY_BUFFER on wasm and create/bind/upload/delete it alongside ARRAY_BUFFER to satisfy WebGL requirements. - Use raw_index when binding index buffers on wasm builds. - Avoid setting UNPACK_ROW_LENGTH when row length equals width (prevents Firefox defensive copy); restore UNPACK_ROW_LENGTH to 0 after operations. - On texture unit binding for wasm, explicitly unbind other texture targets when binding a target to avoid stale bindings. - Adjust client_wait_sync timeout handling for WebGL2: use 0 on wasm (blocking forbidden) and clamp non-wasm timeouts to MAX_TIMEOUT. - Improve shader texture binding: track assigned texture slots and set uniform once (avoid repeatedly querying uniform slot values). - Set canvas element size from config in web backend (lookup element with id "blade" and set width/height). These changes improve WebGL compatibility, avoid browser-specific performance pitfalls, and ensure correct buffer/texture behavior on wasm targets.

kvark

Looks to be very well thought, thank you for filing!

kvark · 2026-05-14T04:50:04Z

                    target,
                } => {
                    gl.active_texture(glow::TEXTURE0 + slot);
+                    #[cfg(target_arch = "wasm32")]


can we turn this into a normal if condition, e.g. if cfg!(target_arch = "wasm32") {...}

kvark · 2026-05-14T04:51:05Z

-                                ptr::null_mut(),
-                                &display_attributes,
-                            )
-                            .unwrap(),


this was actually intentional: we don't expect it to fail, and we do want to see the error, not just skip over it

kvark · 2026-05-14T04:51:42Z

+                    );
+                    match d {
+                        Ok(display) => (display, None),
+                        Err(e) => {


why would this fail?

kvark · 2026-05-14T04:52:36Z

+        #[cfg(target_arch = "wasm32")]
+        let timeout_ns_i32 = 0i32;
+        #[cfg(not(target_arch = "wasm32"))]
+        let timeout_ns_i32 = {


let's prefer run-time checks over compile time: more compilation code coverage

kvark · 2026-05-14T04:53:14Z

-        Tf::R32Uint => (glow::R32UI, glow::RED, glow::UNSIGNED_INT),
-        Tf::Rg32Uint => (glow::RG32UI, glow::RG, glow::UNSIGNED_INT),
-        Tf::Rgba32Uint => (glow::RGBA32UI, glow::RGBA, glow::UNSIGNED_INT),
+        Tf::R32Uint => (glow::R32UI, glow::RED_INTEGER, glow::UNSIGNED_INT),


nice catch!

kvark · 2026-05-14T04:54:11Z

+
+                    let location = attributes.len() as u32;
+                    gl.bind_attrib_location(program, location, attrib_name);
+                    gl.bind_attrib_location(program, location, &format!("e_{}", attrib_name));


what's happening here? why are we binding it twice?

kvark · 2026-05-14T04:55:22Z

-                                    gl.get_uniform_i32(program, location, &mut slots);
-                                    targets.push(slots[0] as u32);
+                                    let slot = *texture_slots
+                                        .entry(glsl_name.to_string())


when would this happen? is this about using the same texture in multiple stages?

kvark · 2026-05-14T04:55:55Z

+                                let mut index_opt = gl.get_uniform_block_index(program, glsl_name);
+                                if index_opt.is_none() {
+                                    if let Some(ref type_name) = sf.shader.module.types[var.ty].name
+                                    {


nit: not formatted?

kvark · 2026-05-14T04:56:19Z

+                                    }
+                                }
+                                if index_opt.is_none() {
+                                    let type_name = format!("type_{}", var.ty.index());


why so many paths to bind?

kvark · 2026-05-14T05:02:59Z

 pub struct Buffer {
    raw: glow::Buffer,
+    #[cfg(target_arch = "wasm32")]
+    raw_index: glow::Buffer,


This is really unfortunate. It makes all of the buffers on WebGL less efficient, just because there is one quirk in the spec.
What do you think of a counter-proposal: we just assert that the user isn't hitting this path on webgl but keep the implementation clean? that would lift the burden to the user.

Remove redundant unsafe blocks and clean up Metal API usage across the Metal backend. Changes touch command.rs, mod.rs, pipeline.rs and surface.rs: create_* and new() calls no longer wrapped in unsafe, autoreleasepool closures simplified, render layer configuration unwrapped, and vertex/attribute descriptor handling reformatted. The MTLCounterSampleBufferDescriptor sample count is still set inside an unsafe block where required. These edits improve safety/readability without changing runtime behavior.

Introduce sync_buffer_range to the ResourceDevice trait and add platform-specific implementations for GLES, Metal and Vulkan to support partial buffer updates. Handle Naga GLSL quirks in GLES pipeline by binding attributes with the 'e_' prefix and checking multiple names when looking up uniform block indices. Simplify EGL surfaceless display selection (use get_platform_display().unwrap()). Improve Vulkan diagnostics and robustness: add trace logging to CommandEncoder::start(), lower descriptor pool creation log to debug, log on pool destruction, and guard reset_descriptor_pool to recreate the base pool if empty before resetting.

ohsalmeron · 2026-05-20T22:01:07Z

Hi kvark, thanks for the review!

Let's keep the buffer implementation clean, and I want to make sure we do this the right way.

I've addressed all the minor comments:
-Switched compile-time #[cfg] blocks to run-time cfg! for WASM to ensure full compilation coverage.
-Updated the EGL fallback to use .unwrap() cleanly as intended.
-Fixed formatting nits. (Note on the e_ prefix and uniform block variants: Naga GLSL transpilation sometimes adds these depending on the config. Binding both is a defensive check to ensure robust matching across different browsers/drivers).

Regarding the WebGL2 index buffer constraint: I agree we shouldn't add overhead or complexity to the native Vulkan/Metal paths. My challenge is that because WebGL2 strictly forbids sharing the same buffer for ARRAY and ELEMENT_ARRAY, enforcing this on the user would force me to heavily restructure my engine's entire pipeline just to support WebGL/Firefox.

Could I get a bit of time to dig deeper into the internals here? I want to explore if we can design a cleaner abstraction that satisfies the restrictive WebGL spec without punishing native buffer efficiency at all.

I'm fully committed to getting this right for the upstream!

Render Egui output as sRGB directly (swapchain is configured as Srgb/Unorm) by removing the linear_from_gamma conversion in the WGSL shader. Replace several compile-time #[cfg(target_arch = "wasm32")] branches with runtime cfg!(target_arch = "wasm32") checks in the GLES backend so the same codepath can be used across targets (ensure timeout_ns_i32 == 0 on wasm/WebGL2). Also make MAX_TIMEOUT unconditional and apply a minor formatting cleanup in pipeline.rs.

Under WebGL2, once a buffer is bound as a non-ELEMENT_ARRAY_BUFFER target, it cannot be bound as ELEMENT_ARRAY_BUFFER later. To resolve this cleanly without driver-level duplication hacks: 1. I added a `name` field to `BufferBeltDescriptor` to name the chunk allocations. 2. I refactored the GLES driver's `create_buffer` and buffer sync calls to check `desc.name` and automatically bind to either `ELEMENT_ARRAY_BUFFER` or `ARRAY_BUFFER` accordingly. 3. Named our separated belts in `blade-egui` as `"vertex"` and `"index"`. Split the temporary BufferBelt in blade-egui into separate vertex_belt and index_belt, updating creation, allocation, destroy, trim, sync and flush calls to use the appropriate belt for vertices, indices and image uploads. In blade-graphics (GLES backend) remove the wasm-specific dual-buffer handling (raw_index) and associated cfg branches: Buffer and BufferPart now contain a single raw buffer, DrawIndexedIndirect carries a BufferPart for the index buffer, and command/resource code binds/updates/deletes the unified buffer. This simplifies cross-platform buffer management and removes redundant ELEMENT_ARRAY_BUFFER handling for wasm.

kvark · 2026-05-21T06:31:12Z

            crate::Memory::External(_) => unimplemented!(),
        };

+        let target = if desc.name.contains("index") {


The name check is awkward here. All names should be non-semantical.
But really, do we have to check for this at all? WebGL forbids binding the buffer to 2 different bind points simultaneously. This code here is just for data initialization. We can unbind the buffer freely at the end of this function, and it's not bound anywhere else, so should be totally valid to always use ARRAY_BUFFER like the original code did.
Basically, I wonder if everything will be correct if we just not introduce the name at all. All we should need is that split between the index belt and the vertex belt.

kvark · 2026-05-21T06:32:37Z

                // assigning the storage.
                gl.tex_parameter_i32(target, glow::TEXTURE_MIN_FILTER, glow::NEAREST as i32);
                gl.tex_parameter_i32(target, glow::TEXTURE_MAG_FILTER, glow::NEAREST as i32);
+                gl.tex_parameter_i32(target, glow::TEXTURE_WRAP_S, glow::CLAMP_TO_EDGE as i32);


Is this not defined by default?

kvark · 2026-05-21T06:34:08Z

+            .unwrap()
+            .dyn_into::<web_sys::HtmlCanvasElement>()
+            .unwrap();
+        canvas.set_width(config.size.width);


This looks weird to me. It would be analogous if on native we'd try to resize the window here. The canvas is providing us with a surface, but we should not own it here.

kvark · 2026-05-21T06:37:58Z

+        }
+    }
+
+    fn sync_buffer_range(&self, buffer: super::Buffer, offset: u64, size: u64) {


should we just change the signature of the original sync_buffer method? It's not relevant to other backends anyway, and the user can get the size of the buffer by doing buffer.size() shall they need to sync all of it.

kvark · 2026-05-21T06:39:13Z

                            crate::ShaderBinding::Plain { size } => {
-                                if let Some(index) = gl.get_uniform_block_index(program, glsl_name)
-                                {
+                                // Naga may emit the uniform block name as the variable name,


this looks suss to me. Naga should always expose to us explicitly what the generated names are, given that the client may need to know them. We shouldn't ever guess.

kvark · 2026-05-21T06:39:57Z

+
+                    let location = attributes.len() as u32;
+                    gl.bind_attrib_location(program, location, attrib_name);
+                    // Naga's GLSL backend prefixes entry point arguments with 'e_'


Similarly suspicious. Naga's "e_" may be an implementation detail, not a part of the contract.

kvark · 2026-05-21T06:41:04Z

+            };
+            timeout_ns.min(MAX_TIMEOUT) as i32
        };
-        //TODO: https://github.com/grovesNL/glow/issues/287


this comment is still useful, shouldn't remove it

kvark · 2026-05-21T06:42:14Z

+            0i32
        } else {
-            timeout_ms as u64 * 1_000_000
+            let timeout_ns = if timeout_ms == !0 {


is this a no-op today? we are still doing min(MAX_TIMEOUT) after that, so 1_000_000 * u32::MAX is behaving the same as !0

kvark · 2026-05-21T06:50:17Z

-    //Note: we always assume rendering to linear color space,
-    // but Egui wants to blend in gamma space, see
-    // https://github.com/emilk/egui/pull/2071
+    //Note: we output sRGB directly because our swapchain is configured as Srgb 


This looks wrong to me, and it's changing behavior for everybody, not just WebGL.
The user decides how its fragment colors are going to be presented here -

blade/blade-graphics/src/lib.rs

Lines 1356 to 1362 in 0d69342

/// The color space that render output colors are expected to be in.

///

/// This will affect the surface format returned by the `Context`.

///

/// For example, if the display expects sRGB space and we render

/// in `ColorSpace::Linear` space, the returned format will be sRGB.

pub color_space: ColorSpace,

The color space is Linear by default in Blade - this is a convention, by which the user is expected to produce linear colors from their shaders.

Add R8Uint and R16Uint to TextureFormat enum with mappings for Vulkan, Metal, and GLES backends GuiPainter::after_submit now takes &Context and trims vertex/index belts to cap memory growth Run cargo fmt

ohsalmeron added 2 commits May 11, 2026 14:59

kvark reviewed May 14, 2026

View reviewed changes

ohsalmeron added 2 commits May 14, 2026 16:36

ohsalmeron added 2 commits May 20, 2026 22:02

kvark reviewed May 21, 2026

View reviewed changes

ohsalmeron added 2 commits May 24, 2026 20:30

Add R8/R16Uint texture formats, fix GUI belt memory trim

376a349

Add R8Uint and R16Uint to TextureFormat enum with mappings for Vulkan, Metal, and GLES backends GuiPainter::after_submit now takes &Context and trims vertex/index belts to cap memory growth Run cargo fmt

min chunk size increased

1b20c3f

	/// The color space that render output colors are expected to be in.
	///
	/// This will affect the surface format returned by the `Context`.
	///
	/// For example, if the display expects sRGB space and we render
	/// in `ColorSpace::Linear` space, the returned format will be sRGB.
	pub color_space: ColorSpace,

Conversation

ohsalmeron commented May 12, 2026

Problem

What changed

Testing

Note

Uh oh!

kvark left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ohsalmeron commented May 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants