[wasm-split] Remove unnecessary global exports#8832
Conversation
Globals and tables can have initializers that can contain other globals. Currently we just scan them as uses. For example, if global $g is used both in the primary and the secondary and its initializer is `(global.get $h)`, $h is also marked as "used" in both modules. But currently we only move a module item to a secondary module only when that item is exclusively used by that module. So if a global is used in the primary and the secondary, it will stay in the primary and then be exported to the secondary. But in the current code, becaus $g is marked as used in both modules and its initializer will be walked in both modules, $h is also marked as used in both modules. Becuase $g doesn't move to the secondary and only is imported there, the secondary doesn't need $h. But because it is marked as "used", the secondary module imports $h unnecessarily. The multi-split case is similar. The case is the same for table initiaializers. The difference between the two is global initializers can contain another global, so we need a worklist to compute the transitive closure. This fixes it by figuring out who the "owner" is for each global and table, and mark it "used" in a secondary module only when that is the sole user. Otherwise it will be marked as "used" in the primary. This does not meaningfully change computation time and reduces the primary module size around 0.3% for new acx_gallery and essentials and 1% for old acx_gallery.
| // Scan table initializers into their owning modules. If a table is used by a | ||
| // single secondary module, its initializer dependencies belong to that | ||
| // secondary module. Otherwise, they belong to the primary module. |
There was a problem hiding this comment.
Does this handle the case where the same global is used both in a moved table initializer and in some other location that prevents it from being moved? It looks like the code might handle this, but the comment suggests it does not.
It would be good to add tests for this kind of case if we don't have them already.
There was a problem hiding this comment.
It looks I can't create that test though:
binaryen/src/wasm/wasm-validator.cpp
Lines 5240 to 5247 in 8b4abbe
But yeah because UsedNames::globals are managed separately, so if a global is used in two different places it will be pinned to the primary module. I'll rephrase the comment: b490f2a
There was a problem hiding this comment.
Oh right, I guess only imported globals can be referenced from table initializers.
Co-authored-by: Thomas Lively <tlively123@gmail.com>
| if (UsedNames* owner = getOwner(name, &UsedNames::globals)) { | ||
| for (auto* get : FindAll<GlobalGet>(global->init).list) { | ||
| if (owner->globals.insert(get->name).second) { | ||
| worklist.push(get->name); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
It looks like a name may end up in the primaryUsed map as well as a secondaryUsed map. This can happen if the visitation order is such that we think a secondary module owns the name until later we discover that there is another use so the primary module should own the name. Is this a problem? Should we ensure that each name only ends up in at most one map?
Globals and tables can have initializers that can contain other globals. Currently we just scan them as uses. For example, if global $g is used both in the primary and the secondary and its initializer is
(global.get $h), $h is also marked as "used" in both modules.But currently we only move a module item to a secondary module only when that item is exclusively used by that module. So if a global is used in the primary and the secondary, it will stay in the primary and then be exported to the secondary.
But in the current code, becaus $g is marked as used in both modules and its initializer will be walked in both modules, $h is also marked as used in both modules. Becuase $g doesn't move to the secondary and only is imported there, the secondary doesn't need $h. But because it is marked as "used", the secondary module imports $h unnecessarily. The multi-split case is similar.
The case is the same for table initiaializers. The difference between the two is global initializers can contain another global, so we need a worklist to compute the transitive closure.
This fixes it by figuring out who the "owner" is for each global and table, and mark it "used" in a secondary module only when that is the sole user. Otherwise it will be marked as "used" in the primary.
This does not meaningfully change computation time and reduces the primary module size around 0.3% for new acx_gallery and essentials and 1% for old acx_gallery.