Skip to content
This repository was archived by the owner on Sep 27, 2022. It is now read-only.
This repository was archived by the owner on Sep 27, 2022. It is now read-only.

Handle inline transclusion differently in plaintext extraction #41

@appledora

Description

@appledora

In GitLab by @geohci on Aug 30, 2022, 24:21

Example: for the en:Cabbage article, the second paragraph of plaintext skipping transclusion is A cabbage generally weighs between . because the HTML is actually <p id="mwHg">A cabbage generally weighs between <span about="#mwt15" typeof="mw:Transclusion" data-mw='{"parts":[{"template":{"target":{"wt":"convert","href":"./Template:Convert"},"params":{"1":{"wt":"500"},"2":{"wt":"to"},"3":{"wt":"1000"},"4":{"wt":"g"},"5":{"wt":"lbs"},"sigfig":{"wt":"1"}},"i":0}}]}' id="mwHw">500 to 1,000 grams (1 to 2</span><span typeof="mw:Entity" about="#mwt15"> </span><span about="#mwt15">lb)</span>. and the wikitext is A cabbage generally weighs between {{convert|500|to|1000|g|lbs|sigfig=1}}.

Maybe we can have an option that only excludes transclusion when it happens inside certain types of elements instead of being the parent element?

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions