Skip to content

PR into master from dev/olga/Add-typst-fotmat-for-math#403

Draft
OlgaRedozubova wants to merge 215 commits intomasterfrom
dev/olga/Add-typst-fotmat-for-math
Draft

PR into master from dev/olga/Add-typst-fotmat-for-math#403
OlgaRedozubova wants to merge 215 commits intomasterfrom
dev/olga/Add-typst-fotmat-for-math

Conversation

@OlgaRedozubova
Copy link
Contributor

branch: dev/olga/Add-typst-fotmat-for-math

Related issue: 18398

OlgaRedozubova and others added 30 commits January 29, 2026 19:04
…rue, so MathML data is not generated even though the renderer expects it.
…ects mjx-container inside full-width math blocks. To fix the centering issue for equations with numbering while not affecting other layouts.
…e(2) was too loose. Strengthened to length > 10 plus a regex check that the speech contains fraction, over, or divided — validating the SRE output is semantically meaningful
Implements a MathML AST visitor that emits native Typst math syntax,
following the same pattern as SerializedAsciiVisitor. Handles all 19
MathML node types (mi, mo, mn, mtext, mfrac, msup, msub, msubsup,
msqrt, mroot, mover, munder, munderover, mspace, mtable, mrow, mtr,
mpadded, menclose) with 180+ Unicode-to-Typst symbol mappings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix spacing between adjacent Typst tokens (needSpaceBefore/needSpaceAfter
  handle single-char identifiers, numbers, and dotted symbol paths)
- Add inter-node spacing in visitInferredMrowNode and mrow handler
- Collapse double spaces in toTypstML output
- Fix lr() for \left..\right: skip delimiter mo children, handle invisible
  delimiters without lr() to avoid malformed Typst syntax
- Fix \binom: detect TeXAtom OPEN/CLOSE wrapping zero-linethickness mfrac
- Fix \oint: add mstyle handler that skips operator-internal spacing while
  preserving user spacing (\, \quad)
- Fix \underline: flip overline→underline in munder context
- Fix \cancel direction: updiagonalstrike→cancel(), downdiagonalstrike→cancel(inverted)
- Fix \sin/\cos/\log: skip upright() wrapping for built-in Typst math operators
- Map Unicode minus \u2212 to ASCII - for natural Typst output
- Add mtext handler for single known symbols (e.g. ∮→integral.cont)
- Move test data to tests/_data/_typst/data.js (119 test cases)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…bols, color, boxed, cases alignment

- Add mphantom handler (drops invisible content, no Typst equivalent)
- Strip equation numbers from mlabeledtr rows in align/gather environments
- Fix \choose producing frac instead of binom (linethickness number vs string)
- Add 13 negated relation symbols (equiv.not, lt.eq.not, sim.not, etc.)
- Fix \log_2 spacing by adding msub/msup to needSpaceAfter parent checks
- Handle \textbf in mtext via mathvariant font wrapping
- Handle \color via #text(fill: ...) wrapper in mstyle, filter _inherit_ sentinel
- Handle \boxed via #box(stroke: ...) in menclose box notation
- Fix \bcancel to use #true instead of true for Typst syntax
- Fix cases environment to use & for cell alignment within rows
- Add 8 new test cases (133 total), all passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…spacing

- Add lr(size: #<em>) for \big, \Big, \bigg, \Bigg delimiters by detecting
  sized TeXAtom OPEN/CLOSE pairs in visitInferredMrowNode
- Use accent(content, symbol) for non-shorthand accents like \overleftarrow
  instead of invalid arrow.l(content) syntax
- Add -tex-mathit → italic() font mapping for \mathit
- Fix \int\limits spacing by adding munderover/munder/mover to
  needSpaceAfter parent exclusion list
- Add 7 new test cases (140 total)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…etection

- Add quoted strings (") to spacing separator pattern so identifiers get
  space before adjacent text like x "and" → x "and"
- Fix mstyle operator-internal spacing detection: check for TeXAtom ancestor
  instead of texClass OP, so user \, spacing is preserved in expressions
  containing integrals while \oint internal spacing is still skipped
- Add 2 new test cases (142 total)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eil, dif

- Fix \operatorname detection: remove !mathvariant check since MathJax provides
  defaults; rely on texClass=OP to distinguish from \mathrm
- \operatorname{name} → op("name"), \operatorname*{name} → op("name", limits: #true)
- \mathbb{R} → RR (doubled-letter shorthand for single uppercase letters)
- \mathrm{d} → dif (differential operator optimization)
- \mathbf{v} → upright(bold(v)) (LaTeX \mathbf is upright bold)
- \left|x\right| → norm(x), \left\lfloor → floor(), \left\lceil → ceil()
- \aleph → alef, non-breaking space → space.nobreak

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
OlgaRedozubova and others added 30 commits March 2, 2026 18:22
- Add getChildText() helper using TextNode.getText() API
- Rewrite getNodeText() to use TextNode.getText() instead of .text
- Replace 4 (node.childNodes[0] as any)?.text in isThousandSepComma
- Replace as-any in getNodeTypstSymbol, needsSpaceAfter, matchBraceAnnotation
- Replace .properties[] with .getProperty() in table-handlers (4 places)
- Remove unnecessary (node as any) cast in index.ts visitor

Zero as-any casts remain in the module.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace ~30 instances of `node: any` and untyped function parameters
across all serialized-typst modules with proper MathJax types:
- TreeNode (base Node) for utilities needing only kind/childNodes
- MmlNode for handlers needing texClass/attributes/isInferred
- ITypstSerializer for serialize parameters
- HandlerFn for handler factory return types
Replace direct .properties access with getProperty()/setProperty() API.
Replace .text access on TextNode with getChildText() helper.
Use `as MmlNode` casts only where children need MmlNode-specific features.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…lNode casts

Extend MmlNode interface with `childNodes: MmlNode[]` and `parent: MmlNode`
to match AbstractMmlNode runtime behavior. This narrows the inherited Node[]
type to MmlNode[], eliminating ~15 `as MmlNode` casts across all modules.

All nodes in a MathJax MathML tree are MmlNode instances — the extension
simply makes the type system reflect reality. TreeNode is no longer needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… MmlNode

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…emove section comments

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace 22 handler factories with direct HandlerFn constants
  (contextual typing preserves full type safety)
- Move handleAll to common.ts, remove handlerApi indirection
- handlers.ts: 1208 → 1135 lines

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…in mtext

- buildLimitBase: use placeholder '""' inside limits() wrapper instead
  of early return, so \underset{a}{} correctly produces limits("")_a
- mtext: remove backslash escaping that doubled backslashes in text
  content (e.g. "x \\geq 0" → "x \geq 0")

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace raw '\uXXXX' escapes with readable named constants (FUNC_APPLY,
MINUS_SIGN, LEFT_FLOOR, INTEGRAL_SIGN, etc.) across handlers.ts,
bracket-utils.ts, and index.ts for improved code readability.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…terals

- Add HandlerKind type to types.ts: Record<HandlerKind, HandlerFn> catches
  typos in handler keys and ensures all 22 handlers are registered
- Fix matchBraceAnnotation to use extracted kind/base variables instead of
  raw regex groups m[1]/m[2]
- Replace ~30 string concatenations with template literals for readability

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…cess

Add 7 attribute interfaces (FontAttrs, FracAttrs, MoAttrs, SpaceAttrs,
PaddedAttrs, EncloseAttrs, StyleAttrs) to types.ts. Replace
Record<string, any> with a single `as T` cast inside getAttrs, giving
all 11 call sites full type checking and autocomplete.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…l-handlers

Decompose the 1137-line handlers.ts into focused submodules:
- token-handlers.ts: mi, mo, mn, mtext, mspace (~250 lines)
- script-handlers.ts: mfrac, msup, msub, msubsup, msqrt, mroot, mover, munder, munderover, mmultiscripts (~350 lines)
- structural-handlers.ts: mrow, mpadded, mphantom, menclose, mstyle (~280 lines)
- handlers.ts: dispatch-only (~30 lines)

Also move getAttrs to common.ts and SHALLOW_TREE_MAX_DEPTH to consts.ts
as shared utilities.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace all 14 raw getProperty() calls with getProp<T>(node, key)
across 4 files (structural-handlers, token-handlers, bracket-utils,
table-handlers). Accepts nullable nodes for convenience, returns
T | undefined. Mirrors the getAttrs<T> pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…elpers

Convert ~25 string concatenations to template literals for readability.
Extract reusable helpers: buildFigureTag, buildAutoTagWithLabel,
labelSuffix, joinRows, AUTO_TAG_ENTRY. Decompose long template literals
into intermediate variables for clarity.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ions

Extract 5 independent pattern-detection blocks from the 220-line while loop
into focused functions (tryBigDelimiterPattern, tryBareDelimiterPattern,
tryIdotsintPattern, tryThousandSepPattern, isTaggedEqnArray) with a shared
PatternResult interface. Add serializeRange helper. Replace string
concatenations with template literals. Use getAttrs<T> instead of raw
getAllAttributes().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…literals in script-handlers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… in escape-utils

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…typed catch

- Replace explicit `any` with `unknown`/`MathNode` in index.ts
- Replace 5 unsafe `as string` casts with `String()` in table-handlers.ts
- Add `isHandlerKind` type guard in handlers.ts
- Type all 15 catch clauses with `: unknown` across 6 files
- Add Readonly/ReadonlySet/ReadonlyMap to 23 lookup tables across 8 files
- Fix boolean coercion in isFirstChild/isLastChild, let→const, remove unnecessary ?.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- mmultiscripts: use tr:/br: (top-right/bottom-right) instead of t:/b:
  (top/bottom) for post-scripts in attach() — fixes subscript/superscript
  placement when prescripts are present
- mo handler: add munder/mover to inScript check to prevent unwanted
  spacing around operators inside these contexts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bugs:
- Remove duplicate U+22A5 (⊥) mapping: keep "perp" in relations, drop "bot" from misc
- Fix displaystyle truthy check: use === true instead of truthy coercion
- Remove duplicate entries for prec/succ/product.co in typst-symbol-map

Quality:
- buildMatrix: form augmentStr without trailing ", " instead of fragile slice(0, -2)
- hasScriptAncestor: add ANCESTOR_MAX_DEPTH limit for consistency
- needsParens: simplify to s.length > 1
- visitTeXAtomNode: replace regex match with trim()
- childNodeMml: remove unused _space and _nl parameters

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ndling strategy

- Prevent parameter mutation in buildTaggedEqnArray/buildUntaggedEqnArray by
  cloning the rows array before modification
- Replace single depth counter in scanExpression with per-bracket-type counters
  (parenDepth, bracketDepth, braceDepth) to avoid cross-type mismatches
- Add JSDoc to handle() documenting the two-tier error handling strategy

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace 5-file module structure with actual 12-file layout (types, consts,
  token/script/structural/table-handlers, escape-utils, bracket-utils)
- Fix mmultiscripts positions: t:/b: → tr:/br: for post-scripts
- Update file references: serializeTagContent → table-handlers.ts,
  mrow handler → structural-handlers.ts, map names → OPEN/CLOSE_BRACKETS
- Document per-bracket-type depth counters in escape scanner
- Add hasScriptAncestor depth limit, visitInferredMrowNode decomposition
- Remove deleted node-utils.ts from spec

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
handlers.ts: replace Readonly<Record> with `as const satisfies`, build
HANDLER_KIND_SET from Object.keys for cast-free runtime guard, use
handleAll directly instead of defaultHandler wrapper, simplify JSDoc.

index.ts: extract BigDelimInfo interface, cache getBigDelimInfo result,
add appendScripts helper to deduplicate sub/sup serialization in
tryBareDelimiterPattern and tryIdotsintPattern, simplify BARE_DELIM_PAIRS
lookup, remove redundant try/catch from childNodeMml, use constructor
shorthand, cache node.childNodes in visitInferredMrowNode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- BRACE_ANNOTATION_RE: greedy (.+) → lazy ([\s\S]+?), drop s flag
- addLimitsParam: regex replace → endsWith/slice, remove RE_TRAILING_PAREN
- matchBraceAnnotation: escape base content (m[2]) with
  typstPlaceholder(escapeContentSeparators(...)) to match annotation
- buildLimitBase: precompute baseEscaped to DRY 3 call sites
- mmultiscripts attach(): escape all part values and base through
  unified esc() pipeline for consistent separator protection
- isStretchyBase: remove redundant MathNode type annotation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ackground

- mrow hasTableChild: shallow search through inferredMrow to catch
  mtable wrapped by MathJax in an inferred node
- menclose cancel: add escapeUnbalancedParens to prevent unbalanced )
  from closing the cancel() call prematurely
- mstyle: add mathbackground support (feature parity with mpadded),
  including combination with mathcolor
- StyleAttrs: add mathbackground property

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…arsing

- RE_CONTENT_SPECIAL: add [ ] to escaped chars for Typst content blocks
- extractTagFromConditionCell: last \tag{} wins (LaTeX behavior),
  module-level RE_TAG_EXTRACT_G with lastIndex reset, early return
  after mtext processing
- buildNumcasesGrid: cleanup double spaces after RE_TAG_STRIP
- buildMatrix: columnlines/rowlines use .trim().split(/\s+/) for
  robust whitespace handling
- isNumcasesTable: document best-effort heuristic limitation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace try/catch in needsSpaceBefore/After with explicit null checks
- Extract shared helpers: escapeTypstString, normalizeOperatorName,
  isWordLikeToken, isInScriptContext, singleTypst, withContextSpaces
- Add SCRIPT_PARENT_KINDS set to centralize script-context detection
- Decompose mo into trySerialize* sub-handlers and needsDisambiguatingSpaceAfter
- Fix escaping in op("...") and mi font-wrapping string literals
- Convert MSPACE_WIDTH_MAP from Record to ReadonlyMap for consistency
- Rename atr → attrs, normalize mspace width with trim()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…g handling

- Split ScanOptions into DetectOptions | TransformOptions union to prevent
  mixing detect-mode with transform-specific flags at the type level
- Add isDetectMode type guard, extract all opts into local booleans
- Check colon-spacing against source expr[i-1] instead of transformed result
- Unclosed quote now consumes to end of expression instead of leaking separators
- Rename escapeUnbalancedClose → escapeUnbalancedCloseParen for precision
- Replace magic string 'found' with SEPARATOR_FOUND constant
- Use isTopLevel (all depths === 0) instead of summed depth check
- Fix hasTopLevelSeparators comment: "top level" not "parenthesis depth 0"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove try/catch from isThousandSepComma, use explicit bounds checks
- Simplify isFirstChild/isLastChild by removing redundant null checks
- Use indexOf instead of findIndex for getSiblingIndex
- Add defensive ?? [] in handleAll for childNodes
- Clarify JSDoc: addToTypstData mutates, addSpaceToTypstData skips
  uninitialized typst_inline, needsParens is a simple heuristic,
  getNodeText is non-recursive
- Shorten initTypstData to arrow expression

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract findUnpairedIndices for shared strict stack pairing algorithm,
  used by both replaceUnpairedBrackets and markUnpairedBrackets
- Decompose replaceUnpairedBrackets into scanBracketTokens and
  findUnpairedIndices for easier testing and maintenance
- Extract skipQuotedString helper to deduplicate string-skipping logic
- Convert delimiterToTypst from switch to DELIMITER_LITERAL_MAP lookup
- Extract FLATTENABLE_CONTAINER_KINDS set and shouldFlattenNode helper
- Rename escapeLrOpenDelimiter → escapeLrDelimiter (handles both open
  and close delimiters); rename lrOpenEscapeMap → LR_DELIMITER_ESCAPE_MAP
- Use node.childNodes ?? [] in treeContainsMo and walk

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant