PR into master from dev/olga/Add-typst-fotmat-for-math#403
Draft
OlgaRedozubova wants to merge 215 commits intomasterfrom
Draft
PR into master from dev/olga/Add-typst-fotmat-for-math#403OlgaRedozubova wants to merge 215 commits intomasterfrom
OlgaRedozubova wants to merge 215 commits intomasterfrom
Conversation
…rue, so MathML data is not generated even though the renderer expects it.
…ects mjx-container inside full-width math blocks. To fix the centering issue for equations with numbering while not affecting other layouts.
…s with SRE versions
…e(2) was too loose. Strengthened to length > 10 plus a regex check that the speech contains fraction, over, or divided — validating the SRE output is semantically meaningful
Implements a MathML AST visitor that emits native Typst math syntax, following the same pattern as SerializedAsciiVisitor. Handles all 19 MathML node types (mi, mo, mn, mtext, mfrac, msup, msub, msubsup, msqrt, mroot, mover, munder, munderover, mspace, mtable, mrow, mtr, mpadded, menclose) with 180+ Unicode-to-Typst symbol mappings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix spacing between adjacent Typst tokens (needSpaceBefore/needSpaceAfter handle single-char identifiers, numbers, and dotted symbol paths) - Add inter-node spacing in visitInferredMrowNode and mrow handler - Collapse double spaces in toTypstML output - Fix lr() for \left..\right: skip delimiter mo children, handle invisible delimiters without lr() to avoid malformed Typst syntax - Fix \binom: detect TeXAtom OPEN/CLOSE wrapping zero-linethickness mfrac - Fix \oint: add mstyle handler that skips operator-internal spacing while preserving user spacing (\, \quad) - Fix \underline: flip overline→underline in munder context - Fix \cancel direction: updiagonalstrike→cancel(), downdiagonalstrike→cancel(inverted) - Fix \sin/\cos/\log: skip upright() wrapping for built-in Typst math operators - Map Unicode minus \u2212 to ASCII - for natural Typst output - Add mtext handler for single known symbols (e.g. ∮→integral.cont) - Move test data to tests/_data/_typst/data.js (119 test cases) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…bols, color, boxed, cases alignment - Add mphantom handler (drops invisible content, no Typst equivalent) - Strip equation numbers from mlabeledtr rows in align/gather environments - Fix \choose producing frac instead of binom (linethickness number vs string) - Add 13 negated relation symbols (equiv.not, lt.eq.not, sim.not, etc.) - Fix \log_2 spacing by adding msub/msup to needSpaceAfter parent checks - Handle \textbf in mtext via mathvariant font wrapping - Handle \color via #text(fill: ...) wrapper in mstyle, filter _inherit_ sentinel - Handle \boxed via #box(stroke: ...) in menclose box notation - Fix \bcancel to use #true instead of true for Typst syntax - Fix cases environment to use & for cell alignment within rows - Add 8 new test cases (133 total), all passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…spacing - Add lr(size: #<em>) for \big, \Big, \bigg, \Bigg delimiters by detecting sized TeXAtom OPEN/CLOSE pairs in visitInferredMrowNode - Use accent(content, symbol) for non-shorthand accents like \overleftarrow instead of invalid arrow.l(content) syntax - Add -tex-mathit → italic() font mapping for \mathit - Fix \int\limits spacing by adding munderover/munder/mover to needSpaceAfter parent exclusion list - Add 7 new test cases (140 total) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…etection
- Add quoted strings (") to spacing separator pattern so identifiers get
space before adjacent text like x "and" → x "and"
- Fix mstyle operator-internal spacing detection: check for TeXAtom ancestor
instead of texClass OP, so user \, spacing is preserved in expressions
containing integrals while \oint internal spacing is still skipped
- Add 2 new test cases (142 total)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eil, dif
- Fix \operatorname detection: remove !mathvariant check since MathJax provides
defaults; rely on texClass=OP to distinguish from \mathrm
- \operatorname{name} → op("name"), \operatorname*{name} → op("name", limits: #true)
- \mathbb{R} → RR (doubled-letter shorthand for single uppercase letters)
- \mathrm{d} → dif (differential operator optimization)
- \mathbf{v} → upright(bold(v)) (LaTeX \mathbf is upright bold)
- \left|x\right| → norm(x), \left\lfloor → floor(), \left\lceil → ceil()
- \aleph → alef, non-breaking space → space.nobreak
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add getChildText() helper using TextNode.getText() API - Rewrite getNodeText() to use TextNode.getText() instead of .text - Replace 4 (node.childNodes[0] as any)?.text in isThousandSepComma - Replace as-any in getNodeTypstSymbol, needsSpaceAfter, matchBraceAnnotation - Replace .properties[] with .getProperty() in table-handlers (4 places) - Remove unnecessary (node as any) cast in index.ts visitor Zero as-any casts remain in the module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace ~30 instances of `node: any` and untyped function parameters across all serialized-typst modules with proper MathJax types: - TreeNode (base Node) for utilities needing only kind/childNodes - MmlNode for handlers needing texClass/attributes/isInferred - ITypstSerializer for serialize parameters - HandlerFn for handler factory return types Replace direct .properties access with getProperty()/setProperty() API. Replace .text access on TextNode with getChildText() helper. Use `as MmlNode` casts only where children need MmlNode-specific features. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…lNode casts Extend MmlNode interface with `childNodes: MmlNode[]` and `parent: MmlNode` to match AbstractMmlNode runtime behavior. This narrows the inherited Node[] type to MmlNode[], eliminating ~15 `as MmlNode` casts across all modules. All nodes in a MathJax MathML tree are MmlNode instances — the extension simply makes the type system reflect reality. TreeNode is no longer needed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… MmlNode Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…emove section comments Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace 22 handler factories with direct HandlerFn constants (contextual typing preserves full type safety) - Move handleAll to common.ts, remove handlerApi indirection - handlers.ts: 1208 → 1135 lines Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…in mtext
- buildLimitBase: use placeholder '""' inside limits() wrapper instead
of early return, so \underset{a}{} correctly produces limits("")_a
- mtext: remove backslash escaping that doubled backslashes in text
content (e.g. "x \\geq 0" → "x \geq 0")
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace raw '\uXXXX' escapes with readable named constants (FUNC_APPLY, MINUS_SIGN, LEFT_FLOOR, INTEGRAL_SIGN, etc.) across handlers.ts, bracket-utils.ts, and index.ts for improved code readability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…terals - Add HandlerKind type to types.ts: Record<HandlerKind, HandlerFn> catches typos in handler keys and ensures all 22 handlers are registered - Fix matchBraceAnnotation to use extracted kind/base variables instead of raw regex groups m[1]/m[2] - Replace ~30 string concatenations with template literals for readability Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…cess Add 7 attribute interfaces (FontAttrs, FracAttrs, MoAttrs, SpaceAttrs, PaddedAttrs, EncloseAttrs, StyleAttrs) to types.ts. Replace Record<string, any> with a single `as T` cast inside getAttrs, giving all 11 call sites full type checking and autocomplete. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…l-handlers Decompose the 1137-line handlers.ts into focused submodules: - token-handlers.ts: mi, mo, mn, mtext, mspace (~250 lines) - script-handlers.ts: mfrac, msup, msub, msubsup, msqrt, mroot, mover, munder, munderover, mmultiscripts (~350 lines) - structural-handlers.ts: mrow, mpadded, mphantom, menclose, mstyle (~280 lines) - handlers.ts: dispatch-only (~30 lines) Also move getAttrs to common.ts and SHALLOW_TREE_MAX_DEPTH to consts.ts as shared utilities. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace all 14 raw getProperty() calls with getProp<T>(node, key) across 4 files (structural-handlers, token-handlers, bracket-utils, table-handlers). Accepts nullable nodes for convenience, returns T | undefined. Mirrors the getAttrs<T> pattern. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…elpers Convert ~25 string concatenations to template literals for readability. Extract reusable helpers: buildFigureTag, buildAutoTagWithLabel, labelSuffix, joinRows, AUTO_TAG_ENTRY. Decompose long template literals into intermediate variables for clarity. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ions Extract 5 independent pattern-detection blocks from the 220-line while loop into focused functions (tryBigDelimiterPattern, tryBareDelimiterPattern, tryIdotsintPattern, tryThousandSepPattern, isTaggedEqnArray) with a shared PatternResult interface. Add serializeRange helper. Replace string concatenations with template literals. Use getAttrs<T> instead of raw getAllAttributes(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…literals in script-handlers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… in escape-utils Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…typed catch - Replace explicit `any` with `unknown`/`MathNode` in index.ts - Replace 5 unsafe `as string` casts with `String()` in table-handlers.ts - Add `isHandlerKind` type guard in handlers.ts - Type all 15 catch clauses with `: unknown` across 6 files - Add Readonly/ReadonlySet/ReadonlyMap to 23 lookup tables across 8 files - Fix boolean coercion in isFirstChild/isLastChild, let→const, remove unnecessary ?. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- mmultiscripts: use tr:/br: (top-right/bottom-right) instead of t:/b: (top/bottom) for post-scripts in attach() — fixes subscript/superscript placement when prescripts are present - mo handler: add munder/mover to inScript check to prevent unwanted spacing around operators inside these contexts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bugs: - Remove duplicate U+22A5 (⊥) mapping: keep "perp" in relations, drop "bot" from misc - Fix displaystyle truthy check: use === true instead of truthy coercion - Remove duplicate entries for prec/succ/product.co in typst-symbol-map Quality: - buildMatrix: form augmentStr without trailing ", " instead of fragile slice(0, -2) - hasScriptAncestor: add ANCESTOR_MAX_DEPTH limit for consistency - needsParens: simplify to s.length > 1 - visitTeXAtomNode: replace regex match with trim() - childNodeMml: remove unused _space and _nl parameters Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ndling strategy - Prevent parameter mutation in buildTaggedEqnArray/buildUntaggedEqnArray by cloning the rows array before modification - Replace single depth counter in scanExpression with per-bracket-type counters (parenDepth, bracketDepth, braceDepth) to avoid cross-type mismatches - Add JSDoc to handle() documenting the two-tier error handling strategy Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace 5-file module structure with actual 12-file layout (types, consts, token/script/structural/table-handlers, escape-utils, bracket-utils) - Fix mmultiscripts positions: t:/b: → tr:/br: for post-scripts - Update file references: serializeTagContent → table-handlers.ts, mrow handler → structural-handlers.ts, map names → OPEN/CLOSE_BRACKETS - Document per-bracket-type depth counters in escape scanner - Add hasScriptAncestor depth limit, visitInferredMrowNode decomposition - Remove deleted node-utils.ts from spec Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
handlers.ts: replace Readonly<Record> with `as const satisfies`, build HANDLER_KIND_SET from Object.keys for cast-free runtime guard, use handleAll directly instead of defaultHandler wrapper, simplify JSDoc. index.ts: extract BigDelimInfo interface, cache getBigDelimInfo result, add appendScripts helper to deduplicate sub/sup serialization in tryBareDelimiterPattern and tryIdotsintPattern, simplify BARE_DELIM_PAIRS lookup, remove redundant try/catch from childNodeMml, use constructor shorthand, cache node.childNodes in visitInferredMrowNode. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- BRACE_ANNOTATION_RE: greedy (.+) → lazy ([\s\S]+?), drop s flag - addLimitsParam: regex replace → endsWith/slice, remove RE_TRAILING_PAREN - matchBraceAnnotation: escape base content (m[2]) with typstPlaceholder(escapeContentSeparators(...)) to match annotation - buildLimitBase: precompute baseEscaped to DRY 3 call sites - mmultiscripts attach(): escape all part values and base through unified esc() pipeline for consistent separator protection - isStretchyBase: remove redundant MathNode type annotation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ackground - mrow hasTableChild: shallow search through inferredMrow to catch mtable wrapped by MathJax in an inferred node - menclose cancel: add escapeUnbalancedParens to prevent unbalanced ) from closing the cancel() call prematurely - mstyle: add mathbackground support (feature parity with mpadded), including combination with mathcolor - StyleAttrs: add mathbackground property Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…arsing
- RE_CONTENT_SPECIAL: add [ ] to escaped chars for Typst content blocks
- extractTagFromConditionCell: last \tag{} wins (LaTeX behavior),
module-level RE_TAG_EXTRACT_G with lastIndex reset, early return
after mtext processing
- buildNumcasesGrid: cleanup double spaces after RE_TAG_STRIP
- buildMatrix: columnlines/rowlines use .trim().split(/\s+/) for
robust whitespace handling
- isNumcasesTable: document best-effort heuristic limitation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace try/catch in needsSpaceBefore/After with explicit null checks
- Extract shared helpers: escapeTypstString, normalizeOperatorName,
isWordLikeToken, isInScriptContext, singleTypst, withContextSpaces
- Add SCRIPT_PARENT_KINDS set to centralize script-context detection
- Decompose mo into trySerialize* sub-handlers and needsDisambiguatingSpaceAfter
- Fix escaping in op("...") and mi font-wrapping string literals
- Convert MSPACE_WIDTH_MAP from Record to ReadonlyMap for consistency
- Rename atr → attrs, normalize mspace width with trim()
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…g handling - Split ScanOptions into DetectOptions | TransformOptions union to prevent mixing detect-mode with transform-specific flags at the type level - Add isDetectMode type guard, extract all opts into local booleans - Check colon-spacing against source expr[i-1] instead of transformed result - Unclosed quote now consumes to end of expression instead of leaking separators - Rename escapeUnbalancedClose → escapeUnbalancedCloseParen for precision - Replace magic string 'found' with SEPARATOR_FOUND constant - Use isTopLevel (all depths === 0) instead of summed depth check - Fix hasTopLevelSeparators comment: "top level" not "parenthesis depth 0" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove try/catch from isThousandSepComma, use explicit bounds checks - Simplify isFirstChild/isLastChild by removing redundant null checks - Use indexOf instead of findIndex for getSiblingIndex - Add defensive ?? [] in handleAll for childNodes - Clarify JSDoc: addToTypstData mutates, addSpaceToTypstData skips uninitialized typst_inline, needsParens is a simple heuristic, getNodeText is non-recursive - Shorten initTypstData to arrow expression Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract findUnpairedIndices for shared strict stack pairing algorithm, used by both replaceUnpairedBrackets and markUnpairedBrackets - Decompose replaceUnpairedBrackets into scanBracketTokens and findUnpairedIndices for easier testing and maintenance - Extract skipQuotedString helper to deduplicate string-skipping logic - Convert delimiterToTypst from switch to DELIMITER_LITERAL_MAP lookup - Extract FLATTENABLE_CONTAINER_KINDS set and shouldFlattenNode helper - Rename escapeLrOpenDelimiter → escapeLrDelimiter (handles both open and close delimiters); rename lrOpenEscapeMap → LR_DELIMITER_ESCAPE_MAP - Use node.childNodes ?? [] in treeContainsMo and walk Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
branch:
dev/olga/Add-typst-fotmat-for-mathRelated issue: 18398