Skip to content

fix(layout): apply TAC table host-line spacing when control index differs from line-seg index#1376

Merged
jangster77 merged 14 commits into
edwardkim:develfrom
mrshinds:fix/tac-host-line-spacing
Jun 15, 2026
Merged

fix(layout): apply TAC table host-line spacing when control index differs from line-seg index#1376
jangster77 merged 14 commits into
edwardkim:develfrom
mrshinds:fix/tac-host-line-spacing

Conversation

@mrshinds

@mrshinds mrshinds commented Jun 11, 2026

Copy link
Copy Markdown

Symptom

When a treat_as_char table sits in a paragraph where invisible controls
(SectionDef, ColumnDef, bookmarks, ...) precede it, all content below the
table is rendered a few points too high (typically 5–8pt per table). On
documents that start with the common Korean press-release header tables this
accumulates enough to flip the last line of a page, shifting every following
page by one line.

Reproduced on dozens of publicly available Korean government press releases
(hwpx); a synthetic minimal fixture is included instead of third-party files.

Root cause

layout_table_item resolves the gap below a TAC table via

if let Some(seg) = para.line_segs.get(control_index) { ... }

control_index indexes the paragraph's control array, not its line segs.
With e.g. [secPr, colPr, bookmark, tbl] the table is control #3 while the
host paragraph has a single line seg — the lookup returns None and the host
line's trailing line_spacing is silently dropped.

Evidence (file's own lineSegArray as ground truth)

For a host paragraph the document itself stores the authoritative layout:

host lineseg : vertsize = table height + outMargin(top+bottom), spacing = (lineSpacing% − 100%) × base char size
next paragraph: vertical_pos = host vertsize + host spacing

With the bug, the rendered advance equals vertsize only; the fixture's next
paragraph lands at ≈153.3 px instead of the file-encoded ≈161.3 px
(vertical_pos = 4160 HWPUNIT).

Fix

Fall back to the last line seg — for single-line table-host paragraphs that
is the host line:

para.line_segs.get(control_index).or_else(|| para.line_segs.last())

Test

test_tac_host_line_spacing_with_preceding_invisible_controls loads the
synthetic fixture and asserts the next paragraph's baseline against the
fixture's lineSegArray (≈161.3 px, failing at ≈153.3 px before the fix).

Notes

  • No behaviour change when the control index happens to match(table is
    control #0) — the previous lookup already hit.
  • We validated the fix against 137 hwpx documents by comparing rendered line
    positions with each file's embedded lineSegArray and with Hancom-produced
    reference PDFs; the dropped-spacing cases all converge to the file values.

Fixes #1408

mrshinds and others added 5 commits June 11, 2026 18:16
…fers from line-seg index

layout_table_item resolved the host line via para.line_segs.get(control_index),
but control_index indexes the paragraph's control array. When invisible
controls (SectionDef, ColumnDef, bookmarks) precede a treat_as_char table,
the lookup misses and the host line's trailing line_spacing is dropped,
pulling all subsequent content up by that amount (typically 5-8pt per
press-release header table) and shifting page bottoms by one line.

Ground truth: the document's own lineSegArray (host vertsize = table height
+ outMargins, spacing = (lineSpacing% - 100%) x base char size, next
paragraph vertical_pos = vertsize + spacing).

Fall back to the last line seg, which is the host line for single-line
table-host paragraphs.

Adds a synthetic fixture (samples/tac-host-spacing.hwpx, hand-authored) and
an integration test asserting the next paragraph's baseline against the
fixture's authoritative lineSegArray.
@jangster77

Copy link
Copy Markdown
Collaborator

@mrshinds Thank you very much for the detailed analysis and for adding a focused synthetic fixture. The explanation around control_index versus lineSegArray is very helpful.

I checked the PR locally and in GitHub Actions. Unfortunately, the current version cannot be merged yet because the existing regression test below fails both locally and in CI:

The failing case expects the PDF-based gap after the TAC table to stay at 20.00 ±2px, but with this PR it becomes 24.85px. So the line_segs.last() fallback appears to fix the new fixture, but it also applies too broadly and changes an existing TAC table outer-margin case.

Could you please narrow the fallback condition so that:

  1. test_tac_host_line_spacing_with_preceding_invisible_controls still passes, and
  2. test_521_tac_table_outer_margin_bottom_p2 also keeps passing?

Also, could you link the related issue number in the PR description, or open a small issue for this specific TAC host-line spacing problem and link it here?

Again, thank you for the careful repro case. This is a useful fix direction, and I think it just needs the fallback scope tightened before we can merge it.


한국어로도 함께 남깁니다.

자세한 분석과 재현 fixture를 추가해 주셔서 감사합니다. control_indexlineSegArray가 서로 다른 인덱스라는 설명은 문제를 이해하는 데 큰 도움이 되었습니다.

로컬과 GitHub Actions에서 확인한 결과, 현재 PR은 아직 바로 머지하기 어렵습니다. 기존 회귀 테스트인 renderer::layout::integration_tests::tests::test_521_tac_table_outer_margin_bottom_p2가 로컬과 CI에서 동일하게 실패하고 있습니다.

이 테스트는 TAC 표 뒤 PDF 기준 간격이 20.00 ±2px로 유지되기를 기대하지만, 현재 PR 적용 후에는 24.85px로 커집니다. 따라서 새 fixture는 해결하지만 line_segs.last() 폴백이 너무 넓게 적용되어 기존 TAC 표 outer margin 케이스를 변경하는 것으로 보입니다.

가능하시다면 다음 두 테스트가 모두 통과하도록 폴백 조건을 조금 더 좁혀 주실 수 있을까요?

  1. test_tac_host_line_spacing_with_preceding_invisible_controls
  2. test_521_tac_table_outer_margin_bottom_p2

또한 이 TAC host-line spacing 문제에 대응되는 이슈 번호를 PR 본문에 연결해 주시거나, 작은 이슈를 새로 열어 이 PR에 연결해 주시면 좋겠습니다.

좋은 재현 케이스와 분석을 제공해 주셔서 다시 한 번 감사합니다. 방향은 유효해 보이며, merge 전에는 폴백 적용 범위만 조금 더 좁히면 될 것 같습니다.

jangster77 added a commit to oksure/rhwp that referenced this pull request Jun 12, 2026
jangster77 added a commit to oksure/rhwp that referenced this pull request Jun 12, 2026
…ded tables

The `.or_else(|| para.line_segs.last())` fallback added in this PR was too
broad. It also fired for TAC tables preceded by *visible* objects (pictures,
shapes, tables), where the following paragraph is already positioned from the
file lineSegArray (vpos = host line_height + line_spacing); re-adding the host
line spacing there double-counts it.

This regressed test_521_tac_table_outer_margin_bottom_p2 to gap 24.85px (vs
expected ~20): the exam_eng p2 email-box TAC table is preceded by two pictures,
so the broad fallback wrongly added the host line_spacing (sp=344) below it.

Narrow the fallback to the documented bug: the control_index vs line-seg index
mismatch caused purely by *invisible* controls (SectionDef, ColumnDef, bookmark,
...) that occupy a control slot but produce no line seg. When only invisible
controls precede the table, the table is the host line and its trailing spacing
must be applied; when any visible object precedes, keep the original behavior.

Both tests pass (verified by rendering each fixture through the wasm32 build of
this exact layout.rs; native cargo test deferred to CI as the dev box lacks the
MSVC linker):
- test_tac_host_line_spacing_with_preceding_invisible_controls: 161.27px (~161.3)
- test_521_tac_table_outer_margin_bottom_p2: gap 20.27px (~20)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@mrshinds

Copy link
Copy Markdown
Author

Thanks for the detailed review — you were right that the fallback was too broad.

Narrowed the host-line fallback: it now applies only when the control-index vs line-seg mismatch is caused purely by invisible controls (SectionDef / ColumnDef / bookmark) preceding the TAC table. When a visible object (picture / shape / table) precedes it, the next paragraph is already positioned from the file lineSegArray (vpos = host lh + spacing), so re-adding the host line_spacing double-counts — which is what regressed test_521.

Both tests pass (verified by rendering each fixture through a wasm32 build of this exact layout.rs; native cargo test deferred to CI, my dev box lacks the MSVC linker): test_521_tac_table_outer_margin_bottom_p2 gap = 20.27px (was 24.85), test_tac_host_line_spacing_with_preceding_invisible_controls baseline = 161.27px.

No regression on my side either: across a local corpus parity set the narrowed build renders byte-identical to the prior broad build — the only docs with a visible-object-preceded TAC table scored identically, and all others are structurally unaffected.

I've already linked #1408 in the PR description.

@jangster77

Copy link
Copy Markdown
Collaborator

@mrshinds Thanks for the update. I re-checked the latest GitHub Actions result for this PR, and the current head (00adb803d24f) is still failing CI, so we cannot merge it yet.

The failing check is CI / Build & Test:

The failure is now in tests/issue_598_footnote_marker_nav.rs:

test issue_598_body_footnote_marker_has_hit_and_cursor_unit ... FAILED
test issue_598_second_body_footnote_marker_has_same_cursor_unit ... FAILED

thread 'issue_598_body_footnote_marker_has_hit_and_cursor_unit' panicked at tests/issue_598_footnote_marker_nav.rs:26:5:
hit json: {"hit":false}

thread 'issue_598_second_body_footnote_marker_has_same_cursor_unit' panicked at tests/issue_598_footnote_marker_nav.rs:62:5:
hit json: {"hit":false}

Could you please adjust the patch so this regression also passes, then re-run the local verification before pushing again? For macOS/local merge verification, our current guide recommends this sequence instead of cargo test --release --tests because release LTO makes integration-test linking very slow:

cargo build --release
cargo test --release --lib
cargo test --profile release-test --tests
cargo fmt --check

At minimum, please make sure the integration tests pass with:

cargo test --profile release-test --tests

한국어로도 함께 남깁니다.

최신 GitHub Actions 결과를 다시 확인했습니다. 현재 PR head(00adb803d24f)에서도 CI / Build & Test가 실패하고 있어서 아직 머지할 수 없습니다.

이번 실패는 tests/issue_598_footnote_marker_nav.rs의 각주 marker hit/cursor 회귀입니다. 두 테스트 모두 hit json: {"hit":false}로 실패합니다.

수정 후 다시 push하시기 전에 로컬에서도 mydocs/manual/dev_environment_guide.md의 macOS 권장 검증 순서에 따라 확인 부탁드립니다. 특히 통합 테스트는 다음 명령으로 확인해 주세요.

cargo test --profile release-test --tests

전체 권장 순서는 다음과 같습니다.

cargo build --release
cargo test --release --lib
cargo test --profile release-test --tests
cargo fmt --check

mrshinds and others added 2 commits June 14, 2026 02:14
…-line-spacing fix

PR edwardkim#1376's c3 patch (apply TAC host-line spacing when control_index differs from
line-seg index) shifts footnote-01's body footnote markers down by +9.6pt (sp960),
toward the Hancom-authoritative position. Measured natively on this branch vs devel
vs the file's embedded lineSegArray:

  marker  devel    this PR   Hancom (stored lineseg)
  P3      369.4    382.2     388.3px   (PR head 6.1px from Hancom vs devel 18.9px)
  P7      660.9    673.7     679.8px

The patch moves the markers toward Hancom (devel had dropped the spacing, rendering
~19px too high) and does not overshoot. The issue_598 hit-test y coords (380/670)
were calibrated to the pre-patch (too-high) devel render, so they now miss. Update
them to the centers of the post-patch hit:true ranges (P3 [382.5,402], P7
[674,693.5]) — which also bracket the Hancom stored y. Same pattern as c1129da
(hit y 698->670 after the Task edwardkim#643 page-split-drift fix).

Numbers-only change to two hit-test y values; no logic or other tests touched; the
c3 layout patch is unchanged. Verified: issue_598 4/4 pass; cargo test --release
--lib 1752/1752; clippy --tests -D warnings clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@jangster77

jangster77 commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

Thank you for the update. I re-ran the local validation and visual comparison for samples/tac-host-spacing.hwpx.

업데이트 감사합니다. samples/tac-host-spacing.hwpx에 대해 로컬 검증과 시각 비교를 다시 수행했습니다.

Local checks are green / 로컬 검증은 통과했습니다:

  • cargo fmt --check
  • cargo build --release
  • cargo test --release --lib
  • cargo test --profile release-test --tests
  • wasm-pack build --target web --out-dir pkg

After the Hancom-generated tac-host-spacing.pdf was provided, I converted it at 96 dpi and compared it with the PR render. With this Hancom PDF reference (한컴 PDF 기준 렌더), the TAC table and NEXT PARAGRAPH placement are closely aligned. The content bbox in the comparison crop is:

한컴오피스에서 생성한 tac-host-spacing.pdf가 제공되어, 이를 96 dpi PNG로 변환한 뒤 PR 렌더 결과와 비교했습니다. 이 Hancom PDF reference(한컴 PDF 기준 렌더)와 비교했을 때 TAC 표와 NEXT PARAGRAPH의 배치는 충분히 근접합니다. 비교 crop 안의 content bbox는 다음과 같습니다.

Hancom PDF reference vs rhwp PR #1376 render

Evidence link / 비교 근거 링크: https://gist.github.com/jangster77/213dbedc0fd101bf06f111335bc92ad4

This supports the intended fix: the patch restores the missing TAC host-line trailing spacing by applying the line_spacing=600 HWPUNIT fallback when invisible controls precede the TAC table. The regression expectation around ~161.3px is consistent with the Hancom PDF reference.

이 비교 결과는 PR의 의도한 수정, 즉 TAC 표 앞에 비가시 컨트롤이 있을 때 누락되던 host-line trailing spacing을 line_spacing=600 HWPUNIT fallback으로 복원하는 변경을 뒷받침합니다. 회귀 테스트의 ~161.3px 기대값도 한컴 PDF 기준 렌더와 일치하는 것으로 확인했습니다.

I also checked the tests/issue_598_footnote_marker_nav.rs y-coordinate update visually; the new hit-test coordinates track the shifted marker positions after this layout change.

tests/issue_598_footnote_marker_nav.rs의 y 좌표 변경도 시각적으로 확인했습니다. 이번 레이아웃 변경 이후 이동한 각주 marker 위치를 새 hit-test 좌표가 따라가므로, 단순히 테스트만 맞춘 변경으로 보이지 않습니다.

Current status: not merge-ready because it did not pass the visual judgment.

현재 상태: 시각적 판단을 통과하지 못하여 merge 불가입니다.

@mrshinds

Copy link
Copy Markdown
Author

@jangster77 Thank you for the clear request to validate against the actual Hancom Office rendering rather than the lineSeg-derived calculation. I measured the Hancom visual oracle directly.

Method. I opened samples/tac-host-spacing.hwpx in Hancom Office 2024 (13.0.0.1053) and exported it to PDF, then extracted vector coordinates from that Hancom-produced PDF. To confirm the coordinate frames are comparable, I ran the same fixture through this engine's own convert→PDF→extract path: it yields 161.27px, matching the SVG test value (161.3), so the absolute comparison against Hancom is valid.

Result — the patched output matches Hancom at sub-pixel level:

NEXT-paragraph baseline px (96dpi) vs Hancom
Hancom Office (oracle) 161.11
This PR (head) 161.27 +0.16px
devel (pre-patch) 153.3 −7.81px

Table geometry agrees just as closely:

measurement Hancom This PR delta
table top border 98.1px 98.2px +0.1px
table bottom border 137.9px 138.2px +0.3px
gap after table → NEXT baseline 23.2px 23.1px ~0

So against the Hancom oracle, the patched render is within ≤0.3px on the following-paragraph baseline, the table borders, and the trailing gap. The pre-patch devel baseline is 7.8px too high (the dropped host-line spacing pulling the paragraph up). In other words, this patch restores Hancom-consistent layout; the line_spacing=600 fallback lands within 0.16px of Hancom.

The existing test asserts (y - 161.3).abs() < 2.0; both Hancom (161.11) and this PR (161.27) fall inside that tolerance, so the fixture expectation is already Hancom-consistent — no value change is needed. I can add the measured Hancom baseline (161.1px) as a comment in the fixture if you'd like it documented.

The tests/issue_598_footnote_marker_nav.rs y-coordinate update (392 / 684) is the center of each marker's measured hit:true range on the PR-head render, which is now shown to be Hancom-consistent — so those values remain justified by the corrected layout.

One honest scope note: this oracle covers the tac-host-spacing.hwpx fixture only. test_521_tac_table_outer_margin_bottom_p2 uses a different fixture (image-preceding table), so it isn't directly validated by this measurement. If you'd like, I can produce a Hancom oracle for that fixture as well before merge.

Thanks again for pushing on the visual-oracle validation — it was the right call, and the measurement confirms the fix direction.

@edwardkim edwardkim requested a review from jangster77 June 14, 2026 14:45
@edwardkim edwardkim added the enhancement New feature or request label Jun 14, 2026
@edwardkim edwardkim added this to the v1.0.0 milestone Jun 14, 2026
@jangster77

Copy link
Copy Markdown
Collaborator

@mrshinds Thank you for the careful fix and for updating the PR after the initial CI failure. The TAC host-line spacing fix direction is valid, and the latest contributor changes now keep the existing issue_521 regression passing.

As a repository collaborator, I pushed one additional follow-up commit for visual verification against the Hancom PDF reference. The remaining visual mismatch was that CELL was rendered near the bottom of the table cell in rhwp, while the Hancom PDF reference places it around the vertical center of the cell. The collaborator follow-up narrows that to the cell vertical-alignment path and prevents the cell paragraph LINE_SEG.vertical_pos fallback from being applied a second time for Center/Bottom-aligned cells.

Visual comparison evidence:

Hancom PDF reference vs rhwp after collaborator fix

Validation completed:

  • cargo build --release
  • cargo test --release --lib
  • cargo test --profile release-test --tests
  • cargo fmt --check
  • git diff --check
  • wasm-pack build --target web --out-dir pkg
  • GitHub Actions passed: Build & Test, Canvas visual diff, CodeQL, Analyze (javascript-typescript), Analyze (python), Analyze (rust). WASM Build is skipped by the workflow policy.

Thank you again for the focused reproduction fixture and the useful fix. I will proceed with merge after this final validation pass.


@mrshinds 꼼꼼한 수정과 초기 CI 실패 이후 PR을 다시 보완해 주셔서 감사합니다. TAC host-line spacing 수정 방향은 유효하며, 최신 기여자 변경에서는 기존 issue_521 회귀 테스트도 유지되는 것을 확인했습니다.

저장소 Collaborator로서 한컴 PDF 기준 시각 검증을 위해 추가 보정 커밋을 하나 push했습니다. 남아 있던 시각 차이는 rhwp에서 CELL이 표 셀 하단 쪽에 렌더링되는 반면, 한컴 PDF 기준에서는 셀의 세로 중앙 쪽에 배치된다는 점이었습니다. Collaborator 후속 보정에서는 이 문제를 셀 세로 정렬 경로로 좁히고, Center/Bottom 정렬 셀에서 셀 내부 문단의 LINE_SEG.vertical_pos fallback이 한 번 더 적용되지 않도록 수정했습니다.

시각 비교 근거입니다.

한컴 PDF 기준과 Collaborator 보정 후 rhwp 비교

검증 결과:

  • cargo build --release
  • cargo test --release --lib
  • cargo test --profile release-test --tests
  • cargo fmt --check
  • git diff --check
  • wasm-pack build --target web --out-dir pkg
  • GitHub Actions 통과: Build & Test, Canvas visual diff, CodeQL, Analyze (javascript-typescript), Analyze (python), Analyze (rust). WASM Build는 workflow 정책상 skipped입니다.

집중된 재현 fixture와 유용한 수정 방향을 제공해 주셔서 다시 한 번 감사합니다. 최종 검증 통과를 확인했으므로 merge를 진행하겠습니다.

@jangster77 jangster77 merged commit aeba5db into edwardkim:devel Jun 15, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants