fix: flatten merged widget annotations in form.flatten()#134
Open
horacio-penya wants to merge 1 commit intocantoo-scribe:masterfrom
Open
fix: flatten merged widget annotations in form.flatten()#134horacio-penya wants to merge 1 commit intocantoo-scribe:masterfrom
horacio-penya wants to merge 1 commit intocantoo-scribe:masterfrom
Conversation
Extends form.flatten() to handle "merged" widget annotations - widgets that have field properties (/FT, /V, /T) directly on the annotation dict in page Annots rather than being registered in AcroForm.Fields. Previously, form.flatten() would do nothing for PDFs with merged widgets because getFields() only traverses AcroForm.Fields. Now it also scans page annotations for orphaned widgets and flattens them. Handles both simple appearances (text fields) and stateful appearances (checkboxes, radio buttons with /AS appearance state). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sharcoux
requested changes
Feb 17, 2026
| const annotsToRemove: PDFRef[] = []; | ||
|
|
||
| for (let i = 0; i < annots.size(); i++) { | ||
| const annotRef = annots.get(i); |
Collaborator
There was a problem hiding this comment.
Probably a good idea to wrap this in a try/catch and handle errors gracefully
| const rect = annot.get(PDFName.of('Rect')); | ||
| if (!(rect instanceof PDFArray) || rect.size() < 4) continue; | ||
|
|
||
| const x1 = (rect.get(0) as any)?.asNumber?.() ?? 0; |
Collaborator
There was a problem hiding this comment.
You could use:
const rect = annot.get(PDFName.of('Rect'));
const rectangle = rect.asRectangle();This will properly normalize the rectangle
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Extends form.flatten() to handle "merged" widget annotations - widgets that have field properties (/FT, /V, /T) directly on the annotation dict in page Annots rather than being registered in AcroForm.Fields.
Previously, form.flatten() would do nothing for PDFs with merged widgets because getFields() only traverses AcroForm.Fields. Now it also scans page annotations for orphaned widgets and flattens them.
Handles both simple appearances (text fields) and stateful appearances (checkboxes, radio buttons with /AS appearance state).
What?
Extends form.flatten() to handle "merged" widget annotations - widgets that have field properties (/FT, /V, /T) directly on the annotation dict in page Annots rather than being registered in AcroForm.Fields.
Why?
embedPages missed those annotations
How?
Added a private method flattenMergedWidgets() called at the end of flatten() that:
Iterates all pages and their annotations
Identifies widget annotations (/Subtype: /Widget) with field type (/FT) directly on them
Resolves the appearance stream from /AP/N, handling:
Direct streams (text fields)
Appearance state dictionaries (checkboxes/radio buttons) - looks up /AS to get current state
Draws the appearance as an XObject at the widget's /Rect position
Removes the widget annotation
Alternative implementation: embedPages could copy the annotations as such, as copyPages does, but my current code uses flatten, so this seemed like the right fix.
Testing?
I tested with documents that had "merged" widget annotations, before this PR the annotations would be lost, now they are there.
New Dependencies?
No.
Screenshots
Before:

After:

Suggested Reading?
No
Anything Else?
I used claude code, in an AI pair programming way (I directed the work, even if not writing the code)
I'm not sure how a test for this should be done.
I ran the linter, and it wanted to make changes in files unrelated to this PR, so I didn't commit those.
Checklist