Skip to content

fix: flatten merged widget annotations in form.flatten()#134

Open
horacio-penya wants to merge 1 commit intocantoo-scribe:masterfrom
horacio-penya:fix/flatten-merged-widgets
Open

fix: flatten merged widget annotations in form.flatten()#134
horacio-penya wants to merge 1 commit intocantoo-scribe:masterfrom
horacio-penya:fix/flatten-merged-widgets

Conversation

@horacio-penya
Copy link
Copy Markdown

Extends form.flatten() to handle "merged" widget annotations - widgets that have field properties (/FT, /V, /T) directly on the annotation dict in page Annots rather than being registered in AcroForm.Fields.

Previously, form.flatten() would do nothing for PDFs with merged widgets because getFields() only traverses AcroForm.Fields. Now it also scans page annotations for orphaned widgets and flattens them.

Handles both simple appearances (text fields) and stateful appearances (checkboxes, radio buttons with /AS appearance state).

What?

Extends form.flatten() to handle "merged" widget annotations - widgets that have field properties (/FT, /V, /T) directly on the annotation dict in page Annots rather than being registered in AcroForm.Fields.

Why?

embedPages missed those annotations

How?

Added a private method flattenMergedWidgets() called at the end of flatten() that:

Iterates all pages and their annotations
Identifies widget annotations (/Subtype: /Widget) with field type (/FT) directly on them
Resolves the appearance stream from /AP/N, handling:
Direct streams (text fields)
Appearance state dictionaries (checkboxes/radio buttons) - looks up /AS to get current state
Draws the appearance as an XObject at the widget's /Rect position
Removes the widget annotation

Alternative implementation: embedPages could copy the annotations as such, as copyPages does, but my current code uses flatten, so this seemed like the right fix.

Testing?

I tested with documents that had "merged" widget annotations, before this PR the annotations would be lost, now they are there.

New Dependencies?

No.

Screenshots

Before:
image

After:
image

Suggested Reading?

No

Anything Else?

I used claude code, in an AI pair programming way (I directed the work, even if not writing the code)

I'm not sure how a test for this should be done.

I ran the linter, and it wanted to make changes in files unrelated to this PR, so I didn't commit those.

Checklist

  • I read CONTRIBUTING.md.
  • I read MAINTAINERSHIP.md#pull-requests.
  • I added/updated unit tests for my changes.
  • I added/updated integration tests for my changes.
  • I ran the integration tests.
  • I tested my changes in Node, Deno, and the browser.
  • I viewed documents produced with my changes in Adobe Acrobat, Foxit Reader, Firefox, and Chrome.
  • I added/updated doc comments for any new/modified public APIs.
  • My changes work for both new and existing PDF files.
  • I ran the linter on my changes.

Extends form.flatten() to handle "merged" widget annotations - widgets
that have field properties (/FT, /V, /T) directly on the annotation dict
in page Annots rather than being registered in AcroForm.Fields.

Previously, form.flatten() would do nothing for PDFs with merged widgets
because getFields() only traverses AcroForm.Fields. Now it also scans
page annotations for orphaned widgets and flattens them.

Handles both simple appearances (text fields) and stateful appearances
(checkboxes, radio buttons with /AS appearance state).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
const annotsToRemove: PDFRef[] = [];

for (let i = 0; i < annots.size(); i++) {
const annotRef = annots.get(i);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a good idea to wrap this in a try/catch and handle errors gracefully

const rect = annot.get(PDFName.of('Rect'));
if (!(rect instanceof PDFArray) || rect.size() < 4) continue;

const x1 = (rect.get(0) as any)?.asNumber?.() ?? 0;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use:

const rect = annot.get(PDFName.of('Rect'));
const rectangle = rect.asRectangle();

This will properly normalize the rectangle

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants