chore(deps): update dependency pdf2json to v4#42
Open
renovate[bot] wants to merge 1 commit intomasterfrom
Open
chore(deps): update dependency pdf2json to v4#42renovate[bot] wants to merge 1 commit intomasterfrom
renovate[bot] wants to merge 1 commit intomasterfrom
Conversation
ba3ef24 to
5957ddc
Compare
5957ddc to
5506003
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
1.2.1→4.0.2Release Notes
modesty/pdf2json (pdf2json)
v4.0.2: Stable Build v4.0.2Compare Source
add support for transparent groups, ensure endGroup would merge sub-canvas text/line/etc. back to primary output data. this completes the fix for #418
v4.0.1: Stable Build v4.0.1Compare Source
Bug fixes
v4.0.0: Stable Build v4.0.0 [Breaking Changes]Compare Source
v4.0.0 Release Notes
includes critical fixes for text encoding, space preservation, and text positioning, along with improved error handling. This release contains breaking changes that require attention when upgrading from v3.x.
🚨 Breaking Changes
Text Encoding Change (Issue #385, PR #410)
What Changed: Text in JSON output is no longer URI-encoded. All text now outputs as UTF-8 directly.
Why: To properly support Chinese, Japanese, Korean, and other multi-byte Unicode characters. The previous URI encoding caused issues with CJK text display and partial character extraction.
Migration Required: If your code expects URI-encoded text, you must update it to handle plain UTF-8 text.
JSON Output Examples
Before v4.0.0 (URI-encoded):
{ "Pages": [{ "Texts": [{ "R": [{ "T": "Added%20Text%20from%20Acrobat" }] }] }] }After v4.0.0 (UTF-8):
{ "Pages": [{ "Texts": [{ "R": [{ "T": "Added Text from Acrobat" }] }] }] }Code Migration
Before v4.0.0:
After v4.0.0:
CJK Character Support
Before v4.0.0:
{ "T": "%E4%B8%AD%E6%96%87" }After v4.0.0:
{ "T": "中文" }✨ Features & Enhancements
Accurate Space Preservation (Issues #355, #361, #319, PR #411)
Complete overhaul of space detection and preservation in text extraction (test CLI with -c command line option):
textHScalefor compressed/expanded textImpact: Spaces in extracted text (both
content.txtand JSON output) now accurately reflect the original PDF layout. Multi-word phrases, tables, and formatted text preserve proper spacing.Example Output Improvement
Before v4.0.0:
After v4.0.0:
🐛 Bug Fixes
Text Block Coordinate Accuracy (Issue #408, PR #409)
Character Extraction Completeness (Issue #385, PR #410)
CLI Error Handling (Issue #414)
more related issues should have been fixed (needs testing PDFs)
📦 Dependencies
v3.2.2: Stable Build v3.2.2Compare Source
v3.2.1: Stable build: V3.2.1Compare Source
v3.2.0: Stable build v3.2.0Compare Source
-- fix: issue #68 and #396
-- add node:protocol to make them explicit when running in env other than node, including deno and bun
v3.1.6: Stable build v3.1.6Compare Source
What's Changed
enginestodevEngines, thanks @styfle for #387New Contributors
Full Changelog: modesty/pdf2json@v3.1.5...v3.1.6
v3.1.5: Stable build v3.1.5Compare Source
feature added:
Issues addressed:
v3.1.4: Stable Build v3.1.4Compare Source
v3.1.3: Stable build v3.1.3Compare Source
** ENOENT: no such file or directory, open '/var/task/../package.json' #343
** Node.js Server got stuck when parsing specific PDF while it is working for other PDFs #321
** TypeError: Cannot read property 'free' of undefined #318
** parserError: 'bad XRef entry' #277
** params.get is not a function #262
** Error: Requesting object that isn't resolved yet #255
v3.1.2: Stable build v3.1.2Compare Source
v3.1.1: Stable build v3.1.1Compare Source
This v3.1.1 release replaces pdf2json@3.1.0.
v3.1.0Compare Source
v3.0.5: Stable build v3.0.5Compare Source
v3.0.4: Stable Build v3.0.4Compare Source
v3.0.3: Stable build v3.0.3Compare Source
Enhancement:
v3.0.2: Stable Build v3.0.2Compare Source
Bug fixes:
v3.0.1: Stable build v3.0.1Compare Source
dependency update xmldom
v3.0.0: Stable build v3.0.0: ES ModuleCompare Source
Breaking changes: converted commonJS to ES Module, see README for details
plus dependency upgrade for security patch and other minor bug fixes
v2.1.0Compare Source
v2.0.2: Stable build v2.0.2Compare Source
release/version2 branch: patch security issues in 2.x line. issue #300
v2.0.1: Stable Build v2.0.1Compare Source
Patch release, fix value of checkbox and add support for signature field.
v2.0.0: Stable build v2.0.0 (w/ breaking changes)Compare Source
Major refactoring since 2015. Full meta support, least dependency, improved exception handling and performance, better stream support and more testings. See readme for details on breaking changes on output JSON format.
v1.3.1Compare Source
v1.3.0Compare Source
v1.2.5: Stable build v1.2.5Compare Source
Better error handling. README updates.
v1.2.4: Stable build v1.2.4Compare Source
bug fixes and security updates
v1.2.3Compare Source
v1.2.2Compare Source
Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.