Segmentation fault on /JPXDecode images in Termux (Android aarch64) - Proposed WITHOUT_JPX build option

### Description
On Termux (Android aarch64, Python 3.13), `docling-parse` crashes with a segmentation fault when encountering a PDF with JPEG 2000 (JPX) images. The crash occurs in the native C++ code during stream decoding.

### Environment
- **OS**: Android 11 (Termux)
- **Arch**: aarch64
- **Python**: 3.13
- **Library Version**: 1.0.0 (current source build)

### Root Cause
The crash happens in `src/parse/pdf_resources/page_xobject_image.h` within `init_stream_data()` when `qpdf_xobject.getStreamData()` is called on a stream filtered with `/JPXDecode`. This appears to be related to ABI conflicts or instability in the OpenJPEG backend when running in the Termux environment.

### Log Snippet
```log
2026-04-12 19:41:46.035 (   0.053s) [         FC6C500]   page_xobject_image.h:427   INFO| filter: /JPXDecode
2026-04-12 19:41:46.035 (   0.053s) [         FC6C500]   page_xobject_image.h:433   INFO| init_stream_data
2026-04-12 19:41:46.035 (   0.053s) [         FC6C500]   page_xobject_image.h:444   INFO| raw stream size: 21435 bytes
Fatal Python error: Segmentation fault
```

### Proposed Fix
Since JPX support is problematic on mobile/Android platforms and may not be required for all use cases, adding a `WITHOUT_JPX` build option allows the library to skip unstable decoding paths and maintain overall parser stability.

**1. CMakeLists.txt**
Add a toggle to make JPX support optional:
```cmake
option(WITHOUT_JPX OFF "Disable JPX support")
if(WITHOUT_JPX)
    add_definitions(-DWITHOUT_JPX)
endif()
```

**2. src/parse/pdf_resources/page_xobject_image.h**
Implement a check and guard the decoding call:
```cpp
bool has_jpx_filter() const {
    for(auto const& f : image_filters) {
        if(f == "/JPXDecode") return true;
    }
    return false;
}

// In init_stream_data()
try {
    bool skip_decoding = false;
#ifdef WITHOUT_JPX
    if (has_jpx_filter()) skip_decoding = true;
#endif
    if (!skip_decoding) {
        decoded_stream_data = to_shared_ptr(qpdf_xobject.getStreamData());
    } else {
        LOG_S(WARNING) << "skipping decoding due to WITHOUT_JPX and /JPXDecode filter";
        decoded_stream_data = nullptr;
    }
} catch(...) { ... }
```

This fix has been verified locally on Termux and allows the rest of the document content (text, fonts, vectors) to parse successfully without crashing.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation fault on /JPXDecode images in Termux (Android aarch64) - Proposed WITHOUT_JPX build option #251

Description

Environment

Root Cause

Log Snippet

Proposed Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Segmentation fault on /JPXDecode images in Termux (Android aarch64) - Proposed WITHOUT_JPX build option #251

Description

Description

Environment

Root Cause

Log Snippet

Proposed Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions