feat(analyzer): Add ONNX Runtime backend to HuggingFaceNerRecognizer#2086
feat(analyzer): Add ONNX Runtime backend to HuggingFaceNerRecognizer#2086yuriihavrylko wants to merge 7 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR extends presidio-analyzer’s HuggingFaceNerRecognizer to support an optional ONNX Runtime inference path (via Optimum), enabling execution-provider based acceleration and use of pre-quantized ONNX models while preserving the existing PyTorch/transformers behavior by default.
Changes:
- Added
backendselection (torchdefault /ort) and**model_kwargspass-through to support Optimum ORT model loading and future loader options. - Added optional dependency group (
onnxruntime) plus configuration/docs for running HF NER via ONNX Runtime execution providers. - Expanded unit tests and added new end-to-end tests for real model loading (torch + ort paths, when optional deps are present).
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| presidio-analyzer/presidio_analyzer/predefined_recognizers/ner/huggingface_ner_recognizer.py | Implements backend selection and ORT loading path, and forwards extra loader kwargs. |
| presidio-analyzer/pyproject.toml | Adds an onnxruntime optional-dependency group for Optimum/ORT usage. |
| presidio-analyzer/presidio_analyzer/input_validation/schemas.py | Updates config validation dump behavior to exclude None values. |
| presidio-analyzer/presidio_analyzer/conf/hf_ner_onnx.yaml | Adds a sample analyzer config using the new ORT backend with a mixed-layout HF repo. |
| presidio-analyzer/tests/test_huggingface_ner_recognizer.py | Updates mocked tests to reflect model_kwargs and adds ORT-path unit coverage. |
| presidio-analyzer/tests/test_huggingface_ner_recognizer_e2e.py | Adds new opt-in E2E tests that exercise real transformers/optimum pipelines. |
| mkdocs.yml | Adds the new documentation page to the Analyzer docs nav. |
| docs/analyzer/recognizer_registry_provider.md | Documents the new backend parameter and links to backend guidance. |
| docs/analyzer/nlp_engines/gpu_usage.md | Links GPU usage docs to the new HF NER backend guidance. |
| docs/analyzer/huggingface_ner_inference.md | New detailed guide for torch vs ORT backends, installation, and execution-provider configuration. |
… validation" This reverts commit b2c7847.
| mock_hf_pipeline.assert_called_once_with( | ||
| "token-classification", | ||
| model="test-model", | ||
| tokenizer="test-model", | ||
| aggregation_strategy="simple", |
There was a problem hiding this comment.
These are false positives - EntityRecognizer.__init__ calls self.load(), so constructing the recognizer already triggers the pipeline/ORT model creation - the assertions run against that.
Verified the test passes as written. Adding an explicit rec.load() would be a redundant no-op (load() early-returns when the pipeline is already built)
Change Description
Adds an optional ONNX Runtime inference backend to
HuggingFaceNerRecognizer, selected via a newbackendparameter ("torch"- default, unchanged - or"ort"). The ort backend loads token-classification models throughoptimum.onnxruntime, enabling:How
backendparameter onHuggingFaceNerRecognizer. For"ort", the recognizer pre-loadsORTModelForTokenClassification.from_pretrained()explicitly and hands the model object to the pipeline. This scopes loader kwargs (subfolder,file_name,provider, …) to the model loader only - required for mixed-layout repos (onnx-community/*, etc) that keep ONNX files underonnx/while config/tokenizer live at the repo root. Passing these at the pipeline level breaks on such repos.**model_kwargspass-through (mirrorsGLiNERRecognizer): extra kwargs — from Python or YAML — are forwarded to the active backend's loader. No recognizer changes needed for future loader options.onnxruntimeextra inpyproject.toml:pip install 'presidio-analyzer[onnxruntime]'(CPU build). GPU/accelerator builds are installed directly instead of the extra (see docs) because theonnxruntime*packages ship the same Python module and conflict.Usage
Breaking change
Unknown constructor kwargs were previously logged and dropped; they are now forwarded to the model loader and raise
TypeErrorif the loader rejects them. Typos in YAML recognizer configs that loaded silently before will now fail loudly at startup. This is intentional (silent misconfiguration of a PII detector is worse than a startup error), but existing configs with stray keys need cleanup.Issue reference
None
Checklist