Skip to content

Commit d32d67a

Browse files
sc-protegrityprotegrity-gheuserprotegrity-gheuser
authored
push to pre-release branch (#20) (#21)
Co-authored-by: protegrity-gheuser <protegrity.gheuser@protegrity.com> Co-authored-by: protegrity-gheuser <protegrity-gheuser@protegrity.com>
1 parent 0fa0cbd commit d32d67a

30 files changed

+1679
-191
lines changed

CHANGELOG.md

Lines changed: 79 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,84 @@
22

33
All notable changes to the Protegrity Developer Edition Python project will be documented in this file.
44

5-
## [1.0.0] - Current Release
5+
## [1.1.0] - 2025-12-15
6+
7+
### 🎉 Major New Features
8+
9+
#### Semantic Guardrails v1.1.0 SDK Support
10+
- **Enhanced Risk Scoring**: Updated SDK to support Semantic Guardrails v1.1.0 with improved risk assessment capabilities
11+
- **Vertical-Specific Models**: Added support for Finance and Healthcare industry-specific models
12+
- **Multi-turn Conversation Support**: Enhanced PII scanning and risk scoring across conversation history
13+
- **Improved API Interface**: Streamlined SDK interface for semantic guardrail operations
14+
15+
#### Data Discovery Enhancements
16+
- **Harmonized Classifications**: Support for categorized "harmonized" entity classifications
17+
- **Entity Mapping Updates**: Updated entity-to-data-element mapping to align with Data Discovery v1.1.1
18+
- **Improved Accuracy**: Enhanced classification accuracy and confidence scoring
19+
- **Overlapping Labels**: Fixed ordering logic for overlapping classification labels
20+
21+
#### Conda Package Support (NEW)
22+
- **Conda Recipe**: Added `conda-recipe/` directory with complete build configuration
23+
- **Cross-Platform Distribution**: Support for conda package distribution across platforms
24+
- **Meta.yaml Configuration**: Comprehensive conda package metadata and dependencies
25+
26+
### 🏗️ Architecture & Structure Changes
27+
28+
#### Repository Structure Enhancements
29+
- **Conda Recipe Directory**: New `conda-recipe/` with build scripts and metadata
30+
- **Enhanced Test Structure**: Improved test organization and expected outputs
31+
- **Configuration Updates**: Removed hardcoded endpoint URLs from `mapping_config.json`
32+
33+
#### SDK Interface Improvements
34+
- **Cleaner APIs**: Simplified method signatures for semantic guardrail operations
35+
- **Better Error Handling**: Enhanced error messages and exception handling
36+
- **Type Hints**: Improved type annotations for better IDE support
37+
38+
### 🔧 Enhanced Configuration & Service Features
39+
40+
#### Configuration Updates
41+
- **Dynamic Endpoint Configuration**: Removed hardcoded `endpoint_url` from mapping configuration
42+
- **Flexible Mapping**: Enhanced entity mapping configuration options
43+
- **Environment-Based Config**: Better support for environment-specific configurations
44+
45+
#### Testing Improvements
46+
- **Updated Test Outputs**: Refreshed expected test outputs to match Data Discovery 1.1.1 entity names and patterns
47+
- **Semantic Guardrails Unit Tests**: Updated unit tests for v1.1.0 compatibility
48+
- **Better Test Coverage**: Expanded test scenarios for new features
49+
50+
### 📚 Documentation & Developer Experience
51+
52+
#### README Enhancements
53+
- **"Why This Matters" Section**: Added context about the importance of data protection
54+
- **Improved Examples**: More comprehensive code examples and use cases
55+
- **Better Prerequisites**: Clearer setup instructions and dependency documentation
56+
57+
#### Developer Guidance
58+
- **Conda Installation**: New installation method via conda packages
59+
- **API Documentation**: Enhanced inline documentation and docstrings
60+
- **Migration Notes**: Guidance for upgrading from 1.0.0 to 1.1.0
61+
62+
### 🔄 Dependencies
63+
- **Updated Requirements**: Refreshed `requirements.txt` with compatible versions
64+
- **Conda Dependencies**: Added conda-specific dependency management
65+
- **Python Version**: Maintained Python 3.12.11+ requirement
66+
67+
### 🔐 Security
68+
- **Dependency Updates**: Updated to latest secure versions of dependencies
69+
- **Vulnerability Fixes**: Applied security patches as needed
70+
71+
### ⚠️ Breaking Changes
72+
- **Configuration Schema**: Removed `endpoint_url` from `mapping_config.json` - endpoints are now dynamically determined
73+
- **Entity Mapping**: Updated entity names and patterns to match Data Discovery 1.1.1 - may require configuration updates for custom mappings
74+
75+
### 📦 Distribution
76+
- **PyPI Package**: Available as `protegrity-developer-python` v1.1.0
77+
- **Conda Package**: New distribution channel via conda (coming soon)
78+
- **Wheel Distribution**: Pre-built wheel available for quick installation
79+
80+
---
81+
82+
## [1.0.0] - 2025-09-30
683

784
### 🎉 Major New Features
885

@@ -93,15 +170,11 @@ All notable changes to the Protegrity Developer Edition Python project will be d
93170

94171
---
95172

96-
## [Previous Release] - README1.md Baseline
173+
## [Previous Release] - README.md Baseline
97174

98175
### Features (Baseline)
99176
- Basic Find and Redact functionality
100177
- Single module structure (`protegrity_developer_python`)
101178
- Python 3.9.23 support
102179
- Basic configuration options
103180
- Simple repository structure
104-
105-
---
106-
107-
*Note: This changelog reflects the transition from the previous single-module approach to the current dual-module architecture with enhanced protection capabilities.*

README.md

Lines changed: 29 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,14 @@
11
<div align="center">
22

33
# Protegrity Developer Edition Python
4-
[![Version](https://img.shields.io/badge/version-1.0.0-green.svg?style=flat)](https://github.com/Protegrity-Developer-Edition/protegrity-developer-python/releases)
4+
[![Version](https://img.shields.io/badge/version-1.1.0-green.svg?style=flat)](https://github.com/Protegrity-Developer-Edition/protegrity-developer-python/releases)
55
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg?style=flat)](https://github.com/Protegrity-Developer-Edition/protegrity-developer-python/blob/main/LICENSE)
66
[![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg?style=flat)](https://www.python.org/downloads/)
77
[![Linux](https://img.shields.io/badge/Linux-FCC624?style=flat&logo=linux&logoColor=black)](https://www.linux.org/)
88
[![Windows](https://img.shields.io/badge/Windows-0078D6?style=flat&logo=windows&logoColor=white)](https://www.microsoft.com/windows/)
99
[![macOS](https://img.shields.io/badge/mac%20os-000000?style=flat&logo=macos&logoColor=F0F0F0)](https://www.apple.com/macos/)
10+
[![PyPI 1.1.0](https://img.shields.io/pypi/v/protegrity-developer-python.svg)](https://pypi.org/project/protegrity-developer-python/)
11+
[![Anaconda 1.1.0](https://anaconda.org/protegrity/protegrity-developer-python/badges/version.svg?style=flat)](https://anaconda.org/protegrity/protegrity-developer-python)
1012
[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/Protegrity-Developer-Edition/protegrity-developer-python)
1113
</div>
1214

@@ -18,6 +20,7 @@ Customize, compile, and use the module as per your requirement.
1820
## Table of Contents
1921

2022
1. [Overview](#overview)
23+
- [Why This Matters](#why-this-matters)
2124
2. [Repository Structure](#repository-structure)
2225
3. [Features](#features)
2326
- [Protegrity Developer Python](#Protegrity-Developer-Python)
@@ -44,6 +47,17 @@ This repository contains two powerful modules designed to handle different aspec
4447
- **protegrity_developer_python** - Focuses on data discovery, classification, and redaction of Personally Identifiable Information (PII) in unstructured text
4548
- **appython** - Provides comprehensive data protection and unprotection capabilities for structured data
4649

50+
#### Why This Matters
51+
52+
Sensitive data shows up in more places than you'd expect — logs, payloads, prompts, training sets, and unstructured text. This Python module gives you tools to find and protect that data using tokenization, masking, and discovery — whether it's in an AI pipeline or a local script. No infrastructure, no UI, just code.
53+
54+
- **Developer-first experience:** Open APIs, sample apps, and modular design make it easy to embed data discovery and protection into any Python project.
55+
56+
- **Accelerate innovation:** Prototype and validate data discovery and protection strategies in a lightweight, containerized sandbox.
57+
58+
- **Enable responsible AI:** Protect sensitive information in training data, prompts, and outputs for GenAI and machine learning workflows.
59+
60+
- **Simplify compliance:** Meet regulatory requirements for data privacy with built-in detection and protection capabilities.
4761

4862
## Repository Structure
4963

@@ -63,6 +77,7 @@ This repository contains two powerful modules designed to handle different aspec
6377
│ └── protegrity_developer_python
6478
│ ├── __init__.py
6579
│ ├── securefind.py
80+
│ ├── scan.py
6681
│ └── utils
6782
└── tests
6883
├── e2e
@@ -77,7 +92,8 @@ This repository contains two powerful modules designed to handle different aspec
7792
│ ├── bulk
7893
│ ├── mock
7994
│ └── single
80-
└── find_and_secure
95+
├── find_and_secure
96+
└── semantic_guardrail
8197
8298
```
8399

@@ -91,6 +107,7 @@ This repository contains two powerful modules designed to handle different aspec
91107
| **Find and Protect** | Classifies and protects Personally Identifiable Information (PII) in unstructured text using Protegrity protection policies. |
92108
| **Find and Unprotect** | Restores original Personally Identifiable Information (PII) data from its protected form. |
93109
| **Cross-Platform Support** | Compatible with **Linux**, **Windows**, and **MacOS**. |
110+
| **Semantic Guardrail Support** | Scan conversations for PII and risk using Semantic Guardrail API. |
94111

95112
### Application Protector Python
96113

@@ -162,8 +179,8 @@ For setup instructions, please refer to the documentation [here](https://github.
162179
import protegrity_developer_python
163180
164181
protegrity_developer_python.configure(
165-
endpoint_url="http://localhost:8580/pty/data-discovery/v1.0/classify",
166-
named_entity_map={"PERSON": "NAME", "SOCIAL_SECURITY_NUMBER": "SSN"},
182+
endpoint_url="http://localhost:8580/pty/data-discovery/v1.1/classify",
183+
named_entity_map={"PERSON": "NAME", "SOCIAL_SECURITY_ID": "SSN"},
167184
masking_char="#",
168185
classification_score_threshold=0.6,
169186
method="redact",
@@ -182,8 +199,8 @@ print(output_text)
182199
import protegrity_developer_python
183200

184201
protegrity_developer_python.configure(
185-
endpoint_url="http://localhost:8580/pty/data-discovery/v1.0/classify",
186-
named_entity_map={"PERSON": "NAME", "SOCIAL_SECURITY_NUMBER": "SSN"},
202+
endpoint_url="http://localhost:8580/pty/data-discovery/v1.1/classify",
203+
named_entity_map={"PERSON": "NAME", "SOCIAL_SECURITY_ID": "SSN"},
187204
masking_char="#",
188205
classification_score_threshold=0.6,
189206
method="redact",
@@ -202,8 +219,8 @@ print(output_text)
202219
import protegrity_developer_python
203220

204221
protegrity_developer_python.configure(
205-
endpoint_url="http://localhost:8580/pty/data-discovery/v1.0/classify",
206-
named_entity_map={"PERSON": "NAME", "SOCIAL_SECURITY_NUMBER": "SSN"},
222+
endpoint_url="http://localhost:8580/pty/data-discovery/v1.1/classify",
223+
named_entity_map={"PERSON": "NAME", "SOCIAL_SECURITY_ID": "SSN"},
207224
masking_char="#",
208225
classification_score_threshold=0.6,
209226
method="redact",
@@ -212,7 +229,7 @@ protegrity_developer_python.configure(
212229
)
213230

214231
#Pass the output received from find and protect
215-
input_text = "[PERSON]7ro8 lfU'I[/PERSON] SSN is [SOCIAL_SECURITY_NUMBER]616-16-2210[/SOCIAL_SECURITY_NUMBER]."
232+
input_text = "[PERSON]7ro8 lfU'I[/PERSON] SSN is [SOCIAL_SECURITY_ID]616-16-2210[/SOCIAL_SECURITY_ID]."
216233
output_text = protegrity_developer_python.find_and_unprotect(input_text)
217234
print(output_text)
218235
```
@@ -266,6 +283,9 @@ print("Unprotected Data:%s "%unprotected_data)
266283

267284
- [Protegrity Developer Edition documentation](http://developer.docs.protegrity.com/)
268285
- For API reference and tutorials, visit [Developer Portal](https://www.protegrity.com/developers)
286+
- For more information about Data Discovery, refer to the [Data Discovery documentation]( https://docs.protegrity.com/data-discovery/1.1.1/docs/).
287+
- For more information about Semantic Guardrails, refer to the [Semantic Guardrails documentation]( https://docs.protegrity.com/sem_guardrail/1.1.0/docs/).
288+
- For more information about Application Protector Python, refer to the [Application Protector Python documentation]( https://docs.protegrity.com/10.0/protectors/application_protector/ap_python/).
269289

270290
## Sample Use Case
271291

conda-recipe/bld.bat

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
REM Install the package using pip
2+
%PYTHON% -m pip install . -vv
3+
if errorlevel 1 exit 1

conda-recipe/build.sh

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
#!/bin/bash
2+
3+
# Install the package using pip
4+
$PYTHON -m pip install . -vv

conda-recipe/meta.yaml

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
{% set name = "protegrity-developer-python" %}
2+
{% set version = "1.1.0" %}
3+
4+
package:
5+
name: {{ name|lower }}
6+
version: {{ version }}
7+
8+
source:
9+
# Building from local source
10+
path: ..
11+
12+
build:
13+
number: 0
14+
noarch: python
15+
16+
requirements:
17+
host:
18+
- python >=3.12.11
19+
- pip
20+
- setuptools >=61.0
21+
run:
22+
- python >=3.12.11
23+
- requests
24+
25+
test:
26+
imports:
27+
- protegrity_developer_python
28+
- appython
29+
commands:
30+
- pip check
31+
requires:
32+
- pip
33+
34+
about:
35+
home: https://www.protegrity.com/developers
36+
license: MIT
37+
license_family: MIT
38+
license_file: LICENSE
39+
summary: Python module for integrating Protegrity's Data Discovery and Protection APIs into GenAI and traditional applications
40+
description: |
41+
This repository contains two powerful modules designed to handle different aspects of data protection:
42+
43+
- protegrity_developer_python - Focuses on data discovery, classification, and redaction of
44+
Personally Identifiable Information (PII) in unstructured text
45+
- appython - Provides comprehensive data protection and unprotection capabilities for structured data
46+
47+
Part of the Protegrity Developer Edition suite for integrating Protegrity's APIs into applications.
48+
doc_url: http://developer.docs.protegrity.com
49+
dev_url: https://github.com/Protegrity-Developer-Edition/protegrity-developer-python
50+
51+
extra:
52+
recipe-maintainers:
53+
- Protegrity

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "protegrity-developer-python"
3-
version = "1.0.0"
3+
version = "1.1.0"
44
description = "Python module for integrating Protegrity's Data Discovery and Protection APIs into GenAI and traditional applications."
55
authors = [{ name = "Protegrity", email="info@protegrity.com" }]
66
requires-python = ">=3.12.11"

requirements.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,4 +12,5 @@ attrs==25.3.0
1212
requests==2.32.4
1313
python-dotenv==1.1.1
1414
pytest-html==4.1.1
15-
pytest-metadata==3.1.1
15+
pytest-metadata==3.1.1
16+
pydantic==2.12.4

setup.cfg

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[metadata]
22
name = protegrity-developer-python
3-
version = 1.0.0
3+
version = 1.1.0
44
author = Protegrity
55
author_email = info@protegrity.com
66
license = MIT

src/appython/protector.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ def get_version(self):
7878
protector.get_version()
7979
8080
"""
81-
return "1.0.0"
81+
return "1.1.0"
8282

8383
def get_version_ex(self):
8484
"""Returns the extended version of the AP Python in use.
@@ -97,7 +97,7 @@ def get_version_ex(self):
9797
protector.get_version_ex()
9898
9999
"""
100-
return "SDK Version: 1.0.0, Core Version: 1.0.0"
100+
return "SDK Version: 1.1.0, Core Version: 1.1.0"
101101

102102
def terminate(self):
103103
return True
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
"""
2+
Module for scanning conversations for security risks using semantic guardrails
3+
"""
4+
5+
from protegrity_developer_python.utils.logger import get_logger
6+
7+
8+
from protegrity_developer_python.utils.semantic_guardrails import (
9+
MessageRiskRequest,
10+
scan_messages,
11+
MessageBatchRiskResponse,
12+
MessageBatchRiskRequest,
13+
)
14+
15+
logger = get_logger()
16+
17+
18+
def scan_conversation_messages(
19+
messages: list[MessageRiskRequest],
20+
) -> MessageBatchRiskResponse:
21+
"""
22+
Scan a batch of conversation messages for security risks using the Semantic Guardrails API.
23+
A risk assessment is executed on messages to ensure they are within the boundaries of expected topics and content.
24+
s
25+
Args:
26+
messages (list[MessageRiskRequest]): List of messages to scan.
27+
28+
Returns:
29+
MessageBatchRiskResponse: Risk assessment for the batch of messages.
30+
"""
31+
try:
32+
request = MessageBatchRiskRequest(messages=messages)
33+
response = scan_messages(request)
34+
return response
35+
except Exception as e:
36+
logger.error(f"Failed to scan conversation messages: {e}")
37+
raise

0 commit comments

Comments
 (0)