Protegrity AI Developer Edition

Protect Sensitive
Data from AI Workflows

Downloadable SDKs for discovering and protecting sensitive data in prompts, logs, and unstructured text.
Code snippet showing Protegrity AI Developer Edition usage.

Build and Validate
Privacy-First AI Pipelines

Composable tools for discovering, protecting, and governing sensitive data in AI-driven applications.

Data Discovery

Detect sensitive entities in raw, unstructured data.

Find & Protect

Tokenize or mask sensitive values inline.

Semantic Guardrails

Score prompt and output risk in real-time.

Synthetic Data

Generate safe datasets from real schemas.

DATA DISCOVERY

Detect PII in
Unstructured Text

Identify PII in prompts, logs, and messages with precise boundaries – even without schemas.

Input
Hi, I’m Dan Johnson and I need to update my account information.
My ssn is 123-45-6789, phone number is (555) 234-5678, and I live in LA.
Please send the confirmation to [email protected].
Raw unstructured text before discovery.
Protegrity SDK
import protegrity_developer_python as protegrity

# Discover all PII entities (returns dict by type)
result = protegrity.discover(text)

# Extract just the text values for display
entities = {
    entity_type: [d["text"] for d in detections]
    for entity_type, detections in result.items()
}
One SDK call discovers PII entities across the raw text.
Output
{
  "EMAIL_ADDRESS":  ["[email protected]"],
  "PERSON": ["Dan Johnson"],
  "LOCATION":  ["LA"],
  "PHONE_NUMBER": ["(555) 234-5678"],
  "SOCIAL_SECURITY_ID":  ["123-45-6789"]
}
Structured entities ready for masking, tokenization, or policy enforcement.
FIND & PROTECT

Tokenize and Mask
Sensitive Data Inline

Tokenize or mask detected entities while preserving structure and readability.

Input
Hi, I’m Dan Johnson and I need to update my account information.
My ssn is 123-45-6789, phone number is (555) 234-5678, and I live in LA.
Please send the confirmation to [email protected], thanks!
Raw text containing unprotected PII.
Protegrity SDK
import protegrity_developer_python as protegrity

# Configure SDK with data discovery endpoint
protegrity.configure(endpoint_url="classify_endpoint")

# Find and protect PII in one call
protected_text = protegrity.find_and_protect(text)
One SDK call discovers and protects PII inline.
Output
Hi, I’m [PERSON]ybe B1elUnm[/PERSON] and I need to update my account information.
My ssn is[SSN]142-42-0001[/SSN], phone number is [PHONE](857) 142-4221[/PHONE], and I live in [LOCATION]iK[/LOCATION]. Please send the confirmation to [EMAIL][email protected][/EMAIL] thanks!
Tokenized/masked text safe for logs, pipelines, analytics, and AI systems.
SEMANTIC GUARDRAILS

Score prompt and
output risk in real time

Evaluate messages in real-time and return structured risk signals your app can act on.

User message
Hi, I’m Dan Johnson and I need to update my account information.
My ssn is 123-45-6789, phone number is (555) 234-5678, and I live in LA.
Please send the confirmation to [email protected], thanks!
Raw prompt / log line before protection.
Semantic Guardrail (SDK call)
import requests

# Scan for risks before sending to LLM
res = requests.post(
    "http://localhost:8581/.../scan",
    json={"messages": [{"content": user_input}]}
)

# Build assessment from response
assessment = {
    "risk_score": res.json()["batch"]["score"],
    "risk_level": "HIGH" if score > 0.7 else "MEDIUM",
    "issues": [msg["processors"][0]["explanation"]],
    "action": "BLOCK" if score > 0.7 else "ALLOW"
}
One API call evaluates semantic risk and returns structured results.
Risk assessment
{
  "risk_score": 0.5102,
  "risk_level": "MEDIUM",
  "issues": [
    "OFFTOPIC"
  ],
  "action": "ALLOW"
}
Your app decides whether to allow, challenge, or block the request.
SYNTHETIC DATA GENERATION

Build safe datasets for
AI and analytics

Create statistically similar datasets from real schemas without exposing real records.

Real data
Name,Age,City,Income
Jennifer Martinez,34,San Francisco,$95000
John Smith,45,New York,$120000
Sarah Johnson,28,Austin,$75000
Sample production-like records that must not be shared directly.
Synthetic generator (API call)
import requests

# Generate synthetic data from real data
response = requests.post(
    "http://localhost:8095/.../generate",
    json={"source_data": real_data, "num_records": 1000}
)

synthetic_data = response.json()["synthetic_data"]
Protegrity generates statistically similar data without direct identifiers.
Synthetic data
Name,Age,City,Income
Alex Chen,35,San Francisco,$93500
Michael Brown,44,New York,$118000
Emma Davis,29,Austin,$76200
Safe, production-like records for testing, training, and analytics.
Docs center

Get Started in Minutes

Run locally, explore examples, and integrate with your stack.
Introduction &<br />Architecture

Introduction &
Architecture

Understand how the Developer Edition fits together — components, data flow, data discovery, and data security.
Install &<br />Configure

Install &
Configure

Step-by-step instructions to pull the Docker images, install python modules, and get started with sample applications.
Run the<br />Sample app

Run the
Sample app

Follow the walkthrough sample apps to see enterprise-grade data privacy in action.