Protegrity AI Developer Edition

Protect Sensitive
Data from AI Workflows

Downloadable SDKs for discovering and protecting sensitive data in prompts, logs, and unstructured text.

DOWNLOAD ON GITHUB GET API KEY

Code snippet showing Protegrity AI Developer Edition usage.

Data Discovery

Detect sensitive entities in raw, unstructured data.

Find & Protect

Tokenize or mask sensitive values inline.

Semantic Guardrails

Score prompt and output risk in real-time.

Synthetic Data

Generate safe datasets from real schemas.

DATA DISCOVERY

Detect PII in
Unstructured Text

Identify PII in prompts, logs, and messages with precise boundaries – even without schemas.

Input

Hi, I’m Dan Johnson and I need to update my account information.
My ssn is 123-45-6789, phone number is (555) 234-5678, and I live in LA.
Please send the confirmation to [email protected].

Raw unstructured text before discovery.

Protegrity SDK

import protegrity_developer_python as protegrity

# Discover all PII entities (returns dict by type)
result = protegrity.discover(text)

# Extract just the text values for display
entities = {
    entity_type: [d["text"] for d in detections]
    for entity_type, detections in result.items()
}

One SDK call discovers PII entities across the raw text.

Output

{
  "EMAIL_ADDRESS":  ["[email protected]"],
  "PERSON": ["Dan Johnson"],
  "LOCATION":  ["LA"],
  "PHONE_NUMBER": ["(555) 234-5678"],
  "SOCIAL_SECURITY_ID":  ["123-45-6789"]
}

Structured entities ready for masking, tokenization, or policy enforcement.

FIND & PROTECT

Tokenize and Mask
Sensitive Data Inline

Tokenize or mask detected entities while preserving structure and readability.

Input

Hi, I’m Dan Johnson and I need to update my account information.
My ssn is 123-45-6789, phone number is (555) 234-5678, and I live in LA.
Please send the confirmation to [email protected], thanks!

Raw text containing unprotected PII.

Protegrity SDK

import protegrity_developer_python as protegrity

# Configure SDK with data discovery endpoint
protegrity.configure(endpoint_url="classify_endpoint")

# Find and protect PII in one call
protected_text = protegrity.find_and_protect(text)

One SDK call discovers and protects PII inline.

Output

Hi, I’m [PERSON]ybe B1elUnm[/PERSON] and I need to update my account information.
My ssn is[SSN]142-42-0001[/SSN], phone number is [PHONE](857) 142-4221[/PHONE], and I live in [LOCATION]iK[/LOCATION]. Please send the confirmation to [EMAIL][email protected][/EMAIL] thanks!

Tokenized/masked text safe for logs, pipelines, analytics, and AI systems.

SEE PROTEGRITY IN ACTION – LAUNCH NOTEBOOK

SEMANTIC GUARDRAILS

Score prompt and
output risk in real time

Evaluate messages in real-time and return structured risk signals your app can act on.

User message

Hi, I’m Dan Johnson and I need to update my account information.
My ssn is 123-45-6789, phone number is (555) 234-5678, and I live in LA.
Please send the confirmation to [email protected], thanks!

Raw prompt / log line before protection.

Semantic Guardrail (SDK call)

import requests

# Scan for risks before sending to LLM
res = requests.post(
    "http://localhost:8581/.../scan",
    json={"messages": [{"content": user_input}]}
)

# Build assessment from response
assessment = {
    "risk_score": res.json()["batch"]["score"],
    "risk_level": "HIGH" if score > 0.7 else "MEDIUM",
    "issues": [msg["processors"][0]["explanation"]],
    "action": "BLOCK" if score > 0.7 else "ALLOW"
}

One API call evaluates semantic risk and returns structured results.

Risk assessment

{
  "risk_score": 0.5102,
  "risk_level": "MEDIUM",
  "issues": [
    "OFFTOPIC"
  ],
  "action": "ALLOW"
}

Your app decides whether to allow, challenge, or block the request.

SYNTHETIC DATA GENERATION

Build safe datasets for
AI and analytics

Create statistically similar datasets from real schemas without exposing real records.

Real data

Name,Age,City,Income
Jennifer Martinez,34,San Francisco,$95000
John Smith,45,New York,$120000
Sarah Johnson,28,Austin,$75000

Sample production-like records that must not be shared directly.

Synthetic generator (API call)

import requests

# Generate synthetic data from real data
response = requests.post(
    "http://localhost:8095/.../generate",
    json={"source_data": real_data, "num_records": 1000}
)

synthetic_data = response.json()["synthetic_data"]

Protegrity generates statistically similar data without direct identifiers.

Synthetic data

Name,Age,City,Income
Alex Chen,35,San Francisco,$93500
Michael Brown,44,New York,$118000
Emma Davis,29,Austin,$76200

Safe, production-like records for testing, training, and analytics.

Docs center

Get Started in Minutes

Run locally, explore examples, and integrate with your stack.

Introduction &
Architecture

Understand how the Developer Edition fits together — components, data flow, data discovery, and data security.

VIEW DOCUMENTATION

Install &
Configure

Step-by-step instructions to pull the Docker images, install python modules, and get started with sample applications.

VIEW DOCUMENTATION

Run the
Sample app

Follow the walkthrough sample apps to see enterprise-grade data privacy in action.

VIEW DOCUMENTATION

Protect Sensitive
Data from AI Workflows

Build and Validate
Privacy-First AI Pipelines

Data Discovery

Find & Protect

Semantic Guardrails

Synthetic Data

Detect PII in
Unstructured Text

Tokenize and Mask
Sensitive Data Inline

Score prompt and
output risk in real time

Build safe datasets for
AI and analytics

Get Started in Minutes

Introduction &
Architecture

Install &
Configure

Run the
Sample app

Keep Building
as Your
Team Grows

See for yourself

Technical Demos

INTERACTIVE CALCULATOR

Start Building Today

Protect SensitiveData from AI Workflows

Build and Validate Privacy-First AI Pipelines

Data Discovery

Find & Protect

Semantic Guardrails

Synthetic Data

Detect PII in Unstructured Text

Tokenize and Mask Sensitive Data Inline

Score prompt and output risk in real time

Build safe datasets for AI and analytics

Get Started in Minutes

Introduction &Architecture

Install &Configure

Run theSample app

Keep Buildingas Your Team Grows

Protect Sensitive
Data from AI Workflows

Build and Validate
Privacy-First AI Pipelines

Detect PII in
Unstructured Text

Tokenize and Mask
Sensitive Data Inline

Score prompt and
output risk in real time

Build safe datasets for
AI and analytics

Introduction &
Architecture

Install &
Configure

Run the
Sample app

Keep Building
as Your
Team Grows