Partner integration

Protegrity & Bodo.ai

Dynamic, metadata-driven protection and unprotection of sensitive data within AI-assisted SQL workflows—without sacrificing performance.

overview

The Protegrity–Bodo Secure Text-to-Analytics solution lets organizations unlock the value of sensitive data with natural language queries and high-performance analytics while ensuring end-to-end security and compliance.

Protegrity guarantees that sensitive data fields (e.g., PII, PHI, financial data) are always protected.
Bodo delivers HPC-scale performance for analytics, reducing processing times from hours to minutes.

Together, they enable enterprises to run analytics and AI use cases that were previously blocked by compliance concerns or performance limits. This integration will also democratize data access across the organization, ensuring that even users without permission to view sensitive data in the clear can still derive analytical value. By asking questions in plain English, users receive complex SQL and Python translations behind the scenes, which return the analytical insights they need — all while the underlying data remains protected.

Key Integration Feature

The integration between Protegrity and Bodo establishes a critical defense for Generative AI by enabling dynamic, metadata-driven protection and unprotection of sensitive data assets. By embedding Protegrity’s granular security policies directly into Bodo’s compute engine, the solution automatically secures data within AI-assisted SQL workflows based on user authorization and context. Crucially, this seamless enforcement occurs without sacrificing performance, allowing enterprises to leverage the full velocity of Bodo’s parallel processing while maintaining strict compliance and privacy standards in their GenAI pipelines.

Features & Capabilities

01
Secure Text-to-Analytics: Natural language queries on structured data with complete privacy.
Why It Matters
Natural language queries on structured data with complete privacy, enabling business users to interact with data safely while bypassing risks of data leakage.
How it Works
Protegrity ensures queries and responses remain fully compliant, even when sensitive data is processed.
02
End-to-End Data Protection: From ingestion to analytics results, all data is safeguarded.
Why It Matters
From ingestion to analytics results, all data is safeguarded, guaranteeing compliance with regulations like GDPR, HIPAA, and PCI-DSS.
How it Works
Field-level encryption ensures sensitive identifiers are protected at every stage.
03
High-Performance Parallel Analytics: Bodo’s distributed engine processes petabytes of data with Python simplicity.
Why It Matters
Bodo’s distributed engine processes petabytes of data with Python simplicity, delivering lightning-fast performance for AI/ML workloads.
How it Works
Bodo customers achieve up to 10x faster data analytics vs. legacy solutions.
04
Flexible Deployment: Works across multi-cloud and hybrid environments.
Why It Matters
Works across multi-cloud and hybrid environments, reducing vendor lock-in and supporting enterprise-scale architectures.
How it Works
Seamless integration into existing data lakes and pipelines.
05
Developer-Friendly Experience: Simple Python APIs with enterprise-grade security.
Why It Matters
Simple Python APIs with enterprise-grade security, making advanced analytics accessible without deep security expertise.
How it Works
Data scientists can focus on models, while security is automated.

Architecture &
Sample Data Flow

The data journey

Visualizing the data journey

The data journey explained

01
User Input (UI Layer)
- The user submits a natural-language query.
- Authentication occurs and the user context is loaded (role, permissions, session).
- Output: an authenticated request moves to the API layer.
02
API Processing
- Request validation checks structure and permissions.
- Sanitization removes/neutralizes unsafe content.
- Chat management maintains conversation state.
- Output: a clean, authorised prompt for model orchestration.
03
LLM Processing
- Provider selection chooses the model/service.
- Session management tracks tokens and state.
- Prompt engineering structures the instruction the model receives.
- Output: an intent/plan that can be translated to executable data work.
04
PyDough Translation
- Build a plan in PyDough DSL.
- Code generation creates executable queries/operations.
- Code validation ensures safety and correctness before execution.
- Output: vetted code ready to run against the data platform.
05
Database Execution
- The generated code runs against the client’s database/data platform.
- Data is stored protected by Protegrity at rest
- During reads/writes, Protegrity policies are enforced via its APIs/protectors
06
Response Processing
- Aggregation combines results.
- Format conversion prepares tabular/graph-friendly outputs.
- Visual preparation organizes content for display.
07
Frontend Display (UI Layer)
- The UI renders tables, charts, and graphs, and updates the chat/UI with the results.
- Control returns to the user for further questions/refinement.

Use Cases

Examples where Bodo has helped achieve a business goal.

Finance

Challenge

Financial institutions must process massive volumes of sensitive transactions in real time, while maintaining strict controls over PCI, PII, and confidential data. Migrating legacy platforms to the cloud introduces additional risks around data sovereignty and compliance.

Solution

Bodo executes distributed fraud analytics at scale, while Protegrity enforces tokenization and masking across hybrid and multi-cloud environments. Vaultless Tokenization and centralized policy management ensure consistent protection, even as workloads shift between on-premises and cloud.

Result

Banks and financial services firms reduce risk and accelerate cloud adoption. They can migrate previously blocked workloads, standardize data protection policies, and provide secure, real-time data access for authorized teams, supporting PCI and PII use cases and simplifying audits.

Healthcare

Challenge

Healthcare organizations face stringent compliance requirements when analyzing sensitive patient data stored in Electronic Health Records (EHRs). Balancing privacy, regulatory mandates (like HIPAA), and the need for rapid clinical insights is a persistent struggle.

Solution

The Protegrity–Bodo integration enables protected-at-rest EHR data to be analyzed at scale. Protegrity’s field-level security policies govern every view and query, ensuring only authorized access while Bodo’s parallel engine delivers high-speed analytics.

Result

Healthcare providers gain faster, actionable insights for patient care and operational efficiency, all without exposing patient identities or breaching compliance. This unlocks new possibilities for population health analytics, predictive modeling, and research, while maintaining strict privacy controls.

DEPLOYMENT

Customer-controlled data:

Data remains within the customer’s existing platforms, whether those are on-premises databases, private cloud environments, or public cloud storage solutions. Protegrity’s protection mechanisms are applied directly to the data at rest, ensuring that sensitive information is encrypted or tokenized before any analytics or processing occurs. This approach allows organizations to maintain full control over their data assets, manage access according to internal policies, and comply with data residency requirements. The integration does not require data migration or duplication; instead, it leverages the customer’s current infrastructure, applying security controls seamlessly across all supported environments.

Bodo compute:

Bodo’s compute engine is deployed within the customer’s cloud environment, enabling scalable, parallel execution of analytics workloads. The engine is designed to handle large volumes of data and complex queries by distributing processing tasks across multiple nodes or instances. This architecture supports both batch and real-time analytics, adapting to the needs of different business units and use cases. Bodo integrates with existing cloud resources, orchestrating compute jobs in a way that optimizes performance and resource utilization. The deployment is flexible, supporting multi-cloud and hybrid scenarios, and can be tailored to meet specific organizational requirements for scalability and reliability.

Policy integration:

Security policies are centrally defined and managed within Protegrity, and are enforced dynamically during data operations. When Bodo jobs are executed, they invoke Protegrity APIs and protectors to apply field-level security controls in real time. This means that every read, write, or transformation operation on sensitive data is subject to the appropriate policy, based on user roles, context, and data classification. The integration ensures that security is not an afterthought but an intrinsic part of the analytics workflow, with policies automatically applied as data moves through pipelines and processes. This setup supports granular control, allowing organizations to specify exactly how different types of data should be protected throughout their lifecycle.

Governance:

Protegrity provides a centralized platform for managing security policies, encryption keys, and audit logs. All actions related to data access, policy enforcement, and protector operations are recorded, enabling comprehensive auditing and compliance reporting. The governance framework supports regulatory requirements such as GDPR, HIPAA, and PCI-DSS, providing detailed visibility into who accessed what data, when, and under what policy conditions. Administrators can update policies and rotate keys from a single interface, ensuring consistency and reducing operational overhead. The auditing capabilities also facilitate forensic analysis and incident response, helping organizations maintain a robust security posture across all environments.

RESOURCES

Quick reads and docs to help your team deploy Protegrity with Bodo—natural-language SQL, HPC-scale analytics, and field-level protection without slowing performance.

Protegrity Documentation

Product docs, APIs, protectors, deployment guides, and policy examples for discovery, tokenization/masking, and everything you need to implement Protegrity.

Bodo/PyDough Documentation

Developer guides for Bodo’s distributed compute and PyDough DSL—install, scale-out patterns, tuning, and best practices for high-performance analytics.

Frequently
Asked Questions

During analytics operations, Bodo transparently invokes Protegrity APIs and protectors. This means that security policies are enforced in real time, and data is shown either in the clear, masked, or tokenized according to user permissions and context. Every read, write, or transformation operation is subject to the appropriate policy, ensuring compliance and privacy throughout the analytics workflow.

Protegrity’s governance framework records all actions related to data access, policy enforcement, and protector operations. Audit logs capture access context, policy decisions, and protector actions, enabling comprehensive compliance reporting and forensic analysis. This supports regulatory requirements such as GDPR, HIPAA, and PCI-DSS, and provides detailed visibility into who accessed what data, when, and under what policy conditions.

See for yourself

Technical Demos

Practical Guide

Start Building Today

Protegrity & Bodo.ai

overview

Key Integration Feature

Features & Capabilities

Secure Text-to-Analytics: Natural language queries on structured data with complete privacy.

Why It Matters

How it Works

End-to-End Data Protection: From ingestion to analytics results, all data is safeguarded.

Why It Matters

How it Works

High-Performance Parallel Analytics: Bodo’s distributed engine processes petabytes of data with Python simplicity.

Why It Matters

How it Works

Flexible Deployment: Works across multi-cloud and hybrid environments.

Why It Matters

How it Works

Developer-Friendly Experience: Simple Python APIs with enterprise-grade security.

Why It Matters

How it Works

Architecture & Sample Data Flow

The data journey

Visualizing the data journey

The data journey explained

User Input (UI Layer)

API Processing

LLM Processing

PyDough Translation

Database Execution

Response Processing

Frontend Display (UI Layer)

Use Cases

Finance

Challenge

Solution

Result

Healthcare

Challenge

Solution

Result

Retail

Challenge

Solution

Result

Manufacturing

Challenge

Solution

Result

DEPLOYMENT

Customer-controlled data:

Bodo compute:

Policy integration:

Governance:

RESOURCES

Protegrity Documentation

Bodo/PyDough Documentation

Frequently Asked Questions

See the Protegrity platform in action

Architecture &
Sample Data Flow

Frequently
Asked Questions

See the
Protegrity
platform
in action