Optimized, seamlessly integrated data-centric security for HDP

As Hadoop transitions to take on a more mission critical role within the enterprise, so the top IT imperatives of process innovation, operational efficiency, and data security take center stage. Organizations must protect sensitive data to meet corporate governance and compliance requirements. At the same time, IT is confronted with meeting the demands of users to access information within business processes and analytical engines to gain insight and drive growth initiatives.


hortonworks2Protegrity Avatar™
for Hortonworks can help resolve these challenges by protecting the data itself within Hortonworks Data Platform (HDP) with enterprise-class encryption and Protegrity Vaultless Tokenization (PVT), while managing and monitoring the data flow.

KEY Features

  • Transparent data encryption and tokenization for HDP
  • Role and process-based policy and enterprise-grade key management
  • Selective access to de-identified data fields with business intelligence
  • Comprehensive monitoring, auditing and reporting for regulatory compliance
  • Seamless deployment with no process change and minimal performance impact

Protecting Sensitive Data in HDP

Protegrity Avatar™ for Hortonworks delivers highly transparent file-level AES 256 encryption and patented Protegrity Vaultless Tokenization on the node for individual data elements. It also includes Protegrity’s industry-leading centralized data security administration software, including comprehensive monitoring, auditing, and policy and key management. All sensitive data in HDP can be protected from internal and external threats – at rest in HDFS; in use during processing and analysis using MapReduce, Hive, and Pig; and in transit to and from enterprise data systems such as an Enterprise Data Warehouse. The actual data in HDP installations can now be tokenized and protected from external and internal threats, including privileged users.

Data Protection Methods

An effective data security strategy is defined by matching the risk associated with any particular type of data with a specific data protection method. Protegrity Avatar supports a comprehensive range of data protection methods, including PVT, format-preserving encryption, file and field AES 256 encryption, masking, hashing, and monitoring.

Coarse-Grained Security (File/Volume)

Methods: File or volume encryption
“All or nothing” approach
Secures data only at rest, and files in transit

Fine-Grained Security (Field/Column)

Methods: Field/column PVT, encryption or masking
Data is protected wherever it goes
Business intelligence increases usability

Protegrity Vaultless Tokenization (PVT)

Granular data protection is essential to any comprehensive data security solution, and recently many companies are turning to technologies such as tokenization to secure the data itself. The industry-first, patented Protegrity Vaultless Tokenization process eliminates the challenges associated with standard, vault-based tokenization, and provides all the benefits of masking with the additional advantage of reversibility for situations that require data in the clear. PVT greatly reduces bottlenecks in performance and scalability caused by latency, with no more fear of collisions, and no more sensitive data or tokens residing in your token server. In the event of a breach, tokens hold no value to a potential thief. Tokens can also be embedded with business intelligence, allowing for seamless analytics and business processes without the need to detokenize data.

Enterprise-Grade Policy and Key Management, Auditing and Reporting

Protegrity Avatar™ provides central policy and key management, auditing, and reporting across HDP. In addition, all audit logs can be fed back into Hadoop for intelligence-based security analysis. Protegrity Avatar enables seamless security in all layers of HDP without disruption to business processes.

Applying Data Protection

Protegrity Avatar™ provides comprehensive security throughout HDP and protects data in all critical states.

Protecting New Data Entering Hadoop

Protegrity Avatar’s Extended HDFS File Encryption minimally impacts the rate at which data can be loaded into HDFS. As data enters, individual data elements can also be tokenized with Hadoop applications such as MapReduce and Hive on the node, and then distributed to the clusters inside encrypted files.

Protecting Data To And From The Enterprise

As enterprise data enters or leaves HDFS, including monetized data, the same policy-level protection, file encryption, and data tokenization can be applied. This continuous protection ensures security of sensitive data throughout the enterprise and beyond.

Protecting Data During Analysis

The API available to tasks in HDP applications such as MapReduce, Hive, Pig, Sqoop, Flume and HBase are extended with policy-driven data protection functions, including PVT and format-preserving encryption. This allows sensitive data to remain secure during analysis, while retaining the visibility critical to deep business insights.

avatar-diagram-large
  1. Apply data protection at database, application or file-level outside Hadoop
  2. Transfer data to staging area (edge node) and apply data protection outside Hadoop
  3. Apply volume-level encryption within Hadoop
  4. Extend Hbase, Pig, Hive, Flume and Sqoop job function using data protection API within Hadoop
  5. Extend MapReduce framework with data protection API within Hadoop
  6. Apply transparent HDFS folder and file encryption
  7. Import de-identified data into Hadoop
  8. Export de-identified data for input into BI applications
  9. Export identifiable data to trusted sources
  10. Export audit data for monitoring, reporting and analysis

Try the Protegrity Avatar Sandbox