Protegrity Blog

Big Data Security Best Practices

In a recent blog post from Wikibon, Jeff Kelly wrote, “When talking about Big Data, the conversation tends to focus on Data Science and analytics. That is, the stories about Big Data that hit the front pages of the mainstream press … are mostly about all the cool new ways to use data to greater effect. But Big Data Analytics doesn’t take place in a vacuum. It takes place in the enterprise. And any time you mix data and the enterprise, you can’t afford to ignore data management best practices. Failure to apply fundamental data management best practices to Big Data projects can lead not just to failed projects, but to potential legal consequences as well.” I agree that the conversation tends to focus on Data Science and analytics. What is most troubling is that we know that Big Data deployment is growing fast and that many organizations are rushing into it with a focus solely only ROI. Security is not usually part of the strategy. There is also a shortage in Big Data skills and an industry-wide shortage in data security personnel, so many organizations don’t even know they are doing anything wrong from a security perspective. As Anjul Bhambhri notes, “The same principles from the standpoint of data governance that apply to the structured world definitely apply to Big Data.” However, most traditional security solutions are not suitable for a Big Data environment. It is very difficult to build a security solution that can provide data breach protection and security and privacy regulation compliance, yet still allow the powerful analysis and data insight promised by Big Data environments. A data security solution for Big Data should protect the data flow, including surrounding legacy systems. In addition to volume or file encryption, granular protection at the field level should be supported. Data tokenization and/or encryption should operate on each node to provide scalability and high performance. Metadata can define the policy for data usage, and a separation of duties is required to maintain proper internal security. Additionally, a Data Usage Control function, monitoring access to sensitive data fields, can detect any abnormal attempts to access data, both from external and internal threats. This approach can be highly effective for compliance and to mitigate data breaches, as tokenized data holds no value to a potential thief, and the data will still be available for analytics and business processes. To read the full Wikibon blog post, please click here.

Leave a Reply

Your email address will not be published. Required fields are marked *

Download our Latest Insights

Secrets of Cloud Data Security


Subscribe Now