TOKENIZATION VS. ENCRYPTION: HOW DATA PROTECTION METHODS WORK FOR YOU

Jun 20, 2022

Summary

6 min

Data security is crucial for businesses, and two popular methods to protect sensitive data are tokenization and encryption. Tokenization substitutes real values with tokens, maintaining referential integrity and future-proofing security.
Vaultless tokenization is a powerful, scalable, and flexible method that produces ciphertext maintaining data type, format, value, and length.
Encryption changes data into unreadable ciphertext and requires a key to unlock it. It is appropriate for file types that contain sensitive information. The strength of the encryption algorithm and key determine the security of the data.

PROTEGRITY VAULTLESS TOKENIZATION IS USED BY SOME OF THE LARGEST FINANCIAL, HEALTHCARE, AND RETAIL INSTITUTIONS IN THE WORLD

Data security is a fundamental requirement for today’s businesses, and organizations have a plethora of choices when it comes to the types of methods and technologies they can employ to protect sensitive information. Oftentimes, the right data protection method—or combination of methods—is dependent on the type of data and use case in question. Two of the more popular types of data protection are tokenization and encryption.

To choose the most appropriate data-security technology based on your business’ needs, it is important to understand the similarities and differences between these two unique methods of data protection. Let’s dive in.

WHAT IS TOKENIZATION AND HOW IS IT USED?

In its most basic definition, tokenization is a form of cryptography, though some may call it a type of format-preserving encryption. Regardless, it substitutes a real value for a token that obfuscates that value. Tokenization allows enterprises to securely store sensitive data like a nine-digit Social Security or National Insurance Number with a nine-digit reversible token. The data is sorted into two buckets: Cleartext (unprotected data) and Ciphertext (encrypted data). This makes it easier for downstream systems to use and minimizes changes to an application. If an application is expecting a nine-digit number and we send a 256-bit encrypted value, the system will fail.

Because tokenization uses randomization to generate tokens that substitute for real values, it delivers powerful business value. Randomization is a particularly powerful tool to help future-proof data security by negating concerns about quantum computing’s inherent ability to break key-based encryption algorithms. Further, tokenization substitutes the real value with a token consistently across the enterprise. This referential integrity means that data can be joined in a protected state to power AI, ML, data analytics initiatives, or any other application that requires data from multiple siloes to drive business outcomes.

ARS Technica reported that researchers demonstrated 87% of Americans can be identified with only three pieces of information: zip code, sex, and birthdate. Organizations use tokenization to protect sensitive data elements for internal and external compliance requirements. Most of the enterprise’s data is left untouched, which increases processing performance and eases implementation.

Tokenization is a data security technology that is used with data de-identification to deliver data privacy. Data de-identification removes the ability to identify an individual within a dataset by making personal identifiers and quasi-identifiers like social security or national insurance number, name, email, birthdate, etc. unreadable without explicit authorization.

The result of the combined techniques delivers a privacy-enhanced dataset that is persistently protected even upon exfiltration by a bad actor. In the event of a data breach, sensitive data is concealed, and all the hacker has access to is the tokenized Ciphertext data, which appears as unusable, random characters.

Tokenization is used to protect structured data, data in columns and rows that is fed into transactional and analytical systems. This solves a big portion of the problem for protecting data, but not all data is neatly organized in a table. Audio, video, and bodies of text are called unstructured data, like doctor’s notes and x-rays images in a patient’s file, which can also contain personal identifiers. We use encryption to safeguard unstructured data in emails, word processing documents, PDF files, photos, and more.

WHAT IS PROTEGRITY VAULTLESS TOKENIZATION AND HOW IS IT USED?

But not all tokenization is created equally. Vaultless tokenization is particularly powerful because of its performance characteristics and its ability to produce ciphertext that maintains data type, format, value, and length. This ciphertext is transparent to the systems as it moves and can be reversed to the original cleartext value just in time for an authorized user.

Vaultless tokenization uses small codebooks to perform protection operations without the need for massive, vaulted token-lookup tables. This approach creates a highly scalable, flexible, and powerful protection method for structured and semi-structured data.

Protegrity Vaultless Tokenization (PVT) is used by some of the largest financial, healthcare, and retail institutions in the world to enable them to safely store and analyze their most sensitive customer data. In fact, we interact with systems that protect our data through tokenization every day simply by using our bank cards at a store.

Traditional, vault-based tokenization methods face challenges because they are costly to scale and reach speed and capacity limitations. Protegrity Vaultless Tokenization is meeting our customers’ security performance requirements with a substantial ROI using embeddable data protection in their applications, datastores, or even over network protocols.

WHAT IS ENCRYPTION AND HOW IS IT USED?

Encryption uses mathematical algorithms and cryptographic keys to change data into “binary ciphertext,” rendering the encrypted data unreadable and unusable. To access the original data, a user needs to present an encryption key, which reverses the encryption process and unlocks the data.

The security of an encryption algorithm depends on the strength of the algorithm and the strength of the key. Today’s modern encryption algorithms are rooted in complex mathematical equations that are extremely difficult to break with the current computing resources available.

One way to think about encryption is to compare it to a lock on a safe. However, instead of the combination being a mix of numbers 0-99, the key is a combination of 0s and 1s, and the size of the encryption key is the number of times you need to turn the wheel in the lock. When using encryption, it is important to change the key—via a process called key rotation—more frequently than the time it would take to complete a brute force attack on the encrypted data. This is why many regulations require key rotation on a regular basis, usually every one or two years.

Encryption is appropriate for file types that contain sensitive information, such as documents, images, entire hard drives or volumes, video calls, etc. Anything larger than discrete data, typically more than 1,000-2,000 bytes, may best be handled with encryption.

‍

WHEN IS ADDITIONAL PROTECTION NEEDED?

There are a variety of use cases in which advanced data protection capabilities, including encryption and Protegrity Vaultless Tokenization, are valuable and help drive tangible business outcomes. One of the most common reasons that enterprises adopt advanced data protection methods is when migrating to the cloud.

Oftentimes, enterprises feel more comfortable moving data and workloads to the cloud once they are able to attain the same levels of data security in the cloud as they have in their on-premises environments. This is especially true for organizations in highly regulated industries, such as financial services, insurance and healthcare. Traditionally, organizations in these verticals have been hesitant to move their data to the cloud since they must adhere to stringent government regulations around data privacy and security. These regulations often place restrictions on how data can be used and the level of data privacy that must be maintained regardless of where that data resides.

Enterprises in regulated industries deal with highly confidential and sensitive data, including medical records, credit card information, social security numbers, and other personally identifiable information. If data of this nature were to be breached, it could have significant implications in terms of customer trust, not to mention lead to potential regulatory fines.

This has often compelled many organizations to maintain some of their most sensitive data in their on-premises data centers, leading to missing out on the capabilities the cloud provides, including cloud scale, agility, and advanced compute capabilities.

For example, Amazon Web Services (AWS), the world’s most comprehensive and broadly adopted cloud platform, offers more than 200 fully featured services including artificial intelligence (AI) and machine learning (ML) capabilities, advanced data analytics, scalable storage offerings, managed databases, and more. With advanced data protection capabilities from Protegrity, including encryption and Protegrity Vaultless Tokenization, organizations moving to the cloud are equipped with an additional layer of protection to safeguard their sensitive enterprise data.

Protegrity enables quick, seamless migration and adoption of AWS services by further protecting all forms of data—both structured and unstructured—within AWS. Enjoy greater freedom to leverage the full scope of cloud capabilities, without security being a concern, by leveraging Protegrity’s data protection capabilities that ensure the same controls and data security standards for AWS as data that’s maintained in on-premises environments.

This gives businesses greater freedom to innovate. Advanced data protection provides enterprises with greater freedom to safely leverage even the most confidential enterprise data to be utilized for AI initiatives, advanced data analytics, personalization programs, fraud prevention, and a number of other initiatives that could otherwise be challenging (or risky) to apply sensitive data to without additional data protection. Protegrity empowers organizations to protect their data so it can propel their business forward.

MAKING THE RIGHT CHOICE FOR YOUR BUSINESS

Tokenization is a method of data protection that can be used separately or combined to deliver data security and privacy to a business. The first step for businesses going on a journey to security maturity is to determine what sensitive data exists and where it is stored. This piece of operational intelligence along with your use cases will help you determine which method or methods will work best for you.

Businesses considering solutions that offer Vaultless Tokenization or a similar method as their main data protection process often find that the persistent protection and the benefits of extra storage space made possible by this method work more effectively to meet their goals to scale and increase revenue.

To learn more about the various data protection methods for securing sensitive data, including Protegrity Vaultless Tokenization, check out the Methods of Data Protection Guide or contact our team today!