More and more organizations are moving from encryption to tokenization, recognizing that it is a more secure, more flexible and more cost-effective approach – they realize that tokenization goes well beyond simply reducing the burden of PCI compliance, to being the best way to protect all private or sensitive data, unlock business value and minimize risk in every scenario including GDPR and HIPAA.
Encryption is different than Tokenization
While both are forms of cryptography, in practice, encryption and tokenization are very different data security methods. Here’s how McGraw-Hill’s Encyclopedia of Science & Technology defines the various methods of writing in secret codes or ciphers:
The History of Encryption
The term encryption applies to the use of mathematical cryptographic algorithms to protect data by rendering it unreadable. Authorized users who possess the appropriate cryptographic ‘keys’ can decrypt the protected data to render it readable.
Since the 1970s encryption has evolved in line with computing power and technology to offer stronger protection against brute force attacks – from Data Encryption Standard (DES) to Advanced Encryption Standards (AES).
Encryption brings with it vulnerabilities caused by the key and the unchanging nature of the algorithm. Cryptographic keys themselves are vulnerable to exposure and must be treated with the same care as the data, as compromised keys could result in a compromise of the encrypted data, no matter what the strength.
Encryption can also lack versatility, as it changes the appearance and increases the size of the original data. Applications and databases must be able to read specific data type and length in order to accept it so, if data types and lengths are incompatible with systems, they will effectively break.
Recently standardized Format Preserving Encryption (FPE), developed to have less impact on systems, requires extensive computing resources with a limited breadth of data types and formats, while its strength of protection can be regarded as less secure than other available alternatives.
The History of Tokenization
Tokenization is a non-mathematical approach to protecting data while preserving its type and length. It is based on replacing sensitive data, regardless of nature (PII, PHI, PCI), type or format, with non-sensitive substitutes. This mitigates impact should a breach occur and offers more flexibility than FPE.
Tokens, by definition, look like the original value in data type and length while de-identifying sensitive information, enabling it to travel throughout its lifecycle without modification to systems. Furthermore, tokens can keep specific data fully or partially visible for processing and analytics, overcoming many of the problems associated with encryption.
Born from a desire to simplify PCI compliance, vault-based tokenization uses a large database table to create lookup pairs that associate a token with the encrypted sensitive information, such as credit card numbers, for which it is a substitute. The only way to get the original information or its token, is to reference the lookup table, taking data out of scope for audit. As the lookup table grows with each instance of tokenization, lookups slow down and as a result performance suffers. vault-based tokenization requires costly synchronization capabilities to maintain reliability, high availability, and to avoid collisions. Additionally, vault-based tokenization is too complex to tokenize anything more than credit card numbers without massive architectural problems.
In contrast, Protegrity Vaultless Tokenization (PVT) is a light-weight and powerful solution that eliminates the operational and management problems associated with vault-based tokenization. PVT deploys a very small set of lookup tables of random values without having to store either sensitive data or tokens. As the tables don’t grow with actual data, as they do with vault-based tokenization, PVT is faster, more reliable, more secure, and can scale to a range of varied data protection tasks in addition to protecting credit card numbers, such as health and privacy information.
Stateless tokenization has also emerged as an alternative to vault-based solutions, but while stateless tokenization operates via an appliance based single design pattern causing latency issues, PVT occurs at the point where data security policies are enforced, ensuring system performance remains uninhibited and offering varied architecture design patterns.
Most use cases for data require security without business compromise, meaning PVT is well suited for all data-driven organizations with an increased need to protect data itself.
Organizations need to ask questions of their tokenization solution providers in order to fully understand what is on offered, given the variations outlined above.
Encryption or Tokenization?
The virtues of tokenization over encryption in terms of protecting data without impacting systems or processes are clear. In addition to simplifying PCI compliance, PVT’s ability to easily protect a broad variety of data types makes it a natural choice for meeting other data protection requirements including internal policies, industry standards such as HIPAA and geographical laws such as GDPR.
In certain circumstances it makes sense to use encryption rather than tokenization, typically when data is unstructured, such as binary files, images, biometrics, etc. The lack of standardization in relation to tokenization may also make encryption seem like a more obvious choice but it is worth noting that PVT has been validated by some of the most distinguished names in cryptography and data security:
“The Protegrity tokenization scheme offers excellent security, since it is based on fully randomized tables. This is a fully distributed tokenization approach with no need for synchronization and there is no risk for collisions.“ Professor Dr. Ir. Bart Preneel of the Katholieke University in Leuven, Belgium
“Our analysis shows that sans access to the payment card environment and the tokenization mechanism, an attacker will be forced to resort to a brute-force attack since tokens are not mathematically derived from the PAN.” C. Matthew Curtin, CISSP and Matthew Brothers-McGrew at Interhack