Products
Why Protegrity Our unique data security technologies are built for today's data-driven businesses. Learn More
Data Discovery Uncover where sensitive data resides
Data Management Control every corner of the enterprise
Data Protectors Protection beyond platforms
Professional Services Security experts for every need
Security Gateways Data security that goes with the data
Vaultless Tokenization Go beyond encryption
Contact Us
Posted on: December 6, 2019

The 5 Biggest Big Data Solutions of 2019

As the year that was 2019 comes to a close, our team looks back at 5 of the solutions that had a resounding impact on the world of big data:

1. Deep Learning as a Service

While Deep Learning as a Service (DLaaS) was not invented this year, it was 2019 that the global adoption of DLaaS solutions moved from novelty to reality

This milestone coincides with the introduction of Amazon's first DLaaS solutions, Contacts Lens, and Amazon Kendra. These solutions are revolutionary in the sense that they are allowing businesses hosting data on AWS infrastructure to access world class machine-learning technology, out-of-the-box, without employing data scientists with AI-project expertise.

Contact Lens, for instance, allows customer service teams using Amazon Connect, Amazon's cloud-based customer contact center service, the ability to access real-time data on caller sentiment, call themes, and call vocabulary.  Amazon Kendra, on the other hand, which is an AI-enabled enterprise search tool, offers users state-of-the-art natural language search capabilities.  Using this technology, a user’s search query will yield accurate, and specific answers driven by machine-learning, as opposed to the traditional string of results based on keywords.

photo-1551288049-bebda4e38f71 photo-1525182008055-f88b95ff7980

2. Kubernetes

If there is one buzzword that has permeated the product floor at Protegrity HQ this year, it’s definitely ‘Kubernetes’.  It’s not just us though: Gartner predicts that by 2022, 75% of global companies will be running containerized applications, and these containers will be managed by the Kubernetes.  So, what has fueled the meteoric rise of this container orchestration tool?  Firstly, it was developed for and by Google, meaning there is an army behind it to fix bugs, and keep it continuously updated. Secondly, Kubernetes supports higher load demands with more complexity than the other major container orchestration tool, Docker Swarm.  The automated and continuous container deployment that is obtained through Kubernetes is the most scalable way to approach running application workflows. Kubernetes, in many ways, is rendering a formerly manual function of a DevOps teams obsolete.

1280px-Kubernetes_logo.svg

 

Kubernetes

3. Apache Kafka

Today, the music Spotify recommends, the trip pricing Uber offers, and family trees Ancestry.com produce, all happen in real-time thanks to Kafka. In fact, currently, over 35% of Fortune 500 companies are using systems built on Kafka.  The proliferation and dominance of Kafka can be attributed to two factors:

  1. The rise of event data
  2. The scalability of Kafka’s log-centric approach to managing event data

1024px-Apache_kafka-icon.svgWhen the software engineers at LinkedIn developed Kafka in 2011 to ingest large batches of event data from LinkedIn into a lambda architecture, so that LinkedIn could display relevant posts to users in real-time, they probably did not envision that Etsy would be using the same technology to display the perfect Christmas ornament to a craft enthusiast, but as we enter the new decade, this is what consumers have come to expect in 2019.

4. Graph Databases

While graph databases are nowhere close to displacing traditional relational database technology, many companies are migrating to graph databases to overcome some of the inherent limitations of SQL queries, especially when analyzing complex, or indirect relationships in data.

Internet_map_1024While SQL queries are developed to answered anticipated questions, graphs provide the flexibility to answer unknown and unexpected questions.  This is because the relationships between nodes in graphs are explicitly represented, as opposed to the relationships between SQL tables, which need to first be defined through foreign key-constraints.

Areas where graph databases are proving to be particularly advantageous include genomics, aerospace, precision medicine, and threat detection and response (TDR).

5. Data Orchestration Platforms

Data orchestration platforms help businesses marshal and organize the disparate silos of data across their enterprises, so that there is a ’single source of truth’.  As data ecosystems continue to expand in both size and complexity, the importance of such technology has become increasingly apparent, and it’s no coincidence that 2019 marked the inaugural Data Orchestration Summit. One sustained skepticism of these platforms, however, has been that they will be unable to keep up with the ever expanding compute workloads of modern, global businesses.  This year, data orchestration technology leader, Alluxio, has countered this skepticism, by expanding the memory limits of their software from 200 million files to 1 billion files with the launch of version 2.0 of their software.