Last week’s DataWorks Summit in Berlin, attended by Protegrity, had a strong emphasis on data privacy and data protection, with GDPR core to many of the sessions. A poll in the morning keynote of day two highlighted that 15 percent of the attendees would not be ready for the May 25th deadline, with just over 50 percent still making preparations.
The conference, with a theme of, “Ideas. Insights. Innovation.” brought together a large, mainly technical audience with shared interest in maximising data exploitation for business value, but throughout there was a strong recognition of the fundamental importance of data privacy, with many references to recent cases of data loss and misuse, including the revelations from Facebook.
Scott Gnau, Hortonworks CTO, highlighted the trend of cloud adoption with a simple but effective reminder of the need for organisations to align their use of data (and analytics) and cloud adoption with their business strategies. An audience poll indicated that cloud adoption is still at an early stage, with only 25 percent of respondents moving (or planning to move) more than half their analytics to the cloud. The poll didn’t enquire as to the reasons for hesitancy, but often it’s security concerns. And of course the cynic in me might question why 35 percent of respondents have “no-plans” to move to the cloud. Is this evidence of vendor and analyst hype around cloud migration or poll bias? It’s worth noting that 10 percent of poll respondents answered “What is GDPR?” to the question about GDPR readiness. A suitable reminder that data quality is often unreliable, and often not fit for purpose!
Clarity in “data understanding” was another strong theme at DataWorks; with an emphasis on metadata – the “right data, for the right use” was how Mandy Chessell from IBM described the aim of metadata initiatives underway. Specifically, this referenced the Apache Atlas initiative, that aims to define a set of open standard interfaces, message protocols and frameworks for metadata management and governance. As she succinctly put it, “Good analytics needs good data, and that needs good metadata.”
A case study from MunichRe emphasised the proliferation of data, but also the quest to find new data to create richer data sets for analytics — not in a random way, but a coordinated approach to extend data sets by acquiring specific data for a specific purpose. At MunichRe they have identified a role for “data hunters” who go out in the business, and also externally, to identify new, untapped data sources for collection and integration onto analytics. All this fueled the drive to faster, more responsive, more accurate analytics.
Many presenters discussed the practical challenges of fulfilling data protection requirements. Worldpay explained how fundamental data protection is to their business; and the decision they have made to tokenise or encrypt sensitive fields in their Hadoop platform – knowing that clusters are a key target for hackers, and that hackers go to any extremes to get personal data.
Apache Ranger is now widely used to provide access controls; but Hortonworks security specialists identified the options available through partners such as Protegrity to use encryption or tokenization to protect sensitive data at field level.
Such protection will become an increasing necessity as breaches have more impact; David Walker from Worldpay illustrated this with a recent example of a casino losing its high-roller customer database via an IoT thermostat in the lobby fish tank!
IoT was an area of increased interest for attendees. Renault presented an example of how they use sensors to keep track of gearbox packaging; a superficially trivial issue but with a high cost to the business. Safe transportation of complete gearboxes requires expensive packaging that should be unusable: sensors help to maximise re-use and minimise packaging costs. Based on the outline numbers gearbox packaging costs are around €750K a year for the Renault brand.
Predictive Analytics is a particularly hot-area for project initiatives. Teradata highlighted how customers are extending the scope of predictive models with many new data sources. Frank Sauberlich highlighted the example of Danske Bank, that has extended the scope of data used in its fraud prediction models and, as a result, increased detection rate by 50 percent (to 60 percent of actual fraud cases), whilst bringing the level of false positives down by 60 percent from 99.9 percent . Whilst financial services continue to provide leading examples it was interesting to note the strong presence of vehicle manufacturers represented, not just the Renault keynote, but sessions from Audi and BMW and a reference to John Deere in the Bernard Marr keynote — strong evidence of the direction of data and analytics.
This explosion in data, and the growth in data lakes, has prompted the creation of “Data Steward Studio,” launched by Hortonworks at the event. This is a component of Hortonworks DataPlane service and provides better understanding and governance of data across data lakes.
The challenges of GDPR were summarised in a keynote session from Enza Iannopollo at Forrester Research; with some clear, actionable advice. Identify high priority risks, and take action on these first: deploy security controls and reengineer processes to fulfill consent, data subject and data breach requirements.
Throughout the event momentum was the key message, analytics continues to evolve at an incredible pace; and there are fundamentals – like data security – which need to keep up – but without hindering the business or analysts from getting value from the data assets and exploring new ideas. Exactly the philosophy at Protegrity!