Veritas Strategy Analysis | Veritas Vision Solution Day
>> From Tavern on the Green in Central Park, New York, it's theCUBE covering Veritas Solution Day. Brought to you by Veritas. >> Welcome to New York City, everybody. We're here in the heart of Central Park at the beautiful location, Tavern on the Green. You're watching theCUBE, the leader in live tech coverage. And this is our special coverage of the Veritas Solutions Day. The hashtag is VtasVision. Veritas Vision last year was a big tent customer event, several thousand customers at that event and Veritas decided this year to go out to the field. 20 of these solution days, very intimate events, couple hundred customers, keynote presentations from Veritas, breakout sessions, getting deep into the product but also talking strategy, and intimate conversations with executives, CxOs, CIOs, backup admins, and of course, New York City is one of those places where you get very advanced customers pushing the envelope, very demanding. I often joke they're as demanding as New York sports fans, and so they have high expectations. But they also have a lot of money, and so the vendor community loves to come to New York, they love to get intimate with these customers in New York, as do we at theCUBE. So we're going to be talking to customers today, we're going to be talking to executives of Veritas, some partners. So I want to talk a little bit about what's going on in the marketplace, in this backup and recovery space. It's transforming quite dramatically. For those of you who follow theCUBE, you know last year at VMworld, last two years, actually, data protection was one of the hottest topics at the event. Of course, multi-cloud, of course there was a lot of AI talk and containers and Kubernetes. But staid old backup, old, reliable data protection was one of the hottest topics. We're seeing VC money pour into this space. We're seeing upstarts like Cohesity and Rubrik trying to take aim at the incumbents like Veritas and Commvault, and IBM, and Dell EMC, so those traditional companies, those enterprise companies that have large install bases are trying to hold onto that install base and migrate their platforms to a modern software-defined platform, API-based, using containers, using microservices, building on top of the code that they've developed, simplifying the UI, and at the same time, allowing for an abstraction layer across clouds and multi-clouds. So what are the big drivers that are really pushing the trends, the megatrends of this space? Well, certainly digital transformation is one of them. The last 10 years of big data, people have gathered all this data, and now that data is in this place and people are now applying machine intelligence to that data. They're doing a lot of this work in the cloud. So digital transformation, data, big data, cloud, multi-cloud, simplification. People want a much simpler experience, so bringing the cloud experience to their data, wherever the data might live. Because of course, you get the three laws of cloud. You've got the law of physics, right? Physics says you can't just shove everything into the cloud. It just takes too long. If I have big bog of data, if I have a petabyte of data, you know how long that's going to take to put into the cloud? So I may not just move it in there unless I stick it on a Chevy truck and it cart it over on a bunch of tapes and nobody really wants to do that. So there's the law of physics. There's also the law of economics. It's very expensive to move that data. You need a lot of network bandwidth, so, you know, you might not necessarily put everything into the cloud, you might keep stuff on-prem. And of course, there's a law of the land. And the law of the land says, well, if I'm in country X, let's say Germany, that data can't leave that country. It's got to be physically proximate inside the boundaries, the borders of the country, by local law. So these three laws are something that was put forth to us by Pat Gelsinger in theCUBE at VMworld this year. We've evolved that thinking, but it's very true when we talk to customers about this. These are trends that are driving their decisions about cloud and multi-cloud and where to put it. We talked in theCUBE about the stat that the average enterprise has eight clouds. Well, we're a small enterprise and we have eight clouds, so I think that number's actually much, much higher, especially when you include SAS. So lots of data, lots of copies of data, so you need a way to abstract all that complexity and have a single place to protect your data. Now, a big part of this, digital transformation is driving more intense requirements on recovery point objectives and recovery time objectives, RPO and RTO, what do those words mean? Recovery point objective, think about... Ask a businessperson, how much data are you willing to lose? And they go, oh, what are you talking about? I don't want to lose any data. But if you think about it and you ask the next question, how much are you willing to spend so that you lose no data, and if they have to spend millions and millions of dollars to do that, they might relax that requirement a little bit. They might say, well, you know, if I lose 15 minutes of data in any given time and have to recreate it, not the end of the world. So that's what RPO is, is essentially the point in time that you go in to recover and how much data loss you're exposed to. And the way this works is you take, let's say, snapshots to simplify the equation, you push those offsite away from any potential disaster, and it's that gap between when you actually capture the data and when that disaster might happen that you're exposed. So to make that as close as zero as possible, that gap as close to zero as possible, is very, very expensive, so a lot of companies don't want to do that. At the same time, digital transformation's pushing them to get as close to zero as possible without breaking the bank. The other part of that equation is recovery time objective, how long it takes to get the application and the data back and running. And because of digital transformation, people want to make that virtually instantaneously. So because of digital transformation, people are re-architecting their data protection strategies to have near-instantaneous recovery. This all fits into the megatrend of cloud. People want it to be simpler, they want it to mimic the cloud-like experience, almost as if I'm on Amazon or I'm on Netflix, so simplifying the recovery process and the backup process is something that we're going to hear a lot more of. Automation is another big theme. People tend to automate through scripts. Well, scripts are fragile, scripts tend to break. When changes are made in software, scripts tend to have to be rewritten and maintained. And so it's a very high maintenance type of activity to do scripts, and over time, they just fade away, or don't, they stop working. So automation through API is very, very important, something that you're hearing much more, is much more thematic in this world of data protection. The other is getting more out of the corpus of data in my data protection infrastructure, because, let's face it, backup and recovery, it's like insurance. I hope I never need it, but if I do need it, it's very valuable at that point in time that I do need it. But it's an expense. It's not driving bottom-line revenue. It's not necessarily cutting cost. It is indirectly in the form of reducing the cost of downtime, but that's harder. That's kind of viewed oftentimes as a soft dollar benefit. So what you're hearing is a lot of the vendor community and the user community are talking about getting more out of the data that they have and out of the backup and recovery infrastructure by bringing analytics, and machine intelligence, or AI and machine learning to the equation. Studying analytics to identify anomalous behavior, maybe identifying security breaches, creating air gaps such that I can potentially thwart ransomware or other malware infections, analyzing the corpus of backup data because it holds all the company's corporate data, it's accessible. If you can analyze that data and look for anomalies, you might be able to thwart an attack. So getting more out of that data through analytics. Predictive maintenance is another example of data analytics that's driving some of these trends beyond just backup and recovery. And also governance. Governance and privacy are kind of, security and privacy are two sides of the same coin, so with GDPR, the General Data Protection Regulation that came out, that went into effect in terms of fines going into effect this past May, very, very onerous and expensive fines, people are using their data protection corpus and the analytics around that to reduce their risk and to better govern their data. So these are some of the big trends that we're seeing. So Veritas is a leader here, we're going to be covering this all day. Veritas and some of its other brethren that have been around for decades are getting attacked by a lot of the upstarts, but they got the advantage that the install vendors have the advantage of a large install base. The incumbent vendors have the advantage of a large install base. The upstarts have the advantage of they're starting with a clean sheet of paper. We're going to talk to customers and find out what are they thinking in terms of their backup approach. Industry data suggest that over half of the customers that you talk to are rethinking their backup strategies because of digital transformation. Well, we're going to talk to some customers. Are they thinking about sticking with Veritas or they thinking about migrating? Why or why not? What are some of the advantages and considerations there? So Veritas, a long, rich story going back to the '80s when the company was founded, was a hot IPO, really super hot company, got sold to Symantec for about 13.5 billion, and then Symantec spun it out to private equity several years ago in an eight billion dollar go-private sale, and subsequently, Veritas got off the 90-day shot clock. We heard this from companies like Dell where they didn't have to report and get abused by the street for either missing a number or having one little metric that was off. So they could write their own narrative. They could invest in R&D, they could have more patient capital. And so you saw this from the Carlisle group that took Veritas private and has been sort of this march toward a new platform, spending money on R&D, and now, really going to market very aggressively. Another thing you're going to hear about is partnerships, partnerships with AWS and some of the other cloud-providers. There's a partnership that's being announced with the flash storage company, Pure, today. So we're going to dig into some of that. So we'll be here all day, Tavern on the Green. You're watching theCUBE and we're here in New York City. Keep it right there, we'll be right back. I'm Dave Vellante, back shortly. (digitalized music)
SUMMARY :
Brought to you by Veritas. and the analytics around that to reduce their risk
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Symantec | ORGANIZATION | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Pat Gelsinger | PERSON | 0.99+ |
15 minutes | QUANTITY | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
Veritas | ORGANIZATION | 0.99+ |
New York | LOCATION | 0.99+ |
General Data Protection Regulation | TITLE | 0.99+ |
New York City | LOCATION | 0.99+ |
millions | QUANTITY | 0.99+ |
90-day | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
two sides | QUANTITY | 0.99+ |
eight clouds | QUANTITY | 0.99+ |
GDPR | TITLE | 0.99+ |
Veritas Solutions Day | EVENT | 0.99+ |
three laws | QUANTITY | 0.99+ |
Veritas Solution Day | EVENT | 0.99+ |
Central Park | LOCATION | 0.99+ |
this year | DATE | 0.98+ |
eight billion dollar | QUANTITY | 0.98+ |
Netflix | ORGANIZATION | 0.98+ |
today | DATE | 0.98+ |
several years ago | DATE | 0.98+ |
SAS | ORGANIZATION | 0.98+ |
millions of dollars | QUANTITY | 0.98+ |
about 13.5 billion | QUANTITY | 0.97+ |
Carlisle | ORGANIZATION | 0.97+ |
Dell EMC | ORGANIZATION | 0.97+ |
one | QUANTITY | 0.97+ |
last two years | DATE | 0.97+ |
zero | QUANTITY | 0.96+ |
VMworld | ORGANIZATION | 0.96+ |
Germany | LOCATION | 0.95+ |
Rubrik | ORGANIZATION | 0.95+ |
Pure | ORGANIZATION | 0.94+ |
couple hundred customers | QUANTITY | 0.92+ |
one little metric | QUANTITY | 0.92+ |
20 of these solution days | QUANTITY | 0.91+ |
Central Park, New York | LOCATION | 0.91+ |
single | QUANTITY | 0.9+ |
Tavern on the Green | LOCATION | 0.9+ |
Cohesity | ORGANIZATION | 0.87+ |
Veritas Vision | EVENT | 0.87+ |
this march | DATE | 0.85+ |
Chevy | ORGANIZATION | 0.84+ |
'80s | DATE | 0.83+ |
theCUBE | ORGANIZATION | 0.83+ |
past May | DATE | 0.82+ |
law of physics | TITLE | 0.79+ |
thousand customers | QUANTITY | 0.73+ |
petabyte of data | QUANTITY | 0.72+ |
VtasVision | ORGANIZATION | 0.7+ |
last 10 years | DATE | 0.7+ |
de | QUANTITY | 0.66+ |
Tavern on the Green | TITLE | 0.65+ |
Commvault | ORGANIZATION | 0.55+ |
over half | QUANTITY | 0.52+ |
Day | EVENT | 0.47+ |
country | LOCATION | 0.44+ |
GDPR on theCUBE, Highlight Reel #4 | GDPR Day
- So our first prediction relates to how data governance is likely to change in a global basis. If we believe that we need to turn more data into work, businesses haven't generally adopted many of the principles associated with those practices. They haven't optimized to do that better. They haven't elevated those concepts within the business as broadly and successfully as they have, or as they should. We think that's gonna change, in part, by the emergence of GDPR, or the General Data Protection Regulation. It's gonna go in full effect in May 2018. A lot has been written about it. A lot has been talked about. But our core issues ultimately are, is that the dictates associated with GDPR are going to elevate the conversation on a global basis. And it mandates something that's now called the Data Protection Officer. We're gonna talk about that in a second, Dave Elonte. But it is going to have real teeth. So we were talking with one Chief Privacy Officer not too long ago who suggested that had the Equifax breach occurred under the rules of GDPR, that the actual fines that would have been levied would have been in excess of $160 billion dollars, which is a little bit more than the $0 dollars that has been fined thus far. Now we see new bills introduced in Congress, but ultimately our observation and our conversation with a lot of Chief Privacy Officers or Data Protection Officers is that in the B to B world, GDPR is going to strongly influence not just how businesses behave regarding data in Europe, but on a global basis. - A lot of the undertone is, "Cloud, cloud, cloud, governance, governance, governance," is the two, kind of the drivers I've been seeing as the forces this week is a lot of people trying to get their act together on those two fronts. And you can kind of see the scabs on the industry. Some people haven't been paying attention and they're weak in the area. Cloud is absolutely going to be driving the big data world, because data's horizontal, cloud's the power source to that. You guys have been on that. What's your thoughts? What other drivers and currents-- first of all do you agree with what I'm saying? And what else did I miss? I mean, security is obviously in there, but-- - Absolutely, so I think you're exactly right on. So, obviously governance security's a big deal. Largely being driven by the GDPR regulation that's happening in Europe. But I mean, every company today is global, so everybody's essentially affected by it. So I think data up til now has always been a kind of opportunistic thing, that there's a couple guys in the organization who are looking at it as, "Oh, let's do some experimentation, "let's do something interesting here." Now it's becoming government mandate. And so I think there's a lot of organizations who are, like to your point, getting their act together, and that's driving a lot of demand for data management products. So now people say, "Well, if I gotta get my act together, I don't want to have to hire armies of people to do it. Let me look for automated, machine-learning based ways of doing it," so that they can actually deliver on the audit reports that they need to deliver on, ensure the compliance that they need to ensure, but do it in a very scalable way. - Me as a customer come to an enterprise say, "I don't want any of my data stored." It's up to you to go delete that data completely, right? That's the term that's being used, and that goes into effect in May. How do you make sure that that data gets completely deleted by that time the customer has. How do you get that consent from the customer to go do all this? So there's a whole lot of challenges as data as multiplies. How do you deal with the data? How do you create insights to the data? How do you pay the consent on the data? How do you be compliant on the data? You know, how do you create the policies that's needed to generate that data? All those things needs to be, those are the challenges that enterprise is facing. - Digital transformation's accelerating, data protection's being disrupted, millions of jobs are coming in. You guys are playing a role. What is the role that Druva is playing in the digital transformation acceleration? - Absolutely. You think about the world, right, and you think of companies like Domino's or Tesla, they think they are softer companies, right, they deliver, the server they deliver a softer approach of the traditional business model. In the heart of this transformation of enterprise is becoming softer, digitalized, is the data at the core. And data today will outlive most systems. And the more and more fragmented your approach to data becomes, you store data on prem, in the cloud, everywhere in between, the data management has to become more and more centralized. So Druva is in the core of this transformation making a data transformation and making sure your data architextures the future of a better approach of manageablity and protection with a Druva platform. - You guys had a busy month this month. You got a couple big news we're gonna be talking about today. Funding and next generation platform. Walk us through that. - Absolutely, so we have two big news to announce today. The first one being $80 million dollars of capital raised, led by Revolt Capital, followed by most of their investors, including Sequoia. Excellent in iron capital. And then the number two being announcing a whole new Druva cloud platform, which holistically takes our entire product portfolio and puts it together in a nice, simplistic approach to manage your entire information workload in a single platform in the could. - The first question is mind is is everybody ready for GDPR? The answer is "no". Have they started into the journey to get, have they started getting on the racetrack, right? On the road? Yes. Yeah, it depends on the maturity of the organization. Some people have just started building a small strategy around GDPR. Some people have actually started doing assessment to understand how complex is this beast and regulation. And some people have just moved further in the journey of doing assessment, but they're now putting up changes in their infrastructure to handle remediation, right? Things like, for example, consent management. Things about, things like deletion. It could be very big deal to do, right? So they are making changes to the infrastructure that they have or the IT systems to manage it effectively. But I don't think there's any company which probably can claim that they have got it right fully end to end.
SUMMARY :
is that in the B to B world, GDPR is going to
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Elonte | PERSON | 0.99+ |
Tesla | ORGANIZATION | 0.99+ |
Europe | LOCATION | 0.99+ |
May 2018 | DATE | 0.99+ |
Revolt Capital | ORGANIZATION | 0.99+ |
May | DATE | 0.99+ |
General Data Protection Regulation | TITLE | 0.99+ |
two | QUANTITY | 0.99+ |
Sequoia | ORGANIZATION | 0.99+ |
Equifax | ORGANIZATION | 0.99+ |
one | QUANTITY | 0.99+ |
two fronts | QUANTITY | 0.99+ |
Congress | ORGANIZATION | 0.99+ |
GDPR | TITLE | 0.99+ |
$0 dollars | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
first question | QUANTITY | 0.98+ |
$80 million dollars | QUANTITY | 0.98+ |
Druva | TITLE | 0.97+ |
this week | DATE | 0.96+ |
first prediction | QUANTITY | 0.96+ |
single platform | QUANTITY | 0.95+ |
two big news | QUANTITY | 0.95+ |
millions of jobs | QUANTITY | 0.94+ |
first one | QUANTITY | 0.92+ |
this month | DATE | 0.92+ |
first | QUANTITY | 0.9+ |
Chief Privacy Officer | PERSON | 0.9+ |
GDPR Day | EVENT | 0.89+ |
second | QUANTITY | 0.88+ |
couple guys | QUANTITY | 0.85+ |
$160 billion dollars | QUANTITY | 0.85+ |
Domino's | ORGANIZATION | 0.82+ |
number two | QUANTITY | 0.79+ |
GDPR | EVENT | 0.77+ |
Data Protection Officer | PERSON | 0.74+ |
4 | QUANTITY | 0.68+ |
couple big news | QUANTITY | 0.65+ |
Druva | ORGANIZATION | 0.56+ |
Highlight Reel | ORGANIZATION | 0.48+ |
GDPR on theCUBE, Highlight Reel #3 | GDPR Day
(bouncy, melodic music) - The world's kind of revolting against these mega-siloed platforms. - That's the risk of having such centralized control over technology. If you remember in the old days, when Microsoft dominance was rising, all you had to do was target Windows as a virus platform, and you were able to impact thousands of businesses even in the early Internet days, within hours. And it's the same thing happening right now, as a weaponization of these social media platforms, and Google's search engine technology and so forth, is the same side effect now. The centralization, that control, is the problem. One of the reasons I love the Blockstack technology, and Blockchain in general, is the ability to decentralize these things right now. And the most passionate thing I care about nowadays is being driven out of Europe, where they have a lot more maturity in terms of handling these nuisance-- - You mean the check being driven out of Europe. - Their loss, - The loss, okay. - being driven out of Europe and-- - Be specific, we'd like an example. - The major deadline that's coming up in May 25th of 2018 is GDPR, General Data Protection Regulation, where European citizens now, and any company, American or otherwise, catering to European citizens, has to respond to things like the Right To Be Forgotten request. You've got 24 hours as a global corporation with European operations, to respond to European citizens, EU citizens, Right To Be Forgotten request where all the personally identifiable information, the PII, has to be removed and auto-trailed, proving it's been removed, has to be gone from two, three hundred internal systems within 24 hours. And this has teeth by the way. It's not like the 2.7 billion dollar fine that Google just flipped away casually. This has up to 4% of your global profits per incident where you don't meet that requirement. - And so what we're seeing in the case of GDPR is that's an accelerant to adopt Cloud, because we actually isolate the data down into regions and the way we've architected our platform from day one is always been a true multi-tenant SaaS technology platform. And so there's not that worry about data resiliency and where it resides, and how you get access to it, because we've built all that up. And so, when we go through all of our own attestations, whether it's SOC Type One, Type Two, GDPR as an initiative, what we're doin' for HIPAA, what we're doin' for plethora of other things, usually the CSO says, "Oh, I get it, you're way more secure, now help me," because I don't want the folks in development or operations to go amuck, so to speak, I want to be an enabler, not Doctor No. - I'm a developer, I search for data, I'm just searching for data. - That's right. - What's the controls available for making sure that I don't go afoul of GDPR. - So absolutely. So we have phenomenal security capabilities that are built into our product, both from an identification point of view, giving rights and privileges, as well as protecting that data from any third party access. All of this information is going to be compliant with these regulations, beyond GDPR. There's enormous regulations around data that require us to keep our securities levels as high as we go. In fact, we would argue that AWS itself is now typically more secure, more secure, - [Mike] They've done the work. - than your classic data center. - [Mike] Yeah, they've done the work. - AI-ers, explicable machine learning. - Yeah, that's a hot focus, - Indeed. - or concern of enterprises everywhere, especially in a world where governance and tracking and lineage, - Precisely. - GDPR and so forth, so hot. - Yes, you have mentioned all the right things. Now, so given those two things, there's normal web data, NML is not easy, why the partnership between Hortonworks and IBM makes sense? Well, you're looking at the number one, industry leading big data platform, Hortonworks, Then you look at a DSX Local, which I'm proud to say I've been there since the first line of code, and I'm feeling very passionate about the product, is the merge between the two. Ability to integrate them tightly together, gives your data scientists secure access to data, ability to leverage the Spark that runs inside of Hortonworks Glassdoor, ability to actually work in a platform like DSX, that doesn't limit you to just one kind of technology but allows you to work within multiple technologies, Ability to actually work on your, not only Spark-- - You say technologies here, are you referring to frameworks like TensorFlow, and-- - [Piotr] Precisely. - Okay, okay. - Very good, now, that part I'm gonna get into very shortly. So please don't steal my thunder. - So GDPR you see as a big opportunity for Cloud providers, like Azure. Or they bring something to the table, right? - Yeah, they bring different things to the table. You have elements of data where you need the on-premise solution, you need to have control, and you need to have that restriction about where that data sits. And some of the talks here that are going on at the moment, is understanding, again, how critical and how risky is that data? What is it you're keepin' and how high does that come up in our business value it is? So if that's gonna be on your imperma-solution, there may be other data that can get push out into the Cloud, but, I would say, Azure, the AWS Suites and Google, they are really pushing down that security, what you can do, how you protect it, how you can protect that data, and you've got the capabilities of things like LSR or GSR, and having that global reach or that local repositories, for the object storage. So you can start to control by policies. You can write into this country, but you're not allowed to go to this country, and you're not allowed to go to that one, and Cloud does give you that to a certain element, but also then, you have to step back into, maybe the sorts of things that-- - So does that make Cloud Orchestrator more valuable, or has it still got more work to do? Because under what Adam was saying, is that the point and click, is a great way to provision, right?
SUMMARY :
- So GDPR you see as a big opportunity
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Hortonworks | ORGANIZATION | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Adam | PERSON | 0.99+ |
two | QUANTITY | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Europe | LOCATION | 0.99+ |
May 25th of 2018 | DATE | 0.99+ |
GDPR | TITLE | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
General Data Protection Regulation | TITLE | 0.99+ |
24 hours | QUANTITY | 0.99+ |
two things | QUANTITY | 0.99+ |
Mike | PERSON | 0.99+ |
first line | QUANTITY | 0.99+ |
HIPAA | TITLE | 0.98+ |
Cloud | TITLE | 0.98+ |
One | QUANTITY | 0.98+ |
both | QUANTITY | 0.98+ |
thousands of businesses | QUANTITY | 0.96+ |
Windows | TITLE | 0.96+ |
Piotr | PERSON | 0.95+ |
up to 4% | QUANTITY | 0.95+ |
TensorFlow | TITLE | 0.94+ |
one kind | QUANTITY | 0.93+ |
European | OTHER | 0.93+ |
2.7 billion dollar | QUANTITY | 0.91+ |
Azure | ORGANIZATION | 0.89+ |
AWS Suites | ORGANIZATION | 0.89+ |
Spark | TITLE | 0.88+ |
three hundred internal systems | QUANTITY | 0.86+ |
EU | LOCATION | 0.85+ |
Hortonworks Glassdoor | ORGANIZATION | 0.84+ |
NML | ORGANIZATION | 0.83+ |
GDPR Day | EVENT | 0.78+ |
day one | QUANTITY | 0.75+ |
American | OTHER | 0.74+ |
CSO | ORGANIZATION | 0.72+ |
LSR | TITLE | 0.7+ |
Right To Be Forgotten | OTHER | 0.68+ |
GSR | TITLE | 0.62+ |
Type Two | OTHER | 0.62+ |
To | OTHER | 0.6+ |
DSX | ORGANIZATION | 0.59+ |
One | OTHER | 0.59+ |
Highlight Reel | ORGANIZATION | 0.56+ |
#3 | QUANTITY | 0.54+ |
one | QUANTITY | 0.5+ |
SOC Type | TITLE | 0.49+ |
DSX Local | TITLE | 0.44+ |
GDPR on theCUBE, Highlight Reel #1 | GDPR Day
(inspirational music) - So GDPR, the General Data Protection Regulation was passed by the EU in 2016, in May of 2016. It is, as Ronald was saying it's four base things. The right to privacy, the right to be forgotten, privacy built into systems by default, and the right to data transfer. - [Panelist] Takes effect next year. - It is already in effect. GDPR took effect in May of 2016. The enforcement penalties take place the 25th of May 2018. Now here's where there's two things on the penalty side that are important for everyone to know. Number one. GDPR is extra territorial. Which means that any EU citizen anywhere on the planet has GDPR goes with them. So say you are a pizza shop in Nebraska. An EU citizen walks in, orders a pizza, gives the credit card, stuff like that. If you for some reason destroy that data, GDPR now applies to you Mr. Pizza Shop, whether or not you do business in the EU, because an EU citizens data is with you. It's true, the penalties are much different than they ever have been. In the old days companies could simply write off penalties as saying that's cost of doing business. With GDPR the penalties are up to 4% of your annual revenue or 20 million euros, which ever is greater, and there may be criminal sanctions against, charges against key company executives. So there's a lot of questions about how this is going to be implemented. But one of the first impacts you will see from a marketing perspective is, all the advertising we do, targeting people by their age, by their personal identifiable information, by their demographics, between now and May 25th 2018 a good chunk of that may have to go away because we may not, there's no way for you to say well this person's an EU citizen this person's not. People give false information all the time online. So how do you differentiate every company regardless whether they are in the EU or not will have to adapt to it. Or deal with the penalties. - When you think about the principles that GDPR gives you, I look at that and think that's just, to me that's just good data management practices and principles. It happens to be around personal data for GDPR right now, but those principles are just valley for probably kind of any kind of data. So if you're on the digital transformation journey, with all the change and all the opportunity that brings, these practices and principles for GDPR, they should be helping drive things like your digital transformation. For a lot of our customers, change is the only constant they've got, especially managing all this whilst everything is changing around you. It's tough for a lot of them. - How are people thinking about the data layer, where it lives, on prem, in the cloud, think about GDPR compliance, you know all that sort of good stuff. How are you and Red Hat, how are you asking people to think about that? - So, you know, data management is a big question. We build storage tooling. We understand how to put the bytes on disk, and persist and maintain the storage. It's a different question what are the data services and what is the data governance or policy around placement. And, I think it's a really interesting part of the ecosystem today. We've been working with some research partners in the Massachusets open cloud at Boston University on a project called Cloud Dataverse. And it has a whole policy question around data. It's there, scientists want to share data sets, to control and understand who you're sharing your data sets with. So its definitely a space that we are interested in. Understand that there's a lot of work to be done there, and GDPR just kind of shines the light right on it. Says, policy and governance around where data is placed is actually fundamental and important. And I think it's an important part because you have seen some of the data issues recently in the news. And, we got to get a handle on where data goes, and ultimately I'd love to see a place where I'm in control of how my data is shared with the rest of the world. - GDPR provides for two types of things that a business must do. It must provide insight into the data that it's captured, about business or an individual, legal entity. And it must also then provide the processes for mediating or taking action against that data according to whatever the customers virtues are. Tell us a little bit about that. - So these are two important features because of GDPR. First thing GDPR has 99 articles and 173 articles and 99 like term technological ways. There are other ways, legal ways to do it, but technologically what they want. Like if Peter decides, that I need to know from this bank or from this social media company how much information you have about me, and what are you doing with it. They have to provide that information in 30 days. That is called right to access. And the second thing is you can come and say, well I'm not using these five things which you sold me earlier I don't want you to use that information, or even have information on that for me or my son or my kid. So you can tell them delete that information or mask that. - And that's call the right to? - Right to erasure, right to remove the data. And these two things are very important. This gives customer, they make customer the king. They make the individual the king. He can say tell me what you have on me, and delete what you have on me. - Now the laws have been in the books in, at least in the EU for GDPR for a while. But the fines start getting leveled in May. - May 5th. - Now we've heard that... - So GDPR is a big thing for us and our customers and prospects as well. So we are actively working on getting GDPR compliant. Today our platform is FIPS compliance, so that's already a big stepping stone to getting there. So we look at GDPR in one of, in two ways again, right? One is the solution that we provide to our customers, the data platform and the data protect as we call it. Being GDPR complaint. Meaning the data that lands on that system. The ability to delete the data, the ability to say who has access to the data, rules based taxes, things like that. The second aspect is, our support and the fact that we have access to a lot of customer information ourselves, right? The fact that we can look at their systems and make sure that, everything we do internally is also GDPR compliant, so that the customers and our support systems and our sales force database is all GDPR as well. So both those elements come into play and we are actively working on all of them. (inspirational music)
SUMMARY :
and the right to data transfer.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Ronald | PERSON | 0.99+ |
May of 2016 | DATE | 0.99+ |
May 5th | DATE | 0.99+ |
Nebraska | LOCATION | 0.99+ |
May 25th 2018 | DATE | 0.99+ |
2016 | DATE | 0.99+ |
First | QUANTITY | 0.99+ |
EU | ORGANIZATION | 0.99+ |
five things | QUANTITY | 0.99+ |
next year | DATE | 0.99+ |
May | DATE | 0.99+ |
GDPR | TITLE | 0.99+ |
General Data Protection Regulation | TITLE | 0.99+ |
two things | QUANTITY | 0.99+ |
25th of May 2018 | DATE | 0.99+ |
two types | QUANTITY | 0.99+ |
two ways | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
Peter | PERSON | 0.99+ |
second thing | QUANTITY | 0.99+ |
second aspect | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
99 | QUANTITY | 0.99+ |
30 days | QUANTITY | 0.99+ |
173 articles | QUANTITY | 0.99+ |
one | QUANTITY | 0.98+ |
Today | DATE | 0.98+ |
99 articles | QUANTITY | 0.98+ |
20 million euros | QUANTITY | 0.98+ |
Red Hat | ORGANIZATION | 0.98+ |
Mr. Pizza Shop | ORGANIZATION | 0.98+ |
Boston University | ORGANIZATION | 0.98+ |
up to 4% | QUANTITY | 0.97+ |
two important features | QUANTITY | 0.96+ |
EU | LOCATION | 0.94+ |
today | DATE | 0.87+ |
GDPR Day | EVENT | 0.85+ |
Massachusets | ORGANIZATION | 0.82+ |
first impacts | QUANTITY | 0.81+ |
four base | QUANTITY | 0.7+ |
Cloud Dataverse | TITLE | 0.68+ |
Number one | QUANTITY | 0.64+ |
theCUBE | ORGANIZATION | 0.51+ |
Highlight Reel | ORGANIZATION | 0.4+ |
Day Two Keynote Analysis | Dataworks Summit 2018
>> Announcer: From Berlin, Germany, it's the Cube covering Datawork Summit Europe 2018. Brought to you by Hortonworks. (electronic music) >> Hello and welcome to the Cube on day two of Dataworks Summit 2018 from Berlin. It's been a great show so far. We have just completed the day two keynote and in just a moment I'll bring ya up to speed on the major points and the presentations from that. It's been a great conference. Fairly well attended here. The hallway chatter, discussion's been great. The breakouts have been stimulating. For me the takeaway is the fact that Hortonworks, the show host, has announced yesterday at the keynote, Scott Gnau, the CTO of Hortonworks announced Data Steward Studio, DSS they call it, part of the data plane, Hotronworks data plane services portfolio and it could not be more timely Data Steward Studio because we are now five weeks away from GDPR, that's the General Data Protection Regulation becoming the law of the land. When I say the land, the EU, but really any company that operates in the EU, and that includes many U.S. based and Apac based and other companies will need to comply with the GDPR as of May 25th and ongoing. In terms of protecting the personal data of EU citizens. And that means a lot of different things. Data Steward Studio announced yesterday, was demo'd today, by Hortonworks and it was a really excellent demo, and showed that it's a powerful solution for a number of things that are at the core of GDPR compliance. The demo covered the capability of the solution to discover and inventory personal data within a distributed data lake or enterprise data environment, number one. Number two, the ability of the solution to centralize consent, provide a consent portal essentially that data subjects can use then to review the data that's kept on them to make fine grain consents or withdraw consents for use in profiling of their data that they own. And then number three, the show, they demonstrated the capability of the solution then to execute the data subject to people's requests in terms of the handling of their personal data. The three main points in terms of enabling, adding the teeth to enforce GDPR in an operational setting in any company that needs to comply with GDPR. So, what we're going to see, I believe going forward in the, really in the whole global economy and in the big data space is that Hortonworks and others in the data lake industry, and there's many others, are going to need to roll out similar capabilities in their portfolios 'cause their customers are absolutely going to demand it. In fact the deadline is fast approaching, it's only five weeks away. One of the interesting take aways from the, the keynote this morning was the fact that John Kreisa, the VP for marketing at Hortonworks today, a quick survey of those in the audience a poll, asking how ready they are to comply with GDPR as of May 25th and it was a bit eye opening. I wasn't surprised, but I think it was 19 or 20%, I don't have the numbers in front of me, said that they won't be ready to comply. I believe it was something where between 20 and 30% said they will be able to comply. About 40% I'm, don't quote me on that, but a fair plurality said that they're preparing. So that, indicates that they're not entirely 100% sure that they will be able to comply 100% to the letter of the law as of May 25th. I think that's probably accurate in terms of ballpark figures. I think there's a lot of, I know there's a lot of companies, users racing for compliance by that date. And so really GDPR is definitely the headline banner, umbrella story around this event and really around the big data community world-wide right now in terms of enterprise, investments in the needed compliance software and services and capabilities are needed to comply with GDPR. That was important. That wasn't the only thing that was covered in, not only the keynotes, but in the sessions here so far. AI, clearly AI and machine learning are hot themes in terms of the innovation side of big data. There's compliance, there's GDPR, but really innovation in terms of what enterprises are doing with their data, with their analytics, they're building more and more AI and embedding that in conversational UIs and chatbots and their embedding AI, you know manner of e-commerce applications, internal applications in terms of search, as well as things like face recognition, voice recognition, and so forth and so on. So, what we've seen here at the show is what I've been seeing for quite some time is that more of the actual developers who are working with big data are the data scientists of the world. And more of the traditional coders are getting up to speed very rapidly on the new state of the art for building machine learning and deep learning AI natural language processing into their applications. That said, so Hortonworks has become a fairly substantial player in the machine learning space. In fact, you know, really across their portfolio many of the discussions here I've seen shows that everybody's buzzing about getting up to speed on frameworks for building and deploying and iterating and refining machine learning models in operational environments. So that's definitely a hot theme. And so there was an AI presentation this morning from the first gentleman that came on that laid out the broad parameters of what, what developers are doing and looking to do with data that they maintain in their lakes, training data to both build the models and train them and deploy them. So, that was also something I expected and it's good to see at Dataworks Summit that there is a substantial focus on that in addition of course to GDPR and compliance. It's been about seven years now since Hortonworks was essentially spun off of Yahoo. It's been I think about three years or so since they went IPO. And what I can see is that they are making great progress in terms of their growth, in terms of not just the finances, but their customer acquisition and their deal size and also customer satisfaction. I get a sense from talking to many of the attendees at this event that Hortonworks has become a fairly blue chip vendor, that they're really in many ways, continuing to grow their footprint of Hortonworks products and services in most of their partners, such as IBM. And from what I can see everybody was wrapped with intention around Data Steward Studio and I sensed, sort of a sigh of relief that it looks like a fairly good solution and so I have no doubt that a fair number of those in this hall right now are probably, as we say in the U.S., probably kicking the tires of DSS and probably going to expedite their adoption of it. So, with that said, we have day two here, so what we're going to have is Alan Gates, one of the founders of Hortonworks coming on in just a few minutes and I'll be interviewing him, asking about the vibrancy in the health of the community, the Hortonworks ecosystem, developers, partners, and so forth as well as of course the open source communities for Hadoop and Ranger and Atlas and so forth, the growing stack of open source code upon which Hortonworks has built their substantial portfolio of solutions. Following him we'll have John Kreisa, the VP for marketing. I'm going to ask John to give us an update on, really the, sort of the health of Hortonworks as a business in terms of the reach out to the community in terms of their messaging obviously and have him really position Hortonworks in the community in terms of who's he see them competing with. What segments is Hortonworks in now? The whole Hadoop segment increasingly... Hadoop is there. It's the foundation. The word is not invoked in the context of discussions of Hortonworks as much now as it was in the past. And the same thing for say Cloudera one of their closest to traditional rivals, closest in the sense that people associate them. I was at the Cloudera analyst event the other week in Santa Monica, California. It was the same thing. I think both of these vendors are on a similar path to become fairly substantial data warehousing and data governance suppliers to the enterprises of the world that have traditionally gone with the likes of IBM and Oracle and SAP and so forth. So I think they're, Hortonworks, has definitely evolved into a far more diversified solution provider than people realize. And that's really one of the take aways from Dataworks Summit. With that said, this is Jim Kobielus. I'm the lead analyst, I should've said that at the outset. I'm the lead analyst at SiliconANGLE's Media's Wikibon team focused on big data analytics. I'm your host this week on the Cube at Dataworks Summit Berlin. I'll close out this segment and we'll get ready to talk to the Hortonworks and IBM personnel. I understand there's a gentleman from Accenture on as well today on the Cube here at Dataworks Summit Berlin. (electronic music)
SUMMARY :
Announcer: From Berlin, Germany, it's the Cube as a business in terms of the reach out to the community
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jim Kobielus | PERSON | 0.99+ |
John Kreisa | PERSON | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
Scott Gnau | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
John | PERSON | 0.99+ |
Cloudera | ORGANIZATION | 0.99+ |
May 25th | DATE | 0.99+ |
Berlin | LOCATION | 0.99+ |
Yahoo | ORGANIZATION | 0.99+ |
five weeks | QUANTITY | 0.99+ |
Alan Gates | PERSON | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Hotronworks | ORGANIZATION | 0.99+ |
Data Steward Studio | ORGANIZATION | 0.99+ |
General Data Protection Regulation | TITLE | 0.99+ |
Santa Monica, California | LOCATION | 0.99+ |
GDPR | TITLE | 0.99+ |
19 | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
100% | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
20% | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
yesterday | DATE | 0.99+ |
U.S. | LOCATION | 0.99+ |
DSS | ORGANIZATION | 0.99+ |
30% | QUANTITY | 0.99+ |
Berlin, Germany | LOCATION | 0.98+ |
Dataworks Summit 2018 | EVENT | 0.98+ |
three main points | QUANTITY | 0.98+ |
Atlas | ORGANIZATION | 0.98+ |
20 | QUANTITY | 0.98+ |
about seven years | QUANTITY | 0.98+ |
Accenture | ORGANIZATION | 0.97+ |
SiliconANGLE | ORGANIZATION | 0.97+ |
One | QUANTITY | 0.97+ |
about three years | QUANTITY | 0.97+ |
Day Two | QUANTITY | 0.97+ |
first gentleman | QUANTITY | 0.96+ |
day two | QUANTITY | 0.96+ |
SAP | ORGANIZATION | 0.96+ |
EU | LOCATION | 0.95+ |
Datawork Summit Europe 2018 | EVENT | 0.95+ |
Dataworks Summit | EVENT | 0.94+ |
this morning | DATE | 0.91+ |
About 40% | QUANTITY | 0.91+ |
Wikibon | ORGANIZATION | 0.9+ |
EU | ORGANIZATION | 0.9+ |
Abhas Ricky, Hortonwork | Dataworks Summit 2018
>> Announcer: From Berlin, Germany, it's the CUBE covering Dataworks Summit Europe 2018. Brought to you by Hortonworks. >> Welcome to the CUBE, we're here at Dataworks Summit 2018 in Berlin. I'm James Kobielus. I am the lead analyst for big data analytics on the Wikibon team of SiliconANGLE Media On the CUBE, we extract the signal from the noise and here at Dataworks Summit, the signal is big data analytics and increasingly the imperative for many enterprises is compliance with GDPR, the General Data Protection Regulation comes in five weeks, May 25th. There's more things going on so what I'm going to be doing today for the next 20 minutes or so is from Hortonworks I have Abhas Ricky who is the director of strategy and innovation. He helps customers, and he'll explain what he does, but at a high level, he helps customers to identify the value of investments in big data, analytics, big data platforms in their business. And Abhas, how do you justify the value of compliance with GDPR. I guess, the value would be avoid penalties for noncompliance, right? Can you do it as an upside as well? Is there an upside in terms of if you make an investment, and you probably will need to make an investment to comply, Can you turn this around as a strategic asset, possibly? Yeah, so I'll take a step back first. >> James: Like a big data catalog and so forth. >> Yeah, so if you look at the value part which you said, it's interesting that you mentioned it. So there's a study which was done by McKinsey which said that only 15% of executives can understand what is the value of a digital initiative, let alone big data initiative. >> James: Yeah. >> Similarly, Gardner says that if you look at the various portraits and if you look at various issues, the fundamental thing which executives struggle with identifying the value which they will get. So that is where I pitch in. That is where I come in and do a data perspective. Now if you look at GDPR specifically, one of the things that we believe, and I've done multiple blogs around that and webinars, GDPR should be treated at a business opportunity because of the fact that -- >> James: Any opportunity? Business opportunity. It shouldn't necessarily be seen as a compliance burden on costs or your balance sheets because of the fact, it is the one single opportunity which allows you to clean up your data supply chain. It allows you to look at your data assets with a holistic view, and if you create a transparent data supply chain, and your IT systems talk to each other. So some of the provisions, as you know, in addition to right to content, right to portability, etc. It is also privacy by design which says that you have to be proactive in defining your IT systems and architecture. It's not necessarily reactive. But guess what? If you're able to do that, you will see the benefits in other use cases like single view of customer or fraud or anti-money laundering because at the end of the day, all GDPR is allowing you to say is that where do you store your data, what's the lineage, what's the provenance? Can you identify what the personally identifiable information is for any particular customer? And can you use that to your effect as you go forward? So it's a great opportunity because to be able to comply with the provisions, you've got to take steps before that which is essentially streamlining your data operations which obviously will have a domino effect on the efficiency of other use cases. So I believe it's a business opportunity. >> Right, now part of that opportunity in terms of getting your arms around what data you have, when the GDPR is concerned, the customer has a right to withhold consent for you and the enterprise that holds that data to use that personal data of theirs which they own for various and sundry reasons. Many enterprises and many of Hortonworks customers are using their big data for things like AI and machine learning. Won't this compliance with GDPR limit their ability to seize the opportunity to build deep learning and so forth? What are customers saying about that? Is that going to be kind of a downer or a chilling effect on their investments in AI and so forth? >> So there's two elements around it. The first thing which you said, there are customers, there's machine learning in AI, yes, there are. But broadly speaking, before you're able to do machine learning and AI, you need to get your data sets onto a particular platform in a particular fashion, clean data, otherwise, you can't do AI or machine learning on top of it. >> James: Right. So the reason why I say it's an opportunity is that because you're being forced by compliance to get that data from every other place onto this platform. So obviously those capabilities will get enhanced. Having said, I do agree if I'm an organization which does targeting, retargeting of customers based on multiple segmentations and then one of the things is online advertisements. In that case, yes, your ability might get affected, but I don't think you'll get prohibited. And that affected time span will be only small because you just adapt. So the good thing about machine learning and AI is that you don't create rules, you don't create manual rules. They pick up the rules based on the patterns and how the data sets have been performing. So obviously once you have created those structures in place, initially, yes, you'll have to make an investment to alter your programs of work. However, going forward, it will be even better. Because guess what? You just cleaned your entire data supply chain. So that's how I would see that, yes, a lot of companies, ecommerce you do targeting and retargeting based on the customer DNA, based on their shopping profiles, based on their shopping ad libs and then based off that, you give them the next best offer or whatever. So, yes, that might get affected initially, but that's not because GDPR is there or not. That's just because you're changing your program software. You're changing the fundamental way by which you're sourcing the data, the way they are coming from and which data can you use. But once you have tags against each of those attributes, once you have access controls, once you know exactly which customer attributes you can touch and you cannot for the purposes, do you have consent or not, your life's even better. The AI tools or the machine learning algorithms will learn from themselves. >> Right, so essentially, once you have a tight ship in terms of managing your data in line with the GDPR strictures and so forth, it sounds like what you're saying is that it gives you as an enterprise the confidence and assurance that if you want to use that data and need to use that data, you know exactly how you've the processes in place to gain the necessary consents from customers. So there won't be any nasty surprises later on of customers complaining because you've got legal procedures for getting the consent and that's great. You know, one of the things, Abhas, we're hearing right now in terms of compliance requirements that are coming along, maybe not apart of GDPR directly yet, but related to it is the whole notion of algorithmic transparency. As you build machine learning models and these machine learning models are driven into working applications, being able to transparently identify if those models make, in particular, let's say autonomous action based on particular data and particular variables, and then there is some nasty consequences like crashing an autonomous vehicle, the ability, they call it explicably AI to roll that back and determine who's liable for that event. Does Hortonworks have any capability within your portfolio to enable more transparency into the algorithmic underpinnings of a given decision? Is that something that you enable in your solutions or that your partner IBM enables through DSX and so forth? Give us a sense whether that's a capability currently that you guys offer and whether that's something in terms of your understand, are customers asking for that yet or is that too futuristic? >> So I would say that it's a two-part question. >> James: Yeah. >> The first one, yes, there are multiple regulations coming in, like Vilica Financial Markets, there's Mid Fair, the BCBS, etc. and organizations have to comply. You've got the IFRS which span to brokers, the insurance, etc., etc. So, yes, a lot of organizations across industries are getting affected by compliance use cases. Where does Hortonworks come into the picture is to be able to be compliant from a data standpoint, A you need to be able to identify which of those data sources you need to implement a particular use case. B you need to get them to a certain point whereby you can do analytics on that And then there's the whole storage and processing and all of that. But also which you might have heard at the keynote today, from a cloud perspective, it's starting to get more and more complex because everyone's moving to the cloud which means, if you look at any large multi-national organization, most of them have a hybrid cloud structure because they work with two or three cloud vendors which makes the process even more complex because now you have multiple clusters, you have have on premise and you have multiple different IT systems who need to talk to each other. Which is where the Hortonworks data plan services come into the picture because it gives you a unified view of your global data assets. >> James: Yes. >> Think of it like a single pane of glass which whereby you can do security and governance across all data assets. So from those angles, yes, we definitely enable those use cases which will help with compliance. >> Making the case to the customer for a big data catalog along the lines of what you guys offer, in making the case, there's a lot of upfront data architectural work that needs to be done to get all you data assets into shape within the context of the catalog. How do they justify making that expense in terms of hiring the people, the data architects and so forth needed to put it all in shape. I mean, how long does it take before you can really stand up in your working data catalog in most companies? >> So again, you've asked two questions. First of all is how do they justify it? Which is where we say that the platform is a means to an end. It's enabling you to deliver use cases. So I look at it in terms of five key value drivers. Either it's a risk reduction or it's a cost reduction or it's a cost avoidance. >> James: Okay. >> Or it's a revenue optimization, or it's time to market. Against each one of these value drivers, or multiple of them or a combination of them, each of the use cases that you're delivering on the platform will lead you to benefits around that. My job, obviously, is to work with the customers and executes to understand what will that be to quantify the potential impact which will then form the basis and give my customer champions enough ammunition so that they can go back and justify those investments. >> James: Abhas, we're going to have to cut it short, but I'm going to let you finish your point here, but we have to end this segment so go ahead. >> That's fine. >> Okay, well, anyway, have had Abhas Ricky who is the director of strategy and innovation at Hortonworks. We're here at Dataworks Summit Berlin. And thank you very much Sorry to cut it short, but we have to move to the next guest. >> No worries, pleasure, thank you very much. >> Take care, have a good one. >> Thanks a lot, yes. (upbeat music)
SUMMARY :
Brought to you by Hortonworks. and you probably will need to make an investment to comply, Yeah, so if you look at the value part which you said, the various portraits and if you look at various issues, So some of the provisions, as you know, the customer has a right to withhold consent for you you need to get your data sets onto a particular platform the way they are coming from and which data can you use. and need to use that data, you know exactly come into the picture because it gives you which whereby you can do security and governance a big data catalog along the lines of what you guys offer, the platform is a means to an end. will lead you to benefits around that. but I'm going to let you finish your point here, And thank you very much Thanks a lot, yes.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
James | PERSON | 0.99+ |
James Kobielus | PERSON | 0.99+ |
two | QUANTITY | 0.99+ |
Berlin | LOCATION | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
two questions | QUANTITY | 0.99+ |
BCBS | ORGANIZATION | 0.99+ |
two-part | QUANTITY | 0.99+ |
General Data Protection Regulation | TITLE | 0.99+ |
Abhas | PERSON | 0.99+ |
Gardner | PERSON | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
15% | QUANTITY | 0.99+ |
two elements | QUANTITY | 0.99+ |
Vilica Financial Markets | ORGANIZATION | 0.99+ |
each | QUANTITY | 0.99+ |
Abhas Ricky | PERSON | 0.99+ |
SiliconANGLE Media | ORGANIZATION | 0.99+ |
GDPR | TITLE | 0.99+ |
May 25th | DATE | 0.99+ |
today | DATE | 0.98+ |
First | QUANTITY | 0.98+ |
Berlin, Germany | LOCATION | 0.98+ |
Dataworks Summit 2018 | EVENT | 0.98+ |
one | QUANTITY | 0.98+ |
first | QUANTITY | 0.97+ |
first one | QUANTITY | 0.97+ |
single | QUANTITY | 0.97+ |
Dataworks Summit | EVENT | 0.96+ |
five weeks | QUANTITY | 0.95+ |
five key value drivers | QUANTITY | 0.95+ |
first thing | QUANTITY | 0.95+ |
Wikibon | ORGANIZATION | 0.95+ |
one single opportunity | QUANTITY | 0.93+ |
single pane | QUANTITY | 0.91+ |
McKinsey | ORGANIZATION | 0.9+ |
CUBE | ORGANIZATION | 0.9+ |
Mid Fair | ORGANIZATION | 0.89+ |
three cloud vendors | QUANTITY | 0.89+ |
IFRS | TITLE | 0.87+ |
each one | QUANTITY | 0.87+ |
Dataworks Summit Europe 2018 | EVENT | 0.86+ |
DSX | TITLE | 0.8+ |
Hortonwork | ORGANIZATION | 0.78+ |
next 20 minutes | DATE | 0.72+ |
Keynote Analysis | Dataworks Summit 2018
>> Narrator: From Berlin, Germany, it's theCUBE! Covering DataWorks Summit, Europe 2018. (upbeat music) Brought to you by Hortonworks. (upbeat music) >> Hello, and welcome to theCUBE. I'm James Kobielus. I'm the lead analyst for Big Data analytics in the Wikibon team of SiliconANGLE Media, and we're here at DataWorks Summit 2018 in Berlin, Germany. And it's an excellent event, and we are here for two days of hard-hitting interviews with industry experts focused on the hot issues facing customers, enterprises, in Europe and the world over, related to the management of data and analytics. And what's super hot this year, and it will remain hot as an issue, is data privacy and privacy protection. Five weeks from now, a new regulation of the European Union called the General Data Protection Regulation takes effect, and it's a mandate that is effecting any business that is not only based in the EU but that does business in the EU. It's coming fairly quickly, and enterprises on both sides of the Atlantic and really throughout the world are focused on GDPR compliance. So that's a hot issue that was discussed this morning in the keynote, and so what we're going to be doing over the next two days, we're going to be having experts from Hortonworks, the show's host, as well as IBM, Hortonworks is one of their lead partners, as well as a customer, Munich Re, will appear on theCUBE and I'll be interviewing them about not just GDPR but really the trends facing the Big Data industry. Hadoop, of course, Hortonworks got started about seven years ago as one of the solution providers that was focused on commercializing the open source Hadoop code base, and they've come quite a ways. They've had their recent financials were very good. They continue to rock 'n' roll on the growth side and customer acquisitions and deal sizes. So we'll be talking a little bit later to Scott Gnau, their chief technology officer, who did the core keynote this morning. He'll be talking not only about how the business is doing but about a new product announcement, the Data Steward Studio that Hortonworks announced overnight. It is directly related to or useful, this new solution, for GDPR compliance, and we'll ask Scott to bring us more insight there. But what we'll be doing over the next two days is extracting signal from noise. The Big Data space continues to grow and develop. Hadoop has been around for a number of years now, but in many ways it's been superseded in the agenda as the priorities of enterprises that are building applications from data by some newer primarily open source technology such as Apache Spark, TensorFlow for building deep learning and so forth. We'll be discussing the trends towards the deepening of the open source data analytics stack with our guest. We'll be talking with a European based reinsurance company, Munich Re, about the data lake that they have built for their internal operations, and we'll be asking their, Andres Kohlmaier, their lead of data engineering, to discuss how they're using it, how they're managing their data lake, and possibly to give us some insight about it will serve them in achieving GDPR compliance and sustaining it going forward. So what we will be doing is that we'll be looking at trends, not just in compliance, not just in the underlying technologies, but the applications that Hadoop and Spark and so forth, these technologies are being used for, and the applications are really, the same initiatives in Europe are world-wide in terms of what enterprises are doing. They're moving away from Big Data environments built primarily on data at rest, that's where Hadoop has been, the sweet spot, towards more streaming architectures. And so Hortonworks, as I said the show's host, has been going more deeply towards streaming architectures with its investments in NiFi and so forth. We'll be asking them to give us some insight about where they're going with that. We'll also be looking at the growth of multi-cloud Big Data environments. What we're seeing is that there's a trend in the marketplace away from predominately premises-based Big Data platforms towards public cloud-based Big Data platforms. And so Hortonworks, they are partners with a number of the public cloud providers, the IBM that I mentioned. They've also got partnerships with Microsoft Azure, with Amazon Web Services, with Google and so forth. We'll be looking, we'll be asking our guest to give us some insight about where they're going in terms of their support for multi-clouds, support for edge computing, analytics, and the internet of things. Big Data increasingly is evolving towards more of a focus on serving applications at the edge like mobile devices that have autonomous smarts like for self-driving vehicles. Big Data is critically important for feeding, for modeling and building the AI needed to power the intelligence and endpoints. Not just self-driving cars but intelligent appliances, conversational user interfaces for mobile devices for our consumer appliances like, you know, Amazon's got their Alexa, Apple's got their Siri and so forth. So we'll be looking at those trends as well towards pushing more of that intelligence towards the edge and the power and the role of Big Data and data driven algorithms, like machine learning, and driving those kinds of applications. So what we see in the Wikibon, the team that I'm embedded within, we have published just recently our updated forecast for the Big Data analytics market, and we've identified key trends that are... revolutionizing and disrupting and changing the market for Big Data analytics. So among the core trends, I mentioned the move towards multi-clouds. The move towards a more public cloud-based big data environments in the enterprise, I'll be asking Hortonworks, who of course built their business and their revenue stream primarily on on-premises deployments, to give us a sense for how they plan to evolve as a business as their customers move towards more public cloud facing deployments. And IBM, of course, will be here in force. We have tomorrow, which is a Thursday. We have several representatives from IBM to talk about their initiatives and partnerships with Hortonworks and others in the area of metadata management, in the area of machine learning and AI development tools and collaboration platforms. We'll be also discussing the push by IBM and Hortonworks to enable greater depths of governance applied to enterprise deployments of Big Data, both data governance, which is an area where Hortonworks and IBM as partners have achieved a lot of traction in terms of recognition among the pace setters in data governance in the multi-cloud, unstructured, Big Data environments, but also model governments. The governing, the version controls and so forth of machine learning and AI models. Model governance is a huge push by enterprises who increasingly are doing data science, which is what machine learning is all about. Taking that competency, that practice, and turning into more of an industrialized pipeline of building and training and deploying into an operational environment, a steady stream of machine-learning models into multiple applications, you know, edge applications, conversational UIs, search engines, eCommerce environments that are driven increasingly by machine learning that's able to process Big Data in real time and deliver next best actions and so forth more intelligence into all applications. So we'll be asking Hortonworks and IBM to net out where they're going with their partnership in terms of enabling a multi-layered governance environment to enable this pipeline, this machine-learning pipeline, this data science pipeline, to be deployed it as an operational capability into more organizations. Also, one of the areas where I'll be probing our guest is to talk about automation in the machine learning pipeline. That's been a hot theme that Wikibon has seen in our research. A lot of vendors in the data science arena are adding automation capabilities to their machine-learning tools. Automation is critically important for productivity. Data scientists as a discipline are in limited supply. I mean experienced, trained, seasoned data scientists fetch a high price. There aren't that many of them, so more of the work they do needs to be automated. It can be automated by a mature tool, increasingly mature tools on the market, a growing range of vendors. I'll be asking IBM and Hortonworks to net out where they're going with automation in sight of their Big Data, their machine learning tools and partnerships going forward. So really what we're going to be doing over the next few days is looking at these trends, but it's going to come back down to GDPR as a core envelope that many companies attending this event, DataWorks Summit, Berlin, are facing. So I'm James Kobielus with theCUBE. Thank you very much for joining us, and we look forward to starting our interviews in just a little while. Our first up will be Scott Gnau from Hortonworks. Thank you very much. (upbeat music)
SUMMARY :
Brought to you by Hortonworks. and enterprises on both sides of the Atlantic
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
James Kobielus | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
Scott Gnau | PERSON | 0.99+ |
Andres Kohlmaier | PERSON | 0.99+ |
Apple | ORGANIZATION | 0.99+ |
European Union | ORGANIZATION | 0.99+ |
Europe | LOCATION | 0.99+ |
General Data Protection Regulation | TITLE | 0.99+ |
Scott | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Amazon Web Services | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
two days | QUANTITY | 0.99+ |
Munich Re | ORGANIZATION | 0.99+ |
Thursday | DATE | 0.99+ |
Siri | TITLE | 0.99+ |
GDPR | TITLE | 0.99+ |
SiliconANGLE Media | ORGANIZATION | 0.99+ |
Berlin, Germany | LOCATION | 0.99+ |
Wikibon | ORGANIZATION | 0.99+ |
first | QUANTITY | 0.99+ |
Data Steward Studio | ORGANIZATION | 0.98+ |
both | QUANTITY | 0.98+ |
tomorrow | DATE | 0.98+ |
DataWorks Summit | EVENT | 0.98+ |
Atlantic | LOCATION | 0.98+ |
one | QUANTITY | 0.98+ |
Berlin | LOCATION | 0.98+ |
both sides | QUANTITY | 0.97+ |
DataWorks Summit 2018 | EVENT | 0.97+ |
Apache | ORGANIZATION | 0.96+ |
Hadoop | TITLE | 0.95+ |
Alexa | TITLE | 0.94+ |
this year | DATE | 0.94+ |
Spark | TITLE | 0.92+ |
2018 | EVENT | 0.91+ |
EU | ORGANIZATION | 0.91+ |
Dataworks Summit 2018 | EVENT | 0.88+ |
TensorFlow | ORGANIZATION | 0.81+ |
this morning | DATE | 0.77+ |
about seven years ago | DATE | 0.76+ |
Azure | TITLE | 0.7+ |
next two days | DATE | 0.68+ |
Five weeks | QUANTITY | 0.62+ |
NiFi | TITLE | 0.59+ |
European | LOCATION | 0.59+ |
theCUBE | ORGANIZATION | 0.58+ |
Dr. Tendu Yogurtcu, Syncsort | Big Data SV 2018
>> Announcer: Live from San Jose, it's theCUBE. Presenting data, Silicon Valley brought to you by Silicon Angle Media and it's ecosystem partners. >> Welcome back to theCUBE. We are live in San Jose at our event, Big Data SV. I'm Lisa Martin, my co-host is George Gilbert and we are down the street from the Strata Data Conference. We are at a really cool venue: Forager Eatery Tasting Room. Come down and join us, hang out with us, we've got a cocktail par-tay tonight. We also have an interesting briefing from our analysts on big data trends tomorrow morning. I want to welcome back to theCUBE now one of our CUBE VIP's and alumna Tendu Yogurtcu, the CTO at Syncsort, welcome back. >> Thank you. Hello Lisa, hi George, pleasure to be here. >> Yeah, it's our pleasure to have you back. So, what's going on at Syncsort, what are some of the big trends as CTO that you're seeing? >> In terms of the big trends that we are seeing, and Syncsort has grown a lot in the last 12 months, we actually doubled our revenue, it has been really an successful and organic growth path, and we have more than 7,000 customers now, so it's a great pool of customers that we are able to talk and see the trends and how they are trying to adapt to the digital disruption and make data as part of their core strategy. So data is no longer an enabler, and in all of the enterprise we are seeing data becoming the core strategy. This reflects in the four mega trends, they are all connected to enable business as well as operational analytics. Cloud is one, definitely. We are seeing more and more cloud adoption, even our financial services healthcare and banking customers are now, they have a couple of clusters running in the cloud, in public cloud, multiple workloads, hybrid seems to be the new standard, and it comes with also challenges. IT governance as well as date governance is a major challenge, and also scoping and planning for the workloads in the cloud continues to be a challenge, as well. Our general strategy for all of the product portfolio is to have our products following design wants and deploy any of our strategy. So whether it's a standalone environment on Linux or running on Hadoop or Spark, or running on Premise or in the Cloud, regardless of the Cloud provider, we are enabling the same education with no changes to run all of these environments, including hybrid. Then we are seeing the streaming trend, with the connected devices with the digital disruption and so much data being generated, being able to stream and process data on the age, with the Internet of things, and in order to address the use cases that Syncsort is focused on, we are really providing more on the Change Data Capture and near real-time and real-time data replication to the next generation analytics environments and big data environments. We launched last year our Change Data Capture, CDC, product offering with data integration, and we continue to strengthen that vision merger we had data replication, real-time data replication capabilities, and we are now seeing even Kafka database becoming a consumer of this data. Not just keeping the data lane fresh, but really publishing the changes from multiple, diverse set of sources and publishing into a Kafka database and making it available for applications and analytics in the data pipeline. So the third trend we are seeing is around data science, and if you noticed this morning's keynote was all about machine learning, artificial intelligence, deep learning, how to we make use of data science. And it was very interesting for me because we see everyone talking about the challenge of how do you prepare the data and how do you deliver the the trusted data for machine learning and artificial intelligence use and deep learning. Because if you are using bad data, and creating your models based on bad data, then the insights you get are also impacted. We definitely offer our products, both on the data integration and data quality side, to prepare the data, cleanse, match, and deliver the trusted data set for data scientists and make their life easier. Another area of focus for 2018 is can we also add supervised learning to this, because with the premium quality domain experts that we have now in Syncsort, we have a lot of domain experts in the field, we can infuse the machine learning algorithms and connect data profiling capabilities we have with the data quality capabilities recommending business rules for data scientists and helping them automate the mandate tasks with recommendations. And the last but not least trend is data governance, and data governance is almost a umbrella focus for everything we are doing at Syncsort because everything about the Cloud trend, the streaming, and the data science, and developing that next generation analytics environment for our customers depends on the data governance. It is, in fact, a business imperative, and the regulatory compliance use cases drives more importance today than governance. For example, General Data Protection Regulation in Europe, GDPR. >> Lisa: Just a few months away. >> Just a few months, May 2018, it is in the mind of every C-level executive. It's not just for European companies, but every enterprise has European data sourced in their environments. So compliance is a big driver of governance, and we look at governance in multiple aspects. Security and issuing data is available in a secure way is one aspect, and delivering the high quality data, cleansing, matching, the example Hilary Mason this morning gave in the keynote about half of what the context matters in terms of searches of her name was very interesting because you really want to deliver that high quality data in the enterprise, trust of data set, preparing that. Our Trillium Quality for big data, we launched Q4, that product is generally available now, and actually we are in production with very large deployment. So that's one area of focus. And the third area is how do you create visibility, the farm-to-table view of your data? >> Lisa: Yeah, that's the name of your talk! I love that. >> Yes, yes, thank you. So tomorrow I have a talk at 2:40, March 8th also, I'm so happy it's on the Women's Day that I'm talking-- >> Lisa: That's right, that's right! Get a farm-to-table view of your data is the name of your talk, track data lineage from source to analytics. Tell us a little bit more about that. >> It's all about creating more visibility, because for audit reasons, for understanding how many copies of my data is created, valued my data had been, and who accessed it, creating that visibility is very important. And the last couple of years, we saw everyone was focused on how do I create a data lake and make my data accessible, break the data silos, and liberate my data from multiple platforms, legacy platforms that the enterprise might have. Once that happened, everybody started worrying about how do I create consumable data set and how do I manage this data because data has been on the legacy platforms like Mainframe, IMBI series has been on relational data stores, it is in the Cloud, gravity of data originating in the Cloud is increasing, it's originating from mobile. Hadoop vendors like Hortonworks and Cloudera, they are creating visibility to what happens within the Hadoop framework. So we are deepening our integration with the Cloud Navigator, that was our announcement last week. We already have integration both with Hortonworks and Cloudera Navigator, this is one step further where we actually publish what happened to every single granular level of data at the field level with all of the transformations that data have been through outside of the cluster. So that visibility is now published to Navigator itself, we also publish it through the RESTful API, so governance is a very strong and critical initiative for all of the businesses. And we are playing into security aspect as well as data lineage and tracking aspect and the quality aspect. >> So this sounds like an extremely capable infrastructure service, so that it's trusted data. But can you sell that to an economic buyer alone, or do you go in in conjunction with anther solution like anti-money laundering for banks or, you know, what are the key things that they place enough value on that they would spend, you know, budget on it? >> Yes, absolutely. Usually the use cases might originate like anti-money laundering, which is very common, fraud detection, and it ties to getting a single view of an entity. Because in anti-money laundering, you want to understand the single view of your customer ultimately. So there is usually another solution that might be in the picture. We are providing the visibility of the data, as well as that single view of the entity, whether it's the customer view in this case or the product view in some of the use cases by delivering the matching capabilities and the cleansing capabilities, the duplication capabilities in addition to the accessing and integrating the data. >> When you go into a customer and, you know, recognizing that we still have tons of silos and we're realizing it's a lot harder to put everything in one repository, how do customers tell you they want to prioritize what they're bringing into the repository or even what do they want to work on that's continuously flowing in? >> So it depends on the business use case. And usually at the time that we are working with the customer, they selected that top priority use case. The risk here, and the anti-money laundering, or for insurance companies, we are seeing a trend, for example, building the data marketplace, as that tantalize data marketplace concept. So depending on the business case, many of our insurance customers in US, for example, they are creating the data marketplace and they are working with near real-time and microbatches. In Europe, Europe seems to be a bit ahead of the game in some cases, like Hadoop production was slow but certainly they went right into the streaming use cases. We are seeing more directly streaming and keeping it fresh and more utilization of the Kafka and messaging frameworks and database. >> And in that case, where they're sort of skipping the batch-oriented approach, how do they keep track of history? >> It's still, in most of the cases, microbatches, and the metadata is still associated with the data. So there is an analysis of the historical what happened to that data. The tools, like ours and the vendors coming to picture, to keep track, of that basically. >> So, in other words, by knowing what happened operationally to the data, that paints a picture of a history. >> Exactly, exactly. >> Interesting. >> And for the governance we usually also partner, for example, we partner with Collibra data platform, we partnered with ASG for creating that business rules and technical metadata and providing to the business users, not just to the IT data infrastructure, and on the Hadoop side we partner with Cloudera and Hortonworks very closely to complete that picture for the customer, because nobody is just interested in what happened to the data in Hadoop or in Mainframe or in my relational data warehouse, they are really trying to see what's happening on Premise, in the Cloud, multiple clusters, traditional environments, legacy systems, and trying to get that big picture view. >> So on that, enabling a business to have that, we'll say in marketing, 360 degree view of data, knowing that there's so much potential for data to be analyzed to drive business decisions that might open up new business models, new revenue streams, increase profit, what are you seeing as a CTO of Syncsort when you go in to meet with a customer, data silos, when you're talking to a Chief Data Officer, what's the cultural, I guess, not shift but really journey that they have to go on to start opening up other organizations of the business, to have access to data so they really have that broader, 360 degree view? What's that cultural challenge that they have to, journey that they have to go on? >> Yes, Chief Data Officers are actually very good partners for us, because usually Chief Data Officers are trying to break the silos of data and make sure that the data is liberated for the business use cases. Still most of the time the infrastructure and the cluster, whether it's the deployment in the Cloud versus on Premise, it's owned by the IT infrastructure. And the lines of business are really the consumers and the clients of that. CDO, in that sense, almost mitigates and connects to those line of businesses with the IT infrastructure with the same goals for the business, right? They have to worry about the compliance, they have to worry about creating multiple copies of data, they have to worry about the security of the data and availability of the data, so CDOs actually help. So we are actually very good partners with the CDOs in that sense, and we also usually have IT infrastructure owner in the room when we are talking with our customers because they have a big stake. They are like the gatekeepers of the data to make sure that it is accessed by the right... By the right folks in the business. >> Sounds like maybe they're in the role of like, good cop bad cop or maybe mediator. Well Tendu, I wish we had more time. Thanks so much for coming back to theCUBE and, like you said, you're speaking tomorrow at Strata Conference on International Women's Day: Get a farm-to-table view of your data. Love the title. >> Thank you. >> Good luck tomorrow, and we look forward to seeing you back on theCUBE. >> Thank you, I look forward to coming back and letting you know about more exciting both organic innovations and acquisitions. >> Alright, we look forward to that. We want to thank you for watching theCUBE, I'm Lisa Martin with my co-host George Gilbert. We are live at our event Big Data SV in San Jose. Come down and visit us, stick around, and we will be right back with our next guest after a short break. >> Tendu: Thank you. (upbeat music)
SUMMARY :
brought to you by Silicon Angle Media and we are down the street from the Strata Data Conference. Hello Lisa, hi George, pleasure to be here. Yeah, it's our pleasure to have you back. and in all of the enterprise we are seeing data and delivering the high quality data, Lisa: Yeah, that's the name of your talk! it's on the Women's Day that I'm talking-- is the name of your talk, track data lineage and make my data accessible, break the data silos, that they place enough value on that they would and the cleansing capabilities, the duplication So it depends on the business use case. It's still, in most of the cases, operationally to the data, that paints a picture And for the governance we usually also partner, and the cluster, whether it's the deployment Love the title. to seeing you back on theCUBE. and letting you know about more exciting and we will be right back with our next guest Tendu: Thank you.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Lisa Martin | PERSON | 0.99+ |
George | PERSON | 0.99+ |
May 2018 | DATE | 0.99+ |
George Gilbert | PERSON | 0.99+ |
Syncsort | ORGANIZATION | 0.99+ |
Lisa | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
US | LOCATION | 0.99+ |
Hilary Mason | PERSON | 0.99+ |
San Jose | LOCATION | 0.99+ |
ASG | ORGANIZATION | 0.99+ |
2018 | DATE | 0.99+ |
Tendu | PERSON | 0.99+ |
Silicon Angle Media | ORGANIZATION | 0.99+ |
Cloudera | ORGANIZATION | 0.99+ |
360 degree | QUANTITY | 0.99+ |
tomorrow | DATE | 0.99+ |
Collibra | ORGANIZATION | 0.99+ |
more than 7,000 customers | QUANTITY | 0.99+ |
last week | DATE | 0.99+ |
last year | DATE | 0.99+ |
tomorrow morning | DATE | 0.99+ |
one aspect | QUANTITY | 0.99+ |
third area | QUANTITY | 0.99+ |
Linux | TITLE | 0.99+ |
Cloud Navigator | TITLE | 0.99+ |
2:40 | DATE | 0.98+ |
Women's Day | EVENT | 0.98+ |
Tendu Yogurtcu | PERSON | 0.98+ |
GDPR | TITLE | 0.98+ |
Spark | TITLE | 0.97+ |
tonight | DATE | 0.97+ |
Big Data SV | EVENT | 0.97+ |
Kafka | TITLE | 0.97+ |
International Women's Day | EVENT | 0.97+ |
both | QUANTITY | 0.97+ |
CDC | ORGANIZATION | 0.96+ |
Navigator | TITLE | 0.96+ |
Strata Data Conference | EVENT | 0.96+ |
single view | QUANTITY | 0.96+ |
Hadoop | TITLE | 0.95+ |
third trend | QUANTITY | 0.95+ |
one step | QUANTITY | 0.95+ |
single view | QUANTITY | 0.95+ |
Dr. | PERSON | 0.94+ |
theCUBE | ORGANIZATION | 0.94+ |
CUBE | ORGANIZATION | 0.94+ |
this morning | DATE | 0.94+ |
Cloud | TITLE | 0.92+ |
last 12 months | DATE | 0.91+ |
Change Data Capture | ORGANIZATION | 0.9+ |
today | DATE | 0.9+ |
European | OTHER | 0.88+ |
last couple of years | DATE | 0.88+ |
General Data Protection Regulation in Europe | TITLE | 0.86+ |
Strata Conference | EVENT | 0.84+ |
one | QUANTITY | 0.83+ |
one repository | QUANTITY | 0.83+ |
tons of silos | QUANTITY | 0.82+ |
one area | QUANTITY | 0.82+ |
Q4 | DATE | 0.82+ |
Big Data SV 2018 | EVENT | 0.81+ |
four mega trends | QUANTITY | 0.76+ |
March 8th | DATE | 0.76+ |
Brian Brackeen, Kairos.com | Polycon 2018
(electronic theme music0 >> Announcer: Live from Nassau in the Bahamas. It's the Cube. Covering Polycom 18. Brought to you by Polycom. >> Welcome to Nassau, everybody. This is the Cube, the leader in live tech coverage. And we're here at Polycom 18 in beautiful Bahamas, Nassau. Brian Brackeen is here, the CEO of Kairos. Brain, thanks for coming on. >> Thanks for having me. >> We just met this morning. >> Yep. >> I heard you up on the panel. So Kairos, first of all, I love the name. >> Thank you, thank you. >> Where's the name come from? >> It's Greek. >> Yeah, I thought so. >> It means, the most opportune moment. >> Love it. Okay, so you seize the opportune moment to do facial recognition. Everybody knows facial recognition from Facebook, but talk about what you guys bring to the table. >> Yeah and like you said. You seen it from Facebook. The new iPhone has facial recognition. It's really all about identifying who someone is and verifying their identity. We use it for companies. Prior to doing this for ICS stuff, we were an existing business, six years old. Mostly fortune 500, fortune 1,000 companies. We have retailers understand who's in the retail store. Their age, gender, ethnicities, their emotions, their feelings. We also help people like a, even like school bus companies that identifies which kids are getting on the right bus. We help movie studios to understand how you feel about a film. So we've been in this industry some time. We think it's perfect for the block change >> So there's a security angle there as well. >> Absolutely. >> As the fun on Facebook. How's, what's the state of facial recognition technology? I'd love to hear from an expert. I've talked to some people who say, oh, it's nowhere near ready. And I'm like how can it not be ready? I go on Facebook, they tag me in an instance. (laughing) I go, no, don't tag me. Where are we in terms of the quality and ethicacy of facial recognition. >> Yeah, we can find one person in a billion in about one third of a second. And we're about 99.8, 99.9% sure they are who we think they are. So definitely, the future is really now. >> Now you guys, unlike many companies who either done an ICO or raisen security tokens or done utility tokens, you guys are an established company. And then decided, so let's, but before we get into that. Give us the history of the company. You seized the moment and, how you got started and how you got here. >> Sure. My personal background, I'm a, Philadelphia, originally. We were just talking abut being an Eagles fan. >> Hey, congratulations. >> Thank you, thank you so much. Long time coming. >> The Eagles, a deserved win. It hurts me from being from Boston, but. (laughing) >> But we still get along. >> Yeah >> So worked in large corporations for most of my career. >> Comcast, IBM in Phily, took a job at Apple, just after the iPhone launch on through the iPad launch. Steve Job was still there. It was a period of exponential growth. It changed my life. And then I got the shuttle bug, and quit my job there. Which my parents thought I was absolutely crazy. And started Kairos. First in San Francisco and then moved the company to Miami. We realized early on that facial recognition was a right direction that helping companies to do it was a big idea. Essentially the market is anywhere or anyone that works with people. So thought it was a good and growing market. And we got into it deeply in the last three to four years or so. >> So a bit of a change up. I want to ask you GDPR, the General Data Protection Regulation is coming, it's here but the fines and penalties go into effect in May of this year. I learned recently that pictures qualify for personally identifiable information. >> Correct. >> Has that been a tailwind for you? Have people come to you and say, hey, we need help because we, we're on the video business or whatever it is and we need help in case somebody needs to identify somebody. Is that a use case. >> Yeah, we think a lot about GDPR, a lot about it. As your viewers may know, that's really a European Union regulation. However, it kind of extends to people who, anybody doing business there. >> Dave: Right. >> Which is everyone in the US. (laughing) So it becomes almost like defacto US law, even though it's not a US law. There's a lot of concern about, because of facial recognition, your picture really becomes your identity. So how do we manage that. We're actually one of the first anonymous facial recognition companies in the world. We sometimes just let you know that it's the same person, but not who that person is. Protecting your animity and your individuality. >> Okay, is that where block change comes in? >> Exactly. >> Okay, let's pivot to that discussion, block change. Talk about the technology that allows me to own my own data, protect my own data, anonymize, how's that work? >> Absolutely. Let's say me and you were in a kind of friendly wager, if it's really a go right, on the super bowl, (laughing) right. And I, you lost the game, so now you owe me 20 ether. So you don't just want to send it to a random address. You want to make sure that, you know, it's really me. Because 20 ether is a decent amount of money these days, right. And so now you're going to use facial recognition transaction today. Only this face can unlock this transaction. Can open this ether and deposit into their wallet. I don't think you don't even know who I am, but just this face. And so I'm standing on the other side. I can say that I will only accept ether from this face. >> Right >> Yeah, it changes everything. >> And then the obvious question people are going to ask you, server address really, but how secured is that? You know, how hackable is that? Can I take a picture of somebody and then, you know, recreate, you know, that image? How do you, you know, forth that? >> Yeah, yeah. A number of ways. Some things like you can take a picture of someone else and say hold it in front of the camera, that kind of thing. We have all kinds of anti fraud detection. So we can detect from the entrance of light, and because we can read emotions, is the person kind of really alive, are they feeling emotions or are they breathing. All kinds of technology we can use to verify someone's identity. >> Great. All right, let's get in the business of tokens. You choice to tokenize your business. Why does it make sense to tokenize your business? >> Yeah, and you know, you see this world, often times will write a white paper and say this is my idea. I appreciate that, but raising 10's and millions of dollars sometimes, and never coming through on that idea, right. In our case, we were an existing business. We've already raised about $80 million in capital, you know, like a Series A, Series B, very traditional way. And we didn't think we could just go off and build a new division in Gibralter or different kind of exotic. I would say that we're in US space and we have US investors in venture capital investors. So we said, let's do this the right way. Let's create a security token. Completely SEC compliant. So let's just do this like another round. To completely tokenize the existing investors and the new investors. So we're all on the same boat. And we've seen great success because of it. >> Okay and so the motivation for them was for investors was equity. Motivation for the existing, preferred investors was liquidity. >> Liquidity. >> Okay, so you basically took those existing, preferred. They protected their ownership and you transferred them over to tokens. >> Transferred them over to tokens, yeah. Essentially, you don't lose any equity, right. But you gain liquidity. You're still in the business. You're long on Kairos, you can stay long on Kairos. If you want to take a little off the table, you can take a little off the table. It really changes overseas finance. >> Dave: And you're doing it to your Chili token as well or no? >> We're doing it to Chili token as well. >> Dave: Okay. >> And with the Chili token, we gave it away for free. Because then we say to the SEC or anyone else, look, we're not trying to profit or get invested from the Chili tokens, that's why it completely free. We're doing a SEC compliant token. >> And talk about the use cases for that utility token. Howe are people utilizing it and what's the value? >> So going back to our friendly bet for the 25 ether, when I click my face for the first time, when I give a scan, that cost one Tyro token. >> Right >> Now after that, to verify it, it's free. But to create your face the first time, it's a Tyro token. >> Let's see, okay, and then you guys charge a monthly subscription for your service, correct? >> For the block change service, no. We just do it, just face san. >> Now right, okay. >> Yeah. >> But through your core business. >> For core business, monthly subscription, reoccurring revenue, absolutely. >> Excellent. I'll give you the last word. Kind of future, where's all this going? We're here at this investors conference. It's the first conference focused on security tokens? >> Yes, right. >> So, and you're a great example of that, of an existing company not a blank sheet of paper. >> Yeah. >> What's your outlook, you know, for the future of this industry, this eco system, this community? >> I'm literally like bubbling with excitement on the future. And it is, as you know, it's way tough for founders who are not base in San Francisco or Silicon Valley, to raise capital. This sort of democratizes that entire process. Now what you'll have is, somebody started in Miami or Portland or Boston, right. And first they would do a round of small investors, local VCs. Get their model together. Get their act right Get some customers. Things start to work for the company. And then there, instead of trying to go Silicon Valley, and beg them to invest, and maybe they won't just because the location. Now, you do ICO at that stage and make the folks in your community richer. They go off and do more things. Make better cities. It's really, really something great. >> Brian Brackeen, thanks very much >> Thank you. >> For coming on the Cube. Really appreciate having you. >> Yup. >> Alright, keep it right there, buddy. We'll be back with our next guest right after this short break. This is Dave Vellante. You're watching the Cube. (electronic theme music)
SUMMARY :
Brought to you by Polycom. Brian Brackeen is here, the CEO of Kairos. So Kairos, first of all, I love the name. Okay, so you seize the opportune moment Yeah and like you said. As the fun on Facebook. So definitely, the future is really now. And then decided, so let's, but before we get into that. We were just talking abut being an Eagles fan. Thank you, thank you so much. It hurts me from being from Boston, but. that helping companies to do it was a big idea. I want to ask you Have people come to you and say, However, it kind of extends to people who, We sometimes just let you know that it's the same person, Talk about the technology that allows me to own my own data, And I, you lost the game, so now you owe me 20 ether. and say hold it in front of the camera, that kind of thing. Why does it make sense to tokenize your business? Yeah, and you know, you see this world, Okay and so the motivation for them and you transferred them over to tokens. you can take a little off the table. from the Chili tokens, that's why it completely free. And talk about the use cases for that utility token. So going back to our friendly bet for the 25 ether, But to create your face the first time, it's a Tyro token. For the block change service, no. For core business, monthly subscription, It's the first conference focused on security tokens? So, and you're a great example of that, and make the folks in your community richer. For coming on the Cube. right after this short break.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Polycom | ORGANIZATION | 0.99+ |
Comcast | ORGANIZATION | 0.99+ |
Miami | LOCATION | 0.99+ |
Dave | PERSON | 0.99+ |
Brian Brackeen | PERSON | 0.99+ |
Boston | LOCATION | 0.99+ |
Portland | LOCATION | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
San Francisco | LOCATION | 0.99+ |
US | LOCATION | 0.99+ |
iPhone | COMMERCIAL_ITEM | 0.99+ |
General Data Protection Regulation | TITLE | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Bahamas | LOCATION | 0.99+ |
iPad | COMMERCIAL_ITEM | 0.99+ |
first time | QUANTITY | 0.99+ |
10 | QUANTITY | 0.99+ |
European Union | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
Phily | LOCATION | 0.99+ |
Kairos | ORGANIZATION | 0.99+ |
SEC | ORGANIZATION | 0.99+ |
about $80 million | QUANTITY | 0.99+ |
one person | QUANTITY | 0.99+ |
Philadelphia | LOCATION | 0.99+ |
Steve Job | PERSON | 0.99+ |
First | QUANTITY | 0.99+ |
Polycom 18 | ORGANIZATION | 0.99+ |
GDPR | TITLE | 0.98+ |
Apple | ORGANIZATION | 0.98+ |
millions of dollars | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
first | QUANTITY | 0.98+ |
four years | QUANTITY | 0.97+ |
first conference | QUANTITY | 0.97+ |
one | QUANTITY | 0.97+ |
Series A | COMMERCIAL_ITEM | 0.97+ |
Eagles | ORGANIZATION | 0.96+ |
Nassau | LOCATION | 0.94+ |
six years old | QUANTITY | 0.92+ |
Gibralter | LOCATION | 0.91+ |
about one third of a second | QUANTITY | 0.9+ |
Series B | COMMERCIAL_ITEM | 0.88+ |
Bahamas, Nassau | LOCATION | 0.87+ |
May of this year | DATE | 0.86+ |
about 99.8 | QUANTITY | 0.85+ |
Polycon | EVENT | 0.85+ |
20 | QUANTITY | 0.83+ |
this morning | DATE | 0.8+ |
a billion | QUANTITY | 0.79+ |
Kairos.com | ORGANIZATION | 0.72+ |
99.9% | QUANTITY | 0.72+ |
The Eagles | ORGANIZATION | 0.71+ |
1,000 companies | QUANTITY | 0.71+ |
2018 | DATE | 0.65+ |
three | QUANTITY | 0.6+ |
facial | QUANTITY | 0.6+ |
Greek | OTHER | 0.59+ |
Tyro | OTHER | 0.56+ |
25 ether | OTHER | 0.55+ |
Brain | PERSON | 0.54+ |
fortune | ORGANIZATION | 0.53+ |
Cube | COMMERCIAL_ITEM | 0.52+ |
ICS | ORGANIZATION | 0.51+ |
20 ether | OTHER | 0.47+ |
fortune 500 | ORGANIZATION | 0.45+ |
Tyro | COMMERCIAL_ITEM | 0.33+ |
Michelle Dennedy, Cisco | Data Privacy Day 2018
(screen switch sound) >> Hey, welcome back everybody. Jeff Frick here with theCUBE. We're at the place that you should be. Where is that you say? Linked-In's new downtown San Francisco's headquarters at Data Privacy Day 2018. It's a small, but growing event. Talking, really a lot about privacy. You know we talk a lot about security all the time. But privacy is this kind of other piece of security and ironically it's often security that's used as a tool to kind of knock privacy down. So it's an interesting relationship. We're really excited to be joined by our first guest Michelle Dennedy. We had her on last year, terrific conversation. She's the Chief Privacy Officer at Cisco and a keynote speaker here. Michelle, great to see you again. >> Great to see you and happy privacy day. >> Thank you, thank you. So it's been a year, what has kind of changed on the landscape from a year ago? >> Well, we have this little thing called GDPR. >> Jeff: That's right. >> You know, it's this little old thing the General Data Protection Regulation. It's been, it was enacted almost two years ago. It will be enforced May 25, 2018. So everyone's getting ready. It's not Y2K, it's the beginning of a whole new era in data. >> But the potential penalties, direct penalties. Y2K had a lot of indirect penalties if the computers went down that night. But this has significant potential financial penalties that are spelled out very clearly. Multiples of revenue. >> Absolutely >> So what are people doing? How are they getting ready? Obviously, the Y2k, great example. It was a scramble. No one really knew what was going to happen. So what are people doing to get ready for this? >> Yeah, I think its, I like the analogy it ends because January one, after 2000, we figured it out, right? Or it didn't happen because of our prep work. In this case, we have had 20 years of lead time. 1995, 1998, we had major pieces of legislation saying know thy data, know where it's going, value it and secure it, and make sure your users know where and what it is. We didn't do a whole lot about it. There are niche market people, like myself, who said "Oh my gosh, this is really important." but now the rest of the world has to wake up and pay attention because four percent of global turnover is not chump change in a multi-billion dollar business and in a small business it could be the only available revenue stream that you wanted to spend innovating-- >> Right, right >> rather than recovering. >> But the difficulty again, as we've talked about before is not as much the companies. I mean obviously the companies have a fiduciary responsibility. But the people-- >> Yes. >> On the end of the data, will hit the ULA as we talked about before without thinking about it. They're walking around sharing all this information. They're logging in to public WiFi's and we actually even just got a note at theCube the other day asking us what our impact, are we getting personal information when we're filming stuff that's going out live over the internet. So I think this is a kind of weird implication. >> I wish I could like feel sad for that but there's a part of my privacy soul that's like, "Yes! People should be asking. "What are you doing with my image after this? "How will you repurpose this video? "Who are my users looking at it?" I actually, I think it's difficult at first to get started. But once you know how to do it, it's like being a nutritionist and a chef all in one. Think about the days before nutrition labels for food. When it was first required, and very high penalties of the same quanta of the GDPR and some of these other Asiatic countries are the same, people simply didn't know what they were eating. >> Right. >> People couldn't take care of their health and look for gluten free, or vitamin E, or vitamin A, or omega whatever. Now, it's a differentiator. Now to get there, people had to test food. They had to understand sources. They had to look at organics and pesticides and say, "This is something that the populace wants." And look at the innovation and even something as basic and integral to your humanity as food now we're looking at what is the story that we're sharing with one another and can we put the same effort in to get the same benefits out. Putting together a nutrition label for your data, understanding the mechanisms, understanding the life cycle flow. It's everything and is it a pain in the tuckus some times? You betcha. Why do it? A: You're going to get punished if you don't. But more importantly, B: It's the gateway to innovation. >> Right. It's just funny. We talked to a gal in a security show and she's got 100% hit rate. She did this at Black Hat, social engineering access to anything. Basically by calling, being a sweetheart, asking the right questions and getting access to people's-- >> Exactly. >> So where does that fit in terms of the company responsibility, when they are putting everything, as much as they can in their place. Here like at AWS too you'll hear, "Somebody has a security breach at AWS." Well it wasn't the security of the AWS system, it was somebody didn't hit a toggle switch in the right position. >> That's right. >> So it's pretty complex versus if you're a food manufacturer, hopefully you have pretty good controls as to what you put in the food and then you can come back and define. So it's a really complicated problem when it's the users who you're tryna protect that are often the people that are causing the most problems. >> Absolutely. And every analogy has its failures right? >> Right, right. >> We'll stick with food for a while. >> Oh no I like the food one. >> Alright it's something you can understand. >> Y2K is kind of old, right. >> Yeah, yeah. But think about like, have we made, I was going to use a brand name, a spray on cheese chip, have we made that illegal? That stuff is terrible for your body. We have an obesity crisis here in North America certainly, and other parts of the world, and yet we let people choose what they're putting into their bodies. At the same time we're educating consumers about what the new food chart should look like, we're listening to maybe sugar isn't as good as we thought it was, maybe fat isn't as bad. So giving people some modicum of control doesn't mean that people are always going to make the right choices but at least we give them a running chance by being able to test and separate and be accountable for at least what we put into the ingredients. >> Right, right, okay so what are some of the things you're working on at Cisco? I think you said before we go on the air you have a new report published, study, what's going on? I do, I'm ashamed Jeff to be so excited about data but, I'm excited about data. (laughter) >> Everybody's excited about data. >> Are they? >> Absolutely. >> Alright let's geek out for a moment. >> So what did you find out? >> So we actually did the first metrics reporting correlating data privacy maturity models and asking customers, 3,000 customers plus in 20 different countries from companies of all sizes S and B's to very large corps, are you experiencing a slow down based on the fears of privacy and security problems? We found that 68 percent of these questions said yes indeed we are, and we asked them what is the average timing of slowing down closing business based on these fears. We found a big spread from over 16 and a half weeks all the way down to two weeks. We said that's interesting. We asked that same set of customers, where would you put yourself on a zero to five ad hoc to optimized privacy maturity model. What we found was if you were even middle of the road a three or a four, to having some awareness, having some basic tools, you can lower your risk of loss, by up to 70 percent. I'm making it sound like it's causation, it's just a correlation but it was such a strong one that when we ran the data last year I didn't run the report, because we weren't sure enough. So we ran it again and got the same quantum with a larger sample size. So now I feel pretty confident that the self reporting of data maturity is related to closing business more efficiently and faster on the up side and limiting your losses on the down side. >> Right, so where are the holes? What's the easiest way to get from a zero or one to a three or a four, I don't even want to say three or four, two or three in terms of behaviors, actions, things that people do? >> So you're scratching on my geeky legal underbelly here. (laughter) I'm going to say it depends Jeff. >> Of course of course. >> Couching this and I'm not your lawyer. >> No forward licking statements. >> No forward licking statement. Well, for a reason what the heck. We're looking forward not back. It really does depend on your organization. So, Cisco, my company we are known for engineering. In fact on the down side of our brand, we're known for having trouble letting go until everything is perfect. So, sometimes it's slower than you want cause we want to be so perfect. In that culture my coming into the engineering with their bonafides and their pride in their brand, that's where I start to attack with privacy engineering education, and looking at specs and requirements for the products and services. So hitting my company where it lives in engineering was a great place to start to build in maturity. In a company like a large telco or healthcare or highly regulated industry, come from the legal aspect. Start with compliance if that's what is effective for your organization. >> Right, right. >> So look at where you are in your organization and then hit it there first, and then you can fill up, document those policies, make sure training is fun. Don't be afraid to embarrass yourself. It's kind of my mantra these days. Be a storyteller, make it personal to your employees and your customers, and actually care. >> Right, hopefully, hopefully. >> It's a weird thing to say right, you actually should give a beep >> Have a relationship with people. When you look at how companies moved that curve from last year to this year was it a significant movement? Was it more than you thought less than you thought? Is it appropriate for what's coming up? >> We haven't tracked individual companies time after time cause it's double blind study. So it's survey data. The survey numbers are statistically relevant. That when you have a greater level of less ad hoc and more routinized systems, more privacy policies that are stated and transparent, more tools and technologies that are implemented, measured, tested, and more board level engagement you start to see that even if you have a cyber risk the chances that it's over 500 thousand per event goes down precipitously. If you are at that kind of mid range level of maturity you can take off 70 percent of the lag time and go from about four months of closing a deal that has privacy and security implications to somewhere around two to three weeks. That's a lot of time. Time in business is everything. We run by the quarter. >> Yeah well if you don't sell it today, you never get today back. You might sell it tomorrow, but you never get today back. Alright so we just flipped the calendar. I can't believe it's 2018. That's a whole different conversation. (laughter) What are your priorities for 2018 as you look forward? >> Oh my gosh. I am hungry for privacy engineering to become a non niche topic. We're going out to universities. We're going out to high schools. We're doing innovation challenges within Cisco to make innovating around data a competitive advantage for everyone, and come up with a common language. So that if you're a user interface guy you're thinking about data control and the stories that you're telling about what the real value is behind your thing. If you are a compliance guy or girl, how do I efficiently measure? How do I come back again in three months without having compliance fatigue, because after the first couple days of enforcement of GDPR and some of these other laws come into force it's really easy to say whew, it didn't hit me. I've got no problem now. >> Right. >> That is not the attitude I want people to take. I want them to take real ownership over this information. >> It's very ana logist to what's happening in security. >> Very much so. >> Just baking it in all the way. It's not a walled garden. You can't defend the perimeter anymore, but it's got to be baked into everything. >> It's no mistake that it's like the security world. They're about 25 years ahead of us in data privacy and protection. My boss is our chief trust officer who formally was our CISO I am absolutely free riding on all the progresses the security people have made. We're just really complimenting each others skills, and getting out into other parts of the business in addition to the technical part of the business. >> Exciting times. >> Yeah, it's going to be fun. >> Well great to catch up and >> Yeah you too. >> We'll let you go. Unfortunately we're out of time. We'll see you in 2019. >> Data Privacy Day. >> Data Privacy Day. She's Michelle Dennedy, I'm Jeff Frank. You're watching theCUBE. Thanks for tuning in from Data Privacy Day 2018. (music)
SUMMARY :
We're at the place that you should be. on the landscape from a year ago? it's the beginning of a whole new era in data. But the potential penalties, direct penalties. Obviously, the Y2k, great example. and in a small business it could be the only available is not as much the companies. They're logging in to public WiFi's and we actually even I actually, I think it's difficult at first to get started. But more importantly, B: It's the gateway to innovation. asking the right questions and getting access to people's-- in the right position. as to what you put in the food And every analogy has its failures right? and other parts of the world, and yet we let people I think you said before we go on the air you have a new So now I feel pretty confident that the self reporting I'm going to say it depends Jeff. In that culture my coming into the engineering with So look at where you are in your organization Was it more than you thought less than you thought? We run by the quarter. You might sell it tomorrow, but you never get today back. it's really easy to say whew, That is not the attitude I want people to take. Just baking it in all the way. and getting out into other parts of the business We'll see you in 2019. Thanks for tuning in from Data Privacy Day 2018.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Michelle Dennedy | PERSON | 0.99+ |
Jeff Frank | PERSON | 0.99+ |
Jeff | PERSON | 0.99+ |
May 25, 2018 | DATE | 0.99+ |
Jeff Frick | PERSON | 0.99+ |
Cisco | ORGANIZATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
100% | QUANTITY | 0.99+ |
2018 | DATE | 0.99+ |
1998 | DATE | 0.99+ |
20 years | QUANTITY | 0.99+ |
Y2K | ORGANIZATION | 0.99+ |
North America | LOCATION | 0.99+ |
70 percent | QUANTITY | 0.99+ |
Michelle | PERSON | 0.99+ |
1995 | DATE | 0.99+ |
tomorrow | DATE | 0.99+ |
2019 | DATE | 0.99+ |
General Data Protection Regulation | TITLE | 0.99+ |
last year | DATE | 0.99+ |
zero | QUANTITY | 0.99+ |
two weeks | QUANTITY | 0.99+ |
68 percent | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
four | QUANTITY | 0.99+ |
three | QUANTITY | 0.99+ |
GDPR | TITLE | 0.99+ |
3,000 customers | QUANTITY | 0.99+ |
four percent | QUANTITY | 0.99+ |
Y2k | ORGANIZATION | 0.99+ |
two | QUANTITY | 0.99+ |
January one | DATE | 0.99+ |
Data Privacy Day | EVENT | 0.99+ |
20 different countries | QUANTITY | 0.99+ |
this year | DATE | 0.99+ |
a year ago | DATE | 0.99+ |
three months | QUANTITY | 0.98+ |
five | QUANTITY | 0.98+ |
one | QUANTITY | 0.98+ |
Data Privacy Day 2018 | EVENT | 0.98+ |
about four months | QUANTITY | 0.98+ |
first guest | QUANTITY | 0.97+ |
Linked-In | ORGANIZATION | 0.97+ |
first couple days | QUANTITY | 0.97+ |
up to 70 percent | QUANTITY | 0.97+ |
first metrics | QUANTITY | 0.97+ |
three weeks | QUANTITY | 0.97+ |
over 16 and a half weeks | QUANTITY | 0.97+ |
first | QUANTITY | 0.97+ |
about 25 years | QUANTITY | 0.96+ |
multi-billion dollar | QUANTITY | 0.95+ |
San Francisco | LOCATION | 0.94+ |
theCube | ORGANIZATION | 0.94+ |
vitamin A | OTHER | 0.94+ |
around two | QUANTITY | 0.94+ |
2000 | DATE | 0.9+ |
over 500 thousand per event | QUANTITY | 0.9+ |
a year | QUANTITY | 0.87+ |
Black Hat | ORGANIZATION | 0.85+ |
two years ago | DATE | 0.85+ |
vitamin E | OTHER | 0.83+ |
theCUBE | ORGANIZATION | 0.78+ |
Asiatic | OTHER | 0.76+ |
double blind study | QUANTITY | 0.75+ |
telco | ORGANIZATION | 0.75+ |
almost | DATE | 0.67+ |
Privacy Officer | PERSON | 0.65+ |
ULA | ORGANIZATION | 0.63+ |
quarter | DATE | 0.53+ |
Wikibon Predictions Webinar with Slides
(upbeat music) >> Hi, welcome to this year's Annual Wikibon Predictions. This is our 2018 version. Last year, we had a very successful webinar describing what we thought was going to happen in 2017 and beyond and we've assembled a team to do the same thing again this year. I'm very excited to be joined by the folks listed here on the screen. My name is Peter Burris. But with me is David Floyer, Jim Kobielus is remote. George Gilbert's here in our Pal Alto studio with me. Neil Raden is remote. David Vellante is here in the studio with me. And Stuart Miniman is back in our Marlboro office. So thank you analysts for attending and we look forward to a great teleconference today. Now what we're going to do over the course of the next 45 minutes or so is we're going to hit about 13 of the 22 predictions that we have for the coming year. So if you have additional questions, I want to reinforce this, if you have additional questions or things that don't get answered, if you're a client, give us a call. Reach out to us. We'll leave you with the contact information at the end of the session. But to start things off we just want to make sure that everybody understands where we're coming from. And let you know who is Wikibon. So Wikibon is a company that starts with the idea of what's important as to research communities. Communities are where the action is. Community is where the change is happening. And community is where the trends are being established. And so we use digital technologies like theCUbE, CrowdChat and others to really ensure that we are surfacing the best ideas that are in a community and making them available to our clients so that they can succeed successfully, they can be more successful in their endeavors. When we do that, our focus has always been on a very simple premise. And that is that we're moving to an era of digital business. For many people, digital business can mean virtually anything. For us it means something very specific. To us, the difference between business and digital business is data. A digital business uses data to differentially create and keep a customer. So borrowing from what Peter Drucker said if the goal of business is to create customers and keep and sustain customers, the goal of digital business is to use data to do that. And that's going to inform an enormous number of conversations and an enormous number of decisions and strategies over the next few years. We specifically believe that all businesses are going to have establish what we regard as the five core digital business capabilities. First, they're going to have to put in place concrete approaches to turning more data into work. It's not enough to just accrete data, to capture data or to move data around. You have to be very purposeful and planful in how you establish the means by which you turn that data into work so that you can create and keep more customers. Secondly, it's absolutely essential that we build kind of the three core technology issues here, technology capabilities of effectively doing a better job of capturing data and IoT and people, or internet of things and people, mobile computing for example, is going to be a crucial feature of that. You have to then once you capture that data, turn it into value. And we think this is the essence of what big data and in many respects AI is going to be all about. And then once you have the possibility, kind of the potential energy of that data in place, then you have to turn it into kinetic energy and generate work in your business through what we call systems of agency. Now, all of this is made possible by this significant transformation that happens to be conterminous with this transition to digital business. And that is the emergence of the cloud. The technology industry has always been defined by the problems it was able to solve, catalyzed by the characteristics of the technology that made it possible to solve them. And cloud is crucial to almost all of the new types of problems that we're going to solve. So these are the five digital business capabilities that we're going to talk about, where we're going to have our predictions. Let's start first and foremost with this notion of turn more data into work. So our first prediction relates to how data governance is likely to change in a global basis. If we believe that we need to turn more data into work well, businesses haven't generally adopted many of the principles associated with those practices. They haven't optimized to do that better. They haven't elevated those concepts within the business as broadly and successfully as they have or as they should. We think that's going to change in part by the emergence of GDPR or the General Data Protection Regulation. It's going to go in full effect in May 2018. A lot has been written about it. A lot has been talked about. But our core issues ultimately are is that the dictates associated with GDPR are going to elevate the conversation on a global basis. And it mandates something that's now called the data protection officer. We're going to talk about that in a second David Vellante. But if is going to have real teeth. So we were talking with one chief privacy officer not too long ago who suggested that had the Equifax breach occurred under the rules of GDPR that the actual finds that would have been levied would have been in excess of 160 billion dollars which is a little bit more than the zero dollars that has been fined thus far. Now we've seen new bills introduced in Congress but ultimately our observation and our conversations with a lot of data chief privacy officers or data protection officers is that in the B2B world, GDPR is going to strongly influence not just our businesses behavior regarding data in Europe but on a global basis. Now that has an enormous implication David Vellante because it certainly suggest this notion of a data protection officer is something now we've got another potential chief here. How do we think that's going to organize itself over the course of the next few years? >> Well thank you Peter. There are a lot of chiefs (laughs) in the house and sometimes it gets confusing as the CIO, there's the CDO and that's either chief digital officer or chief data officer. There's the CSO, could be strategy, sometimes that could be security. There's the CPO, is that privacy or product. As he says, it gets confusing sometimes. On theCUbE we talked to all of these roles so we wanted to try to add some clarity to that. First thing we want to say is that the CIO, the chief information officer, that role is not going away. A lot of people predict that, we think that's nonsense. They will continue to have a critical role. Digital transformations are the priority in organizations. And so the chief digital officer is evolving from more than just a strategy role to much more of an operation role. Generally speaking, these chiefs tend to report in our observation to the chief operating officer, president COO. And we see the chief digital officer as increasing operational responsibility aligning with the COO and getting incremental responsibility that's more operational in nature. So the prediction really is that the chief digital officer is going to emerge as a charismatic leader amongst these chiefs. And by 2022, nearly 50% of organizations will position the chief digital officer in a more prominent role than the CIO, the CISO, the CDO and the CPO. Those will still be critical roles. The CIO will be an enabler. The chief information security officer has a huge role obviously to play especially in terms of making security a teams sport and not just falling on IT's shoulders or the security team's shoulders. The chief data officer who really emerged from a records and data management role in many cases, particularly within regulated industries will still be responsible for that data architecture and data access working very closely with the emerging chief privacy officer and maybe even the chief data protection officer. Those roles will be pretty closely aligned. So again, these roles remain critical but the chief digital officer we see as increasing in prominence. >> Great, thank you very much David. So when we think about these two activities, what we're really describing is over the course of the next few years, we strongly believe that data will be regarded more as an asset within business and we'll see resources devoted to it and we'll see certainly management devoted to it. Now, that leads to the next set of questions as data becomes an asset, the pressure to acquire data becomes that much more acute. We believe strongly that IoT has an enormous implication longer term as a basis for thinking about how data gets acquired. Now, operational technology has been in place for a long time. We're not limiting ourselves just operational technology when we talk about this. We're really talking about the full range of devices that are going to provide and extend information and digital services out to consumers, out to the Edge, out to a number of other places. So let's start here. Over the course of the next few years, the Edge analytics are going to be an increasingly important feature overall of how technology decisions get made, how technology or digital business gets conceived and even ultimately how business gets defined. Now David Floyer's done a significant amount of work in this domain and we've provided that key finding on the right hand side. And what it shows is that if you take a look at an Edge based application, a stylized Edge based application and you presume that all the data moves back to an centralized cloud, you're going to increase your costs dramatically over a three year period. Now that moderates the idea or moderates the need ultimately for providing an approach to bringing greater autonomy, greater intelligence down to the Edge itself and we think that ultimately IoT and Edge analytics become increasingly synonymous. The challenge though is that as we evolve, while this has a pressure to keep more of the data at the Edge, that ultimately a lot of the data exhaust can someday become regarded as valuable data. And so as a consequence of that, there's still a countervailing impression to try to still move all data not at the moment of automation but for modeling and integration purposes, back to some other location. The thing that's going to determine that is going to be rate at which the cost of moving the data around go down. And our expectation is over the next few years when we think about the implications of some of the big cloud suppliers, Amazon, Google, others, that are building out significant networks to facilitate their business services may in fact have a greater impact on the common carriers or as great an impact on the common carriers as they have on any server or other infrastructure company. So our prediction over the next few years is watch what Amazon, watch what Google do as they try to drive costs down inside their networks because that will have an impact how much data moves from the Edge back to the cloud. It won't have an impact necessarily on the need for automation at the Edge because latency doesn't change but it will have a cost impact. Now that leads to a second consideration and the second consideration is ultimately that when we talk about greater autonomy at the Edge we need to think about how that's going to play out. Jim Kobielus. >> Jim: Hey thanks a lot Peter. Yeah, so what we're seeing at Wikibon is that more and more of the AI applications, more of the AI application development involves AI and more and more of the AI involves deployment of those models, deep learning machine learning and so forth to the Edges of the internet of things and people. And much of that AI will be operating autonomously with little or no round-tripping back to the cloud. What that's causing, in fact, we're seeing really about a quarter of the AI development projects (static interference with web-conference) as Edge deployment. What that involves is that more and more of that AI will be, those applications will be bespoke. They'll be one of a kind, or unique or an unprecedented application and what that means is that, you know, there's a lot of different deployment scenarios within which organizations will need to use new forms of learning to be able to ready that data, those AI applications to do their jobs effectively albeit to predictions of real time, guiding of an autonomous vehicle and so forth. Reinforcement learning is the core of what many of these kinds of projects, especially those that involve robotics. So really software is hitting the world and you know the biggest parts are being taken out of the Edge, much of that is AI, much of that autonomous, where there is no need or less need for real time latency in need of adaptive components, AI infused components where as they can learn by doing. From environmental variables, they can adapt their own algorithms to take the right actions. So, they'll have far reaching impacts on application development in 2018. For the developer, the new developer really is a data scientist at heart. They're going to have to tap into a new range of sources of data especially Edge sourced data from the senors on those devices. They're going to need to do commitment training and testing especially reinforcement learning which doesn't involve trained data so much as it involves being able to build an algorithm that can learn to maximum what's called accumulative reward function and if you do the training there adaptly in real time at the Edge and so forth and so on. So really, much of this will be bespoke in the sense that every Edge device increasingly will have its own set of parameters and its own set of objective functions which will need to be optimized. So that's one of the leading edge forces, trends, in development that we see in the coming year. Back to you Peter. >> Excellent Jim, thank you very much. The next question here how are you going to create value from data? So once you've, we've gone through a couple trends and we have multiple others about what's going to happen at the Edge. But as we think about how we're going to create value from data, Neil Raden. >> Neil: You know, the problem is that data science emerged rapidly out of sort of a perfect storm of big data and cloud computing and so forth. And people who had been involved in quantitative methods you know rapidly glommed onto the title because it was, lets face it, it was very glamorous and paid very well. But there weren't really good best practices. So what we have in data science is a pretty wide field of things that are called data science. My opinion is that the true data scientists are people who are scientists and are involved in developing new or improving algorithms as opposed to prepping data and applying models. So the whole field really kind of generated very quickly, in really, just in a few years. To me I called it generation zero which is more like data prep and model management all done manually. And it wasn't really sustainable in most organizations because for obvious reasons. So generation one, then some vendors stepped up with tool kits or benchmarks or whatever for data scientists and made it a little better. And generation two is what we're going to see in 2018, is the need for data scientists to no longer prep data or at least not spend very much time with it. And not to do model management because the software will not only manage the progression of the models but even recommend them and generate them and select the data and so forth. So it's in for a very big change and I think what you're going to see is that the ranks of data scientists are going to sort of bifurcate to old style, let me sit down and write some spaghetti code in R or Java or something and those that use these advanced tool kits to really get the work done. >> That's great Neil and of course, when we start talking about getting the work done, we are becoming increasingly dependent upon tools, aren't we George? But the tool marketplace for data science, for big data, has been somewhat fragmented and fractured. And hasn't necessarily focused on solving the problems of the data scientists. But in many respects focusing the problems that the tools themselves have. What's going to happen in the coming year when we start thinking about Neil's prescription that as the tools improve what's going to happen to the tools. >> Okay so, the big thing that we see supporting what Neil's talking about, what Neil was talking about is partly a symptom of a product issue and a go to market issue where the produce issue was we had a lot of best of breed products that were all designed to fit together. That in the broader big data space, that's the same issue that we faced with more narrowly with ArpiM Hadoop where you know, where we were trying to fit together a bunch of open source packages that had an admin and developer burden. More broadly, what Neil is talking about is sort of a richer end to end tools that handle both everything from the ingest all to the way to the operationalization and feedback of the models. But part of what has to go on here is that with open source, these open source tools the price point and the functional footprints that many of the vendors are supporting right now can't feed an enterprise sales force. Everyone talks with their open source business models about land and expand and inside sales. But the problem is once you want to go to wide deployment in an enterprise, you still need someone negotiating commercial terms at a senior level. You still need the technical people fitting the tools into a broader architecture. And most of the vendors that we have who are open source vendors today, don't have either the product breadth or the deal size to support traditional enterprise software. An account team would typically a million and a half to two million quota every year so we see consolidation and the consolidation again driven by the need for simplicity for the admins and the developers and for business model reasons to support enterprise sales force. >> All right, so what we're going to see happen in the course of the coming year is a lot of specialization and recognition of what is data science, what are the practices, how is it going to work, supported by an increasing quality of tools and a lot of tool vendors are going to be left behind. Now the third kind of notion here for those core technology capabilities is we still have to enact based on data. The good new is that big data is starting to show some returns in part because of some of the things that AI and other technologies are capable of doing. But we have to move beyond just creating the potential for, we have to turn that into work and that's what we mean ultimately by this notion of systems of agency. The idea that data driven applications will increasingly be act on behalf of a brand, on behalf of a company and building those systems out is going to be crucial. It's going to have a whole new set of disciplines and expertise required. So when we think about what's going to be required, it always starts with this notion of AI. A lot of folks are presuming however, that AI is going to be relatively easy to build or relatively easy to put together. We have a different opinion George. What do we think is going to happen as these next few years unfold related to AI adoption in large enterprises? >> Okay so, let's go back to the lessons we learned from sort of the big data, the raw, you know, let's put a data link in place which was sort of the top of everyone's agenda for several years. The expectation was it was going to cure cancer, taste like chocolate and cost a dollar. And uh. (laughing) It didn't quite work out that way. Partly because we had a burden on the administrator again of so many tools that weren't all designed to fit together, even though they were distributed together. And then the data scientists, the guys who had to take all this data that wasn't carefully curated yet. And turn that into advanced analytics and machine learning models. We have many of the same problems now with tool sets that are becoming more integrated but at lower levels. This is partly what Neil Raden was just talking about. What we have to recognize is something that we see all along, I mean since the beginning of (laughs) corporate computing. We have different levels of extraction and you know at the very bottom, when you're dealing with things like Tensorflow or MXNet, that's not for mainstream enterprises. That's for you know, the big sophisticated tech companies who are building new algorithms on those frameworks. There's a level above that where you're using like a spark cluster in the machine learning built into that. That's slightly more accessible but when we talk about mainstream enterprises taking advantage of AI, the low hanging fruit is for them to use the pre-trained models that the public cloud vendors have created with all the consumer data on speech, image recognition, natural language processing. And then some of those capabilities can be further combined into applications like managing a contact center and we'll see more from like Amazon, like recommendation engines, fulfillment optimization, pricing optimization. >> So our expectation ultimately George is that we're going to see a lot of this, a lot of AI adoption happen through existing applications because the vendors that are capable of acquiring a talent, taking or experimenting, creating value, software vendors are going to be where a lot of the talent ends up. So Neil, we have an example of that. Give us an example of what we think is going to happen in 2018 when we start thinking about exploiting AI and applications. >> Neil: I think that it's fairly clear to be the application of what's called advanced analytics and data science and even machine learning. But really, it's rapidly becoming a commonplace in organizations not just at the bottom of the triangle here. But I like the example of SalesForce.com. What they've done with Einstein, is they've made machine learning and I guess you can say, AI applications available to their customer base and why is that a good thing? Because their customer base already has a giant database of clean data that they can use. So you're going to see a huge number of applications being built with Einstein against Salesforce.com data. But there's another thing to consider and that is a long time ago Salesforce.com built connectors to a zillion times of external data. So, if you're a SalesForce.com customer using Einstein, you're going to be able to use those advanced tools without knowing anything about how to train a machine learning model and start to build those things. And I think that they're going to lead the industry in that sense. That's going to push their revenue next year to, I don't know, 11 billion dollars or 12 billion dollars. >> Great, thanks Neil. All right so when we think about further evidence of this and further impacts, we ultimately have to consider some of the challenges associated with how we're going to create application value continually from these tools. And that leads to the idea that one of the cobblers children, it's going to gain or benefit from AI will in fact be the developer organization. Jim, what's our prediction for how auto-programming impacts development? >> Jim: Thank you very much Peter. Yeah, automation, wow. Auto-programming like I said is the epitome of enterprise application development for us going forward. People know it as co-generation but that really understates the control of auto-programming as it's evolving. Within 2018, what we're going to see is that machine learning driven co-generation approach of becoming the forefront of innovation. We're seeing a lot of activity in the industry in which applications use ML to drive the productivity of developers for all kinds of applications. We're also seeing a fair amount of what's called RPA, robotic process automation. And really, how they differ is that ML will deliver or will drive co-generation, from what I call the inside out meaning, creating reams of code that are geared to optimize a particular application scenario. This is RPA which really takes over the outside in approach which is essentially, it's the evolution of screen scraping that it's able to infer the underlined code needed for applications of various sorts from the external artifacts, the screens and from sort of the flow of interactions and clips and so forth for a given application. We're going to see that ML and RPA will compliment each other in the next generation of auto-programming capabilities. And so, you know, really application development tedium is really the enemy of, one of the enemies of productivity (static interference with web-conference). This is a lot of work, very detailed painstaking work. And what they need is to be better, more nuanced and more adaptive auto-programming tools to be able to build the code at the pace that's absolutely necessary for this new environment of cloud computing. So really AI-related technologies can be applied and are being applied to application development productivity challenges of all sorts. AI is fundamental to RPA as well. We're seeing a fair number of the vendors in that stage incorporate ML driven OCR and natural language processing and screen scraping and so forth into their core tools to be able to quickly build up the logic albeit to drive sort of the verbiage outside in automation of fairly complex orchestration scenario. In 2018, we'll see more of these technologies come together. But you know, they're not a silver bullet. 'Cause fundamentally and for organizations that are considering going deeply down into auto-programming they're going to have to factor AI into their overall plans. They need to get knowledgeable about AI. They're going to need to bring more AI specialists into their core development teams to be able to select from the growing range of tools that are out there, RPA and ML driven auto-programming. Overall, really what we're seeing is that the AI, the data scientists, who's been the fundamental developer of AI, they're coming into the core of development tools and skills in organizations. And they're going to be fundamental to this whole trend in 2018 and beyond. If AI gets proven out in auto-programming, these developers will then be able to evangelize the core utility of the this technology, AI. In a variety of other backend but critically important investments that organizations will be making in 2018 and beyond. Especially in IT operations and in management, AI is big in that area as well. Back to you there, Peter. >> Yeah, we'll come to that a little bit later in the presentation Jim, that's a crucial point but the other thing we want to note here regarding ultimately how folks will create value out of these technologies is to consider the simple question of okay, how much will developers need to know about infrastructure? And one of the big things we see happening is this notion of serverless. And here we've called it serverless, developer more. Jim, why don't you take us through why we think serverless is going to have a significant impact on the industry, at least certainly from a developer perspective and developer productivity perspective. >> Jim: Yeah, thanks. Serverless is really having an impact already and has for the last several years now. Now, everybody, many are familiar in the developer world, AWS Lambda which is really the ground breaking public cloud service that incorporates the serverless capabilities which essentially is an extraction layer that enables developers to build stateless code that executes in a cloud environment without having to worry about and to build microservices, we don't have to worry about underlined management of containers and virtual machines and so forth. So in many ways, you know, serverless is a simplification strategy for developers. They don't have to worry about the underlying plumbing. They can worry, they need to worry about the code, of course. What are called Lambda functions or functional methods and so forth. Now functional programming has been around for quite a while but now it's coming to the form in this new era of serverless environment. What we'll see in 2018 is that we're predicting is that more than 50% of lean microservices employees, in the public cloud will be deployed in serverless environments. There's AWS and Microsoft has the Azure function. IMB has their own. Google has their own. There's a variety of private, there's a variety of multiple service cloud code bases for private deployment of serverless environments that we're seeing evolving and beginning to deploy in 2018. They all involve functional programming which really, along, you know, when coupled with serverless clouds, enables greater scale and speed in terms of development. And it's very agile friendly in the sense that you can quickly Github a functionally programmed serverless microservice in a hurry without having to manage state and so forth. It's very DevOps friendly. In the very real sense it's a lot faster than having to build and manage and tune. You know, containers and DM's and so forth. So it can enable a more real time and rapid and iterative development pipeline going forward in cloud computing. And really fundamentally what serverless is doing is it's pushing more of these Lamba functions to the Edge, to the Edges. If you're at an AWS Green event last week or the week before, but you notice AWS is putting a big push on putting Lambda functions at the Edge and devices for the IoT as we're going to see in 2018. Pretty much the entire cloud arena. Everybody will push more of the serverless, functional programming to the Edge devices. It's just a simplification strategy. And that actually is a powerful tool for speeding up some of the development metabolism. >> All right, so Jim let me jump in here and say that we've now introduced the, some of these benefits and really highlighted the role that the cloud is going to play. So, let's turn our attention to this question of cloud optimization. And Stu, I'm going to ask you to start us off by talking about what we mean by true private cloud and ultimately our prediction for private cloud. Do we have, why don't you take us through what we think is going to happen in this world of true private cloud? >> Stuart: Sure Peter, thanks a lot. So when Wikibon, when we launched the true private cloud terminology which was about two weeks ago next week, two years ago next week, it was in some ways coming together of a lot of trends similar to things that you know, George, Neil and James have been talking about. So, it is nothing new to say that we needed to simplify the IT stack. We all know, you know the tried and true discussion of you know, way too much of the budget is spent kind of keeping lights on. What we'd like to say is kind of running the business. If you squint through this beautiful chart that we have on here, a big piece of this is operational staffing is where we need to be able to make a significant change. And what we've been really excited and what led us to this initial market segment and what we're continuing to see good growth on is the move from traditional, really siloed infrastructure to you want to have, you know, infrastructure where it is software based. You want IT to really be able to focus on the application services that they're running. And what our focus for the this for the 2018 is of course it's the central point, it's the data that matters here. The whole reason we've infrastructured this to be able to run applications and one of the things that is a key determiner as to where and what I use is the data and how can I not only store that data but actually gain value from that data. Something we've talked about time and again and that is a major determining factor as to am I building this in a public cloud or am I doing it in you know my core. Is it something that is going to live on the Edge. So that's what we were saying here with the true private cloud is not only are we going to simplify our environment and therefore it's really the operational model that we talked about. So we often say the line, cloud is not a destination. But it's an operational model. So a true private cloud giving me some of the you know, feel and management type of capability that I had had in the public cloud. It's, as I said, not just virtualization. It's much more than that. But how can I start getting services and one of the extensions is true private cloud does not live in isolation. When we have kind of a core public cloud and Edge deployments, I need to think about the operational models. Where data lives, what processing happens need to be as environments, and what data we'll need to move between them and of course there's fundamental laws of physics that we need to consider in that. So, the prediction of course is that we know how much gear and focus has been on the traditional data center. And true private cloud helps that transformation to modernization and the big focus is many of these applications we've been talking about and uses of data sets are starting to come into these true private cloud environments. So, you know, we've had discussions. There's Spark, there's modern databases. Many of these, there's going to be many reasons why they might live in the private cloud environment. And therefore that's something that we're going to see tremendous growth and a lot of focus. And we're seeing a new wave of companies that are focusing on this to deliver solutions that will do more than just a step function for infrastructure or get us outside of our silos. But really helps us deliver on those cloud native applications where we pull in things like what Jim was talking about with serverless and the like. >> All right, so Stu, what that suggests ultimately is that data is going to dictate that everything's not going to end up in the private or in the public cloud or centralized public clouds because of latency costs, data governance and IP protection reasons. And there will be some others. At bare minimum, that means that we're going to have it in most large enterprises as least a couple of clouds. Talk to us about what this impact of multi cloud is going to look like over the course of the next few years. >> Stuart: Yeah, critical point there Peter. Because, right, unfortunately, we don't have one solution. There's nobody that we run into that say, oh, you know, I just do a single you know, one environment. You know it would be great if we only had one application to worry about. But as you've done this lovely diagram here, we all use lots of SaaS and increasingly, you know, Oracle, Microsoft, SalesForce, you know, all pushing everybody to multiple SaaS environments that has major impacts on my security and where my data lives. Public clouds, no doubt is growing at leaps and bounds. And many customers are choosing applications to live in different places. So just as in data centers, I would kind of look at it from an application standpoint and build up what I need. Often, there's you know, Amazon doing phenomenal. But you know, maybe there's things that I'm doing with Azure. Maybe there's things that's I'm doing with Google or others as well as my service providers for locality, for you know, specialized services, that there's reasons why people are doing it. And what customers would love is an operational model that can actually span between those. So we are very early in trying to attack this multi cloud environment. There's everything from licensing to security to you know, just operationally how do I manage those. And a piece of them that we were touching on in this prediction year, is Kubernetes actually can be a key enabler for that cloud native environment. As Jim talked about the serverless, what we'd really like is our developer to be able to focus on building their application and not think as much about the underlined infrastructure whether that be you know, racket servers that I built myself or public cloud infrastructures. So we really want to think more it's at the data and application level. It's SaaS and pass is the model and Kubernetes holds the promise to solve a piece of this puzzle. Now Kubernetes is not by no means a silver bullet for everything that we need. But it absolutely, it is doing very well. Our team was at the Linux, the CNCF show at KubeCon last week and there is you know, broad adoption from over 40 of the leading providers including Amazon is now a piece. Even SalesForce signed up to the CNCF. So Kubernetes is allowing me to be able to manage multi cloud workflows and therefore the prediction we have here Peter is that 50% of developing teams will be building, sustaining multi cloud with Kubernetes as a foundational component of that. >> That's excellent Stu. But when we think about it, the hardware of technology especially because of the opportunities associated with true private cloud, the hardware technologies are also going to evolve. There will be enough money here to sustain that investment. David Floyer, we do see another architecture on the horizon where for certain classes of workloads, we will be able to collapse and replicate many of these things in an economical, practical way on premise. We call that UniGrid, NVME is, over fabric is a crucial feature of UniGrid. >> Absolutely. So, NVMe takes, sorry NVMe over fabric or NVMe-oF takes NVMe which is out there as storage and turns it into a system framework. It's a major change in system architecture. We call this UniGrid. And it's going to be a focus of our research in 2018. Vendors are already out there. This is the fastest movement from early standards into products themselves. You can see on the chart that IMB have come out with NVMe over fabrics with the 900 storage connected to the power. Nine systems. NetApp have the EF750. A lot of other companies are there. Meta-Lox is out there looking for networks, for high speed networks. Acceler has a major part of the storage software. So and it's going to be used in particular with things like AI. So what are the drivers and benefits of this architecture? The key is that data is the bottleneck for application. We've talked about data. The amount of data is key to making applications more effective and higher value. So NVMe and NVMe over fabrics allows data to be accessed in microseconds as opposed to milliseconds. And it allows gigabytes of data per second as opposed to megabytes of data per second. And it also allows thousands of processes to access all of the data in very very low latencies. And that gives us amazing parallelism. So what's is about is disaggregation of storage and network and processes. There are some huge benefits from that. Not least of which is you save about 50% of the processor you get back because you don't have to do storage and networking on it. And you save from stranded storage. You save from stranded processor and networking capabilities. So it's overall, it's going to be cheaper. But more importantly, it makes it a basis for delivering systems of intelligence. And systems of intelligence are bringing together systems of record, the traditional systems, not rewriting them but attaching them to real time analytics, real time AI and being able to blend those two systems together because you've got all of that additional data you can bring to bare on a particular problem. So systems themselves have reached pretty well the limit of human management. So, one of the great benefits of UniGrid is to have a single metadata lab from all of that data, all of those processes. >> Peter: All those infrastructure elements. >> All those infrastructure elements. >> Peter: And application. >> And applications themselves. So what that leads to is a huge potential to improve automation of the data center and the application of AI to operations, operational AI. >> So George, it sounds like it's going to be one of the key potential areas where we'll see AI be practically adopted within business. What do we think is going to happen here as we think about the role that AI is going to play in IT operations management? >> Well if we go back to the analogy with big data that we thought was going to you know, cure cancer, taste like chocolate, cost a dollar, and it turned out that the application, the most wide spread application of big data was to offload ETL from expensive data warehouses. And what we expect is the first widespread application of AI embedded in applications for horizontal use where Neil mentioned SalesForce and the ability to use Einstein as SalesForce data and connected data. Now because the applications we're building are so complex that as Stu mentioned you know, we have this operational model with a true private cloud. It's actually not just the legacy stuff that's sucking up all the admin overhead. It's the complexity of the new applications and the stringency of the SLA's, means that we would have to turn millions of people into admins, the old you know, when the telephone networks started, everyone's going to have to be an operator. The only way we can get past this is if we sort of apply machine learning to IT Ops and application performance management. The key here is that the models can learn how the infrastructure is laid out and how it operates. And it can also learn about how all the application services and middleware works, behaving independently and with each other and how they tie with the infrastructure. The reason that's important is because all of a sudden you can get very high fidelity root cause analysis. In the old management technology, if you had an underlined problem, you'd have a whole sort of storm of alerts, because there was no reliable way to really triangulate on the or triage the root cause. Now, what's critical is if you have high fidelity root cause analysis, you can have really precise recommendations for remediation or automated remediation which is something that people will get comfortable with over time, that's not going to happen right away. But this is critical. And this is also the first large scale application of not just machine learning but machine data and so this topology of collecting widely desperate machine data and then applying models and then reconfiguring the software, it's training wheels for IoT apps where you're going to have it far more distributed and actuating devices instead of software. >> That's great, George. So let me sum up and then we'll take some questions. So very quickly, the action items that we have out of this overall session and again, we have another 15 or so predictions that we didn't get to today. But one is, as we said, digital business is the use of data assets to compete. And so ultimately, this notion is starting to diffuse rapidly. We're seeing it on theCUbE. We're seeing it on the the CrowdChats. We're seeing it in the increase of our customers. Ultimately, we believe that the users need to start preparing for even more business scrutiny over their technology management. For example, something very simple and David Floyer, you and I have talked about this extensively in our weekly action item research meeting, the idea of backing up and restoring a system is no longer in a digital business world. It's not just backing up and restoring a system or an application, we're talking about restoring the entire business. That's going to require greater business scrutiny over technology management. It's going to lead to new organizational structures. New challenges of adopting systems, et cetera. But, ultimately, our observations is that data is going to indicate technology directions across the board whether we talk about how businesses evolve or the roles that technology takes in business or we talk about the key business capability, digital business capabilities, of capturing data, turning it into value and then turning into work. Or whether we talk about how we think about cloud architecture and which organizations of cloud resources we're going to utilize. It all comes back to the role that data's going to play in helping us drive decisions. The last action item we want to put here before we get to the questions is clients, if we don't get to your question right now, contact us. Send us an inquiry. Support@silicongangle.freshdesk.com. And we'll respond to you as fast as we can over the course of the next day, two days, to try to answer your question. All right, David Vellante, you've been collecting some questions here. Why don't we see if we can take a couple of them before we close out. >> Yeah, we got about five or six minutes in the chat room, Jim Kobielus has been awesome helping out and so there's a lot of detailed answer there. The first, there's some questions and comments. The first one was, are there too many chiefs? And I guess, yeah. There's some title inflation. I guess my comment there would be titles are cheap, results aren't. So if you're creating chief X officers just for the, to check a box, you're probably wasting money. So you've got to give them clear roles. But I think each of these chiefs has clear roles to the extent that they are you know empowered. Another comment came up which is we don't want you know, Hadoop spaghetti soup all over again. Well true that. Are we at risk of having Hadoop spaghetti soup as the centricity of big data moves from Hadoop to AI and ML and deep learning? >> Well, my answer is we are at risk of that but that there's customer pressure and vendor economic pressure to start consolidating. And we'll also see, what we didn't see in the ArpiM big data era, with cloud vendors, they're just going to start making it easier to use some of the key services together. That's just natural. >> And I'll speak for Neil on this one too, very quickly, that the idea ultimately is as the discipline starts to mature, we won't have people that probably aren't really capable of doing some of this data science stuff, running around and buying a tool to try to supplement their knowledge and their experience. So, that's going to be another factor that I think ultimately leads to clarity in how we utilize these tools as we move into an AI oriented world. >> Okay, Jim is on mute so if you wouldn't mind unmuting him. There was a question, is ML a more informative way of describing AI? Jim, when you and I were in our Boston studio, I sort of asked a similar question. AI is sort of the uber category. Machine learning is math. Deep learning is a more sophisticated math. You have a detailed answer in the chat. But maybe you can give a brief summary. >> Jim: Sure, sure. I don't want too pedantic here but deep learning is essentially, it's a lot more hierarchical deeper stacks of neural network of layers to be able to infer high level extractions from data, you know face recognitions, sentiment analysis and so forth. Machine learning is the broader phenomenon. That's simply along a different and part various approaches for distilling patterns, correlations and algorithms from the data itself. What we've seen in the last week, five, six tenure, let's say, is that all of the neural network approaches for AI have come to the forefront. And in fact, the core often market place and the state of the art. AI is an ancient paradigm that's older than probably you or me that began and for the longest time was rules based system, expert systems. Those haven't gone away. The new era of AI we see as a combination of both statical approaches as well as rules based approaches, and possibly even orchestration based approaches like graph models or building broader context or AI for a variety of applications especially distributed Edge application. >> Okay, thank you and then another question slash comment, AI like graphics in 1985, we move from a separate category to a core part of all apps. AI infused apps, again, Jim, you have a very detailed answer in the chat room but maybe you can give the summary version. >> Jim: Well quickly now, the most disruptive applications we see across the world, enterprise, consumer and so forth, the advantage involves AI. You know at the heart of its machine learning, that's neural networking. I wouldn't say that every single application is doing AI. But the ones that are really blazing the trail in terms of changing the fabric of our lives very much, most of them have AI at their heart. That will continue as the state of the art of AI continues to advance. So really, one of the things we've been saying in our research at Wikibon `is that the data scientists or those skills and tools are the nucleus of the next generation application developer, really in every sphere of our lives. >> Great, quick comment is we will be sending out these slides to all participants. We'll be posting these slides. So thank you Kip for that question. >> And very importantly Dave, over the course of the next few days, most of our predictions docs will be posted up on Wikibon and we'll do a summary of everything that we've talked about here. >> So now the questions are coming through fast and furious. But let me just try to rapid fire here 'cause we only got about a minute left. True private cloud definition. Just say this, we have a detailed definition that we can share but essentially it's substantially mimicking the public cloud experience on PRIM. The way we like to say it is, bringing the cloud operating model to your data versus trying to force fit your business into the cloud. So we've got detailed definitions there that frankly are evolving. about PaaS, there's a question about PaaS. I think we have a prediction in one of our, you know, appendices predictions but maybe a quick word on PaaS. >> Yeah, very quick word on PaaS is that there's been an enormous amount of effort put on the idea of the PaaS marketplace. Cloud Foundry, others suggested that there would be a PaaS market that would evolve because you want to be able to effectively have mobility and migration and portability for this large cloud application. We're not seeing that happen necessarily but what we are seeing is that developers are increasingly becoming a force in dictating and driving cloud decision making and developers will start biasing their choices to the platforms that demonstrate that they have the best developer experience. So whether we call it PaaS, whether we call it something else. Providing the best developer experience is going to be really important to the future of the cloud market place. >> Okay great and then George, George O, George Gilbert, you'll follow up with George O with that other question we need some clarification on. There's a question, really David, I think it's for you. Will persistent dims emerge first on public clouds? >> Almost certainly. But public clouds are where everything is going first. And when we talked about UniGrid, that's where it's going first. And then, the NVMe over fabrics, that architecture is going to be in public clouds. And it has the same sort of benefits there. And NV dims will again develop pretty rapidly as a part of the NVMe over fabrics. >> Okay, we're out of time. We'll look through the chat and follow up with any other questions. Peter, back to you. >> Great, thanks very much Dave. So once again, we want to thank you everybody here that has participated in the webinar today. I apologize for, I feel like Hans Solo and saying it wasn't my fault. But having said that, none the less, I apologize Neil Raden and everybody who had to deal with us finding and unmuting people but we hope you got a lot out of today's conversation. Look for those additional pieces of research on Wikibon, that pertain to the specific predictions on each of these different things that we're talking about. And by all means, Support@silicongangle.freshdesk.com, if you have an additional question but we will follow up with as many as we can from those significant list that's starting to queue up. So thank you very much. This closes out our webinar. We appreciate your time. We look forward to working with you more in 2018. (upbeat music)
SUMMARY :
And that is the emergence of the cloud. but the chief digital officer we see how much data moves from the Edge back to the cloud. and more and more of the AI involves deployment and we have multiple others that the ranks of data scientists are going to sort Neil's prescription that as the tools improve And most of the vendors that we have that AI is going to be relatively easy to build the low hanging fruit is for them to use of the talent ends up. of the triangle here. And that leads to the idea the logic albeit to drive sort of the verbiage And one of the big things we see happening is in the sense that you can quickly the role that the cloud is going to play. Is it something that is going to live on the Edge. is that data is going to dictate that and Kubernetes holds the promise to solve the hardware technologies are also going to evolve. of the processor you get back and the application of AI to So George, it sounds like it's going to be one of the key and the stringency of the SLA's, over the course of the next day, two days, to the extent that they are you know empowered. in the ArpiM big data era, with cloud vendors, as the discipline starts to mature, AI is sort of the uber category. and the state of the art. in the chat room but maybe you can give the summary version. at Wikibon `is that the data scientists these slides to all participants. over the course of the next few days, bringing the cloud operating model to your data Providing the best developer experience is going to be with that other question we need some clarification on. that architecture is going to be in public clouds. Peter, back to you. on Wikibon, that pertain to the specific predictions
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David Floyer | PERSON | 0.99+ |
David Vellante | PERSON | 0.99+ |
Jim | PERSON | 0.99+ |
Neil | PERSON | 0.99+ |
David | PERSON | 0.99+ |
Stuart | PERSON | 0.99+ |
Jim Kobielus | PERSON | 0.99+ |
Neil Raden | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
2018 | DATE | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Peter Burris | PERSON | 0.99+ |
George | PERSON | 0.99+ |
Wikibon | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
2017 | DATE | 0.99+ |
Stuart Miniman | PERSON | 0.99+ |
George Gilbert | PERSON | 0.99+ |
Peter Drucker | PERSON | 0.99+ |
May 2018 | DATE | 0.99+ |
Peter | PERSON | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
General Data Protection Regulation | TITLE | 0.99+ |
Dave | PERSON | 0.99+ |
1985 | DATE | 0.99+ |
50% | QUANTITY | 0.99+ |
Last year | DATE | 0.99+ |
George O | PERSON | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Hans Solo | PERSON | 0.99+ |
Support@silicongangle.freshdesk.com | OTHER | 0.99+ |
12 billion dollars | QUANTITY | 0.99+ |
second consideration | QUANTITY | 0.99+ |
11 billion dollars | QUANTITY | 0.99+ |
Nine systems | QUANTITY | 0.99+ |
Valentin Bercovici, PencilDATA | Cube Conversation with John Furrier
(light adventurous music) >> Hello everyone, welcome to theCUBE Studios here in Palo Alto. I'm John Furrier, the co-host of theCUBE, co-founder of SiliconANGLE Media. This is our CUBE Conversation Thought Leader Thursday and I'm here with Val Bercovici, who's the founder and CEO of a new startup called PencilDATA. Val, CUBE alumni, been on many times with NetApp and then a variety of other great startups, but now you're doing your own thing around cryptocurrency, blockchain, enterprise-like technical infrastructure. You've been a CTO, now entrepreneur, founder and CEO of PencilDATA. Congratulations, you're on the crypto wave, this wave is coming. >> I believe it's here. >> It's here. >> Timing couldn't be better. >> So, I interviewed Dr. Jian Wang who's the chairman of Alibaba's technology steering committee, also the founder of Alibaba Cloud, just recently in China. Presented by Intel, plug for Intel there, thanks Intel for supporting theCUBE. He said to me, and I put the clip out on Twitter, natively on the video clip, which was, I asked him about blockchain, you know China, they blocked the ICOs, he said, "Blockchain is fundamental, part of the Internet. "It's as fundamental as TCP/IP was." This is the nuance that is attracting a lot of tier one entrepreneurs. Obviously the money side is hyped up beyond all recognition right now. As Don Klein on our team was saying, "It's melting up in terms of hype." But this really speaks to the transformation of the web, and the Internet now, the web is the Internet, from distributed and decentralized. This is a big sea change. Kind of building on the fundamentals of the internet, formerly called the information superhighway, before the web came along, but the web was designed to withstand nuclear disaster, be resilient, be decentralized. >> It reminds me of Back to the Future in many, many ways, because if you're as old as we are, you remember those DARPA origins of the Internet and exactly that decentralized nature, and we've gone away from that, right? As Tim Berners-Lee brought on the HTTP protocol, we've had web protocols, and as major, the FANG vendors have really dominated their usage of that existing layer of technology we've gone away, we've gone to a very, very centralized approach, which as we're seeing with the tech hearings this week, carries all sorts of risks, it's not just business and legal and political. >> And you're referring to the senate hearings, where Facebook, Google, or Alphabet, and Twitter were in front of the senate committee, you're going to tell them about the Russians, the Russian political thing, but they're bringing up the issue of the role of these mega platforms that have all this data and the problem is that this is not what the users bargained for. I mean, I use Facebook as a free app, I love Facebook, Facebook, we love you, WhatsApp here and there, and Instagram, but you know, my bargain was simple. I'll use your free app and I'll let you use some of my data but now you're making billions, $10 billion quarter, fake news has infiltrated the country, I have a poor user experience every day, it's getting worse and worse, a lot of hate and division. This is not what I bargained for. >> Val: Exactly. >> So the world's kind of revolting against these mega-siloed platforms. >> That's the risk of having such centralized control of the technology. If you remember in the old days when Microsoft's dominance was rising, all you had to do was target Windows as a virus platform and you're able to impact thousands of businesses, even in the early Internet days, within hours. And it's the same thing happening right now, there's a weaponization of these social media platforms and Google's search engine technology and so forth. It's the same side effect now, the centralization of that control is the problem. One of the reasons I love the blockstack technology, and blockchain in general is the ability to decentralize these things right now, and the most passionate thing I care about nowadays is being driven out of Europe, where they have a lot more maturity in terms of handling these new scenarios. >> You mean the tech being driven out of Europe. >> The laws. >> The laws, okay. >> Being driven out of Europe. >> Be specific, we'd like an example. >> The major deadline that's coming up in May 25th of 2018 is GDPR, General Data Protection Regulation, where European citizens now in any company, American or otherwise, catering to European citizens, has to respond to things like the right to be forgotten request. You've got 24 hours, as a global corporation with European operations, to respond to European citizens', EU citizens', right to be forgotten request, where all the personally identifiable information, the PII, has to be removed and an audit trail, proving it's been removed, has to be gone from two, three hundred internal systems within 24 hours. And this has teeth by the way, it's not like the $2.7 billion fine that Google just flipped away casually, this has up to 4% of your global profits per incident where you don't meet that requirement. >> Well you bring up a good point, the GDPR is a good one, it has teeth and it's kind of in the weeds with the folks who might not know that regulation, but really it's about the privacy and the rights of the individual. But coming back to Facebook, to connect another dot is, what we're seeing with Facebook, Twitter, and Alphabet with the senate hearings is, and this is why the industry and the media is crumbling, publications are dying, the newspapers, the media's changing, is because knowing your customer is a really important thing. The people who want to be served need to have a closed loop with the publication, and these platforms are bogarting all the data, and so the right of the customer, the users are suffering, and that's what people are generally talking about. You know, personally, a guy can rent a truck and go mow people down in Manhattan, we should know who these people are, like the neighbors, so I think there's going to be a trend towards knowing who your neighbor is, knowing who the customers are, at a level that's not scary privacy violation, but we're going to know who the crazies are, we're going to know what's going on and then that's kind of out there, that's kind of my general feeling. But now, getting back to the impact. GDPR, these big mega platforms where the users are at the center of the value proposition, really comes down to the shift in user expectations around a decentralized Internet. That means agile goes to a whole other level. If I'm a user and I say, "Hey Facebook, "delete my digital exhaust or digital footprints "from Facebook over the past 10 years." I mean, that's hard to do. >> That's hard for them. >> That's not, technically is a really serious problem. >> And it's actually not just a technology challenge, I always love to go back to Conway's Law in these discussions, the org chart, you know, how information, infrastructure is budgeted for, and managed through various different departments within any large enterprise, data-savvy or not, is a challenge, as is coordinating these efforts, actually going beyond the talking phase, towards implementing a master data model. Those are the main challenges right now, and it's a movement that I believe now has political strength to actually migrate across the pond. Over here as well there's a groundswell movement called Digital Sovereignty as a response to GDPR in Europe, where people are realizing that they have the right to be sovereign over their data, their digital exhaust, their digital footprints online and that's a two-way street. You want and demand control over your data, but on the other hand your identity, which you control, has to be authentic as opposed to a fake identity, and your reputation has to be out there as well. >> These signals and these trends you were just referring to, to me are just like little tremors of the tectonic plates that are going to be changing, because if you look at the major shift in technology, let's take blockchain for instance, and look at the impact of a decentralized internet, now global, immutability with the ability now for more agile capability and not just permanent, "I want to erase things" that you're talking about, but three, the younger generation, if we look at what the young kids are doing, I have four kids, my oldest is 22, it's a gaming culture, right? It's a gaming culture, they're online all the time. They're not old like us, my son's like, "Dad, Google Search is for old people." I mean, that's a general sentiment, over-categorizing, but a combination of the new user experience, this younger generation, entrepreneurs and users, and these tremors we're seeing in the marketplace, signaling that, "hey Facebook, you might be too big for your britches," or, "hey Twitter, you got a bot problem, "hey all you gamers using Twitch," this is now a signal, where is it leading to? And where does blockchain in particular impact it? Because this is kind of where everything's converging to. >> So what I'd like to say right now is, you've got Marc Andreessen's premise that software is eating the world. If you extend that, data is feeding it, blockchain is valuing it, and it's AI that's automating it. So in my mind, particularly in my experience earlier this year in the AI industry, you realize that AI today really boils down to machine learning, which in itself boils down to deep learning, which boils down to data, your access to data. Professor Andrew Wang did this at the recent O'Reilly conference up in the city, he got up and lectured as the keynote instead of sharing slides and his number one, two, and three advice to everyone in the audience was, get the right datasets to train your model. If you don't have that you don't have a differentiated business, and that's what inspired PencilDATA, is my encountering of the cold start AI problem where the IP's in a public domain, public datasets are ubiquitous which is fantastic for academics, but as a business you can't differentiate unless you have access to the right datasets to train your models more specifically. >> Okay, as the founder and CEO of PencilDATA, that's your new startup, let's get into some of the reasons why you're starting it. What problem are you attacking? Obviously a pencil, I can see pencil and you erase things, it's got data... >> The internet is no longer written in ink, that's the premise. Now with Pencil you can erase some data. >> Well blockchain is immutable, so this is conflicting in my mind. Help me kind of rationalize this. The benefit of blockchain is everything's permanent, if you're on-chain as they say. >> Exactly. >> If you're off-chain, you could do some things. Is that kind of what we're getting at? >> We're mixing the best of both. So our premise is that again, whether you're an organization or an individual, you need to have, to survive in a new digital economy, control over your data. The blockchain part of it is the visibility side. If you don't know who's doing what to your data, you're far less likely to share it. And once you know who's doing what to your data, in an immutable blockchain, with a detailed audit trail, with strong authentication, of literally who's doing what to your data, gives you that visibility. Then you do what modern asset managers do. You can't really value an asset until you fully control it. And our premise is, you can't control something until you can take it back. So the notion of PencilDATA is the ability to go on-chain for the visibility and off-chain for managing data in encrypted containers, and if a data owner or publisher doesn't like how the subscriber's consuming their data, they have the power to revoke all downloaded copies. >> So is this kind of like a shadow blockchain model? I'm trying to find a mental model because I remember the old days back, I was breaking into the industry in the late '80s, early '90s, WORM drives, write once, read many. And you write it once, it's a laser, it was optical drives at the time. Also, demilitarized zones in networking was an area where there was a safe harbor kind of thing, where people could play around. What metaphor, what mental model can people take away from some of the things that you're trying to solve? Is it like a DMZ, is it like a-- >> The implementation's a lot like a DMZ and the business challenge and opportunity is that there's a lot of tension between protecting data, because we have an epidemic of data breaches right now, I think you're foolish if you're assuming that you haven't been breached yet but you might be, because everyone has been breached, personally and organizationally, so we have to deal with the rising need to protect data more and more. But at the same time, you can't stay in business if you don't optimize the monetization of the data you have. And so PencilDATA walks that fine line between letting you do both, letting you not just protect infrastructure, that's a whole other industry that we're not involved in, but literally protect data at the data level. If you look up terms like crypto anchor you'll see some of the technologies we're taking advantage of there. But being able to monetize data by unlocking all that latent value of data hidden behind firewalls. If you use a physics analogy of potential and kinetic energy, applied to data behind firewalls, there's hundreds of billions of dollars of value in latent data basically, potential data hiding behind firewalls, and when you can safely share it, give the owners control they've never had before, then you expose the value of that data for the first time. >> Alright, so let's take us through where you're at. Obviously super exciting, you're leveraging the blockchain and you've got an ICO, initial coin offering coming up but you're not just doing that for the sake of doing, there's a lot of scams out there, you're taking a little bit more of a pragmatic approach. Give us the status because you're the founder and CEO, what's the makeup of the team, how big are you guys, what are you guys looking for, obviously you're looking for team members most likely. >> We're looking for developers obviously. >> Where in the process are you? >> We are a two-month-old company. We're at the seed stage. And we've actually assembled a world-class team. You hear that a lot, but I'm really, really proud of the team members we have right now. >> World-class, are they from around the world and then they have class? Define world-class. >> They're worldly, like myself, I travel a lot. (laughter) An example, my chief privacy officer is Sheila Fitzpatrick, she's a worldwide recognized leader in data privacy, she's on many, many privacy boards in the US and EU and so forth, and she now is traveling nonstop lecturing on GDPR, itself specifically. She's one of those recognized-- >> Should you see yourself as a solution for GDPR, because that's, again, it does have teeth, I'll just say that we've been reporting on this through Wikibon, our research team as well as theCUBE, it comes up all the time and there's heavy fines associated with it, so it's not like- >> GDPR is the perfect use case because on the one hand, we have that audit trail that proves what you're doing with data. On the other hand we have a kill switch, that revocable use clause for data where you can literally comply with GDPR in minutes or seconds, as opposed to take a full 24 hours to scour database and delete selected records. >> Alright, so what about the product? Give us an example of the product. Will you be, first of all that's right around the corner, it's next year. >> Val: Yeah. >> I think it was a March or April's timeframe, I don't have the exact date but it's pretty soon. >> Public beta before the end of this year, version 1.0 first of second quarter next year. >> For you guys, PencilDATA. >> Yes. >> Clients, are you working with anyone right now, you have a handful? >> So we've actually got really interesting distribution partnerships that we're not in a position to announce right now but the top-tier brand name enterprise cloud vendors, both on the SaaS and infrastructure and database side, they're lining up to work with us. Because we're enabling amazing use cases in healthcare and life sciences, the ability to selectively share patient data with insurers, with healthcare providers, clinical trials now to share more information through differential privacy and collectively have more data to be processed and analyzed. Use cases are just off the charts. >> Well you know we go to all the big data shows, we're horizontally scaled on the event site circuit, but this is the number one thing that comes up, I want to move from batch marketing, batch process, batch business to real-time business, speed is essential, but it's always been a conflict between, how do I enable data to move really fast and be available for applications but protecting the privacy. >> Yeah. >> Do you solve that problem, is that something that you see yourselves solving? >> We aren't necessarily innovating on speed, of data movement, it's going to be a SaaS service. >> So it's availability model. >> It's availability of data that's really never been shared before and I think that's the key here, is we know there's a lot of value locked up behind corporate firewalls. The irony is, we don't even have to sell this outside firewalls initially, when you go to any medium-to-large size enterprise that has more than one site or more than one department, Sales doesn't trust Marketing and vice versa, Engineering doesn't trust Customer Support, neither of the four of them trust each other, so we're actually going to enable more data shared within an enterprise at first. >> So that's a starting point for you guys. >> That's a starting point, that's the easiest low-hanging fruit sale we have. >> Well PencilDATA, it's great stuff, Val, congratulations on that startup. I mean, you've got a world-class management team, and this kind of brings up a point that I've been banging on theCUBE pretty much every time I go out I'll talk about blockchain and ICO because you know, theCUBE is a very decentralized audience and that's a value that we're looking at as well with blockchain. I've got to ask you the personal question, from your own personal perspective, experience, executive and CTO, why is blockchain attracting so many A players? Because you're seeing a lot of what I call A players, entrepreneurs, technical geeks, really jumping into this because they can see it, they can smell the opportunity, and also, it also attracts the scammers as well, but specifically, why are these A players coming in? Is it, what are you hearing, what's the general vibe, what's the anecdotal reason? >> So as you said earlier on, it's a fundamental evolution of the core internet as a technology, as fundamental as HTTP and web was on top of TCP/IP back 20 years ago, but it's got that rare combination of not only being a technical innovation that empowers new use cases on the web, on the internet, it's also got immediate, amazing business applications as a store of value initially, as an actual valuation of various business processes, or datasets in my case, as an ability to exchange that value so transparently, so, in such a friction-less liquid manner, those are some of the amazing innovations it brings to the table and I think the most important thing is not to think of this as being able to do digital transformation or faster analog, it's about completely reimagining the exchange of value, measurement of value, and new kinds of businesses that just weren't possible before. >> And at all points of the stack, not the low levels and at the application level, the business logic, and to the geek side, right? >> Absolutely. >> You agree. I mean, that's great and as you know, theCUBE is looking at a blockchain ICO on down the horizon so keep an eye out for that, CUBEcoins could be in everyone's future, so we're super excited like you. >> I'm looking forward to your presale, just like I'm looking forward to mine. (laughing) >> Well, we'll see. But the bottom line is that this is what the reality is, you know, reimagining the applications is what people are thinking and I think people should beware of the scams out there, and then final question I want to ask you is, obviously we're both in the community together, with our teams. Share your perspective on the ecosystem, because obviously decentralization will change the nature of traditional ecosystems. >> Very much so. >> What's your vision on how the ecosystem will evolve, and how big is it now relative to these early markets? >> We're actually starting to enter the middle innings of the cloud game, if you will, we're seeing a very good maturity, a good diversification of profitable earnings and outcomes for the major cloud players, so I think we've gone well down the cloud path so far. But the decentralized world is in its infancy. It's embryonic right now. And I've always been a proponent of the multi-cloud environment and a multi-cloud world, and decentralization fundamentally is based on and depends on a multi-cloud, not just multi-region, but multi-data-center-in-a-closet scenario as well, to be able to actually have a democratic model for determining where the value is, where the value isn't, blockchain node style. And that is incredibly exciting to me, because that really cements this rebalancing of the pendulum between core and edge in terms of where processing and value happens. >> Yeah, and value exchange obviously now, markup links are becoming the du jour way to exchange value, users are in control, infrastructure equilibrium is interesting. Great stuff. And I'll say, perfect storm for innovation. The waves are coming. (laughing) >> You know, one thing I've learned over the years is, the innovation, change never stops. There's always an opportunity to innovate, and that's what I love about this movement. >> Blockchain, ICO, PencilDATA, check 'em out, Val Bercovici, founder and CEO, great friend of theCUBE, also really strong CTO, check these guys out. This wave of innovation around blockchain ICOs and infrastructure, reimagining, the future is here upon us at theCUBE, be right back with more, thanks for watching. (electronic music)
SUMMARY :
I'm John Furrier, the co-host of theCUBE, Kind of building on the fundamentals of the internet, As Tim Berners-Lee brought on the HTTP protocol, the issue of the role of these mega platforms So the world's kind of revolting and blockchain in general is the ability the PII, has to be removed and an audit trail, and it's kind of in the weeds with but on the other hand your identity, which you control, and look at the impact of a decentralized internet, get the right datasets to train your model. some of the reasons why you're starting it. that's the premise. The benefit of blockchain is everything's permanent, Is that kind of what we're getting at? So the notion of PencilDATA is the ability to go from some of the things that you're trying to solve? But at the same time, you can't stay in business what are you guys looking for, of the team members we have right now. and then they have class? in the US and EU and so forth, and she now is traveling because on the one hand, we have that audit trail first of all that's right around the corner, it's next year. I don't have the exact date but it's pretty soon. Public beta before the end of this year, the ability to selectively share patient data available for applications but protecting the privacy. of data movement, it's going to be a SaaS service. neither of the four of them trust each other, That's a starting point, that's the easiest and also, it also attracts the scammers as well, evolution of the core internet as a technology, on down the horizon so keep an eye out for that, I'm looking forward to your presale, reimagining the applications is what people are thinking of the cloud game, if you will, we're seeing a very markup links are becoming the du jour way the innovation, change never stops. the future is here upon us at theCUBE,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Don Klein | PERSON | 0.99+ |
Sheila Fitzpatrick | PERSON | 0.99+ |
two | QUANTITY | 0.99+ |
Marc Andreessen | PERSON | 0.99+ |
Alibaba | ORGANIZATION | 0.99+ |
John Furrier | PERSON | 0.99+ |
$2.7 billion | QUANTITY | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
Val Bercovici | PERSON | 0.99+ |
Valentin Bercovici | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
ORGANIZATION | 0.99+ | |
ORGANIZATION | 0.99+ | |
Manhattan | LOCATION | 0.99+ |
Jian Wang | PERSON | 0.99+ |
22 | QUANTITY | 0.99+ |
US | LOCATION | 0.99+ |
China | LOCATION | 0.99+ |
Alphabet | ORGANIZATION | 0.99+ |
May 25th of 2018 | DATE | 0.99+ |
24 hours | QUANTITY | 0.99+ |
PencilDATA | ORGANIZATION | 0.99+ |
$10 billion | QUANTITY | 0.99+ |
April | DATE | 0.99+ |
Thursday | DATE | 0.99+ |
Andrew Wang | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
next year | DATE | 0.99+ |
Tim Berners-Lee | PERSON | 0.99+ |
March | DATE | 0.99+ |
SiliconANGLE Media | ORGANIZATION | 0.99+ |
General Data Protection Regulation | TITLE | 0.99+ |
GDPR | TITLE | 0.99+ |
billions | QUANTITY | 0.99+ |
CUBE | ORGANIZATION | 0.99+ |
first time | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
more than one site | QUANTITY | 0.99+ |
four kids | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
theCUBE | ORGANIZATION | 0.99+ |
theCUBE Studios | ORGANIZATION | 0.99+ |
Intel | ORGANIZATION | 0.98+ |
three | QUANTITY | 0.98+ |
early '90s | DATE | 0.98+ |
late '80s | DATE | 0.98+ |
this week | DATE | 0.98+ |
more than one department | QUANTITY | 0.98+ |
two-way | QUANTITY | 0.98+ |
Wikibon | ORGANIZATION | 0.97+ |
Alibaba Cloud | ORGANIZATION | 0.97+ |
EU | LOCATION | 0.97+ |
Val | PERSON | 0.97+ |
Pencil | ORGANIZATION | 0.97+ |
second quarter next year | DATE | 0.96+ |
end of this year | DATE | 0.96+ |
Russian | OTHER | 0.96+ |
Twitch | ORGANIZATION | 0.96+ |
hundreds of billions of dollars | QUANTITY | 0.95+ |
20 years ago | DATE | 0.95+ |
one | QUANTITY | 0.94+ |
today | DATE | 0.94+ |
Jane Allen & Jay Cline | Veritas Vision 2017
>> Male: Live from Las Vegas. It's theCUBE covering Veritas Vision 2017. Brought to you by Veritas. (upbeat music) >> Welcome to Las Vegas, everybody. This is the Cube and we are here covering Veritas Vision 2017. It's the hashtag Vtas, V-T-A-S Vision, and this is Day one of two days of coverage here. I'm with Stu Miniman. My name is Dave Vellante. Jane Allen and Jay Cline are here from PwC. Jane is a partner and principal and Jay is a partner. Folks, welcome to the Cube, good to see you. >> Thank you. >> Thank you. >> Thanks for having us. >> So PwC leading global consultancy, I would say one of the top three, four, easily. Top 2. Maybe even top 1. >> Jane: Yes. >> I mean, you guys are gold standard for global. You solve problems that most people can't even begin to touch, except for a handful of companies. Jane, let's start with you. What's hot these days in your world? >> So I lead a practice, an information governance practice here at PwC, founded in a lot of folks with technology, legal support, regulatory backgrounds. And it pertains to all companies these days, right? How do you manage your data, to manage all the risks and reap the benefits of it. Certainly a hot topic and certainly with your privacy regulations on board, cyber risk, and just again all the benefits of data that companies are trying to take advantage of. It's been a growing consultancy practice and something that's very relevant to companies of all industries. >> Jay, we've heard a lot today about GDPR. I know it's something that you've been knee-deep in. What do people need to know about GDPR? >> I think GDPR boils down to one proposition, being able to prove that you have control over people's data. I think that summarizes the 72 different requirements of GDPR. >> Yeah, so GDPR, for those of you who don't know, General Data Protection Regulation, came out of the EU. One person on theCUBE called it a socialist agenda. (Jane laughs) But it's serious business, and if you can't ... I mean, actually, Jay, summarize, you know, what people should know about the exposure. I mean, essentially you have to be able to identify personal information and be able to delete that personal information on request, right, for any European Union citizen? >> Resident or citizen. >> Right, okay. >> That's right. >> So if somebody walks into Joe's pizza shop and says I want to sign up a bingo card to get, you know, mailings and your emailings, technically speaking, that person, if they wanted to do business in the EU, is responsible, is that right? >> You've got to know 360 degree view of all the personal data that you have of your employees, your consumers, your customers. You've got to be able to produce evidence on demand that you have this level of control. And whenever somebody comes in and asks for access to their data, to correct it, to export it, to their email, or to erase it, you've got to know whether you can deny that request or do you have to fulfill it, and you usually only have 30 days to fulfill it. >> So is this one of the hotter topics going on in your world these days? And what percent of your clients are actually prepared? >> I'll let Jay comment on how many are prepared, but you know, I think most companies, frankly, are trying to figure out how to be compliant and what is it they actually need to do. But it is a hot topic. I think even before GDPR, the landscape was already complex, right? People are trying to respond to litigation investigations, retention requirements from regulations, cyber risk, how do we manage it? And it's all about, what data do we have, where is it, and what are we doing with it, and how are we controlling it? And those questions are already there. GDPR highlights it. And with a May 2018 deadline, I mean, it's really putting the spotlight on this topic. >> Oh, yeah, that's one little, the fact that we forgot to mention, the clock is ticking. We're down under a year. So how about customer readiness? >> I think when we cross the one-year milestone in May, a lot of boards got exercised. The phone started ringing off the hooks, because they realized, we only have one more budget cycle to get this done. And so now I think, they're realizing that because GDPR hits the tech stack, and the IT budgets had already been planned for, the release cycles had already been put in place, they're now starting to ask, well, we can't get everything done by next May. What are the most important high-risk things that we do need to get done? And there's going to be more spillover work after May, I think. >> I think this highlights something that was already present in terms of the need for cross-functional senior leadership to pay attention to this, right? This isn't just a legal or privacy topic. It isn't just an IT topic. This really hits across organization and these folks need to work together. >> Jane, could you help us kind of uplevel a little bit. If I look at information governance, you mentioned it's super complex. You know, every company I talked to, they're deploying more and more sass. In the keynote this morning, Veritas said most of their customers have at least three clouds. We find, you know, absolutely it's, the strategy, especially if I start, oh, well, just different groups start using things, then how do I govern it? Do I even worry about security and backup and everything like that? How does this fit in the overall picture for most customers? >> Well, I guess that's what's interesting, right? There's no one right way of doing this right. And so it depends on your business, your industry, your customer base, your geographic location and outreach, and the data landscape. And you have to make smart decisions of what works within your corporate business culture even, of what is it that we need to keep and how we need to keep it and enable, you know, our engineers, our users, our customers, to leverage data, but also manage our risks. And there's just not one way to look at it. But again it goes down to really knowing what control you have, what you have, and where is it, right? But that's what's interesting, is for every company to figure out how is the best way for them to tackle it. >> So who's driving the information governance bus these days? I mean, with Sarbanes-Oxley it was the CFO. With the federal rules of civil procedure, it was kind of the general council. Who's really sort of in charge today? >> Well, I mean, depending on who owns it in an organization, looks a little different, usually legal and/or privacy, and oftentimes they are within the same group. >> Dave: So a chief privacy officer? >> Yeah. >> General counsel obviously involved, IT? >> Sometimes the compliance office again, depending how that's structured, but generally in that legal compliance privacy realm. >> Right. Okay, and when I think about some of those previous, you know, generations, Sox in particular, but also I guess FRO, CP. There was an effort within the company, because the ROI was just like, oh, we got to do this. It was like, okay, what does it cost to not comply, you know. >> Jane: Yeah. >> They would try to thread that needle. But there was always a faction that said, hey, we can... And consultancies were part of this. We can actually get value out of this. It's an opportunity to clean up your data, maybe to get rid of stuff, maybe you can reclaim some wasted space or, you know, et cetera, et cetera. Is that the way it is today with GDPR? And maybe we could unpack that a little bit. >> Yeah. One of the first steps that you have to take for GDPR, is to discover where all of your European personal data is, so data discovery effort. And in doing that, we've had a number of clients that for the first time, they've really put together a view of how they make money using data. And they're finding data, their chief marketing officer is finding data they didn't know they had. And so now they're able to monetize that data if they can use it responsibly within the privacy regulations of GDPR. So marketing is oftentimes funding, helping IT and Legal fund their GDPR efforts. >> And I think one of the other benefits is, if you have to go through this exercise to be compliant, but then you get additional insights in your data and you know where to invest more for those additional business opportunities, then at least hopefully you're reaping, again, more ROI off the effort. >> Well, I know the clock's ticking and there's a sort of virtual gun to organization's heads, but getting into that whole value notion, monetization, most organizations that we talked to, they don't really have an understanding of how data fuels monetization. Not necessarily monetizing the data, but how it contributes to monetization. What do you see in the customer base? >> This is the biggest area I think where GDPR is going to morph after May of 2018. I think the companies that can protect their exposure to this regulation, by going through the same processes to find out where their data is, they are positioned to monetize that data, to take advantage of new market opportunities, in Europe in particular. >> Okay. By the way, we should mention that this actually, the law is in effect, it's just the penalties aren't being-- >> Jay: Right. >> invoked at this point in time, right? >> Jay: That's right. >> So the recital is one-year grace period? And a lot of people are thinking, well, maybe we'll get another year of grace period. It's going to be really interesting to see how that goes down. And presumably the EU's going to go after the big pockets, right? I mean, those are the guys who have to be most concerned about this. But what about that midsize company? For your midsize clients, what are you advising them, that may not have the budgets of the big guys? >> We've been advising our clients that there are actually three ways that you can get hit by GDPR. The one that everybody's talking about is the famous 4% fine on your global revenues. That's what the regulators would impose on you if they discovered that you had an egregious violation of privacy. But there's another way that people aren't talking about that's going to be live on May 25th of 2018. And that's a new litigation risk for B2C. Anybody in the B2C space, even if you're midsize, if you violate the rights of a class of people, they can sue you on May 25th. And you can bet there are going to be law firms that are going to take advantage of this new situation. >> Dave: So they can sue you as individuals? >> As a class of individuals. There's also for people in the B2B space, we're seeing right away the contracting risk. And RFPs, they're saying as a condition to bid for this work, you've got to be able to sign that you are GDPR compliant. So you'll be locked out of the European market if you're B2B and you're not ready on May 2018. >> So we were talking off-camera. I was sort of struggling with trying to understand the direct fit with technology, Jay, and I thought you had a good answer. So what's technology's role in all of this? I mean, technology, can it help us get out of this problem? >> There's two parts where technology's very important. First is just discovering where your data is. That takes a lot of technology tools based on your tech stack, to be able to have an ongoing real-time data map. But the other one, the harder part, is responding to these individual rights requests, to ask for where their data is, to correct it, to delete it, to have that 360 view of individuals throughout your information environment. I think that takes IT to a new level. It hits all parts of the tech stack. >> All right. Because an individual can essentially say, I need to know what you know about me, right, that's part of it? >> Well, exactly. And a lot these companies that collect customer data and structured systems, they weren't really built for this type of exercise, to go through and search for something and actually dispose of it. And so companies are having to think very tactically. Okay, can I do this across all my different systems? And then certainly an unstructured data stores, again, what's there and how do we figure that out? >> So in the keynote this morning, we heard about GDPR. It looked like there was... I called it the doomsday clock, what was up on the wall. Can you bring back, how is Veritas doing? How are they helping customers with information governance and GDPR? >> Well, I think one of the really exciting things they demoed and talked about there is some of the data scanning or data profiling information, whether it be the classification or reporting out in terms of what is in this unstructured stores. Again, in order for companies to figure out what it is that they need to do process and technology wise is, what do we have out there again? And they're giving and enabling customers with some of their tools to be able to get some insights there, which I think is really transformative. I think people have been talking about these things from either a legal discovery standpoint, certainly a cyber risk. And I think this is just really adding on. So again, these tools help enable all of them, but certainly for GDPR. >> You have to get this first step right, the data discovery and classification, because if you scope GDPR too big, your compliance costs are through the roof. But if you scope it too small, your exposure's too big. So having a good discovery and classification approach, is critical to the success of your GDPR program. >> Has the industry solved the classification problem? I mean, for years, you really struggled to classify data. You could classify, you know, maybe data in an email archive, but data became so distributed by its very nature. Has that problem been solved? >> I would say no, but I've certainly seen a huge uptick in companies that actually finally just biting the bullet and getting themselves organized. But again, at least doing it because, hey, we need to figure it out for GDPR and privacy, we need to figure it out for cyber security controls, we need to figure it out for e-discovery, and just regular records management and how long we need to keep things. And so I think they recognize, wait, this satisfies a lot of different needs. But I don't know that there's an easy solution to it either. >> And the best practice organizations have automated that presumably, 'cause otherwise it's not going to scale, right? >> In the long-term that's what they're seeking, right, but you need to get the structure right, so you need to have file plans and organization of the information that makes sense to your employees and the way you do work, and then hopefully tie that back, knowing the data life cycle, to be able to classify things based on role, based on access, based on data type. So there's a lot of upfront work, but ultimately that's the-- >> So that's a taxonomical exercise, is that right? >> It is. That's a fancy word. >> Okay. But that's a heavy lift. And then it changes. >> It is, it is. But I think. Again, there's multiple benefits to that. >> Sure. >> And then going forward, you've got things in order for all those reasons. You can leverage the power of the technology, and then your functional groups and what work they do. People know what work they do, how long it generally it needs to be kept. And if you kind of can marry those two things from the business, the technology side, you can get set up and lauch. >> And then you can automate the policies around data retention. >> Exactly. >> What's your relationship specifically with Veritas? >> Well, you know, they're a client of ours, but we're also a client of theirs. >> Dave: Okay. >> I guess we're friends on a number of different angels and whatnot. But our practice tends to... Or we are technology agnostic in general, but we definitely want to stay on top of the different leaders in the industry. So that when we go to our clients, we can recommend, hey, these these are the top two or three that we believe could help you based on your situation, based on your data landscape, and be able to advise in that regard. So Veritas, between the backup tools, their e-discovery, and certainly some of the things they're doing on, you know, information governance and GDPR, is certainly one of the key providers that our clients should consider. >> So, I have sort of set up this discussion with a little background on PwC, clearly one of the leading consultancies out there. I would point to global, footprint, your deep industry expertise, you understand technology, you've been around, you know, you've got deep relationships. So other than those, what's the big difference, you know? Why PwC? And you can repeat some of those if you want. Probably be more articulate than I was. >> I think one thing that's different is what we call the end-to-end approach, where there might be other companies that have some of the qualities that you've talked about. But with GDPRs, it hits across five to ten different budgets in an enterprise. And we'll take a company through a transformational journey across all of them. We have auditors, and we have lawyers, and technologists, forensic scientists. GDPR really hits across all the functions of the enterprise. Because of our scale, we can hit all of these. Whereas other providers will take different slices of that. >> I would also add, PwC looks at our clients as forever clients. We're not looking for a one transaction and see you later. I mean, we look at them in terms of we want to be a firm that supports and partners them, whether it be on the consulting side, audit, tax, whatnot. And so we look at that that way in terms of trying to support them. And maybe that's just one point solution, maybe it's broader. But we'll bring the right experts to the table that fits for that client. And so we always want to think about it that way. While we might have ways and approaches that we leverage, hey, if they've got a specific need or a specific specialty, we'll bring the right expert to the firm. >> So that leads me to like my last question, which is, so it sounds like GDPR, and in chain of the context of that answer, is not just a tactical sort of pain relief project. Is it part of more strategic digital transformations? Are you able to make that connection? Or are people just in too much of a rush to fix the pain? >> No. Jay and I were talking about this earlier today. I mean, I'll use the example of some of the cloud transformation that companies are going through, right, if they haven't already, and thinking about their data and how they operate differently. And wait a minute, we don't need to forklift all of our data over. Let's think about it. And oh, by the way, let's make sure we're compliant with GDPR, right? So there's a number of different ways that you can kind of pull in different pieces that are helpful to clients. I think there were a number of different aspects to that, that we were talking about. So it's certainly something front and center, but it's not a one time, let's check the box and move on exercise either. >> Awesome. All right. We got to go. Thanks very much for coming the Cube. >> Thank you. >> Thanks. >> It's good to meet you guys. All right, keep it right there, everybody. We'll be back with our next guests. This is theCUBE. We're live from Veritas Vision 2017 in Las Vegas. We'll be right back. (techno music)
SUMMARY :
Brought to you by Veritas. This is the Cube I would say one of the top three, I mean, you guys are gold standard for global. and just again all the benefits of data I know it's something that you've been knee-deep in. I think GDPR boils down to one proposition, I mean, essentially you have to be able to identify of all the personal data that you have I mean, it's really putting the spotlight on this topic. the fact that we forgot to mention, And there's going to be more spillover work and these folks need to work together. In the keynote this morning, Veritas said And you have to make smart decisions the information governance bus these days? and oftentimes they are within the same group. Sometimes the compliance office again, what does it cost to not comply, you know. It's an opportunity to clean up your data, And so now they're able to monetize that data but then you get additional insights in your data but how it contributes to monetization. This is the biggest area I think where GDPR it's just the penalties aren't being-- the EU's going to go after the big pockets, right? And you can bet there are going to be law firms that you are GDPR compliant. and I thought you had a good answer. I think that takes IT to a new level. I need to know what you know about me, right, And so companies are having to think very tactically. So in the keynote this morning, we heard about GDPR. that they need to do process and technology wise is, is critical to the success of your GDPR program. You could classify, you know, But I don't know that there's an easy solution to it either. and organization of the information that makes sense That's a fancy word. And then it changes. Again, there's multiple benefits to that. And if you kind of can marry those two things And then you can automate the policies Well, you know, they're a client of ours, and certainly some of the things they're doing on, you know, And you can repeat some of those if you want. some of the qualities that you've talked about. And so we always want to think about it that way. and in chain of the context of that answer, And oh, by the way, We got to go. It's good to meet you guys.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Jay | PERSON | 0.99+ |
Jane | PERSON | 0.99+ |
Jane Allen | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
May 25th | DATE | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
May 2018 | DATE | 0.99+ |
30 days | QUANTITY | 0.99+ |
Jay Cline | PERSON | 0.99+ |
General Data Protection Regulation | TITLE | 0.99+ |
May | DATE | 0.99+ |
May 25th of 2018 | DATE | 0.99+ |
five | QUANTITY | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
two parts | QUANTITY | 0.99+ |
PwC | ORGANIZATION | 0.99+ |
Europe | LOCATION | 0.99+ |
First | QUANTITY | 0.99+ |
May of 2018 | DATE | 0.99+ |
three | QUANTITY | 0.99+ |
GDPR | TITLE | 0.99+ |
360 degree | QUANTITY | 0.99+ |
one-year | QUANTITY | 0.99+ |
two things | QUANTITY | 0.99+ |
4% | QUANTITY | 0.99+ |
two days | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
one proposition | QUANTITY | 0.99+ |
Veritas | ORGANIZATION | 0.99+ |
EU | ORGANIZATION | 0.99+ |
next May | DATE | 0.99+ |
three ways | QUANTITY | 0.98+ |
first step | QUANTITY | 0.98+ |
first time | QUANTITY | 0.98+ |
first steps | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
Sox | ORGANIZATION | 0.98+ |
one transaction | QUANTITY | 0.98+ |
four | QUANTITY | 0.97+ |
under a year | QUANTITY | 0.96+ |
72 different requirements | QUANTITY | 0.95+ |
360 view | QUANTITY | 0.94+ |
One | QUANTITY | 0.94+ |
one time | QUANTITY | 0.94+ |
one thing | QUANTITY | 0.94+ |
one way | QUANTITY | 0.93+ |
Distributed Data with Unifi Software
>> Narrator: From the Silicon Angle Media Office in Boston, Massachusetts, it's theCUBE. Now, here's your host, Stu Miniman. >> Hi, I'm Stu Miniman and we're here at the east coast studio for Silicon Angle Media. Happy to welcome back to the program, a many time guest, Chris Selland, who is now the Vice President of strategic growth with Unifi Software. Great to see you Chris. >> Thanks so much Stu, great to see you too. >> Alright, so Chris, we'd had you in your previous role many times. >> Chris: Yes >> I think not only is the first time we've had you on since you made the switch, but also first time we've had somebody from Unifi Software on. So, why don't you give us a little bit of background of Unifi and what brought you to this opportunity. >> Sure, absolutely happy to sort of open up the relationship with Unifi Software. I'm sure it's going to be a long and good one. But I joined the company about six months ago at this point. So I joined earlier this year. I actually had worked with Unifi for a bit as partners. Where when I was previously at the Vertica business inside of HP/HP, as you know for a number of years prior to that, where we did all the work together. I also knew the founders of Unifi, who were actually at Greenplum, which was a direct Vertica competitor. Greenplum is acquired by EMC. Vertica was acquired by HP. We were sort of friendly respected competitors. And so I have known the founders for a long time. But it was partly the people, but it was really the sort of the idea, the product. I was actually reading the report that Peter Burris or the piece that Peter Burris just did on I guess wikibon.com about distributed data. And it played so into our value proposition. We just see it's where things are going. I think it's where things are going right now. And I think the market's bearing that out. >> The piece you reference, it was actually, it's a Wikibon research meeting, we run those weekly. Internally, we're actually going to be doing them soon we will be broadcasting video. Cause, of course, we do a lot of video. But we pull the whole team together, and it was one, George Gilbert actually led this for us, talking about what architectures do I need to build, when I start doing distributed data. With my background really more in kind of the cloud and infrastructure world. We see it's a hybrid, and many times a multi-cloud world. And, therefore, one of the things we look at that's critical is wait, if I've got things in multiple places. I've got my SAS over here, I've got multiple public clouds I'm using, and I've got my data center. How do I get my arms around all the pieces? And of course data is critical to that. >> Right, exactly, and the fact that more and more people need data to do their jobs these days. Working with data is no longer just the area where data scientists, I mean organizations are certainly investing in data scientists, but there's a shortage, but at the same time, marketing people, finance people, operations people, supply chain folks. They need data to do their jobs. And as you said where it is, it's distributed, it's in legacy systems, it's in the data center, it's in warehouses, it's in SAS applications, it's in the cloud, it's on premise, It's all over the place, so, yep. >> Chris, I've talked to so many companies that are, everybody seems to be nibbling at a piece of this. We go to the Amazon show and there's this just ginormous ecosystem that everybody's picking at. Can you drill in a little bit for what problems do you solve there. I have talked to people. Everything from just trying to get the licensing in place, trying to empower the business unit to do things, trying to do government compliance of course. So where's Unifi's point in this. >> Well, having come out of essentially the data warehousing market. And now of course this has been going on, of course with all the investments in HDFS, Hadoop infrastructure, and open source infrastructure. There's been this fundamental thinking that, well the answer's if I get all of the data in one place then I can analyze it. Well that just doesn't work. >> Right. >> Because it's just not feasible. So I think really and its really when you step back it's one of these like ah-ha that makes total sense, right. What we do is we basically catalog the data in place. So you can use your legacy data that's on the main frame. Let's say I'm a marketing person. I'm trying to do an analysis of selling trends, marketing trends, marketing effectiveness. And I want to use some order data that's on the main frame, I want some click stream data that's sitting in HDFS, I want some customer data in the CRM system, or maybe it's in Sales Force, or Mercado. I need some data out of Workday. I want to use some external data. I want to use, say, weather data to look at seasonal analysis. I want to do neighborhooding. So, how do I do that? You know I may be sitting there with Qlik or Tableau or Looker or one of these modern B.I. products or visualization products, but at the same time where's the data. So our value proposition it starts with we catalog the data and we show where the data is. Okay, you've got these data sources, this is what they are, we describe them. And then there's a whole collaboration element to the platform that lets people as they're using the data say, well yes that's order data, but that's old data. So it's good if you use it up to 2007, but the more current data's over here. Do things like that. And then we also then help the person use it. And again I almost said IT, but it's not real data scientists, it's not just them. It's really about democratizing the use. Because business people don't know how to do inner and outer joins and things like that or what a schema is. They just know, I'm trying do a better job of analyzing sales trends. I got all these different data sources, but then once I found them, once I've decided what I want to use, how do I use them? So we answer that question too. >> Yea, Chris reminds me a lot of some the early value propositions we heard when kind of Hadoop and the whole big data wave came. It was how do I get as a smaller company, or even if I'm a bigger company, do it faster, do it for less money than the things it use to be. Okay, its going to be millions of dollars and it's going to take me 18 months to roll out. Is it right to say this is kind of an extension of that big data wave or what's different and what's the same? >> Absolutely, we use a lot of that stuff. I mean we basically use, and we've got flexibility in what we can use, but for most of our customers we use HDFS to store the data. We use Hive as the most typical data form, you have flexibility around there. We use MapReduce, or Spark to do transformation of the data. So we use all of those open source components, and as the product is being used, as the platform is being used and as multiple users, cause it's designed to be an enterprise platform, are using it, the data does eventually migrate into the data lake, but we don't require you to sort of get it there as a prerequisite. As I said, this is one of the things that we really talk about a lot. We catalog the data where it is, in place, so you don't have to move it to use it, you don't have to move it to see it. But at the same time if you want to move it you can. The fundamental idea I got to move it all first, I got to put it all in one place first, it never works. We've come into so many projects where organizations have tried to do that and they just can't, it's too complex these days. >> Alright, Chris, what are some of the organizational dynamics you're seeing from your customers. You mention data scientist, the business users. Who is identifying, whose driving this issues, whose got the budget to try to fix some of these challenges. >> Well, it tends to be our best implementations are driven really, almost all of them these days, are driven by used cases. So they're driven by business needs. Some of the big ones. I've sort of talked about customers already, but like customer 360 views. For instance, there's a very large credit union client of ours, that they have all of their data, that is organized by accounts, but they can't really look at Stu Miniman as my customer. How do I look at Stu's value to us as a customer? I can look at his mortgage account, I can look at his savings account, I can look at his checking account, I can look at his debit card, but I can't just see Stu. I want to like organize my data, that way. That type of customer 360 or marketing analysis I talked about is a great use case. Another one that we've been seeing a lot of is compliance. Where just having a better handle on what data is where it is. This is where some of the governance aspects of what we do also comes into play. Even though we're very much about solving business problems. There's a very strong data governance. Because when you are doing things like data compliance. We're working, for instance, with MoneyGram, is a customer of ours. Who this day and age in particular, when there's money flows across the borders, there's often times regulators want to know, wait that money that went from here to there, tell me where it came from, tell me where it went, tell me the lineage. And they need to be able to respond to those inquiries very very quickly. Now the reality is that data sits in all sorts of different places, both inside and outside of the organization. Being able to organize that and give the ability to respond more quickly and effectively is a big competitive advantage. Both helps with avoiding regulatory fines, but also helps with customers responsiveness. And then you've got things GDPR, the General Data Protection Regulation, I believe it is, which is being driven by the EU. Where its sort of like the next Y2K. Anybody in data, if they are not paying attention to it, they need to be pretty quick. At least if they're a big enough company they're doing business in Europe. Because if you are doing business with European companies or European customers, this is going to be a requirement as of May next year. There's a whole 'nother set of how data's kept, how data's stored, what customers can control over data. Things like 'Right to Be Forgotten'. This need to comply with regulatory... As data's gotten more important, as you might imagine, the regulators have gotten more interested in what organizations are doing with data. Having a framework with that, organizes and helps you be more compliant with those regulations is absolutely critical. >> Yeah, my understanding of GDPR, if you don't comply, there's hefty fines. >> Chris: Major Fines. >> Major Fines. That are going to hit you. Does Unifi solve that? Is there other re-architecture, redesign that customers need to do to be able to be compliant? [speaking at The same Time] >> No, no that's the whole idea again where being able to leave the data where it is, but know what it is and know where it is and if and when I need to use it and where it came from and where it's going and where it went. All of those things, so we provide the platform that enables the customers to use it or the partners to build the solutions for their customers. >> Curious, customers, their adoption of public cloud, how does that play into what you are doing? They deploy more SAS environments. We were having a conversation off camera today talking about the consolidation that's happening in the software world. What does those dynamics mean for your customers? >> Well public cloud is obviously booming and growing and any organization has some public cloud infrastructure at this point, just about any organization. There's some very heavily regulated areas. Actually health care's probably a good example. Where there's very little public cloud. But even there we're working with... we're part of the Microsoft Accelerator Program. Work very closely with the Azure team, for instance. And they're working in some health care environments, where you have to be things like HIPAA compliant, so there is a lot of caution around that. But none the less, the move to public cloud is certainly happening. I think I was just reading some stats the other day. I can't remember if they're Wikibon or other stats. It's still only about 5% of IT spending. And the reality is organizations of any size have plenty of on-prem data. And of course with all the use of SAS solutions, with Salesforce, Workday, Mercado, all of these different SAS applications, it's also in somebody else's data center, much of our data as well. So it's absolutely a hybrid environment. That's why the report that you guys put out on distributed data, really it spoke so much to what out value proposition is. And that's why you know I'm really glad to be here to talk to you about it. >> Great, Chris tell us a little bit, the company itself, how many employees you have, what metrics can you share about the number of customers, revenue, things like that. >> Sure, no, we've got about, I believe about 65 people at the company right now. I joined like I said earlier this year, late February, early March. At that point we we were like 40 people, so we've been growing very quickly. I can't get in too specifically to like our revenue, but basically we're well in the triple digit growth phase. We're still a small company, but we're growing quickly. Our number of customers it's up in the triple digits as well. So expanding very rapidly. And again we're a platform company, so we serve a variety of industries. Some of the big ones are health care, financial services. But even more in the industries it tends to be driven by these used cases I talked about as well. And we're building out our partnerships also, so that's a big part of what I do also. >> Can you share anything about funding where you are? >> Oh yeah, funding, you asked about that, sorry. Yes, we raised our B round of funding, which closed in March of this year. So we [mumbles], a company called Pelion Venture Partners, who you may know, Canaan Partners, and then most recently Scale Venture Partners are investors. So the companies raised a little over $32 million dollars so far. >> Partnerships, you mentioned Microsoft already. Any other key partnerships you want to call out? >> We're doing a lot of work. We have a very broad partner network, which we're building up, but some of the ones that we are sort of leaning in the most with, Microsoft is certainly one. We're doing a lot of work guys at Cloudera as well. We also work with Hortonworks, we also work with MapR. We're really working almost across the board in the BI space. We have spent a lot of time with the folks at Looker. Who was also a partner I was working with very closely during my Vertica days. We're working with Qlik, we're working with Tableau. We're really working with actually just about everybody in sort of BI and visualization. I don't think people like the term BI anymore. The desktop visualization space. And then on public cloud, also Google, Amazon, so really all the kind of major players. I would say that they're the ones that we worked with the most closely to date. As I mentioned earlier we're part of the Microsoft Accelerator Program, so we're certainly very involved in the Microsoft ecosystem. I actually just wrote a blog post, which I don't believe has been published yet, about some of the, what we call the full stack solutions we have been rolling out with Microsoft for a few customers. Where we're sitting on Azure, we're using HDInsight, which is essentially Microsoft's Hadoop cloud Hadoop distribution, visualized empower BI. So we've really got to lot of deep integration with Microsoft, but we've got a broad network as well. And then I should also mention service providers. We're building out our service provider partnerships also. >> Yeah, Chris I'm surprised we haven't talked about kind of AI yet at all, machine learning. It feels like everybody that was doing big data, now has kind pivoted in maybe a little bit early in the buzz word phase. What's your take on that? You've been apart of this for a while. Is big data just old now and we have a new thing, or how do you put those together? >> Well I think what we do maps very well until, at least my personal view of what's going on with AI/ML, is that it's really part of the fabric of what our product does. I talked before about once you sort of found the data you want to use, how do I use it? Well there's a lot of ML built into that. Where essentially, I see these different datasets, I want to use them... We do what's called one click functions. Which basically... What happens is these one click functions get smarter as more and more people use the product and use the data. So that if I've got some table over here and then I've got some SAS data source over there and one user of the product... or we might see field names that we, we grab the metadata, even though we don't require moving the data, we grab the metadata, we look at the metadata and then we'll sort of tell the user, we suggest that you join this data source with that data source and see what it looks like. And if they say: ah that worked, then we say oh okay that's part of sort of the whole ML infrastructure. Then we are more likely to advise the next few folks with the one click function that, hey if you trying to do a analysis of sales trends, well you might want to use this source and that source and you might want to join them together this way. So it's a combination of sort of AI and ML built into the fabric of what we do, and then also the community aspect of more and more people using it. But that's, going back to your original question, That's what I think that... There was quote, I'll misquote it, so I'm not going to directly say it, but it was just.. I think it might have John Ferrier, who was recently was talking about ML and just sort of saying you know eventually we're not going to talk about ML anymore than we talk about phone business or something. It's just going to become sort of integrated into the fabric of how organizations do business and how organizations do things. So we very much got it built in. You could certainly call us an AI/ML company if you want, its actually definitely part of our slide deck. But at the same time its something that will just sort of become a part of doing business over time. But it really, it depends on large data sets. As we all know, this is why it's so cheap to get Amazon Echoes and such these days. Because it's really beneficial, because the more data... There's value in that data, there was just another piece, I actually shared it on Linkedin today as a matter of fact, about, talking about Amazon and Whole Foods and saying: why are they getting such a valuation premium? They're getting such a valuation premium, because they're smart about using data, but one of the reasons they're smart about using the data is cause they have the data. So the more data you collect, the more data you use, the smarter the systems get, the more useful the solutions become. >> Absolutely, last year when Amazon reinvented, John Ferrier interviewed Andy Jassy and I had posited that the customer flywheel, is going to be replaced by that data flywheel. And enhanced to make things spin even further. >> That's exactly right and once you get that flywheel going it becomes a bigger and bigger competitive advantage, by the way that's also why the regulators are getting interested these days too, right? There's sort of, that flywheel going back the other way, but from our perspective... I mean first of all it just makes economic sense, right? These things could conceivably get out of control, that's at least what the regulators think, if you're not careful at least there's some oversight and I would say that, yes probably some oversight is a good idea, so you've got kind of flywheels pushing in both directions. But one way or another organizations need to get much smarter and much more precise and prescriptive about how they use data. And that's really what we're trying to help with. >> Okay, Chris want to give you the final word, Unify Software, you're working on kind of the strategic road pieces. What should we look for from you in your segment through the rest of 2017? >> Well, I think, I've always been a big believer, I've probably cited 'Crossing the Chasm' like so many times on theCUBE, during my prior HP 10 year and such but you know, I'm a big believer and we should be talking about customers, we should be talking about used cases. It's not about alphabet soup technology or data lakes, it's about the solutions and it's about how organizations are moving themselves forward with data. Going back to that Amazon example, so I think from us, yes we just released 2.O, we've got a very active blog, come by unifisoftware.com, visit it. But it's also going to be around what our customers are doing and that's really what we're going to try to promote. I mean if you remember this was also something, that for all the years I've worked with you guys I've been very much... You always have to make sure that the customer has agreed to be cited, it's nice when you can name them and reference them and we're working on our customer references, because that's what I think is the most powerful in this day and age, because again, going back to my, what I said before about, this is going throughout organizations now. People don't necessarily care about the technology infrastructure, but they care about what's being done with it. And so, being able to tell those customer stories, I think that's what you're going to probably see and hear the most from us. But we'll talk about our product as much as you let us as well. >> Great thing, it reminds me of when Wikibon was founded it was really about IT practice, users being able to share with their peers. Now when the software economy today, when they're doing things in software often that can be leveraged by their peers and that flywheel that they're doing, just like when Salesforce first rolled out, they make one change and then everybody else has that option. We're starting to see that more and more as we deploy as SAS and as cloud, it's not the shrink wrap software anymore. >> I think to that point, you know, I was at a conference earlier this year and it was an IT conference, but I was really sort of floored, because when you ask what we're talking about, what the enlightened IT folks and there is more and more enlightened IT folks we're talking about these days, it's the same thing. Right, it's how our business is succeeding, by being better at leveraging data. And I think the opportunities for people in IT... But they really have to think outside of the box, it's not about Hadoop and Sqoop and Sequel and Java anymore it's really about business solutions, but if you can start to think that way, I think there's tremendous opportunities and we're just scratching the surface. >> Absolutely, we found that really some of the proof points of what digital transformation really is for the companies. Alright Chris Selland, always a pleasure to catch up with you. Thanks so much for joining us and thank you for watching theCUBE. >> Chris: Thanks too. (techno music)
SUMMARY :
Narrator: From the Silicon Angle Media Office Great to see you Chris. we'd had you in your previous role many times. I think not only is the first time we've had you on But I joined the company about six months ago at this point. And of course data is critical to that. it's in legacy systems, it's in the data center, I have talked to people. the data warehousing market. So I think really and its really when you step back and it's going to take me 18 months to roll out. But at the same time if you want to move it you can. You mention data scientist, the business users. and give the ability to respond more quickly Yeah, my understanding of GDPR, if you don't comply, that customers need to do to be able to be compliant? that enables the customers how does that play into what you are doing? to be here to talk to you about it. what metrics can you share about the number of customers, But even more in the industries it tends to be So the companies raised a little Any other key partnerships you want to call out? so really all the kind of major players. in the buzz word phase. So the more data you collect, the more data you use, and I had posited that the customer flywheel, There's sort of, that flywheel going back the other way, What should we look for from you in your segment that for all the years I've worked with you guys We're starting to see that more and more as we deploy I think to that point, you know, and thank you for watching theCUBE. Chris: Thanks too.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Chris | PERSON | 0.99+ |
George Gilbert | PERSON | 0.99+ |
John Ferrier | PERSON | 0.99+ |
Unifi | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Europe | LOCATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Chris Selland | PERSON | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
Pelion Venture Partners | ORGANIZATION | 0.99+ |
HP | ORGANIZATION | 0.99+ |
Greenplum | ORGANIZATION | 0.99+ |
Peter Burris | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Vertica | ORGANIZATION | 0.99+ |
Stu | PERSON | 0.99+ |
Unifi Software | ORGANIZATION | 0.99+ |
Whole Foods | ORGANIZATION | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
General Data Protection Regulation | TITLE | 0.99+ |
Canaan Partners | ORGANIZATION | 0.99+ |
Andy Jassy | PERSON | 0.99+ |
EMC | ORGANIZATION | 0.99+ |
Silicon Angle Media | ORGANIZATION | 0.99+ |
last year | DATE | 0.99+ |
Looker | ORGANIZATION | 0.99+ |
May next year | DATE | 0.99+ |
EU | ORGANIZATION | 0.99+ |
late February | DATE | 0.99+ |
40 people | QUANTITY | 0.99+ |
18 months | QUANTITY | 0.99+ |
MoneyGram | ORGANIZATION | 0.99+ |
Qlik | ORGANIZATION | 0.99+ |
HP/HP | ORGANIZATION | 0.99+ |
Scale Venture Partners | ORGANIZATION | 0.99+ |
360 views | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
MapR | ORGANIZATION | 0.99+ |
GDPR | TITLE | 0.99+ |
Cloudera | ORGANIZATION | 0.99+ |
early March | DATE | 0.99+ |
Echoes | COMMERCIAL_ITEM | 0.99+ |
Both | QUANTITY | 0.99+ |
Tableau | ORGANIZATION | 0.99+ |
millions of dollars | QUANTITY | 0.99+ |
Boston, Massachusetts | LOCATION | 0.99+ |
both | QUANTITY | 0.98+ |
Wikibon | ORGANIZATION | 0.98+ |
ORGANIZATION | 0.98+ | |
one click | QUANTITY | 0.98+ |
one place | QUANTITY | 0.98+ |
Java | TITLE | 0.98+ |
2007 | DATE | 0.98+ |
over $32 million | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
Spark | TITLE | 0.98+ |
HIPAA | TITLE | 0.98+ |
first time | QUANTITY | 0.98+ |
earlier this year | DATE | 0.98+ |
unifisoftware.com | OTHER | 0.98+ |
10 year | QUANTITY | 0.97+ |
Panel Discussion | IBM Fast Track Your Data 2017
>> Narrator: Live, from Munich, Germany, it's the CUBE. Covering IBM, Fast Track Your Data. Brought to you by IBM. >> Welcome to Munich everybody. This is a special presentation of the CUBE, Fast Track Your Data, brought to you by IBM. My name is Dave Vellante. And I'm here with my cohost, Jim Kobielus. Jim, good to see you. Really good to see you in Munich. >> Jim: I'm glad I made it. >> Thanks for being here. So last year Jim and I hosted a panel at New York City on the CUBE. And it was quite an experience. We had, I think it was nine or 10 data scientists and we felt like that was a lot of people to organize and talk about data science. Well today, we're going to do a repeat of that. With a little bit of twist on topics. And we've got five data scientists. We're here live, in Munich. And we're going to kick off the Fast Track Your Data event with this data science panel. So I'm going to now introduce some of the panelists, or all of the panelists. Then we'll get into the discussions. I'm going to start with Lillian Pierson. Lillian thanks very much for being on the panel. You are in data science. You focus on training executives, students, and you're really a coach but with a lot of data science expertise based in Thailand, so welcome. >> Thank you, thank you so much for having me. >> Dave: You're very welcome. And so, I want to start with sort of when you focus on training people, data science, where do you start? >> Well it depends on the course that I'm teaching. But I try and start at the beginning so for my Big Data course, I actually start back at the fundamental concepts and definitions they would even need to understand in order to understand the basics of what Big Data is, data engineering. So, terms like data governance. Going into the vocabulary that makes up the very introduction of the course, so that later on the students can really grasp the concepts I present to them. You know I'm teaching a deep learning course as well, so in that case I start at a lot more advanced concepts. So it just really depends on the level of the course. >> Great, and we're going to come back to this topic of women in tech. But you know, we looked at some CUBE data the other day. About 17% of the technology industry comprises women. And so we're a little bit over that on our data science panel, we're about 20% today. So we'll come back to that topic. But I don't know if there's anything you would add? >> I'm really passionate about women in tech and women who code, in particular. And I'm connected with a lot of female programmers through Instagram. And we're supporting each other. So I'd love to take any questions you have on what we're doing in that space. At least as far as what's happening across the Instagram platform. >> Great, we'll circle back to that. All right, let me introduce Chris Penn. Chris, Boston based, all right, SMI. Chris is a marketing expert. Really trying to help people understand how to get, turn data into value from a marketing perspective. It's a very important topic. Not only because we get people to buy stuff but also understanding some of the risks associated with things like GDPR, which is coming up. So Chris, tell us a little bit about your background and your practice. >> So I actually started in IT and worked at a start up. And that's where I made the transition to marketing. Because marketing has much better parties. But what's really interesting about the way data science is infiltrating marketing is the technology came in first. You know, everything went digital. And now we're at a point where there's so much data. And most marketers, they kind of got into marketing as sort of the arts and crafts field. And are realizing now, they need a very strong, mathematical, statistical background. So one of the things, Adam, the reason why we're here and IBM is helping out tremendously is, making a lot of the data more accessible to people who do not have a data science background and probably never will. >> Great, okay thank you. I'm going to introduce Ronald Van Loon. Ronald, your practice is really all about helping people extract value out of data, driving competitive advantage, business advantage, or organizational excellence. Tell us a little bit about yourself, your background, and your practice. >> Basically, I've three different backgrounds. On one hand, I'm a director at a data consultancy firm called Adversitement. Where we help companies to become data driven. Mainly large companies. I'm an advisory board member at Simply Learn, which is an e-learning platform, especially also for big data analytics. And on the other hand I'm a blogger and I host a series of webinars. >> Okay, great, now Dez, Dez Blanchfield, I met you on Twitter, you know, probably a couple of years ago. We first really started to collaborate last year. We've spend a fair amount of time together. You are a data scientist, but you're also a jack of all trades. You've got a technology background. You sit on a number of boards. You work very active with public policy. So tell us a little bit more about what you're doing these days, a little bit more about your background. >> Sure, I think my primary challenge these days is communication. Trying to join the dots between my technical background and deeply technical pedigree, to just plain English, every day language, and business speak. So bridging that technical world with what's happening in the boardroom. Toe to toe with the geeks to plain English to execs in boards. And just hand hold them and steward them through the journey of the challenges they're facing. Whether it's the enormous rapid of change and the pace of change, that's just almost exhaustive and causing them to sprint. But not just sprint in one race but in multiple lanes at the same time. As well as some of the really big things that are coming up, that we've seen like GDPR. So it's that communication challenge and just hand holding people through that journey and that mix of technical and commercial experience. >> Great, thank you, and finally Joe Caserta. Founder and president of Caserta Concepts. Joe you're a practitioner. You're in the front lines, helping organizations, similar to Ronald. Extracting value from data. Translate that into competitive advantage. Tell us a little bit about what you're doing these days in Caserta Concepts. >> Thanks Dave, thanks for having me. Yeah, so Caserta's been around. I've been doing this for 30 years now. And natural progressions have been just getting more from application development, to data warehousing, to big data analytics, to data science. Very, very organically, that's just because it's where businesses need the help the most, over the years. And right now, the big focus is governance. At least in my world. Trying to govern when you have a bunch of disparate data coming from a bunch of systems that you have no control over, right? Like social media, and third party data systems. Bringing it in and how to you organize it? How do you ingest it? How do you govern it? How do you keep it safe? And also help to define ownership of the data within an organization within an enterprise? That's also a very hot topic. Which ties back into GDPR. >> Great, okay, so we're going to be unpacking a lot of topics associated with the expertise that these individuals have. I'm going to bring in Jim Kobielus, to the conversation. Jim, the newest Wikibon analyst. And newest member of the SiliconANGLE Media Team. Jim, get us started off. >> Yeah, so we're at an event, at an IBM event where machine learning and data science are at the heart of it. There are really three core themes here. Machine learning and data science, on the one hand. Unified governance on the other. And hybrid data management. I want to circle back or focus on machine learning. Machine learning is the coin of the realm, right now in all things data. Machine learning is the heart of AI. Machine learning, everybody is going, hiring, data scientists to do machine learning. I want to get a sense from our panel, who are experts in this area, what are the chief innovations and trends right now on machine learning. Not deep learning, the core of machine learning. What's super hot? What's in terms of new techniques, new technologies, new ways of organizing teams to build and to train machine learning models? I'd like to open it up. Let's just start with Lillian. What are your thoughts about trends in machine learning? What's really hot? >> It's funny that you excluded deep learning from the response for this, because I think the hottest space in machine learning is deep learning. And deep learning is machine learning. I see a lot of collaborative platforms coming out, where people, data scientists are able to work together with other sorts of data professionals to reduce redundancies in workflows. And create more efficient data science systems. >> Is there much uptake of these crowd sourcing environments for training machine learning wells. Like CrowdFlower, or Amazon Mechanical Turk, or Mighty AI? Is that a huge trend in terms of the workflow of data science or machine learning, a lot of that? >> I don't see that crowdsourcing is like, okay maybe I've been out of the crowdsourcing space for a while. But I was working with Standby Task Force back in 2013. And we were doing a lot of crowdsourcing. And I haven't seen the industry has been increasing, but I could be wrong. I mean, because there's no, if you're building automation models, most of the, a lot of the work that's being crowdsourced could actually be automated if someone took the time to just build the scripts and build the models. And so I don't imagine that, that's going to be a trend that's increasing. >> Well, automation machine learning pipeline is fairly hot, in terms of I'm seeing more and more research. Google's doing a fair amount of automated machine learning. The panel, what do you think about automation, in terms of the core modeling tasks involved in machine learning. Is that coming along? Are data scientists in danger of automating themselves out of a job? >> I don't think there's a risk of data scientist's being put out of a job. Let's just put that on the thing. I do think we need to get a bit clearer about this meme of the mythical unicorn. But to your call point about machine learning, I think what you'll see, we saw the cloud become baked into products, just as a given. I think machine learning is already crossed this threshold. We just haven't necessarily noticed or caught up. And if we look at, we're at an IBM event, so let's just do a call out for them. The data science experience platform, for example. Machine learning's built into a whole range of things around algorithm and data classification. And there's an assisted, guided model for how you get to certain steps, where you don't actually have to understand how machine learning works. You don't have to understand how the algorithms work. It shows you the different options you've got and you can choose them. So you might choose regression. And it'll give you different options on how to do that. So I think we've already crossed this threshold of baking in machine learning and baking in the data science tools. And we've seen that with Cloud and other technologies where, you know, the Office 365 is not, you can't get a non Cloud Office 365 account, right? I think that's already happened in machine learning. What we're seeing though, is organizations even as large as the Googles still in catch up mode, in my view, on some of the shift that's taken place. So we've seen them write little games and apps where people do doodles and then it runs through the ML library and says, "Well that's a cow, or a unicorn, or a duck." And you get awards, and gold coins, and whatnot. But you know, as far as 12 years ago I was working on a project, where we had full size airplanes acting as drones. And we mapped with two and 3-D imagery. With 2-D high res imagery and LiDAR for 3-D point Clouds. We were finding poles and wires for utility companies, using ML before it even became a trend. And baking it right into the tools. And used to store on our web page and clicked and pointed on. >> To counter Lillian's point, it's not crowdsourcing but crowd sharing that's really powering a lot of the rapid leaps forward. If you look at, you know, DSX from IBM. Or you look at Node-RED, huge number of free workflows that someone has probably already done the thing that you are trying to do. Go out and find in the libraries, through Jupyter and R Notebooks, there's an ability-- >> Chris can you define before you go-- >> Chris: Sure. >> This is great, crowdsourcing versus crowd sharing. What's the distinction? >> Well, so crowdsourcing, kind of, where in the context of the question you ask is like I'm looking for stuff that other people, getting people to do stuff that, for me. It's like asking people to mine classifieds. Whereas crowd sharing, someone has done the thing already, it already exists. You're not purpose built, saying, "Jim, help me build this thing." It's like, "Oh Jim, you already "built this thing, cool. "So can I fork it and make my own from it?" >> Okay, I see what you mean, keep going. >> And then, again, going back to earlier. In terms of the advancements. Really deep learning, it probably is a good idea to just sort of define these things. Machine learning is how machines do things without being explicitly programmed to do them. Deep learning's like if you can imagine a stack of pancakes, right? Each pancake is a type of machine learning algorithm. And your data is the syrup. You pour the data on it. It goes from layer, to layer, to layer, to layer, and what you end up with at the end is breakfast. That's the easiest analogy for what deep learning is. Now imagine a stack of pancakes, 500 or 1,000 high, that's where deep learning's going now. >> Sure, multi layered machine learning models, essentially, that have the ability to do higher levels of abstraction. Like image analysis, Lillian? >> I had a comment to add about automation and data science. Because there are a lot of tools that are able to, or applications that are able to use data science algorithms and output results. But the reason that data scientists aren't in risk of losing their jobs, is because just because you can get the result, you also have to be able to interpret it. Which means you have to understand it. And that involves deep math and statistical understanding. Plus domain expertise. So, okay, great, you took out the coding element but that doesn't mean you can codify a person's ability to understand and apply that insight. >> Dave: Joe, you have something to add? >> I could just add that I see the trend. Really, the reason we're talking about it today is machine learning is not necessarily, it's not new, like Dez was saying. But what's different is the accessibility of it now. It's just so easily accessible. All of the tools that are coming out, for data, have machine learning built into it. So the machine learning algorithms, which used to be a black art, you know, years ago, now is just very easily accessible. That you can get, it's part of everyone's toolbox. And the other reason that we're talking about it more, is that data science is starting to become a core curriculum in higher education. Which is something that's new, right? That didn't exist 10 years ago? But over the past five years, I'd say, you know, it's becoming more and more easily accessible for education. So now, people understand it. And now we have it accessible in our tool sets. So now we can apply it. And I think that's, those two things coming together is really making it becoming part of the standard of doing analytics. And I guess the last part is, once we can train the machines to start doing the analytics, right? And get smarter as it ingests more data. And then we can actually take that and embed it in our applications. That's the part that you still need data scientists to create that. But once we can have standalone appliances that are intelligent, that's when we're going to start seeing, really, machine learning and artificial intelligence really start to take off even more. >> Dave: So I'd like to switch gears a little bit and bring Ronald on. >> Okay, yes. >> Here you go, there. >> Ronald, the bromide in this sort of big data world we live in is, the data is the new oil. You got to be a data driven company and many other cliches. But when you talk to organizations and you start to peel the onion. You find that most companies really don't have a good way to connect data with business impact and business value. What are you seeing with your clients and just generally in the community, with how companies are doing that? How should they do that? I mean, is that something that is a viable approach? You don't see accountants, for example, quantifying the value of data on a balance sheet. There's no standards for doing that. And so it's sort of this fuzzy concept. How are and how should organizations take advantage of data and turn it into value. >> So, I think in general, if you look how companies look at data. They have departments and within the departments they have tools specific for this department. And what you see is that there's no central, let's say, data collection. There's no central management of governance. There's no central management of quality. There's no central management of security. Each department is manages their data on their own. So if you didn't ask, on one hand, "Okay, how should they do it?" It's basically go back to the drawing table and say, "Okay, how should we do it?" We should collect centrally, the data. And we should take care for central governance. We should take care for central data quality. We should take care for centrally managing this data. And look from a company perspective and not from a department perspective what the value of data is. So, look at the perspective from your whole company. And this means that it has to be brought on one end to, whether it's from C level, where most of them still fail to understand what it really means. And what the impact can be for that company. >> It's a hard problem. Because data by its' very nature is now so decentralized. But Chris you have a-- >> The thing I want to add to that is, think about in terms of valuing data. Look at what it would cost you for data breach. Like what is the expensive of having your data compromised. If you don't have governance. If you don't have policy in place. Look at the major breaches of the last couple years. And how many billions of dollars those companies lost in market value, and trust, and all that stuff. That's one way you can value data very easily. "What will it cost us if we mess this up?" >> So a lot of CEOs will hear that and say, "Okay, I get it. "I have to spend to protect myself, "but I'd like to make a little money off of this data thing. "How do I do that?" >> Well, I like to think of it, you know, I think data's definitely an asset within an organization. And is becoming more and more of an asset as the years go by. But data is still a raw material. And that's the way I think about it. In order to actually get the value, just like if you're creating any product, you start with raw materials and then you refine it. And then it becomes a product. For data, data is a raw material. You need to refine it. And then the insight is the product. And that's really where the value is. And the insight is absolutely, you can monetize your insight. >> So data is, abundant insights are scarce. >> Well, you know, actually you could say that intermediate between insights and the data are the models themselves. The statistical, predictive, machine learning models. That are a crystallization of insights that have been gained by people called data scientists. What are your thoughts on that? Are statistical, predictive, machine learning models something, an asset, that companies, organizations, should manage governance of on a centralized basis or not? >> Well the models are essentially the refinery system, right? So as you're refining your data, you need to have process around how you exactly do that. Just like refining anything else. It needs to be controlled and it needs to be governed. And I think that data is no different from that. And I think that it's very undisciplined right now, in the market or in the industry. And I think maturing that discipline around data science, I think is something that's going to be a very high focus in this year and next. >> You were mentioning, "How do you make money from data?" Because there's all this risk associated with security breaches. But at the risk of sounding simplistic, you can generate revenue from system optimization, or from developing products and services. Using data to develop products and services that better meet the demands and requirements of your markets. So that you can sell more. So either you are using data to earn more money. Or you're using data to optimize your system so you have less cost. And that's a simple answer for how you're going to be making money from the data. But yes, there is always the counter to that, which is the security risks. >> Well, and my question really relates to, you know, when you think of talking to C level executives, they kind of think about running the business, growing the business, and transforming the business. And a lot of times they can't fund these transformations. And so I would agree, there's many, many opportunities to monetize data, cut costs, increase revenue. But organizations seem to struggle to either make a business case. And actually implement that transformation. >> Dave, I'd love to have a crack at that. I think this conversation epitomizes the type of things that are happening in board rooms and C suites already. So we've really quickly dived into the detail of data. And the detail of machine learning. And the detail of data science, without actually stopping and taking a breath and saying, "Well, we've "got lots of it, but what have we got? "Where is it? "What's the value of it? "Is there any value in it at all?" And, "How much time and money should we invest in it?" For example, we talk of being about a resource. I look at data as a utility. When I turn the tap on to get a drink of water, it's there as a utility. I counted it being there but I don't always sample the quality of the water and I probably should. It could have Giardia in it, right? But what's interesting is I trust the water at home, in Sydney. Because we have a fairly good experience with good quality water. If I were to go to some other nation. I probably wouldn't trust that water. And I think, when you think about it, what's happening in organizations. It's almost the same as what we're seeing here today. We're having a lot of fun, diving into the detail. But what we've forgotten to do is ask the question, "Well why is data even important? "What's the reasoning to the business? "Why are we in business? "What are we doing as an organization? "And where does data fit into that?" As opposed to becoming so fixated on data because it's a media hyped topic. I think once you can wind that back a bit and say, "Well, we have lot's of data, "but is it good data? "Is it quality data? "Where's it coming from? "Is it ours? "Are we allowed to have it? "What treatment are we allowed to give that data?" As you said, "Are we controlling it? "And where are we controlling it? "Who owns it?" There's so many questions to be asked. But the first question I like to ask people in plain English is, "Well is there any value "in data in the first place? "What decisions are you making that data can help drive? "What things are in your organizations, "KPIs and milestones you're trying to meet "that data might be a support?" So then instead of becoming fixated with data as a thing in itself, it becomes part of your DNA. Does that make sense? >> Think about what money means. The Economists' Rhyme, "Money is a measure for, "a systems for, a medium, a measure, and exchange." So it's a medium of exchange. A measure of value, a way to exchange something. And a way to store value. Data, good clean data, well governed, fits all four of those. So if you're trying to figure out, "How do we make money out of stuff." Figure out how money works. And then figure out how you map data to it. >> So if we approach and we start with a company, we always start with business case, which is quite clear. And defined use case, basically, start with a team on one hand, marketing people, sales people, operational people, and also the whole data science team. So start with this case. It's like, defining, basically a movie. If you want to create the movie, You know where you're going to. You know what you want to achieve to create the customer experience. And this is basically the same with a business case. Where you define, "This is the case. "And this is how we're going to derive value, "start with it and deliver value within a month." And after the month, you check, "Okay, where are we and how can we move forward? "And what's the value that we've brought?" >> Now I as well, start with business case. I've done thousands of business cases in my life, with organizations. And unless that organization was kind of a data broker, the business case rarely has a discreet component around data. Is that changing, in your experience? >> Yes, so we guide companies into be data driven. So initially, indeed, they don't like to use the data. They don't like to use the analysis. So that's why, how we help. And is it changing? Yes, they understand that they need to change. But changing people is not always easy. So, you see, it's hard if you're not involved and you're not guiding it, they fall back in doing the daily tasks. So it's changing, but it's a hard change. >> Well and that's where this common parlance comes in. And Lillian, you, sort of, this is what you do for a living, is helping people understand these things, as you've been sort of evangelizing that common parlance. But do you have anything to add? >> I wanted to add that for organizational implementations, another key component to success is to start small. Start in one small line of business. And then when you've mastered that area and made it successful, then try and deploy it in more areas of the business. And as far as initializing big data implementation, that's generally how to do it successfully. >> There's the whole issue of putting a value on data as a discreet asset. Then there's the issue, how do you put a value on a data lake? Because a data lake, is essentially an asset you build on spec. It's an exploratory archive, essentially, of all kinds of data that might yield some insights, but you have to have a team of data scientists doing exploration and modeling. But it's all on spec. How do you put a value on a data lake? And at what point does the data lake itself become a burden? Because you got to store that data and manage it. At what point do you drain that lake? At what point, do the costs of maintaining that lake outweigh the opportunity costs of not holding onto it? >> So each Hadoop note is approximately $20,000 per year cost for storage. So I think that there needs to be a test and a diagnostic, before even inputting, ingesting the data and storing it. "Is this actually going to be useful? "What value do we plan to create from this?" Because really, you can't store all the data. And it's a lot cheaper to store data in Hadoop then it was in traditional systems but it's definitely not free. So people need to be applying this test before even ingesting the data. Why do we need this? What business value? >> I think the question we need to also ask around this is, "Why are we building data lakes "in the first place? "So what's the function it's going to perform for you?" There's been a huge drive to this idea. "We need a data lake. "We need to put it all somewhere." But invariably they become data swamps. And we only half jokingly say that because I've seen 90 day projects turn from a great idea, to a really bad nightmare. And as Lillian said, it is cheaper in some ways to put it into a HDFS platform, in a technical sense. But when we look at all the fully burdened components, it's actually more expensive to find Hadoop specialists and Spark specialists to maintain that cluster. And invariably I'm finding that big data, quote unquote, is not actually so much lots of data, it's complex data. And as Lillian said, "You don't always "need to store it all." So I think if we go back to the question of, "What's the function of a data lake in the first place? "Why are we building one?" And then start to build some fully burdened cost components around that. We'll quickly find that we don't actually need a data lake, per se. We just need an interim data store. So we might take last years' data and tokenize it, and analyze it, and do some analytics on it, and just keep the meta data. So I think there is this rush, for a whole range of reasons, particularly vendor driven. To build data lakes because we think they're a necessity, when in reality they may just be an interim requirement and we don't need to keep them for a long term. >> I'm going to attempt to, the last few questions, put them all together. And I think, they all belong together because one of the reasons why there's such hesitation about progress within the data world is because there's just so much accumulated tech debt already. Where there's a new idea. We go out and we build it. And six months, three years, it really depends on how big the idea is, millions of dollars is spent. And then by the time things are built the idea is pretty much obsolete, no one really cares anymore. And I think what's exciting now is that the speed to value is just so much faster than it's ever been before. And I think that, you know, what makes that possible is this concept of, I don't think of a data lake as a thing. I think of a data lake as an ecosystem. And that ecosystem has evolved so much more, probably in the last three years than it has in the past 30 years. And it's exciting times, because now once we have this ecosystem in place, if we have a new idea, we can actually do it in minutes not years. And that's really the exciting part. And I think, you know, data lake versus a data swamp, comes back to just traditional data architecture. And if you architect your data lake right, you're going to have something that's substantial, that's you're going to be able to harness and grow. If you don't do it right. If you just throw data. If you buy Hadoop cluster or a Cloud platform and just throw your data out there and say, "We have a lake now." yeah, you're going to create a mess. And I think taking the time to really understand, you know, the new paradigm of data architecture and modern data engineering, and actually doing it in a very disciplined way. If you think about it, what we're doing is we're building laboratories. And if you have a shabby, poorly built laboratory, the best scientist in the world isn't going to be able to prove his theories. So if you have a well built laboratory and a clean room, then, you know a scientist can get what he needs done very, very, very efficiently. And that's the goal, I think, of data management today. >> I'd like to just quickly add that I totally agree with the challenge between on premise and Cloud mode. And I think one of the strong themes of today is going to be the hybrid data management challenge. And I think organizations, some organizations, have rushed to adopt Cloud. And thinking it's a really good place to dump the data and someone else has to manage the problem. And then they've ended up with a very expensive death by 1,000 cuts in some senses. And then others have been very reluctant as a result of not gotten access to rapid moving and disruptive technology. So I think there's a really big challenge to get a basic conversation going around what's the value using Cloud technology as in adopting it, versus what are the risks? And when's the right time to move? For example, should we Cloud Burst for workloads? Do we move whole data sets in there? You know, moving half a petabyte of data into a Cloud platform back is a non-trivial exercise. But moving a terabyte isn't actually that big a deal anymore. So, you know, should we keep stuff behind the firewalls? I'd be interested in seeing this week where 80% of the data, supposedly is. And just push out for Cloud tools, machine learning, data science tools, whatever they might be, cognitive analytics, et cetera. And keep the bulk of the data on premise. Or should we just move whole spools into the Cloud? There is no one size fits all. There's no silver bullet. Every organization has it's own quirks and own nuances they need to think through and make a decision themselves. >> Very often, Dez, organizations have zonal architectures so you'll have a data lake that consists of a no sequel platform that might be used for say, mobile applications. A Hadoop platform that might be used for unstructured data refinement, so forth. A streaming platform, so forth and so on. And then you'll have machine learning models that are built and optimized for those different platforms. So, you know, think of it in terms of then, your data lake, is a set of zones that-- >> It gets even more complex just playing on that theme, when you think about what Cisco started, called Folk Computing. I don't really like that term. But edge analytics, or computing at the edge. We've seen with the internet coming along where we couldn't deliver everything with a central data center. So we started creating this concept of content delivery networks, right? I think the same thing, I know the same thing has happened in data analysis and data processing. Where we've been pulling social media out of the Cloud, per se, and bringing it back to a central source. And doing analytics on it. But when you think of something like, say for example, when the Dreamliner 787 from Boeing came out, this airplane created 1/2 a terabyte of data per flight. Now let's just do some quick, back of the envelope math. There's 87,400 fights a day, just in the domestic airspace in the USA alone, per day. Now 87,400 by 1/2 a terabyte, that's 43 point five petabytes a day. You physically can't copy that from quote unquote in the Cloud, if you'll pardon the pun, back to the data center. So now we've got the challenge, a lot of our Enterprise data's behind a firewall, supposedly 80% of it. But what's out at the edge of the network. Where's the value in that data? So there are zonal challenges. Now what do I do with my Enterprise versus the open data, the mobile data, the machine data. >> Yeah, we've seen some recent data from IDC that says, "About 43% of the data "is going to stay at the edge." We think that, that's way understated, just given the examples. We think it's closer to 90% is going to stay at the edge. >> Just on the airplane topic, right? So Airbus wasn't going to be outdone. Boeing put 4,000 sensors or something in their 787 Dreamliner six years ago. Airbus just announced an 83, 81,000 with 10,000 sensors in it. Do the same math. Now the FAA in the US said that all aircraft and all carriers have to be, by early next year, I think it's like March or April next year, have to be at the same level of BIOS. Or the same capability of data collection and so forth. It's kind of like a mini GDPR for airlines. So with the 83, 81,000 with 10,000 sensors, that becomes two point five terabytes per flight. If you do the math, it's 220 petabytes of data just in one day's traffic, domestically in the US. Now, it's just so mind boggling that we're going to have to completely turn our thinking on its' head, on what do we do behind the firewall? What do we do in the Cloud versus what we might have to do in the airplane? I mean, think about edge analytics in the airplane processing data, as you said, Jim, streaming analytics in flight. >> Yeah that's a big topic within Wikibon, so, within the team. Me and David Floyer, and my other colleagues. They're talking about the whole notion of edge architecture. Not only will most of the data be persisted at the edge, most of the deep learning models like TensorFlow will be executed at the edge. To some degree, the training of those models will happen in the Cloud. But much of that will be pushed in a federated fashion to the edge, or at least I'm predicting. We're already seeing some industry moves in that direction, in terms of architectures. Google has a federated training, project or initiative. >> Chris: Look at TensorFlow Lite. >> Which is really fascinating for it's geared to IOT, I'm sorry, go ahead. >> Look at TensorFlow Lite. I mean in the announcement of having every Android device having ML capabilities, is Google's essential acknowledgment, "We can't do it all." So we need to essentially, sort of like a setting at home. Everyone's smartphone top TV box just to help with the processing. >> Now we're talking about this, this sort of leads to this IOT discussion but I want to underscore the operating model. As you were saying, "You can't just "lift and shift to the Cloud." You're not going to, CEOs aren't going to get the billion dollar hit by just doing that. So you got to change the operating model. And that leads to, this discussion of IOT. And an entirely new operating model. >> Well, there are companies that are like Sisense who have worked with Intel. And they've taken this concept. They've taken the business logic and not just putting it in the chip, but actually putting it in memory, in the chip. So as data's going through the chip it's not just actually being processed but it's actually being baked in memory. So level one, two, and three cache. Now this is a game changer. Because as Chris was saying, even if we were to get the data back to a central location, the compute load, I saw a real interesting thing from I think it was Google the other day, one of the guys was doing a talk. And he spoke about what it meant to add cognitive and voice processing into just the Android platform. And they used some number, like that had, double the amount of compute they had, just to add voice for free, to the Android platform. Now even for Google, that's a nontrivial exercise. So as Chris was saying, I think we have to again, flip it on its' head and say, "How much can we put "at the edge of the network?" Because think about these phones. I mean, even your fridge and microwave, right? We put a man on the moon with something that these days, we make for $89 at home, on the Raspberry Pie computer, right? And even that was 1,000 times more powerful. When we start looking at what's going into the chips, we've seen people build new, not even GPUs, but deep learning and stream analytics capable chips. Like Google, for example. That's going to make its' way into consumer products. So that, now the compute capacity in phones, is going to, I think transmogrify in some ways because there is some magic in there. To the point where, as Chris was saying, "We're going to have the smarts in our phone." And a lot of that workload is going to move closer to us. And only the metadata that we need to move is going to go centrally. >> Well here's the thing. The edge isn't the technology. The edge is actually the people. When you look at, for example, the MIT language Scratch. This is kids programming language. It's drag and drop. You know, kids can assemble really fun animations and make little movies. We're training them to build for IOT. Because if you look at a system like Node-RED, it's an IBM interface that is drag and drop. Your workflow is for IOT. And you can push that to a device. Scratch has a converter for doing those. So the edge is what those thousands and millions of kids who are learning how to code, learning how to think architecturally and algorithmically. What they're going to create that is beyond what any of us can possibly imagine. >> I'd like to add one other thing, as well. I think there's a topic we've got to start tabling. And that is what I refer to as the gravity of data. So when you think about how planets are formed, right? Particles of dust accrete. They form into planets. Planets develop gravity. And the reason we're not flying into space right now is that there's gravitational force. Even though it's one of the weakest forces, it keeps us on our feet. Oftentimes in organizations, I ask them to start thinking about, "Where is the center "of your universe with regard to the gravity of data." Because if you can follow the center of your universe and the gravity of your data, you can often, as Chris is saying, find where the business logic needs to be. And it could be that you got to think about a storage problem. You can think about a compute problem. You can think about a streaming analytics problem. But if you can find where the center of your universe and the center of your gravity for your data is, often you can get a really good insight into where you can start focusing on where the workloads are going to be where the smarts are going to be. Whether it's small, medium, or large. >> So this brings up the topic of data governance. One of the themes here at Fast Track Your Data is GDPR. What it means. It's one of the reasons, I think IBM selected Europe, generally, Munich specifically. So let's talk about GDPR. We had a really interesting discussion last night. So let's kind of recreate some of that. I'd like somebody in the panel to start with, what is GDPR? And why does it matter, Ronald? >> Yeah, maybe I can start. Maybe a little bit more in general unified governance. So if i talk to companies and I need to explain to them what's governance, I basically compare it with a crime scene. So in a crime scene if something happens, they start with securing all the evidence. So they start sealing the environment. And take care that all the evidence is collected. And on the other hand, you see that they need to protect this evidence. There are all kinds of policies. There are all kinds of procedures. There are all kinds of rules, that need to be followed. To take care that the whole evidence is secured well. And once you start, basically, investigating. So you have the crime scene investigators. You have the research lab. You have all different kind of people. They need to have consent before they can use all this evidence. And the whole reason why they're doing this is in order to collect the villain, the crook. To catch him and on the other hand, once he's there, to convict him. And we do this to have trust in the materials. Or trust in basically, the analytics. And on the other hand to, the public have trust in everything what's happened with the data. So if you look to a company, where data is basically the evidence, this is the value of your data. It's similar to like the evidence within a crime scene. But most companies don't treat it like this. So if we then look to GDPR, GDPR basically shifts the power and the ownership of the data from the company to the person that created it. Which is often, let's say the consumer. And there's a lot of paradox in this. Because all the companies say, "We need to have this customer data. "Because we need to improve the customer experience." So if you make it concrete and let's say it's 1st of June, so GDPR is active. And it's first of June 2018. And I go to iTunes, so I use iTunes. Let's go to iTunes said, "Okay, Apple please "give me access to my data." I want to see which kind of personal information you have stored for me. On the other end, I want to have the right to rectify all this data. I want to be able to change it and give them a different level of how they can use my data. So I ask this to iTunes. And then I say to them, okay, "I basically don't like you anymore. "I want to go to Spotify. "So please transfer all my personal data to Spotify." So that's possible once it's June 18. Then I go back to iTunes and say, "Okay, I don't like it anymore. "Please reduce my consent. "I withdraw my consent. "And I want you to remove all my "personal data for everything that you use." And I go to Spotify and I give them, let's say, consent for using my data. So this is a shift where you can, as a person be the owner of the data. And this has a lot of consequences, of course, for organizations, how to manage this. So it's quite simple for the consumer. They get the power, it's maturing the whole law system. But it's a big consequence of course for organizations. >> This is going to be a nightmare for marketers. But fill in some of the gaps there. >> Let's go back, so GDPR, the General Data Protection Regulation, was passed by the EU in 2016, in May of 2016. It is, as Ronald was saying, it's four basic things. The right to privacy. The right to be forgotten. Privacy built into systems by default. And the right to data transfer. >> Joe: It takes effect next year. >> It is already in effect. GDPR took effect in May of 2016. The enforcement penalties take place the 25th of May 2018. Now here's where, there's two things on the penalty side that are important for everyone to know. Now number one, GDPR is extra territorial. Which means that an EU citizen, anywhere on the planet has GDPR, goes with them. So say you're a pizza shop in Nebraska. And an EU citizen walks in, orders a pizza. Gives her the credit card and stuff like that. If you for some reason, store that data, GDPR now applies to you, Mr. Pizza shop, whether or not you do business in the EU. Because an EU citizen's data is with you. Two, the penalties are much stiffer then they ever have been. In the old days companies could simply write off penalties as saying, "That's the cost of doing business." With GDPR the penalties are up to 4% of your annual revenue or 20 million Euros, whichever is greater. And there may be criminal sanctions, charges, against key company executives. So there's a lot of questions about how this is going to be implemented. But one of the first impacts you'll see from a marketing perspective is all the advertising we do, targeting people by their age, by their personally identifiable information, by their demographics. Between now and May 25th 2018, a good chunk of that may have to go away because there's no way for you to say, "Well this person's an EU citizen, this person's not." People give false information all the time online. So how do you differentiate it? Every company, regardless of whether they're in the EU or not will have to adapt to it, or deal with the penalties. >> So Lillian, as a consumer this is designed to protect you. But you had a very negative perception of this regulation. >> I've looked over the GDPR and to me it actually looks like a socialist agenda. It looks like (panel laughs) no, it looks like a full assault on free enterprise and capitalism. And on its' face from a legal perspective, its' completely and wholly unenforceable. Because they're assigning jurisdictional rights to the citizen. But what are they going to do? They're going to go to Nebraska and they're going to call in the guy from the pizza shop? And call him into what court? The EU court? It's unenforceable from a legal perspective. And if you write a law that's unenforceable, you know, it's got to be enforceable in every element. It can't be just, "Oh, we're only "going to enforce it for Facebook and for Google. "But it's not enforceable for," it needs to be written so that it's a complete and actionable law. And it's not written in that way. And from a technological perspective it's not implementable. I think you said something like 652 EU regulators or political people voted for this and 10 voted against it. But what do they know about actually implementing it? Is it possible? There's all sorts of regulations out there that aren't possible to implement. I come from an environmental engineering background. And it's absolutely ridiculous because these agencies will pass laws that actually, it's not possible to implement those in practice. The cost would be too great. And it's not even needed. So I don't know, I just saw this and I thought, "You know, if the EU wants to," what they're essentially trying to do is regulate what the rest of the world does on the internet. And if they want to build their own internet like China has and police it the way that they want to. But Ronald here, made an analogy between data, and free enterprise, and a crime scene. Now to me, that's absolutely ridiculous. What does data and someone signing up for an email list have to do with a crime scene? And if EU wants to make it that way they can police their own internet. But they can't go across the world. They can't go to Singapore and tell Singapore, or go to the pizza shop in Nebraska and tell them how to run their business. >> You know, EU overreach in the post Brexit era, of what you're saying has a lot of validity. How far can the tentacles of the EU reach into other sovereign nations. >> What court are they going to call them into? >> Yeah. >> I'd like to weigh in on this. There are lots of unknowns, right? So I'd like us to focus on the things we do know. We've already dealt with similar situations before. In Australia, we introduced a goods and sales tax. Completely foreign concept. Everything you bought had 10% on it. No one knew how to deal with this. It was a completely new practice in accounting. There's a whole bunch of new software that had to be written. MYRB had to have new capability, but we coped. No one actually went to jail yet. It's decades later, for not complying with GST. So what it was, was a framework on how to shift from non sales tax related revenue collection. To sales tax related revenue collection. I agree that there are some egregious things built into this. I don't disagree with that at all. But I think if I put my slightly broader view of the world hat on, we have well and truly gone past the point in my mind, where data was respected, data was treated in a sensible way. I mean I get emails from companies I've never done business with. And when I follow it up, it's because I did business with a credit card company, that gave it to a service provider, that thought that I was going to, when I bought a holiday to come to Europe, that I might want travel insurance. Now some might say there's value in that. And other's say there's not, there's the debate. But let's just focus on what we're talking about. We're talking about a framework for governance of the treatment of data. If we remove all the emotive component, what we are talking about is a series of guidelines, backed by laws, that say, "We would like you to do this," in an ideal world. But I don't think anyone's going to go to jail, on day one. They may go to jail on day 180. If they continue to do nothing about it. So they're asking you to sort of sit up and pay attention. Do something about it. There's a whole bunch of relief around how you approach it. The big thing for me, is there's no get out of jail card, right? There is no get out of jail card for not complying. But there's plenty of support. I mean, we're going to have ambulance chasers everywhere. We're going to have class actions. We're going to have individual suits. The greatest thing to do right now is get into GDPR law. Because you seem to think data scientists are unicorn? >> What kind of life is that if there's ambulance chasers everywhere? You want to live like that? >> Well I think we've seen ad blocking. I use ad blocking as an example, right? A lot of organizations with advertising broke the internet by just throwing too much content on pages, to the point where they're just unusable. And so we had this response with ad blocking. I think in many ways, GDPR is a regional response to a situation where I don't think it's the exact right answer. But it's the next evolutional step. We'll see things evolve over time. >> It's funny you mentioned it because in the United States one of the things that has happened, is that with the change in political administrations, the regulations on what companies can do with your data have actually been laxened, to the point where, for example, your internet service provider can resell your browsing history, with or without your consent. Or your consent's probably buried in there, on page 47. And so, GDPR is kind of a response to saying, "You know what? "You guys over there across the Atlantic "are kind of doing some fairly "irresponsible things with what you allow companies to do." Now, to Lillian's point, no one's probably going to go after the pizza shop in Nebraska because they don't do business in the EU. They don't have an EU presence. And it's unlikely that an EU regulator's going to get on a plane from Brussels and fly to Topeka and say, or Omaha, sorry, "Come on Joe, let's get the pizza shop in order here." But for companies, particularly Cloud companies, that have offices and operations within the EU, they have to sit up and pay attention. So if you have any kind of EU operations, or any kind of fiscal presence in the EU, you need to get on board. >> But to Lillian's point it becomes a boondoggle for lawyers in the EU who want to go after deep pocketed companies like Facebook and Google. >> What's the value in that? It seems like regulators are just trying to create work for themselves. >> What about the things that say advertisers can do, not so much with the data that they have? With the data that they don't have. In other words, they have people called data scientists who build models that can do inferences on sparse data. And do amazing things in terms of personalization. What do you do about all those gray areas? Where you got machine learning models and so forth? >> But it applies-- >> It applies to personally identifiable information. But if you have a talented enough data scientist, you don't need the PII or even the inferred characteristics. If a certain type of behavior happens on your website, for example. And this path of 17 pages almost always leads to a conversion, it doesn't matter who you are or where you're coming from. If you're a good enough data scientist, you can build a model that will track that. >> Like you know, target, infer some young woman was pregnant. And they inferred correctly even though that was never divulged. I mean, there's all those gray areas that, how can you stop that slippery slope? >> Well I'm going to weigh in really quickly. A really interesting experiment for people to do. When people get very emotional about it I say to them, "Go to Google.com, "view source, put it in seven point Courier "font in Word and count how many pages it is." I guess you can't guess how many pages? It's 52 pages of seven point Courier font, HTML to render one logo, and a search field, and a click button. Now why do we need 52 pages of HTML source code and Java script just to take a search query. Think about what's being done in that. It's effectively a mini operating system, to figure out who you are, and what you're doing, and where you been. Now is that a good or bad thing? I don't know, I'm not going to make a judgment call. But what I'm saying is we need to stop and take a deep breath and say, "Does anybody need a 52 page, "home page to take a search query?" Because that's just the tip of the iceberg. >> To that point, I like the results that Google gives me. That's why I use Google and not Bing. Because I get better search results. So, yeah, I don't mind if you mine my personal data and give me, our Facebook ads, those are the only ads, I saw in your article that GDPR is going to take out targeted advertising. The only ads in the entire world, that I like are Facebook ads. Because I actually see products I'm interested in. And I'm happy to learn about that. I think, "Oh I want to research that. "I want to see this new line of products "and what are their competitors?" And I like the targeted advertising. I like the targeted search results because it's giving me more of the information that I'm actually interested in. >> And that's exactly what it's about. You can still decide, yourself, if you want to have this targeted advertising. If not, then you don't give consent. If you like it, you give consent. So if a company gives you value, you give consent back. So it's not that it's restricting everything. It's giving consent. And I think it's similar to what happened and the same type of response, what happened, we had the Mad Cow Disease here in Europe, where you had the whole food chain that needed to be tracked. And everybody said, "No, it's not required." But now it's implemented. Everybody in Europe does it. So it's the same, what probably going to happen over here as well. >> So what does GDPR mean for data scientists? >> I think GDPR is, I think it is needed. I think one of the things that may be slowing data science down is fear. People are afraid to share their data. Because they don't know what's going to be done with it. If there are some guidelines around it that should be enforced and I think, you know, I think it's been said but as long as a company could prove that it's doing due diligence to protect your data, I think no one is going to go to jail. I think when there's, you know, we reference a crime scene, if there's a heinous crime being committed, all right, then it's going to become obvious. And then you do go directly to jail. But I think having guidelines and even laws around privacy and protection of data is not necessarily a bad thing. You can do a lot of data, really meaningful data science, without understanding that it's Joe Caserta. All of the demographics about me. All of the characteristics about me as a human being, I think are still on the table. All that they're saying is that you can't go after Joe, himself, directly. And I think that's okay. You know, there's still a lot of things. We could still cure diseases without knowing that I'm Joe Caserta, right? As long as you know everything else about me. And I think that's really at the core, that's what we're trying to do. We're trying to protect the individual and the individual's data about themselves. But I think as far as how it affects data science, you know, a lot of our clients, they're afraid to implement things because they don't exactly understand what the guideline is. And they don't want to go to jail. So they wind up doing nothing. So now that we have something in writing that, at least, it's something that we can work towards, I think is a good thing. >> In many ways, organizations are suffering from the deer in the headlight problem. They don't understand it. And so they just end up frozen in the headlights. But I just want to go back one step if I could. We could get really excited about what it is and is not. But for me, the most critical thing there is to remember though, data breaches are happening. There are over 1,400 data breaches, on average, per day. And most of them are not trivial. And when we saw 1/2 a billion from Yahoo. And then one point one billion and then one point five billion. I mean, think about what that actually means. There were 47,500 Mongodbs breached in an 18 hour window, after an automated upgrade. And they were airlines, they were banks, they were police stations. They were hospitals. So when I think about frameworks like GDPR, I'm less worried about whether I'm going to see ads and be sold stuff. I'm more worried about, and I'll give you one example. My 12 year old son has an account at a platform called Edmodo. Now I'm not going to pick on that brand for any reason but it's a current issue. Something like, I think it was like 19 million children in the world had their username, password, email address, home address, and all this social interaction on this Facebook for kids platform called Edmodo, breached in one night. Now I got my hands on a copy. And everything about my son is there. Now I have a major issue with that. Because I can't do anything to undo that, nothing. The fact that I was able to get a copy, within hours on a dark website, for free. The fact that his first name, last name, email, mobile phone number, all these personal messages from friends. Nobody has the right to allow that to breach on my son. Or your children, or our children. For me, GDPR, is a framework for us to try and behave better about really big issues. Whether it's a socialist issue. Whether someone's got an issue with advertising. I'm actually not interested in that at all. What I'm interested in is companies need to behave much better about the treatment of data when it's the type of data that's being breached. And I get really emotional when it's my son, or someone else's child. Because I don't care if my bank account gets hacked. Because they hedge that. They underwrite and insure themselves and the money arrives back to my bank. But when it's my wife who donated blood and a blood donor website got breached and her details got lost. Even things like sexual preferences. That they ask questions on, is out there. My 12 year old son is out there. Nobody has the right to allow that to happen. For me, GDPR is the framework for us to focus on that. >> Dave: Lillian, is there a comment you have? >> Yeah, I think that, I think that security concerns are 100% and definitely a serious issue. Security needs to be addressed. And I think a lot of the stuff that's happening is due to, I think we need better security personnel. I think we need better people working in the security area where they're actually looking and securing. Because I don't think you can regulate I was just, I wanted to take the microphone back when you were talking about taking someone to jail. Okay, I have a background in law. And if you look at this, you guys are calling it a framework. But it's not a framework. What they're trying to do is take 4% of your business revenues per infraction. They want to say, "If a person signs up "on your email list and you didn't "like, necessarily give whatever "disclaimer that the EU said you need to give. "Per infraction, we're going to take "4% of your business revenue." That's a law, that they're trying to put into place. And you guys are talking about taking people to jail. What jail are you? EU is not a country. What jurisdiction do they have? Like, you're going to take pizza man Joe and put him in the EU jail? Is there an EU jail? Are you going to take them to a UN jail? I mean, it's just on its' face it doesn't hold up to legal tests. I don't understand how they could enforce this. >> I'd like to just answer the question on-- >> Security is a serious issue. I would be extremely upset if I were you. >> I personally know, people who work for companies who've had data breaches. And I respect them all. They're really smart people. They've got 25 plus years in security. And they are shocked that they've allowed a breach to take place. What they've invariably all agreed on is that a whole range of drivers have caused them to get to a bad practice. So then, for example, the donate blood website. The young person who was assist admin with all the right skills and all the right experience just made a basic mistake. They took a db dump of a mysql database before they upgraded their Wordpress website for the business. And they happened to leave it in a folder that was indexable by Google. And so somebody wrote a radio expression to search in Google to find sql backups. Now this person, I personally respect them. I think they're an amazing practitioner. They just made a mistake. So what does that bring us back to? It brings us back to the point that we need a safety net or a framework or whatever you want to call it. Where organizations have checks and balances no matter what they do. Whether it's an upgrade, a backup, a modification, you know. And they all think they do, but invariably we've seen from the hundreds of thousands of breaches, they don't. Now on the point of law, we could debate that all day. I mean the EU does have a remit. If I was caught speeding in Germany, as an Australian, I would be thrown into a German jail. If I got caught as an organization in France, breaching GDPR, I would be held accountable to the law in that region, by the organization pursuing me. So I think it's a bit of a misnomer saying I can't go to an EU jail. I don't disagree with you, totally, but I think it's regional. If I get a speeding fine and break the law of driving fast in EU, it's in the country, in the region, that I'm caught. And I think GDPR's going to be enforced in that same approach. >> All right folks, unfortunately the 60 minutes flew right by. And it does when you have great guests like yourselves. So thank you very much for joining this panel today. And we have an action packed day here. So we're going to cut over. The CUBE is going to have its' interview format starting in about 1/2 hour. And then we cut over to the main tent. Who's on the main tent? Dez, you're doing a main stage presentation today. Data Science is a Team Sport. Hillary Mason, has a breakout session. We also have a breakout session on GDPR and what it means for you. Are you ready for GDPR? Check out ibmgo.com. It's all free content, it's all open. You do have to sign in to see the Hillary Mason and the GDPR sessions. And we'll be back in about 1/2 hour with the CUBE. We'll be running replays all day on SiliconAngle.tv and also ibmgo.com. So thanks for watching everybody. Keep it right there, we'll be back in about 1/2 hour with the CUBE interviews. We're live from Munich, Germany, at Fast Track Your Data. This is Dave Vellante with Jim Kobielus, we'll see you shortly. (electronic music)
SUMMARY :
Brought to you by IBM. Really good to see you in Munich. a lot of people to organize and talk about data science. And so, I want to start with sort of can really grasp the concepts I present to them. But I don't know if there's anything you would add? So I'd love to take any questions you have how to get, turn data into value So one of the things, Adam, the reason I'm going to introduce Ronald Van Loon. And on the other hand I'm a blogger I met you on Twitter, you know, and the pace of change, that's just You're in the front lines, helping organizations, Trying to govern when you have And newest member of the SiliconANGLE Media Team. and data science are at the heart of it. It's funny that you excluded deep learning of the workflow of data science And I haven't seen the industry automation, in terms of the core And baking it right into the tools. that's really powering a lot of the rapid leaps forward. What's the distinction? It's like asking people to mine classifieds. to layer, and what you end up with the ability to do higher levels of abstraction. get the result, you also have to And I guess the last part is, Dave: So I'd like to switch gears a little bit and just generally in the community, And this means that it has to be brought on one end to, But Chris you have a-- Look at the major breaches of the last couple years. "I have to spend to protect myself, And that's the way I think about it. and the data are the models themselves. And I think that it's very undisciplined right now, So that you can sell more. And a lot of times they can't fund these transformations. But the first question I like to ask people And then figure out how you map data to it. And after the month, you check, kind of a data broker, the business case rarely So initially, indeed, they don't like to use the data. But do you have anything to add? and deploy it in more areas of the business. There's the whole issue of putting And it's a lot cheaper to store data And then start to build some fully is that the speed to value is just the data and someone else has to manage the problem. So, you know, think of it in terms on that theme, when you think about from IDC that says, "About 43% of the data all aircraft and all carriers have to be, most of the deep learning models like TensorFlow geared to IOT, I'm sorry, go ahead. I mean in the announcement of having "lift and shift to the Cloud." And only the metadata that we need And you can push that to a device. And it could be that you got to I'd like somebody in the panel to And on the other hand, you see that But fill in some of the gaps there. And the right to data transfer. a good chunk of that may have to go away So Lillian, as a consumer this is designed to protect you. I've looked over the GDPR and to me You know, EU overreach in the post Brexit era, But I don't think anyone's going to go to jail, on day one. And so we had this response with ad blocking. And so, GDPR is kind of a response to saying, a boondoggle for lawyers in the EU What's the value in that? With the data that they don't have. leads to a conversion, it doesn't matter who you are And they inferred correctly even to figure out who you are, and what you're doing, And I like the targeted advertising. And I think it's similar to what happened I think no one is going to go to jail. and the money arrives back to my bank. "disclaimer that the EU said you need to give. I would be extremely upset if I were you. And I think GDPR's going to be enforced in that same approach. And it does when you have great guests like yourselves.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jim Kobielus | PERSON | 0.99+ |
Chris | PERSON | 0.99+ |
David Floyer | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Ronald | PERSON | 0.99+ |
Lillian Pierson | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Lillian | PERSON | 0.99+ |
Jim | PERSON | 0.99+ |
Joe Caserta | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Dez | PERSON | 0.99+ |
Nebraska | LOCATION | 0.99+ |
Adam | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
Hillary Mason | PERSON | 0.99+ |
87,400 | QUANTITY | 0.99+ |
Topeka | LOCATION | 0.99+ |
Airbus | ORGANIZATION | 0.99+ |
Thailand | LOCATION | 0.99+ |
Brussels | LOCATION | 0.99+ |
Australia | LOCATION | 0.99+ |
EU | ORGANIZATION | 0.99+ |
10% | QUANTITY | 0.99+ |
Dez Blanchfield | PERSON | 0.99+ |
Chris Penn | PERSON | 0.99+ |
Omaha | LOCATION | 0.99+ |
Munich | LOCATION | 0.99+ |
May of 2016 | DATE | 0.99+ |
May 25th 2018 | DATE | 0.99+ |
Sydney | LOCATION | 0.99+ |
nine | QUANTITY | 0.99+ |
Germany | LOCATION | 0.99+ |
17 pages | QUANTITY | 0.99+ |
Joe | PERSON | 0.99+ |
80% | QUANTITY | 0.99+ |
$89 | QUANTITY | 0.99+ |
Yahoo | ORGANIZATION | 0.99+ |
France | LOCATION | 0.99+ |
June 18 | DATE | 0.99+ |
83, 81,000 | QUANTITY | 0.99+ |
30 years | QUANTITY | 0.99+ |
Ronald Van Loon | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
USA | LOCATION | 0.99+ |
thousands | QUANTITY | 0.99+ |
2013 | DATE | 0.99+ |
one point | QUANTITY | 0.99+ |
100% | QUANTITY | 0.99+ |
Conquering Big Data Part 1: Data as Capital
>> Narrator: From the SiliconANGLE Media office in Boston, Massachusetts, it's theCUBE. Now here is your host, Dave Vellante. >> Hi, everybody. This is Dave Vellante. Welcome to a special presentation, Conquering Big Data. This is part one: Data as Capital, and this is sponsored by Oracle. With me is Paul Sonderegger, a big data strategist from Oracle. Paul, it's good to see you in theCUBE again. >> It's good to be here. >> Okay, so we were talking earlier. This whole thing for us at SiliconANGLE Media started around 2010 when we started to pay attention to the dupe trend, and data is the new source of competitive advantage, data is the new oil, and in six or seven short years, we've come quite a long way. Everybody says that they want to be data-driven. Where are we today from your perspective? >> I think the cover article of the Economist just a couple of weeks ago captured it pretty well where it said the data is the world's most valuable resource, and part of the evidence for that is that the top five most valuable listed firms or publicly listed firms worldwide are all data-heavy technology companies, so we're at the point now where the effect of accumulating data, stocks of data capital is obvious and using it is obvious but nonetheless, we are still at the beginning of the changes that the rise of data capital are going to bring. >> As I said, most executives would say they want their companies to be data-driven. Many actually say, "Oh yes, our company is data-driven," but when you start to peel the onion, do you agree that most companies aren't really as data-centric as they may claim to be? >> A lot of companies, they just struggle with the philosophy of what data is and what effect it has on the way they compete. Don't get me wrong. All executives understand that more data helps you make better decisions. That's evergreen. That's a good idea. But a lot of companies fail to appreciate that data. Contrary to popular wisdom, is not abundant. There's a lot of it but it consists of countless unique observations, and so really, the way that executives need to think about data is that it is scarce. Data really consists of observations of things that are going on in the world, and if you are not there when those activities happen, when these events take place, your opportunity to capture those observations is lost. It doesn't come back. >> Okay, so let's get into this. You've written about and talked about the three principles of data capital, so let's start there and go through them. Principle one is data comes from activity. Okay. I guess that sounds obvious but what does it mean? >> This is the issue that we were just talking about. This is the first principle of data capital, that data comes from activity and a lot of executives will say, "Yes, obviously. "We put in this big ERP application back in the '90s, "and it captured all of this data about our own processes, "so then we reported on it "so we can see what's going on." All of that is true but what a lot of executives miss is that they're in competition for data. So the data that ERP apps and CRM apps and all of these enterprise applications produce, those are all data from the company's own activities but what's happening now is the digitization and datafication of activities outside the company, activities that customers carry on. It could be in everyday consumer life, it could be in B2B environments as well, it could be the movement of trucks, the movement of inventory done through supply chains run by partners. Executives have to get the habit of looking out at the world and seeing the data that is not there yet, information coming from these activities that is lost. It's either captured on paper or it's not captured at all, and putting sensors and mobile apps into those activities before their rivals do because when an activity happens, if you are not part of it, your opportunity to capture its data is lost. It doesn't come back. >> So data, raw data is abundant but the data that is actually valuable to organizations you're saying is scarce and takes a lot of refinement to use the oil analogy. >> Think about it this way. Remember Sir Edmund Halley, the guy who predicted the comet? >> Dave: Right. >> Sir Edmund Halley predicted when you will die. This is actually one of his signal achievements a lot of people have forgotten about. Halley was the first one to work out mortality tables, what is expected, what is life expectancy. The reason that that could be valuable is that he showed that life insurance policies that the British government was offering were mispriced depending on how old you were and how much longer you expected to live. The data that he used to make those calculations was not his. It came from Breslau. It came from another city, and it came from a particular church, which had kept really rigorous records during that time. Before the priests of Breslau said, "Hey, you could use this data," Halley had no ability to make this prediction. He had no ability to identify the mispricing of life insurance policies. That data, those observations was a scarce resource concentrated in another city that he needed in order to figure all this out. We have exactly the same situation now. Exactly the same situation now where companies taking observations of activities that they conduct with their partners, activities that they conduct with their customers build up into these concentrations of observations that are unique, they're proprietary, and they are the necessary fuel for creating new digital products and services. >> And many of those observations come from data outside of the organization. Okay, let's look at the second principle. Data makes more data. What are you talking about here? Are you talking about metadata? Can you explain? >> Sure. Providing data to people so they can make better decisions is always a good thing. It has been a good thing for a long time. It will continue to be a good thing. But the real money is in algorithms. The real money is in using these stocks of data capital to feed algorithms for two reasons. One is that algorithms can take decisions beyond human scale either in a more situations per unit time or simply faster than human beings can. The second reason it's important is because algorithms produce data about their own performance, which can be fed back into the model to improve their future performance. This is true of dynamic pricing algorithms, which capture data about what change did this price switch have on conversion rates, for example. It applies in fraud detection. We have customers who are banks who look at how many legitimate transactions did our current fraud detection algorithm wrongly flagged because they get complaints about it, how many fraudulent transactions did our current algorithm actually missed because investigations get kicked off through other processes. Those observations about the performance of the algorithm go back into the model improving its future performance. This applies to algorithms for inventory detection and fleet movement. So the second principle is the data tends to make more data, and this virtuous cycle with algorithms creates a competitive advantage that is very, very hard to catch. >> And I'm hearing you have to act on that data and continue to iterate. It's not obviously a one-shot static deal. We kind of all know that but it's this constant improvement that's going to give you that competitive edge. >> That's really the key, and this is at the very heart of machine learning, so all the talk about AI and all the talk about machine learning, one of the tactics of machine learning algorithms is that they learn from their own behaviors and improve their behaviors over time, so really, this particular kind of competitive advantage is baked in to the practice of machine learning and AI. >> Okay, great. Now your third principle is that platforms tend to win. You've written that this is where the real money is, so what do you mean by platforms? Are you talking about platforms versus products? What do you mean? >> Here, we're talking about platforms not as technologists often think about it where there is a foundational technology and then you build on top. We're talking about platforms as economists see them, so through the eyes of an economist, a platform is an intermediary that serves a two-sided market, and usually it makes it easier, cheaper, faster for the two sides to do business with each other. So just to use a very familiar example, credit cards are a payment platform, and they serve a two-sided market. On one side, you have merchants. On the other side, you have consumers. And of course, we as consumers, we want to carry the card more merchants will take. Merchants want to take the card more consumers have in their pocket. And so growth on one side of the market tends to encourage growth on the other side of the market. They kind of ladder up like that, and that means that platform competition tends toward a winner-take-all outcome, and so we have seen this in, say, the competition for the desktop operating system. That was a platform competition. We see it in the competition for the mobile operating system but it's also something that you see in gaming platforms, for example. More game developers want to develop for the platforms where there are more gamers. Gamers want to have the platform where there are more games. The reason that this matters now is because the digitization and datafication of more daily activities brings platform competition to industries that have never see it before. So just to use a simple example, look at farming. You can now have a drone. It will go out and take pictures of a field, and the drone will do spectrographic analysis of the images, and it's looking for green, which is a proxy for the degree of chlorophyll in the plants. It uses that information to inform the fertilizer spreader about how to tailor the fertilizer to the plants, not to the field but to the individual plants. The tractor in the middle is in competition to be the platform for digital agricultural services, and that is not how makers of large agricultural equipment typically think about competition. >> Okay, so let's move on. If data is so important, it's the new source of competitive advantage, we're talking today about data as capital, but the accounting field doesn't look at data as the same way in which they do a financial asset. You don't see companies recognizing the value of data on their balance sheets yet at the same time, you said the top five firms worldwide in terms of market value are data-oriented. So I'm sure that's much greater than the capital assets that they have on their books. So what's going on there? Should the accounting world be coming into the 21st century? Should companies wait until they do? What are your thoughts on that? >> I won't presume to give the accounting industry any advice on what they ought to but I will say that regardless of how the accounting standards look at data. The most successful data-driven companies, they already recognize that data is a true asset despite the fact that they cannot put it on the balance sheet as an asset with a certain dollar value. These firms, they already recognize that data is not just a record of what happened, it is a raw material for creating new digital products and services. In that way, it is capital like capital equipment, like financial capital, like if you do not have this input, you cannot create the service that you have in mind. And so that's why these data-heavy companies are not satisfied with the stocks of data capital they've got. These platform businesses are constantly on the lookout for new activities they can go digitize and datafy, adjacent activities that are next to the ones that they have already captured in order to further build out this stock of data capital, in order to create more raw material for new products and services. I will presume to give corporations in general advice, and the advice is that you've got to get this idea that data is not just a record of what happened, it is a raw material for new digital products and services. Digital products and services are the competitive field for providing value to your customers. >> So don't wait for the accounting industry to catch up is really your advice there. >> Not at all. >> So you said digitize, datafy, and that's leads us what you've talked in the past about data trade, the monetization question, so let's talk about monetization. How should organizations think about monetizing data? Should they be selling data? Should they be thinking about it differently? Why should they be monetizing data? >> The first thing to remember is that data trade is a decades-old practice. Credit bureaus were one of the first kinds of companies to build an entire business on the trade of data, and so they're accumulating information about consumers and then providing them to banks so the banks can more easily, quickly, effectively make lending decisions, and that increases access to credit, which is a good thing overall. It's a very, very useful thing. But what's happening now is that the data trade is massively expanding, buying and selling of data about different kinds of aspects of consumer buying and shopping behavior, for example but we're also starting to see the buying and selling of data in the world of the Internet of Things. As you may know, Oracle has a very large data marketplace, the largest online marketplace, a data marketplace of consumer shopping and browsing behavior, so we have five billion consumer profiles, 400 million business profiles, $3 trillion in transactions. One of the things to note about this whole business is that the data in our marketplace is created by a whole set of other firms. Just to give you one example, there's 15,000 websites which are the sources for online browsing behavior, those websites have no idea what value that data will provide to the companies who use it. They don't know. Instead, they are originating this data, and they are selling it on for these secondary purposes, and those secondary purposes really are discovered by the companies who buy the data and use it, and that data then goes into targeting marketing campaigns. It goes into refining product launch plans. It goes into redesigning social media publishing calendars and activities. The reason all this matters is because data consists of observations. The value from those observations only happens when it gets used. There is this curious issue. Just like Edmund Halley needed data from Breslau in order to figure out life expectancy and figure out the proper pricing of these insurance policies, we have the same issue today where data originates in one set of activities but the firms that create it may not create the greatest value from it, and so we need these data marketplaces in order to grow the overall value created from this digitization and datafication. >> Paul, are there pitfalls that people should, I'm sure there are many but maybe a couple you could point to that people need to think about when they enter this data monetization journey? >> Sure. One of the ones that comes out right away is personally identifiable information and invasions of privacy. So one of the ways to deal with that is to anonymize these records, strip out all the personally identifiable information, and then the next step that you can take is to aggregate them. So on that first piece about stripping out personally identifiable information, there are obvious pieces like name, first name, last name, and social security number, taxpayer ID number but new regulations in Europe, the General Data Protection Regulation, the GDPR has expanded the notion of personally identifiable information to any piece of data that could be uniquely tied back to a specific individual, so for example, something like an IMEI number, that unique code for your phone as it connects to the cellular network, in some cases perhaps even IP address. So this notion of personally identifiable information is expanding, so that's one thing for companies to be aware of. This notion of aggregation is an interesting one because even the GDPR says that if you aggregate a whole bunch of records together, and reidentification of those individual records is no longer possible, the GDPR doesn't even apply to those data products, so one of the things companies should be thinking about is can they create data products that provide observations about a part of the world that other firms are interested in and yet at a high enough, at a large enough level of aggregation that the issues are around personally identifiable information are all resolved. >> And this becomes really important. GDPR goes in effect next May, next May 18. >> Next May. >> So things to think about. All right. Last question before we summarize this. Metrics, even though the accounting industry isn't counting data as an asset, are there new metrics that organizations are using or should be using to quantify the value of their data? >> There are. McKinsey writes about this occasionally. They have taken just a really simple, back of the envelope calculation for looking at revenue per employee for companies in a given industry, and then calling out the radical differences in revenue per employee for firms known to be highly data-centric versus others who perhaps are older or have been in the business longer or who have greater traditional capital assets, so something even that simple can be a useful tool but I suspect that we're going to need a new family of metrics. There has been talk for a while about data productivity, about measuring that. It's often been difficult to do but we've entered into a new world now where observations about how data gets used within a company, looking at the queries going against the data management infrastructure is now not only possible but cost-effective. I suspect that we're actually going to see a new metric of data productivity that is related to traditional measures of labor productivity and capital productivity, which economists have known about for a long time, but I think we'll see a way of measuring the work done, the value-creating work done by a company's digital data infrastructure which can then be related to what's their return on invested capital as well as what is their labor productivity. I think we'll start to see a new set of metrics like that. >> And it maybe is implicit in even the McKinsey example of revenue per employee, something as simple as that. Maybe if you could isolate that and identify the input of labor and capital, maybe you can get to that. >> And then if you could isolate the input of work done by queries acting on data, then yeah, you ought to be able to establish that relationship. >> Okay, good. Let's summarize. Before I do, I just want to remind people to think about some questions. We're going to have a Q&A session right after this in the chat area right below. Okay, so we kind of introduced the notion of data capital and talked about why it's important. You mentioned the top five firms worldwide in terms of value are data-oriented companies, and then we talked about your three principles around data capital. Why don't you summarize the three for us? >> Sure. Data comes from activity, so digitize and datafy activities outside your firms before your rivals do. Data tends to make more data, so feed the data you've got into algorithms so that they can create data about their own performance creating a virtuous cycle. And then the third is platforms tend to win, and here, companies really need an active imagination to look at their industries and their business models and imagine them, either imagine their own business model reinvented as a platform, an intermediary between two side of the market where the digitization and datafication helps them create a new kind of value, or imagine another firm like that that comes to attack them. >> Okay, and then we talked about the accounting industry, how it has not begun to recognize data as value, put in a balance sheet, et cetera. You chose not to suggest that they should or should not. Rather, you chose to focus on the companies, the organizations that they should not wait for the accounting industry to catch up, that they should really dive in and begin thinking about how to digitize, you call it datafy, and that led to a conversation on monetization, and then you talked about data markets as a critical emerging, re-emerging entity and dynamic that's occurring there. Maybe some comments? >> Sure. For decades now, we've had businesses with traditional business models working as data sellers. Again, credit bureaus are a good example, market research firms are another good one, LexisNexis, Bloomberg but I think what we're going to see is a rise in data marketplaces where you've got a new kind of business model. It's an exchange. And you've got data originators providing data into the marketplace for sale, and you've got buyers on the other side, probably mostly companies but there could be nonprofits, there could be governments as well actually, and those, those are actually really exciting because exchanges like that, increases in data trade help to spread the wealth of data capital to more parties. It makes it possible for companies who need data but have not datafied the activities that they just discovered they care about go and source that data. It also helps firms who have managed to create these data capital assets but they're not sure what to do with them themselves make them available to places where they can create value. >> Excellent. Then you talked about ways to avoid some of the pitfalls, particularly those associated with personal information and the upcoming GDPR, and then we wrapped with a conversation around metrics, some simple metrics have been posed like revenue per employee, and you noted a McKinsey study that those data-oriented companies have a higher revenue per employee but then you suggested that we're going to start peeling back those metrics and looking at the contribution of labor plus capital in terms of what you call, a new metric called data productivity, so we're going to follow that and hopefully talk to you down the road and learn more about that. Paul, thanks so much for spending some time with us. I really appreciate it. >> Thank you. >> You're welcome. Okay, now as I say, think about your questions. Go down below. Paul and I will be here for a Q&A in the chat below. Thanks for watching, everybody. We'll see you next time. (light music)
SUMMARY :
Narrator: From the SiliconANGLE Media office Paul, it's good to see you in theCUBE again. and data is the new source of competitive advantage, is that the top five most valuable listed firms aren't really as data-centric as they may claim to be? But a lot of companies fail to appreciate that data. of data capital, so let's start there and go through them. and datafication of activities outside the company, but the data that is actually valuable to organizations Remember Sir Edmund Halley, the guy who predicted the comet? that the British government was offering were mispriced Okay, let's look at the second principle. So the second principle is the data tends to make more data, and continue to iterate. and all the talk about machine learning, so what do you mean by platforms? and the drone will do spectrographic analysis but the accounting field doesn't look at data and the advice is that you've got to get this idea is really your advice there. and that's leads us what you've talked in the past One of the things to note about this whole business level of aggregation that the issues And this becomes really important. So things to think about. back of the envelope calculation and identify the input of labor and capital, And then if you could isolate the input of work done in the chat area right below. or imagine another firm like that that comes to attack them. for the accounting industry to catch up, but have not datafied the activities and hopefully talk to you down the road Paul and I will be here for a Q&A in the chat below.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Paul Sonderegger | PERSON | 0.99+ |
Paul | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
LexisNexis | ORGANIZATION | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
Edmund Halley | PERSON | 0.99+ |
$3 trillion | QUANTITY | 0.99+ |
Bloomberg | ORGANIZATION | 0.99+ |
Europe | LOCATION | 0.99+ |
General Data Protection Regulation | TITLE | 0.99+ |
15,000 websites | QUANTITY | 0.99+ |
21st century | DATE | 0.99+ |
Halley | PERSON | 0.99+ |
One | QUANTITY | 0.99+ |
two reasons | QUANTITY | 0.99+ |
Dave | PERSON | 0.99+ |
three | QUANTITY | 0.99+ |
Breslau | LOCATION | 0.99+ |
second reason | QUANTITY | 0.99+ |
two side | QUANTITY | 0.99+ |
two sides | QUANTITY | 0.99+ |
first one | QUANTITY | 0.99+ |
SiliconANGLE Media | ORGANIZATION | 0.99+ |
one side | QUANTITY | 0.99+ |
McKinsey | ORGANIZATION | 0.99+ |
GDPR | TITLE | 0.99+ |
one | QUANTITY | 0.99+ |
next May | DATE | 0.99+ |
first piece | QUANTITY | 0.99+ |
one example | QUANTITY | 0.99+ |
Boston, Massachusetts | LOCATION | 0.99+ |
six | QUANTITY | 0.98+ |
Next May | DATE | 0.98+ |
two-sided | QUANTITY | 0.98+ |
five firms | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
next May 18 | DATE | 0.98+ |
second principle | QUANTITY | 0.98+ |
third principle | QUANTITY | 0.98+ |
third | QUANTITY | 0.98+ |
400 million business profiles | QUANTITY | 0.98+ |
first principle | QUANTITY | 0.97+ |
three principles | QUANTITY | 0.97+ |
seven short years | QUANTITY | 0.96+ |
first thing | QUANTITY | 0.95+ |
one set | QUANTITY | 0.95+ |
one thing | QUANTITY | 0.92+ |
five billion consumer profiles | QUANTITY | 0.9+ |
90s | DATE | 0.9+ |
Sir | PERSON | 0.89+ |
couple of weeks ago | DATE | 0.88+ |
British government | ORGANIZATION | 0.85+ |
first kinds | QUANTITY | 0.85+ |
2010 | DATE | 0.84+ |
one-shot | QUANTITY | 0.84+ |
Oracl | ORGANIZATION | 0.8+ |
part one | QUANTITY | 0.74+ |
Economist | TITLE | 0.72+ |
five most valuable listed | QUANTITY | 0.71+ |
couple | QUANTITY | 0.68+ |
Part 1 | OTHER | 0.67+ |
McKinsey | PERSON | 0.67+ |
unique observations | QUANTITY | 0.62+ |
top | QUANTITY | 0.6+ |
Last | QUANTITY | 0.56+ |
decades | QUANTITY | 0.5+ |
Vijay Vijayasanker & Cortnie Abercrombie, IBM - IBM CDO Strategy Summit - #IBMCDO - #theCUBE
(lively music) >> To the world. Over 31 million people have viewed theCUBE and that is the result of great content, great conversations and I'm so proud to be part of theCUBE, of a great team. Hi, I'm John Furrier. Thanks for watching theCUBE. For more information, click here. >> Narrator: Live from Fisherman's Wharf in San Francisco, it's theCUBE. Covering IBM Chief Data Officer Strategy Summit Spring 2017. Brought to you by IBM. >> Hey, welcome back everybody. Jeff Frick here at theCUBE. It is lunchtime at the IBM CDO Summit. Packed house, you can see them back there getting their nutrition. But we're going to give you some mental nutrition. We're excited to be joined by a repeat performance of Cortnie Abercrombie. Coming on back with Vijay Vijayasankar. He's the GM Cognitive, IOT, and Analytics for IBM, welcome. >> Thanks for having me. >> So first off, did you eat before you came on? >> I did thank you. >> I want to make sure you don't pass out or anything. (group laughing) Cortnie and I both managed to grab a quick bite. >> Excellent. So let's jump into it. Cognitive, lot of buzz, IoT, lot of buzz. How do they fit? Where do they mesh? Why is it, why are they so important to one another? >> Excellent question. >> IoT has been around for a long time even though we never called it IoT. My favorite example is smart meters that utility companies use. So these things have been here for more than a decade. And if you think about IoT, there are two aspects to it. There's the instrumentation by putting the sensors in and getting the data. And the insides aspect where there's making sense of what the sensor is trying to tell us. Combining these two, is where the value is for the client. Just by putting outwardly sensors, it doesn't make much sense. So, look at the world around us now, right? The traditional utility, I will stick with the utilities to complete the story. Utilities all get dissected from both sides. On one hand you have your electric vehicles plugging into the grid to draw power. On the other hand, you have supply coming from solar roofs and so on. So optimizing this is where the cognitive and analytics kicks in. So that's the beauty of this world. All these things come together, that convergence is where the big value is. >> Right because the third element that you didn't have in your original one was what's going on, what should we do, and then actually doing something. >> Vijay: Exactly. >> You got to have the action to pull it all together. >> Yes, and learning as we go. The one thing that is available today with cognitive systems that we did not have in the past was this ability to learn as you go. So you don't need human intervention to keep changing the optimization algorithms. These things can learn by itself and improve over time which is huge. >> But do you still need a person to help kind of figure out what you're optimizing for? That's where, can you have a pure, machine-driven algorithm without knowing exactly what are you optimizing for? >> We are no where close to that today. Generally, where the system is super smart by itself is a far away concept. But there are lots of aspects of specific AI optimizing a given process that can still go into this unsupervised learning aspects. But it needs boundaries. The system can get smart within boundaries, the system cannot just replace human thought. Just augmenting our intelligence. >> Jeff: Cortnie, you're shaking you head over there. >> I'm completely in agreement. We are no where near, and my husband's actually looking forward to the robotic apocalypse by the way, so. (group laughing) >> He must be an Arnold Schwarzenegger fan. >> He's the opposite of me. I love people, he's like looking forward to that. He's like, the less people, the better. >> Jeff: He must have his Zoomba, or whatever those little vacuum cleaner things are called. >> Yeah, no. (group laughing) >> Peter: Tell him it's the fewer the people, the better. >> The fewer the people the better for him. He's a finance guy, he'd rather just sit with the money all day. What does that say about me? Anyway, (laughing) no, less with the gross. Yeah no, I think we're never going to really get to that point. Because we always as people always have to be training these systems to think like us. So we're never going to have systems that are just autonomically out there without having an intervention here and there to learn the next steps. That's just how it works. >> I always thought the autonomous vehicle, just example, cause it's just so clean. You know, if somebody jumps in front of the car, does the car hit the person, or run into the ditch? >> Where today a person can't make that judgment very fast. They're just going to react. But in computer time, that's like forever. So you can actually make rules. And then people go bananas, well what if it's a grandma on one side and kids on the other? Which do you go? Or what if it's a criminal that just robbed a bank? Do you take him out on purpose? >> Trade off. >> So, you get into a lot of, interesting parameters that have nothing to do necessarily with the mechanics of making that decision. >> And this changes the fundamentals of computing big time too, right? Because a car cannot wait to ping the Cloud to find out, you know, should I break, or should I just run over this person in front of me. So it needs to make that determination right away. And hopefully the right decision which is to break. But on the other hand, all the cars that have this algorithm, together have collective learning, which needs some kind of Cloud computing. So this whole idea of Edge computing will come and replace a lot of what exists today. So see this disruption even behind the scenes on how we architect these systems, it's a fascinating time. >> And then how much of the compute, the store is at the Edge? How much of the computed to store in the Cloud and then depending on the decision, how do you say it, can you do it locally or do you have to send it upstream or break it in pieces. >> I mean if you look at a car of the future, forget car of the future, car of the present like Tesla, that has more compute power than a small data center, at multiple CPU's, lots of RAM, a lot of hard disk. It's a little Cloud that runs on wheels. >> Well it's a little data center that runs on wheels. But, let me ask you a question. And here's the question, we talk about systems that learn, cognitive systems that are constantly learning, and we're training them. How do we ensure that Watson, for example is constantly operating in the interest of the customer, and not the interest of IBM? Now there's a reason I'm asking this question, because at some point in time, I can perceive some other company offering up a similar set of services. I can see those services competing for attention. As we move forward with increasingly complex decisions, with increasingly complex sources of information, what does that say about how these systems are going to interact with each other? >> He always with the loaded questions today. (group laughing) >> It's an excellent question, it's something that I worry about all the time as well. >> Something we worry about with our clients too. >> So, couple of approaches by which this will exist. And to begin with, while we have the big lead in cognitive computing now, there is no hesitation on my part to admit that the ecosystem around us is also fast developing and there will be hefty competition going forward, which is a good thing. 'Cause if you look at how this world is developing, it is developing as API. APIs will fight on their own merits. So it's a very pluggable architecture. If my API is not very good, then it will get replaced by somebody else's API. So that's one aspect. The second aspect is, there is a difference between the provider and the client in terms of who owns the data. We strongly believe from IBM that client owns the data. So we will not go in and do anything crazy with it. We won't even touch it. So we will provide a framework and a cartridge that is very industry specific. Like for example, if Watson has to act as a call center agent for a Telco, we will provide a set of instructions that are applicable to Telco. But, all the learning that Watson does is on top of that clients data. We are not going to take it from one Telco and put it in another Telco. That will stay very local to that Telco. And hopefully that is the way the rest of the industry develops too. That they don't take information from one and provide to another. Even on an anonymous basis, it's a really bad idea to take a clients data and then feed it elsewhere. It has all kinds of ethical and moral consequences, even if it's legal. >> Absolutely. >> And we would encourage clients to take a look at some of the others out there and make sure that that's the arrangement that they have. >> Absolutely, what a great job for an analyst firm, right? But I want to build upon this point, because I heard something very interesting in the keynote, the CDO of IBM, in the keynote this morning. >> He used a term that I've thought about, but never heard before, trust as a service. Are you guys familiar with his use of that term? >> Vijay: Yep. >> Okay, what does trust as a service mean, and how does it play out so that as a consumer of IMB cognitive services, I have a measurable difference in how I trust IBM's cognitive services versus somebody else? >> Some would call that Blockchain. In fact Blockchain has often been called trust as a service. >> Okay, and Blockchain is probably the most physical form of it that we can find at the moment, right? At the (mumbles) where it's open to everybody but then no one brand section can be tabbed by somebody else. But if we extend that concept philosophically, it also includes a lot of the concept about identity. Identity. I as a user today don't have an easy way to identify myself across systems. Like, if I'm behind the firewall I have one identity, if I am outside the firewall I have another identity. But, if you look at the world tomorrow where I have to deal with a zillion APIs, this concept of a consistent identity needs to pass through all of them. It's a very complicated a difficult concept to implement. So that trust as a service, essentially, the light blocking that needs to be an identity service that follows me around that is not restrictive to an IBM system, or a Nautical system or something. >> But at the end of the day, Blockchain's a mechanism. >> Yes. >> Trust in the service sounds like a-- >> It's a transparency is what it is, the more transparency, the more trust. >> It's a way of doing business. >> Yes. >> Sure. >> So is IBM going to be a leader in defining what that means? >> Well look, in all cases, IBM has, we have always strove, what's the right word? Striven, strove, whatever it. >> Strove. >> Strove (laughing)? >> I'll take that anyway. >> Strove, thank you. To be a leader in how we approach everything ethically. I mean, this is truly in our blood, I mean, we are here for our clients. And we aren't trying to just get them to give us all of their data and then go off and use it anywhere. You have to pay attention sometimes, that what you're paying for is exactly what you're getting, because people will try to do those things, and you just need to have a partner that you trust in this. And, I know it's self-serving to say, but we think about data ethics, we think about these things when we talk to our clients, and that's one of the things that we try to bring to the table is that moral, ethical, should you. Just because you can, and we have, just so you know walked away from deals that were very lucrative before, because we didn't feel it was the right thing to do. And we will always, I mean, I know it sounds self-serving, I don't know how to, you won't know until you deal with us, but pay attention, buyer beware. >> You're just Cortnie from IBM, we know what side you're on. (group laughing) It's not a mystery. >> Believe me, if I'm associated with it, it's yeah. >> But you know, it's a great point, because the other kind of ethical thing that comes up a lot with data, is do you have the ethical conversation before you collect that data, and how you're going to be using it. >> Exactly. >> But that's just today. You don't necessarily know what's going to, what and how that might be used tomorrow. >> Well, in other countries. >> That's what gets really tricky. >> Future-proofing is a very interesting concept. For example, vast majority of our analytics conversation today is around structure and security, those kinds of terms. But, where is the vast majority of data sitting today? It is in video and sound files, which okay. >> Cortnie: That's even more scary. >> It is significantly scary because the technology to get insights out of this is still developing. So all these things like cluster and identity and security and so on, and quantum computing for that matter. All these things need to think about the future. But some arbitrary form of data can come hit you and all these principles of ethics and legality and all should apply. It's a very non-trivial challenge. >> But I do see that some countries are starting to develop their own protections like the General Data Protection Regulation is going to be a huge driver of forced ethics. >> And some countries are not. >> And some countries are not. I mean, it's just like, cognitive is just like anything else. When the car was developed, I'm sure people said, hey everybody's going to go out killing people with their cars now, you know? But it's the same thing, you can use it as a mode of transportation, or you can do something evil with it. It really is going to be governed by the societal norms that you live in, as to how much you're going to get away with. And transparency is our friend, so the more transparent we can be, things like Blockchain, other enablers like that that allow you to see what's going on, and have multiple copies, the better. >> All right, well Cortnie, Vijay, great topics. And that's why gatherings like this are so important to be with your peer group, you know, to talk about these much deeper issues that are really kind of tangental to technology but really to the bigger picture. So, keep getting out on the fringe to help us figure this stuff out. >> I appreciate it, thanks for having us. >> Thanks. >> Pleasure. All right, I'm Jeff Frick with Peter Burris. We're at the Fisherman's Wharf in San Francisco at the IBM Chief Data Officer Strategy Summit 2017. Thanks for watching. (upbeat music) (dramatic music)
SUMMARY :
and that is the result of great content, Brought to you by IBM. It is lunchtime at the IBM CDO Summit. Cortnie and I both managed to grab a quick bite. So let's jump into it. On the other hand, you have supply Right because the third element that you didn't have in the past was this ability to learn as you go. the system cannot just replace human thought. forward to the robotic apocalypse by the way, so. He's like, the less people, the better. Jeff: He must have his Zoomba, or whatever those The fewer the people the better for him. does the car hit the person, or run into the ditch? a grandma on one side and kids on the other? interesting parameters that have nothing to do to find out, you know, should I break, How much of the computed to store in the Cloud forget car of the future, car of the present like Tesla, of the customer, and not the interest of IBM? He always with the loaded questions today. that I worry about all the time as well. And hopefully that is the way that that's the arrangement that they have. the CDO of IBM, in the keynote this morning. Are you guys familiar with his use of that term? In fact Blockchain has often been called trust as a service. Okay, and Blockchain is probably the most physical form the more transparency, the more trust. we have always strove, what's the right word? And, I know it's self-serving to say, but we think about You're just Cortnie from IBM, we know what side you're on. is do you have the ethical conversation before you what and how that might be used tomorrow. It is in video and sound files, which okay. It is significantly scary because the technology But I do see that some countries are starting But it's the same thing, you can use it as a mode that are really kind of tangental to technology We're at the Fisherman's Wharf in San Francisco
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Telco | ORGANIZATION | 0.99+ |
Jeff Frick | PERSON | 0.99+ |
Peter Burris | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Jeff | PERSON | 0.99+ |
Vijay Vijayasankar | PERSON | 0.99+ |
John Furrier | PERSON | 0.99+ |
General Data Protection Regulation | TITLE | 0.99+ |
Cortnie | PERSON | 0.99+ |
second aspect | QUANTITY | 0.99+ |
Vijay | PERSON | 0.99+ |
Peter | PERSON | 0.99+ |
Tesla | ORGANIZATION | 0.99+ |
Cortnie Abercrombie | PERSON | 0.99+ |
tomorrow | DATE | 0.99+ |
Vijay Vijayasanker | PERSON | 0.99+ |
both sides | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
two aspects | QUANTITY | 0.99+ |
third element | QUANTITY | 0.99+ |
one aspect | QUANTITY | 0.98+ |
Spring 2017 | DATE | 0.98+ |
San Francisco | LOCATION | 0.98+ |
two | QUANTITY | 0.98+ |
both | QUANTITY | 0.98+ |
Arnold Schwarzenegger | PERSON | 0.97+ |
one | QUANTITY | 0.97+ |
first | QUANTITY | 0.97+ |
Over 31 million people | QUANTITY | 0.96+ |
more than a decade | QUANTITY | 0.95+ |
IBM Chief Data Officer | EVENT | 0.95+ |
this morning | DATE | 0.94+ |
Watson | ORGANIZATION | 0.91+ |
one thing | QUANTITY | 0.9+ |
Strategy Summit 2017 | EVENT | 0.9+ |
IBM CDO Summit | EVENT | 0.89+ |
Fisherman's Wharf | LOCATION | 0.88+ |
IOT | ORGANIZATION | 0.88+ |
Fisherman's Wharf | TITLE | 0.88+ |
#IBMCDO | ORGANIZATION | 0.87+ |
couple | QUANTITY | 0.86+ |
theCUBE | TITLE | 0.83+ |
one hand | QUANTITY | 0.82+ |
Chief Data Officer | EVENT | 0.8+ |
IBM CDO Strategy Summit | EVENT | 0.8+ |
theCUBE | ORGANIZATION | 0.77+ |
Strategy Summit | EVENT | 0.74+ |
one side | QUANTITY | 0.73+ |
Cognitive | ORGANIZATION | 0.7+ |
zillion APIs | QUANTITY | 0.65+ |
Zoomba | ORGANIZATION | 0.61+ |
IMB | ORGANIZATION | 0.6+ |
GM Cognitive | ORGANIZATION | 0.6+ |
Analytics | ORGANIZATION | 0.54+ |
#theCUBE | ORGANIZATION | 0.46+ |