Arun Krishnamoorthy, Dell Technologies & Mihir Maniar, Dell Technologies | Dell Tech World '22
>> The cube presents, Dell technologies world, brought to you by Dell. >> Hey everyone. Welcome back to the Cube's live coverage of Dell technologies world 2022 from the Venetian in Las Vegas. Lisa Martin here with Dave Valante. Dave, this is our second day. Lots of conversations. We've been talking a lot about apex, multi-cloud, edge, resilience, cyber resilience. >> It is a number one topic actually. I mean, a lot of multi-cloud talk obviously, too. But I think security is the hot topic at the end. >> It is a hot topic and we've got two guests joining us from Dell technologies. We're going to unpack that and talk about some of the great new things they are enabling. Please welcome. One of our alumni, Mihir Maniar, vice president at Dell technologies and Arun Krishnamoorthy, global strategy, resiliency and security at Dell technologies. All right guys, welcome to the program. >> Pleasure, meeting you, Lisa and Dave. >> So ransomware, it's a household term. I'm pretty sure my mom even knows what ransomware is. >> Exactly. >> Legitimately. >> Yeah. >> But I mean, if you look at the numbers, a ransomware attack is happening once every 11 seconds. The numbers, the stats say, you know, an estimated 75% of organizations are going to face an attack, 75% by 2025, it's around the corner. So it's no longer a matter of, are we going to get hit? if we get hit, it's when? and that resiliency and that recovery is absolutely critical. Talk about some of the things there, Dell's comprehensive approach to helping organizations really build resiliency. >> That's a great point. So if you go to see, organizations are going to get hit, if not already, 75% already out there. And then we find that through research, a lot of our customers need a lot of help. They need help because security is really complex. I mean, they have a tough job, right? Because there's so many attacks happening at the same time. One single ransomware incident can cost them on an average 13 million dollars. They have to integrate 50 plus different security vendors to go and build a secured defense in depth, kind of a mechanism. They're liable to the board. At the same time, they have lines of business that are talking about, hey, can you provide me security, but make sure productivity doesn't get impacted. So it's a tough role for them. And that's where Dell services comes in, where our Dell managed security services. We have a full comprehensive suite of offers for our customers to help them, right. To remain secure. And we're focused on the services based on a NIST framework. So I can talk more about the NIST framework as hobby, go about doing. >> There's a lot of talk in the community about, should I pay the ransom? Should they not pay the ransom? And I suppose your advice would be well pay up front and avoid the ransom if you can. Right? >> Absolutely. >> Yeah. Yeah, Dave, what we've seen is the ransomware payment has been very unreliable. We know of many, many examples where either they paid the ransom and they were not able to recover data or they got the decryption keys and the recover process was too slow. So we are all about helping customers understand the risks that they have today and giving them some pragmatic technology solutions. >> Talk about that conversation, where is it, Arun, happening at the customer level as security is a board level conversation. >> Right. >> Are you still talking with the CIOs in lines of business? Who all is involved in really understanding, where all these vulnerabilities are within an organization? >> Yeah, so that's a great question. So we work with CIOs, we work with CSOs, a lot more and the CSOs actually are facing the skills shortage problem. >> Yes. >> That's where they need actually help from, vendors like Dell. And talking about ransomware, if you go to see a NIST framework, it goes all the way from identification of threats to prevention, creating prevention measures with different defense in depth. How do you detect and respond to threats in time. Because time is critical actually and the recovering from threats. So in that whole process, it's better for customers to have the full suite of security services installed, so that they don't end up paying the ransomware eventually, right. To provide their whole defense mechanism. >> So the adversary is very, they're motivated, they're well funded, incredibly sophisticated these days. Okay. So how do you not lose, if you're a customer. What's the playbook that you're helping your customers proceed with? >> Yeah, it's a great, so in the NIST framework, as I mentioned before, services are evolving around, how do you identify the threats that exist in the customer's network? So we provide advisory services and we provide assessment of the customer's vulnerability, that exist so we can detect those vulnerabilities. And then we can build the prevention mechanisms, once you detect those vulnerabilities. This is all about what you cannot see, you can't really defend against. So that's where the whole assessment comes in, where you can go and do a zero trust assessment for the customers, you know, entire infrastructure, and then figure out where those issues lie. So we can go and block those loopholes with the prevention mechanisms. And in prevention mechanisms, actually we have a whole zero trust prevention mechanism. So you can actually go and build out, end to end defense in depth kind of security. >> Arun, before the pandemic, the term zero trust, people would roll their eyes. It was kind of a buzzword and it's becoming sort of a mandate. >> Yeah. >> What does zero trust mean to your customers? How are you helping them achieve it? >> Yeah. So, great question, Dave. A lot of customers think zero trust is a product. It's not. It's a framework. It's a mindset. It helps customer think through what kind of access do I want to give my users, my third party, my customers? Where does my data sit in my environment? Have I configure the right network policies? Have I segmented my network? So it is a collection of different strategies that work across cloud, across data, across network, across applications that interact with each other and what we are helping customers with, understand what that zero trust actually means and how they can translate into actionable technology implementations. >> How do you help customers do that? When we know that, I mean, the average customer has what, seven different backup protection solutions, all alone. If we're talking about like data protection. How do you help them understand, what's in their environment now? If they're talking about protecting applications, users, data, network. What's that conversation? And what's that process like to simplify, their protection so that they really can achieve cyber resilience? >> That's correct. That's a great it question, Lisa. One of the big issues we see with customers is they don't know what they don't know. There's data across multi-cloud, which is great. It enables productivity, but it also is not within the four walls of a data center. So one of the first things we do is identify where customer's data is? Where is their application live? And then we look for blind spots. Are you protecting your SaaS workloads? Are you protecting your endpoints? And we give them a holistic strategy on data protection. And you bring up a great point, a lot of customers have had accidental growth over the years. They started off with one tool and then different business needs drove them to different tools. And maybe now is a good time to evaluate what is your tool set? Can we consolidate it? And reduce the risk in the environment. >> Yeah, I dunno if you guys are be probably familiar with that. I use it a lot, when I write, it's an optive, NSS eye test and it says, here's the security landscape, the taxonomy. It's got to be the most complicated of any, in the business. And so my question is ecosystem, right. You've got to have partners, right. But there's so many choices. How are you helping to solve that problem of consolidating choices and tools? >> That's a great point. So if you look at the zero trust framework, which Lisa, you talked about. In the zero trust framework, we have few things we look at, and that is through Dell's technologies and partner technologies. So we can provide things like secure access, context based, right. So which users can access which applications, identity based. The second one is, which applications can talk to which applications, for micro segmentation, again identity based. And then you have an encryption everywhere. Encryption with data in motion, data in rest. Because encryption is super important to prevent hacks. So, and then you have cloud workloads. We have cloud workload protection. So some of those things, we rely on our partners and some of them actually, we have technologies in the house, like Arun talked about the cyber resilience and the wall that we have in house. So we provide the end-to-end framework for our customers for zero trust, where we can go and identify. We can assess, we can go build it out for them. We can detect and respond with our excellent MDR service, that we came out with last, just last year. So that MDR service allows you to detect attacks and respond automatically using our AI enabled platform that reduces the signal from the noise and allows to prevent these attacks, right, from happening. >> Arun, question for you, as we've seen the proliferation of cyber attacks during the pandemic, we've seen the sophistication increasing, the personalization is increasing. Ransomware is as service is making it, there is no barrier to entry these days. >> Right. >> How has Dell technologies overall, cyber resilience strategy evolved in the last couple of years? I imagine that there's been some silver linings and some accelerations there. >> No, absolutely, Lisa. One of the things we recognized very early on with big cyber attacks going on five years ago, we knew that as much as customers had great technologies to prevent a cyber attack, it was a matter of when, not if, so we created the first purpose built solution to help customers respond and recover from a cyber attack. We created innovative technologies to isolate the data in a cyber wall. We have immutable technologies that lock the data. So they can't be tampered with. And we also build some great intelligence based on AIML. In fact, this is the first and only product in the world that looks it's backup data, does full content indexing and it's able to look for behaviors or patterns in your environment that you could normally not find with signature based detection systems. So it's very revolutionary and we want to help customers not only on the prevention side, which is proactive. We want them to be equally, have a sound strategy on how they would respond and recover from a cyber attack. >> Okay. So there's two pieces there, proactive, and then if and when you get hit, how do you react. And I think about moments in cyber, I mean, Stuxnet was obviously a huge turning point. And then of course the SolarWinds and you see that, the supply chain hacks, you see the island hopping and the living off the land and the stealth moves. So it's almost like, wow, some of these techniques have even been proactive. You're not going to catch them. Right. So you've got to have this, you talked about the NIST framework multilevel, but I mean, customers are aware, obviously everybody, customer you talk to. the SolarWinds, But it seems like, they're still sleeping with one eye open. Like they're really nervous. Right. >> Right. >> And like, we haven't figured it out as an industry yet. And so that's where solutions like this are so critical because you're almost resigning yourself to the fact that while, you may not find it being proactive. >> Yeah. Right. >> But you've got to have, you know, it's like putting tapes in a truck and driving them somewhere. Do you sense that it was a major milestone in the industry? Milestone, negative milestone. And that was a turning point and it was kind of a wake up call for the industry, a new wake up call. What's your sense of how the industry is responding? >> Yeah. I think that's a great point. So if you go to see the verbiage is that it's not, if you're going to get attacked, it's when you're going to get attacked. So the attacks are going to happen no matter what. So that's the reason why the defense in depth and the zero trust framework comes into play. The customers have to have an end-to-end holistic framework, so that they can have, not just the defensive mechanisms, but also detect and respond when the attacks happen. And then as you mentioned, some of them, you just can't catch all of them. So we have excellent incident response and recovery mechanisms. So if the attack happened, it will cause damage. We can do forensics analysis. And on top of that, we can go and recover, like the cyber recovery wall, we can recover that data, make them production again. >> Right. Ready. >> I guess. I'm sorry. What I was trying to ask is, do you think we've understand SolarWinds? Have the industry figured it out? >> Yeah. You know, great question. Right. I think this is where customers have to take a pragmatic approach, on how they do security. And we talk about concepts like intrinsic security. So in other words, you can do a certain activity in your environment and punt the ball to some other team to figure out security, part of what Dell does. You know, you asked the question, right. There's a lot of tools, where do customers start? One of the big values we bring to customers is the initial awareness and just educating customers. Hey, what happened in these watershed moment with these different attacks, right. Wannacry, stuxnet. And how did those customers respond and where did they fail? So let's do some lessons learned with past attacks and let's move forward with some pragmatic solutions. And we usually don't overwhelm our customers with a lot of tools. Let's have a road map. Let's do an incremental build of your security posture. And over time, let's get your entire organization to play with it. >> You talk about awareness, obviously that's critical, but one of the other things that's critical with the cyber threats and the what's going on today is, the biggest threat vector still is people. >> Exactly. >> So talk to me, about out some of the things that you help organizations do. When you're talking about, from an awareness perspective. It's training the people not to open certain links, if they look suspicious, that sort of thing. How involved is Dell technologies with your customers from a strategic perspective about really drilling this into the end users that they've got a lot of responsibility here. >> Yeah, if you go to see phishing is one of the most common attack vectors to go and infiltrate these attacks. So Dell has a whole employee education program that they rolled out. So we all are aware of the fact that clicking on links and phishing, is a risk factor. And we are trying to take that same message to our customers through an employee awareness training service. So we can actually provide education for the employees, from getting these phishing attacks happening. >> Yeah. That's really critical because as I mentioned, we talked about the sophistication, but the personalization, the social engineering is off the church these days. And it's so easy for someone to, especially with all this distractions that we have going on. >> Right. >> If you're working from home and you've got kids at home or dogs barking and whatnot. It's easy to be fooled into something that looks incredibly legitimate. >> Yeah, Yeah. >> You know, you bring another great point, right. You can keep telling people in your environment, don't do things, don't do it. You create a friction, right. We want people to be productive. We want them to use different access to different applications, both in house and in the cloud. So this is where technology comes into play. There are some modern malware defenses that will help customers, identify some of these email phishing, spear phishing. So they are in a better prepared position. And we don't want to curb productivity, but we want to also make a very secure environment where people can work. >> That's a great point is, that it has to be frictionless. >> I do have a question for you guys with respect to SaaS applications. I talk to a lot of customers, using certain SaaS applications who have this sort of, there's a dual responsibility model there, where the SaaS vendor's responsible for the application, protection. But Mr. And miss customer, you're responsible for the data. We are? >> Yeah. >> Are you finding that a lot of organizations are going help. We've got Google workspace, Microsoft 365, Salesforce and it's really incredibly business critical data. Dell technologies help us protect this because this is a vulnerability that we were not aware of. >> Absolutely. And that's why we have the backup service with apex. Where we can actually have SaaS data, which is backed up, using our apex solution for backup recovery. So, yes, that's very critical. We have the end-to-end portfolio for backing it up, having the vault, which is a air gap solution, recovering from it, when you have an attack. And I think the value prop that Dell brings to the table is, we have the client side and we have the data center side, right. With the multi-cloud. So we provide a completely hardened infrastructure where, all the way from supply chain to secure OS, secure bot and secure image. Everything is kind of harden with stick hardening on top of that. And then we have the services layer to go and make sure we can assess the risks. We can detect and respond. We can recover, right. So that we can keep our customers completely secure. That's the value prop that we bring to the table with unmatched scale of Dell services, right. In terms of the scale that we bring to the table, to our customers and help them out. >> Well, it's an interesting opportunity, and it's certainly, from a threats perspective, one that's going to persist, obviously we know that. Great that there's been such a focus from Dell on cyber resiliency for its customers, whether we're talking about multi-cloud, On-Prem, public cloud, SaaS applications, it's critical. It's a techno. It's a solution that every industry has to take advantage of. Guys, thank you so much for joining us. Wish we had more time. I could talk about this all day. >> Yes. >> Thank you. >> Great work going on there. Congratulations on what was going on with apex and the announcement. And I'm sure we'll be hearing more from you in the future. >> Excellent. Thank you, Lisa. >> Thank you very much. >> We are super excited about Dell services and what we can bring for manual security services for our customers. >> Great. >> Excellent. >> Appreciate it. >> Thanks, guys. >> Thank you. >> For our guests and for Dave Valante. I'm Lisa Martin. And You're watching the cube, live from day two of our coverage of Dell technologies world, live from Las Vegas. Dave and I will be right back with our last guest of the day. (upbeat music)
SUMMARY :
brought to you by Dell. from the Venetian in Las Vegas. the hot topic at the end. the great new things So ransomware, it's a household term. The numbers, the stats say, you know, So if you go to see, organizations and avoid the ransom if you can. and the recover process was too slow. happening at the customer level and the CSOs actually are facing and the recovering from threats. So the adversary is very, And then we can build the the term zero trust, Have I configure the I mean, the average customer has what, So one of the first things we do of any, in the business. that we came out with last, during the pandemic, in the last couple of years? One of the things we and the living off the land And like, we haven't figured the industry is responding? and the zero trust Right. Have the industry figured it out? and punt the ball to some other team and the what's going on today is, about out some of the things So we can actually provide distractions that we have going on. It's easy to be fooled into something Yeah, And we don't want to curb productivity, that it has to be frictionless. I do have a question for you guys that we were not aware of. So that we can keep our and it's certainly, and the announcement. and what we can bring for Dave and I will be right back
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Lisa Martin | PERSON | 0.99+ |
Dave Valante | PERSON | 0.99+ |
Mihir Maniar | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Arun Krishnamoorthy | PERSON | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
Lisa | PERSON | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
two pieces | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
13 million dollars | QUANTITY | 0.99+ |
75% | QUANTITY | 0.99+ |
2025 | DATE | 0.99+ |
second day | QUANTITY | 0.99+ |
two guests | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
Dell Technologies | ORGANIZATION | 0.99+ |
One | QUANTITY | 0.99+ |
one tool | QUANTITY | 0.99+ |
first purpose | QUANTITY | 0.99+ |
five years ago | DATE | 0.98+ |
second one | QUANTITY | 0.98+ |
ORGANIZATION | 0.98+ | |
one | QUANTITY | 0.98+ |
apex | TITLE | 0.98+ |
SolarWinds | ORGANIZATION | 0.97+ |
today | DATE | 0.96+ |
zero trust | QUANTITY | 0.96+ |
pandemic | EVENT | 0.96+ |
both | QUANTITY | 0.94+ |
last couple of years | DATE | 0.93+ |
Arun | PERSON | 0.93+ |
Venetian | LOCATION | 0.93+ |
day two | QUANTITY | 0.91+ |
NIST | ORGANIZATION | 0.91+ |
zero | QUANTITY | 0.87+ |
zero trust | QUANTITY | 0.87+ |
once every 11 seconds | QUANTITY | 0.82+ |
one eye | QUANTITY | 0.79+ |
Salesforce | ORGANIZATION | 0.79+ |
50 plus different security vendors | QUANTITY | 0.78+ |
One single ransomware incident | QUANTITY | 0.77+ |
Microsoft 365 | ORGANIZATION | 0.74+ |
2022 | DATE | 0.73+ |
seven different backup protection solutions | QUANTITY | 0.72+ |
NSS | ORGANIZATION | 0.7+ |
Arun Krishnamoorthy, Dell Technologies & Mihir Maniar, Dell Technologies | Dell Techn World 2022
>> The CUBE presents Dell technologies world brought to you by Dell. >> Hey everyone. Welcome back to theCube's live coverage of Dell technologies World 2022 from the Venetian in Las Vegas. Lisa Martin here with Dave Vellante, Dave this is our second day, lots of conversations. We've been talking a lot about APEX, Multi-cloud, edge, resilience, cyber resilience. >> I guess the number one topic actually. I mean, a lot of Multi-cloud talk obviously too, but I think security is the hot topic at the event. >> It is a hot topic, and we've got two guests joining us from Dell technologies. We're going to unpack that and talk about some of the great new things they are enabling. Please welcome. One of our alumni, Mihir Maniar our vice president at Dell technologies and Aaron Krishnmoorthy, global strategy resiliency and security at Dell technologies. Guys, welcome to the program. >> Pleasure meeting you Lisa and Dave. >> So ransomware, it's a household term. I'm pretty sure my mom even knows what ransomware is. >> Exactly. >> Legitimately. But I mean, if you look at the numbers, a ransomware attack is happening once every 11 seconds, the numbers, the stats say, an estimated 75% of organizations are going to face an attack, 75%, by 2025, it's around the corner. So it's no longer a matter of are we going to get hit? If we get hit? It's when? And that resiliency, and that recovery is absolutely critical. Talk about some of the things there, Dell's comprehensive approach to helping organizations really build resiliency. >> That's a great point. So if you go to see organizations are going to get hit, if not already 75% already out there. And then we find that through research, a lot of our customers need a lot of help. They need help because security is really complex. I mean, they have a tough job, because there's so many attacks happening at the same time. One single ransomware incident can cost them on an average $13 million. They have to integrate 50 plus different security vendors to go and build a secured defense in depth, kind of for mechanism, they're liable to the board, at the same time they have lines of business that are talking about, hey, can you provide me, you know, security, but make sure productivity doesn't get impacted. So it's a tough role for them, And that's where Dell services comes in, where our Dell Managed Security Services. We have a full comprehensive suite of offers for our customers to help them to remain secure. And we have focused on the services based on a NEST framework, so I can talk more about the NEST framework as a hobby about, go about doing that. >> There's a lot of talk in the community about should I pay the ransom? Should they not pay the ransom? And I suppose your advice would be, well pay up front and avoid the ransom if you can. >> Absolutely. Yeah. Dave, what we've seen is the ransomware payment has been very unreliable. We know of many, many examples where either they paid the ransom and they were not able to recover data, or they got the decryption keys and the recover process was too slow. So we are all about helping customers understand the risks that they have today, and giving them some pragmatic technology solutions. >> Talk about that conversation. Where is it happening at the customer level, as security is a board level conversation. Are you still talking with the CIOs lines of business, who else is involved in really understanding where all these vulnerabilities are within an organization? >> Yeah. So that's a great question. So we work with CIOs, we work with CSOs a lot more and the CSOs actually are facing the skills shortage problem. >> Yes. >> That's where they need actually help from vendors like Dell. And talking about ransomware, if you go to see a NEST framework, it goes all the way from identification of threats to prevention, creating measures with defense in depth. How do you detect and respond to threats in time? Because time is critical actually. And recovering from threats. So in that whole process, it's better for customers to have the full suite of security services installed, so that they don't end up paying the ransomware eventually. To provide the whole defense mechanism. >> So the adversary is, very, they're motivated. They're well funded, incredibly sophisticated these days. So how do you not lose if you're a customer? What's the playbook that you're helping your customers proceed with? >> Yeah, it's a great, so in the NEST framework as I mentioned before, services are evolving around, how do you identify the threats that exist in the customer's network? So we provide advisory services and we provide assessment of the customer's vulnerabilities that exist, so we can detect those vulnerabilities, and then we can build the prevention mechanisms once we detect those vulnerabilities. It's all about what you cannot see, you can't really defend against. So that's where the whole assessment comes in, where you can go and do a zero trust assessment for the customers entire infrastructure, and then figure out where those issues lie. So we can go and block those loopholes, with the prevention mechanisms. In the prevention mechanisms, actually we have a whole zero trust prevention mechanism. So you can actually go and build out, end to end defense in depth, kind of security. >> Arun, before the pandemic, the term zero trust people would roll their eyes. It was kind of a buzzword, and it's becoming sort of a mandate. What does zero trust mean to your customers? How are you helping them achieve it? >> Yeah. So great question, Dave. A lot of customers think zero trust is a product. It's not, it's a framework, it's a mindset. It helps customer think through, what kind of access do I want to give my users, my third party, my customers? Where does my data sit in my environment? Have I configured the right network policies? Have I segmented my network? So it is a collection of different strategies that work across cloud, across data, across network, across applications that interact with each other and what we are helping customers with understand what that zero trust actually means and how they can translate into actionable technology implementations. >> What do you help customers do that when we know that, I mean, the average customer has what? Seven different backup protection solutions alone, if we're talking about like data protection. How do you help them understand what's in their environment now? If they're talking about protecting applications, users, data, network, what's that conversation? And what's that process like to simplify their protection so that they really can achieve cyber resilience? >> That's correct. That's a great question, Lisa. One of the big issues we see with customers, is they don't know what they don't know. There's data across multi-cloud, which is great, it enables productivity, but it also is not within the four walls of a data center. So one of the first things we do is identify where customer's data is, where is their application live? And then we look for blind spots. Are you protecting your SaaS workloads? Are you protecting your endpoints? And we give them a holistic strategy on data protection and you bring up a great point. A lot of customers have had accidental growth over the years. They started off with one tool and then different business needs drove them to different tools. Maybe now is a good time to evaluate what is your tool set, can we consolidate it and reduce the risk in the environment. >> Yeah, I dunno if you guys are probably familiar with that. I use it a lot when I write, it's an Optive chart and it's this eye test and it says here's this security landscape that taxonomy it's got to be the most complicated of any in the business. And so my question is ecosystem, you've got to have partners. But there's so many choices, how are you helping to solve that problem of consolidating choices and tools? >> That's a great point. So if you look at the zero trust framework which Lisa you talked about, in the zero trust framework, we have few things we look at, that is through Dell's technologies and partner technologies. So we can provide things like secure access, context based. So which users can access which applications. Identity based, the second one is which applications can talk to which applications for micro segmentation. Again, identity based. And then you have encryption everywhere, encryption with data and motion data and rest. Encryption is super important to prevent hacks. So, and then you have cloud workloads, we have cloud workload protection. So some of those things, we rely on our partners and some of them actually we have technologies in house I was like Arun talked about the cyber resilience and the world that we have in house. So we provide the end-to-end framework for our customer for zero trust, where we can go and identify, we can assess, we can go build it out for them. We can detect and respond with our excellent MDR service that we came out with last, just last year. So that MDR service allows you to detect attacks and respond automatically using our AI and ML platform, that reduces the signal from the noise and allows to prevent these attacks from happening. >> Arun, question for you as we've seen the proliferation of cyber attacks during the pandemic, we've seen the sophistication increasing, the personalization is increasing. Ransomware as a service is making it, there is no barrier to entry these days. How has Dell technologies overall cyber resilience strategy evolved in the last couple of years? I imagine that there's been some silver linings and some accelerations there. >> Yeah, absolutely Lisa. One of the things we recognized very early on when big cyber attacks going on five years ago, we knew that at as much as customers had great technologies to prevent a cyber attack, it was a matter of when, not if. So we created the first purpose built solution to help customers respond and recover from a cyber attack. We created innovative technologies to isolate the data in a cyber wall. We have imutable technologies that lock the data, so they can't be tampered with. And we also build some great intelligence based on IML. In fact, this is the first and only product in the world that looks at backup data, does full content indexing, and it's able to look for behaviors or patterns in your environment that you could normally not find with signature based detection systems. So it's very revolutionary and we want to help customers not only on the prevention side, which is proactive. We want them to be equally, have a sound strategy on how they would respond and recover from a cyber attack. >> So there's two pieces there, proactive, and then if, and when you get hit, how do you react? And I think about moments in cyber, I mean Stuxnet was obviously a huge turning point. And then of course the solar winds. And you see that the supply chain hacks, you see the island hopping and the living off the land and the stealth moves. So, it's almost like wow, some of these techniques have even being proactive, you're not going to catch 'em. So you've got to have this, you talked about the NEST framework multi-level, but I mean customers are aware, obviously everybody customer you talk to the solar winds, blah, blah. But it seems like they're still sleeping with one eye open. Like they're really nervous. And like we haven't figured it out as an industry yet. And so that's where solutions like this are so critical because you're almost resigning yourself to the fact that, well, you may not find it being proactive. >> Yeah, right. >> But you've got to have, the last, it's like putting tapes in a truck and driving them somewhere. What do you? Do you sense that it was a major milestone in the industry, milestone, negative milestone and that was a turning point and it was kind of a wake up call for the industry, a new wake up call. What's your sense of how the industry is responding? >> Yeah, I think that's a great point. So if you go to see the verbiages that it's not, if you're going to get attacked, it's when you're going to get attacked. So the attacks are going to happen no matter what. So that's the reason why the defense in depth and the zero test framework comes into play, where customers have to have an end-to-end holistic framework, so that they can have not just an defensive mechanisms, but also detect and respond when the attacks happen. And then as you mentioned, some of them, you just can't catch all of them. So we have excellent incident response and recovery mechanisms. So if the attack happened, it will cause damage. We can do forensics analysis. And on top of that, we can go and recover like the cyber recovery wall. We can recover that data and them production again, ready. >> I guess, I'm sorry. What I was trying to ask is, do you think we've understand solar winds, have the industry figured it out? >> Yeah, great question. I think this is where customers have to take a pragmatic approach on how they do security. And we talk about concepts like intrinsic security. So in other words, you can do a certain activity in your environment and punt the ball to some other team to figure out security. Part of what Dell does, you asked the question, there's a lot of tools, where do customers start? One of the big values we bring to customers is the initial awareness and just educating customers. Hey, what happened in these water-shed moment, in with these different attacks. Wannacry, Stuxnet, and how did those customers respond and where did they fail? So let's do some lessons learned with past attacks and let's move forward with some pragmatic solutions. And, we usually don't overwhelm our customers with a lot of tools. Let's have a roadmap, let's do an incremental build of your security posture. And over time, let's get your enter organization to play with it. >> You talk about awareness, obviously that's critical, but one of the other things that's critical with the cyber threats and the what's going on today is the biggest threat venture still is people. >> Exactly. >> So talk to us about some of the things that you help organizations do. When you're talking about the from an awareness perspective, it's training the people not to open certain links if they look suspicious, that sort of thing. How involved is Dell technologies with your customers from a strategic perspective about really drilling this into the end users that they've got a lot of responsibility here? >> Yeah, if you go to see phishing is one of the most common attack vectors to go and infiltrate these attacks. So Dell has a whole employee education program that they rolled out. So we all are aware of the fact, that clicking on links and phishing is a risk factor. And we are trying to take that same message to our customers through an employee awareness training service. So we can actually provide education for the employees from getting these phishing attacks happening. >> Yeah, that's really critical because as I mentioned, we talked about the sophistication, but the personalization, the social engineering is off the charts these days. And it's so easy for someone to, especially with with all this distractions that we have going on, if you're working from home and you've got kids at home or dogs barking and whatnot, it's easy to be fooled into something that looks incredibly legitimate. >> You bring another great point. You can keep tell people in your environment don't do things, don't do it. You create a friction. We want people to be productive. We want them to use different access to different applications, both inhouse and in the cloud. So this is where technology comes into play. There are some modern malware defenses that will help customers identify some of these email phishing, spear phishing. So they are in a better prepared position. And we don't want to curb productivity, but we want to also make, a very secure environment where people can. >> That's a great point is it has to be frictionless. I do have a question for you guys with respect to SaaS applications. I talk to a lot of customers using certain SaaS applications who have this sort of, there's a, a dual responsibility model there, where the SaaS vendors responsible for the application protection. But Mr. and Miss customer, you're responsible for the data, we are. Are you finding that a lot of organizations are going help. We've got, Google workspace, Microsoft 365, Salesforce, that, and it's really incredibly business critical to data. Dell technologies help us protect this, because this is on vulnerability that we were not aware of. >> Absolutely, and that's why we have the backup service with APEX, where we can actually have stats, data which is backed up using IEX solution for backup recovery. So, yes, that's very critical. We have the end to end portfolio for backing it up, having the vault, which is a air gap solution, recovering from it when you have an attack. And I think the value prop that Dell brings to the table is we have the client side and we have the data center side, With the Multi-cloud. So we provide a completely hardened infrastructure, where we all the way from supply chain to secure OS, secure boot and secure image. Everything is kind of hardened with stick hardening on top of that. And then we have the services layer to go and make sure we can assess the risks, we can detect and respond, we can recover. So that we can keep our customers completely secure. That's the value prop that we bring to the table with unmatched scale of Dell services. In terms of the scale that we bring to the table to our customers and help them out. >> It's an interesting opportunity. And it's certainly from a threats perspective, one that's going to persist. Obviously we know that, great that there's been such a focus from Dell on cyber resiliency for its customers, whether we're talking about multi-cloud OnPrem, public cloud, SaaS applications, it's critical. It's a techno, it's a solution that every industry has to take advantage of guys. Thank you so much for joining us. I wish we had more time. I could talk about this all day. >> Thank you. >> Great work going on there. Congratulations on what was going on with APEX and the announcement, and I'm sure we'll be hearing more from you in the future. >> Excellent. Thank you, Lisa. We are super excited about Dell services and what we can bring for managed security services for our customers. >> Excellent. >> Appreciate it. >> Thanks guys. >> Thank you. >> For our guests and for Dave Vellante. I'm Lisa Martin, you're watching theCube live from day two of our coverage of Dell technologies World, live from Las Vegas. Dave and I will be right back with our last guest of the day. (gentle music)
SUMMARY :
brought to you by Dell. from the Venetian in Las Vegas. I guess the number one topic actually. talk about some of the great So ransomware, it's a household term. Talk about some of the things there, So if you go to see organizations and avoid the ransom if you can. and the recover process was too slow. at the customer level, and the CSOs actually are facing it goes all the way from So the adversary is, and then we can build the term zero trust people Have I configured the the average customer has what? and reduce the risk in the environment. complicated of any in the business. and the world that we have in house. strategy evolved in the One of the things we and the living off the land and that was a turning point and the zero test have the industry figured it out? the ball to some other team but one of the other So talk to us about some of the things So we can actually provide that we have going on, And we don't want to curb productivity, that we were not aware of. We have the end to end one that's going to persist. and the announcement, and what we can bring for and I will be right back
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Aaron Krishnmoorthy | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
Mihir Maniar | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Lisa | PERSON | 0.99+ |
Dell | ORGANIZATION | 0.99+ |
Arun Krishnamoorthy | PERSON | 0.99+ |
two pieces | QUANTITY | 0.99+ |
75% | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
first | QUANTITY | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
second day | QUANTITY | 0.99+ |
two guests | QUANTITY | 0.99+ |
$13 million | QUANTITY | 0.99+ |
Dell Technologies | ORGANIZATION | 0.99+ |
2025 | DATE | 0.99+ |
APEX | ORGANIZATION | 0.99+ |
50 plus | QUANTITY | 0.99+ |
one tool | QUANTITY | 0.99+ |
second one | QUANTITY | 0.99+ |
One | QUANTITY | 0.98+ |
five years ago | DATE | 0.98+ |
one | QUANTITY | 0.98+ |
pandemic | EVENT | 0.98+ |
today | DATE | 0.98+ |
zero trust | QUANTITY | 0.98+ |
Venetian | LOCATION | 0.97+ |
first purpose | QUANTITY | 0.96+ |
Dell Techn | ORGANIZATION | 0.96+ |
both | QUANTITY | 0.95+ |
one eye | QUANTITY | 0.95+ |
Dell technologies | ORGANIZATION | 0.93+ |
day two | QUANTITY | 0.93+ |
NEST | TITLE | 0.91+ |
last couple of years | DATE | 0.9+ |
zero test | QUANTITY | 0.89+ |
CUBE | COMMERCIAL_ITEM | 0.88+ |
Salesforce | ORGANIZATION | 0.87+ |
Seven different backup protection solutions | QUANTITY | 0.83+ |
Arun | PERSON | 0.83+ |
One single ransomware | QUANTITY | 0.82+ |
zero trust | QUANTITY | 0.81+ |
first things | QUANTITY | 0.79+ |
zero | QUANTITY | 0.79+ |
IEX | TITLE | 0.78+ |
Stuxnet | ORGANIZATION | 0.77+ |
once every 11 seconds | QUANTITY | 0.77+ |
trust | ORGANIZATION | 0.71+ |
Wannacry | ORGANIZATION | 0.69+ |
Dell technologies World 2022 | EVENT | 0.68+ |
dual | QUANTITY | 0.65+ |
Adam Leftik, Lacework & Arun Sankaran, Lending Tree | AWS Startup Showcase
>> Welcome to today's session of theCUBE's presentation of the AWS Startup Showcase, The Next Big Thing in AI, Security and Life Sciences. Today featuring Lacework for the security track. I'm your host Natalie Erlich. Thank you for joining us. And we will discuss today how LendingTree automates AWS security for DevOps teams and stays compliant with Lacework. Now we're joined by Adam Leftik the VP of Product at Lacework as well as a Arun Sankaran, CISO of LendingTree. Thank you both very much for joining us today. >> Thank you for having us. >> Well, wonderful. Adam, let's start with you. Lacework positions itself as, "cloud security at the speed of cloud innovation." What does that mean to you and how are you helping your customers? >> Great question, Natalie. I think one of the things that's really important to understand about Lacework really comes back to essentially what's happening at cloud speed, which is customers are aggressively moving more and more of their applications to the cloud, but they're doing so with the same number of resources to secure that environment. And as the cloud continues to grow, both in terms of complexity, as well as overall ability to unlock new styles of applications that were never before even possible without this new technology landscape. Fundamentally, Lacework is designed to enable those builders to go faster without worrying about all the different intricacies and threats that they face out there on the internet. And so the core mission of Lacework is really about enabling builders to build those applications and leverage those cloud resources and new cloud technologies to move quicker and quicker. >> Natalie: Fascinating. >> Yeah, thanks. If you go back to the sort of foundation of the company there we took a very different approach to how we think about security. Often, you know, security approaches in the past have been a rules driven model where you try and think of all the different vectors that attacks can come at. And fundamentally, you end up writing a series of these rules that are impossible to maintain, they atrophy over time, and that you can't possibly think ahead of all these nefarious actors. So one of the things that Lacework did from the very beginning was take a very different approach which is leveraging security as a data problem. And the way we do this is through what we refer to as our polygraph. And the polygraph essentially looks at all the exhaust telemetry that we're able to ingest both from your cloud accounts as well as the underlying infrastructure. And we take that and we build a baseline and a behavioral model for how the application should behave when it's normal. And this baseline represents the state of normalcy. And so then we leverage modern data science techniques to essentially build a model that can identify potential threats without requiring our users to build rules and ultimately play catch up to all the different threats that they face. And this is a really, really powerful capability because it allows our customers both to identify misconfigurations and remediate them, monitor all the activity to reduce the overall overhead on their security organization, and of course help them build faster and identify threats as they come into the system. And we differentiate in lots of different ways as well. So one of the things we're looking to do as part of the overall cloud transformation is really meet the DevOps teams and the security teams where they are. And so all of the information that Lacework captures, synthesizes, and produce through our automation ultimately feed into the different channels that our users are really leveraging that skill today. Whether that's through their ChatOps windows or ultimately into their CICD pipeline so that we give broad coverage both at build time as well as run time and give them full visibility and insights and the ability to remediate those quickly. You know, one of the other things that we're really proud of and this is core to our product philosophy is building more and more partnerships with our customers and LendingTree is really at the forefront of that partnership and we're super excited to be partnering with them. And that's certainly something that we've done to differentiate our product offering and I'd love to hear from Arun, how have you been working with Lacework and how has that been going so far? >> Yeah, thank you, Adam. You know, frankly I think that's a huge differentiator for us. There's a lot of players that can solve technology problems but what we've really appreciated is that as a smaller shop and a smaller organization, the level of connectedness that we feel with the development teams at Lacework. We raise a opportunity. You know, this can make things more efficient for us or this can reduce our time to triage, or this visualization or this UI could be modified to support certain security operations center use cases, maybe that's not what it's designed for. And we've enjoyed just a lot of success in kind of shaping the product in order to meet all the different use cases. And as Adam mentioned, you know, as a CISO, my primary responsibility is security, but frankly there's a lot of DevOps and tech use cases within the polygraph visualization tool, and understanding our environment and troubleshooting has frankly it saved us quite a bit of time and we're looking forward to the partnership to continue to grow out the tool. As we, as a company, scale in today's world, it's very important that we're able to scale our capability 2-3X without a corresponding 2-3X in staff and resources. I think this is the kind of tool that's going to help us get there. >> Well, speaking to you Arun, Lacework has recently grown tremendously and gotten a lot of industry attention but you saw something before everyone else. Can you tell us what really caught your attention? What stood out to you and why you decided to become an early adopter? >> Yeah, great question. Honestly, I wish it was a super tricky kind of answer but the real honest answer is it was a very easy decision because we had a need. We knew that we needed robust monitoring capability and detection of threats within containerized environments. And, you know, there are other players in the space but we have a very diverse environment. We're a combination of multiple container technologies and multiple cloud platforms. And we needed something that had the greatest diversity of coverage across our environments. And this was really the only solution that would work for us. I'd love to be able to say that it was like an aggressive bake-off and there's all these different options. But really, from a capability, and scope, and coverage, it was a fairly easy decision for us. >> And how has your threat detection and investigation process changed since you brought on Lacework? >> Yeah, it certainly has. Our environment within 24 hour period, it might generate 300, 400 million events and that's process level data from hosts and network data access. It's just a very noisy amount of alerts. With the Lacework's platform, those 300, 400 million get reduced to about a hundred alerts a day that we see and of those, five are critical and those tend to all be very actionable. So from an alert fatigue perspective, we really rely on this to give us actionable data, actionable alerts that teams can really focus on and reduces that noise. So I would say that's probably the number one way that our detection process has changed and frankly, a lot of it is what Adam mentioned as far as the underlying self-learning, self-tuning engine. There's not a whole lot of active rules that we had to create or configuration that we had to do. It's kind of a learning system and I think it's really, probably, I would estimate maybe 50-60% reduction in triage and response time for alerts as well. >> And Adam, now going to you, while 2020 was a really rough year for a lot of people, a lot of businesses, Lacework realized 300% revenue growth. So now that the economy is bouncing back and seemingly so in full force, what are your expectations for Lacework in the next year? >> Great question. I think one of the things we're seeing broadly across the industry is an acceleration, a realization that companies that are going through digital transformations have accelerated their pace and so we anticipate even faster growth. Additionally, you know, the companies that may have not been on that trajectory are now realizing that they need to move to the cloud. There's not a lot of folks right now thinking that they're going to be racking and stacking in physical data centers going forward. So we fully expect a continuation of massive growth. And increasingly as customers are moving into the cloud, they're looking for tools to help them build a secure footprint but also enable them to go faster. So, we have a point of view that we're going to continue to see this massive growth and if not, how to accelerate from here. >> Well, you're also the man behind the product. So could you go behind some of the key features that it offers? >> Sure. So, if you think about our overall product portfolio, we really have both breadth and depth. So, first and foremost, most customers who are moving to the cloud or have a large cloud footprint, the first concern they have is, do I have a series of misconfigurations? We really help our customers both identify best practices with those configurations in the cloud, and then also help them move quickly towards potential compliance standards that they need to adhere to. Everyone's operating in a regulated environment these days. And then of course, once you've got that footprint to a place where it's healthy, you really, really want to be able to monitor and track the changes to the configurations over time to ensure you're continuing to maintain that footprint. And so we provide a polygraph based model that essentially identifies potential behavioral risks that we're observing through our data clustering algorithms to help you identify potential holes that you may have created over time and help you remediate those things. And then of course, you know, every customer faces a significant challenge when it comes to just keeping up with the overall landscape changes in terms of overall vulnerability footprint in their environments. And so we have a great capability with what we call vulnerability discovery, which enables our customers to understand where they're vulnerable and not simply tell them how many vulnerabilities they have, but help them isolate, leveraging all the run time and bill time contexts we have so that they can really prioritize what's important to them and what represents the highest risk. And then of course, lastly, you know, where the company really got started is in helping customers protect their cloud workloads. And we do this by identifying threats that we're able to leverage our machine learning and data clustering algorithms so that once we have those baseline behaviors identified and modeled, we can leverage all of our threat intelligence to identify anomalies in that system and help customers really identify those risks as they're coming into the system and deal with those in a really timely manner. So those are kind of the overall key capabilities that they really help teams scale and drive their overall cloud security programs. >> And Arun, really quickly from your perspective, what is a key feature that is really beneficial to LendingTree? >> It's kind of what Adam mentioned with the kind of the self-tuning capability, the reduction of alerts and data based on behavioral-based detection versus rule-based. A lot of people have, you have fancy words, they call AI and machine learning, this and that, but I've rarely seen it work effectively. I think this is a situation where it does work really effectively and does free up time and resources on our side that we can apply to other problems we're trying to solve so I think that's the number one. >> Okay, terrific. Well, I'm really curious Adam. Got to ask you this question. I mean, we saw a really big software IPO last year. What do you think is in store for Lacework? >> Yeah, well, you know, the IPO is just a point in time as opposed to it's part of the journey. Lacework's continuing to invest and really focus on fundamentally changing the security landscape. One of the reasons why I joined Lacework and continue to be really excited about the opportunity comes back to the fundamental challenge that all security tools have. We do not want to create a platform that drives wet blanket behavior, but really fundamentally enables teams like Arun's to move faster and enable the builders to build the applications that fundamentally drive great business outcomes for our customers. And so that's what gets me out of bed. And I think everyone at Lacework is really focused on helping drive great outcomes for our customers. >> Fascinating to hear how Lacework is securing cloud around the world. Lovely to have you on the show. Adam Leftik, the VP of Lacework, as well Arun Sankaran, the CISO of LendingTree. I'm your host for the AWS Startup Network here on theCUBE. Thank you very much for watching.
SUMMARY :
of the AWS Startup Showcase, What does that mean to you And as the cloud continues to grow, and this is core to our product philosophy in kind of shaping the product Well, speaking to you Arun, We knew that we needed and reduces that noise. So now that the economy is bouncing back that they need to move to the cloud. man behind the product. the changes to the on our side that we can apply Got to ask you this question. and continue to be really Lovely to have you on the show.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Natalie Erlich | PERSON | 0.99+ |
Adam | PERSON | 0.99+ |
Adam Leftik | PERSON | 0.99+ |
Natalie | PERSON | 0.99+ |
Lacework | ORGANIZATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Arun Sankaran | PERSON | 0.99+ |
five | QUANTITY | 0.99+ |
300, 400 million | QUANTITY | 0.99+ |
2020 | DATE | 0.99+ |
last year | DATE | 0.99+ |
LendingTree | ORGANIZATION | 0.99+ |
next year | DATE | 0.99+ |
One | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
50-60% | QUANTITY | 0.98+ |
24 hour | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
Arun | PERSON | 0.98+ |
Today | DATE | 0.98+ |
one | QUANTITY | 0.98+ |
both | QUANTITY | 0.97+ |
2-3X | QUANTITY | 0.95+ |
300, 400 million events | QUANTITY | 0.92+ |
first concern | QUANTITY | 0.92+ |
theCUBE | ORGANIZATION | 0.9+ |
Lending Tree | ORGANIZATION | 0.89+ |
300% revenue | QUANTITY | 0.88+ |
about a hundred alerts a day | QUANTITY | 0.87+ |
CISO | PERSON | 0.75+ |
Startup Showcase | EVENT | 0.7+ |
number one | QUANTITY | 0.63+ |
Next Big Thing | EVENT | 0.55+ |
VP | PERSON | 0.52+ |
LendingTree | TITLE | 0.52+ |
Network | ORGANIZATION | 0.42+ |
Arun Varadarajan, Cognizant | Informatica World 2019
>> Live from Las Vegas, its theCUBE. Covering Informatica World 2019. Brought to you by Informatica. >> Welcome back everyone to the theCUBE's live coverage of Informatica World 2019 here in Sin City. I'm your host Rebecca Knight. We're here with Arun Varadarajan. He is the vice president of AI and anaylsitcs at Cognizant. Thank you so much for coming on theCUBE Arun. >> Wonderful its also great to meet you folks at theCUBE. >> You are Cube alarm. >> I am a Cube alarm. This is probably the third or fourth time that I'm on theCUBE. >> Excellent. Well for those viewers who have not seen your previous clips tell us a little bit about your role at Cognizant. >> My role at Cognizant is focused on two primary things. One is to really get our customers ready for AI and truly compete in the digital world. The second big focus for me is to get them there. To me it's all about the data. So many times we don't realize this that if you look at a lot of the FANG players. The digital natives are born digital who really have leveraged machine learning and AI to disrupt the market place. They do it with data. It's all about the data. So the big push that I'm working on these days is to help our clients create this new modern data platform that can truly help them leverage AI and disrupt the market where possible. >> So tell us what you've-- So we know that this journey is incredibly complex and there's a lot of layers, a lot of questions, hard questions that companies are wrestling with. >> Yes. >> Give us the lay of the land. What do you see as sort of the big dominant forces happening in AI and ML? >> I think the first place is companies are still trying to figure out where do they apply AI and ML. I think that is where they need to start because if it is not designed and the initiative is not purposed around any sort of specific business area or business focus or business outcome, it becomes an engineering project that really doesn't see light of day. If you remember back in the days when Hadoop was big. Hadoop was almost like a solution trying to find-- A problem or a solution trying to find a problem, whichever way it is. I think as opposed to taking a technology view which has been the traditional approach that most of the CIO organizations have used. In AI even more so, there needs to be significant participation for the business to decide where are the opportunities for me to drive business value. So I've always told my clients that the place to start is where can I apply AI and machine learning because at the end of the day it is just a technique right, and the technique has to be focused on delivering true business outcomes and business values. So that is where I think our clients need to start. If you go back in time and remember the ERP days when people were implementing SAP and Oracle there was this very strong focus on process optimization and process excellence. How do I get a straight through process organization? Really create that process orchestration layer that could execute at excellence. I think that needs to be brought back today but in a different light and the light is, now let me view my value chain, not just from a process orchestration standpoint but where are the opportunities for me to leverage machine learning and AI to create very different outcomes within that process layer? And I think-- Sorry. >> I definitely want to go back to that but I also want to remember that we are here at Informatica World and I want to make sure I ask you how you at Cognizant work with Informatica. >> Informatica is a strategic partner of ours and as I was saying, while you start with that outcome in mind and really say these are the areas I want to drive business outcomes it's very important you understand how data plays a role in delivering those outcomes. So that's where Informatica and our partnership really comes to fruition. You know that Informatica has been working very strong in the areas of metadata management, data governance, security. All of these are essential part of you knowing your data and knowing where your data's coming from, where is it going, who is using it, how is it being consumed, in what form and shape should it be delivered so that we can deliver business value is a key aspect of really leveraging AI and machine learning. In AI and machine learning the one thing that we have to be cognizant of, pun intended, is the fact that when you're going to get the machine to start making decisions for you, the quality of your data has to be significantly higher than just a report that is inaccurate, right. Report inaccuracy, yes you're going to get shouted at by the consumer of the report but that's the only problem you face but with AI and machine learning coming into play if your data is not truly representative of the decision area that the machine is working on then you're going to have a very bad outcome. >> This is a deep and philosophical issue because if the data is shoddy or biased there is a lot of problems that companies can get into. So where do you even start? How do you even work with a company to make sure that their data is the right data, is pure? What do you think? >> Interesting you ask that question. We've come up with this notion that even data has got IQ. We call it data IQ in Cognizant and it's a mathematical measure that we have come up with which allows us to score a data's ability to perform in a given area or function. So it could be in the area of sales effectiveness. Look we have a large retail company that is really trying to figure out how can they improve their store level information so that they can execute more sales orders with their customers. Their assumption is that they're working with a data set that can help them drive that outcome. How do they know that? Well there's one way to find out, which is for you to experiment, test, and learn and test and learn but that's an arduous process. Which is why a lot of the data science work that is happening today is, I would say, probably seventy to 80% of the data science effort goes waste because there are experiments that fail. This was-- >> But is that a waste? So it failed, but you tried and you maybe had some learnings from it, right? >> So a lot of people keep saying that failure is a great teacher of-- >> That's the Silicon Valley mantra right now. >> Well you can be smart about where you fail. >> True. >> Right. >> Good point. >> If there are opportunities for you to prevent that failure why wouldn't you? >> Okay. All right. >> That's what we're looking at. So what I'm saying is that before you go into doing any data science experiment, what if I came back and told you that the data that you're working on is not going to be sufficient for you to deliver that outcome. Would it not be interesting? >> Exactly, so it's making sure that you at least are maximizing your chance of success by having the right data to begin with. It is a failure for failure's sake if you're not even starting with the right data. >> Absolutely and you know the other thing that people don't realize is is if you go and ask-- If you just do it, I'm going back to my industrial engineering days, if you go and do a simple time and motion study of data science, data scientists, I can guarantee you that 80% or 90% of their time is spent on just prepping the data and only less than 10% or 15% on truly driving business value. So my question is you're spending big dollars on data science experiments where eighty to 90% of the time the data scientists are prepping. Looking at the data, is it the right skew, has it the right features, do I need to do some feature engineering, do you denormalize it? There are a whole bunch of data prep work that they do. My question is, what if we take that pain away from them? That's what I call as data science freedom and this is what we are promoting to our clients saying what can you do with your data so that your data is ready for the data science folks? Today it's data science folks, tomorrow it's going to be hopefully machine learning algorithms that can self model because a lot of people are talking about auto ML which is the new buzz-word, which is AI doing AI and that's an area that we're heavily invested in. Where you really want to make sure that the data going in is of the veracity and the complexity and the texture required for that outcome area. So that's where I think things like data IQ as a concept would really help our clients to know that hey the data I'm working with has got the intrinsic intelligence in that outcome area for me to drive that particular business outcome that I'm working on. That's where I think the magic lies. >> That's where they'll see the value. >> That's where they'll see the value. >> So talk a little about the AI journey because that is, it's all intertwined but so many companies are coming to you, to Cognizant and saying we know we need to do more of this, we want to make it real, how do we get there? So what do you say? What's your advice? >> So, I think I mentioned this right up-front when we started the conversation. It all has to start with purpose. Without purpose no AI project really succeeds. You'll end up creating a few bots. In fact when I look out there in the world and look at the kind of work that is happening in machine Learning and AI, many of the so-called AI projects, if you double click on them, are just bots. So we are doing some level of maybe process automation, we're trying to reduce labor content, bringing in bots, but are we truly driving change? I'm not saying that that's not a change. There is definitely a change but it's more of an incremental change. It is not the kind of disruptive change that some of the FANG leaders that are showing right. If you take Facebook, Amazon, the whole gamut of digital natives, they are truly disrupting the market place. Some of them are even able to do a million predictions a second to match demand, supply, and price. Now that is how they are using it. Now the question I think for our clients, for our enterprise clients is to say that's a great goal to have but where do I start and how do I start? It starts with, in my opinion, two or three big notions. One is, honestly ask yourself, how much of a change are you willing to make, because if you have to compete and really leverage AI and machine learning the way it has been designed to do so you have to be willing to press the reset button. You have to be willing to destroy what you have today and there is, I think Bill Baker back in the days, he was a SQL server guy. He was talking about this whole concept of what is known as scale up and scale out and he was talking about it from the angle of managing a pet versus managing cattle. So when you're managing a pet, a pet is a very unique component like your mail server So Bob the mail server, if the mail server goes down then all hell breaks loose and hopefully you have another alternate to Bob to manage the mail server. So it's more like a scale up model where you are looking at, hey how do I manage high availability as opposed to today's world where you have the opportunity to really look at things in a far more expansive manner. So if you have to do that you can't be saying I have this on-prem data warehouse right, which is running on X Y Z, and I want to take that on-prem data warehouse and move it to the Cloud and expect magic to happen, because all you're doing is you're shifting your mess from your data center to somebody else's data center which is called the Cloud. >> Right. >> Right? So I think the big thing for clients to really understand is how much are they invested in this change. How are they willing to drive this change? I'll tell you it's not about the technology. There are so many technology options today and we have got some really smart engineers who know how to engineer things. The question is, what are you doing this for? Are you willing, if you want to compete in that paradigm, are you willing to let go of what you have tody? That is a big question. That I would start with. >> An important question but I want to sneak in one more question and that is about the skills gap because this is something that we hear so much about. So many companies facing a, there is a dearth of qualified candidates who can do these jobs in data science and AI and ML. What are you seeing at Cognizant and what are you doing to remedy the problem? >> So I think it's definitely a challenge for the industry at large and what we are starting to see is two things emerging. One is the new workforce coming into the market is better equipped because of the way the school systems have changed in the last few years and I would say this is a global phenomenon not just in North America or in Europe or in China or India. It's a global phenomenon. We're starting to see that undergrad students who come out of school today are better equipped to learn the new capabilities. That's number one. Which is very heartening for us right, in the whole talent space. What I've always believed in, and this is my personal view on this, what I've always believed in is that these skills will come into fashion and go out of fashion in months and days. It's about the kind of engineering approach you have that stays constant, right. If you look at any of the new technologies today, they all are based on some core standard principles. Yes the semantics will change, the structure will change, but some of the engineering principles remain the same. So what we've been doing in Cognizant is really investing in our engineering talent. So we call it data engineering and to us data engineering means that if you're a data engineer you can't tell me I will only work with A, B, or C technology. You should be in a position to work with all of these technologies and you should be in a position to approach it from an engineering mindset as opposed to a skill or a tool based mindset and that's the change that we need with fads coming in and out of Vogue. I think it's super important for all consultants in this space to be grounded on some core engineering principles. That's what we are investing in very heavily. >> Well it sounds like a sound investment. Well thank you so much for coming on the show Arun. I appreciate it. >> Thank you so much. It was a pleasure. >> I'm Rebecca Knight for theCUBE. You are watching theCUBE at Informatica World 2019. Stay tuned. (lighthearted music)
SUMMARY :
Brought to you by Informatica. He is the vice president of AI and anaylsitcs at Cognizant. This is probably the third or fourth time Well for those viewers who have not seen your previous clips and disrupt the market where possible. So tell us what you've-- What do you see as sort of the big dominant forces and the technique has to be focused on delivering and I want to make sure I ask you but that's the only problem you face So where do you even start? So it could be in the area of sales effectiveness. All right. to be sufficient for you to deliver that outcome. Exactly, so it's making sure that you at least are Absolutely and you know the other thing that people don't You have to be willing to destroy what you have today So I think the big thing for clients to really understand is and that is about the skills gap It's about the kind of engineering approach you have Well thank you so much for coming on the show Arun. Thank you so much. I'm Rebecca Knight for theCUBE.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Rebecca Knight | PERSON | 0.99+ |
Informatica | ORGANIZATION | 0.99+ |
China | LOCATION | 0.99+ |
Arun Varadarajan | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Europe | LOCATION | 0.99+ |
India | LOCATION | 0.99+ |
80% | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
Bob | PERSON | 0.99+ |
90% | QUANTITY | 0.99+ |
Sin City | LOCATION | 0.99+ |
seventy | QUANTITY | 0.99+ |
North America | LOCATION | 0.99+ |
two | QUANTITY | 0.99+ |
second | QUANTITY | 0.99+ |
Bill Baker | PERSON | 0.99+ |
third | QUANTITY | 0.99+ |
15% | QUANTITY | 0.99+ |
Cognizant | ORGANIZATION | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
first | QUANTITY | 0.99+ |
tomorrow | DATE | 0.99+ |
Informatica World | ORGANIZATION | 0.99+ |
two things | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
Cognizant | PERSON | 0.99+ |
Today | DATE | 0.99+ |
One | QUANTITY | 0.98+ |
Oracle | ORGANIZATION | 0.98+ |
less than 10% | QUANTITY | 0.98+ |
one more question | QUANTITY | 0.97+ |
eighty | QUANTITY | 0.95+ |
Informatica World 2019 | EVENT | 0.95+ |
fourth time | QUANTITY | 0.94+ |
Hadoop | TITLE | 0.94+ |
one thing | QUANTITY | 0.93+ |
one way | QUANTITY | 0.92+ |
theCUBE | ORGANIZATION | 0.92+ |
Cube | COMMERCIAL_ITEM | 0.9+ |
Silicon Valley | LOCATION | 0.88+ |
FANG | ORGANIZATION | 0.87+ |
Informatica World | EVENT | 0.87+ |
two primary things | QUANTITY | 0.87+ |
last few years | DATE | 0.85+ |
Arun | TITLE | 0.84+ |
a million predictions | QUANTITY | 0.81+ |
SAP | ORGANIZATION | 0.79+ |
three big notions | QUANTITY | 0.74+ |
a second | QUANTITY | 0.73+ |
2019 | DATE | 0.73+ |
double | QUANTITY | 0.68+ |
Vogue | ORGANIZATION | 0.67+ |
Cognizant | TITLE | 0.66+ |
Arun | PERSON | 0.62+ |
one | QUANTITY | 0.45+ |
Arun Murthy, Hortonworks | theCUBE NYC 2018
>> Live from New York, it's The Cube, covering The Cube New York City 2018 brought to you by SiliconAngle Media and its Ecosystem partners. >> Okay, welcome back everyone, here live in New York City for Cube NYC, formally Big Data NYC, now called CubeNYC. The topic has moved beyond big data. It's about cloud, it's about data, it's also about potentially blockchain in the future. I'm John Furrier, Dave Vellante. We're happy to have a special guest here, Arun Murthy. He's the cofounder and chief product officer of Hortonworks, been in the Ecosystem from the beginning, at Yahoo, already been on the Cube many times, but great to see you, thanks for coming in, >> My pleasure, >> appreciate it. >> thanks for having me. >> Super smart to have you on here, because a lot of people have been squinting through the noise of the market place. You guys have been now for a few years on this data plan idea, so you guys have actually launched Hadoop with Cloudera, they were first. You came after, Yahoo became second, two big players. Evolved it quickly, you guys saw early on that this is bigger than Hadoop. And now, all the conversations on what you guys have been talking about three years ago. Give us the update, what's the product update? How is the hybrids a big part of that, what's the story? >> We started off being the Hadoop company, and Rob, our CEO who was here on Cube, a couple of hours ago, he calls it sort of the phase one of the company, where it were Hadoop company. Very quickly realized we had to help enterprises manage the entire life cycle data, all the way from the edge to the data center, to the cloud, and between, right. So which is why we did acquisition of YARN, we've been talking about it, which kind of became the basis of our Hot marks Data flow product. And then as we went through the phase of that journey it was quickly obvious to us that enterprises had to manage data and applications in a hybrid manner right which is both on prem And public load and increasingly Edge, which is really very we spend a lot of time these days With IOT and everything from autonomous cars to video monitoring to all these aspects coming in. Which is why we wanted to get to the data plan architecture it allows to get you to a consistent security governance model. There's a lot of, I'll call it a lot of, a lot of fight about Cloud being insecure and so on, I don't think there's anything inherently insecure about the Cloud. The issue that we see is lack of skills and our enterprises know how to manage the data on-prem they know how to do LDAP, groups, and curb rows, and AAD, and what have you, they just don't have the skill sets yet to be able to do it on the public load, which leads to mistakes occasionally. >> Um-hm. >> And Data breaches and so on. So we recognize really early that part of data plan was to get that consistent security in governance models, so you don't have to worry about how you set up IMRL's on Amazon versus LDAP on-prem versus something else on Google. >> It's operating consistency. >> It's operating, exactly. I've talked about this in the past. So getting that Data plan was that journey, and this week at Charlotte work week we announced was we wanted to take that step further we've been able to kind of allow enterprise to manage this hybrid architecture on prem, multiple public loads. >> And the Edge. >> In a connected manner, the issue we saw early on and it's something we've been working on for a long while. Is that we've been able to connect the architectures Hadoop when it started it was more of an on premise architecture right, and I was there in 2005, 2006 when it started, Hadoop's started was bought on the world wide web we had a gigabyte of ethernet and I was up to the rack. From the rack on we had only eight gigs up to the rack so if you have a 2000 or cluster your dealing with eight gigs of connection. >> Bottleneck >> Huge bottleneck, fast forward today, you have at least ten if not one hundred gigabits. Moving to one hundred to a terabyte architecture, for that standpoint, and then what's happening is everything in that world, if you had the opportunity to read things on the assumptions we have in Hadoop. And then the good news is that when Cloud came along Cloud already had decoupled storage and architecture, storage and compute architectures. As we've sort of helped customers navigate the two worlds, with data plan, it's been a journey that's been reasonably successful and I think we have an opportunity to kind of provide identical consistent architectures both on prem and on Cloud. So it's almost like we took Hadoop and adapted it to Cloud. I think we can adapt the Cloud architecture back on prem, too to have consistent architectures. >> So talk about the Cloud native architecture. So you have a post that just got published. Cloud native architecture for big data and the data center. No, Cloud native architecture to big data in the data center. That's hyrid, explain the hybrid model, how do you define that? >> Like I said, for us it's really important to be able to have consistent architectures, consistent security, consistent governance, consistent way to manage data, and consistent way to actually to double up and port applications. So portability for data is important, which is why having security and governance consistently is a key. And then portability for the applications themselves are important, which is why we are so excited to kind of be, kind of first to embrace the whole containerize the ecosystem initiative. We've announced the open hybrid architecture initiative which is about decoupling storage and compute and then leveraging containers for all the big data apps, for the entire ecosystem. And this is where we are really excited to be working with both IBM and Redhat especially Redhat given their sort of investments in Kubernetes and open ship. We see that much like you'll have S3 and EC2, S3 for storage, EC2 for compute, and same thing with ADLS and azure compute. You'll actually have the next gen HDFS and Kubernetives. So is this a massive architectural rewrite, or is it more sort of management around the core. >> Great question. So part of it is evolution of the architecture. We have to get, whether it's Spark or Kafka or any of these open source projects, we need to do some evolution in the architecture, to make them work in the ecosystem, in the containerized world. So we are containerizing every one of the 28 animals 30 animals, in the zoo, right. That's a lot of work, we are kind of you know, sort of do it, we've done it in the past. Along with your point it's not enough to just have the architecture, you need to have a consistent fabric to be able to manage and operate it, which is really where the data plan comes in again. That was really the point of data plane all the time, this is a multi-roadmap, you know when we sit down we are thinking about what we'll do in 22, and 23. But we really have to execute on a multi-roadmap. >> And Data plane was a lynch pin. >> Well it was just like the sharp edge of the sword. Right, it was the tip of the sphere, but really the idea was always that we have to get data plan in to kind of get that hybrid product out there. And then we can sort of get to a inter generational data plan which would work with the next generation of the big data ecosystem itself. >> Do you see Kubernetes and things like Kubernetes, you've got STO a few service meshes up the stack, >> Absolutely are going to play a pretty instrumental role around orchestrating work loads and providing new stateless and stateful application with data, so now data you've got more data being generated there. So this is a new dynamic, it sounds like that's a fit for what you guys are doing. >> Which is something we've seen for awhile now. Like containers are something we've tracked for a long time and really excited to see Docker and RedHat. All the work that they are doing with Redhat containers. Get the security and so on. It's the maturing of that ecosystem. And now, the ability to port, build and port applications. And the really cool part for me is that, we will definitely see Kubenetes and open shift, and prem but even if you look at the Cloud the really nice part is that each of the Cloud providers themselves, provide a Kubenesos. Whether it's GKE on Google or Fargate on Amazon or AKS on Microsoft, we will be able to take identical architectures and leverage them. When we containerize high mark aft or spark we will be able to do this with kubernetes on spark with open shift and there will be open shift on leg which is available in the public cloud but also GKE and Fargate and AKS. >> What's interesting about the Redhat relationship is that I think you guys are smart to do this, is by partnering with Redhat you can, customers can run their workloads, analytical workloads, in the same production environment that Redhat is in. But with kind of differentiation if you will. >> Exactly with data plane. >> Data plane is just a wonderful thing there. So again good move there. Now around the ecosystem. Who else are you partnering with? what else do you see out there? who is in your world that is important? >> You know again our friends at IBM, that we've had a long relationship with them. We are doing a lot of work with IBM to integrate, data plane and also ICPD, which is the IBM Cloud plane for data, which brings along all of the IBM ecosystem. Whether it's DBT or IGC information governance catalogs, all that kind of were back in this world. What we also believe this will give a flip to is the whole continued standardization of security and governance. So you guys remember the old dpi, it caused a bit of a flutter, a few years ago. (anxious laughing) >> We know how that turned out. >> What we did was we kind of said, old DPI was based on the old distributions, now it's DPI's turn to be more about merit and governance. So we are collaborating with IBM on DPI more on merit and governance, because again we see that as being very critical in this sort of multi-Cloud, on prem edge world. >> Well the narrative, was always why do you need it, but it's clear that these three companies have succeeded dramatically, when you look at the financials, there has been statements made about IBM's contribution to seven figure deals to you guys. We had Redhat on and you guys are birds of a feather. [Murhty] Exactly. >> It certainly worked for you three, which presumably means it confers value to your customers. >> Which is really important, right from a customer standpoint, what is something we really focus on is that the benefit of the bargain is that now they understand that some of their key vendor partners that's us and Ibm and Redhat, we have a shared roadmap so now they can be much more sure about the fact that they can go to containers and kubernetes and so on and so on. Because all of the tools that they depend on are and all the partners they depend on are working together. >> So they can place bets. >> So they can place bets, and the important thing is that they can place longer term bets. Not a quarter bet, we hear about customers talking about building the next gen data centers, with kubernetes in mind. >> They have too. >> They have too, right and it's more than just building machines up, because what happens is with this world we talked about things like networking the way you do networking in this world with kubernetes, is different than you do before. So now they have to place longer term bets and they can do this now with the guarantee that the three of us will work together to deliver on the architecture. >> Well Arun, great to have you on the Cube, great to see you, final question for you, as you guys have a good long plan which is very cool. Short term customers are realizing, the set-up phase is over, okay now they're in usage mode. So the data has got to deliver value, so there is a real pressure for ROI, we would give people a little bit of a pass earlier on because set-up everything, set-up the data legs, do all this stuff, get it all operationalized, but now, with the AI and the machine learning front and center that's a signal that people want to start putting this to work. What have you seen customers gravitate to from the product side? Where are they going, is it the streaming is it the Kafka, is it the, what products are they gravitating to? >> Yeah definitely, I look at these in my role, in terms of use cases, right, we are certainly seeing a continued push towards the real-time analytics space. Which is why we place a longer-term bet on HDF and Kafka and so on. What's been really heartening kind of back to your sentiment, is we are seeing a lot of push right now on security garments. That's why we introduced for GDPR, we introduced a bunch of cable readies and data plane, with DSS and James Cornelius wrote about this earlier in the year, we are seeing customers really push us for key aspects like GDPR. This is a reflection for me of the fact of the maturing of the ecosystem, it means that it's no longer something on the side that you play with, it's something that's more, the whole ecosystem is now more a system of record instead of a system of augmentation, so that is really heartening but also brings a sharper focus and more sort of responsibility on our shoulders. >> Awesome, well congratulations, you guys have stock prices at a 52-week high. Congratulations. >> Those things take care of themselves. >> Good products, and stock prices take care of themselves. >> Okay the Cube coverage here in New York City, I'm John Vellante, stay with us for more live coverage all things data happening here in New York City. We will be right back after this short break. (digital beat)
SUMMARY :
brought to you by SiliconAngle Media at Yahoo, already been on the Cube many times, And now, all the conversations on what you guys a couple of hours ago, he calls it sort of the phase one so you don't have to worry about how you set up IMRL's on was we wanted to take that step further we've been able In a connected manner, the issue we saw early on on the assumptions we have in Hadoop. So talk about the Cloud native architecture. it more sort of management around the core. evolution in the architecture, to make them work in idea was always that we have to get data plan in to for what you guys are doing. And the really cool part for me is that, we will definitely What's interesting about the Redhat relationship is that Now around the ecosystem. So you guys remember the old dpi, it caused a bit of a So we are collaborating with IBM on DPI more on merit and Well the narrative, was always why do you need it, but It certainly worked for you three, which presumably be much more sure about the fact that they can go to building the next gen data centers, with kubernetes in mind. So now they have to place longer term bets and they So the data has got to deliver value, so there is a on the side that you play with, it's something that's Awesome, well congratulations, you guys have stock Okay the Cube coverage here in New York City,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave Vellante | PERSON | 0.99+ |
Arun Murthy | PERSON | 0.99+ |
Rob | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
2005 | DATE | 0.99+ |
John Vellante | PERSON | 0.99+ |
John Furrier | PERSON | 0.99+ |
Redhat | ORGANIZATION | 0.99+ |
Yahoo | ORGANIZATION | 0.99+ |
30 animals | QUANTITY | 0.99+ |
SiliconAngle Media | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
AKS | ORGANIZATION | 0.99+ |
New York City | LOCATION | 0.99+ |
second | QUANTITY | 0.99+ |
52-week | QUANTITY | 0.99+ |
James Cornelius | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
Microsoft | ORGANIZATION | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
New York | LOCATION | 0.99+ |
three | QUANTITY | 0.99+ |
YARN | ORGANIZATION | 0.99+ |
28 animals | QUANTITY | 0.99+ |
one hundred | QUANTITY | 0.99+ |
Fargate | ORGANIZATION | 0.99+ |
two worlds | QUANTITY | 0.99+ |
GDPR | TITLE | 0.99+ |
2006 | DATE | 0.99+ |
Arun | PERSON | 0.99+ |
three companies | QUANTITY | 0.99+ |
one hundred gigabits | QUANTITY | 0.99+ |
eight gigs | QUANTITY | 0.99+ |
this week | DATE | 0.99+ |
two big players | QUANTITY | 0.99+ |
Hadoop | TITLE | 0.98+ |
first | QUANTITY | 0.98+ |
Spark | TITLE | 0.98+ |
GKE | ORGANIZATION | 0.98+ |
Kafka | TITLE | 0.98+ |
both | QUANTITY | 0.98+ |
Kubernetes | TITLE | 0.98+ |
each | QUANTITY | 0.97+ |
today | DATE | 0.97+ |
NYC | LOCATION | 0.97+ |
three years ago | DATE | 0.97+ |
Cloud | TITLE | 0.97+ |
Charlotte | LOCATION | 0.96+ |
seven figure | QUANTITY | 0.96+ |
DSS | ORGANIZATION | 0.96+ |
EC2 | TITLE | 0.95+ |
S3 | TITLE | 0.95+ |
Cube | COMMERCIAL_ITEM | 0.94+ |
Cube | ORGANIZATION | 0.92+ |
Murhty | PERSON | 0.88+ |
2000 | QUANTITY | 0.88+ |
few years ago | DATE | 0.87+ |
couple of hours ago | DATE | 0.87+ |
Ecosystem | ORGANIZATION | 0.86+ |
Ibm | PERSON | 0.85+ |
Arun Murthy, Hortonworks | DataWorks Summit 2018
>> Live from San Jose in the heart of Silicon Valley, it's theCUBE, covering DataWorks Summit 2018, brought to you by Hortonworks. >> Welcome back to theCUBE's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight, along with my cohost, Jim Kobielus. We're joined by Aaron Murphy, Arun Murphy, sorry. He is the co-founder and chief product officer of Hortonworks. Thank you so much for returning to theCUBE. It's great to have you on >> Yeah, likewise. It's been a fun time getting back, yeah. >> So you were on the main stage this morning in the keynote, and you were describing the journey, the data journey that so many customers are on right now, and you were talking about the cloud saying that the cloud is part of the strategy but it really needs to fit into the overall business strategy. Can you describe a little bit about how you're approach to that? >> Absolutely, and the way we look at this is we help customers leverage data to actually deliver better capabilities, better services, better experiences, to their customers, and that's the business we are in. Now with that obviously we look at cloud as a really key part of it, of the overall strategy in terms of how you want to manage data on-prem and on the cloud. We kind of joke that we ourself live in a world of real-time data. We just live in it and data is everywhere. You might have trucks on the road, you might have drawings, you might have sensors and you have it all over the world. At that point, we've kind of got to a point where enterprise understand that they'll manage all the infrastructure but in a lot of cases, it will make a lot more sense to actually lease some of it and that's the cloud. It's the same way, if you're delivering packages, you don't got buy planes and lay out roads you go to FedEx and actually let them handle that view. That's kind of what the cloud is. So that is why we really fundamentally believe that we have to help customers leverage infrastructure whatever makes sense pragmatically both from an architectural standpoint and from a financial standpoint and that's kind of why we talked about how your cloud strategy, is part of your data strategy which is actually fundamentally part of your business strategy. >> So how are you helping customers to leverage this? What is on their minds and what's your response? >> Yeah, it's really interesting, like I said, cloud is cloud, and infrastructure management is certainly something that's at the foremost, at the top of the mind for every CIO today. And what we've consistently heard is they need a way to manage all this data and all this infrastructure in a hybrid multi-tenant, multi-cloud fashion. Because in some GEOs you might not have your favorite cloud renderer. You know, go to parts of Asia is a great example. You might have to use on of the Chinese clouds. You go to parts of Europe, especially with things like the GDPR, the data residency laws and so on, you have to be very, very cognizant of where your data gets stored and where your infrastructure is present. And that is why we fundamentally believe it's really important to have and give enterprise a fabric with which it can manage all of this. And hide the details of all of the underlying infrastructure from them as much as possible. >> And that's DataPlane Services. >> And that's DataPlane Services, exactly. >> The Hortonworks DataPlane Services we launched in October of last year. Actually I was on CUBE talking about it back then too. We see a lot of interest, a lot of excitement around it because now they understand that, again, this doesn't mean that we drive it down to the least common denominator. It is about helping enterprises leverage the key differentiators at each of the cloud renderers products. For example, Google, which we announced a partnership, they are really strong on AI and MO. So if you are running TensorFlow and you want to deal with things like Kubernetes, GKE is a great place to do it. And, for example, you can now go to Google Cloud and get DPUs which work great for TensorFlow. Similarly, a lot of customers run on Amazon for a bunch of the operational stuff, Redshift as an example. So the world we live in, we want to help the CIO leverage the best piece of the cloud but then give them a consistent way to manage and count that data. We were joking on stage that IT has just about learned how deal with Kerberos and Hadoob And now we're telling them, "Oh, go figure out IM on Google." which is also IM on Amazon but they are completely different. The only thing that's consistent is the name. So I think we have a unique opportunity especially with the open source technologies like Altas, Ranger, Knox and so on, to be able to draw a consistent fabric over this and secured occurrence. And help the enterprise leverage the best parts of the cloud to put a best fit architecture together, but which also happens to be a best of breed architecture. >> So the fabric is everything you're describing, all the Apache open source projects in which HortonWorks is a primary committer and contributor, are able to scheme as in policies and metadata and so forth across this distributed heterogeneous fabric of public and private cloud segments within a distributed environment. >> Exactly. >> That's increasingly being containerized in terms of the applications for deployment to edge nodes. Containerization is a big theme in HTP3.0 which you announced at this show. >> Yeah. >> So, if you could give us a quick sense for how that containerization capability plays into more of an edge focus for what your customers are doing. >> Exactly, great point, and again, the fabric is obviously, the core parts of the fabric are the open source projects but we've also done a lot of net new innovation with data plans which, by the way, is also open source. Its a new product and a new platform that you can actually leverage, to lay it out over the open source ones you're familiar with. And again, like you said, containerization, what is actually driving the fundamentals of this, the details matter, the scale at which we operate, we're talking about thousands of nodes, terabytes of data. The details really matter because a 5% improvement at that scale leads to millions of dollars in optimization for capex and opex. So that's why all of that, the details are being fueled and driven by the community which is kind of what we tell over HDP3 Until the key ones, like you said, are containerization because now we can actually get complete agility in terms of how you deploy the applications. You get isolation not only at the resource management level with containers but you also get it at the software level, which means, if two data scientists wanted to use a different version of Python or Scala or Spark or whatever it is, they get that consistently and holistically. That now they can actually go from the test dev cycle into production in a completely consistent manner. So that's why containers are so big because now we can actually leverage it across the stack and the things like MiNiFi showing up. We can actually-- >> Define MiNiFi before you go further. What is MiNiFi for our listeners? >> Great question. Yeah, so we've always had NiFi-- >> Real-time >> Real-time data flow management and NiFi was still sort of within the data center. What MiNiFi does is actually now a really, really small layer, a small thin library if you will that you can throw on a phone, a doorbell, a sensor and that gives you all the capabilities of NiFi but at the edge. >> Mmm Right? And it's actually not just data flow but what is really cool about NiFi it's actually command and control. So you can actually do bidirectional command and control so you can actually change in real-time the flows you want, the processing you do, and so on. So what we're trying to do with MiNiFi is actually not just collect data from the edge but also push the processing as much as possible to the edge because we really do believe a lot more processing is going to happen at the edge especially with the A6 and so on coming out. There will be custom hardware that you can throw and essentially leverage that hardware at the edge to actually do this processing. And we believe, you know, we want to do that even if the cost of data not actually landing up at rest because at the end of the day we're in the insights business not in the data storage business. >> Well I want to get back to that. You were talking about innovation and how so much of it is driven by the open source community and you're a veteran of the big data open source community. How do we maintain that? How does that continue to be the fuel? >> Yeah, and a lot of it starts with just being consistent. From day one, James was around back then, in 2011 we started, we've always said, "We're going to be open source." because we fundamentally believed that the community is going to out innovate any one vendor regardless of how much money they have in the bank. So we really do believe that's the best way to innovate mostly because their is a sense of shared ownership of that product. It's not just one vendor throwing some code out there try to shove it down the customers throat. And we've seen this over and over again, right. Three years ago, we talk about a lot of the data plane stuff comes from Atlas and Ranger and so on. None of these existed. These actually came from the fruits of the collaboration with the community with actually some very large enterprises being a part of it. So it's a great example of how we continue to drive it6 because we fundamentally believe that, that's the best way to innovate and continue to believe so. >> Right. And the community, the Apache community as a whole so many different projects that for example, in streaming, there is Kafka, >> Okay. >> and there is others that address a core set of common requirements but in different ways, >> Exactly. >> supporting different approaches, for example, they are doing streaming with stateless transactions and so forth, or stateless semantics and so forth. Seems to me that HortonWorks is shifting towards being more of a streaming oriented vendor away from data at rest. Though, I should say HDP3.0 has got great scalability and storage efficiency capabilities baked in. I wonder if you could just break it down a little bit what the innovations or enhancements are in HDP3.0 for those of your core customers, which is most of them who are managing massive multi-terabyte, multi-petabyte distributed, federated, big data lakes. What's in HDP3.0 for them? >> Oh for lots. Again, like I said, we obviously spend a lot of time on the streaming side because that's where we see. We live in a real-time world. But again, we don't do it at the cost of our core business which continues to be HDP. And as you can see, the community trend is drive, we talked about continuization massive step up for the Hadoob Community. We've also added support for GPUs. Again, if you think about Trove's at scale machine learning. >> Graphing processing units, >> Graphical-- >> AI, deep learning >> Yeah, it's huge. Deep learning, intensive flow and so on, really, really need a custom, sort of GPU, if you will. So that's coming. That's an HDP3. We've added a whole bunch of scalability improvements with HDFS. We've added federation because now we can go from, you can go over a billion files a billion objects in HDFS. We also added capabilities for-- >> But you indicated yesterday when we were talking that very few of your customers need that capacity yet but you think they will so-- >> Oh for sure. Again, part of this is as we enable more source of data in real-time that's the fuel which drives and that was always the strategy behind the HDF product. It was about, can we leverage the synergies between the real-time world, feed that into what you do today, in your classic enterprise with data at rest and that is what is driving the necessity for scale. >> Yes. >> Right. We've done that. We spend a lot of work, again, loading the total cost of ownership the TCO so we added erasure coding. >> What is that exactly? >> Yeah, so erasure coding is a classic sort of storage concept which allows you to actually in sort of, you know HTFS has always been three replicas So for redundancy, fault tolerance and recovery. Now, it sounds okay having three replicas because it's cheap disk, right. But when you start to think about our customers running 70, 80 hundred terabytes of data those three replicas add up because you've now gone from 80 terabytes of effective data where actually two 1/4 of an exobyte in terms of raw storage. So now what we can do with erasure coding is actually instead of storing the three blocks we actually store parody. We store the encoding of it which means we can actually go down from three to like two, one and a half, whatever we want to do. So, if we can get from three blocks to one and a half especially for your core data, >> Yeah >> the ones you're not accessing every day. It results in a massive savings in terms of your infrastructure costs. And that's kind of what we're in the business doing, helping customers do better with the data they have whether it's on-prem or on the cloud, that's sort of we want to help customers be comfortable getting more data under management along with secured and the lower TCO. The other sort of big piece I'm really excited about HDP3 is all the work that's happened to Hive Community for what we call the real-time database. >> Yes. >> As you guys know, you follow the whole sequel of ours in the Doob Space. >> And hive has changed a lot in the last several years, this is very different from what it was five years ago. >> The only thing that's same from five years ago is the name (laughing) >> So again, the community has done a phenomenal job, kind of, really taking sort of a, we used to call it like a sequel engine on HDFS. From there, to drive it with 3.0, it's now like, with Hive 3 which is part of HDP3 it's a full fledged database. It's got full asset support. In fact, the asset support is so good that writing asset tables is at least as fast as writing non-asset tables now. And you can do that not only on-- >> Transactional database. >> Exactly. Now not only can you do it on prem, you can do it on S3. So you can actually drive the transactions through Hive on S3. We've done a lot of work to actually, you were there yesterday when we were talking about some of the performance work we've done with LAP and so on to actually give consistent performance both on-prem and the cloud and this is a lot of effort simply because the performance characteristics you get from the storage layer with HDFS versus S3 are significantly different. So now we have been able to bridge those with things with LAP. We've done a lot of work and sort of enhanced the security model around it, governance and security. So now you get things like account level, masking, row-level filtering, all the standard stuff that you would expect and more from an Enprise air house. We talked to a lot of our customers, they're doing, literally tens of thousands of views because they don't have the capabilities that exist in Hive now. >> Mmm-hmm 6 And I'm sitting here kind of being amazed that for an open source set of tools to have the best security and governance at this point is pretty amazing coming from where we started off. >> And it's absolutely essential for GDPR compliance and compliance HIPA and every other mandate and sensitivity that requires you to protect personally identifiable information, so very important. So in many ways HortonWorks has one of the premier big data catalogs for all manner of compliance requirements that your customers are chasing. >> Yeah, and James, you wrote about it in the contex6t of data storage studio which we introduced >> Yes. >> You know, things like consent management, having--- >> A consent portal >> A consent portal >> In which the customer can indicate the degree to which >> Exactly. >> they require controls over their management of their PII possibly to be forgotten and so forth. >> Yeah, it's going to be forgotten, it's consent even for analytics. Within the context of GDPR, you have to allow the customer to opt out of analytics, them being part of an analytic itself, right. >> Yeah. >> So things like those are now something we enable to the enhanced security models that are done in Ranger. So now, it's sort of the really cool part of what we've done now with GDPR is that we can get all these capabilities on existing data an existing applications by just adding a security policy, not rewriting It's a massive, massive, massive deal which I cannot tell you how much customers are excited about because they now understand. They were sort of freaking out that I have to go to 30, 40, 50 thousand enterprise apps6 and change them to take advantage, to actually provide consent, and try to be forgotten. The fact that you can do that now by changing a security policy with Ranger is huge for them. >> Arun, thank you so much for coming on theCUBE. It's always so much fun talking to you. >> Likewise. Thank you so much. >> I learned something every time I listen to you. >> Indeed, indeed. I'm Rebecca Knight for James Kobeilus, we will have more from theCUBE's live coverage of DataWorks just after this. (Techno music)
SUMMARY :
brought to you by Hortonworks. It's great to have you on Yeah, likewise. is part of the strategy but it really needs to fit and that's the business we are in. And hide the details of all of the underlying infrastructure for a bunch of the operational stuff, So the fabric is everything you're describing, in terms of the applications for deployment to edge nodes. So, if you could give us a quick sense for Until the key ones, like you said, are containerization Define MiNiFi before you go further. Yeah, so we've always had NiFi-- and that gives you all the capabilities of NiFi the processing you do, and so on. and how so much of it is driven by the open source community that the community is going to out innovate any one vendor And the community, the Apache community as a whole I wonder if you could just break it down a little bit And as you can see, the community trend is drive, because now we can go from, you can go over a billion files the real-time world, feed that into what you do today, loading the total cost of ownership the TCO sort of storage concept which allows you to actually is all the work that's happened to Hive Community in the Doob Space. And hive has changed a lot in the last several years, And you can do that not only on-- the performance characteristics you get to have the best security and governance at this point and sensitivity that requires you to protect possibly to be forgotten and so forth. Within the context of GDPR, you have to allow The fact that you can do that now Arun, thank you so much for coming on theCUBE. Thank you so much. we will have more from theCUBE's live coverage of DataWorks
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Jim Kobielus | PERSON | 0.99+ |
Rebecca Knight | PERSON | 0.99+ |
James | PERSON | 0.99+ |
Aaron Murphy | PERSON | 0.99+ |
Arun Murphy | PERSON | 0.99+ |
Arun | PERSON | 0.99+ |
2011 | DATE | 0.99+ |
ORGANIZATION | 0.99+ | |
5% | QUANTITY | 0.99+ |
80 terabytes | QUANTITY | 0.99+ |
FedEx | ORGANIZATION | 0.99+ |
two | QUANTITY | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
San Jose | LOCATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Arun Murthy | PERSON | 0.99+ |
HortonWorks | ORGANIZATION | 0.99+ |
yesterday | DATE | 0.99+ |
San Jose, California | LOCATION | 0.99+ |
three replicas | QUANTITY | 0.99+ |
James Kobeilus | PERSON | 0.99+ |
three blocks | QUANTITY | 0.99+ |
GDPR | TITLE | 0.99+ |
Python | TITLE | 0.99+ |
Europe | LOCATION | 0.99+ |
millions of dollars | QUANTITY | 0.99+ |
Scala | TITLE | 0.99+ |
Spark | TITLE | 0.99+ |
theCUBE | ORGANIZATION | 0.99+ |
five years ago | DATE | 0.99+ |
one and a half | QUANTITY | 0.98+ |
Enprise | ORGANIZATION | 0.98+ |
three | QUANTITY | 0.98+ |
Hive 3 | TITLE | 0.98+ |
Three years ago | DATE | 0.98+ |
both | QUANTITY | 0.98+ |
Asia | LOCATION | 0.97+ |
50 thousand | QUANTITY | 0.97+ |
TCO | ORGANIZATION | 0.97+ |
MiNiFi | TITLE | 0.97+ |
Apache | ORGANIZATION | 0.97+ |
40 | QUANTITY | 0.97+ |
Altas | ORGANIZATION | 0.97+ |
Hortonworks DataPlane Services | ORGANIZATION | 0.96+ |
DataWorks Summit 2018 | EVENT | 0.96+ |
30 | QUANTITY | 0.95+ |
thousands of nodes | QUANTITY | 0.95+ |
A6 | COMMERCIAL_ITEM | 0.95+ |
Kerberos | ORGANIZATION | 0.95+ |
today | DATE | 0.95+ |
Knox | ORGANIZATION | 0.94+ |
one | QUANTITY | 0.94+ |
hive | TITLE | 0.94+ |
two data scientists | QUANTITY | 0.94+ |
each | QUANTITY | 0.92+ |
Chinese | OTHER | 0.92+ |
TensorFlow | TITLE | 0.92+ |
S3 | TITLE | 0.91+ |
October of last year | DATE | 0.91+ |
Ranger | ORGANIZATION | 0.91+ |
Hadoob | ORGANIZATION | 0.91+ |
HIPA | TITLE | 0.9+ |
CUBE | ORGANIZATION | 0.9+ |
tens of thousands | QUANTITY | 0.9+ |
one vendor | QUANTITY | 0.89+ |
last several years | DATE | 0.88+ |
a billion objects | QUANTITY | 0.86+ |
70, 80 hundred terabytes of data | QUANTITY | 0.86+ |
HTP3.0 | TITLE | 0.86+ |
two 1/4 of an exobyte | QUANTITY | 0.86+ |
Atlas and | ORGANIZATION | 0.85+ |
DataPlane Services | ORGANIZATION | 0.84+ |
Google Cloud | TITLE | 0.82+ |
Arun Garg, NetApp | Cisco Live 2018
>> Live from Orlando, Florida it's theCUBE covering Cisco Live 2018. Brought to you by Cisco, NetApp and theCUBE's ecosystem partners. >> Hey, welcome back everyone. This is theCUBE's coverage here in Orlando, Florida at Cisco Live 2018. Our first year here at Cisco Live. We were in Barcelona this past year. Again, Cisco transforming to a next generation set of networking capabilities while maintaining all the existing networks and all the security. I'm John Furrier your host with Stu Miniman my co-host for the next three days. Our next guest is Arun Garg. Welcome to theCUBE. You are the Director of Product Management Converged Infrastructure Group at NetApp. >> Correct, thank you very much for having me on your show and it's a pleasure to meet with you. >> One of the things that we've been covering a lot lately is the NetApp's really rise in the cloud. I mean NetApp's been doing a lot of work on the cloud. I mean I've wrote stories back when Tom Georges was the CEO when Amazon just came on the scene. NetApp has been really into the cloud and from the customer's standpoint but now with storage and elastic resources and server lists, the customers are now startin' to be mindful. >> Absolutely. >> Of how to maximize the scale and with All Flash kind of a perfect storm. What are you guys up to? What's your core thing that you guys are talking about here at Cisco Live? >> So absolutely, thank you. So George Kurian, our CEO at NetApp, is very much in taking us to the next generation and the cloud. Within that I take care of some of the expansion plans we have on FlexPod with Cisco and in that we have got two new things that we are announcing right now. One is the FlexPod for Healthcare which is in FlexPod we've been doing horizontal application so far which are like the data bases, tier one database, as well as applications from Microsoft and virtual desktops. Now we are going vertical. Within the vertical our application, the first one we're looking in the vertical is healthcare. And so it's FlexPod for Healthcare. That's the first piece that we are addressing. >> What's the big thing with update on FlexPod? Obviously FlexPod's been very successful. What's the modernization aspect of it because Cisco's CEO was onstage today talking about Cisco's value proposition, about the old ways now transitioning to a new network architecture in the modern era. What's the update on FlexPod? Take a minute to explain what are the cool, new things going on with FlexPod. >> Correct, so the All Flash FAS, which is the underlying technology, which is driving the FlexPod, has really picked up over the last year as customers keep wanting to improve their infrastructure with better latencies and better performance the All Flash FAS has driven even the FlexPod into the next generation. So that's the place where we are seeing double-digit growth over the last five quarters consistently in FlexPod. So that's a very important development for us. We've also done more of the standard CVDs that we do on SAP and a few other are coming out. So those are all out there. Now we are going to make sure that all these assets can be consumed by the vertical industry in healthcare. And there's another solution we'll talk about, the managed private cloud on FlexPod. >> Yeah, Arun, I'd love to talk about the private cloud. So I think back to when Cisco launched UCS it was the storage partners that really helped drive that modernization for virtualization. NetApp with FlexPod, very successful over the years doing that. As we know, virtualization isn't enough to really be a private cloud. All the things that Chuck Robbins is talking about onstage, how do I modernize, how do I get you know, automation in there? So help us connect the dots as to how we got from you know, a good virtualized platform to this is, I think you said managed private cloud, FlexPod in Cisco. >> Absolutely. So everybody likes to consume a cloud. It's easy to consume a cloud. You go and you click on I need a VM, small, medium, large, and I just want to see a dashboard with how my VMs are doing. But in reality it's more difficult to just build your own cloud. There's complexity associated with it. You need a service platform where you can give a ticket, then you need an orchestration platform where you can set up the infrastructure, then you need a monitoring platform which will show you all of the ways your infrastructure's working. You need a capacity planning tool. There's tens of tools that need to be integrated. So what we have done is we have partnered with some of the premium partners and some DSIs who have already built this. So the risk of a customer using their private cloud infrastructure is minimized and therefore these partners also have a managed service. So when you combine the fact that you have a private cloud infrastructure in the software domain as well as a managed service and you put it on the on-prem FlexPod that are already sold then the customer benefits from having the best of both worlds, a cloud-like experience on their own premise. And that is what we are delivering with this FlexPod managed private cloud solution. >> Talk about the relationship with Cisco. So we're here at Cisco Live you guys have a good relationship with Cisco. What should customers understand about the relationship? What are the top bullet points and value opportunities and what does it mean to the impact for the customer? >> So we, all these solutions we work very closely with the Cisco business unit and we jointly develop these solutions. So within that what we do is there's the BU to BU interaction where the solution is developed and defined. There is a marketing to marketing interaction where the collateral gets created and reviewed by both parties. So you will not put a FlexPod brand unless the two companies agree. >> So it's tightly integrated. >> It's tightly integrated. The sales teams are aligned, the marketing, the communications team, the channel partner team. That's the whole value that the end customer gets because when a partner goes to a high-end enterprise customer he knows that both Cisco and NetApp teams can be brought to the table for the customer to showcase the value as well as help them through it all. >> Yeah, over in one of the other areas that's been talked about this show we talk about modernization. You talk about things like microservices. >> Yes. >> Containers are pretty important. How does that story of containerization fit into FlexPod? >> Absolutely. So containerization helps you get workloads, the cloud-native workloads or the type two native. Type two workloads as Gartner calls them. So our mode two. What we do is we work with the Cisco teams and we already had a CVD design with a hybrid cloud with a Cisco cloud center platform, which is the quicker acquisition. And we showed a design with that. What we are now bringing to the table is the ability for our customers to benefit with a managed service on top of it. So that's the piece we are dealing with the cloud teams. With the Cisco team the ACI fabric is very important to them. So that ACI fabric is visible and shown in our designs whether you do SAP, you do Oracle, you do VDI and you do basic infrastructure or you do the managed private cloud or FlexPod on Healthcare. All of these have the core networking technologies from Cisco, as well as the cloud technologies from Cisco in a form factor or in a manner that easily consumable by our customers. >> Arun, talk about the customer use cases. So say you've got a customer, obviously you guys have a lot of customers together with Cisco, they're doing some complex things with the technology, but for the customer out there that has not yet kinda went down the NetApp Cisco route, what do they do? 'Cause a lot of storage guys are lookin' at All Flash, so check, you guys have that. They want great performance, check. But then they gotta integrate. So what do you say to the folks watching that aren't yet customers about what they should look at and evaluate vis-a-vis your opportunity with them and say the competition? >> So yes, there are customers who are doing all this as separate silos, but the advantage of taking a converged infrastructure approach is that you benefit from the years of man experience or person experience that we have put behind in our labs to architect this, make sure that everything is working correctly and therefore is reduces their deployment time and reduces the risk. And if you want to be agile and faster even in the traditional infrastructure, while you're being asked to go to the cloud you can do it with our FlexPod design guides. If you want the cloud-like experience then you can do it with a managed private cloud solution on your premise. >> So they got options and they got flexibility on migrating to the cloud or architecting that. >> Yes. >> Okay, great, now I'm gonna ask you another question. This comes up a lot on theCUBE and certainly we see it in the industry. One of the trends is verticalization. >> Yes. >> So verticalization is not a new thing. Vertical industry, people go to market that way, they build products that are custom to verticals. But with cloud one of the benefits of cloud and kind of a cloud operations is you have a horizontally scalable capability. So how do you guys look at that, because these verticals, they gotta get closer to the front lines and have apps that are customized. I mean data that's fastly delivered to the app. How should verticals think about architecting storage to maintain the scale of horizontally scalable but yet provide customization into the applications that might be unique to the vertical? >> Okay, so let me give a trend first and then I'll get to the specific. So in the vertical industry, the next trend is industry clouds. For example, you have healthcare clouds and you'll have clouds to specific industries. And the reason is because these industries have to keep their data on-prem. So the data gravity plays a lot of impact in all of these decisions. And the security of their data. So that is getting into industry-specific clouds. The second pieces are analytics. So customers now are finding that data is valuable and the insight you can get from the data are actually more valuable. So what they want is the data on their premise, they want the ability all in their control so to say, they want the ability to not only run their production applications but also the ability to run analytics on top of that. In the specific example for health care what it does is when you have All Flash FAS it provides you a faster response for the patient because the physician is able to get the diagnostics done better if he has some kind of analytics helping him. [Interviewer] - Yeah. >> Plus the first piece I talked about, the rapid deployment is very important because you want to get your infrastructure set up so I can give an example on that too. >> Well before we get to the example, this is an important point because I think this is really the big megatrend. It's not really kinda talked much about but it's pretty happening is that what you just pointed out was it's not just about speeds and feeds and IOPs, the performance criteria to the industry cloud has other new things like data, the role of data, what they're using for the application. >> Correct. >> So it's just you've gotta have table stakes of great, fast storage. >> Yes. >> But it's gotta be integrated into what is becoming a use case for the verticals. Did I get that right? >> Yes, absolutely. So I'll give two examples. One I can name the customer. So they'll come at our booth tomorrow, in a minute here. So LCMC Health, part of UMC, and they have the UMC Medical Center. So when New Orleans had this Katrina disaster in Louisiana, so they came up with they need a hospital, fast. And they decided on FlexPod because within three months with the wire one's architecture and application they could scale their whole IT data center for health care. So that has helped them tremendously to get it up and running. Second is with the All Flash FAS they're able to provide faster response to their customer. So that's a typical example that we see in these kind of industries. >> Arun, thanks for coming on theCUBE. We really appreciate it. You guys are doing a great job. In following NetApps recent success lately, as always, NetApp's always goin' the next level. Quick question for you to end the segment. What's your take of Cisco Live this year? What's some of the vibe of the show? So I know it's day one, there's a lot more to come and you're just getting a sense of it. What's the vibe? What's coming out of the show this year? What's the big ah-ha? >> So I attended the keynote today and it was very interesting because Cisco has taken networking to the next level within 10 base networking, its data and analytics where you can put on a subscription mode on all the pieces of the infrastructure networking. And that's exactly the same thing which NetApp is doing, where we are going up in the cloud with this subscription base. And when you add the two subscription base then for us, at least in the managed private cloud solution we can provide the subscription base through the managed private cloud through our managed service provider. So knowing where the industry was going, knowing where Cisco was going and knowing where we want to go, we have come up with this solution which matches both these trends of Cisco as well as NetApp. >> And the number of connected devices going up every day. >> Yes. >> More network connections, more geo domains, it's complicated. >> It is complicated, but if you do it correctly we can help you find a way through it. >> Arun, thank you for coming on theCUBE. I'm John Furrier here on theCUBE with Stu Miniman here with NetApp at Cisco Live 2018. Back with more live coverage after this short break. (upbeat music)
SUMMARY :
Brought to you by Cisco, NetApp and all the security. and it's a pleasure to meet with you. and from the customer's standpoint What are you guys up to? One is the FlexPod for What's the modernization aspect of it So that's the place where we All the things that Chuck So the risk of a customer using Talk about the relationship with Cisco. So you will not put a FlexPod brand that the end customer gets Yeah, over in one of the other areas How does that story of So that's the piece we are and say the competition? and reduces the risk. on migrating to the cloud One of the trends is verticalization. the benefits of cloud and the insight you can get from the data Plus the first piece I talked the big megatrend. So it's just you've case for the verticals. One I can name the customer. What's some of the vibe of the show? So I attended the keynote today And the number of connected it's complicated. we can help you find a way through it. Arun, thank you for coming on theCUBE.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Cisco | ORGANIZATION | 0.99+ |
Tom Georges | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Arun | PERSON | 0.99+ |
George Kurian | PERSON | 0.99+ |
UMC Medical Center | ORGANIZATION | 0.99+ |
Barcelona | LOCATION | 0.99+ |
Arun Garg | PERSON | 0.99+ |
two companies | QUANTITY | 0.99+ |
John Furrier | PERSON | 0.99+ |
LCMC Health | ORGANIZATION | 0.99+ |
Chuck Robbins | PERSON | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
UMC | ORGANIZATION | 0.99+ |
second pieces | QUANTITY | 0.99+ |
Louisiana | LOCATION | 0.99+ |
Orlando, Florida | LOCATION | 0.99+ |
Katrina | EVENT | 0.99+ |
FlexPod | COMMERCIAL_ITEM | 0.99+ |
NetApp | ORGANIZATION | 0.99+ |
both parties | QUANTITY | 0.99+ |
New Orleans | LOCATION | 0.99+ |
Second | QUANTITY | 0.99+ |
10 base | QUANTITY | 0.99+ |
three months | QUANTITY | 0.99+ |
first piece | QUANTITY | 0.99+ |
two examples | QUANTITY | 0.99+ |
tomorrow | DATE | 0.99+ |
both | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
first one | QUANTITY | 0.99+ |
today | DATE | 0.98+ |
theCUBE | ORGANIZATION | 0.98+ |
first year | QUANTITY | 0.98+ |
last year | DATE | 0.98+ |
Gartner | ORGANIZATION | 0.98+ |
both worlds | QUANTITY | 0.97+ |
Cisco Live 2018 | EVENT | 0.97+ |
two | QUANTITY | 0.97+ |
NetApp | TITLE | 0.97+ |
this year | DATE | 0.97+ |
All Flash FAS | COMMERCIAL_ITEM | 0.97+ |
one | QUANTITY | 0.97+ |
two new things | QUANTITY | 0.96+ |
tens of tools | QUANTITY | 0.95+ |
UCS | ORGANIZATION | 0.94+ |
Oracle | ORGANIZATION | 0.94+ |
Arun Varadarajan, Cognizant | Informatica World 2018
>> Voiceover: Live from Las Vegas, it's theCUBE. Covering Informatica World 2018, brought to you by Informatica. >> Hey, welcome back everyone, we're here live at the Venetian, we're at the Sands Convention Center, Venetian, the Palazzo, for Informatica World 2018. I'm John Furrier, with Peter Burris, my co-host with you. Our next guest, Arun Varadarajan, who's the VP of AI and Analytics at Cognizant. Great to see you. It's been awhile. Thanks for coming on. >> Thank you. Thank you John, it's wonderful meeting you again. >> So, last time you were on was 2015 in the queue. We were at the San Francisco, where the event was. You kind of nailed the real time piece; also, the disruption of data. Look ing forward, right now, we're kind of right at the spot you were talking about there. What's different? What's new for you? ASI data's at the center of the value preposition. >> Arun: Yep. People are now realizing, I need to have strategic data plan, not just store it, and go do analytics on it. GDPR is a signal; obviously we're seeing that. What's new? >> So, I think a couple of things, John. One is, I think the customers have realized that there is a need to have a very deliberate approach. Last time, when we spoke, we spoke about digital transformation; it was a cool thing. It had this nice feel to it. But I think what has happened in the last couple of years is that we've been able to help our clients understand what exactly is digital transformation, apart from it being a very simple comparative tactic to deal with the fact that digital natives are, you know, barking down your path. It also is an opportunity for you to really reimagine your business architecture. So, what we're telling our clients is that when you're thinking about digital transformation, think of it from a 3-layer standpoint, the first layer being your business model itself, right? Because, if you're a traditional taxi service, and you're dealing with the Uber war, you better reimagine your business model. It starts there. And then, if your business model has to change to compete in the digital world, your operating model has to be extremely aligned to that new business model paradigm that you've defined. And, to that, if you don't have a technology model that is adapting to that change, none of this is going to happen. So, we're telling our clients, when you think about digital transformation, think of it from these three dimensions. >> It's interesting, because back in the old days, your technology model dictated what you could do. It's almost flipped around, where the business model is dictating the direction. So, business model, operating model, technology model. Is that because technology is more versatile? Or, as Peter says, processes are known, and you can manage it? It used to be, hey, let's pick a technology decision. Which database, and we're off to the races. Now it seems to be flipped around. >> There are two reasons for that. One is, I think, technology itself has proliferated so much that there are so many choices to be made. And if you start looking at technology first, you get kind of burdened by the choices you need to make. Because, at the end of the day, the choice you make on technology has to have a very strong alignment and impact to business. So, what we're telling our clients is, choices are there; there are plenty of choices. There are compute strategies available that are out there. There's new analytical capabilities. There's a whole lot of that. But if you do not purpose and engineer your technology model to a specific business objective, it's lost. So, when we think about business architecture, and really competing in the digital space, it's really about you saying, how do I make sure that my business model is such that I can thwart the competition that is likely to come from digital natives? You saw Amazon the other day, right? They bought an insurance company. Who knows what they're going to buy next? My view is that Uber may buy one of the auto companies, and completely change the car industry. So, what does Ford do? What does General Motors do? And, if they're going to go about this in a very incremental fashion, my view is that they may not exist. >> So, we have been in our research arguing that digital transformation does mean something. We think that it's the difference between a business and a digital business is the role that data plays in a digital 6business, and whether or not a business treats data as an asset. Now, in every business, in every business strategy, the most simple, straightforward, bottom-line thing you can acknowledge is that businesses organize work around assets. >> John: Yep. >> So, does it comport with your observation that, to many respects, what we're talking about here is, how are we reinstitutionalizing work around data, and what impact does that have on our business model, our operating model, and our technology selection? Does that line up for you? >> Totally, totally. So, if you think about business model change, to me, it starts by re-imagining your engagement process with your customers. Re-imagining customer experience. Now, how are you going to be able to re-imagine customer experience and customer engagement if you don't know your customer? Right? So, the first building block in my mind is, do you have customer intelligence? So, when you're talking about data as an asset, to me, the asset is intelligence, right? So, customer intelligence, to me, is the first analytical building block for you to start re-imagining your business model. The second block, very clearly, is fantastic. I've re-imagined customer experience. I've re-imagined how I am going to engage with my customer. Is your product, and service, intelligent enough to develop that experience? Because, experience has to change with customers wanting new things. You know, today I was okay with buying that item online, and getting the shipment done to me in 4 days. But, that may change; I may need overnight shipping. How do you know that, right? Are you really aware of my preferences, and how quickly is your product and service aligning to that change? And, to your point, if I have customer intelligence, and product intelligence sorted out, I better make sure that my business processes are equally capable of institutionalizing intelligence. Right? So, my process orchestration, whether it's my supply chain, whether it's my auto management, whether it's my, you know, let's say fulfillment process; all of these must be equally intelligent. So, in my mind, these are three intelligent blocks: there's customer intelligence, product intelligence, and operations intelligence. If you have these three building blocks in place, then I think you can start thinking about what should your new data foundation look like. >> I want to take that and overlay kind of like, what's going on in the landscape of the industry. You have infrastructure world, which you buy some rack and stack the servers; clouds now on the scene, so there's overlapping there. We used to have a big data category. You know, ADO; but, that's now AI and machine learning, and data ware. It's kind of its own category, call it AI. And then, you have kind of emerging tech, whether you call, block chain, these kind of... confluence of all these things. But there's a data component that sits in the center of all these things. Security, data, IOT, traverse infrastructure, cloud, the classic data industry, analytics, AI, and emerging. You need data that traverses all these new environments. How does someone set up their architecture so that, because now I say, okay, I got a dat big data analytics package over here. I'm doing some analytics, next gen analytics. But, now I got to move data around for its cloud services, or for an application. So, you're seeing data as to being architected to be addressable across multiple industries. >> Great point John. In fact, that leads logically to the next thing that me and my team are working on. So we are calling it the Adaptive Data Foundation. Right? The reason why we chose the word adaptive is because in my mind it's all about adapting to change. I think Chal Salvan, or somebody said that the survival of the fittest is not, the survival is not of the survival of the fittest or the survival of the species that is intelligent, but it's the survival of those who can adapt to change, right? To me, your data foundation has to be super adaptive. So what we've done is, in fact, my notion, and I keep throwing this at you every time I meet you, in my opinion, big data is legacy. >> John: Yeah, I would agree with that. >> And its coming.. >> John: The debate. >> It's pretty much legacy in my mind. Today it's all about scale-out, responsive, compute. The data world. Now, if you looked at most of the architectures of the past of the data world, it was all about store and forward. Right? I would, it's a left to right architecture. To me it's become a multi-directional architecture. Therefore what we have done is, and this is where I think the industry is still struggling, and so are our customers. I understand I need to have a new modern data foundation, but what does that look like? What does it feel like? So with the Adaptive Data Foundation... >> They've never seen it before by the way. >> They have not seen it. >> This is new. >> They are not able to envision it. >> It is net new. >> Exactly. They're not able to envision it. So what I tell my clients is, if you really want to reimagine, just as you're reimagining your business model, your operating model, you better reimagine your data model. Is your data model capable of high velocity resolutions? Whether it's identity resolution of a client who's calling in. Whether it's the resolution of the right product and service to deliver to the client. Whether it's your process orchestration, they're able to quickly resolve that this data, this distribution center is better capable of servicing their customer need. You better have that kind of environment, right? So, somebody told me the other day that Amazon can identify an analytical opportunity and deliver a new experience and productionize it in 11.56 seconds. Today my customers, on average, the enterprise customers, barely get to have a reasonable release on a monthly basis. Forget about 11.56 seconds. So if they have to move at that kind of velocity, and that kind of responsiveness, they need to reimagine their data foundation. What we have done is, we have tried to break it down into three broad components. The first component that they're saying is that you need a highly responsive architecture. The question that you asked. And a highly responsive architecture, we've defined, we've got about seven to eight attributes that defines what a responsive architecture is. And in my mind, you'll hear a lot of, I've been hearing a lot of this that a friend, even in today's conference, people are saying, 'Oh, its going to be a hybrid world. There's going to be Onprim, there's going to be cloud, there's going to be multicloud. My view is, if you're going to have all of that mess, you're going to die, right? So I know I'm being a little harsh on this subject, but my view is you got to move to a very simplified responsive architecture right up front. >> Well you'd be prepared for any architecture. >> I've always said, we've debated this many times, I think it's a cloud world, public cloud, everything. Where the data center on premise is a huge edge. Right, so? If you think of the data center as an edge, you can say okay, it's a large edge. It's a big fat edge. >> Our fundamentalists, I don't think it exists. Our fundamental position is data increasingly, the physical realities of data, the legal realities of data, the intellectual property control realities of data, the cost realities of data are going to dictate where the processing actually takes place. There's going to be a tendency to try to move the activity as close to the data as possible so you don't have to move the data. It's not in opposition, but we think increasingly people are going to not move the data to the cloud, but move the cloud to the data. That's how we think. >> That's an interesting notion. My view is that the data has to be really close to the source of position and execution, right? >> Peter: Yeah. Data has got to be close to the activity. >> It has to be very close to the activity. >> The locality matters. >> Exactly, exactly, and my view is, if you can, I know it's tough, but a lot of our clients are struggling with that, I'm pushing them to move their data to the cloud, only for one purpose. It gives them that accessibility to a wide ranging of computer and analytical options. >> And also microservices. >> Oh yeah. >> We had a customer on earlier who's moved to the cloud. This is what we're saying about the edge being data centered. Hybrid cloud just means you're running cloud operations. Which just means you got to have a data architecture that supports cloud operations. Which means orchestration, not having siloed systems, but essentially having these kind of, data traversal, but workload management, and I think that seems to be the consistency there. This plays right into what you're saying. That adaptive platform has to enable that. >> Exactly. >> If it forecloses it, then you're missing an opportunity. I guess, how do you... Okay tell me about a customer where you had the opportunity to do the adaptive platform, and they say no, I want a silo inside my network. I got the cloud for that. I got the proprietary system here. Which is eventually foreclosing their future revenue. How do you handle that scenario? >> So the way we handle that scenario, is again, focusing on what the end objective, that the client has, from an analytical opportunity, respectfully. What I mean by that is that semi-customer says I need to be significantly more responsive in my service management, right? So if he says I want to get that achieved, then what we start thinking about is, what is that responsive data architecture that can tell us a better outcome because like you said, and you said, there's stuff on the data center, there's stuff all over the place, it's going to be difficult to take that all away. But can I create a purpose for change? Many times you need a purpose for change. So the purpose being if I can get to a much more intelligent service management framework, I will be able to either take cost out or I can increase my revenue through services. It has to be tied to an outcome. So then the conversation becomes very easy because you're building a business case for investing in change, resulting in a measurable, business outcome. So that engineer to purpose is the way I'm finding it easier to have that conversation. And I'm telling the plan, keep what you have so you've got all the speckety messes somebody said, right? You've got all of the speckety mess out there. Let us focus on, if there are 15 data sets, that we think are relevant for us to deliver service management intelligence, let's focus on those 15 data sets. Let's get that into a new scalable, hyper responsive modern architecture. Then it becomes easier. Then I can tell the customer, now we have created an equal system where we can truly get to the 11.56 seconds analytical opportunity getting productionized. Move to an experiment as a service. That's another concept. So all of that, in my opinion John, is if he can put a purpose around it, as opposed to saying let's rip and replay, let's do this large scale transformation program, those things cost a lot of money. >> Well the good news is containers and Cubernetties is stowing away to get those projects moving cloud natives as fast as possible. Love the architecture vision. Love to fault with you on that. Great conversation. I think that's a path, in my opinion. Now short-term, the house in on fire in many areas. I want to get your thoughts on this final question. GDPR, the house is on fire, it's kind of critical, it's kind of tactical. People don't like freaking out. Saying okay, saying what does this mean? Okay, it's a signal, it is important. I think it's a technical mess. I mean where's the data? What schema? John Furrier, am I J Furrier, or Furrier, John? There's data on me everywhere inside the company. It's hard. >> Arun: It is. >> So, how are you guys helping customers and navigate the landscape of GDPR? >> GDPR is a whole, it's actually a much bigger problem than we all thought it was. It is securing things at the source system because there's volatibilities of source system. Forget about it entering into any sort of mastering or data barrels. They're securing its source, that is so critical. Then, as you said, the same John Furrier, who was probably exposed to GDPR is defined in ten different ways. How do I make sure that those ten definitions are managed? >> Tells you, you need an adaptive data platform to understands. >> So right now most of our work, is just doing that impactive analysis, right? Whether it's at a source system level, it has data coverance issues, it has data security issues, it has mastering issues. So it's a fairly complex problem. I think customers are still grappling with it. They're barely, in my opinion, getting to the point of having that plan because May 18, 2018 May, was supposed to, for you to show evidence of a plan. So I think there... >> The plan is we have no plan. >> Right, the plan of the plan, I guess is what they're going to show. It may, as opposed to the plan. >> Well I'm sure it's keeping you guys super busy. I know it's on everyone's mind. We've been talking a lot about it. Great to have you on again. Great to see you. Live here at Informatica World. Day one of two days of coverage at theCUBE here. In Las Vegas, I'm John here with Peter Burris with more coverage after this short break. (techno music)
SUMMARY :
brought to you by Informatica. Great to see you. it's wonderful meeting you again. right at the spot you were talking about there. People are now realizing, I need to have And, to that, if you don't have a technology model Now it seems to be flipped around. Because, at the end of the day, the choice you make is the role that data plays in a digital 6business, and getting the shipment done to me in 4 days. But, now I got to move data around In fact, that leads logically to the next thing Now, if you looked at most of the architectures of the to reimagine, just as you're reimagining your If you think of the data center as an edge, of data, the cost realities of data are going to to the source of position and execution, right? Data has got to be close to the activity. It gives them that accessibility to a wide ranging That adaptive platform has to enable that. opportunity to do the adaptive platform, and they So the purpose being if I can get to a much more Love to fault with you on that. probably exposed to GDPR is defined in ten different ways. platform to understands. They're barely, in my opinion, getting to the point It may, as opposed to the plan. Great to have you on again.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Peter Burris | PERSON | 0.99+ |
Arun Varadarajan | PERSON | 0.99+ |
Peter | PERSON | 0.99+ |
John | PERSON | 0.99+ |
General Motors | ORGANIZATION | 0.99+ |
Ford | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Uber | ORGANIZATION | 0.99+ |
Arun | PERSON | 0.99+ |
2015 | DATE | 0.99+ |
Informatica | ORGANIZATION | 0.99+ |
John Furrier | PERSON | 0.99+ |
3-layer | QUANTITY | 0.99+ |
15 data sets | QUANTITY | 0.99+ |
first | QUANTITY | 0.99+ |
Adaptive Data Foundation | ORGANIZATION | 0.99+ |
Las Vegas | LOCATION | 0.99+ |
two reasons | QUANTITY | 0.99+ |
second block | QUANTITY | 0.99+ |
two days | QUANTITY | 0.99+ |
San Francisco | LOCATION | 0.99+ |
first layer | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
Today | DATE | 0.99+ |
11.56 seconds | QUANTITY | 0.99+ |
GDPR | TITLE | 0.99+ |
one purpose | QUANTITY | 0.99+ |
first component | QUANTITY | 0.98+ |
Chal Salvan | PERSON | 0.98+ |
4 days | QUANTITY | 0.98+ |
Venetian | LOCATION | 0.98+ |
Cognizant | PERSON | 0.98+ |
Cognizant | ORGANIZATION | 0.97+ |
Informatica World 2018 | EVENT | 0.97+ |
Sands Convention Center | LOCATION | 0.96+ |
11.56 seconds | QUANTITY | 0.95+ |
one | QUANTITY | 0.94+ |
ten definitions | QUANTITY | 0.92+ |
J Furrier | PERSON | 0.92+ |
May 18, 2018 May | DATE | 0.9+ |
Day one | QUANTITY | 0.89+ |
eight attributes | QUANTITY | 0.89+ |
Informatica World | EVENT | 0.87+ |
about 11.56 seconds | QUANTITY | 0.87+ |
three intelligent blocks | QUANTITY | 0.87+ |
Palazzo | LOCATION | 0.86+ |
Onprim | ORGANIZATION | 0.85+ |
three building blocks | QUANTITY | 0.84+ |
three dimensions | QUANTITY | 0.82+ |
Furrier | ORGANIZATION | 0.79+ |
first building | QUANTITY | 0.77+ |
ten different ways | QUANTITY | 0.74+ |
ADO | TITLE | 0.7+ |
three | QUANTITY | 0.69+ |
years | DATE | 0.66+ |
last couple | DATE | 0.63+ |
about seven | QUANTITY | 0.59+ |
components | QUANTITY | 0.57+ |
2018 | DATE | 0.48+ |
theCUBE | ORGANIZATION | 0.43+ |
Arun Murthy, Hortonworks | BigData NYC 2017
>> Coming back when we were a DOS spreadsheet company. I did a short stint at Microsoft and then joined Frank Quattrone when he spun out of Morgan Stanley to create what would become the number three tech investment (upbeat music) >> Host: Live from mid-town Manhattan, it's theCUBE covering the BigData New York City 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. (upbeat electronic music) >> Welcome back, everyone. We're here, live, on day two of our three days of coverage of BigData NYC. This is our event that we put on every year. It's our fifth year doing BigData NYC in conjunction with Hadoop World which evolved into Strata Conference, which evolved into Strata Hadoop, now called Strata Data. Probably next year will be called Strata AI, but we're still theCUBE, we'll always be theCUBE and this our BigData NYC, our eighth year covering the BigData world since Hadoop World. And then as Hortonworks came on we started covering Hortonworks' data summit. >> Arun: DataWorks Summit. >> DataWorks Summit. Arun Murthy, my next guest, Co-Founder and Chief Product Officer of Hortonworks. Great to see you, looking good. >> Likewise, thank you. Thanks for having me. >> Boy, what a journey. Hadoop, years ago, >> 12 years now. >> I still remember, you guys came out of Yahoo, you guys put Hortonworks together and then since, gone public, first to go public, then Cloudera just went public. So, the Hadoop World is pretty much out there, everyone knows where it's at, it's got to nice use case, but the whole world's moved around it. You guys have been, really the first of the Hadoop players, before ever Cloudera, on this notion of data in flight, or, I call, real-time data but I think, you guys call it data-in-motion. Batch, we all know what Batch does, a lot of things to do with Batch, you can optimize it, it's not going anywhere, it's going to grow. Real-time data-in-motion's a huge deal. Give us the update. >> Absolutely, you know, we've obviously been in this space, personally, I've been in this for about 12 years now. So, we've had a lot of time to think about it. >> Host: Since you were 12? >> Yeah. (laughs) Almost. Probably look like it. So, back in 2014 and '15 when we, sort of, went public and we're started looking around, the thesis always was, yes, Hadoop is important, we're going to love you to manage lots and lots of data, but a lot of the stuff we've done since the beginning, starting with YARN and so on, was really enable the use cases beyond the whole traditional transactions and analytics. And Drop, our CO calls it, his vision's always been we've got to get into a pre-transactional world, if you will, rather than the post-transactional analytics and BIN and so on. So that's where it started. And increasingly, the obvious next step was to say, look enterprises want to be able to get insights from data, but they also want, increasingly, they want to get insights and they want to deal with it in real-time. You know while you're in you shopping cart. They want to make sure you don't abandon your shopping cart. If you were sitting at at retailer and you're on an island and you're about to walk away from a dress, you want to be able to do something about it. So, this notion of real-time is really important because it helps the enterprise connect with the customer at the point of action, if you will, and provide value right away rather than having to try to do this post-transaction. So, it's been a really important journey. We went and bought this company called Onyara, which is a bunch of geeks like us who started off with the government, built this batching NiFi thing, huge community. Its just, like, taking off at this point. It's been a fantastic thing to join hands and join the team and keep pushing in the whole streaming data style. >> There's a real, I don't mean to tangent but I do since you brought up community I wanted to bring this up. It's been the theme here this week. It's more and more obvious that the community role is becoming central, beyond open-source. We all know open-source, standing on the shoulders before us, you know. And Linux Foundation showing code numbers hitting up from $64 million to billions in the next five, ten years, exponential growth of new code coming in. So open-source certainly blew me. But now community is translating to things you start to see blockchain, very community based. That's a whole new currency market that's changing the financial landscape, ICOs and what-not, that's just one data point. Businesses, marketing communities, you're starting to see data as a fundamental thing around communities. And certainly it's going to change the vendor landscape. So you guys compare to, Cloudera and others have always been community driven. >> Yeah our philosophy has been simple. You know, more eyes and more hands are better than fewer. And it's been one of the cornerstones of our founding thesis, if you will. And you saw how that's gone on over course of six years we've been around. Super-excited to have someone like IBM join hands, it happened at DataWorks Summit in San Jose. That announcement, again, is a reflection of the fact that we've been very, very community driven and very, very ecosystem driven. >> Communities are fundamentally built on trust and partnering. >> Arun: Exactly >> Coding is pretty obvious, you code with your friends. You code with people who are good, they become your friends. There's an honor system among you. You're starting to see that in the corporate deals. So explain the dynamic there and some of the successes that you guys have had on the product side where one plus one equals more than two. One plus one equals five or three. >> You know IBM has been a great example. They've decided to focus on their strengths which is around Watson and machine learning and for us to focus on our strengths around data management, infrastructure, cloud and so on. So this combination of DSX, which is their data science work experience, along with Hortonworks is really powerful. We are seeing that over and over again. Just yesterday we announced the whole Dataplane thing, we were super excited about it. And now to get IBM to say, we'll get in our technologies and our IP, big data, whether it's big Quality or big Insights or big SEQUEL, and the word has been phenomenal. >> Well the Dataplane announcement, finally people who know me know that I hate the term data lake. I always said it's always been a data ocean. So I get redemption because now the data lakes, now it's admitting it's a horrible name but just saying stitching together the data lakes, Which is essentially a data ocean. Data lakes are out there and you can form these data lakes, or data sets, batch, whatever, but connecting them and integrating them is a huge issue, especially with security. >> And a lot of it is, it's also just pragmatism. We start off with this notion of data lake and say, hey, you got too many silos inside the enterprise in one data center, you want to put them together. But then increasingly, as Hadoop has become more and more mainstream, I can't remember the last time I had to explain what Hadoop is to somebody. As it has become mainstream, couple things have happened. One is, we talked about streaming data. We see all the time, especially with HTF. We have customers streaming data from autonomous cars. You have customers streaming from security cameras. You can put a small minify agent in a security camera or smart phone and can stream it all the way back. Then you get into physics. You're up against the laws of physics. If you have a security camera in Japan, why would you want to move it all the way to California and process it. You'd rather do it right there, right? So with this notion of a regional data center becomes really important. >> And that talks to the Edge as well. >> Exactly, right. So you want to have something in Japan that collects all of the security cameras in Tokyo, and you do analysis and push what you want back here, right. So that's physics. The other thing we are increasingly seeing is with data sovereignty rules especially things like GDPR, there's now regulation reasons where data has to naturally stay in different regions. Customer data from Germany cannot move to France or visa versa, right. >> Data governance is a huge issue and this is the problem I have with data governance. I am really looking for a solution so if you can illuminate this it would be great. So there is going to be an Equifax out there again. >> Arun: Oh, for sure. >> And the problem is, is that going to force some regulation change? So what we see is, certainly on the mugi bond side, I see it personally is that, you can almost see that something else will happen that'll force some policy regulation or governance. You don't want to screw up your data. You also don't want to rewrite your applications or rewrite you machine learning algorithms. So there's a lot of waste potential by not structuring the data properly. Can you comment on what's the preferred path? >> Absolutely, and that's why we've been working on things like Dataplane for almost a couple of years now. We is to say, you have to have data and policies which make sense, given a context. And the context is going to change by application, by usage, by compliance, by law. So, now to manage 20, 30, 50 a 100 data lakes, would it be better, not saying lakes, data ponds, >> [Host} Any Data. >> Any data >> Any data pool, stream, river, ocean, whatever. (laughs) >> Jacuzzis. Data jacuzzis, right. So what you want to do is want a holistic fabric, I like the term, you know Forrester uses, they call it the fabric. >> Host: Data fabric. >> Data fabric, right? You want a fabric over these so you can actually control and maintain governance and security centrally, but apply it with context. Last not least, is you want to do this whether it's on frame or on the cloud, or multi-cloud. So we've been working with a bank. They were probably based in Germany but for GDPR they had to stand up something in France now. They had French customers, but for a bunch of new reasons, regulation reasons, they had to sign up something in France. So they bring their own data center, then they had only the cloud provider, right, who I won't name. And they were great, things are working well. Now they want to expand the similar offering to customers in Asia. It turns out their favorite cloud vendor was not available in Asia or they were not available in time frame which made sense for the offering. So they had to go with cloud vendor two. So now although each of the vendors will do their job in terms of giving you all the security and governance and so on, the fact that you are to manage it three ways, one for OnFrame, one for cloud vendor A and B, was really hard, too hard for them. So this notion of a fabric across these things, which is Dataplane. And that, by the way, is based by all the open source technologies we love like Atlas and Ranger. By the way, that is also what IBM is betting on and what the entire ecosystem, but it seems like a no-brainer at this point. That was the kind of reason why we foresaw the need for something like a Dataplane and obviously couldn't be more excited to have something like that in the market today as a net new service that people can use. >> You get the catalogs, security controls, data integration. >> Arun: Exactly. >> Then you get the cloud, whatever, pick your cloud scenario, you can do that. Killer architecture, I liked it a lot. I guess the question I have for you personally is what's driving the product decisions at Hortonworks? And the second part of that question is, how does that change your ecosystem engagement? Because you guys have been very friendly in a partnering sense and also very good with the ecosystem. How are you guys deciding the product strategies? Does it bubble up from the community? Is there an ivory tower, let's go take that hill? >> It's both, because what typically happens is obviously we've been in the community now for a long time. Working publicly now with well over 1,000 customers not only puts a lot of responsibility on our shoulders but it's also very nice because it gives us a vantage point which is unique. That's number one. The second one we see is being in the community, also we see the fact that people are starting to solve the problems. So it's another elementary for us. So you have one as the enterprise side, we see what the enterprises are facing which is kind of where Dataplane came in, but we also saw in the community where people are starting to ask us about hey, can you do multi-cluster Atlas? Or multi-cluster Ranger? Put two and two together and say there is a real need. >> So you get some consensus. >> You get some consensus, and you also see that on the enterprise side. Last not least is when went to friends like IBM and say hey we're doing this. This is where we can position this, right. So we can actually bring in IGSC, you can bring big Quality and bring all these type, >> [Host} So things had clicked with IBM? >> Exactly. >> Rob Thomas was thinking the same thing. Bring in the power system and the horsepower. >> Exactly, yep. We announced something, for example, we have been working with the power guys and NVIDIA, for deep learning, right. That sort of stuff is what clicks if you're in the community long enough, if you have the vantage point of the enterprise long enough, it feels like the two of them click. And that's frankly, my job. >> Great, and you've got obviously the landscape. The waves are coming in. So I've got to ask you, the big waves are coming in and you're seeing people starting to get hip with the couple of key things that they got to get their hands on. They need to have the big surfboards, metaphorically speaking. They got to have some good products, big emphasis on real value. Don't give me any hype, don't give me a head fake. You know, I buy, okay, AI Wash, and people can see right through that. Alright, that's clear. But AI's great. We all cheer for AI but the reality is, everyone knows that's pretty much b.s. except for core machine learning is on the front edge of innovation. So that's cool, but value. [Laughs] Hey I've got the integrate and operationalize my data so that's the big wave that's coming. Comment on the community piece because enterprises now are realizing as open source becomes the dominant source of value for them, they are now really going to the next level. It used to be like the emerging enterprises that knew open source. The guys will volunteer and they may not go deeper in the community. But now more people in the enterprises are in open source communities, they are recruiting from open source communities, and that's impacting their business. What's your advice for someone who's been in the community of open source? Lessons you've learned, what is the best practice, from your standpoint on philosophy, how to build into the community, how to build a community model. >> Yeah, I mean, the end of the day, my best advice is to say look, the community is defined by the people who contribute. So, you get advice if you contribute. Which means, if that's the fundamental truth. Which means you have to get your legal policies and so on to a point that you can actually start to let your employees contribute. That kicks off a flywheel, where you can actually go then recruit the best talent, because the best talent wants to stand out. Github is a resume now. It is not a word doc. If you don't allow them to build that resume they're not going to come by and it's just a fundamental truth. >> It's self governing, it's reality. >> It's reality, exactly. Right and we see that over and over again. It's taken time but it as with things, the flywheel has changed enough. >> A whole new generation's coming online. If you look at the young kids coming in now, it is an amazing environment. You've got TensorFlow, all this cool stuff happening. It's just amazing. >> You, know 20 years ago that wouldn't happen because the Googles of the world won't open source it. Now increasingly, >> The secret's out, open source works. >> Yeah, (laughs) shh. >> Tell everybody. You know they know already but, This is changing some of the how H.R. works and how people collaborate, >> And the policies around it. The legal policies around contribution so, >> Arun, great to see you. Congratulations. It's been fun to watch the Hortonworks journey. I want to appreciate you and Rob Bearden for supporting theCUBE here in BigData NYC. If is wasn't for Hortonworks and Rob Bearden and your support, theCUBE would not be part of the Strata Data, which we are not allowed to broadcast into, for the record. O'Reilly Media does not allow TheCube or our analysts inside their venue. They've excluded us and that's a bummer for them. They're a closed organization. But I want to thank Hortonworks and you guys for supporting us. >> Arun: Likewise. >> We really appreciate it. >> Arun: Thanks for having me back. >> Thanks and shout out to Rob Bearden. Good luck and CPO, it's a fun job, you know, not the pressure. I got a lot of pressure. A whole lot. >> Arun: Alright, thanks. >> More Cube coverage after this short break. (upbeat electronic music)
SUMMARY :
the number three tech investment Brought to you by SiliconANGLE Media This is our event that we put on every year. Co-Founder and Chief Product Officer of Hortonworks. Thanks for having me. Boy, what a journey. You guys have been, really the first of the Hadoop players, Absolutely, you know, we've obviously been in this space, at the point of action, if you will, standing on the shoulders before us, you know. And it's been one of the cornerstones Communities are fundamentally built on that you guys have had on the product side and the word has been phenomenal. So I get redemption because now the data lakes, I can't remember the last time I had to explain and you do analysis and push what you want back here, right. so if you can illuminate this it would be great. I see it personally is that, you can almost see that We is to say, you have to have data and policies Any data pool, stream, river, ocean, whatever. I like the term, you know Forrester uses, the fact that you are to manage it three ways, I guess the question I have for you personally is So you have one as the enterprise side, and you also see that on the enterprise side. Bring in the power system and the horsepower. if you have the vantage point of the enterprise long enough, is on the front edge of innovation. and so on to a point that you can actually the flywheel has changed enough. If you look at the young kids coming in now, because the Googles of the world won't open source it. This is changing some of the how H.R. works And the policies around it. and you guys for supporting us. Thanks and shout out to Rob Bearden. More Cube coverage after this short break.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Asia | LOCATION | 0.99+ |
France | LOCATION | 0.99+ |
Arun | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Rob Bearden | PERSON | 0.99+ |
Germany | LOCATION | 0.99+ |
Arun Murthy | PERSON | 0.99+ |
Japan | LOCATION | 0.99+ |
NVIDIA | ORGANIZATION | 0.99+ |
Tokyo | LOCATION | 0.99+ |
2014 | DATE | 0.99+ |
California | LOCATION | 0.99+ |
12 | QUANTITY | 0.99+ |
five | QUANTITY | 0.99+ |
Frank Quattrone | PERSON | 0.99+ |
three | QUANTITY | 0.99+ |
two | QUANTITY | 0.99+ |
Onyara | ORGANIZATION | 0.99+ |
$64 million | QUANTITY | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
San Jose | LOCATION | 0.99+ |
O'Reilly Media | ORGANIZATION | 0.99+ |
each | QUANTITY | 0.99+ |
Morgan Stanley | ORGANIZATION | 0.99+ |
Linux Foundation | ORGANIZATION | 0.99+ |
One | QUANTITY | 0.99+ |
fifth year | QUANTITY | 0.99+ |
Atlas | ORGANIZATION | 0.99+ |
20 | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
Rob Thomas | PERSON | 0.99+ |
three days | QUANTITY | 0.99+ |
eighth year | QUANTITY | 0.99+ |
yesterday | DATE | 0.99+ |
SiliconANGLE Media | ORGANIZATION | 0.99+ |
six years | QUANTITY | 0.99+ |
Equifax | ORGANIZATION | 0.99+ |
next year | DATE | 0.99+ |
NYC | LOCATION | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
second part | QUANTITY | 0.99+ |
both | QUANTITY | 0.99+ |
Ranger | ORGANIZATION | 0.99+ |
50 | QUANTITY | 0.98+ |
30 | QUANTITY | 0.98+ |
Yahoo | ORGANIZATION | 0.98+ |
Strata Conference | EVENT | 0.98+ |
DataWorks Summit | EVENT | 0.98+ |
Hadoop | TITLE | 0.98+ |
'15 | DATE | 0.97+ |
20 years ago | DATE | 0.97+ |
Forrester | ORGANIZATION | 0.97+ |
GDPR | TITLE | 0.97+ |
second one | QUANTITY | 0.97+ |
one data center | QUANTITY | 0.97+ |
Github | ORGANIZATION | 0.96+ |
about 12 years | QUANTITY | 0.96+ |
three ways | QUANTITY | 0.96+ |
Manhattan | LOCATION | 0.95+ |
day two | QUANTITY | 0.95+ |
this week | DATE | 0.95+ |
NiFi | ORGANIZATION | 0.94+ |
Dataplane | ORGANIZATION | 0.94+ |
BigData | ORGANIZATION | 0.94+ |
Hadoop World | EVENT | 0.93+ |
billions | QUANTITY | 0.93+ |
Arun Murthy, Hortonworks | DataWorks Summit 2017
>> Announcer: Live from San Jose, in the heart of Silicon Valley, it's theCUBE covering DataWorks Summit 2017. Brought to you by Hortonworks. >> Good morning, welcome to theCUBE. We are live at day 2 of the DataWorks Summit, and have had a great day so far, yesterday and today, I'm Lisa Martin with my co-host George Gilbert. George and I are very excited to be joined by a multiple CUBE alumni, the co-founder and VP of Engineering at Hortonworks Arun Murthy. Hey, Arun. >> Thanks for having me, it's good to be back. >> Great to have you back, so yesterday, great energy at the event. You could see and hear behind us, great energy this morning. One of the things that was really interesting yesterday, besides the IBM announcement, and we'll dig into that, was that we had your CEO on, as well as Rob Thomas from IBM, and Rob said, you know, one of the interesting things over the last five years was that there have been only 10 companies that have beat the S&P 500, have outperformed, in each of the last five years, and those companies have made big bets on data science and machine learning. And as we heard yesterday, these four meta-trains IoT, cloud streaming, analytics, and now the fourth big leg, data science. Talk to us about what Hortonworks is doing, you've been here from the beginning, as a co-founder I've mentioned, you've been with Hadoop since it was a little baby. How is Hortonworks evolving to become one of those big users making big bets on helping your customers, and yourselves, leverage machine loading to really drive the business forward? >> Absolutely, a great question. So, you know, if you look at some of the history of Hadoop, it started off with this notion of a data lake, and then, I'm talking about the enterprise side of Hadoop, right? I've been working for Hadoop for about 12 years now, you know, the last six of it has been as a vendor selling Hadoop to enterprises. They started off with this notion of data lake, and as people have adopted that vision of a data lake, you know, you bring all the data in, and now you're starting to get governance and security, and all of that. Obviously the, one of the best ways to get value over the data is the notion of, you know, can you, sort of, predict what is going to happen in your world of it, with your customers, and, you know, whatever it is with the data that you already have. So that notion of, you know, Rob, our CEO, talks about how we're trying to move from a post-transactional world to a pre-transactional world, and doing the analytics and data sciences will be, obviously, with me. We could talk about, and there's so many applications of it, something as similar as, you know, we did a demo last year of, you know, of how we're working with a freight company, and we're starting to show them, you know, predict which drivers and which routes are going to have issues, as they're trying to move, alright? Four years ago we did the same demo, and we would say, okay this driver has, you know, we would show that this driver had an issue on this route, but now, within the world, we can actually predict and let you know to take preventive measures up front. Similarly internally, you know, you can take things from, you know, mission-learning, and log analytics, and so on, we have a internal problem, you know, where we have to test two different versions of HDP itself, and as you can imagine, it's a really, really hard problem. We have the support, 10 operating systems, seven databases, like, if you multiply that matrix, it's, you know, tens of thousands of options. So, if you do all that testing, we now use mission-learning internally, to look through the logs, and kind of predict where the failures were, and help our own, sort of, software engineers understand where the problems were, right? An extension of that has been, you know, the work we've done in Smartsense, which is a service we offer our enterprise customers. We collect logs from their Hadoop clusters, and then they can actually help them understand where they can either tune their applications, or even tune their hardware, right? They might have a, you know, we have this example I really like where at a really large enterprise Financial Services client, they had literally, you know, hundreds and, you know, and thousands of machines on HDP, and we, using Smartsense, we actually found that there were 25 machines which had bad NIC configuration, and we proved to them that by fixing those, we got a 30% to put back on their cluster. At that scale, it's a lot of money, it's a lot of cap, it's a lot of optics So, as a company, we try to ourselves, as much as we, kind of, try to help our customers adopt it, that make sense? >> Yeah, let's drill down on that even a little more, cause it's pretty easy to understand what's the standard telemetry you would want out of hardware, but as you, sort of, move up the stack the metrics, I guess, become more custom. So how do you learn, not just from one customer, but from many customers especially when you can't standardize what you're supposed to pull out of them? >> Yeah so, we're sort of really big believers in, sort of, doctoring your own stuff, right? So, we talk about the notion of data lake, we actually run a Smartsense data lake where we actually get data across, you know, the hundreds of of our customers, and we can actually do predictive mission-learning on that data in our own data lake. Right? And to your point about how we go up the stack, this is, kind of, where we feel like we have a natural advantage because we work on all the layers, whether it's the sequel engine, or the storage engine, or, you know, above and beyond the hardware. So, as we build these models, we understand that we need more, or different, telemetry right? And we put that back into the product so the next version of HDP will have that metrics that we wanted. And, now we've been doing this for a couple of years, which means we've done three, four, five turns of the crank, obviously something we always get better at, but I feel like, compared to where we were a couple of years ago when Smartsense first came out, it's actually matured quite a lot, from that perspective. >> So, there's a couple different paths you can add to this, which is customers might want, as part of their big data workloads, some non-Hortonworks, you know, services or software when it's on-prem, and then can you also extend this management to the Cloud if they want to hybrid setup where, in the not too distant future, the Cloud vendor will be also a provider for this type of management. >> So absolutely, in fact it's true today when, you know, we work with, you know, Microsoft's a great partner of ours. We work with them to enable Smartsense on HDI, which means we can actually get the same telemetry back, whether you're running the data on an on-prem HDP, or you're running this on HDI. Similarly, we shipped a version of our Cloud product, our Hortonworks Data Cloud, on Amazon and again Smartsense preplanned there, so whether you're on an Amazon, or a Microsoft, or on-prem, we get the same telemetry, we get the same data back. We can actually, if you're a customer using many of these products, we can actually give you that telemetry back. Similarly, if you guys probably know this we have, you were probably there in an analyst when they announced the Flex Support subscription, which means that now we can actually take the support subscription you have to get from Hortonworks, and you can actually use it on-prem or on the Cloud. >> So in terms of transforming, HDP for example, just want to make sure I'm understanding this, you're pulling in data from customers to help evolve the product, and that data can be on-prem, it can be in a Microsoft lesur, it can be an AWS? >> Exactly. The HDP can be running in any of these, we will actually pull all of them to our data lake, and they actually do the analytics for us and then present it back to the customers. So, in our support subscription, the way this works is we do the analytics in our lake, and it pushes it back, in fact to our support team tickets, and our sales force, and all the support mechanisms. And they get a set of recommendations saying Hey, we know this is the work loads you're running, we see these are the opportunities for you to do better, whether it's tuning a hardware, tuning an application, tuning the software, we sort of send the recommendations back, and the customer can go and say Oh, that makes sense, the accept that and we'll, you know, we'll update the recommendation for you automatically. Then you can have, or you can say Maybe I don't want to change my kernel pedometers, let's have a conversation. And if the customer, you know, is going through with that, then they can go and change it on their own. We do that, sort of, back and forth with the customer. >> One thing that just pops into my mind is, we talked a lot yesterday about data governance, are there particular, and also yesterday on stage were >> Arun: With IBM >> Yes exactly, when we think of, you know, really data-intensive industries, retail, financial services, insurance, healthcare, manufacturing, are there particular industries where you're really leveraging this, kind of, bi-directional, because there's no governance restrictions, or maybe I shouldn't say none, but. Give us a sense of which particular industries are really helping to fuel the evolution of Hortonworks data lake. >> So, I think healthcare is a great example. You know, when we started off, sort of this open-source project, or an atlas, you know, a couple of years ago, we got a lot of traction in the healthcare sort of insurance industry. You know, folks like Aetna were actually founding members of that, you know, sort of consortium of doing this, right? And, we're starting to see them get a lot of leverage, all of this. Similarly now as we go into, you know, Europe and expand there, things like GDPR, are really, really being pardoned, right? And, you guys know GDPR is a really big deal. Like, you pay, if you're not compliant by, I think it's like March of next year, you pay a portion of your revenue as fines. That's, you know, big money for everybody. So, I think that's what we're really excited about the portion with IBM, because we feel like the two of us can help a lot of customers, especially in countries where they're significantly, highly regulated, than the United States, to actually get leverage our, sort of, giant portfolio of products. And IBM's been a great company to atlas, they've adopted wholesale as you saw, you know, in the announcements yesterday. >> So, you're doing a Keynote tomorrow, so give us maybe the top three things, you're giving the Keynote on Data Lake 3.0, walk us through the evolution. Data Lakes 1.0, 2.0, 3.0, where you are now, and what folks can expect to hear and see in your Keynote. >> Absolutely. So as we've, kind of, continued to work with customers and we see the maturity model of customers, you know, initially people are staying up a data lake, and then they'd want, you know, sort of security, basic security what it covers, and so on. Now, they want governance, and as we're starting to go to that journey clearly, our customers are pushing us to help them get more value from the data. It's not just about putting the data lake, and obviously managing data with governance, it's also about Can you help us, you know, do mission-learning, Can you help us build other apps, and so on. So, as we look to there's a fundamental evolution that, you know, Hadoop legal system had to go through was with advance of technologies like, you know, a Docker, it's really important first to help the customers bring more than just workloads, which are sort of native to Hadoop. You know, Hadoop started off with MapReduce, obviously Spark's went great, and now we're starting to see technologies like Flink coming, but increasingly, you know, we want to do data science. To mass market data science is obviously, you know, people, like, want to use Spark, but the mass market is still Python, and R, and so on, right? >> Lisa: Non-native, okay. >> Non-native. Which are not really built, you know, these predate Hadoop by a long way, right. So now as we bring these applications in, having technology like Docker is really important, because now we can actually containerize these apps. It's not just about running Spark, you know, running Spark with R, or running Spark with Python, which you can do today. The problem is, in a true multi-tenant governed system, you want, not just R, but you want specifics of a libraries for R, right. And the libraries, you know, George wants might be completely different than what I want. And, you know, you can't do a multi-tenant system where you install both of them simultaneously. So Docker is a really elegant solution to problems like those. So now we can actually bring those technologies into a Docker container, so George's Docker containers will not, you know, conflict with mine. And you can actually go to the races, you know after the races, we're doing data signs. Which is really key for technologies like DSX, right? Because with DSX if you see, obviously DSX supports Spark with technologies like, you know, Zeppelin which is a front-end, but they also have Jupiter, which is going to work the mass market users for Python and R, right? So we want to make sure there's no friction whether it's, sort of, the guys using Spark, or the guys using R, and equally importantly DSX, you know, in the short map will also support things like, you know, the classic IBM portfolio, SBSS and so on. So bringing all of those things in together, making sure they run with data in the data lake, and also the computer in the data lake, is really big for us. >> Wow, so it sounds like your Keynote's going to be very educational for the folks that are attending tomorrow, so last question for you. One of the themes that occurred in the Keynote this morning was sharing a fun-fact about these speakers. What's a fun-fact about Arun Murthy? >> Great question. I guess, you know, people have been looking for folks with, you know, 10 years of experience on Hadoop. I'm here finally, right? There's not a lot of people but, you know, it's fun to be one of those people who've worked on this for about 10 years. Obviously, I look forward to working on this for another 10 or 15 more, but it's been an amazing journey. >> Excellent. Well, we thank you again for sharing time again with us on theCUBE. You've been watching theCUBE live on day 2 of the Dataworks Summit, hashtag DWS17, for my co-host George Gilbert. I am Lisa Martin, stick around we've got great content coming your way.
SUMMARY :
Brought to you by Hortonworks. We are live at day 2 of the DataWorks Summit, and Rob said, you know, one of the interesting and we're starting to show them, you know, when you can't standardize what you're or the storage engine, or, you know, some non-Hortonworks, you know, services when, you know, we work with, you know, And if the customer, you know, Yes exactly, when we think of, you know, Similarly now as we go into, you know, Data Lakes 1.0, 2.0, 3.0, where you are now, with advance of technologies like, you know, And the libraries, you know, George wants One of the themes that occurred in the Keynote this morning There's not a lot of people but, you know, Well, we thank you again for sharing time again
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
George Gilbert | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Rob | PERSON | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
Rob Thomas | PERSON | 0.99+ |
George | PERSON | 0.99+ |
Lisa | PERSON | 0.99+ |
30% | QUANTITY | 0.99+ |
San Jose | LOCATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
25 machines | QUANTITY | 0.99+ |
10 operating systems | QUANTITY | 0.99+ |
hundreds | QUANTITY | 0.99+ |
Arun Murthy | PERSON | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
two | QUANTITY | 0.99+ |
Aetna | ORGANIZATION | 0.99+ |
10 years | QUANTITY | 0.99+ |
Arun | PERSON | 0.99+ |
today | DATE | 0.99+ |
Spark | TITLE | 0.99+ |
yesterday | DATE | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
both | QUANTITY | 0.99+ |
Python | TITLE | 0.99+ |
last year | DATE | 0.99+ |
Four years ago | DATE | 0.99+ |
15 | QUANTITY | 0.99+ |
tomorrow | DATE | 0.99+ |
CUBE | ORGANIZATION | 0.99+ |
three | QUANTITY | 0.99+ |
DataWorks Summit | EVENT | 0.99+ |
seven databases | QUANTITY | 0.98+ |
four | QUANTITY | 0.98+ |
DataWorks Summit 2017 | EVENT | 0.98+ |
United States | LOCATION | 0.98+ |
Dataworks Summit | EVENT | 0.98+ |
10 | QUANTITY | 0.98+ |
Europe | LOCATION | 0.97+ |
10 companies | QUANTITY | 0.97+ |
One | QUANTITY | 0.97+ |
one customer | QUANTITY | 0.97+ |
thousands of machines | QUANTITY | 0.97+ |
about 10 years | QUANTITY | 0.96+ |
GDPR | TITLE | 0.96+ |
Docker | TITLE | 0.96+ |
Smartsense | ORGANIZATION | 0.96+ |
about 12 years | QUANTITY | 0.95+ |
this morning | DATE | 0.95+ |
each | QUANTITY | 0.95+ |
two different versions | QUANTITY | 0.95+ |
five turns | QUANTITY | 0.94+ |
R | TITLE | 0.93+ |
four meta-trains | QUANTITY | 0.92+ |
day 2 | QUANTITY | 0.92+ |
Data Lakes 1.0 | COMMERCIAL_ITEM | 0.92+ |
Flink | ORGANIZATION | 0.91+ |
first | QUANTITY | 0.91+ |
HDP | ORGANIZATION | 0.91+ |
Arun Murthy, Hortonworks - Spark Summit East 2017 - #SparkSummit - #theCUBE
>> [Announcer] Live, from Boston, Massachusetts, it's the Cube, covering Spark Summit East 2017, brought to you by Data Breaks. Now, your host, Dave Alante and George Gilbert. >> Welcome back to snowy Boston everybody, this is The Cube, the leader in live tech coverage. Arun Murthy is here, he's the founder and vice president of engineering at Horton Works, father of YARN, can I call you that, godfather of YARN, is that fair, or? (laughs) Anyway. He's so, so modest. Welcome back to the Cube, it's great to see you. >> Pleasure to have you. >> Coming off the big keynote, (laughs) you ended the session this morning, so that was great. Glad you made it in to Boston, and uh, lot of talk about security and governance, you know we've been talking about that years, it feels like it's truly starting to come into the main stream Arun, so. >> Well I think it's just a reflection of what customers are doing with the tech now. Now, three, four years ago, a lot of it was pilots, a lot of it was, you know, people playing with the tech. But increasingly, it's about, you know, people actually applying stuff in production, having data, system of record, running workloads both on prem and on the cloud, cloud is sort of becoming more and more real at mainstream enterprises. So a lot of it means, as you take any of the examples today any interesting app will have some sort of real time data feed, it's probably coming out from a cell phone or sensor which means that data is actually not, in most cases not coming on prem, it's actually getting collected in a local cloud somewhere, it's just more cost effective, why would we put up 25 data centers if you don't have to, right? So then you got to connect that data, production data you have or customer data you have or data you might have purchased and then join them up, run some interesting analytics, do geobased real time threat detection, cyber security. A lot of it means that you need a common way to secure data, govern it, and that's where we see the action, I think it's a really good sign for the market and for the community that people are pushing on these dimensions of the broader, because, getting pushed in this dimension because it means that people are actually using it for real production work loads. >> Well in the early days of Hadoop you really didn't talk that much about cloud. >> Yeah. >> You know, and now, >> Absolutely. >> It's like, you know, duh, cloud. >> Yeah. >> It's everywhere, and of course the whole hybrid cloud thing comes into play, what are you seeing there, what are things you can do in a hybrid, you know, or on prem that you can't do in a public cloud and what's the dynamic look like? >> Well, it's definitely not an either or, right? So what we're seeing is increasingly interesting apps need data which are born in the cloud and they'll stay in the cloud, but they also need transactional data which stays on prem, you might have an EDW for example, right? >> Right. >> There's not a lot of, you know, people want to solve business problems and not just move data from one place to another, right? Or back from one place to another, so it's not interesting to move an EDW to the cloud, and similarly it's not interesting to bring your IOT data or sensor data back into on-prem, right? Just makes sense. So naturally what happens is, you know, at Hortonworks we talk of kinds of modern app or a modern data app, which means a modern data app has to spare, has to sort of, you know, it can pass both on-prem data and cloud data. >> Yeah, you talked about that in your keynote years ago. Furio said that the data is the new development kit. And now you're seeing the apps are just so dang rich, >> Exactly, exactly. >> And they have to span >> Absolutely. >> physical locations, >> Yeah. >> But then this whole thing of IOT comes up, we've been having a conversation on The Cube, last several Cubes of, okay, how much stays out, how much stays in, there's a lot of debates about that, there's reasons not to bring it in, but you talked today about some of the important stuff will come back. >> Yeah. >> So the way this is, this all is going to be, you know, there's a lot of data that should be born in the cloud and stay there, the IOT data, but then what will happen increasingly is, key summaries of the data will move back and forth, so key summaries of your EDW will move to the cloud, sometimes key summaries of your IOT data, you know, you want to do some sort of historical training in analytics, that will come back on-prem, so I think there's a bi-directional data movement, but it just won't be all the data, right? It'll be key interesting summaries of the data but not all of it. >> And a lot of times, people say well it doesn't matter where it lives, cloud should be an operating model, not a place where you put data or applications, and while that's true and we would agree with that, from a customer standpoint it matters in terms of performance and latency issues and cost and regulation, >> And security and governance. >> Yeah. >> Absolutely. >> You need to think those things through. >> Exactly, so I mean, so that's what we're focused on, to make sure that you have a common security and governance model regardless of where data is, so you can think of it as, infrastructure you own and infrastructure you lease. >> Right. >> Right? Now, the details matter of course, when you go to the cloud you lose S3 for example or ADLS from Microsoft, but you got to make sure that there's a common sort of security governance front and top of it, in front of it, as an example one of the things that, you know, in the open source community, Ranger's a really sort of key project right now from a security authorization and authentication standpoint. We've done a lot of work with our friends at Microsoft to make sure, you can actually now manage data in Wasabi which is their object store, data stream, natively with Ranger, so you can set a policy that says only Dave can access these files, you know, George can access these columns, that sort of stuff is natively done on the Microsoft platform thanks to the relationship we have with them. >> Right. >> So that's actually really interesting for the open source communities. So you've talked about sort of commodity storage at the bottom layer and even if they're different sort of interfaces and implementations, it's still commodity storage, and now what's really helpful to customers is that they have a common security model, >> Exactly. >> Authorization, authentication, >> Authentication, lineage prominence, >> Oh okay. >> You want to make sure all of these are common sources across. >> But you've mentioned off of the different data patterns, like the stuff that might be streaming in on the cloud, what, assuming you're not putting it into just a file system or an object store, and you want to sort of merge it with >> Yeah. >> Historical data, so what are some of the data stores other than the file system, in other words, newfangled databases to manage this sort of interaction? >> So I think what you're saying is, we certainly have the raw data, the raw data is going to line up in whatever cloud native storage, >> Yeah. >> It's going to be Amazon, Wasabi, ADLS, Google Storage. But then increasingly you want, so now the patterns change so you have raw data, you have some sort of an ETL process, what's interesting in the cloud is that even the process data or, if you take the unstructured raw data and structure it, that structured data also needs to live on the cloud platform, right? The reason that's important is because A, it's cheaper to use the native platform rather than set up your own database on top of it. The other one is you also want to take advantage of all the native sources that the cloud storage provides, so for example, linking your application. So automatically data in Wasabi, you know, if you can set up a policy and easily say this structured data stable that I have of which is a summary of all the IOT activity in the last 24 hours, you can, using the cloud provider's technologies you can actually make it show up easily in Europe, like you don't have to do any work, right? So increasingly what we Hortonworks focused a lot on is to make sure that we, all of the computer engines, whether it's Spark or Hive or, you know, or MapReduce, it doesn't really matter, they're all natively working on the cloud provider's storage platform. >> [George] Okay. >> Right, so, >> Okay. >> That's a really key consideration for us. >> And the follow up to that, you know, there's a bit of a misconception that Spark replaces Hadoop, but it actually can be a processing, a compute engine for, >> Yeah. >> That can compliment or replace some of the compute engines in Hadoop, help us frame, how you talk about it with your customers. >> For us it's really simple, like in the past, the only option you had on Hadoop to do any computation was MapReduce, that was, I started working in MapReduce 11 years ago, so as you can imagine, it's a pretty good run for any technology, right? Spark is definitely the interesting sort of engine for sort of the, anything from mission learning to ETL for data on top of Hadoop. But again, what we focus a lot on is to make sure that every time we bring in, so right now, when we started on HTP, the first on HTP had about nine open source projects literally just nine. Today, the last one we shipped was 2.5, HTP 2.5 had about 27 I think, like it's a huge sort of explosion, right? But the problem with that is not just that we have 27 projects, the problem is that you're going to make sure each of the 27 work with all the 26 others. >> It's a QA nightmare. >> Exactly. So that integration is really key, so same thing with Spark, we want to make sure you have security and YARN (mumbles), like you saw in the demo today, you can now run Spark SQL but also make sure you get low level (mumbles) masking, all of the enterprise capabilities that you need, and I was at a financial services three or four weeks ago in Chicago. Today, to do equivalent of what I showed today on demo, they need literally, they have a classic ADW, and they have to maintain anywhere between 1500 to 2500 views of the same database, that's a nightmare as you can imagine. Now the fact that you can do this on the raw data using whether it's Hive or Spark or Peg or MapReduce, it doesn't really matter, it's really key, and that's the thing we push to make sure things like YARN security work across all the stacks, all the open source techs. >> So that makes life better, a simplification use case if you will, >> Yeah. >> What are some of the other use cases that you're seeing things like Spark enable? >> Machine learning is a really big one. Increasingly, every product is going to have some, people call it, machine learning and AI and deep learning, there's a lot of techniques out there, but the key part is you want to build a predictive model, in the past (mumbles) everybody want to build a model and score what's happening in the real world against model, but equally important make sure the model gets updated as more data comes in on and actually as the model scores does get smaller over time. So that's something we see all over, so for example, even within our own product, it's not just us enabling this for the customer, for example at Hortonworks we have a product called SmartSense which allows you to optimize how people use Hadoop. Where the, what are the opportunities for you to explore deficiencies within your own Hadoop system, whether it's Spark or Hive, right? So we now put mesh learning into SmartSense. And show you that customers who are running queries like you are running, Mr. Customer X, other customers like you are tuning Hadoop this way, they're running this sort of config, they're using these sort of features in Hadoop. That allows us to actually make the product itself better all the way down the pipe. >> So you're improving the scoring algorithm or you're sort of replacing it with something better? >> What we're doing there is just helping them optimize their Hadoop deploys. >> Yep. >> Right? You know, configuration and tuning and kernel settings and network settings, we do that automatically with SmartSense. >> But the customer, you talked about scoring and trying to, >> Yeah. >> They're tuning that, improving that and increasing the probability of it's accuracy, or is it? >> It's both. >> Okay. >> So the thing is what they do is, you initially come with a hypothesis, you have some amount of data, right? I'm a big believer that over time, more data, you're better off spending more, getting more data into the system than to tune that algorithm financially, right? >> Interesting, okay. >> Right, so you know, for example, you know, talk to any of the big guys on Facebook because they'll do the same, what they'll say is it's much better to get, to spend your time getting 10x data to the system and improving the model rather than spending 10x the time and improving the model itself on day one. >> Yeah, but that's a key choice, because you got to >> Exactly. >> Spend money on doing either, >> One of them. >> And you're saying go for the data. >> Go for the data. >> At least now. >> Yeah, go for data, what happens is the good part of that is it's not just the model, it's the, what you got to really get through is the entire end to end flow. >> Yeah. >> All the way from data aggregation to ingestion to collection to scoring, all that aspect, you're better off sort of walking through the paces like building the entire end to end product rather than spending time in a silo trying to make a lot of change. >> We've talked to a lot of machine learning tool vendors, application vendors, and it seems like we got to the point with Big Data where we put it in a repository then we started doing better at curating it and understanding it then starting to do a little bit exploration with business intelligence, but with machine learning, we don't have something that does this end to end, you know, from acquiring the data, building the model to operationalizing it, where are we on that, who should we look to for that? >> It's definitely very early, I mean if you look at, even the EDW space, for example, what is EDW? EDW is ingestion, ETL, and then sort of fast query layer, Olap BI, on and on and on, right? So that's the full EDW flow, I don't think as a market, I mean, it's really early in this space, not only as an overall industry, we have that end to end sort of industrialized design concept, it's going to take time, but a lot of people are ahead, you know, the Google's a world ahead, over time a lot of people will catch up. >> We got to go, I wish we had more time, I had so many other questions for you but I know time is tight in our schedule, so thanks so much Arun, >> Appreciate it. For coming on, appreciate it, alright, keep right there everybody, we'll be back with our next guest, it's The Cube, we're live from Spark Summit East in Boston, right back. (upbeat music)
SUMMARY :
brought to you by Data Breaks. father of YARN, can I call you that, Glad you made it in to Boston, So a lot of it means, as you take any of the examples today you really didn't talk that has to sort of, you know, it can pass both on-prem data Yeah, you talked about that in your keynote years ago. but you talked today about some of the important stuff So the way this is, this all is going to be, you know, And security and You need to think those so that's what we're focused on, to make sure that you have as an example one of the things that, you know, in the open So that's actually really interesting for the open source You want to make sure all of these are common sources in the last 24 hours, you can, using the cloud provider's in Hadoop, help us frame, how you talk about it with like in the past, the only option you had on Hadoop all of the enterprise capabilities that you need, Where the, what are the opportunities for you to explore What we're doing there is just helping them optimize and network settings, we do that automatically for example, you know, talk to any of the big guys is it's not just the model, it's the, what you got to really like building the entire end to end product rather than but a lot of people are ahead, you know, the Google's everybody, we'll be back with our next guest, it's The Cube,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Dave | PERSON | 0.99+ |
George Gilbert | PERSON | 0.99+ |
Dave Alante | PERSON | 0.99+ |
Arun Murthy | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
10x | QUANTITY | 0.99+ |
Boston | LOCATION | 0.99+ |
Chicago | LOCATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
George | PERSON | 0.99+ |
Arun | PERSON | 0.99+ |
Wasabi | ORGANIZATION | 0.99+ |
25 data centers | QUANTITY | 0.99+ |
Today | DATE | 0.99+ |
Hadoop | TITLE | 0.99+ |
Wasabi | LOCATION | 0.99+ |
YARN | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
ADLS | ORGANIZATION | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
Horton Works | ORGANIZATION | 0.99+ |
today | DATE | 0.99+ |
Data Breaks | ORGANIZATION | 0.99+ |
1500 | QUANTITY | 0.98+ |
SmartSense | TITLE | 0.98+ |
S3 | TITLE | 0.98+ |
Boston, Massachusetts | LOCATION | 0.98+ |
One | QUANTITY | 0.98+ |
27 projects | QUANTITY | 0.98+ |
three | DATE | 0.98+ |
ORGANIZATION | 0.98+ | |
Furio | PERSON | 0.98+ |
Spark | TITLE | 0.98+ |
2500 views | QUANTITY | 0.98+ |
first | QUANTITY | 0.97+ |
Spark Summit East | LOCATION | 0.97+ |
both | QUANTITY | 0.97+ |
Spark SQL | TITLE | 0.97+ |
Google Storage | ORGANIZATION | 0.97+ |
26 | QUANTITY | 0.96+ |
Ranger | ORGANIZATION | 0.96+ |
four weeks ago | DATE | 0.95+ |
one | QUANTITY | 0.94+ |
each | QUANTITY | 0.94+ |
four years ago | DATE | 0.94+ |
11 years ago | DATE | 0.93+ |
27 work | QUANTITY | 0.9+ |
MapReduce | TITLE | 0.89+ |
Hive | TITLE | 0.89+ |
this morning | DATE | 0.88+ |
EDW | TITLE | 0.88+ |
about nine open source | QUANTITY | 0.88+ |
day one | QUANTITY | 0.87+ |
nine | QUANTITY | 0.86+ |
years | DATE | 0.84+ |
Olap | TITLE | 0.83+ |
Cube | ORGANIZATION | 0.81+ |
a lot of data | QUANTITY | 0.8+ |
Breaking Analysis: Chaos Creates Cash for Criminals & Cyber Companies
from the cube studios in palo alto in boston bringing you data-driven insights from the cube and etr this is breaking analysis with dave vellante the pandemic not only accelerated the shift to digital but also highlighted a rush of cyber criminal sophistication collaboration and chaotic responses by virtually every major company in the planet the solar winds hack exposed supply chain weaknesses and so-called island hopping techniques that are exceedingly difficult to detect moreover the will and aggressiveness of well-organized cyber criminals has elevated to the point where incident responses are now met with counterattacks designed to both punish and extract money from victims via ransomware and other criminal activities the only upshot is the cyber security market remains one of the most enduring and attractive investment sectors for those that can figure out where the market is headed and which firms are best positioned to capitalize hello everyone and welcome to this week's wikibon cube insights powered by etr in this breaking analysis we'll provide our quarterly update of the security industry and share new survey data from etr and thecube community that will help you navigate through the maze of corporate cyber warfare we'll also share our thoughts on the game of 3d chest that octa ceo todd mckinnon is playing against the market now we all know this market is complicated fragmented and fast moving and this next chart says it all it's an interactive graphic from optiv a denver colorado based si that's focused on cyber security they've done some really excellent research and put together this awesome taxonomy and mapped vendor names therein and this helps users navigate the complex security landscape and there are over a dozen major sectors high-level sectors within the security taxonomy in nearly 60 sub-sectors from monitoring vulnerability assessment identity asset management firewalls automation cloud data center sim threat detection and intelligent endpoint network and so on and so on and so on but this is a terrific resource and can help you understand where players fit and help you connect the dots in the space now let's talk about what's going on in the market the dynamics in this crazy mess of a landscape are really confusing sometimes now since the beginning of cyber time we've talked about the increasing sophistication of the adversary and the back and forth escalation between good and evil and unfortunately this trend is unlikely to stop here's some data from carbon black's annual modern bank heist report this is the fourth and of course now vmware's brand highlights the carbon black study since the acquisition and it catalyzed the creation of vmware's cloud security division destructive malware attacks according to the recent study are up 118 percent from last year now one major takeaway from the report is that hackers aren't just conducting wire fraud they are 57 of the bank surveyed saw an increase in wire fraud but the cyber criminals are also targeting non-public information such as future trading strategies this allows the bad guys to front run large block trades and profit it's become very lucrative practice now the prevalence of so-called island hopping is up 38 from already elevated levels this is where a virus enters a company's supply chain via a partner and then often connects with other stealthy malware downstream these techniques are more common where the malware will actually self-form with other infected parts of the supply chain and create actions with different signatures designed to identify and exfiltrate valuable information it's a really complex problem of major concern is that 63 of banking respondents in the study reported that responses to incidents were then met with retaliation designed to intimidate or initiate ransomware attacks to extract a final pound of flesh from the victim notably the study found that 75 percent of csos reported to the cio which many feel is not the right regime the study called for a rethinking of the right cyber regime where the cso has increased responsibility in a direct reporting line to the ceo or perhaps the co with greater exposure to boards of directors so many thanks to vmware and tom kellerman specifically for sharing this information with us this past week great work by your team now some of the themes that we've been talking about for several quarters are shown in the lower half of the chart cloud of course is the big driver thanks to work from home and the pandemic to pandemic and the interesting corollary of course is we see a rapid rethinking of endpoint and identity access management and the concept of zero trust in a recent esg survey two-thirds of respondents said that their use of cloud computing necessitated a change in how they approach identity access management now as shown in the chart from optiv the market remains highly fragmented and m a is of course way up now based on our research it looks like transaction volume has increased more than 40 percent just in the last five months so let's dig into the m a the merger and acquisition trends for just a moment we took a five month snapshot and we were able to count about 80 deals that were completed in that time frame those transactions represented more than 20 billion dollars in value some of the larger ones are highlighted here the biggest of course being the toma bravo taking proof point private for a 12 plus billion dollar price tag the stock went from the low 130s and is trading in the low 170s based on 176 dollar per share offer so there's your arbitrage folks go for it perhaps the more interesting acquisition was auth 0 by octa for 6.5 billion which we're going to talk about more in a moment there's more private equity action we saw as insight bought armis and iot security play and cisco shelled out 730 million dollars for imi mobile which is more of an adjacency to cyber but it's going to go under cisco's security and applications business run by g2 patel but these are just the tip of the iceberg some of the themes that we see connecting the dots of these acquisitions are first sis like accenture atos and wipro are making moves in cyber to go local they're buying secops expertise as i say locally in places like france germany netherlands canada and australia that last mile that belly-to-belly intimate service israel israeli-based startups chalked up five acquired companies in the space over the last five months also financial services firms are getting into the act with goldman and mastercard making moves to own its own part of the stack themselves to combat things like fraud and identity theft and then finally numerous moves to expand markets octa with zero crowdstrike buying a log management company palo alto picking up devops expertise rapid seven shoring up its kubernetes chops tenable expanding beyond insights and going after identity interesting fortinet filling gaps in a multi-cloud offering sale point extending to governance risk and compliance grc zscaler picked up an israeli firm to fill gaps in access control and then vmware buying mesh 7 to secure modern app development and distribution services so tons and tons of activity here okay so let's look at some of the etr data to put the cyber market in context etr uses the concept of market share it's one of the key metrics which is a measure of pervasiveness in the data set so for each sector it calculates the number of respondents for that sector divided by the total to get a sense for how prominent the sector is within the cio and i.t buyer communities okay this chart shows the full etr sector taxonomy with security highlighted across three survey periods april last year january this year in april this year now you wouldn't expect big moves in market share over time so it's relatively stable by sector but the big takeaway comes from observing which sectors are most prominent so you see that red line that dotted line imposed at the sixty percent level you can see there are only six sectors above that line and cyber security is one of them okay so we know that security is important in a large market but this puts it in the context of the other sectors however we know from previous breaking analysis episodes that despite the importance of cyber and the urgency catalyzed by the pandemic budgets unfortunately are not unlimited and spending is bounded it's not an open checkbook for csos as shown in this chart this is a two-dimensional graphic showing market share in the horizontal axis or pervasiveness and net score in the vertical axis net score is etr's measurement of spending velocity and we've superimposed a red line at 40 percent because anything over 40 percent we consider extremely elevated we've filtered and limited the number of sectors to simplify the graphic and you can see in the sectors that we've highlighted only the big four four are above that forty percent line ai containers rpa and cloud they exceed that sort of forty percent magic water line information security you can see that is highlighted and it's respectable but it competes for budget with other important sectors so this of course creates challenges for organization because not only are they strapped for talent as we've reported they like everyone else in it face ongoing budget pressures research firm cybersecurity ventures estimates that in 2021 6 trillion dollars worldwide will be lost on cyber crime conversely research firm canalis pegs security spending somewhere around 60 billion dollars annually idc has it higher around 100 billion so either way we're talking about spending between one to one point six percent annually of how much the bad guys are taking out that's peanuts really when you consider the consequences so let's double click into the cyber landscape a bit and further look at some of the companies here's that same x y graphic with the company's etr captures from respondents in the cyber security sector that's what's shown on the chart here now the usefulness of the red lines is 20 percent on the horizontal indicates the largest presence in the survey and the magic 40 percent line that we talked about earlier shows those firms with the most elevated momentum only microsoft and palo alto exceed both high water marks of course splunk and cisco are prominent horizontally and there are numerous companies to the left of the 20 percent line and many above that 40 percent high water mark on the vertical axis now in the bottom left quadrant that includes many of the legacy names that have been around for a long time and there are dozens of companies that show spending momentum on their platforms i.e above single digits so that picture is like the first one we showed you very very crowded space but so let's filter it a bit and only include companies in the etr survey that had at least a hundred responses so an n of a hundred or greater so it's a little easy to read but still it's kind of crowded when you think about it okay so same graphic and we've superimposed the data that determined the plot position over in the bottom right there so it's net score and shared n including only companies with more than 100 n so what does this data tell us about the market well microsoft is dominant as always it seems in all dimensions but let's focus on that red line for a moment some of the names that we've highlighted over the past two years show very well here first i want to talk about palo alto networks pre-covet as you might recall we highlighted the valuation divergence between palo alto and fortinet and we said fortinet was executing better on its cloud strategy and palo alto was at the time struggling with the transition especially with its go to market and its sales force compensation and really refreshing its portfolio but we told you that we were bullish on palo alto networks at the time because of its track record and the fact that cios consistently told us that they saw palo alto as a thought leader in the space that they wanted to work with they said that palo alto was the gold standard the best especially larger company cisos so that gave us confidence that palo alto a very well-run company was going to get its act together and perform better and palo alto has just done just that as we expected they've done very well and they've been rapidly moving customers to the next generation of platforms and we're very impressed by the company's execution and the stock has generally reflected that now some other names that hit our radar and the etr data a couple of years ago continue to perform well crowdstrike z-scaler sales sail point and cloudflare a cloudflare just reported and beat earnings but was off the stock fell on headwinds for tech overall the big rotation but the company is doing very well and they're growing rapidly and they have momentum as you can see from the etr data and we put that double star around proof point to highlight that it was worthy of fetching 12 and a half billion dollars from private equity firm so nice exit there supporting the continued control consolidation trend that we've predicted in cyber security now let's turn our attention to octa and auth zero this is where it gets interesting and is a clever play for octa we think and we want to drill into it a bit octa is acquiring auth zero for big money why well we think todd mckinnon octa ceo wants to run the table on identity and then continue to expand his tam he has to do that to justify his lofty valuation so octa's ascendancy around identity and single sign sign-on is notable the fragmented pictures that we've shown you they scream out for simplification and trust and that's what octa brings but it competes with some major players most notably microsoft with active directory so look of course microsoft is going to dominate in its massive customer base but the rest of the market that's like jump ball it's wide open and we think mckinnon saw the opportunity to go dominate that sector now octa comes at this from an enterprise perspective bringing top-down trust to the equation and throwing a big blanket over all the discrete sas platforms and unifying employee access octa's timing was perfect it was founded in 2009 just as the massive sasification trend was happening around crm and hr and service management and cloud etc but the one thing that octa didn't have that auth 0 does is serious developer chops while octa was crushing it with its enterprise sales strategy auth 0 was laser focused on developers and building a bottoms up approach to identity by acquiring auth0 octa can dominate both sides of the barbell and then capture the fat middle so yes it's a pricey acquisition but in our view it's a great move by mckinnon now i don't know mckinnon personally but last week i spoke to arun shrestha who's the ceo of security specialist beyond id they're a platinum services partner of octa and there a zero trust expert he worked for octa for a number of years and shared with me a bit about mckinnon's style and think big approach arun said something that caught my attention he said firewalls used to be the perimeter now people are and while that's self-serving to octa and probably beyond id it's true people apps and data are the new perimeter and they're not in one location and that's the point now unfortunately i had lined up an interview with dia jolly who was the chief product officer at octa in a cube alum for this past week knowing that we were running this segment in this episode but she unfortunately fell ill the day of our interview and had to cancel but i want to follow up with her and understand how she's thinking about connecting the dots with auth 0 with devs and enterprises and really test our thesis there this is a really interesting chess match that's going on let's look a little deeper into that identity space this chart here shows some of the major identity players it has some of the leaders in the identity market and there's a breakdown of etr's net score now net score comprises five elements the lime green is we're adding the platform new the forest green is we're spending six percent or more relative to last year the gray is flat send plus or minus flat spend plus or minus five percent the pinkish is spending less and the bright red is where exiting the platform retiring now you subtract the red from the green and that gets you the result for net score which you can see superimposed on the right hand chart at the bottom that first column there the far column is shared in which informs and indicates the number of responses and is a proxy for presence in the market oh look at the top two players in terms of spending momentum now sales sale point is right there but auth 0 combined with octa's distribution channel will extend octa's lead significantly in our view and then there's microsoft now just a caveat this includes all of microsoft's security offerings not just identity but it's there for context and cyber arc as well includes its acquisition of adaptive but also other parts of cyberarks portfolio so you can see some of the other names that are there many of which you'll find in the gartner magic quadrant for identity and as we said we really like this move by octa it combines positive market forces with lead offerings from very well-run companies that have winning dna and passionate people now to further emphasize emphasize what what's happening here take a look at this this chart shows etr data for octa within sale point and cyber arc accounts out of the 230 cyber and sale point customers in the data set there are 81 octa accounts that's a 35 overlap and the good news for octa is that within that base of sale point in cyber arc accounts octa is shown by the net score line that green line has a very elevated spending and momentum and the kicker is if you read the fine print in the right hand column etr correctly points out that while sailpoint and cyberarc have long been partners with octa at the recent octane 21 event octa's big customer event the company announced that it was expanding into privileged access management pam and identity governance hello and welcome to coopetition in the 2020s now our current thinking is that this bodes very well for octa and cyberark and sailpoint well they're going to have to make some counter moves to fend off the onslaught that is coming now let's wrap up with what has become a tradition in our quarterly security updates looking at those two dimensions of net score and market share we're going to see which companies crack the top 10 for both measures within the etr data set we do this every quarter so here on the left we have the top 20 sorted by net score or spending momentum and on the right we sort by shared n so again top 20 which informs shared end and forms the market share metric or presence in the data set that red horizontal lines those two lines on each separate the top 10 from the remaining 10 within those top 20. in our method what we do is we assign four stars to those companies that crack the top ten for both metrics so again you see microsoft palo alto networks octa crowdstrike and fortinet fortinet by the way didn't make it last quarter they've kind of been in and out and on the bubble but you know this company is very strong and doing quite well only the other four did last quarter there was same four last quarter and we give two stars to those companies that make it in both categories within the top 20 but didn't make the top 10. so cisco splunk which has been steadily decelerating from a spending momentum standpoint and z-scaler which is just on the cusp you know we really like z-scaler and the company has great momentum but that's the methodology it is what it is now you can see we kept carbon black on the rightmost chart it's like kind of cut off it's number 21 only because they're just outside looking in on netscore you see them there they're just below on on netscore number 11. and vmware's presence in the market we think that carbon black is really worth paying attention to okay so we're going to close with some summary and final thoughts last quarter we did a deeper dive on the solar winds hack and we think the ramifications are significant it has set the stage for a new era of escalation and adversary sophistication now major change we see is a heightened awareness that when you find intruders you'd better think very carefully about your next moves when someone breaks into your house if the dog barks or if you come down with a baseball bat or other weapon you might think the intruder is going to flee but if the criminal badly wants what you have in your house and it's valuable enough you might find yourself in a bloody knife fight or worse what's happening is intruders come to your company via island hopping or inside or subterfuge or whatever method and they'll live off the land stealthily using your own tools against you so they can you can't find them so easily so instead of injecting new tools in that send off an alert they just use what you already have there that's what's called living off the land they'll steal sensitive data for example positive covid test results when that was really really sensitive obviously still is or other medical data and when you retaliate they will double extort you they'll encrypt your data and hold it for ransom and at the same time threaten to release the sensitive information to crushing your brand in the process so your response must be as stealthy as their intrusion as you marshal your resources and devise an attack plan you face serious headwinds not only is this a complicated situation there's your ongoing and acute talent shortage that you tell us about all the time many companies are mired in technical debt that's an additional challenge and then you've got to balance the running of the business while actually affecting a digital transformation that's very very difficult and it's risky because the more digital you become the more exposed you are so this idea of zero trust people used to call it a buzzword it's now a mandate along with automation because you just can't throw labor at the problem this is all good news for investors as cyber remains a market that's ripe for valuation increases and m a activity especially if you know where to look hopefully we've helped you squint through the maze a little bit okay that's it for now thanks to the community for your comments and insights remember i publish each week on wikibon.com and siliconangle.com these episodes they're all available as podcasts all you do is search breaking analysis podcast put in the headphones listen when you're in your car out for your walk or run and you can always connect on twitter at divalante or email me at david.valante at siliconangle.com i appreciate the comments on linkedin and in clubhouse please follow me so you're notified when we start a room and riff on these topics and others and don't forget to check out etr.plus for all the survey data this is dave vellante for the cube insights powered by etr be well and we'll see you next time [Music] you
SUMMARY :
and on the bubble but you know this
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
2009 | DATE | 0.99+ |
20 percent | QUANTITY | 0.99+ |
six percent | QUANTITY | 0.99+ |
microsoft | ORGANIZATION | 0.99+ |
57 | QUANTITY | 0.99+ |
2021 | DATE | 0.99+ |
40 percent | QUANTITY | 0.99+ |
palo alto | ORGANIZATION | 0.99+ |
five elements | QUANTITY | 0.99+ |
81 | QUANTITY | 0.99+ |
fortinet | ORGANIZATION | 0.99+ |
tom kellerman | PERSON | 0.99+ |
palo alto | ORGANIZATION | 0.99+ |
75 percent | QUANTITY | 0.99+ |
6.5 billion | QUANTITY | 0.99+ |
australia | LOCATION | 0.99+ |
cisco | ORGANIZATION | 0.99+ |
730 million dollars | QUANTITY | 0.99+ |
sixty percent | QUANTITY | 0.99+ |
dia jolly | PERSON | 0.99+ |
france | LOCATION | 0.99+ |
more than 20 billion dollars | QUANTITY | 0.99+ |
12 and a half billion dollars | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
april last year | DATE | 0.99+ |
april this year | DATE | 0.99+ |
6 trillion dollars | QUANTITY | 0.99+ |
octa | ORGANIZATION | 0.99+ |
two stars | QUANTITY | 0.99+ |
boston | LOCATION | 0.99+ |
g2 patel | ORGANIZATION | 0.99+ |
2020s | DATE | 0.99+ |
siliconangle.com | OTHER | 0.99+ |
forty percent | QUANTITY | 0.99+ |
more than 40 percent | QUANTITY | 0.99+ |
five month | QUANTITY | 0.99+ |
vmware | ORGANIZATION | 0.99+ |
first column | QUANTITY | 0.99+ |
arun shrestha | PERSON | 0.99+ |
last week | DATE | 0.99+ |
dozens of companies | QUANTITY | 0.98+ |
both categories | QUANTITY | 0.98+ |
both measures | QUANTITY | 0.98+ |
both metrics | QUANTITY | 0.98+ |
one | QUANTITY | 0.98+ |
pandemic | EVENT | 0.98+ |
each week | QUANTITY | 0.98+ |
two dimensions | QUANTITY | 0.98+ |
last quarter | DATE | 0.98+ |
five acquired companies | QUANTITY | 0.98+ |
12 plus billion dollar | QUANTITY | 0.98+ |
six sectors | QUANTITY | 0.98+ |
canada | LOCATION | 0.98+ |
wipro | ORGANIZATION | 0.97+ |
january this year | DATE | 0.97+ |
last quarter | DATE | 0.97+ |
10 | QUANTITY | 0.97+ |
first one | QUANTITY | 0.97+ |
netherlands | LOCATION | 0.96+ |
accenture atos | ORGANIZATION | 0.96+ |
more than 100 n | QUANTITY | 0.96+ |
dave vellante | PERSON | 0.96+ |
each sector | QUANTITY | 0.96+ |
arun | PERSON | 0.96+ |
two lines | QUANTITY | 0.96+ |
fourth | QUANTITY | 0.96+ |
imi mobile | ORGANIZATION | 0.95+ |
Breaking Analysis: Chaos Creates Cash for Criminals & Cyber Companies
>> From The Cube Studios in Palo Alto in Boston, bringing you data-driven insights from The Cube in ETR. This is "Breaking Analysis" with Dave Vellante >> The pandemic not only accelerated the shift to digital but it also highlighted a rush of cyber criminal sophistication, collaboration, and chaotic responses by virtually every major company in the planet. The SolarWinds hack exposed supply chain weaknesses and so-called island hopping techniques that are exceedingly difficult to detect. Moreover, the will and aggressiveness of well-organized cybercriminals has elevated to the point where incident responses are now met with counter attacks, designed to both punish and extract money from victims via ransomware and other criminal activities. The only upshot is the cybersecurity market remains one of the most enduring and attractive investment sectors for those that can figure out where the market is headed and which firms are best positioned to capitalize. Hello, everyone. And welcome to this week's Wikibon Cube Insights powered by ETR. In this "Breaking Analysis" we'll provide our quarterly update of the security industry, and share new survey data from ETR and the Cube community that will help you navigate through the maze of corporate cyber warfare. We'll also share our thoughts on the game of 3D chess that Okta CEO, Todd McKinnon, is playing against the market. Now, we all know this market is complicated, fragmented and fast moving. And this next chart says it all. It's an interactive graphic from Optiv, a Denver, Colorado-based SI that's focused on cybersecurity. They've done some really excellent research and put together this awesome taxonomy, and it mapped vendor names therein. And this helps users navigate the complex security landscape. And there are over a dozen major sectors, high-level sectors within the security taxonomy and nearly 60 subsectors. From monitoring, vulnerability assessment, identity, asset management, firewalls, automation, cloud, data center, sim, threat detection and intelligent endpoint network, and so on and so on and so on. But this is a terrific resource, and going to help you understand where players fit and help you connect the dots in the space. Now let's talk about what's going on in the market. The dynamics in this crazy mess of a landscape are really confusing sometimes. Now, since the beginning of cyber time, we've talked about the increasing sophistication of the adversary, and the back and forth escalation between good and evil. And unfortunately, this trend is unlikely to stop. Here's some data from Carbon Black's annual modern bank heist report. This is the fourth, and of course now, VMware's brand, highlights the Carbon Black study since the acquisition, and to catalyze the creation of VMware's cloud security division. Destructive malware attacks, according to the recent study are up 118% from last year. Now, one major takeaway from the report is that hackers aren't just conducting wire fraud, they are. 57% of the banks surveyed, saw an increase in wire fraud, but the cybercriminals are also targeting non-public information such as future trading strategies. This allows the bad guys to front-run large block trades and profit. It's become a very lucrative practice. Now the prevalence of so-called island hopping is up 38% from already elevated levels. This is where a virus enters a company supply chain via a partner, and then often connects with other stealthy malware downstream. These techniques are more common where the malware will actually self-form with other infected parts of the supply chain and create actions with different signatures, designed to identify and exfiltrate valuable information. It's a really complex problem. Of major concern is that 63% of banking respondents in the study reported that responses to incidents were then met with retaliation designed to intimidate, or initiate ransomware tax to extract a final pound of flesh from the victim. Notably, the study found that 75% of CISOs reported to the CIO, which many feel is not the right regime. The study called for a rethinking of the right cyber regime where the CISO has increased responsibility and a direct reporting line to the CEO, or perhaps the COO, with greater exposure to boards of directors. So, many thanks to VMware and Tom Kellerman specifically for sharing this information with us this past week. Great work by your team. Now, some of the themes that we've been talking about for several quarters are shown in the lower half of the chart. Cloud, of course is the big driver thanks to work-from-home and to the pandemic. And the interesting corollary of course, is we see a rapid rethinking of end point and identity access management, and the concept of zero trust. In a recent ESG survey, two thirds of respondents said that their use of cloud computing necessitated a change in how they approach identity access management. Now, as shown in the chart from Optiv, the market remains highly fragmented, and M&A is of course, way up. Now, based on our research, it looks like transaction volume has increased more than 40% just in the last five months. So let's dig into the M&A, the merger and acquisition trends for just a moment. We took a five-month snapshot and we were able to count about 80 deals that were completed in that timeframe. Those transactions represented more than $20 billion in value. Some of the larger ones are highlighted here. The biggest of course, being the Thoma Bravo, taking Proofpoint private for a $12 plus billion price tag. The stock went from the low 130s and is trading in the low 170s based on the $176 per share offer. So there's your arbitrage, folks. Go for it. Perhaps the more interesting acquisition was Auth0 by Optiv for 6.5 billion, which we're going to talk about more in a moment. There was more private equity action we saw as Insight bought Armis, an IOT security play, and Cisco shelled out $730 million for IMImobile, which is more of an adjacency to cyber, but it's going to go under Cisco security and applications business run by Jeetu Patel. But these are just the tip of the iceberg. Some of the themes that we see connecting the dots of these acquisitions are first, SIs like Accenture, Atos and Wipro are making moves in cyber to go local. They're buying SecOps expertise, as I say, locally in places like France, Germany, Netherlands, Canada, and Australia, that last mile, that belly to belly intimate service. Israeli-based startups chocked up five acquired companies in the space over the last five months. Also financial services firms are getting into the act with Goldman and MasterCard making moves to own its own part of the stack themselves to combat things like fraud and identity theft. And then finally, numerous moves to expand markets. Okta with Auth0, CrowdStrike buying a log management company, Palo Alto, picking up dev ops expertise, Rapid7 shoring up it's Coobernetti's chops, Tenable expanding beyond Insights and going after identity, interesting. Fortinet filling gaps in a multi-cloud offering. SailPoint extending to governance risk and compliance, GRC. Zscaler picked up an Israeli firm to fill gaps in access control. And then VMware buying Mesh7 to secure modern app development and distribution service. So tons and tons of activity here. Okay, so let's look at some of the ETR data to put the cyber market in context. ETR uses the concept of market share, it's one of the key metrics which is a measure of pervasiveness in the dataset. So for each sector, it calculates the number of respondents for that sector divided by the total to get a sense for how prominent the sector is within the CIO and IT buyer communities. Okay, this chart shows the full ETR sector taxonomy with security highlighted across three survey periods; April last year, January this year, and April this year. Now you wouldn't expect big moves in market share over time. So it's relatively stable by sector, but the big takeaway comes from observing which sectors are most prominent. So you see that red line, that dotted line imposed at the 60% level? You can see there are only six sectors above that line and cyber security is one of them. Okay, so we know that security is important in a large market. But this puts it in the context of the other sectors. However, we know from previous breaking analysis episodes that despite the importance of cyber, and the urgency catalyzed by the pandemic, budgets unfortunately are not unlimited, and spending is bounded. It's not an open checkbook for CSOs as shown in this chart. This is a two-dimensional graphic showing market share in the horizontal axis, or pervasiveness in net score in the vertical axis. Net score is ETR's measurement of spending velocity. And we've superimposed a red line at 40% because anything over 40%, we consider extremely elevated. We've filtered and limited the number of sectors to simplify the graphic. And you can see, in the sectors that we've highlighted, only the big four are above that 40% line; AI, containers, RPA, and cloud. They exceed that sort of 40% magic waterline. Information security, you can see that as highlighted and it's respectable, but it competes for budget with other important sectors. So this is of course creates challenges for organization, because not only are they strapped for talent as we've reported, they like everyone else in IT face ongoing budget pressures. Research firm, Cybersecurity Ventures estimates that in 2021, $6 trillion worldwide will be lost on cyber crime. Conversely, research firm, Cannolis peg security spending somewhere around $60 billion annually. IDC has at higher, around $100 billion. So either way, we're talking about spending between 1 to 1.6% annually of how much the bad guys are taking out. That's peanuts really when you consider the consequences. So let's double-click into the cyber landscape a bit and further look at some of the companies. Here's that same X/Y graphic with the companies ETR captures from respondents in the cybersecurity sector. That's what's shown on the chart here. Now, the usefulness of the red lines is 20% on the horizontal indicates the largest presence in the survey, and the magic 40% line that we talked about earlier shows those firms with the most elevated momentum. Only Microsoft and Palo Alto exceed both high watermarks. Of course, Splunk and Cisco are prominent horizontally. And there are numerous companies to the left of the 20% line and many above that 40% high watermark on the vertical axis. Now in the bottom left quadrant, that includes many of the legacy names that have been around for a long time. And there are dozens of companies that show spending momentum on their platforms, i.e above single digits. So that picture is like the first one we showed you, very, very crowded space. But so let's filter it a bit and only include companies in the ETR survey that had at least 100 responses. So an N of 100 or greater. So it was a little easier to read but still it's kind of crowded when you think about it. Okay, so same graphic, and we've superimposed the data that determined the plot position over in the bottom right there. So there's net score and shared in, including only companies with more than 100 N. So what does this data tell us about the market? Well, Microsoft is dominant as always, it seems in all dimensions but let's focus on that red line for a moment. Some of the names that we've highlighted over the past two years show very well here. First, I want to talk about Palo Alto Networks. Pre-COVID as you might recall, we highlighted the valuation divergence between Palo Alto and Fortinet. And we said Fortinet was executing better on its cloud strategy, and Palo Alto was at the time struggling with the transition especially with its go-to-market and its Salesforce compensation, and really refreshing its portfolio. But we told you that we were bullish on Palo Alto Networks at the time because of its track record, and the fact that CIOs consistently told us that they saw Palo Alto as a thought leader in the space that they wanted to work with. They said that Palo Alto was the gold standard, the best, especially larger company CISOs. So that gave us confidence that Palo Alto, a very well-run company was going to get its act together and perform better. And Palo Alto has just done just that. As we expected, they've done very well and rapidly moving customers to the next generation of platforms. And we're very impressed by the company's execution. And the stock has generally reflected that. Now, some other names that hit our radar in the ETR data a couple of years ago, continue to perform well. CrowdStrike, Zscaler, SailPoint, and CloudFlare. Now, CloudFlare just reported and beat earnings but was off, the stock fell on headwinds for tech overall, the big rotation. But the company is doing very well and they're growing rapidly and they have momentum as you can see from the ETR data. Now, we put that double star around Proofpoint to highlight that it was worthy of fetching $12.5 billion from private equity firm. So nice exit there, supporting the continued consolidation trend that we've predicted in cybersecurity. Now let's turn our attention to Okta and Auth0. This is where it gets interesting, and is a clever play for Okta we think, and we want to drill into it a bit. Okta is acquiring Auth0 for big money. Why? Well, we think Todd McKinnon, Okta CEO, wants to run the table on identity and then continue to expand as TAM has to do that, to justify his lofty valuation. So Okta's ascendancy around identity and single sign-on is notable. The fragmented pictures that we've shown you, they scream out for simplification and trust, and that's what Okta brings. But it competes with some major players, most notably Microsoft with active directory. So look, of course, Microsoft is going to dominate in its massive customer base, but the rest of the market, that's like (indistinct) wide open. And we think McKinnon saw the opportunity to go dominate that sector. Now Okta comes at this from an enterprise perspective bringing top-down trust to the equation, and throwing a big blanket over all the discreet SaaS platforms and unifying employee access. Okta's timing was perfect. It was founded in 2009, just as the massive SaaSifiation trend was happening around CRM and HR, and service management and cloud, et cetera. But the one thing that Okta didn't have that Auth0 does is serious developer chops. While Okta was crushing it with its enterprise sales strategy, Auth0 was laser-focused on developers and building a bottoms up approach to identity. By acquiring Auth0, Okta can dominate both sides of the barbell and then capture the fat middle. So yes, it's a pricey acquisition, but in our view, it's a great move by McKinnon. Now, I don't know McKinnon personally, but last week I spoke to Arun Shrestha, who's the CEO of security specialist, BeyondID, they're a platinum services partner of Okta. And they're a zero trust expert. He worked for Okta for a number of years and shared with me a bit about McKinnon's style, and think big approach. Arun said something that caught my attention. He said, firewalls used to be the perimeter, now people are. And while that's self-serving to Okta and probably BeyondID, it's true. People, apps and data are the new perimeter, and they're not in one location. And that's the point. Now, unfortunately, I had lined up an interview with Diya Jolly, who was the chief product officer at Okta and a Cube alum for this past week, knowing that we were running this segment in this episode but she unfortunately fell ill the day of our interview and had to cancel. But I want to follow up with her, and understand how she's thinking about connecting the dots with Auth0 with devs and enterprises and really test our thesis there. This is a really interesting chess match that's going on. Let's look a little deeper into that identity space. This chart here shows some of the major identity players. It has some of the leaders in the identity market, and is a breakdown at ETR's net score. Now net score comprises five elements. The lime green is, we're adding the platform new. The forest green is we're spending 6% or more relative to last year. The gray is flat send plus or minus flat spend, plus or minus 5%. The pinkish is spending less. And the bright red is we're exiting the platform, retiring. Now you subtract the red from the green, and that gets you the result for net score which you can see super-imposed on the right hand chart at the bottom, that first column there. The far column is shared in which informs and indicates the number of responses and is a proxy for presence in the market. Oh, look at the top two players in terms of spending momentum. Now SailPoint is right there, but Auth0 combined with Okta's distribution channel will extend Okta's lead significantly in our view. And then there's Microsoft. Now just a caveat, this includes all of Microsoft's security offerings, not just identity, but it's there for context. And CyberArk as well includes this acquisition of adaptive, but also other parts of CyberArk's portfolio. So you can see some of the other names that are there, many of which you'll find in the Gartner magic quadrant for identity. And as we said, we really like this move by Okta. It combines positive market forces with lead offerings from very well-run companies that have winning DNA and passionate people. Now, to further emphasize what's happening here, take a look at this. This chart shows ETR data for Okta within SailPoint and CyberArk accounts. Out of the 230 CyberArk and SailPoint customers in the dataset, there are 81 Okta accounts. That's a 35% overlap. And the good news for Okta is that within that base of SailPoint and CyberArk accounts, Okta is shown by the net score line, that green line has a very elevated spending in momentum. And the kicker is, if you read the fine print in the right hand column, ETR correctly points out that while SailPoint and CyberArk have long been partners with Okta, at the recent Octane21 event, Okta's big customer event, The company announced that it was expanding into privileged access management, PAM, and identity governance. Hello, and welcome to co-opetition in the 2020s. Now, our current thinking is that this bodes very well for Okta and CyberArk and SailPoint. Well, they're going to have to make some counter moves to fend off the onslaught that is coming. Now, let's wrap up with what has become a tradition in our quarterly security updates. Looking at those two dimensions of net score and market share, we're going to see which companies crack the top 10 for both measures within the ETR dataset. We do this every quarter. So here in the left, we have the top 20, sorted by net score spending momentum and on the right, we sort by shared N. So it's again, top 20, which informs, shared N informs the market share metric or presence in the dataset. That red horizontal lines, those two lines on each separate the top 10 from the remaining 10 within those top 20. And our method, what we do is we assign four stars to those companies that crack the top 10 for both metrics. So again, you see Microsoft, Palo Alto Networks, Okta, CrowdStrike, and Fortinet. Fortinet by the way, didn't make it last quarter. They've kind of been in and out and on the bubble, but company is very strong, and doing quite well. Only the other four did last quarter. They were the same for last quarter. And we give two stars to those companies that make it in both categories within the top 20 but didn't make the top 10. So Cisco, Splunk, which has been steadily decelerating from a spending momentum standpoint, and Zscaler, which is just on the cusp. We really like Zscaler and the company has great momentum, but that's the methodology. That is what it is. Now you can see, we kept Carbon Black on the right most chart, it's like kind of cut off, it's number 21. Only because they're just outside looking in on net score. You see them there, they're just below on net score, number 11. And VMware's presence in the market we think, that Carbon Black is right really worth paying attention to. Okay, so we're going to close with some summary and final thoughts. Last quarter, we did a deeper dive on the SolarWinds hack, and we think the ramifications are significant. It has set the stage for a new era of escalation and adversary sophistication. Now, major change we see is a heightened awareness that when you find intruders, you'd better think very carefully about your next moves. When someone breaks into your house, if the dog barks, or if you come down with a baseball bat or other weapon, you might think the intruder is going to flee. But if the criminal badly wants what you have in your house and it's valuable enough, you might find yourself in a bloody knife fight or worse. Well, what's happening is intruders come to your company via island hopping or insider subterfuge or whatever method. And they'll live off the land stealthily using your own tools against you so that you can't find them so easily. So instead of injecting new tools in that send off an alert, they just use what you already have there. That's what's called living off the land. They'll steal sensitive data, for example, positive COVID test results when that was really, really sensitive, obviously still is, or other medical data. And when you retaliate, they will double-extort you. They'll encrypt your data and hold it for ransom, and at the same time threaten to release the sensitive information, crushing your brand in the process. So your response must be as stealthy as their intrusion, as you marshal your resources and devise an attack plan. And you face serious headwinds. Not only is this a complicated situation, there's your ongoing and acute talent shortage that you tell us about all the time. Many companies are mired in technical debt, that's an additional challenge. And then you've got to balance the running of the business while actually effecting a digital transformation. That's very, very difficult, and it's risky because the more digital you become, the more exposed you are. So this idea of zero trust, people used to call it a buzzword, it's now a mandate along with automation. Because you just can't throw labor at the problem. This is all good news for investors as cyber remains a market that's ripe for valuation increases and M&A activity, especially if you know where to look. Hopefully we've helped you squint through the maze a little bit. Okay, that's it for now. Thanks to the community for your comments and insights. Remember I publish each week on wikibon.com and siliconangle.com. These episodes, they're all available as podcasts. All you got to do is search breaking analysis podcasts, put in the headphones, listen when you're in your car, or out for your walk or run, and you can always connect on Twitter @DVellante, or email me at david.vellante@siliconangle.com. I appreciate the comments on LinkedIn and in Clubhouse, please follow me, so you're notified when we start a room and riff on these topics and others. And don't forget to check out etr.plus for all the survey data. This is Dave Vellante for The Cube Insights powered by ETR. Be well, and we'll see you next time. (light instrumental music)
SUMMARY :
This is "Breaking Analysis" and at the same time threaten to release
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Microsoft | ORGANIZATION | 0.99+ |
Fortinet | ORGANIZATION | 0.99+ |
Cisco | ORGANIZATION | 0.99+ |
Todd McKinnon | PERSON | 0.99+ |
2009 | DATE | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
April | DATE | 0.99+ |
Goldman | ORGANIZATION | 0.99+ |
Okta | ORGANIZATION | 0.99+ |
Arun Shrestha | PERSON | 0.99+ |
IMImobile | ORGANIZATION | 0.99+ |
$12 | QUANTITY | 0.99+ |
Netherlands | LOCATION | 0.99+ |
Canada | LOCATION | 0.99+ |
6% | QUANTITY | 0.99+ |
SailPoint | ORGANIZATION | 0.99+ |
France | LOCATION | 0.99+ |
$730 million | QUANTITY | 0.99+ |
2021 | DATE | 0.99+ |
Accenture | ORGANIZATION | 0.99+ |
$12.5 billion | QUANTITY | 0.99+ |
Atos | ORGANIZATION | 0.99+ |
Auth0 | ORGANIZATION | 0.99+ |
Palo Alto | ORGANIZATION | 0.99+ |
Carbon Black | ORGANIZATION | 0.99+ |
Palo Alto Networks | ORGANIZATION | 0.99+ |
CrowdStrike | ORGANIZATION | 0.99+ |
20% | QUANTITY | 0.99+ |
Germany | LOCATION | 0.99+ |
billion | QUANTITY | 0.99+ |
Diya Jolly | PERSON | 0.99+ |
60% | QUANTITY | 0.99+ |
Australia | LOCATION | 0.99+ |
63% | QUANTITY | 0.99+ |
35% | QUANTITY | 0.99+ |
Palo Alto | LOCATION | 0.99+ |
more than $20 billion | QUANTITY | 0.99+ |
five-month | QUANTITY | 0.99+ |
five elements | QUANTITY | 0.99+ |
Tom Kellerman | PERSON | 0.99+ |
VMware | ORGANIZATION | 0.99+ |
40% | QUANTITY | 0.99+ |
First | QUANTITY | 0.99+ |
Jeetu Patel | PERSON | 0.99+ |
Splunk | ORGANIZATION | 0.99+ |
75% | QUANTITY | 0.99+ |
6.5 billion | QUANTITY | 0.99+ |
CyberArk | ORGANIZATION | 0.99+ |
$6 trillion | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
MasterCard | ORGANIZATION | 0.99+ |
Wipro | ORGANIZATION | 0.99+ |
two stars | QUANTITY | 0.99+ |
Last quarter | DATE | 0.99+ |
81 | QUANTITY | 0.99+ |
Cannolis | ORGANIZATION | 0.99+ |
Daphne Koller, insitro | WiDS Women in Data Science Conference 2020
live from Stanford University it's the hue covering Stanford women in data science 2020 brought to you by Silicon angle media hi and welcome to the cube I'm your host Sonia - Garrett and we're live at Stanford University covering wigs women in data science conference the fifth annual one and joining us today is Daphne Koller who is the co-founder who sari is the CEO and founder of in seat row that Daphne welcome to the cube nice to be here Sonia thank you for having me so tell us a little bit about in seat row how you how it you got it founded and more about your role so I've been working in the intersection of machine learning and biology and health for quite a while and it was always a bit of a an interesting journey in that the data sets were quite small and limited we're now in a different world where there's tools that are allowing us to create massive biological data sets that I think can help us solve really significant societal problems and one of those problems that I think is really important is drug discovery development where despite many important advancements the costs just keep going up and up and up and the question is can we use machine learning to solve that problem better and you talk about this more in your keynote so give us a few highlights of what you talked about so in the last you can think of drug discovery and development in the last 50 to 70 years as being a bit of a glass half-full glass half-empty the glass half-full is the fact that there's diseases that used to be a death sentence or of the sentence still a life long of pain and suffering that are now addressed by some of the modern-day medicines and I think that's absolutely amazing the other side of it is that the cost of developing new drugs has been growing exponentially in what's come to be known as Arun was law being the inverse of Moore's Law which is the one we're all familiar with because the number of drugs approved per billion u.s. dollars just keeps going down exponentially so the question is can we change that curve and you talked in your keynote about the interdisciplinary cold to tell us more about that I think in order to address some of the critical problems that were facing one needs to really build a culture of people who work together at from different disciplines each bringing their own insights and their own ideas into the mix so and in seat row we actually have a company that's half-life scientists many of whom are producing data for the purpose of driving machine learning models and the other half are machine learning people and data scientists who are working on those but it's not a handoff where one group produces the data and the other one consumes and interpreted but really they start from the very beginning to understand what are the problems that one could solve together how do you design the experiment how do you build the model and how do you derive insights from that that can help us make better medicines for people and I also wanted to ask you you co-founded Coursera so tell us a little bit more about that platform so I founded Coursera as a result of work that I'd been doing at Stanford working on how technology can make education better and more accessible this was a project that I did here a number of my colleagues as well and at some point in the fall of 2011 there was an experiment let's take some of the content that we've been we've been developing within it's within Stanford and put it out there for people to just benefit from and we didn't know what would happen would it be a few thousand people but within a matter of weeks with minimal advertising other than one New York Times article that went viral we had a hundred thousand people in each of those courses and that was a moment in time where you know we looked at this and said can we just go back to writing more papers or is there an incredible opportunity to transform access to education to people all over the world and so I ended up taking a what was supposed to be a teary leave of absence from Stanford to go and co-found Coursera and I thought I'd go back after two years but the but at the end of that two-year period the there was just so much more to be done and so much more impact that we could bring to people all over the world people of both genders people of the different social economic status every single country around the world we I just felt like this was something that I couldn't not do and how did you why did you decide to go from an educational platform to then going into machine learning and biomedicine so I've been doing Coursera for about five years in 2016 and the company was on a great trajectory but it's primarily a Content company and around me machine learning was transforming the world and I wanted to come back and be part of that and when I looked around I saw machine learning being applied to ecommerce and the natural language and to self-driving cars but there really wasn't a lot of impact being made on the life science area and I wanted to be part of making that happen partly because I felt like coming back to our earlier comment that in order to really have that impact you need to have someone who speaks both languages and while there's a new generation of researchers who are bilingual in biology and in machine learning there's still a small group and there very few of those in kind of my age cohort and I thought that I would be able to have a real impact by building and company in the space so it sounds like your background is pretty varied what advice would you give to women who are just starting college now who may be interested in a similar field would you tell them they have to major in math or or do you think that maybe like there are some other majors that may be influential as well I think there's a lot of ways to get into data science math is one of them but there's also statistics or physics and I would say that especially for the field that I'm currently in which is at the intersection of machine learning data science on the one hand and biology and health on the other one can get there from biology or medicine as well but what I think is important is not to shy away from the more mathematically oriented courses in whatever major you're in because that found the is a really strong one there's a lot of people out there who are basically lightweight consumers of data science and they don't really understand how the methods that they're deploying how they work and that limits them in their ability to advance the field and come up with new methods that are better suited perhaps to the problems that they're tackling so I think it's totally fine and in fact there's a lot of value to coming into data science from fields other than a third computer science but I think taking courses in those fields even while you're majoring in whatever field you're interested in is going to make you a much better person who lives at that intersection and how do you think having a technology background has helped you in in founding your companies and has helped you become a successful CEO in companies that are very strongly Rd focused like like in C tro and others having a technical co-founder is absolutely essential because it's fine to have an understanding of whatever the user needs and so on and come from the business side of it and a lot of companies have a business co-founder but not understanding what the technology can actually do is highly limiting because you end up hallucinating oh if we could only do this and yet that would be great but you can't and people end up oftentimes making ridiculous promises about what technology will or will not do because they just don't understand where the land mines sit and and where you're gonna hit real obstacles and in the path so I think it's really important to have a strong technical foundation in these companies and that being said where do you see an teacher in the future and and how do you see it solving say Nash that you talked about in your keynote so we hope that in seat row we'll be a fully integrated drug discovery and development company that is based on a slightly different foundation than a traditional pharma company where they grew up in the old approach of that is very much bespoke scientific analysis of the biology of different diseases and then going after targets or our ways of dealing with the disease that are driven by human intuition where I think we have the opportunity to go today is to build a very data-driven approach that collects massive amounts of data and then let analysis of those data really reveal new hypotheses that might not be the ones that the cord with people's preconceptions of what matters and what doesn't and so hopefully we'll be able to over time create enough data and apply machine learning to address key bottlenecks in the drug discovery development process so we can bring better drugs to people and we can do it faster and hopefully at much lower cost that's great and you also mentioned in your keynote that you think that 2020s is like a digital biology era so tell us more about that so I think if you look if you take a historical perspective on science and think back you realize that there's periods in history where one discipline has made a tremendous amount of progress in a relatively short amount of time because of a new technology or a new way of looking at things in the 1870s that discipline was chemistry was the understanding of the periodic table and that you actually couldn't turn lead into gold in the 1900s that was physics with understanding the connection between matter and energy and between space and time in the 1950s that was computing where silicon chips were suddenly able to perform calculations that up until that point only people have been able to do and then in 1990s there was an interesting bifurcation one was the era of data which is related to computing but also involves elements statistics and optimization of neuroscience and the other one was quantitative biology in which biology moved from a descriptive science of techsan amaizing phenomena to really probing and measuring biology in a very detailed and a high-throughput way using techniques like microarrays that measure the activity of 20,000 genes at once Oh the human genome sequencing of the human genome and many others but these two feels kind of evolved in parallel and what I think is coming now 30 years later is the convergence of those two fields into one field that I like to think of as digital biology where we are able using the tools that have and continue to be developed measure biology in entirely new levels of detail of fidelity of scale we can use the techniques of machine learning and data science to interpret what we're seeing and then use some of the technologies that are also emerging to engineer biology to do things that it otherwise wouldn't do and that will have implications in biomaterials in energy in the environment in agriculture and I think also in human health and it's an incredibly exciting space to be in right now because just so much is happening and the opportunities to make a difference and make the world a better place are just so large that sounds awesome Daphne thank you for your insight and thank you for being on cute thank you I'm so neat agario thanks for watching stay tuned for more great
SUMMARY :
in the last you can think of drug
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Daphne Koller | PERSON | 0.99+ |
Sonia | PERSON | 0.99+ |
Daphne | PERSON | 0.99+ |
1950s | DATE | 0.99+ |
1990s | DATE | 0.99+ |
Sonia - Garrett | PERSON | 0.99+ |
2016 | DATE | 0.99+ |
20,000 genes | QUANTITY | 0.99+ |
1900s | DATE | 0.99+ |
1870s | DATE | 0.99+ |
two fields | QUANTITY | 0.99+ |
one field | QUANTITY | 0.99+ |
Stanford University | ORGANIZATION | 0.99+ |
Stanford | ORGANIZATION | 0.99+ |
Coursera | ORGANIZATION | 0.98+ |
2020s | DATE | 0.98+ |
both languages | QUANTITY | 0.98+ |
both genders | QUANTITY | 0.98+ |
two | QUANTITY | 0.98+ |
fall of 2011 | DATE | 0.98+ |
two-year | QUANTITY | 0.98+ |
today | DATE | 0.97+ |
about five years | QUANTITY | 0.96+ |
30 years later | DATE | 0.93+ |
every single country | QUANTITY | 0.93+ |
WiDS Women in Data Science Conference 2020 | EVENT | 0.93+ |
one | QUANTITY | 0.91+ |
one discipline | QUANTITY | 0.9+ |
a hundred thousand people | QUANTITY | 0.9+ |
Nash | PERSON | 0.89+ |
sari | PERSON | 0.89+ |
each | QUANTITY | 0.84+ |
Silicon angle media | ORGANIZATION | 0.83+ |
few thousand people | QUANTITY | 0.83+ |
billion u.s. dollars | QUANTITY | 0.83+ |
two years | QUANTITY | 0.82+ |
New York Times | ORGANIZATION | 0.8+ |
one of those problems | QUANTITY | 0.79+ |
Moore's Law | TITLE | 0.79+ |
one group | QUANTITY | 0.79+ |
Coursera | TITLE | 0.78+ |
2020 | DATE | 0.77+ |
70 years | QUANTITY | 0.76+ |
third computer | QUANTITY | 0.74+ |
fifth annual one | QUANTITY | 0.68+ |
each of those courses | QUANTITY | 0.68+ |
science | EVENT | 0.68+ |
lot of people | QUANTITY | 0.66+ |
half | QUANTITY | 0.64+ |
per | QUANTITY | 0.49+ |
last 50 | DATE | 0.46+ |
Arun | TITLE | 0.4+ |
Scott Gnau, Hortonworks | DataWorks Summit 2018
>> Live from San Jose, in the heart of Silicone Valley, it's theCUBE. Covering Datawork Summit 2018. Brought to you by Hortonworks. >> Welcome back to theCUBE's live coverage of Dataworks Summit here in San Jose, California. I'm your host, Rebecca Knight, along with my cohost James Kobielus. We're joined by Scott Gnau, he is the chief technology officer at Hortonworks. Welcome back to theCUBE, Scott. >> Great to be here. >> It's always fun to have you on the show. So, you have really spent your entire career in the data industry. I want to start off at 10,000 feet, and just have you talk about where we are now, in terms of customer attitudes, in terms of the industry, in terms of where customers feel, how they're dealing with their data and how they're thinking about their approach in their business strategy. >> Well I have to say, 30 plus years ago starting in the data field, it wasn't as exciting as it is today. Of course, I always found it very exciting. >> Exciting means nerve-wracking. Keep going. >> Or nerve-wracking. But you know, we've been predicting it. I remember even you know, 10, 15 years ago before big data was a thing, it's like oh all this data's going to come, and it's going to be you know 10x what it is. And we were wrong. It was like 5000x, you know what it is. And I think the really exciting part is that data really used to be relegated frankly, to big companies as a derivative work of ERP systems, and so on and so forth. And while that's very interesting, and certainly enabled a whole level of productivity for industry, when you compare that to all of the data flying around everywhere today, whether it be Twitter feeds and even doing live polls, like we did in the opening session today. Data is just being created everywhere. And the same thing applies to that data that applied to the ERP data of old. And that is being able to harness, manage and understand that data is a new business creating opportunity. And you know, we were with some analysts the other day, and I think one of the more quoted things that came out of that when I was speaking with them, was really, like railroads and shipping in the 1800s and oil in the 1900s, data really is the wealth creator of this century. And so that creates a very nerve-wracking environment. It also creates an environment, a very agile and very important technological breakthroughs that enable those things to be turned into wealth. >> So thinking about that, in terms of where we are at this point in time and on the main stage this morning someone had likened it to the interstate highway system, that really revolutionized transportation, but also commerce. >> I love that actually. I may steal it in some of my future presentations. >> That's good but we'll know where you pilfered it. >> Well perhaps if data is oil the edge, in containerized applications and piping data, you know, microbursts of data across the internet of things, is sort of like the new fracking. You know, you're being able to extract more of this precious resource from the territory. >> Hopefully not quite as damaging to the environment. >> Maybe not. I'm sorry for environmentalist if I just offended you, I apologize. >> But I think you know, all of those analogies are very true, and I particularly like the interstate one this morning. Because when I think about what we've done in our core http platform, and I know Arun was here talking about all the great advances that we built into this, the kind of the core hadoop platform. Very traditional. Store data, analyze data but also bring in new kinds of algorithms, rapid innovation and so on. That's really great but that's kind of half of the story. In a device connected world, in a consumer centric world, capturing data at the edge, moving and processing data at the edge is the new normal, right? And so just like the interstate highway system actually created new ways of commerce because we could move people and things more efficiently, moving data and processing data more efficiently is kind of the second part of the opportunity that we have in this new deluge of data. And that's really where we've been with our Hortonworks data flow. And really saying that the complete package of managing data from origination at the edge all the way through analytic to decision that's triggered back at the edge is like the holy grail, right? And building a technology for that footprint, is why I'm certainly excited today. It's not the caffeine, it's just the opportunity of making all of that work. >> You know, one of the, I think the key announcement for me at this show, that you guys made on HDP 3.0 was containerization of more of the capabilities of your distributed environment so that these capabilities, in terms of processing. First of all, capturing and analyzing an moving that data, can be pushed closer to the end points. Can you speak a bit Scott, about this new capability or this containerization support? Within HDP 3.0 but really in your broader portfolio and where you're going with that in terms of addressing edge applications perhaps, autonomous vehicles or you know, whatever you might put into a new smart phone or whatever you put at the edge. Describe the potential containerizations to sort of break this ecosystem wide open. >> Yeah, I think there are a couple of aspects to containerization and by the way, we're like so excited about kind of the cloud first, containerized HDP 3.0 that we launched here today. There's a lot of great tech that our customers have been clamoring for that they can take advantage of. And it's really just the beginning, which again is part of the excitement of being in the technology space and certainly being part of Hortonworks. So containerization affords a couple of things. Certainly, agility. Agility in deploying applications. So, you know for 30 years we've built these enterprise software stacks that were very integrated, hugely complicated systems that could bring together multiple different applications, different workloads and manage all that in a multi-tendency kind of environment. And that was because we had to do that, right? Servers were getting bigger, they were more powerful but not particularly well distributed. Obviously in a containerized world, you now turn that whole paradigm on its head and you say, you know what? I'm just going to collect these three microservices that I need to do this job. I can isolate them. I can have them run in a server-less technology. I can actually allocate in the cloud servers to go run, and when they're done they go away. And I don't pay for them anymore. So thinking about kind of that from a software development deployment implementation perspective, there huge implications but the real value for customers is agility, right? I don't have to wait until next year to upgrade my enterprise software stack to take advantage of this new algorithm. I can simply isolate it inside of a container, have it run, and have it go away. And get the answer, right? And so when I think about, and a number of our keynotes this morning were talking about just kind of the exponential rate of change, this is really the net new norm. Because the only way we can do things faster, is in fact to be able to provide this. >> And it's not just microservices. Also orchestrating them through Kubernetes, and so forth, so they can be. >> Sure. That's the how versus yeah. >> Quickly deployed as an ensemble and then quickly de-provisioned when you don't need them anymore. >> Yeah so then there's obviously the cost aspect, right? >> Yeah. >> So if you're going to run a whole bunch of stuff or even if you have something as mundane as a really big merge join inside of hive. Let me spin up a thousand extra containers to go do that big thing, and then have them go away when it's done. >> And oh, by the way, you'll be deployed on. >> And only pay for it while I'm using it. >> And then you can possibly distribute those containers across different public clouds depending on what's most cost effective at any point in time Azure or AWS or whatever it might be. >> And I tease with Arun, you know the only thing that we haven't solved is for the speed of light, but we're working on it. >> In talking about how this warp speed change, being the new norm, can you talk about some of the most exciting use cases you've seen in terms of the customers and clients that are using Hortonworks in the coolest ways. >> Well I mean obviously autonomous vehicles is one that we all captured all of our imagination. 'Cause we understand how that works. But it's a perfect use case for this kind of technology. But the technology also applies in fraud detection and prevention. It applies in healthcare management, in proactive personalized medicine delivery, and in generating better outcomes for treatment. So, you know, all across. >> It will bind us in every aspect of our lives including the consumer realm increasingly, yeah. >> Yeah, all across the board. And you know one of the things that really changed, right, is well a couple things. A lot of bandwidth so you can start to connect these things. The devices themselves are particularly smart, so you don't any longer have to transfer all the data to a mainframe and then wait three weeks, sorry, wait three weeks for your answer and then come back. You can have analytic models running on and edge device. And think about, you know, that is really real time. And that actually kind of solves for the speed of light. 'Cause you're not waiting for those things to go back and forth. So there are a lot of new opportunities and those architectures really depend on some of the core tenets of ultimately containerization stateless application deployment and delivery. And they also depend on the ability to create feedback loops to do point-to-point and peer kinds of communication between devices. This is a whole new world of how data get moved and how the decisions around date movement get made. And certainly that's what we're excited about, building with the core components. The other implication of all of this, and we've know each other for a long time. Data has gravity. Data movements expensive. It takes time, frankly, you have to pay for the bandwidth and all that kind of stuff. So being able to play the data where it lies becomes a lot more interesting from an application portability perspective and with all of these new sensors, devices and applications out there, a lot more data is living its entire lifecycle in the cloud. And so being able to create that connective tissue. >> Or as being as terralexical on the edge. >> And even on the edge. >> In with machine learn, let me just say, butt in a second. One of the areas that we're focusing on increasingly in Wikibot in terms of our focus on machine learning at the edge, is more and more machine learning frameworks are coming into the browser world. Javascript for the most like tenser flow JS, you know more of this inferencing and training is going to happen inside your browser. That blows a lot of people's minds. It may not be heavy hitting machine learning, but it'll be good enough for a lot of things that people do in their normal life. Where you don't want to round trip back to the cloud. It's all happening right there, in you know, Chrome or whatever you happen to be using. >> Yeah and so the point being now, you know when I think about the early days, talking about scalability, I remember ship being my first one terabyte database. And then the first 10 terabyte database. Yeah, it doesn't sound very exciting. When I think about scalability of the future, it's really going to, scalability is not going to be defined as petabytes or exabytes under management. It's really going to be defined as petabytes or exabytes affected across a grid of storage and processing devices. And that's a whole new technology paradigm, and really that's kind of the driving force behind what we've been building and what we've been talking about at this conference. >> Excellent. >> So when you're talking about these things. I mean how much, are the companies themselves prepared, and do they have the right kind of talent to use the kinds of insights that you're able to extract? And then act on them in the real time. 'Cause you're talking about how this is saving a lot of the waiting around time. So is this really changing the way business gets done, and do companies have the talent to execute? >> Sure. I mean it's changing the way business gets done. We showed a quote on stage this morning from the CEO of Marriott, right? So, I think there a couple of pieces. One is business are increasingly data driven and business strategy is increasingly the data strategy. And so it starts from the top, kind of setting that strategy and understanding the value of that asset and how that needs to be leveraged to drive new business. So that's kind of one piece. And you know, obviously there are more and more folks kind of coming to the realization that that is important. The other thing that's been helpful is, you know, as with any new technology there's always kind of the startup shortage of resource and people start to spool up and learn. You know the really good news, and for the past 10 years I've been working with a number of different university groups. Parents are actually going to universities and demanding that the curriculum include data, and processing and big data and all of these technologies. Because they know that their children educated in that kind of a world, number one, they're going to have a fun job to go to everyday. 'Cause it's going to be something different everyday. But number two they're going to be employed for life. (laughing) >> Yeah. >> They will be solvent. >> Frankly the demand has actually created a catch up in supply that we're seeing. And of course, you know, as tools start to get more mature and more integrated, they also become a little bit easier to use. You know, less, there's a little bit easier deployment and so on. So a combination of, I'm seeing a really good supply, there really, obviously we invest in education through the community. And then frankly, the education system itself, and folks saying this is really the hot job of the next century. You know, I can be the new oil barren. Or I can be the new railroad captain. It's actually creating more supply which is also very helpful. >> Data's the heart of what I call the new stem cell. It's science, technology, engineering, mathematics that you want to implant in the brains of the young as soon as possible. I hear ya. >> Yeah, absolutely. >> Well Scott thanks so much for coming on. But I want to first also, we can't let you go without the fashion statement. You arrived on set wearing it. >> The elephants. >> I mean it was quite a look. >> Well I did it because then you couldn't see I was sweating on my brow. >> Oh please, no, no, no. >> 'Cause I was worried about this tough interview. >> You know one of the things I love about your logo, and I'll just you know, sounds like I'm fawning. The elephant is a very intelligent animal. >> It is indeed. >> My wife's from Indonesia. I remember going back one time they had Asian elephants at a one of these safari parks. And watching it perform, and then my son was very little then. The elephant is a very sensitive, intelligent animal. You don't realize 'till you're up close. They pick up all manner of social cues. I think it's an awesome symbol for a company that's all about data driven intelligence. >> The elephant never forgets. >> Yeah. >> That's what we know. >> That's right we never forget. >> Him forget 'cause he's got a brain. Or she, I'm sorry. He or she has a brain. >> And it's data driven. >> Yeah. >> Thanks very much. >> Great. Well thanks for coming on theCUBE. I'm Rebecca Knight for James Kobielus. We will have more coming up from Dataworks just after this. (upbeat music)
SUMMARY :
in the heart of Silicone Valley, he is the chief technology in terms of the industry, in the data field, Exciting means nerve-wracking. and shipping in the 1800s and on the main stage this I love that actually. where you pilfered it. is sort of like the new fracking. to the environment. I apologize. And really saying that the of more of the capabilities of the cloud servers to go run, and so forth, so they can be. and then quickly de-provisioned and then have them go away when it's done. And oh, by the way, And then you can possibly is for the speed of light, Hortonworks in the coolest ways. But the technology also including the consumer and how the decisions around terralexical on the edge. One of the areas that we're Yeah and so the point being now, the talent to execute? and demanding that the And of course, you know, in the brains of the young the fashion statement. then you couldn't see 'Cause I was worried and I'll just you know, and then my son was very little then. He or she has a brain. for coming on theCUBE.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Rebecca Knight | PERSON | 0.99+ |
James Kobielus | PERSON | 0.99+ |
Scott | PERSON | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
Scott Gnau | PERSON | 0.99+ |
Indonesia | LOCATION | 0.99+ |
three weeks | QUANTITY | 0.99+ |
30 years | QUANTITY | 0.99+ |
10x | QUANTITY | 0.99+ |
San Jose | LOCATION | 0.99+ |
Marriott | ORGANIZATION | 0.99+ |
San Jose, California | LOCATION | 0.99+ |
1900s | DATE | 0.99+ |
1800s | DATE | 0.99+ |
10,000 feet | QUANTITY | 0.99+ |
Silicone Valley | LOCATION | 0.99+ |
one piece | QUANTITY | 0.99+ |
Dataworks Summit | EVENT | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Chrome | TITLE | 0.99+ |
theCUBE | ORGANIZATION | 0.99+ |
next year | DATE | 0.98+ |
next century | DATE | 0.98+ |
today | DATE | 0.98+ |
30 plus years ago | DATE | 0.98+ |
Javascript | TITLE | 0.98+ |
second part | QUANTITY | 0.98+ |
ORGANIZATION | 0.98+ | |
first | QUANTITY | 0.97+ |
Dataworks | ORGANIZATION | 0.97+ |
One | QUANTITY | 0.97+ |
5000x | QUANTITY | 0.97+ |
Datawork Summit 2018 | EVENT | 0.96+ |
HDP 3.0 | TITLE | 0.95+ |
one | QUANTITY | 0.95+ |
this morning | DATE | 0.95+ |
HDP 3.0 | TITLE | 0.94+ |
three microservices | QUANTITY | 0.93+ |
first one terabyte | QUANTITY | 0.93+ |
First | QUANTITY | 0.92+ |
DataWorks Summit 2018 | EVENT | 0.92+ |
JS | TITLE | 0.9+ |
Asian | OTHER | 0.9+ |
3.0 | TITLE | 0.87+ |
one time | QUANTITY | 0.86+ |
a thousand extra containers | QUANTITY | 0.84+ |
this morning | DATE | 0.83+ |
15 years ago | DATE | 0.82+ |
Arun | PERSON | 0.81+ |
this century | DATE | 0.81+ |
10, | DATE | 0.8+ |
first 10 terabyte | QUANTITY | 0.79+ |
couple | QUANTITY | 0.72+ |
Azure | ORGANIZATION | 0.7+ |
Kubernetes | TITLE | 0.7+ |
theCUBE | EVENT | 0.66+ |
parks | QUANTITY | 0.59+ |
a second | QUANTITY | 0.58+ |
past 10 years | DATE | 0.57+ |
number two | QUANTITY | 0.56+ |
Wikibot | TITLE | 0.55+ |
HDP | COMMERCIAL_ITEM | 0.54+ |
rd. | QUANTITY | 0.48+ |
John Kreisa, Hortonworks | DataWorks Summit 2018
>> Live from San José, in the heart of Silicon Valley, it's theCUBE! Covering DataWorks Summit 2018. Brought to you by Hortonworks. (electro music) >> Welcome back to theCUBE's live coverage of DataWorks here in sunny San José, California. I'm your host, Rebecca Knight, along with my co-host, James Kobielus. We're joined by John Kreisa. He is the VP of marketing here at Hortonworks. Thanks so much for coming on the show. >> Thank you for having me. >> We've enjoyed watching you on the main stage, it's been a lot of fun. >> Thank you, it's been great. It's been great general sessions, some great talks. Talking about the technology, we've heard from some customers, some third parties, and most recently from Kevin Slavin from The Shed which is really amazing. >> So I really want to get into this event. You have 2,100 attendees from 23 different countries, 32 different industries. >> Yep. This started as a small, >> That's right. tiny little thing! >> Didn't Yahoo start it in 2008? >> It did, yeah. >> You changed names a few year ago, but it's still the same event, looming larger and larger. >> Yeah! >> It's been great, it's gone international as you've said. It's actually the 17th total event that we've done. >> Yeah. >> If you count the ones we've done in Europe and Asia. It's a global community around data, so it's no surprise. The growth has been phenomenal, the energy is great, the innovations that the community is talking about, the ecosystem is talking about, is really great. It just continues to evolve as an event, it continues to bring new ideas and share those ideas. >> What are you hearing from customers? What are they buzzing about? Every morning on the main stage, you do different polls that say, "how much are you using machine learning? What portion of your data are you moving to the cloud?" What are you learning? >> So it's interesting because we've done similar polls in our show in Berlin, and the results are very similar. We did the cloud poll pole and there's a lot of buzz around cloud. What we're hearing is there's a lot of companies that are thinking about, or are somewhere along their cloud journey. It's exactly what their overall plans are, and there's a lot of news about maybe cloud will eat everything, but if you look at the pole results, something like 75% of the attendees said they have cloud in their plans. Only about 12% said they're going to move everything to the cloud, so a lot of hybrid with cloud. It's how to figure out which work loads to run where, how to think about that strategy in terms of where to deploy the data, where to deploy the work loads and what that should look like and that's one of the main things that we're hearing and talking a lot about. >> We've been seeing that Wikiban and our recent update to the recent market forecast showed that public cloud will dominate increasingly in the coming decade, but hybrid cloud will be a long transition period for many or most enterprises who are still firmly rooted in on-premises employment, so forth and so on. Clearly, the bulk of your customers, both of your custom employments are on premise. >> They are. >> So you're working from a good starting point which means you've got what, 1,400 customers? >> That's right, thereabouts. >> Predominantly on premises, but many of them here at this show want to sustain their investment in a vendor that provides them with that flexibility as they decide they want to use Google or Microsoft or AWS or IBM for a particular workload that their existing investment to Hortonworks doesn't prevent them from facilitating. It moves that data and those workloads. >> That's right. The fact that we want to help them do that, a lot of our customers have, I'll call it a multi-cloud strategy. They want to be able to work with an Amazon or a Google or any of the other vendors in the space equally well and have the ability to move workloads around and that's one of the things that we can help them with. >> One of the things you also did yesterday on the main stage, was you talked about this conference in the greater context of the world and what's going on right now. This is happening against the backdrop of the World Cup, and you said that this is really emblematic of data because this is a game, a tournament that generates tons of data. >> A tremendous amount of data. >> It's showing how data can launch new business models, disrupt old ones. Where do you think we're at right now? For someone who's been in this industry for a long time, just lay the scene. >> I think we're still very much at the beginning. Even though the conference has been around for awhile, the technology has been. It's emerging so fast and just evolving so fast that we're still at the beginning of all the transformations. I've been listening to the customer presentations here and all of them are at some point along the journey. Many are really still starting. Even in some of the polls that we had today talked about the fact that they're very much at the beginning of their journey with things like streaming or some of the A.I. machine learning technologies. They're at various stages, so I believe we're really at the beginning of the transformation that we'll see. >> That reminds me of another detail of your product portfolio or your architecture streaming and edge deployments are also in the future for many of your customers who still primarily do analytics on data at rest. You made an investment in a number of technologies NiFi from streaming. There's something called MiNiFi that has been discussed here at this show as an enabler for streaming all the way out to edge devices. What I'm getting at is that's indicative of Arun Murthy, one of your co-founders, has made- it was a very good discussion for us analysts and also here at the show. That is one of many investments you're making is to prepare for a future that will set workloads that will be more predominant in the coming decade. One of the new things I've heard this week that I'd not heard in terms of emphasis from you guys is more of an emphasis on data warehousing as an important use case for HDP in your portfolios, specifically with HIVE. The HIVE 3.0 now in- HDP3.0. >> Yes. >> With the enhancements to HIVE to support more real time and low latency, but also there's ACID capabilities there. I'm hearing something- what you guys are doing is consistent with one of your competitors, Cloudera. They're going deeper into data warehousing too because they recognize they've got to got there like you do to be able to absorb more of your customers' workloads. I think that's important that you guys are making that investment. You're not just big data, you're all data and all data applications. Potentially, if your customers want to go there and engage you. >> Yes. >> I think that was a significant, subtle emphasis that me as an analyst noticed. >> Thank you. There were so many enhancements in 3.0 that were brought from the community that it was hard to talk about everything in depth, but you're right. The enhancements to HIVE in terms of performance have really enabled it to take on a greater set of workloads and inner activity that we know that our customers want. The advantage being that you have a common data layer in the back end and you can run all this different work. It might be data warehousing, high speed query workloads, but you can do it on that same data with Spark and data-science related workloads. Again, it's that common pool backend of the data lake and having that ability to do it with common security and governance. It's one of the benefits our customers are telling us they really appreciate. >> One of the things we've also heard this morning was talking about data analytics in terms of brand value and brand protection importantly. Fedex, exactly. Talking about, the speaker said, we've all seen these apology commercials. What do you think- is it damage control? What is the customer motivation here? >> Well a company can have billions of dollars of market cap wiped out by breeches in security, and we've seen it. This is not theoretical, these are actual occurrences that we've seen. Really, they're trying to protect the brand and the business and continue to be viable. They can get knocked back so far that it can take years to recover from the impact. They're looking at the security aspects of it, the governance of their data, the regulations of GVPR. These things you've mentioned have real financial impact on the businesses, and I think it's brand and the actual operations and finances of the businesses that can be impacted negatively. >> When you're thinking about Hortonworks's marketing messages going forward, how do you want to be described now, and then how do you want customers to think of you five or 10 years from now? >> I want them to think of us as a partner to help us with their data journey, on all aspects of their data journey, whether they're collecting data from the EDGE, you mentioned NiFi and things like that. Bringing that data back, processing it in motion, as well as processing it in rest, regardless of where that data lands. On premise, in the cloud, somewhere in between, the hybrid, multi-cloud strategy. We really want to be thought of as their partner in their data journey. That's really what we're doing. >> Even going forward, one of the things you were talking about earlier is the company's sort of saying, "we want to be boring. We want to help you do all the stuff-" >> There's a lot of money in boring. >> There's a lot of money, right! Exactly! As you said, a partner in their data journey. Is it "we'll do anything and everything"? Are you going to do niche stuff? >> That's a good question. Not everything. We are focused on the data layer. The movement of data, the process and storage, and truly the analytic applications that can be built on top of the platform. Right now we've stuck to our strategy. It's been very consistent since the beginning of the company in terms of taking these open source technologies, making them enterprise viable, developing an eco-system around it and fostering a community around it. That's been our strategy since before the company even started. We want to continue to do that and we will continue to do that. There's so much innovation happening in the community that we quickly bring that into the products and make sure that's available in a trusted, enterprise-tested platform. That's really one of the things we see our customers- over and over again they select us because we bring innovation to them quickly, in a safe and consumable way. >> Before we came on camera, I was telling Rebecca that Hortonworks has done a sensational job of continuing to align your product roadmaps with those of your leading partners. IBM, AWS, Microsoft. In many ways, your primary partners are not them, but the entire open source community. 26 open source projects in which Hortonworks represents and incorporated in your product portfolio in which you are a primary player and committer. You're a primary ingester of innovation from all the communities in which you operate. >> We do. >> That is your core business model. >> That's right. We both foster the innovation and we help drive the information ourselves with our engineers and architects. You're absolutely right, Jim. It's the ability to get that innovation, which is happening so fast in the community, into the product and companies need to innovate. Things are happening so fast. Moore's Law was mentioned multiple times on the main stage, you know, and how it's impacting different parts of the organization. It's not just the technology, but business models are evolving quickly. We heard a little bit about Trumble, and if you've seen Tim Leonard's talk that he gave around what they're doing in terms of logistics and the ability to go all the way out to the farmer and impact what's happening at the farm and tracking things down to the level of a tomato or an egg all the way back and just understand that. It's evolving business models. It's not just the tech but the evolution of business models. Rob talked about it yesterday. I think those are some of the things that are kind of key. >> Let me stay on that point really quick. Industrial internet like precision agriculture and everything it relates to, is increasingly relying on visual analysis, parts and eggs and whatever it might be. That is convolutional neural networks, that is A.I., it has to be trained, and it has to be trained increasingly in the cloud where the data lives. The data lives in H.D.P, clusters and whatnot. In many ways, no matter where the world goes in terms of industrial IoT, there will be massive cluster of HTFS and object storage driving it and also embedded A.I. models that have to follow a specific DevOps life cycle. You guys have a strong orientation in your portfolio towards that degree of real-time streaming, as it were, of tasks that go through the entire life cycle. From the preparing the data, to modeling, to training, to deploying it out, to Google or IBM or wherever else they want to go. So I'm thinking that you guys are in a good position for that as well. >> Yeah. >> I just wanted to ask you finally, what is the takeaway? We're talking about the attendees, talking about the community that you're cultivating here, theme, ideas, innovation, insight. What do you hope an attendee leaves with? >> I hope that the attendee leaves educated, understanding the technology and the impacts that it can have so that they will go back and change their business and continue to drive their data projects. The whole intent is really, and we even changed the format of the conference for more educational opportunities. For me, I want attendees to- a satisfied attendee would be one that learned about the things they came to learn so that they could go back to achieve the goals that they have when they get back. Whether it's business transformation, technology transformation, some combination of the two. To me, that's what I hope that everyone is taking away and that they want to come back next year when we're in Washington, D.C. and- >> My stomping ground. >> His hometown. >> Easy trip for you. They'll probably send you out here- (laughs) >> Yeah, that's right. >> Well John, it's always fun talking to you. Thank you so much. >> Thank you very much. >> We will have more from theCUBE's live coverage of DataWorks right after this. I'm Rebecca Knight for James Kobielus. (upbeat electro music)
SUMMARY :
in the heart of Silicon Valley, He is the VP of marketing you on the main stage, Talking about the technology, So I really want to This started as a small, That's right. but it's still the same event, It's actually the 17th total event the innovations that the community is that's one of the main things that Clearly, the bulk of your customers, their existing investment to Hortonworks have the ability to move workloads One of the things you also did just lay the scene. Even in some of the polls that One of the new things I've heard this With the enhancements to HIVE to subtle emphasis that me the data lake and having that ability to One of the things we've also aspects of it, the the EDGE, you mentioned NiFi and one of the things you were talking There's a lot of money, right! That's really one of the things we all the communities in which you operate. It's the ability to get that innovation, the cloud where the data lives. talking about the community that learned about the things they came to They'll probably send you out here- fun talking to you. coverage of DataWorks right after this.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
James Kobielus | PERSON | 0.99+ |
Rebecca Knight | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
Rebecca | PERSON | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Tim Leonard | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Arun Murthy | PERSON | 0.99+ |
Jim | PERSON | 0.99+ |
Kevin Slavin | PERSON | 0.99+ |
Europe | LOCATION | 0.99+ |
John Kreisa | PERSON | 0.99+ |
Berlin | LOCATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
John | PERSON | 0.99+ |
ORGANIZATION | 0.99+ | |
2008 | DATE | 0.99+ |
Washington, D.C. | LOCATION | 0.99+ |
Asia | LOCATION | 0.99+ |
75% | QUANTITY | 0.99+ |
Rob | PERSON | 0.99+ |
five | QUANTITY | 0.99+ |
San José | LOCATION | 0.99+ |
next year | DATE | 0.99+ |
Yahoo | ORGANIZATION | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
32 different industries | QUANTITY | 0.99+ |
World Cup | EVENT | 0.99+ |
yesterday | DATE | 0.99+ |
23 different countries | QUANTITY | 0.99+ |
one | QUANTITY | 0.99+ |
1,400 customers | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
two | QUANTITY | 0.99+ |
2,100 attendees | QUANTITY | 0.99+ |
Fedex | ORGANIZATION | 0.99+ |
10 years | QUANTITY | 0.99+ |
26 open source projects | QUANTITY | 0.99+ |
Hortonworks | ORGANIZATION | 0.98+ |
17th | QUANTITY | 0.98+ |
both | QUANTITY | 0.98+ |
One | QUANTITY | 0.98+ |
billions of dollars | QUANTITY | 0.98+ |
Cloudera | ORGANIZATION | 0.97+ |
about 12% | QUANTITY | 0.97+ |
theCUBE | ORGANIZATION | 0.97+ |
this week | DATE | 0.96+ |
DataWorks Summit 2018 | EVENT | 0.95+ |
NiFi | ORGANIZATION | 0.91+ |
this morning | DATE | 0.89+ |
HIVE 3.0 | OTHER | 0.86+ |
Spark | TITLE | 0.86+ |
few year ago | DATE | 0.85+ |
Wikiban | ORGANIZATION | 0.85+ |
The Shed | ORGANIZATION | 0.84+ |
San José, California | LOCATION | 0.84+ |
tons | QUANTITY | 0.82+ |
H.D.P | LOCATION | 0.82+ |
DataWorks | EVENT | 0.81+ |
things | QUANTITY | 0.78+ |
DataWorks | ORGANIZATION | 0.74+ |
MiNiFi | TITLE | 0.62+ |
data | QUANTITY | 0.61+ |
Moore | TITLE | 0.6+ |
years | QUANTITY | 0.59+ |
coming decade | DATE | 0.59+ |
Trumble | ORGANIZATION | 0.59+ |
GVPR | ORGANIZATION | 0.58+ |
3.0 | OTHER | 0.56+ |
Adrian Cockcroft, AWS | KubeCon 2017
>> Announcer: Live from Austin, Texas, It's The Cube. Covering KubeCon 2017 and CloudNativeCon 2017. Brought to you by Red Hat, The Lennox Foundation, and The Cube's ecosystem partners. >> Okay, welcome back everyone. Live here in Austin, Texas, this is The Cube's exclusive coverage of the CNCF CloudNativeCon which was yesterday, and today is KubeCon, for Kubernetes conference, and a little bit tomorrow as well, some sessions. Our next guest is Adrian Cockcroft, VP of Cloud Architecture Strategy at AWS, Amazon Web Services, and my co-host Stu Miniman. Obviously, Adrian, an industry legend on Twitter and the industry, formerly with Netflix, knows a lot about AWS, now VP of Cloud Architecture, thanks for joining us. Appreciate it. >> Thanks very much. >> This is your first time as an AWS employee on The Cube. You've been verified. >> I've been on The Cube before. >> Many times. You've been verified. What's going on now with you guys, obviously coming off a hugely successful reinvent, there's a ton of video of me ranting and raving about how you guys are winning, and there's no second place, in the rear-view mirror, certainly Amazon's doing great. But CloudNative's got the formula, here. This is a cultural shift. What is going on here that's similar to what you guys are doing architecturally, why are you guys here, are you evangelizing, are you recruiting, are you proposing anything? What's the story? >> Yeah, it's really all of those things. We've been doing CloudNative for a long time, and the key thing with AWS, we always listen to our customers, and go wherever they take us. That's a big piece of the way we've always managed to keep on top of everything. And in this case, the whole container industry, there's a whole whole market there, there's a lot of different pieces, we've been working on that for a long time, and we found more and more people interested in CNCF and Kubernetes, and really started to engage. Part of my role is to host the open source team that does outbound engagement with all the different open source communities. So I've hired a few people, I hired Arun Gupta, who's very active in CNCF earlier this year, and internally we were looking at, we need to join CNCF at some point. We got to do that eventually and venture in, let's go make it happen. So last summer we just did all the internal paperwork, and running around talking to people and got everyone on the same page. And then in August we announced, hey, we're joining. So we got that done. I'm on the board of CNCF, Arun's my alternate for the board and technical, running around, and really deeply involved in as much of the technology and everything. And then that was largely so that we could kind of get our contributions from engineering on a clear footing. We were starting to contribute to Kupernetes, like as an outsider to the whole thing. So that's why we're, what's going on here? So getting that in place was like the basis for getting the contributions in place, we start hiring, we get the teams in place, and then getting our ducks in a row, if you like. And then last week at Reinvent, we announced EKS, the EC2 Kubernete's Service. And this week, we all had to be here. Like last week after Reinvent, everyone at AWS wants to go and sleep for a week. But no, we're going to go to Austin, we're going to do this. So we have about 20 people here, we came in, I did a little keynote yesterday. I could talk through the different topics, there, but fundamentally we wanted to be here where we've got the engineering teams here, we've got the engineering managers, they're in full-on hiring mode, because we've got the basic teams in place, but there's a lot more we want to do, and we're just going out and engaging, really getting to know the customers in detail. So that's really what drives it. Customer interactions, little bit of hiring, and just being present in this community. >> Adrian, you're very well known in the open source community, everything that you've done. Netflix, when you were on the VC side, you evangelized a bunch of it, if I can use the term. Amazon, many of us from the outside looked and, trying to understand. Obviously Amazon used lots of open source, Amazon's participated in a number of open source. MXNet got a lot of attention, joining the CNCF is something, I know this community, it's been very positively received, everybody's been waiting for it. What can you tell us about how Amazon, how do they think about open source? Is that something that fits into the strategy, or is it a tactic? Obviously, you're building out your teams, that sends certain signals to market, but can you help clarify for those of us that are watching what Amazon thinks about when it comes to this space? >> I think we've been, so, we didn't really have a team focused on outbound communication of what we were doing in open source until I started building this team a year ago. I think that was the missing link. We were actually doing a lot more than most people realized. I'd summarize it as saying, we were doing more than most people expected, but less than we probably could have been given the scale of what we are, the scale that AWS is at. So part of what we're doing is unlocking some internal demand where engineering teams were going. We'd like to open source something, we don't know how to engage with the communities. We're trying to build trust with these communities, and I've hired a team, I've got several people now, who are mostly from the open source community, we were also was kind of interviewing people like crazy. That was our sourcing for this team. So we get these people in and then we kind of say, all right, we have somebody that understands how to build these communities, how to respond, how to engage with the open source community. It's a little different to a standard customer, enterprise, start up, those are different entities that you'd want to relate to. But from a customer point of view, being customer-obsessed as AWS is, how do we get AWS to listen to an open source community and work with them, and meet all their concerns. So we've been, I think, doing a better job of that now we've pretty much got the team in place. >> That's your point, is customer focus is the ethos there. The communities are your customers in this case. So you're formalizing, you're formalizing that for Amazon, which has been so busy building out, and contributing here and there, so it sounds like there was a lot of activity going on within AWS, it was just kind of like contributing, but so much work on building out cloud ... >> Well there's a lot going on, but if no one was out there telling the story, you didn't know about it. Actually one of the best analogies we have for the EKS is actually our EMR, our Hadoop service, which launched 2010 or something, 2009, we've had it forever. But from the first few years when we did EMR, it was actually in a fork. We kept just sort of building our own version of it to do things, but about three or four years ago, we started upstreaming everything, and it's a completely clean, upstreamed version of all the Hadoop and all the related projects. But you make one API call, a cluster appears. Hey, give me a Hadoop cluster. Voom, and I want Spark and I want all these other things on it. And we're basically taking Kubernetes, it's very similar, we're going to reduce that to a single API call, a cluster appears, and it's a fully upstreamed experience. So that's, in terms of an engineering relationship to open source, we've already got a pretty good success story that nobody really knew about. And we're following a very similar path. >> Adrian, can you help us kind of unpack the Amazon Kubernetes stack a little bit? One of the announcements had a lot of attention, definitely got our attention, Fargate, kind of sits underneath what Kubernetes is doing, my understanding. Where are you sitting with the service measures, kind of bring us through the Amazon stack. What does Amazon do on its own versus the open source, and how those all fit together. >> Yeah, so everyone knows Amazon is a place where you can get virtual machines. It's easy to get me a virtual machine from ten years ago, everyone gets that, right? And then about three years ago, I think it was three years ago, we announced Lambda - was that two or three years ago? I lose track of how many reinvents ago it was. But with Lambda it's like, well, just give me a function. But as a first class entity, there's a, give me a function, here's the code I want you to run. We've now added two new ways that you can deploy to, two things you can deploy to. One of them's bare metal, which is already announced, one of the many, many, many announcements last week that might have slipped by without you noticing, but Bare Metal is a service. People go, 'those machines are really big'. Yes, of course they're really big! You get the whole machine and you can be able to bring your own virtualization or run whatever you want. But you could launch, you could run Kubernetes on that if you wanted, but we don't really care what you run it on. So we had Bare Metal, and then we have container. So Fargate is container as a first class entity that you deploy to. So here's my container registry, point you at it, and run one of these for me. And you don't have to think about deploying the underlying machines it's running on, you don't have to think about what version of Lennox it is, you have to build an AMI, all of the agents and fussing around, and you can get it in much smaller chunks. So you can say you get a CPU and half a gig of ram, and have that as just a small container. So it becomes much more granular, and you can get a broader range of mixes. A lot of our instances are sort of powers of two of a ratio of CPU to memory, and with Fargate you can ask for a much broader ratio. So you can have more CPU, less memory, and go back the other way, as well. 'Cause we can mix it up more easily at the container level. So it gives you a lot more flexibility, and if you buy into this, basically you'll get to do a lot of cost reduction for the sort of smaller scale things that you're running. Maybe test environments, you could shrink them down to just the containers and not have a lot of wasted space where you're trying to, you have too many instances running that you want to put it in. So it's partly the finer grain giving you more ability to say -- >> John: Or consumption choice. >> Yeah, and the other thing that we did recently was move to per-second billing, after the first minute, it's per-second. So the granularity of Cloud is now getting to be extremely fine-grained, and Lambda is per hundred millisecond, so it's just a little bit -- >> $4.03 for your bill, I mean this is the key thing. You guys have simplified the consumption experience. Bare Metal, VM's, containers, and functions. I mean pick one. >> Or pick all of them, it's fine. And when you look at the way Fargate's deployed in ECS it's a mixture. It's not all one or all the other, you deploy a number of instances with your containers on them, plus Fargate to deploy some additional containers that maybe didn't fit those instances. Maybe you've got a fleet of GPU enhanced machines, but you want to run a bit of Logic around it, some other containers in the same execution environment, but these don't need to be on the GPU. That kind of thing, you can mix it up. The other part of the question was, so how does this play into Kubernetes, and the discussions are just that we had to release the thing first, and then we can start talking, okay, how does this fit. Parts of the model fit into Kubernetes, parts don't. So we have to expose some more functionality in Fargate for this to make sense, 'cause we've got a really minimal initial release right now, we're going to expose it and add some more features. And then we possibly have to look at ways that we mutate Kubernetes a little bit for it to fit. So the initial EKS release won't include Fargate, because we're just trying to get it out based on what everyone knows today, we'd rather get that out earlier. But we'll be doing development work in the meantime, so a subsequent release we'll have done the integration work, which will all happen in public, in discussion with the community, and we'll have a debate about, okay, this is the features Fargate needs to properly integrate into Kubernetes, and there are other similar services from other top providers that want to integrate to the same API. So it's all going to be done as a public development, how we architect this. >> I saw a tweet here, I want to hear your comments on, it's from your keynote, someone retweeted, "managing over 100,000 clusters on ACS, hashtag Fargate," integrated into ECS, your hashtag, open, ADM's open. What is that hundred thousand number. Is that the total number, is that an example? On elastic container service, what does that mean? >> So ECS is a very large scale, multi-tenant container operation service that we've had for several years. It's in production, if you compare it to Kubernetes it's running much larger clusters, and it's been running at production-grade for longer. So it's a little bit more robust and secure and all those kinds of things. So I think it's missing some Kubernetes features, and there's a few places where we want to bring in capabilities from Kubernetes and make ECS a better experience for people. Think of Kubernetes as some what optimized for the developer experience, and ECS for more the operations experience, and we're trying to bring all this together. It is operating over a hundred thousand clusters of containers, over a hundred thousand clusters. And I think the other number was hundreds of millions of new containers are launched every week, or something like that. I think it was hundreds of millions a week. So, it's a very large scale system that is already deployed, and we're running some extremely large customers on, like Expedia and Macbook. Macbook ... Mac Box. Some of these people are running tens of thousands of containers in production as a single, we have single clusters in the tens of thousands range. So it's a different beast, right? And it meets a certain need, and we're going to evolve it forwards, and Kubernetes is serving a very different purpose. If you look at our data science space, if you want exactly the same Hadoop thing, you can get that on prem, you can run EMR. But we have Athena and Red Shift and all these other ways that are more native to the way we think, where we can go iterate and build something very specific to AWS, so you blend these two together and it depends on what you're trying to achieve. >> Well Adrian, congratulations on a great opportunity, I think the world is excited to have you in your role, if you could clarify and just put the narrative around, what's actually happening in AWS, what's been happening, and what you guys are going to do forward. I'll give you the last minute to let folks know what your job is, what your objective is, what you're looking for to hire, and your philosophy in the open source for AWS. >> I think there's a couple of other projects, and we've talked, this is really all about containers. The other two key project areas that we've been looking at are deep learning frameworks, since all of the deep learning frameworks are open source. A lot of Kubernetes people are using it to run GPUs and do that kind of stuff. So Apache MXNet is another focus on my team. It went into the incubation phase last January, we're walking it through, helping it on its way. It's something where we're 30, 40% of that project is AWS contribution. So we're not dominating it, but we're one of its main sponsors, and we're working with other companies. There's joint work with, it's lots of open source projects around here. We're working with Microsoft on Gluon, we're working with Facebook and Microsoft on Onyx which is an open URL network exchange. There's a whole lot of things going on here. And I have somebody on my team who hasn't started yet, can't tell you who it is, but they're starting pretty soon, who's going to be focusing on that open source, deep learning AI space. And the final area I think is interesting is IOT, serverless, Edge, that whole space. One announcement recently is free AltOS. So again, we sort of acquired the founder of this thing, this free real-time operating system. Everything you have, you probably personally own hundreds of instances of this without knowing it, it's in everything. Just about every little thing that sits there, that runs itself, every light bulb, probably, in your house that has a processor in it, those are all free AltOS. So it's incredibly pervasive, and we did an open source announcement last week where we switched its license to be a pure MIT license, to be more friendly for the community, and announced an Amazon version of it with better Amazon integration, but also some upgrades to the open source version. So, again, we're pushing an open source platform, strategy, in the embedded and IOT space as well. >> And enabling people to build great software, take the software engineering hassles out for the application developers, while giving the software engineers more engineering opportunities to create some good stuff. Thanks for coming on The Cube and congratulations on your continued success, and looking forward to following up on the Amazon Web Services open source collaboration, contribution, and of course, innovation. The Cube doing it's part here with its open source content, three days of coverage of CloudNativeCon and KubeCon. It's our second day, I'm John Furrier, Stu Miniman, we'll be back with more live coverage in Austin, Texas, after this short break. >> Offscreen: Thank you.
SUMMARY :
Brought to you by Red Hat, The Lennox Foundation, exclusive coverage of the CNCF CloudNativeCon This is your first time as an AWS employee on The Cube. What's going on now with you guys, and got everyone on the same page. Is that something that fits into the strategy, So we get these people in and then we kind of say, and there, so it sounds like there was a lot of activity telling the story, you didn't know about it. One of the announcements had a lot of attention, So it's partly the finer grain giving you more Yeah, and the other thing that we did recently was move to You guys have simplified the consumption experience. It's not all one or all the other, you deploy Is that the total number, is that an example? that are more native to the way we think, and what you guys are going to do forward. So it's incredibly pervasive, and we did an open source And enabling people to build great software,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Adrian | PERSON | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Adrian Cockcroft | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
Amazon Web Services | ORGANIZATION | 0.99+ |
Red Hat | ORGANIZATION | 0.99+ |
Stu Miniman | PERSON | 0.99+ |
John Furrier | PERSON | 0.99+ |
last week | DATE | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
August | DATE | 0.99+ |
Netflix | ORGANIZATION | 0.99+ |
ORGANIZATION | 0.99+ | |
second day | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
CNCF | ORGANIZATION | 0.99+ |
2010 | DATE | 0.99+ |
this week | DATE | 0.99+ |
AltOS | TITLE | 0.99+ |
Austin, Texas | LOCATION | 0.99+ |
yesterday | DATE | 0.99+ |
first minute | QUANTITY | 0.99+ |
Austin | LOCATION | 0.99+ |
last summer | DATE | 0.99+ |
Arun Gupta | PERSON | 0.99+ |
tens of thousands | QUANTITY | 0.99+ |
KubeCon | EVENT | 0.99+ |
today | DATE | 0.99+ |
one | QUANTITY | 0.99+ |
MXNet | ORGANIZATION | 0.99+ |
tomorrow | DATE | 0.99+ |
Macbook | COMMERCIAL_ITEM | 0.99+ |
2009 | DATE | 0.99+ |
John | PERSON | 0.99+ |
three years ago | DATE | 0.99+ |
a year ago | DATE | 0.99+ |
hundreds of millions a week | QUANTITY | 0.99+ |
two | DATE | 0.98+ |
last January | DATE | 0.98+ |
The Cube | ORGANIZATION | 0.98+ |
ten years ago | DATE | 0.98+ |
two things | QUANTITY | 0.98+ |
three days | QUANTITY | 0.98+ |
over a hundred thousand clusters | QUANTITY | 0.98+ |
KubeCon 2017 | EVENT | 0.98+ |
over 100,000 clusters | QUANTITY | 0.98+ |
$4.03 | QUANTITY | 0.97+ |
two | QUANTITY | 0.97+ |
hundred thousand | QUANTITY | 0.97+ |
two new ways | QUANTITY | 0.97+ |
Fargate | ORGANIZATION | 0.97+ |
Lambda | TITLE | 0.97+ |
CloudNativeCon 2017 | EVENT | 0.97+ |
The Lennox Foundation | ORGANIZATION | 0.97+ |
half a gig | QUANTITY | 0.97+ |
Kickoff - Spark Summit East 2017 - #sparksummit - #theCUBE
>> Narrator: Live from Boston, Massachusetts, this is theCUBE covering Spark Summit East 2017. Brought to you by Databricks. Now, here are your hosts, Dave Vellante and George Gilbert. >> Everybody the euphoria is still palpable here, we're in downtown Boston at the Hynes Convention Center. For Spark Summit East, #SparkSummit, my co-host and I, George Gilbert, will be unpacking what's going on for the next two days. George, it's good to be working with you again. >> Likewise. >> I always like working with my man, George Gilbert. We go deep, George goes deeper. Fantastic action going on here in Boston, actually quite a good crowd here, it was packed this morning in the keynotes. The rave is streaming. Everybody's talking about streaming. Let's sort of go back a little bit though George. When Spark first came onto the scene, you saw these projects coming out of Berkeley, it was the hope of bringing real-timeness to big data, dealing with some of the memory constraints that we found going from batch to real-time interactive and now streaming, you're going to talk about that a lot. Then you had IBM come in and put a lot of dough behind Spark, basically giving it a stamp, IBM's imprimatur-- >> George: Yeah. >> Much in the same way it did with Lynx-- >> George: Yeah. >> Kind of elbowing it's way in-- >> George: Yeah. >> The marketplace and sort of gaining a foothold. Many people at the time thought that Hadoop needed Spark more than Spark needed Hadoop. A lot of people thought that Spark was going to replace Hadoop. Where are we today? What's the state of big data? >> Okay so to set some context, when Hadoop V1, classic Hadoop came out it was file system, commodity file system, keep everything really cheap, don't have to worry about shared storage, which is very expensive and the processing model, the execution of munging through data was map produced. We're all familiar with those-- >> Dave: Complicated but dirt cheap. >> Yes. >> Dave: Relative to a traditional data warehouse. >> Yes. >> Don't buy a big Oracle Unix box or Lynx box, buy this new file system and figure out how to make it work and you'll save a ton of money. >> Yeah, but unlike the traditional RDBMS', it wasn't really that great for doing interactive business intelligence and things like that. It was really good for big batch jobs that would run overnight or periods of hours, things like that. The irony is when Matei Zaharia, the co-creator of Spark or actually the creator and co-founder of Databricks, which is steward of Spark. When he created the language and the execution environment, his objective was to do a better MapReduce than Radue, than MapReduce, make it faster, take advantage of memory, but he did such a good job of it, that he was able to extend it to be a uniform engine not just for MapReduce type batch stuff, but for streaming stuff. >> Dave: So originally they start out thinking that if I get this right-- >> Yeah. >> It was sort of a microbatch leveraging memory more effectively and then it extended beyond-- >> The microbatch is their current way to address the streaming stuff. >> Dave: Okay. >> It takes MapReduce, which would be big long running jobs, and they can slice them up and so each little slice turns into an element in the stream. >> Dave: Okay, so the point it was improvement upon these big long batch jobs-- >> George: Yeah. >> They're making it batch to interactive in real-time, so let's go back to big data for a moment here. >> George: Yeah. >> Big data was the hottest topic in the world three or four years ago and now it's sort of waned as a buzz word, but big data is now becoming more mainstream. We've talked about that a lot. A lot of people think it's done. Is big data done? >> George: Not it's more that it's sort of-- it's boring for us, kind of pundits, to talk about because it's becoming part of the fabric. The use cases are what's interesting. It started out as a way to collect all data into this really cheap storage repository and then once you did that, this was the data you couldn't afford to put into your terra data, data warehouse at 25,000 per terabyte or with running costs a multiple of that. Here you put all your data in here, your data scientists and data engineers started munging with the data, you started taking workloads off your data warehouse, like ETL things that didn't belong there. Now people are beginning to experiment with business intelligence sort of exploration and reporting on Hadoop, so taking more workloads off the data warehouse. The limitations, there are limitations there that will get solved by putting MPP SQL back-ends on it, but the next step after that. So we're working on that step, but the one that comes after that is make it easier for data scientists to use this data, to create predictive models-- [Dave] Okay, so I often joke that the ROI on big data was reduction on investment and lowering the denominator-- >> George: Yeah. >> In the expense equation, which I think it's fair to say that big data and Hadoop succeeded in achieving that, but then the question becomes, what's the real business impact. Clearly big data has not, except in some edge cases and there are a number of edge cases and examples, but it's not yet anyway lived up to the promise of real-time, affecting outcomes before, you know taking the human out of the decision, bringing transaction and analytics together. Now we're hearing a lot of that talk around AI and machine learning, of course, IoT is the next big thing, that's where streaming fits in. Is it same line new bottle? Or is it sort of the evolution of the data meme? >> George: It's an evolution, but it's not just a technology evolution to make it work. When we've been talking about big data as efficiency, like low cost, cost reduction for the existing type of infrastructure, but when it starts going into machine learning you're doing applications that are more strategic and more top line focused. That means your c-level execs actually have to get involved because they have to talk about the strategic objectives, like growth versus profitability or which markets you want to target first. >> So has Spark been a headwind or tailwind to Hadoop? >> I think it's very much been a tailwind because it simplified a lot of things that took many, many engines in Hadoop. That's something that Matei, creator of Spark, has been talking about for awhile. >> Dave: Okay something I learned today and actually I had heard this before, but the way I phrased it in my tweet, Genomiocs is kicking Moore's Law's ass. >> George: Yeah. >> That the price performance of sequencing a gene improves three x every year to what is essentially a doubling every 18 months for Moore's Law. The amount of data that's being created is just enormous, I think we heard from Broad Institute that they create 17 terabytes a day-- >> George: Yeah. >> As compared to YouTube, which is 24 terabytes a day. >> And then a few years it will be-- >> It will be dwarfing YouTube >> Yeah. >> Of course Twitter you couldn't even see-- >> Yeah. >> So what do you make of that? Is that just the fun fact, is that a new use case, is that really where this whole market is headed? >> It's not a fun fact because we've been hearing for years and years about this study about data doubling every 18 to 24 months, that's coming from the legacy storage guys who can only double their capacity every 18 to 24 months. The reality is that when we take what was analog data and we make it digitally accessible, the only thing that's preventing us from capturing all this data is the cost to acquire and manage it. The available data is growing much, much faster than 40% every 18 months. >> Dave: So what you're saying is that-- I mean this industry has marched to the cadence of Moore's Law for decades and what you're saying is that linear curve is actually reshaping and it's becoming exponential. >> George: For data-- >> Yes. >> George: So the pressure is on for compute, which is now the bottleneck to get clever and clever about how to process it-- >> So that says innovation has to come from elsewhere, not just Moore's Law. It's got to come from a combination of-- Thomas Friedman talks a lot about Moore's Law being one of the fundamentals, but there are others. >> George: Right. >> So from a data perspective, what are those combinatorial effects that are going to drive innovation forward? >> George: There was a big meetup for Spark last night and the focus was this new database called SnappyData that spun out of Pivotal and it's being mentored by Paul Maritz, ex-head of Development in Microsoft in the 90s and former head of VMWare. The interesting thing about this database, and we'll start seeing it in others, is you don't necessarily want to be able to query and analyze petabytes at once, it will take too long, sort of like munging through data of that size on Hadoop took too long. You can do things that approximate the answer and get it much faster. We're going to see more tricks like that. >> Dave: It's interesting you mention Maritz, I heard a lot of messaging this morning that talked about essentially real-time analysis and being able to make decisions on data that you've never seen before and actually affect outcomes. This narrative I first heard from Maritz many, many years ago when they launched Pivotal. He launched Pivotal to be this platform for building big data apps and now you're seeing Databricks and others sort of usurp that messaging and actually seeming to be at the center of that trend. What's going on there? >> I think there's two, what would you call it, two centers of gravity and our CTO David Floyer talks about this. The edge is becoming more intelligent because there's a huge bandwidth and latency gap between these smart devices at the edge, whether the smart device is like a car or a drone or just a bunch of sensors on a turbine. Those things need to analyze and respond in near real-time or hard real-time, like how to tune themselves, things like that, but they also have to send a lot of data back to the cloud to learn about how these things evolve. In other words it would be like sending the data to the cloud to figure out how the weather patterns are changing. >> Dave: Um,humm. >> That's the analogy. You need them both. >> Dave: Okay. >> So Spark right now is really good in the cloud, but they're doing work so that they can take a lighter weight version and put at the edge. We've also seen Amazon put some stuff at the edge and Azure as well. >> Dave: I want you to comment. We're going to talk about this later, we have a-- George and I are going to do a two-part series at this event. We're going to talk about the state of the market and then we're going to release our big data, in a glimpse to our big data numbers, our Spark forecast, our streaming forecast-- I say I mention streaming because that is-- we talk about batch, we talk about interactive/real-time, you know you're at a terminal-- anybody who's as old as I am remembers that. But now you're talking about streaming. Streaming is a new workload type, you call these things continuous apps, like streams of events coming into a call center, for example, >> George: Yeah. >> As one example that you used. Add some color to that. Talk about that new workload type and the roll of streaming, and really potentially how it fits into IoT. >> Okay, so for the last 60 years, since the birth of digital computing, we've had either one of two workloads, they were either batch, which is jobs that ran offline, you put your punch cards in and sometime later the answer comes out. Or we've had interactive, which is originally it was green screens and now we have PCs and mobile devices. The third one coming up now is continuous or streaming data that you act on in near real-time. It's not that those apps will replace the previous ones, it's that you'll have apps that have continuous processing, batch processing, interactive as a mix. An example would be today all the information about how your applications and data center infrastructure are operating, that's a lot of streams of data that Splunk first, took amat and did very well with-- so that you're looking in real-time and able to figure out if something goes wrong. That type of stuff, all the coulometry from your data center, that is a training wheel for Internet things, where you've got lots of stuff out at the edge. >> Dave: It's interesting you mention Splunk, Splunk doesn't actually use the big data term in its marketing, but they actually are big data and they are streaming. They're actually not talking about it, they're just doing it, but anyway-- Alright George, great thanks for that overview. We're going to break now, bring back our first guest, Arun Murthy, coming in from Hortonworks, co-founder at Hortonworks, so keep it right there everybody. This is theCUBE we're live from Spark Summit East, #SparkSummit, we'll be right back. (upbeat music)
SUMMARY :
Brought to you by Databricks. George, it's good to be working with you again. and now streaming, you're going to talk about that a lot. Many people at the time thought that Hadoop needed Spark and the processing model, buy this new file system and figure out how to make it work and the execution environment, to address the streaming stuff. in the stream. so let's go back to big data for a moment here. and now it's sort of waned as a buzz word, [Dave] Okay, so I often joke that the ROI on big data and machine learning, of course, IoT is the next big thing, but it's not just a technology evolution to make it work. That's something that Matei, creator of Spark, but the way I phrased it in my tweet, That the price performance of sequencing a gene all this data is the cost to acquire and manage it. I mean this industry has marched to the cadence So that says innovation has to come from elsewhere, and the focus was this new database called SnappyData and actually seeming to be at the center of that trend. but they also have to send a lot of data back to the cloud That's the analogy. So Spark right now is really good in the cloud, We're going to talk about this later, we have a-- As one example that you used. and sometime later the answer comes out. We're going to break now,
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
George | PERSON | 0.99+ |
Paul Maritz | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
George Gilbert | PERSON | 0.99+ |
Arun Murthy | PERSON | 0.99+ |
Matei Zaharia | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Boston | LOCATION | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
Amazon | ORGANIZATION | 0.99+ |
Thomas Friedman | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
David Floyer | PERSON | 0.99+ |
Matei | PERSON | 0.99+ |
Broad Institute | ORGANIZATION | 0.99+ |
Berkeley | LOCATION | 0.99+ |
two | QUANTITY | 0.99+ |
Maritz | PERSON | 0.99+ |
Databricks | ORGANIZATION | 0.99+ |
two-part | QUANTITY | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
one | QUANTITY | 0.99+ |
third one | QUANTITY | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
YouTube | ORGANIZATION | 0.99+ |
25,000 per terabyte | QUANTITY | 0.99+ |
Hynes Convention Center | LOCATION | 0.99+ |
24 months | QUANTITY | 0.99+ |
Boston, Massachusetts | LOCATION | 0.98+ |
first guest | QUANTITY | 0.98+ |
three | QUANTITY | 0.98+ |
one example | QUANTITY | 0.98+ |
Hadoop | TITLE | 0.97+ |
last night | DATE | 0.97+ |
three | DATE | 0.97+ |
both | QUANTITY | 0.97+ |
40% | QUANTITY | 0.97+ |
today | DATE | 0.97+ |
Spark Summit East 2017 | EVENT | 0.97+ |
17 terabytes a day | QUANTITY | 0.97+ |
first | QUANTITY | 0.97+ |
24 terabytes a day | QUANTITY | 0.97+ |
ORGANIZATION | 0.96+ | |
decades | QUANTITY | 0.96+ |
90s | DATE | 0.96+ |
Moore's Law | TITLE | 0.96+ |
two workloads | QUANTITY | 0.96+ |
Spark | TITLE | 0.95+ |
four years ago | DATE | 0.94+ |
Moore's | TITLE | 0.94+ |
two centers | QUANTITY | 0.92+ |
Unix | COMMERCIAL_ITEM | 0.92+ |
Kickoff | EVENT | 0.92+ |
#SparkSummit | EVENT | 0.91+ |