Basil Faruqui, BMC Software | BigData NYC 2017

>> Live from Midtown Manhattan, it's theCUBE. Covering BigData New York City 2017. Brought to you by SiliconANGLE Media and its ecosystem sponsors. (calm electronic music) >> Basil Faruqui, who's the Solutions Marketing Manger at BMC, welcome to theCUBE. >> Thank you, good to be back on theCUBE. >> So first of all, heard you guys had a tough time in Houston, so hope everything's gettin' better, and best wishes to everyone down in-- >> We're definitely in recovery mode now. >> Yeah and so hopefully that can get straightened out quick. What's going on with BMC? Give us a quick update in context to BigData NYC. What's happening, what is BMC doing in the big data space now, the AI space now, the IOT space now, the cloud space? >> So like you said that, you know, the data link space, the IOT space, the AI space, there are four components of this entire picture that literally haven't changed since the beginning of computing. If you look at those four components of a data pipeline it's ingestion, storage, processing, and analytics. What keeps changing around it, is the infrastructure, the types of data, the volume of data, and the applications that surround it. And the rate of change has picked up immensely over the last few years with Hadoop coming in to the picture, public cloud providers pushing it. It's obviously creating a number of challenges, but one of the biggest challenges that we are seeing in the market, and we're helping costumers address, is a challenge of automating this and, obviously, the benefit of automation is in scalability as well and reliability. So when you look at this rather simple data pipeline, which is now becoming more and more complex, how do you automate all of this from a single point of control? How do you continue to absorb new technologies, and not re-architect our automation strategy every time, whether it's it Hadoop, whether it's bringing in machine learning from a cloud provider? And that is the issue we've been solving for customers-- >> Alright let me jump into it. So, first of all, you mention some things that never change, ingestion, storage, and what's the third one? >> Ingestion, storage, processing and eventually analytics. >> And analytics. >> Okay so that's cool, totally buy that. Now if your move and say, hey okay, if you believe that standard, but now in the modern era that we live in, which is complex, you want breath of data, but also you want the specialization when you get down to machine limits highly bounded, that's where the automation is right now. We see the trend essentially making that automation more broader as it goes into the customer environments. >> Correct >> How do you architect that? If I'm a CXO, or I'm a CDO, what's in it for me? How do I architect this? 'Cause that's really the number one thing, as I know what the building blocks are, but they've changed in their dynamics to the market place. >> So the way I look at it, is that what defines success and failure, and particularly in big data projects, is your ability to scale. If you start a pilot, and you spend three months on it, and you deliver some results, but if you cannot roll it out worldwide, nationwide, whatever it is, essentially the project has failed. The analogy I often given is Walmart has been testing the pick-up tower, I don't know if you've seen. So this is basically a giant ATM for you to go pick up an order that you placed online. They're testing this at about a hundred stores today. Now if that's a success, and Walmart wants to roll this out nation wide, how much time do you think their IT department's going to have? Is this a five year project, a ten year project? No, and the management's going to want this done six months, ten months. So essentially, this is where automation becomes extremely crucial because it is now allowing you to deliver speed to market and without automation, you are not going to be able to get to an operational stage in a repeatable and reliable manner. >> But you're describing a very complex automation scenario. How can you automate in a hurry without sacrificing the details of what needs to be? In other words, there would seem to call for repurposing or reusing prior automation scripts and rules, so forth. How can the Walmart's of the world do that fast, but also do it well? >> Yeah so we do it, we go about it in two ways. One is that out of the box we provide a lot of pre-built integrations to some of the most commonly used systems in an enterprise. All the way from the Mainframes, Oracles, SAPs, Hadoop, Tableaus of the world, they're all available out of the box for you to quickly reuse these objects and build an automated data pipeline. The other challenge we saw, and particularly when we entered the big data space four years ago was that the automation was something that was considered close to the project becoming operational. Okay, and that's where a lot of rework happened because developers had been writing their own scripts using point solutions, so we said alright, it's time to shift automation left, and allow companies to build automations and artifact very early in the developmental life cycle. About a month ago, we released what we call Control-M Workbench, its essentially a community edition of Control-M, targeted towards developers so that instead of writing their own scripts, they can use Control-M in a completely offline manner, without having to connect to an enterprise system. As they build, and test, and iterate, they're using Control-M to do that. So as the application progresses through the development life cycle, and all of that work can then translate easily into an enterprise edition of Control-M. >> Just want to quickly define what shift left means for the folks that might not know software methodologies, they don't think >> Yeah, so. of left political, left or right. >> So, we're not shifting Control-M-- >> Alt-left, alt-right, I mean, this is software development, so quickly take a minute and explain what shift left means, and the importance of it. >> Correct, so if you think of software development as a straight line continuum, you've got, you will start with building some code, you will do some testing, then unit testing, then user acceptance testing. As it moves along this chain, there was a point right before production where all of the automation used to happen. Developers would come in and deliver the application to Ops and Ops would say, well hang on a second, all this Crontab, and these other point solutions we've been using for automation, that's not what we use in production, and we need you to now go right in-- >> So test early and often. >> Test early and often. So the challenge was the developers, the tools they used were not the tools that were being used on the production end of the site. And there was good reason for it, because developers don't need something really heavy and with all the bells and whistles early in the development lifecycle. Now Control-M Workbench is a very light version, which is targeted at developers and focuses on the needs that they have when they're building and developing it. So as the application progresses-- >> How much are you seeing waterfall-- >> But how much can they, go ahead. >> How much are you seeing waterfall, and then people shifting left becoming more prominent now? What percentage of your customers have moved to Agile, and shifting left percentage wise? >> So we survey our customers on a regular basis, and the last survey showed that eighty percent of the customers have either implemented a more continuous integration delivery type of framework, or are in the process of doing it, And that's the other-- >> And getting close to a 100 as possible, pretty much. >> Yeah, exactly. The tipping point is reached. >> And what is driving. >> What is driving all is the need from the business. The days of the five year implementation timelines are gone. This is something that you need to deliver every week, two weeks, and iteration. >> Iteration, yeah, yeah. And we have also innovated in that space, and the approach we call jobs as code, where you can build entire complex data pipelines in code format, so that you can enable the automation in a continuous integration and delivery framework. >> I have one quick question, Jim, and I'll let you take the floor and get a word in soon, but I have one final question on this BMC methodology thing. You guys have a history, obviously BMC goes way back. Remember Max Watson CEO, and Bob Beach, back in '97 we used to chat with him, dominated that landscape. But we're kind of going back to a systems mindset. The question for you is, how do you view the issue of this holy grail, the promised land of AI and machine learning, where end-to-end visibility is really the goal, right? At the same time, you want bounded experiences at root level so automation can kick in to enable more activity. So there's a trade-off between going for the end-to-end visibility out of the gate, but also having bounded visibility and data to automate. How do you guys look at that market? Because customers want the end-to-end promise, but they don't want to try to get there too fast. There's a diseconomies of scale potentially. How do you talk about that? >> Correct. >> And that's exactly the approach we've taken with Control-M Workbench, the Community Edition, because earlier on you don't need capabilities like SLA management and forecasting and automated promotion between environments. Developers want to be able to quickly build and test and show value, okay, and they don't need something that is with all the bells and whistles. We're allowing you to handle that piece, in that manner, through Control-M Workbench. As things progress and the application progresses, the needs change as well. Well now I'm closer to delivering this to the business, I need to be able to manage this within an SLA, I need to be able to manage this end-to-end and connect this to other systems of record, and streaming data, and clickstream data, all of that. So that, we believe that it doesn't have to be a trade off, that you don't have to compromise speed and quality for end-to-end visibility and enterprise grade automation. >> You mentioned trade offs, so the Control-M Workbench, the developer can use it offline, so what amount of testing can they possibly do on a complex data pipeline automation when the tool's offline? I mean it seems like the more development they do offline, the greater the risk that it simply won't work when they go into production. Give us a sense for how they mitigate, the mitigation risk in using Control-M Workbench. >> Sure, so we spend a lot of time observing how developers work, right? And very early in the development stage, all they're doing is working off of their Mac or their laptop, and they're not really connected to any. And that is where they end up writing a lot of scripts, because whatever code business logic they've written, the way they're going to make it run is by writing scripts. And that, essentially, becomes the problem, because then you have scripts managing more scripts, and as the application progresses, you have this complex web of scripts and Crontabs and maybe some opensource solutions, trying to simply make all of this run. And by doing this on an offline manner, that doesn't mean that they're losing all of the other Control-M capabilities. Simply, as the application progresses, whatever automation that the builtin Control-M can seamlessly now flow into the next stage. So when you are ready to take an application into production, there's essentially no rework required from an automation perspective. All of that, that was built, can now be translated into the enterprise-grade Control M, and that's where operations can then go in and add the other artifacts, such as SLA management and forecasting and other things that are important from an operational perspective. >> I'd like to get both your perspectives, 'cause, so you're like an analyst here, so Jim, I want you guys to comment. My question to both of you would be, lookin' at this time in history, obviously in the BMC side we mention some of the history, you guys are transforming on a new journey in extending that capability of this world. Jim, you're covering state-of-the-art AI machine learning. What's your take of this space now? Strata Data, which is now Hadoop World, which is Cloud Air went public, Hortonworks is now public, kind of the big, the Hadoop guys kind of grew up, but the world has changed around them, it's not just about Hadoop anymore. So I'd like to get your thoughts on this kind of perspective, that we're seeing a much broader picture in big data in NYC, versus the Strata Hadoop show, which seems to be losing steam, but I mean in terms of the focus. The bigger focus is much broader, horizontally scalable. And your thoughts on the ecosystem right now? >> Let the Basil answer fist, unless Basil wants me to go first. >> I think that the reason the focus is changing, is because of where the projects are in their lifecycle. Now what we're seeing is most companies are grappling with, how do I take this to the next level? How do I scale? How do I go from just proving out one or two use cases to making the entire organization data driven, and really inject data driven decision making in all facets of decision making? So that is, I believe what's driving the change that we're seeing, that now you've gone from Strata Hadoop to being Strata Data, and focus on that element. And, like I said earlier, the difference between success and failure is your ability to scale and operationalize. Take machine learning for an example. >> Good, that's where there's no, it's not a hype market, it's show me the meat on the bone, show me scale, I got operational concerns of security and what not. >> And machine learning, that's one of the hottest topics. A recent survey I read, which pulled a number of data scientists, it revealed that they spent about less than 3% of their time in training the data models, and about 80% of their time in data manipulation, data transformation and enrichment. That is obviously not the best use of a data scientist's time, and that is exactly one of the problems we're solving for our customers around the world. >> That needs to be automated to the hilt. To help them >> Correct. to be more productive, to deliver faster results. >> Ecosystem perspective, Jim, what's your thoughts? >> Yeah, everything that Basil said, and I'll just point out that many of the core uses cases for AI are automation of the data pipeline. It's driving machine learning driven predictions, classifications, abstractions and so forth, into the data pipeline, into the application pipeline to drive results in a way that is contextually and environmentally aware of what's goin' on. The history, historical data, what's goin' on in terms of current streaming data, to drive optimal outcomes, using predictive models and so forth, in line to applications. So really, fundamentally then, what's goin' on is that automation is an artifact that needs to be driven into your application architecture as a repurposable resource for a variety of-- >> Do customers even know what to automate? I mean, that's the question, what do I-- >> You're automating human judgment. You're automating effort, like the judgments that a working data engineer makes to prepare data for modeling and whatever. More and more that can be automated, 'cause those are pattern structured activities that have been mastered by smart people over many years. >> I mean we just had a customer on with a Glass'Gim CSK, with that scale, and his attitude is, we see the results from the users, then we double down and pay for it and automate it. So the automation question, it's an option question, it's a rhetorical question, but it just begs the question, which is who's writing the algorithms as machines get smarter and start throwing off their own real-time data? What are you looking at? How do you determine? You're going to need machine learning for machine learning? Are you going to need AI for AI? Who writes the algorithms >> It's actually, that's. for the algorithm? >> Automated machine learning is a hot, hot not only research focus, but we're seeing it more and more solution providers, like Microsoft and Google and others, are goin' deep down, doubling down in investments in exactly that area. That's a productivity play for data scientists. >> I think the data markets going to change radically in my opinion. I see you're startin' to some things with blockchain and some other things that are interesting. Data sovereignty, data governance are huge issues. Basil, just give your final thoughts for this segment as we wrap this up. Final thoughts on data and BMC, what should people know about BMC right now? Because people might have a historical view of BMC. What's the latest, what should they know? What's the new Instagram picture of BMC? What should they know about you guys? >> So I think what I would say people should know about BMC is that all the work that we've done over the last 25 years, in virtually every platform that came before Hadoop, we have now innovated to take this into things like big data and cloud platforms. So when you are choosing Control-M as a platform for automation, you are choosing a very, very mature solution, an example of which is Navistar. Their CIO's actually speaking at the Keno tomorrow. They've had Control-M for 15, 20 years, and they've automated virtually every business function through Control-M. And when they started their predictive maintenance project, where they're ingesting data from about 300,000 vehicles today to figure out when this vehicle might break, and to predict maintenance on it. When they started their journey, they said that they always knew that they were going to use Control-M for it, because that was the enterprise standard, and they knew that they could simply now extend that capability into this area. And when they started about three, four years ago, they were ingesting data from about 100,000 vehicles. That has now scaled to over 325,000 vehicles, and they have no had to re-architect their strategy as they grow and scale. So I would say that is one of the key messages that we are taking to market, is that we are bringing innovation that spans over 25 years, and evolving it-- >> Modernizing it, basically. >> Modernizing it, and bringing it to newer platforms. >> Well congratulations, I wouldn't call that a pivot, I'd call it an extensibility issue, kind of modernizing kind of the core things. >> Absolutely. >> Thanks for coming and sharing the BMC perspective inside theCUBE here, on BigData NYC, this is the theCUBE, I'm John Furrier. Jim Kobielus here in New York city. More live coverage, for three days we'll be here, today, tomorrow and Thursday, and BigData NYC, more coverage after this short break. (calm electronic music) (vibrant electronic music)

Published Date : Feb 11 2019

SUMMARY :

Brought to you by SiliconANGLE Media who's the Solutions Marketing Manger at BMC, in the big data space now, the AI space now, And that is the issue we've been solving for customers-- So, first of all, you mention some things that never change, and eventually analytics. but now in the modern era that we live in, 'Cause that's really the number one thing, No, and the management's going to How can the Walmart's of the world do that fast, One is that out of the box we provide a lot of left political, left or right. Alt-left, alt-right, I mean, this is software development, and we need you to now go right in-- and focuses on the needs that they have And getting close to a 100 The tipping point is reached. The days of the five year implementation timelines are gone. and the approach we call jobs as code, At the same time, you want bounded experiences at root level And that's exactly the approach I mean it seems like the more development and as the application progresses, kind of the big, the Hadoop guys kind of grew up, Let the Basil answer fist, and focus on that element. it's not a hype market, it's show me the meat of the problems we're solving That needs to be automated to the hilt. to be more productive, to deliver faster results. and I'll just point out that many of the core uses cases like the judgments that a working data engineer makes So the automation question, it's an option question, for the algorithm? doubling down in investments in exactly that area. What's the latest, what should they know? should know about BMC is that all the work kind of modernizing kind of the core things. Thanks for coming and sharing the BMC perspective

ENTITIES

Entity	Category	Confidence
Jim	PERSON	0.99+
Jim Kobielus	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
BMC	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
NYC	LOCATION	0.99+
Microsoft	ORGANIZATION	0.99+
one	QUANTITY	0.99+
Basil Faruqui	PERSON	0.99+
five year	QUANTITY	0.99+
ten months	QUANTITY	0.99+
two weeks	QUANTITY	0.99+
three months	QUANTITY	0.99+
six months	QUANTITY	0.99+
John Furrier	PERSON	0.99+
15	QUANTITY	0.99+
Basil	PERSON	0.99+
Houston	LOCATION	0.99+
Hortonworks	ORGANIZATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Mac	COMMERCIAL_ITEM	0.99+
BMC Software	ORGANIZATION	0.99+
two ways	QUANTITY	0.99+
both	QUANTITY	0.99+
tomorrow	DATE	0.99+
Midtown Manhattan	LOCATION	0.99+
One	QUANTITY	0.99+
ten year	QUANTITY	0.99+
over 25 years	QUANTITY	0.99+
over 325,000 vehicles	QUANTITY	0.99+
about 300,000 vehicles	QUANTITY	0.99+
third one	QUANTITY	0.99+
three days	QUANTITY	0.99+
about 100,000 vehicles	QUANTITY	0.99+
about 80%	QUANTITY	0.98+
BigData	ORGANIZATION	0.98+
Thursday	DATE	0.98+
eighty percent	QUANTITY	0.98+
today	DATE	0.98+
20 years	QUANTITY	0.98+
one quick question	QUANTITY	0.98+
single point	QUANTITY	0.98+
Bob Beach	PERSON	0.97+
four years ago	DATE	0.97+
two use cases	QUANTITY	0.97+
one final question	QUANTITY	0.97+
'97	DATE	0.97+
Instagram	ORGANIZATION	0.97+
Agile	TITLE	0.96+
New York city	LOCATION	0.96+
About a month ago	DATE	0.96+
Oracles	ORGANIZATION	0.96+
Hadoop	TITLE	0.95+
about a hundred stores	QUANTITY	0.94+
less than 3%	QUANTITY	0.94+
2017	DATE	0.93+
Glass'Gim	ORGANIZATION	0.92+
about	QUANTITY	0.92+
first	QUANTITY	0.91+
Ops	ORGANIZATION	0.91+
Hadoop	ORGANIZATION	0.9+
Max Watson	PERSON	0.88+
100	QUANTITY	0.88+
theCUBE	ORGANIZATION	0.88+
Mainframes	ORGANIZATION	0.88+
Navistar	ORGANIZATION	0.86+

Harish Venkat, Veritas | Veritas Vision Solution Day NYC 2018

>> From Tavern on the Green, in Central Park, New York, it's theCUBE, covering Veritas Vision Solution Day, brought to you by Veritas. >> Welcome back to the beautiful Tavern on the Green, in the heart of Central Park. You're watching theCUBE, the leader in live tech coverage. My name's Dave Vellante. We're covering Vertias Solution Days, #VtasVision. Veritas used to have the big, single tent, big tent customer event, and decided this year, it's going to go belly to belly. Go out to 20 cities, intimate customer events where they can really sit down with customers across from the table; certainly, this beautiful venue is the perfect place to do that. Harish Venkat is here as the VP of Marketing and Global Sales Enablement at Veritas. Thanks for coming on, Harish. >> Yeah, thanks for having me. >> So, we're going to change it up a little bit. Let's hit the Escape key a few times and talk about >> Yeah. >> some of the big mega trends that you're seeing. You spend a lot of time with customers. You had some intimate conversations today. What do you see as the big trends driving the marketplace? >> So at my level, what I observe with the highest thing is simplicity, instant gratification, is two things that customers love. Forget about customers, even we as individuals, we love simplicity and instant gratification. Examples around that, you know, think about back in the days where you had to take a picture, process the film, and then realize, "oh my god, the film's not even worth watching." Now you have digital photography, you take millions of pictures, and instantly you view the picture, and keep whatever you want, delete whatever you don't want. A small example of how simplicity and instant gratification is changing the world. In fact, if you listen to Warren Buffett, he'll say, "Invest in companies that is making your life a lot easier," so, if I spread that across the entire industry, I can go on with examples like Netflix disrupting Blockbuster because it made it easy for customers to watch movies at their time, and making it easy for consumption. You look at showrooming concept, where you go to Best Buy's of the world and many others, and look at a product, but you don't buy it right there. You go to your phone and say, "okay, do I do a price compare?" And then order it on the phone, where someone delivers it to your house So the list goes on and on, and the underpinning result as a result of this is disruption, all right? You look at Fortune 500 companies, just in the last decade. Over 52% of those companies have been disrupted and the underpinning phenomenon is all about instant gratification and simplicity. >> And Amazon is another great example of, I remember when my wife said to me, "Dave, you got to invest in this company." It was like... 1997. >> Yeah. >> Invest in this company, Amazon? >> Yeah. >> At the time, it was mostly books, but they started to get into other retail, so right-- >> We missed that boat, didn't we? >> I actually did, but I sold, ah! (laughs) >> I never lost money making a profit, so okay. So, at the same time, customer... Customers just can't get there... >> Yeah. >> Overnight, so what are some of the challenges that they have in getting to that level of simplicity? >> Yeah, so you look at IT spend, and when you look at the breakdown of IT spend, you'll see that about 87%, and in many cases, even greater than 90%, they spend just to keep the lights on and these are well-established companies that I'm talking about. In fact, I was doing a Keynote in, in Minneapolis one time and a CIO came and said, "Harish, I totally disagree." "In my company, it's 96%." >> (Dave laughs) >> Just to keep the lights on! So you're talking about less than 10% of your IT spend gone towards innovation, and then you look at emerging companies who are spending almost 100% all around innovation, leveraging the clouds of the world, leveraging the latest and greatest technology, and then doing these disruptions, and making things simple for consumption, and as a result, the disruption happens, so I think we have an opportunity to re-balance the equation in the enterprise space, and making it more available for innovation than just keeping the lights on. >> So part of that... the equation of shifting that needle, moving that needle, if you will, just eliminating non-value-producing activities that are expensive. We know, still, IT is still very labor-intensive, so we got to take that equation down and shift it. Are you seeing companies have success in shifting, re-training people toward digital initiatives and removing some of the heavy lifting, and what's driving that? >> Yeah, so I think it's a journey, right? So, I mean, the entire notion of journeying to the cloud is one of the big initiative to take out heavily manual-intensive, data center-intensive, which is costing a lot of money. If I can just shift all of those workloads to the cloud, that'll help me re-balance the equation. I view the concept of data intensity, which is really two variables to it. Back to your point, if I can take the non-core activity, rely on my partner ecosystem to say what is best in class solutions that I can use as my foundation layer, and then innovate on top of it, then yes, you have the perfect winning formula to really have a lot of market share and wallet share. If you're trying to do the entire stack by yourself, good luck. You'll be one of those guys who will be disrupted. There is no doubt. >> So well, okay, that says partnerships are very important. >> Without a doubt. >> You're not too alone. >> Channel is very important. >> Yes. >> So, so what do you see, in terms of the ebb and flow in the industry, of partnerships, how those are forming? Hear a lot about "co-opetition," which is kind of an interesting term, that is now, we're living. >> Yeah. >> What's your, what's your observation about partnerships, and how companies are able to leverage them? What's best practice there? >> Yeah, so just as Veritas, we're a data protection leader company. We have incredible market share and wallet share, amongst the Fortune 500 and Fortune 100 companies, but even within our incredible standing, we have to rely on other partners. We don't do everything on our own. We have incredible relationship with our cloud service providers, with the hyper-converged system to the world, like Nutanix. We just announced Pure today, so when we combine those partnerships, we can offer incredible solutions for our customers, who can then take care of the first variable that I talked about, and then innovate on top of it. So I think partner ecosystem is extremely important. For customers, it's very important that they pick the right players, so they don't have to worry about the data, and they can continually focus on innovation. >> We were talking to NBC Universal today, and one of themes in my take-aways was he's trying to get to the... he's a, basically a data protector, backup administrator, essentially, but he's trying to get to the point where he can get the business lines to self-serve. >> Yeah. >> And that seems to me to be part of the simplicity. Now... an individual like that, got to re-skill. Move toward a digital transformation. Move that needle so it's not 90% keeping the lights on. It's maybe you get to 50/50. >> Yeah. What are you seeing in terms of training and re-education of both existing people and maybe even how young people are being educated, your thoughts? >> Yeah, I think the young people coming out of college, they're already tuned to this, so to me, those are the disruptors of the world. You got to keep an eye on those millennials of the world because you don't have to train them more, because they're coming out of college, you know. They don't have the legacy background. They don't have the data centers of the world. They are already in the cloud. They're born in the cloud, sort of individuals, so I think the challenge is more about existing individuals who have the pedigree of all the journey that, you and I, we have seen, and how do you re-tune yourself to the modern world? And I think that presents an opportunity to say, "Okay look, if you don't adapt real quick," "you don't have a chance to survive" "in this limited amount of time you have in the IT space," but having said that, we're also seeing that you have some time window, and that time window will continue to shrink, so when we talk about this transformation journey, you can see year after year, the progress that, that's been made in the transformation, this leap and bound, and that's all related to Moore's Law. You think about computer and storage, it's becoming a lot cheaper, and so the innovation rate is continuing to go up. So you have very limited window: adapt or die. >> So, Harish, we were talking about, we've talked about digital transformation. We talk about simplifying; we're talking about agility. We're talking about shifting budget priorities, all very important initiatives. How is Veritas helping customers achieve these goals, so that they can move the needle from 90% keep the lights on to maybe 50/50, and put more into innovation. >> So four major themes: one is data protection. If you don't have your core enterprise asset, which is your data protected, then you can't really innovate anything on top of it. You'll constantly be worrying about what happens if I have a ransomware attack, what if I have a data outage, so Veritas takes care of it, back to the notion that you pick the best players to take care of the fundamental layer, which is around the data. The second thing that I... I would say Veritas can help is the journey to the cloud. Cloud, again, is another instrument for you to take out cost out of your data center. You're agile, you're nimble, so you can focus on innovation. Do you see the trend? So again, Veritas helps you with that journey to the cloud. It allows to move data and application to the cloud. When you're in the cloud, we protect your data in the cloud. The third thing I would say is doing more with less. I talked about the IT equation already. Software-defined storage allows you to do that. And the last thing I would say is compliance. We can't get away from compliance, the fact that Veritas has solutions to have visibility around the data. You can classify the data. You can always be compliant working with Veritas. You take care of these four layers, you don't have to worry about your data asset. You can worry about innovation at that point. >> So it, to me, it's sort of a modern version of the rebirth of Veritas. When Veritas first started, I always used to think of it as a data management company, not just a backup company. >> Right. >> And that's really what we're talking about here today, evolving toward a data-centric approach, that full life cycle of data management, simplifying that, bringing the cloud experience to your data wherever it is. Could be "on-prem." >> Yeah. >> Could be in the cloud, sort of this API-based architecture, microservices, containers... >> Yep. >> All the kind of interesting buzzwords today, but they enable agility in a cloud-like experience, that Netflix-like experience that you were talking about. >> Absolutely, right, so we're super excited. The one thing I would also say is what our latest net backup, 812, the other thing that I talked about, which is simplicity and ease of use: we are addressing both of that in addition to the robust brand that we have around protecting data. So you now you have simplicity, ease of use, instant gratification, all the basic ingredients, and Veritas is here to protect them. >> Harish, it's been a great day. Thanks for helping me close out the segment here. This venue is really terrific. It's been a while since I've been at Tavern on the Green. Some of you guys, I don't think you've ever seen it before. Seth's down here; he's, he's a city boy but we country bumpkins up in Massachusetts, we love coming down here, in the heart of Yankee country. So thanks very much-- >> Of course. >> For helping me close out here, great segment. All right, thanks for watching, everybody. We're out here, from New York City, Tavern on the Green. You've been watching theCUBE; I'm Dave Vellante. We'll see you next time. (light electronic music)

Published Date : Oct 11 2018

SUMMARY :

brought to you by Veritas. is the perfect place to do that. Let's hit the Escape key some of the big mega trends that you're seeing. back in the days where you had to take a picture, "Dave, you got to invest in this company." So, at the same time, customer... and when you look at the breakdown of IT spend, and then you look at emerging companies and removing some of the heavy lifting, is one of the big initiative to take out So, so what do you see, so they don't have to worry about the data, and one of themes in my take-aways was Move that needle so it's not 90% keeping the lights on. What are you seeing in terms of training and re-education and so the innovation rate is continuing to go up. so that they can move the needle from 90% keep the lights on is the journey to the cloud. of the rebirth of Veritas. bringing the cloud experience to your data wherever it is. Could be in the cloud, sort of this API-based architecture, that Netflix-like experience that you were talking about. and Veritas is here to protect them. Thanks for helping me close out the segment here. We're out here, from New York City, Tavern on the Green.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Harish	PERSON	0.99+
Nutanix	ORGANIZATION	0.99+
90%	QUANTITY	0.99+
Dave	PERSON	0.99+
Veritas	ORGANIZATION	0.99+
Minneapolis	LOCATION	0.99+
Harish Venkat	PERSON	0.99+
Massachusetts	LOCATION	0.99+
96%	QUANTITY	0.99+
New York City	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Warren Buffett	PERSON	0.99+
NBC Universal	ORGANIZATION	0.99+
two things	QUANTITY	0.99+
20 cities	QUANTITY	0.99+
1997	DATE	0.99+
Best Buy	ORGANIZATION	0.99+
first variable	QUANTITY	0.99+
Seth	PERSON	0.99+
Central Park	LOCATION	0.99+
Netflix	ORGANIZATION	0.99+
second thing	QUANTITY	0.99+
less than 10%	QUANTITY	0.98+
millions of pictures	QUANTITY	0.98+
this year	DATE	0.98+
both	QUANTITY	0.98+
50/50	QUANTITY	0.98+
two variables	QUANTITY	0.98+
first	QUANTITY	0.98+
one	QUANTITY	0.98+
last decade	DATE	0.98+
today	DATE	0.98+
Veritas Vision Solution Day	EVENT	0.96+
Over 52%	QUANTITY	0.96+
almost 100%	QUANTITY	0.96+
about 87%	QUANTITY	0.96+
greater than 90%	QUANTITY	0.95+
Vertias Solution Days	EVENT	0.95+
third thing	QUANTITY	0.94+
four major themes	QUANTITY	0.94+
Fortune 500	ORGANIZATION	0.93+
single tent	QUANTITY	0.9+
Central Park, New York	LOCATION	0.89+
Yankee	LOCATION	0.89+
four layers	QUANTITY	0.86+
Tavern on the Green	LOCATION	0.83+
themes	QUANTITY	0.8+
#VtasVision	EVENT	0.78+
theCUBE	ORGANIZATION	0.74+
Veritas Vision Solution Day NYC 2018	EVENT	0.73+
one time	QUANTITY	0.73+
Fortune 100	ORGANIZATION	0.68+
Fortune	ORGANIZATION	0.62+
812	OTHER	0.6+
Moore	ORGANIZATION	0.6+
Enablement	ORGANIZATION	0.57+
on the Green	TITLE	0.56+
Keynote	EVENT	0.55+
Blockbuster	ORGANIZATION	0.47+
Tavern	ORGANIZATION	0.44+
theCUBE	EVENT	0.38+
500	QUANTITY	0.37+

Greg Hughes, Veritas | Veritas Vision Solution Day NYC 2018

>> From Tavern on the Green in Central Park, New York, it's theCUBE, covering Veritas Vision Solution Day. Brought to you by Veritas. (robotic music) >> We're back in the heart of Central Park. We're here at Tavern on the Green. Beautiful location for the Veritas Vision Day. You're watching theCUBE, my name is Dave Vellante. We go out to the events, we extract the signal from the noise, we got the CEO of Veritas here, Greg Hughes, newly minted, nine months in. Greg, thanks for coming on theCUBE. >> It's great to be here Dave, thank you. >> So let's talk about your nine. What was your agenda your first nine months? You know they talk about the 100 day plan. What was your nine month plan? >> Yeah, well look, I've been here for nine months, but I'm a boomerang. So I was here from 2003 to 2010. I ran all of global services, during that time and became the chief strategy officer after that. Was here during the merger by Semantic. And then ran the Enterprise Product Group. So I had all the products and all the engineering teams for all the Enterprise products. And really my starting point is the customer. I really like to hear directly from the customer. So I've spent probably 50% of my time out and about, meeting with customers. And at this point, I've met with a 100 different accounts all around the world. And what I'm hearing, makes me even more excited to be here. Digital transformation is real. These customers are investing a lot in digitizing their companies. And that's driving an explosion of data. That data all needs to be available and recoverable and that's where we step in. We're the best at that. >> Okay, so that was sort of alluring to you. You're right, everybody's trying to get digital transformation right. It changes the whole data protection equation. It kind of reminds me, in a much bigger scale, of virtualization. You remember, everybody had to rethink their backup strategies because you now have less physical resources. This is a whole different set of pressures, isn't it? It's like you can't go down, you have to always have access to data. Data is-- >> 24 by seven. >> Increasingly valuable. >> Yup. >> So talk a little bit more about the importance of data, the role of data, and where Veritas fits in. >> Well, our customers are using new, they're driving new applications throughout the enterprise. So machine learning, AI, big data, internet of things. And that's all driving the use of new data management technologies. Cassandra, Hadoop, Open Sequel, MongoDB. You've heard all of these, right? And then that's driving the use of new platforms. Hyper-converged, virtual machines, the cloud. So all this data is popping up in all these different areas. And without Veritas, it can exist, it'll just be in silos. And that becomes very hard to manage and protect it. All that data needs to be protected. We're there to protect everything. And that's really how we think about it. >> The big message we heard today was you got a lot of different clouds, you don't want to have a different data protection strategy for each cloud. So you've got to simplify that for people. Sounds easy, but from an R&D perspective, you've got a large install base, you've been around for a long, long time. So you've got to put investments to actually see that through. Talk about your R&D and investment strategy. >> Well, our investment strategy's very simple. We are the market share leader in data protection and software-defined storage. And that scale, gives us a tremendous advantage. We can use that scale to invest more aggressively than anybody else, in those areas. So we can cover all the workloads, we can cover wherever our customers are putting their data, and we can help them standardize on one provider of data protection, and that's us. So they don't have to have the complexity of point products in their infrastructure. >> So I wonder if we could talk, just a little veer here, and talk about the private equity play. You guys are the private equity exit. And you're seeing a lot of high profile PE companies. It used to be where companies would go to die, and now it's becoming a way for the PE guys to actually get step-ups, and make a lot of money by investing in companies, and building communities, investing in R&D. Some of the stuff we've covered. We've followed Syncsort, BMC, Infor, a really interesting company, what's kind of an exit from PE, right? Dell, the biggest one of all. Riverbed, and of course Veritas. So, there's like a new private equity playbook. It's something you know well from your Silver Lake days. Describe what that dynamic is like, and how it's changed. >> Oh look, private equity's been involved in software for 10 or 15 years. It's been a very important area of investment in private equity. I've worked for private equity firms, worked for software companies, so I know it very well. And the basic idea is, continue the investment. Continue in the investment in the core products and the core customers, to make sure that there is continued enhancement and innovation, of the core products. With that, there'll be continuity in customer relationships, and those customer relationships are very valuable. That's really the secret, if you will, of the private equity playbook. >> Well and public markets are very fickle. I mean, they want growth now. They don't care about profits. I see you've got a very nice cash flow, you and some of the brethren that I mentioned. So that could be very attractive, particularly when, you know, public markets they ebb and flow. The key is value for customers, and that's going to drive value for shareholders. >> That's absolutely right. >> So talk about the TAM. Part of a CEOs job, is to continually find new ways, you're a strategy guy, so TAM expansion is part of the role. How do you look at the market? Where are the growth opportunities? >> We see our TAM, or our total addressable market, at being around $17 billion, cutting across all of our areas. Probably growing into high single digits, 8%. That's kind of a big picture view of it. When I like to think about it, I like to think about it from the themes I'm hearing from customers. What are our customers doing? They're trying to leverage the cloud. Most of our customers, which are large enterprises. We work with the blue-chip enterprises on the planet. They're going to move to a hybrid approach. They're going to on-premise infrastructure and multiple cloud providers. So that's really what they're doing. The second thing our customers are worried about is ransomware, and ransomware attacks. Spearfishing works, the bad guys are going to get in. They're going to put some bad malware in your environment. The key is to be resilient and to be able to restore at scale. That's another area of significant investment. The third, they're trying to automate. They're trying to make investments in automation, to take out manual labor, to reduce error rate. In this whole world, tape should go away. So one of the things our customers are doing, is trying to get rid of tape backup in their environment. Tape is a long-term retention strategy. And then finally, if you get rid of tape, and you have all your secondary data on disc or in the cloud, what becomes really cool, is you can analyze all that data. Out of bound, from the primary storage. That's one of the bigger changes I've seen since I've returned back to Veritas. >> So $17 billion, obviously, that transcends backup. Frankly, we go back to the early days of Veritas, I always thought of it as a data management company and sort of returned to those roots. >> Backup, software defined storage, compliance, all those areas are key to what we do. >> You mentioned automation. When you think about cloud and digital transformation, automation is fundamental, we had NBCUniversal on earlier, and the customer was talking about scripts and how scripts are fragile and they need to be maintained and it doesn't scale. So he wants to drive automation into his processes as much as possible, using a platform, a sort of API based, modern, microservices, containers. Kind of using all those terms. What does that mean for you guys in terms of your R&D roadmap, in terms of the investments that you're making in those types of software innovations? >> Well actually one of the things we're talking about today is our latest release of NetBackup 812, which had a significant investment in APIs and that allow our customers to use the product and automate processes, tie it together with their infrastructure, like ServiceNow, or whatever they have. And we're going to continue full throttle on APIs. Just having lunch with some customers just today, they want us to go even further in our APIs. So that's really core to what we're doing. >> So you guys are a little bit like the New England Patriots. You're the leader, and everybody wants to take you down. So you always start-- >> Nobody's confused me for Tom Brady. Although my wife looks... I'll stack her up against Giselle anytime, but I'm no Tom Brady. >> So okay, how do you maintain your leadership and your relevance for customers? A lot of VC money coming into the marketplace. Like I said, everybody wants to take the leader down. How do you maintain your leadership? >> We've been around for 25 years. We're very honored to have 95% of the Fortune 100, are our customers. If you go to any large country in the world it's very much like that. We work with the bluest of blue-chips, the biggest companies, the most complex, the most demanding (chuckling), the most highly regulated. Those are our customers. We steer the ship based on their input, and that's why we're relevant. We're listening to them. Our customer's extremely relevant. We're going to help them protect, classify, archive their data, wherever it is. >> So the first nine months was all about hearing from customers. So what's the next 12 to 18 months about for you? >> We're continuing to invest, delighted to talk about partnerships, and where those are going, as well. I think that's going to be a major emphasis of us to continue to drive our partnerships. We can't do this alone. Our customers use products from a variety of other players. Today we had Henry Axelrod, from Amazon Web Services, here talking about how we're working closely with Amazon. We announced a really cool partnership with Pure Storage. Our customers that use Pure Storage's all-flash arrays, they know their data's backed up and protected with Veritas and with NetBackup. It's continually make sure that across this ecosystem of partners, we are the one player that can help our large customers. >> Great, thank you for mentioning that ecosystem is a key part of it. The channel, that's how you continue to grow. You get a lot of leverage out of that. Well Greg, thanks very much for coming on theCUBE. Congratulations on your-- >> Dave, thank you. >> On the new role. We are super excited for you guys, and we'll be watching. >> I enjoyed it, thank you. >> All right. Keep it right there everybody we'll be back with our next guest. This is Dave Vellante, we're here in Central Park. Be right back, Veritas Vision, be right back. (robotic music)

Published Date : Oct 11 2018

SUMMARY :

Brought to you by Veritas. We're back in the So let's talk about your nine. and became the chief It changes the whole about the importance of data, And that's all driving the use to actually see that through. So they don't have to have the complexity and talk about the private equity play. and innovation, of the core products. and that's going to drive So talk about the TAM. So one of the things and sort of returned to those roots. all those areas are key to what we do. and the customer was talking about scripts So that's really core to what we're doing. like the New England Patriots. for Tom Brady. into the marketplace. of the Fortune 100, are our customers. So the first nine months We're continuing to invest, You get a lot of leverage out of that. On the new role. This is Dave Vellante,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Greg Hughes	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Veritas	ORGANIZATION	0.99+
Dave	PERSON	0.99+
BMC	ORGANIZATION	0.99+
2003	DATE	0.99+
Syncsort	ORGANIZATION	0.99+
Greg	PERSON	0.99+
50%	QUANTITY	0.99+
$17 billion	QUANTITY	0.99+
2010	DATE	0.99+
Tom Brady	PERSON	0.99+
nine months	QUANTITY	0.99+
8%	QUANTITY	0.99+
Henry Axelrod	PERSON	0.99+
Pure Storage	ORGANIZATION	0.99+
10	QUANTITY	0.99+
95%	QUANTITY	0.99+
Semantic	ORGANIZATION	0.99+
15 years	QUANTITY	0.99+
Giselle	PERSON	0.99+
NBCUniversal	ORGANIZATION	0.99+
Veritas Vision	ORGANIZATION	0.99+
TAM	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
New England Patriots	ORGANIZATION	0.99+
Infor	ORGANIZATION	0.99+
100 day	QUANTITY	0.99+
100 different accounts	QUANTITY	0.99+
Silver Lake	LOCATION	0.99+
24	QUANTITY	0.99+
around $17 billion	QUANTITY	0.99+
first nine months	QUANTITY	0.99+
Central Park	LOCATION	0.99+
third	QUANTITY	0.99+
Veritas Vision Day	EVENT	0.98+
Today	DATE	0.98+
nine month	QUANTITY	0.98+
today	DATE	0.98+
nine	QUANTITY	0.98+
each cloud	QUANTITY	0.98+
one	QUANTITY	0.97+
Pure Storage	ORGANIZATION	0.97+
Enterprise Product Group	ORGANIZATION	0.97+
second thing	QUANTITY	0.97+
seven	QUANTITY	0.96+
one player	QUANTITY	0.96+
NetBackup 812	TITLE	0.95+
Vision Solution Day	EVENT	0.95+
18 months	QUANTITY	0.94+
12	QUANTITY	0.94+
Central Park, New York	LOCATION	0.93+
25 years	QUANTITY	0.92+
Cassandra	PERSON	0.91+
ServiceNow	TITLE	0.91+
Tavern on the Green	LOCATION	0.9+
NYC	LOCATION	0.84+
Hadoop	PERSON	0.82+
one provider	QUANTITY	0.81+
PE	ORGANIZATION	0.75+
NetBackup	ORGANIZATION	0.73+
Veritas Vision Solution Day	EVENT	0.72+
theCUBE	ORGANIZATION	0.68+
2018	EVENT	0.6+
MongoDB	TITLE	0.58+
Open Sequel	ORGANIZATION	0.57+
Riverbed	ORGANIZATION	0.55+
Fortune	ORGANIZATION	0.48+
100	QUANTITY	0.28+

David Raffo, TechTarget Storage | Veritas Vision Solution Day NYC 2018

>> From Tavern on the Green in Central Park, New York, it's theCUBE, covering Veritas Vision Solution Day. Brought to you by Veritas. >> Hi everybody, welcome back to Tavern on the Green. We're in the heart of Central Park in New York City, the Big Apple. My name is Dave Vellante and you're watching theCUBE, the leader in live tech coverage. We're here at the Veritas Solution Day #VtasVision. Veritas used to have a big main tent day where they brought in all the customers. Now they're going out, belly-to-belly, 20 cities. Dave Raffo is here, he's the editorial director for TechTarget Storage. Somebody who follows this space very closely. David good to see you, welcome to theCUBE. >> Yeah, it's great to be on theCUBE. I always hear and watch you guys but never been on before. >> Well you're now an alum, I got to get him a sticker. So, we were talking about VMworld just now, and that show, last two years, one of the hottest topics anyway, was cloud, multi-cloud, Kubernetes of course was a hot topic. But, data protection was right up there. Why, in your view, is data protection such a hot topic right now? >> Well there's a lot of changes going on. First of all, couple years ago it was backup, nobody calls it backup anymore right. The whole market is changing. Data protection, you have newer guys like Cohesity and Ruberik, would come out with a, you know, architecture. They're basically, from scratch, they built scale-out and that's changing the way people look at data protection. You have all of the data protection guys, the Dell EMC, CommVault, Veeam, they're all kind of changing a little. And Veritas, the old guys, have been doing it forever. And now they're changing the way that they're reacting to the competition. The cloud is becoming a major force in where data lives, and you have to protect that. So there's a lot of changes going on in the market. >> Yeah I was talking to a Gartner analyst recently, he said they're data suggested about 2/3 of the customers that they talk to, within the next, I think, 18 months, are going to change they're backup approach or reconsider how they do backup or data protection as it were, as you just said. What do you think is driving that? I mean, people cite digital transformation they cite cloud, they cite big data, all the buzz words. You know, where there's smoke, there's fire, I guess. But what are your thoughts? >> Yeah, it's a little bit of all of those things, because the IT infrastructure is changing, virtualization containers, everything, every architectural change changes the way you protect and manage your data, right. So, we're seeing a lot of those changes, and now people are reacting to it and everybody's figuring out still how to use the cloud and where the data is going to live. So then, you know, how do you protect that data? >> And of course, when you listen to vendors talk, data protection, backup, recovery, it's very sexy when you talk to the customers they're just, oftentimes, drinking from the fire hose, right. Just trying to solve the next problem that they have. But what are you hearing from the customers? TechTarget obviously has a big community. You guys do a lot of events. You talk personally to a lot of customers, particularly when there are new announcements. And what does the landscape look like to you? >> So they're all, you know like I said, everybody's looking at the cloud. They're looking at all these, how they're going to use these things. They're not sure yet, but they want data protection, data management that will kind of fit in no matter which direction they go. It's kind of, you know, we know we're looking at where we're going to be in five years and now we want to know how we're going to protect, how we're going to manage our data, how we're going to use it, move it from cloud to cloud. So, you know, it's kind of like, it's a lot of positioning going on now. A lot of planing for the future. And they're trying to figure out what's the best way they're going to be able to do all this stuff. >> Yeah, so, you know the hot thing, it used to be, like you said, backup. And then of course, people said backup is one thing, recovery is everything. You know, so it was the old bromide, my friend Fred Moore, I think coined that term, back in the old storage tech days. But when you think about cloud, and you think about the different cloud suppliers, they've all got different approaches, they're different walled gardens, essentially. And they've got different processes for at least replicating, backing up data. Where do you see customers, in terms of having that sort of single abstraction layer, the single data protection philosophy or strategy and set of products for multi-cloud? >> Well, where they are is they're not there, and they're, you know, far from it, but that's where they want to be. So, that's where a lot of the vendor positioning is going. A lot of the customers are looking to do that. But another thing that's changing it is, you know, people aren't using Oracle, SQL databases all the time anymore either. They're using the NoSQL MongoDB. So that change, you know, you need different products for that too. So, the whole, almost every type of product, hyper-converged is changing backup. So, you know, all these technologies are changing the way people actually are going to protect their data. >> So you look at the guys with the big install base, obviously Veritas is one, guys like IBM, certainly CommVault and there are others that have large install bases. And the new guys, the upstarts, they're licking their chops to go after them. What do you see as, let's take Veritas as an example, the vulnerabilities and the strengths of a company like that? >> So the vulnerabilities of an old company that's been around forever is that, the newer guys are coming with a clean sheet of paper and coming up and developing their products around technologies that didn't exist when NetBackup was created, right. So the strength is that, for Veritas, they have huge install base. They have all the products, technology they need. They have a lot of engineers so they can get to the board, drawing board, and figure it out and add stuff. And what they're trying to do is build around NetBackup saying all these companies are using NetBackup, so let's expand that, let's build archiving in, let's build, you know, copy data protect, copy data management into that. Let's build encryption, all of that, into NetBackup. You know, appliances, they're going farther, farther and farther into appliances. Seems like nobody wants to just buy backup software, and backup hardware as separate, which they were forever. So you know, we're seeing the integration there. >> Well that brings up another good point, is you know, for years, backup's been kind of one size fits all. So that meant you were either over protected, or under protected. It was maybe an after thought, a bolt-on, you put in applications, put it in a server, an application on top of it. You know, install Linux, maybe some Oracle databases. All of the a sudden, oh, we got to back this thing up. And increasingly, people are saying, hey, I don't want to just pay for insurance, I'd like to get more value. And so, you're hearing a lot of talk about governance, certainly security, ransomware is now a big topic, analytics. What are you seeing, in terms of, some of those additional value, those value adds beyond that, is it still just insurance, or are we seeing incremental value to customers? >> Yeah, well I think everybody wants incremental value. They have the data, now it's not just, like you said, insurance. It's like how is this going to, how am I going to use this data? How's it going to help my business? So, the analytics is a big thing that everybody's trying to get in. You know, primary and secondary storage everybody's adding analytics. AI, how we use AI, machine learning. You know, how we're going to back up data from the edge into and out of thing. What are we going to do with all this data? How are we going to collect it, centralize it, and then use it for our business purposes? So there's, you know, it's a wide open field. Remember it used to be, people would say backup, nobody ever changes their backup, nobody wants to change backup. Now surveys are saying within the next two years or so, more than 50% of people are looking to either add a backup product, or just change out their whole backup infrastructure. >> Well that was the interesting about when, you know, the ascendancy of Data Domain, as you recall, you were following the company back then. The beauty of that architecture was, you don't have to change your backup processes. And now, that's maybe a challenge for a company like that. Where people are, because of digital, because of cloud, they're actually looking to change their backup processes. Not unlike, although there are differences, but a similar wave, remember the early days of virtualization, you had, you're loosing physical resources, so you had to rethink backup. Are you seeing similar trends today, with cloud, and digital? >> Yeah, the cloud, containers, microservices, things like that, you know, how do you protect that data? You know people, some people are still struggling with virtualization, you know, like, there's so many more VMs being created so quickly, and that you know, a lot of the backup products still haven't caught up to that. So, I mean Veeam has made an awful great business around dealing with VM backup, right? >> Right. >> Where was everybody else before that? Nobody else could do it. >> We storage guys, we're like the cockroaches of the industry. We're just this, storage just doesn't seem to die. You know the joke is, there's a hundred people in storage and 99 seats. But you've been following it for a long time. Yeah, you see all the hot topics like cloud and multi-cloud and digital transformation. Are you surprised at the amount of venture capital over the last, you know, four or five years, that has flooded in to storage, that continues to flood in to storage? And you see some notable successes, sure some failures, but even those failures, you're seeing the CEOs come out and sell to new companies and you're seeing the rise of a lot of these startups and a lot of these unicorns. Does it surprise you, or is that kind of your expectation? >> Well, I mean, like you said, that's the way it's always been in storage. When you look at storage compared to networking and compute, how many startups are there in those other areas. Very few, but storage keeps getting funded. A couple of years ago, I used to joke, if you said I do Flash, people would just throw hundreds of millions of dollars at you, then it was cloud. There always seem to be like a hot topic, a hot spot, that you can get money from VCs. And there's always four or five, at least, storage vendors who are in that space. >> Yeah, the cloud, the storage cloud AI blockchain company is really the next unicorn right? >> Right, yeah, if you know the right buzz words you can get money. And there's never just one right, there's always a couple in that same area and then one or two make it. >> Yeah, or, and or, if you've done before, right, you're seeing that a lot. I mean, you see what the guys like, for instance at Datrium are doing. Brian Biles he did it a Data Domain, and now he's, they just did a giant raise. >> Qumulo. >> Yeah, you know, Qumulo, for sure. Obviously the Cohesity are sort of well known, in terms of how they've done giant raises. So there's massive amount of capital now pouring in, much of which will go into innovation. It's kind of, it's engineering and it's you know, go to market and marketing. So, you know, no doubt, that that innovation curve will continue. I guess you can't bet against data growth. >> Right, you know, yeah, right, everybody knows data is going to grow. They're saying it's the new oil, right. Data is the big thing. The interesting thing with the funding stuff now is the, not the new companies, but the companies that have been around a little bit, and it's now time for them to start showing revenue. And where in the past it was easier for them to get money, now it seems a little tougher for those guys. So, you know, we could see more companies go away without getting bought up or go public but-- >> Okay, great. Dave, thanks very much for coming on theCUBE. >> Alright. >> It was great to have you. >> Thanks for having me on. >> Alright keep it right there everybody. We'll be back with our next guest. You're watching theCUBE from Veritas Vision in Central Park. We'll be right back. (theCUBE theme music)

Published Date : Oct 11 2018

SUMMARY :

Brought to you by Veritas. We're in the heart of Central Park I always hear and watch you guys one of the hottest topics anyway, would come out with a, you know, architecture. What do you think is driving that? changes the way you protect and manage your data, right. And of course, when you listen to vendors talk, So, you know, it's kind of like, and you think about the different cloud suppliers, So that change, you know, you need different products So you look at the guys with the big install base, So you know, we're seeing the integration there. So that meant you were either over protected, So there's, you know, it's a wide open field. you know, the ascendancy of Data Domain, as you recall, and that you know, a lot of the backup products Where was everybody else before that? over the last, you know, four or five years, a hot spot, that you can get money from VCs. Right, yeah, if you know the right buzz words I mean, you see what the guys like, So, you know, no doubt, So, you know, we could see more companies go away Dave, thanks very much for coming on theCUBE. We'll be back with our next guest.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave Raffo	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Fred Moore	PERSON	0.99+
David Raffo	PERSON	0.99+
Dave	PERSON	0.99+
CommVault	ORGANIZATION	0.99+
Brian Biles	PERSON	0.99+
David	PERSON	0.99+
Veritas	ORGANIZATION	0.99+
Veeam	ORGANIZATION	0.99+
five	QUANTITY	0.99+
Gartner	ORGANIZATION	0.99+
Datrium	ORGANIZATION	0.99+
99 seats	QUANTITY	0.99+
one	QUANTITY	0.99+
Central Park	LOCATION	0.99+
four	QUANTITY	0.99+
TechTarget	ORGANIZATION	0.99+
18 months	QUANTITY	0.99+
five years	QUANTITY	0.99+
more than 50%	QUANTITY	0.99+
two	QUANTITY	0.99+
Dell EMC	ORGANIZATION	0.99+
Veritas Vision	ORGANIZATION	0.99+
20 cities	QUANTITY	0.98+
Big Apple	LOCATION	0.98+
Linux	TITLE	0.98+
NYC	LOCATION	0.97+
couple years ago	DATE	0.97+
Oracle	ORGANIZATION	0.97+
TechTarget Storage	ORGANIZATION	0.97+
Qumulo	PERSON	0.97+
Veritas Solution Day	EVENT	0.96+
New York City	LOCATION	0.96+
last two years	DATE	0.95+
Vision Solution Day	EVENT	0.94+
Central Park, New York	LOCATION	0.93+
SQL	TITLE	0.91+
today	DATE	0.9+
NetBackup	ORGANIZATION	0.89+
Tavern on the Green	LOCATION	0.89+
Ruberik	PERSON	0.89+
one thing	QUANTITY	0.87+
about 2/3	QUANTITY	0.87+
NoSQL MongoDB	TITLE	0.83+
couple of years ago	DATE	0.82+
theCUBE	ORGANIZATION	0.81+
hundreds of millions of dollars	QUANTITY	0.81+
#VtasVision	EVENT	0.79+
VMworld	ORGANIZATION	0.79+
Veritas Vision Solution Day	EVENT	0.77+
next two years	DATE	0.71+
a hundred people	QUANTITY	0.69+
NetBackup	TITLE	0.66+
Cohesity	ORGANIZATION	0.65+
theCUBE	TITLE	0.63+
First	QUANTITY	0.58+
couple	QUANTITY	0.57+
Kubernetes	TITLE	0.54+
2018	DATE	0.5+

David Noy, Veritas | Veritas Vision Solution Day NYC 2018

from Tavern on the Green in Central Park New York it's the to covering Veritas vision solution day brought to you by Veritas welcome back to New York City everybody we're in the heart of Central Park at the Tavern on the Green beautiful location here a lot of customers coming in to see and hear veritas solution days we talked to Scott general earlier about sort of why these solution days very intimate customer events around the world David Moyes is here he's the vice president of software to find storage and appliances at Veritas David thanks for coming on oh thanks for having me you're very welcome so wait appliances you guys were software company what what's going on we are we are a software company and we have been a software company for a very long time and we will continue to be a software company for a very long time but what we find is that you know customers oftentimes they want to deploy software but then they find that there's a lot of additional challenges that come with that there's the maintenance of the actual server infrastructure there's patching of the operating system there's vulnerabilities that show up those additional operational costs sometimes outweigh the benefits of just buying a purpose-built appliance and those purpose-built appliances can sometimes you know how workflows built-in to them that just make them so easy to use that you know one person could potentially operate petabytes of an appliance versus a whole army of people trying to maintain hundreds or thousands of individual servers so you put out a stat this morning which I wasn't aware of you guys have more than half the market I think it was an IDC stat maybe was Gartner more than half the market for integrated the purpose-built backup appliances is that right that's right so for backup appliances purpose-built back of appliances that actually hosts the backup software we have more than 50% market share people have that much trust in the appliances that we build and find that simplicity so compelling that they want to buy it and that form factor from us you know about a decade ago I wrote a piece in the early days of wiki bond talking about services oriented storage and and when we think about software-defined storage what you described today was actually sets of granular services that I can invoke when I need them or not if I don't need them very cloud like I mean I could replace the s in Software Defined with s and services and that's your kind of philosophy and approach isn't it it is in fact what we find is that people want data services on top of that data when you're reporting enterprise data and you protecting exabyte so that apprise data the way the Veritas does having it just sit there and do nothing is is kind of wasteful to a large extent unless you're just waiting for a disaster to occur or data corruption or something like that if we can support to think about what are the governance capabilities lineage audit all of the different things that we could potentially build as services that are then can be downloaded onto that purpose-built backup appliance those all become added value to the customer but you you did what you described was not just another you're not just dropping in another stovepipe appliance you talked about having visibility across the entire portfolio you really stress that a lot you said several times you can't get this from any other backup vendor or any other vendor really talk about that a little bit well what's interesting is that look from the moment that data is actually born from a primary application and then it's protected its protected into a backup solution and then it's probably put into some sort of a storage solution either maybe at a duplication storage solution then it's moved to an even cheaper tier and eventually off to the cloud or somewhere else each of those are disparate if we can actually build a connector framework that can actually extract that information and bring it all together so that we can start to make assertions about hey how did this data flow how did it originate what kind of information do I have do I even need it anymore after seven years should I purge it or should I delete it is it even a risk to my organization to keep it you can only do that when you can make those associations across the lifecycle of that data and so we tracked the data through its entire lifecycle and that's only through the integration of all that product portfolio and that's something that you're not going to see with the small point solutions that are being built by startups and it's even very rare to see in some of the larger companies that build these solutions you know dude I'm glad you mentioned that about getting rid of data because so often today in the in the news media and you here in vendor presentations people talk about keeping data forever yeah that's that's dangerous in a lot of case a lot of general councils out there don't want to keep data forever there's data that you want to delete if in fact you can because of the compliance risks that it brings to your company you don't want to keep working process or some rogue email that floats around the organization get rid of that keep what you have to and get rid of the rest right and then the problem is if you don't know what you have and also you don't know how many places that that data is propagated how can you possibly delete it all we've helped customers in some cases that I'm not going to mention who they are but we've helped them delete up to 50 percent of their data after it's aged out and I've talked to banks before and I've asked them like hey after seven years what do you do with your data they say well we just keep it because we don't know what it was originated for we don't even know what's in it and therefore it's too risky for us to go and delete but at the same time to your point it may be even too risky to keep in some cases especially a liability to keep that deal and what's the what's the technology enable are there is that your catalog you're sort of copy data management self so it's a combination of things the catalog is what helps us understand what we have and where it is and how it's actually moved through that lifecycle and then we have a component called the Veritas information classifier and that component allows us to crack open the data and actually determine what's inside whether it's personally identifiable information social security numbers there's a number of different patterns we can look for actually document types and we can actually tag that data and say hey this data has information that's pertains to a specific individuals so for example if I'm following gdpr rules I can now find out all the data where it's propagated specific to an individual who said now I want all my stuff deleted and that's a very powerful technique so such as the data it's the metadata associated with that taste well that's a that I think it's a unique capability in terms of being integrated into a solution and so that's that's cool I also want to talk about Acme financial services yeah is artificial company or a real company but but an anonymized company that you talked about moving to your system your appliance based system they were able to reduce TCO by 40% shave two thirds of their hardware infrastructure away come back to that is I have sort of a tongue-in-cheek there get rid of tape at least we're possible no new tape I think is what you that's right and then save 20 million dollars a year in reduced downside downtime costs so my tongue and cheek is who everybody remembers the the no hardware agenda of signs you know bhai live in Massachusetts that we used to see those right next to the EMC facilities and you're only a hardware agenda it seems with your appliance is to get rid of hardware is exactly right look we we are actually putting out technology that allows you to take ten heads or ten servers and consolidate it down to two or three to make the total cost of ownership for your product less because at the end of day it's in our benefit right we're a software vendor we want to maintain ourselves as a software vendor we want to take Hardware out of the equation to the extent that's possible but we don't want to do it at the expense of simplicity and so striking that balance is what's most important yeah and and you also have talked about a little you showed a little leg if you will on a road map that's right I'm one of the things that struck me and it's sort of there today but but even more in the future is the ability to scale compute and storage independently and more granular chunks explain what that is and why that's important to customers well you know if you think about it the way that these integrated backup appliances work even back up people just dismiss backup appliances they just grow and grow and grow to a certain capacity they scale up and scale up and at some point the performance just starts to tank and taper off so what if you could actually grow them almost in a node-based architecture think about its compute and storage that you grow together and as you add more compute you add more storage and so that means that I can do more micro services or I can provide better deduplication but my deduplication doesn't slow down when I go from two petabyte to four petabyte because my compute has actually grown in lockstep with my storage you made a big deal about eliminating tape where possible and you also and I want to push it this a little bit talked about the economics of solution relative to tape I was somewhat surprised because the conventional wisdom would say tape is you know pennies on the dollar come here to to disc based solutions how is it that you're able to make that claim well there's a couple of different things that come to play number one is that again through the cloud catalyst capability of net backup we can actually keep data deduplicated before we send it to our disk based solution which means that it stays in some cases 50 to one deduplicated and you're not necessarily gonna get that capability on tape you'd have to rehydrate don't have to rehydrate so if you talk about about pennies well multiply that by 50 because if your data duplicates that much that's the kind of thing you're talking about the second part is the operational cost of actually maintaining tape now if I'm keeping data for seven years thirty years the lifetime of a patient that tape infrastructure age is out and I'm doing tape migrations all the time those are not cheap and sometimes though tape infrastructures not even available anymore the other compatibility is not there that's right the other thing is well the other thing is is the network costs right if you're going to be pushing stuff over the network absolutely your network and your and you pushing bigger things over the network you have more network infrastructure just for the purposes of moving data from one here tier to another tier it's wasteful David talked about the Flex appliance what I took away from that is it sort of allows for services oriented deployment fast migration it allows you to sort of test out new services to see whether or not you like it what what is that what does the customer have to do to exploit that capability today the customer would buy our high-end backup appliance which is the 53:40 they would buy the Flex software which is a software package that allows them to basically augment that's appliance so that it can actually maintain that catalog and then quickly deploy them those services as I showed in the demo in three minutes or less we can deploy a service on from the service catalog in the future you should expect that we should we will begin to build that into all of our appliances that's just the way of doing things it becomes a service-oriented architecture and that catalog is just going to be a natural way of us operating ok I want to also ask you about another capability that you discussed which was your ability to look across the portfolio and identify predictive failures yeah everybody talks about machine intelligence being used you know in that use case IOT you hear about that along what are you guys doing what's unique well you know in some cases what we're doing is not completely unique I mean there are some companies that have done this pretty well they've done it for their own point solutions I think where it gets interesting is when you say look we're not just building a point solution we have a number of different products again for the entire lifecycle of data from the moment it's born and if I can integrate all of the telemetry that I get from those different products I can now start to get predictive about things that might have happened not just at one stage within one product but might happen down the road when I go to move it into my long-term retention as my long-term retention ready for it is it going to impact the performance of my long-term retention solution and so therefore should I think about scaling on my long-term retention solution independent of you know so or ahead of the actual growth of my purpose-built Beco appliance right so it's that portfolio view that makes it so powerful right it's okay two things actually the portfolio view and also the full lifecycle view as well that's right something you've been hitting on not just a point product all right David I know you Jam it a lot of a lot of customer conversations here in New York so I gotta let you go but thanks so much for stopping by the cube appreciate my pleasure I really appreciate your time thank you welcome all right keep it right there everybody you're watching the cube live from Veritas solution days in New York City right in the heart of Central Park we'll be right at right back after this short break [Music]

Published Date : Oct 11 2018

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
David Noy	PERSON	0.99+
seven years	QUANTITY	0.99+
David Moyes	PERSON	0.99+
New York	LOCATION	0.99+
hundreds	QUANTITY	0.99+
two	QUANTITY	0.99+
New York City	LOCATION	0.99+
40%	QUANTITY	0.99+
New York City	LOCATION	0.99+
Veritas	ORGANIZATION	0.99+
Massachusetts	LOCATION	0.99+
veritas	ORGANIZATION	0.99+
three	QUANTITY	0.99+
more than 50%	QUANTITY	0.99+
second part	QUANTITY	0.99+
Scott	PERSON	0.99+
50	QUANTITY	0.99+
three minutes	QUANTITY	0.99+
Central Park	LOCATION	0.98+
New York	LOCATION	0.98+
up to 50 percent	QUANTITY	0.98+
two things	QUANTITY	0.97+
Central Park	LOCATION	0.97+
today	DATE	0.96+
Acme	ORGANIZATION	0.96+
seven years	QUANTITY	0.95+
Central Park	LOCATION	0.95+
one person	QUANTITY	0.95+
thirty years	QUANTITY	0.94+
seven years	QUANTITY	0.94+
Beco	ORGANIZATION	0.93+
Flex	TITLE	0.93+
one stage	QUANTITY	0.93+
four petabyte	QUANTITY	0.92+
about a decade ago	DATE	0.92+
Tavern	LOCATION	0.92+
20 million dollars a year	QUANTITY	0.91+
two thirds	QUANTITY	0.9+
one	QUANTITY	0.89+
NYC	LOCATION	0.88+
Gartner	ORGANIZATION	0.88+
ten servers	QUANTITY	0.88+
one product	QUANTITY	0.88+
two petabyte	QUANTITY	0.87+
ten heads	QUANTITY	0.85+
Veritas Vision Solution Day	EVENT	0.84+
this morning	DATE	0.83+
a couple of different things	QUANTITY	0.83+
more than half the market	QUANTITY	0.82+
more than half the market	QUANTITY	0.81+
thousands of individual servers	QUANTITY	0.81+
petabytes	QUANTITY	0.78+
EMC	ORGANIZATION	0.77+
number one	QUANTITY	0.74+
each	QUANTITY	0.72+
pennies	QUANTITY	0.72+
lot	QUANTITY	0.68+
lot of customers	QUANTITY	0.64+
Tavern on the Green	ORGANIZATION	0.63+
wiki bond	ORGANIZATION	0.63+
solution	EVENT	0.51+
2018	DATE	0.5+
on	ORGANIZATION	0.49+
the Green	LOCATION	0.47+
53:40	TITLE	0.27+

CUBE Highlights | theCUBE NYC 2018

the foundation of having that data management platform is absolutely fundamental and necessary to do good machinery without good data without good data management you can't do good ml or AI sounds sort of simple but very true the dupes of all the nature and velocity of data has evolved in the last five screens taking over the world and being in charge of you and us being dominated by them as often we say in culture now it's about having this really beautiful interface between technology objects I can take the traditional tools using like Jupiter no spark tensorflow you know those packages with kubernetes on top of the databases of service and some object stores I have a much easier stack to work in able everyone to make data-driven decisions but make sure that they're interpreting that data in the right way right give them enough guidance don't let them just kind of attack the well [Music]

Published Date : Sep 19 2018

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
NYC	LOCATION	0.87+
theCUBE	ORGANIZATION	0.83+
five screens	QUANTITY	0.79+
Jupiter	TITLE	0.66+
CUBE	ORGANIZATION	0.65+
2018	DATE	0.5+

Josh Rogers, Syncsort | theCUBE NYC 2018

>> Live from New York, it's theCUBE, covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. >> Okay, welcome back, everyone. We're here live in New York City for CUBE NYC. This is our ninth year covering the big data ecosystem, now it's AI, machine-learning, used to be Hadoop, now it's growing, ninth year covering theCUBE here in New York City. I'm John Furrier, with Dave Vellante. Our next guest, Josh Rogers, CEO of Syncsort. I'm going back, long history in theCUBE. You guys have been on every year. Really appreciate chatting with you. Been fun to watch the evolution of Syncsort and also get the insight. Thanks for coming on, appreciate it. >> Thanks for having me. It's great to see you. >> So you guys have constantly been on this wave, and it's been fun to watch. You guys had a lot of IP in your company, and then just watching you guys kind of surf the big data wave, but also make some good decisions, made some good calls. You're always out front. You guys are on the right parts of the wave. I mean now it's cloud, you guys are doing some things. Give us a quick update. You guys got a brand refresh, so you got the new logo goin' on there. Give us a quick update on Syncsort. You got some news, you got the brand refresh. Give us a quick update. >> Sure. I'll start with the brand refresh. We refreshed the brand, and you see that in the web properties and in the messaging that we use in all of our communications. And, we did that because the value proposition of the portfolio had expanded so much, and we had gained so much more insight into some of the key use cases that we're helping customers solve that we really felt we had to do a better job of telling our story and, probably most importantly, engage with the more senior level within these organizations. What we've seen is that when you think about the largest enterprises in the world, we offer a series of solutions around two fundamental value propositions that tend to be top of mind for these executives. The first is how do I take the 20, 30, 40 years of investment in infrastructure and run that as efficiently as possible. You know, I can't make any compromises on the availability of that. I certainly have to improve my governance and secureability of that environment. But, fundamentally, I need to make sure I could run those mission-critical workloads, but I need to also save some money along the way, because what I really want to do is be a data-driven enterprise. What I really want to do is take advantage of the data that gets produced in these transactional applications that run on my AS400 or IBM I-infra environment, my mainframe environment, even in my traditional data warehouse, and make sure that I'm getting the most out of that data by analyzing it in a next-generation set of-- >> I mean one of the trends I want to get your thoughts on, Josh, cause you're kind of talking through the big, meagatrend which is infrastructure agnostic from an application standpoint. So the that's the trend with dev ops, and you guys have certainly had diverse solutions across your portfolio, but, at the end of the day, this is the abstraction layer customers want. They want to run workloads on environments that they know are in production, that work well with applications, so they almost want to view the infrastructure, or cloud, if you will, same thing, as just agnostic, but let the programmability take care of itself, under the hood, if you will. >> Right, and what we see is that people are absolutely kind of into extending and modernizing existing applications. This is in the large enterprise, and those applications and core components will still run on mainframe environments. And so, what we see in terms of use cases is how do we help customers understand how to monitor that, the performance of those applications. If I have a tier that's sitting on the cloud, but it's transacting with the mainframe behind the firewall, how do I get an end-to-end view of application performance? How do I take the data that ultimately gets logged in a DB2 database on the mainframe and make that available in a next-generation repository, like Hadoop, so that I can do advanced analytics? When you think about solving both the optimization and the integration challenge there, you need a lot of expertise in both sides, the old and the new, and I think that's what we uniquely offer. >> You guys done a good job with integration. I want to ask quick question on the integration piece. Is this becoming more and more table stakes, but also challenging at the same time? Integration and connecting systems together, if their stateless, is no problem, you use APIs, right, and do that, but as you start to get data that needs state information, you start to think to think about some of the challenges around different, disparate systems being distributed, but networked, in some cases, even decentralized, so distributed networking is being radically changed by the data decisions on the architecture, but also integration, call it API 2.0 or this new way to connect and integrate. >> Yeah, so what we've tried to focus on is kind of solving that piece between these older applications that run these legacy platforms and making them available to whatever the consumer is. Today, we see Kafka and in Amazon we see Kinesis as kind of key buses delivering data as a service, and so the role that we see ourselves playing and what we announced this week is an ability to track changed data, deliver it in realtime in these older systems, but deliver it to these new targets: Kafka, Kinesis, and whatever comes next. Because really that's the fundamental partner we're trying to be to our customers is we will help you solve the integration challenge between this infrastructure you've been building for 30 years and this next-generation technology that lets you get the next leg of value out of your data. >> So Jim, when you think about the evolution of this whole big data space, the early narrative in the trade press was, well, NoSQL is going to replace Oracle and DB2, and the data lake is going to replace the EDW, and unstructured data is all that matters, and so forth. And now, you look at what's really happened is the EDW is a fundamental component of making decisions and insights, and SQL is the killer app for Hadoop. And I take an example of say fraud detection, and when you think and this is where you guys sit in the middle from the standpoint of data quality, data integration, in order to do what we've done in the past 10 years take fraud detection down from well, I look at my statement a month or two later and then call the credit card company, it's now gone to a text that's instantaneous. Still some false positives, and I'm sure working on that even. So maybe you could describe that use case or any other, your favorite use case, and what your role is there in terms of taking those different data sources, integrating them, improving the data quality. >> So, I think when you think about a use case where I'm trying to improve the SLA or the responsiveness of how do manage against or detect fraud, rather than trying to detect it on a daily basis, I'm trying to detect it at transaction time. The reality is you want to leverage the existing infrastructure you have. So if you have a data warehouse that has detailed information about transaction history, maybe that's a good source. If you have an application that's running on the mainframe that's doing those transaction realtime, the ultimate answer is how do I knit together the existing infrastructure I have and embed the additional intelligence and capability I need from these new capabilities, like, for example, using Kafka, to deliver a complete solution. What we do is we help customers kind of tie that together, Specifically, we announced this integration I mentioned earlier where we can take a changed data element in a DB2 database and publish it into Kafka. That is a key requirement in delivering this real-time fraud detection if I in fact am running transactions on a mainframe, which most of the banks are. >> Without ripping and replacing >> Why would you want to rip out an application >> You don't. >> your core customer file when you can just extend it. >> And you mentioned the Cloudera 6 certification. You guys have been early on there. Maybe talk a little about that relationship, the engineering work that has to get done for you to be able to get into the press release day one. >> We just mentioned that my first time on theCUBE was in 2013, and that was on the back of our initial product release in the big data world. When we brought the initial DMX-h release to market, we knew that we needed to have deep partnerships with Cloudera and the key platform providers. I went and saw Mike Olson, I introduced myself, he was gracious enough to give me an hour, and explain what we thought we could do to help them develop more value proposition around their platform, and it's been a terrific relationship. Our architecture and our engineering and product management relationship is such that it allows us to very rapidly certify and work on their new releases, usually within a couple a days. Not only can customers take advantage of that, which is pretty unique in the industry, but we get some some visibility from Cloudera as evidenced by Tendu's quote in the press release that was released this week, which is terrific. >> Talk about your business a little bit. You guys are like a 50-year old startup. You've had this really interesting history. I remember you from when I first started in the industry following you guys. You've restructured the company, you've done some spin outs, you've done some M and A, but it seems to be working. Talk about growth and progress that you're making. >> We're the leader in the Big Iron to Big Data market. We define that as allowing customers to optimize their traditional legacy investments for cost and performance, and then we help them maximize the value of the data that get generated in those environments by integrating it with next-generation analytic environments. To do that, we need a broad set of capability. There's a lot of different ways to optimize existing infrastructure. One is capacity management, so we made an acquisition about a year ago in the capacity management space. We're allowing customers to figure out how do I make sure I've got not too much and not too little capacity. That's an example of optimization. Another area of capability is data quality. If I'm maximize the value of the data that gets produced in these older environments, it would be great that when it lands in these next-generation repositories it's as high quality as possible. We acquired Trillium about a year ago, or actually coming up >> How's that comin'? >> on two years ago and we think that's a great capability for our customers It's going terrific. We took their core data quality engine, and now it runs natively on a distributed Hadoop infrastructure. We have customers leveraging it to deliver unprecedented volume of matching, so not only breakthrough performance, but this whole notion of write once, run anywhere. I can run it on an SMP environment. I can run it on Hadoop. I can run it Hadoop in the cloud. We've seen terrific growth in that business based on our continued innovation, particularly pointing it at the big data space. >> One of the things that I'm impressed with you guys is you guys have transformed, so having a transformation message to your customers is you have a lot of credibility, but what's interesting is is that the world with containers and Kubernetes now and multi-cloud, you're seeing that you don't have to kill the legacy to bring in the new stuff. You can see you can connect systems, when you guys have done with legacy systems, look at connect the data. You don't have to kill that to bring in the new. >> Right >> You can do cloud-native, you can do some really cool things. >> Right. I think there's-- >> This rip and replace concept is kind of going away. You put containers around it too. That helps. >> Right. It's expensive and it's risky, so why do that. I think that's the realization. The reality is that when people build these mission-critical systems, they stay in place for not five years, but 25 years. The question is how do you allow the customers to leverage what they have and the investment they've made, but take advantage of the next wave, and that's what we're singularly focused on, and I think we're doing a great job of that, not just for customers, but also for these next-generation partners, which has been a lot of fun for us. >> And we also heard people doing analytics they want to have their own multi-tenent, isolated environments, which goes to don't screw this system up, if it's doing a great job on a mission-critical thing, don't bundle it, just connect it to the network, and you're good. >> And on the cloud side, we're continuing to look at our portfolio and say what capabilities will customers want to consume in a cloud-delivery model. We've been doing that in the data quality space for quite awhile. We just launched and announced over the last about three months ago capacity management as a service. You'll continue to see, both on the optimization side and on the integration side, us continuing to deliver new ways for customers to consume the capabilities they need. >> That's a key thing for you guys, integration. That's pretty much how you guys put the stake in the ground and engineer your activities around integration. >> Yeah, we start with the premise that your going to need to continue to run this older investments that you made, and you're going to need to integrate the new stuff with that. >> What's next? What's goin' on the rest of the year with you guys? >> We'll continue to invest heavily in the realtime and changed-data capture space. We think that's really interesting. We're seeing a tremendous amount of demand there. We've made a series of acquisitions in the security space. We believe that the ability to secure data in the core systems and its journey to the next-generation systems is absolutely critical, so we'll continue to invest there. And then, I'd say governance, that's an area that we think is incredibly important as people start to really take advantage of these data lakes they're building, they have to establish real governance capabilities around those. We believe we have an important role to play there. And there's other adjacencies, but those are probably the big areas we're investing in right now. >> Just continuing to move the ball down the field in the Syncsort cadence of acquisitions, organic development. Congratulations. Josh, thanks for comin' on. To John Rogers, CEO of Syncsort, here inside theCUBE. I'm John Furrier with Dave Vellante. Stay with us for more big data coverage, AI coverage, cloud coverage here. Part of CUBE NYC, we're in New York City live. We'll be right back after this short break. Stay with us. (techno music)

Published Date : Sep 17 2018

SUMMARY :

Brought to you by SiliconANGLE Media and also get the insight. It's great to see you. kind of surf the big data wave, take advantage of the data I mean one of the trends I want to in a DB2 database on the by the data decisions on the architecture, and so the role that we and SQL is the killer app for Hadoop. the existing infrastructure you have. when you can just extend it. the engineering work that has to get done in the big data world. first started in the industry of the data that get generated I can run it Hadoop in the cloud. is that the world with containers You can do cloud-native, you can do I think there's-- concept is kind of going away. but take advantage of the next wave, connect it to the network, and on the integration side, put the stake in the ground integrate the new stuff with that. We believe that the ability to secure data in the Syncsort cadence of acquisitions,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Josh	PERSON	0.99+
Josh Rogers	PERSON	0.99+
2013	DATE	0.99+
Jim	PERSON	0.99+
Josh Rogers	PERSON	0.99+
20	QUANTITY	0.99+
John Rogers	PERSON	0.99+
John Furrier	PERSON	0.99+
Mike Olson	PERSON	0.99+
Syncsort	ORGANIZATION	0.99+
25 years	QUANTITY	0.99+
New York City	LOCATION	0.99+
New York City	LOCATION	0.99+
30 years	QUANTITY	0.99+
five years	QUANTITY	0.99+
New York	LOCATION	0.99+
Kafka	TITLE	0.99+
an hour	QUANTITY	0.99+
30	QUANTITY	0.99+
both sides	QUANTITY	0.99+
both	QUANTITY	0.99+
NoSQL	TITLE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
first	QUANTITY	0.99+
40 years	QUANTITY	0.98+
two years ago	DATE	0.98+
first time	QUANTITY	0.98+
IBM	ORGANIZATION	0.98+
Today	DATE	0.98+
Hadoop	TITLE	0.98+
Oracle	ORGANIZATION	0.98+
Amazon	ORGANIZATION	0.98+
ninth year	QUANTITY	0.97+
NYC	LOCATION	0.97+
this week	DATE	0.96+
Trillium	ORGANIZATION	0.96+
SQL	TITLE	0.96+
this week	DATE	0.96+
50-year old	QUANTITY	0.96+
CUBE	ORGANIZATION	0.96+
One	QUANTITY	0.95+
a month	DATE	0.94+
EDW	TITLE	0.92+
about a year ago	DATE	0.91+
Cloudera	ORGANIZATION	0.91+
about	DATE	0.9+
SLA	TITLE	0.84+
DB2	TITLE	0.84+
one	QUANTITY	0.82+
CEO	PERSON	0.81+
a year ago	DATE	0.81+
theCUBE	ORGANIZATION	0.8+
about three months ago	DATE	0.79+
AS400	COMMERCIAL_ITEM	0.78+
wave	EVENT	0.77+
past 10 years	DATE	0.74+
two later	DATE	0.74+
two fundamental value propositions	QUANTITY	0.72+
Kinesis	TITLE	0.72+
couple a days	QUANTITY	0.71+
Cloudera 6	TITLE	0.7+
big data	EVENT	0.64+
day one	QUANTITY	0.61+
2018	DATE	0.57+
API 2.0	OTHER	0.54+
Tendu	PERSON	0.51+

Basil Faruqui, BMC | theCUBE NYC 2018

(upbeat music) >> Live from New York, it's theCUBE. Covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. >> Okay, welcome back everyone to theCUBE NYC. This is theCUBE's live coverage covering CubeNYC Strata Hadoop Strata Data Conference. All things data happen here in New York this week. I'm John Furrier with Peter Burris. Our next guest is Basil Faruqui lead solutions marketing manager digital business automation within BMC returns, he was here last year with us and also Big Data SV, which has been renamed CubeNYC, Cube SV because it's not just big data anymore. We're hearing words like multi cloud, Istio, all those Kubernetes. Data now is so important, it's now up and down the stack, impacting everyone, we talked about this last year with Control M, how you guys are automating in a hurry. The four pillars of pipelining data. The setup days are over; welcome to theCUBE. >> Well thank you and it's great to be back on theCUBE. And yeah, what you said is exactly right, so you know, big data has really, I think now been distilled down to data. Everybody understands data is big, and it's important, and it is really you know, it's quite a cliche, but to a larger degree, data is the new oil, as some people say. And I think what you said earlier is important in that we've been very fortunate to be able to not only follow the journey of our customers but be a part of it. So about six years ago, some of the early adopters of Hadoop came to us and said that look, we use your products for traditional data warehousing on the ERP side for orchestration workloads. We're about to take some of these projects on Hadoop into production and really feel that the Hadoop ecosystem is lacking enterprise-grade workflow orchestration tools. So we partnered with them and some of the earliest goals they wanted to achieve was build a data lake, provide richer and wider data sets to the end users to be able to do some dashboarding, customer 360, and things of that nature. Very quickly, in about five years time, we have seen a lot of these projects mature from how do I build a data lake to now applying cutting-edge ML and AI and cloud is a major enabler of that. You know, it's really, as we were talking about earlier, it's really taking away excuses for not being able to scale quickly from an infrastructure perspective. Now you're talking about is it Hadoop or is it S3 or is it Azure Blob Storage, is it Snowflake? And from a control-end perspective, we're very platform and technology agnostic, so some of our customers who had started with Hadoop as a platform, they are now looking at other technologies like Snowflake, so one of our customers describes it as kind of the spine or a power strip of orchestration where regardless of what technology you have, you can just plug and play in and not worry about how do I rewire the orchestration workflows because control end is taking care of it. >> Well you probably always will have to worry about that to some degree. But I think where you're going, and this is where I'm going to test with you, is that as analytics, as data is increasingly recognized as a strategic asset, as analytics increasingly recognizes the way that you create value out of those data assets, and as a business becomes increasingly dependent upon the output of analytics to make decisions and ultimately through AI to act differently in markets, you are embedding these capabilities or these technologies deeper into business. They have to become capabilities. They have to become dependable. They have to become reliable, predictable, cost, performance, all these other things. That suggests that ultimately, the historical approach of focusing on the technology and trying to apply it to a periodic or series of data science problems has to become a little bit more mature so it actually becomes a strategic capability. So the business can say we're operating on this, but the technologies to take that underlying data science technology to turn into business operations that's where a lot of the net work has to happen. Is that what you guys are focused on? >> Yeah, absolutely, and I think one of the big differences that we're seeing in general in the industry is that this time around, the pull of how do you enable technology to drive the business is really coming from the line of business, versus starting on the technology side of the house and then coming to the business and saying hey we've got some cool technologies that can probably help you, it's really line of business now saying no, I need better analytics so I can drive new business models for my company, right? So the need for speed is greater than ever because the pull is from the line of business side. And this is another area where we are unique is that, you know, Control M has been designed in a way where it's not just a set of solutions or tools for the technical guys. Now, the line of business is getting closer and closer, you know, it's blending into the technical side as well. They have a very, very keen interest in understanding are the dashboards going to be refreshed on time? Are we going to be able to get all the right promotional offers at the right time? I mean, we're here at NYC Strata, there's a lot of real-time promotion happening here. The line of business has direct interest in the delivery and the timing of all of this, so we have always had multiple interfaces to Control M where a business user who has an interest in understanding are the promotional offers going to happen at the right time and is that on schedule? They have a mobile app for them to do that. A developer who's building up complex, multi-application platform, they have an API and a programmatic interface to do that. Operations that has to monitor all of this has rich dashboards to be able to do that. That's one of the areas that has been key for our success over the last couple decades, and we're seeing that translate very well into the big data place. >> So I just want to go under the hood for a minute because I love that answer. And I'd like to pivot off what Peter said, tying it back to the business, okay, that's awesome. And I want to learn a little bit more about this because we talked about this last year and I kind of am seeing it now. Kubernetes and all this orchestration is about workloads. You guys nailed the workflow issue, complex workflows. Because if you look at it, if you're adding line of business into the equation, that's just complexity in and of itself. As more workflows exist within its own line of business, whether it's recommendations and offers and workflow issues, more lines of business in there is complex for even IT to deal with, so you guys have nailed that. How does that work? Do you plug it in and the lines of businesses have their own developers, so the people who work with the workflows engage how? >> So that's a good question, with sort of orchestration and automation now becoming very, very generic, it's kind of important to classify where we play. So there's a lot of tools that do release and build automation. There's a lot of tools that'll do infrastructure automation and orchestration. All of this infrastructure and release management process is done ultimately to run applications on top of it, and the workflows of the application need orchestration and that's the layer that we play in. And if you think about how does the end user, the business and consumer interact with all of this technology is through applications, k? So the orchestration of the workflow's inside the applications, whether you start all the way from an ERP or a CRM and then you land into a data lake and then do an ML model, and then out come the recommendations analytics, that's the layer we are automating today. Obviously, all of this-- >> By the way, the technical complexity for the user's in the app. >> Correct, so the line of business obviously has a lot more control, you're seeing roles like chief digital officers emerge, you're seeing CTOs that have mandates like okay you're going to be responsible for all applications that are facing customer facing where the CIO is going to take care of everything that's inward facing. It's not a settled structure or science involved. >> It's evolving fast. >> It's evolving fast. But what's clear is that line of business has a lot more interest and influence in driving these technology projects and it's important that technologies evolve in a way where line of business can not only understand but take advantage of that. >> So I think it's a great question, John, and I want to build on that and then ask you something. So the way we look at the world is we say the first fifty years of computing were known process, unknown technology. The next fifty years are going to be unknown process, known technology. It's all going to look like a cloud. But think about what that means. Known process, unknown technology, Control M and related types of technologies tended to focus on how you put in place predictable workflows in the technology layer. And now, unknown process, known technology, driven by the line of business, now we're talking about controlling process flows that are being created, bespoke, strategic, differentiating doing business. >> Well, dynamic, too, I mean, dynamic. >> Highly dynamic, and those workflows in many respects, those technologies, piecing applications and services together, become the process that differentiates the business. Again, you're still focused on the infrastructure a bit, but you've moved it up. Is that right? >> Yeah, that's exactly right. We see our goal as abstracting the complexity of the underlying application data and infrastructure. So, I mean, it's quite amazing-- >> So it could be easily reconfigured to a business's needs. >> Exactly, so whether you're on Hadoop and now you're thinking about moving to Snowflake or tomorrow something else that comes up, the orchestration or the workflow, you know, that's as a business as a product that's our goal is to continue to evolve quickly and in a manner that we continue to abstract the complexity so from-- >> So I've got to ask you, we've been having a lot of conversations around Hadoop versus Kubernetes on multi cloud, so as cloud has certainly come in and changed the game, there's no debate on that. How it changes is debatable, but we know that multiple clouds is going to be the modus operandus for customers. >> Correct. >> So I got a lot of data and now I've got pipelining complexities and workflows are going to get even more complex, potentially. How do you see the impact of the cloud, how are you guys looking at that, and what are some customer use cases that you see for you guys? >> So the, what I mentioned earlier, that being platform and technology agnostic is actually one of the unique differentiating factors for us, so whether you are an AWS or an Azure or a Google or On-Prem or still on a mainframe, a lot of, we're in New York, a lot of the banks, insurance companies here still do some of the most critical processing on the mainframe. The ability to abstract all of that whether it's cloud or legacy solutions is one of our key enablers for our customers, and I'll give you an example. So Malwarebytes is one of our customers and they've been using Control M for several years. Primarily the entire structure is built on AWS, but they are now utilizing Google cloud for some of their recommendation analysis on sentiment analysis because their goal is to pick the best of breed technology for the problem they're looking to solve. >> Service, the best breed service is in the cloud. >> The best breed service is in the cloud to solve the business problem. So from Control M's perspective, transcending from AWS to Google cloud is completely abstracted for them, so runs Google tomorrow it's Azure, they decide to build a private cloud, they will be able to extend the same workflow orchestration. >> But you can build these workflows across whatever set of services are available. >> Correct, and you bring up an important point. It's not only being able to build the workflows across platforms but being able to define dependencies and track the dependencies across all of this, because none of this is happening in silos. If you want to use Google's API to do the recommendations, well, you've got to feed it the data, and the data's pipeline, like we talked about last time, data ingestion, data storage, data processing, and analytics have very, very intricate dependencies, and these solutions should be able to manage not only the building of the workflow but the dependencies as well. >> But you're defining those elements as fundamental building blocks through a control model >> Correct. >> That allows you to treat the higher level services as reliable, consistent, capabilities. >> Correct, and the other thing I would like to add here is not only just build complex multiplatform, multiapplication workflows, but never lose focus of the business service of the business process there, so you can tie all of this to a business service and then, these things are complex, there are problems, let's say there's an ETL job that fails somewhere upstream, Control M will immediately be able to predict the impact and be able to tell you this means the recommendation engine will not be able to make the recommendations. Now, the staff that's going to work under mediation understands the business impact versus looking at a screen where there's 500 jobs and one of them has failed. What does that really mean? >> Set priorities and focal points and everything else. >> Right. >> So I just want to wrap up by asking you how your talk went at Strata Hadoop Data Conference. What were you talking about, what was the core message? Was it Control M, was it customer presentations? What was the focus? >> So the focus of yesterday's talk was actually, you know, one of the things is academic talk is great, but it's important to, you know, show how things work in real life. The session was focused on a real-use case from a customer. Navistar, they have IOT data-driven pipelines where they are predicting failures of parts inside trucks and buses that they manufacture, you know, reducing vehicle downtime. So we wanted to simulate a demo like that, so that's exactly what we did. It was very well received. In real-time, we spun up EMR environment in AWS, automatically provision control of infrastructure there, we applied spark and machine learning algorithms to the data and out came the recommendation at the end was that, you know, here are the vehicles that are-- >> Fix their brakes. (laughing) >> Exactly, so it was very, very well received. >> I mean, there's a real-world example, there's real money to be saved, maintenance, scheduling, potential liability, accidents. >> Liability is a huge issue for a lot of manufacturers. >> And Navistar has been at the leading edge of how to apply technologies in that business. >> They really have been a poster child for visual transformation. >> They sure have. >> Here's a company that's been around for 100 plus years and when we talk to them they tell us that we have every technology under the sun that has come since the mainframe, and for them to be transforming and leading in this way, we're very fortunate to be part of their journey. >> Well we'd love to talk more about some of these customer use cases. Other people love about theCUBE, we want to do more of them, share those examples, people love to see proof in real-world examples, not just talk so appreciate it sharing. >> Absolutely. >> Thanks for sharing, thanks for the insights. We're here Cube live in New York City, part of CubeNYC, we're getting all the data, sharing that with you. I'm John Furrier with Peter Burris. Stay with us for more day two coverage after this short break. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by SiliconANGLE Media with Control M, how you guys are automating in a hurry. describes it as kind of the spine or a power strip but the technologies to take that underlying of the house and then coming to the business You guys nailed the workflow issue, and that's the layer that we play in. for the user's in the app. Correct, so the line of business and it's important that technologies evolve in a way So the way we look at the world is we say that differentiates the business. of the underlying application data and infrastructure. so as cloud has certainly come in and changed the game, and what are some customer use cases that you see for the problem they're looking to solve. is in the cloud. The best breed service is in the cloud But you can build these workflows across and the data's pipeline, like we talked about last time, That allows you to treat the higher level services and be able to tell you this means the recommendation engine So I just want to wrap up by asking you at the end was that, you know, Fix their brakes. there's real money to be saved, And Navistar has been at the leading edge of how They really have been a poster child for and for them to be transforming and leading in this way, people love to see proof in real-world examples, Thanks for sharing, thanks for the insights.

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Basil Faruqui	PERSON	0.99+
Peter Burris	PERSON	0.99+
BMC	ORGANIZATION	0.99+
Peter	PERSON	0.99+
500 jobs	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
New York	LOCATION	0.99+
last year	DATE	0.99+
AWS	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Hadoop	TITLE	0.99+
first fifty years	QUANTITY	0.99+
theCUBE	ORGANIZATION	0.99+
Navistar	ORGANIZATION	0.99+
tomorrow	DATE	0.98+
yesterday	DATE	0.98+
one	QUANTITY	0.98+
this week	DATE	0.97+
Malwarebytes	ORGANIZATION	0.97+
Cube	ORGANIZATION	0.95+
Control M	ORGANIZATION	0.95+
NYC	LOCATION	0.95+
Snowflake	TITLE	0.95+
Strata Hadoop Data Conference	EVENT	0.94+
100 plus years	QUANTITY	0.93+
CubeNYC Strata Hadoop Strata Data Conference	EVENT	0.92+
last couple decades	DATE	0.91+
Azure	TITLE	0.91+
about five years	QUANTITY	0.91+
Istio	ORGANIZATION	0.9+
CubeNYC	ORGANIZATION	0.89+
day	QUANTITY	0.87+
about six years ago	DATE	0.85+
Kubernetes	TITLE	0.85+
today	DATE	0.84+
NYC Strata	ORGANIZATION	0.83+
Hadoop	ORGANIZATION	0.78+
one of them	QUANTITY	0.77+
Big Data SV	ORGANIZATION	0.75+
2018	EVENT	0.7+
Kubernetes	ORGANIZATION	0.66+
fifty years	DATE	0.62+
Control M	TITLE	0.61+
four pillars	QUANTITY	0.61+
two	QUANTITY	0.6+
-Prem	ORGANIZATION	0.6+
Cube SV	COMMERCIAL_ITEM	0.58+
a minute	QUANTITY	0.58+
S3	TITLE	0.55+
Azure	ORGANIZATION	0.49+
cloud	TITLE	0.49+
2018	DATE	0.43+

Ronen Schwartz, Informatica | theCUBE NYC 2018

>> Live from New York, it's theCUBE covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. (techy music) >> Welcome back to the Big Apple, everybody. This is theCUBE, the leader in live tech coverage. My name is Dave Vellante, I'm here with my cohost Peter Burris, and this is our week-long coverage of CUBENYC. It used to be, really, a big data theme. It sort of evolved into data, AI, machine learning. Ronan Schwartz is here, he's the senior vice president and general manager of cloud, big data, and data integration at data integration company Informatica. Great to see you again, Ronan, thanks so much for coming on. >> Thanks for inviting me, it's a good, warm day in New York. >> Yeah, the storm is coming and... Well, speaking of storms, the data center is booming. Data is this, you know, crescendo of storms (chuckles) have occurred, and you guys are at the center of that. It's been a tailwind for your business. Give us the update, how's business these days? >> So, we finished Q2 in a great, great success, the best Q2 that we ever had, and the third quarter looks just as promising, so I think the short answer is that we are seeing the strong demand for data, for technologies that supports data. We're seeing more users, new use cases, and definitely a huge growth in need to support... To support data, big data, data in the cloud, and so on, so I think very, very good Q2 and it looks like Q3's going to be just as good, if not better. >> That's great, so there's been a decades-long conversation, of course, about data, the value of data, but more often than not over the history of recent history, when I say recent I mean let's say 20 years on, data's been a problem for people. It's been expensive, how do you manage it, when do you delete it? It's sort of this nasty thing that people have to deal with. Fast forward to 2010, the whole Hadoop movement, all of a sudden data's the new oil, data's... You know, which Peter, of course, disagrees with for many reasons. >> No, it's... >> We don't have to get into it. >> It's subtlety. >> It's a subtlety, but you're right about it, and well, maybe if we have time we can talk about that, but the bromide of... But really focused attention on data and the importance of data and the value of data, and that was really a big contribution that Hadoop made. There were a lot of misconceptions. "Oh, we don't need the data warehouse anymore. "Oh, we don't need old," you know, "legacy databases." Of course none of those are true. Those are fundamental components of people's big data strategy, but talk about the importance of data and where Informatica fits. >> In a way, if I look into the same history that you described, and Informatica have definitely been a player through this history. We divide it into three eras. The first one is when data was like this thing that sits below the application, that used the application to feed the data in and if you want to see the data you go through the application, you see the data. We sometimes call that as Data 1.0. Data 2.0 was the time that companies, including Informatica, kind of froze and been able to give you a single view of the data across multiple systems, across your organization, and so on, because we're Informatica we have the ETL with data quality, even with master data management, kind of came into play and allowed an organization to actually build analytics as a system, to build single view as a system, et cetera. I think what is happening, and Hadoop was definitely a trigger, but I would say the cloud is just as big of a trigger as the big data technologies, and definitely everything that's happening right now with Spark and the processing power, et cetera, is contributing to that. This is the time of the Data 3.0 when data is actually in the center. It's not a single application like it was in the Data 2.0. It's not this thing below the application in Data 1.0. Data is in the center and everything else is just basically have to be connected to the data, and I think it's an amazing time. A big part of digitalization is the fact that the data is actually there. It's the most important asset the organization has. >> Yeah, so I want to follow up on something. So, last night we had a session Peter hosted on the future of AI, and he made the point, I said earlier data's the new oil. I said you disagreed, there's a nuance there. You made the point last night that oil, I can put oil in my car, I can put oil in my house, I can't do both. Data is the new currency, people said, "Well, I can spend a dollar or I can spend "a dollar on sports tickets, I can't do both." Data's different in that... >> It doesn't follow the economics of scarcity, and I think that's one of the main drivers here. As you talk about 1.0, 2.0, and 3.0, 1.0 it's locked in the application, 2.0 it's locked in a model, 3.0 now we're opening it up so that the same data can be shared, it can be evolved, it can be copied, it can be easily transformed, but their big issue is we have to sustain overall coherence of it. Security has to remain in place, we have to avoid corruption. Talk to us about some of the new demands given, especially that we've got this, more data but more users of that data. As we think about evidence-based management, where are we going to ensure that all of those new claims from all of those new users against those data sources can be satisfied? >> So, first, I truly like... This is a big nuance, it's not a small one. (laughs) The fact that you have better idea actually means that you do a lot of things better. It doesn't mean that you do one thing better and you cannot do the other. >> Right. I agree 100%, I actually contribute that for two things. One is more users, and the other thing is more ways to use the data, so the fact that you have better data, more data, big data, et cetera, actually means that your analytics is going to be better, right, but it actually means that if you are looking into hyperautomation and AI and machine learning and so on, suddenly this is possible to do because you have this data foundation that is big enough to actually support machine learning processes, and I think we're just in the beginning of that. I think we're going to see data being used for more and more use cases. We're in the integration business and in the data management business, and we're seeing, within what our customers are asking us to support, this huge growth in the number of patterns of how they want the data to be available, how they want to bring data into different places, into different users, so all of that is truly supporting what you just mentioned. I think if you look into the Data 2.0 timeframe, it was the time that a single team that is very, very strong with the right tools can actually handle the organization needs. In what you described, suddenly self-service. Can every group consume the data? Can I get the data in both batch and realtime? Can I get the data in a massive amount as well as in small chunks? These are all becoming very, very central. >> And very use case, but also user and context, you know, we think about time, dependent, and one of the biggest challenges that we have is to liberate the data in the context of the multiple different organization uses, and one of the biggest challenges that customers have, or that any enterprise has, and again, evidence-based management, nice trend, a lot of it's going to happen, but the familiarity with data is still something that's not, let's say broadly diffused, and a lot of the tools for ensuring that people can be made familiar, can discover, can reuse, can apply data, are modestly endowed today, so talk about some of these new tools that are going to make it easier to discover, capture, catalog, sustain these data assets? >> Yeah, and I think you're absolutely right, and if this is such a critical asset, and data is, and we're actually looking into more user consuming the data in more ways, it actually automatically create a bottleneck in how do I find the data, how do I identify the data that I need, and how am I making this available in the right place at the right time? In general, it looks like a problem that is almost unsolvable, like I got more data, more users, more patterns, nobody have their budget tripled or quadrupled just to be able to consume it. How do you address that, and I think Informatica very early have identified this growing need, and we have invested in a product that we call the enterprise data catalog, and it's actually... The concept of a catalog or a metadata repository, a place that you can actually identify all the data that exists, is not necessarily a new concept-- >> No, it's been around for years. >> Yes, but doing it in an enterprise-unified way is unique, and I think if you look into what we're trying to basically empower any user to do I basically, you know, we all using Google. You type something and you find it. If you're trying to find data in the organization in a similar way, it's a much harder task, and basically the catalog and Informatica unified, enterprise-unified catalog is doing that, leveraging a lot of machine learning and AI behind the scenes to basically make this search possible, make basically the identification of the data possible, the curation of the data possible, and basically empowering every user to find the data that he wants, see recommendation for other data that can work with it, and then basically consume the data in the way that he wants. I totally think that this will change the way IT is functioning. It is actually an amazing bridge between IT and the business. If there is one place that you can search all your data, suddenly the whole interface between IT and the business is changing, and Informatica's actually leading this change. >> So, the catalog gives you line-of-sight on all, (clears throat) all those data sources, what's the challenge in terms of creating a catalog and making it performant and useful? >> I think there are a few levels of the challenge. I chose the word enterprise-unified intelligent catalog deliberately, and I think each one of them is kind of representing a different challenge. The first challenge is the unified. There is technical metadata, this is the mapping and the processes that move data from one place to the other, then there is business metadata. These are the definition the business is using, and then there is the operational metadata as well, as well as the physical location and so on. Unifying all of them so that you can actually connect and see them in one place is a unique challenge that at this stage we have already completely addressed. The second one is enterprise, and when talking about enterprise metadata it means that you want all of your applications, you want application in the cloud, you want your cloud environment, your big data environment. You want, actually, your APIs, you want your integration environment. You want to be able to collect all of this metadata across the enterprise, so unified all the types, enterprise is the second one. The third challenge is actually the most exciting one, is how can you leverage intelligence so it's not limited by the human factor, by the amount of people that you have to actually put the data together, right? >> Mm-hm. >> And today we're using a very, very sophisticated, interesting logarithm to run on the metadata and be able to tell you that even though you don't know how the data got from here to here, it actually did get from here to here. >> Mm-hm. >> It's a dotted line, maybe somebody copied it, maybe something else happened, but the data is so similar that we can actually tell you it came from one place. >> So, actually, let me see, because I think there's... I don't think you missed a step, but let me reveal a step that's in there. One of the key issues in the enterprise side of things is to reveal how data's being used. The value of data is tied to its context, and having catalogs that can do, as you said, the unified, but also the metadata becomes part of how it's used makes that opportunity, that ability to then create audit trails and create lineage possible. >> You're absolutely right, and I think it actually is one of the most important things, is to see where the data came from and what steps did it go to. >> Right. >> There's also one other very interesting value of lineage that I think sometimes people tend to ignore is who else is using it? >> Right. >> Who else is consuming it, because that is actually, like, a very good indicator of how good the data is or how common the data is. The ability to actually leverage and create this lineage is a mandatory thing. The ability to create lineage that is inferred, and not actually specifically defined, is also very, very interesting, but we're now doing, like, things that are, I think, really exciting. For example, let's say that a user is looking into a data field in one source and he is actually identifying that this is a certain, specific ID that his organization is using. Now we're able to actually automatically understand that this field actually exists in 700 places, and actually, leverage the intelligence that he just gave us and actually ask him, "Do you want it to be automatically updated everywhere? "Do you want to do it in a step-by-step, guided way?" And this is how you actually scale to handle the massive amount of data, and this is how organizations are going to learn more and more and get the data to be better and better the more they work with the data. >> Now, Ronan, you have hard news this week, right? Why don't you update us on what you've announced? >> So, I think in the context for our discussion, Informatica announced here, actually today, this morning in Strata, a few very exciting news that are actually helping the customer go into this data journey. The first one is basically supporting data across, big data across multi-clouds. The ability to basically leverage all of these great tools, including the catalog, including the big data management, including data quality, data governance, and so on, on AWS, on Azure, on GCP, basically without any effort needed. We're even going further and we're empowering our user to use it in a serverless mode where we're actually allowing them full control over the resources that are being consumed. This is really, really critical because this is actually allowing them to do more with the data in a lower cost. I think the last part of the news that is really exciting is we added a lot, a lot of functionality around our Spark processing and the capabilities of the things that you can do so that the developers, the AI and machine learning can use their stuff, but at the same time we actually empower business users to do more than they ever did before. So, kind of being able to expand the amount of users that can access the data, wanting a more sophisticated way, and wanting a very simple but still very powerful way, I think this is kind of the summary of the news. >> And just a quick followup on that. If I understand it, it's your full complement of functionality across these clouds, is that right? You're not neutering... (chuckles) >> That is absolutely correct, yes, and we are seeing, definitely within our customers, a growing choice to decide to focus their big data efforts in the cloud, it makes a lot of sense. The ability to scale up and down in the cloud is significantly superior, but also the ability to give more users access in the cloud is typically easier, so I think Informatica have chosen as the market we're focusing on enterprise cloud data management. We talked a lot about data management. This is a lot about the cloud, the cloud part of it, and it's basically a very, very focused effort in optimizing things across clouds. >> Cloud is critical, obviously. That's how a lot of people want to do business. They want to do business in a cloud-like fashion, whether it's on-prem or off-prem. A lot of people want things to be off-prem. Cloud's important because it's where innovation is happening, and scale. Ronan, thanks so much for coming on theCUBE today. >> Yeah, thank you very much and I did learn something, oil is not one of the terms that I'm going to use for data in the future. >> Makes you think about that, right? >> I'm going to use something different, yes. >> It's good, and I also... My other takeaway is, in that context, being able to use data in multiple places. Usage is a proportional relationship between usage and value, so thanks for that. >> Excellent. >> Happy to be here. >> And thank you, everybody, for watching. We will be right back right after this short break. You're watching theCUBE at #CUBENYC, we'll be right back. (techy music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by SiliconANGLE Media Ronan Schwartz is here, he's the senior Well, speaking of storms, the data center is booming. the best Q2 that we ever had, and the third quarter conversation, of course, about data, the value of data, and the importance of data and the value of data, that the data is actually there. Data is the new currency, people said, so that the same data can be shared, it can be evolved, The fact that you have better idea actually so the fact that you have better data, in how do I find the data, how do I identify the data behind the scenes to basically make this search possible, by the amount of people that you have to actually put how the data got from here to here, it actually did get maybe something else happened, but the data and having catalogs that can do, as you said, it actually is one of the most important things, and get the data to be better and better of the things that you can do so that the developers, of functionality across these clouds, is that right? but also the ability to give more users That's how a lot of people want to do business. that I'm going to use for data in the future. being able to use data in multiple places. And thank you, everybody, for watching.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Ronan	PERSON	0.99+
Ronan Schwartz	PERSON	0.99+
Informatica	ORGANIZATION	0.99+
Peter	PERSON	0.99+
New York	LOCATION	0.99+
100%	QUANTITY	0.99+
Peter Burris	PERSON	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
20 years	QUANTITY	0.99+
Ronen Schwartz	PERSON	0.99+
700 places	QUANTITY	0.99+
2010	DATE	0.99+
third challenge	QUANTITY	0.99+
One	QUANTITY	0.99+
both	QUANTITY	0.99+
two things	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
a dollar	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
one source	QUANTITY	0.99+
first challenge	QUANTITY	0.98+
first one	QUANTITY	0.98+
today	DATE	0.98+
one	QUANTITY	0.98+
last night	DATE	0.98+
first	QUANTITY	0.98+
this week	DATE	0.97+
this morning	DATE	0.97+
second one	QUANTITY	0.97+
one place	QUANTITY	0.97+
Spark	TITLE	0.97+
3.0	OTHER	0.96+
single application	QUANTITY	0.96+
New York City	LOCATION	0.95+
single team	QUANTITY	0.93+
decades	QUANTITY	0.92+
2.0	OTHER	0.91+
each one	QUANTITY	0.91+
Hadoop	TITLE	0.9+
theCUBE	ORGANIZATION	0.89+
1.0	OTHER	0.89+
single view	QUANTITY	0.89+
third quarter	DATE	0.88+
Data 3.0	TITLE	0.85+
Data 2.0	TITLE	0.85+
Data 1.0	OTHER	0.84+
Q2	DATE	0.83+
Data 2.0	OTHER	0.83+
Azure	TITLE	0.82+
both batch	QUANTITY	0.81+
Big Apple	LOCATION	0.81+
NYC	LOCATION	0.78+
one thing	QUANTITY	0.74+
three eras	QUANTITY	0.74+
GCP	TITLE	0.65+
Q3	DATE	0.64+
Hadoop	PERSON	0.64+

David Richards, WANdisco | theCUBE NYC 2018

Live from New York, it's theCUBE. Covering theCUBE, New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. >> Okay, welcome back everyone. This is theCUBE live in New York City for our CUBE NYC event, #cubenyc. This is our ninth year covering the big data ecosystem going back to the original Hadoop world, now it's evolved to essentially all things AI, future of AI. Peter Burris is my cohost. He gave a talk two nights ago on the future of AI presented in his research. So it's all about data, it's all about the cloud, it's all about live action here in theCUBE. Our next guest is David Richards, who's been in the industry for a long time, seen the evolution of Hadoop, been involved in it, has been a key enabler of the technology, certainly enabling cloud recovery replication for cloud, welcome back to theCUBE. It's good to see you. >> It's really good to be here. >> I got to say, you've been on theCUBE pretty much every year, I think every year, we've done nine years now. You made some predictions and calls that actually happened. Like five years ago you said the cloud's going to kill Hadoop. Yeah, I think you didn't say that off camera, but it might (laughing) maybe you said it on camera. >> I probably did, yeah. >> [John] But we were kind of pontificating but also speculating, okay, where does this go? You've been right on a lot of calls. You also were involved in the Hadoop distribution business >>back in the day. Oh god. >> You got out of that quickly. (laughing) You saw that early, good call. But you guys have essentially a core enabler that's been just consistently performing well in the market both on the Hadoop side, cloud, and as data becomes the conversation, which has always been your perspective, you guys have had a key in part of the infrastructure for a long time. What's going on? Is it still doing deals, what's? >> Yes, I mean, the history of WANdisco's play and big data in Hadoop has been, as you know because you've been with us for a long time, kind of an interesting one. So we back in sort of 2013, 2014, 2015 we built a Hadoop-specific product called Non-Stop NameNode and we had a Hadoop distribution. But we could see this transition, this change in the market happening. And the change wasn't driven necessarily by the advent of new technology. It was driven by overcomplexity associated with deploying, managing Hadoop clusters at scale because lots of people, and we were talking about this off-camera before, can deploy Hadoop in a fairly small way, but not many companies are equipped or built to deploy massive scale Hadoop distributions. >> Sustain it. >> They can't sustain it, and so the call that I made you know, actions speak louder than words. The company rebuilt the product, built a general purpose data replication platform called WANdisco Fusion that, yes, supported Hadoop but also supported object store and cloud technologies. And we're now seeing use cases in cloud certainly begin to overtake Hadoop for us for the first time. >> And you guys have a patent that's pretty critical in all this, right? >> Yeah. So there's some real IP. >> Yes, so people often make the mistake of calling us a data replication business, which we are, but data replication happens post-consensus or post-agreement, so the very heart of WANdisco of 35 patents are all based around a Paxos-based consensus algorithm, which wasn't a very cool thing to talk about now with the advent of blockchain and decentralized computing, consensus is at the core of pretty much that movement, so what WANdisco does is a consensus algorithm that enables things like hybrid cloud, multi cloud, poly cloud as Microsoft call it, as well as disaster recovery for Hadoop and other things. >> Yeah, as you have more disparate parts working together, say multi cloud, I mean, you're really perfectly positioned for multi cloud. I mean, hybrid cloud is hybrid cloud, but also multi cloud, they're two different things. Peter has been on the record describing the difference between hybrid cloud and multi cloud, but multi cloud is essentially connecting clouds. >> We're on a mission at the moment to define what those things actually are because I can tell you what it isn't. A multi cloud strategy doesn't mean you have disparate data and processes running in two different clouds that just means that you've got two different clouds. That's not a multi cloud strategy. >> [Peter] Two cloud silos. >> Yeah, correct. That's kind of creating problems that are really going to be bad further down the road. And hybrid cloud doesn't mean that you run some operations and processes and data on premise and a different siloed approach to cloud. What this means is that you have a data layer that's clustered and stretched, the same data that's stretched across different clouds, different on-premise systems, whether it's Hadoop on-premise and maybe I want to build a huge data lake in cloud and start running complex AI and analytics processes over there because I'm, less face it, banks et cetera ain't going to be able to manage and run AI themselves. It's already being done by Amazon, Google, Microsoft, Alibaba, and others in the cloud. So the ability to run this simultaneously in different locations is really important. That's what we do. >> [John] All right, let me just ask this directly since we're filming and we'll get a clip out of this. What is the definition of hybrid cloud? And what is the definition of multi cloud? Take, explain both of those. >> The ability to manage and run the same data set against different applications simultaneously. And achieve exactly the same result. >> [John] That's hybrid cloud or multi cloud? >> Both. >> So they're the same. >> The same. >> You consider hybrid cloud multi cloud the same? >> For us it's just a different end point. It's hybrid kind of mean that you're running something implies on-premise. A multi cloud or poly cloud implies that you're running between different cloud venues. >> So hybrid is location, multi is source. >> Correct. >> So but let's-- >> [David] That's a good definition. >> Yes, but let's unpack this a little bit because at the end of the day, what a business is going to want to do is they're going to want to be able to run apply their data to the best service. >> [David] Correct. >> And increasingly that's what we're advising our clients to think about. >> [David] Yeah. >> Don't think about being an AWS customer, per se, think about being a customer of AWS services that serve your business. Or IBM services that serve your business. But you want to ensure that your dependency on that service is not absolute, and that's why you want to be able to at least have the option of being able to run your data in all of these different places. >> And I think the market now realizes that there is not going to be a single, dominant vendor for cloud infrastructure. That's not going to happen. Yes, it happened, Oracle dominated in relational data. SAP dominated for ERP systems. For cloud, it's democratized. That's not going to happen. So everybody knows that Amazon probably have the best serverless compute lambda functions available. They've got millions of those things already written or in the process of being written. Everybody knows that Microsoft are going to extend the wonderful technology that they have on desktop and move that into cloud for analytics-based technologies and so on. The Google have been working on artificial intelligence for an elongated period of time, so vendors are going to arbitrage between different cloud vendors. They're going to choose the best of brood approach. >> [John] They're going to go to Google for AI and scale, they're going to go to Amazon for robustness of services, and they're going to go to Microsoft for the Suite. >> [Peter] They're going to go for the services. They're looking at the services, that's what they need to do. >> And the thing that we'll forget, that we don't at WANdisco, is that that requires guaranteed consistent data sets underneath the whole thing. >> So where does Fusion fit in here? How is that getting traction? Give us some update. Are you working with Microsoft? I know we've been talking about Amazon, what about Microsoft? >> So we've been working with Microsoft, we announced a strategic partnership with them in March where we became a tier zero vendor, which basically means that we're partnered with them in lockstep in the field. We executed extremely well since that point and we've done a number of fairly large, high-profile deals. A retailer, for example, that was based in Amazon didn't really like being based in Amazon so had to build a poly cloud implementation to move had to buy scale data from AWS into Azure, that went seamlessly. It was an overnight success. >> [John] And they're using your technology? >> They're using our technology. There's no other way to do that. I think the world has now, what Microsoft and others have realized, CDC technology changed data capture. Doesn't work at this kind of scale where you batch up a bunch of changes and then you ship them, block shipping or whatever, every 15 minutes or so. We're talking about petabyte scale ingest processes. We're talking about huge data lakes, that that technology simply doesn't work at this kind of scale. >> [John] We've got a couple minutes left, I want to just make sure we get your views on blockchain, you mentioned consensus, I want to get your thoughts on that because we're seeing blockchain is certainly experimental, it's got, it's certainly powering money, Bitcoin and the international markets, it's certainly becoming a money backbone for countries to move billions of dollars out. It's certainly in the tank right now about 600 million below its mark in January, but blockchain is fundamentally supply chain, you're seeing consensus, you're seeing some of these things that are in your realm, what's your view? >> So first of all, at WANdisco, we separate the notion of cryptocurrency and blockchain. We see blockchain as something that's been around for a long time. It's basically the world is moving to decentralization. We're seeing this with airlines, with supermarkets, and so on. People actually want to decentralize rather that centralize now. And the same thing is going to happen in the financial industry where we don't actually need a central transaction coordinator anymore, we don't need a clearinghouse, in other words. Now, how do you do that? At the very heart of blockchain is an incorrect assumption. So must people think that Satoshi's invention, whoever that may be, was based around the blockchain itself. Blockchain is pieced together technologies that doesn't actually scale, right? So it takes game-theoretic approach to consensus. And I won't get, we don't have enough time for me to delve into exactly what that means, but our consensus algorithm has already proven to scale, right? So what does that mean? Well, it means that if you want to go and buy a cup of coffee at the Starbucks next door, and you want to use a Bitcoin, you're going to be waiting maybe half an hour for that transaction to settle, right? Because the-- >> [John] The buyer's got to create a block, you know, all that step's in one. >> The game-theoretic approach basically-- >> Bitcoin's running 500,000 transactions a day. >> Yeah. That's eight. >> There's two transactions per second, right? Between two and eight transactions per second. We've already proven that we can achieve hundreds of thousands, potentially millions of agreements per second. Now the argument against using Paxos, which is what our technology's based on, is it's too complicated. Well, no shit, of course it's too complicated. We've solved that problem. That's what WANdisco does. So we've filed a patent >> So you've abstracted the complexity, that's your job. >> We've extracted the complexity. >> So you solve the complexity problem by being a complex solution, but you're making and abstracting it even easier. >> We have an algorithmic not a game-theoretic approach. >> Solving the scale problem Correct. >> Using Paxos in a way that allows real developers to be able to build consensus algorithm-based applications. >> Yes, and 90% of blockchain is consensus. We've solved the consensus problem. We'll be launching a product based around Hyperledger very soon, we're already in tests and we're already showing tens of thousands of transactions per second. Not two, not 2,000, two transactions. >> [Peter] The game theory side of it is still going to be important because when we start talking about machines and humans working together, programs don't require incentives. Human beings do, and so there will be very, very important applications for this stuff. But you're right, from the standpoint of the machine-to-machine when there is no need for incentive, you just want consensus, you want scale. >> Yeah and there are two approaches to this world of blockchains. There's public, which is where the Bitcoin guys are and the anarchists who firmly believe that there should be no oversight or control, then there's the real world which is permission blockchains, and permission blockchains is where the banks, where the regulators, where NASDAQ will be when we're trading shares in the future. That will be a permission blockchain that will be overseen by a regulator like the SEC, NASDAQ, or London Stock Exchange, et cetera. >> David, always great to chat with you. Thanks for coming on, again, always on the cutting edge, always having a great vision while knocking down some good technology and moving your IP on the right waves every time, congratulations. >> Thank you. >> Always on the next wave, David Richards here inside theCUBE. Every year, doesn't disappoint, theCUBE bringing you all the action here. Cube NYC, we'll be back with more coverage. Stay with us; a lot more action for the rest of the day. We'll be right back; stay with us for more after this short break. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by SiliconANGLE Media has been a key enabler of the technology, I got to say, you've been on theCUBE [John] But we were kind of pontificating back in the day. and as data becomes the conversation, in the market happening. and so the call that I made So there's some real IP. consensus is at the core of Peter has been on the record at the moment to define So the ability to run this simultaneously What is the definition of hybrid cloud? and run the same data set implies that you're running is they're going to want to be able to run our clients to think about. of being able to run your data that there is not going to and they're going to go to They're looking at the services, And the thing that we'll forget, How is that getting traction? in lockstep in the field. and then you ship them, Bitcoin and the international markets, And the same thing is going to happen got to create a block, 500,000 transactions a day. That's eight. Now the argument against using Paxos, So you've abstracted the So you solve the complexity problem We have an algorithmic not Solving the scale problem to be able to build consensus We've solved the consensus problem. is still going to be important because and the anarchists who firmly believe that Thanks for coming on, again, always on the action for the rest of the day.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
Peter Burris	PERSON	0.99+
John	PERSON	0.99+
Peter	PERSON	0.99+
Google	ORGANIZATION	0.99+
David Richards	PERSON	0.99+
SEC	ORGANIZATION	0.99+
NASDAQ	ORGANIZATION	0.99+
March	DATE	0.99+
two	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
January	DATE	0.99+
AWS	ORGANIZATION	0.99+
2014	DATE	0.99+
millions	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
90%	QUANTITY	0.99+
2013	DATE	0.99+
WANdisco	ORGANIZATION	0.99+
London Stock Exchange	ORGANIZATION	0.99+
2015	DATE	0.99+
New York City	LOCATION	0.99+
nine years	QUANTITY	0.99+
both	QUANTITY	0.99+
two transactions	QUANTITY	0.99+
eight	QUANTITY	0.99+
five years ago	DATE	0.99+
New York	LOCATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
half an hour	QUANTITY	0.99+
35 patents	QUANTITY	0.99+
hundreds of thousands	QUANTITY	0.99+
2,000	QUANTITY	0.99+
Both	QUANTITY	0.99+
ninth year	QUANTITY	0.98+
first time	QUANTITY	0.98+
billions of dollars	QUANTITY	0.98+
Hadoop	TITLE	0.98+
SAP	ORGANIZATION	0.98+
Starbucks	ORGANIZATION	0.98+
Paxos	ORGANIZATION	0.98+
two nights ago	DATE	0.97+
single	QUANTITY	0.97+
two approaches	QUANTITY	0.97+
500,000 transactions a day	QUANTITY	0.97+
about 600 million	QUANTITY	0.96+
theCUBE	ORGANIZATION	0.96+
Satoshi	PERSON	0.92+
two different clouds	QUANTITY	0.91+
NYC	LOCATION	0.89+
one	QUANTITY	0.88+
theCUBE	EVENT	0.87+

Paul Appleby, Kinetica | theCUBE NYC 2018

>> Live from New York, it's the Cube (funky music) covering the Cube New York City 2018 brought to you by SiliconANGLE Media and its ecosystem partners. (funky music) >> Everyone welcome back to theCUBE live in New York City for Cube NYC. This is our live broadcast - two days of coverage around the big data world, AI, the future of Cloud analytics. I'm John Furrier, my cohost Peter Burris. Our next guest is Paul Appleby, CEO Kinetica. Thanks for coming back to theCUBE - good to see you. >> Great to be back again and great to visit in New York City - it's incredible to be here on this really important week. >> Last time we chatted was in our big data Silicon Valley event, which is going to be renamed Cube SV, because it's not just data anymore; there's a lot of Cloud involved, a lot of new infrastructure. But analytics has certainly changed. What's your perspective now in New York as you're in here hearing all the stories around the show and you talk to customers - what's the update from your perspective? Because certainly we're hearing a lot of Cloud this year - Cloud, multi Cloud, analytics, and eyeing infrastructure, proof in the pudding, that kind of thing. >> I'm going to come back to the Cloud thing because I think that's really important. We have shifted to this sort of hybrid multi Cloud world, and that's our future - there is no doubt about it, and that's right across all spectre of computing, not just as it relates to data. But I think this evolution of data has continued this journey that we've all been on from whatever you want to call it - systems or record - to the world of big data where we're trying to gain insights out of this massive oceans of data. But we're in a world today where we're leveraging the power of analytics and intelligence, AI machine learning, to make fundamental decisions that drive some action. Now that action may be to a human to make a decision to interact more effectively with a customer, or it could be to a machine to automate some process. And we're seeing this fundamental shift towards a focus on that problem, and associated with that, we're leveraging the power of Cloud, AI, ML, and all the rest of it. >> And the human role in all this has been talked about. I've seen in the US in the political landscape, data for good, we see Facebook up there being basically litigated publicly in front of the Senate around the role of data and the elections. People are talking in the industry about the role of humans with machines is super important. This is now coming back as a front and center issue of hey, machines do great intelligence, but what about the human piece? What's your view on the human interaction component, whether it's the curation piece, the role of the citizen analyst, or whatever we're calling it these days, and what machines do to supplement that? >> Really good question - I've spent a lot of time thinking about this. I've had the incredible privilege of being able to attend the World Economic Forum for the last five years, and this particular topic of how Robotics Automation Artificial Intelligence machine learning is impacting economies, societies, and ultimately the nature of work has been a really big thread there for a number of years. I've formed a fundamental view: first of all, any technology can be used for good purposes and bad purposes, and it's - >> It always is. >> And it always is, and it's incumbent upon society and government to apply the appropriate levels of regulation, and for corporations to obviously behave the right way, but setting aside those topics - because we could spend hours talking about those alone - there is a fundamental issue, and this is this kind of conversation about what a lot of people like to describe as the fourth industrial revolution. I've spent a lot of time, because you hear people bandy that around - what do they really mean, and what are we really talking about? I've looked at every point in time where there's been an industrial revolution - there's been a fundamental shift of work that was done by humans that's now done by machines. There's been a societal uproar, and there're being new forms of work created, and society's evolved. What I look at today is yes, there's a responsibility and a regular treaside to this, but there's also a responsibility in business and society to prepare our workers and our kids for new forms of work, cause that's what I really think we should be thinking about - what are the new forms of work that are actually unlocked by these technologies, rather than what are the roles that are displaced by this steam powered engine. (laughs softly) >> Well, Paul, we totally agree with you. There's one other step in this process. It kind of anticipates each of these revolutions, and that is there is a process of new classes of asset formation. Mhm. So if you go back to when we put new power trains inside row houses to facilitate the industrial revolution in the early 1800s, and you could say the same thing about transportation, and what the trains did and whatnot. There's always this process of new asset formation that presaged some of these changes. Today it's data - data's an asset cause businesses ultimately institutionalize, or re institutionalize, their work around what they regard as valuable. Now, when we start talking about machines telling other machines what to do, or providing options or paring off options for humans so they have clear sets of things that they can take on, speed becomes a crucial issue, right? At the end of the day, all of this is going to come back to how fast can you process data? Talk to us a little bit about how that dynamic and what you guys are doing to make it possible is impacting business choices. >> Two really important things to unpack there, and one I think I'd love to touch on later, which is data as an asset class and how corporations should treat data. You talk about speed, and I want to talk about speed in the context of perishability, because the truth is if you're going to drive these incredible insights, whether it's related to a cyber threat, or a terrorist threat, or an opportunity to expand your relationship with a customer, or to make a critical decision in a motor vehicle in an autonomous operating mode, these things are about taking massive volumes of streaming data, running analytics in real time, and making decisions in real time. These are not about gleaning insights from historic pools or oceans of data; this is about making decisions that are fundamental to - >> Right now. >> The environment that you're in right now. You think about the autonomous car - great example of the industrial Internet, one we all love to talk about. The mechanical problems associated with autonomy have been solved, fundamentally sensors in cars, and the automated processes related to that. The decisioning engines - they need to be applied at scale in millions of vehicles in real time. That's an extreme data problem. The biggest problem solved there is data, and then over time, societal and regulatory change means that this is going to take some time before it comes to fruition. >> We were just saying - I think it was 100 Teslas generating 100 terabytes of data a day based on streams from its fleet of cars its customers have. >> We firmly believe that longer term, when you get to true autonomy, each car will probably generate around ten terabytes of data a day. That is an extremely complex problem to solve, because at the end of the day, this thinking that you're able to drive that data back to some centralized brain to be making those decisions for and on behalf of the cars is just fundamentally flawed. It has to happen in the car itself. >> Totally agree. >> This is putting super computers inside cars. >> Which is kind of happening - in fact, that 100 terabytes a day is in fact the data that does get back to Tesla. >> Yeah. >> As you said, there's probably 90% of the data is staying inside the car, which is unbelievable scale. >> So the question I wanted to ask you - you mentioned the industrial revolution, so every time there's a new revolution, there's an uproar, you mentioned. But there's also a step up of new capabilities, so if there's new work being developed, usually entrepreneur activity - weird entrepreneurs figured out that everyone says they're not weird anymore; it's great. But there's a step up of new capability that's built. Someone else says hey, the way we used to do databases and networks was great for moving one gig Ethernet on top of the rack; now you got 10 terabytes coming off a car or wireless spectrum. We got to rethink spectrum, or we got to rethink database. Let's use some of these GPUs - so a new step up of suppliers have to come in to support the new work. What's your vision on some of those things that are happening now - that you think people aren't yet seeing? What are some of those new step up functions? Is it on the database side, is it on the network, is it on the 5G - where's the action? >> Wow. Because who's going to support the Teslas? (Paul laughs) Who's going to support the new mobile revolution, the new iPhones the size of my two hands put together? What's your thoughts on that? >> The answer is all of the above. Let me talk about that and what I mean by that. Because you're looking at it from the technology perspective, I'd love to come back and talk about the human perspective as well, but from the technology perspective, of course leveraging power is going to be fundamental to this, because if you think about the types of use cases where you're going to have to be gigathreading queries against massive volumes of data, both static and streaming, you can't do that with historic technology, so that's going to be a critical part of it. The other part of it that we haven't mentioned a lot here but I think we should bring into it is if you think about these types of industrial Internet use cases, or IOT - even consumer Internet IOT related use cases - a lot of the decisioning has to occur out of the H. It cannot occur in a central facility, so it means actually putting the AI or ML engine inside the vehicle, or inside the cell phone tower, or inside the oil rig, and that is going to be a really big part of you know, shifting back to this very distributive model of machine lining in AI, which brings very complex questions in of how you drive governance - (John chuckles) >> And orchestration around employing Ai and ML models at massive scale, out to edge devices. >> Inferencing at the edge, certainly. It's going to be interesting to see what happens with training - we know that some of the original training will happen at the center, but some of that maintenance training? It's going to be interesting to see where that actually - it's probably going to be a split function, but you're going to need really high performing databases across the board, and I think that's one of the big answers, John, is that everybody says oh, it's all going to be in software. It's going to be a lot of hard word answers. >> Yep. >> Well the whole idea is just it's provocative to think about it and also intoxicating if you also want to go down that rabbit hole... If you think about that car, okay, if they're going to be doing century machine learning at the edge - okay, what data are you working off of? There's got to be some storage, and then what about real time data coming from other either horizontally scalable data sets. (laughs) So the question is, what do they have access to? Are they optimized for the decision making at that time? >> Mhm. >> Again, talk about the future of work - this is a big piece, but this is the human piece as well. >> Yeah. >> Are our kids going to be in a multi massive, multi player online game called Life? >> They are. >> They are now. They're on Fortnite, they're on Call of Duty, and all this gaming culture. >> But I think this is one of the interesting things, because there's a very strong correlation between information theory and thermodynamics. >> Mhm. >> They're the same exact - in physics, they are the identical algorithms and the identical equations. There's not a lot of difference, and you go back to the original revolution, you have a series of row houses, you put a power supply all the way down, you can run a bunch of looms. The big issue is entropy - how much heat are you generating? How do you get greater efficiency out of that single power supply? Same thing today: we're worried about the amount of cost, the amount of energy, the amount of administrative overhead associated with using data as an asset, and the faster the database, the more natural it is, the more easy it is to administer, the more easy it is to apply to a lot of different cases, the better. And it's going to be very, very interesting over the next few year to see how - Does database come in memory? Does database stay out over there? A lot of questions are going to be answered in the next couple years as we try to think about where these information transducers actually reside, and how they do their job. >> Yeah, and that's going to be driven yes, partially by the technology, but more importantly by the problems that we're solving. Here we are in New York City - you look at financial services. There are two massive factors in financial services going on what is the digital bank of the future look like, and how the banks interact with their customers, and how you get that true one-to-one engagement, which historically has been virtually impossible for companies that have millions or tens of millions of customers, so fundamental transformation of customer engagement driven by these advanced or excelerated analytics engines, and the pair of AI and ML, but then on the other side if you start looking at really incredibly important things for the banks like risk and spread, historically because of the volumes of data, it's been virtually impossible for them to present their employees with a true picture of those things. Now, with these accelerated technologies, you can take all the historic trading data, and all of the real time trading data, smash that together, and run real time analytics to make the right decisions in the moment of interaction with a customer, and that is incredibly powerful for both the customer, but also for the bank in mitigating risk, and they're the sorts of things we're doing with banks up and down the city here in New York, and of course, right around the world. >> So here's a question for you, so with that in mind - this is kind of more of a thought exercise - will banks even be around in 20 years? >> Wow. (laughs) >> I mean, you've got block chains saying we're going to have new crypto models here, if you take this Tesla with ten terabytes going out every second or whatever that number is. If that's the complex problem, banking should be really easy to solve. >> I think it's incumbent on boards in every industry, not just banking, to think about what existential threats exist, because there are incredibly powerful, successful companies that have gone out of existence because of fundamental shifts and buying behaviors or technologies - I think banks need to be concerned. >> Every industry needs to be concerned. >> Every industry needs to be concerned. >> At the end of the day, every board needs to better understand how they can reduce their assets specificities, right? How they can have their assets be more fungible and more applicable or appropriable to multiple different activities? Think about a future where data and digital assets are a dominant feature of business. Asset specificities go down; today their very definition of vertical industry is defined by the assets associated with bottling, the assets associated with flying, the assets associated with any number of other things. As aspect specialist needs to go down because of data, it changes even the definition of industry, let alone banking. >> Yeah, and auto industry's a great example. Will we own cars in the future? Will we confirm them as a service? >> Exactly. >> Car order manufacturers need to come to terms with that. The banks need to come to terms with the fact that the fundamental infrastructure for payments, whether it's domestic or global, will change. I mean, it is going to change. >> It's changing. It's changing. >> It has to change, and it's in the process of changing, and I'm not talking about crypto, you know, what form of digital currency exists in the future, we can argue about forever, but a fundamental underlying platform for real time exchange - that's just the future. Now, what does that mean for banks that rely heavily on payments as part of their core driver of profitability? Now that's a really important thing to come to terms with. >> Or going back to the point you made earlier. We may not have banks, but we have bankers. There's still going to be people who're providing advice in council, helping the folks understand what businesses to buy, what businesses to sell. So whatever industry they're in, we will still have the people that bring the extra taste to the data. >> Okay, we got to break it there, we've run out of time. Paul, love to chat further about future banking, all this other stuff, and also, as we live in a connected world, what does that mean? We're obviously connected to data; we certainly know there's gonnna be a ton of data. We're bringing that to you here, New York City, with Cube NYC. Stay with us for more coverage after the short break. (funky music)

Published Date : Sep 13 2018

SUMMARY :

brought to you by SiliconANGLE Media Thanks for coming back to theCUBE - good to see you. in New York City - it's incredible to be here around the show and you talk to customers - Now that action may be to a human to make a decision about the role of humans with machines is super important. to attend the World Economic Forum for the last and government to apply the appropriate levels At the end of the day, all of this is going to come back to and one I think I'd love to touch on later, and the automated processes related to that. based on streams from its fleet of cars because at the end of the day, a day is in fact the data that does get back to Tesla. is staying inside the car, which is unbelievable scale. So the question I wanted to ask you - Who's going to support the new mobile revolution, a lot of the decisioning has to occur out of the H. at massive scale, out to edge devices. It's going to be interesting to see what happens There's got to be some storage, and then what about Again, talk about the future of work - this is and all this gaming culture. But I think this is one of the interesting things, the more easy it is to administer, the more easy it is and all of the real time trading data, Wow. If that's the complex problem, or technologies - I think banks need to be concerned. the assets associated with bottling, Yeah, and auto industry's a great example. The banks need to come to terms with the fact It's changing. Now that's a really important thing to come to terms with. Or going back to the point you made earlier. We're bringing that to you here,

ENTITIES

Entity	Category	Confidence
Paul	PERSON	0.99+
Paul Appleby	PERSON	0.99+
John	PERSON	0.99+
Peter Burris	PERSON	0.99+
New York	LOCATION	0.99+
millions	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Call of Duty	TITLE	0.99+
Two	QUANTITY	0.99+
New York City	LOCATION	0.99+
10 terabytes	QUANTITY	0.99+
Tesla	ORGANIZATION	0.99+
Senate	ORGANIZATION	0.99+
two hands	QUANTITY	0.99+
two days	QUANTITY	0.99+
Kinetica	ORGANIZATION	0.99+
US	LOCATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
ten terabytes	QUANTITY	0.99+
iPhones	COMMERCIAL_ITEM	0.99+
each car	QUANTITY	0.99+
one	QUANTITY	0.99+
Today	DATE	0.98+
early 1800s	DATE	0.98+
both	QUANTITY	0.98+
today	DATE	0.98+
millions of vehicles	QUANTITY	0.98+
Fortnite	TITLE	0.97+
100 terabytes a day	QUANTITY	0.97+
Cube	ORGANIZATION	0.96+
each	QUANTITY	0.96+
Silicon Valley	LOCATION	0.96+
two massive factors	QUANTITY	0.96+
NYC	LOCATION	0.95+
World Economic Forum	EVENT	0.95+
this year	DATE	0.94+
100 terabytes of data	QUANTITY	0.94+
around ten terabytes of data a day	QUANTITY	0.93+
20 years	QUANTITY	0.92+
next couple years	DATE	0.91+
Teslas	ORGANIZATION	0.91+
fourth industrial revolution	EVENT	0.91+
100	QUANTITY	0.91+
one gig	QUANTITY	0.88+
single power	QUANTITY	0.88+
last five years	DATE	0.86+
theCUBE	ORGANIZATION	0.84+
Cube SV	COMMERCIAL_ITEM	0.83+
CEO	PERSON	0.83+
tens of millions of customers	QUANTITY	0.81+
every second	QUANTITY	0.8+
next few year	DATE	0.79+
90% of the data	QUANTITY	0.76+
a day	QUANTITY	0.75+
Cube	COMMERCIAL_ITEM	0.75+
Cube New York City 2018	EVENT	0.7+
2018	DATE	0.7+
one other step	QUANTITY	0.67+
important things	QUANTITY	0.59+
Life	TITLE	0.55+
data	QUANTITY	0.51+
5G	TITLE	0.28+

Jim Franklin & Anant Chintamaneni | theCUBE NYC 2018

>> Live from New York. It's theCUBE. Covering theCUBE New York City, 2018. Brought to you by SiliconANGLE Media, and it's ecosystem partners. >> I'm John Furrier with Peter Burris, our next two guests are Jim Franklin with Dell EMC Director of Product Management Anant Chintamaneni, who is the Vice President of Products at BlueData. Welcome to theCUBE, good to see you. >> Thanks, John. >> Thank you. >> Thanks for coming on. >> I've been following BlueData since the founding. Great company, and the founders are great. Great teams, so thanks for coming on and sharing what's going on, I appreciate it. >> It's a pleasure, thanks for the opportunity. >> So Jim, talk about the Dell relationship with BlueData. What are you guys doing? You have the Dell-ready solutions. How is that related now, because you've seen this industry with us over the years morph. It's really now about, the set-up days are over, it's about proof points. >> That's right. >> AI and machine learning are driving the signal, which is saying, 'We need results'. There's action on the developer's side, there's action on the deployment, people want ROI, that's the main focus. >> That's right. That's right, and we've seen this journey happen from the new batch processing days, and we're seeing that customer base mature and come along, so the reason why we partnered with BlueData is, you have to have those softwares, you have to have the contenders. They have to have the algorithms, and things like that, in order to make this real. So it's been a great partnership with BlueData, it's dated back actually a little farther back than some may realize, all the way to 2015, believe it or not, when we used to incorporate BlueData with Isilon. So it's been actually a pretty positive partnership. >> Now we've talked with you guys in the past, you guys were on the cutting edge, this was back when Docker containers were fashionable, but now containers have become so proliferated out there, it's not just Docker, containerization has been the wave. Now, Kubernetes on top of it is really bringing in the orchestration. This is really making the storage and the network so much more valuable with workloads, whether respective workloads, and AI is a part of that. How do you guys navigate those waters now? What's the BlueData update, how are you guys taking advantage of that big wave? >> I think, great observation, re-embrace Docker containers, even before actually Docker was even formed as a company by that time, and Kubernetes was just getting launched, so we saw the value of Docker containers very early on, in terms of being able to obviously provide the agility, elasticity, but also, from a packaging of applications perspective, as we all know it's a very dynamic environment, and today, I think we are very happy to know that, with Kubernetes being a household name now, especially a tech company, so the way we're navigating this is, we have a turnkey product, which has containerization, and then now we are taking our value proposition of big data and AI and lifecycle management and bringing it to Kubernetes with an open source project that we launched called Cube Director under our umbrella. So, we're all about bringing stateful applications like Hadoop, AI, ML to the community and to our customer base, which is some of the largest financial services in health care customers. >> So the container revolution has certainly groped developers, and developers have always had a history of chasing after the next cool technology, and for good reason, it's not like just chasing after... Developers tend not to just chase after the shiny thing, they chased after the most productive thing, and they start using it, and they start learning about it, and they make themselves valuable, and they build more valuable applications as a result. But there's this interesting meshing of creators, makers, in the software world, between the development community and the data science community. How are data scientists, who you must be spending a fair amount of time with, starting to adopt containers, what are they looking at? Are they even aware of this, as you try to help these communities come together? >> We absolutely talk to the data scientists and they're the drivers of determining what applications they want to consume for the different news cases. But, at the end of the day, the person who has to deliver these applications, you know data scientists care about time to value, getting the environment quickly all prepared so they can access the right data sets. So, in many ways, most of our customers, many of them are unaware that there's actually containers under the hood. >> So this is the data scientists. >> The data scientists, but the actual administrators and the system administrators were making these tools available, are using containers as a way to accelerate the way they package the software, which has a whole bunch of dependent libraries, and there's a lot of complexity our there. So they're simplifying all that and providing the environment as quickly as possible. >> And in so doing, making sure that whatever workloads are put together, can scaled, can be combined differently and recombined differently, based on requirements of the data scientists. So the data scientist sees the tool... >> Yeah. >> The tool is manifest as, in concert with some of these new container related technologies, and then the whole CICD process supports the data scientist >> The other thing to think about though, is that this also allows freedom of choice, and we were discussing off camera before, these developers want to pick out what they want to pick out what they want to work with, they don't want to have to be locked in. So with containers, you can also speed that deployment but give them freedom to choose the tools that make them best productive. That'll make them much happier, and probably much more efficient. >> So there's a separation under the data science tools, and the developer tools, but they end up all supporting the same basic objective. So how does the infrastructure play in this, because the challenge of big data for the last five years as John and I both know, is that a lot of people conflated. The outcome of data science, the outcome of big data, with the process of standing up clusters, and lining up Hadoop, and if they failed on the infrastructure, they said it was a failure overall. So how you making the infrastructure really simple, and line up with this time of value? >> Well, the reality is, we all need food and water. IT still needs server and storage in order to work. But at the end of the day, the abstraction has to be there just like VMware in the early days, clouds, containers with BlueData is just another way to create a layer of abstraction. But this one is in the context of what the data scientist is trying to get done, and that's the key to why we partnered with BlueData and why we delivered big data as a service. >> So at that point, what's the update from Dell EMC and Dell, in particular, Analytics? Obviously you guys work with a lot of customers, have challenges, how are you solving those problems? What are those problems? Because we know there's some AI rumors, big Dell event coming up, there's rumors of a lot of AI involved, I'm speculating there's going to be probably a new kind of hardware device and software. What's the state of the analytics today? >> I think a lot of the customers we talked about, they were born in that batch processing, that Hadoop space we just talked about. I think they largely got that right, they've largely got that figured out, but now we're seeing proliferation of AI tools, proliferation of sandbox environments, and you're psyched to see a little bit of silo behavior happening, so what we're trying to do is that IT shop is trying to dispatch those environments, dispatch with some speed, with some agility. They want to have it at the right economic model as well, so we're trying to strike a better balance, say 'Hey, I've invested in all this infrastructure already, I need to modernize it, and that I also need to offer it up in a way that data scientists can consume it'. Oh, by the way, we're starting to see them start to hire more and more of these data scientists. Well, you don't want your data scientists, this very expensive, intelligent resource, sitting there doing data mining, data cleansing, detail offloads, we want them actually doing modeling and analytics. So we find that a lot of times right now as you're doing an operational change, the operational mindset as you're starting to hire these very expensive people to do this very good work, at the corest of the data, but they need to get productive in the way that you hired them to be productive. >> So what is this ready solution, can you just explain what that is? Is it a program, is it a hardware, is it a solution? What is the ready solution? >> Generally speaking, what we do as a division is we look for value workloads, just generally speaking, not necessarily in batch processing, or AI, or applications, and we try and create an environment that solves that customer challenge, typically they're very complex, SAP, Oracle Database, it's AI, my goodness. Very difficult. >> Variety of tools, using hives, no sequel, all this stuff's going on. >> Cassandra, you've got Tensorflow, so we try fit together a set of knowledge experts, that's the key, the intellectual property of our engineers, and their deep knowledge expertise in a certain area. So for AI, we have a sight of them back at the shop, they're in the lab, and this is what they do, and they're serving up these models, they're putting data through its paces, they're doing the work of a data scientist. They are data scientists. >> And so this is where BlueData comes in. You guys are part of this abstraction layer in the ready solutions. Offering? Is that how it works? >> Yeah, we are the software that enables the self-service experience, the multitenancy, that the consumers of the ready solution would want in terms of being able to onboard multiple different groups of users, lines of business, so you could have a user that wants to run basic spark, cluster, spark jobs, or you could have another user group that's using Tensorflow, or accelerated by a special type of CPU or GPU, and so you can have them all on the same infrastructure. >> One of the things Peter and I were talking about, Dave Vellante, who was here, he's at another event right now getting some content but, one of the things we observed was, we saw this awhile ago so it's not new to us but certainly we're seeing the impact at this event. Hadoop World, there's now called Strata Data NYC, is that we hear words like Kubernetes, and Multi Cloud, and Istio for the first time. At this event. This is the impact of the Cloud. The Cloud has essentially leveled the Hadoop World, certainly there's some Hadoop activity going on there, people have clusters, there's standing up infrastructure for analytical infrastructures that do analytics, obviously AI drives that, but now you have the Cloud being a power base. Changing that analytics infrastructure. How has it impacted you guys? BlueData, how are you guys impacted by the Cloud? Tailwind for you guys? Helpful? Good? >> You described it well, it is a tailwind. This space is about the data, not where the data lives necessarily, but the robustness of the data. So whether that's in the Cloud, whether that's on Premise, whether that's on Premise in your own private Cloud, I think anywhere where there's data that can be gathered, modeled, and new insights being pulled out of, this is wonderful, so as we ditched data, whether it's born in the Cloud or born on Premise, this is actually an accelerant to the solutions that we built together. >> As BlueData, we're all in on the Cloud, we support all the three major Cloud providers that was the big announcement that we made this week, we're generally available for AWS, GCP, and Azure, and, in particular, we start with customers who weren't born in the Cloud, so we're talking about some of the large financial services >> We had Barclays UK here who we nominated, they won the Cloud Era Data Impact Award, and what they're actually going through right now, is they started on Prem, they have these really packaged certified technology stacks, whether they are Cloud Era Hadoop, whether they are Anaconda for data science, and what they're trying to do right now is, they're obviously getting value from that on Premise with BlueData, and now they want to leverage the Cloud. They want to be able to extend into the Cloud. So, we as a company have made our product a hybrid Cloud-ready platform, so it can span on Prem as well as multiple Clouds, and you have the ability to move the workloads from one to the other, depending on data gravity, SLA considerations. >> Compliancy. >> I think it's one more thing, I want to test this with you guys, John, and that is, analytics is, I don't want to call it inert, or passive, but analytics has always been about getting the right data to human beings so they can make decisions, and now we're seeing, because of AI, the distinction that we draw between analytics and AI is, AI is about taking action on the data, it's about having a consequential action, as a result of the data, so in many respects, NCL, Kubernetes, a lot of these are not only do some interesting things for the infrastructure associated with big data, but they also facilitate the incorporation of new causes of applications, that act on behalf of the brand. >> Here's the other thing I'll add to it, there's a time element here. It used to be we were passive, and it was in the past, and you're trying to project forward, that's no longer the case. You can do it right now. Exactly. >> In many respects, the history of the computing industry can be drawn in this way, you focused on the past, and then with spreadsheets in the 80s and personal computing, you focused on getting everybody to agree on the future, and now, it's about getting action to happen right now. >> At the moment it happens. >> And that's why there's so much action. We're passed the set-up phase, and I think this is why we're hearing, seeing machine learning being so popular because it's like, people want to take action there's a demand, that's a signal that it's time to show where the ROI is and get action done. Clearly we see that. >> We're capitalists, right? We're all trying to figure out how to make money in these spaces. >> Certainly there's a lot of movement, and Cloud has proven that spinning up an instance concept has been a great thing, and certainly analytics. It's okay to have these workloads, but how do you tie it together? So, I want to ask you, because you guys have been involved in containers, Cloud has certainly been a tailwind, we agree with you 100 percent on that. What is the relevance of Kubernetes and Istio? You're starting to see these new trends. Kubernetes, Istio, Cupflow. Higher level microservices with all kinds of stateful and stateless dynamics. I call it API 2.0, it's a whole other generation of abstractions that are going on, that are creating some goodness for people. What is the impact, in your opinion, of Kubernetes and this new revolution? >> I think the impact of Kubernetes is, I just gave a talk here yesterday, called Hadoop-la About Kubernetes. We were thinking very deeply about this. We're thinking deeply about this. So I think Kubernetes, if you look at the genesis, it's all about stateless applications, and I think as new applications are being written folks are thinking about writing them in a manner that are decomposed, stateless, microservices, things like Cupflow. When you write it like that, Kubernetes fits in very well, and you get all the benefits of auto-scaling, and so control a pattern, and ultimately Kubernetes is this finite state machine-type model where you describe what the state should be, and it will work and crank towards making it towards that state. I think it's a little bit harder for stateful applications, and I think that's where we believe that the Kubernetes community has to do a lot more work, and folks like BlueData are going to contribute to that work which is, how do you bring stateful applications like Hadoop where there's a lot of interdependent services, they're not necessarily microservices, they're actually almost close to monolithic applications. So I think new applications, new AI ML tooling that's going to come out, they're going to be very conscious of how they're running in a Cloud world today that folks weren't aware of seven or eight years ago, so it's really going to make a huge difference. And I think things like Istio are going to make a huge difference because you can start in the cloud and maybe now expand on to Prem. So there's going to be some interesting dynamics. >> Without hopping management frameworks, absolutely. >> And this is really critical, you just nailed it. Stateful is where ML will shine, if you can then cross the chasma to the on Premise where the workloads can have state sharing. >> Right. >> Scales beautifully. It's a whole other level. >> Right. You're going to the data into the action, or the activity, you're going to have to move the processing to the data, and you want to have nonetheless, a common, seamless management development framework so that you have the choices about where you do those things. >> Absolutely. >> Great stuff. We can do a whole Cube segment just on that. We love talking about these new dynamics going on. We'll see you in CF CupCon coming up in Seattle. Great to have you guys on. Thanks, and congratulations on the relationship between BlueData and Dell EMC and Ready Solutions. This is Cube, with the Ready Solutions here. New York City, talking about big data and the impact, the future of AI, all things stateful, stateless, Cloud and all. It's theCUBE bringing you all the action. Stay with us for more after this short break.

Published Date : Sep 13 2018

SUMMARY :

Brought to you by SiliconANGLE Media, Welcome to theCUBE, good to see you. Great company, and the founders are great. So Jim, talk about the Dell relationship with BlueData. AI and machine learning are driving the signal, so the reason why we partnered with BlueData is, What's the BlueData update, how are you guys and bringing it to Kubernetes with an open source project and the data science community. But, at the end of the day, the person who has to deliver and the system administrators So the data scientist sees the tool... So with containers, you can also speed that deployment So how does the infrastructure play in this, But at the end of the day, the abstraction has to be there What's the state of the analytics today? in the way that you hired them to be productive. and we try and create an environment that all this stuff's going on. that's the key, the intellectual property of our engineers, in the ready solutions. and so you can have them all on the same infrastructure. Kubernetes, and Multi Cloud, and Istio for the first time. but the robustness of the data. and you have the ability to move the workloads I want to test this with you guys, John, Here's the other thing I'll add to it, and personal computing, you focused on getting everybody to We're passed the set-up phase, and I think this is why how to make money in these spaces. we agree with you 100 percent on that. the Kubernetes community has to do a lot more work, And this is really critical, you just nailed it. It's a whole other level. so that you have the choices and the impact, the future of AI,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Anant Chintamaneni	PERSON	0.99+
Peter Burris	PERSON	0.99+
Jim Franklin	PERSON	0.99+
John	PERSON	0.99+
BlueData	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Jim	PERSON	0.99+
2015	DATE	0.99+
New York	LOCATION	0.99+
100 percent	QUANTITY	0.99+
John Furrier	PERSON	0.99+
New York City	LOCATION	0.99+
Ready Solutions	ORGANIZATION	0.99+
Seattle	LOCATION	0.99+
yesterday	DATE	0.99+
Dell EMC	ORGANIZATION	0.99+
Barclays UK	ORGANIZATION	0.99+
first time	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
today	DATE	0.99+
One	QUANTITY	0.98+
both	QUANTITY	0.98+
AWS	ORGANIZATION	0.98+
this week	DATE	0.97+
CF CupCon	EVENT	0.97+
one	QUANTITY	0.97+
Cassandra	PERSON	0.97+
seven	DATE	0.96+
two guests	QUANTITY	0.96+
Isilon	ORGANIZATION	0.96+
80s	DATE	0.96+
NCL	ORGANIZATION	0.96+
SAP	ORGANIZATION	0.95+
API 2.0	OTHER	0.92+
Anaconda	ORGANIZATION	0.92+
Cloud Era Hadoop	TITLE	0.91+
NYC	LOCATION	0.91+
Hadoop	TITLE	0.91+
eight years ago	DATE	0.91+
Prem	ORGANIZATION	0.9+
Cupflow	TITLE	0.89+
Premise	TITLE	0.89+
Kubernetes	TITLE	0.88+
one more thing	QUANTITY	0.88+
Istio	ORGANIZATION	0.87+
Docker	TITLE	0.85+
Docker	ORGANIZATION	0.85+
Cupflow	ORGANIZATION	0.84+
Cube	ORGANIZATION	0.83+
last five years	DATE	0.82+
Cloud	TITLE	0.8+
Kubernetes	ORGANIZATION	0.8+
Oracle Database	ORGANIZATION	0.79+
2018	DATE	0.79+
Clouds	TITLE	0.78+
GCP	ORGANIZATION	0.77+
theCUBE	ORGANIZATION	0.76+
Cloud Era Data Impact Award	EVENT	0.74+
Cube	PERSON	0.73+

Influencer Panel | theCUBE NYC 2018

- [Announcer] Live, from New York, it's theCUBE. Covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media, and its ecosystem partners. - Hello everyone, welcome back to CUBE NYC. This is a CUBE special presentation of something that we've done now for the past couple of years. IBM has sponsored an influencer panel on some of the hottest topics in the industry, and of course, there's no hotter topic right now than AI. So, we've got nine of the top influencers in the AI space, and we're in Hell's Kitchen, and it's going to get hot in here. (laughing) And these guys, we're going to cover the gamut. So, first of all, folks, thanks so much for joining us today, really, as John said earlier, we love the collaboration with you all, and we'll definitely see you on social after the fact. I'm Dave Vellante, with my cohost for this session, Peter Burris, and again, thank you to IBM for sponsoring this and organizing this. IBM has a big event down here, in conjunction with Strata, called Change the Game, Winning with AI. We run theCUBE NYC, we've been here all week. So, here's the format. I'm going to kick it off, and then we'll see where it goes. So, I'm going to introduce each of the panelists, and then ask you guys to answer a question, I'm sorry, first, tell us a little bit about yourself, briefly, and then answer one of the following questions. Two big themes that have come up this week. One has been, because this is our ninth year covering what used to be Hadoop World, which kind of morphed into big data. Question is, AI, big data, same wine, new bottle? Or is it really substantive, and driving business value? So, that's one question to ponder. The other one is, you've heard the term, the phrase, data is the new oil. Is data really the new oil? Wonder what you think about that? Okay, so, Chris Penn, let's start with you. Chris is cofounder of Trust Insight, long time CUBE alum, and friend. Thanks for coming on. Tell us a little bit about yourself, and then pick one of those questions. - Sure, we're a data science consulting firm. We're an IBM business partner. When it comes to "data is the new oil," I love that expression because it's completely accurate. Crude oil is useless, you have to extract it out of the ground, refine it, and then bring it to distribution. Data is the same way, where you have to have developers and data architects get the data out. You need data scientists and tools, like Watson Studio, to refine it, and then you need to put it into production, and that's where marketing technologists, technologists, business analytics folks, and tools like Watson Machine Learning help bring the data and make it useful. - Okay, great, thank you. Tony Flath is a tech and media consultant, focus on cloud and cyber security, welcome. - Thank you. - Tell us a little bit about yourself and your thoughts on one of those questions. - Sure thing, well, thanks so much for having us on this show, really appreciate it. My background is in cloud, cyber security, and certainly in emerging tech with artificial intelligence. Certainly touched it from a cyber security play, how you can use machine learning, machine control, for better controlling security across the gamut. But I'll touch on your question about wine, is it a new bottle, new wine? Where does this come from, from artificial intelligence? And I really see it as a whole new wine that is coming along. When you look at emerging technology, and you look at all the deep learning that's happening, it's going just beyond being able to machine learn and know what's happening, it's making some meaning to that data. And things are being done with that data, from robotics, from automation, from all kinds of different things, where we're at a point in society where data, our technology is getting beyond us. Prior to this, it's always been command and control. You control data from a keyboard. Well, this is passing us. So, my passion and perspective on this is, the humanization of it, of IT. How do you ensure that people are in that process, right? - Excellent, and we're going to come back and talk about that. - Thanks so much. - Carla Gentry, @DataNerd? Great to see you live, as opposed to just in the ether on Twitter. Data scientist, and owner of Analytical Solution. Welcome, your thoughts? - Thank you for having us. Mine is, is data the new oil? And I'd like to rephrase that is, data equals human lives. So, with all the other artificial intelligence and everything that's going on, and all the algorithms and models that's being created, we have to think about things being biased, being fair, and understand that this data has impacts on people's lives. - Great. Steve Ardire, my paisan. - Paisan. - AI startup adviser, welcome, thanks for coming to theCUBE. - Thanks Dave. So, uh, my first career was geology, and I view AI as the new oil, but data is the new oil, but AI is the refinery. I've used that many times before. In fact, really, I've moved from just AI to augmented intelligence. So, augmented intelligence is really the way forward. This was a presentation I gave at IBM Think last spring, has almost 100,000 impressions right now, and the fundamental reason why is machines can attend to vastly more information than humans, but you still need humans in the loop, and we can talk about what they're bringing in terms of common sense reasoning, because big data does the who, what, when, and where, but not the why, and why is really the Holy Grail for causal analysis and reasoning. - Excellent, Bob Hayes, Business Over Broadway, welcome, great to see you again. - Thanks for having me. So, my background is in psychology, industrial psychology, and I'm interested in things like customer experience, data science, machine learning, so forth. And I'll answer the question around big data versus AI. And I think there's other terms we could talk about, big data, data science, machine learning, AI. And to me, it's kind of all the same. It's always been about analytics, and getting value from your data, big, small, what have you. And there's subtle differences among those terms. Machine learning is just about making a prediction, and knowing if things are classified correctly. Data science is more about understanding why things work, and understanding maybe the ethics behind it, what variables are predicting that outcome. But still, it's all the same thing, it's all about using data in a way that we can get value from that, as a society, in residences. - Excellent, thank you. Theo Lau, founder of Unconventional Ventures. What's your story? - Yeah, so, my background is driving technology innovation. So, together with my partner, what our work does is we work with organizations to try to help them leverage technology to drive systematic financial wellness. We connect founders, startup founders, with funders, we help them get money in the ecosystem. We also work with them to look at, how do we leverage emerging technology to do something good for the society. So, very much on point to what Bob was saying about. So when I look at AI, it is not new, right, it's been around for quite a while. But what's different is the amount of technological power that we have allow us to do so much more than what we were able to do before. And so, what my mantra is, great ideas can come from anywhere in the society, but it's our job to be able to leverage technology to shine a spotlight on people who can use this to do something different, to help seniors in our country to do better in their financial planning. - Okay, so, in your mind, it's not just a same wine, new bottle, it's more substantive than that. - [Theo] It's more substantive, it's a much better bottle. - Karen Lopez, senior project manager for Architect InfoAdvisors, welcome. - Thank you. So, I'm DataChick on twitter, and so that kind of tells my focus is that I'm here, I also call myself a data evangelist, and that means I'm there at organizations helping stand up for the data, because to me, that's the proxy for standing up for the people, and the places and the events that that data describes. That means I have a focus on security, data privacy and protection as well. And I'm going to kind of combine your two questions about whether data is the new wine bottle, I think is the combination. Oh, see, now I'm talking about alcohol. (laughing) But anyway, you know, all analogies are imperfect, so whether we say it's the new wine, or, you know, same wine, or whether it's oil, is that the analogy's good for both of them, but unlike oil, the amount of data's just growing like crazy, and the oil, we know at some point, I kind of doubt that we're going to hit peak data where we have not enough data, like we're going to do with oil. But that says to me that, how did we get here with big data, with machine learning and AI? And from my point of view, as someone who's been focused on data for 35 years, we have hit this perfect storm of open source technologies, cloud architectures and cloud services, data innovation, that if we didn't have those, we wouldn't be talking about large machine learning and deep learning-type things. So, because we have all these things coming together at the same time, we're now at explosions of data, which means we also have to protect them, and protect the people from doing harm with data, we need to do data for good things, and all of that. - Great, definite differences, we're not running out of data, data's like the terrible tribbles. (laughing) - Yes, but it's very cuddly, data is. - Yeah, cuddly data. Mark Lynd, founder of Relevant Track? - That's right. - I like the name. What's your story? - Well, thank you, and it actually plays into what my interest is. It's mainly around AI in enterprise operations and cyber security. You know, these teams that are in enterprise operations both, it can be sales, marketing, all the way through the organization, as well as cyber security, they're often under-sourced. And they need, what Steve pointed out, they need augmented intelligence, they need to take AI, the big data, all the information they have, and make use of that in a way where they're able to, even though they're under-sourced, make some use and some value for the organization, you know, make better use of the resources they have to grow and support the strategic goals of the organization. And oftentimes, when you get to budgeting, it doesn't really align, you know, you're short people, you're short time, but the data continues to grow, as Karen pointed out. So, when you take those together, using AI to augment, provided augmented intelligence, to help them get through that data, make real tangible decisions based on information versus just raw data, especially around cyber security, which is a big hit right now, is really a great place to be, and there's a lot of stuff going on, and a lot of exciting stuff in that area. - Great, thank you. Kevin L. Jackson, author and founder of GovCloud. GovCloud, that's big. - Yeah, GovCloud Network. Thank you very much for having me on the show. Up and working on cloud computing, initially in the federal government, with the intelligence community, as they adopted cloud computing for a lot of the nation's major missions. And what has happened is now I'm working a lot with commercial organizations and with the security of that data. And I'm going to sort of, on your questions, piggyback on Karen. There was a time when you would get a couple of bottles of wine, and they would come in, and you would savor that wine, and sip it, and it would take a few days to get through it, and you would enjoy it. The problem now is that you don't get a couple of bottles of wine into your house, you get two or three tankers of data. So, it's not that it's a new wine, you're just getting a lot of it. And the infrastructures that you need, before you could have a couple of computers, and a couple of people, now you need cloud, you need automated infrastructures, you need huge capabilities, and artificial intelligence and AI, it's what we can use as the tool on top of these huge infrastructures to drink that, you know. - Fire hose of wine. - Fire hose of wine. (laughs) - Everybody's having a good time. - Everybody's having a great time. (laughs) - Yeah, things are booming right now. Excellent, well, thank you all for those intros. Peter, I want to ask you a question. So, I heard there's some similarities and some definite differences with regard to data being the new oil. You have a perspective on this, and I wonder if you could inject it into the conversation. - Sure, so, the perspective that we take in a lot of conversations, a lot of folks here in theCUBE, what we've learned, and I'll kind of answer both questions a little bit. First off, on the question of data as the new oil, we definitely think that data is the new asset that business is going to be built on, in fact, our perspective is that there really is a difference between business and digital business, and that difference is data as an asset. And if you want to understand data transformation, you understand the degree to which businesses reinstitutionalizing work, reorganizing its people, reestablishing its mission around what you can do with data as an asset. The difference between data and oil is that oil still follows the economics of scarcity. Data is one of those things, you can copy it, you can share it, you can easily corrupt it, you can mess it up, you can do all kinds of awful things with it if you're not careful. And it's that core fundamental proposition that as an asset, when we think about cyber security, we think, in many respects, that is the approach to how we can go about privatizing data so that we can predict who's actually going to be able to appropriate returns on it. So, it's a good analogy, but as you said, it's not entirely perfect, but it's not perfect in a really fundamental way. It's not following the laws of scarcity, and that has an enormous effect. - In other words, I could put oil in my car, or I could put oil in my house, but I can't put the same oil in both. - Can't put it in both places. And now, the issue of the wine, I think it's, we think that it is, in fact, it is a new wine, and very simple abstraction, or generalization we come up with is the issue of agency. That analytics has historically not taken on agency, it hasn't acted on behalf of the brand. AI is going to act on behalf of the brand. Now, you're going to need both of them, you can't separate them. - A lot of implications there in terms of bias. - Absolutely. - In terms of privacy. You have a thought, here, Chris? - Well, the scarcity is our compute power, and our ability for us to process it. I mean, it's the same as oil, there's a ton of oil under the ground, right, we can't get to it as efficiently, or without severe environmental consequences to use it. Yeah, when you use it, it's transformed, but our scarcity is compute power, and our ability to use it intelligently. - Or even when you find it. I have data, I can apply it to six different applications, I have oil, I can apply it to one, and that's going to matter in how we think about work. - But one thing I'd like to add, sort of, you're talking about data as an asset. The issue we're having right now is we're trying to learn how to manage that asset. Artificial intelligence is a way of managing that asset, and that's important if you're going to use and leverage big data. - Yeah, but see, everybody's talking about the quantity, the quantity, it's not always the quantity. You know, we can have just oodles and oodles of data, but if it's not clean data, if it's not alphanumeric data, which is what's needed for machine learning. So, having lots of data is great, but you have to think about the signal versus the noise. So, sometimes you get so much data, you're looking at over-fitting, sometimes you get so much data, you're looking at biases within the data. So, it's not the amount of data, it's the, now that we have all of this data, making sure that we look at relevant data, to make sure we look at clean data. - One more thought, and we have a lot to cover, I want to get inside your big brain. - I was just thinking about it from a cyber security perspective, one of my customers, they were looking at the data that just comes from the perimeter, your firewalls, routers, all of that, and then not even looking internally, just the perimeter alone, and the amount of data being pulled off of those. And then trying to correlate that data so it makes some type of business sense, or they can determine if there's incidents that may happen, and take a predictive action, or threats that might be there because they haven't taken a certain action prior, it's overwhelming to them. So, having AI now, to be able to go through the logs to look at, and there's so many different types of data that come to those logs, but being able to pull that information, as well as looking at end points, and all that, and people's houses, which are an extension of the network oftentimes, it's an amazing amount of data, and they're only looking at a small portion today because they know, there's not enough resources, there's not enough trained people to do all that work. So, AI is doing a wonderful way of doing that. And some of the tools now are starting to mature and be sophisticated enough where they provide that augmented intelligence that Steve talked about earlier. - So, it's complicated. There's infrastructure, there's security, there's a lot of software, there's skills, and on and on. At IBM Think this year, Ginni Rometty talked about, there were a couple of themes, one was augmented intelligence, that was something that was clear. She also talked a lot about privacy, and you own your data, etc. One of the things that struck me was her discussion about incumbent disruptors. So, if you look at the top five companies, roughly, Facebook with fake news has dropped down a little bit, but top five companies in terms of market cap in the US. They're data companies, all right. Apple just hit a trillion, Amazon, Google, etc. How do those incumbents close the gap? Is that concept of incumbent disruptors actually something that is being put into practice? I mean, you guys work with a lot of practitioners. How are they going to close that gap with the data haves, meaning data at their core of their business, versus the data have-nots, it's not that they don't have a lot of data, but it's in silos, it's hard to get to? - Yeah, I got one more thing, so, you know, these companies, and whoever's going to be big next is, you have a digital persona, whether you want it or not. So, if you live in a farm out in the middle of Oklahoma, you still have a digital persona, people are collecting data on you, they're putting profiles of you, and the big companies know about you, and people that first interact with you, they're going to know that you have this digital persona. Personal AI, when AI from these companies could be used simply and easily, from a personal deal, to fill in those gaps, and to have a digital persona that supports your family, your growth, both personal and professional growth, and those type of things, there's a lot of applications for AI on a personal, enterprise, even small business, that have not been done yet, but the data is being collected now. So, you talk about the oil, the oil is being built right now, lots, and lots, and lots of it. It's the applications to use that, and turn that into something personally, professionally, educationally, powerful, that's what's missing. But it's coming. - Thank you, so, I'll add to that, and in answer to your question you raised. So, one example we always used in banking is, if you look at the big banks, right, and then you look at from a consumer perspective, and there's a lot of talk about Amazon being a bank. But the thing is, Amazon doesn't need to be a bank, they provide banking services, from a consumer perspective they don't really care if you're a bank or you're not a bank, but what's different between Amazon and some of the banks is that Amazon, like you say, has a lot of data, and they know how to make use of the data to offer something as relevant that consumers want. Whereas banks, they have a lot of data, but they're all silos, right. So, it's not just a matter of whether or not you have the data, it's also, can you actually access it and make something useful out of it so that you can create something that consumers want? Because otherwise, you're just a pipe. - Totally agree, like, when you look at it from a perspective of, there's a lot of terms out there, digital transformation is thrown out so much, right, and go to cloud, and you migrate to cloud, and you're going to take everything over, but really, when you look at it, and you both touched on it, it's the economics. You have to look at the data from an economics perspective, and how do you make some kind of way to take this data meaningful to your customers, that's going to work effectively for them, that they're going to drive? So, when you look at the big, big cloud providers, I think the push in things that's going to happen in the next few years is there's just going to be a bigger migration to public cloud. So then, between those, they have to differentiate themselves. Obvious is artificial intelligence, in a way that makes it easy to aggregate data from across platforms, to aggregate data from multi-cloud, effectively. To use that data in a meaningful way that's going to drive, not only better decisions for your business, and better outcomes, but drives our opportunities for customers, drives opportunities for employees and how they work. We're at a really interesting point in technology where we get to tell technology what to do. It's going beyond us, it's no longer what we're telling it to do, it's going to go beyond us. So, how we effectively manage that is going to be where we see that data flow, and those big five or big four, really take that to the next level. - Now, one of the things that Ginni Rometty said was, I forget the exact step, but it was like, 80% of the data, is not searchable. Kind of implying that it's sitting somewhere behind a firewall, presumably on somebody's premises. So, it was kind of interesting. You're talking about, certainly, a lot of momentum for public cloud, but at the same time, a lot of data is going to stay where it is. - Yeah, we're assuming that a lot of this data is just sitting there, available and ready, and we look at the desperate, or disparate kind of database situation, where you have 29 databases, and two of them have unique quantifiers that tie together, and the rest of them don't. So, there's nothing that you can do with that data. So, artificial intelligence is just that, it's artificial intelligence, so, they know, that's machine learning, that's natural language, that's classification, there's a lot of different parts of that that are moving, but we also have to have IT, good data infrastructure, master data management, compliance, there's so many moving parts to this, that it's not just about the data anymore. - I want to ask Steve to chime in here, go ahead. - Yeah, so, we also have to change the mentality that it's not just enterprise data. There's data on the web, the biggest thing is Internet of Things, the amount of sensor data will make the current data look like chump change. So, data is moving faster, okay. And this is where the sophistication of machine learning needs to kick in, going from just mostly supervised-learning today, to unsupervised learning. And in order to really get into, as I said, big data, and credible AI does the who, what, where, when, and how, but not the why. And this is really the Holy Grail to crack, and it's actually under a new moniker, it's called explainable AI, because it moves beyond just correlation into root cause analysis. Once we have that, then you have the means to be able to tap into augmented intelligence, where humans are working with the machines. - Karen, please. - Yeah, so, one of the things, like what Carla was saying, and what a lot of us had said, I like to think of the advent of ML technologies and AI are going to help me as a data architect to love my data better, right? So, that includes protecting it, but also, when you say that 80% of the data is unsearchable, it's not just an access problem, it's that no one knows what it was, what the sovereignty was, what the metadata was, what the quality was, or why there's huge anomalies in it. So, my favorite story about this is, in the 1980s, about, I forget the exact number, but like, 8 million children disappeared out of the US in April, at April 15th. And that was when the IRS enacted a rule that, in order to have a dependent, a deduction for a dependent on your tax returns, they had to have a valid social security number, and people who had accidentally miscounted their children and over-claimed them, (laughter) over the years them, stopped doing that. Well, some days it does feel like you have eight children running around. (laughter) - Agreed. - When, when that rule came about, literally, and they're not all children, because they're dependents, but literally millions of children disappeared off the face of the earth in April, but if you were doing analytics, or AI and ML, and you don't know that this anomaly happened, I can imagine in a hundred years, someone is saying some catastrophic event happened in April, 1983. (laughter) And what caused that, was it healthcare? Was it a meteor? Was it the clown attacking them? - That's where I was going. - Right. So, those are really important things that I want to use AI and ML to help me, not only document and capture that stuff, but to provide that information to the people, the data scientists and the analysts that are using the data. - Great story, thank you. Bob, you got a thought? You got the mic, go, jump in here. - Well, yeah, I do have a thought, actually. I was talking about, what Karen was talking about. I think it's really important that, not only that we understand AI, and machine learning, and data science, but that the regular folks and companies understand that, at the basic level. Because those are the people who will ask the questions, or who know what questions to ask of the data. And if they don't have the tools, and the knowledge of how to get access to that data, or even how to pose a question, then that data is going to be less valuable, I think, to companies. And the more that everybody knows about data, even people in congress. Remember when Zuckerberg talked about? (laughter) - That was scary. - How do you make money? It's like, we all know this. But, we need to educate the masses on just basic data analytics. - We could have an hour-long panel on that. - Yeah, absolutely. - Peter, you and I were talking about, we had a couple of questions, sort of, how far can we take artificial intelligence? How far should we? You know, so that brings in to the conversation of ethics, and bias, why don't you pick it up? - Yeah, so, one of the crucial things that we all are implying is that, at some point in time, AI is going to become a feature of the operations of our homes, our businesses. And as these technologies get more powerful, and they diffuse, and know about how to use them, diffuses more broadly, and you put more options into the hands of more people, the question slowly starts to turn from can we do it, to should we do it? And, one of the issues that I introduce is that I think the difference between big data and AI, specifically, is this notion of agency. The AI will act on behalf of, perhaps you, or it will act on behalf of your business. And that conversation is not being had, today. It's being had in arguments between Elon Musk and Mark Zuckerberg, which pretty quickly get pretty boring. (laughing) At the end of the day, the real question is, should this machine, whether in concert with others, or not, be acting on behalf of me, on behalf of my business, or, and when I say on behalf of me, I'm also talking about privacy. Because Facebook is acting on behalf of me, it's not just what's going on in my home. So, the question of, can it be done? A lot of things can be done, and an increasing number of things will be able to be done. We got to start having a conversation about should it be done? - So, humans exhibit tribal behavior, they exhibit bias. Their machine's going to pick that up, go ahead, please. - Yeah, one thing that sort of tag onto agency of artificial intelligence. Every industry, every business is now about identifying information and data sources, and their appropriate sinks, and learning how to draw value out of connecting the sources with the sinks. Artificial intelligence enables you to identify those sources and sinks, and when it gets agency, it will be able to make decisions on your behalf about what data is good, what data means, and who it should be. - What actions are good. - Well, what actions are good. - And what data was used to make those actions. - Absolutely. - And was that the right data, and is there bias of data? And all the way down, all the turtles down. - So, all this, the data pedigree will be driven by the agency of artificial intelligence, and this is a big issue. - It's really fundamental to understand and educate people on, there are four fundamental types of bias, so there's, in machine learning, there's intentional bias, "Hey, we're going to make "the algorithm generate a certain outcome "regardless of what the data says." There's the source of the data itself, historical data that's trained on the models built on flawed data, the model will behave in a flawed way. There's target source, which is, for example, we know that if you pull data from a certain social network, that network itself has an inherent bias. No matter how representative you try to make the data, it's still going to have flaws in it. Or, if you pull healthcare data about, for example, African-Americans from the US healthcare system, because of societal biases, that data will always be flawed. And then there's tool bias, there's limitations to what the tools can do, and so we will intentionally exclude some kinds of data, or not use it because we don't know how to, our tools are not able to, and if we don't teach people what those biases are, they won't know to look for them, and I know. - Yeah, it's like, one of the things that we were talking about before, I mean, artificial intelligence is not going to just create itself, it's lines of code, it's input, and it spits out output. So, if it learns from these learning sets, we don't want AI to become another buzzword. We don't want everybody to be an "AR guru" that has no idea what AI is. It takes months, and months, and months for these machines to learn. These learning sets are so very important, because that input is how this machine, think of it as your child, and that's basically the way artificial intelligence is learning, like your child. You're feeding it these learning sets, and then eventually it will make its own decisions. So, we know from some of us having children that you teach them the best that you can, but then later on, when they're doing their own thing, they're really, it's like a little myna bird, they've heard everything that you've said. (laughing) Not only the things that you said to them directly, but the things that you said indirectly. - Well, there are some very good AI researchers that might disagree with that metaphor, exactly. (laughing) But, having said that, what I think is very interesting about this conversation is that this notion of bias, one of the things that fascinates me about where AI goes, are we going to find a situation where tribalism more deeply infects business? Because we know that human beings do not seek out the best information, they seek out information that reinforces their beliefs. And that happens in business today. My line of business versus your line of business, engineering versus sales, that happens today, but it happens at a planning level, and when we start talking about AI, we have to put the appropriate dampers, understand the biases, so that we don't end up with deep tribalism inside of business. Because AI could have the deleterious effect that it actually starts ripping apart organizations. - Well, input is data, and then the output is, could be a lot of things. - Could be a lot of things. - And that's where I said data equals human lives. So that we look at the case in New York where the penal system was using this artificial intelligence to make choices on people that were released from prison, and they saw that that was a miserable failure, because that people that release actually re-offended, some committed murder and other things. So, I mean, it's, it's more than what anybody really thinks. It's not just, oh, well, we'll just train the machines, and a couple of weeks later they're good, we never have to touch them again. These things have to be continuously tweaked. So, just because you built an algorithm or a model doesn't mean you're done. You got to go back later, and continue to tweak these models. - Mark, you got the mic. - Yeah, no, I think one thing we've talked a lot about the data that's collected, but what about the data that's not collected? Incomplete profiles, incomplete datasets, that's a form of bias, and sometimes that's the worst. Because they'll fill that in, right, and then you can get some bias, but there's also a real issue for that around cyber security. Logs are not always complete, things are not always done, and when things are doing that, people make assumptions based on what they've collected, not what they didn't collect. So, when they're looking at this, and they're using the AI on it, that's only on the data collected, not on that that wasn't collected. So, if something is down for a little while, and no data's collected off that, the assumption is, well, it was down, or it was impacted, or there was a breach, or whatever, it could be any of those. So, you got to, there's still this human need, there's still the need for humans to look at the data and realize that there is the bias in there, there is, we're just looking at what data was collected, and you're going to have to make your own thoughts around that, and assumptions on how to actually use that data before you go make those decisions that can impact lots of people, at a human level, enterprise's profitability, things like that. And too often, people think of AI, when it comes out of there, that's the word. Well, it's not the word. - Can I ask a question about this? - Please. - Does that mean that we shouldn't act? - It does not. - Okay. - So, where's the fine line? - Yeah, I think. - Going back to this notion of can we do it, or should we do it? Should we act? - Yeah, I think you should do it, but you should use it for what it is. It's augmenting, it's helping you, assisting you to make a valued or good decision. And hopefully it's a better decision than you would've made without it. - I think it's great, I think also, your answer's right too, that you have to iterate faster, and faster, and faster, and discover sources of information, or sources of data that you're not currently using, and, that's why this thing starts getting really important. - I think you touch on a really good point about, should you or shouldn't you? You look at Google, and you look at the data that they've been using, and some of that out there, from a digital twin perspective, is not being approved, or not authorized, and even once they've made changes, it's still floating around out there. Where do you know where it is? So, there's this dilemma of, how do you have a digital twin that you want to have, and is going to work for you, and is going to do things for you to make your life easier, to do these things, mundane tasks, whatever? But how do you also control it to do things you don't want it to do? - Ad-based business models are inherently evil. (laughing) - Well, there's incentives to appropriate our data, and so, are things like blockchain potentially going to give users the ability to control their data? We'll see. - No, I, I'm sorry, but that's actually a really important point. The idea of consensus algorithms, whether it's blockchain or not, blockchain includes games, and something along those lines, whether it's Byzantine fault tolerance, or whether it's Paxos, consensus-based algorithms are going to be really, really important. Parts of this conversation, because the data's going to be more distributed, and you're going to have more elements participating in it. And so, something that allows, especially in the machine-to-machine world, which is a lot of what we're talking about right here, you may not have blockchain, because there's no need for a sense of incentive, which is what blockchain can help provide. - And there's no middleman. - And, well, all right, but there's really, the thing that makes blockchain so powerful is it liberates new classes of applications. But for a lot of the stuff that we're talking about, you can use a very powerful consensus algorithm without having a game side, and do some really amazing things at scale. - So, looking at blockchain, that's a great thing to bring up, right. I think what's inherently wrong with the way we do things today, and the whole overall design of technology, whether it be on-prem, or off-prem, is both the lock and key is behind the same wall. Whether that wall is in a cloud, or behind a firewall. So, really, when there is an audit, or when there is a forensics, it always comes down to a sysadmin, or something else, and the system administrator will have the finger pointed at them, because it all resides, you can edit it, you can augment it, or you can do things with it that you can't really determine. Now, take, as an example, blockchain, where you've got really the source of truth. Now you can take and have the lock in one place, and the key in another place. So that's certainly going to be interesting to see how that unfolds. - So, one of the things, it's good that, we've hit a lot of buzzwords, right now, right? (laughing) AI, and ML, block. - Bingo. - We got the blockchain bingo, yeah, yeah. So, one of the things is, you also brought up, I mean, ethics and everything, and one of the things that I've noticed over the last year or so is that, as I attend briefings or demos, everyone is now claiming that their product is AI or ML-enabled, or blockchain-enabled. And when you try to get answers to the questions, what you really find out is that some things are being pushed as, because they have if-then statements somewhere in their code, and therefore that's artificial intelligence or machine learning. - [Peter] At least it's not "go-to." (laughing) - Yeah, you're that experienced as well. (laughing) So, I mean, this is part of the thing you try to do as a practitioner, as an analyst, as an influencer, is trying to, you know, the hype of it all. And recently, I attended one where they said they use blockchain, and I couldn't figure it out, and it turns out they use GUIDs to identify things, and that's not blockchain, it's an identifier. (laughing) So, one of the ethics things that I think we, as an enterprise community, have to deal with, is the over-promising of AI, and ML, and deep learning, and recognition. It's not, I don't really consider it visual recognition services if they just look for red pixels. I mean, that's not quite the same thing. Yet, this is also making things much harder for your average CIO, or worse, CFO, to understand whether they're getting any value from these technologies. - Old bottle. - Old bottle, right. - And I wonder if the data companies, like that you talked about, or the top five, I'm more concerned about their nearly, or actual $1 trillion valuations having an impact on their ability of other companies to disrupt or enter into the field more so than their data technologies. Again, we're coming to another perfect storm of the companies that have data as their asset, even though it's still not on their financial statements, which is another indicator whether it's really an asset, is that, do we need to think about the terms of AI, about whose hands it's in, and who's, like, once one large trillion-dollar company decides that you are not a profitable company, how many other companies are going to buy that data and make that decision about you? - Well, and for the first time in business history, I think, this is true, we're seeing, because of digital, because it's data, you're seeing tech companies traverse industries, get into, whether it's content, or music, or publishing, or groceries, and that's powerful, and that's awful scary. - If you're a manger, one of the things your ownership is asking you to do is to reduce asset specificities, so that their capital could be applied to more productive uses. Data reduces asset specificities. It brings into question the whole notion of vertical industry. You're absolutely right. But you know, one quick question I got for you, playing off of this is, again, it goes back to this notion of can we do it, and should we do it? I find it interesting, if you look at those top five, all data companies, but all of them are very different business models, or they can classify the two different business models. Apple is transactional, Microsoft is transactional, Google is ad-based, Facebook is ad-based, before the fake news stuff. Amazon's kind of playing it both sides. - Yeah, they're kind of all on a collision course though, aren't they? - But, well, that's what's going to be interesting. I think, at some point in time, the "can we do it, should we do it" question is, brands are going to be identified by whether or not they have gone through that process of thinking about, should we do it, and say no. Apple is clearly, for example, incorporating that into their brand. - Well, Silicon Valley, broadly defined, if I include Seattle, and maybe Armlock, not so much IBM. But they've got a dual disruption agenda, they've always disrupted horizontal tech. Now they're disrupting vertical industries. - I was actually just going to pick up on what she was talking about, we were talking about buzzword, right. So, one we haven't heard yet is voice. Voice is another big buzzword right now, when you couple that with IoT and AI, here you go, bingo, do I got three points? (laughing) Voice recognition, voice technology, so all of the smart speakers, if you think about that in the world, there are 7,000 languages being spoken, but yet if you look at Google Home, you look at Siri, you look at any of the devices, I would challenge you, it would have a lot of problem understanding my accent, and even when my British accent creeps out, or it would have trouble understanding seniors, because the way they talk, it's very different than a typical 25-year-old person living in Silicon Valley, right. So, how do we solve that, especially going forward? We're seeing voice technology is going to be so more prominent in our homes, we're going to have it in the cars, we have it in the kitchen, it does everything, it listens to everything that we are talking about, not talking about, and records it. And to your point, is it going to start making decisions on our behalf, but then my question is, how much does it actually understand us? - So, I just want one short story. Siri can't translate a word that I ask it to translate into French, because my phone's set to Canadian English, and that's not supported. So I live in a bilingual French English country, and it can't translate. - But what this is really bringing up is if you look at society, and culture, what's legal, what's ethical, changes across the years. What was right 200 years ago is not right now, and what was right 50 years ago is not right now. - It changes across countries. - It changes across countries, it changes across regions. So, what does this mean when our AI has agency? How do we make ethical AI if we don't even know how to manage the change of what's right and what's wrong in human society? - One of the most important questions we have to worry about, right? - Absolutely. - But it also says one more thing, just before we go on. It also says that the issue of economies of scale, in the cloud. - Yes. - Are going to be strongly impacted, not just by how big you can build your data centers, but some of those regulatory issues that are going to influence strongly what constitutes good experience, good law, good acting on my behalf, agency. - And one thing that's underappreciated in the marketplace right now is the impact of data sovereignty, if you get back to data, countries are now recognizing the importance of managing that data, and they're implementing data sovereignty rules. Everyone talks about California issuing a new law that's aligned with GDPR, and you know what that meant. There are 30 other states in the United States alone that are modifying their laws to address this issue. - Steve. - So, um, so, we got a number of years, no matter what Ray Kurzweil says, until we get to artificial general intelligence. - The singularity's not so near? (laughing) - You know that he's changed the date over the last 10 years. - I did know it. - Quite a bit. And I don't even prognosticate where it's going to be. But really, where we're at right now, I keep coming back to, is that's why augmented intelligence is really going to be the new rage, humans working with machines. One of the hot topics, and the reason I chose to speak about it is, is the future of work. I don't care if you're a millennial, mid-career, or a baby boomer, people are paranoid. As machines get smarter, if your job is routine cognitive, yes, you have a higher propensity to be automated. So, this really shifts a number of things. A, you have to be a lifelong learner, you've got to learn new skillsets. And the dynamics are changing fast. Now, this is also a great equalizer for emerging startups, and even in SMBs. As the AI improves, they can become more nimble. So back to your point regarding colossal trillion dollar, wait a second, there's going to be quite a sea change going on right now, and regarding demographics, in 2020, millennials take over as the majority of the workforce, by 2025 it's 75%. - Great news. (laughing) - As a baby boomer, I try my damnedest to stay relevant. - Yeah, surround yourself with millennials is the takeaway there. - Or retire. (laughs) - Not yet. - One thing I think, this goes back to what Karen was saying, if you want a basic standard to put around the stuff, look at the old ISO 38500 framework. Business strategy, technology strategy. You have risk, compliance, change management, operations, and most importantly, the balance sheet in the financials. AI and what Tony was saying, digital transformation, if it's of meaning, it belongs on a balance sheet, and should factor into how you value your company. All the cyber security, and all of the compliance, and all of the regulation, is all stuff, this framework exists, so look it up, and every time you start some kind of new machine learning project, or data sense project, say, have we checked the box on each of these standards that's within this machine? And if you haven't, maybe slow down and do your homework. - To see a day when data is going to be valued on the balance sheet. - It is. - It's already valued as part of the current, but it's good will. - Certainly market value, as we were just talking about. - Well, we're talking about all of the companies that have opted in, right. There's tens of thousands of small businesses just in this region alone that are opt-out. They're small family businesses, or businesses that really aren't even technology-aware. But data's being collected about them, it's being on Yelp, they're being rated, they're being reviewed, the success to their business is out of their hands. And I think what's really going to be interesting is, you look at the big data, you look at AI, you look at things like that, blockchain may even be a potential for some of that, because of mutability, but it's when all of those businesses, when the technology becomes a cost, it's cost-prohibitive now, for a lot of them, or they just don't want to do it, and they're proudly opt-out. In fact, we talked about that last night at dinner. But when they opt-in, the company that can do that, and can reach out to them in a way that is economically feasible, and bring them back in, where they control their data, where they control their information, and they do it in such a way where it helps them build their business, and it may be a generational business that's been passed on. Those kind of things are going to make a big impact, not only on the cloud, but the data being stored in the cloud, the AI, the applications that you talked about earlier, we talked about that. And that's where this bias, and some of these other things are going to have a tremendous impact if they're not dealt with now, at least ethically. - Well, I feel like we just got started, we're out of time. Time for a couple more comments, and then officially we have to wrap up. - Yeah, I had one thing to say, I mean, really, Henry Ford, and the creation of the automobile, back in the early 1900s, changed everything, because now we're no longer stuck in the country, we can get away from our parents, we can date without grandma and grandpa setting on the porch with us. (laughing) We can take long trips, so now we're looked at, we've sprawled out, we're not all living in the country anymore, and it changed America. So, AI has that same capabilities, it will automate mundane routine tasks that nobody wanted to do anyway. So, a lot of that will change things, but it's not going to be any different than the way things changed in the early 1900s. - It's like you were saying, constant reinvention. - I think that's a great point, let me make one observation on that. Every period of significant industrial change was preceded by the formation, a period of formation of new assets that nobody knew what to do with. Whether it was, what do we do, you know, industrial manufacturing, it was row houses with long shafts tied to an engine that was coal-fired, and drove a bunch of looms. Same thing, railroads, large factories for Henry Ford, before he figured out how to do an information-based notion of mass production. This is the period of asset formation for the next generation of social structures. - Those ship-makers are going to be all over these cars, I mean, you're going to have augmented reality right there, on your windshield. - Karen, bring it home. Give us the drop-the-mic moment. (laughing) - No pressure. - Your AV guys are not happy with that. So, I think the, it all comes down to, it's a people problem, a challenge, let's say that. The whole AI ML thing, people, it's a legal compliance thing. Enterprises are going to struggle with trying to meet five billion different types of compliance rules around data and its uses, about enforcement, because ROI is going to make risk of incarceration as well as return on investment, and we'll have to manage both of those. I think businesses are struggling with a lot of this complexity, and you just opened a whole bunch of questions that we didn't really have solid, "Oh, you can fix it by doing this." So, it's important that we think of this new world of data focus, data-driven, everything like that, is that the entire IT and business community needs to realize that focusing on data means we have to change how we do things and how we think about it, but we also have some of the same old challenges there. - Well, I have a feeling we're going to be talking about this for quite some time. What a great way to wrap up CUBE NYC here, our third day of activities down here at 37 Pillars, or Mercantile 37. Thank you all so much for joining us today. - Thank you. - Really, wonderful insights, really appreciate it, now, all this content is going to be available on theCUBE.net. We are exposing our video cloud, and our video search engine, so you'll be able to search our entire corpus of data. I can't wait to start searching and clipping up this session. Again, thank you so much, and thank you for watching. We'll see you next time.

Published Date : Sep 13 2018

SUMMARY :

- Well, and for the first

ENTITIES

Entity	Category	Confidence
Chris	PERSON	0.99+
Steve	PERSON	0.99+
Mark Lynd	PERSON	0.99+
Karen	PERSON	0.99+
Karen Lopez	PERSON	0.99+
John	PERSON	0.99+
Steve Ardire	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Bob	PERSON	0.99+
Peter Burris	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Chris Penn	PERSON	0.99+
Google	ORGANIZATION	0.99+
Carla Gentry	PERSON	0.99+
Dave	PERSON	0.99+
Theo Lau	PERSON	0.99+
Carla	PERSON	0.99+
Kevin L. Jackson	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Tony Flath	PERSON	0.99+
Tony	PERSON	0.99+
April, 1983	DATE	0.99+
Apple	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
Ray Kurzweil	PERSON	0.99+
Zuckerberg	PERSON	0.99+
New York	LOCATION	0.99+
Facebook	ORGANIZATION	0.99+
2020	DATE	0.99+
two	QUANTITY	0.99+
75%	QUANTITY	0.99+
Ginni Rometty	PERSON	0.99+
Bob Hayes	PERSON	0.99+
80%	QUANTITY	0.99+
GovCloud	ORGANIZATION	0.99+
35 years	QUANTITY	0.99+
2025	DATE	0.99+
Oklahoma	LOCATION	0.99+
Mark Zuckerberg	PERSON	0.99+
US	LOCATION	0.99+
two questions	QUANTITY	0.99+
United States	LOCATION	0.99+
April	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
29 databases	QUANTITY	0.99+
Mark	PERSON	0.99+
7,000 languages	QUANTITY	0.99+
five billion	QUANTITY	0.99+
Elon Musk	PERSON	0.99+
1980s	DATE	0.99+
Unconventional Ventures	ORGANIZATION	0.99+
IRS	ORGANIZATION	0.99+
Siri	TITLE	0.99+
eight children	QUANTITY	0.99+
both	QUANTITY	0.99+
one	QUANTITY	0.99+
Armlock	ORGANIZATION	0.99+
French	OTHER	0.99+
Trust Insight	ORGANIZATION	0.99+
ninth year	QUANTITY	0.99+
congress	ORGANIZATION	0.99+
first time	QUANTITY	0.99+
Paisan	PERSON	0.99+

Yaron Haviv, Iguazio | theCUBE NYC 2018

Live from New York It's theCUBE! Covering theCUBE New York City 2018 Brought to you by Silicon Angle Media and it's ecosystem partners >> Hey welcome back and we're live in theCUBE in New York city. It's our 2nd day of two days of coverage CUBE NYC. The hashtag CUBENYC Formerly Big data NYC renamed because it's about big data, it's about the server, it's about Cooper _________'s multi-cloud data. It's all about data, and that's the fundamental change in the industry. Our next guest is Yaron Haviv, who's the CTO of Iguazio, key alumni, always coming out with some good commentary smart analysis. Kind of a guest host as well as an industry participant supplier. Welcome back to theCUBE. Good to see you. >> Thank you John. >> Love having you on theCUBE because you always bring some good insight and we appreciate that. Thank you so much. First, before we get into some of the comments because I really want to delve into comments that David Richards said a few years ago, CEO of RenDisco. He said, "Cloud's going to kill Hadoop". And people were looking at him like, "Oh my God, who is this heretic? He's crazy. What is he talking about?" But you might not need Hadoop, if you can run server less Spark, Tensorflow.... You talk about this off camera. Is Hadoop going to be the open stack of the big data world? >> I don't think cloud necessary killed Hadoop, although it is working on that, you know because you go to Amazon and you know, you can consume a bunch of services and you don't really need to think about Hadoop. I think cloud native serve is starting to kill Hadoop, cause Hadoop is three layers, you know, it's a file system, it's DFS, and then you have server scheduling Yarn, then you have applications starting with map produce and then you evolve into things like Spark. Okay, so, file system I don't really need in the cloud. I use Asfree, I can use a database as a service, as you know, pretty efficient way of storing data. For scheduling, Kubernetes is a much more generic way of scheduling workloads and not confined to Spark and specific workloads. I can run with Dancerflow, I can run with data science tools, etc., just containerize. So essentially, why would I need Hadoop? If I can take the traditional tools people are now evolving in and using like Jupiter Notebooks, Spark, Dancerflow, you know, those packages with Kubernetes on top of a database as a service and some object store, I have a much easier stack to work with. And I could mobilize that whether it's in the cloud, you know on different vendors. >> Scale is important too. How do you scale it? >> Of course, you have independent scaling between data and computation, unlike Hadoop. So I can just go to Google, and use Vquery, or use, you know, DynamoDB on Amazon or Redchick, or whatever and automatically scale it down and then, you know >> That's a unique position, so essentially, Hadoop versus Kubernetes is a top-line story. And wouldn't that be ironic for Google, because Google essentially created Map Produce and Coudera ran with it and went public, but when we're talking about 2008 timeframe, 2009 timeframe, back when ventures with cloud were just emerging in the mainstream. So wouldn't it be ironic Kubernetes, which is being driven by Google, ends up taking over Hadoop? In terms of running things on Kubernetes and cloud eight on Visa Vis on premise with Hadoop. >> The poster is tend to give this comment about Google, but essentially Yahoo started Hadoop. Google started the technology and couple of years after Hadoop started, with Google they essentially moved to a different architecture, with something called Percolator. So Google's not too associated with Hadoop. They're not really using this approach for a long time. >> Well they wrote the map-produced paper and the internal conversations we report on theCUBE about Google was, they just let that go. And Yahoo grabbed it. (cross-conversation) >> The companies that had the most experience were the first to leave. And I think it may respect what you're saying. As the marketplace realizes the outcomes of the dubious associate with, they will find other ways of achieving those outcomes. It might be more depth. >> There's also a fundamental shift in the consumption where Hadoop was about a ranking pages in a batch form. You know, just collecting logs and ranking pages, okay. The chances that people have today revolve around applying AI to business application. It needs to be a lot more concurring, transactional, real-time ish, you know? It's nothing to do with Hadoop, okay? So that's why you'll see more and more workers, mobilizing different black server functions, into service pre-canned services, etc. And Kubernetes playing a good role here is providing the trend. Transport for migrating workloads across cloud providers, because I can use GKE, the Google Kubenetes, or Amazon Kubernetes, or Azure Kubernetes, and I could write a similar application and deploy it on any cloud, or on Clam on my own private cluster. It makes the infrastructure agnostic really application focused. >> Question about Kubernetes we heard on theCUBE earlier, the VP of Project BlueData said that Kubernetes ecosystem and community needs to do a better job with Stapla, they nailed Stapflalis, Stafle application support is something that they need help on. Do you agree with that comment, and then if so, what alternatives do you have for customers who care about Stafe? >> They should use our product (laughing) >> (mumbling) Is Kubernetes struggling there? And if so, talk about your product >> So, I think that our challenge is rounded that there are many solutions in that. I think that they are attacking it from a different approach Many of them are essentially providing some block storage to different containers on really cloud 90. What you want to be able is to have multiple containers access the same data. That means either sharing through file systems, for objects or through databases because one container is generating, for example, ingestion or __________. Another container is manipulating that same data. A third container may look for something in the data, and generate a trigger or an action. So you need shared access to data from those containers. >> The rest of the data synchronizes all three of those things. >> Yes because the data is the form of state. The form of state cannot be associated with the same container, which is what most of where I am very active and sincere in those committees, and you have all the storage guys in the committees, and they think the block story just drag solution. Cause they still think like virtual machines, okay? But the general idea is that if you think about Kubernetes is like the new OS, where you have many processes, they're just scattered around. In OS, the way for us to share state between processes an OS, is whether through files, or through databases, in those form. And that's really what >> Threads and databases as a positive engagement. >> So essentially I gave maybe two years ago, a session at KubeCon in Europe about what we're doing on storing state. It's really high-performance access from those container processes to our database. Impersonate objects, files, streams or time series data, etc And then essentially, all those workloads just mount on top of and we can all share stape. We can even control the access for each >> Do you think you nailed the stape problem? >> Yes, by the way, we have a managed service. Anyone could go today to our cloud, to our website, that's in our cloud. It gets it's own Kubernetes cluster, a provision within less than 10 minutes, five to 10 minutes. With all of those services pre-integrated with Spark, Presto, ______________, real-time, these services functions. All that pre-configured on it's own time. I figured all of these- >> 100% compatible with Kubernetes, it's a good investment >> Well we're just expanding it to Kubernetes stripes, now it's working on them, Amazon Kubernetes, EKS I think, we're working on AKS and GK. We partner with Azure and Google. And we're also building an ad solution that is essentially exactly the same stock. Can run on an edge appliance in a factory. You can essentially mobilize data and functions back and forth. So you can go and develop your work loads, your application in the cloud, test it under simulation, push a single button and teleport the artifacts into the edge factory. >> So is it like a real-time Kubernetes? >> Yes, it's a real-time Kubernetes. >> If you _______like the things we're doing, it's all real-time. >> Talk about real-time in the database world because you mentioned time-series databases. You give objects store versus blog. Talk about time series. You're talking about data that is very relevant in the moment. And also understanding time series data. And then, it's important post-event, if you will, meaning How do you store it? Do you care? I mean, it's important to manage the time series. At the same time, it might not be as valuable as other data, or valuable at certain points and time, which changes it's relationship to how it's stored and how it's used. Talk about the dynamic of time series.. >> Figured it out in the last six or 12 months that since real-time is about time series. Everything you think about real-time censored data, even video is a time-series of frames, okay And what everyone wants to do is just huge amount of time series. They want to cross-correlate it, because for example, you think about stock tickers you know, the stock has an impact from news feeds or Twitter feeds, or of a company or a segment. So essentially, what they need to do is something called multi-volume analysis of multiple time series to be able to extract some meaning, and then decide if you want to sell or buy a stock, as in vacation example. And there is a huge gap in the solution in that market, because most of the time series databases were designed for operational databases, you know, things that monitor apps. Nothing that injects millions of data points per second, and cross-correlates and run real-time AI analytics. Ah, so we've essentially extended because we have a programmable database essentially under the hoop. We've extended it to support time series data with about 50 to 1 compression ratio, compared to some other solutions. You know we've break with the customer, we've done sizing, they told them us they need half a pitabyte. After a small sizing exercise, about 10 to 20 terabytes of storage for the same data they stored in Kassandra for 500 terabytes. No huge ingestion rates, and what's very important, we can do an in-flight with all those cross-correlations, so, that's something that's working very well for us. >> This could help on smart mobility. Kenex 5G comes on, certainly. Intelligent edge. >> So the customers we have, these cases that we applied right now is in financial services, two or three main applications. One is tick data and analytics, everyone wants to be smarter learning on how to buy and sell stocks or manage risk, the second one is infrastructure, monitoring, critical infrastructure, monitoring is SLA monitoring is be able to monitor network devices, latencies, applications, you now, transaction rate, or that, be able to predict potential failures or escalation We have similar applications; we have about three Telco customers using it for real-time time. Series analytics are metric data, cybersecurity attacks, congestion avoidance, SLA management, and also automotive. Fleet management, file linking, they are also essentially feeding huge data sets of time series analytics. They're running cross-correlation and AI logic, so now they can generate triggers. Now apply to Hadoop. What does Hadoop have anything to do with those kinds of applications? They cannot feed huge amounts of datasets, they cannot react in real-time, doesn't store time-series efficiently. >> Hapoop (laughing) >> You said that. >> Yeah. That's good. >> One, I know we don't have a lot of time left. We're running out of time, but I want to make sure we get this out here. How are you engaging with customers? You guys got great technical support. We can vouch for the tech chops that you guys have. We seen the solution. If it's compatible to Kubernetes, certainly this is an alternative to have really great analytical infrastructure. Cloud native, goodness of your building, You do PFC's, they go to your website, and how do you engage, how do you get deals? How do people work with you? >> So because now we have a cloud service, so also we engage through the cloud. Mainly, we're going after customers and leads, or from webinars and activities on the internet, and we sort of follow-up with those customers, we know >> Direct sales? >> Direct sales, but through lead generation mechanism. Marketplace activity, Amazon, Azure, >> Partnerships with Azure and Google now. And Azure joint selling activities. They can actually resale and get compensated. Our solution is an edge for Azure. Working on similar solution for Google. Very focused on retailers. That's the current market focus of since you think about stores that have a single supermarket will have more than a 1,000 cameras. Okay, just because they're monitoring shelves in real-time, think about Amazon go, kind of replication. Real-time inventory management. You cannot push a 1,000 camera feeds into the cloud. In order to analyze it then decide on inventory level. Proactive action, so, those are the kind of applications. >> So bigger deals, you've had some big deals. >> Yes, we're really not a raspberry pie-kind of solution. That's where the bigger customers >> Got it. Yaron, thank you so much. The CTO of Iguazio Check him out. It's actually been great commentary. The Hadoop versus Kubernetes narrative. Love to explore that further with you. Stay with us for more coverage after this short break. We're live in day 2 of CUBE NYC. Par Strata, Hadoop Strata, Hadoop World. CUBE Hadoop World, whatever you want to call it. It's all because of the data. We'll bring it to ya. Stay with us for more after this short break. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

It's all about data, and that's the fundamental change Love having you on theCUBE because you always and then you evolve into things like Spark. How do you scale it? and then, you know and cloud eight on Visa Vis on premise with Hadoop. Google started the technology and couple of years and the internal conversations we report on theCUBE The companies that had the most experience It's nothing to do with Hadoop, okay? and then if so, what alternatives do you have for So you need shared access to data from those containers. The rest of the data synchronizes is like the new OS, where you have many processes, We can even control the access for each Yes, by the way, we have a managed service. So you can go and develop your work loads, your application If you And then, it's important post-event, if you will, meaning because most of the time series databases were designed for This could help on smart mobility. So the customers we have, and how do you engage, how do you get deals? and we sort of follow-up with those customers, we know Direct sales, but through lead generation mechanism. since you think about stores that have Yes, we're really not a raspberry pie-kind of solution. It's all because of the data.

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Ed Macosky	PERSON	0.99+
Darren Anthony	PERSON	0.99+
Yaron Haviv	PERSON	0.99+
Mandy Dolly	PERSON	0.99+
Mandy Dhaliwal	PERSON	0.99+
David Richards	PERSON	0.99+
Suzi Jewett	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
HP	ORGANIZATION	0.99+
two	QUANTITY	0.99+
2.9 times	QUANTITY	0.99+
Darren	PERSON	0.99+
Google	ORGANIZATION	0.99+
Suzi	PERSON	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
RenDisco	ORGANIZATION	0.99+
2009	DATE	0.99+
Suzie Jewitt	PERSON	0.99+
HPE	ORGANIZATION	0.99+
2022	DATE	0.99+
Yahoo	ORGANIZATION	0.99+
Lisa	PERSON	0.99+
2008	DATE	0.99+
AKS	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
500 terabytes	QUANTITY	0.99+
60%	QUANTITY	0.99+
2021	DATE	0.99+
Hadoop	TITLE	0.99+
1,000 camera	QUANTITY	0.99+
one	QUANTITY	0.99+
18,000 customers	QUANTITY	0.99+
five	QUANTITY	0.99+
Amsterdam	LOCATION	0.99+
2030	DATE	0.99+
One	QUANTITY	0.99+
HIPAA	TITLE	0.99+
tomorrow	DATE	0.99+
2026	DATE	0.99+
Yaron	PERSON	0.99+
two days	QUANTITY	0.99+
Europe	LOCATION	0.99+
First	QUANTITY	0.99+
today	DATE	0.99+
Telco	ORGANIZATION	0.99+
both	QUANTITY	0.99+
three	QUANTITY	0.99+

Mick Hollison, Cloudera | theCUBE NYC 2018

(lively peaceful music) >> Live, from New York, it's The Cube. Covering "The Cube New York City 2018." Brought to you by SiliconANGLE Media and its ecosystem partners. >> Well, everyone, welcome back to The Cube special conversation here in New York City. We're live for Cube NYC. This is our ninth year covering the big data ecosystem, now evolved into AI, machine learning, cloud. All things data in conjunction with Strata Conference, which is going on right around the corner. This is the Cube studio. I'm John Furrier. Dave Vellante. Our next guest is Mick Hollison, who is the CMO, Chief Marketing Officer, of Cloudera. Welcome to The Cube, thanks for joining us. >> Thanks for having me. >> So Cloudera, obviously we love Cloudera. Cube started in Cloudera's office, (laughing) everyone in our community knows that. I keep, keep saying it all the time. But we're so proud to have the honor of working with Cloudera over the years. And, uh, the thing that's interesting though is that the new building in Palo Alto is right in front of the old building where the first Palo Alto office was. So, a lot of success. You have a billboard in the airport. Amr Awadallah is saying, hey, it's a milestone. You're in the airport. But your business is changing. You're reaching new audiences. You have, you're public. You guys are growing up fast. All the data is out there. Tom's doing a great job. But, the business side is changing. Data is everywhere, it's a big, hardcore enterprise conversation. Give us the update, what's new with Cloudera. >> Yeah. Thanks very much for having me again. It's, it's a delight. I've been with the company for about two years now, so I'm officially part of the problem now. (chuckling) It's been a, it's been a great journey thus far. And really the first order of business when I arrived at the company was, like, welcome aboard. We're going public. Time to dig into the S-1 and reimagine who Cloudera is going to be five, ten years out from now. And we spent a good deal of time, about three or four months, actually crafting what turned out to be just 38 total words and kind of a vision and mission statement. But the, the most central to those was what we were trying to build. And it was a modern platform for machine learning analytics in the cloud. And, each of those words, when you unpack them a little bit, are very, very important. And this week, at Strata, we're really happy on the modern platform side. We just released Cloudera Enterprise Six. It's the biggest release in the history of the company. There are now over 30 open-source projects embedded into this, something that Amr and Mike could have never imagined back in the day when it was just a couple of projects. So, a very very large and meaningful update to the platform. The next piece is machine learning, and Hilary Mason will be giving the kickoff tomorrow, and she's probably forgotten more about ML and AI than somebody like me will ever know. But she's going to give the audience an update on what we're doing in that space. But, the foundation of having that data management platform, is absolutely fundamental and necessary to do good machine learning. Without good data, without good data management, you can't do good ML or AI. Sounds sort of simple but very true. And then the last thing that we'll be announcing this week, is around the analytics space. So, on the analytic side, we announced Cloudera Data Warehouse and Altus Data Warehouse, which is a PaaS flavor of our new data warehouse offering. And last, but certainly not least, is just the "optimize for the cloud" bit. So, everything that we're doing is optimized not just around a single cloud but around multi-cloud, hybrid-cloud, and really trying to bridge that gap for enterprises and what they're doing today. So, it's a new Cloudera to say the very least, but it's all still based on that core foundation and platform that, you got to know it, with very early on. >> And you guys have operating history too, so it's not like it's a pivot for Cloudera. I know for a fact that you guys had very large-scale customers, both with three letter, letters in them, the government, as well as just commercial. So, that's cool. Question I want to ask you is, as the conversation changes from, how many clusters do I have, how am I storing the data, to what problems am I solving because of the enterprises. There's a lot of hard things that enterprises want. They want compliance, all these, you know things that have either legacy. You guys work on those technical products. But, at the end of the day, they want the outcomes, they want to solve some problems. And data is clearly an opportunity and a challenge for large enterprises. What problems are you guys going after, these large enterprises in this modern platform? What are the core problems that you guys knock down? >> Yeah, absolutely. It's a great question. And we sort of categorize the way we think about addressing business problems into three broad categories. We use the terms grow, connect, and protect. So, in the "grow" sense, we help companies build or find new revenue streams. And, this is an amazing part of our business. You see it in everything from doing analytics on clickstreams and helping people understand what's happening with their web visitors and the like, all the way through to people standing up entirely new businesses based simply on their data. One large insurance provider that is a customer of ours, as an example, has taken on the challenge and asked us to engage with them on building really, effectively, insurance as a service. So, think of it as data-driven insurance rates that are gauged based on your driving behaviors in real time. So no longer simply just using demographics as the way that you determine, you know, all 18-year old young men are poor drivers. As it turns out, with actual data you can find out there's some excellent 18 year olds. >> Telematic, not demographics! >> Yeah, yeah, yeah, exactly! >> That Tesla don't connect to the >> Exactly! And Parents will love this, love this as well, I think. So they can find out exactly how their kids are really behaving by the way. >> They're going to know I rolled through the stop signs in Palo Alto. (laughing) My rates just went up. >> Exactly, exactly. So, so helping people grow new businesses based on their data. The second piece is "Connect". This is not just simply connecting devices, but that's a big part of it, so the IOT world is a big engine for us there. One of our favorite customer stories is a company called Komatsu. It's a mining manufacturer. Think of it as the ones that make those, just massive mines that are, that are all over the world. They're particularly big in Australia. And, this is equipment that, when you leave it sit somewhere, because it doesn't work, it actually starts to sink into the earth. So, being able to do predictive maintenance on that level and type and expense of equipment is very valuable to a company like Komatsu. We're helping them do that. So that's the "Connect" piece. And last is "Protect". Since data is in fact the new oil, the most valuable resource on earth, you really need to be able to protect it. Whether that's from a cyber security threat or it's just meeting compliance and regulations that are put in place by governments. Certainly GDPR is got a lot of people thinking very differently about their data management strategies. So we're helping a number of companies in that space as well. So that's how we kind of categorize what we're doing. >> So Mick, I wonder if you could address how that's all affected the ecosystem. I mean, one of the misconceptions early on was that Hadoop, Big Data, is going to kill the enterprise data warehouse. NoSQL is going to knock out Oracle. And, Mike has always said, "No, we are incremental". And people are like, "Yeah, right". But that's really, what's happened here. >> Yes. >> EDW was a fundamental component of your big data strategies. As Amr used to say, you know, SQL is the killer app for, for big data. (chuckling) So all those data sources that have been integrated. So you kind of fast forward to today, you talked about IOT and The Edge. You guys have announced, you know, your own data warehouse and platform as a service. So you see this embracing in this hybrid world emerging. How has that affected the evolution of your ecosystem? >> Yeah, it's definitely evolved considerably. So, I think I'd give you a couple of specific areas. So, clearly we've been quite successful in large enterprises, so the big SI type of vendors want a, want a piece of that action these days. And they're, they're much more engaged than they were early days, when they weren't so sure all of this was real. >> I always say, they like to eat at the trough and then the trough is full, so they dive right in. (all laughing) They're definitely very engaged, and they built big data practices and distinctive analytics practices as well. Beyond that, sort of the developer community has also begun to shift. And it's shifted from simply people that could spell, you know, Hive or could spell Kafka and all of the various projects that are involved. And it is elevated, in particular into a data science community. So one of additional communities that we sort of brought on board with what we're doing, not just with the engine and SPARK, but also with tools for data scientists like Cloudera Data Science Workbench, has added that element to the community that really wasn't a part of it, historically. So that's been a nice add on. And then last, but certainly not least, are the cloud providers. And like everybody, they're, those are complicated relationships because on the one hand, they're incredibly valuable partners to it, certainly both Microsoft and Amazon are critical partners for Cloudera, at the same time, they've got competitive offerings. So, like most successful software companies there's a lot of coopetition to contend with that also wasn't there just a few years ago when we didn't have cloud offerings, and they didn't have, you know, data warehouse in the cloud offerings. But, those are things that have sort of impacted the ecosystem. >> So, I've got to ask you a marketing question, since you're the CMO. By the way, great message UL. I like the, the "grow, connect, protect." I think that's really easy to understand. >> Thank you. >> And the other one was modern. The phrase, say the phrase again. >> Yeah. It's the "Cloudera builds the modern platform for machine learning analytics optimized for the cloud." >> Very tight mission statement. Question on the name. Cloudera. >> Mmhmm. >> It's spelled, it's actually cloud with ERA in the letters, so "the cloud era." People use that term all the time. We're living in the cloud era. >> Yes. >> Cloud-native is the hottest market right now in the Linux foundation. The CNCF has over two hundred and forty members and growing. Cloud-native clearly has indicated that the new, modern developers here in the renaissance of software development, in general, enterprises want more developers. (laughs) Not that you want to be against developers, because, clearly, they're going to hire developers. >> Absolutely. >> And you're going to enable that. And then you've got the, obviously, cloud-native on-premise dynamic. Hybrid cloud and multi-cloud. So is there plans to think about that cloud era, is it a cloud positioning? You see cloud certainly important in what you guys do, because the cloud creates more compute, more capabilities to move data around. >> Sure. >> And (laughs) process it. And make it, make machine learning go faster, which gives more data, more AI capabilities, >> It's the flywheel you and I were discussing. >> It's the flywheel of, what's the innovation sandwich, Dave? You know? (laughs) >> A little bit of data, a little bit of machine itelligence, in the cloud. >> So, the innovation's in play. >> Yeah, Absolutely. >> Positioning around Cloud. How are you looking at that? >> Yeah. So, it's a fascinating story. You were with us in the earliest days, so you know that the original architecture of everything that we built was intended to be run in the public cloud. It turns out, in 2008, there were exactly zero customers that wanted all of their data in a public cloud environment. So the company actually pivoted and re-architected the original design of the offerings to work on-prim. And, no sooner did we do that, then it was time to re-architect it yet again. And we are right in the midst of doing that. So, we really have offerings that span the whole gamut. If you want to just pick up you whole current Cloudera environment in an infrastructure as a service model, we offer something called Altus Director that allows you to do that. Just pick up the entire environment, step it up onto AWUS, or Microsoft Azure, and off you go. If you want the convenience and the elasticity and the ease of use of a true platform as a service, just this past week we announced Altus Data Warehouse, which is a platform as a service kind of a model. For data warehousing, we have the data engineering module for Altus as well. Last, but not least, is everybody's not going to sign up for just one cloud vendor. So we're big believers in multi-cloud. And that's why we support the major cloud vendors that are out there. And, in addition to that, it's going to be a hybrid world for as far out as we can see it. People are going to have certain workloads that, either for economics or for security reasons, they're going to continue to want to run in-house. And they're going to have other workloads, certainly more transient workloads, and I think ML and data science will fall into this camp, that the public cloud's going to make a great deal of sense. And, allowing companies to bridge that gap while maintaining one security compliance and management model, something we call a Shared Data Experience, is really our core differentiator as a business. That's at the very core of what we do. >> Classic cloud workload experience that you're bringing, whether it's on-prim or whatever cloud. >> That's right. >> Cloud is an operating environment for you guys. You look at it just as >> The delivery mechanism. In effect. Awesome. All right, future for Cloudera. What can you share with us. I know you're a public company. Can't say any forward-looking statements. Got to do all those disclaimers. But for customers, what's the, what's the North Star for Cloudera? You mentioned going after a much more hardcore enterprise. >> Yes. >> That's clear. What's the North Star for you guys when you talk to customers? What's the big pitch? >> Yeah. I think there's a, there's a couple of really interesting things that we learned about our business over the course of the past six, nine months or so here. One, was that the greatest need for our offerings is in very, very large and complex enterprises. They have the most data, not surprisingly. And they have the most business gain to be had from leveraging that data. So we narrowed our focus. We have now identified approximately five thousand global customers, so think of it as kind of Fortune or Forbes 5000. That is our sole focus. So, we are entirely focused on that end of the market. Within that market, there are certain industries that we play particularly well in. We're incredibly well-positioned in financial services. Very well-positioned in healthcare and telecommunications. Any regulated industry, that really cares about how they govern and maintain their data, is really the great target audience for us. And so, that continues to be the focus for the business. And we're really excited about that narrowing of focus and what opportunities that's going to build for us. To not just land new customers, but more to expand our existing ones into a broader and broader set of use cases. >> And data is coming down faster. There's more data growth than ever seen before. It's never stopping.. It's only going to get worse. >> We love it. >> Bring it on. >> Any way you look at it, it's getting worse or better. Mick, thanks for spending the time. I know you're super busy with the event going on. Congratulations on the success, and the focus, and the positioning. Appreciate it. Thanks for coming on The Cube. >> Absolutely. Thank you gentlemen. It was a pleasure. >> We are Cube NYC. This is our ninth year doing all action. Everything that's going on in the data world now is horizontally scaling across all aspects of the company, the society, as we know. It's super important, and this is what we're talking about here in New York. This is The Cube, and John Furrier. Dave Vellante. Be back with more after this short break. Stay with us for more coverage from New York City. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by SiliconANGLE Media This is the Cube studio. is that the new building in Palo Alto is right So, on the analytic side, we announced What are the core problems that you guys knock down? So, in the "grow" sense, we help companies by the way. They're going to know I rolled Since data is in fact the new oil, address how that's all affected the ecosystem. How has that affected the evolution of your ecosystem? in large enterprises, so the big and all of the various projects that are involved. So, I've got to ask you a marketing question, And the other one was modern. optimized for the cloud." Question on the name. We're living in the cloud era. Cloud-native clearly has indicated that the new, because the cloud creates more compute, And (laughs) process it. machine itelligence, in the cloud. How are you looking at that? that the public cloud's going to make a great deal of sense. Classic cloud workload experience that you're bringing, Cloud is an operating environment for you guys. What can you share with us. What's the North Star for you guys is really the great target audience for us. And data is coming down faster. and the positioning. Thank you gentlemen. is horizontally scaling across all aspects of the

ENTITIES

Entity	Category	Confidence
Komatsu	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Mick Hollison	PERSON	0.99+
Mike	PERSON	0.99+
Australia	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
2008	DATE	0.99+
Palo Alto	LOCATION	0.99+
Tom	PERSON	0.99+
New York	LOCATION	0.99+
Mick	PERSON	0.99+
John Furrier	PERSON	0.99+
New York City	LOCATION	0.99+
Tesla	ORGANIZATION	0.99+
CNCF	ORGANIZATION	0.99+
Hilary Mason	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
second piece	QUANTITY	0.99+
three letter	QUANTITY	0.99+
North Star	ORGANIZATION	0.99+
Amr Awadallah	PERSON	0.99+
zero customers	QUANTITY	0.99+
five	QUANTITY	0.99+
18 year	QUANTITY	0.99+
ninth year	QUANTITY	0.99+
One	QUANTITY	0.99+
Dave	PERSON	0.99+
this week	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
both	QUANTITY	0.99+
ten years	QUANTITY	0.98+
four months	QUANTITY	0.98+
over two hundred and forty members	QUANTITY	0.98+
Oracle	ORGANIZATION	0.98+
NYC	LOCATION	0.98+
first	QUANTITY	0.98+
NoSQL	TITLE	0.98+
The Cube	ORGANIZATION	0.98+
over 30 open-source projects	QUANTITY	0.98+
Amr	PERSON	0.98+
today	DATE	0.98+
SQL	TITLE	0.98+
each	QUANTITY	0.98+
GDPR	TITLE	0.98+
tomorrow	DATE	0.98+
Cube	ORGANIZATION	0.97+
approximately five thousand global customers	QUANTITY	0.97+
Strata	ORGANIZATION	0.96+
about two years	QUANTITY	0.96+
Altus	ORGANIZATION	0.96+
earth	LOCATION	0.96+
EDW	TITLE	0.95+
18-year old	QUANTITY	0.95+
Strata Conference	EVENT	0.94+
few years ago	DATE	0.94+
one	QUANTITY	0.94+
AWUS	TITLE	0.93+
Altus Data Warehouse	ORGANIZATION	0.93+
first order	QUANTITY	0.93+
single cloud	QUANTITY	0.93+
Cloudera Enterprise Six	TITLE	0.92+
about three	QUANTITY	0.92+
Cloudera	TITLE	0.84+
three broad categories	QUANTITY	0.84+
past six	DATE	0.82+

John Mracek & Peter Smails, Imanis Data | theCUBE NYC 2018

live from New York it's the cube covering the cube New York City 2018 brought to you by silicon angle media and its ecosystem partners i'm jeff workday Villante we're here nine years our nine years of coverage two days live in New York City and our next two guests shot Mrazek CEO amana stayed at fiendish males CMO mystic good to see you again welcome back thank you bad to be here guys so obviously this show we've been here nine years we were the first original Hadoop world we've seen a change Hadoop was gonna change the world it kind of didn't but we get the idea of it did not it did didn't but it would change our world it brought open source and the notion of low-cost Hardware into the big data game and then the big data became so much more powerful around these new tools but then the cloud comes in full throttles and while they can get horsepower that compute you can stand up infrastructure for analytics all this data goodness starts to change machine learning then becomes the the real utility that's showing this demand for using data right now not the set up using data this is a fundamental big trend so I don't get you guys reaction what do you see this evolving more cloud like how do you guys see the trend in this as data science certainly becoming more mainstream and productivity users to hardcore users and then you got cloud native developers doing things like kubernetes we've heard kubernetes here it's like a cloud is a data science what's going on what's your view of the market so I came from a company that was in an tech and we were built on big data and in looking at how big data is evolved and the movement towards analytics and machine learning it really being enabled by Big Data people have rushed to build these solutions and they've done a great job but it was always about what's the solution to my problem how do i leverage this data and they built out these platforms and in our context what we've seen is that enterprises get to a certain point where they say okay i've got all these different stacks that have been built these apps that have been built to solve my bi and analytics problems but what do I do about how do I manage all these and that's what I encounter my last company where we built everything ourselves and then so wait a minute but what we see at an enterprise level is fascinating because when I go to a large company I go you know we work with no sequel databases and Hadoop and you know how much Couchbase do you have how much Mongo etc the inevitable answer is yes and five of each right and they're cutting to this point where I've got all this distributed data distributed across my organization how am I going to actually manage it and make sure that that data is protected that I can migrate to the cloud or in a hybrid cloud environment and all these questions start to come up at an enterprise level we actually have had some very high-level discussions at a large financial institution here in New York where they literally brought 26 people to the meeting the initial meeting this was literally a second call where we were presenting our capability because they're they're now at the point where it's like this is mission-critical data this is not just some cool stuff somebody built off in one of our divisions it matters to the whole enterprise how do we make sure that data is protected backed up how do we move data around and that's really the the trend that we're tapping into and that the founders of our company saw many years ago and said I need to I need to we need to build a solution around this it's interesting you know you think about network data as a concept or data in general it's kind of got the same concepts we've seen in networking and/or cloud a control plane of some sorts out there and you know we're networking kind of went wrong as the management plane was different than the control plane so management and control or huge issues I mean you bring up this sprawl of data these companies are data full it's not like hey we might have data in the future right they got data now they're like bursting with data one what's the control plane look like what's the management plane look like these are all there's a technical concepts but with that with that in mind this is a big problem what our company is doing right now what are what are some of the steps that are taking now to get a handle on the management the data management it's not just your grandfather's data management so we anymore it's different it looks different your thoughts on on this chain of management so they're approaching the problem now and that's our sweet spot but I don't think they have in their minds yet come to exactly how to solve it it's there's this realization about we need to do this at this point and and and in fact doing it right is something that our founders when they built Lee said look if this problem of data management across big data needs to be solved by a data we're platform built on big data so let's use big data techniques to solve the problem all right so let's before getting some of the solution you guys are doing take a minute to explain what you guys are doing for the company the mission you know the value proposition status what do you guys do how are people gonna consume your product I mean take a particular type gen simple elevator pitch and we were enterprise data management focused specific than had you been no sequel so everyone's familiar with the traditional space of data management in the relational space relational world very large market very mature market well we're tapping into is what John was just saying which is you've got this proliferation but Dupin no sequel and people are hitting the wall they're hitting the ceiling because they don't have the same level of operational tools that they need to be able to mainstream these deployments whether it's data protection whether it's orchestration whether it's migration whatever the case may be so what we do that's essentially our value prophecy at a management for a Dupin no sequel we help organizations essentially drive that control plane really around three buckets data protection if it's business critical I got to protect it okay disaster recovery falls into protection bucket good old stuff everyone's familiar with but not in Hadoop in no single space orchestrations the second big bucket for us which is I'm moving to an agile development model how do i do things like automated test dev how do i do things like GD are the compliance management how do i do things like cloud migration you tut you know john touched on this one before a really interesting trend that we're seeing is you said what are customers doing they're trying to create a unified taxonomy they're trying to create a unified data strategy which is why 26 people end up in the but in lieu of that there's this huge opportunity because of what they need they know that it's got to be protected and they have 12 different platforms and they also want to be able to do things like one Cosmo I'm on go today but I'll be cosmos tomorrow I'm a dupe today but I might be HD inside tomorrow I want to just move from one to the other I want to be able to do intelligence so essentially the problem that we solve is we give them that agility and we give them that protection as they're sort of figuring this all out so we have this right you basically come in and say look it you can have whatever platform you want for your day there whether it's Hadoop and with most equals get unstructured and structured data together which makes sense but protections specifically does it have to morph and get swapped out based upon a decision correct make well now we're focused specifically Hadoop and no sequel so we would not be playing like if you we're not the 21st vendor to be helping s AP and Oracle you know customers backup their data it's basically if your Hadoop renewal sequel that's the platform regardless of what Hadoop distribution you're doing or where it's no see you know change out your piece what they do as they evolve and are correct I feel exactly right you're filling white space right because when this whole movement started it was like you were saying commodity Hardware yeah and you had this this idea of pushing code to data and oh hey his life is so easy and all of a sudden there's no governance there's no data protection no business continuity is all his enterprise stuff I didn't you heard for a long time people were gonna bring enterprise grade to Hadoop but they really didn't focus on the data protection space correct or the orchestra either was in those buckets and you touch them just the last piece of that puzzle value wise is on the machine learning piece yeah we do protection we do orchestration and we're bringing machine learning to bear to automate protection what amazing we hear a lot and that's a huge concern because the HDFS clusters need to talk speech out there right so there's a lot of nuances and Hadoop that are great but also can create headache from a user human standpoint because you need exact errors can get folded I gotta write scripts it creates a huge problem on multiple fronts the whole notion of being eventually being clustered in the first base being eventually consistent in the second place it creates a huge opportunity for us because this notion of being a legs we get the question asked the question why well you know there are a lot of traditional vendors they're just getting into the space and then what do that that's actually good because it rises you know rises all boats if you will because we think we've got a pretty significant technology mode around our ability to provide protection orchestration for eventually consistent clustered environments which is radically different than the traditional I love the story about the 26 people showing them me take me through what happened because that's kind of like what your jonquil fishbowl what do they do it they sit in their auditing they take a node so they really raising their hand they peppering you with questions what what happened in that meeting tell us so so it's an interesting microcosm what's happening in these organizations because as the various divisions and kind of like the federated IT structure started building their own stuff and I think the cloud enabled that it's like you know basically giving a the middle finger to central IT and so I can do all this stuff myself and then the organization gets to this realization of like no we need a central way to approach data management so in this meeting basically so we had an initial meeting with a couple of senior people and said we are we are going about consolidating how we manage all this data across all these platforms we want you to come in and present so when we presented there was a lot of engagement a lot of questions you could also see people still though there's an element of I want to protect my world and so this organizational dynamic plays out but you know when you're at a fortune 50 company and data is everything there's the central control starts to assert itself again and that's what we saw in this because the consequences of not addressing it is what is potentially massive data you know data loss loss of millions hundreds of millions of dollars you know data is the gold now right is the new oil so the central organizations are starting to assert that so we say that see that playing out and that's why all these people were in this meeting which is good in a way because then we're not like okay we got to sell ten different groups or ten different organizations it's actually being so there's there's kind of this pull back to the center it's happened in the no sequel world of your perspectives on this I mean early on you had guys like Mongo took off because it was so simple to use and capture unstructured data and now you're hearing everybody's talking about you know acid compliance and enterprise you know great capabilities that's got to be a tailwind for you guys could you bring it in the data protection and orchestration component but yeah what do you see it in that world what do you guys support today and maybe give us a glimpse of the future sure so that what we see as well a couple different things we are we are agnostic to the databases in the sense that we are definitely in Switzerland we were we you know we support all commerce so it's you know it's follow the follow the follow of the market share if you will Cassandra Mongo couch data stacks right on down the line on the no sequel side and what's interesting so they have very there have all varying degrees of maturity in terms of what their enterprise capabilities are some of them offer sort of rudimentary backup type stuff some fancy they have more backup versus others but at the end of the day you know their core differentiation they each it's fascinating to each have sort of a unique value prop in terms of what they're good at so it's a very fragmented market so that's a challenge that's an opportunity for us but it's a challenge from a marketplace networkers they've got to carve out there they all want the biggest slice of the pie but it's very fragmented because each of them is good at doing something slightly different yeah okay and so that like the the situation described before is they've got yes so you got one of everything yeah so they've got 19 different backup and recovery right coordinate processes approach or the or nothing or scripting law so that they do have to they've got a zillion steps associated with that and they're all scripted and so their probability of a failure you know very you drop a mirror that's a human error to is another problem and you use the word tailwind and I think that's very appropriate because with most of these vendors they're there they've got their hands full just moving their database features forward right you know where the engagement so when we can come in and actually help them with a customer who's now like okay great thank you database platform what do you do for backup well we have a rudimentary thing we should belong with it but there is one of our partners a manas who can provide these like robust enterprise it really helps them so with some of those vendors were actually a lot of partner traction because they see it's like that's not what their their strength is and they got to focus on moving their database so I'll give you some stats I'm writing a piece right now a traditional enterprise back in recovery but I wonder if you could comment on how it applies to your world so these are these are research that David flora did and some survey work that we've done on average of global 2000 organizations will have 50 to 80 steps associated with its backup and recovery processes and they're generally automated with scripts which of course a fragile yeah right and their prefer own to era and it's basically because of all this complexity there's a 1 in 4 chance of encountering an error on recovery which is obviously going to lead to longer outages and you know if you look at I mean the average cost the downtime for a typical global global 2000 companies between 75 thousand and two hundred fifteen thousand dollars an hour right now I don't know is your world because it's data it's all digitally the worst built as a source is it probably higher end of the spectrum all those numbers go AHA all those numbers go up and here's why all those metrics tie back to a monolithic architecture the world is now micro services based apps and you're running these applications in clusters and distributor architectures drop a note which is common I mean think you know you're talking about you're talking about commodity hardware to come out of the infrastructure it's completely normal to drop notes drops off you just add one back in everything keeps going on if your script expects five nodes and now there's four everything goes sideways so the probability I would I don't have the same stats back but it's worse because the the likelihood of error based upon configuration changes something as simple as that and you said micro-services was interesting to is is that now is it just a data lake kind of idea of storing data and a new cluster with microservices now you're having data that's an input to another app check so now so that the level of outage 7so mole severity is multiple because there could be a revenue-generating app at good young some sort of recommendation engine for e-commerce or something yeah something that's important like sorry you can't get your bank balance right now can't you any transfers because the hadoo closes down okay this is pretty big yes so it's a little bit different than say oh well to have a guy go out there and add a new server maybe a little bit different yeah and this is the you know this is the type of those are the types of stats that organizations that we're talking to now are caring a lot more it speaks to the market maturity do you run into the problem of you know it's insurance yeah and so they don't want to pay for insurance but a big theme in that you know the traditional enterprises how do we get more out of this data whether it's helping manage you know this I guess where that that's where your orchestration comes in cloud management maybe cloud migration maybe talk about some of the non insurance value add to our components and how that's resonating with with cost yeah yeah I so I'll jump in but the yeah the non protection stuff the orchestration bucket we're actually seeing it comes back to the to the problem sting we just said before which is they don't have it's not a monolithic stack it's a micro services based stack they've got multiple data sources they've got multiple data types it's sort of a it's the it's the byproduct of essentially putting power into into divisions hands to drive these different data strategies so you know the whole cloud let me double click on cloud migrations is a is a huge value problem that we have we talked about this notion of being data where so the ability to I'm here today but I want to be somewhere else tomorrow is a very strong operational argument that we hear from customers that we also also hear from the SI community because they hear it from the other community the other piece of that puzzle is you also hear that from the cloud folks because you've got multiple data for platforms that you're dealing with that you need agility to move around and the second piece is you've got the cloud obviously there's a massive migration to the cloud particularly with the dubidouxs sequel workloads so how do I streamline that process how do I provide the agility to be able to go from point A to point B just from of migration standpoint so that's a very very important use case for us has a lot of strategic value like it's coming it's sort of the markets talking to us like no no no we have this is him but we have to be able to do this and then simple things like not simple but you know automated test step is a big deal for us everybody's moved agile development so they want to spin up you know I don't want it I don't want to basically I want 10% of my data set I want to mask out my PII data I want to spin it up on Azure and I want to do that automatically every hour because I'm gonna run 16 I'm gonna run six builds today clouds certainly accelerates your opportunity big-time it forces everything to the table right yeah everybody's you can't hide anymore right what are you gonna do right you gotta answer the questions these are the questions so okay my final question I want to get on the table is for you in the segment is the product strategy how you guys looking at as an assassin gonna be software on premise cloud how's that look at how people consume the OP the offering and to opportunities because you guys are a young growing company you're kind of good good time you don't have the dog'll or the bagging it's Hadoop has changed a lot certainly there's a use case that neurons getting behind but clouds now a factor that product strategy and then when you're in deal why are you being called in why would someone want to call you rotor signs that would say you know call you guys up when with it when would a customer see signals and what signals would that be and to give you guys a ring or a digital connection product so the primary use cases are talking about recovery there's also data migration and the test step we have a big account right now that we're in final negotiations with where their primary use case is they're they're in health care and it's all about privacy and they need to securely mask and subset the data to your specific question around how are we getting called in basically you've got two things you've got the the administrators either the database architect or the IT or infrastructure people who are saying okay I need a backup solution I'm at a point now where I really need to protect my data as one and then there's this other track which is these higher-level strategic discussions where we're called in like the twenty six person meeting it's like okay we need an enterprise-wide data strategy so we're kind of attacking it both at the use case and at the higher level strategic and and and obviously the more we can drive that strategic discussion and get more of people wanting to talk to us about that that's gonna be better for our business and the stakeholders in that strategic discussion or whomever CIT is involved CIO maybe use their chief data officer and yeah database architect enterprise architecture head of enterprise architecture you know various flavors but you basically it kind of ways comes down to like two polls there's somebody who's kind of owns infrastructure and then there's somebody who kind of owns the data so it could be a chief data officer data architect or whatever depending on the scale of your and they're calling you because they're full they had to move the production workloads or they have production workloads that are from a bond from what uncared-for undershirt or is that the main reason they're in pain or you're the aspirin are you more others like we had a day loss and we didn't have any point in time recovery and that's what you guys provide so we don't want to go through this again so that's that's a huge impetus for us it is all about to your point it is mature its production workloads I mean the simple qualifying are you are you running a duper no sequel yes are you running in production yes you have a backup strategy sort of tip of the spear now to just briefly answer your question before we before we run out of time so it's an it's it's not a SAS basement we're software-defined solution will run in bare mantle running VMs will run in the cloud as your Google whatever you want to run on so we run anywhere you want we're sorry for be fine we use any storage that you want and basically it's an annual subscription base so it's not a SAS consumption model that may come down the road but it's basically in a license that you buy deploy it wherever you want customers choose what to do basically customers can do you know it's complete flexible flexible but back to you so let's go back to something you said you said they didn't have a point in time recovery what their point in time recovery was their last full backup or they just didn't have one or they just didn't have one all of the above you know see we've seen both yeah there's a market maturity issues so it's represented yeah you know that a lot its clustered I you know I just replicate my data and replication is not earth and truth be told my old company that was our approach we had a script but still it was like and the key thing is even if you write that script as you point out before the whole recovery thing so you know having a recovery sandbox is really in thing about this we designed everything exactly extract the value and show the use case prove it out yeah dupes real the history is repeating itself in that regard if you refuel a tional space there's a very in correlation to the Delton between the database platforms of the data mention logical hence they are involved coming in okay let's look at this in the big picture let's dad what's the recovery strategy how we gonna scale this exactly it's just a product Carson so your granularity for a point in time is you offer any point in time any point in time is varying and we'll have more news on that in the next couple weeks okay mantas data here inside the cube hot new startup growing companies really solving a real need need in the marketplace you're kind of an aspirant today but you know growth opportunity for as they scale up so congratulations good luck with the opportunity to secure bringing you live coverage here is part of Cuban YC our ninth year covering the big data ecosystem starting originally 2010 with a dupe world now it's a machine learning Hadoop clusters going at the production guys thanks for coming I really appreciate it this is the cube thanks for watching day one we'll be here all day tomorrow stay with us for more tomorrow be right back tomorrow I'll see you tomorrow

Published Date : Sep 13 2018

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
50	QUANTITY	0.99+
New York City	LOCATION	0.99+
Switzerland	LOCATION	0.99+
26 people	QUANTITY	0.99+
tomorrow	DATE	0.99+
John	PERSON	0.99+
two days	QUANTITY	0.99+
New York	LOCATION	0.99+
2010	DATE	0.99+
New York	LOCATION	0.99+
John Mracek	PERSON	0.99+
today	DATE	0.99+
nine years	QUANTITY	0.99+
10%	QUANTITY	0.99+
26 people	QUANTITY	0.99+
Peter Smails	PERSON	0.99+
five	QUANTITY	0.99+
26 people	QUANTITY	0.99+
second piece	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
1	QUANTITY	0.99+
Lee	PERSON	0.99+
12 different platforms	QUANTITY	0.99+
75 thousand	QUANTITY	0.99+
ninth year	QUANTITY	0.99+
Hadoop	TITLE	0.98+
millions	QUANTITY	0.98+
two things	QUANTITY	0.98+
Cassandra Mongo	PERSON	0.98+
each	QUANTITY	0.98+
twenty six person	QUANTITY	0.98+
second place	QUANTITY	0.98+
New York City	LOCATION	0.98+
ten different groups	QUANTITY	0.98+
ten different organizations	QUANTITY	0.98+
Google	ORGANIZATION	0.98+
hundreds of millions of dollars	QUANTITY	0.97+
second call	QUANTITY	0.97+
first base	QUANTITY	0.97+
80 steps	QUANTITY	0.97+
first	QUANTITY	0.97+
one	QUANTITY	0.97+
two guests	QUANTITY	0.97+
amana	PERSON	0.97+
David flora	PERSON	0.96+
john	PERSON	0.96+
five nodes	QUANTITY	0.96+
second	QUANTITY	0.96+
two polls	QUANTITY	0.96+
both	QUANTITY	0.96+
two hundred fifteen thousand dollars an hour	QUANTITY	0.95+
a lot of questions	QUANTITY	0.95+
CEO	PERSON	0.95+
Mongo	ORGANIZATION	0.94+
four	QUANTITY	0.93+
anis	PERSON	0.93+
Delton	ORGANIZATION	0.92+
7so	QUANTITY	0.92+
19 different backup	QUANTITY	0.92+
Carson	ORGANIZATION	0.92+
zillion steps	QUANTITY	0.91+
AP	ORGANIZATION	0.91+
2018	DATE	0.91+
theCUBE	ORGANIZATION	0.91+
day one	QUANTITY	0.9+
jeff workday	PERSON	0.89+
many years ago	DATE	0.87+
every hour	QUANTITY	0.87+
a couple of senior people	QUANTITY	0.84+
4	QUANTITY	0.84+
next couple weeks	DATE	0.84+
Dupin	TITLE	0.84+
21st vendor	QUANTITY	0.83+
agile	TITLE	0.8+

Billie Whitehouse, Wearable X | theCUBE NYC 2018

>> Live from New York, it's theCUBE. Covering theCUBE New York City 2018. Brought to you by Silicon Angle Media and its ecosystem partners. >> Hi, welcome back. I'm your host Sonia Tagare with my cohost Dave Vellante, and we're here at theCUBE NYC covering everything big data, AI, and the cloud. And this week is also New York Fashion Week, and with us today we have a guest who intersects both of those technologies, so Billie Whitehouse, CEO of Wearable X, thank you so much for being on. >> It's a pleasure, thank you for having me. >> Great to see you. >> Thank you. >> So your company Wearable X, which intersects fashion and technology, tell us more about that. >> So Wearable X started five years ago. And we started by building clothes that had integrated haptic feedback, which is just vibrational feedback on the body. And we really believe that we can empower clothing with technology to do far more than it ever has for you before, and to really give you control back of your life. >> That's amazing. So can you tell us more about the haptic, how it works and what the technology is about? >> Absolutely. So the haptics are integrated with accelerometers and they're paired through conductive pathways around the body, and specifically this is built for yoga in a line called NadiX. And Nadi is spelled N-A-D-I. I know that I have a funny accent so sometimes it helps to spell things out. They connect and understand your body orientation and then from understanding your body orientation we pair that back with your smartphone and then the app guides you with audio, how to move into each yoga pose, step by step. And at the end we ask you to address whether you made it into the pose or not by reading the accelerometer values, and then we give you vibrational feedback where to focus. >> And the accelerometer is what exactly? It's just a tiny device... Does it protrude or is it just...? >> I mean it's as invisibly integrated as we can get it so that we can make it washable and tumble-dryable. >> So I know I rented a car recently, big SUV with the family and when I started backing up or when I get close to another car, it started vibrating. So is it that kind of sensation? It was sort of a weird warning but then after a while I got used to it. It was kind of training me. Is that-- >> Precisely. >> Sort of the same thing? And it's just the pants or the leggings, or is it the top as well? >> So it's built in through the ankles, behind the knees and in the hip of the yoga pants, and then we will release upper body work as well. >> Alright, so let's double click on this. So if I'm in a crescent pose and I'm leaning too far forward, will it sort of correct me or hit me in the calf and say, "Put your heel down," or how would that work? >> Exactly. So the audio instructions will give you exactly the kind of instructions you would get if you were in a class. And then similarly to what you would get if you had a personal instructor, the vibrations will show you where to isolate and where to ground down, or where to lift up, or where to rotate, and then at the end of the pose, the accelerometer values are read and we understand whether you made it into the pose or whether you didn't quite get there, and whether you're overextended or not. And then we ask you to either go back and work on the pose again or move forward and move on to the next pose. >> That is amazing. I usually have to ask my daughters or my wife, "Is this right?" And then they'll just shake their heads. Now what do you do with the data? Do you collect the data and can I review and improve, feed it back? How does that all work? >> So the base level membership, which is free, is you don't see your progress tracking as yet. But we're about to release our membership, where you pay $10 a month, and with that you get progress tracking as a customer. Us on the back end, we can see how often people make it into particular poses. We can also see which ones they don't make it into very well, but we don't necessarily share that. >> And so presumably it tracks other things besides, like frequency, duration of the yoga? >> Exactly. Minutes of yoga, precisely right. >> Different body parts, or not necessarily? >> So the accelerometers are just giving us an individual value, and then we determine what pose you're in, so I don't know what you mean by different body parts? >> In other words, which parts of my body I'm working out or maybe need to work on? >> Oh precisely. Yeah if you're overextending a particular knee or an ankle, we can eventually tell you that very detailed. >> And how long have you been doing this? >> It's five years. >> Okay. And so what have you learned so far from all this data that you've collected? >> Well I mean, I'm going to start from a human learning first, and then I'll give you the data learnings. The human learning for me is equally as interesting. The language on the body and how people respond to vibration was learning number one. And we even did tests many years ago with a particular product, an upper body product, with kids, so aged between eight and 13, and I played a game of memory with them to see if they could learn and understand different vibrational sequences and what they meant. And it was astounding. They would get it every single time without fail. They would understand what the vibrations meant and they would remember it. For us, we are then trying to replicate that for yoga. And that has been a really interesting learning, to see how people need and understand and want to have audio cues with their vibrational feedback. From a data perspective, the biggest learning for us is that people are actually spending between 13 to 18 minutes inside the app. So they don't necessarily want an hour and a half class, which is what we originally thought. They want short, quick, easy-to-digest kind of flows. And that for me was very much a learning. They're also using it at really interesting times of the day. So it's before seven AM, in the middle of the day between 11 and three, and then after nine PM. And that just so happens to be when studios are shut. So it makes sense that they want to use something that's quick and easy for them, whether it's early morning when they have a big, full day, or late night 'cause they need to relax. >> Sounds like such a great social impact. Can you tell us more about why you decided to make this? >> Yeah, for me there was a personal problem. I was paying an extraordinary amount to go to classes, I was often in a class with another 50 people and not really getting any of the attention that I guess I thought I deserved, so I was frustrated. I was frustrated that I was paying so much money to go into class and not getting the attention, had been working with haptic feedback for quite some time at that point, realized that there was this language on the body that was being really underutilized, and then had this opportunity to start looking at how we could do it for yoga. Don't get me wrong, I had several engineers tell me this wasn't possible about three and a half years ago, and look at us now, we're shipping product and we're in retail and it's all working, but it took some time. >> So you're not an engineer, I take it? >> I am not an engineer. >> You certainly don't dress like an engineer, but you never know. What's your background? >> My background is in design. And I truly think that design, for us, has always come first. And I hope that it continues to be that way. I believe that designers have an ability to solve problems in, dare I say, in a horizontal way. We can understand pockets of things that are going on, whether it's the problem, whether it's ways to solve the solutions, and we can combine the two. It's not just about individual problem solving on a minute level; it's very much a macro view. And I hope that more and more designers go into this space because I truly believe that they have an ability to solve really interesting problems by asking empathetic questions. >> And how does the tech work? I mean, what do you need besides the clothing and the accelerometers to make this work? >> So we have a little device called the pulse. And the pulse has our Bluetooth module and our battery and our PCB, and that clips just behind the left knee. Now that's also the one spot on the body that during yoga doesn't get in the way, and we have tested that on every body shape you can imagine across five different continents, because we wanted to make sure that the algorithms that we built to understand the poses were going to be fair for everybody. So in doing that, that little pulse, you un-clip when you want to wash and dry. >> And is that connected to the app as well? >> Exactly, that's connected via Bluetooth to your app. >> That's great. So you have all your data in your hand and you know exactly what kind of yoga poses you're doing, where you need to strengthen up. >> Exactly. >> That's great. >> And is it a full program? In other words, are there different yoga programs I can do, or am I on my own for that? How does that work? >> So with the base level membership, you can choose different yoga instructors around New York that you'd like to follow, and then you can get progress tracking, you can get recommendations, and they are timed between that 10 to 20 minutes. If you want to pay the slightly more premium membership, you can actually build your own playlists, and that's something that our customers have said they're really interested in. It means that you can build a sequence of poses that is really defined by you, that is good for your body. So that means instead of going to a class where you end up getting a terrible teacher, or music that you don't like, you can actually build your own class and then share that with your friends as well. >> Is it a Spotify-like model, where the teachers get compensation at the back end, or how does that all work? >> Exactly. Yes, precisely. >> And what do you charge for this? >> So the pants are $250, and then the base level membership is $10 a month, and then the slightly more premium is $30 a month. >> If you think about how much you would spend for a yoga class, that actually seems like a pretty good deal. >> And trust me, when you start calculating, when you go to yoga at least once a week, and it's $20 a week and then you're like, "Oh, and I went every week this year," you realize that it racks up very quickly. >> Well plus the convenience of doing it... I love having... To be able to do it at six a.m. without having to go to a class, especially where I live in Boston, when it's cold in the winter, you don't even want to go out. (all laughing) >> So what do you think the future of the wearable industry is? >> This is a space that I get really excited about. I believe in a version of the future, which has been titled "enchanted objects." And the reason I sort of put it in inverted commas is I think that often has sometimes a magical element to it that people think is a little too far forward. But for me, I really believe that this is possible. So not only do I believe that we will have our own body area network, which I like to call an app store for the body, but I believe every object will have this. And there was a beautiful Wired article last month that actually described why the Japanese culture are adopting robotics and automation in a way that western culture often isn't. And that is because the Shinto religion is the predominant religion in Japan, and they believe that every object has a soul. And if in believing that, you're designing for that object to have a soul and a personality and an ecosystem, and dare we call it, a body area network for each object, then that area network can interface with yours or mine or whoever's, and you can create this really interesting communication that is enchanted and delightful, and not about domination. It's not about screens taking over the world and being in charge of you, and us being dominated by them, as often we see in culture now. It's about having this really beautiful interface between technology and objects. And I really believe that's going to be the version of the future. >> And looking good while you do it. >> Precisely. >> You've got visions to take this beyond yoga, is that right? Other sports, perhaps cycling and swimming and skiing, I can think of so many examples. >> Exactly. Well for us, we're focused on yoga to start with. And certainly areas that I would say are in the gaps. I like to think of our products as being very touch-focused and staying in areas of athleisure or sports that are around touch. So where you would get a natural adjustment from a coach or a teacher, our products can naturally fit into that space. So whether it is squats or whether it is Pilates, they're certainly in our pipeline. But in the immediate future, we're certainly looking at the upper body and in meditation, and how we can remind you to roll your shoulders back and down, and everyone sits up straight. And then longer term, we're looking at how we can move this into physiotherapy, and so as you mentioned, you can enter in that you have a left knee injury, and we'll be able to adjust what you should be working on because of that. >> Is there a possibility of a breathing component, or is that perhaps there today? Such an important part of yoga is breathing. >> 100%. That is very much part of what we're working on. I would say more silently, but very much will launch soon. >> Well it sounds like it's going to have such a positive impact on so many people and that it's going to be in so many different industries. >> I hope so. Yeah that's the plan. >> Well Billie Whitehouse, thank you so much for being on theCUBE, and Dave, thank you. We're here at theCUBE NYC, and stay tuned, don't go anywhere, we'll be back. (inquisitive electronic music)

Published Date : Sep 12 2018

SUMMARY :

Brought to you by Silicon Angle Media thank you so much for being on. thank you for having me. and technology, tell us more about that. for you before, and to really give you So can you tell us more about the haptic, And at the end we ask you to address And the accelerometer is what exactly? so that we can make it So is it that kind of sensation? and then we will release me or hit me in the calf And then similarly to what you would get Now what do you do with the data? is you don't see your Minutes of yoga, precisely right. you that very detailed. And so what have you learned and then I'll give you the data learnings. why you decided to make this? and then had this opportunity to start engineer, but you never know. And I hope that it and our PCB, and that clips via Bluetooth to your app. and you know exactly what kind and then you can get progress tracking, Exactly. So the pants are $250, and how much you would spend when you go to yoga at least once a week, in the winter, you don't And that is because the Shinto religion while you do it. is that right? how we can remind you or is that perhaps there today? of what we're working on. that it's going to be Yeah that's the plan. thank you so much

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Sonia Tagare	PERSON	0.99+
Boston	LOCATION	0.99+
Billie Whitehouse	PERSON	0.99+
New York	LOCATION	0.99+
$250	QUANTITY	0.99+
Dave	PERSON	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
Wearable X	ORGANIZATION	0.99+
10	QUANTITY	0.99+
Japan	LOCATION	0.99+
five years	QUANTITY	0.99+
Wearable X	ORGANIZATION	0.99+
two	QUANTITY	0.99+
last month	DATE	0.99+
13	QUANTITY	0.99+
six a.m.	DATE	0.99+
today	DATE	0.98+
each object	QUANTITY	0.98+
five years ago	DATE	0.98+
20 minutes	QUANTITY	0.98+
this year	DATE	0.98+
50 people	QUANTITY	0.98+
18 minutes	QUANTITY	0.97+
an hour and a half class	QUANTITY	0.97+
both	QUANTITY	0.97+
first	QUANTITY	0.97+
New York Fashion Week	EVENT	0.97+
$30 a month	QUANTITY	0.96+
five different continents	QUANTITY	0.96+
$10 a month	QUANTITY	0.96+
$20 a week	QUANTITY	0.96+
NYC	LOCATION	0.95+
theCUBE	ORGANIZATION	0.94+
one spot	QUANTITY	0.93+
Spotify	ORGANIZATION	0.93+
New York City	LOCATION	0.93+
this week	DATE	0.92+
11	DATE	0.92+
Japanese	OTHER	0.91+
about three and a half years ago	DATE	0.89+
double	QUANTITY	0.88+
eight	QUANTITY	0.87+
many years ago	DATE	0.86+
after	DATE	0.84+
each yoga	QUANTITY	0.84+
much	QUANTITY	0.81+
objects	TITLE	0.8+
2018	DATE	0.78+
yoga	TITLE	0.78+
100%	QUANTITY	0.76+
every week	QUANTITY	0.75+
single time	QUANTITY	0.75+
before seven AM	DATE	0.75+
every object	QUANTITY	0.73+
three	DATE	0.71+
Wired	TITLE	0.71+
N-A-D-I	OTHER	0.7+
nine PM	DATE	0.7+
at least once a week	QUANTITY	0.7+
CEO	PERSON	0.65+
number one	QUANTITY	0.64+
Shinto	ORGANIZATION	0.61+
2018	EVENT	0.61+
Pilates	TITLE	0.6+
Nadi	COMMERCIAL_ITEM	0.6+
NadiX	COMMERCIAL_ITEM	0.47+

DD, Cisco + Han Yang, Cisco | theCUBE NYC 2018

>> Live from New York, It's the CUBE! Covering theCUBE, New York City 2018. Brought to you by SiliconANGLE Media and its Ecosystem partners. >> Welcome back to the live CUBE coverage here in New York City for CUBE NYC, #CubeNYC. This coverage of all things data, all things cloud, all things machine learning here in the big data realm. I'm John Furrier and Dave Vellante. We've got two great guests from Cisco. We got DD who is the Vice President of Data Center Marketing at Cisco, and Han Yang who is the Senior Product Manager at Cisco. Guys, welcome to the Cube. Thanks for coming on again. >> Good to see ya. >> Thanks for having us. >> So obviously one of the things that has come up this year at the Big Data Show, used to be called Hadoop World, Strata Data, now it's called, the latest name. And obviously CUBE NYC, we changed from Big Data NYC to CUBE NYC, because there's a lot more going on. I heard hallway conversations around blockchain, cryptocurrency, Kubernetes has been said on theCUBE already at least a dozen times here today, multicloud. So you're seeing the analytical world try to be, in a way, brought into the dynamics around IT infrastructure operations, both cloud and on premises. So interesting dynamics this year, almost a dev ops kind of culture to analytics. This is a new kind of sign from this community. Your thoughts? >> Absolutely, I think data and analytics is one of those things that's pervasive. Every industry, it doesn't matter. Even at Cisco, I know we're going to talk a little more about the new AI and ML workload, but for the last few years, we've been using AI and ML techniques to improve networking, to improve security, to improve collaboration. So it's everywhere. >> You mean internally, in your own IT? >> Internally, yeah. Not just in IT, in the way we're designing our network equipment. We're storing data that's flowing through the data center, flowing in and out of clouds, and using that data to make better predictions for better networking application performance, security, what have you. >> The first topic I want to talk to you guys about is around the data center. Obviously, you do data center marketing, that's where all the action is. The cloud, obviously, has been all the buzz, people going to the cloud, but Andy Jassy's announcement at VMworld really is a validation that we're seeing, for the first time, hybrid multicloud validated. Amazon announced RDS on VMware on-premises. >> That's right. This is the first time Amazon's ever done anything of this magnitude on-premises. So this is a signal from the customers voting with their wallet that on-premises is a dynamic. The data center is where the data is, that's where the main footprint of IT is. This is important. What's the impact of that dynamic, of data center, where the data is with the option of a cloud. How does that impact data, machine learning, and the things that you guys see as relevant? >> I'll start and Han, feel free to chime in here. So I think those boundaries between this is a data center, and this a cloud, and this is campus, and this is the edge, I think those boundaries are going away. Like you said, data center is where the data is. And it's the ability of our customers to be able to capture that data, process it, curate it, and use it for insight to take decision locally. A drone is a data center that flies, and boat is a data center that floats, right? >> And a cloud is a data center that no one sees. >> That's right. So those boundaries are going away. We at Cisco see this as a continuum. It's the edge cloud continuum. The edge is exploding, right? There's just more and more devices, and those devices are cranking out more data than ever before. Like I said, it's the ability of our customers to harness the data to make more meaningful decisions. So Cisco's take on this is the new architectural approach. It starts with the network, because the network is the one piece that connects everything- every device, every edge, every individual, every cloud. There's a lot of data within the network which we're using to make better decisions. >> I've been pretty close with Cisco over the years, since '95 timeframe. I've had hundreds of meetings, some technical, some kind of business. But I've heard that term edge the network many times over the years. This is not a new concept at Cisco. Edge of the network actually means something in Cisco parlance. The edge of the network >> Yeah. >> that the packets are moving around. So again, this is not a new idea at Cisco. It's just materialized itself in a new way. >> It's not, but what's happening is the edge is just now generating so much data, and if you can use that data, convert it into insight and make decisions, that's the exciting thing. And that's why this whole thing about machine learning and artificial intelligence, it's the data that's being generated by these cameras, these sensors. So that's what is really, really interesting. >> Go ahead, please. >> One of our own studies pointed out that by 2021, there will be 847 zettabytes of information out there, but only 1.3 zettabytes will actually ever make it back to the data center. That just means an opportunity for analytics at the edge to make sense of that information before it ever makes it home. >> What were those numbers again? >> I think it was like 847 zettabytes of information. >> And how much makes it back? >> About 1.3. >> Yeah, there you go. So- >> So a huge compression- >> That confirms your research, Dave. >> We've been saying for a while now that most of the data is going to stay at the edge. There's no reason to move it back. The economics don't support it, the latency doesn't make sense. >> The network cost alone is going to kill you. >> That's right. >> I think you really want to collect it, you want to clean it, and you want to correlate it before ever sending it back. Otherwise, sending that information, of useless information, that status is wonderful. Well that's not very valuable. And 99.9 percent, "things are going well." >> Temperature hasn't changed. (laughs) >> If it really goes wrong, that's when you want to alert or send more information. How did it go bad? Why did it go bad? Those are the more insightful things that you want to send back. >> This is not just for IoT. I mean, cat pictures moving between campuses cost money too, so why not just keep them local, right? But the basic concepts of networking. This is what I want to get in my point, too. You guys have some new announcements around UCS and some of the hardware and the gear and the software. What are some of the new announcements that you're announcing here in New York, and what does it mean for customers? Because they want to know not only speeds and feeds. It's a software-driven world. How does the software relate? How does the gear work? What's the management look like? Where's the control plane? Where's the management plane? Give us all the data. >> I think the biggest issues starts from this. Data scientists, their task is to export different data sources, find out the value. But at the same time, IT is somewhat lagging behind. Because as the data scientists go from data source A to data source B, it could be 3 petabytes of difference. IT is like, 3 petabytes? That's only from Monday through Wednesday? That's a huge infrastructure requirement change. So Cisco's way to help the customer is to make sure that we're able to come out with blueprints. Blueprints enabling the IT team to scale, so that the data scientists can work beyond their own laptop. As they work through the petabytes of data that's come in from all these different sources, they're able to collaborate well together and make sense of that information. Only by scaling with IT helping the data scientists to work the scale, that's the only way they can succeed. So that's why we announced a new server. It's called a C480ML. Happens to have 8 GPUs from Nvidia inside helping customers that want to do that deep learning kind of capabilities. >> What are some of the use cases on these as products? It's got some new data capabilities. What are some of the impacts? >> Some of the things that Han just mentioned. For me, I think the biggest differentiation in our solution is things that we put around the box. So the management layer, right? I mean, this is not going to be one server and one data center. It's going to be multiple of them. You're never going to have one data center. You're going to have multiple data centers. And we've got a really cool management tool called Intersight, and this is supported in Intersight, day one. And Intersight also uses machine learning techniques to look at data from multiple data centers. And that's really where the innovation is. Honestly, I think every vendor is bend sheet metal around the latest chipset, and we've done the same. But the real differentiation is how we manage it, how we use the data for more meaningful insight. I think that's where some of our magic is. >> Can you add some code to that, in terms of infrastructure for AI and ML, how is it different than traditional infrastructures? So is the management different? The sheet metal is not different, you're saying. But what are some of those nuances that we should understand. >> I think especially for deep learning, multiple scientists around the world have pointed that if you're able to use GPUs, they're able to run the deep learning frameworks faster by roughly two waters magnitude. So that's part of the reason why, from an infrastructure perspective, we want to bring in that GPUs. But for the IT teams, we didn't want them to just add yet another infrastructure silo just to support AI or ML. Therefore, we wanted to make sure it fits in with a UCS-managed unified architecture, enabling the IT team to scale but without adding more infrastructures and silos just for that new workload. But having that unified architecture, it helps the IT to be more efficient and, at the same time, is better support of the data scientists. >> The other thing I would add is, again, the things around the box. Look, this industry is still pretty nascent. There is lots of start-ups, there is lots of different solutions, and when we build a server like this, we don't just build a server and toss it over the fence to the customer and say "figure it out." No, we've done validated design guides. With Google, with some of the leading vendors in the space to make sure that everything works as we say it would. And so it's all of those integrations, those partnerships, all the way through our systems integrators, to really understand a customer's AI and ML environment and can fine tune it for the environment. >> So is that really where a lot of the innovation comes from? Doing that hard work to say, "yes, it's going to be a solution that's going to work in this environment. Here's what you have to do to ensure best practice," etc.? Is that right? >> So I think some of our blueprints or validated designs is basically enabling the IT team to scale. Scale their stores, scale their CPU, scale their GPU, and scale their network. But do it in a way so that we work with partners like Hortonworks or Cloudera. So that they're able to take advantage of the data lake. And adding in the GPU so they're able to do the deep learning with Tensorflow, with Pytorch, or whatever curated deep learning framework the data scientists need to be able to get value out of those multiple data sources. These are the kind of solutions that we're putting together, making sure our customers are able to get to that business outcome sooner and faster, not just a-- >> Right, so there's innovation at all altitudes. There's the hardware, there's the integrations, there's the management. So it's innovation. >> So not to go too much into the weeds, but I'm curious. As you introduce these alternate processing units, what is the relationship between traditional CPUs and these GPUs? Are you managing them differently, kind of communicating somehow, or are they sort of fenced off architecturally. I wonder if you could describe that. >> We actually want it to be integrated, because by having it separated and fenced off, well that's an IT infrastructure silo. You're not going to have the same security policy or the storage mechanisms. We want it to be unified so it's easier on IT teams to support the data scientists. So therefore, the latest software is able to manage both CPUs and GPUs, as well as having a new file system. Those are the solutions that we're putting forth, so that ARC-IT folks can scale, our data scientists can succeed. >> So IT's managing a logical block. >> That's right. And even for things like inventory management, or going back and adding patches in the event of some security event, it's so much better to have one integrated system rather than silos of management, which we see in the industry. >> So the hard news is basically UCS for AI and ML workloads? >> That's right. This is our first server custom built ground up to support these deep learning, machine learning workloads. We partnered with Nvidia, with Google. We announced earlier this week, and the phone is ringing constantly. >> I don't want to say godbot. I just said it. (laughs) This is basically the power tool for deep learning. >> Absolutely. >> That's how you guys see it. Well, great. Thanks for coming out. Appreciate it, good to see you guys at Cisco. Again, deep learning dedicated technology around the box, not just the box itself. Ecosystem, Nvidia, good call. Those guys really get the hot GPUs out there. Saw those guys last night, great success they're having. They're a key partner with you guys. >> Absolutely. >> Who else is partnering, real quick before we end the segment? >> We've been partnering with software sci, we partner with folks like Anaconda, with their Anaconda Enterprise, which data scientists love to use as their Python data science framework. We're working with Google, with their Kubeflow, which is open source project integrating Tensorflow on top of Kubernetes. And of course we've been working with folks like Caldera as well as Hortonworks to access the data lake from a big data perspective. >> Yeah, I know you guys didn't get a lot of credit. Google Cloud, we were certainly amplifying it. You guys were co-developing the Google Cloud servers with Google. I know they were announcing it, and you guys had Chuck on stage there with Diane Greene, so it was pretty positive. Good integration with Google can make a >> Absolutely. >> Thanks for coming on theCUBE, thanks, we appreciate the commentary. Cisco here on theCUBE. We're in New York City for theCUBE NYC. This is where the world of data is converging in with IT infrastructure, developers, operators, all running analytics for future business. We'll be back with more coverage, after this short break. (upbeat digital music)

Published Date : Sep 12 2018

SUMMARY :

It's the CUBE! Welcome back to the live CUBE coverage here So obviously one of the things that has come up this year but for the last few years, Not just in IT, in the way we're designing is around the data center. and the things that you guys see as relevant? And it's the ability of our customers to It's the edge cloud continuum. The edge of the network that the packets are moving around. is the edge is just now generating so much data, analytics at the edge Yeah, there you go. that most of the data is going to stay at the edge. I think you really want to collect it, (laughs) Those are the more insightful things and the gear and the software. the data scientists to work the scale, What are some of the use cases on these as products? Some of the things that Han just mentioned. So is the management different? it helps the IT to be more efficient in the space to make sure that everything works So is that really where a lot of the data scientists need to be able to get value There's the hardware, there's the integrations, So not to go too much into the weeds, Those are the solutions that we're putting forth, in the event of some security event, and the phone is ringing constantly. This is basically the power tool for deep learning. Those guys really get the hot GPUs out there. to access the data lake from a big data perspective. the Google Cloud servers with Google. This is where the world of data

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Nvidia	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Han Yang	PERSON	0.99+
Google	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Diane Greene	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
2021	DATE	0.99+
New York City	LOCATION	0.99+
Andy Jassy	PERSON	0.99+
8 GPUs	QUANTITY	0.99+
847 zettabytes	QUANTITY	0.99+
John Furrier	PERSON	0.99+
99.9 percent	QUANTITY	0.99+
Monday	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
3 petabytes	QUANTITY	0.99+
Anaconda	ORGANIZATION	0.99+
Wednesday	DATE	0.99+
DD	PERSON	0.99+
first time	QUANTITY	0.99+
one server	QUANTITY	0.99+
Cloudera	ORGANIZATION	0.99+
Python	TITLE	0.99+
first topic	QUANTITY	0.99+
one piece	QUANTITY	0.99+
VMworld	ORGANIZATION	0.99+
'95	DATE	0.98+
1.3 zettabytes	QUANTITY	0.98+
NYC	LOCATION	0.98+
both	QUANTITY	0.98+
one	QUANTITY	0.98+
this year	DATE	0.98+
Big Data Show	EVENT	0.98+
Caldera	ORGANIZATION	0.98+
two waters	QUANTITY	0.97+
today	DATE	0.97+
Chuck	PERSON	0.97+
One	QUANTITY	0.97+
Big Data	ORGANIZATION	0.97+
earlier this week	DATE	0.97+
Intersight	ORGANIZATION	0.97+
hundreds of meetings	QUANTITY	0.97+
CUBE	ORGANIZATION	0.97+
first server	QUANTITY	0.97+
last night	DATE	0.95+
one data center	QUANTITY	0.94+
UCS	ORGANIZATION	0.92+
petabytes	QUANTITY	0.92+
two great guests	QUANTITY	0.9+
Tensorflow	TITLE	0.86+
CUBE NYC	ORGANIZATION	0.86+
Han	PERSON	0.85+
#CubeNYC	LOCATION	0.83+
Strata Data	ORGANIZATION	0.83+
Kubeflow	TITLE	0.82+
Hadoop World	ORGANIZATION	0.81+
2018	DATE	0.8+

Brent Compton, Red Hat | theCUBE NYC 2018

>> Live from New York, it's theCUBE, covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. >> Hello, everyone, welcome back. This is theCUBE live in New York City for theCUBE NYC, #CUBENYC. This is our ninth year covering the big data ecosystem, which has now merged into cloud. All things coming together. It's really about AI, it's about developers, it's about operations, it's about data scientists. I'm John Furrier, my co-host Dave Vellante. Our next guest is Brent Compton, Technical Marketing Director for Storage Business at Red Hat. As you know, we cover Red Hat Summit and great to have the conversation. Open source, DevOps is the theme here. Brent, thanks for joining us, thanks for coming on. >> My pleasure, thank you. >> We've been talking about the role of AI and AI needs data and data needs storage, which is what you do, but if you look at what's going on in the marketplace, kind of an architectural shift. It's harder to find a cloud architect than it is to find diamonds these days. You can't find a good cloud architect. Cloud is driving a lot of the action. Data is a big part of that. What's Red Hat doing in this area and what's emerging for you guys in this data landscape? >> Really, the days of specialists are over. You mentioned it's more difficult to find a cloud architect than find diamonds. What we see is the infrastructure, it's become less about compute as storage and networking. It's the architect that can bring the confluence of those specialties together. One of the things that we see is people bringing their analytics workloads onto the common platforms where they've been running the rest of their enterprise applications. For instance, if they're running a lot of their enterprise applications on AWS, of course, they want to run their analytics workloads in AWS and that's EMRs long since in the history books. Likewise, if they're running a lot of their enterprise applications on OpenStack, it's natural that they want to run a lot of their analytics workloads on the same type of dynamically provisioned infrastructure. Emerging, of course, we just announced on Monday this week with Hortonworks and IBM, if they're running a lot of their enterprise applications on a Kubernetes substrate like OpenShift, they want to run their analytics workloads on that same kind of agile infrastructure. >> Talk about the private cloud impact and hybrid cloud because obviously we just talked to the CEO of Hortonworks. Normally it's about early days, about Hadoop, data legs and then data planes. They had a good vision. They're years into it, but I like what Hortonworks is doing. But he said Kubernetes, on a data show Kubernetes. Kubernetes is a multi-cloud, hybrid cloud concept, containers. This is really enabling a lot of value and you guys have OpenShift which became very successful over the past few years, the growth has been phenomenal. So congratulations, but it's pointing to a bigger trend and that is that the infrastructure software, the platform as a service is becoming the middleware, the glue, if you will, and Kubernetes and containers are facilitating a new architecture for developers and operators. How important is that with you guys, and what's the impact of the customer when they think, okay I'm going to have an agile DevOps environment, workload portability, but do I have to build that out? You mentioned people don't have to necessarily do that anymore. The trend has become on-premise. What's the impact of the customer as they hear Kubernetese and containers and the data conversation? >> You mentioned agile DevOps environment, workload portability so one of the things that customers come to us for is having that same thing, but infrastructure agnostic. They say, I don't want to be locked in. Love AWS, love Azure, but I don't want to be locked into those platforms. I want to have an abstraction layer for my Kubernetese layer that sits on top of those infrastructure platforms. As I bring my workloads, one-by-one, custom DevOps from a lift and shift of legacy apps onto that substrate, I want to have it be independent, private cloud or public cloud and, time permitting, we'll go into more details about what we've seen happening in the private cloud with analytics as well, which is effectively what brought us here today. The pattern that we've discovered with a lot of our large customers who are saying, hey, we're running OpenStack, they're large institutions that for lots of reasons they store a lot of their data on-premises saying, we want to use the utility compute model that OpenStack gives us as well as the shared data context that Ceph gives us. We want to use that same thing for our analytics workload. So effectively some of our large customers taught us this program. >> So they're building infrastructure for analytics essentially. >> That's what it is. >> One of the challenges with that is the data is everywhere. It's all in silos, it's locked in some server somewhere. First of all, am I overstating that problem and how are you seeing customers deal with that? What are some of the challenges that they're having and how are you guys helping? >> Perfect lead in, in fact, one of our large government customers, they recently sent us an unsolicited email after they deployed the first 10 petabytes in a deca petabyte solution. It's OpenStack based as well as Ceph based. Three taglines in their email. The first was releasing the lock on data. The second was releasing the lock on compute. And the third was releasing the lock on innovation. Now, that sounds a bit buzzword-y, but when it comes from a customer to you. >> That came from a customer? Sounds like a marketing department wrote that. >> In the details, as you know, traditional HDFS clusters, traditional Hadoop clusters, sparklers or whatever, HDFS is not shared between clusters. One of our large customers has 50 plus analytics clusters. Their data platforms team employ a maze of scripts to copy data from one cluster to the other. And if you are a scientist or an engineer, you'd say, I'm trying to obtain these types of answers, but I need access to data sets A, B, C, and D, but data sets A and B are only on this cluster. I've got to go contact the data platforms team and have them copy it over and ensure that it's up-to-date and in sync so it's messy. >> It's a nightmare. >> Messy. So that's why the one customer said releasing the lock on data because now it's in a shared. Similar paradigm as AWS with EMR. The data's in a shared context, an S3. You spin up your analytics workloads on AC2. Same paradigm discussion as with OpenStack. Your spinning up your analytics workloads via OpenStack virtualization and their sourcing is shared data context inside of Ceph, S3 compatible Ceph so same architecture. I love his last bit, the one that sounds the most buzzword-y which was releasing lock on innovation. And this individual, English was not this person's first language so love the word. He said, our developers no longer fear experimentation because it's so easy. In minutes they can spin up an analytics cluster with a shared data context, they get the wrong mix of things they shut it down and spin it up again. >> In previous example you used HDFS clusters. There's so many trip wires, right. You can break something. >> It's fragile. >> It's like scripts. You don't want to tinker with that. Developers don't want to get their hand slapped. >> The other thing is also the recognition that innovation comes from data. That's what my takeaway is. The customer saying, okay, now we can innovate because we have access to the data, we can apply intelligence to that data whether it's machine intelligence or analytics, et cetera. >> This the trend in infrastructure. You mentioned the shared context. What other observations and learnings have you guys come to as Red Hat starts to get more customer interactions around analytical infrastructure. Is it an IT problem? You mentioned abstracting the way different infrastructures, and that means multi-cloud's probably setup for you guys in a big way. But what does that mean for a customer? If you had to explain infrastructure analytics, what needs to get done, what does the customer need to do? How do you describe that? >> I love the term that industry uses of multi-tenant workload isolation with shared data context. That's such a concise term to describe what we talk to our customers about. And most of them, that's what they're looking for. They've got their data scientist teams that don't want their workloads mixed in with the long running batch workloads. They say, listen, I'm on deadline here. I've got an hour to get these answers. They're working with Impala. They're working with Presto. They iterate, they don't know exactly the pattern they're looking for. So having to take a long time because their jobs are mixed in with these long MapReduce jobs. They need to be able to spin up infrastructure, workload isolation meaning they have their own space, shared context, they don't want to be placing calls over to the platform team saying, I need data sets C, D, and E. Could you please send them over? I'm on deadline here. That phrase, I think, captures so nicely what customers are really looking to do with their analytics infrastructure. Analytics tools, they'll still do their thing, but the infrastructure underneath analytics delivering this new type of agility is giving that multi-tenant workload isolation with shared data context. >> You know what's funny is we were talking at the kickoff. We were looking back nine years. We've been at this event for nine years now. We made prediction there will be no Red Hat of big data. John, years ago said, unless it's Red Hat. You guys got dragged into this by your customers really is how it came about. >> Customers and partners, of course with your recent guest from Hortonworks, the announcement that Red Hat, Hortonworks, and IBM had on Monday of this week. Dialing up even further taking the agility, okay, OpenStack is great for agility, private cloud, utility based computing and storage with OpenStack and Ceph, great. OpenShift dials up that agility another notch. Of course, we heard from the CEO of Hortonworks how much they love the agility that a Kubernetes based substrate provides their analytics customers. >> That's essentially how you're creating that sort of same-same experience between on-prem and multi-cloud, is that right? >> Yeah, OpenShift is deployed pervasively on AWS, on-premises, on Azure, on GCE. >> It's a multi-cloud world, we see that for sure. Again, the validation was at VMworld. AWS CEO, Andy Jassy announced RDS which is their product on VMware on-premises which they've never done. Amazon's never done any product on-premises. We were speculating it would be a hardware device. We missed that one, but it's a software. But this is the validation, seamless cloud operations on-premise in the cloud really is what people want. They want one standard operating model and they want to abstract away the infrastructure, as you were saying, as the big trend. The question that we have is, okay, go to the next level. From a developer standpoint, what is this modern developer using for tools in the infrastructure? How can they get that agility and spinning up isolated, multi-tenant infrastructure concept all the time? This is the demand we're seeing, that's an evolution. Question for Red Hat is, how does that change your partnership strategy because you mentioned Rob Bearden. They've been hardcore enterprise and you guys are hardcore enterprise. You kind of know the little things that customers want that might not be obvious to people: compliance, certification, a decade of support. How is Red Hat's partnership model changing with this changing landscape, if you will? You mentioned IBM and Hortonworks release this week, but what in general, how does the partnership strategy look for you? >> The more it changes, the more it looks the same. When you go back 20 years ago, what Red Hat has always stood for is any application on any infrastructure. But back in the day it was we had n-thousand of applications that were certified on Red Hat Linux and we ran on anybody's server. >> Box. >> Running on a box, exactly. It's a similar play, just in 2018 in the world of hybrid, multi-cloud architectures. >> Well, you guys have done some serious heavy lifting. Don't hate me for saying this, but you're kind of like the mules of the industry. You do a lot of stuff that nobody either wants to do or knows how to do and it's really paid off. You just look at the ascendancy of the company, it's been amazing. >> Well, multi-cloud is hard. Look at what it takes to do multi-cloud in DevOps. It's not easy and a lot of pretenders will fall out of the way, you guys have done well. What's next for you guys? What's on the horizon? What's happening for you guys this next couple months for Red Hat and technology? Any new announcements coming? What's the vision, what's happening? >> One of the announcements that you saw last week, was Red Hat, Cloudera, and Eurotech as analytics in the data center is great. Increasingly, the world's businesses run on data-driven decisions. That's great, but analytics at the edge for more realtime industrial automation, et cetera. Per the announcements we did with Cloudera and Eurotech about the use of, we haven't even talked about Red Hat's middleware platforms, such as AMQ Streams now based on Kafka, a Kafka distribution, Fuze, an integration master effectively bringing Red Hat technology to the edge of analytics so that you have the ability to do some processing in realtime before back calling all the way back to the data center. That's an area that you'll also see is pushing some analytics to the edge through our partnerships such as announced with Cloudera and Eurotech. >> You guys got the Red Hat Summit coming up next year. theCUBE will be there, as usual. It's great to cover Red Hat. Thanks for coming on theCUBE, Brent. Appreciate it, thanks for spending the time. We're here in New York City live. I'm John Furrier, Dave Vallante, stay with us. All day coverage today and tomorrow in New York City. We'll be right back. (upbeat music)

Published Date : Sep 12 2018

SUMMARY :

Brought to you by SiliconANGLE Media Open source, DevOps is the theme here. Cloud is driving a lot of the action. One of the things that we see is people and that is that the infrastructure software, the shared data context that Ceph gives us. So they're building infrastructure One of the challenges with that is the data is everywhere. And the third was releasing the lock on innovation. That came from a customer? In the details, as you know, I love his last bit, the one that sounds the most buzzword-y In previous example you used HDFS clusters. You don't want to tinker with that. that innovation comes from data. You mentioned the shared context. I love the term that industry uses of You guys got dragged into this from Hortonworks, the announcement that Yeah, OpenShift is deployed pervasively on AWS, You kind of know the little things that customers want But back in the day it was we had n-thousand of applications in the world of hybrid, multi-cloud architectures. You just look at the ascendancy of the company, What's on the horizon? One of the announcements that you saw last week, You guys got the Red Hat Summit coming up next year.

ENTITIES

Entity	Category	Confidence
Dave Vallante	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
John	PERSON	0.99+
Brent Compton	PERSON	0.99+
AWS	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Eurotech	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Brent	PERSON	0.99+
New York City	LOCATION	0.99+
2018	DATE	0.99+
Red Hat	ORGANIZATION	0.99+
Rob Bearden	PERSON	0.99+
nine years	QUANTITY	0.99+
Andy Jassy	PERSON	0.99+
last week	DATE	0.99+
first language	QUANTITY	0.99+
Three taglines	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
first	QUANTITY	0.99+
tomorrow	DATE	0.99+
second	QUANTITY	0.99+
One	QUANTITY	0.99+
Cloudera	ORGANIZATION	0.99+
next year	DATE	0.99+
third	QUANTITY	0.99+
New York	LOCATION	0.99+
Impala	ORGANIZATION	0.99+
Monday this week	DATE	0.99+
VMworld	ORGANIZATION	0.98+
one cluster	QUANTITY	0.98+
Red Hat Summit	EVENT	0.98+
ninth year	QUANTITY	0.98+
one	QUANTITY	0.98+
OpenStack	TITLE	0.98+
today	DATE	0.98+
NYC	LOCATION	0.97+
20 years ago	DATE	0.97+
Kubernetese	TITLE	0.97+
Kafka	TITLE	0.97+
First	QUANTITY	0.96+
this week	DATE	0.96+
Red Hat	TITLE	0.95+
English	OTHER	0.95+
Monday of this week	DATE	0.94+
OpenShift	TITLE	0.94+
one standard	QUANTITY	0.94+
50 plus analytics clusters	QUANTITY	0.93+
Ceph	TITLE	0.92+
Azure	TITLE	0.92+
GCE	TITLE	0.9+
Presto	ORGANIZATION	0.9+
agile DevOps	TITLE	0.89+
theCUBE	ORGANIZATION	0.88+
DevOps	TITLE	0.87+

Stephanie McReynolds, Alation | theCUBE NYC 2018

>> Live from New York, It's theCUBE! Covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. >> Hello and welcome back to theCUBE live in New York City, here for CUBE NYC. In conjunct with Strata Conference, Strata Data, Strata Hadoop This is our ninth year covering the big data ecosystem which has evolved into machine learning, A.I., data science, cloud, a lot of great things happening all things data, impacting all businesses I'm John Furrier, your host with Dave Vellante and Peter Burris, Peter is filling in for Dave Vellante. Next guest, Stephanie McReynolds who is the CMO, VP of Marketing for Alation, thanks for joining us. >> Thanks for having me. >> Good to see you. So you guys have a pretty spectacular exhibit here in New York. I want to get to that right away, top story is Attack of the Bots. And you're showing a great demo. Explain what you guys are doing in the show. >> Yah, well it's robot fighting time in our booth, so we brought a little fun to the show floor my kids are.. >> You mean big data is not fun enough? >> Well big data is pretty fun but occasionally you got to get your geek battle on there so we're having fun with robots but I think the real story in the Alation booth is about the product and how machine learning data catalogs are helping a whole variety of users in the organization everything from improving analyst productivity and even some business user productivity of data to then really supporting data scientists in their work by helping them to distribute their data products through a data catalog. >> You guys are one of the new guard companies that are doing things that make it really easy for people who want to use data, practitioners that the average data citizen has been called, or people who want productivity. Not necessarily the hardcore, setting up clusters, really kind of like the big data user. What's that market look like right now, has it met your expectations, how's business, what's the update? >> Yah, I think we have a strong perspective that for us to close the final mile and get to real value out of the data, it's a human challenge, there's a trust gap with managers. Today on stage over at STRATA it was interesting because Google had a speaker and it wasn't their chief data officer it was their chief decision scientist and I think that reflects what that final mile is is that making decisions and it's the trust gap that managers have with data because they don't know how the insides are coming to them, what are all the details underneath. In order to be able to trust decisions you have to understand who processed the data, what decision making criteria did they use, was this data governed well, are we introducing some bias into our algorithms, and can that be controlled? And so Alation becomes a platform for supporting getting answers to those issues. And then there's plenty of other companies that are optimizing the performance of those QUERYS and the storage of that data, but we're trying to really to close that trust gap. >> It's very interesting because from a management standpoint we're trying to do more evidence based management. So there's a major trend in board rooms, and executive offices to try to find ways to acculturate the executive team to using data, evidence based management healthcare now being applied to a lot of other domains. We've also historically had a situation where the people who focused or worked with the data was a relatively small coterie of individuals that crave these crazy systems to try to bring those two together. It sounds like what you're doing, and I really like the idea of the data scientists, being able to create data products that then can be distributed. It sounds like you're trying to look at data as an asset to be created, to be distributed so they can be more easily used by more people in your organization, have we got that right? >> Absolutely. So we're now seeing we're in just over a hundred production implementations of Alation, at large enterprises, and we're now seeing those production implementations get into the thousands of users. So this is going beyond those data specialists. Beyond the unicorn data scientists that understand the systems and math and technology. >> And business. >> And business, right. In business. So what we're seeing now is that a data catalog can be a point of collaboration across those different audiences in an enterprise. So whereas three years ago some of our initial customers kept the data catalog implementations small, right. They were getting access to the specialists to this catalog and asked them to certify data assets for others, what were starting to see is a proliferation of creation of self service data assets, a certification process that now is enterprise-wide, and thousands of users in these organizations. So Ebay has over a thousand weekly logins, Munich Reinsurance was on stage yesterday, their head of data engineering said they have 2,000 users on Alation at this point on their data lake, Fiserv is going to speak on Thursday and they're getting up to those numbers as well, so we see some really solid organizations that are solving medical, pharmaceutical issues, right, the largest re insurer in the world leading tech companies, starting to adopt a data catalog as a foundation for how their going to make those data driven decisions in the organization. >> Talk about how the product works because essentially you're bringing kind of the decision scientists, for lack of a better word, and productivity worker, almost like a business office suite concept, as a SAS, so you got a SAS model that says "Hey you want to play with data, use it but you have to do some front end work." Take us through how you guys roll out the platform, how are your customers consuming the service, take us through the engagement with customers. >> I think for customers, the most interesting part of this product is that it displays itself as an application that anyone can use, right? So there's a super familiar search interface that, rather than bringing back webpages, allows you to search for data assets in your organization. If you want more information on that data asset you click on those search results and you can see all of the information of how that data has been used in the organization, as well as the technical details and the technical metadata. And I think what's even more powerful is we actually have a recommendation engine that recommends data assets to the user. And that can be plugged into Tablo and Salesworth, Einstein Analytics, and a whole variety of other data science tools like Data Haiku that you might be using in your organization. So this looks like a very easy to use application that folks are familiar with that you just need a web browser to access, but on the backend, the hard work that's happening is the automation that we do with the platform. So by going out and crawling these source systems and looking at not just the technical descriptions of data, the metadata that exists, but then being able to understand by parsing the sequel weblogs, how that data is actually being used in the organization. We call it behavior I.O. by looking at the behaviors of how that data's being used, from those logs, we can actually give you a really good sense of how that data should be used in the future or where you might have gaps in governing that data or how you might want to reorient your storage or compute infrastructure to support the type of analytics that are actually being executed by real humans in your organization. And that's eye opening to a lot of I.T. sources. >> So you're providing insights to the data usage so that the business could get optimized for whether it's I.T. footprint component, or kinds of use cases, is that kind of how it's working? >> So what's interesting is the optimization actually happens in a pretty automated way, because we can make recommendations to those consumers of data of how they want to navigate the system. Kind of like Google makes recommendations as you browse the web, right? >> If you misspell something, "Oh did you mean this", kind of thing? >> "Did you mean this, might you also be interested in this", right? It's kind of a cross between Google and Amazon. Others like you may have used these other data assets in the past to determine revenue for that particular region, have you thought about using this filter, have you thought about using this join, did you know that you're trying to do analysis that maybe the sales ops guy has already done, and here's the certified report, why don't you just start with that? We're seeing a lot of reuse in organizations, wherein the past I think as an industry when Tablo and Click and all these B.I tools that were very self service oriented started to take off it was all about democratizing visualization by letting every user do their own thing and now we're realizing to get speed and accuracy and efficiency and effectiveness maybe there's more reuse of the work we've already done in existing data assets and by recommending those and expanding the data literacy around the interpretation of those, you might actually close this trust gap with the data. >> But there's one really important point that you raised, and I want to come back to it, and that is this notion of bias. So you know, Alation knows something about the data, knows a lot about the metadata, so therefore, I don't want to say understands, but it's capable of categorizing data in that way. And you're also able to look at the usage of that data by parsing some of sequel statements and then making a determination of the data as it's identified is appropriately being used based on how people are actually applying it so you can identify potential bias or potential misuse or whatever else it might be. That is an incredibly important thing. As you know John, we had an event last night and one of the things that popped up is how do you deal with emergence in data science in A.I, etc. And what methods do you put in place to actually ensure that the governance model can be extended to understand how those things are potentially in a very soft way, corrupting the use of the data. So could you spend a little bit more time talking about that because it's something a lot of people are interested in, quite frankly we don't know about a lot of tools that are doing that kind of work right now. It's an important point. >> I think the traditional viewpoint was if we just can manage the data we will be able to have a govern system. So if we control the inputs then well have a safe environment, and that was kind of like the classic single source of truth, data warehouse type model. >> Stewards of the data. >> What we're seeing is with the proliferation of sources of data and how quickly with IOT and new modern sources, data is getting created, you're not able to manage data at that point of that entry point. And it's not just about systems, it's about individuals that go on the web and find a dataset and then load it into a corporate database, right? Or you merge an Excel file with something that in a database. And so I think what we see happening, not only when you look at bias but if you look at some of the new regulations like [Inaudible] >> Sure. Ownership, [Inaudible] >> The logic that you're using to process that data, the algorithm itself can be biased, if you have a biased training data site that you feed it into a machine learning algorithm, the algorithm itself is going to be biased. And so the control point in this world where data is proliferating and we're not sure we can control that entirely, becomes the logic embedded in the algorithm. Even if that's a simple sequel statement that's feeding a report. And so Alation is able to introspect that sequel and highlight that maybe there is bias at work and how this algorithm is composed. So with GDPR the consumer owns their own data, if they want to pull it out from a training data set, you got to rerun that algorithm without that consumer data and that's your control point then going forward for the organization on different governance issues that pop up. >> Talk about the psychology of the user base because one of the things that shifted in the data world is a few stewards of data managed everything, now you've got a model where literally thousands of people of an organization could be users, productivity users, so you get a social component in here that people know who's doing data work, which in a way, creates a new persona or class of worker. A non techy worker. >> Yeah. It's interesting if you think about moving access to the data and moving the individuals that are creating algorithms out to a broader user group, what's important, you have to make sure that you're educating and training and sharing knowledge with that democratized audience, right? And to be able to do that you kind of want to work with human psychology, right? You want to be able to give people guidance in the course of their work rather than have them memorize a set of rules and try to remember to apply those. If you had a specialist group you can kind of control and force them to memorize and then apply, the more modern approach is to say "look, with some of these machine learning techniques that we have, why don't we make a recommendation." What you're going to do is introduce bias into that calculation. >> And we're capturing that information as you use the data. >> Well were also making a recommendation to say "Hey do you know you're doing this? Maybe you don't want to do that." Most people are using the data are not bad actors. They just can't remember all the rule sets to apply. So what were trying to do is cut someone behaviorally in the act before they make that mistake and say hey just a bit of a reminder, a bit of a coaching moment, did you know what you're doing? Maybe you can think of another approach to this. And we've found that many organizations that changes the discussion around data governance. It's no longer this top down constraint to finding insight, which frustrates an audience, is trying to use that data. It's more like a coach helping you improve and then social aspect of wanting to contribute to the system comes into play and people start communicating, collaborating, the platform and curating information a little bit. >> I remember when Microsoft Excel came out, the spreadsheet, or Lotus 123, oh my God, people are going to use these amazing things with spreadsheets, they did. You're taking a similar approach with analytics, much bigger surface area of work to kind of attack from a data perspective, but in a way kind of the same kind of concept, put the hands of the users, have the data in their hands so to speak. >> Yeah, enable everyone to make data driven decisions. But make sure that they're interpreting that data in the right way, right? Give them enough guidance, don't let them just kind of attack the wild west and fair it out. >> Well looking back at the Microsoft Excel spreadsheet example, I remember when a finance department would send a formatted spreadsheet with all the rules for how to use it out of 50 different groups around the world, and everyone figured out that you can go in and manipulate the macros and deliver any results they want. And so it's that same notion, you have to know something about that, but this site, in many respects Stephanie you're describing a data governance model that really is more truly governance, that if we think about a data asset it's how do we mediate a lot of different claims against that set of data so that its used appropriately, so its not corrupted, so that it doesn't effect other people, but very importantly so that the out6comes are easier to agree upon because there's some trust and there's some valid behaviors and there's some verification in the flow of the data utilization. >> And where we give voice to a number of different constituencies. Because business opinions from different departments can run slightly counter to one another. There can be friction in how to use particular data assets in the business depending on the lens that you have in that business and so what were trying to do is surface those different perspectives, give them voice, allow those constituencies to work that out in a platform that captures that debate, captures that knowledge, makes that debate a knowledge of foundation to build upon so in many ways its kind of like the scientific method, right? As a scientist I publish a paper. >> Get peer reviewed. >> Get peer reviewed, let other people weigh in. >> And it becomes part of the canon of knowledge. >> And it becomes part of the canon. And in the scientific community over the last several years you see that folks are publishing their data sets out publicly, why can't an enterprise do the same thing internally for different business groups internally. Take the same approach. Allow others to weigh in. It gets them better insights and it gets them more trust in that foundation. >> You get collective intelligence from the user base to help come in and make the data smarter and sharper. >> Yeah and have reusable assets that you can then build upon to find the higher level insights. Don't run the same report that a hundred people in the organization have already run. >> So the final question for you. As you guys are emerging, starting to do really well, you have a unique approach, honestly we think it fits in kind of the new guard of analytics, a productivity worker with data, which is we think is going to be a huge persona, where are you guys winning, and why are you winning with your customer base? What are some things that are resonating as you go in and engage with prospects and customers and existing customers? What are they attracted to, what are they like, and why are you beating the competition in your sales and opportunities? >> I think this concept of a more agile, grassroots approach to data governance is a breath of fresh air for anyone who spend their career in the data space. Were at a turning point in industry where you're now seeing chief decision scientists, chief data officers, chief analytic officers take a leadership role in organizations. Munich Reinsurance is using their data team to actually invest and hold new arms of their business. That's how they're pushing the envelope on leadership in the insurance space and were seeing that across our install base. Alation becomes this knowledge repository for all of those mines in the organization, and encourages a community to be built around data and insightful questions of data. And in that way the whole organization raises to the next level and I think its that vision of what can be created internally, how we can move away from just claiming that were a big data organization and really starting to see the impact of how new business models can be creative in these data assets, that's exciting to our customer base. >> Well congratulations. A hot start up. Alation here on theCUBE in New York City for cubeNYC. Changing the game on analytics, bringing a breath of fresh air to hands of the users. A new persona developing. Congratulations, great to have you. Stephanie McReynolds. Its the cube. Stay with us for more live coverage, day one of two days live in New York City. We'll be right back.

Published Date : Sep 12 2018

SUMMARY :

Brought to you by SiliconANGLE Media the CMO, VP of Marketing for Alation, thanks for joining us. So you guys have a pretty spectacular so we brought a little fun to the show floor in the Alation booth is about the product You guys are one of the new guard companies is that making decisions and it's the trust gap and I really like the idea of the data scientists, production implementations get into the thousands of users. and asked them to certify data assets for others, kind of the decision scientists, gaps in governing that data or how you might want to so that the business could get optimized as you browse the web, right? in the past to determine revenue for that particular region, and one of the things that popped up is how do you deal and that was kind of like the classic it's about individuals that go on the web and find a dataset the algorithm itself is going to be biased. because one of the things that shifted in the data world And to be able to do that you kind of They just can't remember all the rule sets to apply. have the data in their hands so to speak. that data in the right way, right? and everyone figured out that you can go in in the business depending on the lens that you have And in the scientific community over the last several years You get collective intelligence from the user base Yeah and have reusable assets that you can then build upon and why are you winning with your customer base? and really starting to see the impact of how new business bringing a breath of fresh air to hands of the users.

ENTITIES

Entity	Category	Confidence
Stephanie McReynolds	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
John	PERSON	0.99+
Peter Burris	PERSON	0.99+
Google	ORGANIZATION	0.99+
Stephanie	PERSON	0.99+
Thursday	DATE	0.99+
New York	LOCATION	0.99+
John Furrier	PERSON	0.99+
50 different groups	QUANTITY	0.99+
Peter	PERSON	0.99+
New York City	LOCATION	0.99+
Ebay	ORGANIZATION	0.99+
2,000 users	QUANTITY	0.99+
Excel	TITLE	0.99+
Attack of the Bots	TITLE	0.99+
thousands	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
two days	QUANTITY	0.99+
yesterday	DATE	0.99+
ninth year	QUANTITY	0.99+
two	QUANTITY	0.99+
STRATA	ORGANIZATION	0.99+
Today	DATE	0.99+
Fiserv	ORGANIZATION	0.99+
last night	DATE	0.99+
three years ago	DATE	0.99+
Alation	PERSON	0.99+
NYC	LOCATION	0.98+
Lotus 123	TITLE	0.98+
Munich Reinsurance	ORGANIZATION	0.98+
one	QUANTITY	0.98+
GDPR	TITLE	0.97+
Alation	ORGANIZATION	0.96+
Microsoft	ORGANIZATION	0.94+
SAS	ORGANIZATION	0.94+
over a thousand weekly logins	QUANTITY	0.91+
theCUBE	ORGANIZATION	0.9+
Strata Conference	EVENT	0.89+
single source	QUANTITY	0.86+
thousands of people	QUANTITY	0.86+
thousands of users	QUANTITY	0.84+
Tablo	ORGANIZATION	0.83+
day one	QUANTITY	0.78+
2018	EVENT	0.75+
CUBE	ORGANIZATION	0.75+
Salesworth	ORGANIZATION	0.74+
Einstein Analytics	ORGANIZATION	0.73+
Tablo	TITLE	0.73+
Strata Hadoop	EVENT	0.73+
a hundred people	QUANTITY	0.7+
2018	DATE	0.66+
point	QUANTITY	0.63+
years	DATE	0.63+
Alation	LOCATION	0.62+
Click	ORGANIZATION	0.62+
Munich Reinsurance	TITLE	0.6+
over a hundred	QUANTITY	0.59+
Data	ORGANIZATION	0.58+
Strata Data	EVENT	0.57+
last	DATE	0.55+
Haiku	TITLE	0.47+

Rob Bearden, Hortonworks | theCUBE NYC 2018

>> Live from New York, it's theCUBE, covering theCUBE, New York City, 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. >> And welcome to theCUBE here in New York City. We're live from CUBE NYC, this is our big data now: AI, now all things cloud 9 years covering the beginning of Hadoop. Now into cloud and data as the center of the value I'm John Furrier with David Vellante. Our special guest is Rob Bearden, CEO of Hortonworks CUBE alumni, been on many times Great supporter of theCUBE, legend in OpenSource Great to see you. >> It's great to be here, thanks. Yes, absolutely. >> So one of the things I wanted to talk to you about is that OpenSource certainly has been a big part of the Ethos, just seeing it in all sectors, again, growing even in Blockchain, Open Ethos is growing. The role of data now certainly in the center. You guys have been on this vision of open data, if you will and making data, and move and flight, maybe rest all these things are going on. Certainly the Hadoop world has changed, not just Hadoop and data lakes anymore, it's data. All things data, it's happening. This is core to your business, you guys have been banging this drum for a long time. Stock's at an all-time high. Congratulations on the business performance. So it's working, things are working for you guys. >> I think the model in this strategy are really coming together nicely. And to your point, it's about all the data. It's about the entire life-cycle of the data and bringing all data under management through its entire life-cycle. And being able to give the enterprise that accessibility to that data across each tier on-prem, private cloud, and across all the multi-clouds. And that's really changed, really in many regards, the overall core architecture of Hadoop and how it needs to manage data. And how it needs to interact with other data sources. And our model and strategy is been about not going above the Hadoop stack, but actually going out to the edge, and bringing data under management from the point of origination through its entire movement life-cycle until it comes at rest, and then have the ability, to deploy and access that data across each tier and across a multi-cloud environment. And it's a hybrid architecture world now. >> You guys have been on this trend for a while now, it's kind of getting lift obviously you're seeing the impact that cloud, impact AI cause the faster computer you have, the faster you can process data, the faster the data can be used, machine learning it's a nice flywheel. So again, that flywheel is being recognized. So I have to ask you, what is in your opinion, been the impact of cloud computing, specifically the Amazons, and the Azures, and now Google where certainly AI is in the center of their proposition, now hybrid cloud is validated with Amazon announcing RDS on the premises on VMWARE. That's the first Amazon ever, ever on premises activity. So this is clearly a validation of hybrid cloud. How has the cloud impacted the data space, and if you will, it used to be data warehousing, cloud has changed that. What's your opinion? >> Well what's it's done is given a, an architectural extension to the enterprise of what their data architecture needs to be, and the real key is, it's now, it's not about hybrid or cloud or on-prem, it's about having a data strategy overall. And how do I bring all my different assets, and bring a connected community together, in real-time? because what enterprise is trying to do is, connect and have higher velocity and faster visibility between the enterprise, the product, their customer, and their supply chain. And to do that, they need to be able to aggregate data into the best economic platform from the point of origination, maybe starting from the component on their product, a single component, and be able to bring all that data together through its life-cycle, aggregate it, and then deploy it on the most economically feasible tier. Whether that's on-prem, or a private cloud, or across multiple public clouds. And our platform with HDF, HDP, and data plane and complete that hybrid data architecture. And by doing that, the real value is then the cloud, AI and machine learning capabilities have the ability now to access all data across the enterprise, whether it be their tier in the cloud, or whether that be on-prem. And our strategy is around bringing that and being that fabric, to bring all the interconnectivity irrespective of whether it sits on the edge and the cloud is somewhere in between. Because the more accessibility AI has to data, the faster velocity of driving value back in to that AI cycle. >> Yeah, people don't want to move data if they don't have to And so, and we've been on this for a while, that this idea that you want to bring the cloud model to your data, and not the data to the cloud always. And so, how do you do that? How do you make it this kind of same, same environment? What role does HortonWorks play in it? >> Well the first thing we want to do is, bring the data under management from and through its life-cycle where HDF goes to the edge, brings the data through its movement cycle, aggregates the streams. HDP is the data at rest platform that can sit on-prem and a public cloud or a private cloud. And then data plains that fabric, that ensures that we have connectivity to all types of data across all tiers. And then serves as the common security and governance framework, irrespective of which tier that is. And that's very very important. And then that then gives the AI platforms the ability to bring AI onto a broader array of data, that they can then have a higher and better impact on it than just having an isolated AI impact on just a single tier I data in the cloud. >> Well that messages seems to be resonating, we talked earlier about the stock price, but also I think Neil Bushery and Frank Sluben popularized the metric of number of seven-figure deals. You guys are closing some big deals, and remember in the early days Robert Vor Breath, people are like how these guys going to sell anything, it's all open-source and you're doing a lot of a million plus dollar deals. So it's resonating not only with the streep but also with enterprises, your thoughts. >> Last quarter we, I think the key is that the industry really understands, the investors understand, the enterprises really now understand the importance of hybrid and hybrid cloud. And it's not going to be all about managing data lakes on-prem. All the data's not going to go and have this giant line of demarkation and now all reside in the cloud. It has to coexist across each tier and our role is to be that aggregation point. >> And you've seen the big cloud players now, all it's the big three, all have on-prem strategies. Azure with Azure Stack, Google we saw Kubernetes on-prem, and even AWS now, the last load up putting RDS on-prem announced that VMWorld. So they've all sort of recognized that not everything's going to go into the cloud. So that's got to be, you know good confirmation for you guys >> It's great validation. What is also says though is, we must have cloud first architecture and a cloud first approach with all of our tech. And the key to that is, from our standpoint, within our strategy is to containerize everything. And we had an announcement earlier this week that was really a three-way announcement between us, Red Hat, and IBM; and the essence of that announcement is we've adopted the Kubernetes distro from Red Hat. To where we're are containerizing all of our platforms with Red Hat's Kubernetes distribution. And what that does, is gives us the ability to optimize our platforms for OpenShift, the Red Hat pass, and optimize then the deployment of that and the IBM private cloud, right. And naturally data plane will also then give us the ability, to extend those workloads; those very granular workloads up in to the public clouds, and we can even leverage their native objects stores. >> So that's an interesting love triangle right? You and Red Hat are kind of birds of a feather with open-source. IBM has always been a big proponent of open-source, you know funded Linux in the early days. And then brings this, a massive channel and brand, you know to that world. >> Yes. And you know this is really going to accelerate our movement into a cloud first architecture, with pure containerization. And the reason that's so important is, it gives us that modularity to move those applications and those workloads, across whichever tiers most appropriate architecturally for it to run and be deployed. >> You know we said this on theCUBE many many years ago, and continues to be this theme, enterprise is one really wanting hardened solutions, but they don't mind experimenting. And Stu Miniman and I, were always talking about and comparing OpenStack ecosystem to what's happened in the Hadoop ecosystem. There's some pockets of relevance and it's a lot of work to build your own, and OpenStack has a great solution for certain use cases, now mostly on the infrastructure side But when cloud came in and changed the game, because you saw things like Kubernetes. I mean we're here at the Hadoop show that started with Hadoop, now it's AI, the word Kubernetes is being talked about. You mentioned hybrid cloud, these aren't words that were spoken at an event like this. So the IT problem in multi-cloud has always been a storage issue. So you do some storage work, you got to store the data somewhere, but now you're talking about Kubernetes. You're talking about orchestration around workloads, the role of data in workloads. This is what enterprise IT actually cares about right now. This is not like, a small little thing, it's a big deal because data is not only in the workloads, they're using instrumentation with containers, with service meshes around the coin. You're starting to see policy, this is hardcore B2B enterprise features. >> This is where with what we're seeing is a massive transformational shift of how the IT architecture's going to look for the next 20 years. Right. The IT world it is been horribly constrained from this very highly configured, very procedural-based applications and now they want to create high velocity engagement between the enterprise, their product, their customer and supply chain. They were so constrained with these very procedural-based applications and containerization gives the ability now to create that velocity and to move those workloads, and those interactions between that four pillars. >> Now let's talk about the edge. Cause the pendulum is clearly swinging sort of back to some decentralization going on, and the edge to us is a data play. We talk about it all the time. What are your thoughts on the edge, where does HortonWorks fit? What's your vision of the data modeling and how that evolves? >> That goes back to, the insight to that would be our strategy and what we did and had the great fortune, quite frankly, of having the ability to merge on Yara and HortonWorks back in 2015. And we wanted, and the whole goal of that besides working with a great team, Joe Witt had built, is being able to get to the edge. And what we wanted to have the ability to do, was to operate on every sensor, on every device at the edge for the customer so that they could bring the data under management whenever that may be, through its entire life-cycle; so from point of origination through its movement until it comes at rest. So our belief is that if we can bring enough intelligence and faster insights as that data is being generated, and as events or conditions are happening, moving, or changing before it ever comes to rest we can process and take prescriptive action. Leveraging AI and machine learning as it's in its life-cycle we can dramatically decrease the amount of data we have to bring to rest. We can just bring the province the metadata to rest and have that insight. And we try to get to these high velocity, real-time insights starting with the data on the edge. And that's why we think it's so important to manage the entire life-cycle. And then, what's even more important is then put that data, on to what ever tier. That may be bring it back to rest in a day like on-prem, right, to aggregate with other like data structures. Or it may be, take it into cold storage on a native object store in a cloud, that has the lowest cost of storage structure for a particular time. >> Or take an action on the edge and leave it there. >> Yeah. You guys definitely think about the edge in a big way, that's pretty obvious. But what I want to get your thoughts on is an emerging area we're watching, and I'll call it for lack of a better description, programmable data. And you mentioned data architecture is being setup probably set a 10, 20 year run for enterprises they setup their data architecture with the cloud architects. Making data programmable is kind of a dev-ops concept right. And this is something that you guys have thought about with the data plane, what's your reaction to this notion of making data programmable? When you start talking about Kubernetes, you're going to have statefull applications, stateless applications, you have new dynamics I call it API 2.0 happening. Whole new infrastructure happening, data has to be programmable, going to need policy around it, the role of data's certainly changing rather than storing it somewhere. What's your view of programmable data, making it programmable? >> Well you've got to be able to, to truly have programmable data, you can't have slices of accessibility or window. You have to understand the lineage of that entire data, and the context of that data through its entire life-cycle. That's step and point number one. Point number two is, you have to be able to have that containerized so that you can take the module of data that you want to take prescriptive action against, or create action against a condition. And to be able to do that in granular bites or chunks, right. And then you've got to have accessibility to all the other contextual data, which means whether that's as its in motion as its at rest or, as its contextual cousin if you will, that sits up in an object store on another tier in a public cloud. Right. But what's important is that you have to be able to control and understand the entire lineage of that. And therefore, that's where our second step in this is data plane. And having the ability to have a full security model through that entire architectural chain, as well as the entire governance and lineage leveraging, leveraging atlas through data plane. And that then gives you the ability to take these very prescriptive actions that are driven through AI and machine learning insights. >> And that makes you very agile, love it. I mean the ethos of open-source and dev-ops is literally being applied to every thing. We see it with at the network layer, you see it at the data layer, you're starting to see this concept of dev and ops being applied in a big way. >> The next you know, previous years we've talked about what we're trying to accomplish. And we've started HortonWorks, it was about changing the data architecture for the next 20 years and how data was going to be managed. And that's had, to your earlier point we opened up the show, that's had twists and turns. Hadoop's evolved, the nature and velocity of data has evolved in the last five, six, seven, eight years you know. It's about going to the edge, it's about leveraging the cloud and we're very excited about where we're positioned as this massive transformation's happening. And what we're seeing is the iteration of change, is happening at an incredibly fast pace. Even much more so than it was two, three years ago. >> Yeah, the clock speeds definitely up, their data is working. People putting it to work. What works... >> They're able to get more value faster because of it. >> The AI is great. >> The data economy is here and now. And the enterprise understands it. So they want to now move aggressively to change and transform their business model to take advantage of what their data is giving them the ability to do. >> That's great. They always want the value, and they want it fast and anything gets in the way they'll remove the blockers as what we say. >> Alright, it's theCUBE here Rob Bearden, CEO of Hortonworks giving his vision but also an update on the company; data at the center of the value proposition. This is about AI, it's about big data, it's about the cloud. It's theCUBE bringing you, theCUBE data here in New York City. CUBENYC, that's the hashtag; check us out on Twitter. Stay with us for a live coverage all day today and tomorrow here in New York City. We'll be right back after this short break. (upbeat music)

Published Date : Sep 12 2018

SUMMARY :

Brought to you by SiliconANGLE Media Now into cloud and data as the center of the value It's great to be here, thanks. So one of the things I wanted to talk to you about above the Hadoop stack, but actually going out to the edge, How has the cloud impacted the data space, and if you will, have the ability now to access all data across the and not the data to the cloud always. HDP is the Well that messages seems to be resonating, And it's not going to be So that's got to be, you know good confirmation for you guys And the key to that is, from our standpoint, And then brings this, a massive channel and brand, And the reason that's because data is not only in the workloads, they're using containerization gives the ability now to create going on, and the edge to us is a data play. the metadata to rest and have that insight. And this is something that you guys have thought about And having the ability to have a full security model And that makes you very agile, love it. And that's had, to your earlier point we opened up the show, Yeah, the clock speeds definitely up, their data And the enterprise understands it. and they want it fast and anything gets in the way it's about the cloud.

ENTITIES

Entity	Category	Confidence
David Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Rob Bearden	PERSON	0.99+
Frank Sluben	PERSON	0.99+
2015	DATE	0.99+
Hortonworks	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
10	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
Yara	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Joe Witt	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Stu Miniman	PERSON	0.99+
tomorrow	DATE	0.99+
Amazons	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Kubernetes	TITLE	0.99+
CUBE	ORGANIZATION	0.99+
second step	QUANTITY	0.99+
VMWorld	ORGANIZATION	0.99+
today	DATE	0.99+
Last quarter	DATE	0.99+
HortonWorks	ORGANIZATION	0.99+
Robert Vor Breath	PERSON	0.98+
first	QUANTITY	0.98+
Neil Bushery	PERSON	0.98+
six	QUANTITY	0.98+
each tier	QUANTITY	0.98+
Hadoop	TITLE	0.97+
seven-figure deals	QUANTITY	0.97+
Point number two	QUANTITY	0.97+
two	DATE	0.97+
seven	QUANTITY	0.97+
theCUBE	ORGANIZATION	0.97+
OpenShift	TITLE	0.97+
OpenSource	ORGANIZATION	0.96+
each tier	QUANTITY	0.96+
2018	DATE	0.96+
three years ago	DATE	0.95+
earlier this week	DATE	0.95+
first thing	QUANTITY	0.93+
eight years	QUANTITY	0.93+
single component	QUANTITY	0.93+
VMWARE	TITLE	0.93+
Linux	TITLE	0.92+
first approach	QUANTITY	0.92+
point number one	QUANTITY	0.9+
first architecture	QUANTITY	0.9+
Red Hat	ORGANIZATION	0.88+
NYC	LOCATION	0.88+
Ethos	ORGANIZATION	0.88+
CEO	PERSON	0.88+
Open Ethos	ORGANIZATION	0.88+
one	QUANTITY	0.87+
three-way announcement	QUANTITY	0.87+
next 20 years	DATE	0.86+
Red Hat	TITLE	0.84+
single tier	QUANTITY	0.83+
OpenStack	TITLE	0.82+
20 year	QUANTITY	0.82+
Azure Stack	TITLE	0.79+
9 years	QUANTITY	0.77+
many years ago	DATE	0.77+
Hortonworks CUBE	ORGANIZATION	0.76+
three	QUANTITY	0.76+

Arun Murthy, Hortonworks | theCUBE NYC 2018

>> Live from New York, it's The Cube, covering The Cube New York City 2018 brought to you by SiliconAngle Media and its Ecosystem partners. >> Okay, welcome back everyone, here live in New York City for Cube NYC, formally Big Data NYC, now called CubeNYC. The topic has moved beyond big data. It's about cloud, it's about data, it's also about potentially blockchain in the future. I'm John Furrier, Dave Vellante. We're happy to have a special guest here, Arun Murthy. He's the cofounder and chief product officer of Hortonworks, been in the Ecosystem from the beginning, at Yahoo, already been on the Cube many times, but great to see you, thanks for coming in, >> My pleasure, >> appreciate it. >> thanks for having me. >> Super smart to have you on here, because a lot of people have been squinting through the noise of the market place. You guys have been now for a few years on this data plan idea, so you guys have actually launched Hadoop with Cloudera, they were first. You came after, Yahoo became second, two big players. Evolved it quickly, you guys saw early on that this is bigger than Hadoop. And now, all the conversations on what you guys have been talking about three years ago. Give us the update, what's the product update? How is the hybrids a big part of that, what's the story? >> We started off being the Hadoop company, and Rob, our CEO who was here on Cube, a couple of hours ago, he calls it sort of the phase one of the company, where it were Hadoop company. Very quickly realized we had to help enterprises manage the entire life cycle data, all the way from the edge to the data center, to the cloud, and between, right. So which is why we did acquisition of YARN, we've been talking about it, which kind of became the basis of our Hot marks Data flow product. And then as we went through the phase of that journey it was quickly obvious to us that enterprises had to manage data and applications in a hybrid manner right which is both on prem And public load and increasingly Edge, which is really very we spend a lot of time these days With IOT and everything from autonomous cars to video monitoring to all these aspects coming in. Which is why we wanted to get to the data plan architecture it allows to get you to a consistent security governance model. There's a lot of, I'll call it a lot of, a lot of fight about Cloud being insecure and so on, I don't think there's anything inherently insecure about the Cloud. The issue that we see is lack of skills and our enterprises know how to manage the data on-prem they know how to do LDAP, groups, and curb rows, and AAD, and what have you, they just don't have the skill sets yet to be able to do it on the public load, which leads to mistakes occasionally. >> Um-hm. >> And Data breaches and so on. So we recognize really early that part of data plan was to get that consistent security in governance models, so you don't have to worry about how you set up IMRL's on Amazon versus LDAP on-prem versus something else on Google. >> It's operating consistency. >> It's operating, exactly. I've talked about this in the past. So getting that Data plan was that journey, and this week at Charlotte work week we announced was we wanted to take that step further we've been able to kind of allow enterprise to manage this hybrid architecture on prem, multiple public loads. >> And the Edge. >> In a connected manner, the issue we saw early on and it's something we've been working on for a long while. Is that we've been able to connect the architectures Hadoop when it started it was more of an on premise architecture right, and I was there in 2005, 2006 when it started, Hadoop's started was bought on the world wide web we had a gigabyte of ethernet and I was up to the rack. From the rack on we had only eight gigs up to the rack so if you have a 2000 or cluster your dealing with eight gigs of connection. >> Bottleneck >> Huge bottleneck, fast forward today, you have at least ten if not one hundred gigabits. Moving to one hundred to a terabyte architecture, for that standpoint, and then what's happening is everything in that world, if you had the opportunity to read things on the assumptions we have in Hadoop. And then the good news is that when Cloud came along Cloud already had decoupled storage and architecture, storage and compute architectures. As we've sort of helped customers navigate the two worlds, with data plan, it's been a journey that's been reasonably successful and I think we have an opportunity to kind of provide identical consistent architectures both on prem and on Cloud. So it's almost like we took Hadoop and adapted it to Cloud. I think we can adapt the Cloud architecture back on prem, too to have consistent architectures. >> So talk about the Cloud native architecture. So you have a post that just got published. Cloud native architecture for big data and the data center. No, Cloud native architecture to big data in the data center. That's hyrid, explain the hybrid model, how do you define that? >> Like I said, for us it's really important to be able to have consistent architectures, consistent security, consistent governance, consistent way to manage data, and consistent way to actually to double up and port applications. So portability for data is important, which is why having security and governance consistently is a key. And then portability for the applications themselves are important, which is why we are so excited to kind of be, kind of first to embrace the whole containerize the ecosystem initiative. We've announced the open hybrid architecture initiative which is about decoupling storage and compute and then leveraging containers for all the big data apps, for the entire ecosystem. And this is where we are really excited to be working with both IBM and Redhat especially Redhat given their sort of investments in Kubernetes and open ship. We see that much like you'll have S3 and EC2, S3 for storage, EC2 for compute, and same thing with ADLS and azure compute. You'll actually have the next gen HDFS and Kubernetives. So is this a massive architectural rewrite, or is it more sort of management around the core. >> Great question. So part of it is evolution of the architecture. We have to get, whether it's Spark or Kafka or any of these open source projects, we need to do some evolution in the architecture, to make them work in the ecosystem, in the containerized world. So we are containerizing every one of the 28 animals 30 animals, in the zoo, right. That's a lot of work, we are kind of you know, sort of do it, we've done it in the past. Along with your point it's not enough to just have the architecture, you need to have a consistent fabric to be able to manage and operate it, which is really where the data plan comes in again. That was really the point of data plane all the time, this is a multi-roadmap, you know when we sit down we are thinking about what we'll do in 22, and 23. But we really have to execute on a multi-roadmap. >> And Data plane was a lynch pin. >> Well it was just like the sharp edge of the sword. Right, it was the tip of the sphere, but really the idea was always that we have to get data plan in to kind of get that hybrid product out there. And then we can sort of get to a inter generational data plan which would work with the next generation of the big data ecosystem itself. >> Do you see Kubernetes and things like Kubernetes, you've got STO a few service meshes up the stack, >> Absolutely are going to play a pretty instrumental role around orchestrating work loads and providing new stateless and stateful application with data, so now data you've got more data being generated there. So this is a new dynamic, it sounds like that's a fit for what you guys are doing. >> Which is something we've seen for awhile now. Like containers are something we've tracked for a long time and really excited to see Docker and RedHat. All the work that they are doing with Redhat containers. Get the security and so on. It's the maturing of that ecosystem. And now, the ability to port, build and port applications. And the really cool part for me is that, we will definitely see Kubenetes and open shift, and prem but even if you look at the Cloud the really nice part is that each of the Cloud providers themselves, provide a Kubenesos. Whether it's GKE on Google or Fargate on Amazon or AKS on Microsoft, we will be able to take identical architectures and leverage them. When we containerize high mark aft or spark we will be able to do this with kubernetes on spark with open shift and there will be open shift on leg which is available in the public cloud but also GKE and Fargate and AKS. >> What's interesting about the Redhat relationship is that I think you guys are smart to do this, is by partnering with Redhat you can, customers can run their workloads, analytical workloads, in the same production environment that Redhat is in. But with kind of differentiation if you will. >> Exactly with data plane. >> Data plane is just a wonderful thing there. So again good move there. Now around the ecosystem. Who else are you partnering with? what else do you see out there? who is in your world that is important? >> You know again our friends at IBM, that we've had a long relationship with them. We are doing a lot of work with IBM to integrate, data plane and also ICPD, which is the IBM Cloud plane for data, which brings along all of the IBM ecosystem. Whether it's DBT or IGC information governance catalogs, all that kind of were back in this world. What we also believe this will give a flip to is the whole continued standardization of security and governance. So you guys remember the old dpi, it caused a bit of a flutter, a few years ago. (anxious laughing) >> We know how that turned out. >> What we did was we kind of said, old DPI was based on the old distributions, now it's DPI's turn to be more about merit and governance. So we are collaborating with IBM on DPI more on merit and governance, because again we see that as being very critical in this sort of multi-Cloud, on prem edge world. >> Well the narrative, was always why do you need it, but it's clear that these three companies have succeeded dramatically, when you look at the financials, there has been statements made about IBM's contribution to seven figure deals to you guys. We had Redhat on and you guys are birds of a feather. [Murhty] Exactly. >> It certainly worked for you three, which presumably means it confers value to your customers. >> Which is really important, right from a customer standpoint, what is something we really focus on is that the benefit of the bargain is that now they understand that some of their key vendor partners that's us and Ibm and Redhat, we have a shared roadmap so now they can be much more sure about the fact that they can go to containers and kubernetes and so on and so on. Because all of the tools that they depend on are and all the partners they depend on are working together. >> So they can place bets. >> So they can place bets, and the important thing is that they can place longer term bets. Not a quarter bet, we hear about customers talking about building the next gen data centers, with kubernetes in mind. >> They have too. >> They have too, right and it's more than just building machines up, because what happens is with this world we talked about things like networking the way you do networking in this world with kubernetes, is different than you do before. So now they have to place longer term bets and they can do this now with the guarantee that the three of us will work together to deliver on the architecture. >> Well Arun, great to have you on the Cube, great to see you, final question for you, as you guys have a good long plan which is very cool. Short term customers are realizing, the set-up phase is over, okay now they're in usage mode. So the data has got to deliver value, so there is a real pressure for ROI, we would give people a little bit of a pass earlier on because set-up everything, set-up the data legs, do all this stuff, get it all operationalized, but now, with the AI and the machine learning front and center that's a signal that people want to start putting this to work. What have you seen customers gravitate to from the product side? Where are they going, is it the streaming is it the Kafka, is it the, what products are they gravitating to? >> Yeah definitely, I look at these in my role, in terms of use cases, right, we are certainly seeing a continued push towards the real-time analytics space. Which is why we place a longer-term bet on HDF and Kafka and so on. What's been really heartening kind of back to your sentiment, is we are seeing a lot of push right now on security garments. That's why we introduced for GDPR, we introduced a bunch of cable readies and data plane, with DSS and James Cornelius wrote about this earlier in the year, we are seeing customers really push us for key aspects like GDPR. This is a reflection for me of the fact of the maturing of the ecosystem, it means that it's no longer something on the side that you play with, it's something that's more, the whole ecosystem is now more a system of record instead of a system of augmentation, so that is really heartening but also brings a sharper focus and more sort of responsibility on our shoulders. >> Awesome, well congratulations, you guys have stock prices at a 52-week high. Congratulations. >> Those things take care of themselves. >> Good products, and stock prices take care of themselves. >> Okay the Cube coverage here in New York City, I'm John Vellante, stay with us for more live coverage all things data happening here in New York City. We will be right back after this short break. (digital beat)

Published Date : Sep 12 2018

SUMMARY :

brought to you by SiliconAngle Media at Yahoo, already been on the Cube many times, And now, all the conversations on what you guys a couple of hours ago, he calls it sort of the phase one so you don't have to worry about how you set up IMRL's on was we wanted to take that step further we've been able In a connected manner, the issue we saw early on on the assumptions we have in Hadoop. So talk about the Cloud native architecture. it more sort of management around the core. evolution in the architecture, to make them work in idea was always that we have to get data plan in to for what you guys are doing. And the really cool part for me is that, we will definitely What's interesting about the Redhat relationship is that Now around the ecosystem. So you guys remember the old dpi, it caused a bit of a So we are collaborating with IBM on DPI more on merit and Well the narrative, was always why do you need it, but It certainly worked for you three, which presumably be much more sure about the fact that they can go to building the next gen data centers, with kubernetes in mind. So now they have to place longer term bets and they So the data has got to deliver value, so there is a on the side that you play with, it's something that's Awesome, well congratulations, you guys have stock Okay the Cube coverage here in New York City,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Arun Murthy	PERSON	0.99+
Rob	PERSON	0.99+
IBM	ORGANIZATION	0.99+
2005	DATE	0.99+
John Vellante	PERSON	0.99+
John Furrier	PERSON	0.99+
Redhat	ORGANIZATION	0.99+
Yahoo	ORGANIZATION	0.99+
30 animals	QUANTITY	0.99+
SiliconAngle Media	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
AKS	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
second	QUANTITY	0.99+
52-week	QUANTITY	0.99+
James Cornelius	PERSON	0.99+
Google	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
New York	LOCATION	0.99+
three	QUANTITY	0.99+
YARN	ORGANIZATION	0.99+
28 animals	QUANTITY	0.99+
one hundred	QUANTITY	0.99+
Fargate	ORGANIZATION	0.99+
two worlds	QUANTITY	0.99+
GDPR	TITLE	0.99+
2006	DATE	0.99+
Arun	PERSON	0.99+
three companies	QUANTITY	0.99+
one hundred gigabits	QUANTITY	0.99+
eight gigs	QUANTITY	0.99+
this week	DATE	0.99+
two big players	QUANTITY	0.99+
Hadoop	TITLE	0.98+
first	QUANTITY	0.98+
Spark	TITLE	0.98+
GKE	ORGANIZATION	0.98+
Kafka	TITLE	0.98+
both	QUANTITY	0.98+
Kubernetes	TITLE	0.98+
each	QUANTITY	0.97+
today	DATE	0.97+
NYC	LOCATION	0.97+
three years ago	DATE	0.97+
Cloud	TITLE	0.97+
Charlotte	LOCATION	0.96+
seven figure	QUANTITY	0.96+
DSS	ORGANIZATION	0.96+
EC2	TITLE	0.95+
S3	TITLE	0.95+
Cube	COMMERCIAL_ITEM	0.94+
Cube	ORGANIZATION	0.92+
Murhty	PERSON	0.88+
2000	QUANTITY	0.88+
few years ago	DATE	0.87+
couple of hours ago	DATE	0.87+
Ecosystem	ORGANIZATION	0.86+
Ibm	PERSON	0.85+

Byron Banks, SAP Analytics | theCUBE NYC 2018

>> Live from New York, it's theCUBE covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. (techy music) >> Hey, welcome back, everyone. It's theCUBE live in New York City for CUBENYC, formerly Big Data NYC. Now it's turned from big data into a much broader conversation. CUBENYC is exploring all these around data, data intelligence, cloud computing, devops, application developers, data centers, the whole range, all things data. I'm John Furrier here with Peter Burris, cohost and analyst here on the session. Our next guest is Byron Banks, who's the vice president of product marketing at SAP Analytics. No stranger to enterprise analytics. Welcome to theCUBE, thanks for joining us. >> Thank you for having us. >> So, SAP is, you know, a brand that's been doing business analytics for a long, long time, certainly powering-- >> Mm-hm, sure. >> The software for larger enterprises. Supply chain, you name it-- >> Sure. >> ERP, everyone kind of knows the history of SAP, but you guys really have been involved in analytics. HANA's been tailor-made for some speed. We've been covering that, but now as the world turns into a cloud native-- >> Mm-hm. >> SAP has a global cloud platform that is multi-cloud driven you guys kind of see this picture of a horizontally scalable computing environment. Analytics is a big, big piece of that, so what's going on with machine learning and AI, and as analytical software and infrastructure need to be provisioned dynamically. >> Sure, sure. >> This is an opportunity for people who love to get into the data. >> Absolutely. >> This is a great opportunity. What's the uptake? >> Great opportunity for us. We firmly believe that the era of optimization and digitization is over. It's not enough, it's certainly important. It has given a lot of benefits, but just overwhelming every user, every customer with more data, more optimization, faster data, better data, it's not enough. So, we believe that the concept to switch to intelligence, so how do you make customers, how do you serve customers exactly what they need in the moment? How do you give them an offer that is relevant? Not spam them, give them a great offer. How do you motivate your employees to be the best at what they do, whether it's in HR or whether it's in sales, and we think technology's key to that, but at the end of the day, the customer, the organization is the driver. They are the driver, they know their business best, so what we want to do is be the pit crew, if you will, to use a racing analogy, if they're the driver of the race car we want to bring the technology to them with some best practices and advice, because again, we're SAP, we've been in the business for 45 years, so we have a very good perspective of what works based on the companies we see, and serve over 300,000 of them, but it's really enabling them to be their best, and the customers that are doing the best, we call those intelligent enterprises, and that means three components. It needs intelligent applications, what we call the intelligent suite. So, how do we make an HR application that is great at retaining the best employees and also attracting great ones? How do we enable a sales system to give the best offers and do the best forecasts? So, all of that is the intelligent applications. The middle layer for that is called intelligent technologies. So, how do we use these great technologies that we've been developing as an industry over the last three to five years? Things like big data, IoT, sensors, machine learning, and analytics. That intelligent technology layer, how do we make that available, and then finally, it's the digital core, the digital platform for that. So, how do we have this scalable platform, ideally in the cloud, that can pull data from both cloud sources, SAP sources, non-SAP sources, and give the right data to those applications-- >> Yeah. >> And technologies in realtime. >> I love the pit crew example of the race car on the track, because you want to get as much data in the system as possible because more data is, you know, more opportunities to understand and get insights, but at the end of the day, you want to make sure that the car not only runs well on the track, (chuckles) and is cost effective, but it's performing. It actually wins the race or stays in the race. So, customers want revenue, I mean, the big thing we're hearing is, "Okay, let's get some top line benefit, not just "good cost effectiveness." >> Right, right. >> So, the objective of the customer, and whatever, that can be applications, it could be, you know, insight into operational efficiency. The revenue piece of growth is a big part of the growth strategy-- >> Right. >> For companies to have a data-centric system. >> Absolutely. >> This is part of the intelligence. >> But it's not just presenting the data. We introduced a product a couple of years ago, and I promise this isn't going to be a marketing pitch, (chuckles) but I think it's very relevant to what you just said. So, the SAP Analytics Cloud, that's one of those technologies I talked about, intelligent technologies. So, it is modern, built from the ground for SAS applications, cloud-based, built on the SAP cloud platform, and it has three major components. It has planning, so what are my KPIs? If I'm in HR am I recruiting talent or am I retraining talent? What are my KPIs if I'm in sales? Am I trying to drive profitability or am I trying to track new customers? And if I'm in, you know, again, in marketing how effective are we on campaigns? Tied to that is all the data visualization we can do so that we can mix and match data to discover new insights about our business, make it very, very easy, again, to connect with both SAP and non-SAP sources, and then provide the machine learning capabilities. All of that predictive capability, so not just looking at what happened in the past, I'm also looking at what's likely to happen in the next week, and the key point to all of that is when you open the application and start, the first thing it asks you is, "What are you trying to do? "What is the business problem you're trying to solve?" It's a story, so it's designed from the get-go to be very business outcome focused, not just show you 50 different data sources or 100 different data sources and then leave it to you to figure out what you should be doing. >> Yeah. >> So, it is designed to be very much a business outcome driven environment, so that, again, people like me, a marketer, can logon to that product and immediately start to work in campaigns-- >> Yeah. >> And in the language that I want to work in, not in IT speak or geek speak. Nothing wrong with geek speak, but again-- >> Yeah, I want to get into a conversation, because one of the things, we're very data driven as a media company because we have data that's out there, consumption data, but some platforms don't have measurement capability, like LinkedIn doesn't finance any analytics. >> Sure. >> So, this data that's out there that I need, I want, that might be available down the road, but not today, so I want to get to that conversation around, okay, you can measure what you're looking at, so everything that's measurable you've got dashboards for, but-- >> Sure. >> There's some elusive gaps between what's available that could help the data model. These are future data sets, or things that aren't yet instrumented properly. >> Correct. >> As new technology comes in with cloud native the need for instrumentation's critical. How do you guys think about that from a product standpoint, because you know, customers aren't going to say, "Well, create a magic linkage between something "that doesn't exist yet," but soon data will be existing. You know, for instance, network effect or other things that might be important for people that aren't yet measurable but might be in the future. >> Sure. >> They want to be set up for that, they don't want to foreclose that. >> Sure, well and I think one of the balances we have as SAP, because we're a technology company and we built a lot of great tools, but we also work a lot with our customers around business processes, so as I said, when we introduce our products we don't want to give them just a black box, which is a bunch of feeds and speeds technologies-- >> Yeah. >> That they need to figure it out. As we see patterns in our customers, we build an end-to-end process that is analytics driven and we provide that back to our customers to give them a headstart, but we have to have all of the capabilities in our solutions that allow them to build and extend in any way possible, because again, at the end of the day, they have a very unique business, but we want to give them a jumping off point so that they're not just staring at a blank screen. It's kind of like writing a speech. You don't want to start with just a blank screen. If you're in sales and marketing and you want to do a sales forecast, we will provide out-of-the-box, what we call embedded analytics, a fully complete dashboard that will take them through a guided workflow that says, "Hey, you want to do a sales forecast. "Here's the data we think you want to pull, "do you want to pull that? "Here's some additional inference we've seen "from some of our machine learning algorithms "based on what has happened in the last six weeks "of selling and make a projection as to what "we expect will happen between now and next quarter." >> You get people started quickly, that's the whole goal. Get people started quickly. >> Exactly, but we don't lock them into only doing it the one way, the right way. We're not preaching >> Yeah. >> We want to give them the flexibility. >> But this is an important point, because every, almost every decision at some point in time comes back to finance. >> Sure. >> And so, being able to extend your ability to learn something about data and act on data as measurements improve, you still want to be able to bring it back to what it means from a return standpoint, and that requires some agreement, not just some, a lot of agreement-- >> Sure. >> With a core financial system, and I think that this could be one of the big opportunities that you guys have, is because knowing a lot about how the data works, where it is, sustaining that so that the transactional integrity remains the same but you can review it through a lot of different analytics systems-- >> Right. >> Is a crucial element of this, would you agree? >> I fully agree, and I think if you look at the analytics cloud that I talked about, the very first solution capability we built into it was planning. What are my KPIs that I'm trying to measure? Now, yes, of course if you're in a business it all turns into dollars or euros at the end of the end of the day, but customer satisfaction, employee engagement, all of those things are incredibly important, so I do believe there is a way to put measurements, not always at a dollar value, that are important for what you're trying to do, because it will ultimately translate into dollars down the road. >> Right, and I want to get the news. You guys have some hard news here in New York this week on your analytics and the stuff you're working on. What's the hard news? >> Absolutely. Absolutely, so today we announced a bunch of updates to our analytics cloud platform. We've had it around for three or four years, thousands of customers, a lot of great innovation, and what we were doing today, what we announced today, is the update since our SAPPHIRE, our big, annual conference in June this year, so we have built a number of machine learning capabilities that, again, speak in the language of the business user, give them the tools that allow them to quickly benefit from things like correlations, things like regressions, patterns we've seen in the data to guide them through a process where they can do forecasting, retainment, recruiting, maybe even looking for bias, and unintended bias, in things like campaigns or marketing campaigns. Give them a guided approach to that, speaking in their terms, using very natural language processing, so for example, we have things like Smart Insights where you can ask questions about, "Give me the sales forecast for Japan," and you can say it, just type it that way and the analytic platform will start to construct and guide you through it, and it will build all the queries, it will give you, again, you're still in control, but it's a very guided process-- >> Yep. >> That says, "Do you want to run a forecast? "Here's how we recommend a forecast. "Here are some variables we find very, very interesting." That says, "Oh, in Japan this product sold "really well two quarters ago, "but it's not selling well this quarter." Maybe there's been a competitive action, maybe we need to look at pricing, maybe we need to retrain the sales organization. So, it's giving them information, again, in a very guided business focus, and I think that's the key thing. Like data scientists, we love them. We want to use them in a lot of places, but can't have data scientists involved in every single analytic that you're trying to do. >> Yeah. >> There are just not enough in the world. >> I mean, I love the conversation, because this exact conversation goes down the road of devops-like conversation. >> Right. >> Automation, agility, these are themes that we're talking about in cloud platforms, (chuckles) say data analytics. >> Absolutely. >> So, now you're bringing data down. Hey, we're automating things, so it could look like a Siri or voice activated construct for interaction. >> Yeah, absolutely, and in their language, again, in the language that the end user wants to speak, and it doesn't take the human out of it. It's actually making them better, right? We want to automate things and give recommendations so that you can automate things. >> Yeah. >> A great example is like invoice matching. We have customers that use, you know, spent hundreds of people, thousands of hours doing invoice matching because the address wouldn't line up or the purchase order had a transposed number in it, but using machine learning-- >> Yeah, yeah. >> Or using algorithms, we can automate all of that or go, "Hey, here's a pattern we see." >> Yeah. >> "Do you want us to automate "this matching process for you?" And customers that have-- >> Yeah. >> Implemented, they've found 70% of the transactions could be automated. >> I think you're right on, I personally believe that humans are more valuable, certainly in the media business that people think is, you know, sliding down, but humans, huge role. Now, data and automation can surface and create value that humans can curate on top of, so same with data. The human role is pretty critical in this because the synthesis is being helped by the computers, but the job's not going away, it's just shortcutting to the truth. >> And I think if you do it right machine learning can actually train the users on the job. >> Yeah. >> I think about myself and I think about unintended bias, right, and you look at a resume that you put out or a job posting, if you use the term I want somebody to lead a team, you will get a demographic profile of the people that apply to that job. If you use the term build a team, you'll get a different demographic profile, so I'm not saying one's better or the other, but me as a hiring manager, I'm not aware of that. I'm not totally on top of that, but if the tool is providing me information saying, "Hey, we've seen these keywords "in your marketing campaign," or in your recruiting, or even in your customer support and the way you speak with your customers, and it's starting to see patterns, just saying, "Hey, by the way, "we know that if you use these kinds of terms "it's more likely to get this kind of a response." That helps me become a better marketer. >> Yeah. >> Or be more appropriate in the way I engage with my customers. >> So, it assists you, it's your pit crew example, it's efficiency, all kind of betterment. >> Absolutely. >> Byron, thanks for coming on theCUBE, appreciate the time, coming to share and the insights on SAP's news and your vision on analytics. Thanks for coming on, appreciate it. It's theCUBE live in New York City for CUBENYC. I'm John Furrier with Peter Burris. Stay with us, day one continues. We're here for two days, all things data here in New York City. Stay with us, we'll be right back. (techy music)

Published Date : Sep 12 2018

SUMMARY :

Brought to you by SiliconANGLE Media cohost and analyst here on the session. Supply chain, you name it-- ERP, everyone kind of knows the history of SAP, you guys kind of see this picture of a This is an opportunity for people What's the uptake? So, all of that is the intelligent applications. but at the end of the day, you want to make sure So, the objective of the customer, and the key point to all of that is And in the language that I want to work in, because one of the things, we're very data driven available that could help the data model. the need for instrumentation's critical. they don't want to foreclose that. "Here's the data we think you want to pull, You get people started quickly, that's the whole goal. doing it the one way, the right way. at some point in time comes back to finance. at the end of the end of the day, What's the hard news? and the analytic platform will start to construct That says, "Do you want to run a forecast? I mean, I love the conversation, because this Automation, agility, these are themes that we're So, now you're bringing data down. and it doesn't take the human out of it. We have customers that use, you know, Or using algorithms, we can automate all of that the transactions could be automated. certainly in the media business that people think the users on the job. of the people that apply to that job. the way I engage with my customers. So, it assists you, it's your pit crew example, appreciate the time, coming to share and the insights

ENTITIES

Entity	Category	Confidence
Peter Burris	PERSON	0.99+
70%	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Japan	LOCATION	0.99+
New York	LOCATION	0.99+
two days	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Byron	PERSON	0.99+
Siri	TITLE	0.99+
45 years	QUANTITY	0.99+
New York City	LOCATION	0.99+
LinkedIn	ORGANIZATION	0.99+
50 different data sources	QUANTITY	0.99+
four years	QUANTITY	0.99+
first	QUANTITY	0.99+
100 different data sources	QUANTITY	0.99+
today	DATE	0.99+
SAP Analytics	ORGANIZATION	0.99+
three	QUANTITY	0.99+
two quarters ago	DATE	0.99+
both	QUANTITY	0.99+
SAP	ORGANIZATION	0.99+
over 300,000	QUANTITY	0.98+
next week	DATE	0.98+
next quarter	DATE	0.98+
HANA	TITLE	0.98+
thousands of hours	QUANTITY	0.98+
June this year	DATE	0.96+
this week	DATE	0.96+
one	QUANTITY	0.95+
single	QUANTITY	0.95+
hundreds of people	QUANTITY	0.95+
NYC	LOCATION	0.94+
Byron Banks	PERSON	0.92+
one way	QUANTITY	0.92+
this quarter	DATE	0.91+
theCUBE	ORGANIZATION	0.91+
SAP Analytics Cloud	TITLE	0.89+
couple of years ago	DATE	0.87+
Big Data	ORGANIZATION	0.87+
thousands of customers	QUANTITY	0.87+
CUBENYC	ORGANIZATION	0.85+
day one	QUANTITY	0.84+
first solution	QUANTITY	0.83+
last six weeks	DATE	0.81+
euros	OTHER	0.81+
five years	QUANTITY	0.81+
CUBENYC	LOCATION	0.74+
SAS	ORGANIZATION	0.7+
Insights	TITLE	0.64+
techy music	ORGANIZATION	0.6+
customer	QUANTITY	0.6+
SAPPHIRE	ORGANIZATION	0.59+
Banks	ORGANIZATION	0.59+
2018	DATE	0.58+
2018	EVENT	0.57+
every	QUANTITY	0.57+
techy	PERSON	0.5+
last	DATE	0.48+

Kickoff | theCUBE NYC 2018

>> Live from New York, it's theCUBE covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. (techy music) >> Hello, everyone, welcome to this CUBE special presentation here in New York City for CUBENYC. I'm John Furrier with Dave Vellante. This is our ninth year covering the big data industry, starting with Hadoop World and evolved over the years. This is our ninth year, Dave. We've been covering Hadoop World, Hadoop Summit, Strata Conference, Strata Hadoop. Now it's called Strata Data, I don't know what Strata O'Reilly's going to call it next. As you all know, theCUBE has been present for the creation at the Hadoop big data ecosystem. We're here for our ninth year, certainly a lot's changed. AI's the center of the conversation, and certainly we've seen some horses come in, some haven't come in, and trends have emerged, some gone away, your thoughts. Nine years covering big data. >> Well, John, I remember fondly, vividly, the call that I got. I was in Dallas at a storage networking world show and you called and said, "Hey, we're doing "Hadoop World, get over there," and of course, Hadoop, big data, was the new, hot thing. I told everybody, "I'm leaving." Most of the people said, "What's Hadoop?" Right, so we came, we started covering, it was people like Jeff Hammerbacher, Amr Awadallah, Doug Cutting, who invented Hadoop, Mike Olson, you know, head of Cloudera at the time, and people like Abi Mehda, who at the time was at B of A, and some of the things we learned then that were profound-- >> Yeah. >> As much as Hadoop is sort of on the back burner now and people really aren't talking about it, some of the things that are profound about Hadoop, really, were the idea, the notion of bringing five megabytes of code to a petabyte of data, for example, or the notion of no schema on write. You know, put it into the database and then figure it out. >> Unstructured data. >> Right. >> Object storage. >> And so, that created a state of innovation, of funding. We were talking last night about, you know, many, many years ago at this event this time of the year, concurrent with Strata you would have VCs all over the place. There really aren't a lot of VCs here this year, not a lot of VC parties-- >> Mm-hm. >> As there used to be, so that somewhat waned, but some of the things that we talked about back then, we said that big money and big data is going to be made by the practitioners, not by the vendors, and that's proved true. I mean... >> Yeah. >> The big three Hadoop distro vendors, Cloudera, Hortonworks, and MapR, you know, Cloudera's $2.5 billion valuation, you know, not bad, but it's not a $30, $40 billion value company. The other thing we said is there will be no Red Hat of big data. You said, "Well, the only Red Hat of big data might be "Red Hat," and so, (chuckles) that's basically proved true. >> Yeah. >> And so, I think if we look back we always talked about Hadoop and big data being a reduction, the ROI was a reduction on investment. >> Yeah. >> It was a way to have a cheaper data warehouse, and that's essentially-- Well, what did we get right and wrong? I mean, let's look at some of the trends. I mean, first of all, I think we got pretty much everything right, as you know. We tend to make the calls pretty accurately with theCUBE. Got a lot of data, we look, we have the analytics in our own system, plus we have the research team digging in, so you know, we pretty much get, do a good job. I think one thing that we predicted was that Hadoop certainly would change the game, and that did. We also predicted that there wouldn't be a Red Hat for Hadoop, that was a production. The other prediction was is that we said Hadoop won't kill data warehouses, it didn't, and then data lakes came along. You know my position on data lakes. >> Yeah. >> I've always hated the term. I always liked data ocean because I think it was much more fluidity of the data, so I think we got that one right and data lakes still doesn't look like it's going to be panning out well. I mean, most people that deploy data lakes, it's really either not a core thing or as part of something else and it's turning into a data swamp, so I think the data lake piece is not panning out the way it, people thought it would be. I think one thing we did get right, also, is that data would be the center of the value proposition, and it continues and remains to be, and I think we're seeing that now, and we said data's the development kit back in 2010 when we said data's going to be part of programming. >> Some of the other things, our early data, and we went out and we talked to a lot of practitioners who are the, it was hard to find in the early days. They were just a select few, I mean, other than inside of Google and Yahoo! But what they told us is that things like SQL and the enterprise data warehouse were key components on their big data strategy, so to your point, you know, it wasn't going to kill the EDW, but it was going to surround it. The other thing we called was cloud. Four years ago our data showed clearly that much of this work, the modeling, the big data wrangling, et cetera, was being done in the cloud, and Cloudera, Hortonworks, and MapR, none of them at the time really had a cloud strategy. Today that's all they're talking about is cloud and hybrid cloud. >> Well, it's interesting, I think it was like four years ago, I think, Dave, when we actually were riffing on the notion of, you know, Cloudera's name. It's called Cloudera, you know. If you spell it out, in Cloudera we're in a cloud era, and I think we were very aggressive at that point. I think Amr Awadallah even made a comment on Twitter. He was like, "I don't understand "where you guys are coming from." We were actually saying at the time that Cloudera should actually leverage more cloud at that time, and they didn't. They stayed on their IPO track and they had to because they had everything betted on Impala and this data model that they had and being the business model, and then they went public, but I think clearly cloud is now part of Cloudera's story, and I think that's a good call, and it's not too late for them. It never was too late, but you know, Cloudera has executed. I mean, if you look at what's happened with Cloudera, they were the only game in town. When we started theCUBE we were in their office, as most people know in this industry, that we were there with Cloudera when they had like 17 employees. I thought Cloudera was going to run the table, but then what happened was Hortonworks came out of the Yahoo! That, I think, changed the game and I think in that competitive battle between Hortonworks and Cloudera, in my opinion, changed the industry, because if Hortonworks did not come out of Yahoo! Cloudera would've had an uncontested run. I think the landscape of the ecosystem would look completely different had Hortonworks not competed, because you think about, Dave, they had that competitive battle for years. The Hortonworks-Cloudera battle, and I think it changed the industry. I think it couldn't been a different outcome. If Hortonworks wasn't there, I think Cloudera probably would've taken Hadoop and making it so much more, and I think they wouldn't gotten more done. >> Yeah, and I think the other point we have to make here is complexity really hurt the Hadoop ecosystem, and it was just bespoke, new projects coming out all the time, and you had Cloudera, Hortonworks, and maybe to a lesser extent MapR, doing a lot of the heavy lifting, particularly, you know, Hortonworks and Cloudera. They had to invest a lot of their R&D in making these systems work and integrating them, and you know, complexity just really broke the back of the Hadoop ecosystem, and so then Spark came in, everybody said, "Oh, Spark's going to basically replace Hadoop." You know, yes and no, the people who got Hadoop right, you know, embraced it and they still use it. Spark definitely simplified things, but now the conversation has turned to AI, John. So, I got to ask you, I'm going to use your line on you in kind of the ask-me-anything segment here. AI, is it same wine, new bottle, or is it really substantively different in your opinion? >> I think it's substantively different. I don't think it's the same wine in a new bottle. I'll tell you... Well, it's kind of, it's like the bad wine... (laughs) Is going to be kind of blended in with the good wine, which is now AI. If you look at this industry, the big data industry, if you look at what O'Reilly did with this conference. I think O'Reilly really has not done a good job with the conference of big data. I think they blew it, I think that they made it a, you know, monetization, closed system when the big data business could've been all about AI in a much deeper way. I think AI is subordinate to cloud, and you mentioned cloud earlier. If you look at all the action within the AI segment, Diane Greene talking about it at Google Next, Amazon, AI is a software layer substrate that will be underpinned by the cloud. Cloud will drive more action, you need more compute, that drives more data, more data drives the machine learning, machine learning drives the AI, so I think AI is always going to be dependent upon cloud ends or some sort of high compute resource base, and all the cloud analytics are feeding into these AI models, so I think cloud takes over AI, no doubt, and I think this whole ecosystem of big data gets subsumed under either an AWS, VMworld, Google, and Microsoft Cloud show, and then also I think specialization around data science is going to go off on its own. So, I think you're going to see the breakup of the big data industry as we know it today. Strata Hadoop, Strata Data Conference, that thing's going to crumble into multiple, fractured ecosystems. >> It's already starting to be forked. I think the other thing I want to say about Hadoop is that it actually brought such great awareness to the notion of data, putting data at the core of your company, data and data value, the ability to understand how data at least contributes to the monetization of your company. AI would not be possible without the data. Right, and we've talked about this before. You call it the innovation sandwich. The innovation sandwich, last decade, last three decades, has been Moore's law. The innovation sandwich going forward is data, machine intelligence applied to that data, and cloud for scale, and that's the sandwich of innovation over the next 10 to 20 years. >> Yeah, and I think data is everywhere, so this idea of being a categorical industry segment is a little bit off, I mean, although I know data warehouse is kind of its own category and you're seeing that, but I don't think it's like a Magic Quadrant anymore. Every quadrant has data. >> Mm-hm. >> So, I think data's fundamental, and I think that's why it's going to become a layer within a control plane of either cloud or some other system, I think. I think that's pretty clear, there's no, like, one. You can't buy big data, you can't buy AI. I think you can have AI, you know, things like TensorFlow, but it's going to be a completely... Every layer of the stack is going to be impacted by AI and data. >> And I think the big players are going to infuse their applications and their databases with machine intelligence. You're going to see this, you're certainly, you know, seeing it with IBM, the sort of Watson heavy lift. Clearly Google, Amazon, you know, Facebook, Alibaba, and Microsoft, they're infusing AI throughout their entire set of cloud services and applications and infrastructure, and I think that's good news for the practitioners. People aren't... Most companies aren't going to build their own AI, they're going to buy AI, and that's how they close the gap between the sort of data haves and the data have-nots, and again, I want to emphasize that the fundamental difference, to me anyway, is having data at the core. If you look at the top five companies in terms of market value, US companies, Facebook maybe not so much anymore because of the fake news, though Facebook will be back with it's two billion users, but Apple, Google, Facebook, Amazon, who am I... And Microsoft, those five have put data at the core and they're the most valuable companies in the stock market from a market cap standpoint, why? Because it's a recognition that that intangible value of the data is actually quite valuable, and even though banks and financial institutions are data companies, their data lives in silos. So, these five have put data at the center, surrounded it with human expertise, as opposed to having humans at the center and having data all over the place. So, how do they, how do these companies close the gap? How do the companies in the flyover states close the gap? The way they close the gap, in my view, is they buy technologies that have AI infused in it, and I think the last thing I'll say is I see cloud as the substrate, and AI, and blockchain and other services, as the automation layer on top of it. I think that's going to be the big tailwind for innovation over the next decade. >> Yeah, and obviously the theme of machine learning drives a lot of the conversations here, and that's essentially never going to go away. Machine learning is the core of AI, and I would argue that AI truly doesn't even exist yet. It's machine learning really driving the value, but to put a validation on the fact that cloud is going to be driving AI business is some of the terms in popular conversations we're hearing here in New York around this event and topic, CUBENYC and Strata Conference, is you're hearing Kubernetes and blockchain, and you know, these automation, AI operation kind of conversations. That's an IT conversation, (chuckles) so you know, that's interesting. You've got IT, really, with storage. You've got to store the data, so you can't not talk about workloads and how the data moves with workloads, so you're starting to see data and workloads kind of be tossed in the same conversation, that's a cloud conversation. That is all about multi-cloud. That's why you're seeing Kubernetes, a term I never thought I would be saying at a big data show, but Kubernetes is going to be key for moving workloads around, of which there's data involved. (chuckles) Instrumenting the workloads, data inside the workloads, data driving data. This is where AI and machine learning's going to play, so again, cloud subsumes AI, that's the story, and I think that's going to be the big trend. >> Well, and I think you're right, now. I mean, that's why you're hearing the messaging of hybrid cloud and from the big distro vendors, and the other thing is you're hearing from a lot of the no-SQL database guys, they're bringing ACID compliance, they're bringing enterprise-grade capability, so you're seeing the world is hybrid. You're seeing those two worlds come together, so... >> Their worlds, it's getting leveled in the playing field out there. It's all about enterprise, B2B, AI, cloud, and data. That's theCUBE bringing you the data here. New York City, CUBENYC, that's the hashtag. Stay with us for more coverage live in New York after this short break. (techy music)

Published Date : Sep 12 2018

SUMMARY :

Brought to you by SiliconANGLE Media for the creation at the Hadoop big data ecosystem. and some of the things we learned then some of the things that are profound about Hadoop, We were talking last night about, you know, but some of the things that we talked about back then, You said, "Well, the only Red Hat of big data might be being a reduction, the ROI was a reduction I mean, first of all, I think we got and I think we're seeing that now, and the enterprise data warehouse were key components and I think we were very aggressive at that point. Yeah, and I think the other point and all the cloud analytics are and cloud for scale, and that's the sandwich Yeah, and I think data is everywhere, and I think that's why it's going to become I think that's going to be the big tailwind and I think that's going to be the big trend. and the other thing is you're hearing New York City, CUBENYC, that's the hashtag.

ENTITIES

Entity	Category	Confidence
Apple	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Diane Greene	PERSON	0.99+
Google	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
John	PERSON	0.99+
Alibaba	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jeff Hammerbacher	PERSON	0.99+
$30	QUANTITY	0.99+
New York	LOCATION	0.99+
2010	DATE	0.99+
IBM	ORGANIZATION	0.99+
Doug Cutting	PERSON	0.99+
Mike Olson	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Dallas	LOCATION	0.99+
O'Reilly	ORGANIZATION	0.99+
Yahoo	ORGANIZATION	0.99+
Cloudera	ORGANIZATION	0.99+
five	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Abi Mehda	PERSON	0.99+
John Furrier	PERSON	0.99+
New York City	LOCATION	0.99+
$2.5 billion	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
MapR	ORGANIZATION	0.99+
Amr Awadallah	PERSON	0.99+
$40 billion	QUANTITY	0.99+
17 employees	QUANTITY	0.99+
VMworld	ORGANIZATION	0.99+
Today	DATE	0.99+
Impala	ORGANIZATION	0.99+
Nine years	QUANTITY	0.99+
four years ago	DATE	0.98+
last night	DATE	0.98+
last decade	DATE	0.98+
Strata Data Conference	EVENT	0.98+
Strata Conference	EVENT	0.98+
Hadoop Summit	EVENT	0.98+
ninth year	QUANTITY	0.98+
Four years ago	DATE	0.98+
two worlds	QUANTITY	0.97+
five companies	QUANTITY	0.97+
today	DATE	0.97+
Strata Hadoop	EVENT	0.97+
Hadoop World	EVENT	0.96+
CUBE	ORGANIZATION	0.96+
Google Next	ORGANIZATION	0.95+
Twitter	ORGANIZATION	0.95+
this year	DATE	0.95+
Spark	ORGANIZATION	0.95+
US	LOCATION	0.94+
CUBENYC	EVENT	0.94+
Strata O'Reilly	ORGANIZATION	0.93+
next decade	DATE	0.93+

Dr Matt Wood, AWS | AWS Summit NYC 2018

live from New York it's the cube covering AWS summit New York 2018 hot GUI Amazon Web Services and its ecosystem partners hello and welcome back here live cube coverage in New York City for AWS Amazon Web Services summit 2018 I'm John Fourier with Jeff Rick here at the cube our next guest is dr. Matt wood general manager of artificial intelligence with Amazon Web Services keep alumnae been so busy for the past year and been on the cubanía thanks for coming back appreciate you spending the time so promotions keep on going on you got now general manager of the AI group AI operations ai automation machine learning offices a lot of big category of new things developing and a you guys have really taken AI and machine learning to a whole new level it's one of the key value propositions that you guys now have for not just a large enterprise but down to startups and developers so you know congratulations and what's the update oh well the update is this morning in the keynote I was lucky enough to introduce some new capabilities across our platform when it comes to machine learning our mission is that we want to be able to take machine learning and make it available to all developers we joke internally that we just want to we want to make machine learning boring we wanted to make it vanilla it's just it's another tool in the tool chest of any developer and any any data data scientist and we've done that this idea of taking technology that is traditionally only within reached a very very small number of well-funded organizations and making it as broadly distributed as possible we've done that pretty successfully with compute storage and databases and analytics and data warehousing and we want to do the exact same thing for the machine learning and to do that we have to kind of build an entirely new stack and we think of that stack in in three different tiers the bottom tier really for academics and researchers and data scientists we provide a wide range of frameworks open source programming libraries the developers and data scientists use to build neural networks and intelligence they're things like tend to flow and Apache mx9 and by torch and they're really they're very technical you can build you know arbitrarily sophisticated says most she open source to write mostly open source that's right we contribute a lot of our work back to MX net but we also contribute to buy torch and to tend to flow and there's big healthy open source projects growing up around you know all these popular frameworks plus more like chaos and gluon and horror boredom so that's a very very it's a key area for for researchers and academics the next level up we have machine learning platforms this is for developers and data scientists who have data they see in the clout although they want to move to the cloud quickly but they want to be able to use for modeling they want to be able to use it to build custom machine learning models and so here we try and remove as much of the undifferentiated heavy lifting associated with doing that as possible and this is really where sage maker fits in Cersei's maker allows developers to quickly fill train optimize and host their machine learning models and then at the top tier we have a set of AI services which are for application developers that don't want to get into the weeds they just want to get up and running really really quickly and so today we announced four new services really across those their middle tier in that top tier so for Sage maker we're very pleased to introduce a new streaming data protocol which allows you to take data straight from s3 and pump it straight into your algorithm and straight onto the computer infrastructure and what that means is you no longer have to copy data from s3 onto your computer infrastructure in order to be able to start training you just take away that step and just stream it right on there and it's an approach that we use inside sage maker for a lot of our built-in algorithms and it significantly increases the the speed of the algorithm and significantly of course decreases the cost of running the training because you pay by the second so any second you can save off it's a coffin for the customer and they also it helps the machine learn more that's right yeah you can put more data through it absolutely so you're no longer constrained by the amount of disk space you're not even constrained by the amount of memory on the instance you can just pump terabyte after terabyte after terabyte and we actually had another thing like talked about in the keynote this morning a new customer of ours snap who are routinely training on over 100 terabytes of image data using sage maker so you know the ability to be able to pump in lots of data is one of the keys to building successful machine learning applications so we brought that capability to everybody that's using tensorflow now you can just have your tensor flow model bring it to Sage maker do a little bit of wiring click a button and you were just start streaming your data to your tents upload what's the impact of the developer time speed I think it is it is the ability to be able to pump more data it is the decrease in time it takes to start the training but most importantly it decreases the training time all up so you'll see between a 10 and 25 percent decrease in training time some ways you can train more models or you can train more models per in the same unit time or you can just decrease the cost so it's a completely different way of thinking about how to train over large amounts of data we were doing it internally and now we're making it available for everybody through tej matrix that's the first thing the second thing that we're adding is the ability to be able to batch process and stage make them so stage maker used to be great at real-time predictions but there's a lot of use cases where you don't want to just make a one-off prediction you want to predict hundreds or thousands or even millions of things all at once so let's say you've got all of your sales information at the end of the month you want to use that to make a forecast for the next month you don't need to do that in real-time you need to do it once and then place the order and so we added batch transforms to Sage maker so you can pull in all of that data large amounts of data batch process it within a fully automated environment and then spin down the infrastructure and you're done it's a very very simple API anyone that uses a lambda function it's can take advantage of this again just dramatically decreasing the overhead and making it so much easier for everybody to take advantage of machine load and then at the top layer we had new capabilities for our AI services so we announced 12 new language pairs for our translation service and we announced new transcription so capability which allows us to take multi-channel audio such as might be recorded here but more commonly on contact centers just like you have a left channel on the right channel for stereo context centers often record the agent and the customer on the same track and today you can now pass that through our transcribed service long-form speech will split it up into the channels or automatically transcribe it will analyze all the timestamps and create just a single script and from there you can see what was being talked about you can check the topics automatically using comprehend or you can check the compliance did the agents say the words that they have to say for compliance reasons at some point during the conversation that's a material new capability for what's the top surface is being used obviously comprehend transcribe and barri of others you guys have put a lot of stuff out there all kinds of stuff what's the top sellers top use usage as a proxy for uptake you know I think I think we see a ton of we see a ton of adoption across all of these areas but where a lot of the momentum is growing right now is sage maker so if you look at a formula one they just chose Formula One racing they just chose AWS and sage maker as their machine learning platform the National Football League Major League Baseball today announcer they're you know re offering their relationship and their strategic partnership with AWS cream machine learning so all of these groups are using the data which just streams out of these these races all these games yeah and that can be the video or it can be the telemetry of the cars or the telemetry of the players and they're pumping that through Sage maker to drive more engaging experiences for their viewers so guys ok streaming this data is key this is a stage maker quickly this can do video yeah just get it all in all of it well you know we'd love data I would love to follow up on that so the question is is that when will sage maker overtake Aurora as the fastest growing product in history of Amazon because I predicted that reinvent that sage maker would go on err is it looking good right now I mean I sorta still on paper you guys are seeing is growing but see no eager give us an indicator well I mean I don't women breakout revenue per service but even the same excitement I'll say this the same excitement that I see Perseids maker now and the same opportunity and the same momentum it really really reminds me of AWS ten years ago it's the same sort of transformative democratizing approach to which really engages builders and I see the same level of the excitement as levels are super super high as well no super high in general reader pipe out there but I see the same level of enthusiasm and movement and the middle are building with it basically absolutely so what's this toy you have here I know we don't have a lot of time but this isn't you've got a little problem this is the world's first deep learning in April were on wireless video camera we thought it D blends we announced it and launched it at reinvent 2017 and actually hold that but they can hold it up to the camera it's a cute little device we modeled it after wall-e the Pixar movie and it is a HD video camera on the front here and in the base here we have a incredibly powerful custom piece of machine learning hardware so this can process over a billion machine learning operations per second you can take the video in real time you send it to the GPU on board and we'll just start processing the stream in real time so that's kind of interesting but the real value of this and why we designed it was we wanted to try and find a way for developers to get literally hands-on with machine learning so the way that build is a lifelong learners right they they love to learn they have an insatiable appetite for new information and new technologies and the way that they learn that is they experiment they start working and they kind of spin this flywheel where you try something out it works you fiddle with it it stops working you learn a little bit more and you want to go around around around that's been tried and tested for developers for four decades the challenge with machine learning is doing that is still very very difficult you need a label data you need to understand the algorithms it's just it's hard to do but with deep lens you can get up and running in ten minutes so it's connected back to the cloud it's good at about two stage makeup you can deploy a pre-built model down onto the device in ten minutes to do object detection we do some wacky visual effects with neural style transfer we do hot dog and no hot dog detection of course but the real value comes in that you can take any of those models tear them apart so sage maker start fiddling around with them and then immediately deploy them back down onto the camera and every developer on their desk has things that they can detect there are pens and cups and people whatever it is so they can very very quickly spin this flywheel where they're experimenting changing succeeding failing and just going round around a row that's for developers your target audience yes right okay and what are some of the things that have come out of it have you seen any cool yes evolutionary it has been incredibly gratifying and really humbling to see developers that have no machine learning experience take this out of the box and build some really wonderful projects one in really good example is exercise detection so you know when you're doing a workout they build a model which detects the exerciser there and then detects the reps of the weights that you're lifting now we saw skeletal mapping so you could map a person in 3d space using a simple camera we saw security features where you could put this on your door and then it would send you a text message if it didn't recognize who was in front of the door we saw one which was amazing which would read books aloud to kids so you would hold up the book and they would detect the text extract the text send the text to paly and then speak aloud for the kids so there's games as educational tools as little security gizmos one group even trained a dog detection model which detected individual species plug this into an enormous power pack and took it to the local dog park so they could test it out so it's all of this from from a cold start with know machine learning experience you having fun yes absolutely one of the great things about machine learning is you don't just get to work in one area you get to work in you get to work in Formula One and sports and you get to work in healthcare and you get to work in retail and and develop a tool in CTO is gonna love this chief toy officers chief toy officers I love it so I got to ask you so what's new in your world GM of AI audition intelligence what does that mean just quickly explain it for our our audience is that all the software I mean what specifically are you overseeing what's your purview within the realm of AWS yeah that's that's a totally fair question so my purview is I run the products for deep learning machine learning and artificial intelligence really across the AWS machine learning team so I get I have a lot of fingers in a lot of pies I get involved in the new products we're gonna go build out I get involved in helping grow usage of existing products I get it to do a lot of invention it spent a ton of time with customers but overall work with the rest of the team on setting the technical and pronto strategy for machine learning at AWS when what's your top priorities this year adoption uptake new product introductions and you guys don't stop it well we do sync we don't need to keep on introducing more and more things any high ground that you want to take what's what's the vision I didn't the vision is to is genuinely to continue to make it as easy as possible for developers to use Ruggiero my icon overstate the importance or the challenge so we're not at the point where you can just pull down some Python code and figure it out we're not even we don't have a JVM for machine learning where there's no there's no developer tools or debuggers there's very few visualizers so it's still very hard if you kind of think of it in computing terms we're still working in assembly language and you're seen learning so there's this wealth of opportunity ahead of us and the responsibility that I feel very strongly is to be able to continually in crew on the staff to continually bring new capabilities to mortar but well cloud has been disrupting IT operations AI ops with a calling in Silicon Valley and the venture circuit Auto ml as a term has been kicked around Auto automatic machine learning you got to train the machines with something data seems to be it strikes me about this compared to storage or compared to compute or compared to some of the core Amazon foundational products those are just better ways to do something they already existed this is not a better way to do something that are exists this is a way to get the democratization at the start of the process of the application of machine learning and artificial intelligence to a plethora of applications in these cases that is fundamentally yeah different in it just a step up in terms of totally agree the power to the hands of the people it's something which is very far as an area which is very fast moving and very fast growing but what's funny is it totally builds on top of the cloud and you really can't do machine learning in any meaningful production way unless you have a way that is cheap and easy to collect large amounts of data in a way which allows you to pull down high-performance computation at any scale that you need it and so through the cloud we've actually laid the foundations for machine learning going forwards and other things too coming oh yes that's a search as you guys announced the cloud highlights the power yet that it brings to these new capabilities solutely yeah and we get to build on them at AWS and at Amazon just like our customers do so osage make the runs on ec2 we wouldn't we won't be able to do sage maker without ec2 and you know in the fullness of time we see that you know the usage of machine learning could be as big if not bigger than the whole of the rest of AWS combined that's our aspiration dr. Matt would I wish we had more time to Chad loved shopping with you I'd love to do a whole nother segment on what you're doing with customers I know you guys are great customer focus as Andy always mentions when on the cube you guys listen to customers want to hear that maybe a reinvent will circle back sounds good congratulations on your success great to see you he showed it thanks off dr. Matt would here in the cube was dreaming all this data out to the Amazon Cloud is whether they be hosts all of our stuff of course it's the cube bringing you live action here in New York City for cube coverage of AWS summit 2018 in Manhattan we'll be back with more after this short break

Published Date : Jul 17 2018

SUMMARY :

amount of memory on the instance you can

ENTITIES

Entity	Category	Confidence
Jeff Rick	PERSON	0.99+
Amazon Web Services	ORGANIZATION	0.99+
John Fourier	PERSON	0.99+
New York City	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
ten minutes	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
10	QUANTITY	0.99+
Manhattan	LOCATION	0.99+
hundreds	QUANTITY	0.99+
Andy	PERSON	0.99+
Matt Wood	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
New York City	LOCATION	0.99+
25 percent	QUANTITY	0.99+
ten minutes	QUANTITY	0.99+
New York	LOCATION	0.99+
second thing	QUANTITY	0.99+
millions	QUANTITY	0.99+
Pixar	ORGANIZATION	0.99+
dr. Matt wood	PERSON	0.99+
Python	TITLE	0.99+
April	DATE	0.99+
today	DATE	0.99+
four decades	QUANTITY	0.98+
terabyte	QUANTITY	0.98+
over 100 terabytes	QUANTITY	0.98+
Sage maker	ORGANIZATION	0.98+
ten years ago	DATE	0.97+
12 new language pairs	QUANTITY	0.97+
next month	DATE	0.97+
four new services	QUANTITY	0.97+
first thing	QUANTITY	0.96+
thousands	QUANTITY	0.96+
s3	TITLE	0.95+
Aurora	TITLE	0.95+
second	QUANTITY	0.95+
one	QUANTITY	0.95+
sage maker	ORGANIZATION	0.94+
Formula One	TITLE	0.94+
dr. Matt	PERSON	0.93+
first deep learning	QUANTITY	0.93+
ec2	TITLE	0.93+
AWS Summit	EVENT	0.92+
single script	QUANTITY	0.9+
a ton of time	QUANTITY	0.9+
one of the keys	QUANTITY	0.9+
this morning	DATE	0.9+
MX net	ORGANIZATION	0.89+
National Football League Major League Baseball	EVENT	0.88+
Cersei	ORGANIZATION	0.88+
sage maker	ORGANIZATION	0.88+
year	DATE	0.88+
reinvent 2017	EVENT	0.87+
three different tiers	QUANTITY	0.87+
AWS summit 2018	EVENT	0.87+
cubanía	LOCATION	0.86+
one area	QUANTITY	0.86+
2018	EVENT	0.86+
dr. Matt	PERSON	0.85+
Perseids	ORGANIZATION	0.85+
about two stage	QUANTITY	0.82+
lot of time	QUANTITY	0.81+
Web Services summit 2018	EVENT	0.81+
this year	DATE	0.8+
Apache	TITLE	0.79+
over a billion machine learning operations per second	QUANTITY	0.79+
Chad	PERSON	0.79+
things	QUANTITY	0.78+
lot of use cases	QUANTITY	0.77+
a ton of	QUANTITY	0.77+
lots of data	QUANTITY	0.74+
CTO	TITLE	0.73+
this morning	DATE	0.72+
amounts of	QUANTITY	0.71+
Sage maker	TITLE	0.69+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for NYC: