Breaking Analysis: Cloud Revenue Accelerates in the COVID Era

from the cube studios in palo alto in boston bringing you data driven insights from the cube and etr this is breaking analysis with dave vellante as we watch an historic election unfold before our eyes we look back at the early days of the millennium with the memorable presidential race of 2000 that decade of course was defined by 911 which permanently reshaped our thinking and we exited that decade at the tail end of a massive financial crisis only to enter the 2010s with the hope and the momentum of fiscal stimulus a flat globe job growth and very importantly the ascendancy of the cloud cloud computing unquestionably powered the innovation engine over the last 10 years and the pandemic marks a new era where adoption of cloud data and ai have been accelerated by at least two to three years and that's what's going to shape the future of the technology industry and frankly all businesses and organizations hello everyone and welcome to this week's episode of thecube insights powered by etr in this breaking analysis we're going to update you on our latest cloud market share and dig in to some fresh october survey data from our partners over at etr let me start just with a brief summary of the latest action that's going on in cloud now quite interestingly each of the big three cloud players they showed nearly identical year-on-year growth rates in q3 as they did in q2 now we're going to dig into that in a moment but our data suggests that these three companies combined will account for more than 75 billion dollars in infrastructure as a service and platform as a service revenue in 2020 and they're potentially on track to hit 100 billion in 2021. customer survey data indicates that cio's top two infrastructure priorities remain security and cloud migration now that said as we previously reported the cloud it's not immune to the pandemic the remote worker pivot well it's a positive for cloud hasn't completely eradicated certain headwinds now what i mean here is that because the cloud vendors are now so large they're somewhat exposed to the softness in the overall i.t spending climate and also industries that have been hit hardest by the pandemic now would the cloud growth have been better if the pandemic didn't hit we'll never know for sure but our data suggests no covet has definitely been a benefactor to cloud in our view cloud will remain at the center of technological innovation for the foreseeable future the economics of cloud are becoming so compelling that we think the power of the big cloud companies will only increase this decade now importantly we're talking about the costs of running hyper-distributed systems we're not commenting here on what they charge customers that's a different story we believe the cost structure for the hyperscalers is superior to alternative approaches and we believe this advantage will only accelerate over the next several years we also believe that competition is going to continue to drive competitive pricing and innovation all right let's look at our latest market share numbers for the big three this chart shows our estimates of aws azure and the google cloud platform now viewers of this program know that these are is and pass figures and you also know that aws is the only company that provides clean numbers on that sector whereas azure and gcp are estimates that we make based on tidbits of guidance that the companies give us and survey data that we capture and other modeling that we do now as we've said we'll end this year it's about 75 billion in revenue or maybe even a little bit more note that for these three note that we've we've slightly restated some of our earlier estimates for azure to reconcile some differences that we had between constant currency and actual growth we try to keep things in constant currency where possible sorry for that but sometimes that happens azure according to our estimates as we reported last week is now 18 of microsoft's overall revenue number we had it at 19 that last week but when i dug in we made some adjustments so we toned it down a bit aws represents a much smaller percentage of course of amazon's revenues at about 12 percent but it represents 56 percent of amazon's profits gcp on the other hand accounts for less than five percent of google's overall revenue which as we've stated a few weeks ago needs more attention from google but look at the growth rates for these three platforms and the respective size of their is and pass businesses hear all this talk about repatriation i.e that what i mean by that is people go to the cloud but they're unhappy or the bill is too high it's too expensive so then they come back on prem well you just don't see that in the numbers so you gotta be careful when vendor a vendor tries to sell you on that trend i don't buy it except for selective situations now let's bring in some of the etr data and compare the spending momentum for each of the big three you've seen these wheel graphs before they show the breakdown of net score for aws microsoft and google now one note these figures represent these three companies overall within the etr technology taxonomy so for example they don't include amazon's retail business of course but they do include for example microsoft's entire tech portfolio not just the cloud the green portion of the wheel represents increases in spending via new adoptions and increased spending whereas the red sections show decreases via lower spending and defections net score which i've highlighted in the orange is calculated by subtracting the two reds from the two true greens in other words adoptions and increase minus decrease and replacements the takeaway here is these are all pretty strong with aws leading the pack microsoft is exceptionally strong as we pointed out last last week because they're so huge and they still have net scores comparable to aws which is a pure play gcp is a laggard and is showing softness in the data despite a sanguine outlook that we had back in 2019 based on survey data i don't know perhaps google's smaller presence muted their customers ability to take advantage of the platform the thinking there is the customers maybe needed to pivot to the cloud so quickly and aws and azure were the incumbents and that was maybe the most expedient path hence the higher increases in the spend more category but you do see gcp um they had 13 new adoptions which is pretty good so we'll keep looking at that regardless again these are not pure play cloud comparisons but they give a good indication of spending momentum i'd also note that all three show very low defections well each is showing solid increases in new adoptions especially google as i mentioned so that's kind of interesting to see but again google much much smaller you would expect that now i want to turn our attention to one of the hottest areas in cloud which is serverless and this is a pure play comparison so serverless let me start there it's a strange term because it's not really accurate but it's stuck serverless computing is a model where the cloud platform dynamically delivers services as the application requires so so you don't have to configure the compute and the containers for example rather when an application needs resources it goes and gets them and you only pay for when the services are actually invoked and in use so it's really good for workloads that spin up and spin down very frequently it kind of reminds me in concept anyway of the component tree that we saw in the days of soa if you remember that services oriented architecture but now this is cloud it's cloud native it's a whole new world and it's increasingly a popular model and as we'll show in a moment there's a lot of spending momentum in this area but before we do that i want to share some comments made by andy jassy a while back about serverless take a listen it's a good question and you know i really the comment i made was really about um directionally what amazon would do you know in this in the very earliest days of aws jeff used to say a lot if i were starting amazon today i'd have built it on top of aws we didn't have all the capability and all the functionality at that very moment but he knew what was coming and he saw what people were still able to accomplish even with where the services were at that point i think the same thing is true here with lambda which is i think if amazon were starting today it's a given they would build it on the cloud and i think with a lot of the applications that comprise amazon's consumer business we would build those on on our serverless capabilities now now lambda of course jesse referring to lambda that's amazon's serverless offering and if you think about amazon's retail business and take for example the frequent spin up and spin down of resources for something like black monday serverless would be a much more cost effective approach same for a managed data warehouse service for example where you know you don't want to pay for the compute if it's idle the app just calls for the compute when it's needed so it's a very popular model and it's got increased momentum today and you see that in this slide it shows the net score breakdown for serverless for azure aws is lambda which is again is their serverless offering and google cloud functions again you're shipping functions to the application that's why it's called functions look at the net scores azure functions nearly 70 aws at 65 google again lagging and that's a bit of a concern because this is a really really hot space all right let's move on and look at the competitive landscape as we like to do often and update you on that this xy graph is one of our favorites and it shows net score or spending momentum on the vertical axis and market share on the horizontal market share is a measure of pervasiveness in the data set in the upper right you also see a table that ranks each vendor my net score and it includes the shared n in other words the number of mentions in this sector for each vendor now you can you can see up top in the middle i've selected on the cloud computing category so this represents only the cloud businesses for each of these players there's a little bit of nuance here and that we've selected on microsoft azure there's a category in the etr taxonomy for that and we're comparing that with aws overall so there's there are things in the aws overall number that fit into the other parts of the taxonomy like maybe ai collaboration etc whereas azures and gcp are just the cloud segments so i i know it's a bit strange because aws is all cloud but don't get caught up in the taxonomical nuance the point is it's good to be azure in aws it's shown there when you look at the upper right of the chart here they stand out and they stand alone in cloud leadership google cloud is they have nice elevated levels but they're much much smaller they don't have the presence in the market now look at that hybrid cloud zone emerging we've talked about this sometimes in the past and and i want to call it vmware cloud on aws red hat open shift and vmware cloud itself like vmware cloud foundation and their other cloud services all of these appear to be gaining traction and you can see in the number of occurrences in the upper right that shared end that i talked about we're starting to see real numbers that are meaningful in this space vmware cloud on aws for example has a net score of 53 percent with 116 accounts within that total respondent sample that you see there in the middle left of 1438 that's how many cios and technology buyers responded to the etr survey in october you look at open shift at 45 net score and that's with 82 accounts now openshift is in beta with what looked to be some really strong offerings on aws and you can see for context i've added dell emc's cloud offerings hpe's cloud offerings and the oracle cloud and ibm cloud and also rackspace dell actually pretty strong with a net score of 20 and 185 shared accounts much much higher than dell overall which is kind of in the red zone oracle ibm you see those rackspace you know organizing not killing it rackspace is kind of in the big negative so that's a concern but anyway we'd like for these guys we'd like to see the data match the marketing rhetoric for the the guys that are in the red and look alibaba is starting to to show up in the server there's only 26 shared ends but we thought we'd we'd put it in there those three key points again aws and microsoft keep on trucking google needs to do better hybrid is becoming real and that bodes well for multi-cloud and the legacy on-prem guys they got a lot of work to do they're under a lot of pressure the pivot to cloud has not been easy for them uh and it's still a case where they're i've talked about this a lot they're they're declines in their on-premises offerings they're not being offset by the new stuff the cloud momentum all right i want to close out by sharing some of the conversations and thoughts that we've had in the community around sas and its impact on cloud we really have been focusing on ias and pass of the sas layer obviously up the stack so let me first share that there's a lot of talk around and has been for years about aws they're slowing growth rates and whether or not they'll have to enter the sas market to expand their total available market and i've said consistently while i never say never about aws i don't think so at least not yet this chart plots the big three cloud players note aws is a bigger piece of this pie now that i've turned off the cloud computing filter and i know more nuances but the data wonks will will find you know see this and they'll ask me about it this is all of aws portfolio and again it's only the microsoft azure portfolio so you see it aws now overtakes azure on the x-axis i.e market share now we've plotted some of the major sas vendors and you can see servicenow and salesforce both very large and they have really strong spending momentum and servicenow's you know pushing 100 billion dollars in market value they've surpassed workday quite some time ago workday's got less presence but they've got really really solid net score and i got to say i'm impressed with sap despite some of the earnings challenges that they've been having they're right up there with splunk and tableau splunk has softened in recent surveys and i've i've also plotted in there netsuite and oracle fusion which are just okay and that is i think for now anyway aws is going to position as the best place and the most friendly and highest quality cloud in which to run your sas for example workday runs on aws aws is salesforce's preferred infrastructure platform so my premise here is just like retail companies might want not want to run on aws a number of sas companies that compete with microsoft they might think twice about running on azure so aws would be better off for now trying to attract those sas players and drive their services and sticking to infrastructure and the pass layer snowflake is actually kind of interesting and i've added them for context because their netscore is always kind of a bellwether it's really off the charts and they're an isv running on the cloud they're different from some of the other sas players and the snowflake is a database okay and most of snowflake's business runs on aws and aws competes with snowflake with redshift but aws has the best cloud and drives a lot of business for snowflake and vice versa so it's kind of interesting snow snowflake to redshift and a much smaller example is kind of like netflix to amazon prime video to compete they both thrive so i think aws is going to continue to grow by attracting sas players as the preferred platform and they'll also attract developers and try to disrupt sas players like servicenow which runs on its own cloud i remember years ago david floyer and i said that servicenow was it was awesome but at some point its infrastructure cost structure its infrastructure cost structure is going to be less competitive than those companies that are running on hyperscale clouds certainly the hyperscale clouds themselves and servicenow they have this multi-instance architecture which just can't easily port over to the cloud but it can charge a lot which it does now at some point some sharp developers are going to look at all this and say whoa see that service now i can build this for less and they'll attack servicenow and their seat base license model maybe with the consumption pricing model and a platform that's perhaps or a set of services that are perhaps less expensive you're seeing this to a you know a certain degree with like elastic inside the application performance management space so there's some some things to watch there but there are those who firmly believe that aws will and must enter the sas space directly we talked last week about how beneficial microsoft's application business is for azure and what a flywheel that is but for me i think we're not there yet let's give it some time i think maybe four to five years before aws may even start to think about filling some of the space up the stack now maybe they'll find some unique opportunities to do that for instance at the edge but i think that's way off okay so bottom line it's good to be in tech these days it's even better to be in the cloud and it's best if you're aws and microsoft and i don't see that changing for a while now remember these episodes are all available as podcasts wherever you listen i publish each week on wikibon.com and siliconangle.com you can get in touch with me through email it's david at siliconangle.com feel free to dm me on twitter at d vallante i post on linkedin love your comments there thank you and don't forget to check out etr plus for all the survey action thanks for watching this episode of thecube insights powered by etr this is dave vellante stay safe stay sane and we'll see you next time you

Published Date : Nov 7 2020

SUMMARY :

in the upper right you also see a table

ENTITIES

Entity	Category	Confidence
amazon	ORGANIZATION	0.99+
56 percent	QUANTITY	0.99+
microsoft	ORGANIZATION	0.99+
2020	DATE	0.99+
last week	DATE	0.99+
2021	DATE	0.99+
53 percent	QUANTITY	0.99+
20	QUANTITY	0.99+
2019	DATE	0.99+
82 accounts	QUANTITY	0.99+
116 accounts	QUANTITY	0.99+
three companies	QUANTITY	0.99+
david	PERSON	0.99+
100 billion dollars	QUANTITY	0.99+
three platforms	QUANTITY	0.99+
less than five percent	QUANTITY	0.99+
october	DATE	0.99+
alibaba	ORGANIZATION	0.99+
siliconangle.com	OTHER	0.99+
more than 75 billion dollars	QUANTITY	0.99+
aws	ORGANIZATION	0.99+
google	ORGANIZATION	0.99+
65	QUANTITY	0.99+
100 billion	QUANTITY	0.99+
13 new adoptions	QUANTITY	0.99+
netflix	ORGANIZATION	0.99+
five years	QUANTITY	0.98+
four	QUANTITY	0.98+
pandemic	EVENT	0.98+
this year	DATE	0.98+
three companies	QUANTITY	0.98+
today	DATE	0.98+
each	QUANTITY	0.98+
each week	QUANTITY	0.98+
each vendor	QUANTITY	0.98+
dell	ORGANIZATION	0.98+
boston	LOCATION	0.97+
two reds	QUANTITY	0.97+
dave vellante	PERSON	0.97+
first	QUANTITY	0.97+
q2	DATE	0.97+
twice	QUANTITY	0.96+
2010s	DATE	0.96+
this week	DATE	0.95+
q3	DATE	0.95+
about 12 percent	QUANTITY	0.94+
one note	QUANTITY	0.94+
jeff	PERSON	0.94+
three years	QUANTITY	0.94+
three note	QUANTITY	0.94+
oracle	ORGANIZATION	0.93+
18	QUANTITY	0.93+
about 75 billion	QUANTITY	0.93+

Breaking Analysis: Cloud Momentum Building for the Post COVID Era

>> From theCUBE studios in Palo Alto and Boston, connecting with thought leaders all around the world, this is a Cube Conversation. >> Analysis from company earnings reports and costumer survey data, continues to show that Microsoft Azure and GCP are closing the gap on AWS's cloud dominance. Now, while reporting definitions of the cloud remain fuzzy, it's very clear that clouds steady march into the stronghold of on-premises computing continues. The global Coronavirus pandemic has only strengthen the cloud's position in the overall market place. Now, as you might recall, we reported last week, the story of the haves and the have nots, and that's playing out in several sectors. And in this breaking analysis we're going to take a closer look at the big three cloud players, and we'll do a brief investigation of AWS specifically in a short drill down. Welcome everyone, to theCUBE insights powered by ETR. Today we're going to try to really accomplish three things. First, we want to quantify how the cloud is impacting the on-prem business. As we enter this decade, let's take a snapshot of some of the vendors that are well positioned, and maybe some of those that are facing greater head winds. The second thing we want to do, is we want to update you on the latest market share data for the big three cloud players. And then finally, I want to dig into the business of AWS in a little bit more depth to see where they're seeing the most strengthen, and where, perhaps, maybe there are some cracks in their substantial armor. Now, let's look at the IT landscape where we are in 2020. The first data point that we want to share, really tells a familiar story, and really drafts off the theme that we've set for the past several weeks, which is the bifurcation in the marketplace. Now, if you take a look at this chart what's really showing is ETR's version of the Gartner Magic Quadrant, but it uses survey data to plot the vendors. So in the y-axis is the metric of it, net score, which is a measurement of spending momentum. And just to review, each quarter ETR surveys more than 1,200 CIOS and IT professionals, and asks them, essentially are they spending more or less on a particular supplier. And what we do is we subtract the less from the more, and the remainder is the net score. So it's sort of like NPS, and I'll go into that a little bit later. But that's the vertical axis. Now the x-axis is called market share. You know, it's really not market share, like IDC measures, rather it's a measure of pervasiveness in the survey and it's calculated by dividing the mentions of a particular company by the total mentions in the overall survey. And you see that's plotted on the horizontal axis. So several points here that I want to note. First is remember, this is April survey data, so for more than 1200 buyers, and you can see we've plotted several companies, including the big three cloud players. You got Microsoft and AWS in the upper right and Google with much lower presence but decent spending momentum. And we've plotted a number of other enterprise players, including several on-prem leaders, like Dell EMC, IBM, Oracle, and Cisco. And we've also included some of the companies that are showing real promise from a momentum standpoint, and penetration. These are business models that we like, and they include Snowflake, the analytic database disruptor, UiPath, who's the RPA specialist, Okta and CrowdStrike who are really killing it in security and Datadog who provides cloud monitoring services. And as you can see, we've superimposed in the upper right a table showing the net scores and market shares for each of the companies. And the story here very clearly quantifies that cloud is winning, and we think it's likely to continue to grow fast and penetrate the enterprise. Now, as we've reported many times, downturns tend to be good for cloud. But the on-prem leaders, you know, as you can see by Cisco's position, for example, they're not going to just roll over. And we'll be covering winning strategies for legacy players in a later segment. But let me just say this, if you're a customer with a lot of on-prem infrastructure, and you're building out data centers, unless you're a big cloud provider, you're probably going to be in the wrong side of history here. Okay. Let's take a closer look at the big three. I want to update you on their IaaS and PaaS numbers as best we can. All recently reported earnings, and this chart shows the data for each of the companies. Now as you can see, each of them has substantial businesses with AWS by far the largest, GCP is growing the fastest. What's notable is that AWS in 2018 was 2.7 x larger than Azure, and today that delta is under two x based on our q1 estimates. And it's just about two X on a trailing 12 month basis. Now, I got to caution you that the AWS numbers are the cleanest AWS reports religiously an easy to understand revenue and operating profit number for its cloud business, every quarter. Microsoft and Google are much fuzzier. You know, for example, you read through Microsoft's 10-K reports and you'll see that their intelligent cloud revenue comprises public and private clouds, hybrid, SQL Server, Windows Server, System Center, GitHub, enterprise support and consulting services and, oh yeah, Azure. So we have to estimate how much of that hairball is actually comparable directly to AWS. Now, Google similarly just started breaking out its cloud revenue in bundles more than just IaaS and PaaS into its cloud numbers. Now, having said that, both Microsoft and Google, they do give little tidbits like Hansel and Gretel of guidance in the form of growth rates or commentary on growth rates in their respective IaaS and PaaS businesses, ie, Azure and GCP. So this is our best estimate, given all that is reported and what we know from survey data. Now, I also want to point out that these clouds are, they're really different in quality and they have different fits for different use cases. For example, Microsoft is building out a cloud really to support it's huge install base of customers, and really make it easy for them to tap into the Microsoft Cloud services, but it may not be the most robust cloud, as has been widely reported in analyzed in the press. You know, Microsoft is struggling to provide adequate capacity for its customers. It's kind of using the COVID-19 pandemic as a bit of a heat shield on this issue. Microsoft put out a blog post essentially saying that it'll, it'll prioritize first responders, health workers, and essential businesses during the COVID 19 pandemic, oh, and Teams customers. So okay, that's one of those caveat emptor situations, you know, if you're not one of these camps, you know, or frankly, maybe if you are. But it's unquestionable that Microsoft has strong momentum across its vast portfolio, including cloud. And really that's what I want to get into next. So let's take a look at some data, we've been reporting for quite some time based on the ETR surveys, that the big cloud players, you know, have very, very strong momentum as measured by net scores. So what this chart shows is the most recent survey results, again, more than 1,200, it buyers 1,269 to be exact. And you can see broadly that all the big three are well on green for net scores as we show in the upper right hand box, and well over 50% net scores for all three, and Microsoft Azure is in the 70% range. So a very, very strong demand across the board. Now remember, ETR is asking buyers to comment on the areas with which they are familiar. So a buyer might be interpreting cloud to include all those things in Microsoft and Google that may not be directly comparable to the AWS responses, but it doesn't matter. The point is, they all have momentum, and you can see, you know, even though there's a slight dip in the most recent survey, you know, which ran during the peak of the shutdown in the US. So even there's a small dip relative to other parts of the survey, cloud is very, very strong. Now, let's dig into the data a bit more, and take a look at the Fortune 500 drill down. So of course, this is an indicator of larger companies. And you can see AWS overtakes Azure in this segment by a small margin, you know, noting the same caveats that I mentioned earlier. But the strength of the net scores for all three is meaningful as they all increased within these larger buying basis. Now let's take a look at this next chart, if we extend that cut, to include the Fortune 1000, you can see here that all three companies again, continue to show strength. But you know, there's a convergence, which really says to me that this multi cloud picture that's emerged, and that CIOs are really now starting to see that whether it's through M and A, or maybe it was shadow IT or whatever, they're faced with a variety of choices that are increasingly viable. And despite my previously and sometimes snarky comments that multi cloud has been more of a symptom of multi vendor versus a clear CIO strategy, that maybe is perhaps beginning to change, especially as they're asked to clean up what I've often called as the crime scene. Now, I want to close by taking a little bit of a closer look at the AWS business specifically. And I want to come back to this notion of net score and explain it a little bit. So what we show here on this wheel chart is really a breakdown of responses across more than 600 AWS customers in the April survey, remember again, this survey ran at the height of the lockdown in the US. It's a global survey well over 100 responses outside of the United States. But really, what's relevant here is the strength of the AWS business overall. This chart shows how net score is essentially derived, ETR asked customers, are you adopting new? Are you increasing spend meaning, increasing by 6% or more? Are you keeping spending flat? Or are you decreasing spending by more than 6%? Or are you chucking the platform i.e. replacement? So look at this, we're talking about nearly 70% of customers spending more in 2020 on AWS than they spent last year, and only 4% are spending less. That's pretty impressive for a player with a $38 billion business. Now the next data point I want to share really shows where the action is across the AWS portfolio, so let's take a look at this. The chart here shows the responses from an end of more than 700 and the net score, or spending momentum, across the AWS portfolio with a comparison across three survey dates, last April, January 2020, and April 2020. And as you can see the very elevated spending momentum across most of the AWS key business lines, including cloud functions, data warehouse, which is EDW, etc, AI and machine learning, workspaces with the work from home pivot. And, you know, there are some areas that are maybe less robust, but nothing in the red zone, red zone, meaning, you know, net scores would be like below, let's say 25% net score. And as you can see, there's really nothing close to that in the AWS portfolio. So you're seeing a very strong momentum for AWS, you know, specifically, and of course, the cloud in general. Now, as I said, the pandemic has been been good for cloud, downturns generally are a tailwind. So if you're building data centers, it's probably not a good use of capital, you know, so server huggers, beware. There's an attractiveness more so than ever with this COVID-19 pandemic of that dial up, dial down service. Watch for software companies starting to use that model, whereas today, they often try to lock you into a, you know, one year or a two year or three year license. Increasingly, we're seeing companies investigate and actually go to market with a true cloud model. Okay, thanks for watching this episode of theCUBE Insights powered by ETR. Remember, these breaking analysis segments are all available as podcasts. You check out siliconangle.com, I publish there weekly, they have all the news, I also published on Wikibon. So don't forget to check out etr.plus, as well get in touch with me @dvellante. Or you can email me at david.vellante@siliconangle.com. Stay safe everybody, and we'll see you next time. (gentle music)

Published Date : May 13 2020

SUMMARY :

leaders all around the world, in the most recent survey, you know,

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
one year	QUANTITY	0.99+
25%	QUANTITY	0.99+
2020	DATE	0.99+
2018	DATE	0.99+
April	DATE	0.99+
Boston	LOCATION	0.99+
$38 billion	QUANTITY	0.99+
12 month	QUANTITY	0.99+
United States	LOCATION	0.99+
three year	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
April 2020	DATE	0.99+
70%	QUANTITY	0.99+
US	LOCATION	0.99+
last week	DATE	0.99+
UiPath	ORGANIZATION	0.99+
6%	QUANTITY	0.99+
Okta	ORGANIZATION	0.99+
CrowdStrike	ORGANIZATION	0.99+
each	QUANTITY	0.99+
First	QUANTITY	0.99+
last year	DATE	0.99+
david.vellante@siliconangle.com	OTHER	0.99+
more than 6%	QUANTITY	0.99+
more than 600	QUANTITY	0.99+
more than 700	QUANTITY	0.99+
more than 1200 buyers	QUANTITY	0.99+
today	DATE	0.99+
more than 1,200	QUANTITY	0.99+
both	QUANTITY	0.98+
second thing	QUANTITY	0.98+
last April	DATE	0.98+
two year	QUANTITY	0.98+
2.7 x	QUANTITY	0.98+
Dell EMC	ORGANIZATION	0.98+
GitHub	ORGANIZATION	0.97+
nearly 70%	QUANTITY	0.97+
Datadog	ORGANIZATION	0.97+
Today	DATE	0.97+
4%	QUANTITY	0.96+
about two X	QUANTITY	0.95+
three companies	QUANTITY	0.95+
over 50%	QUANTITY	0.95+
ETR	ORGANIZATION	0.94+
more than 1,200 CIOS	QUANTITY	0.93+
siliconangle.com	OTHER	0.93+
q1	DATE	0.93+
three	QUANTITY	0.92+
COVID-19 pandemic	EVENT	0.92+
January 2020	DATE	0.91+
Snowflake	ORGANIZATION	0.91+
Gretel	PERSON	0.91+

Analyst Predictions 2023: The Future of Data Management

(upbeat music) >> Hello, this is Dave Valente with theCUBE, and one of the most gratifying aspects of my role as a host of "theCUBE TV" is I get to cover a wide range of topics. And quite often, we're able to bring to our program a level of expertise that allows us to more deeply explore and unpack some of the topics that we cover throughout the year. And one of our favorite topics, of course, is data. Now, in 2021, after being in isolation for the better part of two years, a group of industry analysts met up at AWS re:Invent and started a collaboration to look at the trends in data and predict what some likely outcomes will be for the coming year. And it resulted in a very popular session that we had last year focused on the future of data management. And I'm very excited and pleased to tell you that the 2023 edition of that predictions episode is back, and with me are five outstanding market analyst, Sanjeev Mohan of SanjMo, Tony Baer of dbInsight, Carl Olofson from IDC, Dave Menninger from Ventana Research, and Doug Henschen, VP and Principal Analyst at Constellation Research. Now, what is it that we're calling you, guys? A data pack like the rat pack? No, no, no, no, that's not it. It's the data crowd, the data crowd, and the crowd includes some of the best minds in the data analyst community. They'll discuss how data management is evolving and what listeners should prepare for in 2023. Guys, welcome back. Great to see you. >> Good to be here. >> Thank you. >> Thanks, Dave. (Tony and Dave faintly speaks) >> All right, before we get into 2023 predictions, we thought it'd be good to do a look back at how we did in 2022 and give a transparent assessment of those predictions. So, let's get right into it. We're going to bring these up here, the predictions from 2022, they're color-coded red, yellow, and green to signify the degree of accuracy. And I'm pleased to report there's no red. Well, maybe some of you will want to debate that grading system. But as always, we want to be open, so you can decide for yourselves. So, we're going to ask each analyst to review their 2022 prediction and explain their rating and what evidence they have that led them to their conclusion. So, Sanjeev, please kick it off. Your prediction was data governance becomes key. I know that's going to knock you guys over, but elaborate, because you had more detail when you double click on that. >> Yeah, absolutely. Thank you so much, Dave, for having us on the show today. And we self-graded ourselves. I could have very easily made my prediction from last year green, but I mentioned why I left it as yellow. I totally fully believe that data governance was in a renaissance in 2022. And why do I say that? You have to look no further than AWS launching its own data catalog called DataZone. Before that, mid-year, we saw Unity Catalog from Databricks went GA. So, overall, I saw there was tremendous movement. When you see these big players launching a new data catalog, you know that they want to be in this space. And this space is highly critical to everything that I feel we will talk about in today's call. Also, if you look at established players, I spoke at Collibra's conference, data.world, work closely with Alation, Informatica, a bunch of other companies, they all added tremendous new capabilities. So, it did become key. The reason I left it as yellow is because I had made a prediction that Collibra would go IPO, and it did not. And I don't think anyone is going IPO right now. The market is really, really down, the funding in VC IPO market. But other than that, data governance had a banner year in 2022. >> Yeah. Well, thank you for that. And of course, you saw data clean rooms being announced at AWS re:Invent, so more evidence. And I like how the fact that you included in your predictions some things that were binary, so you dinged yourself there. So, good job. Okay, Tony Baer, you're up next. Data mesh hits reality check. As you see here, you've given yourself a bright green thumbs up. (Tony laughing) Okay. Let's hear why you feel that was the case. What do you mean by reality check? >> Okay. Thanks, Dave, for having us back again. This is something I just wrote and just tried to get away from, and this just a topic just won't go away. I did speak with a number of folks, early adopters and non-adopters during the year. And I did find that basically that it pretty much validated what I was expecting, which was that there was a lot more, this has now become a front burner issue. And if I had any doubt in my mind, the evidence I would point to is what was originally intended to be a throwaway post on LinkedIn, which I just quickly scribbled down the night before leaving for re:Invent. I was packing at the time, and for some reason, I was doing Google search on data mesh. And I happened to have tripped across this ridiculous article, I will not say where, because it doesn't deserve any publicity, about the eight (Dave laughing) best data mesh software companies of 2022. (Tony laughing) One of my predictions was that you'd see data mesh washing. And I just quickly just hopped on that maybe three sentences and wrote it at about a couple minutes saying this is hogwash, essentially. (laughs) And that just reun... And then, I left for re:Invent. And the next night, when I got into my Vegas hotel room, I clicked on my computer. I saw a 15,000 hits on that post, which was the most hits of any single post I put all year. And the responses were wildly pro and con. So, it pretty much validates my expectation in that data mesh really did hit a lot more scrutiny over this past year. >> Yeah, thank you for that. I remember that article. I remember rolling my eyes when I saw it, and then I recently, (Tony laughing) I talked to Walmart and they actually invoked Martin Fowler and they said that they're working through their data mesh. So, it takes a really lot of thought, and it really, as we've talked about, is really as much an organizational construct. You're not buying data mesh >> Bingo. >> to your point. Okay. Thank you, Tony. Carl Olofson, here we go. You've graded yourself a yellow in the prediction of graph databases. Take off. Please elaborate. >> Yeah, sure. So, I realized in looking at the prediction that it seemed to imply that graph databases could be a major factor in the data world in 2022, which obviously didn't become the case. It was an error on my part in that I should have said it in the right context. It's really a three to five-year time period that graph databases will really become significant, because they still need accepted methodologies that can be applied in a business context as well as proper tools in order for people to be able to use them seriously. But I stand by the idea that it is taking off, because for one thing, Neo4j, which is the leading independent graph database provider, had a very good year. And also, we're seeing interesting developments in terms of things like AWS with Neptune and with Oracle providing graph support in Oracle database this past year. Those things are, as I said, growing gradually. There are other companies like TigerGraph and so forth, that deserve watching as well. But as far as becoming mainstream, it's going to be a few years before we get all the elements together to make that happen. Like any new technology, you have to create an environment in which ordinary people without a whole ton of technical training can actually apply the technology to solve business problems. >> Yeah, thank you for that. These specialized databases, graph databases, time series databases, you see them embedded into mainstream data platforms, but there's a place for these specialized databases, I would suspect we're going to see new types of databases emerge with all this cloud sprawl that we have and maybe to the edge. >> Well, part of it is that it's not as specialized as you might think it. You can apply graphs to great many workloads and use cases. It's just that people have yet to fully explore and discover what those are. >> Yeah. >> And so, it's going to be a process. (laughs) >> All right, Dave Menninger, streaming data permeates the landscape. You gave yourself a yellow. Why? >> Well, I couldn't think of a appropriate combination of yellow and green. Maybe I should have used chartreuse, (Dave laughing) but I was probably a little hard on myself making it yellow. This is another type of specialized data processing like Carl was talking about graph databases is a stream processing, and nearly every data platform offers streaming capabilities now. Often, it's based on Kafka. If you look at Confluent, their revenues have grown at more than 50%, continue to grow at more than 50% a year. They're expected to do more than half a billion dollars in revenue this year. But the thing that hasn't happened yet, and to be honest, they didn't necessarily expect it to happen in one year, is that streaming hasn't become the default way in which we deal with data. It's still a sidecar to data at rest. And I do expect that we'll continue to see streaming become more and more mainstream. I do expect perhaps in the five-year timeframe that we will first deal with data as streaming and then at rest, but the worlds are starting to merge. And we even see some vendors bringing products to market, such as K2View, Hazelcast, and RisingWave Labs. So, in addition to all those core data platform vendors adding these capabilities, there are new vendors approaching this market as well. >> I like the tough grading system, and it's not trivial. And when you talk to practitioners doing this stuff, there's still some complications in the data pipeline. And so, but I think, you're right, it probably was a yellow plus. Doug Henschen, data lakehouses will emerge as dominant. When you talk to people about lakehouses, practitioners, they all use that term. They certainly use the term data lake, but now, they're using lakehouse more and more. What's your thoughts on here? Why the green? What's your evidence there? >> Well, I think, I was accurate. I spoke about it specifically as something that vendors would be pursuing. And we saw yet more lakehouse advocacy in 2022. Google introduced its BigLake service alongside BigQuery. Salesforce introduced Genie, which is really a lakehouse architecture. And it was a safe prediction to say vendors are going to be pursuing this in that AWS, Cloudera, Databricks, Microsoft, Oracle, SAP, Salesforce now, IBM, all advocate this idea of a single platform for all of your data. Now, the trend was also supported in 2023, in that we saw a big embrace of Apache Iceberg in 2022. That's a structured table format. It's used with these lakehouse platforms. It's open, so it ensures portability and it also ensures performance. And that's a structured table that helps with the warehouse side performance. But among those announcements, Snowflake, Google, Cloud Era, SAP, Salesforce, IBM, all embraced Iceberg. But keep in mind, again, I'm talking about this as something that vendors are pursuing as their approach. So, they're advocating end users. It's very cutting edge. I'd say the top, leading edge, 5% of of companies have really embraced the lakehouse. I think, we're now seeing the fast followers, the next 20 to 25% of firms embracing this idea and embracing a lakehouse architecture. I recall Christian Kleinerman at the big Snowflake event last summer, making the announcement about Iceberg, and he asked for a show of hands for any of you in the audience at the keynote, have you heard of Iceberg? And just a smattering of hands went up. So, the vendors are ahead of the curve. They're pushing this trend, and we're now seeing a little bit more mainstream uptake. >> Good. Doug, I was there. It was you, me, and I think, two other hands were up. That was just humorous. (Doug laughing) All right, well, so I liked the fact that we had some yellow and some green. When you think about these things, there's the prediction itself. Did it come true or not? There are the sub predictions that you guys make, and of course, the degree of difficulty. So, thank you for that open assessment. All right, let's get into the 2023 predictions. Let's bring up the predictions. Sanjeev, you're going first. You've got a prediction around unified metadata. What's the prediction, please? >> So, my prediction is that metadata space is currently a mess. It needs to get unified. There are too many use cases of metadata, which are being addressed by disparate systems. For example, data quality has become really big in the last couple of years, data observability, the whole catalog space is actually, people don't like to use the word data catalog anymore, because data catalog sounds like it's a catalog, a museum, if you may, of metadata that you go and admire. So, what I'm saying is that in 2023, we will see that metadata will become the driving force behind things like data ops, things like orchestration of tasks using metadata, not rules. Not saying that if this fails, then do this, if this succeeds, go do that. But it's like getting to the metadata level, and then making a decision as to what to orchestrate, what to automate, how to do data quality check, data observability. So, this space is starting to gel, and I see there'll be more maturation in the metadata space. Even security privacy, some of these topics, which are handled separately. And I'm just talking about data security and data privacy. I'm not talking about infrastructure security. These also need to merge into a unified metadata management piece with some knowledge graph, semantic layer on top, so you can do analytics on it. So, it's no longer something that sits on the side, it's limited in its scope. It is actually the very engine, the very glue that is going to connect data producers and consumers. >> Great. Thank you for that. Doug. Doug Henschen, any thoughts on what Sanjeev just said? Do you agree? Do you disagree? >> Well, I agree with many aspects of what he says. I think, there's a huge opportunity for consolidation and streamlining of these as aspects of governance. Last year, Sanjeev, you said something like, we'll see more people using catalogs than BI. And I have to disagree. I don't think this is a category that's headed for mainstream adoption. It's a behind the scenes activity for the wonky few, or better yet, companies want machine learning and automation to take care of these messy details. We've seen these waves of management technologies, some of the latest data observability, customer data platform, but they failed to sweep away all the earlier investments in data quality and master data management. So, yes, I hope the latest tech offers, glimmers that there's going to be a better, cleaner way of addressing these things. But to my mind, the business leaders, including the CIO, only want to spend as much time and effort and money and resources on these sorts of things to avoid getting breached, ending up in headlines, getting fired or going to jail. So, vendors bring on the ML and AI smarts and the automation of these sorts of activities. >> So, if I may say something, the reason why we have this dichotomy between data catalog and the BI vendors is because data catalogs are very soon, not going to be standalone products, in my opinion. They're going to get embedded. So, when you use a BI tool, you'll actually use the catalog to find out what is it that you want to do, whether you are looking for data or you're looking for an existing dashboard. So, the catalog becomes embedded into the BI tool. >> Hey, Dave Menninger, sometimes you have some data in your back pocket. Do you have any stats (chuckles) on this topic? >> No, I'm glad you asked, because I'm going to... Now, data catalogs are something that's interesting. Sanjeev made a statement that data catalogs are falling out of favor. I don't care what you call them. They're valuable to organizations. Our research shows that organizations that have adequate data catalog technologies are three times more likely to express satisfaction with their analytics for just the reasons that Sanjeev was talking about. You can find what you want, you know you're getting the right information, you know whether or not it's trusted. So, those are good things. So, we expect to see the capabilities, whether it's embedded or separate. We expect to see those capabilities continue to permeate the market. >> And a lot of those catalogs are driven now by machine learning and things. So, they're learning from those patterns of usage by people when people use the data. (airy laughs) >> All right. Okay. Thank you, guys. All right. Let's move on to the next one. Tony Bear, let's bring up the predictions. You got something in here about the modern data stack. We need to rethink it. Is the modern data stack getting long at the tooth? Is it not so modern anymore? >> I think, in a way, it's got almost too modern. It's gotten too, I don't know if it's being long in the tooth, but it is getting long. The modern data stack, it's traditionally been defined as basically you have the data platform, which would be the operational database and the data warehouse. And in between, you have all the tools that are necessary to essentially get that data from the operational realm or the streaming realm for that matter into basically the data warehouse, or as we might be seeing more and more, the data lakehouse. And I think, what's important here is that, or I think, we have seen a lot of progress, and this would be in the cloud, is with the SaaS services. And especially you see that in the modern data stack, which is like all these players, not just the MongoDBs or the Oracles or the Amazons have their database platforms. You see they have the Informatica's, and all the other players there in Fivetrans have their own SaaS services. And within those SaaS services, you get a certain degree of simplicity, which is it takes all the housekeeping off the shoulders of the customers. That's a good thing. The problem is that what we're getting to unfortunately is what I would call lots of islands of simplicity, which means that it leads it (Dave laughing) to the customer to have to integrate or put all that stuff together. It's a complex tool chain. And so, what we really need to think about here, we have too many pieces. And going back to the discussion of catalogs, it's like we have so many catalogs out there, which one do we use? 'Cause chances are of most organizations do not rely on a single catalog at this point. What I'm calling on all the data providers or all the SaaS service providers, is to literally get it together and essentially make this modern data stack less of a stack, make it more of a blending of an end-to-end solution. And that can come in a number of different ways. Part of it is that we're data platform providers have been adding services that are adjacent. And there's some very good examples of this. We've seen progress over the past year or so. For instance, MongoDB integrating search. It's a very common, I guess, sort of tool that basically, that the applications that are developed on MongoDB use, so MongoDB then built it into the database rather than requiring an extra elastic search or open search stack. Amazon just... AWS just did the zero-ETL, which is a first step towards simplifying the process from going from Aurora to Redshift. You've seen same thing with Google, BigQuery integrating basically streaming pipelines. And you're seeing also a lot of movement in database machine learning. So, there's some good moves in this direction. I expect to see more than this year. Part of it's from basically the SaaS platform is adding some functionality. But I also see more importantly, because you're never going to get... This is like asking your data team and your developers, herding cats to standardizing the same tool. In most organizations, that is not going to happen. So, take a look at the most popular combinations of tools and start to come up with some pre-built integrations and pre-built orchestrations, and offer some promotional pricing, maybe not quite two for, but in other words, get two products for the price of two services or for the price of one and a half. I see a lot of potential for this. And it's to me, if the class was to simplify things, this is the next logical step and I expect to see more of this here. >> Yeah, and you see in Oracle, MySQL heat wave, yet another example of eliminating that ETL. Carl Olofson, today, if you think about the data stack and the application stack, they're largely separate. Do you have any thoughts on how that's going to play out? Does that play into this prediction? What do you think? >> Well, I think, that the... I really like Tony's phrase, islands of simplification. It really says (Tony chuckles) what's going on here, which is that all these different vendors you ask about, about how these stacks work. All these different vendors have their own stack vision. And you can... One application group is going to use one, and another application group is going to use another. And some people will say, let's go to, like you go to a Informatica conference and they say, we should be the center of your universe, but you can't connect everything in your universe to Informatica, so you need to use other things. So, the challenge is how do we make those things work together? As Tony has said, and I totally agree, we're never going to get to the point where people standardize on one organizing system. So, the alternative is to have metadata that can be shared amongst those systems and protocols that allow those systems to coordinate their operations. This is standard stuff. It's not easy. But the motive for the vendors is that they can become more active critical players in the enterprise. And of course, the motive for the customer is that things will run better and more completely. So, I've been looking at this in terms of two kinds of metadata. One is the meaning metadata, which says what data can be put together. The other is the operational metadata, which says basically where did it come from? Who created it? What's its current state? What's the security level? Et cetera, et cetera, et cetera. The good news is the operational stuff can actually be done automatically, whereas the meaning stuff requires some human intervention. And as we've already heard from, was it Doug, I think, people are disinclined to put a lot of definition into meaning metadata. So, that may be the harder one, but coordination is key. This problem has been with us forever, but with the addition of new data sources, with streaming data with data in different formats, the whole thing has, it's been like what a customer of mine used to say, "I understand your product can make my system run faster, but right now I just feel I'm putting my problems on roller skates. (chuckles) I don't need that to accelerate what's already not working." >> Excellent. Okay, Carl, let's stay with you. I remember in the early days of the big data movement, Hadoop movement, NoSQL was the big thing. And I remember Amr Awadallah said to us in theCUBE that SQL is the killer app for big data. So, your prediction here, if we bring that up is SQL is back. Please elaborate. >> Yeah. So, of course, some people would say, well, it never left. Actually, that's probably closer to true, but in the perception of the marketplace, there's been all this noise about alternative ways of storing, retrieving data, whether it's in key value stores or document databases and so forth. We're getting a lot of messaging that for a while had persuaded people that, oh, we're not going to do analytics in SQL anymore. We're going to use Spark for everything, except that only a handful of people know how to use Spark. Oh, well, that's a problem. Well, how about, and for ordinary conventional business analytics, Spark is like an over-engineered solution to the problem. SQL works just great. What's happened in the past couple years, and what's going to continue to happen is that SQL is insinuating itself into everything we're seeing. We're seeing all the major data lake providers offering SQL support, whether it's Databricks or... And of course, Snowflake is loving this, because that is what they do, and their success is certainly points to the success of SQL, even MongoDB. And we were all, I think, at the MongoDB conference where on one day, we hear SQL is dead. They're not teaching SQL in schools anymore, and this kind of thing. And then, a couple days later at the same conference, they announced we're adding a new analytic capability-based on SQL. But didn't you just say SQL is dead? So, the reality is that SQL is better understood than most other methods of certainly of retrieving and finding data in a data collection, no matter whether it happens to be relational or non-relational. And even in systems that are very non-relational, such as graph and document databases, their query languages are being built or extended to resemble SQL, because SQL is something people understand. >> Now, you remember when we were in high school and you had had to take the... Your debating in the class and you were forced to take one side and defend it. So, I was was at a Vertica conference one time up on stage with Curt Monash, and I had to take the NoSQL, the world is changing paradigm shift. And so just to be controversial, I said to him, Curt Monash, I said, who really needs acid compliance anyway? Tony Baer. And so, (chuckles) of course, his head exploded, but what are your thoughts (guests laughing) on all this? >> Well, my first thought is congratulations, Dave, for surviving being up on stage with Curt Monash. >> Amen. (group laughing) >> I definitely would concur with Carl. We actually are definitely seeing a SQL renaissance and if there's any proof of the pudding here, I see lakehouse is being icing on the cake. As Doug had predicted last year, now, (clears throat) for the record, I think, Doug was about a year ahead of time in his predictions that this year is really the year that I see (clears throat) the lakehouse ecosystems really firming up. You saw the first shots last year. But anyway, on this, data lakes will not go away. I've actually, I'm on the home stretch of doing a market, a landscape on the lakehouse. And lakehouse will not replace data lakes in terms of that. There is the need for those, data scientists who do know Python, who knows Spark, to go in there and basically do their thing without all the restrictions or the constraints of a pre-built, pre-designed table structure. I get that. Same thing for developing models. But on the other hand, there is huge need. Basically, (clears throat) maybe MongoDB was saying that we're not teaching SQL anymore. Well, maybe we have an oversupply of SQL developers. Well, I'm being facetious there, but there is a huge skills based in SQL. Analytics have been built on SQL. They came with lakehouse and why this really helps to fuel a SQL revival is that the core need in the data lake, what brought on the lakehouse was not so much SQL, it was a need for acid. And what was the best way to do it? It was through a relational table structure. So, the whole idea of acid in the lakehouse was not to turn it into a transaction database, but to make the data trusted, secure, and more granularly governed, where you could govern down to column and row level, which you really could not do in a data lake or a file system. So, while lakehouse can be queried in a manner, you can go in there with Python or whatever, it's built on a relational table structure. And so, for that end, for those types of data lakes, it becomes the end state. You cannot bypass that table structure as I learned the hard way during my research. So, the bottom line I'd say here is that lakehouse is proof that we're starting to see the revenge of the SQL nerds. (Dave chuckles) >> Excellent. Okay, let's bring up back up the predictions. Dave Menninger, this one's really thought-provoking and interesting. We're hearing things like data as code, new data applications, machines actually generating plans with no human involvement. And your prediction is the definition of data is expanding. What do you mean by that? >> So, I think, for too long, we've thought about data as the, I would say facts that we collect the readings off of devices and things like that, but data on its own is really insufficient. Organizations need to manipulate that data and examine derivatives of the data to really understand what's happening in their organization, why has it happened, and to project what might happen in the future. And my comment is that these data derivatives need to be supported and managed just like the data needs to be managed. We can't treat this as entirely separate. Think about all the governance discussions we've had. Think about the metadata discussions we've had. If you separate these things, now you've got more moving parts. We're talking about simplicity and simplifying the stack. So, if these things are treated separately, it creates much more complexity. I also think it creates a little bit of a myopic view on the part of the IT organizations that are acquiring these technologies. They need to think more broadly. So, for instance, metrics. Metric stores are becoming much more common part of the tooling that's part of a data platform. Similarly, feature stores are gaining traction. So, those are designed to promote the reuse and consistency across the AI and ML initiatives. The elements that are used in developing an AI or ML model. And let me go back to metrics and just clarify what I mean by that. So, any type of formula involving the data points. I'm distinguishing metrics from features that are used in AI and ML models. And the data platforms themselves are increasingly managing the models as an element of data. So, just like figuring out how to calculate a metric. Well, if you're going to have the features associated with an AI and ML model, you probably need to be managing the model that's associated with those features. The other element where I see expansion is around external data. Organizations for decades have been focused on the data that they generate within their own organization. We see more and more of these platforms acquiring and publishing data to external third-party sources, whether they're within some sort of a partner ecosystem or whether it's a commercial distribution of that information. And our research shows that when organizations use external data, they derive even more benefits from the various analyses that they're conducting. And the last great frontier in my opinion on this expanding world of data is the world of driver-based planning. Very few of the major data platform providers provide these capabilities today. These are the types of things you would do in a spreadsheet. And we all know the issues associated with spreadsheets. They're hard to govern, they're error-prone. And so, if we can take that type of analysis, collecting the occupancy of a rental property, the projected rise in rental rates, the fluctuations perhaps in occupancy, the interest rates associated with financing that property, we can project forward. And that's a very common thing to do. What the income might look like from that property income, the expenses, we can plan and purchase things appropriately. So, I think, we need this broader purview and I'm beginning to see some of those things happen. And the evidence today I would say, is more focused around the metric stores and the feature stores starting to see vendors offer those capabilities. And we're starting to see the ML ops elements of managing the AI and ML models find their way closer to the data platforms as well. >> Very interesting. When I hear metrics, I think of KPIs, I think of data apps, orchestrate people and places and things to optimize around a set of KPIs. It sounds like a metadata challenge more... Somebody once predicted they'll have more metadata than data. Carl, what are your thoughts on this prediction? >> Yeah, I think that what Dave is describing as data derivatives is in a way, another word for what I was calling operational metadata, which not about the data itself, but how it's used, where it came from, what the rules are governing it, and that kind of thing. If you have a rich enough set of those things, then not only can you do a model of how well your vacation property rental may do in terms of income, but also how well your application that's measuring that is doing for you. In other words, how many times have I used it, how much data have I used and what is the relationship between the data that I've used and the benefits that I've derived from using it? Well, we don't have ways of doing that. What's interesting to me is that folks in the content world are way ahead of us here, because they have always tracked their content using these kinds of attributes. Where did it come from? When was it created, when was it modified? Who modified it? And so on and so forth. We need to do more of that with the structure data that we have, so that we can track what it's used. And also, it tells us how well we're doing with it. Is it really benefiting us? Are we being efficient? Are there improvements in processes that we need to consider? Because maybe data gets created and then it isn't used or it gets used, but it gets altered in some way that actually misleads people. (laughs) So, we need the mechanisms to be able to do that. So, I would say that that's... And I'd say that it's true that we need that stuff. I think, that starting to expand is probably the right way to put it. It's going to be expanding for some time. I think, we're still a distance from having all that stuff really working together. >> Maybe we should say it's gestating. (Dave and Carl laughing) >> Sorry, if I may- >> Sanjeev, yeah, I was going to say this... Sanjeev, please comment. This sounds to me like it supports Zhamak Dehghani's principles, but please. >> Absolutely. So, whether we call it data mesh or not, I'm not getting into that conversation, (Dave chuckles) but data (audio breaking) (Tony laughing) everything that I'm hearing what Dave is saying, Carl, this is the year when data products will start to take off. I'm not saying they'll become mainstream. They may take a couple of years to become so, but this is data products, all this thing about vacation rentals and how is it doing, that data is coming from different sources. I'm packaging it into our data product. And to Carl's point, there's a whole operational metadata associated with it. The idea is for organizations to see things like developer productivity, how many releases am I doing of this? What data products are most popular? I'm actually in right now in the process of formulating this concept that just like we had data catalogs, we are very soon going to be requiring data products catalog. So, I can discover these data products. I'm not just creating data products left, right, and center. I need to know, do they already exist? What is the usage? If no one is using a data product, maybe I want to retire and save cost. But this is a data product. Now, there's a associated thing that is also getting debated quite a bit called data contracts. And a data contract to me is literally just formalization of all these aspects of a product. How do you use it? What is the SLA on it, what is the quality that I am prescribing? So, data product, in my opinion, shifts the conversation to the consumers or to the business people. Up to this point when, Dave, you're talking about data and all of data discovery curation is a very data producer-centric. So, I think, we'll see a shift more into the consumer space. >> Yeah. Dave, can I just jump in there just very quickly there, which is that what Sanjeev has been saying there, this is really central to what Zhamak has been talking about. It's basically about making, one, data products are about the lifecycle management of data. Metadata is just elemental to that. And essentially, one of the things that she calls for is making data products discoverable. That's exactly what Sanjeev was talking about. >> By the way, did everyone just no notice how Sanjeev just snuck in another prediction there? So, we've got- >> Yeah. (group laughing) >> But you- >> Can we also say that he snuck in, I think, the term that we'll remember today, which is metadata museums. >> Yeah, but- >> Yeah. >> And also comment to, Tony, to your last year's prediction, you're really talking about it's not something that you're going to buy from a vendor. >> No. >> It's very specific >> Mm-hmm. >> to an organization, their own data product. So, touche on that one. Okay, last prediction. Let's bring them up. Doug Henschen, BI analytics is headed to embedding. What does that mean? >> Well, we all know that conventional BI dashboarding reporting is really commoditized from a vendor perspective. It never enjoyed truly mainstream adoption. Always that 25% of employees are really using these things. I'm seeing rising interest in embedding concise analytics at the point of decision or better still, using analytics as triggers for automation and workflows, and not even necessitating human interaction with visualizations, for example, if we have confidence in the analytics. So, leading companies are pushing for next generation applications, part of this low-code, no-code movement we've seen. And they want to build that decision support right into the app. So, the analytic is right there. Leading enterprise apps vendors, Salesforce, SAP, Microsoft, Oracle, they're all building smart apps with the analytics predictions, even recommendations built into these applications. And I think, the progressive BI analytics vendors are supporting this idea of driving insight to action, not necessarily necessitating humans interacting with it if there's confidence. So, we want prediction, we want embedding, we want automation. This low-code, no-code development movement is very important to bringing the analytics to where people are doing their work. We got to move beyond the, what I call swivel chair integration, between where people do their work and going off to separate reports and dashboards, and having to interpret and analyze before you can go back and do take action. >> And Dave Menninger, today, if you want, analytics or you want to absorb what's happening in the business, you typically got to go ask an expert, and then wait. So, what are your thoughts on Doug's prediction? >> I'm in total agreement with Doug. I'm going to say that collectively... So, how did we get here? I'm going to say collectively as an industry, we made a mistake. We made BI and analytics separate from the operational systems. Now, okay, it wasn't really a mistake. We were limited by the technology available at the time. Decades ago, we had to separate these two systems, so that the analytics didn't impact the operations. You don't want the operations preventing you from being able to do a transaction. But we've gone beyond that now. We can bring these two systems and worlds together and organizations recognize that need to change. As Doug said, the majority of the workforce and the majority of organizations doesn't have access to analytics. That's wrong. (chuckles) We've got to change that. And one of the ways that's going to change is with embedded analytics. 2/3 of organizations recognize that embedded analytics are important and it even ranks higher in importance than AI and ML in those organizations. So, it's interesting. This is a really important topic to the organizations that are consuming these technologies. The good news is it works. Organizations that have embraced embedded analytics are more comfortable with self-service than those that have not, as opposed to turning somebody loose, in the wild with the data. They're given a guided path to the data. And the research shows that 65% of organizations that have adopted embedded analytics are comfortable with self-service compared with just 40% of organizations that are turning people loose in an ad hoc way with the data. So, totally behind Doug's predictions. >> Can I just break in with something here, a comment on what Dave said about what Doug said, which (laughs) is that I totally agree with what you said about embedded analytics. And at IDC, we made a prediction in our future intelligence, future of intelligence service three years ago that this was going to happen. And the thing that we're waiting for is for developers to build... You have to write the applications to work that way. It just doesn't happen automagically. Developers have to write applications that reference analytic data and apply it while they're running. And that could involve simple things like complex queries against the live data, which is through something that I've been calling analytic transaction processing. Or it could be through something more sophisticated that involves AI operations as Doug has been suggesting, where the result is enacted pretty much automatically unless the scores are too low and you need to have a human being look at it. So, I think that that is definitely something we've been watching for. I'm not sure how soon it will come, because it seems to take a long time for people to change their thinking. But I think, as Dave was saying, once they do and they apply these principles in their application development, the rewards are great. >> Yeah, this is very much, I would say, very consistent with what we were talking about, I was talking about before, about basically rethinking the modern data stack and going into more of an end-to-end solution solution. I think, that what we're talking about clearly here is operational analytics. There'll still be a need for your data scientists to go offline just in their data lakes to do all that very exploratory and that deep modeling. But clearly, it just makes sense to bring operational analytics into where people work into their workspace and further flatten that modern data stack. >> But with all this metadata and all this intelligence, we're talking about injecting AI into applications, it does seem like we're entering a new era of not only data, but new era of apps. Today, most applications are about filling forms out or codifying processes and require a human input. And it seems like there's enough data now and enough intelligence in the system that the system can actually pull data from, whether it's the transaction system, e-commerce, the supply chain, ERP, and actually do something with that data without human involvement, present it to humans. Do you guys see this as a new frontier? >> I think, that's certainly- >> Very much so, but it's going to take a while, as Carl said. You have to design it, you have to get the prediction into the system, you have to get the analytics at the point of decision has to be relevant to that decision point. >> And I also recall basically a lot of the ERP vendors back like 10 years ago, we're promising that. And the fact that we're still looking at the promises shows just how difficult, how much of a challenge it is to get to what Doug's saying. >> One element that could be applied in this case is (indistinct) architecture. If applications are developed that are event-driven rather than following the script or sequence that some programmer or designer had preconceived, then you'll have much more flexible applications. You can inject decisions at various points using this technology much more easily. It's a completely different way of writing applications. And it actually involves a lot more data, which is why we should all like it. (laughs) But in the end (Tony laughing) it's more stable, it's easier to manage, easier to maintain, and it's actually more efficient, which is the result of an MIT study from about 10 years ago, and still, we are not seeing this come to fruition in most business applications. >> And do you think it's going to require a new type of data platform database? Today, data's all far-flung. We see that's all over the clouds and at the edge. Today, you cache- >> We need a super cloud. >> You cache that data, you're throwing into memory. I mentioned, MySQL heat wave. There are other examples where it's a brute force approach, but maybe we need new ways of laying data out on disk and new database architectures, and just when we thought we had it all figured out. >> Well, without referring to disk, which to my mind, is almost like talking about cave painting. I think, that (Dave laughing) all the things that have been mentioned by all of us today are elements of what I'm talking about. In other words, the whole improvement of the data mesh, the improvement of metadata across the board and improvement of the ability to track data and judge its freshness the way we judge the freshness of a melon or something like that, to determine whether we can still use it. Is it still good? That kind of thing. Bringing together data from multiple sources dynamically and real-time requires all the things we've been talking about. All the predictions that we've talked about today add up to elements that can make this happen. >> Well, guys, it's always tremendous to get these wonderful minds together and get your insights, and I love how it shapes the outcome here of the predictions, and let's see how we did. We're going to leave it there. I want to thank Sanjeev, Tony, Carl, David, and Doug. Really appreciate the collaboration and thought that you guys put into these sessions. Really, thank you. >> Thank you. >> Thanks, Dave. >> Thank you for having us. >> Thanks. >> Thank you. >> All right, this is Dave Valente for theCUBE, signing off for now. Follow these guys on social media. Look for coverage on siliconangle.com, theCUBE.net. Thank you for watching. (upbeat music)

Published Date : Jan 11 2023

SUMMARY :

and pleased to tell you (Tony and Dave faintly speaks) that led them to their conclusion. down, the funding in VC IPO market. And I like how the fact And I happened to have tripped across I talked to Walmart in the prediction of graph databases. But I stand by the idea and maybe to the edge. You can apply graphs to great And so, it's going to streaming data permeates the landscape. and to be honest, I like the tough grading the next 20 to 25% of and of course, the degree of difficulty. that sits on the side, Thank you for that. And I have to disagree. So, the catalog becomes Do you have any stats for just the reasons that And a lot of those catalogs about the modern data stack. and more, the data lakehouse. and the application stack, So, the alternative is to have metadata that SQL is the killer app for big data. but in the perception of the marketplace, and I had to take the NoSQL, being up on stage with Curt Monash. (group laughing) is that the core need in the data lake, And your prediction is the and examine derivatives of the data to optimize around a set of KPIs. that folks in the content world (Dave and Carl laughing) going to say this... shifts the conversation to the consumers And essentially, one of the things (group laughing) the term that we'll remember today, to your last year's prediction, is headed to embedding. and going off to separate happening in the business, so that the analytics didn't And the thing that we're waiting for and that deep modeling. that the system can of decision has to be relevant And the fact that we're But in the end We see that's all over the You cache that data, and improvement of the and I love how it shapes the outcome here Thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Doug Henschen	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Doug	PERSON	0.99+
Carl	PERSON	0.99+
Carl Olofson	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Tony Baer	PERSON	0.99+
Tony	PERSON	0.99+
Dave Valente	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
Curt Monash	PERSON	0.99+
Sanjeev Mohan	PERSON	0.99+
Christian Kleinerman	PERSON	0.99+
Dave Valente	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Sanjeev	PERSON	0.99+
Constellation Research	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Ventana Research	ORGANIZATION	0.99+
2022	DATE	0.99+
Hazelcast	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Tony Bear	PERSON	0.99+
25%	QUANTITY	0.99+
2021	DATE	0.99+
last year	DATE	0.99+
65%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
today	DATE	0.99+
five-year	QUANTITY	0.99+
TigerGraph	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
two services	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
David	PERSON	0.99+
RisingWave Labs	ORGANIZATION	0.99+

Deploying AI in the Enterprise

(orchestral music) >> Hi, I'm Peter Burris and welcome to another digital community event. As we do with all digital community events, we're gonna start off by having a series of conversations with real thought leaders about a topic that's pressing to today's enterprises as they try to achieve new classes of business outcomes with technology. At the end of that series of conversations, we're gonna go into a crowd chat and give you an opportunity to voice your opinions and ask your questions. So stay with us throughout. So, what are we going to be talking about today? We're going to be talking about the challenge that businesses face as they try to apply AI, ML, and new classes of analytics to their very challenging, very difficult, but nonetheless very value-producing outcomes associated with data. The challenge that all these businesses have is that often, you spend too much time in the infrastructure and not enough time solving the problem. And so what's required is new classes of technology and new classes of partnerships and business arrangements that allow for us to mask the underlying infrastructure complexity from data science practitioners, so that they can focus more time and attention on building out the outcomes that the business wants and a sustained business capability so that we can continue to do so. Once again, at the end of this series of conversations, stay with us, so that we can have that crowd chat and you can, again, ask your questions, provide your insights, and participate with the community to help all of us move faster in this crucial direction for better AI, better ML and better analytics. So, the first conversation we're going to have is with Anant Chintamaneni. Anant's the Vice President of Products at BlueData. Anant, welcome to theCUBE. >> Hi Peter, it's great to be here. I think the topic that you just outlined is a very fascinating and interesting one. Over the last 10 years, data and analytics have been used to create transformative experiences and drive a lot of business growth. You look at companies like Uber, AirBnB, and you know, Spotify, practically, every industry's being disrupted. And the reason why they're able to do this is because data is in their DNA; it's their key asset and they've leveraged it in every aspect of their product development to deliver amazing experiences and drive business growth. And the reason why they're able to do this is they've been able to leverage open-source technologies, data science techniques, and big data, fast data, all types of data to extract that business value and inject analytics into every part of their business process. Enterprises of all sizes want to take advantage of that same assets that the new digital companies are taking and drive digital transformation and innovation, in their organizations. But there's a number of challenges. First and foremost, if you look at the enterprises where data was not necessarily in their DNA and to inject that into their DNA, it is a big challenge. The executives, the executive branch, definitely wants to understand where they want to apply AI, how to kind of identify which huge cases to go after. There is some recognition coming in. They want faster time-to-value and they're willing to invest in that. >> And they want to focus more on the actual outcomes they seek as opposed to the technology selection that's required to achieve those outcomes. >> Absolutely. I think it's, you know, a boardroom mandate for them to drive new business outcomes, new business models, but I think there is still some level of misalignment between the executive branch and the data worker community which they're trying to upgrade with the new-age data scientists, the AI developer and then you have IT in the middle who has to basically bridge the gap and enable the digital transformation journey and provide the infrastructure, provide the capabilities. >> So we've got a situation where people readily acknowledge the potential of some of these new AI, ML, big data related technologies, but we've got a mismatch between the executives that are trying to do evidence-based management, drive new models, the IT organization who's struggling to deal with data-first technologies, and data scientists who are few and far between, and leave quickly if they don't get the tooling that they need. So, what's the way forward, that's the problem. How do we move forward? >> Yeah, so I think, you know, I think we have to double-click into some of the problems. So the data scientists, they want to build a tool chain that leverages the best in-class, open source technologies to solve the problem at hand and they don't want, they want to be able to compile these tool chains, they want to be able to apply and create new algorithms and operationalize and do it in a very iterative cycle. It's a continuous development, continuous improvement process which is at odds with what IT can deliver, which is they have to deliver data that is dispersed all over the place to these data scientists. They need to be able to provide infrastructure, which today, they're not, there's an impotence mismatch. It takes them months, if not years, to be able to make those available, make that infrastructure available. And last but not the least, security and control. It's just fundamentally not the way they've worked where they can make data and new tool chains available very quickly to the data scientists. And the executives, it's all about faster time-to-value so there's a little bit of an expectation mismatch as well there and so those are some of the fundamental problems. There's also reproducibility, like, once you've created an analytics model, to be able to reproduce that at scale, to be then able to govern that and make sure that it's producing the right results is fundamentally a challenge. >> Audibility of that process. >> Absolutely, audibility. And, in general, being able to apply this sort of model for many different business problems so you can drive outcomes in different parts of your business. So there's a huge number of problems here. And so what I believe, and what we've seen with some of these larger companies, the new digital companies that are driving business valley ways, they have invested in a unified platform where they've made the infrastructure invisible by leveraging cloud technologies or containers and essentially, made it such that the data scientists don't have to worry about the infrastructure, they can be a lot more agile, they can quickly create the tool chains that work for the specific business problem at hand, scale it up and down as needed, be able to access data where it lies, whether it's on-prem, whether it's in the cloud or whether it's a hybrid model. And so that's something that's required from a unified platform where you can do your rapid prototyping, you can do your development and ultimately, the business outcome and the value comes when you operationalize it and inject it into your business processes. So, I think fundamentally, this start, this kind of a unified platform, is critical. Which, I think, a lot of the new age companies have, but is missing with a lot of the enterprises. >> So, a big challenge for the enterprise over the next few years is to bring these three groups together; the business, data science world and infrastructure world or others to help with those problems and apply it successfully to some of the new business challenges that we have. >> Yeah, and I would add one last point is that we are on this continuous journey, as I mentioned, this is a world of open source technologies that are coming out from a lot of the large organizations out there. Whether it's your Googles and your Facebooks. And so there is an evolution in these technologies much like we've evolved from big data and data management to capture the data. The next sort of phase is around data exploitation with artificial intelligence and machine learning type techniques. And so, it's extremely important that this platform enables these organizations to future proof themselves. So as new technologies come in, they can leverage them >> Great point. >> for delivering exponential business value. >> Deliver value now, but show a path to delivery value in the future as all of these technologies and practices evolve. >> Absolutely. >> Excellent, all right, Anant Chintamaneni, thanks very much for giving us some insight into the nature of the problems that enterprises face and some of the way forward. We're gonna be right back, and we're gonna talk about how to actually do this in a second. (light techno music) >> Introducing, BlueData EPIC. The leading container-based software platform for distributed AI, machine learning, deep learning and analytics environments. Whether on-prem, in the cloud or in a hybrid model. Data scientists need to build models utilizing various stacks of AI, ML and DL applications and libraries. However, installing and validating these environments is time consuming and prone to errors. BlueData provides the ability to spin up these environments on demand. The BlueData EPIC app store includes, best of breed, ready to run docker based application images. Like TensorFlow and H2O driverless AI. Teams can also add their own images, to provide the latest tools that data scientists prefer. And ensure compliance with enterprise standards. They can use the quick launch button. which provides pre configured templates with the appropriate application image and resources. For example, they can instantly launch a new Sandbox environment using the template for TensorFlow with a Jupyter Notebook. Within just a few minutes, it'll be automatically configured with GPUs and easy access to their data. Users can launch experiments and make GPUs automatically available for analysis. In this case, the H2O environment was set up with one GPU. With BlueData EPIC, users can also deploy end points with the appropriate run time. And the inference run times can use CPUs or GPUs. With a container based BlueData Platform, you can deploy fully configured distributed environments within a matter of minutes. Whether on-prem, in the public cloud, or in a hybrid a architecture. BlueData was recently acquired by Hewlett Packward Enterprise. And now, HPE and BlueData are joining forces to help you on your AI journey. (light techno music) To learn more, visit www.BlueData.com >> And we're back. I'm Peter Burris and we're continuing to have this conversation about how businesses are turning experience with the problems of advance analytics and the solutions that they seek into actual systems that deliver continuous on going value and achieve the business capabilities required to make possible these advanced outcomes associated with analytics, AI and ML. And to do that, we've got two great guests with us. We've got Kumar Sreekanti, who is the co-founder and CEO of BlueData. Kumar, welcome back to theCUBE. >> Thank you, it is nice to be here, back again. >> And Kumar, you're being joined by a customer. Ramesh Thyagarajan, is the executive director of the Advisory Board Company which is part of Optum now. Ramesh, welcome to theCUBE. >> Great to be here. >> Alright, so Kumar let's start with you. I mentioned up front, this notion of turning technology and understanding into actual business capabilities to deliver outcomes. What has been BlueData's journey along, to make that happen? >> Yeah, it all started six years ago, Peter. It was a bold vision and a big idea and no pun intended on big data which was an emerging market then. And as everybody knows, the data was enormous and there was a lot of innovation around the periphery. but nobody was paying attention to how to make the big data consumable in enterprise. And I saw an enormous opportunity to make this data more consumable in the enterprise and to give a cloud-like experience with the agility and elasticity. So, our vision was to build a software infrastructure platform like VMware, specially focused on data intensity distributed applications and this platform will allow enterprises to build cloud like experiences both on enterprise as well as on hybrid clouds. So that it pays the journey for their cloud experience. So I was very fortunate to put together a team and I found good partners like Intel. So that actually is the genesis for the BlueData. So, if you look back into the last six years, big data itself has went through a lot of evolution and so the marketplace and the enterprises have gone from offline analytics to AI, ML based work loads that are actually giving them predictive and descriptive analytics. What BlueData has done is by making the infrastructure invisible, by making the tool set completely available as the tool set itself is evolving and in the process, we actually created so many game changing software technologies. For example, we are the first end-to-end content-arised enterprise solution that gives you distributed applications. And we built a technology called DataTap, that provides computed data operation so that you don't have to actually copy the data, which is a boom for enterprises. We also actually built multitenancy so those enterprises can run multiple work loads on the same data and Ramesh will tell you in a second here, in the healthcare enterprise, the multitenancy is such a very important element. And finally, we also actually contributed to many open source technologies including, we have a project called KubeDirector which is actually is our own Kubernetes and how to run stateful workloads on Kubernetes. which we have actually very happy to see that people like, customers like Ramesh are using the BlueData. >> Sounds like quite a journey and obviously you've intercepted companies like the advisory board company. So Ramesh, a lot of enterprises have mastered or you know, gotten, understood how to create data lakes with a dupe but then found that they still weren't able to connect to some of the outcomes that they saw. Is that the experience that you had. >> Right, to be precise, that is one of the kind of problems we have. It's not just the data lake that we need to be able to do the workflows or other things, but we also, being a traditional company, being in the business for a long time, we have a lot of data assets that are not part of this data lake. We're finding it hard to, how do we get the data, getting them and putting them in a data lake is a duplication of work. We were looking for some kind of solutions that will help us to gather the benefits of leaving the data alone but still be able to get into it. >> This is where (mumbles). >> This is where we were looking for things and then I was lucky and fortunate to run into Kumar and his crew in one of the Hadoop conferences and then they demonstrated the way it can be done so immediately hit upon, it's a big hit with us and then we went back and then did a POC, very quickly adapt to the technology and that is also one of the benefits of corrupting this technology is the level of contrary memorization they are doing, it is helping me to address many needs. My data analyst, the data engineers and the data scientists so I'm able to serve all of them which otherwise wouldn't be possible for me with just this plain very (mumbles). >> So it sounds as though the partnership with BlueData has allowed you to focus on activities and problems and challenges above the technology so that you can actually start bringing data science, business objectives and infrastructure people together. Have I got that right? >> Absolutely. So BlueData is helping me to tie them all together and provide an excess value to my business. We being in the healthcare, the importance is we need to be able to look at the large data sets for a period of time in order to figure out how a patient's health journey is happening. That is very important so that we can figure out the ways and means in which we can lower the cost of health care and also provide insights to the physician, they can help get people better at health. >> So we're getting great outcomes today especially around, as you said that patient journey where all the constituents can get access to those insights without necessarily having to learn a whole bunch of new infrastructure stuff but presumably you need more. We're talking about a new world that you mentioned before upfront, talking about a new world, AI, ML, a lot of changes. A lot of our enterprise customers are telling us it's especially important that they find companies that not only deliver something today but demonstrate a commitment to sustain that value delivery process especially as the whole analytics world evolves. Are you experiencing that as well? >> Yes, we are experiencing and one of the great advantage of the platform, BlueData platform that gave me this ability to, I had the new functionality, be it the TensorFlow, be it the H2O, be it the heart studio, anything that I needed, I call them, they give me the images that are plug-and-play, just put them and all the prompting is practically transparent to nobody need to know how it is achieved. Now, in order to get to the next level of the predictive and prescriptive analytics, it is not just you having the data, you need to be able to have your curated data asset set process on top of a platform that will help you to get the data scientists to make you. One of the biggest challenges that are scientist is not able to get their hands on data. BlueData platform gives me the ability to do it and ensure all the security meets and all the compliances with the various other regulated compliances we need to make. >> Kamar, congratulations. >> Thank you. >> Sounds like you have a happy customer. >> Thank you. >> One of the challenges that every entrepreneur faces is how did you scale the business. So talk to us about where you are in the decisions that you made recently to achieve that. >> As an entrepreneur, when you start a company, odds are against you, right? You're always worried about it, right. You make so many sacrifices, yourself and your team and all that but the the customer is the king. The most important thing for us to find satisfied customers like Rameshan so we were very happy and BlueData was very successful in finding that customer because i think as you pointed out, as Ramesh pointed out, we provide that clean solution for the customer but as you go through this journey as a co-founder and CEO, you always worry about how do you scale to the next level. So we had partnerships with many companies including HPE and we found when this opportunity came in front of me with myself and my board, we saw this opportunity of combining the forces of BlueData satisfied customers and innovative technology and the team with the HPs brand name, their world-class service, their investment in R&D and they have a very long, large list of enterprise customers. We think putting these two things together provides that next journey in the BlueData's innovation and BlueData's customers. >> Excellent, so once again Kumar Sreekanti, co-founder and CEO of BlueData and Ramesh Thyagarajan who is the executive director of the advisory board company and part of Optum, I want to thank both of you for being on theCUBE. >> Thank you >> Thank you, great to be here. >> Now let's hear a little bit more about how this notion of bringing BlueData and HPE together is generating new classes of value that are making things happen today but are also gonna make things happen for customers in the future and to do that we've got Dave Velante who's with Silicon Angle Wiki Bond joined by Patrick Osbourne who's with HPE in our Marlborough studio so Dave over to you. >> Thanks Peter. We're here with Patrick Osbourne, the vice president and general manager of big data and analytics at Hewlett Packard Enterprise. Patrick, thanks for coming on. >> Thanks for having us. >> So we heard from Kumar, let's hear from you. Why did HPE purchase, acquire BlueData? >> So if you think about it from three angles. Platform, people and customers, right. Great platform, built for scale addressing a number of these new workloads and big data analytics and certainly AI, the people that they have are amazing, right, great engineering team, awesome customer success team, team of data scientists, right. So you know, all the folks that have some really, really great knowledge in this space so they're gonna be a great addition to HPE and also on the customer side, great logos, major fortune five customers in the financial services vertical, healthcare, pharma, manufacturing so a huge opportunity for us to scale that within HP context. >> Okay, so talk about how it fits into your strategy, specifically what are you gonna do with it? What are the priorities, can you share some roadmap? >> Yeah, so you take a look at HPE strategy. We talk about hybrid cloud and specifically edge to core to cloud and the common theme that runs through that is data, data-driven enterprises. So for us we see BlueData, Epic platform as a way to you know, help our customers quickly deploy these new mode to applications that are fueling their digital transformation. So we have some great plans. We're gonna certainly invest in all the functions, right. So we're gonna do a force multiplier on not only on product engineering and product delivery but also go to market and customer success. We're gonna come out in our business day one with some really good reference architectures, with some of our partners like Cloud Era, H2O, we've got some very scalable building block architectures to marry up the BlueData platform with our Apollo systems for those of you have seen that in the market, we've got our Elastic platform for analytics for customers who run these workloads, now you'd be able to virtualize those in containers and we'll have you know, we're gonna be building out a big services practice in this area. So a lot of customers often talk to us about, we don't have the people to do this, right. So we're gonna bring those people to you as HPE through Point Next, advisory services, implementation, ongoing help with customers. So it's going to be a really fantastic start. >> Apollo, as you mentioned Apollo. I think of Apollo sometimes as HPC high performance computing and we've had a lot of discussion about how that's sort of seeping in to mainstream, is that what you're seeing? >> Yeah absolutely, I mean we know that a lot of our customers have traditional workloads, you know, they're on the path to almost completely virtualizing those, right, but where a lot of the innovation is going on right now is in this mode two world, right. So your big data and analytics pipeline is getting longer, you're introducing new experiences on top of your product and that's fueling you know, essentially commercial HPC and now that folks are using techniques like AI and modeling inference to make those services more scalable, more automated, we're starting to bringing these more of these platforms, these scalable architectures like Apollo. >> So it sounds like your roadmap has a lot of integration plans across the HPE portfolio. We certainly saw that with Nimble, but BlueData was working with a lot of different companies, its software, is the plan to remain open or is this an HPE thing? >> Yeah, we absolutely want to be open. So we know that we have lots of customers that choose, so the HP is all about hybrid cloud, right and that has a couple different implications. We want to talk about your choice of on-prem versus off-prem so BlueData has a great capability to run some of these workloads. It essentially allows you to do separation of compute and storage, right in the world of AI and analytics we can run it off-prem as well in the public cloud but then we also have choice for customers, you know, any customer's private cloud. So that means they want to run on other infrastructure besides HPE, we're gonna support that, we have existing customers that do that. We're also gonna provide infrastructure that marries the software and the hardware together with frameworks like Info Site that we feel will be a you know, much better experience for the customers but we'll absolutely be open and absolutely have choice. >> All right, what about the business impact to take the customer perspective, what can they expect? >> So I think from a customer perspective, we're really just looking to accelerate deployment of AI in the enterprise, right and that has a lot of implications for us. We're gonna have very scalable infrastructure for them, we're gonna be really focused on this very dynamic AI and ML application ecosystems through partnerships and support within the BlueData platform. We want to provide a SAS experience, right. So whether that's GPUs or accelerators as a service, analytics as a service, we really want to fuel innovation as a service. We want to empower those data scientists there, those are they're really hard to find you know, they're really hard to retain within your organization so we want to unlock all that capability and really just we want to focus on innovation of the customers. >> Yeah, and they spend a lot of time wrangling data so you're really going to simplify that with the cloud (mumbles). Patrick thank you, I appreciate it. >> Thank you very much. >> Alright Peter, back to you in Palo Alto. >> And welcome back, I'm Peter Burris and we've been talking a lot in the industry about how new tooling, new processes can achieve new classes of analytics, AI and ML outcomes within a business but if you don't get the people side of that right, you're not going to achieve the full range of benefits that you might get out of your investments. Now to talk a little bit about how important the data science practitioner is in this equation, we've got two great guests with us. Nanda Vijaydev is the chief data scientists of BlueData. Welcome to theCUBE. >> Thank you Peter, happy to be here. >> Ingrid Burton is the CMO and business leader at H2O.AI, Ingrid, welcome to the CUBE. >> Thank you so much for having us. >> So Nanda Vijaydev, let's start with you. Again, having a nice platform, very, very important but how does that turn into making the data science practitioner's life easier so they can deliver more business value. >> Yeah thank you, it's a great question. I think end of the day for a data scientist, what's most important is, did you understand the question that somebody asked you and what is expected of you when you deliver something and then you go about finding, what do I need for them, I need data, I need systems and you know, I need to work with people, the experts in the process to make sure that the hypothesis I'm doing is structured in a nice way where it is testable, it's modular and I have you know, a way for them to go back to show my results and keep doing this in an iterative manner. That's the biggest thing because the satisfaction for a data scientist is when you actually take this and make use of it, put it in production, right. To make this whole thing easier, we definitely need some way of bringing it all together. That's really where, especially compared to the traditional data science where everything was monolithic, it was one system, there was a very set way of doing things but now it is not so you know, with the growing types of data, with the growing types of computation algorithms that's available, there's a lot of opportunity and at the same time there is a lot of uncertainty. So it's really about putting that structure and it's really making sure you get the best of everything and still deliver the results, that is the focus that all data scientists strive for. >> And especially you wanted, the data scientists wants to operate in the world of uncertainty related to the business question and reducing uncertainty and not deal with the underlying some uncertainty associated with the infrastructure. >> Absolutely, absolutely you know, as a data scientist a lot of time used to spend in the past about where is the data, then the question was, what data do you want and give it to you because the data always came in a nice structured, row-column format, it had already lost a lot of context of what we had to look for. So it is really not about you know, getting the you know, it's really not about going back to systems that are pre-built or pre-processed, it's getting access to that real, raw data. It's getting access to the information as it came so you can actually make the best judgment of how to go forward with it. >> So you describe the world with business, technology and data science practitioners are working together but let's face it, there's an enormous amount of change in the industry and quite frankly, a deficit of expertise and I think that requires new types of partnerships, new types of collaboration, a real (mumbles) approach and Ingrid, I want to talk about what H2O.AI is doing as a partner of BlueData, HPE to ensure that you're complementing these skills in pursuit or in service to the customer's objectives. >> Absolutely, thank you for that. So as Nanda described, you know, data scientists want to get to answers and what we do at H2O.AI is we provide the algorithms, the platforms for data scientist to be successful. So when they want to try and solve a problem, they need to work with their business leaders, they need to work with IT and they actually don't want to do all the heavy lifting, they want to solve that problem. So what we do is we do automatic machine learning platforms, we do that with optimizing algorithms and doing all the kind of, a lot of the heavy lifting that novice data scientists need and help expert data scientists as well. I talk about it as algorithms to answers and actually solving business problems with predictions and that's what machine learning is really all about but really what we're seeing in the industry right now and BlueData is a great example of kind of taking away some of the hard stuff away from a data scientist and making them successful. So working with BlueData and HPE, making us together really solve the problems that businesses are looking for, it's really transformative and we've been through like the digital transformation journey, all of us have been through that. We are now what I would term an AI transformation of sorts and businesses are going to the next step. They had their data, they got their data, infrastructure is kind of seamlessly working together, the clusters and containerization that's very important. Now what we're trying to do is get to the answers and using automatic machine learning platforms is probably the best way forward. >> That's still hard stuff but we're trying to get rid of data science practitioners, focusing on hard stuff that doesn't directly deliver value. >> It doesn't deliver anything for them, right. They shouldn't have to worry about the infrastructure, they should worry about getting the answers to the business problems they've been asked to solve. >> So let's talk a little bit about some of the new business problems that are going to be able to be solved by these kinds of partnerships between BlueData and H2O.AI. Start, Nanda, what do you, what gets you excited when we think about the new types of business problems that customers are gonna be able to solve. >> Yeah, I think it is really you know, the question that comes to you is not filtered through someone else's lens, right. Someone is trying an optimization problem, someone is trying to do a new product discovery so all this is based on a combination of both data-driven and evidence-based, right. For us as a data scientist, what excites me is that I have the flexibility now that I can choose the best of the breed technologies. I should not be restricted to what is given to me by an IT organization or something like that but at the same time, in an organization, for things to work, there has to be some level of control. So it is really having this type of environments or having some platforms where some, there is a team that can work on the control aspect but as a data scientist, I don't have to worry about it. I have my flexibility of tools of choice that I can use. At the same time, when you talk about data, security is a big deal in companies and a lot of times data scientists don't get access to data because of the layers and layers of security that they have to go through, right. So the excitement of the opportunity for me is if someone else takes care of the problem you know, just tell me where is the source of data that I can go to, don't filter the data for me you know, don't already structure the data for me but just tell me it's an approved source, right then it gives me more flexibility to actually go and take that information and build. So the having those controls taken care of well before I get into the picture as a data scientist, it makes it extremely easy for us to focus on you know, to her point, focus on the problem, right, focus on accessing the best of the breed technology and you know, give back and have that interaction with the business users on an ongoing basis. >> So especially focus on, so speed to value so that you're not messing around with a bunch of underlying infrastructure, governance remaining in place so that you know what are the appropriate limits of using the data with security that is embedded within that entire model without removing fidelity out of the quality of data. >> Absolutely. >> Would you agree with those? >> I totally agree with all the points that she brought up and we have joint customers in the market today, they're solving very complex problems. We have customers in financial services, joint customers there. We have customers in healthcare that are really trying to solve today's business problems and these are everything from, how do I give new credit to somebody? How do I know what next product to give them? How do I know what customer recommendations can I make next? Why did that customer churn? How do I reach new people? How do I do drug discovery? How do I give a patient a better prescription? How do I pinpoint disease than when I couldn't have seen it before? Now we have all that data that's available and it's very rich and data is a team sport. It takes data scientists, it takes business leaders and it takes IT to make it all work together and together the two companies are really working to solve problems that our customers are facing, working with our customers because they have the intellectual knowledge of what their problems are. We are providing the tools to help them solve those problems. >> Fantastic conversation about what is necessary to ensure that the data science practitioner remains at the center and is the ultimate test of whether or not these systems and these capabilities are working for business. Nanda Vijaydev, chief data scientist of BlueData, Ingrid Burton CMO and business leader, H2O.AI, thank you very much for being on theCUBE. >> Thank you. >> Thank you so much. >> So let's now spend some time talking about how ultimately, all of this comes together and what you're going to do as you participate in the crowd chat. To do that let me throw it back to Dave Velante in our Marlborough studios. >> We're back with Patrick Osbourne, alright Patrick, let's wrap up here and summarize. We heard how you're gonna help data science teams, right. >> Yup, speed, agility, time to value. >> Alright and I know a bunch of folks at BlueData, the engineering team is very, very strong so you picked up a good asset there. >> Yeah, it means amazing technology, the founders have a long lineage of software development and adoption in the market so we're just gonna, we're gonna invested them and let them loose. >> And then we heard they're sort of better together story from you, you got a roadmap, you're making some investments here, as I heard. >> Yeah, I mean so if we're really focused on hybrid cloud and we want to have all these as a services experience, whether it's through Green Lake or providing innovation, AI, GPUs as a service is something that we're gonna be you know, continuing to provide our customers as we move along. >> Okay and then we heard the data science angle and the data science community and the partner angle, that's exciting. >> Yeah, I mean, I think it's two approaches as well too. We have data scientists, right. So we're gonna bring that capability to bear whether it's through the product experience or through a professional services organization and then number two, you know, this is a very dynamic ecosystem from an application standpoint. There's commercial applications, there's certainly open source and we're gonna bring a fully vetted, full stack experience for our customers that they can feel confident in this you know, it's a very dynamic space. >> Excellent, well thank you very much. >> Thank you. Alright, now it's your turn. Go into the crowd chat and start talking. Ask questions, we're gonna have polls, we've got experts in there so let's crouch chat.

Published Date : May 7 2019

SUMMARY :

and give you an opportunity to voice your opinions and to inject that into their DNA, it is a big challenge. on the actual outcomes they seek and provide the infrastructure, provide the capabilities. and leave quickly if they don't get the tooling So the data scientists, they want to build a tool chain that the data scientists don't have to worry and apply it successfully to some and data management to capture the data. but show a path to delivery value in the future that enterprises face and some of the way forward. to help you on your AI journey. and the solutions that they seek into actual systems of the Advisory Board Company which is part of Optum now. What has been BlueData's journey along, to make that happen? and in the process, we actually created Is that the experience that you had. of leaving the data alone but still be able to get into it. and that is also one of the benefits and challenges above the technology and also provide insights to the physician, that you mentioned before upfront, and one of the great advantage of the platform, So talk to us about where you are in the decisions and all that but the the customer is the king. and part of Optum, I want to thank both of you in the future and to do that we've got Dave Velante and general manager of big data and analytics So we heard from Kumar, let's hear from you. and certainly AI, the people that they have are amazing, So a lot of customers often talk to us about, about how that's sort of seeping in to mainstream, and modeling inference to make those services more scalable, its software, is the plan to remain open and storage, right in the world of AI and analytics those are they're really hard to find you know, Yeah, and they spend a lot of time wrangling data of benefits that you might get out of your investments. Ingrid Burton is the CMO and business leader at H2O into making the data science practitioner's life easier and at the same time there is a lot of uncertainty. the data scientists wants to operate in the world of how to go forward with it. and Ingrid, I want to talk about what H2O and businesses are going to the next step. that doesn't directly deliver value. to the business problems they've been asked to solve. of the new business problems that are going to be able and a lot of times data scientists don't get access to data So especially focus on, so speed to value and it takes IT to make it all work together to ensure that the data science practitioner remains To do that let me throw it back to Dave Velante We're back with Patrick Osbourne, Alright and I know a bunch of folks at BlueData, and adoption in the market so we're just gonna, And then we heard they're sort of better together story that we're gonna be you know, continuing and the data science community and then number two, you know, Go into the crowd chat and start talking.

ENTITIES

Entity	Category	Confidence
Peter	PERSON	0.99+
Ramesh Thyagarajan	PERSON	0.99+
Kumar Sreekanti	PERSON	0.99+
Dave Velante	PERSON	0.99+
Peter Burris	PERSON	0.99+
Kumar	PERSON	0.99+
Nanda Vijaydev	PERSON	0.99+
AirBnB	ORGANIZATION	0.99+
Uber	ORGANIZATION	0.99+
BlueData	ORGANIZATION	0.99+
Patrick Osbourne	PERSON	0.99+
Patrick	PERSON	0.99+
Ingrid Burton	PERSON	0.99+
Ramesh	PERSON	0.99+
Anant Chintamaneni	PERSON	0.99+
Spotify	ORGANIZATION	0.99+
Nanda	PERSON	0.99+
HPE	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
two companies	QUANTITY	0.99+
Ingrid	PERSON	0.99+
Anant	PERSON	0.99+
Hewlett Packward Enterprise	ORGANIZATION	0.99+
H2O.AI	ORGANIZATION	0.99+
both	QUANTITY	0.99+
HPs	ORGANIZATION	0.99+
Facebooks	ORGANIZATION	0.99+
Googles	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Intel	ORGANIZATION	0.99+
Marlborough	LOCATION	0.99+
First	QUANTITY	0.99+
first	QUANTITY	0.99+
one	QUANTITY	0.99+
one system	QUANTITY	0.99+
today	DATE	0.99+
two approaches	QUANTITY	0.99+
Apollo	ORGANIZATION	0.99+
www.BlueData.com	OTHER	0.99+
HP	ORGANIZATION	0.99+
Hewlett Packard Enterprise	ORGANIZATION	0.98+
theCUBE	ORGANIZATION	0.98+
six years ago	DATE	0.98+
two things	QUANTITY	0.98+
One	QUANTITY	0.98+

Jim Franklin & Anant Chintamaneni | theCUBE NYC 2018

>> Live from New York. It's theCUBE. Covering theCUBE New York City, 2018. Brought to you by SiliconANGLE Media, and it's ecosystem partners. >> I'm John Furrier with Peter Burris, our next two guests are Jim Franklin with Dell EMC Director of Product Management Anant Chintamaneni, who is the Vice President of Products at BlueData. Welcome to theCUBE, good to see you. >> Thanks, John. >> Thank you. >> Thanks for coming on. >> I've been following BlueData since the founding. Great company, and the founders are great. Great teams, so thanks for coming on and sharing what's going on, I appreciate it. >> It's a pleasure, thanks for the opportunity. >> So Jim, talk about the Dell relationship with BlueData. What are you guys doing? You have the Dell-ready solutions. How is that related now, because you've seen this industry with us over the years morph. It's really now about, the set-up days are over, it's about proof points. >> That's right. >> AI and machine learning are driving the signal, which is saying, 'We need results'. There's action on the developer's side, there's action on the deployment, people want ROI, that's the main focus. >> That's right. That's right, and we've seen this journey happen from the new batch processing days, and we're seeing that customer base mature and come along, so the reason why we partnered with BlueData is, you have to have those softwares, you have to have the contenders. They have to have the algorithms, and things like that, in order to make this real. So it's been a great partnership with BlueData, it's dated back actually a little farther back than some may realize, all the way to 2015, believe it or not, when we used to incorporate BlueData with Isilon. So it's been actually a pretty positive partnership. >> Now we've talked with you guys in the past, you guys were on the cutting edge, this was back when Docker containers were fashionable, but now containers have become so proliferated out there, it's not just Docker, containerization has been the wave. Now, Kubernetes on top of it is really bringing in the orchestration. This is really making the storage and the network so much more valuable with workloads, whether respective workloads, and AI is a part of that. How do you guys navigate those waters now? What's the BlueData update, how are you guys taking advantage of that big wave? >> I think, great observation, re-embrace Docker containers, even before actually Docker was even formed as a company by that time, and Kubernetes was just getting launched, so we saw the value of Docker containers very early on, in terms of being able to obviously provide the agility, elasticity, but also, from a packaging of applications perspective, as we all know it's a very dynamic environment, and today, I think we are very happy to know that, with Kubernetes being a household name now, especially a tech company, so the way we're navigating this is, we have a turnkey product, which has containerization, and then now we are taking our value proposition of big data and AI and lifecycle management and bringing it to Kubernetes with an open source project that we launched called Cube Director under our umbrella. So, we're all about bringing stateful applications like Hadoop, AI, ML to the community and to our customer base, which is some of the largest financial services in health care customers. >> So the container revolution has certainly groped developers, and developers have always had a history of chasing after the next cool technology, and for good reason, it's not like just chasing after... Developers tend not to just chase after the shiny thing, they chased after the most productive thing, and they start using it, and they start learning about it, and they make themselves valuable, and they build more valuable applications as a result. But there's this interesting meshing of creators, makers, in the software world, between the development community and the data science community. How are data scientists, who you must be spending a fair amount of time with, starting to adopt containers, what are they looking at? Are they even aware of this, as you try to help these communities come together? >> We absolutely talk to the data scientists and they're the drivers of determining what applications they want to consume for the different news cases. But, at the end of the day, the person who has to deliver these applications, you know data scientists care about time to value, getting the environment quickly all prepared so they can access the right data sets. So, in many ways, most of our customers, many of them are unaware that there's actually containers under the hood. >> So this is the data scientists. >> The data scientists, but the actual administrators and the system administrators were making these tools available, are using containers as a way to accelerate the way they package the software, which has a whole bunch of dependent libraries, and there's a lot of complexity our there. So they're simplifying all that and providing the environment as quickly as possible. >> And in so doing, making sure that whatever workloads are put together, can scaled, can be combined differently and recombined differently, based on requirements of the data scientists. So the data scientist sees the tool... >> Yeah. >> The tool is manifest as, in concert with some of these new container related technologies, and then the whole CICD process supports the data scientist >> The other thing to think about though, is that this also allows freedom of choice, and we were discussing off camera before, these developers want to pick out what they want to pick out what they want to work with, they don't want to have to be locked in. So with containers, you can also speed that deployment but give them freedom to choose the tools that make them best productive. That'll make them much happier, and probably much more efficient. >> So there's a separation under the data science tools, and the developer tools, but they end up all supporting the same basic objective. So how does the infrastructure play in this, because the challenge of big data for the last five years as John and I both know, is that a lot of people conflated. The outcome of data science, the outcome of big data, with the process of standing up clusters, and lining up Hadoop, and if they failed on the infrastructure, they said it was a failure overall. So how you making the infrastructure really simple, and line up with this time of value? >> Well, the reality is, we all need food and water. IT still needs server and storage in order to work. But at the end of the day, the abstraction has to be there just like VMware in the early days, clouds, containers with BlueData is just another way to create a layer of abstraction. But this one is in the context of what the data scientist is trying to get done, and that's the key to why we partnered with BlueData and why we delivered big data as a service. >> So at that point, what's the update from Dell EMC and Dell, in particular, Analytics? Obviously you guys work with a lot of customers, have challenges, how are you solving those problems? What are those problems? Because we know there's some AI rumors, big Dell event coming up, there's rumors of a lot of AI involved, I'm speculating there's going to be probably a new kind of hardware device and software. What's the state of the analytics today? >> I think a lot of the customers we talked about, they were born in that batch processing, that Hadoop space we just talked about. I think they largely got that right, they've largely got that figured out, but now we're seeing proliferation of AI tools, proliferation of sandbox environments, and you're psyched to see a little bit of silo behavior happening, so what we're trying to do is that IT shop is trying to dispatch those environments, dispatch with some speed, with some agility. They want to have it at the right economic model as well, so we're trying to strike a better balance, say 'Hey, I've invested in all this infrastructure already, I need to modernize it, and that I also need to offer it up in a way that data scientists can consume it'. Oh, by the way, we're starting to see them start to hire more and more of these data scientists. Well, you don't want your data scientists, this very expensive, intelligent resource, sitting there doing data mining, data cleansing, detail offloads, we want them actually doing modeling and analytics. So we find that a lot of times right now as you're doing an operational change, the operational mindset as you're starting to hire these very expensive people to do this very good work, at the corest of the data, but they need to get productive in the way that you hired them to be productive. >> So what is this ready solution, can you just explain what that is? Is it a program, is it a hardware, is it a solution? What is the ready solution? >> Generally speaking, what we do as a division is we look for value workloads, just generally speaking, not necessarily in batch processing, or AI, or applications, and we try and create an environment that solves that customer challenge, typically they're very complex, SAP, Oracle Database, it's AI, my goodness. Very difficult. >> Variety of tools, using hives, no sequel, all this stuff's going on. >> Cassandra, you've got Tensorflow, so we try fit together a set of knowledge experts, that's the key, the intellectual property of our engineers, and their deep knowledge expertise in a certain area. So for AI, we have a sight of them back at the shop, they're in the lab, and this is what they do, and they're serving up these models, they're putting data through its paces, they're doing the work of a data scientist. They are data scientists. >> And so this is where BlueData comes in. You guys are part of this abstraction layer in the ready solutions. Offering? Is that how it works? >> Yeah, we are the software that enables the self-service experience, the multitenancy, that the consumers of the ready solution would want in terms of being able to onboard multiple different groups of users, lines of business, so you could have a user that wants to run basic spark, cluster, spark jobs, or you could have another user group that's using Tensorflow, or accelerated by a special type of CPU or GPU, and so you can have them all on the same infrastructure. >> One of the things Peter and I were talking about, Dave Vellante, who was here, he's at another event right now getting some content but, one of the things we observed was, we saw this awhile ago so it's not new to us but certainly we're seeing the impact at this event. Hadoop World, there's now called Strata Data NYC, is that we hear words like Kubernetes, and Multi Cloud, and Istio for the first time. At this event. This is the impact of the Cloud. The Cloud has essentially leveled the Hadoop World, certainly there's some Hadoop activity going on there, people have clusters, there's standing up infrastructure for analytical infrastructures that do analytics, obviously AI drives that, but now you have the Cloud being a power base. Changing that analytics infrastructure. How has it impacted you guys? BlueData, how are you guys impacted by the Cloud? Tailwind for you guys? Helpful? Good? >> You described it well, it is a tailwind. This space is about the data, not where the data lives necessarily, but the robustness of the data. So whether that's in the Cloud, whether that's on Premise, whether that's on Premise in your own private Cloud, I think anywhere where there's data that can be gathered, modeled, and new insights being pulled out of, this is wonderful, so as we ditched data, whether it's born in the Cloud or born on Premise, this is actually an accelerant to the solutions that we built together. >> As BlueData, we're all in on the Cloud, we support all the three major Cloud providers that was the big announcement that we made this week, we're generally available for AWS, GCP, and Azure, and, in particular, we start with customers who weren't born in the Cloud, so we're talking about some of the large financial services >> We had Barclays UK here who we nominated, they won the Cloud Era Data Impact Award, and what they're actually going through right now, is they started on Prem, they have these really packaged certified technology stacks, whether they are Cloud Era Hadoop, whether they are Anaconda for data science, and what they're trying to do right now is, they're obviously getting value from that on Premise with BlueData, and now they want to leverage the Cloud. They want to be able to extend into the Cloud. So, we as a company have made our product a hybrid Cloud-ready platform, so it can span on Prem as well as multiple Clouds, and you have the ability to move the workloads from one to the other, depending on data gravity, SLA considerations. >> Compliancy. >> I think it's one more thing, I want to test this with you guys, John, and that is, analytics is, I don't want to call it inert, or passive, but analytics has always been about getting the right data to human beings so they can make decisions, and now we're seeing, because of AI, the distinction that we draw between analytics and AI is, AI is about taking action on the data, it's about having a consequential action, as a result of the data, so in many respects, NCL, Kubernetes, a lot of these are not only do some interesting things for the infrastructure associated with big data, but they also facilitate the incorporation of new causes of applications, that act on behalf of the brand. >> Here's the other thing I'll add to it, there's a time element here. It used to be we were passive, and it was in the past, and you're trying to project forward, that's no longer the case. You can do it right now. Exactly. >> In many respects, the history of the computing industry can be drawn in this way, you focused on the past, and then with spreadsheets in the 80s and personal computing, you focused on getting everybody to agree on the future, and now, it's about getting action to happen right now. >> At the moment it happens. >> And that's why there's so much action. We're passed the set-up phase, and I think this is why we're hearing, seeing machine learning being so popular because it's like, people want to take action there's a demand, that's a signal that it's time to show where the ROI is and get action done. Clearly we see that. >> We're capitalists, right? We're all trying to figure out how to make money in these spaces. >> Certainly there's a lot of movement, and Cloud has proven that spinning up an instance concept has been a great thing, and certainly analytics. It's okay to have these workloads, but how do you tie it together? So, I want to ask you, because you guys have been involved in containers, Cloud has certainly been a tailwind, we agree with you 100 percent on that. What is the relevance of Kubernetes and Istio? You're starting to see these new trends. Kubernetes, Istio, Cupflow. Higher level microservices with all kinds of stateful and stateless dynamics. I call it API 2.0, it's a whole other generation of abstractions that are going on, that are creating some goodness for people. What is the impact, in your opinion, of Kubernetes and this new revolution? >> I think the impact of Kubernetes is, I just gave a talk here yesterday, called Hadoop-la About Kubernetes. We were thinking very deeply about this. We're thinking deeply about this. So I think Kubernetes, if you look at the genesis, it's all about stateless applications, and I think as new applications are being written folks are thinking about writing them in a manner that are decomposed, stateless, microservices, things like Cupflow. When you write it like that, Kubernetes fits in very well, and you get all the benefits of auto-scaling, and so control a pattern, and ultimately Kubernetes is this finite state machine-type model where you describe what the state should be, and it will work and crank towards making it towards that state. I think it's a little bit harder for stateful applications, and I think that's where we believe that the Kubernetes community has to do a lot more work, and folks like BlueData are going to contribute to that work which is, how do you bring stateful applications like Hadoop where there's a lot of interdependent services, they're not necessarily microservices, they're actually almost close to monolithic applications. So I think new applications, new AI ML tooling that's going to come out, they're going to be very conscious of how they're running in a Cloud world today that folks weren't aware of seven or eight years ago, so it's really going to make a huge difference. And I think things like Istio are going to make a huge difference because you can start in the cloud and maybe now expand on to Prem. So there's going to be some interesting dynamics. >> Without hopping management frameworks, absolutely. >> And this is really critical, you just nailed it. Stateful is where ML will shine, if you can then cross the chasma to the on Premise where the workloads can have state sharing. >> Right. >> Scales beautifully. It's a whole other level. >> Right. You're going to the data into the action, or the activity, you're going to have to move the processing to the data, and you want to have nonetheless, a common, seamless management development framework so that you have the choices about where you do those things. >> Absolutely. >> Great stuff. We can do a whole Cube segment just on that. We love talking about these new dynamics going on. We'll see you in CF CupCon coming up in Seattle. Great to have you guys on. Thanks, and congratulations on the relationship between BlueData and Dell EMC and Ready Solutions. This is Cube, with the Ready Solutions here. New York City, talking about big data and the impact, the future of AI, all things stateful, stateless, Cloud and all. It's theCUBE bringing you all the action. Stay with us for more after this short break.

Published Date : Sep 13 2018

SUMMARY :

Brought to you by SiliconANGLE Media, Welcome to theCUBE, good to see you. Great company, and the founders are great. So Jim, talk about the Dell relationship with BlueData. AI and machine learning are driving the signal, so the reason why we partnered with BlueData is, What's the BlueData update, how are you guys and bringing it to Kubernetes with an open source project and the data science community. But, at the end of the day, the person who has to deliver and the system administrators So the data scientist sees the tool... So with containers, you can also speed that deployment So how does the infrastructure play in this, But at the end of the day, the abstraction has to be there What's the state of the analytics today? in the way that you hired them to be productive. and we try and create an environment that all this stuff's going on. that's the key, the intellectual property of our engineers, in the ready solutions. and so you can have them all on the same infrastructure. Kubernetes, and Multi Cloud, and Istio for the first time. but the robustness of the data. and you have the ability to move the workloads I want to test this with you guys, John, Here's the other thing I'll add to it, and personal computing, you focused on getting everybody to We're passed the set-up phase, and I think this is why how to make money in these spaces. we agree with you 100 percent on that. the Kubernetes community has to do a lot more work, And this is really critical, you just nailed it. It's a whole other level. so that you have the choices and the impact, the future of AI,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Anant Chintamaneni	PERSON	0.99+
Peter Burris	PERSON	0.99+
Jim Franklin	PERSON	0.99+
John	PERSON	0.99+
BlueData	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Jim	PERSON	0.99+
2015	DATE	0.99+
New York	LOCATION	0.99+
100 percent	QUANTITY	0.99+
John Furrier	PERSON	0.99+
New York City	LOCATION	0.99+
Ready Solutions	ORGANIZATION	0.99+
Seattle	LOCATION	0.99+
yesterday	DATE	0.99+
Dell EMC	ORGANIZATION	0.99+
Barclays UK	ORGANIZATION	0.99+
first time	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
today	DATE	0.99+
One	QUANTITY	0.98+
both	QUANTITY	0.98+
AWS	ORGANIZATION	0.98+
this week	DATE	0.97+
CF CupCon	EVENT	0.97+
one	QUANTITY	0.97+
Cassandra	PERSON	0.97+
seven	DATE	0.96+
two guests	QUANTITY	0.96+
Isilon	ORGANIZATION	0.96+
80s	DATE	0.96+
NCL	ORGANIZATION	0.96+
SAP	ORGANIZATION	0.95+
API 2.0	OTHER	0.92+
Anaconda	ORGANIZATION	0.92+
Cloud Era Hadoop	TITLE	0.91+
NYC	LOCATION	0.91+
Hadoop	TITLE	0.91+
eight years ago	DATE	0.91+
Prem	ORGANIZATION	0.9+
Cupflow	TITLE	0.89+
Premise	TITLE	0.89+
Kubernetes	TITLE	0.88+
one more thing	QUANTITY	0.88+
Istio	ORGANIZATION	0.87+
Docker	TITLE	0.85+
Docker	ORGANIZATION	0.85+
Cupflow	ORGANIZATION	0.84+
Cube	ORGANIZATION	0.83+
last five years	DATE	0.82+
Cloud	TITLE	0.8+
Kubernetes	ORGANIZATION	0.8+
Oracle Database	ORGANIZATION	0.79+
2018	DATE	0.79+
Clouds	TITLE	0.78+
GCP	ORGANIZATION	0.77+
theCUBE	ORGANIZATION	0.76+
Cloud Era Data Impact Award	EVENT	0.74+
Cube	PERSON	0.73+

Joe Beda, Heptio | KubeCon 2017

>> Announcer: Live, from Austin, Texas, it's theCube, covering KubeCon and Cloud Native Con 2017. Brought to you by Red Hat, The Linux Foundation, and theCube's ecosystem partners. >> Welcome back everyone, live here. This is theCube's exclusive coverage, live in Austin, Texas for Cloud Native Con and KubeCon with The Linux Foundation. I'm John Furrier, the founder. Silicon Angle Media, my cohost Stu Miniman, and next to us Joe Beda, who's the co-founder, co-founder and CTO of Heptio With Craig McLuckie, the famous startup that came out of the Google team, really one of the principal founders of Kubernetes with Craig and the team Brendon Burns and the like. Great to have you on theCube, thanks for coming on. >> Thank you very much for having me, it's exciting. >> Good time, first time on theCube, glad to have you, we've been trying to get your perspective because obviously we're fans of the Kubernetes, I just had Lou Tucker on, we were talking interclouding and some orchestration opportunity. You guys had that vision and it's really important to tell the story, at the beginning with Kubernetes. You guys were sitting around, having a little beer, free food at the Google cafeteria, what was it like? What happened? How did it all come together? >> All right well, I started at Google probably 10, 12 years ago, did a whole bunch of stuff but eventually landed doing cloud. Craig and I started up a Google compute engine, VM as a service and the odd thing to recognize is that nobody who had been at Google for a long time thought that there was anything to this VM stuff. Because Google had been on containers for so long, that was their mindset, Borg was the way that stuff was actually deployed, so my boss at the time, who's now in Cloud Era booted up a VM for the first time, and anybody in the outside world would be like hey, that's really cool and his response was like, well now what? You're sitting at a prompt, that's not super interesting, how do I run my app? That's what everybody's been struggling with with Cloud, it's not how do I get a VM, how do I actually run my code? As Google got more and more serious about Cloud, every big company wants to dog food their products. How do we make the experience that folks inside of Google have, developers inside of Google have, match the experience that Cloud customers have? The choice there was either we make everybody inside of Google start using VM's which would have felt like that step backwards, or we teach the rest of the world about Borg. Now around the same time, docker started getting a lot of attention and we were like hey, those guys are onto something, they really found a good way to make this technology accessible to users on a single node level, but our experience at Google really taught us that that clusters you, how do you actually create this abstraction that a whole bunch of computers are one thing that you operate with? That was the thing that was going to be interesting and so out of that, we decided Kubernetes was going to be the thing or at least getting Borg out to the rest of the world, and we knew for it to be effective, it couldn't just be Google doing it alone, we had to do it in a way that would bring the rest of the industry with us. That's the motivation behind Kubernetes. It took us about another three months to convince all the folks at Google that this was a good idea, it was controversial, the open source projects at the time were things like, the biggest things would be like Chrome and Android. Those things were, the relationship with their community was very different from what we were aiming for with Kubernetes, they were much more consumer focused versus infrastructure focused. >> It was early too for Google to recognize the multi cloud world. >> I think some it wasn't so much multi cloud as much as developers have a really strong sense of where the lock in is, where the vendor lock in is, and we knew that if we wanted to win the hearts and minds of engineers and developers and folks that took this stuff seriously, as the underdog in the cloud world at the time, you had to really go out there and build something that was going to be widely applicable. Because you don't want to invest your time and energy into something that's super specialized to one cloud and I think the whole multi cloud thing, honestly I think it's engineers and developers and operations folks that had that sense from the get go, we were just reacting to that. >> Good instincts too. Kubernetes certainly working out today, state of the union, cause we're still only less than three years old as a community, seems like 20, but the momentum's been amazing, has been a lot of revision, a lot of people have their own versions of Kubernetes, yet there's a core, vanilla Kubernetes, but it's working. People have gotten around this. What is the big thing that has surprised you the most and where are you most excited right now, where Kubernetes is at? >> Okay surprise, there's 4100 people here at KubeCon, that's absolutely insane. I think we had this idea that it could be a thing and that, but I don't think that any of us imagined that within three years we'd be sitting here, doing this type of thing. That I think for me is the most surprising. It's a challenge to take these ideas that have been successful inside at Google and translate those to the rest of the world and it wasn't an easy or obvious thing, there were a lot of good ideas but figuring out how to get those out there, I think that really is due to the larger community. Folks like Clayton Pullman from Red Hat coming in early with a lot of that really brought a lot of that outside DNA necessary to bridge that gap. Surprising that we got here, but really it took the community to make that happen. In terms of what I'm most excited about right now, with the announcement of EKS from Amazon, it definitely feels like we're moving into a new phase of Kubernetes where folks are being much more focused on what do you do with Kubernetes versus how do you get Kubernetes running. Kelsey tweeted it the other day, but I think we've been saying for a while, Kubernetes at its heart is a platform for building platforms, really we viewed it from the start as a toolbox and I think we're only now starting to see, what other things are people going to be building with that toolbox and I think that's going to be that larger ecosystem, is going to be much larger than Kubernetes itself. >> Joe, coming into this show, there were so many announcements around Kubernetes, there's like 42 certified different versions out there. I think you could help explain a little bit because there's the big cloud guys, you mentioned Clayton who we had earlier from Red Hat, there's all these companies, oh well, Kubernetes is just like it's a piece and it's in there. Your company is around Kubernetes, so what does this mean that Kubernetes is, I guess we'd say commoditized across there, I think it's a good thing for the industry, but what does it mean, why is there a need for Heptio and what do you guys see as your role in the ecosystem? >> There's a bunch of folks that are really concentrating on how do I get Kubernetes up and running and that's one thing, and I think that landscape is going to be changing and evolving over time. We're definitely happy to help folks be successful with Kubernetes, it's one of those things we're going to do, we're going to do an open source project, services, support and training with that, but when we look forward, I think a big part of it is, how do we bridge the gap to integrate Kubernetes into businesses, how do we start building those next layer tools on top of it and to some degree, it's a wild west. There's those 42 companies, everybody's trying to actually find something that's going to be interesting, start solving problems, but the thing that's really encouraging to me is that Kubernetes is the base and we're doing work, both Heptio and the community around conformance to make sure that we actually have a solid base that folks can build on top of. Then everybody's focused on how can we actually capture the attention of developers, how can we actually deliver value there and so that's a really great dynamic, when everybody's like I want to do something really great that people are going to get a lot out of, only good things are going to come from that. >> Yeah and I liked, there was a concern some people had, oh last week AWS is now all in, they've got EKS, but you had an announcement about the Heptio authenticator open source authentication, a little bit of a partnership with AWS it looked like. Maybe explain, it sounds like one of the things you're building on top of this. >> Yeah exactly. Like everybody else, we had heard all the rumors, hey is Amazon going to do a Kubernetes offering or not. In our mind, there were two ways. >> Didn't they have to Joe? >> Well that's what I thought last year, but who knows, I think Amazon doesn't have to do anything but when we first started Kubernetes, we reached out to the folks at Amazon including Deepak and we're like hey, you guys are welcome, come join us here and they were like yeah, yeah, we'll join you when the customers are asking for it. Well it turns out the customers were asking for it, so here they are and I think it's a great thing. I think it could've gone two ways, they could have built in a bunch of integrations into Kubernetes that were only available through EKS that really made EKS a more integrated, better Kubernetes than running open source Kubernetes on top of Amazon, or they could've worked with the community, with upstream to try and make Kubernetes run great on Amazon, better on Amazon as is but then run even better when you're running it with EKS and they actually have the management on top of it. I think they decided to go that second route which is much more community friendly. A couple weeks before the announcement, they reached out to us, said hey, we noticed you had this project, it looks really interesting, we need a way to bridge IM to authenticate to Kubernetes and we like the approach that you're taking, can we work together to continue to develop this and that was the first signal to us that they wanted to really reach out and work with the community and so we're like hey, that sounds great, let's work together and get that stuff out there. It's still very early, I think EKS is GA next year, they set an aggressive goal for themselves, so I'm really looking forward to see where they take that and we're going to partner with them where it makes sense around things like authenticator. >> You mentioned we're going to a whole other level with Kubernetes and Amazon's announcement goes to the next level, you also mentioned you worked at Google Compute, Apple, all these other cool names with Google and you got Heptio, you're solving making interesting things happen with Kubernetes and you got a new class of developers coming in that have never heard of what a local director is. Infrastructure as code is happening, so you got the cloud game going on. I got to ask you, as Kubernetes starts to continue to take shape, a lot of people are trying to survive. In this technical architecture decisions, almost a tech chess game, which side of history will you be on thing going on and customers want more clarity. You have a lot of movement and customers want clarity. How do you see it continuing and what is the right path in your mind because it's looking good right now and commoditization as some say, I think is a good thing because value, there's value in interoperability, there's value in orchestration, there's value in a new class of web developer creating, solving problems with code, whether it's societal problems or other things, so there's a lot of big picture, wholistic things happening and Kubernetes kind of strikes at the heart of that. What's the right path in your mind, what's the vision you think Kubernetes should go into. >> Well I think first of all, I think change happens in the industry both fast and slow. It feels like it's been three years since Kubernetes, since we open sourced Kubernetes, and it's come a huge way since then. That happened really fast. You look at Enterprise, you look at Enterprise adoption cycles, I believe last I heard the mainframe division was a growing profit center for IBM. This stuff doesn't go away so as we see things like containers and Kubernetes and serverless and cloud, as we see these things come on the scene, it doesn't necessarily replace stuff, it augments and it adds over time so we see the mix of where people invest shift. In that way, things become established quickly, but old things go away slowly. I don't think it's going to be as quick of a shift as maybe it might seem at first. Now in terms of where the opportunities are moving forward and where we see this developing, the thing that's exciting for me is as we have, and this is something early on, talking with Brendon, he got super excited about, is as we provide new abstractions, as we provide a new toolbox, how do people start creating systems and applications that take advantage of that. I'll give you an example, distributed systems, pre-systems like Kubernetes were very difficult because not only did you have to do the thing that you wanted to do, you had to build all of this plumbing to actually get your things to talk to each other, the finds, the secure, all that stuff had to be created from scratch and those systems were rare and hard to manage and few and far between. Now with things like Kubernetes, there's a whole set of problems that you actually don't have to solve. The floor that you need, the floor is that much higher for building these systems so I think we're going to see a shift not just to cloud native, but I also think we're going to see a set of applications that are Kubernetes native. These are applications that assume that Kubernetes is the substrate that they're running on, and they take special advantage of it and I think we're going to see amazing thing happens when we really democratize the plumbing for building distributed systems. >> And that's the key, make that frictionless so if people want to go Kubernetes native, they're taking advantage, that's cool. I want to get to, to take that to the next level, as the world of IOT comes down, you can almost look at the world now as all IOT. There's no on prem and there's no cloud. If you believe this service mission unpluggable architectures, you could argue that a data center is a network point, it's an attached device to a myriad things, so you're going to need policy, the light bulb has a process in it, the wifi has wifis everywhere, so in a way, this is all going to be a grid if you will, it's going to be kind of a mesh. This is the right direction don't you think, the more services that come online, you just want to connect to them. That's the nirvana right? Are we smoking the peace pipe here too much? >> I think there's a bunch of trends that we're seeing happen there. I think with IOT, we see also a move towards edge computing, this idea of, we're going to see much more stuff happening in a more distributed manner. Whether that edge happens to be in your house or whether it's in a telecom cabinet or whether it's just mini data centers that are dropped in to parking lots here and there. That introduces a whole bunch of new problems in terms of how do you manage that stuff at scale. One of the things that I see is that we're seeing an interesting overlap between CDM providers and cloud providers, so you have cloud flare introducing their cloud workers, where you can start running actual code in their CDM nodes and that's the culmination of CDM providers over time fighting with each other to drive more and more customization. On the other hand, you have Amazon taking lambda, finding ways to actually use lambda and push that out to the edge, even into devices that are doing local machine learning. There's this overlap between these two different worlds. Then also, as we move stuff closer out to the clouds, the political situations that people deal with become that much more complex. As you start running compute in all these different countries, all of a sudden you can't necessarily go to one provider to actually deal with all of that. We're moving from this world where, when you're centered around data which is the traditional cloud, when you want to put it all in one big pile with compute around the edges, that's kind of like the traditional data center. Going with a few large providers makes a ton of sense. As we move towards a much more distributed world, it becomes a more distributed problem both in terms of how do you manage the compute, but how do you manage the relationships and how do you actually understand what's happening across all that and I think Kubernetes can be a part of that puzzle for sure, but it's not the end of the answer, there's still a lot of problems to be solved there. >> No but you get the first mile post. You can say hey, I can start orchestrating workloads and have endpoints that have services that talk to each other as the first step. >> Joe, one thing I wanted to ask you, what are the stumbling blocks? What do people need to look out for? Because most companies out there aren't Google. >> This morning at today's keynote and you can find it online, there's that cloud native road map that Dan was showing. That is an interesting thing that cuts both ways. On the one hand, it shows an enormous amount of innovation, it shows that we're seeing this explosion of interest in this world and it's really invigorating. That's from an entrepreneur's view and a technologist's view. If I'm a customer, that thing's kind of horrifying. I look at that and I say wow, I really have to understand all of this stuff to get ahead? I think the biggest stumbling block is really being able to make sense of all the noise out there. I think that noise is part and parcel of an active, innovative, chaotic ecosystem, but I think it's one of those things that makes it that much harder for enterprises and for more mainstream developers to adopt. Tim, we've been saying this for a while, for Kubernetes to be successful, we had to make it boring. That's Tim Hawkin, I think maybe was the first one to say that, but we not only had to make Kubernetes boring, we had to make that entire stack boring, we had to make cloud native boring. That's when it will have succeeded. I don't know what this conference will look like when cloud native is boring, but it'll probably be very different than. >> It'll certainly create some excitement, boring is reliable, boring is safe, boring is secure, boring is comfortable. Mark Zuckerberg once said move fast, break stuff, then he revised it to move fast and be 100% reliable. That's boring. >> Did he actually say that? >> I don't know, he shifted his narrative because that was the maverick early days when he started running at five nines it's like a whole nother ball game. >> Actually that matters. >> Joe, great to have you on theCube, thanks for sharing your awesome insight into the dynamics of the computing industry that's going cloud native, going KubeCon, and certainly Kubernetes that you helped put together with the team, it's certainly taken on a life of its own, last minute, take a minute to talk about Heptio, what you guys are working on, get the plug in. >> Yeah Heptio, we have services, support and training that we're offering to make customers successful with Kubernetes today and that's been invigorating, really getting out there and talking with folks, seeing the problems that they're hitting now versus where we want it to go. We're doing a bunch of work around open source projects, we have Heptio Arc which is a backup disaster recovery project open source, we have Sona Boy, which is a diagnostic project for running the conformance tests and it underpins the Kubernetes conformance effort. We have K Sonic which helps you configure applications and then we also have Contour, which is an ingress controller building on top of Envoy and other CNCF project and then into 2018, we're going to be offering more products and projects and services that really start targeting the special needs of larger and larger enterprises and that's where our focus is going to shift over time. >> You guys are certainly helping customers who are under pressure to add more services, including what Amazon's doing, more pronouncements, there are little announcements, some big some little, but still, the cadence of new things happening is fast at all times right now. >> I can't keep up either, nobody else can. >> We try. Two and a half hour keynote, it's ridiculous. Joe Beda here inside theCube, cofounder CTO of Heptio a hot startup, making Kubernetes interesting and exciting and reliable and boring. Not boring, we should say that. >> Oh boring's good. >> Infrastructure's good, it's theCube, bringing you all the live action from Austin, Texas, I'm John Furrier, Stu Miniman, KubeCon and Cloud Native Con, we'll be right back after this short break.

Published Date : Dec 7 2017

SUMMARY :

Brought to you by Red Hat, The Linux Foundation, Great to have you on theCube, thanks for coming on. to tell the story, at the beginning with Kubernetes. the thing or at least getting Borg out to the rest to recognize the multi cloud world. and operations folks that had that sense from the get go, What is the big thing that has surprised you the most and I think that's going to be that larger ecosystem, and what do you guys see as your role in the ecosystem? around conformance to make sure that we actually have but you had an announcement about the Heptio authenticator hey is Amazon going to do a Kubernetes offering or not. and they were like yeah, yeah, we'll join you to the next level, you also mentioned you worked of problems that you actually don't have to solve. this is all going to be a grid if you will, Whether that edge happens to be in your house and have endpoints that have services that talk What do people need to look out for? for Kubernetes to be successful, we had to make it boring. then he revised it to move fast and be 100% reliable. because that was the maverick early days and certainly Kubernetes that you helped and services that really start targeting the special needs but still, the cadence of new things happening Two and a half hour keynote, it's ridiculous. bringing you all the live action from Austin, Texas,

ENTITIES

Entity	Category	Confidence
Tim Hawkin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Joe Beda	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Tim	PERSON	0.99+
Clayton Pullman	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Mark Zuckerberg	PERSON	0.99+
Apple	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
2018	DATE	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
last year	DATE	0.99+
42 companies	QUANTITY	0.99+
two ways	QUANTITY	0.99+
Dan	PERSON	0.99+
Lou Tucker	PERSON	0.99+
Joe	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Craig	PERSON	0.99+
Kelsey	PERSON	0.99+
4100 people	QUANTITY	0.99+
one provider	QUANTITY	0.99+
Austin, Texas	LOCATION	0.99+
Android	TITLE	0.99+
next year	DATE	0.99+
first mile	QUANTITY	0.99+
first step	QUANTITY	0.99+
first time	QUANTITY	0.99+
Clayton	PERSON	0.99+
Two and a half hour	QUANTITY	0.98+
Cloud	TITLE	0.98+
Craig McLuckie	PERSON	0.98+
Brendon	PERSON	0.98+
both ways	QUANTITY	0.98+
KubeCon	EVENT	0.98+
Google Compute	ORGANIZATION	0.98+
first	QUANTITY	0.98+
three years	QUANTITY	0.98+
today	DATE	0.98+
Kubernetes	TITLE	0.98+
last week	DATE	0.98+
EKS	ORGANIZATION	0.98+
Chrome	TITLE	0.98+
Cloud Native Con	EVENT	0.97+
first one	QUANTITY	0.97+
both	QUANTITY	0.97+
Heptio	ORGANIZATION	0.97+
first signal	QUANTITY	0.97+
Cloud Native Con 2017	EVENT	0.96+
theCube	ORGANIZATION	0.96+
two different worlds	QUANTITY	0.96+
three months	QUANTITY	0.96+
Borg	ORGANIZATION	0.95+
Kubernetes	ORGANIZATION	0.95+

Greg Sands, Costanoa | Big Data NYC 2017

(electronic music) >> Host: Live from Midtown Manhattan it's The Cube! Covering Big Data New York City 2017, brought to you by Silicon Angle Media, and its Ecosystem sponsors. >> Okay, welcome back everyone. We are here live, The Cube in New York City for Big Data NYC, this is our fifth year, doing our own event, not with O'Reilly or Cloud Era at Strata Data, which as Hadoop World, Strata Conference, Strata Hadoop, now called Strata Data, probably called Strata AI next year, we're The Cube every year, bringing you all the great data, and what's going on. Entrepreneurs, VCs, thought leaders, we interview them and bring that to you. I'm John Furrier with our next guest, Greg Sands, who's the managing director and founder of Costa Nova ventures in Palo Alto, started out as an entrepreneur himself, then single shingle out there, now he's a big VC firm on a third fund. >> On the third fund. >> Third fund. How much in that fund? >> 175 million dollar fund. >> So now you're a big firm now, congratulations, and really great to see your success. >> Thanks very much. I mean, we're still very much an early stage boutique focused on companies that change the way the world does business, but it is the case that we have a bigger team and a bigger fund, to go do the same thing. >> Well you've been great to work with, I've been following you, we've known each other for a while, watched you left Sir Hill and start Costanova, but what's interesting is that, I can kind of joke and kid you, the VC inside joke about being a big firm, because I know you want to be small, and like to be small, help entrepreneurs, that's your thing. But it's really not a big firm, it's a few partners, but a lot of people helping companies, that's your ethos, that's what you're all about at your firm. Take a minute to just share with the folks the kinds of things you do and how you get involved in companies, you're hands on, you roll up your sleeves. You get out of the way at the right time, you help when you can, share your ethos. >> Yeah, absolutely so the way we think of it is, combining the craft of old school venture capital, with a modern operating team, and so since most founder these days are product-oriented, our job is to think like product people, not think like investors. So we think like product people, we do product level analysis, we do customer discovery, we do, we go ride along on sales calls when we're making investment decisions. And then we do the things that great venture capitalists have done for years, and so for example, at Alatian, who I know has been on the show today, we were able to incubate them in our office for a year, I had many conversations with Sathien after he'd sold the first two or three customers. Okay, who's the next person we hire? Who isn't a founder? Who's going to go out and sell? What does that person look like? Do you go straight to a VP? Or do you hire an individual contributor? Do you hire someone for domain, or do you hire someone for talent? And that's the thing that we love doing. Now we've actually built out an operating team so marketing partner, Martino Alcenco, and Jim Wilson as a sales partner, to really help turn that into a program, so that they can, we can take these founders who find product market fit, and say, how do we help you build the right sales process and marketing process, sales team and marketing team, for your company, your customer, your product? >> Well it's interesting since you mention old school venture capital, I'll get into some of the dynamics that are going on in Silicon valley, but it's important to bring that forward, because now with cloud you can get to critical mass on the fly wheel, on economics, you can see the visibility faster now. >> Greg: Absolutely. >> So the game of the old school venture capitalist is all the same, how do you get to cruising altitude, whatever metaphor you want to use, the key was getting there, and sometimes it took a couple of rounds, but now you can get these companies with five million, maybe $10 million funding, they can have unit economics visibility, scales insight, then the scale game comes in, so that seems to be the secret trick right now in venture is, don't overspend, keep the valuation in range and allows you to look for multiple exits potentially, or growth. Talk about that dynamic, because this is like, I call it the hour glass. You get through the hour glass, everyone's down here, but if you can sneak through and get the visibility on the economics, then you grow quickly. >> Absolutely. I mean, it's exactly right an I haven't heard the hour glass metaphor before but I like it. You want to basically get through the narrows of product market fit and the beginnings of scalable sales and marketing. You don't need to know all the answers, but you can do that in a capital-efficient way, building really solid foundations for future explosive growth, look, everybody loves fast growth and big markets, and being grown into. But the number of people who basically don't build those foundations and then say, go big or go home! And they take a ton of money, and they go spend all the money, doing things that just fundamentally don't work, and they blow themselves up. >> Well this is the hourglass problem. You have, once you get through that unique economics, then you have true scale, and value will increase. Everybody wins there so it's about getting through that, and you can get through it fast with good mentoring, but here's the challenge that entrepreneurs fall into the trap. I call it the, I think I made it trap. And what happens is they think they're on the other side of the hourglass, but they still haven't even gone through the straight and narrow yet, and they don't know it. And what they do is they over fund and implode. That seems to be a major trap I see a lot of entrepreneurs fall into, while I got a 50 million pre on my B round, or some monster valuation, and they get way too much cash, and they're behaving as if they're scaling, and they haven't even nailed it yet. >> Well, I think that's right. So there's certainly, there are stages of product market fit, and so I think people hit that first stage, and they say, oh I've got it. And they try to explode out of the gates. And we, in fact I know one good example of somebody saying, hey, by the way, we're doing great in field sales, and our investors want us to go really fast, so we are going to go inside and we, my job was to hire 50 inside people, without ever having tried it. And so we always preach crawl, walk, run, right? Hire a couple, see how it works. Right, in a new channel. Or a new category, or an adjacent space, and I think that it's helpful to have an investor who has seen the whole picture to say, yeah, I know it looks like light at the end of the tunnel, but see how it's a relatively small dot? You still got to go a little farther, and then the other thing I say is, look, don't build your company to feed your venture capitalist ego. Right? People do these big rounds of big valuations, and the big dog investors say, go, go, go! But, you're the CEO. Your job is analyze the data. >> John: You can find during the day (laughs). >> And say, you know, given what we know, how fast should we go? Which investments should we make? And you've got to own that. And I think sometimes our job is just to be the pulling guard and clear space for the CEO to make good decisions. >> So you know I'm a big fan, so my bias is pretty much out there, love what you guys are doing. Tim Carr is a Pivot North doing the same thing. Really adding value, getting down and dirty, but the question that entrepreneurs always ask me and talk privately, not about you, but in general, I don't want the VC to get in the way. I want them, I don't want them to preach to me, I don't want too many know-it-alls on my board, I want added value, but again, I don't want the preaching, I don't want them to get in the way, 'cause that's the fear. I'm not saying the same about VCs in general, but that's kind of the mentality of an entrepreneur. I want someone who's going to help me, be in the boat with me, but not be in my way. How do you address that concern to the founders who think, not think like that, but might have a fear. >> Well, by the way, I think it's a legitimate fear, and I think it actually is uncorrelated with added value, right? I think the idea that the board has certain responsibilities, and management has certain responsibilities, is incredibly important. And I think, I can speak for myself in saying, I'm quite conscious of not crossing that line, I think you talk. >> John: You got to build a return, that's the thing. >> But ultimately I would say to an entrepreneur, I'd just say, hey look, call references. And by the way, here are 30 names and phone numbers, and call any one of them, because I think that people who are, so a venture capital know-it-all, in the board room, telling CEOs what to do, destroys value. It's sand in the gears, and it's bad for the company. >> Absolutely, I agree 100% >> And some of my, when I talk about being a pulling guard for the CEO, that's what I'm talking about, which is blocking people who are destructive. >> And rolling the block for a touchdown, kind of use the metaphor. Adding value, that's the key, and that's why I wanted to get that out there because most guys don't get that nuance, and entrepreneurs, especially the younger ones. So it's good and important. Okay, let's talk about culture, obviously in Silicon Valley, I get, reading this morning in the Wymo guy, and they're writing it, that's the Silicon Valley, that's not crazy, there's a lot of great people in Silicon Valley, you're one of them. The culture's certainly an innovative culture, there's been some things in the press, inclusion and diversity, obviously is super important. This whole brogrammer thing that's been kind of kicked around. How are you dealing with all that? Because, you know, this is a cultural shift, but I think it's being made out more than it really is, but there's still our core issues, your thoughts on the whole inclusion and diversity, and this whole brogrammer blowback thing. >> Yeah, well so I think, so first of all, really important issues, glad we're talking about them, and we all need to get better. And to me the question for us has been, what role do we play? And because I would say it is a relatively small subset of the tech industry, and the venture capital industry. At the same time the behavior of that has become public is appalling. It's appalling and totally unacceptable, and so the question is, okay, how can we be a part of the stand-up part of the ecosystem, and some of which is calling things out when we see them. Though frankly we work with and hang out with people and we don't see them that often, and then part of which is, how do we find a couple of ways to contribute meaningfully? So for example this summer we ran what we called the Costanova Access Fellowship, intentionally, trying to provide first opportunity and venture capital for people who traditionally haven't had as much access. We created an event in the spring called, Seat at the Table, really, particularly around women in the tech industry, and it went so well that we're running it in New York on October 19th, so if you're a woman in tech in New York, we'd love to see you then. And we're just trying to figure-- >> You're doing it in an authentic way though, you're not really doing it from a promotional standpoint. It's legit. >> Yeah, we're just trying to do, you know, pick off a couple of things that we can do, so that we can be on the side of the good guys. >> So I guess what you're saying is just have high integrity, and be part of the solution not part of the problem. >> That's right, and by the way, both of these initiatives were ones that were kicked off in late 2016, so it's not a reaction to things like binary capital, and the problems at uper, both of which are appalling. >> Self-awareness is critical. Let's get back to the nuts and bolts of the real reason why I wanted you to come on, one was to find out how much money you have to spend for the entrepreneurs that are watching. Give us the update on the last fund, so you got a new fund that you just closed, the new fund, fund three. You have your other funds that are still out there, and some funds reserved, which, what's the number amount, how much are you writing checks for? Give the whole thesis. >> Absoluteley. So we're an early stage investor, so we lead series A and seed financing companies that change the way the world does business, so up and down the stack, a business-facing software, data-driven applications. Machine-learning and AI driven applications. >> John: But the filter is changing the way the world works? >> The way, yes, but in particularly the way the world does business. You can think of it as a business-facing software stack. We're not social media investors, it's not what we know, it's not what we're good at. And it includes security and management, and the data stack and-- >> Joe: Enterprise and emerging tech. >> That's right. And the-- >> And every crazy idea in between. >> That's right. (laughs) Absolutely, and so we're participate in or leave seed financings as most typically are half a million to maybe one and a quarter, and we'll lead series A financing, small ones might be two or two and a half million dollars at the outer edge is probably a six million dollar check. We were just opening up in the next couple of days, a thousand square feet of incubation space at world headquarters at Palo Alto. >> John: Nice. >> So Alation, Acme Ticketing and Zen IQ are companies that we invested in. >> Joe: What location is this going to be at? >> That's, near the Fills in downtown Palo Alto, 164 staff, and those three companies are ones where we effectively invested at formation and incubated it for a year, we love doing that. >> At the hangout at Philsmore and get the data. And so you got some funds, what else do you have going on? 175 million? >> So one was a $100 million fund, and then fund two was $135 million fund, and the last investment of fund two which we announced about three weeks ago was called Roadster, so it's ecommerce enablement for the modern dealerships. So Omnichannel and Mobile First infrastructure for auto-dealers. We have already closed, and had the first board meeting for the first new investment of fund three, which isn't yet announced, but in the land of computer vision and deep learning, so a couple of the subjects that we care deeply about, and spend a lot of time thinking about. >> And the average check size for the A round again, seed and A, what do you know about the? The lowest and highest? >> The average for the seed is half a million to one and a quarter, and probably average for a series A is four or five. >> And you'll lead As. >> And we will lead As. >> Okay great. What's the coolest thing you're working on right now that gets you excited? It doesn't have to be a portfolio company, but the research you're doing, thing, tires you're kicking, in subjects, or domains? >> You know, so honestly, one of the great benefits of the venture capital business is that I get up and my neurons are firing right away every day. And I do think that for example, one of the things that we love is is all of the adulant infrastructure and so we've got our friends at Victor Ops that are in the middle of that space, and the thinking about how the modern programmer works, how everybody-- >> Joe: Is security on your radar? >> Security is very much on our radar, in fact, someone who you should have on your show is Asheesh Guptar, and Casey Ella, so she's just joined Bug Crowd as the CEO and Casey moves over to CTO, and the word Bug Bounty was just entered into the Oxford Dictionary for the first time last week, so that to me is the ultimate in category creation. So security and dev ops tools are among the things that we really like. >> And bounties will become the norm as more and more decentralized apps hit the scene. Are you doing anything on decentralized applications? I'm not saying Blockchain in particular, but Blockchain like apps, distributing computing you're well versed on. >> That's right, well we-- >> Blockchain will have an impact in your area. >> Blockchain will have an impact, we just spent an hour talking about it in the context our off site in Decosona Lodge in Pascadero, it felt like it was important that we go there. And digging into it. I think actually the edge computing is actually more actionable for us right now, given the things that we're, given the things that we're interested in, and we're doing and they, it is just fascinating how compute centralizes and then decentralizes, centralizes and then decentralizes again, and I do think that there are a set of things that are fascinating about what your process at the edge, and what you send back to the core. >> As Pet Gelson here said in the QU, if you're not out in front of that next wave, you're driftwood, a lot of big waves coming in, you've seen a lot of waves, you were part of one that changed the world, Netscape browser, or the business plan for that first project manager, congratulations. Now you're at a whole nother generation. You ready? (laughs) >> Absolutely, I'm totally ready, I'm ready to go. >> Greg Sands here in The Cube in New York City, part of Big Data NYC, more live coverage with The Cube after this short break, thanks for watching. (electronic jingle) (inspiring electronic music)

Published Date : Sep 29 2017

SUMMARY :

brought to you by Silicon Angle Media, and founder of Costa Nova ventures in Palo Alto, How much in that fund? congratulations, and really great to see your success. but it is the case that we have the kinds of things you do and how you get And that's the thing that we love doing. I'll get into some of the dynamics that are going on is all the same, how do you get to But the number of people who basically but here's the challenge that and the big dog investors say, go, go, go! for the CEO to make good decisions. but that's kind of the mentality of an entrepreneur. Well, by the way, I think it's a legitimate fear, And by the way, here are 30 names and phone numbers, And some of my, and entrepreneurs, especially the younger ones. and so the question is, okay, You're doing it in an authentic way though, so that we can be on the side of the good guys. not part of the problem. and the problems at uper, of the real reason why I wanted you to come on, companies that change the way the world does business, and the data stack and-- And the-- and a half million dollars at the outer edge So Alation, Acme Ticketing and Zen IQ That's, near the Fills in downtown Palo Alto, And so you got some funds, and the last investment of fund two The average for the seed is but the research you're doing, and the thinking about how the modern are among the things that we really like. more and more decentralized apps hit the scene. and what you send back to the core. or the business plan for that first I'm ready to go. Greg Sands here in The Cube in New York City,

ENTITIES

Entity	Category	Confidence
Greg Sands	PERSON	0.99+
Asheesh Guptar	PERSON	0.99+
John	PERSON	0.99+
two	QUANTITY	0.99+
Tim Carr	PERSON	0.99+
John Furrier	PERSON	0.99+
Costa Nova	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
Joe	PERSON	0.99+
October 19th	DATE	0.99+
Costanova	ORGANIZATION	0.99+
Silicon Angle Media	ORGANIZATION	0.99+
$10 million	QUANTITY	0.99+
New York	LOCATION	0.99+
$100 million	QUANTITY	0.99+
five million	QUANTITY	0.99+
Casey Ella	PERSON	0.99+
$135 million	QUANTITY	0.99+
Zen IQ	ORGANIZATION	0.99+
Omnichannel	ORGANIZATION	0.99+
50 million	QUANTITY	0.99+
three companies	QUANTITY	0.99+
Pascadero	LOCATION	0.99+
Greg	PERSON	0.99+
New York City	LOCATION	0.99+
100%	QUANTITY	0.99+
50	QUANTITY	0.99+
Silicon valley	LOCATION	0.99+
Jim Wilson	PERSON	0.99+
O'Reilly	ORGANIZATION	0.99+
Casey	PERSON	0.99+
Alation	ORGANIZATION	0.99+
half a million	QUANTITY	0.99+
30 names	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
175 million	QUANTITY	0.99+
first	QUANTITY	0.99+
Victor Ops	ORGANIZATION	0.99+
Pet Gelson	PERSON	0.99+
both	QUANTITY	0.99+
last week	DATE	0.99+
four	QUANTITY	0.99+
three customers	QUANTITY	0.99+
late 2016	DATE	0.99+
fifth year	QUANTITY	0.99+
Cloud Era	ORGANIZATION	0.99+
Acme Ticketing	ORGANIZATION	0.98+
164 staff	QUANTITY	0.98+
NYC	LOCATION	0.98+
five	QUANTITY	0.98+
Oxford Dictionary	TITLE	0.98+
Midtown Manhattan	LOCATION	0.98+
Alatian	ORGANIZATION	0.98+
175 million dollar	QUANTITY	0.98+
next year	DATE	0.98+
today	DATE	0.97+
first time	QUANTITY	0.97+
third fund	QUANTITY	0.97+
first board	QUANTITY	0.97+
Costanoa	PERSON	0.97+
a year	QUANTITY	0.97+
six	QUANTITY	0.97+
one	QUANTITY	0.97+
one and a quarter	QUANTITY	0.96+
Strata Conference	EVENT	0.96+
The Cube	TITLE	0.96+
Strata AI	EVENT	0.96+
million dollar	QUANTITY	0.96+
2017	EVENT	0.95+
first project	QUANTITY	0.95+
two and a half million dollars	QUANTITY	0.95+
Hadoop World	EVENT	0.94+
Sathien	PERSON	0.93+
single shingle	QUANTITY	0.93+
first two	QUANTITY	0.93+
an hour	QUANTITY	0.92+
this summer	DATE	0.92+
first stage	QUANTITY	0.92+
Bug Crowd	ORGANIZATION	0.91+

Dr. Amr Awadallah - Interview 1 - Hadoop World 2011 - theCUBE

okay we're back live in new york city for hadoop world 2011 john furrier its founder SiliconANGLE calm and we have a special walk-in guest tomorrow and allah the vp of engineering co founder of Cloudera who's going to be on at two thirty eastern time on the cube to go more in depth but since we saw her in the hallway we had a quick spot wanted to grab him in here this is the cube our flagship telecast where we go out to the event atop the smartest people and i'm here with my co-host i'm dave vellante Wikibon door welcome back you're a longtime cube alum so appreciate you coming back on and doing a quick drive by here thanks for the nice welcome so you know we go talk to the smart people in the room you're one of the smartest guys that I know and we've been friends for years and it was your my tweet heard around the world by you to find space and we've been sharing the office space at Cloudera a year didn't have you I meant to have you we're going to be trying to find space because you're expanding so fast we have to get in a new home sorry about that but I wanted to really thank you personally appear on live you've enabled SiliconANGLE Wikibon to we figured it out early because of you I mean we had our nose sniffing around the big data area before it's called big data but when we met talked we've been tracking the social web and really it's exploded in an amazing way and I'm just really thankful because I've been had a front-row seat in the trenches with you guys and and it's been amazing so I want to thank you're welcome and that's great to have you on board and so so you you've been evangelizing in the trenches at Yahoo you were a ir a textile partners announcing the hundred million dollar fund which is all great news today but you've been the real spark get cloudy air is one of the 10 others one of them but I know one of the main sparks a co-founder a lots of ginger cuz I'm Rebecca and my co-founder from facebook I mean we both we said this before like we saw the future like an hour companies we saw the future where everybody is gonna go next and now Jeff's gonna be on as well he's now taking this whole date of science thing art yep building out a team you gotta drilled that down with him what do you what do you think about all this I mean like right now how do you feel personally emotionally and looking at the marketplace share with us your yeah I'm very emotional today actually yeah lots of the good news is you heard about the funding news yes million dollars for startups but no but the 14 oh yeah yeah it is more most actually the news was supposed to come out today came out a bit earlier sir day but yeah I'm very very emotional because of that it's a very Testament from very big name investor's of how well we were doing and recognition of how big this wave really is also the hundred million fun from Excel that's also a huge testament and lots of hopefully lots of new innovations or startups will come out of that so I'm very emotional about that but also overwhelmed by the by the the size of this event and how many people are really gravitating towards the technology which shows how much work we still have to do going forward it was very very August of a great a bit scared a bit scared Michaels is a great CEO on stage they're great guy we love Mike just really he's geeky and he's pragmatic Jerry strategist and you got Kirk who's the operator yeah but he showed a slide up at his keynote that showed the evolution of Hadoop yes the core Hadoop and then he showed ya year-by-year and now we got that columns extending and you got new new components coming out take us through that that progression just go back a few years in and walk us through why is this going on so fast and what are the what's the what's the community doing and just yeah and what happened in 2008 it doesn't need was one mr. yeah when we when we started so I mean first 2008 when we started and what he was believing us back then that hey this thing is going to be big like we had the belief because we saw it happen firsthand but many folks were dismissive and no no no this this big data thing is a fat and nobody will care about it and look and behold today it's obviously proving not to be the case in terms of the maturity of the of the platform you're absolutely right i mean the slide that Mike showed should but only thirty percent of the contributions happening today are in the Hadoop core layer and and and and the overall kind of vision there is very system very similar to the operating system right except what this really is it's a data operating system right it's how to operate large amounts of data in a big data center so sorry it's like an operating system for many machines as opposed to Linux which does not bring system for a single machine right so Hadoop when it came out Hadoop is only the colonel it's only that inner layers which if you look at any opening system like windows or linux and so on the core functionality is two things storing files and running applications on top of these files that's what windows does that's what linux does that was loop does at the heart but then to really get an opening system to work you need many ancillary components around it that really make it functional you need libraries in it applications in eat integration IO devices etc etc and that's really what's happening in the hadoop world so started with the core OS layer which is Hadoop HDFS for storage MapReduce for computation but then now all of these other things are showing around that core kernel to really make it a fully functional extensible data opening system I which made a little replay button but let's just put the paws on that because this is kind of an important point in folks out there there's a lot of different and a lot of people and metaphors are used in this business so it's the Linux I want to be it's just like Red Hat right yeah we kind of use that term the business model is talk a little bit about that we just mentioned you know not like Linux just unpack that a little bit deeper for us what's the difference you mentioned Linux is can you replay what you just said that was really so I was actually talking about the similarity the similarity and then i can and then i can talk about the difference the similarity is the heart of Hadoop is a system for storing files which is sdfs and a system for running applications on top of these files which is MapReduce the heart of Linux is the same thing assistant for storing files which is a txt for and a system for scheduling applications on top of these files that's the same heart of Windows and so on the difference though so that's the similarity I got a difference is Linux is made to run on a single note right and when this is made to run on a single note Hadoop is really made to run on many many notes so hadoo bicester cares about taking a data center of servers a rack of servers or a data center of servers and having them look like one big massive mainframe built out of commodity hardware that can store arbitrary amounts of data and run any type of hence the new components like the hives of the world so now so now these new components coming up like high for example I've makes it easier to write queries for Hadoop it's it's a sequel language for writing queries on top of Hadoop so you don't have to go and write it in MapReduce which we call that assembly language of Hadoop so if you write it and MapReduce you will get the most flexibility you will get the most performance but only if you know what you're doing very similar when you do machine code if you do machine cool assembly you will able do anything but you can also shoot yourself in the foot sunbelt is that right the same thing with MapReduce right when you use hive hive abstracts that out for you so your rights equal and then hive takes care of doing all of the plumbing work to get that compulsion to map it is for you so that's hive HBase for example is a very nice system that augments a dupe makes it low latency and makes it makes it support update and insert and delete transactions which are HDFS does not support out of the box so small like a database it's more like my sequel yeah the energy of my sequel to Linux is very similar to hbase to HDFS and what's your take on were from you know your founders had on now yeah on the business model similarities and differences with with redhead yes so actually they are different I mean that the sonority the similarity stops at open source we are both open source right in the sense that the core system is open source is available out there you can look at the source code again the and so on the difference is with redhead red that actually has a license on their bits so there's the source code and then there's the bits so when Red Hat compiles the source code and two bits these bits you cannot deploy them without having a red hat license with us is very different is now we have the source code which is Apache is all in the patchy we compile the source code into a bunch of bits which is our distribution called cdh these bits are one hundred percent open-source 103 can deploy them use them you don't have to face anything the only reason why you would come back and pay us is for Cloudera enterprise which is really when you go operational when become operational a mission-critical cloud enterprise gives you two things first it gives you a proprietary management suite that we built and it's very unique to us nobody in the market has anything close to what we have right now that makes it easier for you to deploy configure monitor provision do capacity planning security management etc for a loop nobody else has anything close what we have right now for that management's that is unique to cloud area and not part of a patchy open source yes it's not part of the vet's office you only get that as a subscriber to cloud era we do have a free version of that that's available for download and it can run up to 15 hours just for you to get up and running quickly yeah and it's really very simple has a very simple installer like you should be able to go fire off that software and say install Hadoop these are one of my servers and would take care of everything else for you it's like having these installers you know when windows came out in the beginning and he had this nice progress bar and you can install applications very easily imagine that now for a cluster of servers right that's ready what this is the other reason why people subscribe to the cloud enterprise in addition to getting this management suite is getting our support services right and support is necessary for any software even if it's free even for hardware think if I give you a free airplane right now just comment just give it here you go here is an airplane right you can run this airplane make money from passengers you still need somebody to maintain their plane for you right you can still go higher your mechanics maybe we'd have a tweetup bummer you can hire your own mechanics to maintain that airplane but we tell you like if you subscribe with us as the mechanics for your airplane the support you will get with us will be way better than anything else and economics of it also would be way better than having your own stuff for doing the maintenance for that airplane okay final question and we got a one-minute because we slid you in real quick we're going to come back for folks armor is going to come back at two-thirty so come back its eastern time and we'll have a more in-depth conversation but just share with the folks watching your view of what's going on in the patchy and you know there's all these kind of weird you know Fudd being thrown around that clutter is not this and that and you guys clearly the leader we talked with Kirk about that we don't need to go into that but just surely this what's going on what's the real deal happening with Apache the code and you have a unique offering which I mean the real deal and I advise people to go look at this blog post that our CEO wrote called by Michaelson road called the community effect and the real deal is there is a very big healthy community developing the source code for Hadoop the core system which is actually fsm MapReduce and all the components around around that core system we at Cloudera employ a very large engineering organization and tactile engineering relation is bigger than many of these other companies in the space that's our engineering is bigger if you look at the whole company itself is much much bigger than any of these other players so we we do a lot of contributions and to the core system and to the projects around it however we are part of the community and we're definitely doing this with the community it's not just a clowder thing for the core platform so that that's the real deal all right yeah so here we are armor that co-founder congratulations great funding hundred L from accel partners who invested in you guys congratulations you're part of the community we all know that just kind of clarifying that for the record and you have a unique differentiator management suite and the enterprise stuff and say expand the experience experience yeah I think a huge differentiation we have is we have been doing this for three years I had over everybody else we have the experience across all the industries that matter so when you come to us we know how to do this in the finance industry in the retail industry and the health industry and the government so that that's something also that so I'll just for the audience out there arm is coming back at two third you're gonna go deeper in today's the highly decorated or a general because there is there a leak oh and thanks for the small extra info he's in the uniform to the cloud era logo yes sir affecting some of those for us to someday great so what you see you again love love our great great friend

Published Date : May 1 2012

SUMMARY :

clarifying that for the record and you

ENTITIES

Entity	Category	Confidence
Rebecca	PERSON	0.99+
Mike	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
2008	DATE	0.99+
Excel	TITLE	0.99+
Hadoop	TITLE	0.99+
three years	QUANTITY	0.99+
linux	TITLE	0.99+
one-minute	QUANTITY	0.99+
windows	TITLE	0.99+
Michaels	PERSON	0.99+
Jeff	PERSON	0.99+
john furrier	PERSON	0.99+
2011	DATE	0.99+
Linux	TITLE	0.99+
Kirk	PERSON	0.99+
today	DATE	0.99+
thirty percent	QUANTITY	0.99+
Yahoo	ORGANIZATION	0.99+
hbase	TITLE	0.98+
single note	QUANTITY	0.98+
two things	QUANTITY	0.97+
single note	QUANTITY	0.97+
two bits	QUANTITY	0.97+
dave vellante	PERSON	0.97+
HDFS	TITLE	0.97+
10	QUANTITY	0.97+
first	QUANTITY	0.97+
Jerry	PERSON	0.97+
facebook	ORGANIZATION	0.97+
hundred L	QUANTITY	0.96+
both	QUANTITY	0.96+
million dollars	QUANTITY	0.96+
one hundred percent	QUANTITY	0.95+
Red Hat	TITLE	0.95+
August	DATE	0.95+
MapReduce	TITLE	0.95+
Amr Awadallah	PERSON	0.95+
tomorrow	DATE	0.94+
hundred million	QUANTITY	0.94+
Dr.	PERSON	0.94+
hundred million dollar	QUANTITY	0.94+
up to 15 hours	QUANTITY	0.93+
hadoop	TITLE	0.93+
Windows	TITLE	0.93+
single machine	QUANTITY	0.92+
HBase	TITLE	0.92+
new york city	LOCATION	0.9+
years	QUANTITY	0.9+
a year	QUANTITY	0.9+
Apache	ORGANIZATION	0.9+
one	QUANTITY	0.89+
a lot of people	QUANTITY	0.87+
red hat	TITLE	0.85+
Hadoop World	TITLE	0.84+
SiliconANGLE	ORGANIZATION	0.82+
two-thirty	DATE	0.8+
Fudd	PERSON	0.77+
Michaelson road	PERSON	0.74+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Cloud Era: