Ali Ghodsi, Databricks | Cube Conversation Partner Exclusive

(outro music) >> Hey, I'm John Furrier, here with an exclusive interview with Ali Ghodsi, who's the CEO of Databricks. Ali, great to see you. Preview for reinvent. We're going to launch this story, exclusive Databricks material on the notes, after the keynotes prior to the keynotes and after the keynotes that reinvent. So great to see you. You know, you've been a partner of AWS for a very, very long time. I think five years ago, I think I first interviewed you, you were one of the first to publicly declare that this was a place to build a company on and not just post an application, but refactor capabilities to create, essentially a platform in the cloud, on the cloud. Not just an ISV; Independent Software Vendor, kind of an old term, we're talking about real platform like capability to change the game. Can you talk about your experience as an AWS partner? >> Yeah, look, so we started in 2013. I swiped my personal credit card on AWS and some of my co-founders did the same. And we started building. And we were excited because we just thought this is a much better way to launch a company because you can just much faster get time to market and launch your thing and you can get the end users much quicker access to the thing you're building. So we didn't really talk to anyone at AWS, we just swiped a credit card. And eventually they told us, "Hey, do you want to buy extra support?" "You're asking a lot of advanced questions from us." "Maybe you want to buy our advanced support." And we said, no, no, no, no. We're very advanced ourselves, we know what we're doing. We're not going to buy any advanced support. So, you know, we just built this, you know, startup from nothing on AWS without even talking to anyone there. So at some point, I think around 2017, they suddenly saw this company with maybe a hundred million ARR pop up on their radar and it's driving massive amounts of compute, massive amounts of data. And it took a little bit in the beginning just us to get to know each other because as I said, it's like we were not on their radar and we weren't really looking, we were just doing our thing. And then over the years the partnership has deepened and deepened and deepened and then with, you know, Andy (indistinct) really leaning into the partnership, he mentioned us at Reinvent. And then we sort of figured out a way to really integrate the two service, the Databricks platform with AWS . And today it's an amazing partnership. You know, we directly connected with the general managers for the services. We're connected at the CEO level, you know, the sellers get compensated for pushing Databricks, we're, we have multiple offerings on their marketplace. We have a native offering on AWS. You know, we're prominently always sort of marketed and you know, we're aligned also vision wise in what we're trying to do. So yeah, we've come a very, very long way. >> Do you consider yourself a SaaS app or an ISV or do you see yourself more of a platform company because you have customers. How would you categorize your category as a company? >> Well, it's a data platform, right? And actually the, the strategy of the Databricks is take what's otherwise five, six services in the industry or five, six different startups, but do them as part of one data platform that's integrated. So in one word, the strategy of data bricks is "unification." We call it the data lake house. But really the idea behind the data lake house is that of unification, or in more words it's, "The whole is greater than the sum of its parts." So you could actually go and buy five, six services out there or actually use five, six services from the cloud vendors, stitch it together and it kind of resembles Databricks. Our power is in doing those integrated, together in a way in which it's really, really easy and simple to use for end users. So yeah, we're a data platform. I wouldn't, you know, ISV that's a old term, you know, Independent Software Vendor. You know, I think, you know, we have actually a whole slew of ISVs on top of Databricks, that integrate with our platform. And you know, in our marketplace as well as in our partner connect, we host those ISVs that then, you know, work on top of the data that we have in the Databricks, data lake house. >> You know, I think one of the things your journey has been great to document and watch from the beginning. I got to give you guys credit over there and props, congratulations. But I think you're the poster child as a company to what we see enterprises doing now. So go back in time when you guys swiped a credit card, you didn't need attending technical support because you guys had brains, you were refactoring, rethinking. It wasn't just banging out software, you had, you were doing some complex things. It wasn't like it was just write some software hosted on server. It was really a lot more. And as a result your business worth billions of dollars. I think 38 billion or something like that, big numbers, big numbers of great revenue growth as well, billions in revenue. You have customers, you have an ecosystem, you have data applications on top of Databricks. So in a way you're a cloud on top of the cloud. So is there a cloud on top of the cloud? So you have ISVs, Amazon has ISVs. Can you take us through what this means and at this point in history, because this seems to be an advanced version of benefits of platforming and refactoring, leveraging say AWS. >> Yeah, so look, when we started, there was really only one game in town. It was AWS. So it was one cloud. And the strategy of the company then was, well Amazon had this beautiful set of services that they're building bottom up, they have storage, compute, networking, and then they have databases and so on. But it's a lot of services. So let us not directly compete with AWS and try to take out one of their services. Let's not do that because frankly we can't. We were not of that size. They had the scale, they had the size and they were the only cloud vendor in town. So our strategy instead was, let's do something else. Let's not compete directly with say, a particular service they're building, let's take a different strategy. What if we had a unified holistic data platform, where it's just one integrated service end to end. So think of it as Microsoft office, which contains PowerPoint, and Word, and Excel and even Access, if you want to use it. What if we build that and AWS has this really amazing knack for releasing things, you know services, lots of them, every reinvent. And they're sort of a DevOps person's dream and you can stitch these together and you know you have to be technical. How do we elevate that and make it simpler and integrate it? That was our original strategy and it resonated with a segment of the market. And the reason it worked with AWS so that we wouldn't butt heads with AWS was because we weren't a direct replacement for this service or for that service, we were taking a different approach. And AWS, because credit goes to them, they're so customer obsessed, they would actually do what's right for the customer. So if the customer said we want this unified thing, their sellers would actually say, okay, so then you should use Databricks. So they truly are customer obsessed in that way. And I really mean it, John. Things have changed over the years. They're not the only cloud anymore. You know, Azure is real, GCP is real, there's also Alibaba. And now over 70% of our customers are on more than one cloud. So now what we hear from them is, not only want, do we want a simplified, unified thing, but we want it also to work across the clouds. Because those of them that are seriously considering multiple clouds, they don't want to use a service on cloud one and then use a similar service on cloud two. But it's a little bit different. And now they have to do twice the work to make it work. You know, John, it's hard enough as it is, like it's this data stuff and analytics. It's not a walk in the park, you know. You hire an administrator in the back office that clicks a button and its just, now you're a data driven digital transformed company. It's hard. If you now have to do it again on the second cloud with different set of services and then again on a third cloud with a different set of services. That's very, very costly. So the strategy then has changed that, how do we take that unified simple approach and make it also the same and standardize across the clouds, but then also integrate it as far down as we can on each of the clouds. So that you're not giving up any of the benefits that the particular cloud has. >> Yeah, I think one of the things that we see, and I want get your reaction to this, is this rise of the super cloud as we call it. I think you were involved in the Sky paper that I saw your position paper came out after we had introduced Super Cloud, which is great. Congratulations to the Berkeley team, wearing the hat here. But you guys are, I think a driver of this because you're creating the need for these things. You're saying, okay, we went on one cloud with AWS and you didn't hide that. And now you're publicly saying there's other clouds too, increased ham for your business. And customers have multiple clouds in their infrastructure for the best of breed that they have. Okay, get that. But there's still a challenge around the innovation, growth that's still around the corner. We still have a supply chain problem, we still have skill gaps. You know, you guys are unique at Databricks as other these big examples of super clouds that are developing. Enterprises don't have the Databricks kind of talent. They need, they need turnkey solutions. So Adam and the team at Amazon are promoting, you know, more solution oriented approaches higher up on the stack. You're starting to see kind of like, I won't say templates, but you know, almost like application specific headless like, low code, no code capability to accelerate clients who are wanting to write code for the modern error. Right, so this kind of, and then now you, as you guys pointed out with these common services, you're pushing the envelope. So you're saying, hey, I need to compete, I don't want to go to my customers and have them to have a staff or this cloud and this cloud and this cloud because they don't have the staff. Or if they do, they're very unique. So what's your reaction? Because this kind is the, it kind of shows your leadership as a partner of AWS and the clouds, but also highlights I think what's coming. But you share your reaction. >> Yeah, look, it's, first of all, you know, I wish I could take credit for this but I can't because it's really the customers that have decided to go on multiple clouds. You know, it's not Databricks that you know, push this or some other vendor, you know, that, Snowflake or someone who pushed this and now enterprises listened to us and they picked two clouds. That's not how it happened. The enterprises picked two clouds or three clouds themselves and we can get into why, but they did that. So this largely just happened in the market. We as data platforms responded to what they're then saying, which is they're saying, "I don't want to redo this again on the other cloud." So I think the writing is on the wall. I think it's super obvious what's going to happen next. They will say, "Any service I'm using, it better work exactly the same on all the clouds." You know, that's what's going to happen. So in the next five years, every enterprise will say, "I'm going to use the service, but you better make sure that this service works equally well on all of the clouds." And obviously the multicloud vendors like us, are there to do that. But I actually think that what you're going to see happening is that you're going to see the cloud vendors changing the existing services that they have to make them work on the other clouds. That's what's goin to happen, I think. >> Yeah, and I think I would add that, first of all, I agree with you. I think that's going to be a forcing function. Because I think you're driving it. You guys are in a way, one, are just an actor in the driving this because you're on the front end of this and there are others and there will be people following. But I think to me, I'm a cloud vendor, I got to differentiate. Adam, If I'm Adam Saleski, I got to say, "Hey, I got to differentiate." So I don't wan to get stuck in the middle, so to speak. Am I just going to innovate on the hardware AKA infrastructure or am I going to innovate at the higher level services? So what we're talking about here is the tail of two clouds within Amazon, for instance. So do I innovate on the silicon and get low level into the physics and squeeze performance out of the hardware and infrastructure? Or do I focus on ease of use at the top of the stack for the developers? So again, there's a channel of two clouds here. So I got to ask you, how do they differentiate? Number one and number two, I never heard a developer ever say, "I want to run my app or workload on the slower cloud." So I mean, you know, back when we had PCs you wanted to go, "I want the fastest processor." So again, you can have common level services, but where is that performance differentiation with the cloud? What do the clouds do in your opinion? >> Yeah, look, I think it's pretty clear. I think that it's, this is, you know, no surprise. Probably 70% or so of the revenue is in the lower infrastructure layers, compute, storage, networking. And they have to win that. They have to be competitive there. As you said, you can say, oh you know, I guess my CPUs are slower than the other cloud, but who cares? I have amazing other services which only work on my cloud by the way, right? That's not going to be a winning recipe. So I think all three are laser focused on, we going to have specialized hardware and the nuts and bolts of the infrastructure, we can do it better than the other clouds for sure. And you can see lots of innovation happening there, right? The Graviton chips, you know, we see huge price performance benefits in those chips. I mean it's real, right? It's basically a 20, 30% free lunch. You know, why wouldn't you, why wouldn't you go for it there? There's no downside. You know, there's no, "got you" or no catch. But we see Azure doing the same thing now, they're also building their own chips and we know that Google builds specialized machine learning chips, TPU, Tenor Processing Units. So their legs are focused on that. I don't think they can give up that or focused on higher levels if they had to pick bets. And I think actually in the next few years, most of us have to make more, we have to be more deliberate and calculated in the picks we do. I think in the last five years, most of us have said, "We'll do all of it." You know. >> Well you made a good bet with Spark, you know, the duke was pretty obvious trend that was, everyone was shut on that bandwagon and you guys picked a big bet with Spark. Look what happened with you guys? So again, I love this betting kind of concept because as the world matures, growth slows down and shifts and that next wave of value coming in, AKA customers, they're going to integrate with a new ecosystem. A new kind of partner network for AWS and the other clouds. But with aws they're going to need to nurture the next Databricks. They're going to need to still provide that SaaS, ISV like experience for, you know, a basic software hosting or some application. But I go to get your thoughts on this idea of multiple clouds because if I'm a developer, the old days was, old days, within our decade, full stack developer- >> It was two years ago, yeah (John laughing) >> This is a decade ago, full stack and then the cloud came in, you kind had the half stack and then you would do some things. It seems like the clouds are trying to say, we want to be the full stack or not. Or is it still going to be, you know, I'm an application like a PC and a Mac, I'm going to write the same application for both hardware. I mean what's your take on this? Are they trying to do full stack and you see them more like- >> Absolutely. I mean look, of course they're going, they have, I mean they have over 300, I think Amazon has over 300 services, right? That's not just compute, storage, networking, it's the whole stack, right? But my key point is, I think they have to nail the core infrastructure storage compute networking because the three clouds that are there competing, they're formidable companies with formidable balance sheets and it doesn't look like any of them is going to throw in the towel and say, we give up. So I think it's going to intensify. And given that they have a 70% revenue on that infrastructure layer, I think they, if they have to pick their bets, I think they'll focus it on that infrastructure layer. I think the layer above where they're also placing bets, they're doing that, the full stack, right? But there I think the demand will be, can you make that work on the other clouds? And therein lies an innovator's dilemma because if I make it work on the other clouds, then I'm foregoing that 70% revenue of the infrastructure. I'm not getting it. The other cloud vendor is going to get it. So should I do that or not? Second, is the other cloud vendor going to be welcoming of me making my service work on their cloud if I am a competing cloud, right? And what kind of terms of service are I giving me? And am I going to really invest in doing that? And I think right now we, you know, most, the vast, vast, vast majority of the services only work on the one cloud that you know, it's built on. It doesn't work on others, but this will shift. >> Yeah, I think the innovators dilemma is also very good point. And also add, it's an integrators dilemma too because now you talk about integration across services. So I believe that the super cloud movement's going to happen before Sky. And I think what explained by that, what you guys did and what other companies are doing by representing advanced, I call platform engineering, refactoring an existing market really fast, time to value and CAPEX is, I mean capital, market cap is going to be really fast. I think there's going to be an opportunity for those to emerge that's going to set the table for global multicloud ultimately in the future. So I think you're going to start to see the same pattern of what you guys did get in, leverage the hell out of it, use it, not in the way just to host, but to refactor and take down territory of markets. So number one, and then ultimately you get into, okay, I want to run some SLA across services, then there's a little bit more complication. I think that's where you guys put that beautiful paper out on Sky Computing. Okay, that makes sense. Now if you go to today's market, okay, I'm betting on Amazon because they're the best, this is the best cloud win scenario, not the most robust cloud. So if I'm a developer, I want the best. How do you look at their bet when it comes to data? Because now they've got machine learning, Swami's got a big keynote on Wednesday, I'm expecting to see a lot of AI and machine learning. I'm expecting to hear an end to end data story. This is what you do, so as a major partner, how do you view the moves Amazon's making and the bets they're making with data and machine learning and AI? >> First I want to lift off my hat to AWS for being customer obsessed. So I know that if a customer wants Databricks, I know that AWS and their sellers will actually help us get that customer deploy Databricks. Now which of the services is the customer going to pick? Are they going to pick ours or the end to end, what Swami is going to present on stage? Right? So that's the question we're getting. But I wanted to start with by just saying, their customer obsessed. So I think they're going to do the right thing for the customer and I see the evidence of it again and again and again. So kudos to them. They're amazing at this actually. Ultimately our bet is, customers want this to be simple, integrated, okay? So yes there are hundreds of services that together give you the end to end experience and they're very customizable that AWS gives you. But if you want just something simply integrated that also works across the clouds, then I think there's a special place for Databricks. And I think the lake house approach that we have, which is an integrated, completely integrated, we integrate data lakes with data warehouses, integrate workflows with machine learning, with real time processing, all these in one platform. I think there's going to be tailwinds because I think the most important thing that's going to happen in the next few years is that every customer is going to now be obsessed, given the recession and the environment we're in. How do I cut my costs? How do I cut my costs? And we learn this from the customers they're adopting the lake house because they're thinking, instead of using five vendors or three vendors, I can simplify it down to one with you and I can cut my cost. So I think that's going to be one of the main drivers of why people bet on the lake house because it helps them lower their TCO; Total Cost of Ownership. And it's as simple as that. Like I have three things right now. If I can get the same job done of those three with one, I'd rather do that. And by the way, if it's three or four across two clouds and I can just use one and it just works across two clouds, I'm going to do that. Because my boss is telling me I need to cut my budget. >> (indistinct) (John laughing) >> Yeah, and I'd rather not to do layoffs and they're asking me to do more. How can I get smaller budgets, not lay people off and do more? I have to cut, I have to optimize. What's happened in the last five, six years is there's been a huge sprawl of services and startups, you know, you know most of them, all these startups, all of them, all the activity, all the VC investments, well those companies sold their software, right? Even if a startup didn't make it big, you know, they still sold their software to some vendors. So the ecosystem is now full of lots and lots and lots and lots of different software. And right now people are looking, how do I consolidate, how do I simplify, how do I cut my costs? >> And you guys have a great solution. You're also an arms dealer and a innovator. So I have to ask this question, because you're a professor of the industry as well as at Berkeley, you've seen a lot of the historical innovations. If you look at the moment we're in right now with the recession, okay we had COVID, okay, it changed how people work, you know, people working at home, provisioning VLAN, all that (indistinct) infrastructure, okay, yeah, technology and cloud health. But we're in a recession. This is the first recession where the Amazon and the other cloud, mainly Amazon Web Services is a major economic puzzle in the piece. So they were never around before, even 2008, they were too small. They're now a major economic enabler, player, they're serving startups, enterprises, they have super clouds like you guys. They're a force and the people, their customers are cutting back but also they can also get faster. So agility is now an equation in the economic recovery. And I want to get your thoughts because you just brought that up. Customers can actually use the cloud and Databricks to actually get out of the recovery because no one's going to say, stop making profit or make more profit. So yeah, cut costs, be more efficient, but agility's also like, let's drive more revenue. So in this digital transformation, if you take this to conclusion, every company transforms, their company is the app. So their revenue is tied directly to their technology deployment. What's your reaction and comment to that because this is a new historical moment where cloud and scale and data, actually could be configured in a way to actually change the nature of a business in such a short time. And with the recession looming, no one's got time to wait. >> Yeah, absolutely. Look, the secular tailwind in the market is that of, you know, 10 years ago it was software is eating the world, now it's AI's going to eat all of software software. So more and more we're going to have, wherever you have software, which is everywhere now because it's eaten the world, it's going to be eaten up by AI and data. You know, AI doesn't exist without data so they're synonymous. You can't do machine learning if you don't have data. So yeah, you're going to see that everywhere and that automation will help people simplify things and cut down the costs and automate more things. And in the cloud you can also do that by changing your CAPEX to OPEX. So instead of I invest, you know, 10 million into a data center that I buy, I'm going to have headcount to manage the software. Why don't we change this to OPEX? And then they are going to optimize it. They want to lower the TCO because okay, it's in the cloud. but I do want the costs to be much lower that what they were in the previous years. Last five years, nobody cared. Who cares? You know what it costs. You know, there's a new brave world out there. Now there's like, no, it has to be efficient. So I think they're going to optimize it. And I think this lake house approach, which is an integration of the lakes and the warehouse, allows you to rationalize the two and simplify them. It allows you to basically rationalize away the data warehouse. So I think much faster we're going to see the, why do I need the data warehouse? If I can get the same thing done with the lake house for fraction of the cost, that's what's going to happen. I think there's going to be focus on that simplification. But I agree with you. Ultimately everyone knows, everybody's a software company. Every company out there is a software company and in the next 10 years, all of them are also going to be AI companies. So that is going to continue. >> (indistinct), dev's going to stop. And right sizing right now is a key economic forcing function. Final question for you and I really appreciate you taking the time. This year Reinvent, what's the bumper sticker in your mind around what's the most important industry dynamic, power dynamic, ecosystem dynamic that people should pay attention to as we move from the brave new world of okay, I see cloud, cloud operations. I need to really make it structurally change my business. How do I, what's the most important story? What's the bumper sticker in your mind for Reinvent? >> Bumper sticker? lake house 24. (John laughing) >> That's data (indistinct) bumper sticker. What's the- >> (indistinct) in the market. No, no, no, no. You know, it's, AWS talks about, you know, all of their services becoming a lake house because they want the center of the gravity to be S3, their lake. And they want all the services to directly work on that, so that's a lake house. We're Bumper see Microsoft with Synapse, modern, you know the modern intelligent data platform. Same thing there. We're going to see the same thing, we already seeing it on GCP with Big Lake and so on. So I actually think it's the how do I reduce my costs and the lake house integrates those two. So that's one of the main ways you can rationalize and simplify. You get in the lake house, which is the name itself is a (indistinct) of two things, right? Lake house, "lake" gives you the AI, "house" give you the database data warehouse. So you get your AI and you get your data warehousing in one place at the lower cost. So for me, the bumper sticker is lake house, you know, 24. >> All right. Awesome Ali, well thanks for the exclusive interview. Appreciate it and get to see you. Congratulations on your success and I know you guys are going to be fine. >> Awesome. Thank you John. It's always a pleasure. >> Always great to chat with you again. >> Likewise. >> You guys are a great team. We're big fans of what you guys have done. We think you're an example of what we call "super cloud." Which is getting the hype up and again your paper speaks to some of the innovation, which I agree with by the way. I think that that approach of not forcing standards is really smart. And I think that's absolutely correct, that having the market still innovate is going to be key. standards with- >> Yeah, I love it. We're big fans too, you know, you're doing awesome work. We'd love to continue the partnership. >> So, great, great Ali, thanks. >> Take care (outro music)

Published Date : Nov 23 2022

SUMMARY :

after the keynotes prior to the keynotes and you know, we're because you have customers. I wouldn't, you know, I got to give you guys credit over there So if the customer said we So Adam and the team at So in the next five years, But I think to me, I'm a cloud vendor, and calculated in the picks we do. But I go to get your thoughts on this idea Or is it still going to be, you know, And I think right now we, you know, So I believe that the super cloud I can simplify it down to one with you and startups, you know, and the other cloud, And in the cloud you can also do that I need to really make it lake house 24. That's data (indistinct) of the gravity to be S3, and I know you guys are going to be fine. It's always a pleasure. We're big fans of what you guys have done. We're big fans too, you know,

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
John	PERSON	0.99+
Ali Ghodsi	PERSON	0.99+
Adam	PERSON	0.99+
AWS	ORGANIZATION	0.99+
2013	DATE	0.99+
Google	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
2008	DATE	0.99+
five vendors	QUANTITY	0.99+
Adam Saleski	PERSON	0.99+
five	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Ali	PERSON	0.99+
Databricks	ORGANIZATION	0.99+
three vendors	QUANTITY	0.99+
70%	QUANTITY	0.99+
Wednesday	DATE	0.99+
Excel	TITLE	0.99+
38 billion	QUANTITY	0.99+
four	QUANTITY	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Word	TITLE	0.99+
three	QUANTITY	0.99+
two clouds	QUANTITY	0.99+
Andy	PERSON	0.99+
three clouds	QUANTITY	0.99+
10 million	QUANTITY	0.99+
PowerPoint	TITLE	0.99+
one	QUANTITY	0.99+
two	QUANTITY	0.99+
twice	QUANTITY	0.99+
Second	QUANTITY	0.99+
over 300 services	QUANTITY	0.99+
one game	QUANTITY	0.99+
second cloud	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
Sky	ORGANIZATION	0.99+
one word	QUANTITY	0.99+
OPEX	ORGANIZATION	0.99+
two things	QUANTITY	0.98+
two years ago	DATE	0.98+
Access	TITLE	0.98+
over 300	QUANTITY	0.98+
six years	QUANTITY	0.98+
over 70%	QUANTITY	0.98+
five years ago	DATE	0.98+

Ali Ghodsi, Databricks | Supercloud22

(light hearted music) >> Okay, welcome back to Supercloud '22. I'm John Furrier, host of theCUBE. We got Ali Ghodsi here, co-founder and CEO of Databricks. Ali, Great to see you. Thanks for spending your valuable time to come on and talk about Supercloud and the future of all the structural change that's happening in cloud computing. >> My pleasure, thanks for having me. >> Well, first of all, congratulations. We've been talking for many, many years, and I still go back to the video that we have in archive, you talking about cloud. And really, at the beginning of the big reboot, I called the post Hadoop, a revitalization of data. Congratulations, you've been cloud-first, now on multiple clouds. Congratulations to you and your team for achieving what looks like a billion dollars in annualized revenue as reported by the Wall Street Journal, so first, congratulations. >> Thank you so much, appreciate it. >> So I was talking to some young developers and I asked a random poll, what do you think about Databricks? Oh, we love those guys, they're AI and ML-native, and that's their advantage over the competition. So I pressed why. I don't think they knew why, but that's an interesting perspective. This idea of cloud native, AI/ML-native, ML Ops, this has been a big trend and it's continuing. This is a big part of how this change and this structural change is happening. How do you react to that? And how do you see Databricks evolving into this new Supercloud-like multi-cloud environment? >> Yeah, look, I think it's a continuum. It starts with having data, but they want to clean it, you know, and they want to get insights out of it. But then, eventually, you'd like to start asking questions, doing reports, maybe ask questions about what was my revenue yesterday, last week, but soon you want to start using the crystal ball, predictive technology. Okay, but what will my revenue be next week? Next quarter? Who's going to churn? And if you can finally automate that completely so that you can act on the predictions, right? So this credit card that got swiped, the AI thinks it's fraud, we're going to deny it. That's when you get real value. So we're trying to help all these organizations move through this data AI maturity curve, all the way to that, the prescriptive, automated AI machine learning. That's when you get real competitive advantage. And you know, we saw that with the fans, right? I mean, Google wouldn't be here today if it wasn't for AI. You know, we'd be using AltaVista or something. We want to help all organizations to be able to leverage data and AI that way that the fans did. >> One of the things we're looking at with supercloud and why we call it supercloud versus other things like multi-cloud is that today a lot of the successful companies have started in the cloud have been successful, but have realized and even enterprises who have gotten by accident, and maybe have done nothing with cloud have just some cloud projects on multiple clouds. So, people have multiple cloud operational things going on but it hasn't necessarily been a strategy per se. It's been more of kind of a default reaction to things but the ones that are innovating have been successful in one native cloud because the use cases that drove that got scale got value, and then they're making that super by bringing it on premise, putting in a modern data stack, for the modern application development, and kind of dealing with the things that you guys are in the middle of with data bricks is that, that is where the action is, and they don't want to go, lose the trajectory in all the economies of scale. So we're seeing another structural change where the evolutionary nature of the cloud has solved a bunch of use cases, but now other use cases are emerging that's on premises and edge that have been driven by applications because of the developer boom, that's happening. You guys are in the middle of it. What is happening with this structural change? Are people looking for the modern data stack? Are they looking for more AI? What's the, what's your perspective on this supercloud kind of position? >> Look, it started with not AR on multiple clouds, right? So multi-cloud has been a thing. It became a thing 70, 80% of our customers when you ask them, they're more than one cloud. But then soon to start realizing that, hey, you know, if I'm on multiple clouds, this data stuff is hard enough as it is. Do I want to redo it again and again with different proprietary technologies, on each of the clouds. And that's when I started thinking about let's standardize this, let's figure out a way which just works across them. That's where I think open source comes in, becomes really important. Hey, can we leverage open standards because then we can make it work in these different environments, as we said so that we can actually go super, as you said, that's one. The second thing is, can we simplify it? You know, and I think today, the data landscape is complicated. Conceptually it's simple. You have data which is essentially customer data that you have, maybe employee data. And you want to get some kind of insights from that. But how you do that is very complicated. You have to buy data warehouse, hire data analysts. You have to buy, store stuff in the Delta Lake you know, get your data engineers. If you want streaming real time thing that's another complete different set of technologies you have to buy. And then you have to stitch all these together, and you have to do again and again on every cloud. So they just want simplification. So that's why we're big believers in this Delta Lakehouse concept. Which is an open standard to simplifying this data stack and help people to just get value out of their data in any environment. So they can do that in this sort of supercloud as you call it. >> You know, we've been talking about that in previous interviews, do the heavy lifting let them get the value. I have to ask you about how you see that going forward, Because if I'm a customer, I have a lot of operational challenges. Cause the developers are are kicking butt right now. We see that clearly. Open sources growing at, and continue to be great. But ops and security teams they really care about this stuff. And most companies don't want to spin up multiple ops teams to deal with different stacks. This is one big problem that I think that's leading into the multi-cloud viability. How do you guys deal with that? How do you talk to customers when they say, I want to have less complications on operations? >> Yeah, you're absolutely right. You know, it's easy for a developer to adopt all these technologies and new things are coming out all the time. The ops teams are the ones that have to make sure this works. Doing that in multiple different environments is super hard. especially when there's a proprietary stack in each environment that's different. So they just want standardization. They want open source, that's super important. We hear that all the time from them. They want open the source technologies. They believe in the communities around it. You know, they know that source code is open. So you can also see if there's issues with it. If there's security breaches, those kind of things that they can have a community around it. So they can actually leverage that. So they're the ones that are really pushing this, and we're seeing it across the board. You know, it starts first with the digital natives you know, the companies that are, but slowly it's also now percolating to the other organizations, we're hearing across the board. >> Where are we, Ali on the innovation strategies for customers? Where are they on the trajectory around how they're building out their teams? How are they looking at the open source? How are they extending the value proposition of Databricks, and data at scale, as they start to build out their teams and operations, because some are like kind of starting, crawl, walk, run, kind of vibe. Some are big companies, they're dealing with data all the time. Where are they in their journey? What's the core issues that they're solving? What are some of the use cases that you see that are most pressing in customer? >> Yeah, what I've seen, that's really exciting about this Delta Lakehouse concept is that we're now seeing a lot of use cases around real time. So real time fraud detection, real time stock ticker pricing, anyone that's doing trading, they want that to work real time. Lots of use cases around that. Lots of use cases around how do we in real time drive more engagement on our web assets if we're a media company, right? We have all these assets how do we get people to get engaged? Stay on our sites. Continue engaging with the material we have. Those are real time use cases. And the interesting thing is, they're real time. So, you know, it's really important that you that now you don't want to recommend someone, hey, you should go check out this restaurant if they just came from that restaurant, half an hour ago. So you want it to be real time, but B, that it's also all based on machine learning. These are a lot of this is trying to predict what you want to see, what you want to do, is it fraudulent? And that's also interesting because basically more and more machine learning is coming in. So that's super exciting to see, the combination of real time and machine learning on the Lakehouse. And finally, I would say the Lakehouse is really important for this because that's where the data is flowing in. If they have to take that data that's flowing into the lake and actually copy it into a separate warehouse, that delays the real time use cases. And then it can't hit those real time deadlines. So that's another catalyst for this Lakehouse pattern. >> Would that be an example of how the metrics are changing? Cause I've been looking at some people saying, well you can tell if someone's doing well there's a lot of data being transferred. And then I was saying, well, wait a minute. Data transfer costs money, right? And time. So this is interesting dynamic, in a way you don't want to have a lot of movement, right? >> Yeah, movement actually decreases for a lot of these real time use cases. 'Cause what we saw in the past was that they would run a batch processing to process all the data. So once they process all the data. But actually if you look at the things that have changed since the data that we have yesterday it's actually not that much. So if you can actually incrementally process it in real time, you can actually reduce the cost of transfers and storage and processing. So that's actually a great point. That's also one of the main things that we're seeing with the use cases, the bill shrinks and the cost goes down, and they can process less. >> Yeah, and it'd be interesting to see how those KPIs evolve into industry metrics down the road around the supercloud of evolution. I got to ask you about the open source concept of data platforms. You guys have been a pioneer in there doing great work, kind of picking the baton off where the Hadoop World left off as Dave Vellante always points out. But if working across clouds is super important. How are you guys looking at the ability to work across the different clouds with data bricks? Are you going to build that abstraction yourself? Does data sharing and model sharing kind of come into play there? How do you see this data bricks capability across the clouds? >> Yeah, I mean, let me start by saying, we just we're big fans of open source. We think that open source is a force in software. That's going to continue for, decades, hundreds of years, and it's going to slowly replace all proprietary code in its way. We saw that, it could do that with the most advanced technology. Windows, you know proprietary operating system, very complicated, got replaced with Linux. So open source can pretty much do anything. And what we're seeing with the Delta Lakehouse is that slowly the open source community is building a replacement for the proprietary data warehouse, Delta Lake, machine learning, real time stack in open source. And we're excited to be part of it. For us, Delta Lake is a very important project that really helps you standardize how you layout your data in the cloud. And when it comes a really important protocol called data sharing, that enables you in a open way actually for the first time ever share large data sets between organizations, but it uses an open protocol. So the great thing about that is you don't need to be a Databricks customer. You don't need to even like Databricks, you just need to use this open source project and you can now securely share data sets between organizations across clouds. And it actually does so really efficiently just one copy of the data. So you don't have to copy it if you're within the same cloud. >> So you're playing the long game on open source. >> Absolutely. I mean, this is a force it's going to be there if if you deny it, before you know it there's going to be, something like Linux, that is going to be a threat to your propriety. >> I totally agree by the way. I was just talking to somebody the other day and they're like hey, the software industry someone made the comment, the software industry, the software industry is open source. There's no more software industry, it's called open source. It's integrations that become interesting. And I was looking at integrations now is really where the action is. And we had a panel with the Clouderati we called it, the people have been around for a long time. And it was called the innovator's dilemma. And one of the comments was it's the integrator's dilemma, not the innovator's dilemma. And this is a big part of this piece of supercloud. Can you share your thoughts on how cloud and integration need to be tightened up to really make it super? >> Actually that's a great point. I think the beauty of this is, look the ecosystem of data today is vast, there's this picture that someone puts together every year of all the different vendors and how they relate, and it gets bigger and bigger and messy and messier. So, we see customers use all kinds of different aspects of what's existing in the ecosystem and they want it to be integrated in whatever you're selling them. And that's where I think the power of open source comes in. Open source, you get integrations that people will do without you having to push it. So us, Databricks as a vendor, we don't have to go tell people please integrate with Databricks. The open source technology that we contribute to, automatically, people are integrating with it. Delta Lake has integrations with lots of different software out there and Databricks as a company doesn't have to push that. So I think open source is also another thing that really helps with the ecosystem integrations. Many of these companies in this data space actually have employees that are full-time dedicated to make sure make sure our software works well with Spark. Make sure our software works well with Delta and they contribute back to that community. And that's the way you get this sort of ecosystem to further sort of flourish. >> Well, I really appreciate your time. And I, my final question for you is, as we're kind of unpack and and kind of shape and frame supercloud for the future, how would you see a roadmap or architecture or outcome for companies that are going to clearly be in the cloud where it's open source is going to be dominating. Integrations has got to be seamless and frictionless. Abstraction layer make things super easy and take away the complexity. What is supercloud to them? What does the outcome look like? How would you define a supercloud environment for an enterprise? >> Yeah, for me, it's the simplification that you get where you standardize an open source. You get your data in one place, in one format in one standardized way, and then you can get your insights from it, without having to buy lots of different idiosyncratic proprietary software from different vendors. That's different in each environment. So it's this slow standardization that's happening. And I think it's going to happen faster than we think. And I think in a couple years it's going to be a requirement that, does your software work on all these different departments? Is it based on open source? Is it using this Delta Lake house pattern? And if it's not, I think they're going to demand it. >> Yeah, I feel like we're close to some sort of defacto standard coming and you guys are a big part of it, once that clicks in, it's going to highly accelerate in the open, and I think it's going to be super valuable. Ali, thank you so much for your time, and congratulations to you and your team. Like we've been following you guys since the beginning. Remember the early days and look how far it's come. And again, you guys are really making a big difference in making a super cool environment out there. Thanks for coming on sharing. >> Thank you so much John. >> Okay, this is supercloud 22. I'm John Furrier stay with more for more coverage and more commentary after this break. (light hearted music)

Published Date : Aug 7 2022

SUMMARY :

and the future of all Congratulations to you and your team And how do you see Databricks evolving And if you can finally One of the things we're And then you have to I have to ask you about how We hear that all the time from them. What are some of the use cases that delays the real time use cases. in a way you don't want to So if you can actually incrementally I got to ask you about So you don't have to copy it So you're playing the that is going to be a And one of the comments was And that's the way you and take away the complexity. simplification that you get and congratulations to you and your team. Okay, this is supercloud 22.

ENTITIES

Entity	Category	Confidence
Ali Ghodsi	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Google	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
John	PERSON	0.99+
last week	DATE	0.99+
next week	DATE	0.99+
Ali	PERSON	0.99+
Next quarter	DATE	0.99+
yesterday	DATE	0.99+
John Furrier	PERSON	0.99+
Delta	ORGANIZATION	0.99+
one format	QUANTITY	0.99+
first	QUANTITY	0.99+
today	DATE	0.98+
second thing	QUANTITY	0.98+
one	QUANTITY	0.98+
Linux	TITLE	0.98+
one copy	QUANTITY	0.98+
Delta Lakehouse	ORGANIZATION	0.98+
supercloud 22	ORGANIZATION	0.98+
more than one cloud	QUANTITY	0.98+
each environment	QUANTITY	0.98+
Clouderati	ORGANIZATION	0.98+
Supercloud22	ORGANIZATION	0.98+
hundreds of years	QUANTITY	0.97+
Delta Lake	LOCATION	0.97+
one big problem	QUANTITY	0.97+
70, 80%	QUANTITY	0.97+
Windows	TITLE	0.96+
one place	QUANTITY	0.96+
first time	QUANTITY	0.96+
billion dollars	QUANTITY	0.95+
decades	QUANTITY	0.95+
Delta Lake	ORGANIZATION	0.95+
One	QUANTITY	0.94+
supercloud	ORGANIZATION	0.94+
Supercloud	ORGANIZATION	0.94+
half an hour ago	DATE	0.93+
Delta Lake	TITLE	0.92+
Lakehouse	ORGANIZATION	0.92+
Spark	TITLE	0.91+
each	QUANTITY	0.91+
a minute	QUANTITY	0.85+
one of	QUANTITY	0.73+
one native	QUANTITY	0.72+
supercloud	TITLE	0.7+
couple years	QUANTITY	0.66+
AltaVista	ORGANIZATION	0.65+
Wall Street Journal	ORGANIZATION	0.63+
theCUBE	ORGANIZATION	0.63+
Lakehouse	TITLE	0.51+
Lake	LOCATION	0.46+
Hadoop World	TITLE	0.41+
'22	EVENT	0.24+

Ali Ghodsi, Databricks | Informatica World 2019

>> Live from Las Vegas, it's theCUBE, covering Informatica World 2019. Brought to you by Informatica. >> Welcome back everyone to theCUBE's live coverage of Informatica World 2019. I'm your host Rebecca Knight, along with my co-host John Furrier. We're joined by Ali Ghodsi, he is the CEO of Databricks, thank you so much for coming on, for returning to theCUBE. You're a CUBE veteran. >> Yes, thank you for having me. >> So I want to pick up on something that you said up on the main stage, and that is that every enterprise on the planet wants to add AI capabilities, but the hardest part of AI is not AI, it's the data. >> Yeah. >> Can you riff on that a little bit for our viewers? Elaborate? >> Yeah, actually, the interesting part is that, if you look at the company that succeeded with AI, the actual AI algorithms they're using, are actually algorithms from the 70s, you know, they're actually developed in the 70s, that's 50 years ago. So then how come they're succeeding now? When actually the same algorithms weren't working in the 70s, so people gave up on them. Like, these things called neural nets, right? Now they're en vogue and they're, you know, super successful. The reason is you have to apply orders of magnitude more data. If you feed those algorithms that we thought were broken orders of magnitude more data, you actually get great results, but that's actually hard. You know, dealing with petabyte scale data and cleaning it, making sure that it's actually the right data for the task at hand is not easy. So that's the part that people are struggling with. >> I saw you up on stage, I'm like ah, Ali's here, Databricks is here, that's awesome. Psyched that you stopped by theCUBE. Been a while. I wanted to get a quick update, 'cause you guys have been on a tear, doing some great work at Cal, we were just told before we came on camera. But what are you doing here? What's the, is there any announcements or news with Informatica? What's the story? >> Yeah, it's, we're doing partnership around Delta Lake, which is our next generation engine that we built, so we're super excited about that. It integrates with all of the Informatica platform. So their ingestion tools, their transformation tools, and the catalog that they also have. So we think together, this can actually really help enterprises make that transition into the AI era. >> So you know, we've been followers, our 10th year, so remember when we were in the cloud era office of Mike Olsen and Amr Awadallah when we first started and now, Hadoop movement started, and then the cloud came along. Right when you guys started your company, the cloud growth took off. You guys were instrumental in changing the equation in dealing with data, data lakes, whatever they're calling it back then. So now, data, holistically, is a systems architecture. On premise it's a huge challenge, cloud native, well no real challenge, people love that. Data feeds AI, lot of risk taking, lot of reward. We're seeing the SaaS business explode, Zoom communications. The list goes on and on. Do you know, enterprise that's trying to be SAS is hard. So you can't just take data from an enterprise and make it SaaS-ified. You really got to think differently. What are you guys doing? How have you guys evolved and vectored into that challenge, because this is where your core value proposition initially started change. Take us through that Databricks story and how you're solving that problem today. >> Yeah, it's a great question. Really what happened is that people started collecting a lot of our data about a decade ago. And the promise was, you can do great things with this. There are all these aspirational use cases around machine learning, real time, it's going to be amazing. Right? So people started collecting it. They started storing one petabytes, two petabytes, and they kept going back to their boss and saying this project is real successful I now have five petabytes in it. But at some point the business said, okay that's great but what can you do with it? What business problems are you actually addressing? What are you solving? And so, in the last couple years there's been a push towards let's prove the value of these data lakes. And actually, many of these projects are falling short. Many are failing. And the reason is, people have just been dumping this data into data lakes without thinking about, the structure, the quality, how it's going to be used. The use cases have been an afterthought. So the number one thing in the top of mind for everyone right now is how do we make these data lakes that we have successful so we can prove some business value to our management? Towards this, this is the main problem that we're focusing on. Towards this, we built something called Delta Lake. It's something you situate on top of your data lake. And what it does is it increases the quality, the reliability, the performance, and the scale of your data lake. >> (John) So it's like a filter. >> Yeah. >> The cream rises to the top. >> (Ari) Exactly. >> Let's the sludge, the data swamp stay below the clean water, if you will. >> Exactly actually you nailed it. So basically, we look at the data as it comes in, filter as you said, and then look at, if there's any quality issues we then put it back in the data lake. It's fine, it can stay there. We'll figure out how to get value out of it later. But if it makes it into the Delta Lake, it will have high quality. Right? So that's great. And since we're anyway already looking at all the data as it's coming in, we might as well also store a lot of inducees and a lot of things that let us performance optimize it later on. So that, later, when people are actually trying to use that data they get really high performance, they get really good quality. And we also added asset transactions to it so that now you're also getting all those transactional use cases working on your existing data lake. >> I saw, at my daughter's graduation in Cal Berkley this weekend and yesterday, people around with Databricks backpacks. Very popular in academic. You guys got the young generation coming in. What's the update on the company? How many employees? What's the traction? Give us a quick business update. >> Yeah we're about 800 employees now. About 100 people in Europe, I would say, and maybe 40-50 people in Asiapac. We're expanding the ME and the Asia business. >> (John) Growth mode. >> Yeah, growth mode. So it's expanding as fast as possible. I mean, I actually, as a CEO, I try to always, slow the hiring down to make sure that we keep the quality bars. So that's actually top of mind for me. But yeah we're-- >> (John) You did Delta Lake on that one. >> Yeah (laughing) >> Exactly. Yeah and we're super excited about working with these universities. We get a lot of graduate students from top universities-- >> And Cal had the first ever class in college of data analytics, what was that? Data analytics are the first inagaural class graduated. Shows how early it is. >> Yeah, yeah, yeah. And actually used Databricks, the community edition, for a class of over a thousand students at Cal used the platform. So they're going to be trained in data science as they come out. >> So I want to ask about that because as you said you're trying to slow down the hiring to make sure that you are maintaining a high bar for your new hires. But yet, I'm sure there's a huge demand because you are in growth mode. So what are you doing? You said you're working with universities to make sure that the next generation is trained up and is capable of performing at Databricks. So tell us more about those efforts. >> Yeah I mean, so, obviously university recruiting is big for us. Cal, I think Databricks has the longest line of all the companies that come there on the career fair day. So, we work very closely with these universities. I think, next generation, as they come out, this generation that's coming out today actually is data science trained. So it's a big difference. There is a huge skills gap out there. Every big enterprise you talk tells you my biggest problem is actually, I don't have skilled people. Can you help me hire people? I say, hey we're not in the recruiting business. But, the good news is, if you look at the universities, they're all training thousands and thousands of data scientists every year now. I can tell you just at Cal, because, I happpen to be on the faculty there, is, almost every applicant now, to grad school, wants to do something AI related. Which has actually led to, if you look at all the programs in universities today, people used to do networking, professors used to do networking, say we do intelligent networks. People who do databases say, we do intelligent databases. People who do systems research say, hey we do intelligent systems, right? So what that means is, in a couple years you'll have lots of students coming out and these companies, that are now struggling hiring, then will be able to hire this talent and will actually succeed better with these AI projects. >> As they say in Berkley, nothing like a good revolution once in a while. AI is kind of changing everyone over. I got to ask you for the young kids out there, and parents who have kids either in elementary school or high school, everyone is trying to figure out, and there's no yet clear playbook, we're starting to see first generation training, but is there a skill set, because there's a range in surface area, you got hardcore coding to ethics, and everything in between from visualization, multiple dimensions of opportunities. What skills do you that people could hone or tweak that may not be on a curriculum that they could get, or pieces of different curriculums in school that would be a good foundation for folks learning and wanting to jump in to data and data value, whether it's coding to ethics? >> Yeah, just looking at my own background and seeing how, what I got to learn in school, the thing that was lacking, compared to what's needed today, is statistics. Understanding of statistics, statistical knowledge, That I think, it's going to be pervasive. So I think, 10, 15 years from now, no matter which field you're in, actually whatever job you have, you have to have some basic level of statistical understanding 'cause the systems you're working with will be, they'll be spitting out statistics and numbers and you need to understand what is false positives, what is this, what is the sample, what is that? What do these things mean? So that's one thing that's definitely missing and actually it's coming, that's one. The second is computing will continue being important. So, in the intersection of those two is, I think a lot of those jobs. >> In all fields, we were talking about earlier, biology, everything's intersecting, biochemistry to whatever right? >> (Ali) Yeah. >> I got to ask you about, well I'm a little old school, I'm 53 years old but I remember when I broke into the business coding, I used to walk into departments, they were called DP, data processing. So we're getting into the data processing world now, you've got statistics, you've got pipeline, these are data concepts. So I got to ask you as companies that are in the enterprise may be slower to move to the cutting edge like you guys are, they got to figure out where to store the data. So can you share your opinion or view on how customers are thinking and how they maybe should be architecting data on premise, in the cloud. Certainly cloud's great, if you're getting cloud native for pure SAS, and born in the cloud like a start-up. But if you're a large enterprise, and you want to be SAS-like, to have all that benefit, take the risk with the reward of being agile, you got to have data because if you don't the data into the machine learning or AI, you're not going to have good AI. So you need to get that data feeding in fast. And if it's constrained with regulation compliance you're screwed. So what's your view on this? Where should it be stored? What's your opinion? >> Yeah, we've had the same opinion for five, six years, right? Which is the data belongs in the cloud. Don't try to do this yourself. Don't try to do this on prem. Don't store it in, at Duke, it's not built for this. Store it in the cloud. In the cloud, first of all, you get a lot of security benefits that the cloud vendors are already working on. So that's one good thing about it. Second, you get it, it's realiable. You get the 10, 11 lines of availability, so that's great, you get that. Start collecting data there. Another reason you want to do it in the cloud is that a lot of the data sets that you need to actually get good quality results, are available in the cloud. Often times what happens with AI is, you build a predictive model, but actually, it's terrible. It didn't work well. So you go back, and then the main trick, the first tricks you use to increase the quality is actually augmenting that data with other data sets. You might purchase those data sets from other vendors. You don't want to be shipping hard drives around or, you know, getting that into your data center. Those will be available in the cloud, so you can augment that data. So we're big fans of storing your data in data lakes, in the cloud. We obviously believe that you need to make that data high quality and reliable. With that we believe the Delta Lake platform, open-source project that we created is a great vehicle for that. But I think moving to the cloud is the number one thing. >> (John) And hybrid works with that if you need to have something on premise? >> In my opinion the two worlds are so different, that it's hard. You hear a lot of vendors that say we're the hybrid solution that works on both and so on. But the two models are so different, fundamentally, that it's hard to actually make them work well. I have not yet seen a customer yet or enterprise. You see a lot of offerings, where people say hybrid is the way. Of course, a lot of on prem vendors are now saying, hey, we're the hybrid solution. I haven't actually seen that be successful to be frank. Maybe someone will crack that nut but-- >> I think it's an operational question to see who can make it work. Ali, congratulations on all your success. Great to see you. >> Yeah it's been great having you on the show. >> Thank you so much for having me. >> You are watching theCUBE, Informatica 2019. I'm Rebecca Knight, for John Furrier, stay tuned.

Published Date : May 21 2019

SUMMARY :

Brought to you by Informatica. thank you so much for coming on, for returning to theCUBE. So I want to pick up on something that you said So that's the part that people are struggling with. Psyched that you stopped by theCUBE. and the catalog that they also have. So you know, we've been followers, our 10th year, And the promise was, you can do great things with this. the clean water, if you will. But if it makes it into the Delta Lake, You guys got the young generation coming in. We're expanding the ME and the Asia business. slow the hiring down to make sure that Yeah and we're super excited about And Cal had the first ever class in So they're going to be trained in data science the hiring to make sure that you are But, the good news is, if you look at the I got to ask you for the young kids out there, and numbers and you need to understand So I got to ask you as companies that are in the enterprise is that a lot of the data sets that you need But the two models are so different, fundamentally, to see who can make it work. You are watching theCUBE,

ENTITIES

Entity	Category	Confidence
Rebecca Knight	PERSON	0.99+
Ali Ghodsi	PERSON	0.99+
10	QUANTITY	0.99+
Databricks	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
John Furrier	PERSON	0.99+
Informatica	ORGANIZATION	0.99+
first	QUANTITY	0.99+
five	QUANTITY	0.99+
Cal	ORGANIZATION	0.99+
Ali	PERSON	0.99+
John	PERSON	0.99+
two	QUANTITY	0.99+
two models	QUANTITY	0.99+
thousands	QUANTITY	0.99+
one petabytes	QUANTITY	0.99+
10th year	QUANTITY	0.99+
Second	QUANTITY	0.99+
yesterday	DATE	0.99+
two petabytes	QUANTITY	0.99+
70s	DATE	0.99+
six years	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
Duke	ORGANIZATION	0.99+
five petabytes	QUANTITY	0.99+
Delta Lake	LOCATION	0.99+
both	QUANTITY	0.99+
Delta Lake	ORGANIZATION	0.99+
second	QUANTITY	0.98+
first tricks	QUANTITY	0.98+
Berkley	LOCATION	0.98+
40-50 people	QUANTITY	0.98+
two worlds	QUANTITY	0.98+
one good thing	QUANTITY	0.98+
one	QUANTITY	0.98+
Asia	LOCATION	0.98+
50 years ago	DATE	0.98+
CUBE	ORGANIZATION	0.97+
Cal Berkley	LOCATION	0.97+
over a thousand students	QUANTITY	0.97+
theCUBE	ORGANIZATION	0.96+
15 years	QUANTITY	0.96+
today	DATE	0.96+
Asiapac	LOCATION	0.96+
Mike Olsen	PERSON	0.96+
Amr Awadallah	PERSON	0.96+
About 100 people	QUANTITY	0.96+
53 years old	QUANTITY	0.95+
about 800 employees	QUANTITY	0.95+
first generation	QUANTITY	0.92+
11 lines	QUANTITY	0.92+
one thing	QUANTITY	0.91+
2019	DATE	0.89+
Informatica World 2019	EVENT	0.88+
SaaS	TITLE	0.86+
a decade ago	DATE	0.85+
thousands of data scientists	QUANTITY	0.84+
SAS	ORGANIZATION	0.84+
this weekend	DATE	0.82+
last couple years	DATE	0.81+
Informatica World	TITLE	0.62+

Ali Ghodsi, Databricks - #SparkSummit - #theCUBE

>> Narrator: Live from San Francisco, it's the Cube. Covering Sparks Summit 2017. Brought to you by Databricks. (upbeat music) >> Welcome back to the Cube, day two at Sparks Summit. It's very exciting. I can't wait to talk to this gentleman. We have the CEO from Databricks, Ali Ghodsi, joining us. Ali, welcome to the show. >> Thank you so much. >> David: Well we sat here and watched the keynote this morning with Databricks and you delivered. Some big announcements. Before we get into some of that, I want to ask you, it's been about a year and a half since you transitioned from VP Products and Engineering into a CEO role. What's the most fun part of that and maybe what's the toughest part? >> Oh, I see. That's a good question and that's a tough question too. Most fun part is... You know, you touch many more facets of the business. So in engineering, it's all the tech and you're dealing only with engineers, mostly. Customers are one hop away, there's a product management layer between you and the customers. So you're very inwards focused. As a CEO you're dealing with marketing, finance, sales, these different functions. And then, externally with media, with stakeholders, a lot of customer calls. There's many, many more facets of the business that you're seeing. And it also gives you a preview and it also gives you a perspective that you couldn't have before. You see how the pieces fit together so you actually can have a better perspective and see further out than you could before. Before, I was more in my own pick situation where I was seeing sort of just the things relating to engineering so that's the best part. >> You're obviously working close with customers. You introduced a few customers this morning up on stage. But after the keynote, did you hear any reactions from people? What are they saying? >> Yes the keynote was recently so on my way here I've had multiple people sort of... A couple people that high-fived just before I got up on stage here. On several softwaring, people are really excited about that. Less devops, less configuration, let them focus on the innovation, they want that. So that's something that's celebrated. Yesterday-- >> Recap that real quickly for our audience here, what the server-less operating is. >> Absolutely, so it's very simple. We want lots of data scientists to be able to do machine learning without have to worry about the infrastructure underneath it. So we have something called server-less pools and server-less pools you can just have lots of data scientists use it. Under the hood, this pool of resources shrinks and expands automatically. It adds storage, if needed. And you don't have to worry about the configuration of it. And it also makes sure that it's isolating the different data scientists. So if one data scientist happened to run something that takes much more resources, it won't effect the other data scientists that are sharing that. So the short story of it is you cut costs significantly, you can now have 3000 people share the same resources and it enables them to move faster because they don't have to worry about all the devops that they otherwise have to do. >> George, is that a really big deal? >> Well we know whenever there's infrastructure that gets between a developer, data science, and their outcomes, that's friction. I'd be curious to say let's put that into a bigger perspective, which is if you go back several years, what were the class of apps that Spark was being used for, and in conjunction with what other technologies. Then bring us forward to today and then maybe look out three years. >> Ali: Yeah, that's a great question. So from the very beginning, data is key for any of these predictive analytics that we are doing. So that was always a key thing. But back then we saw more Hadoop data lakes. There more data lakes, data reservoirs, data marks that people were building out. We saw also a lot of traditional data warehousing. These days, we see more and more things moving to cloud. The Hadoop data lake received, often times at enterprises, being transformed into a cloud blob storage. That's cheaper, it's dual-up replicated, it's on many continents. That's something that we've seen happen. And we work across any of these, frankly. We, from the very beginning, Spark, one of its strengths is it integrates really well wherever your data is. And there's a huge community of developers around it, over 1000 people now that have contributed to it. Many of these people are in other organizations, they're employed by other companies and their job is to make sure that Databricks or Spark works really, really well with, say, Cassandra or with S3. That's a shift that we're seeing. In terms of applications people are building it's moving more into production. Four years ago much more of it was interactive exploratory. Now we're seeing production use cases. The fraud analytics use case that I mentioned, that's running continuously and the requirements there are different. You can't go down for ten minutes on a Saturday morning at 4 a.m. when you're doing credit card fraud because that's a lot of fraud and that affects the business of, say, Capital One. So that's much more crucial for them. >> So what would be the surrounding infrastructure and applications to make that whole solution work? Would you plug into a traditional system of record at the sales order entry kind of process point? Are you working off sort of semi-real-time or near real-time data? And did you train the models on the data lake? How did the pieces fit together? >> Unfortunately the answers depends on the particular architecture that the customer has. Every enterprise is slightly different. But it's not uncommon that the data is coming in. They're using Spark structured streaming in Databricks to get it into S3, so that's one piece of the puzzle. Then when it ends up there, from then on it funnels out to many different use cases. It could be a data warehousing use case, where they're just using interactive sequel on it. So that's the traditional interactive use case, but it could be a real-time use case, where it's actually taking those data that it's processed and it's detecting anomalies and putting triggers in other systems and then those systems downstream will react to those triggers for anomalies. But it could also be that it's periodically training models and storing the models somewhere. Often times it might be in a Cassandra, or in a Redis, or something of that sort. It will store the model there and then some web application can then take it from there, do point queries to it and say okay, I have a particular user that came in here George now, quickly look up what is his feature vector, figure out what the product recommendations we should show to this person and then it takes it from there. >> So in those cases, Cassandra or Redis, they're playing the serving layer. But generating the prediction model is coming from you and they're just doing the inferencing, the prediction itself. So if you look out several years, without asking you the roadmap, which you can feel free to answer, how do you see that scope of apps expanding or the share of an existing app like that? >> Yeah, I think two interesting trends that I believe in, I'll be foolish enough to make predictions. One is that I think that data warehousing, as we know it today, will continue to exist. However, it will be transformed and all the data warehousing solutions that we have today will add predictive capabilities or it will disappear. So let me motivate that. If you have a data warehouse with customer data in it and a fact table, you have all your transactions there, you have all your products there. Today, you can plug in BI tools and on top of that you can see what's my business health today and yesterday. But you can't ask it: tell me about tomorrow. Why not? The data is there, why can I not ask it this customer data, you tell me which of these customers are going to turn, or which one of them should I reach out to because I can possibly upsell these? Why wouldn't I want to do that? I think everyone would want to do that and everyday a warehousing solution in ten years will have these capabilities. Now with Spark sequel you can do that and the announcement yesterday showed you also how you can bake models, machinery models, and export them so a sequel analyst can just act system directly with no machine learning experience. It's just a simple function call and it just works. So that's one prediction I'll make. The second prediction I'll make is that we're going to see lots of revolutions in different industries, beyond the traditional 'get people to click on ads' and understand social behavior. We're going to go beyond that. So for those use cases it will be closer to the things I mentioned like Shell and what you need to do there is involve these domain experts. The domain experts will come in, the doctors, or the machine specialists, you have to involve them in the loop. And they'll be able to transform, maybe much less exotic applications, it's not the super high-tech Silicon Valley stuff, but it's nevertheless extremely important to every enterprise, to every protocol, on the planet. That's, I think, the exciting part of where predictions will go in the next decade or two. >> If I were to try and pick out the most man-bytes dug kind of observation in there, you know, it's supposed to be the unexpected thing, I would say where you said all data warehouses are going to become predictive services. Because what we've been hearing, it's sort of the other side of that coin which is all the operational databases will get all the predictive capabilities. But you said something very different. I guess my question is are you seeing the advanced analytics going to the data warehouse because the repository of data is going to be bigger there and so you can either build better models or because it's not burdened with transaction SLAs that you can serve up predictions quicker? >> The data warehousing has been about basic statistics. It's been a sequel that the language that is used is to get descriptive statistics. Tables with averages and medians, that's statistics. Why wouldn't you want to have advanced statistics which now does predictions on it. It just so happens that sequel is not the right interface for that. So it's going to be very natural that people who are already asking statistical questions for the last 30 years from their customer data, these massive throes of data that they have stored. Why wouldn't they want to also say, 'okay now give me more advanced statistics?' I'm not an expert on advanced statistics but you the system. Tell me what I should watch out for. Which of these customers do I talk to? Which of the products are in trouble? Which of the products are not, or which parts of my business are not doing well now? Predict the future for me. >> George: When you're doing that though, you're now doing it on data that has a fair amount of latency built into it. Because that's how it got into the data warehouse. Where if it's in the operational database, it's really low latency, typically low latency stuff. Where and why do you see that distinction? >> I do think also that we'll see more and more real-time engines take over. If you do things in real-time you can do it for a fraction of the cost. So we'll also see those capabilities come in. So you don't have to... Your question is, why would you want to once a week batch everything into a central warehouse and I agree with that. It will be streaming in live and then you can on that, do predictions, you can do basic analytics. I think basically the lines will blur between all these technologies that we're seeing. In some sense, Spark actually was the precursor to all that. So Spark already was unifying machine learning, sequel, ETL, real-time, and you're going to see that everywhere appear. >> You mentioned Shell as an example, one of your customers, you also had HP, Capital One, and you developed this unified analytics platform, that's solving some of their common problems. Now that you're in the mood to make predictions, what do you think are going to be the most compelling use cases or industries where you're going to see Databricks going in the future? >> That's a hard one. Right now, I think healthcare. There's a lot of data sets, there's a lot of gene sequencing data. They want to be able to use machine learning. In fact, I think those industries being transformed slowly from using classical statistics into machine learning. We've actually helped some of these companies do that. We've set up workshops and they've gotten people trained. And now they're hiring machine learning experts that are coming in. So that's one I think in the healthcare industry, whether it's for drug-testing, clinical-trials, even diagnosis, that's a big one, I do think industrial IT. These are big companies with lots of equipment, they have tons of sensor data, massive data sets. There's a lot of predictions that they can do on that. So that's a second one I would say. Financial industry, they've always been about predictions, so it makes a lot of sense that they continue doing that. Those are the biggest ones for Databricks. But I think now also as slowly, other verticals are moving into the cloud. We'll see more of other use cases as well. But those are the biggest ones I see right now. It's hard to say where it will be ten years from now, or 15. Things are going so fast that it's hard to even predict six months. >> David: Do you believe IOT is going to be a big business driver? >> Yes, absolutely. >> I want to circle back where you said that we've got different types of databases but we're going to unify the capabilities. Without saying, it's not like one wins, one loses. >> Ali: Yes, I didn't want to do that. >> So describe maybe the characteristics of what a database that compliments Sparks really well might look like. >> That's hard for me to say. The capabilities of Spark, I think, are here to stay. The ability to be able to ETL variety of data that doesn't have structure, so Structured Query Language, SQL, is not fit for it, that is really important and it's going to become more important since data is the new oil, as they said. Well, then it's going to be very important to be able to work with all kinds of data and getting that into the systems. There's more things everyday being created. Devices, IOT, whatever it is that are spewing out this data in different forms and shapes. So being able to work with that variety, that's going to be an important property. So they'll have to do that. That's the ETL portion or the ELT portion. The real-time portion, not having to do this in a batch manner once a week because now time is a competitive advantage. So if I'm one week behind you that means I'm going to lose out. So doing that in real-time, or near human-time or human real-time, that's going to be really important. So that's going to come as well, I think, and people will demand that. That's going to be a competitive advantage. Wherever you can add that secret sauce it's going to add value to the customers. And then finally the predictive stuff, adding the predictive stuff. But I think people will want to continue to also do all the old stuff they've been doing. I don't think that's going to go away. Those bring value to customers, they want to do all those traditional use cases as well. >> So what about now where customers expect to have some, not clear how much, un-Primmed application platform like Spark. Some in the cloud that now that you've totally reordered the TCO equation. But then also at the edge for IOT-type use cases, do you have to slim down Spark to work at the edge? If you have server-less working in the cloud, does that mean you have to change the management paradigm on Prim. What does that mix look like? How does someone, you know how does a Fortune 200 company, get their arms around that? >> Ali: Yeah, this is a surprising thing, most surprising thing for me in the last year, is how many of those Fortune 200's that I was talking to three years ago and they were saying 'no way, we're not going into the cloud. You don't understand the regulations that we are facing or the amount of data that we have.' Or 'we can do it better,' or 'the security requirements that we have, no one can match that.' To now, those very same companies are saying 'absolutely, we're going.' It's not about if, it's about when. Now I would be hard-pressed to find any enterprise that says 'no, we're not going to go, ever.' And some companies we've even seen go from the cloud to on Prim, and then now back. Because the prices are getting more competitive in the cloud. Because now there's three, at least, major players that are competing and they're well-funded companies. In some sense, you have ad money and office money and retail money being thrown at this problem. Prices are getting competitive. Very soon, most IT folks will realize, there's no way we can do this faster, or better, or more reliable secure ourselves. >> David: We've got just a minute to go here before the break so we're going to kind of wrap it up here. And we got over 3000 people here at Spark Summit so it's the Spark community. I want you to talk to them for a moment. What problems do you want them to work on the most? And what are we going to be talking about a year from now at this table? >> The second one is harder. So I think the Spark community is doing a phenomenal job. I'm not going to tell them what to do. They should continue doing what they are doing already which is integrating Spark in the ecosystem, adding more and more integrations with the greatest technologies that are happening out there. Continue the innovation and we're super happy to have them here. We'll continue it as well, we'll continue to host this event and look forward to also having a Spark Summit in Europe, and also the East Coast soon. >> David: Okay, so I'm not going to ask you to make any more predictions. >> Alright, excellent. >> David: Ali this is great stuffy today. Thank you so much for taking some time and giving us more insight after the keynote this morning. Good luck with the rest of the show. >> Thank you. >> Thanks, Ali. And thank you for watching. That's Ali Ghodsi CEO from Databricks. We are Spark Summit 2017 here, on the Cube. Thanks for watching, stay with us. (upbeat mustic)

Published Date : Jun 8 2017

SUMMARY :

Brought to you by Databricks. We have the CEO from Databricks, Ali Ghodsi, joining us. the keynote this morning with Databricks and you delivered. that you couldn't have before. But after the keynote, did you Yes the keynote was recently so on my way here Recap that real quickly for our audience here, and server-less pools you can just have into a bigger perspective, which is if you go back So from the very beginning, So that's the traditional interactive use case, But generating the prediction model is coming from you and the announcement yesterday showed you also and so you can either build better models It's been a sequel that the language that is used Where and why do you see that distinction? and then you can on that, do predictions, what do you think are going to be It's hard to say where it will be ten years from now, or 15. I want to circle back where you said So describe maybe the characteristics of what a database and getting that into the systems. does that mean you have to change or the amount of data that we have.' I want you to talk to them for a moment. and also the East Coast soon. David: Okay, so I'm not going to ask you Thank you so much for taking some time And thank you for watching.

ENTITIES

Entity	Category	Confidence
George	PERSON	0.99+
David	PERSON	0.99+
HP	ORGANIZATION	0.99+
Ali Ghodsi	PERSON	0.99+
Europe	LOCATION	0.99+
Ali	PERSON	0.99+
Databricks	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
Capital One	ORGANIZATION	0.99+
three	QUANTITY	0.99+
Today	DATE	0.99+
one week	QUANTITY	0.99+
tomorrow	DATE	0.99+
last year	DATE	0.99+
ten years	QUANTITY	0.99+
yesterday	DATE	0.99+
three years	QUANTITY	0.99+
3000 people	QUANTITY	0.99+
One	QUANTITY	0.99+
ten minutes	QUANTITY	0.99+
Four years ago	DATE	0.99+
three years ago	DATE	0.99+
next decade	DATE	0.99+
six months	QUANTITY	0.99+
Yesterday	DATE	0.98+
over 1000 people	QUANTITY	0.98+
East Coast	LOCATION	0.98+
today	DATE	0.98+
one	QUANTITY	0.98+
one prediction	QUANTITY	0.98+
second prediction	QUANTITY	0.98+
Silicon Valley	LOCATION	0.97+
Spark Summit 2017	EVENT	0.97+
Spark	TITLE	0.97+
once a week	QUANTITY	0.97+
Sparks Summit	EVENT	0.97+
Fortune 200	ORGANIZATION	0.96+
over 3000 people	QUANTITY	0.96+
about a year and a half	QUANTITY	0.95+
Shell	ORGANIZATION	0.95+
Spark	ORGANIZATION	0.95+
Sparks	TITLE	0.94+
IOT	ORGANIZATION	0.94+
day two	QUANTITY	0.94+
Sparks Summit 2017	EVENT	0.94+
this morning	DATE	0.93+
second one	QUANTITY	0.93+
S3	TITLE	0.85+
one data scientist	QUANTITY	0.85+
15	QUANTITY	0.85+
Saturday morning at	DATE	0.84+
tons	QUANTITY	0.83+
S3	ORGANIZATION	0.8+
one piece of the puzzle	QUANTITY	0.79+
couple people	QUANTITY	0.77+
Prim	ORGANIZATION	0.76+
several years	QUANTITY	0.75+

Breaking Analysis: Databricks faces critical strategic decisions…here’s why

>> From theCUBE Studios in Palo Alto and Boston, bringing you data-driven insights from theCUBE and ETR. This is Breaking Analysis with Dave Vellante. >> Spark became a top level Apache project in 2014, and then shortly thereafter, burst onto the big data scene. Spark, along with the cloud, transformed and in many ways, disrupted the big data market. Databricks optimized its tech stack for Spark and took advantage of the cloud to really cleverly deliver a managed service that has become a leading AI and data platform among data scientists and data engineers. However, emerging customer data requirements are shifting into a direction that will cause modern data platform players generally and Databricks, specifically, we think, to make some key directional decisions and perhaps even reinvent themselves. Hello and welcome to this week's wikibon theCUBE Insights, powered by ETR. In this Breaking Analysis, we're going to do a deep dive into Databricks. We'll explore its current impressive market momentum. We're going to use some ETR survey data to show that, and then we'll lay out how customer data requirements are changing and what the ideal data platform will look like in the midterm future. We'll then evaluate core elements of the Databricks portfolio against that vision, and then we'll close with some strategic decisions that we think the company faces. And to do so, we welcome in our good friend, George Gilbert, former equities analyst, market analyst, and current Principal at TechAlpha Partners. George, good to see you. Thanks for coming on. >> Good to see you, Dave. >> All right, let me set this up. We're going to start by taking a look at where Databricks sits in the market in terms of how customers perceive the company and what it's momentum looks like. And this chart that we're showing here is data from ETS, the emerging technology survey of private companies. The N is 1,421. What we did is we cut the data on three sectors, analytics, database-data warehouse, and AI/ML. The vertical axis is a measure of customer sentiment, which evaluates an IT decision maker's awareness of the firm and the likelihood of engaging and/or purchase intent. The horizontal axis shows mindshare in the dataset, and we've highlighted Databricks, which has been a consistent high performer in this survey over the last several quarters. And as we, by the way, just as aside as we previously reported, OpenAI, which burst onto the scene this past quarter, leads all names, but Databricks is still prominent. You can see that the ETR shows some open source tools for reference, but as far as firms go, Databricks is very impressively positioned. Now, let's see how they stack up to some mainstream cohorts in the data space, against some bigger companies and sometimes public companies. This chart shows net score on the vertical axis, which is a measure of spending momentum and pervasiveness in the data set is on the horizontal axis. You can see that chart insert in the upper right, that informs how the dots are plotted, and net score against shared N. And that red dotted line at 40% indicates a highly elevated net score, anything above that we think is really, really impressive. And here we're just comparing Databricks with Snowflake, Cloudera, and Oracle. And that squiggly line leading to Databricks shows their path since 2021 by quarter. And you can see it's performing extremely well, maintaining an elevated net score and net range. Now it's comparable in the vertical axis to Snowflake, and it consistently is moving to the right and gaining share. Now, why did we choose to show Cloudera and Oracle? The reason is that Cloudera got the whole big data era started and was disrupted by Spark. And of course the cloud, Spark and Databricks and Oracle in many ways, was the target of early big data players like Cloudera. Take a listen to Cloudera CEO at the time, Mike Olson. This is back in 2010, first year of theCUBE, play the clip. >> Look, back in the day, if you had a data problem, if you needed to run business analytics, you wrote the biggest check you could to Sun Microsystems, and you bought a great big, single box, central server, and any money that was left over, you handed to Oracle for a database licenses and you installed that database on that box, and that was where you went for data. That was your temple of information. >> Okay? So Mike Olson implied that monolithic model was too expensive and inflexible, and Cloudera set out to fix that. But the best laid plans, as they say, George, what do you make of the data that we just shared? >> So where Databricks has really come up out of sort of Cloudera's tailpipe was they took big data processing, made it coherent, made it a managed service so it could run in the cloud. So it relieved customers of the operational burden. Where they're really strong and where their traditional meat and potatoes or bread and butter is the predictive and prescriptive analytics that building and training and serving machine learning models. They've tried to move into traditional business intelligence, the more traditional descriptive and diagnostic analytics, but they're less mature there. So what that means is, the reason you see Databricks and Snowflake kind of side by side is there are many, many accounts that have both Snowflake for business intelligence, Databricks for AI machine learning, where Snowflake, I'm sorry, where Databricks also did really well was in core data engineering, refining the data, the old ETL process, which kind of turned into ELT, where you loaded into the analytic repository in raw form and refine it. And so people have really used both, and each is trying to get into the other. >> Yeah, absolutely. We've reported on this quite a bit. Snowflake, kind of moving into the domain of Databricks and vice versa. And the last bit of ETR evidence that we want to share in terms of the company's momentum comes from ETR's Round Tables. They're run by Erik Bradley, and now former Gartner analyst and George, your colleague back at Gartner, Daren Brabham. And what we're going to show here is some direct quotes of IT pros in those Round Tables. There's a data science head and a CIO as well. Just make a few call outs here, we won't spend too much time on it, but starting at the top, like all of us, we can't talk about Databricks without mentioning Snowflake. Those two get us excited. Second comment zeros in on the flexibility and the robustness of Databricks from a data warehouse perspective. And then the last point is, despite competition from cloud players, Databricks has reinvented itself a couple of times over the year. And George, we're going to lay out today a scenario that perhaps calls for Databricks to do that once again. >> Their big opportunity and their big challenge for every tech company, it's managing a technology transition. The transition that we're talking about is something that's been bubbling up, but it's really epical. First time in 60 years, we're moving from an application-centric view of the world to a data-centric view, because decisions are becoming more important than automating processes. So let me let you sort of develop. >> Yeah, so let's talk about that here. We going to put up some bullets on precisely that point and the changing sort of customer environment. So you got IT stacks are shifting is George just said, from application centric silos to data centric stacks where the priority is shifting from automating processes to automating decision. You know how look at RPA and there's still a lot of automation going on, but from the focus of that application centricity and the data locked into those apps, that's changing. Data has historically been on the outskirts in silos, but organizations, you think of Amazon, think Uber, Airbnb, they're putting data at the core, and logic is increasingly being embedded in the data instead of the reverse. In other words, today, the data's locked inside the app, which is why you need to extract that data is sticking it to a data warehouse. The point, George, is we're putting forth this new vision for how data is going to be used. And you've used this Uber example to underscore the future state. Please explain? >> Okay, so this is hopefully an example everyone can relate to. The idea is first, you're automating things that are happening in the real world and decisions that make those things happen autonomously without humans in the loop all the time. So to use the Uber example on your phone, you call a car, you call a driver. Automatically, the Uber app then looks at what drivers are in the vicinity, what drivers are free, matches one, calculates an ETA to you, calculates a price, calculates an ETA to your destination, and then directs the driver once they're there. The point of this is that that cannot happen in an application-centric world very easily because all these little apps, the drivers, the riders, the routes, the fares, those call on data locked up in many different apps, but they have to sit on a layer that makes it all coherent. >> But George, so if Uber's doing this, doesn't this tech already exist? Isn't there a tech platform that does this already? >> Yes, and the mission of the entire tech industry is to build services that make it possible to compose and operate similar platforms and tools, but with the skills of mainstream developers in mainstream corporations, not the rocket scientists at Uber and Amazon. >> Okay, so we're talking about horizontally scaling across the industry, and actually giving a lot more organizations access to this technology. So by way of review, let's summarize the trend that's going on today in terms of the modern data stack that is propelling the likes of Databricks and Snowflake, which we just showed you in the ETR data and is really is a tailwind form. So the trend is toward this common repository for analytic data, that could be multiple virtual data warehouses inside of Snowflake, but you're in that Snowflake environment or Lakehouses from Databricks or multiple data lakes. And we've talked about what JP Morgan Chase is doing with the data mesh and gluing data lakes together, you've got various public clouds playing in this game, and then the data is annotated to have a common meaning. In other words, there's a semantic layer that enables applications to talk to the data elements and know that they have common and coherent meaning. So George, the good news is this approach is more effective than the legacy monolithic models that Mike Olson was talking about, so what's the problem with this in your view? >> So today's data platforms added immense value 'cause they connected the data that was previously locked up in these monolithic apps or on all these different microservices, and that supported traditional BI and AI/ML use cases. But now if we want to build apps like Uber or Amazon.com, where they've got essentially an autonomously running supply chain and e-commerce app where humans only care and feed it. But the thing is figuring out what to buy, when to buy, where to deploy it, when to ship it. We needed a semantic layer on top of the data. So that, as you were saying, the data that's coming from all those apps, the different apps that's integrated, not just connected, but it means the same. And the issue is whenever you add a new layer to a stack to support new applications, there are implications for the already existing layers, like can they support the new layer and its use cases? So for instance, if you add a semantic layer that embeds app logic with the data rather than vice versa, which we been talking about and that's been the case for 60 years, then the new data layer faces challenges that the way you manage that data, the way you analyze that data, is not supported by today's tools. >> Okay, so actually Alex, bring me up that last slide if you would, I mean, you're basically saying at the bottom here, today's repositories don't really do joins at scale. The future is you're talking about hundreds or thousands or millions of data connections, and today's systems, we're talking about, I don't know, 6, 8, 10 joins and that is the fundamental problem you're saying, is a new data error coming and existing systems won't be able to handle it? >> Yeah, one way of thinking about it is that even though we call them relational databases, when we actually want to do lots of joins or when we want to analyze data from lots of different tables, we created a whole new industry for analytic databases where you sort of mung the data together into fewer tables. So you didn't have to do as many joins because the joins are difficult and slow. And when you're going to arbitrarily join thousands, hundreds of thousands or across millions of elements, you need a new type of database. We have them, they're called graph databases, but to query them, you go back to the prerelational era in terms of their usability. >> Okay, so we're going to come back to that and talk about how you get around that problem. But let's first lay out what the ideal data platform of the future we think looks like. And again, we're going to come back to use this Uber example. In this graphic that George put together, awesome. We got three layers. The application layer is where the data products reside. The example here is drivers, rides, maps, routes, ETA, et cetera. The digital version of what we were talking about in the previous slide, people, places and things. The next layer is the data layer, that breaks down the silos and connects the data elements through semantics and everything is coherent. And then the bottom layers, the legacy operational systems feed that data layer. George, explain what's different here, the graph database element, you talk about the relational query capabilities, and why can't I just throw memory at solving this problem? >> Some of the graph databases do throw memory at the problem and maybe without naming names, some of them live entirely in memory. And what you're dealing with is a prerelational in-memory database system where you navigate between elements, and the issue with that is we've had SQL for 50 years, so we don't have to navigate, we can say what we want without how to get it. That's the core of the problem. >> Okay. So if I may, I just want to drill into this a little bit. So you're talking about the expressiveness of a graph. Alex, if you'd bring that back out, the fourth bullet, expressiveness of a graph database with the relational ease of query. Can you explain what you mean by that? >> Yeah, so graphs are great because when you can describe anything with a graph, that's why they're becoming so popular. Expressive means you can represent anything easily. They're conducive to, you might say, in a world where we now want like the metaverse, like with a 3D world, and I don't mean the Facebook metaverse, I mean like the business metaverse when we want to capture data about everything, but we want it in context, we want to build a set of digital twins that represent everything going on in the world. And Uber is a tiny example of that. Uber built a graph to represent all the drivers and riders and maps and routes. But what you need out of a database isn't just a way to store stuff and update stuff. You need to be able to ask questions of it, you need to be able to query it. And if you go back to prerelational days, you had to know how to find your way to the data. It's sort of like when you give directions to someone and they didn't have a GPS system and a mapping system, you had to give them turn by turn directions. Whereas when you have a GPS and a mapping system, which is like the relational thing, you just say where you want to go, and it spits out the turn by turn directions, which let's say, the car might follow or whoever you're directing would follow. But the point is, it's much easier in a relational database to say, "I just want to get these results. You figure out how to get it." The graph database, they have not taken over the world because in some ways, it's taking a 50 year leap backwards. >> Alright, got it. Okay. Let's take a look at how the current Databricks offerings map to that ideal state that we just laid out. So to do that, we put together this chart that looks at the key elements of the Databricks portfolio, the core capability, the weakness, and the threat that may loom. Start with the Delta Lake, that's the storage layer, which is great for files and tables. It's got true separation of compute and storage, I want you to double click on that George, as independent elements, but it's weaker for the type of low latency ingest that we see coming in the future. And some of the threats highlighted here. AWS could add transactional tables to S3, Iceberg adoption is picking up and could accelerate, that could disrupt Databricks. George, add some color here please? >> Okay, so this is the sort of a classic competitive forces where you want to look at, so what are customers demanding? What's competitive pressure? What are substitutes? Even what your suppliers might be pushing. Here, Delta Lake is at its core, a set of transactional tables that sit on an object store. So think of it in a database system, this is the storage engine. So since S3 has been getting stronger for 15 years, you could see a scenario where they add transactional tables. We have an open source alternative in Iceberg, which Snowflake and others support. But at the same time, Databricks has built an ecosystem out of tools, their own and others, that read and write to Delta tables, that's what makes the Delta Lake and ecosystem. So they have a catalog, the whole machine learning tool chain talks directly to the data here. That was their great advantage because in the past with Snowflake, you had to pull all the data out of the database before the machine learning tools could work with it, that was a major shortcoming. They fixed that. But the point here is that even before we get to the semantic layer, the core foundation is under threat. >> Yep. Got it. Okay. We got a lot of ground to cover. So we're going to take a look at the Spark Execution Engine next. Think of that as the refinery that runs really efficient batch processing. That's kind of what disrupted the DOOp in a large way, but it's not Python friendly and that's an issue because the data science and the data engineering crowd are moving in that direction, and/or they're using DBT. George, we had Tristan Handy on at Supercloud, really interesting discussion that you and I did. Explain why this is an issue for Databricks? >> So once the data lake was in place, what people did was they refined their data batch, and Spark has always had streaming support and it's gotten better. The underlying storage as we've talked about is an issue. But basically they took raw data, then they refined it into tables that were like customers and products and partners. And then they refined that again into what was like gold artifacts, which might be business intelligence metrics or dashboards, which were collections of metrics. But they were running it on the Spark Execution Engine, which it's a Java-based engine or it's running on a Java-based virtual machine, which means all the data scientists and the data engineers who want to work with Python are really working in sort of oil and water. Like if you get an error in Python, you can't tell whether the problems in Python or where it's in Spark. There's just an impedance mismatch between the two. And then at the same time, the whole world is now gravitating towards DBT because it's a very nice and simple way to compose these data processing pipelines, and people are using either SQL in DBT or Python in DBT, and that kind of is a substitute for doing it all in Spark. So it's under threat even before we get to that semantic layer, it so happens that DBT itself is becoming the authoring environment for the semantic layer with business intelligent metrics. But that's again, this is the second element that's under direct substitution and competitive threat. >> Okay, let's now move down to the third element, which is the Photon. Photon is Databricks' BI Lakehouse, which has integration with the Databricks tooling, which is very rich, it's newer. And it's also not well suited for high concurrency and low latency use cases, which we think are going to increasingly become the norm over time. George, the call out threat here is customers want to connect everything to a semantic layer. Explain your thinking here and why this is a potential threat to Databricks? >> Okay, so two issues here. What you were touching on, which is the high concurrency, low latency, when people are running like thousands of dashboards and data is streaming in, that's a problem because SQL data warehouse, the query engine, something like that matures over five to 10 years. It's one of these things, the joke that Andy Jassy makes just in general, he's really talking about Azure, but there's no compression algorithm for experience. The Snowflake guy started more than five years earlier, and for a bunch of reasons, that lead is not something that Databricks can shrink. They'll always be behind. So that's why Snowflake has transactional tables now and we can get into that in another show. But the key point is, so near term, it's struggling to keep up with the use cases that are core to business intelligence, which is highly concurrent, lots of users doing interactive query. But then when you get to a semantic layer, that's when you need to be able to query data that might have thousands or tens of thousands or hundreds of thousands of joins. And that's a SQL query engine, traditional SQL query engine is just not built for that. That's the core problem of traditional relational databases. >> Now this is a quick aside. We always talk about Snowflake and Databricks in sort of the same context. We're not necessarily saying that Snowflake is in a position to tackle all these problems. We'll deal with that separately. So we don't mean to imply that, but we're just sort of laying out some of the things that Snowflake or rather Databricks customers we think, need to be thinking about and having conversations with Databricks about and we hope to have them as well. We'll come back to that in terms of sort of strategic options. But finally, when come back to the table, we have Databricks' AI/ML Tool Chain, which has been an awesome capability for the data science crowd. It's comprehensive, it's a one-stop shop solution, but the kicker here is that it's optimized for supervised model building. And the concern is that foundational models like GPT could cannibalize the current Databricks tooling, but George, can't Databricks, like other software companies, integrate foundation model capabilities into its platform? >> Okay, so the sound bite answer to that is sure, IBM 3270 terminals could call out to a graphical user interface when they're running on the XT terminal, but they're not exactly good citizens in that world. The core issue is Databricks has this wonderful end-to-end tool chain for training, deploying, monitoring, running inference on supervised models. But the paradigm there is the customer builds and trains and deploys each model for each feature or application. In a world of foundation models which are pre-trained and unsupervised, the entire tool chain is different. So it's not like Databricks can junk everything they've done and start over with all their engineers. They have to keep maintaining what they've done in the old world, but they have to build something new that's optimized for the new world. It's a classic technology transition and their mentality appears to be, "Oh, we'll support the new stuff from our old stuff." Which is suboptimal, and as we'll talk about, their biggest patron and the company that put them on the map, Microsoft, really stopped working on their old stuff three years ago so that they could build a new tool chain optimized for this new world. >> Yeah, and so let's sort of close with what we think the options are and decisions that Databricks has for its future architecture. They're smart people. I mean we've had Ali Ghodsi on many times, super impressive. I think they've got to be keenly aware of the limitations, what's going on with foundation models. But at any rate, here in this chart, we lay out sort of three scenarios. One is re-architect the platform by incrementally adopting new technologies. And example might be to layer a graph query engine on top of its stack. They could license key technologies like graph database, they could get aggressive on M&A and buy-in, relational knowledge graphs, semantic technologies, vector database technologies. George, as David Floyer always says, "A lot of ways to skin a cat." We've seen companies like, even think about EMC maintained its relevance through M&A for many, many years. George, give us your thought on each of these strategic options? >> Okay, I find this question the most challenging 'cause remember, I used to be an equity research analyst. I worked for Frank Quattrone, we were one of the top tech shops in the banking industry, although this is 20 years ago. But the M&A team was the top team in the industry and everyone wanted them on their side. And I remember going to meetings with these CEOs, where Frank and the bankers would say, "You want us for your M&A work because we can do better." And they really could do better. But in software, it's not like with EMC in hardware because with hardware, it's easier to connect different boxes. With software, the whole point of a software company is to integrate and architect the components so they fit together and reinforce each other, and that makes M&A harder. You can do it, but it takes a long time to fit the pieces together. Let me give you examples. If they put a graph query engine, let's say something like TinkerPop, on top of, I don't even know if it's possible, but let's say they put it on top of Delta Lake, then you have this graph query engine talking to their storage layer, Delta Lake. But if you want to do analysis, you got to put the data in Photon, which is not really ideal for highly connected data. If you license a graph database, then most of your data is in the Delta Lake and how do you sync it with the graph database? If you do sync it, you've got data in two places, which kind of defeats the purpose of having a unified repository. I find this semantic layer option in number three actually more promising, because that's something that you can layer on top of the storage layer that you have already. You just have to figure out then how to have your query engines talk to that. What I'm trying to highlight is, it's easy as an analyst to say, "You can buy this company or license that technology." But the really hard work is making it all work together and that is where the challenge is. >> Yeah, and well look, I thank you for laying that out. We've seen it, certainly Microsoft and Oracle. I guess you might argue that well, Microsoft had a monopoly in its desktop software and was able to throw off cash for a decade plus while it's stock was going sideways. Oracle had won the database wars and had amazing margins and cash flow to be able to do that. Databricks isn't even gone public yet, but I want to close with some of the players to watch. Alex, if you'd bring that back up, number four here. AWS, we talked about some of their options with S3 and it's not just AWS, it's blob storage, object storage. Microsoft, as you sort of alluded to, was an early go-to market channel for Databricks. We didn't address that really. So maybe in the closing comments we can. Google obviously, Snowflake of course, we're going to dissect their options in future Breaking Analysis. Dbt labs, where do they fit? Bob Muglia's company, Relational.ai, why are these players to watch George, in your opinion? >> So everyone is trying to assemble and integrate the pieces that would make building data applications, data products easy. And the critical part isn't just assembling a bunch of pieces, which is traditionally what AWS did. It's a Unix ethos, which is we give you the tools, you put 'em together, 'cause you then have the maximum choice and maximum power. So what the hyperscalers are doing is they're taking their key value stores, in the case of ASW it's DynamoDB, in the case of Azure it's Cosmos DB, and each are putting a graph query engine on top of those. So they have a unified storage and graph database engine, like all the data would be collected in the key value store. Then you have a graph database, that's how they're going to be presenting a foundation for building these data apps. Dbt labs is putting a semantic layer on top of data lakes and data warehouses and as we'll talk about, I'm sure in the future, that makes it easier to swap out the underlying data platform or swap in new ones for specialized use cases. Snowflake, what they're doing, they're so strong in data management and with their transactional tables, what they're trying to do is take in the operational data that used to be in the province of many state stores like MongoDB and say, "If you manage that data with us, it'll be connected to your analytic data without having to send it through a pipeline." And that's hugely valuable. Relational.ai is the wildcard, 'cause what they're trying to do, it's almost like a holy grail where you're trying to take the expressiveness of connecting all your data in a graph but making it as easy to query as you've always had it in a SQL database or I should say, in a relational database. And if they do that, it's sort of like, it'll be as easy to program these data apps as a spreadsheet was compared to procedural languages, like BASIC or Pascal. That's the implications of Relational.ai. >> Yeah, and again, we talked before, why can't you just throw this all in memory? We're talking in that example of really getting down to differences in how you lay the data out on disk in really, new database architecture, correct? >> Yes. And that's why it's not clear that you could take a data lake or even a Snowflake and why you can't put a relational knowledge graph on those. You could potentially put a graph database, but it'll be compromised because to really do what Relational.ai has done, which is the ease of Relational on top of the power of graph, you actually need to change how you're storing your data on disk or even in memory. So you can't, in other words, it's not like, oh we can add graph support to Snowflake, 'cause if you did that, you'd have to change, or in your data lake, you'd have to change how the data is physically laid out. And then that would break all the tools that talk to that currently. >> What in your estimation, is the timeframe where this becomes critical for a Databricks and potentially Snowflake and others? I mentioned earlier midterm, are we talking three to five years here? Are we talking end of decade? What's your radar say? >> I think something surprising is going on that's going to sort of come up the tailpipe and take everyone by storm. All the hype around business intelligence metrics, which is what we used to put in our dashboards where bookings, billings, revenue, customer, those things, those were the key artifacts that used to live in definitions in your BI tools, and DBT has basically created a standard for defining those so they live in your data pipeline or they're defined in their data pipeline and executed in the data warehouse or data lake in a shared way, so that all tools can use them. This sounds like a digression, it's not. All this stuff about data mesh, data fabric, all that's going on is we need a semantic layer and the business intelligence metrics are defining common semantics for your data. And I think we're going to find by the end of this year, that metrics are how we annotate all our analytic data to start adding common semantics to it. And we're going to find this semantic layer, it's not three to five years off, it's going to be staring us in the face by the end of this year. >> Interesting. And of course SVB today was shut down. We're seeing serious tech headwinds, and oftentimes in these sort of downturns or flat turns, which feels like this could be going on for a while, we emerge with a lot of new players and a lot of new technology. George, we got to leave it there. Thank you to George Gilbert for excellent insights and input for today's episode. I want to thank Alex Myerson who's on production and manages the podcast, of course Ken Schiffman as well. Kristin Martin and Cheryl Knight help get the word out on social media and in our newsletters. And Rob Hof is our EIC over at Siliconangle.com, he does some great editing. Remember all these episodes, they're available as podcasts. Wherever you listen, all you got to do is search Breaking Analysis Podcast, we publish each week on wikibon.com and siliconangle.com, or you can email me at David.Vellante@siliconangle.com, or DM me @DVellante. Comment on our LinkedIn post, and please do check out ETR.ai, great survey data, enterprise tech focus, phenomenal. This is Dave Vellante for theCUBE Insights powered by ETR. Thanks for watching, and we'll see you next time on Breaking Analysis.

Published Date : Mar 10 2023

SUMMARY :

bringing you data-driven core elements of the Databricks portfolio and pervasiveness in the data and that was where you went for data. and Cloudera set out to fix that. the reason you see and the robustness of Databricks and their big challenge and the data locked into in the real world and decisions Yes, and the mission of that is propelling the likes that the way you manage that data, is the fundamental problem because the joins are difficult and slow. and connects the data and the issue with that is the fourth bullet, expressiveness and it spits out the and the threat that may loom. because in the past with Snowflake, Think of that as the refinery So once the data lake was in place, George, the call out threat here But the key point is, in sort of the same context. and the company that put One is re-architect the platform and architect the components some of the players to watch. in the case of ASW it's DynamoDB, and why you can't put a relational and executed in the data and manages the podcast, of

ENTITIES

Entity	Category	Confidence
Alex Myerson	PERSON	0.99+
David Floyer	PERSON	0.99+
Mike Olson	PERSON	0.99+
2014	DATE	0.99+
George Gilbert	PERSON	0.99+
Dave Vellante	PERSON	0.99+
George	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
Ken Schiffman	PERSON	0.99+
Andy Jassy	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Erik Bradley	PERSON	0.99+
Dave	PERSON	0.99+
Uber	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
Sun Microsystems	ORGANIZATION	0.99+
50 years	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Bob Muglia	PERSON	0.99+
Gartner	ORGANIZATION	0.99+
Airbnb	ORGANIZATION	0.99+
60 years	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Ali Ghodsi	PERSON	0.99+
2010	DATE	0.99+
Databricks	ORGANIZATION	0.99+
Kristin Martin	PERSON	0.99+
Rob Hof	PERSON	0.99+
three	QUANTITY	0.99+
15 years	QUANTITY	0.99+
Databricks'	ORGANIZATION	0.99+
two places	QUANTITY	0.99+
Boston	LOCATION	0.99+
Tristan Handy	PERSON	0.99+
M&A	ORGANIZATION	0.99+
Frank Quattrone	PERSON	0.99+
second element	QUANTITY	0.99+
Daren Brabham	PERSON	0.99+
TechAlpha Partners	ORGANIZATION	0.99+
third element	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
50 year	QUANTITY	0.99+
40%	QUANTITY	0.99+
Cloudera	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
five years	QUANTITY	0.99+

SiliconANGLE Report: Reporters Notebook with Adrian Cockcroft | AWS re:Invent 2022

(soft techno upbeat music) >> Hi there. Welcome back to Las Vegas. This is Dave Villante with Paul Gillon. Reinvent day one and a half. We started last night, Monday, theCUBE after dark. Now we're going wall to wall. Today. Today was of course the big keynote, Adam Selipsky, kind of the baton now handing, you know, last year when he did his keynote, he was very new. He was sort of still getting his feet wet and finding his guru swing. Settling in a little bit more this year, learning a lot more, getting deeper into the tech, but of course, sharing the love with other leaders like Peter DeSantis. Tomorrow's going to be Swamy in the keynote. Adrian Cockcroft is here. Former AWS, former network Netflix CTO, currently an analyst. You got your own firm now. You're out there. Great to see you again. Thanks for coming on theCUBE. >> Yeah, thanks. >> We heard you on at Super Cloud, you gave some really good insights there back in August. So now as an outsider, you come in obviously, you got to be impressed with the size and the ecosystem and the energy. Of course. What were your thoughts on, you know what you've seen so far, today's keynotes, last night Peter DeSantis, what stood out to you? >> Yeah, I think it's great to be back at Reinvent again. We're kind of pretty much back to where we were before the pandemic sort of shut it down. This is a little, it's almost as big as the, the largest one that we had before. And everyone's turned up. It just feels like we're back. So that's really good to see. And it's a slightly different style. I think there were was more sort of video production things happening. I think in this keynote, more storytelling. I'm not sure it really all stitched together very well. Right. Some of the stories like, how does that follow that? So there were a few things there and some of there were spelling mistakes on the slides, you know that ELT instead of ETL and they spelled ZFS wrong and something. So it just seemed like there was, I'm not quite sure just maybe a few things were sort of rushed at the last minute. >> Not really AWS like, was it? It's kind of remind the Patriots Paul, you know Bill Belichick's teams are fumbling all over the place. >> That's right. That's right. >> Part of it may be, I mean the sort of the market. They have a leader in marketing right now but they're going to have a CMO. So that's sort of maybe as lack of a single threaded leader for this thing. Everything's being shared around a bit more. So maybe, I mean, it's all fixable and it's mine. This is minor stuff. I'm just sort of looking at it and going there's a few things that looked like they were not quite as good as they could have been in the way it was put together. Right? >> But I mean, you're taking a, you know a year of not doing Reinvent. Yeah. Being isolated. You know, we've certainly seen it with theCUBE. It's like, okay, it's not like riding a bike. You know, things that, you know you got to kind of relearn the muscle memories. It's more like golf than is bicycle riding. >> Well I've done AWS keynotes myself. And they are pretty much scrambled. It looks nice, but there's a lot of scrambling leading up to when it actually goes. Right? And sometimes you can, you sometimes see a little kind of the edges of that, and sometimes it's much more polished. But you know, overall it's pretty good. I think Peter DeSantis keynote yesterday was a lot of really good meat there. There was some nice presentations, and some great announcements there. And today I was, I thought I was a little disappointed with some of the, I thought they could have been more. I think the way Andy Jesse did it, he crammed more announcements into his keynote, and Adam seems to be taking sort of a bit more of a measured approach. There were a few things he picked up on and then I'm expecting more to be spread throughout the rest of the day. >> This was more poetic. Right? He took the universe as the analogy for data, the ocean for security. Right? The Antarctic was sort of. >> Yeah. It looked pretty, >> yeah. >> But I'm not sure that was like, we're not here really to watch nature videos >> As analysts and journalists, You're like, come on. >> Yeah, >> Give it the meat >> That was kind the thing, yeah, >> It has always been the AWS has always been Reinvent has always been a shock at our approach. 100, 150 announcements. And they're really, that kind of pressure seems to be off them now. Their position at the top of the market seems to be unshakeable. There's no clear competition that's creeping up behind them. So how does that affect the messaging you think that AWS brings to market when it doesn't really have to prove that it's a leader anymore? It can go after maybe more of the niche markets or fix the stuff that's a little broken more fine tuning than grandiose statements. >> I think so AWS for a long time was so far out that they basically said, "We don't think about the competition, we are listen to the customers." And that was always the statement that works as long as you're always in the lead, right? Because you are introducing the new idea to the customer. Nobody else got there first. So that was the case. But in a few areas they aren't leading. Right? You could argue in machine learning, not necessarily leading in sustainability. They're not leading and they don't want to talk about some of these areas and-- >> Database. I mean arguably, >> They're pretty strong there, but the areas when you are behind, it's like they kind of know how to play offense. But when you're playing defense, it's a different set of game. You're playing a different game and it's hard to be good at both. I think and I'm not sure that they're really used to following somebody into a market and making a success of that. So there's something, it's a little harder. Do you see what I mean? >> I get opinion on this. So when I say database, David Foyer was two years ago, predicted AWS is going to have to converge somehow. They have no choice. And they sort of touched on that today, right? Eliminating ETL, that's one thing. But Aurora to Redshift. >> Yeah. >> You know, end to end. I'm not sure it's totally, they're fully end to end >> That's a really good, that is an excellent piece of work, because there's a lot of work that it eliminates. There's are clear pain points, but then you've got sort of the competing thing, is like the MongoDB and it's like, it's just a way with one database keeps it simple. >> Snowflake, >> Or you've got on Snowflake maybe you've got all these 20 different things you're trying to integrate at AWS, but it's kind of like you have a bag of Lego bricks. It's my favorite analogy, right? You want a toy for Christmas, you want a toy formula one racing car since that seems to be the theme, right? >> Okay. Do you want the fully built model that you can play with right now? Or do you want the Lego version that you have to spend three days building. Right? And AWS is the Lego technique thing. You have to spend some time building it, but once you've built it, you can evolve it, and you'll still be playing those are still good bricks years later. Whereas that prebuilt to probably broken gathering dust, right? So there's something about having an vulnerable architecture which is harder to get into, but more durable in the long term. And so AWS tends to play the long game in many ways. And that's one of the elements that they do that and that's good, but it makes it hard to consume for enterprise buyers that are used to getting it with a bow on top. And here's the solution. You know? >> And Paul, that was always Andy Chassy's answer to when we would ask him, you know, all these primitives you're going to make it simpler. You see the primitives give us the advantage to turn on a dime in the marketplace. And that's true. >> Yeah. So you're saying, you know, you take all these things together and you wrap it up, and you put a snowflake on top, and now you've got a simple thing or a Mongo or Mongo atlas or whatever. So you've got these layered platforms now which are making it simpler to consume, but now you're kind of, you know, you're all stuck in that ecosystem, you know, so it's like what layer of abstractions do you want to tie yourself to, right? >> The data bricks coming at it from more of an open source approach. But it's similar. >> We're seeing Amazon direct more into vertical markets. They spotlighted what Goldman Sachs is doing on their platform. They've got a variety of platforms that are supposedly targeted custom built for vertical markets. How do successful do you see that play being? Is this something that the customers you think are looking for, a fully integrated Amazon solution? >> I think so. There's usually if you look at, you know the MongoDB or data stacks, or the other sort of or elastic, you know, they've got the specific solution with the people that really are developing the core technology, there's open source equivalent version. The AWS is running, and it's usually maybe they've got a price advantage or it's, you know there's some data integration in there or it's somehow easier to integrate but it's not stopping those companies from growing. And what it's doing is it's endorsing that platform. So if you look at the collection of databases that have been around over the last few years, now you've got basically Elastic Mongo and Cassandra, you know the data stacks as being endorsed by the cloud vendors. These are winners. They're going to be around for a very long time. You can build yourself on that architecture. But what happened to Couch base and you know, a few of the other ones, you know, they don't really fit. Like how you going to bait? If you are now becoming an also ran, because you didn't get cloned by the cloud vendor. So the customers are going is that a safe place to be, right? >> But isn't it, don't they want to encourage those partners though in the name of building the marketplace ecosystem? >> Yeah. >> This is huge. >> But certainly the platform, yeah, the platform encourages people to do more. And there's always room around the edge. But the mainstream customers like that really like spending the good money, are looking for something that's got a long term life to it. Right? They're looking for a long commitment to that technology and that it's going to be invested in and grow. And the fact that the cloud providers are adopting and particularly AWS is adopting some of these technologies means that is a very long term commitment. You can base, you know, you can bet your future architecture on that for a decade probably. >> So they have to pick winners. >> Yeah. So it's sort of picking winners. And then if you're the open source company that's now got AWS turning up, you have to then leverage it and use that as a way to grow the market. And I think Mongo have done an excellent job of that. I mean, they're top level sponsors of Reinvent, and they're out there messaging that and doing a good job of showing people how to layer on top of AWS and make it a win-win both sides. >> So ever since we've been in the business, you hear the narrative hardware's going to die. It's just, you know, it's commodity and there's some truth to that. But hardware's actually driving good gross margins for the Cisco's of the world. Storage companies have always made good margins. Servers maybe not so much, 'cause Intel sucked all the margin out of it. But let's face it, AWS makes most of its money. We know on compute, it's got 25 plus percent operating margins depending on the seasonality there. What do you think happens long term to the infrastructure layer discussion? Okay, commodity cloud, you know, we talk about super cloud. Do you think that AWS, and the other cloud vendors that infrastructure, IS gets commoditized and they have to go up market or you see that continuing I mean history would say that still good margins in hardware. What are your thoughts on that? >> It's not commoditizing, it's becoming more specific. We've got all these accelerators and custom chips now, and this is something, this almost goes back. I mean, I was with some micro systems 20,30 years ago and we developed our own chips and HP developed their own chips and SGI mips, right? We were like, the architectures were all squabbling of who had the best processor chips and it took years to get chips that worked. Now if you make a chip and it doesn't work immediately, you screwed up somewhere right? It's become the technology of building these immensely complicated powerful chips that has become commoditized. So the cost of building a custom chip, is now getting to the point where Apple and Amazon, your Apple laptop has got full custom chips your phone, your iPhone, whatever and you're getting Google making custom chips and we've got Nvidia now getting into CPUs as well as GPUs. So we're seeing that the ability to build a custom chip, is becoming something that everyone is leveraging. And the cost of doing that is coming down to startups are doing it. So we're going to see many, many more, much more innovation I think, and this is like Intel and AMD are, you know they've got the compatibility legacy, but of the most powerful, most interesting new things I think are going to be custom. And we're seeing that with Graviton three particular in the three E that was announced last night with like 30, 40% whatever it was, more performance for HPC workloads. And that's, you know, the HPC market is going to have to deal with cloud. I mean they are starting to, and I was at Supercomputing a few weeks ago and they are tiptoeing around the edge of cloud, but those supercomputers are water cold. They are monsters. I mean you go around supercomputing, there are plumbing vendors on the booth. >> Of course. Yeah. >> Right? And they're highly concentrated systems, and that's really the only difference, is like, is it water cooler or echo? The rest of the technology stack is pretty much off the shelf stuff with a few tweets software. >> You point about, you know, the chips and what AWS is doing. The Annapurna acquisition. >> Yeah. >> They're on a dramatically different curve now. I think it comes down to, again, David Floyd's premise, really comes down to volume. The arm wafer volumes are 10 x those of X 86, volume always wins. And the economics of semis. >> That kind of got us there. But now there's also a risk five coming along if you, in terms of licensing is becoming one of the bottlenecks. Like if the cost of building a chip is really low, then it comes down to licensing costs and do you want to pay the arm license And the risk five is an open source chip set which some people are starting to use for things. So your dis controller may have a risk five in it, for example, nowadays, those kinds of things. So I think that's kind of the the dynamic that's playing out. There's a lot of innovation in hardware to come in the next few years. There's a thing called CXL compute express link which is going to be really interesting. I think that's probably two years out, before we start seeing it for real. But it lets you put glue together entire rack in a very flexible way. So just, and that's the entire industry coming together around a single standard, the whole industry except for Amazon, in fact just about. >> Well, but maybe I think eventually they'll get there. Don't use system on a chip CXL. >> I have no idea whether I have no knowledge about whether going to do anything CXL. >> Presuming I'm not trying to tap anything confidential. It just makes sense that they would do a system on chip. It makes sense that they would do something like CXL. Why not adopt the standard, if it's going to be as the cost. >> Yeah. And so that was one of the things out of zip computing. The other thing is the low latency networking with the elastic fabric adapter EFA and the extensions to that that were announced last night. They doubled the throughput. So you get twice the capacity on the nitro chip. And then the other thing was this, this is a bit technical, but this scalable datagram protocol that they've got which basically says, if I want to send a message, a packet from one machine to another machine, instead of sending it over one wire, I consider it over 16 wires in parallel. And I will just flood the network with all the packets and they can arrive in any order. This is why it isn't done normally. TCP is in order, the packets come in order they're supposed to, but this is fully flooding them around with its own fast retry and then they get reassembled at the other end. So they're not just using this now for HPC workloads. They've turned it on for TCP for just without any change to your application. If you are trying to move a large piece of data between two machines, and you're just pushing it down a network, a single connection, it takes it from five gigabits per second to 25 gigabits per second. A five x speed up, with a protocol tweak that's run by the Nitro, this is super interesting. >> Probably want to get all that AIML that stuff is going on. >> Well, the AIML stuff is leveraging it underneath, but this is for everybody. Like you're just copying data around, right? And you're limited, "Hey this is going to get there five times faster, pushing a big enough chunk of data around." So this is turning on gradually as the nitro five comes out, and you have to enable it at the instance level. But it's a super interesting announcement from last night. >> So the bottom line bumper sticker on commoditization is what? >> I don't think so. I mean what's the APIs? Your arm compatible, your Intel X 86 compatible or your maybe risk five one day compatible in the cloud. And those are the APIs, right? That's the commodity level. And the software is now, the software ecosystem is super portable across those as we're seeing with Apple moving from Intel to it's really not an issue, right? The software and the tooling is all there to do that. But underneath that, we're going to see an arms race between the top providers as they all try and develop faster chips for doing more specific things. We've got cranium for training, that instance has they announced it last year with 800 gigabits going out of a single instance, 800 gigabits or no, but this year they doubled it. Yeah. So 1.6 terabytes out of a single machine, right? That's insane, right? But what you're doing is you're putting together hundreds or thousands of those to solve the big machine learning training problems. These super, these enormous clusters that they're being formed for doing these massive problems. And there is a market now, for these incredibly large supercomputer clusters built for doing AI. That's all bandwidth limited. >> And you think about the timeframe from design to tape out. >> Yeah. >> Is just getting compressed It's relative. >> It is. >> Six is going the other way >> The tooling is all there. Yeah. >> Fantastic. Adrian, always a pleasure to have you on. Thanks so much. >> Yeah. >> Really appreciate it. >> Yeah, thank you. >> Thank you Paul. >> Cheers. All right. Keep it right there everybody. Don't forget, go to thecube.net, you'll see all these videos. Go to siliconangle.com, We've got features with Adam Selipsky, we got my breaking analysis, we have another feature with MongoDB's, Dev Ittycheria, Ali Ghodsi, as well Frank Sluman tomorrow. So check that out. Keep it right there. You're watching theCUBE, the leader in enterprise and emerging tech, right back. (soft techno upbeat music)

Published Date : Nov 30 2022

SUMMARY :

Great to see you again. and the ecosystem and the energy. Some of the stories like, It's kind of remind the That's right. I mean the sort of the market. the muscle memories. kind of the edges of that, the analogy for data, As analysts and journalists, So how does that affect the messaging always in the lead, right? I mean arguably, and it's hard to be good at both. But Aurora to Redshift. You know, end to end. of the competing thing, but it's kind of like you And AWS is the Lego technique thing. to when we would ask him, you know, and you put a snowflake on top, from more of an open source approach. the customers you think a few of the other ones, you know, and that it's going to and doing a good job of showing people and the other cloud vendors the HPC market is going to Yeah. and that's really the only difference, the chips and what AWS is doing. And the economics of semis. So just, and that's the entire industry Well, but maybe I think I have no idea whether if it's going to be as the cost. and the extensions to that AIML that stuff is going on. and you have to enable And the software is now, And you think about the timeframe Is just getting compressed Yeah. Adrian, always a pleasure to have you on. the leader in enterprise

ENTITIES

Entity	Category	Confidence
Adam Selipsky	PERSON	0.99+
David Floyd	PERSON	0.99+
Peter DeSantis	PERSON	0.99+
Paul	PERSON	0.99+
Ali Ghodsi	PERSON	0.99+
Adrian Cockcroft	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Frank Sluman	PERSON	0.99+
Paul Gillon	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Andy Chassy	PERSON	0.99+
Las Vegas	LOCATION	0.99+
Adam	PERSON	0.99+
Dev Ittycheria	PERSON	0.99+
Andy Jesse	PERSON	0.99+
Dave Villante	PERSON	0.99+
August	DATE	0.99+
two machines	QUANTITY	0.99+
Bill Belichick	PERSON	0.99+
10	QUANTITY	0.99+
Cisco	ORGANIZATION	0.99+
today	DATE	0.99+
last year	DATE	0.99+
1.6 terabytes	QUANTITY	0.99+
AMD	ORGANIZATION	0.99+
Goldman Sachs	ORGANIZATION	0.99+
hundreds	QUANTITY	0.99+
one machine	QUANTITY	0.99+
three days	QUANTITY	0.99+
Adrian	PERSON	0.99+
800 gigabits	QUANTITY	0.99+
Today	DATE	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
David Foyer	PERSON	0.99+
two years	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
yesterday	DATE	0.99+
this year	DATE	0.99+
Snowflake	TITLE	0.99+
Nvidia	ORGANIZATION	0.99+
five times	QUANTITY	0.99+
one	QUANTITY	0.99+
Netflix	ORGANIZATION	0.99+
thecube.net	OTHER	0.99+
Intel	ORGANIZATION	0.99+
five	QUANTITY	0.99+
both sides	QUANTITY	0.99+
Mongo	ORGANIZATION	0.99+
Christmas	EVENT	0.99+
last night	DATE	0.99+
HP	ORGANIZATION	0.98+
25 plus percent	QUANTITY	0.98+
thousands	QUANTITY	0.98+
20,30 years ago	DATE	0.98+
pandemic	EVENT	0.98+
both	QUANTITY	0.98+
two years ago	DATE	0.98+
twice	QUANTITY	0.98+
tomorrow	DATE	0.98+
X 86	COMMERCIAL_ITEM	0.98+
Antarctic	LOCATION	0.98+
Patriots	ORGANIZATION	0.98+
siliconangle.com	OTHER	0.97+

Ali Ghosdi, Databricks | AWS Partner Exclusive

Published Date : Nov 23 2022

SUMMARY :

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Ali Ghodsi	PERSON	0.99+
Adam	PERSON	0.99+
AWS	ORGANIZATION	0.99+
2013	DATE	0.99+
Google	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
2008	DATE	0.99+
Ali Ghosdi	PERSON	0.99+
five vendors	QUANTITY	0.99+
Adam Saleski	PERSON	0.99+
five	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Ali	PERSON	0.99+
Databricks	ORGANIZATION	0.99+
three vendors	QUANTITY	0.99+
70%	QUANTITY	0.99+
Wednesday	DATE	0.99+
Excel	TITLE	0.99+
38 billion	QUANTITY	0.99+
four	QUANTITY	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Word	TITLE	0.99+
three	QUANTITY	0.99+
two clouds	QUANTITY	0.99+
Andy	PERSON	0.99+
three clouds	QUANTITY	0.99+
10 million	QUANTITY	0.99+
PowerPoint	TITLE	0.99+
one	QUANTITY	0.99+
two	QUANTITY	0.99+
twice	QUANTITY	0.99+
Second	QUANTITY	0.99+
over 300 services	QUANTITY	0.99+
one game	QUANTITY	0.99+
second cloud	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
Sky	ORGANIZATION	0.99+
one word	QUANTITY	0.99+
OPEX	ORGANIZATION	0.99+
two things	QUANTITY	0.98+
two years ago	DATE	0.98+
Access	TITLE	0.98+
over 300	QUANTITY	0.98+
six years	QUANTITY	0.98+
over 70%	QUANTITY	0.98+
five years ago	DATE	0.98+

David Linthicum, Deloitte US | Supercloud22

(bright music) >> "Supermetafragilisticexpialadotious." What's in a name? In an homage to the inimitable Charles Fitzgerald, we've chosen this title for today's session because of all the buzz surrounding "supercloud," a term that we introduced last year to signify a major architectural trend and shift that's occurring in the technology industry. Since that time, we've published numerous videos and articles on the topic, and on August 9th, kicked off "Supercloud22," an open industry event designed to advance the supercloud conversation, gathering input from more than 30 experienced technologists and business leaders in "The Cube" and broader technology community. We're talking about individuals like Benoit Dageville, Kit Colbert, Ali Ghodsi, Mohit Aron, David McJannet, and dozens of other experts. And today, we're pleased to welcome David Linthicum, who's a Chief Strategy Officer of Cloud Services at Deloitte Consulting. David is a technology visionary, a technical CTO. He's an author and a frequently sought after keynote speaker at high profile conferences like "VMware Explore" next week. David Linthicum, welcome back to "The Cube." Good to see you again. >> Oh, it's great to be here. Thanks for the invitation. Thanks for having me. >> Yeah, you're very welcome. Okay, so this topic of supercloud, what you call metacloud, has created a lot of interest. VMware calls it cross-cloud services, Snowflake calls it their data cloud, there's a lot of different names, but recently, you published a piece in "InfoWorld" where you said the following. "I really don't care what we call it, "and I really don't care if I put "my own buzzword into the mix. "However, this does not change the fact "that metacloud is perhaps the most important "architectural evolution occurring right now, "and we need to get this right out of the gate. "If we do that, who cares what it's named?" So very cool. And you also mentioned in a recent article that you don't like to put out new terms out in the wild without defining them. So what is a metacloud, or what we call supercloud? What's your definition? >> Yeah, and again, I don't care what people call it. The reality is it's the ability to have a layer of cross-cloud services. It sits above existing public cloud providers. So the idea here is that instead of building different security systems, different governance systems, different operational systems in each specific cloud provider, using whatever native features they provide, we're trying to do that in a cross-cloud way. So in other words, we're pushing out data integration, security, all these other things that we have to take care of as part of deploying a particular cloud provider. And in a multicloud scenario, we're building those in and between the clouds. And so we've been tracking this for about five years. We understood that multicloud is not necessarily about the particular public cloud providers, it's about things that you build in and between the clouds. >> Got it, okay. So I want to come back to that, to the definition, but I want to tie us to the so-called multicloud. You guys did a survey recently. We've said that multicloud was mostly a symptom of multi-vendor, Shadow Cloud, M&A, and only recently has become a strategic imperative. Now, Deloitte published a survey recently entitled "Closing the Cloud Strategy, Technology, Innovation Gap," and I'd like to explore that a little bit. And so in that survey, you showed data. What I liked about it is you went beyond what we all know, right? The old, "Our research shows that on average, "X number of clouds are used at an individual company." I mean, you had that too, but you really went deeper. You identified why companies are using multiple clouds, and you developed different categories of practitioners across 500 survey respondents. But the reasons were very clear for "why multicloud," as this becomes more strategic. Service choice scale, negotiating leverage, improved business resiliency, minimizing lock-in, interoperability of data, et cetera. So my question to you, David, is what's the problem supercloud or metacloud solves, and what's different from multicloud? >> That's a great question. The reality is that if we're... Well, supercloud or metacloud, whatever, is really something that exists above a multicloud, but I kind of view them as the same thing. It's an architectural pattern. We can name it anything. But the reality is that if we're moving to these multicloud environments, we're doing so to leverage best of breed things. In other words, best of breed technology to provide the innovators within the company to take the business to the next level, and we determine that in the survey. And so if we're looking at what a multicloud provides, it's the ability to provide different choices of different services or piece parts that allows us to build anything that we need to do. And so what we found in the survey and what we found in just practice in dealing with our clients is that ultimately, the value of cloud computing is going to be the innovation aspects. In other words, the ability to take the company to the next level from being more innovative and more disruptive in the marketplace that they're in. And the only way to do that, instead of basically leveraging the services of a particular walled garden of a single public cloud provider, is to cast a wider net and get out and leverage all kinds of services to make these happen. So if you think about that, that's basically how multicloud has evolved. In other words, it wasn't planned. They didn't say, "We're going to go do a multicloud." It was different developers and innovators in the company that went off and leveraged these cloud services, sometimes with the consent of IT leadership, sometimes not. And now we have these multitudes of different services that we're leveraging. And so many of these enterprises are going from 1000 to, say, 3000 services under management. That creates a complexity problem. We have a problem of heterogeneity, different platforms, different tools, different services, different AI technology, database technology, things like that. So the metacloud, or the supercloud, or whatever you want to call it, is the ability to deal with that complexity on the complexity's terms. And so instead of building all these various things that we have to do individually in each of the cloud providers, we're trying to do so within a cross-cloud service layer. We're trying to create this layer of technology, which removes us from dealing with the complexity of the underlying multicloud services and makes it manageable. Because right now, I think we're getting to a point of complexity we just can't operate it at the budgetary limits that we are right now. We can't keep the number of skills around, the number of operators around, to keep these things going. We're going to have to get creative in terms of how we manage these things, how we manage a multicloud. And that's where the supercloud, metacloud, whatever they want to call it, comes that. >> Yeah, and as John Furrier likes to say, in IT, we tend to solve complexity with more complexity, and that's not what we're talking about here. We're talking about simplifying, and you talked about the abstraction layer, and then it sounds like I'm inferring more. There's value that's added on top of that. And then you also said the hyperscalers are in a walled garden. So I've been asked, why aren't the hyperscalers superclouds? And I've said, essentially, they want to put your data into their cloud and keep it there. Now, that doesn't mean they won't eventually get into that. We've seen examples a little bit, Outposts, Anthos, Azure Arc, but the hyperscalers really aren't building superclouds or metaclouds, at least today, are they? >> No, they're not. And I always have the predictions for every major cloud conference that this is the conference that the hyperscaler is going to figure out some sort of a multicloud across-cloud strategy. In other words, building services that are able to operate across clouds. That really has never happened. It has happened in dribs and drabs, and you just mentioned a few examples of that, but the ability to own the space, to understand that we're not going to be the center of the universe in how people are going to leverage it, is going to be multiple things, including legacy systems and other cloud providers, and even industry clouds that are emerging these days, and SaaS providers, and all these things. So we're going to assist you in dealing with complexity, and we're going to provide the core services of being there. That hasn't happened yet. And they may be worried about conflicting their market, and the messaging is a bit different, even actively pushing back on the concept of multicloud, but the reality is the market's going to take them there. So in other words, if enough of their customers are asking for this and asking that they take the lead in building these cross-cloud technologies, even if they're participating in the stack and not being the stack, it's too compelling of a market that it's not going to drag a lot of the existing public cloud providers there. >> Well, it's going to be interesting to see how that plays out, David, because I never say never when it comes to a company like AWS, and we've seen how fast they move. And at the same time, they don't want to be commoditized. There's the layer underneath all this infrastructure, and they got this ecosystem that's adding all this tremendous value. But I want to ask you, what are the essential elements of supercloud, coming back to the definition, if you will, and what's different about metacloud, as you call it, from plain old SaaS or PaaS? What are the key elements there? >> Well, the key elements would be holistic management of all of the IT infrastructure. So even though it's sitting above a multicloud, I view metacloud, supercloud as the ability to also manage your existing legacy systems, your existing security stack, your existing network operations, basically everything that exists under the purview of IT. If you think about it, we're moving our infrastructure into the clouds, and we're probably going to hit a saturation point of about 70%. And really, if the supercloud, metacloud, which is going to be expensive to build for most of the enterprises, it needs to support these things holistically. So it needs to have all the services, that is going to be shareable across the different providers, and also existing legacy systems, and also edge computing, and IoT, and all these very diverse systems that we're building there right now. So if complexity is a core challenge to operate these things at scale and the ability to secure these things at scale, we have to have commonality in terms of security architecture and technology, commonality in terms of our directory services, commonality in terms of network operations, commonality in term of cloud operations, commonality in terms of FinOps. All these things should exist in some holistic cross-cloud layer that sits above all this complexity. And you pointed out something very profound. In other words, that is going to mean that we're hiding a lot of the existing cloud providers in terms of their interfaces and dashboards and things like that that we're dealing with today, their APIs. But the reality is that if we're able to manage these things at scale, the public cloud providers are going to benefit greatly from that. They're going to sell more services because people are going to find they're able to leverage them easier. And so in other words, if we're removing the complexity wall, which many in the industry are calling it right now, then suddenly we're moving from, say, the 25 to 30% migrated in the cloud, which most enterprises are today, to 50, 60, 70%. And we're able to do this at scale, and we're doing it at scale because we're providing some architectural optimization through the supercloud, metacloud layer. >> Okay, thanks for that. David, I just want to tap your CTO brain for a minute. At "Supercloud22," we came up with these three deployment models. Kit Colbert put forth the idea that one model would be your control planes running in one cloud, let's say AWS, but it interacts with and can manage and deploy on other clouds, the Kubernetes Cluster Management System. The second one, Mohit Aron from Cohesity laid out, where you instantiate the stack on different clouds and different cloud regions, and then you create a layer, a common interface across those. And then Snowflake was the third deployment model where it's a single global instance, it's one instantiation, and basically building out their own cloud across these regions. Help us parse through that. Do those seem like reasonable deployment models to you? Do you have any thoughts on that? >> Yeah, I mean, that's a distributed computing trick we've been doing, which is, in essence, an agent of the supercloud that's carrying out some of the cloud native functions on that particular cloud, but is, in essence, a slave to the metacloud, or the supercloud, whatever, that's able to run across the various cloud providers. In other words, when it wants to access a service, it may not go directly to that service. It goes directly to the control plane, and that control plane is responsible... Very much like Kubernetes and Docker works, that control plane is responsible for reaching out and leveraging those native services. I think that that's thinking that's a step in the right direction. I think these things unto themselves, at least initially, are going to be a very complex array of technology. Even though we're trying to remove complexity, the supercloud unto itself, in terms of the ability to build this thing that's able to operate at scale across-cloud, is going to be a collection of many different technologies that are interfacing with the public cloud providers in different ways. And so we can start putting these meta architectures together, and I certainly have written and spoke about this for years, but initially, this is going to be something that may escape the detail or the holistic nature of these meta architectures that people are floating around right now. >> Yeah, so I want to stay on this, because anytime I get a CTO brain, I like to... I'm not an engineer, but I've been around a long time, so I know a lot of buzzwords and have absorbed a lot over the years, but so you take those, the second two models, the Mohit instantiate on each cloud and each cloud region versus the Snowflake approach. I asked Benoit Dageville, "Does that mean if I'm in "an AWS east region and I want to do a query on Azure West, "I can do that without moving data?" And he said, "Yes and no." And the answer was really, "No, we actually take a subset of that data," so there's the latency problem. From those deployment model standpoints, what are the trade-offs that you see in terms of instantiating the stack on each individual cloud versus that single instance? Is there a benefit of the single instance for governance and security and simplicity, but a trade-off on latency, or am I overthinking this? >> Yeah, you hit it on the nose. The reality is that the trade-off is going to be latency and performance. If we get wiggy with the distributed nature, like the distributed data example you just provided, we have to basically separate the queries and communicate with the databases on each instance, and then reassemble the result set that goes back to the people who are recording it. And so we can do caching systems and things like that. But the reality is, if it's distributed system, we're going to have latency and bandwidth issues that are going to be limiting us. And also security issues, because if we're removing lots of information over the open internet, or even private circuits, that those are going to be attack vectors that hackers can leverage. You have to keep that in mind. We're trying to reduce those attack vectors. So it would be, in many instances, and I think we have to think about this, that we're going to keep the data in the same physical region for just that. So in other words, it's going to provide the best performance and also the most simplistic access to dealing with security. And so we're not, in essence, thinking about where the data's going, how it's moving across things, things like that. So the challenge is going to be is when you're dealing with a supercloud or metacloud is, when do you make those decisions? And I think, in many instances, even though we're leveraging multiple databases across multiple regions and multiple public cloud providers, and that's the idea of it, we're still going to localize the data for performance reasons. I mean, I just wrote a blog in "InfoWorld" a couple of months ago and talked about, people who are trying to distribute data across different public cloud providers for different reasons, distribute an application development system, things like that, you can do it. With enough time and money, you can do anything. I think the challenge is going to be operating that thing, and also providing a viable business return based on the application. And so why it may look like a good science experiment, and it's cool unto itself as an architect, the reality is the more pragmatic approach is going to be a leavitt in a single region on a single cloud. >> Very interesting. The other reason I like to talk to companies like Deloitte and experienced people like you is 'cause I can get... You're agnostic, right? I mean, you're technology agnostic, vendor agnostic. So I want to come back with another question, which is, how do you deal with what I call the lowest common denominator problem? What I mean by that is if one cloud has, let's say, a superior service... Let's take an example of Nitro and Graviton. AWS seems to be ahead on that, but let's say some other cloud isn't quite quite there yet, and you're building a supercloud or a metacloud. How do you rationalize that? Does it have to be like a caravan in the army where you slow down so all the slowest trucks can keep up, or are the ways to adjudicate that that are advantageous to hide that deficiency? >> Yeah, and that's a great thing about leveraging a supercloud or a metacloud is we're putting that management in a single layer. So as far as a user or even a developer on those systems, they shouldn't worry about the performance that may come back, because we're dealing with the... You hit the nail on the head with that one. The slowest component is the one that dictates performance. And so we have to have some sort of a performance management layer. We're also making dynamic decisions to move data, to move processing, from one server to the other to try to minimize the amount of latency that's coming from a single component. So the great thing about that is we're putting that volatility into a single domain, and it's making architectural decisions in terms of where something will run and where it's getting its data from, things are stored, things like that, based on the performance feedback that's coming back from the various cloud services that are under management. And so if you're running across clouds, it becomes even more interesting, because ultimately, you're going to make some architectural choices on the fly in terms of where that stuff runs based on the active dynamic performance that that public cloud provider is providing. So in other words, we may find that it automatically shut down a database service, say MySQL, on one cloud instance, and moved it to a MySQL instance on another public cloud provider because there was some sort of a performance issue that it couldn't work around. And by the way, it does so dynamically. Away from you making that decision, it's making that decision on your behalf. Again, this is a matter of abstraction, removing complexity, and dealing with complexity through abstraction and automation, and this is... That would be an example of fixing something with automation, self-healing. >> When you meet with some of the public cloud providers and they talk about on-prem private cloud, the general narrative from the hyperscalers is, "Well, that's not a cloud." Should on-prem be inclusive of supercloud, metacloud? >> Absolutely, I mean, and they're selling private cloud instances with the edge cloud that they're selling. The reality is that we're going to have to keep a certain amount of our infrastructure, including private clouds, on premise. It's something that's shrinking as a market share, and it's going to be tougher and tougher to justify as the public cloud providers become better and better at what they do, but we certainly have edge clouds now, and hyperscalers have examples of that where they run a instance of their public cloud infrastructure on premise on physical hardware and software. And the reality is, too, we have data centers and we have systems that just won't go away for another 20 or 30 years. They're just too sticky. They're uneconomically viable to move into the cloud. That's the core thing. It's not that we can't do it. The fact of the matter is we shouldn't do it, because there's not going to be an economic... There's not going to be an economic incentive of making that happen. So if we're going to create this meta layer or this infrastructure which is going to run across clouds, and everybody agrees on, that's what the supercloud is, we have to include the on-premise systems, including private clouds, including legacy systems. And by the way, include the rising number of IoT systems that are out there, and edge-based systems out there. So we're managing it using the same infrastructure into cloud services. So they have metadata systems and they have specialized services, and service finance and retail and things like doing risk analytics. So it gets them further down that path, but not necessarily giving them a SaaS application where they're forced into all of the business processes. We're giving you piece parts. So we'll give you 1000 different parts that are related to the finance industry. You can assemble anything you need, but the thing is, it's not going to be like building it from scratch. We're going to give you risk analytics, we're giving you the financial analytics, all these things that you can leverage within your applications how you want to leverage them. We'll maintain them. So in other words, you don't have to maintain 'em just like a cloud service. And suddenly, we can build applications in a couple of weeks that used to take a couple of months, in some cases, a couple of years. So that seems to be a large take of it moving forward. So get it up in the supercloud. Those become just other services that are under managed... That are under management on the supercloud, the metacloud. So we're able to take those services, abstract them, assemble them, use them in different applications. And the ability to manage where those services are originated versus where they're consumed is going to be managed by the supercloud layer, which, you're dealing with the governance, the service governance, the security systems, the directory systems, identity access management, things like that. They're going to get you further along down the pike, and that comes back as real value. If I'm able to build something in two weeks that used to take me two months, and I'm able to give my creators in the organization the ability to move faster, that's a real advantage. And suddenly, we are going to be valued by our digital footprint, our ability to do things in a creative and innovative way. And so organizations are able to move that fast, leveraging cloud computing for what it should be leveraged, as a true force multiplier for the business. They're going to win the game. They're going to get the most value. They're going to be around in 20 years, the others won't. >> David Linthicum, always love talking. You have a dangerous combination of business and technology expertise. Let's tease. "VMware Explore" next week, you're giving a keynote, if they're going to be there. Which day are you? >> Tuesday. Tuesday, 11 o'clock. >> All right, that's a big day. Tuesday, 11 o'clock. And David, please do stop by "The Cube." We're in Moscone West. Love to get you on and continue this conversation. I got 100 more questions for you. Really appreciate your time. >> I always love talking to people at "The Cube." Thank you very much. >> All right, and thanks for watching our ongoing coverage of "Supercloud22" on "The Cube," your leader in enterprise tech and emerging tech coverage. (bright music)

Published Date : Aug 24 2022

SUMMARY :

and articles on the Oh, it's great to be here. right out of the gate. The reality is it's the ability to have and I'd like to explore that a little bit. is the ability to deal but the hyperscalers but the ability to own the space, And at the same time, they and the ability to secure and then you create a layer, that may escape the detail and have absorbed a lot over the years, So the challenge is going to be in the army where you slow down And by the way, it does so dynamically. of the public cloud providers And the ability to manage if they're going to be there. Tuesday, 11 o'clock. Love to get you on and to people at "The Cube." and emerging tech coverage.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
David Linthicum	PERSON	0.99+
David McJannet	PERSON	0.99+
Deloitte	ORGANIZATION	0.99+
Ali Ghodsi	PERSON	0.99+
August 9th	DATE	0.99+
AWS	ORGANIZATION	0.99+
Benoit Dageville	PERSON	0.99+
Kit Colbert	PERSON	0.99+
25	QUANTITY	0.99+
two months	QUANTITY	0.99+
Charles Fitzgerald	PERSON	0.99+
50	QUANTITY	0.99+
next week	DATE	0.99+
M&A	ORGANIZATION	0.99+
Mohit Aron	PERSON	0.99+
John Furrier	PERSON	0.99+
each cloud	QUANTITY	0.99+
Tuesday, 11 o'clock	DATE	0.99+
two weeks	QUANTITY	0.99+
Tuesday	DATE	0.99+
60	QUANTITY	0.99+
today	DATE	0.99+
MySQL	TITLE	0.99+
100 more questions	QUANTITY	0.99+
each	QUANTITY	0.99+
last year	DATE	0.99+
each instance	QUANTITY	0.99+
30 years	QUANTITY	0.99+
20	QUANTITY	0.99+
Moscone West	LOCATION	0.99+
3000 services	QUANTITY	0.99+
one model	QUANTITY	0.99+
70%	QUANTITY	0.99+
second one	QUANTITY	0.98+
1000	QUANTITY	0.98+
30%	QUANTITY	0.98+
500 survey respondents	QUANTITY	0.98+
1000 different parts	QUANTITY	0.98+
VMware	ORGANIZATION	0.98+
single component	QUANTITY	0.98+
single layer	QUANTITY	0.97+
Deloitte Consulting	ORGANIZATION	0.97+
one	QUANTITY	0.97+
Nitro	ORGANIZATION	0.97+
about five years	QUANTITY	0.97+
more than 30 experienced technologists	QUANTITY	0.97+
about 70%	QUANTITY	0.97+
single instance	QUANTITY	0.97+
Shadow Cloud	ORGANIZATION	0.96+
Snowflake	TITLE	0.96+
The Cube	ORGANIZATION	0.96+
third deployment	QUANTITY	0.96+
Deloitte US	ORGANIZATION	0.95+
Supercloud22	ORGANIZATION	0.95+
20 years	QUANTITY	0.95+
each cloud region	QUANTITY	0.95+
second two models	QUANTITY	0.95+
Closing the Cloud Strategy, Technology, Innovation Gap	TITLE	0.94+
one cloud	QUANTITY	0.94+
single cloud	QUANTITY	0.94+
Cohesity	ORGANIZATION	0.94+
one server	QUANTITY	0.94+
single domain	QUANTITY	0.94+
each individual cloud	QUANTITY	0.93+
supercloud	ORGANIZATION	0.93+
metacloud	ORGANIZATION	0.92+
multicloud	ORGANIZATION	0.92+
The Cube	TITLE	0.92+
Graviton	ORGANIZATION	0.92+
VMware Explore	EVENT	0.91+
couple of months ago	DATE	0.89+
single global instance	QUANTITY	0.88+
Snowflake	ORGANIZATION	0.88+
cloud	QUANTITY	0.88+

Closing Remarks | Supercloud22

(gentle upbeat music) >> Welcome back everyone, to "theCUBE"'s live stage performance here in Palo Alto, California at "theCUBE" Studios. I'm John Furrier with Dave Vellante, kicking off our first inaugural Supercloud event. It's an editorial event, we wanted to bring together the best in the business, the smartest, the biggest, the up-and-coming startups, venture capitalists, everybody, to weigh in on this new Supercloud trend, this structural change in the cloud computing business. We're about to run the Ecosystem Speaks, which is a bunch of pre-recorded companies that wanted to get their voices on the record, so stay tuned for the rest of the day. We'll be replaying all that content and they're going to be having some really good commentary and hear what they have to say. I had a chance to interview and so did Dave. Dave, this is our closing segment where we kind of unpack everything or kind of digest and report. So much to kind of digest from the conversations today, a wide range of commentary from Supercloud operating system to developers who are in charge to maybe it's an ops problem or maybe Oracle's a Supercloud. I mean, that was debated. So so much discussion, lot to unpack. What was your favorite moments? >> Well, before I get to that, I think, I go back to something that happened at re:Invent last year. Nick Sturiale came up, Steve Mullaney from Aviatrix; we're going to hear from him shortly in the Ecosystem Speaks. Nick Sturiale's VC said "it's happening"! And what he was talking about is this ecosystem is exploding. They're building infrastructure or capabilities on top of the CapEx infrastructure. So, I think it is happening. I think we confirmed today that Supercloud is a thing. It's a very immature thing. And I think the other thing, John is that, it seems to me that the further you go up the stack, the weaker the business case gets for doing Supercloud. We heard from Marianna Tessel, it's like, "Eh, you know, we can- it was easier to just do it all on one cloud." This is a point that, Adrian Cockcroft just made on the panel and so I think that when you break out the pieces of the stack, I think very clearly the infrastructure layer, what we heard from Confluent and HashiCorp, and certainly VMware, there's a real problem there. There's a real need at the infrastructure layer and then even at the data layer, I think Benoit Dageville did a great job of- You know, I was peppering him with all my questions, which I basically was going through, the Supercloud definition and they ticked the box on pretty much every one of 'em as did, by the way Ali Ghodsi you know, the big difference there is the philosophy of Republicans and Democrats- got open versus closed, not to apply that to either one side, but you know what I mean! >> And the similarities are probably greater than differences. >> Berkely, I would probably put them on the- >> Yeah, we'll put them on the Democrat side we'll make Snowflake the Republicans. But so- but as we say there's a lot of similarities as well in terms of what their objectives are. So, I mean, I thought it was a great program and a really good start to, you know, an industry- You brought up the point about the industry consortium, asked Kit Colbert- >> Yep. >> If he thought that was something that was viable and what'd they say? That hyperscale should lead it? >> Yeah, they said hyperscale should lead it and there also should be an industry consortium to get the voices out there. And I think VMware is very humble in how they're putting out their white paper because I think they know that they can't do it all and that they do not have a great track record relative to cloud. And I think, but they have a great track record of loyal installed base ops people using VMware vSphere all the time. >> Yeah. >> So I think they need a catapult moment where they can catapult to the cloud native which they've been working on for years under Raghu and the team. So the question on VMware is in the light of Broadcom, okay, acquisition of VMware, this is an opportunity or it might not be an opportunity or it might be a spin-out or something, I just think VMware's got way too much engineering culture to be ignored, Dave. And I think- well, I'm going to watch this very closely because they can pull off some sort of rallying moment. I think they could. And then you hear the upstarts like Platform9, Rafay Systems and others they're all like, "Yes, we need to unify behind something. There needs to be some sort of standard". You know, we heard the argument of you know, more standards bodies type thing. So, it's interesting, maybe "theCUBE" could be that but we're going to certainly keep the conversation going. >> I thought one of the most memorable statements was Vittorio who said we- for VMware, we want our cake, we want to eat it too and we want to lose weight. So they have a lot of that aspirations there! (John laughs) >> And then I thought, Adrian Cockcroft said you know, the devs, they want to get married. They were marrying everybody, and then the ops team, they have to deal with the divorce. >> Yeah. >> And I thought that was poignant. It's like, they want consistency, they want standards, they got to be able to scale And Lori MacVittie, I'm not sure you agree with this, I'd have to think about it, but she was basically saying, all we've talked about is devs devs devs for the last 10 years, going forward we're going to be talking about ops. >> Yeah, and I think one of the things I learned from this day and looking back, and some kind of- I've been sauteing through all the interviews. If you zoom out, for me it was the epiphany of developers are still in charge. And I've said, you know, the developers are doing great, it's an ops security thing. Not sure I see that the way I was seeing before. I think what I learned was the refactoring pattern that's emerging, In Sik Rhee brought this up from Vertex Ventures with Marianna Tessel, it's a nuanced point but I think he's right on which is the pattern that's emerging is developers want ease-of-use tooling, they're driving the change and I think the developers in the devs ops ethos- it's never going to be separate. It's going to be DevOps. That means developers are driving operations and then security. So what I learned was it's not ops teams leveling up, it's devs redefining what ops is. >> Mm. And I think that to me is where Supercloud's going to be interesting- >> Forcing that. >> Yeah. >> Forcing the change because the structural change is open sources thriving, devs are still in charge and they still want more developers, Vittorio "we need more developers", right? So the developers are in charge and that's clear. Now, if that happens- if you believe that to be true the domino effect of that is going to be amazing because then everyone who gets on the wrong side of history, on the ops and security side, is going to be fighting a trend that may not be fight-able, you know, it might be inevitable. And so the winners are the ones that are refactoring their business like Snowflake. Snowflake is a data warehouse that had nothing to do with Amazon at first. It was the developers who said "I'm going to refactor data warehouse on AWS". That is a developer-driven refactorization and a business model. So I think that's the pattern I'm seeing is that this concept refactoring, patterns and the developer trajectory is critical. >> I thought there was another great comment. Maribel Lopez, her Lord of the Rings comment: "there will be no one ring to rule them all". Now at the same time, Kit Colbert, you know what we asked him straight out, "are you the- do you want to be the, the Supercloud OS?" and he basically said, "yeah, we do". Now, of course they're confined to their world, which is a pretty substantial world. I think, John, the reason why Maribel is so correct is security. I think security's a really hard problem to solve. You've got cloud as the first layer of defense and now you've got multiple clouds, multiple layers of defense, multiple shared responsibility models. You've got different tools for XDR, for identity, for governance, for privacy all within those different clouds. I mean, that really is a confusing picture. And I think the hardest- one of the hardest parts of Supercloud to solve. >> Yeah, and I thought the security founder Gee Rittenhouse, Piyush Sharrma from Accurics, which sold to Tenable, and Tony Kueh, former head of product at VMware. >> Right. >> Who's now an investor kind of looking for his next gig or what he is going to do next. He's obviously been extremely successful. They brought up the, the OS factor. Another point that they made I thought was interesting is that a lot of the things to do to solve the complexity is not doable. >> Yeah. >> It's too much work. So managed services might field the bit. So, and Chris Hoff mentioned on the Clouderati segment that the higher level services being a managed service and differentiating around the service could be the key competitive advantage for whoever does it. >> I think the other thing is Chris Hoff said "yeah, well, Web 3, metaverse, you know, DAO, Superclouds" you know, "Stupercloud" he called it and this bring up- It resonates because one of the criticisms that Charles Fitzgerald laid on us was, well, it doesn't help to throw out another term. I actually think it does help. And I think the reason it does help is because it's getting people to think. When you ask people about Supercloud, they automatically- it resonates with them. They play back what they think is the future of cloud. So Supercloud really talks to the future of cloud. There's a lot of aspects to it that need to be further defined, further thought out and we're getting to the point now where we- we can start- begin to say, okay that is Supercloud or that isn't Supercloud. >> I think that's really right on. I think Supercloud at the end of the day, for me from the simplest way to describe it is making sure that the developer experience is so good that the operations just happen. And Marianna Tessel said, she's investing in making their developer experience high velocity, very easy. So if you do that, you have to run on premise and on the cloud. So hybrid really is where Supercloud is going right now. It's not multi-cloud. Multi-cloud was- that was debunked on this session today. I thought that was clear. >> Yeah. Yeah, I mean I think- >> It's not about multi-cloud. It's about operationally seamless operations across environments, public cloud to on-premise, basically. >> I think we got consensus across the board that multi-cloud, you know, is a symptom Chuck Whitten's thing of multi-cloud by default versus multi- multi-cloud has not been a strategy, Kit Colbert said, up until the last couple of years. Yeah, because people said, "oh we got all these multiple clouds, what do we do with it?" and we got this mess that we have to solve. Whereas, I think Supercloud is something that is a strategy and then the other nuance that I keep bringing up is it's industries that are- as part of their digital transformation, are building clouds. Now, whether or not they become superclouds, I'm not convinced. I mean, what Goldman Sachs is doing, you know, with AWS, what Walmart's doing with Azure connecting their on-prem tools to those public clouds, you know, is that a supercloud? I mean, we're going to have to go back and really look at that definition. Or is it just kind of a SAS that spans on-prem and cloud. So, as I said, the further you go up the stack, the business case seems to wane a little bit but there's no question in my mind that from an infrastructure standpoint, to your point about operations, there's a real requirement for super- what we call Supercloud. >> Well, we're going to keep the conversation going, Dave. I want to put a shout out to our founding supporters of this initiative. Again, we put this together really fast kind of like a pilot series, an inaugural event. We want to have a face-to-face event as an industry event. Want to thank the founding supporters. These are the people who donated their time, their resource to contribute content, ideas and some cash, not everyone has committed some financial contribution but we want to recognize the names here. VMware, Intuit, Red Hat, Snowflake, Aisera, Alteryx, Confluent, Couchbase, Nutanix, Rafay Systems, Skyhigh Security, Aviatrix, Zscaler, Platform9, HashiCorp, F5 and all the media partners. Without their support, this wouldn't have happened. And there are more people that wanted to weigh in. There was more demand than we could pull off. We'll certainly continue the Supercloud conversation series here on "theCUBE" and we'll add more people in. And now, after this session, the Ecosystem Speaks session, we're going to run all the videos of the big name companies. We have the Nutanix CEOs weighing in, Aviatrix to name a few. >> Yeah. Let me, let me chime in, I mean you got Couchbase talking about Edge, Platform 9's going to be on, you know, everybody, you know Insig was poopoo-ing Oracle, but you know, Oracle and Azure, what they did, two technical guys, developers are coming on, we dig into what they did. Howie Xu from Zscaler, Paula Hansen is going to talk about going to market in the multi-cloud world. You mentioned Rajiv, the CEO of Nutanix, Ramesh is going to talk about multi-cloud infrastructure. So that's going to run now for, you know, quite some time here and some of the pre-record so super excited about that and I just want to thank the crew. I hope guys, I hope you have a list of credits there's too many of you to mention, but you know, awesome jobs really appreciate the work that you did in a very short amount of time. >> Well, I'm excited. I learned a lot and my takeaway was that Supercloud's a thing, there's a kind of sense that people want to talk about it and have real conversations, not BS or FUD. They want to have real substantive conversations and we're going to enable that on "theCUBE". Dave, final thoughts for you. >> Well, I mean, as I say, we put this together very quickly. It was really a phenomenal, you know, enlightening experience. I think it confirmed a lot of the concepts and the premises that we've put forth, that David Floyer helped evolve, that a lot of these analysts have helped evolve, that even Charles Fitzgerald with his antagonism helped to really sharpen our knives. So, you know, thank you Charles. And- >> I like his blog, by the I'm a reader- >> Yeah, absolutely. And it was great to be back in Palo Alto. It was my first time back since pre-COVID, so, you know, great job. >> All right. I want to thank all the crew and everyone. Thanks for watching this first, inaugural Supercloud event. We are definitely going to be doing more of these. So stay tuned, maybe face-to-face in person. I'm John Furrier with Dave Vellante now for the Ecosystem chiming in, and they're going to speak and share their thoughts here with "theCUBE" our first live stage performance event in our studio. Thanks for watching. (gentle upbeat music)

Published Date : Aug 9 2022

SUMMARY :

and they're going to be having as did, by the way Ali Ghodsi you know, And the similarities on the Democrat side And I think VMware is very humble So the question on VMware is and we want to lose weight. they have to deal with the divorce. And I thought that was poignant. Not sure I see that the Mm. And I think that to me is where And so the winners are the ones that are of the Rings comment: the security founder Gee Rittenhouse, a lot of the things to do So, and Chris Hoff mentioned on the is the future of cloud. is so good that the public cloud to on-premise, basically. So, as I said, the further and all the media partners. So that's going to run now for, you know, I learned a lot and my takeaway was and the premises that we've put forth, since pre-COVID, so, you know, great job. and they're going to speak

ENTITIES

Entity	Category	Confidence
Tristan	PERSON	0.99+
George Gilbert	PERSON	0.99+
John	PERSON	0.99+
George	PERSON	0.99+
Steve Mullaney	PERSON	0.99+
Katie	PERSON	0.99+
David Floyer	PERSON	0.99+
Charles	PERSON	0.99+
Mike Dooley	PERSON	0.99+
Peter Burris	PERSON	0.99+
Chris	PERSON	0.99+
Tristan Handy	PERSON	0.99+
Bob	PERSON	0.99+
Maribel Lopez	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Mike Wolf	PERSON	0.99+
VMware	ORGANIZATION	0.99+
Merim	PERSON	0.99+
Adrian Cockcroft	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Brian	PERSON	0.99+
Brian Rossi	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Chris Wegmann	PERSON	0.99+
Whole Foods	ORGANIZATION	0.99+
Eric	PERSON	0.99+
Chris Hoff	PERSON	0.99+
Jamak Dagani	PERSON	0.99+
Jerry Chen	PERSON	0.99+
Caterpillar	ORGANIZATION	0.99+
John Walls	PERSON	0.99+
Marianna Tessel	PERSON	0.99+
Josh	PERSON	0.99+
Europe	LOCATION	0.99+
Jerome	PERSON	0.99+
Google	ORGANIZATION	0.99+
Lori MacVittie	PERSON	0.99+
2007	DATE	0.99+
Seattle	LOCATION	0.99+
10	QUANTITY	0.99+
five	QUANTITY	0.99+
Ali Ghodsi	PERSON	0.99+
Peter McKee	PERSON	0.99+
Nutanix	ORGANIZATION	0.99+
Eric Herzog	PERSON	0.99+
India	LOCATION	0.99+
Mike	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
five years	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Kit Colbert	PERSON	0.99+
Peter	PERSON	0.99+
Dave	PERSON	0.99+
Tanuja Randery	PERSON	0.99+

Breaking Analysis: What we hope to learn at Supercloud22

>> From theCUBE studios in Palo Alto in Boston bringing you data driven insights from theCUBE and ETR. This is breaking analysis with Dave Vellante. >> The term Supercloud is somewhat new, but the concepts behind it have been bubbling for years, early last decade when NIST put forth a definition of cloud computing it said services had to be accessible over a public network essentially cutting the on-prem crowd out of the cloud conversation. Now a guy named Chuck Hollis, who was a field CTO at EMC at the time and a prolific blogger objected to that criterion and laid out his vision for what he termed a private cloud. Now, in that post, he showed a workload running both on premises and in a public cloud sharing the underlying resources in an automated and seamless manner. What later became known more broadly as hybrid cloud that vision as we now know, really never materialized, and we were left with multi-cloud sets of largely incompatible and disconnected cloud services running in separate silos. The point is what Hollis laid out, IE the ability to abstract underlying infrastructure complexity and run workloads across multiple heterogeneous estates with an identical experience is what super cloud is all about. Hello and welcome to this week's Wikibon cube insights powered by ETR and this breaking analysis. We share what we hope to learn from super cloud 22 next week, next Tuesday at 9:00 AM Pacific. The community is gathering for Supercloud 22 an inclusive pilot symposium hosted by theCUBE and made possible by VMware and other founding partners. It's a one day single track event with more than 25 speakers digging into the architectural, the technical, structural and business aspects of Supercloud. This is a hybrid event with a live program in the morning running out of our Palo Alto studio and pre-recorded content in the afternoon featuring industry leaders, technologists, analysts and investors up and down the technology stack. Now, as I said up front the seeds of super cloud were sewn early last decade. After the very first reinvent we published our Amazon gorilla post, that scene in the upper right corner here. And we talked about how to differentiate from Amazon and form ecosystems around industries and data and how the cloud would change IT permanently. And then up in the upper left we put up a post on the old Wikibon Wiki. Yeah, it used to be a Wiki. Check out my hair by the way way no gray, that's how long ago this was. And we talked about in that post how to compete in the Amazon economy. And we showed a graph of how IT economics were changing. And cloud services had marginal economics that looked more like software than hardware at scale. And this would reset, we said opportunities for both technology sellers and buyers for the next 20 years. And this came into sharper focus in the ensuing years culminating in a milestone post by Greylock's Jerry Chen called Castles in the Cloud. It was an inspiration and catalyst for us using the term Supercloud in John Furrier's post prior to reinvent 2021. So we started to flesh out this idea of Supercloud where companies of all types build services on top of hyperscale infrastructure and across multiple clouds, going beyond multicloud 1.0, if you will, which was really a symptom, as we said, many times of multi-vendor at least that's what we argued. And despite its fuzzy definition, it resonated with people because they knew something was brewing, Keith Townsend the CTO advisor, even though he frankly, wasn't a big fan of the buzzy nature of the term Supercloud posted this awesome Blackboard on Twitter take a listen to how he framed it. Please play the clip. >> Is VMware the right company to make the super cloud work, term that Wikibon came up with to describe the taking of discreet services. So it says RDS from AWS, cloud compute engines from GCP and authentication from Azure to build SaaS applications or enterprise applications that connect back to your data center, is VMware's cross cloud vision 'cause it is just a vision today, the right approach. Or should you be looking towards companies like HashiCorp to provide this overall capability that we all agree, or maybe you don't that we need in an enterprise comment below your thoughts. >> So I really like that Keith has deep practitioner knowledge and lays out a couple of options. I especially like the examples he uses of cloud services. He recognizes the need for cross cloud services and he notes this capability is aspirational today. Remember this was eight or nine months ago and he brings HashiCorp into the conversation as they're one of the speakers at Supercloud 22 and he asks the community, what they think, the thing is we're trying to really test out this concept and people like Keith are instrumental as collaborators. Now I'm sure you're not surprised to hear that mot everyone is on board with the Supercloud meme, in particular Charles Fitzgerald has been a wonderful collaborator just by his hilarious criticisms of the concept. After a couple of super cloud posts, Charles put up his second rendition of "Supercloudifragilisticexpialidoucious". I mean, it's just beautiful, but to boot, he put up this picture of Baghdad Bob asking us to just stop, Bob's real name is Mohamed Said al-Sahaf. He was the minister of propaganda for Sadam Husein during the 2003 invasion of Iraq. And he made these outrageous claims of, you know US troops running in fear and putting down their arms and so forth. So anyway, Charles laid out several frankly very helpful critiques of Supercloud which has led us to really advance the definition and catalyze the community's thinking on the topic. Now, one of his issues and there are many is we said a prerequisite of super cloud was a super PaaS layer. Gartner's Lydia Leong chimed in saying there were many examples of successful PaaS vendors built on top of a hyperscaler some having the option to run in more than one cloud provider. But the key point we're trying to explore is the degree to which that PaaS layer is purpose built for a specific super cloud function. And not only runs in more than one cloud provider, Lydia but runs across multiple clouds simultaneously creating an identical developer experience irrespective of a state. Now, maybe that's what Lydia meant. It's hard to say from just a tweet and she's a sharp lady, so, and knows more about that market, that PaaS market, than I do. But to the former point at Supercloud 22, we have several examples. We're going to test. One is Oracle and Microsoft's recent announcement to run database services on OCI and Azure, making them appear as one rather than use an off the shelf platform. Oracle claims to have developed a capability for developers specifically built to ensure high performance low latency, and a common experience for developers across clouds. Another example we're going to test is Snowflake. I'll be interviewing Benoit Dageville co-founder of Snowflake to understand the degree to which Snowflake's recent announcement of an application development platform is perfect built, purpose built for the Snowflake data cloud. Is it just a plain old pass, big whoop as Lydia claims or is it something new and innovative, by the way we invited Charles Fitz to participate in Supercloud 22 and he decline saying in addition to a few other somewhat insulting things there's definitely interesting new stuff brewing that isn't traditional cloud or SaaS but branding at all super cloud doesn't help either. Well, indeed, we agree with part of that and we'll see if it helps advanced thinking and helps customers really plan for the future. And that's why Supercloud 22 has going to feature some of the best analysts in the business in The Great Supercloud Debate. In addition to Keith Townsend and Maribel Lopez of Lopez research and Sanjeev Mohan from former Gartner analyst and principal at SanjMo participated in this session. Now we don't want to mislead you. We don't want to imply that these analysts are hopping on the super cloud bandwagon but they're more than willing to go through the thought experiment and mental exercise. And, we had a great conversation that you don't want to miss. Maribel Lopez had what I thought was a really excellent way to think about this. She used TCP/IP as an historical example, listen to what she said. >> And Sanjeev Mohan has some excellent thoughts on the feasibility of an open versus de facto standard getting us to the vision of Supercloud, what's possible and what's likely now, again, I don't want to imply that these analysts are out banging the Supercloud drum. They're not necessarily doing that, but they do I think it's fair to say believe that something new is bubbling and whether it's called Supercloud or multicloud 2.0 or cross cloud services or whatever name you choose it's not multicloud of the 2010s and we chose Supercloud. So our goal here is to advance the discussion on what's next in cloud and Supercloud is meant to be a term to describe that future of cloud and specifically the cloud opportunities that can be built on top of hyperscale, compute, storage, networking machine learning, and other services at scale. And that is why we posted this piece on Answering the top 10 questions about Supercloud. Many of which were floated by Charles Fitzgerald and others in the community. Why does the industry need another term what's really new and different? And what is hype? What specific problems does Supercloud solve? What are the salient characteristics of Supercloud? What's different beyond multicloud? What is a super pass? Is it necessary to have a Supercloud? How will applications evolve on superclouds? What workloads will run? All these questions will be addressed in detail as a way to advance the discussion and help practitioners and business people understand what's real today. And what's possible with cloud in the near future. And one other question we'll address is who will build super clouds? And what new entrance we can expect. This is an ETR graphic that we showed in a previous episode of breaking analysis, and it lays out some of the companies we think are building super clouds or in a position to do so, by the way the Y axis shows net score or spending velocity and the X axis depicts presence in the ETR survey of more than 1200 respondents. But the key callouts to this slide in addition to some of the smaller firms that aren't yet showing up in the ETR data like Chaossearch and Starburst and Aviatrix and Clumio but the really interesting additions are industry players Walmart with Azure, Capital one and Goldman Sachs with AWS, Oracle, with Cerner. These we think are early examples, bubbling up of industry clouds that will eventually become super clouds. So we'll explore these and other trends to get the community's input on how this will all play out. These are the things we hope you'll take away from Supercloud 22. And we have an amazing lineup of experts to answer your question. Technologists like Kit Colbert, Adrian Cockcroft, Mariana Tessel, Chris Hoff, Will DeForest, Ali Ghodsi, Benoit Dageville, Muddu Sudhakar and many other tech athletes, investors like Jerry Chen and In Sik Rhee the analyst we featured earlier, Paula Hansen talking about go to market in a multi-cloud world Gee Rittenhouse talking about cloud security, David McJannet, Bhaskar Gorti of Platform9 and many, many more. And of course you, so please go to theCUBE.net and register for Supercloud 22, really lightweight reg. We're not doing this for lead gen. We're doing it for collaboration. If you sign in you can get the chat and ask questions in real time. So don't miss this inaugural event Supercloud 22 on August 9th at 9:00 AM Pacific. We'll see you there. Okay. That's it for today. Thanks for watching. Thank you to Alex Myerson who's on production and manages the podcast. Kristen Martin and Cheryl Knight. They help get the word out on social media and in our newsletters. And Rob Hof is our editor in chief over at SiliconANGLE. Does some really wonderful editing. Thank you to all. Remember these episodes are all available as podcasts wherever you listen, just search breaking analysis podcast. I publish each week on wikibon.com and Siliconangle.com. And you can email me at David.Vellantesiliconangle.com or DM me at Dvellante, comment on my LinkedIn post. Please do check out ETR.AI for the best survey data in the enterprise tech business. This is Dave Vellante for theCUBE insights powered by ETR. Thanks for watching. And we'll see you next week in Palo Alto at Supercloud 22 or next time on breaking analysis. (calm music)

Published Date : Aug 5 2022

SUMMARY :

This is breaking analysis and buyers for the next 20 years. Is VMware the right company is the degree to which that PaaS layer and specifically the cloud opportunities

ENTITIES

Entity	Category	Confidence
Alex Myerson	PERSON	0.99+
Dave Vellante	PERSON	0.99+
David McJannet	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
Paula Hansen	PERSON	0.99+
Jerry Chen	PERSON	0.99+
Adrian Cockcroft	PERSON	0.99+
Maribel Lopez	PERSON	0.99+
Keith Townsend	PERSON	0.99+
Kristen Martin	PERSON	0.99+
Chuck Hollis	PERSON	0.99+
Charles Fitz	PERSON	0.99+
Charles	PERSON	0.99+
Chris Hoff	PERSON	0.99+
Keith	PERSON	0.99+
Mariana Tessel	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Ali Ghodsi	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Charles Fitzgerald	PERSON	0.99+
Mohamed Said al-Sahaf	PERSON	0.99+
Kit Colbert	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
Clumio	ORGANIZATION	0.99+
Goldman Sachs	ORGANIZATION	0.99+
Gee Rittenhouse	PERSON	0.99+
Aviatrix	ORGANIZATION	0.99+
Chaossearch	ORGANIZATION	0.99+
Benoit Dageville	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
NIST	ORGANIZATION	0.99+
Lydia Leong	PERSON	0.99+
Muddu Sudhakar	PERSON	0.99+
Bob	PERSON	0.99+
Cerner	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Sanjeev Mohan	PERSON	0.99+
Capital one	ORGANIZATION	0.99+
David.Vellantesiliconangle.com	OTHER	0.99+
Starburst	ORGANIZATION	0.99+
EMC	ORGANIZATION	0.99+
2010s	DATE	0.99+
Will DeForest	PERSON	0.99+
more than 1200 respondents	QUANTITY	0.99+
one day	QUANTITY	0.99+
VMware	ORGANIZATION	0.99+
Gartner	ORGANIZATION	0.99+
2021	DATE	0.99+
next week	DATE	0.99+
Supercloud 22	EVENT	0.99+
theCUBE.net	OTHER	0.99+
Bhaskar Gorti	PERSON	0.99+
Supercloud	ORGANIZATION	0.98+
each week	QUANTITY	0.98+
eight	DATE	0.98+
SanjMo	ORGANIZATION	0.98+
Lydia	PERSON	0.98+
theCUBE	ORGANIZATION	0.98+
PaaS	TITLE	0.98+
more than 25 speakers	QUANTITY	0.98+
Snowflake	ORGANIZATION	0.98+
Platform9	ORGANIZATION	0.97+
first	QUANTITY	0.97+
one	QUANTITY	0.97+
today	DATE	0.97+
Hollis	PERSON	0.97+
Sadam Husein	PERSON	0.97+
second rendition	QUANTITY	0.97+
Boston	LOCATION	0.97+
SiliconANGLE	ORGANIZATION	0.96+
more than one cloud provider	QUANTITY	0.96+
both	QUANTITY	0.95+
super cloud 22	EVENT	0.95+

Supercloud22

(upbeat music) >> On August 9th at 9:00 am Pacific, we'll be broadcasting live from theCUBE Studios in Palo Alto, California. Supercloud22, an open industry event made possible by VMware. Supercloud22 will lay out the future of multi-cloud services in the 2020s. John Furrier and I will be hosting a star lineup, including Kit Colbert, VMware CTO, Benoit Dageville, co-founder of Snowflake, Marianna Tessel, CTO of Intuit, Ali Ghodsi, CEO of Databricks, Adrian Cockcroft, former CTO of Netflix, Jerry Chen of Greylock, Chris Hoff aka Beaker, Maribel Lopez, Keith Townsend, Sanjiv Mohan, and dozens of thought leaders. A full day track with 17 sessions. You won't want to miss Supercloud22. Go to thecube.net to mark your calendar and learn more about this free hybrid event. We'll see you there. (upbeat music)

Published Date : Jul 30 2022

SUMMARY :

and dozens of thought leaders.

ENTITIES

Entity	Category	Confidence
Tristan	PERSON	0.99+
George Gilbert	PERSON	0.99+
John	PERSON	0.99+
George	PERSON	0.99+
Steve Mullaney	PERSON	0.99+
Katie	PERSON	0.99+
David Floyer	PERSON	0.99+
Charles	PERSON	0.99+
Mike Dooley	PERSON	0.99+
Peter Burris	PERSON	0.99+
Chris	PERSON	0.99+
Tristan Handy	PERSON	0.99+
Bob	PERSON	0.99+
Maribel Lopez	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Mike Wolf	PERSON	0.99+
VMware	ORGANIZATION	0.99+
Merim	PERSON	0.99+
Adrian Cockcroft	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Brian	PERSON	0.99+
Brian Rossi	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Chris Wegmann	PERSON	0.99+
Whole Foods	ORGANIZATION	0.99+
Eric	PERSON	0.99+
Chris Hoff	PERSON	0.99+
Jamak Dagani	PERSON	0.99+
Jerry Chen	PERSON	0.99+
Caterpillar	ORGANIZATION	0.99+
John Walls	PERSON	0.99+
Marianna Tessel	PERSON	0.99+
Josh	PERSON	0.99+
Europe	LOCATION	0.99+
Jerome	PERSON	0.99+
Google	ORGANIZATION	0.99+
Lori MacVittie	PERSON	0.99+
2007	DATE	0.99+
Seattle	LOCATION	0.99+
10	QUANTITY	0.99+
five	QUANTITY	0.99+
Ali Ghodsi	PERSON	0.99+
Peter McKee	PERSON	0.99+
Nutanix	ORGANIZATION	0.99+
Eric Herzog	PERSON	0.99+
India	LOCATION	0.99+
Mike	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
five years	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Kit Colbert	PERSON	0.99+
Peter	PERSON	0.99+
Dave	PERSON	0.99+
Tanuja Randery	PERSON	0.99+

The New Data Equation: Leveraging Cloud-Scale Data to Innovate in AI, CyberSecurity, & Life Sciences

>> Hi, I'm Natalie Ehrlich and welcome to the AWS startup showcase presented by The Cube. We have an amazing lineup of great guests who will share their insights on the latest innovations and solutions and leveraging cloud scale data in AI, security and life sciences. And now we're joined by the co-founders and co-CEOs of The Cube, Dave Vellante and John Furrier. Thank you gentlemen for joining me. >> Hey Natalie. >> Hey Natalie. >> How are you doing. Hey John. >> Well, I'd love to get your insights here, let's kick it off and what are you looking forward to. >> Dave, I think one of the things that we've been doing on the cube for 11 years is looking at the signal in the marketplace. I wanted to focus on this because AI is cutting across all industries. So we're seeing that with cybersecurity and life sciences, it's the first time we've had a life sciences track in the showcase, which is amazing because it shows that growth of the cloud scale. So I'm super excited by that. And I think that's going to showcase some new business models and of course the keynotes Ali Ghodsi, who's the CEO Data bricks pushing a billion dollars in revenue, clear validation that startups can go from zero to a billion dollars in revenues. So that should be really interesting. And of course the top venture capitalists coming in to talk about what the enterprise dynamics are all about. And what about you, Dave? >> You know, I thought it was an interesting mix and choice of startups. When you think about, you know, AI security and healthcare, and I've been thinking about that. Healthcare is the perfect industry, it is ripe for disruption. If you think about healthcare, you know, we all complain how expensive it is not transparent. There's a lot of discussion about, you know, can everybody have equal access that certainly with COVID the staff is burned out. There's a real divergence and diversity of the quality of healthcare and you know, it all results in patients not being happy, and I mean, if you had to do an NPS score on the patients and healthcare will be pretty low, John, you know. So when I think about, you know, AI and security in the context of healthcare in cloud, I ask questions like when are machines going to be able to better meet or make better diagnoses than doctors? And that's starting. I mean, it's really in assistance putting into play today. But I think when you think about cheaper and more accurate image analysis, when you think about the overall patient experience and trust and personalized medicine, self-service, you know, remote medicine that we've seen during the COVID pandemic, disease tracking, language translation, I mean, there are so many things where the cloud and data, and then it can help. And then at the end of it, it's all about, okay, how do I authenticate? How do I deal with privacy and personal information and tamper resistance? And that's where the security play comes in. So it's a very interesting mix of startups. I think that I'm really looking forward to hearing from... >> You know Natalie one of the things we talked about, some of these companies, Dave, we've talked a lot of these companies and to me the business model innovations that are coming out of two factors, the pandemic is kind of coming to an end so that accelerated and really showed who had the right stuff in my opinion. So you were either on the wrong side or right side of history when it comes to the pandemic and as we look back, as we come out of it with clear growth in certain companies and certain companies that adopted let's say cloud. And the other one is cloud scale. So the focus of these startup showcases is really to focus on how startups can align with the enterprise buyers and create the new kind of refactoring business models to go from, you know, a re-pivot or refactoring to more value. And the other thing that's interesting is that the business model isn't just for the good guys. If you look at say ransomware, for instance, the business model of hackers is gone completely amazing too. They're kicking it but in terms of revenue, they have their own they're well-funded machines on how to extort cash from companies. So there's a lot of security issues around the business model as well. So to me, the business model innovation with cloud-scale tech, with the pandemic forcing function, you've seen a lot of new kinds of decision-making in enterprises. You seeing how enterprise buyers are changing their decision criteria, and frankly their existing suppliers. So if you're an old guard supplier, you're going to be potentially out because if you didn't deliver during the pandemic, this is the issue that everyone's talking about. And it's kind of not publicized in the press very much, but this is actually happening. >> Well thank you both very much for joining me to kick off our AWS startup showcase. Now we're going to go to our very special guest Ali Ghodsi and John Furrier will seat with him for a fireside chat and Dave and I will see you on the other side. >> Okay, Ali great to see you. Thanks for coming on our AWS startup showcase, our second edition, second batch, season two, whatever we want to call it it's our second version of this new series where we feature, you know, the hottest startups coming out of the AWS ecosystem. And you're one of them, I've been there, but you're not a startup anymore, you're here pushing serious success on the revenue side and company. Congratulations and great to see you. >> Likewise. Thank you so much, good to see you again. >> You know I remember the first time we chatted on The Cube, you weren't really doing much software revenue, you were really talking about the new revolution in data. And you were all in on cloud. And I will say that from day one, you were always adamant that it was cloud cloud scale before anyone was really talking about it. And at that time it was on premises with Hadoop and those kinds of things. You saw that early. I remember that conversation, boy, that bet paid out great. So congratulations. >> Thank you so much. >> So I've got to ask you to jump right in. Enterprises are making decisions differently now and you are an example of that company that has gone from literally zero software sales to pushing a billion dollars as it's being reported. Certainly the success of Data bricks has been written about, but what's not written about is the success of how you guys align with the changing criteria for the enterprise customer. Take us through that and these companies here are aligning the same thing and enterprises want to change. They want to be in the right side of history. What's the success formula? >> Yeah. I mean, basically what we always did was look a few years out, the how can we help these enterprises, future proof, what they're trying to achieve, right? They have, you know, 30 years of legacy software and, you know baggage, and they have compliance and regulations, how do we help them move to the future? So we try to identify those kinds of secular trends that we think are going to maybe you see them a little bit right now, cloud was one of them, but it gets more and more and more. So we identified those and there were sort of three or four of those that we kind of latched onto. And then every year the passes, we're a little bit more right. Cause it's a secular trend in the market. And then eventually, it becomes a force that you can't kind of fight anymore. >> Yeah. And I just want to put a plug for your clubhouse talks with Andreessen Horowitz. You're always on clubhouse talking about, you know, I won't say the killer instinct, but being a CEO in a time where there's so much change going on, you're constantly under pressure. It's a lonely job at the top, I know that, but you've made some good calls. What was some of the key moments that you can point to, where you were like, okay, the wave is coming in now, we'd better get on it. What were some of those key decisions? Cause a lot of these startups want to be in your position, and a lot of buyers want to take advantage of the technology that's coming. They got to figure it out. What was some of those key inflection points for you? >> So if you're just listening to what everybody's saying, you're going to miss those trends. So then you're just going with the stream. So, Juan you mentioned that cloud. Cloud was a thing at the time, we thought it's going to be the thing that takes over everything. Today it's actually multi-cloud. So multi-cloud is a thing, it's more and more people are thinking, wow, I'm paying a lot's to the cloud vendors, do I want to buy more from them or do I want to have some optionality? So that's one. Two, open. They're worried about lock-in, you know, lock-in has happened for many, many decades. So they want open architectures, open source, open standards. So that's the second one that we bet on. The third one, which you know, initially wasn't sort of super obvious was AI and machine learning. Now it's super obvious, everybody's talking about it. But when we started, it was kind of called artificial intelligence referred to robotics, and machine learning wasn't a term that people really knew about. Today, it's sort of, everybody's doing machine learning and AI. So betting on those future trends, those secular trends as we call them super critical. >> And one of the things that I want to get your thoughts on is this idea of re-platforming versus refactoring. You see a lot being talked about in some of these, what does that even mean? It's people trying to figure that out. Re-platforming I get the cloud scale. But as you look at the cloud benefits, what do you say to customers out there and enterprises that are trying to use the benefits of the cloud? Say data for instance, in the middle of how could they be thinking about refactoring? And how can they make a better selection on suppliers? I mean, how do you know it used to be RFP, you deliver these speeds and feeds and you get selected. Now I think there's a little bit different science and methodology behind it. What's your thoughts on this refactoring as a buyer? What do I got to do? >> Well, I mean let's start with you said RFP and so on. Times have changed. Back in the day, you had to kind of sign up for something and then much later you're going to get it. So then you have to go through this arduous process. In the cloud, would pay us to go model elasticity and so on. You can kind of try your way to it. You can try before you buy. And you can use more and more. You can gradually, you don't need to go in all in and you know, say we commit to 50,000,000 and six months later to find out that wow, this stuff has got shelf where it doesn't work. So that's one thing that has changed it's beneficial. But the second thing is, don't just mimic what you had on prem in the cloud. So that's what this refactoring is about. If you had, you know, Hadoop data lake, now you're just going to have an S3 data lake. If you had an on-prem data warehouse now you just going to have a cloud data warehouse. You're just repeating what you did on prem in the cloud, architected for the future. And you know, for us, the most important thing that we say is that this lake house paradigm is a cloud native way of organizing your data. That's different from how you would do things on premises. So think through what's the right way of doing it in the cloud. Don't just try to copy paste what you had on premises in the cloud. >> It's interesting one of the things that we're observing and I'd love to get your reaction to this. Dave a lot** and I have been reporting on it is, two personas in the enterprise are changing their organization. One is I call IT ops or there's an SRE role developing. And the data teams are being dismantled and being kind of sprinkled through into other teams is this notion of data, pipelining being part of workflows, not just the department. Are you seeing organizational shifts in how people are organizing their resources, their human resources to take advantage of say that the data problems that are need to being solved with machine learning and whatnot and cloud-scale? >> Yeah, absolutely. So you're right. SRE became a thing, lots of DevOps people. It was because when the cloud vendors launched their infrastructure as a service to stitch all these things together and get it all working you needed a lot of devOps people. But now things are maturing. So, you know, with vendors like Data bricks and other multi-cloud vendors, you can actually get much higher level services where you don't need to necessarily have lots of lots of DevOps people that are themselves trying to stitch together lots of services to make this work. So that's one trend. But secondly, you're seeing more data teams being sort of completely ubiquitous in these organizations. Before it used to be you have one data team and then we'll have data and AI and we'll be done. ' It's a one and done. But that's not how it works. That's not how Google, Facebook, Twitter did it, they had data throughout the organization. Every BU was empowered. It's sales, it's marketing, it's finance, it's engineering. So how do you embed all those data teams and make them actually run fast? And you know, there's this concept of a data mesh which is super important where you can actually decentralize and enable all these teams to focus on their domains and run super fast. And that's really enabled by this Lake house paradigm in the cloud that we're talking about. Where you're open, you're basing it on open standards. You have flexibility in the data types and how they're going to store their data. So you kind of provide a lot of that flexibility, but at the same time, you have sort of centralized governance for it. So absolutely things are changing in the market. >> Well, you're just the professor, the masterclass right here is amazing. Thanks for sharing that insight. You're always got to go out of date and that's why we have you on here. You're amazing, great resource for the community. Ransomware is a huge problem, it's now the government's focus. We're being attacked and we don't know where it's coming from. This business models around cyber that's expanding rapidly. There's real revenue behind it. There's a data problem. It's not just a security problem. So one of the themes in all of these startup showcases is data is ubiquitous in the value propositions. One of them is ransomware. What's your thoughts on ransomware? Is it a data problem? Does cloud help? Some are saying that cloud's got better security with ransomware, then say on premise. What's your vision of how you see this ransomware problem being addressed besides the government taking over? >> Yeah, that's a great question. Let me start by saying, you know, we're a data company, right? And if you say you're a data company, you might as well just said, we're a privacy company, right? It's like some people say, well, what do you think about privacy? Do you guys even do privacy? We're a data company. So yeah, we're a privacy company as well. Like you can't talk about data without talking about privacy. With every customer, with every enterprise. So that's obviously top of mind for us. I do think that in the cloud, security is much better because, you know, vendors like us, we're investing so much resources into security and making sure that we harden the infrastructure and, you know, by actually having all of this infrastructure, we can monitor it, detect if something is, you know, an attack is happening, and we can immediately sort of stop it. So that's different from when it's on prem, you have kind of like the separated duties where the software vendor, which would have been us, doesn't really see what's happening in the data center. So, you know, there's an IT team that didn't develop the software is responsible for the security. So I think things are much better now. I think we're much better set up, but of course, things like cryptocurrencies and so on are making it easier for people to sort of hide. There decentralized networks. So, you know, the attackers are getting more and more sophisticated as well. So that's definitely something that's super important. It's super top of mind. We're all investing heavily into security and privacy because, you know, that's going to be super critical going forward. >> Yeah, we got to move that red line, and figure that out and get more intelligence. Decentralized trends not going away it's going to be more of that, less of the centralized. But centralized does come into play with data. It's a mix, it's not mutually exclusive. And I'll get your thoughts on this. Architectural question with, you know, 5G and the edge coming. Amazon's got that outpost stringent, the wavelength, you're seeing mobile world Congress coming up in this month. The focus on processing data at the edge is a huge issue. And enterprises are now going to be commercial part of that. So architecture decisions are being made in enterprises right now. And this is a big issue. So you mentioned multi-cloud, so tools versus platforms. Now I'm an enterprise buyer and there's no more RFPs. I got all this new choices for startups and growing companies to choose from that are cloud native. I got all kinds of new challenges and opportunities. How do I build my architecture so I don't foreclose a future opportunity. >> Yeah, as I said, look, you're actually right. Cloud is becoming even more and more something that everybody's adopting, but at the same time, there is this thing that the edge is also more and more important. And the connectivity between those two and making sure that you can really do that efficiently. My ask from enterprises, and I think this is top of mind for all the enterprise architects is, choose open because that way you can avoid locking yourself in. So that's one thing that's really, really important. In the past, you know, all these vendors that locked you in, and then you try to move off of them, they were highly innovative back in the day. In the 80's and the 90's, there were the best companies. You gave them all your data and it was fantastic. But then because you were locked in, they didn't need to innovate anymore. And you know, they focused on margins instead. And then over time, the innovation stopped and now you were kind of locked in. So I think openness is really important. I think preserving optionality with multi-cloud because we see the different clouds have different strengths and weaknesses and it changes over time. All right. Early on AWS was the only game that either showed up with much better security, active directory, and so on. Now Google with AI capabilities, which one's going to win, which one's going to be better. Actually, probably all three are going to be around. So having that optionality that you can pick between the three and then artificial intelligence. I think that's going to be the key to the future. You know, you asked about security earlier. That's how people detect zero day attacks, right? You ask about the edge, same thing there, that's where the predictions are going to happen. So make sure that you invest in AI and artificial intelligence very early on because it's not something you can just bolt on later on and have a little data team somewhere that then now you have AI and it's one and done. >> All right. Great insight. I've got to ask you, the folks may or may not know, but you're a professor at Berkeley as well, done a lot of great work. That's where you kind of came out of when Data bricks was formed. And the Berkeley basically was it invented distributed computing back in the 80's. I remember I was breaking in when Unix was proprietary, when software wasn't open you actually had the deal that under the table to get code. Now it's all open. Isn't the internet now with distributed computing and how interconnects are happening. I mean, the internet didn't break during the pandemic, which proves the benefit of the internet. And that's a positive. But as you start seeing edge, it's essentially distributed computing. So I got to ask you from a computer science standpoint. What do you see as the key learnings or connect the dots for how this distributed model will work? I see hybrids clearly, hybrid cloud is clearly the operating model but if you take it to the next level of distributed computing, what are some of the key things that you look for in the next five years as this starts to be completely interoperable, obviously software is going to drive a lot of it. What's your vision on that? >> Yeah, I mean, you know, so Berkeley, you're right for the gigs, you know, there was a now project 20, 30 years ago that basically is how we do things. There was a project on how you search in the very early on with Inktomi that became how Google and everybody else to search today. So workday was super, super early, sometimes way too early. And that was actually the mistake. Was that they were so early that people said that that stuff doesn't work. And then 20 years later you were invented. So I think 2009, Berkeley published just above the clouds saying the cloud is the future. At that time, most industry leaders said, that's just, you know, that doesn't work. Today, recently they published a research paper called, Sky Computing. So sky computing is what you get above the clouds, right? So we have the cloud as the future, the next level after that is the sky. That's one on top of them. That's what multi-cloud is. So that's a lot of the research at Berkeley, you know, into distributed systems labs is about this. And we're excited about that. Then we're one of the sky computing vendors out there. So I think you're going to see much more innovation happening at the sky level than at the compute level where you needed all those DevOps and SRE people to like, you know, build everything manually themselves. I can just see the memes now coming Ali, sky net, star track. You've got space too, by the way, space is another frontier that is seeing a lot of action going on because now the surface area of data with satellites is huge. So again, I know you guys are doing a lot of business with folks in that vertical where you starting to see real time data acquisition coming from these satellites. What's your take on the whole space as the, not the final frontier, but certainly as a new congested and contested space for, for data? >> Well, I mean, as a data vendor, we see a lot of, you know, alternative data sources coming in and people aren't using machine learning< AI to eat out signal out of the, you know, massive amounts of imagery that's coming out of these satellites. So that's actually a pretty common in FinTech, which is a vertical for us. And also sort of in the public sector, lots of, lots of, lots of satellites, imagery data that's coming. And these are massive volumes. I mean, it's like huge data sets and it's a super, super exciting what they can do. Like, you know, extracting signal from the satellite imagery is, and you know, being able to handle that amount of data, it's a challenge for all the companies that we work with. So we're excited about that too. I mean, definitely that's a trend that's going to continue. >> All right. I'm super excited for you. And thanks for coming on The Cube here for our keynote. I got to ask you a final question. As you think about the future, I see your company has achieved great success in a very short time, and again, you guys done the work, I've been following your company as you know. We've been been breaking that Data bricks story for a long time. I've been excited by it, but now what's changed. You got to start thinking about the next 20 miles stair when you look at, you know, the sky computing, you're thinking about these new architectures. As the CEO, your job is to one, not run out of money which you don't have to worry about that anymore, so hiring. And then, you got to figure out that next 20 miles stair as a company. What's that going on in your mind? Take us through your mindset of what's next. And what do you see out in that landscape? >> Yeah, so what I mentioned around Sky company optionality around multi-cloud, you're going to see a lot of capabilities around that. Like how do you get multi-cloud disaster recovery? How do you leverage the best of all the clouds while at the same time not having to just pick one? So there's a lot of innovation there that, you know, we haven't announced yet, but you're going to see a lot of it over the next many years. Things that you can do when you have the optionality across the different parts. And the second thing that's really exciting for us is bringing AI to the masses. Democratizing data and AI. So how can you actually apply machine learning to machine learning? How can you automate machine learning? Today machine learning is still quite complicated and it's pretty advanced. It's not going to be that way 10 years from now. It's going to be very simple. Everybody's going to have it at their fingertips. So how do we apply machine learning to machine learning? It's called auto ML, automatic, you know, machine learning. So that's an area, and that's not something that can be done with, right? But the goal is to eventually be able to automate a way the whole machine learning engineer and the machine learning data scientist altogether. >> You know it's really fun and talking with you is that, you know, for years we've been talking about this inside the ropes, inside the industry, around the future. Now people starting to get some visibility, the pandemics forced that. You seeing the bad projects being exposed. It's like the tide pulled out and you see all the scabs and bad projects that were justified old guard technologies. If you get it right you're on a good wave. And this is clearly what we're seeing. And you guys example of that. So as enterprises realize this, that they're going to have to look double down on the right projects and probably trash the bad projects, new criteria, how should people be thinking about buying? Because again, we talked about the RFP before. I want to kind of circle back because this is something that people are trying to figure out. You seeing, you know, organic, you come in freemium models as cloud scale becomes the advantage in the lock-in frankly seems to be the value proposition. The more value you provide, the more lock-in you get. Which sounds like that's the way it should be versus proprietary, you know, protocols. The protocol is value. How should enterprises organize their teams? Is it end to end workflows? Is it, and how should they evaluate the criteria for these technologies that they want to buy? >> Yeah, that's a great question. So I, you know, it's very simple, try to future proof your decision-making. Make sure that whatever you're doing is not blocking your in. So whatever decision you're making, what if the world changes in five years, make sure that if you making a mistake now, that's not going to bite you in about five years later. So how do you do that? Well, open source is great. If you're leveraging open-source, you can try it out already. You don't even need to talk to any vendor. Your teams can already download it and try it out and get some value out of it. If you're in the cloud, this pay as you go models, you don't have to do a big RFP and commit big. You can try it, pay the vendor, pay as you go, $10, $15. It doesn't need to be a million dollar contract and slowly grow as you're providing value. And then make sure that you're not just locking yourself in to one cloud or, you know, one particular vendor. As much as possible preserve your optionality because then that's not a one-way door. If it turns out later you want to do something else, you can, you know, pick other things as well. You're not locked in. So that's what I would say. Keep that top of mind that you're not locking yourself into a particular decision that you made today, that you might regret in five years. >> I really appreciate you coming on and sharing your with our community and The Cube. And as always great to see you. I really enjoy your clubhouse talks, and I really appreciate how you give back to the community. And I want to thank you for coming on and taking the time with us today. >> Thanks John, always appreciate talking to you. >> Okay Ali Ghodsi, CEO of Data bricks, a success story that proves the validation of cloud scale, open and create value, values the new lock-in. So Natalie, back to you for continuing coverage. >> That was a terrific interview John, but I'd love to get Dave's insights first. What were your takeaways, Dave? >> Well, if we have more time I'll tell you how Data bricks got to where they are today, but I'll say this, the most important thing to me that Allie said was he conveyed a very clear understanding of what data companies are outright and are getting ready. Talked about four things. There's not one data team, there's many data teams. And he talked about data is decentralized, and data has to have context and that context lives in the business. He said, look, think about it. The way that the data companies would get it right, they get data in teams and sales and marketing and finance and engineering. They all have their own data and data teams. And he referred to that as a data mesh. That's a term that is your mock, the Gany coined and the warehouse of the data lake it's merely a node in that global message. It meshes discoverable, he talked about federated governance, and Data bricks, they're breaking the model of shoving everything into a single repository and trying to make that the so-called single version of the truth. Rather what they're doing, which is right on is putting data in the hands of the business owners. And that's how true data companies do. And the last thing you talked about with sky computing, which I loved, it's that future layer, we talked about multi-cloud a lot that abstracts the underlying complexity of the technical details of the cloud and creates additional value on top. I always say that the cloud players like Amazon have given the gift to the world of 100 billion dollars a year they spend in CapEx. Thank you. Now we're going to innovate on top of it. Yeah. And I think the refactoring... >> Hope by John. >> That was great insight and I totally agree. The refactoring piece too was key, he brought that home. But to me, I think Data bricks that Ali shared there and why he's been open and sharing a lot of his insights and the community. But what he's not saying, cause he's humble and polite is they cracked the code on the enterprise, Dave. And to Dave's points exactly reason why they did it, they saw an opportunity to make it easier, at that time had dupe was the rage, and they just made it easier. They was smart, they made good bets, they had a good formula and they cracked the code with the enterprise. They brought it in and they brought value. And see that's the key to the cloud as Dave pointed out. You get replatform with the cloud, then you refactor. And I think he pointed out the multi-cloud and that really kind of teases out the whole future and landscape, which is essentially distributed computing. And I think, you know, companies are starting to figure that out with hybrid and this on premises and now super edge I call it, with 5G coming. So it's just pretty incredible. >> Yeah. Data bricks, IPO is coming and people should know. I mean, what everybody, they created spark as you know John and everybody thought they were going to do is mimic red hat and sell subscriptions and support. They didn't, they developed a managed service and they embedded AI tools to simplify data science. So to your point, enterprises could buy instead of build, we know this. Enterprises will spend money to make things simpler. They don't have the resources, and so this was what they got right was really embedding that, making a building a managed service, not mimicking the kind of the red hat model, but actually creating a new value layer there. And that's big part of their success. >> If I could just add one thing Natalie to that Dave saying is really right on. And as an enterprise buyer, if we go the other side of the equation, it used to be that you had to be a known company, get PR, you fill out RFPs, you had to meet all the speeds. It's like going to the airport and get a swab test, and get a COVID test and all kinds of mechanisms to like block you and filter you. Most of the biggest success stories that have created the most value for enterprises have been the companies that nobody's understood. And Andy Jazz's famous quote of, you know, being misunderstood is actually a good thing. Data bricks was very misunderstood at the beginning and no one kind of knew who they were but they did it right. And so the enterprise buyers out there, don't be afraid to test the startups because you know the next Data bricks is out there. And I think that's where I see the psychology changing from the old IT buyers, Dave. It's like, okay, let's let's test this company. And there's plenty of ways to do that. He illuminated those premium, small pilots, you don't need to go on these big things. So I think that is going to be a shift in how companies going to evaluate startups. >> Yeah. Think about it this way. Why should the large banks and insurance companies and big manufacturers and pharma companies, governments, why should they burn resources managing containers and figuring out data science tools if they can just tap into solutions like Data bricks which is an AI platform in the cloud and let the experts manage all that stuff. Think about how much money in time that saves enterprises. >> Yeah, I mean, we've got 15 companies here we're showcasing this batch and this season if you call it. That episode we are going to call it? They're awesome. Right? And the next 15 will be the same. And these companies could be the next billion dollar revenue generator because the cloud enables that day. I think that's the exciting part. >> Well thank you both so much for these insights. Really appreciate it. AWS startup showcase highlights the innovation that helps startups succeed. And no one knows that better than our very next guest, Jeff Barr. Welcome to the show and I will send this interview now to Dave and John and see you just in the bit. >> Okay, hey Jeff, great to see you. Thanks for coming on again. >> Great to be back. >> So this is a regular community segment with Jeff Barr who's a legend in the industry. Everyone knows your name. Everyone knows that. Congratulations on your recent blog posts we have reading. Tons of news, I want to get your update because 5G has been all over the news, mobile world congress is right around the corner. I know Bill Vass was a keynote out there, virtual keynote. There's a lot of Amazon discussion around the edge with wavelength. Specifically, this is the outpost piece. And I know there is news I want to get to, but the top of mind is there's massive Amazon expansion and the cloud is going to the edge, it's here. What's up with wavelength. Take us through the, I call it the power edge, the super edge. >> Well, I'm really excited about this mostly because it gives a lot more choice and flexibility and options to our customers. This idea that with wavelength we announced quite some time ago, at least quite some time ago if we think in cloud years. We announced that we would be working with 5G providers all over the world to basically put AWS in the telecom providers data centers or telecom centers, so that as their customers build apps, that those apps would take advantage of the low latency, the high bandwidth, the reliability of 5G, be able to get to some compute and storage services that are incredibly close geographically and latency wise to the compute and storage that is just going to give customers this new power and say, well, what are the cool things we can build? >> Do you see any correlation between wavelength and some of the early Amazon services? Because to me, my gut feels like there's so much headroom there. I mean, I was just riffing on the notion of low latency packets. I mean, just think about the applications, gaming and VR, and metaverse kind of cool stuff like that where having the edge be that how much power there. It just feels like a new, it feels like a new AWS. I mean, what's your take? You've seen the evolutions and the growth of a lot of the key services. Like EC2 and SA3. >> So welcome to my life. And so to me, the way I always think about this is it's like when I go to a home improvement store and I wander through the aisles and I often wonder through with no particular thing that I actually need, but I just go there and say, wow, they've got this and they've got this, they've got this other interesting thing. And I just let my creativity run wild. And instead of trying to solve a problem, I'm saying, well, if I had these different parts, well, what could I actually build with them? And I really think that this breadth of different services and locations and options and communication technologies. I suspect a lot of our customers and customers to be and are in this the same mode where they're saying, I've got all this awesomeness at my fingertips, what might I be able to do with it? >> He reminds me when Fry's was around in Palo Alto, that store is no longer here but it used to be back in the day when it was good. It was you go in and just kind of spend hours and then next thing you know, you built a compute. Like what, I didn't come in here, whether it gets some cables. Now I got a motherboard. >> I clearly remember Fry's and before that there was the weird stuff warehouse was another really cool place to hang out if you remember that. >> Yeah I do. >> I wonder if I could jump in and you guys talking about the edge and Jeff I wanted to ask you about something that is, I think people are starting to really understand and appreciate what you did with the entrepreneur acquisition, what you do with nitro and graviton, and really driving costs down, driving performance up. I mean, there's like a compute Renaissance. And I wonder if you could talk about the importance of that at the edge, because it's got to be low power, it has to be low cost. You got to be doing processing at the edge. What's your take on how that's evolving? >> Certainly so you're totally right that we started working with and then ultimately acquired Annapurna labs in Israel a couple of years ago. I've worked directly with those folks and it's really awesome to see what they've been able to do. Just really saying, let's look at all of these different aspects of building the cloud that were once effectively kind of somewhat software intensive and say, where does it make sense to actually design build fabricate, deploy custom Silicon? So from putting up the system to doing all kinds of additional kinds of security checks, to running local IO devices, running the NBME as fast as possible to support the EBS. Each of those things has been a contributing factor to not just the power of the hardware itself, but what I'm seeing and have seen for the last probably two or three years at this point is the pace of innovation on instance types just continues to get faster and faster. And it's not just cranking out new instance types because we can, it's because our awesomely diverse base of customers keeps coming to us and saying, well, we're happy with what we have so far, but here's this really interesting new use case. And we needed a different ratio of memory to CPU, or we need more cores based on the amount of memory, or we needed a lot of IO bandwidth. And having that nitro as the base lets us really, I don't want to say plug and play, cause I haven't actually built this myself, but it seems like they can actually put the different elements together, very very quickly and then come up with new instance types that just our customers say, yeah, that's exactly what I asked for and be able to just do this entire range of from like micro and nano sized all the way up to incredibly large with incredible just to me like, when we talk about terabytes of memory that are just like actually just RAM memory. It's like, that's just an inconceivably large number by the standards of where I started out in my career. So it's all putting this power in customer hands. >> You used the term plug and play, but it does give you that nitro gives you that optionality. And then other thing that to me is really exciting is the way in which ISVs are writing to whatever's underneath. So you're making that, you know, transparent to the users so I can choose as a customer, the best price performance for my workload and that that's just going to grow that ISV portfolio. >> I think it's really important to be accurate and detailed and as thorough as possible as we launch each one of these new instance types with like what kind of processor is in there and what clock speed does it run at? What kind of, you know, how much memory do we have? What are the, just the ins and outs, and is it Intel or arm or AMD based? It's such an interesting to me contrast. I can still remember back in the very very early days of back, you know, going back almost 15 years at this point and effectively everybody said, well, not everybody. A few people looked and said, yeah, we kind of get the value here. Some people said, this just sounds like a bunch of generic hardware, just kind of generic hardware in Iraq. And even back then it was something that we were very careful with to design and optimize for use cases. But this idea that is generic is so, so, so incredibly inaccurate that I think people are now getting this. And it's okay. It's fine too, not just for the cloud, but for very specific kinds of workloads and use cases. >> And you guys have announced obviously the performance improvements on a lamb** does getting faster, you got the per billing, second billings on windows and SQL server on ECE too**. So I mean, obviously everyone kind of gets that, that's been your DNA, keep making it faster, cheaper, better, easier to use. But the other area I want to get your thoughts on because this is also more on the footprint side, is that the regions and local regions. So you've got more region news, take us through the update on the expansion on the footprint of AWS because you know, a startup can come in and these 15 companies that are here, they're global with AWS, right? So this is a major benefit for customers around the world. And you know, Ali from Data bricks mentioned privacy. Everyone's a privacy company now. So the huge issue, take us through the news on the region. >> Sure, so the two most recent regions that we announced are in the UAE and in Israel. And we generally like to pre-announce these anywhere from six months to two years at a time because we do know that the customers want to start making longer term plans to where they can start thinking about where they can do their computing, where they can store their data. I think at this point we now have seven regions under construction. And, again it's all about customer trice. Sometimes it's because they have very specific reasons where for based on local laws, based on national laws, that they must compute and restore within a particular geographic area. Other times I say, well, a lot of our customers are in this part of the world. Why don't we pick a region that is as close to that part of the world as possible. And one really important thing that I always like to remind our customers of in my audience is, anything that you choose to put in a region, stays in that region unless you very explicitly take an action that says I'd like to replicate it somewhere else. So if someone says, I want to store data in the US, or I want to store it in Frankfurt, or I want to store it in Sao Paulo, or I want to store it in Tokyo or Osaka. They get to make that very specific choice. We give them a lot of tools to help copy and replicate and do cross region operations of various sorts. But at the heart, the customer gets to choose those locations. And that in the early days I think there was this weird sense that you would, you'd put things in the cloud that would just mysteriously just kind of propagate all over the world. That's never been true, and we're very very clear on that. And I just always like to reinforce that point. >> That's great stuff, Jeff. Great to have you on again as a regular update here, just for the folks watching and don't know Jeff he'd been blogging and sharing. He'd been the one man media band for Amazon it's early days. Now he's got departments, he's got peoples on doing videos. It's an immediate franchise in and of itself, but without your rough days we wouldn't have gotten all the great news we subscribe to. We watch all the blog posts. It's essentially the flow coming out of AWS which is just a tsunami of a new announcements. Always great to read, must read. Jeff, thanks for coming on, really appreciate it. That's great. >> Thank you John, great to catch up as always. >> Jeff Barr with AWS again, and follow his stuff. He's got a great audience and community. They talk back, they collaborate and they're highly engaged. So check out Jeff's blog and his social presence. All right, Natalie, back to you for more coverage. >> Terrific. Well, did you guys know that Jeff took a three week AWS road trip across 15 cities in America to meet with cloud computing enthusiasts? 5,500 miles he drove, really incredible I didn't realize that. Let's unpack that interview though. What stood out to you John? >> I think Jeff, Barr's an example of what I call direct to audience a business model. He's been doing it from the beginning and I've been following his career. I remember back in the day when Amazon was started, he was always building stuff. He's a builder, he's classic. And he's been there from the beginning. At the beginning he was just the blog and it became a huge audience. It's now morphed into, he was power blogging so hard. He has now support and he still does it now. It's basically the conduit for information coming out of Amazon. I think Jeff has single-handedly made Amazon so successful at the community developer level, and that's the startup action happened and that got them going. And I think he deserves a lot of the success for AWS. >> And Dave, how about you? What is your reaction? >> Well I think you know, and everybody knows about the cloud and back stop X** and agility, and you know, eliminating the undifferentiated, heavy lifting and all that stuff. And one of the things that's often overlooked which is why I'm excited to be part of this program is the innovation. And the innovation comes from startups, and startups start in the cloud. And so I think that that's part of the flywheel effect. You just don't see a lot of startups these days saying, okay, I'm going to do something that's outside of the cloud. There are some, but for the most part, you know, if you saw in software, you're starting in the cloud, it's so capital efficient. I think that's one thing, I've throughout my career. I've been obsessed with every part of the stack from whether it's, you know, close to the business process with the applications. And right now I'm really obsessed with the plumbing, which is why I was excited to talk about, you know, the Annapurna acquisition. Amazon bought and a part of the $350 million, it's reported, you know, maybe a little bit more, but that isn't an amazing acquisition. And the reason why that's so important is because Amazon is continuing to drive costs down, drive performance up. And in my opinion, leaving a lot of the traditional players in their dust, especially when it comes to the power and cooling. You have often overlooked things. And the other piece of the interview was that Amazon is actually getting ISVs to write to these new platforms so that you don't have to worry about there's the software run on this chip or that chip, or x86 or arm or whatever it is. It runs. And so I can choose the best price performance. And that's where people don't, they misunderstand, you always say it John, just said that people are misunderstood. I think they misunderstand, they confused, you know, the price of the cloud with the cost of the cloud. They ignore all the labor costs that are associated with that. And so, you know, there's a lot of discussion now about the cloud tax. I just think the pace is accelerating. The gap is not closing, it's widening. >> If you look at the one question I asked them about wavelength and I had a follow up there when I said, you know, we riff on it and you see, he lit up like he beam was beaming because he said something interesting. It's not that there's a problem to solve at this opportunity. And he conveyed it to like I said, walking through Fry's. But like, you go into a store and he's a builder. So he sees opportunity. And this comes back down to the Martine Casada paradox posts he wrote about do you optimize for CapEx or future revenue? And I think the tell sign is at the wavelength edge piece is going to be so creative and that's going to open up massive opportunities. I think that's the place to watch. That's the place I'm watching. And I think startups going to come out of the woodwork because that's where the action will be. And that's just Amazon at the edge, I mean, that's just cloud at the edge. I think that is going to be very effective. And his that's a little TeleSign, he kind of revealed a little bit there, a lot there with that comment. >> Well that's a to be continued conversation. >> Indeed, I would love to introduce our next guest. We actually have Soma on the line. He's the managing director at Madrona venture group. Thank you Soma very much for coming for our keynote program. >> Thank you Natalie and I'm great to be here and will have the opportunity to spend some time with you all. >> Well, you have a long to nerd history in the enterprise. How would you define the modern enterprise also known as cloud scale? >> Yeah, so I would say I have, first of all, like, you know, we've all heard this now for the last, you know, say 10 years or so. Like, software is eating the world. Okay. Put it another way, we think about like, hey, every enterprise is a software company first and foremost. Okay. And companies that truly internalize that, that truly think about that, and truly act that way are going to start up, continue running well and things that don't internalize that, and don't do that are going to be left behind sooner than later. Right. And the last few years you start off thing and not take it to the next level and talk about like, not every enterprise is not going through a digital transformation. Okay. So when you sort of think about the world from that lens. Okay. Modern enterprise has to think about like, and I am first and foremost, a technology company. I may be in the business of making a car art, you know, manufacturing paper, or like you know, manufacturing some healthcare products or what have you got out there. But technology and software is what is going to give me a unique, differentiated advantage that's going to let me do what I need to do for my customers in the best possible way [Indistinct]. So that sort of level of focus, level of execution, has to be there in a modern enterprise. The other thing is like not every modern enterprise needs to think about regular. I'm competing for talent, not anymore with my peers in my industry. I'm competing for technology talent and software talent with the top five technology companies in the world. Whether it is Amazon or Facebook or Microsoft or Google, or what have you cannot think, right? So you really have to have that mindset, and then everything flows from that. >> So I got to ask you on the enterprise side again, you've seen many ways of innovation. You've got, you know, been in the industry for many, many years. The old way was enterprises want the best proven product and the startups want that lucrative contract. Right? Yeah. And get that beach in. And it used to be, and we addressed this in our earlier keynote with Ali and how it's changing, the buyers are changing because the cloud has enabled this new kind of execution. I call it agile, call it what you want. Developers are driving modern applications, so enterprises are still, there's no, the playbooks evolving. Right? So we see that with the pandemic, people had needs, urgent needs, and they tried new stuff and it worked. The parachute opened as they say. So how do you look at this as you look at stars, you're investing in and you're coaching them. What's the playbook? What's the secret sauce of how to crack the enterprise code today. And if you're an enterprise buyer, what do I need to do? I want to be more agile. Is there a clear path? Is there's a TSA to let stuff go through faster? I mean, what is the modern playbook for buying and being a supplier? >> That's a fantastic question, John, because I think that sort of playbook is changing, even as we speak here currently. A couple of key things to understand first of all is like, you know, decision-making inside an enterprise is getting more and more de-centralized. Particularly decisions around what technology to use and what solutions to use to be able to do what people need to do. That decision making is no longer sort of, you know, all done like the CEO's office or the CTO's office kind of thing. Developers are more and more like you rightly said, like sort of the central of the workflow and the decision making process. So it'll be who both the enterprises, as well as the startups to really understand that. So what does it mean now from a startup perspective, from a startup perspective, it means like, right. In addition to thinking about like hey, not do I go create an enterprise sales post, do I sell to the enterprise like what I might have done in the past? Is that the best way of moving forward, or should I be thinking about a product led growth go to market initiative? You know, build a product that is easy to use, that made self serve really works, you know, get the developers to start using to see the value to fall in love with the product and then you think about like hey, how do I go translate that into a contract with enterprise. Right? And more and more what I call particularly, you know, startups and technology companies that are focused on the developer audience are thinking about like, you know, how do I have a bottom up go to market motion? And sometime I may sort of, you know, overlap that with the top down enterprise sales motion that we know that has been going on for many, many years or decades kind of thing. But really this product led growth bottom up a go to market motion is something that we are seeing on the rise. I would say they're going to have more than half the startup that we come across today, have that in some way shape or form. And so the enterprise also needs to understand this, the CIO or the CTO needs to know that like hey, I'm not decision-making is getting de-centralized. I need to empower my engineers and my engineering managers and my engineering leaders to be able to make the right decision and trust them. I'm going to give them some guard rails so that I don't find myself in a soup, you know, sometime down the road. But once I give them the guard rails, I'm going to enable people to make the decisions. People who are closer to the problem, to make the right decision. >> Well Soma, what are some of the ways that startups can accelerate their enterprise penetration? >> I think that's another good question. First of all, you need to think about like, Hey, what are enterprises wanting to rec? Okay. If you start off take like two steps back and think about what the enterprise is really think about it going. I'm a software company, but I'm really manufacturing paper. What do I do? Right? The core thing that most enterprises care about is like, hey, how do I better engage with my customers? How do I better serve my customers? And how do I do it in the most optimal way? At the end of the day that's what like most enterprises really care about. So startups need to understand, what are the problems that the enterprise is trying to solve? What kind of tools and platform technologies and infrastructure support, and, you know, everything else that they need to be able to do what they need to do and what only they can do in the most optimal way. Right? So to the extent you are providing either a tool or platform or some technology that is going to enable your enterprise to make progress on what they want to do, you're going to get more traction within the enterprise. In other words, stop thinking about technology, and start thinking about the customer problem that they want to solve. And the more you anchor your company, and more you anchor your conversation with the customer around that, the more the enterprise is going to get excited about wanting to work with you. >> So I got to ask you on the enterprise and developer equation because CSOs and CXOs, depending who you talk to have that same answer. Oh yeah. In the 90's and 2000's, we kind of didn't, we throttled down, we were using the legacy developer tools and cloud came and then we had to rebuild and we didn't really know what to do. So you seeing a shift, and this is kind of been going on for at least the past five to eight years, a lot more developers being hired yet. I mean, at FinTech is clearly a vertical, they always had developers and everyone had developers, but there's a fast ramp up of developers now and the role of open source has changed. Just looking at the participation. They're not just consuming open source, open source is part of the business model for mainstream enterprises. How is this, first of all, do you agree? And if so, how has this changed the course of an enterprise human resource selection? How they're organized? What's your vision on that? >> Yeah. So as I mentioned earlier, John, in my mind the first thing is, and this sort of, you know, like you said financial services has always been sort of hiring people [Indistinct]. And this is like five-year old story. So bear with me I'll tell you the firewall story and then come to I was trying to, the cloud CIO or the Goldman Sachs. Okay. And this is five years ago when people were still like, hey, is this cloud thing real and now is cloud going to take over the world? You know, am I really ready to put my data in the cloud? So there are a lot of questions and conversations can affect. The CIO of Goldman Sachs told me two things that I remember to this day. One is, hey, we've got a internal edict. That we made a decision that in the next five years, everything in Goldman Sachs is going to be on the public law. And I literally jumped out of the chair and I said like now are you going to get there? And then he laughed and said like now it really doesn't matter whether we get there or not. We want to set the tone, set the direction for the organization that hey, public cloud is here. Public cloud is there. And we need to like, you know, move as fast as we realistically can and think about all the financial regulations and security and privacy. And all these things that we care about deeply. But given all of that, the world is going towards public load and we better be on the leading edge as opposed to the lagging edge. And the second thing he said, like we're talking about like hey, how are you hiring, you know, engineers at Goldman Sachs Canada? And he said like in hey, I sort of, my team goes out to the top 20 schools in the US. And the people we really compete with are, and he was saying this, Hey, we don't compete with JP Morgan or Morgan Stanley, or pick any of your favorite financial institutions. We really think about like, hey, we want to get the best talent into Goldman Sachs out of these schools. And we really compete head to head with Google. We compete head to head with Microsoft. We compete head to head with Facebook. And we know that the caliber of people that we want to get is no different than what these companies want. If you want to continue being a successful, leading it, you know, financial services player. That sort of tells you what's going on. You also talked a little bit about like hey, open source is here to stay. What does that really mean kind of thing. In my mind like now, you can tell me that I can have from given my pedigree at Microsoft, I can tell you that we were the first embraces of open source in this world. So I'll say that right off the bat. But having said that we did in our turn around and said like, hey, this open source is real, this open source is going to be great. How can we embrace and how can we participate? And you fast forward to today, like in a Microsoft is probably as good as open source as probably any other large company I would say. Right? Including like the work that the company has done in terms of acquiring GitHub and letting it stay true to its original promise of open source and community can I think, right? I think Microsoft has come a long way kind of thing. But the thing that like in all these enterprises need to think about is you want your developers to have access to the latest and greatest tools. To the latest and greatest that the software can provide. And you really don't want your engineers to be reinventing the wheel all the time. So there is something available in the open source world. Go ahead, please set up, think about whether that makes sense for you to use it. And likewise, if you think that is something you can contribute to the open source work, go ahead and do that. So it's really a two way somebody Arctic relationship that enterprises need to have, and they need to enable their developers to want to have that symbiotic relationship. >> Soma, fantastic insights. Thank you so much for joining our keynote program. >> Thank you Natalie and thank you John. It was always fun to chat with you guys. Thank you. >> Thank you. >> John we would love to get your quick insight on that. >> Well I think first of all, he's a prolific investor the great from Madrona venture partners, which is well known in the tech circles. They're in Seattle, which is in the hub of I call cloud city. You've got Amazon and Microsoft there. He'd been at Microsoft and he knows the developer ecosystem. And reason why I like his perspective is that he understands the value of having developers as a core competency in Microsoft. That's their DNA. You look at Microsoft, their number one thing from day one besides software was developers. That was their army, the thousand centurions that one won everything for them. That has shifted. And he brought up open source, and .net and how they've embraced Linux, but something that tele before he became CEO, we interviewed him in the cube at an Xcel partners event at Stanford. He was open before he was CEO. He was talking about opening up. They opened up a lot of their open source infrastructure projects to the open compute foundation early. So they had already had that going and at that price, since that time, the stock price of Microsoft has skyrocketed because as Ali said, open always wins. And I think that is what you see here, and as an investor now he's picking in startups and investing in them. He's got to read the tea leaves. He's got to be in the right side of history. So he brings a great perspective because he sees the old way and he understands the new way. That is the key for success we've seen in the enterprise and with the startups. The people who get the future, and can create the value are going to win. >> Yeah, really excellent point. And just really quickly. What do you think were some of our greatest hits on this hour of programming? >> Well first of all I'm really impressed that Ali took the time to come join us because I know he's super busy. I think they're at a $28 billion valuation now they're pushing a billion dollars in revenue, gap revenue. And again, just a few short years ago, they had zero software revenue. So of these 15 companies we're showcasing today, you know, there's a next Data bricks in there. They're all going to be successful. They already are successful. And they're all on this rocket ship trajectory. Ali is smart, he's also got the advantage of being part of that Berkeley community which they're early on a lot of things now. Being early means you're wrong a lot, but you're also right, and you're right big. So Berkeley and Stanford obviously big areas here in the bay area as research. He is smart, He's got a great team and he's really open. So having him share his best practices, I thought that was a great highlight. Of course, Jeff Barr highlighting some of the insights that he brings and honestly having a perspective of a VC. And we're going to have Peter Wagner from wing VC who's a classic enterprise investors, super smart. So he'll add some insight. Of course, one of the community session, whenever our influencers coming on, it's our beat coming on at the end, as well as Katie Drucker. Another Madrona person is going to talk about growth hacking, growth strategies, but yeah, sights Raleigh coming on. >> Terrific, well thank you so much for those insights and thank you to everyone who is watching the first hour of our live coverage of the AWS startup showcase for myself, Natalie Ehrlich, John, for your and Dave Vellante we want to thank you very much for watching and do stay tuned for more amazing content, as well as a special live segment that John Furrier is going to be hosting. It takes place at 12:30 PM Pacific time, and it's called cracking the code, lessons learned on how enterprise buyers evaluate new startups. Don't go anywhere.

Published Date : Jun 24 2021

SUMMARY :

on the latest innovations and solutions How are you doing. are you looking forward to. and of course the keynotes Ali Ghodsi, of the quality of healthcare and you know, to go from, you know, a you on the other side. Congratulations and great to see you. Thank you so much, good to see you again. And you were all in on cloud. is the success of how you guys align it becomes a force that you moments that you can point to, So that's the second one that we bet on. And one of the things that Back in the day, you had to of say that the data problems And you know, there's this and that's why we have you on here. And if you say you're a data company, and growing companies to choose In the past, you know, So I got to ask you from a for the gigs, you know, to eat out signal out of the, you know, I got to ask you a final question. But the goal is to eventually be able the more lock-in you get. to one cloud or, you know, and taking the time with us today. appreciate talking to you. So Natalie, back to you but I'd love to get Dave's insights first. And the last thing you talked And see that's the key to the of the red hat model, to like block you and filter you. and let the experts manage all that stuff. And the next 15 will be the same. see you just in the bit. Okay, hey Jeff, great to see you. and the cloud is going and options to our customers. and some of the early Amazon services? And so to me, and then next thing you Fry's and before that and appreciate what you did And having that nitro as the base is the way in which ISVs of back, you know, going back is that the regions and local regions. And that in the early days Great to have you on again Thank you John, great to you for more coverage. What stood out to you John? and that's the startup action happened the most part, you know, And that's just Amazon at the edge, Well that's a to be We actually have Soma on the line. and I'm great to be here How would you define the modern enterprise And the last few years you start off thing So I got to ask you on and then you think about like hey, And the more you anchor your company, So I got to ask you on the enterprise and this sort of, you know, Thank you so much for It was always fun to chat with you guys. John we would love to get And I think that is what you see here, What do you think were it's our beat coming on at the end, and it's called cracking the code,

ENTITIES

Entity	Category	Confidence
Ali Ghodsi	PERSON	0.99+
Natalie Ehrlich	PERSON	0.99+
Dave	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Natalie	PERSON	0.99+
Jeff	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
John	PERSON	0.99+
Google	ORGANIZATION	0.99+
Osaka	LOCATION	0.99+
UAE	LOCATION	0.99+
Allie	PERSON	0.99+
Israel	LOCATION	0.99+
Peter Wagner	PERSON	0.99+
John Furrier	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
Tokyo	LOCATION	0.99+
$10	QUANTITY	0.99+
Sao Paulo	LOCATION	0.99+
Goldman Sachs	ORGANIZATION	0.99+
Frankfurt	LOCATION	0.99+
Berkeley	ORGANIZATION	0.99+
Jeff Barr	PERSON	0.99+
Seattle	LOCATION	0.99+
$28 billion	QUANTITY	0.99+
Katie Drucker	PERSON	0.99+
$15	QUANTITY	0.99+
Morgan Stanley	ORGANIZATION	0.99+
Soma	PERSON	0.99+
Iraq	LOCATION	0.99+
2009	DATE	0.99+
Juan	PERSON	0.99+
Goldman Sachs	ORGANIZATION	0.99+
$350 million	QUANTITY	0.99+
Ali	PERSON	0.99+
11 years	QUANTITY	0.99+

Cracking the Code: Lessons Learned from How Enterprise Buyers Evaluate New Startups

(bright music) >> Welcome back to the CUBE presents the AWS Startup Showcase The Next Big Thing in cloud startups with AI security and life science tracks, 15 hottest growing startups are presented. And we had a great opening keynote with luminaries in the industry. And now our closing keynote is to get a deeper dive on cracking the code in the enterprise, how startups are changing the game and helping companies change. And they're also changing the game of open source. We have a great guest, Katie Drucker, Head of Business Development, Madrona Venture Group. Katie, thank you for coming on the CUBE for this special closing keynote. >> Thank you for having me, I appreciate it. >> So one of the topics we talked about with Soma from Madrona on the opening keynote, as well as Ali from Databricks is how startups are seeing success faster. So that's the theme of the Cloud speed, agility, but the game has changed in the enterprise. And I want to really discuss with you how growth changes and growth strategy specifically. They talk, go to market. We hear things like good sales to enterprise sales, organic, freemium, there's all kinds of different approaches, but at the end of the day, the most successful companies, the ones that might not be known that just come out of nowhere. So the economics are changing and the buyers are thinking differently. So let's explore that topic. So take us through your view 'cause you have a lot of experience. But first talk about your role at Madrona, what you do. >> Absolutely all great points. So my role at Madrona, I think I have personally one of the more enviable jobs and that my job is to... I get the privilege of working with all of these fantastic entrepreneurs in our portfolio and doing whatever we can as a firm to harness resources, knowledge, expertise, connections, to accelerate their growth. So my role in setting up business development is taking a look at all of those tools in the tool chest and partnering with the portfolio to make it so. And in our portfolio, we have a wide range of companies, some rely on enterprise sales, some have other go to markets. Some are direct to consumer, a wide range. >> Talk about the growth strategies that you see evolving because what's clear with the pandemic. And as we come out of it is that there are growth plays happening that don't look a little bit differently, more obvious now because of the Cloud scale, we're seeing companies like Databricks, like Snowflake, like other companies that have been built on the cloud or standalone. What are some of the new growth techniques, or I don't want to say growth hacking, that is a pejorative term, but like just a way for companies to quickly describe their value to an enterprise buyer who's moving away from the old RFP days of vendor selection. The game has changed. So take us through how you see secret key and unlocking that new equation of how to present value to an enterprise and how you see enterprises evaluating startups. >> Yes, absolutely. Well, and that's got a question, that's got a few components nestled in what I think are some bigger trends going on. AWS of course brought us the Cloud first. I think now the Cloud is more and more a utility. And so it's incumbent upon thinking about how an enterprise 'cause using the Cloud is going to go up the value stack and partner with its cloud provider and other service providers. I think also with that agility of operations, you have thinning, if you will, the systems of record and a lot of new entrance into this space that are saying things like, how can we harness AIML and other emerging trends to provide more value directly around work streams that were historically locked into those systems of record? And then I think you also have some price plans that are far more flexible around usage based as opposed to just flat subscription or even these big clunky annual or multi-year RFP type stuff. So all of those trends are really designed in ways that favor the emerging startup. And I think if done well, and in partnership with those underlying cloud providers, there can be some amazing benefits that the enterprise realizes an opportunity for those startups to grow. And I think that's what you're seeing. I think there's also this emergence of a buyer that's different than the CIO or the site the CISO. You have things with low code, no code. You've got other buyers in the organization, other line of business executives that are coming to the table, making software purchase decisions. And then you also have empowered developers that are these citizen builders and developer buyers and personas that really matter. So lots of inroads in places for a startup to reach in the enterprise to make a connection and to bring value. That's a great insight. I want to ask that just if you don't mind follow up on that, you mentioned personas. And what we're seeing is the shift happens. There's new roles that are emerging and new things that are being reconfigured or refactored if you will, whether it's human resources or AI, and you mentioned ML playing a role in automation. These are big parts of the new value proposition. How should companies posture to the customer? Because I don't want to say pivot 'cause that means it's not working but mostly extending our iterating around their positioning because as new things have not yet been realized, it might not be operationalized in a company or maybe new things need to be operationalized, it's a new solution for that. Positioning the value is super important and a lot of companies often struggle with that, but also if they get it right, that's the key. What's your feeling on startups in their positioning? So people will dismiss it like, "Oh, that's marketing." But maybe that's important. What's your thoughts on the great positioning question? >> I've been in this industry a long time. And I think there are some things that are just tried and true, and it is not unique to tech, which is, look, you have to tell a story and you have to reach the customer and you have to speak to the customer's need. And what that means is, AWS is a great example. They're famous for the whole concept of working back from the customer and thinking about what that customer's need is. I think any startup that is looking to partner or work alongside of AWS really has to embody that very, very customer centric way of thinking about things, even though, as we just talked about those personas are changing who that customer really is in the enterprise. And then speaking to that value proposition and meeting that customer and creating a dialogue with them that really helps to understand not only what their pain points are, but how you were offering solves those pain points. And sometimes the customer doesn't realize that that is their pain point and that's part of the education and part of the way in which you engage that dialogue. That doesn't change a lot, just generation to generation. I think the modality of how we have that dialogue, the methods in which we choose to convey that change, but that basic discussion is what makes us human. >> What's your... Great, great, great insight. I want to ask you on the value proposition question again, the question I often get, and it's hard to answer is am I competing on value or am I competing on commodity? And depending on where you're in the stack, there could be different things like, for example, land is getting faster, smaller, cheaper, as an example on Amazon. That's driving down to low cost high value, but it shifts up the stack. You start to see in companies this changing the criteria for how to evaluate. So an enterprise might be struggling. And I often hear enterprises say, "I don't know how to pick who I need. I buy tools, I don't buy many platforms." So they're constantly trying to look for that answer key, if you will, what's your thoughts on the changing requirements of an enterprise? And how to do vendor selection. >> Yeah, so obviously I don't think there's a single magic bullet. I always liked just philosophically to think about, I think it's always easier and frankly more exciting as a buyer to want to buy stuff that's going to help me make more revenue and build and grow as opposed to do things that save me money. And just in a binary way, I like to think which side of the fence are you sitting on as a product offering? And the best ways that you can articulate that, what opportunities are you unlocking for your customer? The problems that you're solving, what kind of growth and what impact is that going to lead to, even if you're one or two removed from that? And again, that's not a new concept. And I think that the companies that have that squarely in mind when they think about their go-to market strategy, when they think about the dialogue they're having, when they think about the problems that they're solving, find a much faster path. And I think that also speaks to why we're seeing so many explosion in the line of business, SAS apps that are out there. Again, that thinning of the systems of record, really thinking about what are the scenarios and work streams that we can have happened that are going to help with that revenue growth and unlocking those opportunities. >> What's the common startup challenge that you see when they're trying to do business development? Usually they build the product first, product led value, you hear that a lot. And then they go, "Okay, we're ready to sell, hire a sales guy." That seems to be shifting away because of the go to markets are changing. What are some of the challenges that startups have? What are some that you're seeing? >> Well, and I think the point that you're making about the changes are really almost a result of the trends that we're talking about. The sales organization itself is becoming... These work streams are becoming instrumented. Data is being collected, insights are being derived off of those things. So you see companies like Clary or Highspot or two examples or tutorial that are in our portfolio that are looking at that action and making the art of sales and marketing far more sophisticated overall, which then leads to the different growth hacking and the different insights that are driven. I think the common mistakes that I see across the board, especially with earlier stage startups, look you got to find product market fit. I think that's always... You start with a thesis or a belief and a passion that you're building something that you think the market needs. And it's a lot of dialogue you have to have to make sure that you do find that. I think once you find that another common problem that I see is leading with an explanation of technology. And again, not focusing on the buyer or the... Sorry, the buyer about solving a problem and focusing on that problem as opposed to focusing on how cool your technology is. Those are basic and really, really simple. And then I think setting a set of expectations, especially as it comes to business development and partnering with companies like AWS. The researching that you need to adequately meet the demand that can be turned on. And then I'm sure you heard about from Databricks, from an organization like AWS, you have to be pragmatic. >> Yeah, Databricks gone from zero a software sales a few years ago to over a billion. Now it looks like a Snowflake which came out of nowhere and they had a great product, but built on Amazon, they became the data cloud on top of Amazon. And now they're growing just whole new business models and new business development techniques. Katie, thank you for sharing your insight here. The CUBE's closing keynote. Thanks for coming on. >> Appreciate it, thank you. >> Okay, Katie Drucker, Head of Business Development at Madrona Venture Group. Premier VC in the Seattle area and beyond they're doing a lot of cloud action. And of course they know AWS very well and investing in the ecosystem. So great, great stuff there. Next up is Peter Wagner partner at Wing.VX. Love this URL first of all 'cause of the VC domain extension. But Peter is a long time venture capitalist. I've been following his career. He goes back to the old networking days, back when the internet was being connected during the OSI days, when the TCP IP open systems interconnect was really happening and created so much. Well, Peter, great to see you on the CUBE here and congratulations with success at Wing VC. >> Yeah, thanks, John. It's great to be here. I really appreciate you having me. >> Reason why I wanted to have you come on. First of all, you had a great track record in investing over many decades. You've seen many waves of innovation, startups. You've seen all the stories. You've seen the movie a few times, as I say. But now more than ever, enterprise wise it's probably the hottest I've ever seen. And you've got a confluence of many things on the stack. You were also an early seed investor in Snowflake, well-regarded as a huge success. So you've got your eye on some of these awesome deals. Got a great partner over there has got a network experience as well. What is the big aha moment here for the industry? Because it's not your classic enterprise startups anymore. They have multiple things going on and some of the winners are not even known. They come out of nowhere and they connect to enterprise and get the lucrative positions and can create a moat and value. Like out of nowhere, it's not the old way of like going to the airport and doing an RFP and going through the stringent requirements, and then you're in, you get to win the lucrative contract and you're in. Not anymore, that seems to have changed. What's your take on this 'cause people are trying to crack the code here and sometimes you don't have to be well-known. >> Yeah, well, thank goodness the game has changed 'cause that old thing was (indistinct) So I for one don't miss it. There was some modernization movement in the enterprise and the modern enterprise is built on data powered by AI infrastructure. That's an agile workplace. All three of those things are really transformational. There's big investments being made by enterprises, a lot of receptivity and openness to technology to enable all those agendas, and that translates to good prospects for startups. So I think as far as my career goes, I've never seen a more positive or fertile ground for startups in terms of penetrating enterprise, it doesn't mean it's easy to do, but you have a receptive audience on the other side and that hasn't necessarily always been the case. >> Yeah, I got to ask you, I know that you're a big sailor and your family and Franks Lubens also has a boat and sailing metaphor is always good to have 'cause you got to have a race that's being run and they have tactics. And this game that we're in now, you see the successes, there's investment thesises, and then there's also actually bets. And I want to get your thoughts on this because a lot of enterprises are trying to figure out how to evaluate startups and starts also can make the wrong bet. They could sail to the wrong continent and be in the wrong spot. So how do you pick the winners and how should enterprises understand how to pick winners too? >> Yeah, well, one of the real important things right now that enterprise is facing startups are learning how to do and so learning how to leverage product led growth dynamics in selling to the enterprise. And so product led growth has certainly always been important consumer facing companies. And then there's a few enterprise facing companies, early ones that cracked the code, as you said. And some of these examples are so old, if you think about, like the ones that people will want to talk about them and talk about Classy and want to talk about Twilio and these were of course are iconic companies that showed the way for others. But even before that, folks like Solar Winds, they'd go to market model, clearly product red, bottom stuff. Back then we didn't even have those words to talk about it. And then some of the examples are so enormous if think about them like the one right in front of your face, like AWS. (laughing) Pretty good PLG, (indistinct) but it targeted builders, it targeted developers and flipped over the way you think about enterprise infrastructure, as a result some how every company, even if they're harnessing relatively conventional sales and marketing motion, and you think about product led growth as a way to kick that motion off. And so it's not really an either word even more We might think OPLJ, that means there's no sales keep one company not true, but here's a way to set the table so that you can very efficiently use your sales and marketing resources, only have the most attractive targets and ones that are really (indistinct) >> I love the product led growth. I got to ask you because in the networking days, I remember the term inevitability was used being nested in a solution that they're just going to Cisco off router and a firewall is one you can unplug and replace with another vendor. Cisco you'd have to go through no switching costs were huge. So when you get it to the Cloud, how do you see the competitiveness? Because we were riffing on this with Ali, from Databricks where the lock-in might be value. The more value provider is the lock-in. Is their nestedness? Is their intimate ability as a competitive advantage for some of these starts? How do you look at that? Because startups, they're using open source. They want to have a land position in an enterprise, but how do they create that sustainable competitive advantage going forward? Because again, this is what you do. You bet on ones that you can see that could establish a model whatever we want to call it, but a competitive advantage and ongoing nested position. >> Sometimes it has to do with data, John, and so you mentioned Snowflake a couple of times here, a big part of Snowflake's strategy is what they now call the data cloud. And one of the reasons you go there is not to just be able to process data, to actually get access to it, exchange with the partners. And then that of course is a great reason for the customers to come to the Snowflake platform. And so the more data it gets more customers, it gets more data, the whole thing start spinning in the right direction. That's a really big example, but all of these startups that are using ML in a fundamental way, applying it in a novel way, the data modes are really important. So getting to the right data sources and training on it, and then putting it to work so that you can see that in this process better and doing this earlier on that scale. That's a big part of success. Another company that I work with is a good example that I call (indistinct) which works in sales technology space, really crushing it in terms of building better sales organizations both at performance level, in terms of the intelligence level, and just overall revenue attainment using ML, and using novel data sources, like the previously lost data or phone calls or Zoom calls as you already know. So I think the data advantages are really big. And smart startups are thinking through it early. >> It's interest-- >> And they're planning by the way, not to ramble on too much, but they're betting that PLG strategy. So their land option is designed not just to be an interesting way to gain usage, but it's also a way to gain access to data that then enables the expand in a component. >> That is a huge call-out point there, I was going to ask another question, but I think that is the key I see. It's a new go to market in a way. product led with that kind of approach gets you a beachhead and you get a little position, you get some data that is a cloud model, it means variable, whatever you want to call it variable value proposition, value proof, or whatever, getting that data and reiterating it. So it brings up the whole philosophical question of okay, product led growth, I love that with product led growth of data, I get that. Remember the old platform versus a tool? That's the way buyers used to think. How has that changed? 'Cause now almost, this conversation throws out the whole platform thing, but isn't like a platform. >> It looks like it's all. (laughs) you can if it is a platform, though to do that you can reveal that later, but you're looking for adoption, so if it's down stock product, you're looking for adoption by like developers or DevOps people or SOEs, and they're trying to solve a problem, and they want rapid gratification. So they don't want to have an architectural boomimg, placed in front of them. And if it's up stock product and application, then it's a user or the business or whatever that is, is adopting the application. And again, they're trying to solve a very specific problem. You need instant and immediate obvious time and value. And now you have a ticket to the dance and build on that and maybe a platform strategy can gradually take shape. But you know who's not in this conversation is the CIO, it's like, "I'm always the last to know." >> That's the CISO though. And they got him there on the firing lines. CISOs are buying tools like it's nobody's business. They need everything. They'll buy anything or you go meet with sand, they'll buy it. >> And you make it sound so easy. (laughing) We do a lot of security investment if only (indistinct) (laughing) >> I'm a little bit over the top, but CISOs are under a lot of pressure. I would talk to the CISO at Capital One and he was saying that he's on Amazon, now he's going to another cloud, not as a hedge, but he doesn't want to focus development teams. So he's making human resource decisions as well. Again, back to what IT used to be back in the old days where you made a vendor decision, you built around it. So again, clouds play that way. I see that happening. But the question is that I think you nailed this whole idea of cross hairs on the target persona, because you got to know who you are and then go to the market. So if you know you're a problem solving and the lower in the stack, do it and get a beachhead. That's a strategy, you can do that. You can't try to be the platform and then solve a problem at the same time. So you got to be careful. Is that what you were getting at? >> Well, I think you just understand what you're trying to achieve in that line of notion. And how those dynamics work and you just can't drag it out. And they could make it too difficult. Another company I work with is a very strategic cloud data platform. It's a (indistinct) on systems. We're not trying to foist that vision though (laughs) or not adopters today. We're solving some thorny problems with them in the short term, rapid time to value operational needs in scale. And then yeah, once they found success with (indistinct) there's would be an opportunity to be increasing the platform, and an obstacle for those customers. But we're not talking about that. >> Well, Peter, I appreciate you taking the time and coming out of a board meeting, I know that you're super busy and I really appreciate you making time for us. I know you've got an impressive partner in (indistinct) who's a former Sequoia, but Redback Networks part of that company over the years, you guys are doing extremely well, even a unique investment thesis. I'd like you to put the plug in for the firm. I think you guys have a good approach. I like what you guys are doing. You're humble, you don't brag a lot, but you make a lot of great investments. So could you take them in to explain what your investment thesis is and then how that relates to how an enterprise is making their investment thesis? >> Yeah, yeah, for sure. Well, the concept that I described earlier that the modern enterprise movement as a workplace built on data powered by AI. That's what we're trying to work with founders to enable. And also we're investing in companies that build the products and services that enable that modern enterprise to exist. And we do it from very early stages, but with a longterm outlook. So we'll be leading series and series, rounds of investment but staying deeply involved, both operationally financially throughout the whole life cycle of the company. And then we've done that a bunch of times, our goal is always the big independent public company and they don't always make it but enough for them to have it all be worthwhile. An interesting special case of this, and by the way, I think it intersects with some of startup showcase here is in the life sciences. And I know you were highlighting a lot of healthcare websites and deals, and that's a vertical where to disrupt tremendous impact of data both new data availability and new ways to put it to use. I know several of my partners are very focused on that. They call it bio-X data. It's a transformation all on its own. >> That's awesome. And I think that the reason why we're focusing on these verticals is if you have a cloud horizontal scale view and vertically specialized with machine learning, every vertical is impacted by data. It's so interesting that I think, first start, I was probably best time to be a cloud startup right now. I really am bullish on it. So I appreciate you taking the time Peter to come in again from your board meeting, popping out. Thanks for-- (indistinct) Go back in and approve those stock options for all the employees. Yeah, thanks for coming on. Appreciate it. >> All right, thank you John, it's a pleasure. >> Okay, Peter Wagner, Premier VC, very humble Wing.VC is a great firm. Really respect them. They do a lot of great investing investments, Snowflake, and we have Dave Vellante back who knows a lot about Snowflake's been covering like a blanket and Sarbjeet Johal. Cloud Influencer friend of the CUBE. Cloud commentator and cloud experience built clouds, runs clouds now invests. So V. Dave, thanks for coming back on. You heard Peter Wagner at Wing VC. These guys have their roots in networking, which networking back in the day was, V. Dave. You remember the internet Cisco days, remember Cisco, Wellfleet routers. I think Peter invested in Arrow Point, remember Arrow Point, that was about in the 495 belt where you were. >> Lynch's company. >> That was Chris Lynch's company. I think, was he a sales guy there? (indistinct) >> That was his first big hit I think. >> All right, well guys, let's wrap this up. We've got a great program here. Sarbjeet, thank you for coming on. >> No worries. Glad to be here todays. >> Hey, Sarbjeet. >> First of all, really appreciate the Twitter activity lately on the commentary, the observability piece on Jeremy Burton's launch, Dave was phenomenal, but Peter was talking about this dynamic and I think ties this cracking the code thing together, which is there's a product led strategy that feels like a platform, but it's also a tool. In other words, it's not mutually exclusive, the old methods thrown out the window. Land in an account, know what problem you're solving. If you're below the stack, nail it, get data and go from there. If you're a process improvement up the stack, you have to much more of a platform longer-term sale, more business oriented, different motions, different mechanics. What do you think about that? What's your reaction? >> Yeah, I was thinking about this when I was listening to some of the startups pitching, if you will, or talking about what they bring to the table in this cloud scale or cloud era, if you will. And there are tools, there are applications and then they're big monolithic platforms, if you will. And then they're part of the ecosystem. So I think the companies need to know where they play. A startup cannot be platform from the get-go I believe. Now many aspire to be, but they have to start with tooling. I believe in, especially in B2B side of things, and then go into the applications, one way is to go into the application area, if you will, like a very precise use cases for certain verticals and stuff like that. And other parties that are going into the platform, which is like horizontal play, if you will, in technology. So I think they have to understand their age, like how old they are, how new they are, how small they are, because when their size matter when you are procuring as a big business, procuring your technology vendors size matters and the economic viability matters and their proximity to other windows matter as well. So I think we'll jump into that in other discussions later, but I think that's key, as you said. >> I would agree with that. I would phrase it in my mind, somewhat differently from Sarbjeet which is you have product led growth, and that's your early phase and you get product market fit, you get product led growth, and then you expand and there are many, many examples of this, and that's when you... As part of your team expansion strategy, you're going to get into the platform discussion. There's so many examples of that. You take a look at Ali Ghodsi today with what's happening at Databricks, Snowflake is another good example. They've started with product led growth. And then now they're like, "Okay, we've got to expand the team." Okta is another example that just acquired zero. That's about building out the platform, versus more of a point product. And there's just many, many examples of that, but you cannot to your point, very hard to start with a platform. Arm did it, but that was like a one in a million chance. >> It's just harder, especially if it's new and it's not operationalized yet. So one of the things Dave that we've observed the Cloud is some of the best known successes where nobody's not known at all, database we've been covering from the beginning 'cause we were close to that movement when they came out of Berkeley. But they still were misunderstood and they just started generating revenue in only last year. So again, only a few years ago, zero software revenue, now they're approaching a billion dollars. So it's not easy to make these vendor selections anymore. And if you're new and you don't have someone to operate it or your there's no department and the departments changing, that's another problem. These are all like enterprisey problems. What's your thoughts on that, Dave? >> Well, I think there's a big discussion right now when you've been talking all day about how should enterprise think about startups and think about most of these startups they're software companies and software is very capital efficient business. At the same time, these companies are raising hundreds of millions, sometimes over a billion dollars before they go to IPO. Why is that? A lot of it's going to promotion. I look at it as... And there's a big discussion going on but well, maybe sales can be more efficient and more direct and so forth. I really think it comes down to the golden rule. Two things really mattered in the early days in the startup it's sales and engineering. And writers should probably say engineering and sales and start with engineering. And then you got to figure out your go to market. Everything else is peripheral to those two and you don't get those two things right, you struggle. And I think that's what some of these successful startups are proving. >> Sarbjeet, what's your take on that point? >> Could you repeat the point again? Sorry, I lost-- >> As cloud scale comes in this whole idea of competing, the roles are changing. So look at IOT, look at the Edge, for instance, you got all kinds of new use cases that no one actually knows is a problem to solve. It's just pure opportunity. So there's no one's operational I could have a product, but it don't know we can buy it yet. It's a problem. >> Yeah, I think the solutions have to be point solutions and the startups need to focus on the practitioners, number one, not the big buyers, not the IT, if you will, but the line of business, even within that sphere, like just focus on the practitioners who are going to use that technology. I talked to, I think it wasn't Fiddler, no, it was CoreLogics. I think that story was great today earlier in how they kind of struggle in the beginning, they were trying to do a big bang approach as a startup, but then they almost stumbled. And then they found their mojo, if you will. They went to Don the market, actually, that's a very classic theory of disruption, like what we study from Harvard School of Business that you go down the market, go to the non-consumers, because if you're trying to compete head to head with big guys. Because most of the big guys have lot of feature and functionality, especially at the platform level. And if you're trying to innovate in that space, you have to go to the practitioners and solve their core problems and then learn and expand kind of thing. So I think you have to focus on practitioners a lot more than the traditional oracle buyers. >> Sarbjeet, we had a great thread last night in Twitter, on observability that you started. And there's a couple of examples there. Chaos searches and relatively small company right now, they just raised them though. And they're part of this star showcase. And they could've said, "Hey, we're going to go after Splunk." But they chose not to. They said, "Okay, let's kind of disrupt the elk stack and simplify that." Another example is a company observed, you've mentioned Jeremy Burton's company, John. They're focused really on SAS companies. They're not going after initially these complicated enterprise deals because they got to get it right or else they'll get churn, and churn is that silent killer of software companies. >> The interesting other company that was on the showcase was Tetra Science. I don't know if you noticed that one in the life science track, and again, Peter Wagner pointed out the life science. That's an under recognized in the press vertical that's exploding. Certainly during the pandemic you saw it, Tetra science is an R&D cloud, Dave, R&D data cloud. So pharmaceuticals, they need to do their research. So the pandemic has brought to life, this now notion of tapping into data resources, not just data lakes, but like real deal. >> Yeah, you and Natalie and I were talking about that this morning and that's one of the opportunities for R&D and you have all these different data sources and yeah, it's not just about the data lake. It's about the ecosystem that you're building around them. And I see, it's really interesting to juxtapose what Databricks is doing and what Snowflake is doing. They've got different strategies, but they play a part there. You can see how ecosystems can build that system. It's not one company is going to solve all these problems. It's going to really have to be connections across these various companies. And that's what the Cloud enables and ecosystems have all this data flowing that can really drive new insights. >> And I want to call your attention to a tweet Sarbjeet you wrote about Splunk's earnings and they're data companies as well. They got Teresa Carlson there now AWS as the president, working with Doug, that should change the game a little bit more. But there was a thread of the neath there. Andy Thry says to replies to Dave you or Sarbjeet, you, if you're on AWS, they're a fine solution. The world doesn't just revolve around AWS, smiley face. Well, a lot of it does actually. So (laughing) nice point, Andy. But he brings up this thing and Ali brought it up too, Hybrid now is a new operating system for what now Edge does. So we got Mobile World Congress happening this month in person. This whole Telco 5G brings up a whole nother piece of the Cloud puzzle. Jeff Barr pointed out in his keynote, Dave. Guys, I want to get your reaction. The Edge now is... I'm calling it the super Edge because it's not just Edge as we know it before. You're going to have these pops, these points of presence that are going to have wavelength as your spectrum or whatever they have. I think that's the solution for Azure. So you're going to have all this new cloud power for low latency applications. Self-driving delivery VR, AR, gaming, Telemetry data from Teslas, you name it, it's happening. This is huge, what's your thoughts? Sarbjeet, we'll start with you. >> Yeah, I think Edge is like bound to happen. And for many reasons, the volume of data is increasing. Our use cases are also expanding if you will, with the democratization of computer analysis. Specialization of computer, actually Dave wrote extensively about how Intel and other chip players are gearing up for that future if you will. Most of the inference in the AI world will happen in the field close to the workloads if you will, that can be mobility, the self-driving car that can be AR, VR. It can be healthcare. It can be gaming, you name it. Those are the few use cases, which are in the forefront and what alarm or use cases will come into the play I believe. I've said this many times, Edge, I think it will be dominated by the hyperscalers, mainly because they're building their Metro data centers now. And with a very low latency in the Metro areas where the population is, we're serving the people still, not the machines yet, or the empty areas where there is no population. So wherever the population is, all these big players are putting their data centers there. And I think they will dominate the Edge. And I know some Edge lovers. (indistinct) >> Edge huggers. >> Edge huggers, yeah. They don't like the hyperscalers story, but I think that's the way were' going. Why would we go backwards? >> I think you're right, first of all, I agree with the hyperscale dying you look at the top three clouds right now. They're all in the Edge, Hardcore it's a huge competitive battleground, Dave. And I think the missing piece, that's going to be uncovered at Mobile Congress. Maybe they'll miss it this year, but it's the developer traction, whoever wins the developer market or wins the loyalty, winning over the market or having adoption. The applications will drive the Edge. >> And I would add the fourth cloud is Alibaba. Alibaba is actually bigger than Google and they're crushing it as well. But I would say this, first of all, it's popular to say, "Oh not everything's going to move into the Cloud, John, Dave, Sarbjeet." But the fact is that AWS they're trend setter. They are crushing it in terms of features. And you'd look at what they're doing in the plumbing with Annapurna. Everybody's following suit. So you can't just ignore that, number one. Second thing is what is the Edge? Well, the edge is... Where's the logical place to process the data? That's what the Edge is. And I think to your point, both Sarbjeet and John, the Edge is going to be won by developers. It's going to be one by programmability and it's going to be low cost and really super efficient. And most of the data is going to stay at the Edge. And so who is in the best position to actually create that? Is it going to be somebody who was taking an x86 box and throw it over the fence and give it a fancy name with the Edge in it and saying, "Here's our Edge box." No, that's not what's going to win the Edge. And so I think first of all it's huge, it's wide open. And I think where's the innovation coming from? I agree with you it's the hyperscalers. >> I think the developers as John said, developers are the kingmakers. They build the solutions. And in that context, I always talk about the skills gravity, a lot of people are educated in certain technologies and they will keep using those technologies. Their proximity to that technology is huge and they don't want to learn something new. So as humans we just tend to go what we know how to use it. So from that front, I usually talk with consumption economics of cloud and Edge. It has to focus on the practitioners. And in this case, practitioners are developers because you're just cooking up those solutions right now. We're not serving that in huge quantity right now, but-- >> Well, let's unpack that Sarbjeet, let's unpack that 'cause I think you're right on the money on that. The consumption of the tech and also the consumption of the application, the end use and end user. And I think the reason why hyperscalers will continue to dominate besides the fact that they have all the resource and they're going to bring that to the Edge, is that the developers are going to be driving the applications at the Edge. So if you're low latency Edge, that's going to open up new applications, not just the obvious ones I did mention, gaming, VR, AR, metaverse and other things that are obvious. There's going to be non-obvious things that are going to be huge that are going to come out from the developers. But the Cloud native aspect of the hyperscalers, to me is where the scales are tipping, let me explain. IT was built to build a supply resource to the businesses who were writing business applications. Mostly driven by IBM in the mainframe in the old days, Dave, and then IT became IT. Telcos have been OT closed, "This is our thing, that's it." Now they have to open up. And the Cloud native technologies is the fastest way to value. And I think that paths, Sarbjeet is going to be defined by this new developer and this new super Edge concept. So I think it's going to be wide open. I don't know what to say. I can't guess, but it's going to be creative. >> Let me ask you a question. You said years ago, data's new development kit, does low code and no code to Sarbjeet's point, change the equation? In other words, putting data in the hands of those OT professionals, those practitioners who have the context. Does low-code and no-code enable, more of those protocols? I know it's a bromide, but the citizen developer, and what impact does that have? And who's in the best position? >> Well, I think that anything that reduces friction to getting stuff out there that can be automated, will increase the value. And then the question is, that's not even a debate. That's just fact that's going to be like rent, massive rise. Then the issue comes down to who has the best asset? The software asset that's eating the world or the tower and the physical infrastructure. So if the physical infrastructure aka the Telcos, can't generate value fast enough, in my opinion, the private equity will come in and take it over, and then refactor that business model to take advantage of the over the top software model. That to me is the big stare down competition between the Telco world and this new cloud native, whichever one yields in valley is going to blink first, if you say. And I think the Cloud native wins this one hands down because the assets are valuable, but only if they enable the new model. If the old model tries to hang on to the old hog, the old model as the Edge hugger, as Sarbjeet says, they'll just going to slowly milk that cow dry. So it's like, it's over. So to me, they have to move. And I think this Mobile World Congress day, we will see, we will be looking for that. >> Yeah, I think that in the Mobile World Congress context, I think Telcos should partner with the hyperscalers very closely like everybody else has. And they have to cave in. (laughs) I usually say that to them, like the people came in IBM tried to fight and they cave in. Other second tier vendors tried to fight the big cloud vendors like top three or four. And then they cave in. okay, we will serve our stuff through your cloud. And that's where all the buyers are congregating. They're going to buy stuff along with the skills gravity, the feature proximity. I've got another term I'll turn a coin. It matters a lot when you're doing one thing and you want to do another thing when you're doing all this transactional stuff and regular stuff, and now you want to do data science, where do you go? You go next to it, wherever you have been. Your skills are in that same bucket. And then also you don't have to write a new contract with a new vendor, you just go there. So in order to serve, this is a lesson for startups as well. You need to prepare yourself for being in the Cloud marketplaces. You cannot go alone independently to fight. >> Cloud marketplace is going to replace procurement, for sure, we know that. And this brings up the point, Dave, we talked about years ago, remember on the CUBE. We said, there's going to be Tier two clouds. I used that word in quotes cause nothing... What does it even mean Tier two. And we were talking about like Amazon, versus Microsoft and Google. We set at the time and Alibaba but they're in China, put that aside for a second, but the big three. They're going to win it all. And they're all going to be successful to a relative terms, but whoever can enable that second tier. And it ended up happening, Snowflake is that example. As is Databricks as is others. So Google and Microsoft as fast as they can replicate the success of AWS by enabling someone to build their business on their cloud in a way that allows the customer to refactor their business will win. They will win most of the lion's share my opinion. So I think that applies to the Edge as well. So whoever can come in and say... Whichever cloud says, "I'm going to enable the next Snowflake, the next enterprise solution." I think takes it. >> Well, I think that it comes back... Every conversation coming back to the data. And if you think about the prevailing way in which we treated data with the exceptions of the two data driven companies in their quotes is as we've shoved all the data into some single repository and tried to come up with a single version of the truth and it's adjudicated by a centralized team, with hyper specialized roles. And then guess what? The line of business, there's no context for the business in that data architecture or data Corpus, if you will. And then the time it takes to go from idea for a data product or data service commoditization is way too long. And that's changing. And the winners are going to be the ones who are able to exploit this notion of leaving data where it is, the point about data gravity or courting a new term. I liked that, I think you said skills gravity. And then enabling the business lines to have access to their own data teams. That's exactly what Ali Ghodsi, he was saying this morning. And really having the ability to create their own data products without having to go bow down to an ivory tower. That is an emerging model. All right, well guys, I really appreciate the wrap up here, Dave and Sarbjeet. I'd love to get your final thoughts. I'll just start by saying that one of the highlights for me was the luminary guests size of 15 great companies, the luminary guests we had from our community on our keynotes today, but Ali Ghodsi said, "Don't listen to what everyone's saying in the press." That was his position. He says, "You got to figure out where the puck's going." He didn't say that, but I'm saying, I'm paraphrasing what he said. And I love how he brought up Sky Cloud. I call it Sky net. That's an interesting philosophy. And then he also brought up that machine learning auto ML has got to be table stakes. So I think to me, that's the highlight walk away. And the second one is this idea that the enterprises have to have a new way to procure and not just the consumption, but some vendor selection. I think it's going to be very interesting as value can be proved with data. So maybe the procurement process becomes, here's a beachhead, here's a little bit of data. Let me see what it can do. >> I would say... Again, I said it was this morning, that the big four have given... Last year they spent a hundred billion dollars more on CapEx. To me, that's a gift. In so many companies, especially focusing on trying to hang onto the legacy business. They're saying, "Well not everything's going to move to the Cloud." Whatever, the narrative should change to, "Hey, thank you for that gift. We're now going to build value on top of the Cloud." Ali Ghodsi laid that out, how Databricks is doing it. And it's clearly what Snowflake's new with the data cloud. It basically a layer that abstracts all that underlying complexity and add value on top. Eventually going out to the Edge. That's a value added model that's enabled by the hyperscalers. And that to me, if I have to evaluate where I'm going to place my bets as a CIO or IT practitioner, I'm going to look at who are the ones that are actually embracing that investment that's been made and adding value on top in a way that can drive my data-driven, my digital business or whatever buzzword you want to throw on. >> Yeah, I think we were talking about the startups in today's sessions. I think for startups, my advice is to be as close as you can be to hyperscalers and anybody who awards them, they will cave in at the end of the day, because that's where the whole span of gravity is. That's what the innovation gravity is, everybody's gravitating towards that. And I would say quite a few times in the last couple of years that the rate of innovation happening in a non-cloud companies, when I talk about non-cloud means are not public companies. I think it's like diminishing, if you will, as compared to in cloud, there's a lot of innovation. The Cloud companies are not paying by power people anymore. They have all sophisticated platforms and leverage those, and also leverage the marketplaces and leverage their buyers. And the key will be how you highlight yourself in that cloud market place if you will. It's like in a grocery store where your product is placed and you have to market around it, and you have to have a good story telling team in place as well after you do the product market fit. I think that's a key. I think just being close to the Cloud providers, that's the way to go for startups. >> Real, real quick. Each of you talk about what it takes to crack the code for the enterprise in the modern era now. Dave, we'll start with you. What's it take? (indistinct) >> You got to have it be solving a problem that is 10X better at one 10th a cost of anybody else, if you're a small company, that rule number one. Number two is you obviously got to get product market fit. You got to then figure out. And I think, and again, you're in your early phases, you have to be almost processed builders, figure out... Your KPIs should all be built around retention. How do I define customer success? How do I keep customers and how do I make them loyal so that I know that my cost of acquisition is going to be at least one-third or lower than my lifetime value of that customer? So you've got to nail that. And then once you nail that, you've got to codify that process in the next phase, which really probably gets into your platform discussion. And that's really where you can start to standardize and scale and figure out your go to market and the relationship between marketing spend and sales productivity. And then when you get that, then you got to move on to figure out your Mot. Your Mot might just be a brand. It might be some secret sauce, but more often than not though, it's going to be the relationship that you build. And I think you've got to think about those phases and in today's world, you got to move really fast. Sarbjeet, real quick. What's the secret to crack the code? >> I think the secret to crack the code is partnership and alliances. As a small company selling to the bigger enterprises, the vendors size will be one of the big objections. Even if they don't say it, it's on the back of their mind, "What if these guys disappear tomorrow what would we do if we pick this technology?" And another thing is like, if you're building on the left side, which is the developer side, not on the right side, which is the operations or production side, if you will, you have to understand the sales cycles are longer on the right side and left side is easier to get to, but that's why we see a lot more startups. And on the left side of your DevOps space, if you will, because it's easier to sell to practitioners and market to them and then show the value correctly. And also understand that on the left side, the developers are very know how hungry, on the right side people are very cost-conscious. So understanding the traits of these different personas, if you will buyers, it will, I think set you apart. And as Dave said, you have to solve a problem, focus on practitioners first, because you're small. You have to solve political problems very well. And then you can expand. >> Well, guys, I really appreciate the time. Dave, we're going to do more of these, Sarbjeet we're going to do more of these. We're going to add more community to it. We're going to add our community rooms next time. We're going to do these quarterly and try to do them as more frequently, we learned a lot and we still got a lot more to learn. There's a lot more contribution out in the community that we're going to tap into. Certainly the CUBE Club as we call it, Dave. We're going to build this actively around Cloud. This is another 20 years. The Edge brings us more life with Cloud, it's really exciting. And again, enterprise is no longer an enterprise, it's just the world now. So great companies here, the next Databricks, the next IPO. The next big thing is in this list, Dave. >> Hey, John, we'll see you in Barcelona. Looking forward to that. Sarbjeet, I know in a second half, we're going to run into each other. So (indistinct) thank you John. >> Trouble has started. Great talking to you guys today and have fun in Barcelona and keep us informed. >> Thanks for coming. I want to thank Natalie Erlich who's in Rome right now. She's probably well past her bedtime, but she kicked it off and emceeing and hosting with Dave and I for this AW startup showcase. This is batch two episode two day. What do we call this? It's like a release so that the next 15 startups are coming. So we'll figure it out. (laughs) Thanks for watching everyone. Thanks. (bright music)

Published Date : Jun 24 2021

SUMMARY :

on cracking the code in the enterprise, Thank you for having and the buyers are thinking differently. I get the privilege of working and how you see enterprises in the enterprise to make a and part of the way in which the criteria for how to evaluate. is that going to lead to, because of the go to markets are changing. and making the art of sales and they had a great and investing in the ecosystem. I really appreciate you having me. and some of the winners and the modern enterprise and be in the wrong spot. the way you think about I got to ask you because And one of the reasons you go there not just to be an interesting and you get a little position, it's like, "I'm always the last to know." on the firing lines. And you make it sound and then go to the market. and you just can't drag it out. that company over the years, and by the way, I think it intersects the time Peter to come in All right, thank you Cloud Influencer friend of the CUBE. I think, was he a sales guy there? Sarbjeet, thank you for coming on. Glad to be here todays. lately on the commentary, and the economic viability matters and you get product market fit, and the departments changing, And then you got to figure is a problem to solve. and the startups need to focus on observability that you started. So the pandemic has brought to life, that's one of the opportunities to a tweet Sarbjeet you to the workloads if you They don't like the hyperscalers story, but it's the developer traction, And I think to your point, I always talk about the skills gravity, is that the developers but the citizen developer, So if the physical You go next to it, wherever you have been. the customer to refactor And really having the ability to create And that to me, if I have to evaluate And the key will be how for the enterprise in the modern era now. What's the secret to crack the code? And on the left side of your So great companies here, the So (indistinct) thank you John. Great talking to you guys It's like a release so that the

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Katie	PERSON	0.99+
John	PERSON	0.99+
Natalie Erlich	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Sarbjeet	PERSON	0.99+
Google	ORGANIZATION	0.99+
Katie Drucker	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Peter Wagner	PERSON	0.99+
Telcos	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Natalie	PERSON	0.99+
Ali Ghodsi	PERSON	0.99+
AWS	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Teresa Carlson	PERSON	0.99+
Jeff Barr	PERSON	0.99+
Alibaba	ORGANIZATION	0.99+
Andy	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Andy Thry	PERSON	0.99+
Barcelona	LOCATION	0.99+
Ali	PERSON	0.99+
Rome	LOCATION	0.99+
Madrona Venture Group	ORGANIZATION	0.99+
Jeremy Burton	PERSON	0.99+
Redback Networks	ORGANIZATION	0.99+
Madrona	ORGANIZATION	0.99+
Jeremy Burton	PERSON	0.99+
Databricks	ORGANIZATION	0.99+
Telco	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Doug	PERSON	0.99+
Wellfleet	ORGANIZATION	0.99+
Harvard School of Business	ORGANIZATION	0.99+
Last year	DATE	0.99+
Berkeley	LOCATION	0.99+

Mark Grover & Jennifer Wu | Spark Summit 2017

>> Announcer: Live from San Francisco, it's the Cube covering Spark Summit 2017, brought to you by databricks. >> Hi, we're back here where the Cube is live, and I didn't even know it Welcome, we're at Spark Summit 2017. Having so much fun talking to our guests I didn't know the camera was on. We are doing a talk with Cloudera, a couple of experts that we have here. First is Mark Grover, who's a software engineer and an author. He wrote the book, "Dupe Application Architectures." Mark, welcome to the show. >> Mark: Thank you very much. Glad to be here. And just to his left we also have Jennifer Wu, and Jennifer's director of product management at Cloudera. Did I get that right? >> That's right. I'm happy to be here, too. >> Alright, great to have you. Why don't we get started talking a little bit more about what Cloudera is maybe introducing new at the show? I saw a booth over here. Mark, do you want to get started? >> Mark: Yeah, there are two exciting things that we've launched at least recently. There Cloudera Altus, which is for transient work loads and being able to do ETL-Like workloads, and Jennifer will be happy to talk more about that. And then there's Cloudera data science workbench, which is this tool that allows folks to use data science at scale. So, get away from doing data science in silos on your personal laptops, and do it in a secure environment on cloud. >> Alright, well, let's jump into Data Science Workbench first. Tell me a little bit more about that, and you mentioned it's for exploratory data science. So give us a little more detail on what it does. >> Yeah, absolutely. So, there was private beta for Cloudera Data Science Workbench earlier in the year and then it was GA a few months ago. And it's like you said, an exploratory data science tool that brings data science to the masses within an enterprise. Previously people used to have, it was this dichotomy, right? As a data scientist, I want to have the latest and greatest tools. I want to use the latest version of Python, the latest notebook kernel, and I want to be able to use R and Python to be able to crunch this data and run my models in machine learning. However, on the other side of this dichotomy are the IT organization of the organization, where if they want to make sure that all tools are compliant and that your clusters are secure, and your data is not going into places that are not secured by state of the art security solutions, like Kerberos for example, right? And of course if the data scientists are putting the data on their laptops and taking the laptop around to wherever they go, that's not really a solution. So, that was one problem. And the other one was if you were to bring them all together in the same solution, data scientists have different requirements. One may want to use Python 2.6. Another one maybe want to use 3.2, right? And so Cloudera Data Science Workbench is a new product that allows data scientists to visualize and do machine learning through this very nice notebook-like interface, share their work with the rest of their colleagues in the organization, but also allows you to keep your clusters secure. So it allows you to run against a Kerberized cluster, allows single sign on to your web interface to Data Science Workbench, and provides a really nice developer experience in the sense that My workflow and my tools and my version of Python does not conflict with Jennifer's version of Python. We all have our own docker and Kubernetes-based infrastructure that makes sure that we use the packages that we need, and they don't interfere with each other. We're going to go to Jennifer on Altus in just a few minutes, but George first give you a chance to maybe dig in on Data Science workshop. >> Two questions on the data science side: some of the really toughest nuts to crack have been Sort of a common environment for the collaborators, but also the ability to operationalize the models once you've sort of agreed on them, and manage the lifecycle across teams, you know? Like, challenger champion, promote something, or even before that doing the ab testing, and then sort of what's in production is typically in a different language from what, you know, it was designed in and sort of integrating it with the apps. Where is that on the road map? Cause no one really has a good answer for that. >> Yeah, that's an excellent question. In general I think it's the problem to crack these days. How do you productionalize something that was written by a data scientist in a notebook-like system onto the production cluster, right? And I think the part where the data scientist works in a different language than the language that's in production, I think that problem, the best I can say right now is to actually have someone rewrite that. Have someone rewrite that in the language you're going to make in production, right? I don't see that to be the more common part. I think the more widespread problem is even when the language is production, how do you go making the part that the data scientist wrote, the model or whatever that would be, into a prodution cluster? And so, Data Science Workbench in particular runs on the same cluster that is being managed by Cloudera manager, right? So this is a tool that you install, but that is available to you as a web server, as a web interface, and so that allows you to move your development machine learning algorithms from your data science workbench to production much more easier, because it's all running on the same hardware and same systems. There's no separate Cloudera managers that you have to use to manage the workbench compared to your actual cluster. >> Okay. A tangential question, but one of the, the difficulties of doing machine learning is finding all the training data and, and sort of data science expertise to sit with the domain expert to, you know, figure out proper model of features, things like that. One of the things we've seen so far from the cloud vendors is they take their huge datasets in terms of voice, you know, images. They do the natural language understanding, speech or rather text to speech, you know, facial recognition. Cause they have such huge datasets they can train on. We're hearing noises that they'd going to take that down to the more mundane statistical kind of machine learning algorithms, so that you wouldn't be, like, here's a algorithm to do churn, you know, go to town, but that they might have something that's already kind of pre-populated that you would just customize. Is that something that you guys would tackle, too? >> I can't speak for the road map in that sense, but I think some of that problem needs to be tackled by projects like Spark for example. So I think as the stack matures, it's going to raise the level of abstraction as time goes on. And I think whatever benefits Spark ecosystem will have will come directly to distributions like Cloudera. >> George: That's interesting. >> Yeah >> Okay >> Alright, well let's go to Jennifer now and talk about Altus a little bit. Now you've been on the Cube show before, right? >> I have not. >> Okay, well, familiar with your work. Tell us again, you're the product manager for Altus. What does it do, and what was the motivation to build it? >> Yeah, we're really excited about Cloudera Altus. So, we released Cloudera Altus in its first GA form in April, and we launched Cloudera Altus in a public environment in Strata London about two weeks ago, so we're really excited about this and we are very excited to now open this up to all of the customer base. And what it is is a platform as a service offering designed to leverage, basically, the agility and the scale of cloud, and make a very easy to use type of experience to expose Cloudera capacity for, in particular for data engineering type of workloads. So the end user will be able to very easily, in a very agile manner, get data engineering capacity on Cloudera in the cloud, and they'll be able to do things like ETL and large scale data processing, productionized machine learning workflows in the cloud with this new data engineering as a service experience. And we wanted to abstract away the cloud, and cluster operations, and make the end user a really, the end user experience very easy. So, jobs and workloads as first class objects. You can do things like submit jobs, clone jobs, terminate jobs, troubleshoot jobs. We wanted to make this very, very easy for the data engineering end user. >> It does sound like you've sort of abstracted away a lot of the infrastructure that you would associate with on-prem, and sort of almost make it, like, programmable and invisible. But, um, I guess my, one of my questions is when you put it in a cloud environment, when you're on-prem you have a certain set of competitors which is kind of restrictive, because you are the standalone platform. But when you go on the cloud, someone might say, "I want to use red shift on Amazon," or Snowflake, you know, as the MPP sequel database at the end of a pipeline. And it's not just, I'm using those as examples. There's, you know, dozens, hundreds, thousands of other services to choose from. >> Yes. >> What happens to the integrity of that platform if someone carves off one piece? >> Right. So, interoperability and a unified data pipeline is very important to us, so we want to make sure that we can still service the entire data pipeline all the way from ingest and data processing to analytics. So our team has 24 different open source components that we deliver in the CDH distribution, and we have committers across the entire stack. We know the application, and we want to make sure that everything's interoperable, no matter how you deploy the cluster. So if you deploy data engineering clusters through Cloudera Altus, but you deployed Impala clusters for data marks in the cloud through Cloudera Director or through any other format, we want all these clusters to be interoperable, and we've taken great pains in order to make everything work together well. >> George: Okay. So how do Altus and Sata Science Workbench interoperate with Spark? Maybe start with >> You want to go first with Altus? >> Sure, so, we, in terms of interoperability we focus on things like making sure there are no data silos so that the data that you use for your entire data lake can be consumed by the different components in our system, the different compute engines and different tools, and so if you're processing data you can also look at this data and visualize this data through Data Science Workbench. So after you do data ingestion and data processing, you can use any of the other analytic tools and then, and this includes Data Science Workbench. >> Right, and for Data Science Workbench runs, for example, with the latest version of Spark you could pick, the currently latest released version of Spark, Spark 2.1, Spark 2.2 is being boarded of course, and that will soon be integrated after its release. For example you could use Data Science Workbench with your flavor of Spark two's version and you can run PySpark or Scala jobs on this notebook-like interface, be able to share your work, and because you're using Spark Underneath the hood it uses yarn for resource management, the Data Science Workbench itself uses Docker for configuration management, and Kubernetes for resource managing these Docker containers. >> What would be, if you had to describe sort of the edge conditions and the sweet spot of the application, I mean you talked about data engineering. One thing, we were talking to Matei Zaharia and Ronald Chin about was, and Ali Ghodsi as well was if you put Spark on a database, or at least a, you know, sophisticated storage manager, like Kudu, all of a sudden there're a whole new class of jobs or applications that open up. Have you guys thought about what that might look like in the future, and what new applications you would tackle? >> I think a lot of that benefit, for example, could be coming from the underlying storage engine. So let's take Spark on Kudu, for example. The inherent characteristics of Kudu today allow you to do updates without having to either deal with the complexity of something like Hbase, or the crappy performance of dealing HDFS compactions, right? So the sweet spot comes from Kudu's capabilities. Of course it doesn't support transactions or anything like that today, but imagine putting something like Spark and being able to use the machine learning libraries and, we have been limited so far in the machine learning algorithms that we have implemented in Spark by the storage system sometimes, and, for example new machine learning algorithms or the existing ones could rewritten to make use of the update features for example, in Kudu. >> And so, it sounds like it makes it, the machine learning pipeline might get richer, but I'm not hearing that, and maybe this isn't sort of in the near term sort of roadmap, the idea that you would build sort of operational apps that have these sophisticated analytics built in, you know, where the analytics, um, you've done the training but at run time, you know, the inferencing influences a transaction, influences a decision. Is that something that you would foresee? >> I think that's totally possible. Again, at the core of it is the part that now you have one storage system that can do scans really well, and it can also do random reads and writes any place, right? So as your, and so that allows applications which were previously siloed because one appication that ran off of HDFS, another application that ran out of Hbase, and then so you had to correlate them to just being one single application that can use to train and then also use their trained data to then make decisions on the new transactions that come in. >> So that's very much within the sort of scope of imagination, or scope. That's part of sort of the ultimate plan? >> Mark: I think it's definitely conceivable now, yeah. >> Okay. >> We're up against a hard break coming up in just a minute, so you each get a 30-second answer here, so it's the same question. You've been here for a day and a half now. What's the most surprising thing you've learned that you thing should be shared more broadly with the Spark community? Let's start with you. >> I think one of the great things that's happening in Spark today is people have been complaining about latency for a long time. So if you saw the keynote yesterday, you would see that Spark is making forays into reducing that latency. And if you are interested in Spark, using Spark, it's very exciting news. You should keep tabs on it. We hope to deliver lower latency as a community sooner. >> How long is one millisecond? (Mark laughs) >> Yeah, I'm largely focused on cloud infrastructure and I found here at the conference that, like, many many people are very much prepared to actually start taking more, you know, more POCs and more interest in cloud and the response in terms of all of this in Altus has been very encouraging. >> Great. Well, Jennifer, Mark, thank you so much for spending some time here on the Cube with us today. We're going to come by your booth and chat a little bit more later. It's some interesting stuff. And thank you all for watching the Cube today here at Spark Summit 2017, and thanks to Cloudera for bringing us these two experts. And thank you for watching. We'll see you again in just a few minutes with our next interview.

Published Date : Jun 7 2017

SUMMARY :

covering Spark Summit 2017, brought to you by databricks. I didn't know the camera was on. And just to his left we also have Jennifer Wu, I'm happy to be here, too. Mark, do you want to get started? and being able to do ETL-Like workloads, and you mentioned it's for exploratory data science. And the other one was if you were to bring them all together and manage the lifecycle across teams, you know? and so that allows you to move your development machine the domain expert to, you know, I can't speak for the road map in that sense, and talk about Altus a little bit. to build it? on Cloudera in the cloud, and they'll be able to do things a lot of the infrastructure that you would associate with We know the application, and we want to make sure Maybe start with so that the data that you use for your entire data lake and you can run PySpark in the future, and what new applications you would tackle? or the existing ones could rewritten to make use the idea that you would build sort of operational apps Again, at the core of it is the part that now you have That's part of sort of the ultimate plan? that you thing should be shared more broadly So if you saw the keynote yesterday, you would see that and the response in terms of all of this on the Cube with us today.

ENTITIES

Entity	Category	Confidence
Jennifer	PERSON	0.99+
Mark Grover	PERSON	0.99+
Jennifer Wu	PERSON	0.99+
Ali Ghodsi	PERSON	0.99+
George	PERSON	0.99+
Mark	PERSON	0.99+
April	DATE	0.99+
Ronald Chin	PERSON	0.99+
San Francisco	LOCATION	0.99+
Matei Zaharia	PERSON	0.99+
30-second	QUANTITY	0.99+
Cloudera	ORGANIZATION	0.99+
Dupe Application Architectures	TITLE	0.99+
dozens	QUANTITY	0.99+
Python	TITLE	0.99+
yesterday	DATE	0.99+
Two questions	QUANTITY	0.99+
today	DATE	0.99+
Spark	TITLE	0.99+
Amazon	ORGANIZATION	0.99+
two experts	QUANTITY	0.99+
a day and a half	QUANTITY	0.99+
First	QUANTITY	0.99+
one problem	QUANTITY	0.99+
Python 2.6	TITLE	0.99+
Strata London	LOCATION	0.99+
one piece	QUANTITY	0.99+
first	QUANTITY	0.98+
Spark Summit 2017	EVENT	0.98+
Cloudera Altus	TITLE	0.98+
Scala	TITLE	0.98+
Docker	TITLE	0.98+
One	QUANTITY	0.97+
Kudu	ORGANIZATION	0.97+
one millisecond	QUANTITY	0.97+
PySpark	TITLE	0.96+
R	TITLE	0.95+
one	QUANTITY	0.95+
two weeks ago	DATE	0.93+
Data Science Workbench	TITLE	0.92+
Cloudera	TITLE	0.91+
hundreds	QUANTITY	0.89+
Hbase	TITLE	0.89+
each	QUANTITY	0.89+
24 different open source components	QUANTITY	0.89+
few months ago	DATE	0.89+
single	QUANTITY	0.88+
kernel	TITLE	0.88+
Altus	TITLE	0.88+

Ash Munshi, Pepperdata - #SparkSummit - #theCUBE

(upbeat music) >> Announcer: Live from San Francisco, it's theCUBE, covering Spark Summit 2017, brought to you by Databricks. >> Welcome back to theCUBE, it's day two at the Spark Summit 2017. I'm David Goad and here with George Gilbert from Wikibon, George. >> George: Good to be here. >> Alright and the guest of honor of course, is Ash Munshi, who is the CEO of Pepperdata. Ash, welcome to the show. >> Thank you very much, thank you. >> Well you have an interesting background, I want you to just tell us real quick here, not give the whole bio, but you got a great background in machine learning, you were an early user of Spark, tell us a little bit about your experience. >> So I'm actually a mathematician originally, a theoretician who worked for IBM Research, and then subsequently Larry Ellison at Oracle, and a number of other places. But most recently I was CTO at Yahoo, and then subsequent to that I did a bunch of startups, that involved different types of machine learning, and also just in general, sort of a lot of big data infrastructure stuff. >> And go back to 2012 with Spark right? You had an interesting development. Right, so 2011, 2012, when Spark was still early, we were actually building a recommendation system, based on user-generated reviews. That was a project that was done with Nando de Freitas, who is now at DeepMind, and Peter Cnudde, who's one of the key guys that runs infrastructure at Yahoo. We started that company, and we were one of the early users of Spark, and what we found was, that we were analyzing all the reviews at Amazon. So Amazon allows you to crawl all of their reviews, and we basically had natural language processing, that would allow us to analyze all those reviews. When we were doing sort of MapReduce stuff, it was taking us a huge number of nodes, and 24 hours to actually go do analysis. And then we had this little project called Spark, out of AMPlab, and we decided spin it up, and see what we could do. It had lots of issues at that time, but we were able to actually spin it up on to, I think it was in the order of 100,000 nodes, and we were able take our times for running our algorithms from you know, sort of tens of hours, down to sort of an hour or two, so it was a significant improvement in performance. And that's when we realized that, you know, this is going to be something that's going to be really important once this set of issues, where it, once it was going to get mature enough to make happen, and I'm glad to see that that it's actually happened now, and it's actually taken over the world. >> Yeah that little project became a big deal, didn't it? >> It became a big deal, and now everybody's taking advantage of the same thing. >> Well bring us to the present here. We'll talk about Pepperdata and what you do, and then George is going to ask a little bit more about some of the solutions that you have. >> Perfect, so Pepperdata was a company founded by two gentlemen, Sean Suchter and Chad Carson. Sean used to run Yahoo Search, and one of the first guys who actually helped develop Hadoop next to Eric14 and that team. And then Chad was one of the first guys who actually figured out how to monetize clicks, and was the data science guy around the whole thing. So those are the two guys that actually started the company. I joined the company last July as CEO, and you know, what we've done recently, is we've sort of expanded our focus of the company to addressing DevOps for big data. And the reason why DevOps for big data is important, is because what's happened in the last few years, is people have gone from experimenting with big data, to taking big data into production, and now they're actually starting to figure out how to actually make it so that it actually runs properly, and scales, and does all the other kinds of things that are there, right? So, it's that transition that's actually happened, so, "Hey, we ran it in production, "and it didn't quite work the way we wanted to, "now we actually have to make it work correctly." That's where we sort of fit in, and that's where DevOps comes in, right? DevOps comes in when you're actually trying to make production systems that are going to perform in the right way. And the reason for DevOps is it shortens the cycle between developers and operators, right? So the tighter the loop, the faster you can get solutions out, because business users are actually wanting that to happen. That's where we're squarely focused, is how do we make that work? How do we make that work correctly for big data? And the difference between, sort of classic DevOps and DevOps for big data, is that you're now dealing with not just, you know, a set of computers solving an isolated sort of problem. You're dealing with thousands of machines that are solving one problem, and the amount of data is significantly larger. So the classical methodologies that you have, while, you know, agile and all that still works, the tools don't work to actually figure out what you can do with DevOps, and that's where we come in. We've got a set of tools that are focused on performance effectively, 'cause that's the big difference between distributed systems performance I should say, that's the big difference between that, and sort of classic even scaled out computing, right? So if you've got web servers, yes performance is important, and you need data for those, but that can actually be sharded nicely. This is one system working on one problem, right? Or a set of systems working on one problem. That's much harder, it's a different set of problems, and we help solve those problems. >> Yeah, and George you look like you're itching to dig into this, feel free. (exclaims loudly) >> Well so, it was, so one of the big announcements at the show, and the sort of the headline announcement today, was Spark server lists, like so it's not just someone running Spark in the cloud sort of as a manage service, it's up there as a, you know, sort of SaaS application. And you could call it platform of the service, but it's basically a service where, you know, the infrastructure is invisible. Now, for all those customers who are running their own clusters, which is pretty much everyone I would imagine at this point, how far can you take them in hiding much of the overhead of running those clusters? And by the overhead I mean, you know, the primarily performance and maximizing, you know, sort of maximizing resource efficiency. >> So, you have to actually sort of double-click on to the kind of resources that we're talking about here, right? So there's the number of nodes that you're going to need to actually do the computation. There is, you know, the amount of disc storage and stuff that you're going to need, what type of CPUs you're going to need. All of that stuff is sort of part of the costing if you will, of running an infrastructure. If somebody hides all that stuff, and makes it so that it's economical, then you know, that's a great thing, right? And if it can actually be made so that it's works for huge installations, and hides it appropriately so I don't pay too much of a tax, that's a wonderful thing to do. But we have, our customers are enterprises, typically Fortune 200 enterprises, and they have both a mixture of cloud-based stuff, where they actually want to control everything about what's going on, and then they have infrastructure internally, which by definition they control everything that's going on, and for them we're very, very applicable. I don't know how we'd applicable in this, sort of new world as a service that grows and shrinks. I can certainly imagine that whoever provides that service would embed us, to be able to use the stuff more efficiently. >> No, you answered my question, which is, for the people who aren't getting the turnkey you know, sort of SaaS solution, and they need help managing, you know, what's a fairly involved stack, they would turn to you? >> Ash: Yes. >> Okay. >> Can I ask you about the specific products? >> George: Oh yes. >> I saw you at the booth, and I saw you were announcing a couple of things. Well what is new-- >> Ash: Correct. >> With the show? >> Correct, so at the show we announced Code Analyzer for Apache Spark, and what that allows people to do, is really understand where performance issues are actually happening in their code. So, one of the wonderful things about Spark, compared to MapReduce, is that it abstracts the paradigm that you actually write against, right? So that's a wonderful thing, 'cause it makes it easier to write code. The problem when we abstract, is what does that abstraction do down in the hardware, and where am I losing performance? And being able to give that information back to the user. So you know, in Spark, you have jobs that can run in parallel. So an apps consists of jobs, jobs can run in parallel, and each one of these things can consume resources, CPU, memory, and you see that through sort of garbage collection, or a disc or a network, and what you want to find out, is which one these parallel tasks was dominating the CPU? Why was it dominating the CPU? Which one actually caused the garbage collector actually go crazy at some point? While the Spark UI provides some of that information, what it doesn't do, is gives you a time series view of what's going on. So it's sort of a blow-by-blow view of what's going on. By imposing the time series view on sort of an enhanced version of the Spark UI, you now have much better visibility about which offending stages are causing the issue. And the nice thing about that is, once you know that, you know exactly which piece of code that you actually want to go and look at. So classic example would be, you might have two stages that are running in parallel. The Spark UI will tell you that it's stage three that's causing the problem, but if you look at the time series, you'll find out that stage two actually runs longer, and that's the one that's pegging the CPU. And you can see that because we have the time series, but you couldn't see that any other way. >> So you have a code analyzer and also the app profiler. >> So the app profiler is the other product that we announced a few months ago. We announced that I guess about three months ago or so. And the app profiler, what it does, is it actually looks after the run is done, it actually looks at all the data that the run produces, so the Spark history server produces, and then it actually goes back and analyzes that and says, "Well you know what? "You're executors here, are not working as efficiently, "these are the executors "that aren't working as efficiently." It might be using too much memory or whatever, and then it allows the developer to basically be able to click on it and say, "Explain to me why that's happening?" And then it gives you a little, you know, a little fix-it if you will. It's like, if this is happening, you probably want to do these things, in order to improve performance. So, what's happening with our customers, is our customers are asking developers to run the application profiler first, before they actually put stuff on production. Because if the application profiler comes back and says, "Everything is green." That there's no critical issues there. Then they're saying, "Okay fine, put it on my cluster, "on the production cluster, "but don't do it ahead of time." The application profiler, to be clear, is actually based on some work that, on open source project called Dr. Elephant, which comes out of LinkedIn. And now we're working very closely together to make sure that we actually can advance the set of heuristics that we have, that will allow developers to understand and diagnose more and more complex problems. >> The Spark community has the best code names ever. Dr. Elephant, I've never heard of that one before. (laughter) >> Well Dr. Elephant, actually, is not just the Spark community, it's actually also part of the MapReduce community, right? >> David: Ah, okay. >> So yeah, I mean remember Hadoop? >> David: Yes. >> The elephant thing, so Dr. Elephant, and you know. >> Well let's talk about where things are going next, George? >> So, you know, one of the things we hear all the time from customers and vendors, is, "How are we going to deal with this new era "of distributed computing?" You know, where we've got the cloud, on-prem, edge, and like so, for the first question, let's leave out the edge and say, you've got your Fortune 200 client, they have, you know, production clusters or even if it's just one on-prem, but they also want to work in the cloud, whether it's for elastics stuff, or just for, they're gathering a lot of data there. How can you help them manage both, you know, environments? >> Right, so I think there's a bunch of times still, before we get into most customers actually facing that problem. What we see today is, that a lot of the Fortune 200, or our customers, I shouldn't say a lot of the Fortune 200, a lot of our customers have significant, you know, deployments internally on-prem. They do experimentation on the cloud, right? The current infrastructure for managing all these, and sort of orchestrating all this stuff, is typically YARN. What we're seeing, is that more than likely they're going to wind up, or at least our intelligence tells us that it's going to wind up being Kubernetes that's actually going to wind up managing that. So, what will happen is-- >> George: Both on-prem and-- >> Well let me get to that, alright? >> George: Okay. >> So, I think YARN will be replaced certainly on-prem with Kupernetes, because then you can do multi data center, and things of that sort. The nice thing about Kupernetes, is it in fact can span the cloud as well. So, Kupernetes as an infrastructure, is certainly capable of being able to both handle a multi data center deployment on-prem, along with whatever actually happens on the cloud. There is infrastructure available to do that. It's very immature, most of the customers aren't anywhere close to being able to do that, and I would say even before Kupernetes gets accepted within the environment, it's probably 18 months, and there's probably another 18 months to two years, before we start facing this hybrid cloud, on-prem kind of problem. So we're a few years out I think. >> So, would, for those of us including our viewers, you know, who know the acronym, and know that it's a, you know, scheduler slash cluster manager, resource manager, would that give you enough of a control plane and knowledge of sort of the resources out there, for you to be able to either instrument or deploy an instrument to all the clusters (mumbles). >> So we are actually leading the effort right now for big data on Kupernetes. So there is a group of, there's a small group working. It's Google, us, Red Hat, Palantir, Bloomberg now has joined the group as well. We are actually today talking about our effort on getting HDFS working on Kupernetes, so we see the writing on the wall. We clearly are positioning ourselves to be a player in that particular space, so we think we'll be ready and able to take that challenge on. >> Ash this is great stuff, we've just got about a minute before the break, so I wanted to ask you just a final question. You've been in the Spark community for a while, so what of their open source tools should we be keeping our eyes out for? >> Kupernetes. >> David: That's the one? >> To me that is the killer that's coming next. >> David: Alright. >> I think that's going to make life, it's going to unify the microservices architecture, plus the sort of multi data center and everything else. I think it's really, really good. Board works, it's been working for a long time. >> David: Alright, and I want to thank you for that little Pepper pen that I got over at your booth, as the coolest-- >> Come and get more. >> Gadget here. >> We also have Pepper sauce. >> Oh, of course. (laughter) Well there sir-- >> It's our sauce. >> There's the hot news from-- >> Ash: There you go. >> Pepperdata Ash Munshi. Thank you so much for being on the show, we appreciate it. >> Ash: My pleasure, thank you very much. >> And thank you for watching theCUBE. We're going to be back with more guests, including Ali Ghodsi, CEO of Databricks, coming up next. (upbeat music) (ocean roaring)

Published Date : Jun 7 2017

SUMMARY :

brought to you by Databricks. and here with George Gilbert from Wikibon, George. Alright and the guest of honor of course, I want you to just tell us real quick here, and then subsequent to that I did a bunch of startups, and it's actually taken over the world. and now everybody's taking advantage of the same thing. about some of the solutions that you have. So the classical methodologies that you have, Yeah, and George you look like And by the overhead I mean, you know, is sort of part of the costing if you will, and I saw you were announcing a couple of things. And the nice thing about that is, once you know that, And then it gives you a little, The Spark community has the best code names ever. is not just the Spark community, and like so, for the first question, that a lot of the Fortune 200, or our customers, and there's probably another 18 months to two years, and know that it's a, you know, scheduler Bloomberg now has joined the group as well. so I wanted to ask you just a final question. plus the sort of multi data center Oh, of course. Thank you so much for being on the show, we appreciate it. And thank you for watching theCUBE.

ENTITIES

Entity	Category	Confidence
David Goad	PERSON	0.99+
Ash Munshi	PERSON	0.99+
George	PERSON	0.99+
Ali Ghodsi	PERSON	0.99+
Larry Ellison	PERSON	0.99+
George Gilbert	PERSON	0.99+
Google	ORGANIZATION	0.99+
Sean Suchter	PERSON	0.99+
David	PERSON	0.99+
Sean	PERSON	0.99+
Ash	PERSON	0.99+
Red Hat	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Yahoo	ORGANIZATION	0.99+
Peter Cnudde	PERSON	0.99+
2011	DATE	0.99+
DeepMind	ORGANIZATION	0.99+
Bloomberg	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
two guys	QUANTITY	0.99+
Pepperdata	ORGANIZATION	0.99+
24 hours	QUANTITY	0.99+
first question	QUANTITY	0.99+
Spark UI	TITLE	0.99+
Amazon	ORGANIZATION	0.99+
DevOps	TITLE	0.99+
2012	DATE	0.99+
Chad Carson	PERSON	0.99+
two years	QUANTITY	0.99+
18 months	QUANTITY	0.99+
one	QUANTITY	0.99+
two	QUANTITY	0.99+
one problem	QUANTITY	0.99+
last July	DATE	0.99+
Databricks	ORGANIZATION	0.99+
LinkedIn	ORGANIZATION	0.99+
Spark Summit 2017	EVENT	0.99+
Code Analyzer	TITLE	0.99+
Spark	TITLE	0.98+
100,000 nodes	QUANTITY	0.98+
today	DATE	0.98+
Palantir	ORGANIZATION	0.98+
an hour	QUANTITY	0.98+
IBM Research	ORGANIZATION	0.98+
Both	QUANTITY	0.98+
two gentlemen	QUANTITY	0.98+
Chad	PERSON	0.98+
two stages	QUANTITY	0.98+
first guys	QUANTITY	0.98+
both	QUANTITY	0.97+
thousands of machines	QUANTITY	0.97+
each one	QUANTITY	0.97+
tens of hours	QUANTITY	0.95+
Kupernetes	ORGANIZATION	0.95+
MapReduce	TITLE	0.95+
Yahoo Search	ORGANIZATION	0.94+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Ali Ghodsi: