The Truth About MySQL HeatWave

>>When Oracle acquired my SQL via the Sun acquisition, nobody really thought the company would put much effort into the platform preferring to focus all the wood behind its leading Oracle database, Arrow pun intended. But two years ago, Oracle surprised many folks by announcing my SQL Heatwave a new database as a service with a massively parallel hybrid Columbia in Mary Mary architecture that brings together transactional and analytic data in a single platform. Welcome to our latest database, power panel on the cube. My name is Dave Ante, and today we're gonna discuss Oracle's MySQL Heat Wave with a who's who of cloud database industry analysts. Holgar Mueller is with Constellation Research. Mark Stammer is the Dragon Slayer and Wikibon contributor. And Ron Westfall is with Fu Chim Research. Gentlemen, welcome back to the Cube. Always a pleasure to have you on. Thanks for having us. Great to be here. >>So we've had a number of of deep dive interviews on the Cube with Nip and Aggarwal. You guys know him? He's a senior vice president of MySQL, Heatwave Development at Oracle. I think you just saw him at Oracle Cloud World and he's come on to describe this is gonna, I'll call it a shock and awe feature additions to to heatwave. You know, the company's clearly putting r and d into the platform and I think at at cloud world we saw like the fifth major release since 2020 when they first announced MySQL heat wave. So just listing a few, they, they got, they taken, brought in analytics machine learning, they got autopilot for machine learning, which is automation onto the basic o l TP functionality of the database. And it's been interesting to watch Oracle's converge database strategy. We've contrasted that amongst ourselves. Love to get your thoughts on Amazon's get the right tool for the right job approach. >>Are they gonna have to change that? You know, Amazon's got the specialized databases, it's just, you know, the both companies are doing well. It just shows there are a lot of ways to, to skin a cat cuz you see some traction in the market in, in both approaches. So today we're gonna focus on the latest heat wave announcements and we're gonna talk about multi-cloud with a native MySQL heat wave implementation, which is available on aws MySQL heat wave for Azure via the Oracle Microsoft interconnect. This kind of cool hybrid action that they got going. Sometimes we call it super cloud. And then we're gonna dive into my SQL Heatwave Lake house, which allows users to process and query data across MyQ databases as heatwave databases, as well as object stores. So, and then we've got, heatwave has been announced on AWS and, and, and Azure, they're available now and Lake House I believe is in beta and I think it's coming out the second half of next year. So again, all of our guests are fresh off of Oracle Cloud world in Las Vegas. So they got the latest scoop. Guys, I'm done talking. Let's get into it. Mark, maybe you could start us off, what's your opinion of my SQL Heatwaves competitive position? When you think about what AWS is doing, you know, Google is, you know, we heard Google Cloud next recently, we heard about all their data innovations. You got, obviously Azure's got a big portfolio, snowflakes doing well in the market. What's your take? >>Well, first let's look at it from the point of view that AWS is the market leader in cloud and cloud services. They own somewhere between 30 to 50% depending on who you read of the market. And then you have Azure as number two and after that it falls off. There's gcp, Google Cloud platform, which is further way down the list and then Oracle and IBM and Alibaba. So when you look at AWS and you and Azure saying, hey, these are the market leaders in the cloud, then you start looking at it and saying, if I am going to provide a service that competes with the service they have, if I can make it available in their cloud, it means that I can be more competitive. And if I'm compelling and compelling means at least twice the performance or functionality or both at half the price, I should be able to gain market share. >>And that's what Oracle's done. They've taken a superior product in my SQL heat wave, which is faster, lower cost does more for a lot less at the end of the day and they make it available to the users of those clouds. You avoid this little thing called egress fees, you avoid the issue of having to migrate from one cloud to another and suddenly you have a very compelling offer. So I look at what Oracle's doing with MyQ and it feels like, I'm gonna use a word term, a flanking maneuver to their competition. They're offering a better service on their platforms. >>All right, so thank you for that. Holger, we've seen this sort of cadence, I sort of referenced it up front a little bit and they sat on MySQL for a decade, then all of a sudden we see this rush of announcements. Why did it take so long? And and more importantly is Oracle, are they developing the right features that cloud database customers are looking for in your view? >>Yeah, great question, but first of all, in your interview you said it's the edit analytics, right? Analytics is kind of like a marketing buzzword. Reports can be analytics, right? The interesting thing, which they did, the first thing they, they, they crossed the chasm between OTP and all up, right? In the same database, right? So major engineering feed very much what customers want and it's all about creating Bellevue for customers, which, which I think is the part why they go into the multi-cloud and why they add these capabilities. And they certainly with the AI capabilities, it's kind of like getting it into an autonomous field, self-driving field now with the lake cost capabilities and meeting customers where they are, like Mark has talked about the e risk costs in the cloud. So that that's a significant advantage, creating value for customers and that's what at the end of the day matters. >>And I believe strongly that long term it's gonna be ones who create better value for customers who will get more of their money From that perspective, why then take them so long? I think it's a great question. I think largely he mentioned the gentleman Nial, it's largely to who leads a product. I used to build products too, so maybe I'm a little fooling myself here, but that made the difference in my view, right? So since he's been charged, he's been building things faster than the rest of the competition, than my SQL space, which in hindsight we thought was a hot and smoking innovation phase. It kind of like was a little self complacent when it comes to the traditional borders of where, where people think, where things are separated between OTP and ola or as an example of adjacent support, right? Structured documents, whereas unstructured documents or databases and all of that has been collapsed and brought together for building a more powerful database for customers. >>So I mean it's certainly, you know, when, when Oracle talks about the competitors, you know, the competitors are in the, I always say they're, if the Oracle talks about you and knows you're doing well, so they talk a lot about aws, talk a little bit about Snowflake, you know, sort of Google, they have partnerships with Azure, but, but in, so I'm presuming that the response in MySQL heatwave was really in, in response to what they were seeing from those big competitors. But then you had Maria DB coming out, you know, the day that that Oracle acquired Sun and, and launching and going after the MySQL base. So it's, I'm, I'm interested and we'll talk about this later and what you guys think AWS and Google and Azure and Snowflake and how they're gonna respond. But, but before I do that, Ron, I want to ask you, you, you, you can get, you know, pretty technical and you've probably seen the benchmarks. >>I know you have Oracle makes a big deal out of it, publishes its benchmarks, makes some transparent on on GI GitHub. Larry Ellison talked about this in his keynote at Cloud World. What are the benchmarks show in general? I mean, when you, when you're new to the market, you gotta have a story like Mark was saying, you gotta be two x you know, the performance at half the cost or you better be or you're not gonna get any market share. So, and, and you know, oftentimes companies don't publish market benchmarks when they're leading. They do it when they, they need to gain share. So what do you make of the benchmarks? Have their, any results that were surprising to you? Have, you know, they been challenged by the competitors. Is it just a bunch of kind of desperate bench marketing to make some noise in the market or you know, are they real? What's your view? >>Well, from my perspective, I think they have the validity. And to your point, I believe that when it comes to competitor responses, that has not really happened. Nobody has like pulled down the information that's on GitHub and said, Oh, here are our price performance results. And they counter oracles. In fact, I think part of the reason why that hasn't happened is that there's the risk if Oracle's coming out and saying, Hey, we can deliver 17 times better query performance using our capabilities versus say, Snowflake when it comes to, you know, the Lakehouse platform and Snowflake turns around and says it's actually only 15 times better during performance, that's not exactly an effective maneuver. And so I think this is really to oracle's credit and I think it's refreshing because these differentiators are significant. We're not talking, you know, like 1.2% differences. We're talking 17 fold differences, we're talking six fold differences depending on, you know, where the spotlight is being shined and so forth. >>And so I think this is actually something that is actually too good to believe initially at first blush. If I'm a cloud database decision maker, I really have to prioritize this. I really would know, pay a lot more attention to this. And that's why I posed the question to Oracle and others like, okay, if these differentiators are so significant, why isn't the needle moving a bit more? And it's for, you know, some of the usual reasons. One is really deep discounting coming from, you know, the other players that's really kind of, you know, marketing 1 0 1, this is something you need to do when there's a real competitive threat to keep, you know, a customer in your own customer base. Plus there is the usual fear and uncertainty about moving from one platform to another. But I think, you know, the traction, the momentum is, is shifting an Oracle's favor. I think we saw that in the Q1 efforts, for example, where Oracle cloud grew 44% and that it generated, you know, 4.8 billion and revenue if I recall correctly. And so, so all these are demonstrating that's Oracle is making, I think many of the right moves, publishing these figures for anybody to look at from their own perspective is something that is, I think, good for the market and I think it's just gonna continue to pay dividends for Oracle down the horizon as you know, competition intens plots. So if I were in, >>Dave, can I, Dave, can I interject something and, and what Ron just said there? Yeah, please go ahead. A couple things here, one discounting, which is a common practice when you have a real threat, as Ron pointed out, isn't going to help much in this situation simply because you can't discount to the point where you improve your performance and the performance is a huge differentiator. You may be able to get your price down, but the problem that most of them have is they don't have an integrated product service. They don't have an integrated O L T P O L A P M L N data lake. Even if you cut out two of them, they don't have any of them integrated. They have multiple services that are required separate integration and that can't be overcome with discounting. And the, they, you have to pay for each one of these. And oh, by the way, as you grow, the discounts go away. So that's a, it's a minor important detail. >>So, so that's a TCO question mark, right? And I know you look at this a lot, if I had that kind of price performance advantage, I would be pounding tco, especially if I need two separate databases to do the job. That one can do, that's gonna be, the TCO numbers are gonna be off the chart or maybe down the chart, which you want. Have you looked at this and how does it compare with, you know, the big cloud guys, for example, >>I've looked at it in depth, in fact, I'm working on another TCO on this arena, but you can find it on Wiki bod in which I compared TCO for MySEQ Heat wave versus Aurora plus Redshift plus ML plus Blue. I've compared it against gcps services, Azure services, Snowflake with other services. And there's just no comparison. The, the TCO differences are huge. More importantly, thefor, the, the TCO per performance is huge. We're talking in some cases multiple orders of magnitude, but at least an order of magnitude difference. So discounting isn't gonna help you much at the end of the day, it's only going to lower your cost a little, but it doesn't improve the automation, it doesn't improve the performance, it doesn't improve the time to insight, it doesn't improve all those things that you want out of a database or multiple databases because you >>Can't discount yourself to a higher value proposition. >>So what about, I wonder ho if you could chime in on the developer angle. You, you followed that, that market. How do these innovations from heatwave, I think you used the term developer velocity. I've heard you used that before. Yeah, I mean, look, Oracle owns Java, okay, so it, it's, you know, most popular, you know, programming language in the world, blah, blah blah. But it does it have the, the minds and hearts of, of developers and does, where does heatwave fit into that equation? >>I think heatwave is gaining quickly mindshare on the developer side, right? It's not the traditional no sequel database which grew up, there's a traditional mistrust of oracles to developers to what was happening to open source when gets acquired. Like in the case of Oracle versus Java and where my sql, right? And, but we know it's not a good competitive strategy to, to bank on Oracle screwing up because it hasn't worked not on Java known my sequel, right? And for developers, it's, once you get to know a technology product and you can do more, it becomes kind of like a Swiss army knife and you can build more use case, you can build more powerful applications. That's super, super important because you don't have to get certified in multiple databases. You, you are fast at getting things done, you achieve fire, develop velocity, and the managers are happy because they don't have to license more things, send you to more trainings, have more risk of something not being delivered, right? >>So it's really the, we see the suite where this best of breed play happening here, which in general was happening before already with Oracle's flagship database. Whereas those Amazon as an example, right? And now the interesting thing is every step away Oracle was always a one database company that can be only one and they're now generally talking about heat web and that two database company with different market spaces, but same value proposition of integrating more things very, very quickly to have a universal database that I call, they call the converge database for all the needs of an enterprise to run certain application use cases. And that's what's attractive to developers. >>It's, it's ironic isn't it? I mean I, you know, the rumor was the TK Thomas Curian left Oracle cuz he wanted to put Oracle database on other clouds and other places. And maybe that was the rift. Maybe there was, I'm sure there was other things, but, but Oracle clearly is now trying to expand its Tam Ron with, with heatwave into aws, into Azure. How do you think Oracle's gonna do, you were at a cloud world, what was the sentiment from customers and the independent analyst? Is this just Oracle trying to screw with the competition, create a little diversion? Or is this, you know, serious business for Oracle? What do you think? >>No, I think it has lakes. I think it's definitely, again, attriting to Oracle's overall ability to differentiate not only my SQL heat wave, but its overall portfolio. And I think the fact that they do have the alliance with the Azure in place, that this is definitely demonstrating their commitment to meeting the multi-cloud needs of its customers as well as what we pointed to in terms of the fact that they're now offering, you know, MySQL capabilities within AWS natively and that it can now perform AWS's own offering. And I think this is all demonstrating that Oracle is, you know, not letting up, they're not resting on its laurels. That's clearly we are living in a multi-cloud world, so why not just make it more easy for customers to be able to use cloud databases according to their own specific, specific needs. And I think, you know, to holder's point, I think that definitely lines with being able to bring on more application developers to leverage these capabilities. >>I think one important announcement that's related to all this was the JSON relational duality capabilities where now it's a lot easier for application developers to use a language that they're very familiar with a JS O and not have to worry about going into relational databases to store their J S O N application coding. So this is, I think an example of the innovation that's enhancing the overall Oracle portfolio and certainly all the work with machine learning is definitely paying dividends as well. And as a result, I see Oracle continue to make these inroads that we pointed to. But I agree with Mark, you know, the short term discounting is just a stall tag. This is not denying the fact that Oracle is being able to not only deliver price performance differentiators that are dramatic, but also meeting a wide range of needs for customers out there that aren't just limited device performance consideration. >>Being able to support multi-cloud according to customer needs. Being able to reach out to the application developer community and address a very specific challenge that has plagued them for many years now. So bring it all together. Yeah, I see this as just enabling Oracles who ring true with customers. That the customers that were there were basically all of them, even though not all of them are going to be saying the same things, they're all basically saying positive feedback. And likewise, I think the analyst community is seeing this. It's always refreshing to be able to talk to customers directly and at Oracle cloud there was a litany of them and so this is just a difference maker as well as being able to talk to strategic partners. The nvidia, I think partnerships also testament to Oracle's ongoing ability to, you know, make the ecosystem more user friendly for the customers out there. >>Yeah, it's interesting when you get these all in one tools, you know, the Swiss Army knife, you expect that it's not able to be best of breed. That's the kind of surprising thing that I'm hearing about, about heatwave. I want to, I want to talk about Lake House because when I think of Lake House, I think data bricks, and to my knowledge data bricks hasn't been in the sites of Oracle yet. Maybe they're next, but, but Oracle claims that MySQL, heatwave, Lakehouse is a breakthrough in terms of capacity and performance. Mark, what are your thoughts on that? Can you double click on, on Lakehouse Oracle's claims for things like query performance and data loading? What does it mean for the market? Is Oracle really leading in, in the lake house competitive landscape? What are your thoughts? >>Well, but name in the game is what are the problems you're solving for the customer? More importantly, are those problems urgent or important? If they're urgent, customers wanna solve 'em. Now if they're important, they might get around to them. So you look at what they're doing with Lake House or previous to that machine learning or previous to that automation or previous to that O L A with O ltp and they're merging all this capability together. If you look at Snowflake or data bricks, they're tacking one problem. You look at MyQ heat wave, they're tacking multiple problems. So when you say, yeah, their queries are much better against the lake house in combination with other analytics in combination with O ltp and the fact that there are no ETLs. So you're getting all this done in real time. So it's, it's doing the query cross, cross everything in real time. >>You're solving multiple user and developer problems, you're increasing their ability to get insight faster, you're having shorter response times. So yeah, they really are solving urgent problems for customers. And by putting it where the customer lives, this is the brilliance of actually being multicloud. And I know I'm backing up here a second, but by making it work in AWS and Azure where people already live, where they already have applications, what they're saying is, we're bringing it to you. You don't have to come to us to get these, these benefits, this value overall, I think it's a brilliant strategy. I give Nip and Argo wallet a huge, huge kudos for what he's doing there. So yes, what they're doing with the lake house is going to put notice on data bricks and Snowflake and everyone else for that matter. Well >>Those are guys that whole ago you, you and I have talked about this. Those are, those are the guys that are doing sort of the best of breed. You know, they're really focused and they, you know, tend to do well at least out of the gate. Now you got Oracle's converged philosophy, obviously with Oracle database. We've seen that now it's kicking in gear with, with heatwave, you know, this whole thing of sweets versus best of breed. I mean the long term, you know, customers tend to migrate towards suite, but the new shiny toy tends to get the growth. How do you think this is gonna play out in cloud database? >>Well, it's the forever never ending story, right? And in software right suite, whereas best of breed and so far in the long run suites have always won, right? So, and sometimes they struggle again because the inherent problem of sweets is you build something larger, it has more complexity and that means your cycles to get everything working together to integrate the test that roll it out, certify whatever it is, takes you longer, right? And that's not the case. It's a fascinating part of what the effort around my SQL heat wave is that the team is out executing the previous best of breed data, bringing us something together. Now if they can maintain that pace, that's something to to, to be seen. But it, the strategy, like what Mark was saying, bring the software to the data is of course interesting and unique and totally an Oracle issue in the past, right? >>Yeah. But it had to be in your database on oci. And but at, that's an interesting part. The interesting thing on the Lake health side is, right, there's three key benefits of a lakehouse. The first one is better reporting analytics, bring more rich information together, like make the, the, the case for silicon angle, right? We want to see engagements for this video, we want to know what's happening. That's a mixed transactional video media use case, right? Typical Lakehouse use case. The next one is to build more rich applications, transactional applications which have video and these elements in there, which are the engaging one. And the third one, and that's where I'm a little critical and concerned, is it's really the base platform for artificial intelligence, right? To run deep learning to run things automatically because they have all the data in one place can create in one way. >>And that's where Oracle, I know that Ron talked about Invidia for a moment, but that's where Oracle doesn't have the strongest best story. Nonetheless, the two other main use cases of the lake house are very strong, very well only concern is four 50 terabyte sounds long. It's an arbitrary limitation. Yeah, sounds as big. So for the start, and it's the first word, they can make that bigger. You don't want your lake house to be limited and the terabyte sizes or any even petabyte size because you want to have the certainty. I can put everything in there that I think it might be relevant without knowing what questions to ask and query those questions. >>Yeah. And you know, in the early days of no schema on right, it just became a mess. But now technology has evolved to allow us to actually get more value out of that data. Data lake. Data swamp is, you know, not much more, more, more, more logical. But, and I want to get in, in a moment, I want to come back to how you think the competitors are gonna respond. Are they gonna have to sort of do a more of a converged approach? AWS in particular? But before I do, Ron, I want to ask you a question about autopilot because I heard Larry Ellison's keynote and he was talking about how, you know, most security issues are human errors with autonomy and autonomous database and things like autopilot. We take care of that. It's like autonomous vehicles, they're gonna be safer. And I went, well maybe, maybe someday. So Oracle really tries to emphasize this, that every time you see an announcement from Oracle, they talk about new, you know, autonomous capabilities. It, how legit is it? Do people care? What about, you know, what's new for heatwave Lakehouse? How much of a differentiator, Ron, do you really think autopilot is in this cloud database space? >>Yeah, I think it will definitely enhance the overall proposition. I don't think people are gonna buy, you know, lake house exclusively cause of autopilot capabilities, but when they look at the overall picture, I think it will be an added capability bonus to Oracle's benefit. And yeah, I think it's kind of one of these age old questions, how much do you automate and what is the bounce to strike? And I think we all understand with the automatic car, autonomous car analogy that there are limitations to being able to use that. However, I think it's a tool that basically every organization out there needs to at least have or at least evaluate because it goes to the point of it helps with ease of use, it helps make automation more balanced in terms of, you know, being able to test, all right, let's automate this process and see if it works well, then we can go on and switch on on autopilot for other processes. >>And then, you know, that allows, for example, the specialists to spend more time on business use cases versus, you know, manual maintenance of, of the cloud database and so forth. So I think that actually is a, a legitimate value proposition. I think it's just gonna be a case by case basis. Some organizations are gonna be more aggressive with putting automation throughout their processes throughout their organization. Others are gonna be more cautious. But it's gonna be, again, something that will help the overall Oracle proposition. And something that I think will be used with caution by many organizations, but other organizations are gonna like, hey, great, this is something that is really answering a real problem. And that is just easing the use of these databases, but also being able to better handle the automation capabilities and benefits that come with it without having, you know, a major screwup happened and the process of transitioning to more automated capabilities. >>Now, I didn't attend cloud world, it's just too many red eyes, you know, recently, so I passed. But one of the things I like to do at those events is talk to customers, you know, in the spirit of the truth, you know, they, you know, you'd have the hallway, you know, track and to talk to customers and they say, Hey, you know, here's the good, the bad and the ugly. So did you guys, did you talk to any customers my SQL Heatwave customers at, at cloud world? And and what did you learn? I don't know, Mark, did you, did you have any luck and, and having some, some private conversations? >>Yeah, I had quite a few private conversations. The one thing before I get to that, I want disagree with one point Ron made, I do believe there are customers out there buying the heat wave service, the MySEQ heat wave server service because of autopilot. Because autopilot is really revolutionary in many ways in the sense for the MySEQ developer in that it, it auto provisions, it auto parallel loads, IT auto data places it auto shape predictions. It can tell you what machine learning models are going to tell you, gonna give you your best results. And, and candidly, I've yet to meet a DBA who didn't wanna give up pedantic tasks that are pain in the kahoo, which they'd rather not do and if it's long as it was done right for them. So yes, I do think people are buying it because of autopilot and that's based on some of the conversations I had with customers at Oracle Cloud World. >>In fact, it was like, yeah, that's great, yeah, we get fantastic performance, but this really makes my life easier and I've yet to meet a DBA who didn't want to make their life easier. And it does. So yeah, I've talked to a few of them. They were excited. I asked them if they ran into any bugs, were there any difficulties in moving to it? And the answer was no. In both cases, it's interesting to note, my sequel is the most popular database on the planet. Well, some will argue that it's neck and neck with SQL Server, but if you add in Mariah DB and ProCon db, which are forks of MySQL, then yeah, by far and away it's the most popular. And as a result of that, everybody for the most part has typically a my sequel database somewhere in their organization. So this is a brilliant situation for anybody going after MyQ, but especially for heat wave. And the customers I talk to love it. I didn't find anybody complaining about it. And >>What about the migration? We talked about TCO earlier. Did your t does your TCO analysis include the migration cost or do you kind of conveniently leave that out or what? >>Well, when you look at migration costs, there are different kinds of migration costs. By the way, the worst job in the data center is the data migration manager. Forget it, no other job is as bad as that one. You get no attaboys for doing it. Right? And then when you screw up, oh boy. So in real terms, anything that can limit data migration is a good thing. And when you look at Data Lake, that limits data migration. So if you're already a MySEQ user, this is a pure MySQL as far as you're concerned. It's just a, a simple transition from one to the other. You may wanna make sure nothing broke and every you, all your tables are correct and your schema's, okay, but it's all the same. So it's a simple migration. So it's pretty much a non-event, right? When you migrate data from an O LTP to an O L A P, that's an ETL and that's gonna take time. >>But you don't have to do that with my SQL heat wave. So that's gone when you start talking about machine learning, again, you may have an etl, you may not, depending on the circumstances, but again, with my SQL heat wave, you don't, and you don't have duplicate storage, you don't have to copy it from one storage container to another to be able to be used in a different database, which by the way, ultimately adds much more cost than just the other service. So yeah, I looked at the migration and again, the users I talked to said it was a non-event. It was literally moving from one physical machine to another. If they had a new version of MySEQ running on something else and just wanted to migrate it over or just hook it up or just connect it to the data, it worked just fine. >>Okay, so every day it sounds like you guys feel, and we've certainly heard this, my colleague David Foyer, the semi-retired David Foyer was always very high on heatwave. So I think you knows got some real legitimacy here coming from a standing start, but I wanna talk about the competition, how they're likely to respond. I mean, if your AWS and you got heatwave is now in your cloud, so there's some good aspects of that. The database guys might not like that, but the infrastructure guys probably love it. Hey, more ways to sell, you know, EC two and graviton, but you're gonna, the database guys in AWS are gonna respond. They're gonna say, Hey, we got Redshift, we got aqua. What's your thoughts on, on not only how that's gonna resonate with customers, but I'm interested in what you guys think will a, I never say never about aws, you know, and are they gonna try to build, in your view a converged Oola and o LTP database? You know, Snowflake is taking an ecosystem approach. They've added in transactional capabilities to the portfolio so they're not standing still. What do you guys see in the competitive landscape in that regard going forward? Maybe Holger, you could start us off and anybody else who wants to can chime in, >>Happy to, you mentioned Snowflake last, we'll start there. I think Snowflake is imitating that strategy, right? That building out original data warehouse and the clouds tasking project to really proposition to have other data available there because AI is relevant for everybody. Ultimately people keep data in the cloud for ultimately running ai. So you see the same suite kind of like level strategy, it's gonna be a little harder because of the original positioning. How much would people know that you're doing other stuff? And I just, as a former developer manager of developers, I just don't see the speed at the moment happening at Snowflake to become really competitive to Oracle. On the flip side, putting my Oracle hat on for a moment back to you, Mark and Iran, right? What could Oracle still add? Because the, the big big things, right? The traditional chasms in the database world, they have built everything, right? >>So I, I really scratched my hat and gave Nipon a hard time at Cloud world say like, what could you be building? Destiny was very conservative. Let's get the Lakehouse thing done, it's gonna spring next year, right? And the AWS is really hard because AWS value proposition is these small innovation teams, right? That they build two pizza teams, which can be fit by two pizzas, not large teams, right? And you need suites to large teams to build these suites with lots of functionalities to make sure they work together. They're consistent, they have the same UX on the administration side, they can consume the same way, they have the same API registry, can't even stop going where the synergy comes to play over suite. So, so it's gonna be really, really hard for them to change that. But AWS super pragmatic. They're always by themselves that they'll listen to customers if they learn from customers suite as a proposition. I would not be surprised if AWS trying to bring things closer together, being morely together. >>Yeah. Well how about, can we talk about multicloud if, if, again, Oracle is very on on Oracle as you said before, but let's look forward, you know, half a year or a year. What do you think about Oracle's moves in, in multicloud in terms of what kind of penetration they're gonna have in the marketplace? You saw a lot of presentations at at cloud world, you know, we've looked pretty closely at the, the Microsoft Azure deal. I think that's really interesting. I've, I've called it a little bit of early days of a super cloud. What impact do you think this is gonna have on, on the marketplace? But, but both. And think about it within Oracle's customer base, I have no doubt they'll do great there. But what about beyond its existing install base? What do you guys think? >>Ryan, do you wanna jump on that? Go ahead. Go ahead Ryan. No, no, no, >>That's an excellent point. I think it aligns with what we've been talking about in terms of Lakehouse. I think Lake House will enable Oracle to pull more customers, more bicycle customers onto the Oracle platforms. And I think we're seeing all the signs pointing toward Oracle being able to make more inroads into the overall market. And that includes garnishing customers from the leaders in, in other words, because they are, you know, coming in as a innovator, a an alternative to, you know, the AWS proposition, the Google cloud proposition that they have less to lose and there's a result they can really drive the multi-cloud messaging to resonate with not only their existing customers, but also to be able to, to that question, Dave's posing actually garnish customers onto their platform. And, and that includes naturally my sequel but also OCI and so forth. So that's how I'm seeing this playing out. I think, you know, again, Oracle's reporting is indicating that, and I think what we saw, Oracle Cloud world is definitely validating the idea that Oracle can make more waves in the overall market in this regard. >>You know, I, I've floated this idea of Super cloud, it's kind of tongue in cheek, but, but there, I think there is some merit to it in terms of building on top of hyperscale infrastructure and abstracting some of the, that complexity. And one of the things that I'm most interested in is industry clouds and an Oracle acquisition of Cerner. I was struck by Larry Ellison's keynote, it was like, I don't know, an hour and a half and an hour and 15 minutes was focused on healthcare transformation. Well, >>So vertical, >>Right? And so, yeah, so you got Oracle's, you know, got some industry chops and you, and then you think about what they're building with, with not only oci, but then you got, you know, MyQ, you can now run in dedicated regions. You got ADB on on Exadata cloud to customer, you can put that OnPrem in in your data center and you look at what the other hyperscalers are, are doing. I I say other hyperscalers, I've always said Oracle's not really a hyperscaler, but they got a cloud so they're in the game. But you can't get, you know, big query OnPrem, you look at outposts, it's very limited in terms of, you know, the database support and again, that that will will evolve. But now you got Oracle's got, they announced Alloy, we can white label their cloud. So I'm interested in what you guys think about these moves, especially the industry cloud. We see, you know, Walmart is doing sort of their own cloud. You got Goldman Sachs doing a cloud. Do you, you guys, what do you think about that and what role does Oracle play? Any thoughts? >>Yeah, let me lemme jump on that for a moment. Now, especially with the MyQ, by making that available in multiple clouds, what they're doing is this follows the philosophy they've had the past with doing cloud, a customer taking the application and the data and putting it where the customer lives. If it's on premise, it's on premise. If it's in the cloud, it's in the cloud. By making the mice equal heat wave, essentially a plug compatible with any other mice equal as far as your, your database is concern and then giving you that integration with O L A P and ML and Data Lake and everything else, then what you've got is a compelling offering. You're making it easier for the customer to use. So I look the difference between MyQ and the Oracle database, MyQ is going to capture market more market share for them. >>You're not gonna find a lot of new users for the Oracle debate database. Yeah, there are always gonna be new users, don't get me wrong, but it's not gonna be a huge growth. Whereas my SQL heatwave is probably gonna be a major growth engine for Oracle going forward. Not just in their own cloud, but in AWS and in Azure and on premise over time that eventually it'll get there. It's not there now, but it will, they're doing the right thing on that basis. They're taking the services and when you talk about multicloud and making them available where the customer wants them, not forcing them to go where you want them, if that makes sense. And as far as where they're going in the future, I think they're gonna take a page outta what they've done with the Oracle database. They'll add things like JSON and XML and time series and spatial over time they'll make it a, a complete converged database like they did with the Oracle database. The difference being Oracle database will scale bigger and will have more transactions and be somewhat faster. And my SQL will be, for anyone who's not on the Oracle database, they're, they're not stupid, that's for sure. >>They've done Jason already. Right. But I give you that they could add graph and time series, right. Since eat with, Right, Right. Yeah, that's something absolutely right. That's, that's >>A sort of a logical move, right? >>Right. But that's, that's some kid ourselves, right? I mean has worked in Oracle's favor, right? 10 x 20 x, the amount of r and d, which is in the MyQ space, has been poured at trying to snatch workloads away from Oracle by starting with IBM 30 years ago, 20 years ago, Microsoft and, and, and, and didn't work, right? Database applications are extremely sticky when they run, you don't want to touch SIM and grow them, right? So that doesn't mean that heat phase is not an attractive offering, but it will be net new things, right? And what works in my SQL heat wave heat phases favor a little bit is it's not the massive enterprise applications which have like we the nails like, like you might be only running 30% or Oracle, but the connections and the interfaces into that is, is like 70, 80% of your enterprise. >>You take it out and it's like the spaghetti ball where you say, ah, no I really don't, don't want to do all that. Right? You don't, don't have that massive part with the equals heat phase sequel kind of like database which are more smaller tactical in comparison, but still I, I don't see them taking so much share. They will be growing because of a attractive value proposition quickly on the, the multi-cloud, right? I think it's not really multi-cloud. If you give people the chance to run your offering on different clouds, right? You can run it there. The multi-cloud advantages when the Uber offering comes out, which allows you to do things across those installations, right? I can migrate data, I can create data across something like Google has done with B query Omni, I can run predictive models or even make iron models in different place and distribute them, right? And Oracle is paving the road for that, but being available on these clouds. But the multi-cloud capability of database which knows I'm running on different clouds that is still yet to be built there. >>Yeah. And >>That the problem with >>That, that's the super cloud concept that I flowed and I I've always said kinda snowflake with a single global instance is sort of, you know, headed in that direction and maybe has a league. What's the issue with that mark? >>Yeah, the problem with the, with that version, the multi-cloud is clouds to charge egress fees. As long as they charge egress fees to move data between clouds, it's gonna make it very difficult to do a real multi-cloud implementation. Even Snowflake, which runs multi-cloud, has to pass out on the egress fees of their customer when data moves between clouds. And that's really expensive. I mean there, there is one customer I talked to who is beta testing for them, the MySQL heatwave and aws. The only reason they didn't want to do that until it was running on AWS is the egress fees were so great to move it to OCI that they couldn't afford it. Yeah. Egress fees are the big issue but, >>But Mark the, the point might be you might wanna root query and only get the results set back, right was much more tinier, which been the answer before for low latency between the class A problem, which we sometimes still have but mostly don't have. Right? And I think in general this with fees coming down based on the Oracle general E with fee move and it's very hard to justify those, right? But, but it's, it's not about moving data as a multi-cloud high value use case. It's about doing intelligent things with that data, right? Putting into other places, replicating it, what I'm saying the same thing what you said before, running remote queries on that, analyzing it, running AI on it, running AI models on that. That's the interesting thing. Cross administered in the same way. Taking things out, making sure compliance happens. Making sure when Ron says I don't want to be American anymore, I want to be in the European cloud that is gets migrated, right? So tho those are the interesting value use case which are really, really hard for enterprise to program hand by hand by developers and they would love to have out of the box and that's yet the innovation to come to, we have to come to see. But the first step to get there is that your software runs in multiple clouds and that's what Oracle's doing so well with my SQL >>Guys. Amazing. >>Go ahead. Yeah. >>Yeah. >>For example, >>Amazing amount of data knowledge and, and brain power in this market. Guys, I really want to thank you for coming on to the cube. Ron Holger. Mark, always a pleasure to have you on. Really appreciate your time. >>Well all the last names we're very happy for Romanic last and moderator. Thanks Dave for moderating us. All right, >>We'll see. We'll see you guys around. Safe travels to all and thank you for watching this power panel, The Truth About My SQL Heat Wave on the cube. Your leader in enterprise and emerging tech coverage.

Published Date : Nov 1 2022

SUMMARY :

Always a pleasure to have you on. I think you just saw him at Oracle Cloud World and he's come on to describe this is doing, you know, Google is, you know, we heard Google Cloud next recently, They own somewhere between 30 to 50% depending on who you read migrate from one cloud to another and suddenly you have a very compelling offer. All right, so thank you for that. And they certainly with the AI capabilities, And I believe strongly that long term it's gonna be ones who create better value for So I mean it's certainly, you know, when, when Oracle talks about the competitors, So what do you make of the benchmarks? say, Snowflake when it comes to, you know, the Lakehouse platform and threat to keep, you know, a customer in your own customer base. And oh, by the way, as you grow, And I know you look at this a lot, to insight, it doesn't improve all those things that you want out of a database or multiple databases So what about, I wonder ho if you could chime in on the developer angle. they don't have to license more things, send you to more trainings, have more risk of something not being delivered, all the needs of an enterprise to run certain application use cases. I mean I, you know, the rumor was the TK Thomas Curian left Oracle And I think, you know, to holder's point, I think that definitely lines But I agree with Mark, you know, the short term discounting is just a stall tag. testament to Oracle's ongoing ability to, you know, make the ecosystem Yeah, it's interesting when you get these all in one tools, you know, the Swiss Army knife, you expect that it's not able So when you say, yeah, their queries are much better against the lake house in You don't have to come to us to get these, these benefits, I mean the long term, you know, customers tend to migrate towards suite, but the new shiny bring the software to the data is of course interesting and unique and totally an Oracle issue in And the third one, lake house to be limited and the terabyte sizes or any even petabyte size because you want keynote and he was talking about how, you know, most security issues are human I don't think people are gonna buy, you know, lake house exclusively cause of And then, you know, that allows, for example, the specialists to And and what did you learn? The one thing before I get to that, I want disagree with And the customers I talk to love it. the migration cost or do you kind of conveniently leave that out or what? And when you look at Data Lake, that limits data migration. So that's gone when you start talking about So I think you knows got some real legitimacy here coming from a standing start, So you see the same And you need suites to large teams to build these suites with lots of functionalities You saw a lot of presentations at at cloud world, you know, we've looked pretty closely at Ryan, do you wanna jump on that? I think, you know, again, Oracle's reporting I think there is some merit to it in terms of building on top of hyperscale infrastructure and to customer, you can put that OnPrem in in your data center and you look at what the So I look the difference between MyQ and the Oracle database, MyQ is going to capture market They're taking the services and when you talk about multicloud and But I give you that they could add graph and time series, right. like, like you might be only running 30% or Oracle, but the connections and the interfaces into You take it out and it's like the spaghetti ball where you say, ah, no I really don't, global instance is sort of, you know, headed in that direction and maybe has a league. Yeah, the problem with the, with that version, the multi-cloud is clouds And I think in general this with fees coming down based on the Oracle general E with fee move Yeah. Guys, I really want to thank you for coming on to the cube. Well all the last names we're very happy for Romanic last and moderator. We'll see you guys around.

ENTITIES

Entity	Category	Confidence
Mark	PERSON	0.99+
Ron Holger	PERSON	0.99+
Ron	PERSON	0.99+
Mark Stammer	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Ron Westfall	PERSON	0.99+
Ryan	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Larry Ellison	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Holgar Mueller	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Constellation Research	ORGANIZATION	0.99+
Goldman Sachs	ORGANIZATION	0.99+
17 times	QUANTITY	0.99+
two	QUANTITY	0.99+
David Foyer	PERSON	0.99+
44%	QUANTITY	0.99+
1.2%	QUANTITY	0.99+
4.8 billion	QUANTITY	0.99+
Jason	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Fu Chim Research	ORGANIZATION	0.99+
Dave Ante	PERSON	0.99+

Oracle Announces MySQL HeatWave on AWS

>>Oracle continues to enhance my sequel Heatwave at a very rapid pace. The company is now in its fourth major release since the original announcement in December 2020. 1 of the main criticisms of my sequel, Heatwave, is that it only runs on O. C I. Oracle Cloud Infrastructure and as a lock in to Oracle's Cloud. Oracle recently announced that heat wave is now going to be available in AWS Cloud and it announced its intent to bring my sequel Heatwave to Azure. So my secret heatwave on AWS is a significant TAM expansion move for Oracle because of the momentum AWS Cloud continues to show. And evidently the Heatwave Engineering team has taken the development effort from O. C I. And is bringing that to A W S with a number of enhancements that we're gonna dig into today is senior vice president. My sequel Heatwave at Oracle is back with me on a cube conversation to discuss the latest heatwave news, and we're eager to hear any benchmarks relative to a W S or any others. Nippon has been leading the Heatwave engineering team for over 10 years and there's over 100 and 85 patents and database technology. Welcome back to the show and good to see you. >>Thank you. Very happy to be back. >>Now for those who might not have kept up with the news, uh, to kick things off, give us an overview of my sequel, Heatwave and its evolution. So far, >>so my sequel, Heat Wave, is a fully managed my secret database service offering from Oracle. Traditionally, my secret has been designed and optimised for transaction processing. So customers of my sequel then they had to run analytics or when they had to run machine learning, they would extract the data out of my sequel into some other database for doing. Unlike processing or machine learning processing my sequel, Heat provides all these capabilities built in to a single database service, which is my sequel. He'd fake So customers of my sequel don't need to move the data out with the same database. They can run transaction processing and predicts mixed workloads, machine learning, all with a very, very good performance in very good price performance. Furthermore, one of the design points of heat wave is is a scale out architecture, so the system continues to scale and performed very well, even when customers have very large late assignments. >>So we've seen some interesting moves by Oracle lately. The collaboration with Azure we've we've covered that pretty extensively. What was the impetus here for bringing my sequel Heatwave onto the AWS cloud? What were the drivers that you considered? >>So one of the observations is that a very large percentage of users of my sequel Heatwave, our AWS users who are migrating of Aurora or so already we see that a good percentage of my secret history of customers are migrating from GWS. However, there are some AWS customers who are still not able to migrate the O. C. I to my secret heat wave. And the reason is because of, um, exorbitant cost, which was charges. So in order to migrate the workload from AWS to go see, I digress. Charges are very high fees which becomes prohibitive for the customer or the second example we have seen is that the latency of practising a database which is outside of AWS is very high. So there's a class of customers who would like to get the benefits of my secret heatwave but were unable to do so and with this support of my secret trip inside of AWS, these customers can now get all the grease of the benefits of my secret he trip without having to pay the high fees or without having to suffer with the poorly agency, which is because of the ws architecture. >>Okay, so you're basically meeting the customer's where they are. So was this a straightforward lifted shift from from Oracle Cloud Infrastructure to AWS? >>No, it is not because one of the design girls we have with my sequel, Heatwave is that we want to provide our customers with the best price performance regardless of the cloud. So when we decided to offer my sequel, he headed west. Um, we have optimised my sequel Heatwave on it as well. So one of the things to point out is that this is a service with the data plane control plane and the console are natively running on AWS. And the benefits of doing so is that now we can optimise my sequel Heatwave for the E. W s architecture. In addition to that, we have also announced a bunch of new capabilities as a part of the service which will also be available to the my secret history of customers and our CI, But we just announced them and we're offering them as a part of my secret history of offering on AWS. >>So I just want to make sure I understand that it's not like you just wrapped your stack in a container and stuck it into a W s to be hosted. You're saying you're actually taking advantage of the capabilities of the AWS cloud natively? And I think you've made some other enhancements as well that you're alluding to. Can you maybe, uh, elucidate on those? Sure. >>So for status, um, we have taken the mind sequel Heatwave code and we have optimised for the It was infrastructure with its computer network. And as a result, customers get very good performance and price performance. Uh, with my secret he trade in AWS. That's one performance. Second thing is, we have designed new interactive counsel for the service, which means that customers can now provision there instances with the council. But in addition, they can also manage their schemas. They can. Then court is directly from the council. Autopilot is integrated. The council we have introduced performance monitoring, so a lot of capabilities which we have introduced as a part of the new counsel. The third thing is that we have added a bunch of new security features, uh, expose some of the security features which were part of the My Secret Enterprise edition as a part of the service, which gives customers now a choice of using these features to build more secure applications. And finally, we have extended my secret autopilot for a number of old gpus cases. In the past, my secret autopilot had a lot of capabilities for Benedict, and now we have augmented my secret autopilot to offer capabilities for elderly people. Includes as well. >>But there was something in your press release called Auto thread. Pooling says it provides higher and sustained throughput. High concerns concerns concurrency by determining Apple number of transactions, which should be executed. Uh, what is that all about? The auto thread pool? It seems pretty interesting. How does it affect performance? Can you help us understand that? >>Yes, and this is one of the capabilities of alluding to which we have added in my secret autopilot for transaction processing. So here is the basic idea. If you have a system where there's a large number of old EP transactions coming into it at a high degrees of concurrency in many of the existing systems of my sequel based systems, it can lead to a state where there are few transactions executing, but a bunch of them can get blocked with or a pilot tried pulling. What we basically do is we do workload aware admission control and what this does is it figures out, what's the right scheduling or all of these algorithms, so that either the transactions are executing or as soon as something frees up, they can start executing, so there's no transaction which is blocked. The advantage to the customer of this capability is twofold. A get significantly better throughput compared to service like Aurora at high levels of concurrency. So at high concurrency, for instance, uh, my secret because of this capability Uh oh, thread pulling offers up to 10 times higher compared to Aurora, that's one first benefit better throughput. The second advantage is that the true part of the system never drops, even at high levels of concurrency, whereas in the case of Aurora, the trooper goes up, but then, at high concurrency is, let's say, starting, uh, level of 500 or something. It depends upon the underlying shit they're using the troopers just dropping where it's with my secret heatwave. The truth will never drops. Now, the ramification for the customer is that if the truth is not gonna drop, the user can start off with a small shape, get the performance and be a show that even the workload increases. They will never get a performance, which is worse than what they're getting with lower levels of concurrency. So this let's leads to customers provisioning a shape which is just right for them. And if they need, they can, uh, go with the largest shape. But they don't like, you know, over pay. So those are the two benefits. Better performance and sustain, uh, regardless of the level of concurrency. >>So how do we quantify that? I know you've got some benchmarks. How can you share comparisons with other cloud databases especially interested in in Amazon's own databases are obviously very popular, and and are you publishing those again and get hub, as you have done in the past? Take us through the benchmarks. >>Sure, So benchmarks are important because that gives customers a sense of what performance to expect and what price performance to expect. So we have run a number of benchmarks. And yes, all these benchmarks are available on guitar for customers to take a look at. So we have performance results on all the three castle workloads, ol DB Analytics and Machine Learning. So let's start with the Rdp for Rdp and primarily because of the auto thread pulling feature. We show that for the IPCC for attended dataset at high levels of concurrency, heatwave offers up to 10 times better throughput and this performance is sustained, whereas in the case of Aurora, the performance really drops. So that's the first thing that, uh, tend to alibi. Sorry, 10 gigabytes. B B C c. I can come and see the performance are the throughput is 10 times better than Aurora for analytics. We have done a comparison of my secret heatwave in AWS and compared with Red Ship Snowflake Googled inquiry, we find that the price performance of my secret heatwave compared to read ship is seven times better. So my sequel, Heat Wave in AWS, provides seven times better price performance than red ship. That's a very, uh, interesting results to us. Which means that customers of Red Shift are really going to take the service seriously because they're gonna get seven times better price performance. And this is all running in a W s so compared. >>Okay, carry on. >>And then I was gonna say, compared to like, Snowflake, uh, in AWS offers 10 times better price performance. And compared to Google, ubiquity offers 12 times better price performance. And this is based on a four terabyte p PCH workload. Results are available on guitar, and then the third category is machine learning and for machine learning, uh, for training, the performance of my secret heatwave is 25 times faster compared to that shit. So all the three workloads we have benchmark's results, and all of these scripts are available on YouTube. >>Okay, so you're comparing, uh, my sequel Heatwave on AWS to Red Shift and snowflake on AWS. And you're comparing my sequel Heatwave on a W s too big query. Obviously running on on Google. Um, you know, one of the things Oracle is done in the past when you get the price performance and I've always tried to call fouls you're, like, double your price for running the oracle database. Uh, not Heatwave, but Oracle Database on a W s. And then you'll show how it's it's so much cheaper on on Oracle will be like Okay, come on. But they're not doing that here. You're basically taking my sequel Heatwave on a W s. I presume you're using the same pricing for whatever you see to whatever else you're using. Storage, um, reserved instances. That's apples to apples on A W s. And you have to obviously do some kind of mapping for for Google, for big query. Can you just verify that for me, >>we are being more than fair on two dimensions. The first thing is, when I'm talking about the price performance for analytics, right for, uh, with my secret heat rape, the cost I'm talking about from my secret heat rape is the cost of running transaction processing, analytics and machine learning. So it's a fully loaded cost for the case of my secret heatwave. There has been I'm talking about red ship when I'm talking about Snowflake. I'm just talking about the cost of these databases for running, and it's only it's not, including the source database, which may be more or some other database, right? So that's the first aspect that far, uh, trip. It's the cost for running all three kinds of workloads, whereas for the competition, it's only for running analytics. The second thing is that for these are those services whether it's like shit or snowflakes, That's right. We're talking about one year, fully paid up front cost, right? So that's what most of the customers would pay for. Many of the customers would pay that they will sign a one year contract and pay all the costs ahead of time because they get a discount. So we're using that price and the case of Snowflake. The costs were using is their standard edition of price, not the Enterprise edition price. So yes, uh, more than in this competitive. >>Yeah, I think that's an important point. I saw an analysis by Marx Tamer on Wiki Bond, where he was doing the TCO comparisons. And I mean, if you have to use two separate databases in two separate licences and you have to do et yelling and all the labour associated with that, that that's that's a big deal and you're not even including that aspect in in your comparison. So that's pretty impressive. To what do you attribute that? You know, given that unlike, oh, ci within the AWS cloud, you don't have as much control over the underlying hardware. >>So look hard, but is one aspect. Okay, so there are three things which give us this advantage. The first thing is, uh, we have designed hateful foreign scale out architecture. So we came up with new algorithms we have come up with, like, uh, one of the design points for heat wave is a massively partitioned architecture, which leads to a very high degree of parallelism. So that's a lot of hype. Each were built, So that's the first part. The second thing is that although we don't have control over the hardware, but the second design point for heat wave is that it is optimised for commodity cloud and the commodity infrastructure so we can have another guys, what to say? The computer we get, how much network bandwidth do we get? How much of, like objects to a brand that we get in here? W s. And we have tuned heat for that. That's the second point And the third thing is my secret autopilot, which provides machine learning based automation. So what it does is that has the users workload is running. It learns from it, it improves, uh, various premieres in the system. So the system keeps getting better as you learn more and more questions. And this is the third thing, uh, as a result of which we get a significant edge over the competition. >>Interesting. I mean, look, any I SV can go on any cloud and take advantage of it. And that's, uh I love it. We live in a new world. How about machine learning workloads? What? What did you see there in terms of performance and benchmarks? >>Right. So machine learning. We offer three capabilities training, which is fully automated, running in France and explanations. So one of the things which many of our customers told us coming from the enterprise is that explanations are very important to them because, uh, customers want to know that. Why did the the system, uh, choose a certain prediction? So we offer explanations for all models which have been derailed by. That's the first thing. Now, one of the interesting things about training is that training is usually the most expensive phase of machine learning. So we have spent a lot of time improving the performance of training. So we have a bunch of techniques which we have developed inside of Oracle to improve the training process. For instance, we have, uh, metal and proxy models, which really give us an advantage. We use adaptive sampling. We have, uh, invented in techniques for paralysing the hyper parameter search. So as a result of a lot of this work, our training is about 25 times faster than that ship them health and all the data is, uh, inside the database. All this processing is being done inside the database, so it's much faster. It is inside the database. And I want to point out that there is no additional charge for the history of customers because we're using the same cluster. You're not working in your service. So all of these machine learning capabilities are being offered at no additional charge inside the database and as a performance, which is significantly faster than that, >>are you taking advantage of or is there any, uh, need not need, but any advantage that you can get if two by exploiting things like gravity. John, we've talked about that a little bit in the past. Or trainee. Um, you just mentioned training so custom silicon that AWS is doing, you're taking advantage of that. Do you need to? Can you give us some insight >>there? So there are two things, right? We're always evaluating What are the choices we have from hybrid perspective? Obviously, for us to leverage is right and like all the things you mention about like we have considered them. But there are two things to consider. One is he is a memory system. So he favours a big is the dominant cost. The processor is a person of the cost, but memory is the dominant cost. So what we have evaluated and found is that the current shape which we are using is going to provide our customers with the best price performance. That's the first thing. The second thing is that there are opportunities at times when we can use a specialised processor for vaccinating the world for a bit. But then it becomes a matter of the cost of the customer. Advantage of our current architecture is on the same hardware. Customers are getting very good performance. Very good, energetic performance in a very good machine learning performance. If you will go with the specialised processor, it may. Actually, it's a machine learning, but then it's an additional cost with the customers we need to pay. So we are very sensitive to the customer's request, which is usually to provide very good performance at a very low cost. And we feel is that the current design we have as providing customers very good performance and very good price performance. >>So part of that is architectural. The memory intensive nature of of heat wave. The other is A W s pricing. If AWS pricing were to flip, it might make more sense for you to take advantage of something like like cranium. Okay, great. Thank you. And welcome back to the benchmarks benchmarks. Sometimes they're artificial right there. A car can go from 0 to 60 in two seconds. But I might not be able to experience that level of performance. Do you? Do you have any real world numbers from customers that have used my sequel Heatwave on A W s. And how they look at performance? >>Yes, absolutely so the my Secret service on the AWS. This has been in Vera for, like, since November, right? So we have a lot of customers who have tried the service. And what actually we have found is that many of these customers, um, planning to migrate from Aurora to my secret heat rape. And what they find is that the performance difference is actually much more pronounced than what I was talking about. Because with Aurora, the performance is actually much poorer compared to uh, like what I've talked about. So in some of these cases, the customers found improvement from 60 times, 240 times, right? So he travels 100 for 240 times faster. It was much less expensive. And the third thing, which is you know, a noteworthy is that customers don't need to change their applications. So if you ask the top three reasons why customers are migrating, it's because of this. No change to the application much faster, and it is cheaper. So in some cases, like Johnny Bites, what they found is that the performance of their applications for the complex storeys was about 60 to 90 times faster. Then we had 60 technologies. What they found is that the performance of heat we have compared to Aurora was 100 and 39 times faster. So, yes, we do have many such examples from real workloads from customers who have tried it. And all across what we find is if it offers better performance, lower cost and a single database such that it is compatible with all existing by sequel based applications and workloads. >>Really impressive. The analysts I talked to, they're all gaga over heatwave, and I can see why. Okay, last question. Maybe maybe two and one. Uh, what's next? In terms of new capabilities that customers are going to be able to leverage and any other clouds that you're thinking about? We talked about that upfront, but >>so in terms of the capabilities you have seen, like they have been, you know, non stop attending to the feedback from the customers in reacting to it. And also, we have been in a wedding like organically. So that's something which is gonna continue. So, yes, you can fully expect that people not dressed and continue to in a way and with respect to the other clouds. Yes, we are planning to support my sequel. He tripped on a show, and this is something that will be announced in the near future. Great. >>All right, Thank you. Really appreciate the the overview. Congratulations on the work. Really exciting news that you're moving my sequel Heatwave into other clouds. It's something that we've been expecting for some time. So it's great to see you guys, uh, making that move, and as always, great to have you on the Cube. >>Thank you for the opportunity. >>All right. And thank you for watching this special cube conversation. I'm Dave Volonte, and we'll see you next time.

Published Date : Sep 14 2022

SUMMARY :

The company is now in its fourth major release since the original announcement in December 2020. Very happy to be back. Now for those who might not have kept up with the news, uh, to kick things off, give us an overview of my So customers of my sequel then they had to run analytics or when they had to run machine So we've seen some interesting moves by Oracle lately. So one of the observations is that a very large percentage So was this a straightforward lifted shift from No, it is not because one of the design girls we have with my sequel, So I just want to make sure I understand that it's not like you just wrapped your stack in So for status, um, we have taken the mind sequel Heatwave code and we have optimised Can you help us understand that? So this let's leads to customers provisioning a shape which is So how do we quantify that? So that's the first thing that, So all the three workloads we That's apples to apples on A W s. And you have to obviously do some kind of So that's the first aspect And I mean, if you have to use two So the system keeps getting better as you learn more and What did you see there in terms of performance and benchmarks? So we have a bunch of techniques which we have developed inside of Oracle to improve the training need not need, but any advantage that you can get if two by exploiting We're always evaluating What are the choices we have So part of that is architectural. And the third thing, which is you know, a noteworthy is that In terms of new capabilities that customers are going to be able so in terms of the capabilities you have seen, like they have been, you know, non stop attending So it's great to see you guys, And thank you for watching this special cube conversation.

ENTITIES

Entity	Category	Confidence
Dave Volonte	PERSON	0.99+
December 2020	DATE	0.99+
Amazon	ORGANIZATION	0.99+
John	PERSON	0.99+
France	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
10 times	QUANTITY	0.99+
two things	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Heatwave	TITLE	0.99+
100	QUANTITY	0.99+
60 times	QUANTITY	0.99+
one year	QUANTITY	0.99+
12 times	QUANTITY	0.99+
GWS	ORGANIZATION	0.99+
60 technologies	QUANTITY	0.99+
first part	QUANTITY	0.99+
240 times	QUANTITY	0.99+
two separate licences	QUANTITY	0.99+
third category	QUANTITY	0.99+
second advantage	QUANTITY	0.99+
0	QUANTITY	0.99+
seven times	QUANTITY	0.99+
two seconds	QUANTITY	0.99+
two	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
seven times	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
one	QUANTITY	0.99+
25 times	QUANTITY	0.99+
second point	QUANTITY	0.99+
November	DATE	0.99+
85 patents	QUANTITY	0.99+
second thing	QUANTITY	0.99+
Aurora	TITLE	0.99+
third thing	QUANTITY	0.99+
Each	QUANTITY	0.99+
second example	QUANTITY	0.99+
10 gigabytes	QUANTITY	0.99+
three things	QUANTITY	0.99+
One	QUANTITY	0.99+
two benefits	QUANTITY	0.99+
one aspect	QUANTITY	0.99+
first aspect	QUANTITY	0.98+
two separate databases	QUANTITY	0.98+
over 10 years	QUANTITY	0.98+
fourth major release	QUANTITY	0.98+
39 times	QUANTITY	0.98+
first thing	QUANTITY	0.98+
Heat Wave	TITLE	0.98+

Video exclusive: Oracle adds more wood to the MySQL HeatWave fire

(upbeat music) >> When Oracle acquired Sun in 2009, it paid $5.6 billion net of Sun's cash and debt. Now I argued at the time that Oracle got one of the best deals in the history of enterprise tech, and I got a lot of grief for saying that because Sun had a declining business, it was losing money, and its revenue was under serious pressure as it tried to hang on for dear life. But Safra Catz understood that Oracle could pay Sun's lower profit and lagging businesses, like its low index 86 product lines, and even if Sun's revenue was cut in half, because Oracle has such a high revenue multiple as a software company, it could almost instantly generate $25 to $30 billion in shareholder value on paper. In addition, it was a catalyst for Oracle to initiate its highly differentiated engineering systems business, and was actually the precursor to Oracle's Cloud. Oracle saw that it could capture high margin dollars that used to go to partners like HP, it's original exit data partner, and get paid for the full stack across infrastructure, middleware, database, and application software, when eventually got really serious about cloud. Now there was also a major technology angle to this story. Remember Sun's tagline, "the network is the computer"? Well, they should have just called it cloud. Through the Sun acquisition. Oracle also got a couple of key technologies, Java, the number one programming language in the world, and MySQL, a key ingredient of the LAMP stack, that's Linux, Apache, MySQL and PHP, Perl or Python, on which the internet is basically built, and is used by many cloud services like Facebook, Twitter, WordPress, Flicker, Amazon, Aurora, and many other examples, including, by the way, Maria DB, which is a fork of MySQL created by MySQL's creator, basically in protest to Oracle's acquisition; the drama is Oscar worthy. It gets even better. In 2020, Oracle began introducing a new version of MySQL called MySQL HeatWave, and since late 2020 it's been in sort of a super cycle rolling, out three new releases in less than a year and a half in an attempt to expand its Tam and compete in new markets. Now we covered the release of MySQL Autopilot, which uses machine learning to automate management functions. And we also covered the bench marketing that Oracle produced against Snowflake, AWS, Azure, and Google. And Oracle's at it again with HeatWave, adding machine learning into its database capabilities, along with previously available integrations of OLAP and OLTP. This, of course, is in line with Oracle's converged database philosophy, which, as we've reported, is different from other cloud database providers, most notably Amazon, which takes the right tool for the right job approach and chooses database specialization over a one size fits all strategy. Now we've asked Oracle to come on theCUBE and explain these moves, and I'm pleased to welcome back Nipun Agarwal, who's the senior vice president for MySQL Database and HeatWave at Oracle. And today, in this video exclusive, we'll discuss machine learning, other new capabilities around elasticity and compression, and then any benchmark data that Nipun wants to share. Nipun's been a leading advocate of the HeatWave program. He's led engineering in that team for over 10 years, and he has over 185 patents in database technologies. Welcome back to the show Nipun. Great to see you again. Thanks for coming on. >> Thank you, Dave. Very happy to be back. >> Yeah, now for those who may not have kept up with the news, maybe to kick things off you could give us an overview of what MySQL HeatWave actually is so that we're all on the same page. >> Sure, Dave, MySQL HeatWave is a fully managed MySQL database service from Oracle, and it has a builtin query accelerator called HeatWave, and that's the part which is unique. So with MySQL HeatWave, customers of MySQL get a single database which they can use for transactional processing, for analytics, and for mixed workloads because traditionally MySQL has been designed and optimized for transaction processing. So in the past, when customers had to run analytics with the MySQL based service, they would need to move the data out of MySQL into some other database for running analytics. So they would end up with two different databases and it would take some time to move the data out of MySQL into this other system. With MySQL HeatWave, we have solved this problem and customers now have a single MySQL database for all their applications, and they can get the good performance of analytics without any changes to their MySQL application. >> Now it's no secret that a lot of times, you know, queries are not, you know, most efficiently written, and critics of MySQL HeatWave will claim that this product is very memory and cluster intensive, it has a heavy footprint that adds to cost. How do you answer that, Nipun? >> Right, so for offering any database service in the cloud there are two dimensions, performance and cost, and we have been very cognizant of both of them. So it is indeed the case that HeatWave is a, in-memory query accelerator, which is why we get very good performance, but it is also the case that we have optimized HeatWave for commodity cloud services. So for instance, we use the least expensive compute. We use the least expensive storage. So what I would suggest is for the customers who kind of would like to know what is the price performance advantage of HeatWave compared to any database we have benchmark against, Redshift, Snowflake, Google BigQuery, Azure Synapse, HeatWave is significantly faster and significantly lower price on a multitude of workloads. So not only is it in-memory database and optimized for that, but we have also optimized it for commodity cloud services, which makes it much lower price than the competition. >> Well, at the end of the day, it's customers that sort of decide what the truth is. So to date, what's been the customer reaction? Are they moving from other clouds from on-prem environments? Both why, you know, what are you seeing? >> Right, so we are definitely a whole bunch of migrations of customers who are running MySQL on-premise to the cloud, to MySQL HeatWave. That's definitely happening. What is also very interesting is we are seeing that a very large percentage of customers, more than half the customers who are coming to MySQL HeatWave, are migrating from other clouds. We have a lot of migrations coming from AWS Aurora, migrations from RedShift, migrations from RDS MySQL, TerriData, SAP HANA, right. So we are seeing migrations from a whole bunch of other databases and other cloud services to MySQL HeatWave. And the main reason we are told why customers are migrating from other databases to MySQL HeatWave are lower cost, better performance, and no change to their application because many of these services, like AWS Aurora are ETL compatible with MySQL. So when customers try MySQL HeatWave, not only do they get better performance at a lower cost, but they find that they can migrate their application without any changes, and that's a big incentive for them. >> Great, thank you, Nipun. So can you give us some names? Are there some real world examples of these customers that have migrated to MySQL HeatWave that you can share? >> Oh, absolutely, I'll give you a few names. Stutor.com, this is an educational SaaS provider raised out of Brazil. They were using Google BigQuery, and when they migrated to MySQL HeatWave, they found a 300X, right, 300 times improvement in performance, and it lowered their cost by 85 (audio cut out). Another example is Neovera. They offer cybersecurity solutions and they were running their application on an on-premise version of MySQL when they migrated to MySQL HeatWave, their application improved in performance by 300 times and their cost reduced by 80%, right. So by going from on-premise to MySQL HeatWave, they reduced the cost by 80%, improved performance by 300 times. We are Glass, another customer based out of Brazil. They were running on AWS EC2, and when they migrated, within hours they found that there was a significant improvement, like, you know, over 5X improvement in database performance, and they were able to accommodate a very large virtual event, which had more than a million visitors. Another example, Genius Senority. They are a game designer in Japan, and when they moved to MySQL HeatWave, they found a 90 times percent improvement in performance. And there many, many more like a lot of migrations, again, from like, you know, Aurora, RedShift and many other databases as well. And consistently what we hear is (audio cut out) getting much better performance at a much lower cost without any change to their application. >> Great, thank you. You know, when I ask that question, a lot of times I get, "Well, I can't name the customer name," but I got to give Oracle credit, a lot of times you guys have at your fingertips. So you're not the only one, but it's somewhat rare in this industry. So, okay, so you got some good feedback from those customers that did migrate to MySQL HeatWave. What else did they tell you that they wanted? Did they, you know, kind of share a wishlist and some of the white space that you guys should be working on? What'd they tell you? >> Right, so as customers are moving more data into MySQL HeatWave, as they're consolidating more data into MySQL HeatWave, customers want to run other kinds of processing with this data. A very popular one is (audio cut out) So we have had multiple customers who told us that they wanted to run machine learning with data which is stored in MySQL HeatWave, and for that they have to extract the data out of MySQL (audio cut out). So that was the first feedback we got. Second thing is MySQL HeatWave is a highly scalable system. What that means is that as you add more nodes to a HeatWave cluster, the performance of the system improves almost linearly. But currently customers need to perform some manual steps to add most to a cluster or to reduce the cluster size. So that was other feedback we got that people wanted this thing to be automated. Third thing is that we have shown in the previous results, that HeatWave is significantly faster and significantly lower price compared to competitive services. So we got feedback from customers that can we trade off some performance to get even lower cost, and that's what we have looked at. And then finally, like we have some results on various data sizes with TPC-H. Customers wanted to see if we can offer some more data points as to how does HeatWave perform on other kinds of workloads. And that's what we've been working on for the several months. >> Okay, Nipun, we're going to get into some of that, but, so how did you go about addressing these requirements? >> Right, so the first thing is we are announcing support for in-database machine learning, meaning that customers who have their data inside MySQL HeatWave can now run training, inference, and prediction all inside the database without the data or the model ever having to leave the database. So that's how we address the first one. Second thing is we are offering support for real time elasticity, meaning that customers can scale up or scale down to any number of nodes. This requires no manual intervention on part of the user, and for the entire duration of the resize operation, the system is fully available. The third, in terms of the costs, we have double the amount of data that can be processed per node. So if you look at a HeatWave cluster, the size of the cluster determines the cost. So by doubling the amount of data that can be processed per node, we have effectively reduced the cluster size which is required for planning a given workload to have, which means it reduces the cost to the customer by half. And finally, we have also run the TPC-DS workload on HeatWave and compared it with other vendors. So now customers can have another data point in terms of the performance and the cost comparison of HeatWave with other services. >> All right, and I promise, I'm going to ask you about the benchmarks, but I want to come back and drill into these a bit. How is HeatWave ML different from competitive offerings? Take for instance, Redshift ML, for example. >> Sure, okay, so this is a good comparison. Let's start with, let's say RedShift ML, like there are some systems like, you know, Snowflake, which don't even offer any, like, processing of machine learning inside the database, and they expect customers to write a whole bunch of code, in say Python or Java, to do machine learning. RedShift ML does have integration with SQL. That's a good start. However, when customers of Redshift need to run machine learning, and they invoke Redshift ML, it makes a call to another service, SageMaker, right, where so the data needs to be exported to a different service. The model is generated, and the model is also outside RedShift. With HeatWave ML, the data resides always inside the MySQL database service. We are able to generate models. We are able to train the models, run inference, run explanations, all inside the MySQL HeatWave service. So the data, or the model, never have to leave the database, which means that both the data and the models can now be secured by the same access control mechanisms as the rest of the data. So that's the first part, that there is no need for any ETL. The second aspect is the automation. Training is a very important part of machine learning, right, and it impacts the quality of the predictions and such. So traditionally, customers would employ data scientists to influence the training process so that it's done right. And even in the case of Redshift ML, the users are expected to provide a lot of parameters to the training process. So the second thing which we have worked on with HeatWave ML is that it is fully automated. There is absolutely no user intervention required for training. Third is in terms of performance. So one of the things we are very, very sensitive to is performance because performance determines the eventual cost to the customer. So again, in some benchmarks, which we have published, and these are all available on GitHub, we are showing how HeatWave ML is 25 times faster than Redshift ML, and here's the kicker, at 1% of the cost. So four benefits, the data all remain secure inside the database service, it's fully automated, much faster, much lower cost than the competition. >> All right, thank you Nipun. Now, so there's a lot of talk these days about explainability and AI. You know, the system can very accurately tell you that it's a cat, you know, or for you Silicon Valley fans, it's a hot dog or not a hot dog, but they can't tell you how the system got there. So what is explainability, and why should people care about it? >> Right, so when we were talking to customers about what they would like from a machine learning based solution, one of the feedbacks we got is that enterprise is a little slow or averse to uptaking machine learning, because it seems to be, you know, like magic, right? And enterprises have the obligation to be able to explain, or to provide a answer to their customers as to why did the database make a certain choice. With a rule based solution it's simple, it's a rule based thing, and you know what the logic was. So the reason explanations are important is because customers want to know why did the system make a certain prediction? One of the important characteristics of HeatWave ML is that any model which is generated by HeatWave ML can be explained, and we can do both global explanations or model explanations as well as we can also do local explanations. So when the system makes a specific prediction using HeatWave ML, the user can find out why did the system make such a prediction? So for instance, if someone is being denied a loan, the user can figure out what were the attribute, what were the features which led to that decision? So this ensures, like, you know, fairness, and many of the times there is also like a need for regulatory compliance where users have a right to know. So we feel that explanations are very important for enterprise workload, and that's why every model which is generated by HeatWave ML can be explained. >> Now I got to give Snowflakes some props, you know, this whole idea of separating compute from storage, but also bringing the database to the cloud and driving elasticity. So that's been a key enabler and has solved a lot of problems, in particular the snake swallowing the basketball problem, as I often say. But what about elasticity and elasticity in real time? How is your version, and there's a lot of companies chasing this, how is your approach to an elastic cloud database service different from what others are promoting these days? >> Right, so a couple of characteristics. One is that we have now fully automated the process of elasticity, meaning that if a user wants to scale up or scale down, the only thing they need to specify is the eventual size of the cluster and the system completely takes care of it transparently. But then there are a few characteristics which are very unique. So for instance, we can scale up or scale down to any number of nodes. Whereas in the case of Snowflake, the number of nodes someone can scale up or scale down to are the powers of two. So if a user needs 70 CPUs, well, their choice is either 64 or 128. So by providing this flexibly with MySQL HeatWave, customers get a custom fit. So they can get a cluster which is optimized for their specific portal. So that's the first thing, flexibility of scaling up or down to any number of nodes. The second thing is that after the operation is completed, the system is fully balanced, meaning the data across the various nodes is fully balanced. That is not the case with many solutions. So for instance, in the case of Redshift, after the resize operation is done, the user is expected to manually balance the data, which can be very cumbersome. And the third aspect is that while the resize operation is going on, the HeatWave cluster is completely available for queries, for DMLS, for loading more data. That is, again, not the case with Redshift. Redshift, suppose the operation takes 10 to 15 minutes, during that window of time, the system is not available for writes, and for a big part of that chunk of time, the system is not even available for queries, which is very limiting. So the advantages we have are fully flexible, the system is in a balanced state, and the system is completely available for the entire duration operation. >> Yeah, I guess you got that hypergranularity, which, you know, sometimes they say, "Well, t-shirt sizes are good enough," but then I think of myself, some t-shirts fit me better than others, so. Okay, I saw on the announcement that you have this lower price point for customers. How did you actually achieve this? Could you give us some details around that please? >> Sure, so there are two things for announcing this service, which lower the cost for the customers. The first thing is that we have doubled the amount of data that can be processed by a HeatWave node. So if we have doubled the amount of data, which can be a process by a node, the cluster size which is required by customers reduces to half, and that's why the cost drops to half. The way we have managed to do this is by two things. One is support for Bloom filters, which reduces the amount of intermediate memory. And second is we compress the base data. So these are the two techniques we have used to process more data per node. The second way by which we are lowering the cost for the customers is by supporting pause and resume of HeatWave. And many times you find customers of like HeatWave and other services that they want to run some other queries or some other workloads for some duration of time, but then they don't need the cluster for a few hours. Now with the support for pause and resume, customers can pause the cluster and the HeatWave cluster instantaneously stops. And when they resume, not only do we fetch the data, in a very, like, you know, a quick pace from the object store, but we also preserve all the statistics, which are used by Autopilot. So both the data and the metadata are fetched, extremely fast from the object store. So with these two capabilities we feel that it'll drive down the cost to our customers even more. >> Got it, thank you. Okay, I promised I was going to get to the benchmarks. Let's have it. How do you compare with others but specifically cloud databases? I mean, and how do we know these benchmarks are real? My friends at EMC, they were back in the day, they were brilliant at doing benchmarks. They would produce these beautiful PowerPoints charts, but it was kind of opaque, but what do you say to that? >> Right, so there are multiple things I would say. The first thing is that this time we have published two benchmarks, one is for machine learning and other is for SQL analytics. All the benchmarks, including the scripts which we have used are available on GitHub. So we have full transparency, and we invite and encourage customers or other service providers to download the scripts, to download the benchmarks and see if they get any different results, right. So what we are seeing, we have published it for other people to try and validate. That's the first part. Now for machine learning, there hasn't been a precedence for enterprise benchmarks so we talk about aiding open data sets and we have published benchmarks for those, right? So both for classification, as well as for aggression, we have run the training times, and that's where we find that HeatWave MLS is 25 times faster than RedShift ML at one percent of the cost. So fully transparent, available. For SQL analytics, in the past we have shown comparisons with TPC-H. So we would show TPC-H across various databases, across various data sizes. This time we decided to use TPC-DS. the advantage of TPC-DS over TPC-H is that it has more number of queries, the queries are more complex, the schema is more complex, and there is a lot more data skew. So it represents a different class of workloads, and which is very interesting. So these are queries derived from the TPC-DS benchmark. So the numbers we have are published this time are for 10 terabyte TPC-DS, and we are comparing with all the four majors services, Redshift, Snowflake, Google BigQuery, Azure Synapse. And in all the cases, HeatWave is significantly faster and significantly lower priced. Now one of the things I want to point out is that when we are doing the cost comparison with other vendors, we are being overly fair. For instance, the cost of HeatWave includes the cost of both the MySQL node as well as the HeatWave node, and with this setup, customers can run transaction processing analytics as well as machine learning. So the price captures all of it. Whereas with the other vendors, the comparison is only for the analytic queries, right? So if customers wanted to run RDP, you would need to add the cost of that database. Or if customers wanted to run machine learning, you would need to add the cost of that service. Furthermore, with the case of HeatWave, we are quoting pay as you go price, whereas for other vendors like, you know, RedShift, and like, you know, where applicable, we are quoting one year, fully paid upfront cost rate. So it's like, you know, very fair comparison. So in terms of the numbers though, price performance for TPC-DS, we are about 4.8 times better price performance compared to RedShift We are 14.4 times better price performance compared to Snowflake, 13 times better than Google BigQuery, and 15 times better than Synapse. So across the board, we are significantly faster and significantly lower price. And as I said, all of these scripts are available in GitHub for people to drive for themselves. >> Okay, all right, I get it. So I think what you're saying is, you could have said this is what it's going to cost for you to do both analytics and transaction processing on a competitive platform versus what it takes to do that on Oracle MySQL HeatWave, but you're not doing that. You're saying, let's take them head on in their sweet spot of analytics, or OLTP separately and you're saying you still beat them. Okay, so you got this one database service in your cloud that supports transactions and analytics and machine learning. How much do you estimate your saving companies with this integrated approach versus the alternative of kind of what I called upfront, the right tool for the right job, and admittedly having to ETL tools. How can you quantify that? >> Right, so, okay. The numbers I call it, right, at the end of the day in a cloud service price performance is the metric which gives a sense as to how much the customers are going to save. So for instance, for like a TPC-DS workload, if we are 14 times better price performance than Snowflake, it means that our cost is going to be 1/14th for what customers would pay for Snowflake. Now, in addition, in other costs, in terms of migrating the data, having to manage two different databases, having to pay for other service for like, you know, machine learning, that's all extra and that depends upon what tools customers are using or what other services they're using for transaction processing or for machine learning. But these numbers themselves, right, like they're very, very compelling. If we are 1/5th the cost of Redshift, right, or 1/14th of Snowflake, these numbers, like, themselves are very, very compelling. And that's the reason we are seeing so many of these migrations from these databases to MySQL HeatWave. >> Okay, great, thank you. Our last question, in the Q3 earnings call for fiscal 22, Larry Ellison said that "MySQL HeatWave is coming soon on AWS," and that caught a lot of people's attention. That's not like Oracle. I mean, people might say maybe that's an indication that you're not having success moving customers to OCI. So you got to go to other clouds, which by the way I applaud, but any comments on that? >> Yep, this is very much like Oracle. So if you look at one of the big reasons for success of the Oracle database and why Oracle database is the most popular database is because Oracle database runs on all the platforms, and that has been the case from day one. So very akin to that, the idea is that there's a lot of value in MySQL HeatWave, and we want to make sure that we can offer same value to the customers of MySQL running on any cloud, whether it's OCI, whether it's the AWS, or any other cloud. So this shows how confident we are in our offering, and we believe that in other clouds as well, customers will find significant advantage by having a single database, which is much faster and much lower price then what alternatives they currently have. So this shows how confident we are about our products and services. >> Well, that's great, I mean, obviously for you, you're in MySQL group. You love that, right? The more places you can run, the better it is for you, of course, and your customers. Okay, Nipun, we got to leave it there. As always it's great to have you on theCUBE, really appreciate your time. Thanks for coming on and sharing the new innovations. Congratulations on all the progress you're making here. You're doing a great job. >> Thank you, Dave, and thank you for the opportunity. >> All right, and thank you for watching this CUBE conversation with Dave Vellante for theCUBE, your leader in enterprise tech coverage. We'll see you next time. (upbeat music)

Published Date : Mar 29 2022

SUMMARY :

and get paid for the full Very happy to be back. maybe to kick things off you and that's the part which is unique. that adds to cost. So it is indeed the case that HeatWave Well, at the end of the day, And the main reason we are told So can you give us some names? and they were running their application and some of the white space and for that they have to extract the data and for the entire duration I'm going to ask you about the benchmarks, So one of the things we are You know, the system can and many of the times there but also bringing the So the advantages we Okay, I saw on the announcement and the HeatWave cluster but what do you say to that? So the numbers we have and admittedly having to ETL tools. And that's the reason we in the Q3 earnings call for fiscal 22, and that has been the case from day one. Congratulations on all the you for the opportunity. All right, and thank you for watching

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
$25	QUANTITY	0.99+
Japan	LOCATION	0.99+
Larry Ellison	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Brazil	LOCATION	0.99+
two techniques	QUANTITY	0.99+
2009	DATE	0.99+
EMC	ORGANIZATION	0.99+
14.4 times	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
85	QUANTITY	0.99+
10	QUANTITY	0.99+
Sun	ORGANIZATION	0.99+
300 times	QUANTITY	0.99+
14 times	QUANTITY	0.99+
two things	QUANTITY	0.99+
$5.6 billion	QUANTITY	0.99+
2020	DATE	0.99+
HP	ORGANIZATION	0.99+
80%	QUANTITY	0.99+
MySQL	TITLE	0.99+
25 times	QUANTITY	0.99+
Nipun Agarwal	PERSON	0.99+
Redshift	TITLE	0.99+
AWS	ORGANIZATION	0.99+
both	QUANTITY	0.99+
90 times	QUANTITY	0.99+
Java	TITLE	0.99+
Python	TITLE	0.99+
$30 billion	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
70 CPUs	QUANTITY	0.99+
MySQL HeatWave	TITLE	0.99+
second aspect	QUANTITY	0.99+
RedShift	TITLE	0.99+
Second thing	QUANTITY	0.99+
RedShift ML	TITLE	0.99+
1%	QUANTITY	0.99+
Redshift ML	TITLE	0.99+
Nipun	PERSON	0.99+
Third	QUANTITY	0.99+
one percent	QUANTITY	0.99+
13 times	QUANTITY	0.99+
first part	QUANTITY	0.99+
today	DATE	0.99+
15 times	QUANTITY	0.99+
two capabilities	QUANTITY	0.99+

Video Exclusive: Oracle Announces New MySQL HeatWave Capabilities

(bright music) >> Surprising many people, including myself, Oracle last year began investing pretty heavily in the MySQL space. Now those investments continue today. Let me give you a brief history. Last December, Oracle made its first HeatWave announcement. Where converged OLTP and OLAP together in a single MySQL database. Now, what wasn't surprising was the approach Oracle took. They leveraged hardware to improve the performance and lower the cost. You see when Oracle acquired Sun more than a decade ago, rather than rely on loosely coupled partnerships with hardware vendors to speed up its databases. Oracle set out on a path to tightly integrate hardware and software innovations using its own in-house engineering. So with his first, MySQL HeatWave announcement, Oracle leaned heavily on developing software on top of an in-memory database technology to create an embedded OLAP capability that eliminates the need for ETL and data from a transaction system into a separate analytics database. Now in doing so, Oracle is taking a similar approach with its MySQL today, as it does for its, or back then, whereas it does for its mainstream Oracle database. And today extends that. And what I mean by that is it's converging capabilities in a single platform. So the argument is this simplifies and accelerates analytics that lowers the costs and allows analytics, things like analytics to be run on data that is more fresh. Now, as many of you know, this is a different strategy than how, for example, an AWS approaches database where it creates purpose-built database services, targeted at specific workloads. These are philosophical design decisions made for a variety of reasons, but it's very clear which direction Oracle is headed in. Today, Oracle continues its HeatWave announcement cadence with a focus on increased automation as well. The company is continuing the trend of using clustering technology to scale out for both performance and capacity. And again, that theme of marrying hardware with software Oracle is also making announcements that focus on security. Hello everyone and welcome to this video exclusive. This is Dave Vellante. We're going to dig into these capabilities, Nipun Agarwal here. He's VP of MySQL HeatWave and advanced development in Oracle. Nipun has been leading the MySQL and HeatWave development effort for nearly a decade. He's got 180 patents to his name about half of which are associated with HeatWave. Nipun, welcome back to the show. Great to have you. >> Thank you, Dave. >> So before we get into the new news, if we could, maybe you could give us all a quick overview of HeatWave again, and what problems you originally set out to solve with it? >> Sure. So HeatWave is a in-memory query accelerator for MySQL. Now, as most people are aware, MySQL was originally designed and optimized for transactional processing. So when customers had the need to run analytics, they would need to extract data from the, MySQL database into another database and run analytics. With MySQL HeatWave, customers get a single database, which can be used both for transactional processing and for analytics. There's no need to move the data from one database to another database and all existing tools and applications, which are compatible with MySQL, continue to work as is. So in-memory query accelerator for MySQL and this is significantly faster than any version of MySQL database. And also it's much faster than specialized databases for analytics. >> Yeah, we're going to talk about that. And so obviously when you made the announcement last December, you had, I'm sure, a core group of, of early customers and beta customers, but then you opened it up to the world. So what was the reaction once you expose that to customers? >> The reaction has been very positive, Dave. So initially we're thinking that they're going to be a lot of customers who are on premise users of MySQL, who are going to migrate to the service. And surely that was the case. But the part which was very interesting and surprising is that we see many customers who are migrating from other cloud vendors or migrating from other cloud services to MySQL HeatWave. And most notably the biggest number of migrations we are seeing are from AWS Aurora and AWS RDS. >> Interesting. Okay. I wonder if you've got other feedback you're obviously responding in a pretty, pretty fast cadence here, you know, seven, eight month cadence. What are the feedback that you get, were there gaps that customers wanted you to to close? >> Sure. Yes. So as customers starting moving in to HeatWave they found that HeatWave is much faster, much cheaper. And when it's so much faster, they told us that there are some classes of queries, which could just not run earlier, which they can now with HeatWave. So it makes the applications richer because they can write new classes of queries with which they could not in the past. But in terms of the feedback or enhancement requests we got, I would say they will categorize the number one was automation. There've been customers move their database from on-premise to the cloud. They expect more automation. So that was the number one thing. The second thing was people wanted the ability to run analytics on larger sizes of data with MySQL HeatWave because they like what they saw and they wanted us to increase the data size limit, which can be processed by HeatWave. Third one was they wanted more classes of queries to be accessed with HeatWave. Initially, when we went out, HeatWave was designed to be an accelerator for analytic queries but more and more customers started seeing the benefit of beyond just analytics. More towards mixed workloads. So that was a third request. And then finally they wanted us to scale to a larger cluster size. And that's what we have done over the last several months that incorporating this feedback, which you've gotten from customers. >> So you're addressing those, those, those gaps. And thank you for sharing that with us. I got the press release here. I wonder if we could kind of go through these. Let's start with AutoPilot, you know, what's, what's that all about? What's different about AutoPilot? >> That's right. So MySQL AutoPilot provides machine learning based automation. So the first difference is that not only is it automating things, where and as a cloud provider as a service provider, we feel there are a lot of opportunities for us to automate, but the big difference about the approach we've taken with MySQL AutoPilot is that it's all driven based on the data and the queries. It's machine learning based automation. That's the first aspect. The second thing is this is all done natively in the server, right? So we are enhancing the, MySQL engine. We're enhancing the HeatWave engine and that's where all the logic and all the processing resides. In order to do this, we have had to collect new kinds of data. So for instance, in the past, people would collect statistics, which are based on just the data. Now we also collect statistics based on queries, for instance, what is the compilation time? What is the execution time? And we have augmented this with new machine learning models. And finally we have made a lot of innovations, a lot of inventions in the process where we collect data in a smart way. We process data in a smart way and the machine learning models we are talking about, also have a lot of innovation. And that's what gives us an edge over what other vendors may try to do. >> Yeah. I mean, I'm just, again, I'm looking at this meat, this pretty meaty preference, press release. Auto-provisioning, auto parallel load, auto data placement, auto encoding, auto error, auto recovery, auto scheduling, and you know, using a lot of, you know, computer science techniques that are well-known, first in first out, auto change propagation. So really focusing on, on driving that automation for customers. The other piece of it that struck me, and I said this in my intro is, you know, using clustering technology, clustering technology has been around for a long time as, as in-memory database, but applying it and integrating it. My sense is that's really about scale and performance and taking advantage of course, cloud being able to drive that scale instantaneously, but talk about scale a little bit in your philosophy there and why so much emphasis on scalability? >> Right. So what we want to do is to provide the fastest engine for running analytics. And that's why we do the processing in memory. Now, one of the issues with in process, in-memory processing is that the amount of data which you're processing has to reside in memory. So when we went out in the version one, given the footprint of the MySQL customers we spoke to, we thought 12 terabytes of processing at any given point in time, would be adequate. In the very first month, we got feedback that customers wanted us to process larger amounts of data with HeatWave, because they really like what they saw and they wanted us to increase. So if we have increased deployment from 12 terabytes to 32 terabytes and in order to do so, we now have a HeatWave cluster, which can be up to 64 nodes. That's one aspect on the query processing side. Now to answer the question as to why so much of an emphasis it's because this is something which is extremely difficult to do in query processing that as you scale the size of the cluster, the kind of algorithms, the kind of techniques you have to use so that you achieve a very high efficiency with a very large cluster. These are things which are easy to do, because what we want to make sure is that as customers have the need for like, like a processing larger amount of data, one of the big benefits customers get by using a cloud as opposed to on-premise is that they don't need to worry about provisioning gear ahead of time. So if they have more data with the cloud, they should be able to like process pool data easily. But when they process more data, they should expect the same kind of performance. So same kind of efficiency on a larger data size, similar to a smaller data size. And this is something traditionally other database vendors have struggled to provide. So this is a important problem. This is a tough engineering problem. And that's why a lot of emphasis on this to make sure that we provide our customers with very high efficiency of processing as they increase the size of the data. >> You're saying, traditionally, you'll get diminishing returns as you scale. So sort of as, as the volume grows, you're not able to take as much advantage or you're less efficient. And you're saying you've, you've largely solved that problem you're able to use. I mean, people always talk about scaling linearly and I'm always skeptical, but, but you're saying, especially in database, that's been a challenge, but you're, you're saying you've solved that problem largely. >> Right. What I would say is that we have a system which is very efficient, more efficient than like, you know, any of the database we are aware of. So as you said, perfect scaling is hard with you, right? I mean, that's a critical limit of scale factor one. That's very hard to achieve. We are now close to 90% efficiency for n2n queries. This is not for primitives. This is for n2n queries, both on industry benchmarks, as well as real world customer workloads. So this 90% efficiency we believe is very good and higher than what many of the vendors provide. >> Yeah. Right. So you're not, not just primitives the whole end to end cycle. I think 0.89, I think was the number that I, that I saw just to be technically correct there, but that's pretty, pretty good. Now let's talk about the benchmarks. It wouldn't be an Oracle announcement with some, some benchmarks. So you laid out today in your announcement, some, some pretty outstanding performance and price performance numbers, particularly you called out it's, it's. I feel like it's a badge of honor. If, if Oracle calls me out, I feel like I'm doing well. You called out Snowflake and Amazons. So maybe you could go over those benchmark results that we could peel the onion on that a little bit. >> Right. So the first thing to realize is that we want to have benchmarks, which are credible, right? So it's not the case that we have taken some specific unique workloads where HeatWave shines. That's not the case. What we did was we took a industry standard benchmark, which is like, you know, TPC-H. And furthermore, we had a third party, independent firm do this comparison. So let's first compare with Snowflake. On a 10 terabyte TPC-H benchmark HeatWave is seven times faster and one fifth the cost. So with this, it is 35 times better price performance compared to Snowflake, right? So seven times faster than Snowflake and one fifth of the cost. So HeatWave is 35 times better price performance compared to Snowflake. Not just that, Snowflake only does analytics, whereas MySQL HeatWave does both transactional processing and analytics. It's not a specialized database, MySQL HeatWave is a general purpose database, which can do both OLTP analytics whereas Snowflake can only do analytics. So to be 35 times more efficient than a database service, which is specialized only for one case, which is analytics, we think it's pretty good. So that's a comparison with Snowflake. >> So that's, that's you're using, I presume you got to be using list prices for that, obviously. >> That is correct. >> So there's discounts, let's put that into context of maybe 35 X better. You're not going to get that kind of discount. I wouldn't think. >> That is correct. >> Okay. What about Redshift? Aqua for Redshift has gained a lot of momentum in the marketplace. How do you compare against that? >> Right. So we did a comparison with Redshift, Aqua, same benchmark, 10 terabytes, TPC-H. And again, this was done by a third party. Here, HeatWave is six and a half times faster at half the cost. So HeatWave is 13 times better price performance compared to Redshift Aqua. And the same thing for Redshift. It's a specialized database only for analytics. So customers need to have two databases, one for transaction processing, one for analytics, with Redshift. Whereas with MySQL HeatWave, it's a single database for both. And it is so much faster than Redshift. That again, we feel is a pretty remarkable. >> Now, you mentioned earlier, but you're not, you're obviously I presume not, you're not cheating here. You're not including the cost of the transaction processing data store. Right? We're, we're, we're ignoring that for a minute. Ignoring that you got to, you know, move data, ETL, we're just talking about like the like, is that correct? >> Right. This is extremely fair and extremely generous comparison. Not only are we not including the cost of the source OLTP database, the cost in the case of the Redshift I'm talking about is the cost for one year paid full upfront. So this is a best pricing. A customer can get for one year subscription with Redshift. Whereas when I'm talking about HeatWave, this is the pay as you go price. And the third aspect is, this is Redshift when it is completely fully optimized. I don't think anyone else can get much better numbers on Redshift than we have. Right? So fully optimized configuration of Redshift looking at the one year pre-pay cost of Redshift and not including the source database. >> Okay. And then speaking of transaction processing database, what about Aurora? You mentioned earlier that that you're seeing a lot of migration from Aurora. Can you add some color to that? >> Right. And this is a very interesting question in a, it was a very interesting observation for us when we did the launch back in December, we had numbers on four terabytes, TPC-H with Aurora. So if you look at the same benchmark, four terabytes TPC-H HeatWave is 1,400 times faster than Aurora at half the cost, which makes it 2,800 times better price performance compared to Aurora. So very good number. What we have found is that many customers who are running on Aurora started migrating to HeatWave, and these customers had a mix of transaction processing and analytics, and the data sizes are much smaller. Even those customers found that there was a significant improvement in performance and reduction in costs when they migrated to HeatWave. In the announcement today, many of the references are those class of customers. So for that, we decided to choose another benchmark, which is called CH-benchmark on a much smaller data size. And even for that, even for mixed workloads, we find that HeatWave is 18 times faster, provides over a hundred times higher throughput than Aurora at 42% of the cost. So in terms of price performance gain, it is much, much better than Aurora, even for mixed workloads. And then if you consider a pure OLTP assume you have an application, which has only OLTP, which by the way is like, you know, a very uncommon scenario, but even if that were be the case, in that case for pure OLTP only, MySQL HeatWave is at par with Aurora, with respect to performance, but MySQL HeatWave costs 42% of Aurora. So the point is that in the whole spectrum, pure OLTP, mixed workloads or analytics, MySQL HeatWave is going to be fraction of the cost of a Aurora. And depending upon your query workload, your acceleration can be anywhere from 14,000 times to 18 times faster. >> That's interesting. I mean, you've been at this for the better part of a decade, because my sense is that HeatWave is all about OLAP. And that's really where you've put the majority, if not all of the innovation. But you're saying just coming into December's announcement, you were at par with a, in a, in a, in a, in a rare, but, but hypothetical OLTP workload. >> That is correct. >> Yeah. >> Well, you know, I got to push you still on this because a lot of times these benchmarks are a function of the skills of the individuals performing these tests, right? So can I, if I want to run them myself, you know, if you publish these benchmarks, what if a customer wants to replicate these tests and try to see if they can tune up, you know, Redshift better than you guys did? >> Sure. So I'll say a couple of things. One is all the numbers which I'm talking about both for Redshift and Snowflake were done by a third party firm, but all the numbers we is talking about, TPC-H, as well has CH-benchmark. All the scripts are published on GitHub. So anyone is very welcome. In fact, we encourage customers to go and try it for themselves, and they will find that the numbers are absolutely as advertised. In fact, we had couple of companies like in the last several months who went to GitHub, they downloaded our TPCH scripts and they reported that the performance numbers they were seeing with HeatWave were actually better than we had published back in December. And the reason was that since December we had new code, which was running. So our numbers were actually better than advertised. So all the benchmarks are published. They are all available on GitHub. You can go to the HeatWave website on oracle.com and get the link for it. And we welcome anyone to come and try these numbers for themselves. >> All right. Good. Great. Thank you for that. Now you mentioned earlier that you were somewhat surprised, not surprised that you got customers migrating from on-prem databases, but you also saw migration from other clouds. How do you expect the trend with regard to this new announcement? Do you have any sense as to how that's going to go? >> Right. So one of the big changes from December to now is that we have now focused quite a bit on mixed workloads. So in the past, in December, when we first went out, HeatWave was designed primarily for analytics. Now, what we have found is that there's a very large class of customers who have mixed workloads and who also have smaller data sizes. We now have introduced a lot of technology, including things like auto scheduling, definitely improvement in performance, where MySQL HeatWave is a very superior solution compared to Aurora or other databases out there, both in terms of performance as well as price for these mixed workloads and better latency, better throughput, lower costs. So we expect this trend of migration to MySQL HeatWave, to accelerate. So we are seeing customers migrate from Azure. We are seeing customers migrate from GCP and by far the number one migrations we are seeing are from AWS. So I think based on the new features and technologies, we have announced today, this migration is going to accelerate. >> All right, last question. So I said earlier, it's, it's, it seems like you're applying what are generally well understood and proven technologies, like in-memory, you like clustering to solve these problems. And I think about, you know, the, the things that you're doing, and I wonder, you know, I mean, these things have been around for awhile and why has this type of approach not been introduced by others previously? >> Right. Well, so the main thing is it takes time, right? That we designed HeatWave from the ground up for the cloud. And as a part of that, we had to invent new algorithms for distributed query processing for the cloud. We put in the hooks for machine learning processes. We're sealing processing right from the ground up. So this has taken us close to a decade. It's been hundreds of person-years of investment, dozens of patents which have gone in. Another aspect is it takes talent from different areas. So we have like, you know, people working in distributed query processing, we have people who have a lot of like background in machine learning. And then given that we are like the custodians of the MySQL database, we have a very rich set of customers we can reach out to, to get feedback from them as to what are the pinpoints. So culmination of these trends, which we have this talent, the customer base and the time, so we spent almost close to a decade to make this thing work. So that's what it takes. It takes time, patience, patience, and talent. >> A lot of software innovation bringing together, as I said, that hardware and software strategy. Very interesting. Nipun, thanks so much. I appreciate your, your insights and coming on this video exclusive. >> Thank you, Dave. Thank you for the opportunity. >> My pleasure. And thank you for watching everybody. This is Dave Vellante for theCUBE. We'll see you next time. (bright music)

Published Date : Aug 10 2021

SUMMARY :

So the argument is this simplifies the data from one database So what was the reaction once And most notably the What are the feedback that you get, So it makes the applications I got the press release here. So for instance, in the past, and I said this in my intro is, you know, In the very first month, we So sort of as, as the volume grows, any of the database we are So maybe you could go over So the first thing to realize So that's, that's you're using, You're not going to get in the marketplace. And the same thing for Redshift. of the transaction and not including the source database. a lot of migration from Aurora. So the point is that in the if not all of the innovation. but all the numbers we is talking about, not surprised that you So in the past, in December, And I think about, you know, the, of the MySQL database, we have A lot of software Thank you for the opportunity. you for watching everybody.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
2,800 times	QUANTITY	0.99+
Dave	PERSON	0.99+
December	DATE	0.99+
one year	QUANTITY	0.99+
12 terabytes	QUANTITY	0.99+
1,400 times	QUANTITY	0.99+
14,000 times	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
32 terabytes	QUANTITY	0.99+
Amazons	ORGANIZATION	0.99+
35 times	QUANTITY	0.99+
18 times	QUANTITY	0.99+
90%	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Nipun	PERSON	0.99+
first aspect	QUANTITY	0.99+
Last December	DATE	0.99+
Nipun Agarwal	PERSON	0.99+
last year	DATE	0.99+
MySQL	TITLE	0.99+
seven	QUANTITY	0.99+
42%	QUANTITY	0.99+
13 times	QUANTITY	0.99+
seven times	QUANTITY	0.99+
180 patents	QUANTITY	0.99+
Sun	ORGANIZATION	0.99+
third request	QUANTITY	0.99+
first	QUANTITY	0.99+
one case	QUANTITY	0.99+
AutoPilot	TITLE	0.99+
0.89	QUANTITY	0.99+
second thing	QUANTITY	0.99+
third aspect	QUANTITY	0.99+
one	QUANTITY	0.99+
two databases	QUANTITY	0.99+
10 terabyte	QUANTITY	0.99+
MySQL AutoPilot	TITLE	0.99+
both	QUANTITY	0.99+
Third one	QUANTITY	0.99+
today	DATE	0.99+
last December	DATE	0.99+
MySQL HeatWave	TITLE	0.99+
HeatWave	ORGANIZATION	0.99+
One	QUANTITY	0.99+
10 terabytes	QUANTITY	0.98+
GitHub	ORGANIZATION	0.98+
one fifth	QUANTITY	0.98+

Ed Walsh & Thomas Hazel | A New Database Architecture for Supercloud

(bright music) >> Hi, everybody, this is Dave Vellante, welcome back to Supercloud 2. Last August, at the first Supercloud event, we invited the broader community to help further define Supercloud, we assessed its viability, and identified the critical elements and deployment models of the concept. The objectives here at Supercloud too are, first of all, to continue to tighten and test the concept, the second is, we want to get real world input from practitioners on the problems that they're facing and the viability of Supercloud in terms of applying it to their business. So on the program, we got companies like Walmart, Sachs, Western Union, Ionis Pharmaceuticals, NASDAQ, and others. And the third thing that we want to do is we want to drill into the intersection of cloud and data to project what the future looks like in the context of Supercloud. So in this segment, we want to explore the concept of data architectures and what's going to be required for Supercloud. And I'm pleased to welcome one of our Supercloud sponsors, ChaosSearch, Ed Walsh is the CEO of the company, with Thomas Hazel, who's the Founder, CTO, and Chief Scientist. Guys, good to see you again, thanks for coming into our Marlborough studio. >> Always great. >> Great to be here. >> Okay, so there's a little debate, I'm going to put you right in the spot. (Ed chuckling) A little debate going on in the community started by Bob Muglia, a former CEO of Snowflake, and he was at Microsoft for a long time, and he looked at the Supercloud definition, said, "I think you need to tighten it up a little bit." So, here's what he came up with. He said, "A Supercloud is a platform that provides a programmatically consistent set of services hosted on heterogeneous cloud providers." So he's calling it a platform, not an architecture, which was kind of interesting. And so presumably the platform owner is going to be responsible for the architecture, but Dr. Nelu Mihai, who's a computer scientist behind the Cloud of Clouds Project, he chimed in and responded with the following. He said, "Cloud is a programming paradigm supporting the entire lifecycle of applications with data and logic natively distributed. Supercloud is an open architecture that integrates heterogeneous clouds in an agnostic manner." So, Ed, words matter. Is this an architecture or is it a platform? >> Put us on the spot. So, I'm sure you have concepts, I would say it's an architectural or design principle. Listen, I look at Supercloud as a mega trend, just like cloud, just like data analytics. And some companies are using the principle, design principles, to literally get dramatically ahead of everyone else. I mean, things you couldn't possibly do if you didn't use cloud principles, right? So I think it's a Supercloud effect, you're able to do things you're not able to. So I think it's more a design principle, but if you do it right, you get dramatic effect as far as customer value. >> So the conversation that we were having with Muglia, and Tristan Handy of dbt Labs, was, I'll set it up as the following, and, Thomas, would love to get your thoughts, if you have a CRM, think about applications today, it's all about forms and codifying business processes, you type a bunch of stuff into Salesforce, and all the salespeople do it, and this machine generates a forecast. What if you have this new type of data app that pulls data from the transaction system, the e-commerce, the supply chain, the partner ecosystem, et cetera, and then, without humans, actually comes up with a plan. That's their vision. And Muglia was saying, in order to do that, you need to rethink data architectures and database architectures specifically, you need to get down to the level of how the data is stored on the disc. What are your thoughts on that? Well, first of all, I'm going to cop out, I think it's actually both. I do think it's a design principle, I think it's not open technology, but open APIs, open access, and you can build a platform on that design principle architecture. Now, I'm a database person, I love solving the database problems. >> I'm waited for you to launch into this. >> Yeah, so I mean, you know, Snowflake is a database, right? It's a distributed database. And we wanted to crack those codes, because, multi-region, multi-cloud, customers wanted access to their data, and their data is in a variety of forms, all these services that you're talked about. And so what I saw as a core principle was cloud object storage, everyone streams their data to cloud object storage. From there we said, well, how about we rethink database architecture, rethink file format, so that we can take each one of these services and bring them together, whether distributively or centrally, such that customers can access and get answers, whether it's operational data, whether it's business data, AKA search, or SQL, complex distributed joins. But we had to rethink the architecture. I like to say we're not a first generation, or a second, we're a third generation distributed database on pure, pure cloud storage, no caching, no SSDs. Why? Because all that availability, the cost of time, is a struggle, and cloud object storage, we think, is the answer. >> So when you're saying no caching, so when I think about how companies are solving some, you know, pretty hairy problems, take MySQL Heatwave, everybody thought Oracle was going to just forget about MySQL, well, they come out with Heatwave. And the way they solve problems, and you see their benchmarks against Amazon, "Oh, we crush everybody," is they put it all in memory. So you said no caching? You're not getting performance through caching? How is that true, and how are you getting performance? >> Well, so five, six years ago, right? When you realize that cloud object storage is going to be everywhere, and it's going to be a core foundational, if you will, fabric, what would you do? Well, a lot of times the second generation say, "We'll take it out of cloud storage, put in SSDs or something, and put into cache." And that adds a lot of time, adds a lot of costs. But I said, what if, what if we could actually make the first read hot, the first read distributed joins and searching? And so what we went out to do was said, we can't cache, because that's adds time, that adds cost. We have to make cloud object storage high performance, like it feels like a caching SSD. That's where our patents are, that's where our technology is, and we've spent many years working towards this. So, to me, if you can crack that code, a lot of these issues we're talking about, multi-region, multicloud, different services, everybody wants to send their data to the data lake, but then they move it out, we said, "Keep it right there." >> You nailed it, the data gravity. So, Bob's right, the data's coming in, and you need to get the data from everywhere, but you need an environment that you can deal with all that different schema, all the different type of technology, but also at scale. Bob's right, you cannot use memory or SSDs to cache that, that doesn't scale, it doesn't scale cost effectively. But if you could, and what you did, is you made object storage, S3 first, but object storage, the only persistence by doing that. And then we get performance, we should talk about it, it's literally, you know, hundreds of terabytes of queries, and it's done in seconds, it's done without memory caching. We have concepts of caching, but the only caching, the only persistence, is actually when we're doing caching, we're just keeping another side-eye track of things on the S3 itself. So we're using, actually, the object storage to be a database, which is kind of where Bob was saying, we agree, but that's what you started at, people thought you were crazy. >> And maybe make it live. Don't think of it as archival or temporary space, make it live, real time streaming, operational data. What we do is make it smart, we see the data coming in, we uniquely index it such that you can get your use cases, that are search, observability, security, or backend operational. But we don't have to have this, I dunno, static, fixed, siloed type of architecture technologies that were traditionally built prior to Supercloud thinking. >> And you don't have to move everything, essentially, you can do it wherever the data lands, whatever cloud across the globe, you're able to bring it together, you get the cost effectiveness, because the only persistence is the cheapest storage persistent layer you can buy. But the key thing is you cracked the code. >> We had to crack the code, right? That was the key thing. >> That's where the plans are. >> And then once you do that, then everything else gets easier to scale, your architecture, across regions, across cloud. >> Now, it's a general purpose database, as Bob was saying, but we use that database to solve a particular issue, which is around operational data, right? So, we agree with Bob's. >> Interesting. So this brings me to this concept of data, Jimata Gan is one of our speakers, you know, we talk about data fabric, which is a NetApp, originally NetApp concept, Gartner's kind of co-opted it. But so, the basic concept is, data lives everywhere, whether it's an S3 bucket, or a SQL database, or a data lake, it's just a node on the data mesh. So in your view, how does this fit in with Supercloud? Ed, you've said that you've built, essentially, an enabler for that, for the data mesh, I think you're an enabler for the Supercloud-like principles. This is a big, chewy opportunity, and it requires, you know, a team approach. There's got to be an ecosystem, there's not going to be one Supercloud to rule them all, so where does the ecosystem fit into the discussion, and where do you fit into the ecosystem? >> Right, so we agree completely, there's not one Supercloud in effect, but we use Supercloud principles to build our platform, and then, you know, the ecosystem's going to be built on leveraging what everyone else's secret powers are, right? So our power, our superpower, based upon what we built is, we deal with, if you're having any scale, or cost effective scale issues, with data, machine generated data, like business observability or security data, we are your force multiplier, we will take that in singularly, just let it, simply put it in your object storage wherever it sits, and we give you uniformity access to that using OpenAPI access, SQL, or you know, Elasticsearch API. So, that's what we do, that's our superpower. So I'll play it into data mesh, that's a perfect, we are a node on a data mesh, but I'll play it in the soup about how, the ecosystem, we see it kind of playing, and we talked about it in just in the last couple days, how we see this kind of possibly. Short term, our superpowers, we deal with this data that's coming at these environments, people, customers, building out observability or security environments, or vendors that are selling their own Supercloud, I do observability, the Datadogs of the world, dot dot dot, the Splunks of the world, dot dot dot, and security. So what we do is we fit in naturally. What we do is a cost effective scale, just land it anywhere in the world, we deal with ingest, and it's a cost effective, an order of magnitude, or two or three order magnitudes more cost effective. Allows them, their customers are asking them to do the impossible, "Give me fast monitoring alerting. I want it snappy, but I want it to keep two years of data, (laughs) and I want it cost effective." It doesn't work. They're good at the fast monitoring alerting, we're good at the long-term retention. And yet there's some gray area between those two, but one to one is actually cheaper, so we would partner. So the first ecosystem plays, who wants to have the ability to, really, all the data's in those same environments, the security observability players, they can literally, just through API, drag our data into their point to grab. We can make it seamless for customers. Right now, we make it helpful to customers. Your Datadog, we make a button, easy go from Datadog to us for logs, save you money. Same thing with Grafana. But you can also look at ecosystem, those same vendors, it used to be a year ago it was, you know, its all about how can you grow, like it's growth at all costs, now it's about cogs. So literally we can go an environment, you supply what your customer wants, but we can help with cogs. And one-on one in a partnership is better than you trying to build on your own. >> Thomas, you were saying you make the first read fast, so you think about Snowflake. Everybody wants to talk about Snowflake and Databricks. So, Snowflake, great, but you got to get the data in there. All right, so that's, can you help with that problem? >> I mean we want simple in, right? And if you have to have structure in, you're not simple. So the idea that you have a simple in, data lake, schema read type philosophy, but schema right type performance. And so what I wanted to do, what we have done, is have that simple lake, and stream that data real time, and those access points of Search or SQL, to go after whatever business case you need, security observability, warehouse integration. But the key thing is, how do I make that click, click, click answer, and do it quickly? And so what we want to do is, that first read has to be fast. Why? 'Cause then you're going to do all this siloing, layers, complexity. If your first read's not fast, you're at a disadvantage, particularly in cost. And nobody says I want less data, but everyone has to, whether they say we're going to shorten the window, we're going to use AI to choose, but in a security moment, when you don't have that answer, you're in trouble. And that's why we are this service, this Supercloud service, if you will, providing access, well-known search, well-known SQL type access, that if you just have one access point, you're at a disadvantage. >> We actually talked about Snowflake and BigQuery, and a different platform, Data Bricks. That's kind of where we see the phase two of ecosystem. One is easy, the low-hanging fruit is observability and security firms. But the next one is, what we do, our super power is dealing with this messy data that schema is changing like night and day. Pipelines are tough, and it's changing all the time, but you want these things fast, and it's big data around the world. That's the next point, just use us alongside, or inside, one of their platforms, and now we get the best of both worlds. Our superpower is keeping this messy data as a streaming, okay, not a batch thing, allow you to do that. So, that's the second one. And then to be honest, the third one, which plays you to Supercloud, it also plays perfectly in the data mesh, is if you really go to the ultimate thing, what we have done is made object storage, S3, GCS, and blob storage, we made it a database. Put, get, complex query with big joins. You know, so back to your original thing, and Muglia teed it up perfectly, we've done that. Now imagine if that's an ecosystem, who would want that? If it's, again, it's uniform available across all the regions, across all the clouds, and it's right next to where you are building a service, or a client's trying, that's where the ecosystem, I think people are going to use Superclouds for their superpowers. We're really good at this, allows that short term. I think the Snowflakes and the Data Bricks are the medium term, you know? And then I think eventually gets to, hey, listen if you can make object storage fast, you can just go after it with simple SQL queries, or elastic. Who would want that? I think that's where people are going to leverage it. It's not going to be one Supercloud, and we leverage the super clouds. >> Our viewpoint is smart object storage can be programmable, and so we agree with Bob, but we're not saying do it here, do it here. This core, fundamental layer across regions, across clouds, that everyone has? Simple in. Right now, it's hard to get data in for access for analysis. So we said, simply, we'll automate the entire process, give you API access across regions, across clouds. And again, how do you do a distributed join that's fast? How do you do a distributed join that doesn't cost you an arm or a leg? And how do you do it at scale? And that's where we've been focused. >> So prior, the cloud object store was a niche. >> Yeah. >> S3 obviously changed that. How standard is, essentially, object store across the different cloud platforms? Is that a problem for you? Is that an easy thing to solve? >> Well, let's talk about it. I mean we've fundamentally, yeah we've extracted it, but fundamentally, cloud object storage, put, get, and list. That's why it's so scalable, 'cause it doesn't have all these other components. That complexity is where we have moved up, and provide direct analytical API access. So because of its simplicity, and costs, and security, and reliability, it can scale naturally. I mean, really, distributed object storage is easy, it's put-get anywhere, now what we've done is we put a layer of intelligence, you know, call it smart object storage, where access is simple. So whether it's multi-region, do a query across, or multicloud, do a query across, or hunting, searching. >> We've had clients doing Amazon and Google, we have some Azure, but we see Amazon and Google more, and it's a consistent service across all of them. Just literally put your data in the bucket of choice, or folder of choice, click a couple buttons, literally click that to say "that's hot," and after that, it's hot, you can see it. But we're not moving data, the data gravity issue, that's the other. That it's already natively flowing to these pools of object storage across different regions and clouds. We don't move it, we index it right there, we're spinning up stateless compute, back to the Supercloud concept. But now that allows us to do all these other things, right? >> And it's no longer just cheap and deep object storage. Right? >> Yeah, we make it the same, like you have an analytic platform regardless of where you're at, you don't have to worry about that. Yeah, we deal with that, we deal with a stateless compute coming up -- >> And make it programmable. Be able to say, "I want this bucket to provide these answers." Right, that's really the hope, the vision. And the complexity to build the entire stack, and then connect them together, we said, the fabric is cloud storage, we just provide the intelligence on top. >> Let's bring it back to the customers, and one of the things we're exploring in Supercloud too is, you know, is Supercloud a solution looking for a problem? Is a multicloud really a problem? I mean, you hear, you know, a lot of the vendor marketing says, "Oh, it's a disaster, because it's all different across the clouds." And I talked to a lot of customers even as part of Supercloud too, they're like, "Well, I solved that problem by just going mono cloud." Well, but then you're not able to take advantage of a lot of the capabilities and the primitives that, you know, like Google's data, or you like Microsoft's simplicity, their RPA, whatever it is. So what are customers telling you, what are their near term problems that they're trying to solve today, and how are they thinking about the future? >> Listen, it's a real problem. I think it started, I think this is a a mega trend, just like cloud. Just, cloud data, and I always add, analytics, are the mega trends. If you're looking at those, if you're not considering using the Supercloud principles, in other words, leveraging what I have, abstracting it out, and getting the most out of that, and then build value on top, I think you're not going to be able to keep up, In fact, no way you're going to keep up with this data volume. It's a geometric challenge, and you're trying to do linear things. So clients aren't necessarily asking, hey, for Supercloud, but they're really saying, I need to have a better mechanism to simplify this and get value across it, and how do you abstract that out to do that? And that's where they're obviously, our conversations are more amazed what we're able to do, and what they're able to do with our platform, because if you think of what we've done, the S3, or GCS, or object storage, is they can't imagine the ingest, they can't imagine how easy, time to glass, one minute, no matter where it lands in the world, querying this in seconds for hundreds of terabytes squared. People are amazed, but that's kind of, so they're not asking for that, but they are amazed. And then when you start talking on it, if you're an enterprise person, you're building a big cloud data platform, or doing data or analytics, if you're not trying to leverage the public clouds, and somehow leverage all of them, and then build on top, then I think you're missing it. So they might not be asking for it, but they're doing it. >> And they're looking for a lens, you mentioned all these different services, how do I bring those together quickly? You know, our viewpoint, our service, is I have all these streams of data, create a lens where they want to go after it via search, go after via SQL, bring them together instantly, no e-tailing out, no define this table, put into this database. We said, let's have a service that creates a lens across all these streams, and then make those connections. I want to take my CRM with my Google AdWords, and maybe my Salesforce, how do I do analysis? Maybe I want to hunt first, maybe I want to join, maybe I want to add another stream to it. And so our viewpoint is, it's so natural to get into these lake platforms and then provide lenses to get that access. >> And they don't want it separate, they don't want something different here, and different there. They want it basically -- >> So this is our industry, right? If something new comes out, remember virtualization came out, "Oh my God, this is so great, it's going to solve all these problems." And all of a sudden it just got to be this big, more complex thing. Same thing with cloud, you know? It started out with S3, and then EC2, and now hundreds and hundreds of different services. So, it's a complex matter for a lot of people, and this creates problems for customers, especially when you got divisions that are using different clouds, and you're saying that the solution, or a solution for the part of the problem, is to really allow the data to stay in place on S3, use that standard, super simple, but then give it what, Ed, you've called superpower a couple of times, to make it fast, make it inexpensive, and allow you to do that across clouds. >> Yeah, yeah. >> I'll give you guys the last word on that. >> No, listen, I think, we think Supercloud allows you to do a lot more. And for us, data, everyone says more data, more problems, more budget issue, everyone knows more data is better, and we show you how to do it cost effectively at scale. And we couldn't have done it without the design principles of we're leveraging the Supercloud to get capabilities, and because we use super, just the object storage, we're able to get these capabilities of ingest, scale, cost effectiveness, and then we built on top of this. In the end, a database is a data platform that allows you to go after everything distributed, and to get one platform for analytics, no matter where it lands, that's where we think the Supercloud concepts are perfect, that's where our clients are seeing it, and we're kind of excited about it. >> Yeah a third generation database, Supercloud database, however we want to phrase it, and make it simple, but provide the value, and make it instant. >> Guys, thanks so much for coming into the studio today, I really thank you for your support of theCUBE, and theCUBE community, it allows us to provide events like this and free content. I really appreciate it. >> Oh, thank you. >> Thank you. >> All right, this is Dave Vellante for John Furrier in theCUBE community, thanks for being with us today. You're watching Supercloud 2, keep it right there for more thought provoking discussions around the future of cloud and data. (bright music)

Published Date : Feb 17 2023

SUMMARY :

And the third thing that we want to do I'm going to put you right but if you do it right, So the conversation that we were having I like to say we're not a and you see their So, to me, if you can crack that code, and you need to get the you can get your use cases, But the key thing is you cracked the code. We had to crack the code, right? And then once you do that, So, we agree with Bob's. and where do you fit into the ecosystem? and we give you uniformity access to that so you think about Snowflake. So the idea that you have are the medium term, you know? and so we agree with Bob, So prior, the cloud that an easy thing to solve? you know, call it smart object storage, and after that, it's hot, you can see it. And it's no longer just you don't have to worry about And the complexity to and one of the things we're and how do you abstract it's so natural to get and different there. and allow you to do that across clouds. I'll give you guys and we show you how to do it but provide the value, I really thank you for around the future of cloud and data.

ENTITIES

Entity	Category	Confidence
Walmart	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
NASDAQ	ORGANIZATION	0.99+
Bob Muglia	PERSON	0.99+
Thomas	PERSON	0.99+
Thomas Hazel	PERSON	0.99+
Ionis Pharmaceuticals	ORGANIZATION	0.99+
Western Union	ORGANIZATION	0.99+
Ed Walsh	PERSON	0.99+
Bob	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Nelu Mihai	PERSON	0.99+
Sachs	ORGANIZATION	0.99+
Tristan Handy	PERSON	0.99+
two	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
two years	QUANTITY	0.99+
Supercloud 2	TITLE	0.99+
first	QUANTITY	0.99+
Last August	DATE	0.99+
three	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+
both	QUANTITY	0.99+
dbt Labs	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Ed	PERSON	0.99+
Gartner	ORGANIZATION	0.99+
Jimata Gan	PERSON	0.99+
third one	QUANTITY	0.99+
one minute	QUANTITY	0.99+
second	QUANTITY	0.99+
first generation	QUANTITY	0.99+
third generation	QUANTITY	0.99+
Grafana	ORGANIZATION	0.99+
second generation	QUANTITY	0.99+
second one	QUANTITY	0.99+
hundreds of terabytes	QUANTITY	0.98+
SQL	TITLE	0.98+
five	DATE	0.98+
one	QUANTITY	0.98+
Databricks	ORGANIZATION	0.98+
a year ago	DATE	0.98+
ChaosSearch	ORGANIZATION	0.98+
Muglia	PERSON	0.98+
MySQL	TITLE	0.98+
both worlds	QUANTITY	0.98+
third thing	QUANTITY	0.97+
Marlborough	LOCATION	0.97+
theCUBE	ORGANIZATION	0.97+
today	DATE	0.97+
Supercloud	ORGANIZATION	0.97+
Elasticsearch	TITLE	0.96+
NetApp	TITLE	0.96+
Datadog	ORGANIZATION	0.96+
One	QUANTITY	0.96+
EC2	TITLE	0.96+
each one	QUANTITY	0.96+
S3	TITLE	0.96+
one platform	QUANTITY	0.95+
Supercloud 2	EVENT	0.95+
first read	QUANTITY	0.95+
six years ago	DATE	0.95+

Oracle Aspires to be the Netflix of AI | Cube Conversation

(gentle music playing) >> For centuries, we've been captivated by the concept of machines doing the job of humans. And over the past decade or so, we've really focused on AI and the possibility of intelligent machines that can perform cognitive tasks. Now in the past few years, with the popularity of machine learning models ranging from recent ChatGPT to Bert, we're starting to see how AI is changing the way we interact with the world. How is AI transforming the way we do business? And what does the future hold for us there. At theCube, we've covered Oracle's AI and ML strategy for years, which has really been used to drive automation into Oracle's autonomous database. We've talked a lot about MySQL HeatWave in database machine learning, and AI pushed into Oracle's business apps. Oracle, it tends to lead in AI, but not competing as a direct AI player per se, but rather embedding AI and machine learning into its portfolio to enhance its existing products, and bring new services and offerings to the market. Now, last October at Cloud World in Las Vegas, Oracle partnered with Nvidia, which is the go-to AI silicon provider for vendors. And they announced an investment, a pretty significant investment to deploy tens of thousands more Nvidia GPUs to OCI, the Oracle Cloud Infrastructure and build out Oracle's infrastructure for enterprise scale AI. Now, Oracle CEO, Safra Catz said something to the effect of this alliance is going to help customers across industries from healthcare, manufacturing, telecoms, and financial services to overcome the multitude of challenges they face. Presumably she was talking about just driving more automation and more productivity. Now, to learn more about Oracle's plans for AI, we'd like to welcome in Elad Ziklik, who's the vice president of AI services at Oracle. Elad, great to see you. Welcome to the show. >> Thank you. Thanks for having me. >> You're very welcome. So first let's talk about Oracle's path to AI. I mean, it's the hottest topic going for years you've been incorporating machine learning into your products and services, you know, could you tell us what you've been working on, how you got here? >> So great question. So as you mentioned, I think most of the original four-way into AI was on embedding AI and using AI to make our applications, and databases better. So inside mySQL HeatWave, inside our autonomous database in power, we've been driving AI, all of course are SaaS apps. So Fusion, our large enterprise business suite for HR applications and CRM and ELP, and whatnot has built in AI inside it. Most recently, NetSuite, our small medium business SaaS suite started using AI for things like automated invoice processing and whatnot. And most recently, over the last, I would say two years, we've started exposing and bringing these capabilities into the broader OCI Oracle Cloud infrastructure. So the developers, and ISVs and customers can start using our AI capabilities to make their apps better and their experiences and business workflow better, and not just consume these as embedded inside Oracle. And this recent partnership that you mentioned with Nvidia is another step in bringing the best AI infrastructure capabilities into this platform so you can actually build any type of machine learning workflow or AI model that you want on Oracle Cloud. >> So when I look at the market, I see companies out there like DataRobot or C3 AI, there's maybe a half dozen that sort of pop up on my radar anyway. And my premise has always been that most customers, they don't want to become AI experts, they want to buy applications and have AI embedded or they want AI to manage their infrastructure. So my question to you is, how does Oracle help its OCI customers support their business with AI? >> So it's a great question. So I think what most customers want is business AI. They want AI that works for the business. They want AI that works for the enterprise. I call it the last mile of AI. And they want this thing to work. The majority of them don't want to hire a large and expensive data science teams to go and build everything from scratch. They just want the business problem solved by applying AI to it. My best analogy is Lego. So if you think of Lego, Lego has these millions Lego blocks that you can use to build anything that you want. But the majority of people like me or like my kids, they want the Lego death style kit or the Lego Eiffel Tower thing. They want a thing that just works, and it's very easy to use. And still Lego blocks, you still need to build some things together, which just works for the scenario that you're looking for. So that's our focus. Our focus is making it easy for customers to apply AI where they need to, in the right business context. So whether it's embedding it inside the business applications, like adding forecasting capabilities to your supply chain management or financial planning software, whether it's adding chat bots into the line of business applications, integrating these things into your analytics dashboard, even all the way to, we have a new platform piece we call ML applications that allows you to take a machine learning model, and scale it for the thousands of tenants that you would be. 'Cause this is a big problem for most of the ML use cases. It's very easy to build something for a proof of concept or a pilot or a demo. But then if you need to take this and then deploy it across your thousands of customers or your thousands of regions or facilities, then it becomes messy. So this is where we spend our time making it easy to take these things into production in the context of your business application or your business use case that you're interested in right now. >> So you mentioned chat bots, and I want to talk about ChatGPT, but my question here is different, we'll talk about that in a minute. So when you think about these chat bots, the ones that are conversational, my experience anyway is they're just meh, they're not that great. But the ones that actually work pretty well, they have a conditioned response. Now they're limited, but they say, which of the following is your problem? And then if that's one of the following is your problem, you can maybe solve your problem. But this is clearly a trend and it helps the line of business. How does Oracle think about these use cases for your customers? >> Yeah, so I think the key here is exactly what you said. It's about task completion. The general purpose bots are interesting, but as you said, like are still limited. They're getting much better, I'm sure we'll talk about ChatGPT. But I think what most enterprises want is around task completion. I want to automate my expense report processing. So today inside Oracle we have a chat bot where I submit my expenses the bot ask a couple of question, I answer them, and then I'm done. Like I don't need to go to our fancy application, and manually submit an expense report. I do this via Slack. And the key is around managing the right expectations of what this thing is capable of doing. Like, I have a story from I think five, six years ago when technology was much inferior than it is today. Well, one of the telco providers I was working with wanted to roll a chat bot that does realtime translation. So it was for a support center for of the call centers. And what they wanted do is, Hey, we have English speaking employees, whatever, 24/7, if somebody's calling, and the native tongue is different like Hebrew in my case, or Chinese or whatnot, then we'll give them a chat bot that they will interact with and will translate this on the fly and everything would work. And when they rolled it out, the feedback from customers was horrendous. Customers said, the technology sucks. It's not good. I hate it, I hate your company, I hate your support. And what they've done is they've changed the narrative. Instead of, you go to a support center, and you assume you're going to talk to a human, and instead you get a crappy chat bot, they're like, Hey, if you want to talk to a Hebrew speaking person, there's a four hour wait, please leave your phone and we'll call you back. Or you can try a new amazing Hebrew speaking AI powered bot and it may help your use case. Do you want to try it out? And some people said, yeah, let's try it out. Plus one to try it out. And the feedback, even though it was the exact same technology was amazing. People were like, oh my God, this is so innovative, this is great. Even though it was the exact same experience that they hated a few weeks earlier on. So I think the key lesson that I picked from this experience is it's all about setting the right expectations, and working around the right use case. If you are replacing a human, the level is different than if you are just helping or augmenting something that otherwise would take a lot of time. And I think this is the focus that we are doing, picking up the tasks that people want to accomplish or that enterprise want to accomplish for the customers, for the employees. And using chat bots to make those specific ones better rather than, hey, this is going to replace all humans everywhere, and just be better than that. >> Yeah, I mean, to the point you mentioned expense reports. I'm in a Twitter thread and one guy says, my favorite part of business travel is filling out expense reports. It's an hour of excitement to figure out which receipts won't scan. We can all relate to that. It's just the worst. When you think about companies that are building custom AI driven apps, what can they do on OCI? What are the best options for them? Do they need to hire an army of machine intelligence experts and AI specialists? Help us understand your point of view there. >> So over the last, I would say the two or three years we've developed a full suite of machine learning and AI services for, I would say probably much every use case that you would expect right now from applying natural language processing to understanding customer support tickets or social media, or whatnot to computer vision platforms or computer vision services that can understand and detect objects, and count objects on shelves or detect cracks in the pipe or defecting parts, all the way to speech services. It can actually transcribe human speech. And most recently we've launched a new document AI service. That can actually look at unstructured documents like receipts or invoices or government IDs or even proprietary documents, loan application, student application forms, patient ingestion and whatnot and completely automate them using AI. So if you want to do one of the things that are, I would say common bread and butter for any industry, whether it's financial services or healthcare or manufacturing, we have a suite of services that any developer can go, and use easily customized with their own data. You don't need to be an expert in deep learning or large language models. You could just use our automobile capabilities, and build your own version of the models. Just go ahead and use them. And if you do have proprietary complex scenarios that you need customer from scratch, we actually have the most cost effective platform for that. So we have the OCI data science as well as built-in machine learning platform inside the databases inside the Oracle database, and mySQL HeatWave that allow data scientists, python welding people that actually like to build and tweak and control and improve, have everything that they need to go and build the machine learning models from scratch, deploy them, monitor and manage them at scale in production environment. And most of it is brand new. So we did not have these technologies four or five years ago and we've started building them and they're now at enterprise scale over the last couple of years. >> So what are some of the state-of-the-art tools, that AI specialists and data scientists need if they're going to go out and develop these new models? >> So I think it's on three layers. I think there's an infrastructure layer where the Nvidia's of the world come into play. For some of these things, you want massively efficient, massively scaled infrastructure place. So we are the most cost effective and performant large scale GPU training environment today. We're going to be first to onboard the new Nvidia H100s. These are the new super powerful GPU's for large language model training. So we have that covered for you in case you need this 'cause you want to build these ginormous things. You need a data science platform, a platform where you can open a Python notebook, and just use all these fancy open source frameworks and create the models that you want, and then click on a button and deploy it. And it infinitely scales wherever you need it. And in many cases you just need the, what I call the applied AI services. You need the Lego sets, the Lego death style, Lego Eiffel Tower. So we have a suite of these sets for typical scenarios, whether it's cognitive services of like, again, understanding images, or documents all the way to solving particular business problems. So an anomaly detection service, demand focusing service that will be the equivalent of these Lego sets. So if this is the business problem that you're looking to solve, we have services out there where we can bring your data, call an API, train a model, get the model and use it in your production environment. So wherever you want to play, all the way into embedding this thing, inside this applications, obviously, wherever you want to play, we have the tools for you to go and engage from infrastructure to SaaS at the top, and everything in the middle. >> So when you think about the data pipeline, and the data life cycle, and the specialized roles that came out of kind of the (indistinct) era if you will. I want to focus on two developers and data scientists. So the developers, they hate dealing with infrastructure and they got to deal with infrastructure. Now they're being asked to secure the infrastructure, they just want to write code. And a data scientist, they're spending all their time trying to figure out, okay, what's the data quality? And they're wrangling data and they don't spend enough time doing what they want to do. So there's been a lack of collaboration. Have you seen that change, are these approaches allowing collaboration between data scientists and developers on a single platform? Can you talk about that a little bit? >> Yeah, that is a great question. One of the biggest set of scars that I have on my back from for building these platforms in other companies is exactly that. Every persona had a set of tools, and these tools didn't talk to each other and the handoff was painful. And most of the machine learning things evaporate or die on the floor because of this problem. It's very rarely that they are unsuccessful because the algorithm wasn't good enough. In most cases it's somebody builds something, and then you can't take it to production, you can't integrate it into your business application. You can't take the data out, train, create an endpoint and integrate it back like it's too painful. So the way we are approaching this is focused on this problem exactly. We have a single set of tools that if you publish a model as a data scientist and developers, and even business analysts that are seeing a inside of business application could be able to consume it. We have a single model store, a single feature store, a single management experience across the various personas that need to play in this. And we spend a lot of time building, and borrowing a word that cellular folks used, and I really liked it, building inside highways to make it easier to bring these insights into where you need them inside applications, both inside our applications, inside our SaaS applications, but also inside custom third party and even first party applications. And this is where a lot of our focus goes to just because we have dealt with so much pain doing this inside our own SaaS that we now have built the tools, and we're making them available for others to make this process of building a machine learning outcome driven insight in your app easier. And it's not just the model development, and it's not just the deployment, it's the entire journey of taking the data, building the model, training it, deploying it, looking at the real data that comes from the app, and creating this feedback loop in a more efficient way. And that's our focus area. Exactly this problem. >> Well thank you for that. So, last week we had our super cloud two event, and I had Juan Loza on and he spent a lot of time talking about how open Oracle is in its philosophy, and I got a lot of feedback. They were like, Oracle open, I don't really think, but the truth is if you think about database Oracle database, it never met a hardware platform that it didn't like. So in that sense it's open. So, but my point is, a big part of of machine learning and AI is driven by open source tools, frameworks, what's your open source strategy? What do you support from an open source standpoint? >> So I'm a strong believer that you don't actually know, nobody knows where the next slip fog or the next industry shifting innovation in AI is going to come from. If you look six months ago, nobody foreseen Dali, the magical text to image generation and the exploding brought into just art and design type of experiences. If you look six weeks ago, I don't think anybody's seen ChatGPT, and what it can do for a whole bunch of industries. So to me, assuming that a customer or partner or developer would want to lock themselves into only the tools that a specific vendor can produce is ridiculous. 'Cause nobody knows, if anybody claims that they know where the innovation is going to come from in a year or two, let alone in five or 10, they're just wrong or lying. So our strategy for Oracle is to, I call this the Netflix of AI. So if you think about Netflix, they produced a bunch of high quality shows on their own. A few years ago it was House of Cards. Last month my wife and I binge watched Ginny and Georgie, but they also curated a lot of shows that they found around the world and bought them to their customers. So it started with things like Seinfeld or Friends and most recently it was Squid games and those are famous Israeli TV series called Founder that Netflix bought in, and they bought it as is and they gave it the Netflix value. So you have captioning and you have the ability to speed the movie and you have it inside your app, and you can download it and watch it offline and everything, but nobody Netflix was involved in the production of these first seasons. Now if these things hunt and they're great, then the third season or the fourth season will get the full Netflix production value, high value budget, high value location shooting or whatever. But you as a customer, you don't care whether the producer and director, and screenplay writing is a Netflix employee or is somebody else's employee. It is fulfilled by Netflix. I believe that we will become, or we are looking to become the Netflix of AI. We are building a bunch of AI in a bunch of places where we think it's important and we have some competitive advantage like healthcare with Acellular partnership or whatnot. But I want to bring the best AI software and hardware to OCI and do a fulfillment by Oracle on that. So you'll get the Oracle security and identity and single bill and everything you'd expect from a company like Oracle. But we don't have to be building the data science, and the models for everything. So this means both open source recently announced a partnership with Anaconda, the leading provider of Python distribution in the data science ecosystem where we are are doing a joint strategic partnership of bringing all the goodness into Oracle customers as well as in the process of doing the same with Nvidia, and all those software libraries, not just the Hubble, both for other stuff like Triton, but also for healthcare specific stuff as well as other ISVs, other AI leading ISVs that we are in the process of partnering with to get their stuff into OCI and into Oracle so that you can truly consume the best AI hardware, and the best AI software in the world on Oracle. 'Cause that is what I believe our customers would want the ability to choose from any open source engine, and honestly from any ISV type of solution that is AI powered and they want to use it in their experiences. >> So you mentioned ChatGPT, I want to talk about some of the innovations that are coming. As an AI expert, you see ChatGPT on the one hand, I'm sure you weren't surprised. On the other hand, maybe the reaction in the market, and the hype is somewhat surprising. You know, they say that we tend to under or over-hype things in the early stages and under hype them long term, you kind of use the internet as example. What's your take on that premise? >> So. I think that this type of technology is going to be an inflection point in how software is being developed. I truly believe this. I think this is an internet style moment, and the way software interfaces, software applications are being developed will dramatically change over the next year two or three because of this type of technologies. I think there will be industries that will be shifted. I think education is a good example. I saw this thing opened on my son's laptop. So I think education is going to be transformed. Design industry like images or whatever, it's already been transformed. But I think that for mass adoption, like beyond the hype, beyond the peak of inflected expectations, if I'm using Gartner terminology, I think certain things need to go and happen. One is this thing needs to become more reliable. So right now it is a complete black box that sometimes produce magic, and sometimes produce just nonsense. And it needs to have better explainability and better lineage to, how did you get to this answer? 'Cause I think enterprises are going to really care about the things that they surface with the customers or use internally. So I think that is one thing that's going to come out. And the other thing that's going to come out is I think it's going to come industry specific large language models or industry specific ChatGPTs. Something like how OpenAI did co-pilot for writing code. I think we will start seeing this type of apps solving for specific business problems, understanding contracts, understanding healthcare, writing doctor's notes on behalf of doctors so they don't have to spend time manually recording and analyzing conversations. And I think that would become the sweet spot of this thing. There will be companies, whether it's OpenAI or Microsoft or Google or hopefully Oracle that will use this type of technology to solve for specific very high value business needs. And I think this will change how interfaces happen. So going back to your expense report, the world of, I'm going to go into an app, and I'm going to click on seven buttons in order to get some job done like this world is gone. Like I'm going to say, hey, please do this and that. And I expect an answer to come out. I've seen a recent demo about, marketing in sales. So a customer sends an email that is interested in something and then a ChatGPT powered thing just produces the answer. I think this is how the world is going to evolve. Like yes, there's a ton of hype, yes, it looks like magic and right now it is magic, but it's not yet productive for most enterprise scenarios. But in the next 6, 12, 24 months, this will start getting more dependable, and it's going to change how these industries are being managed. Like I think it's an internet level revolution. That's my take. >> It's very interesting. And it's going to change the way in which we have. Instead of accessing the data center through APIs, we're going to access it through natural language processing and that opens up technology to a huge audience. Last question, is a two part question. And the first part is what you guys are working on from the futures, but the second part of the question is, we got data scientists and developers in our audience. They love the new shiny toy. So give us a little glimpse of what you're working on in the future, and what would you say to them to persuade them to check out Oracle's AI services? >> Yep. So I think there's two main things that we're doing, one is around healthcare. With a new recent acquisition, we are spending a significant effort around revolutionizing healthcare with AI. Of course many scenarios from patient care using computer vision and cameras through automating, and making better insurance claims to research and pharma. We are making the best models from leading organizations, and internal available for hospitals and researchers, and insurance providers everywhere. And we truly are looking to become the leader in AI for healthcare. So I think that's a huge focus area. And the second part is, again, going back to the enterprise AI angle. Like we want to, if you have a business problem that you want to apply here to solve, we want to be your platform. Like you could use others if you want to build everything complicated and whatnot. We have a platform for that as well. But like, if you want to apply AI to solve a business problem, we want to be your platform. We want to be the, again, the Netflix of AI kind of a thing where we are the place for the greatest AI innovations accessible to any developer, any business analyst, any user, any data scientist on Oracle Cloud. And we're making a significant effort on these two fronts as well as developing a lot of the missing pieces, and building blocks that we see are needed in this space to make truly like a great experience for developers and data scientists. And what would I recommend? Get started, try it out. We actually have a shameless sales plug here. We have a free deal for all of our AI services. So it typically cost you nothing. I would highly recommend to just go, and try these things out. Go play with it. If you are a python welding developer, and you want to try a little bit of auto mail, go down that path. If you're not even there and you're just like, hey, I have these customer feedback things and I want to try out, if I can understand them and apply AI and visualize, and do some cool stuff, we have services for that. My recommendation is, and I think ChatGPT got us 'cause I see people that have nothing to do with AI, and can't even spell AI going and trying it out. I think this is the time. Go play with these things, go play with these technologies and find what AI can do to you or for you. And I think Oracle is a great place to start playing with these things. >> Elad, thank you. Appreciate you sharing your vision of making Oracle the Netflix of AI. Love that and really appreciate your time. >> Awesome. Thank you. Thank you for having me. >> Okay. Thanks for watching this Cube conversation. This is Dave Vellante. We'll see you next time. (gentle music playing)

Published Date : Jan 24 2023

SUMMARY :

AI and the possibility Thanks for having me. I mean, it's the hottest So the developers, So my question to you is, and scale it for the thousands So when you think about these chat bots, and the native tongue It's just the worst. So over the last, and create the models that you want, of the (indistinct) era if you will. So the way we are approaching but the truth is if you the movie and you have it inside your app, and the hype is somewhat surprising. and the way software interfaces, and what would you say to them and you want to try a of making Oracle the Netflix of AI. Thank you for having me. We'll see you next time.

ENTITIES

Entity	Category	Confidence
Netflix	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Nvidia	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Elad Ziklik	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Safra Catz	PERSON	0.99+
Elad	PERSON	0.99+
thousands	QUANTITY	0.99+
Anaconda	ORGANIZATION	0.99+
two part	QUANTITY	0.99+
fourth season	QUANTITY	0.99+
House of Cards	TITLE	0.99+
Lego	ORGANIZATION	0.99+
second part	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
first seasons	QUANTITY	0.99+
Seinfeld	TITLE	0.99+
Last month	DATE	0.99+
third season	QUANTITY	0.99+
four hour	QUANTITY	0.99+
last week	DATE	0.99+
Hebrew	OTHER	0.99+
Las Vegas	LOCATION	0.99+
last October	DATE	0.99+
OCI	ORGANIZATION	0.99+
three years	QUANTITY	0.99+
both	QUANTITY	0.99+
two fronts	QUANTITY	0.99+
first part	QUANTITY	0.99+
Juan Loza	PERSON	0.99+
Founder	TITLE	0.99+
four	DATE	0.99+
six weeks ago	DATE	0.99+
today	DATE	0.99+
two years	QUANTITY	0.99+
python	TITLE	0.99+
five	QUANTITY	0.99+
a year	QUANTITY	0.99+
six months ago	DATE	0.99+
two developers	QUANTITY	0.99+
first	QUANTITY	0.98+
Python	TITLE	0.98+
H100s	COMMERCIAL_ITEM	0.98+
five years ago	DATE	0.98+
one	QUANTITY	0.98+
Friends	TITLE	0.98+
one guy	QUANTITY	0.98+
10	QUANTITY	0.97+

Juan Loaiza, Oracle | Building the Mission Critical Supercloud

(upbeat music) >> Welcome back to Supercloud two where we're gathering a number of industry luminaries to discuss the future of cloud services. And we'll be focusing on various real world practitioners today, their challenges, their opportunities with an emphasis on data, self-service infrastructure and how organizations are evolving their data and cloud strategies to prepare for that next era of digital innovation. And we really believe that support for multiple cloud estates is a first step of any Supercloud. And in that regard Oracle surprise some folks with its Azure collaboration the Oracle database and exit database services. And to discuss the challenges of developing a mission critical Supercloud we welcome Juan Loaiza, who's the executive vice president of Mission Critical Database Technologies at Oracle. Juan, you're many time CUBE alums so welcome back to the show. Great to see you. >> Great to see you, and happy to be here with you. >> Yeah, thank you. So a lot of people felt that Oracle was resistant to multicloud strategies and preferred to really have everything run just on the Oracle cloud infrastructure, OCI and maybe that was a misperception maybe you guys were misunderstood or maybe you had to change your heart. Take us through the decision to support multiple cloud platforms >> Now we've supported multiple cloud platforms for many years, so I think that was probably a misperception. Oracle database, we partnered up with Amazon very early on in their cloud when they had kind of the the first cloud out there. And we had Oracle database running on their cloud. We have backup, we have a lot of stuff running. So, yeah, part of the philosophy of Oracle has always been we partner with every platform. We're very open we started with SQL and APIs. As we develop new technologies we push them into the SQL standard. So that's always been part of the ecosystem at Oracle. That's how we think we get an advantage by being more open. I think if we try to create this isolated little world it actually hurts us and hurts customers. So for us it's a win-win to be open across the clouds. >> So Supercloud is this concept that we put forth to describe a platform or some people think it's an architecture if you have an opinion, and I'd love to hear it but it provides a programmatically consistent set of services that hosted on heterogeneous cloud providers. And so we look at the Oracle database service for Azure as fitting within this definition. In your view, is this accurate? >> Yeah, I would broaden it. I'd see a little bit more than that. We just think that services should be available from everywhere, right? So, I mean, it's a little bit like if you go back to the pre-internet world, there was things like AOL and CompuServe and those were kind of islands. And if you were on AOL, you really didn't have access to anything on CompuServe and vice versa. And the cloud world has evolved a little bit like that. And we just think that's the wrong model. They shouldn't these clouds are part of the world and they need to be interconnected like all the rest of the world. It's been a long time with telephones internet, everything, everything's interconnected. Everything should work seamlessly together. So that's how we believe if you're running in one cloud and you're running let's say an application, one cloud you want to use a service from another cloud should be completely simple to do that. It shouldn't be, I can only use what's in AOL or CompuServe or whatever else. It should not be isolated. >> Well, we got a long way to go before that Nirvana exists but one example is the Oracle database service with Azure. So what exactly does that service provide? I'm interested in how consistent the service experience is across clouds. Did you create a purpose-built PaaS layer to achieve this common experience? Or is it off the shelf Terraform? Is there unique value in the PaaS layer? Let's dig into some of those questions. I know I just threw six at you. >> Yeah, I mean, so what this is, is what we're trying to do is very simple. Which is, for example, starting with the Oracle database we want to make that seamless to use from anywhere you're running. Whether it's on-prem, on some other cloud, anywhere else you should be able to seamlessly use the Oracle database and it should look like the internet. There's no friction. There's not a lot of hoops you got to jump just because you're trying to use a database that isn't local to you. So it's pretty straightforward. And in terms of things like Azure, it's not easy to do because all these clouds have a lot of kind of very unique technologies. So what we've done is at Oracle is we've said, "Okay we're going to make Oracle database look exactly like if it was running on Azure." That means we'll use the Azure security systems, the identity management systems, the networking, there's things like monitoring and management. So we'll push all these technologies. For example, when we have monitoring event or we have alerts we'll push those into the Azure console. So as a user, it looks to you exactly as if that Oracle database was running inside Azure. Also, the networking is a big challenge across these clouds. So we've basically made that whole thing seamless. So we create the super high bandwidth network between Azure and Oracle. We make sure that's extremely low latency, under two milliseconds round trip. It's all within the local metro region. So it's very fast, very high bandwidth, very low latency. And we take care establishing the links and making sure that it's secure and all that kind of stuff. So at a high level, it looks to you like the database is--even the look and feel of the screens. It's the Azure colors, it's the Azure buttons it's the Azure layout of the screens so it looks like you're running there and we take care of all the technical details underlying that which there's a lot which has taken a lot of work to make it work seamlessly. >> In the magic of that abstraction. Juan, does it happen at the PaaS layer? Could you take us inside that a little bit? Is there intelligence in there that helps you deal with latency or are there any kind of purpose-built functions for this service? >> You could think of it as... I mean it happens at a lot of different layers. It happens at the identity management layer, it happens at the networking layer, it happens at the database layer, it happens at the monitoring layer, at the management layer. So all those things have been integrated. So it's not one thing that you just go and do. You have to integrate all these different services together. You can access files in Azure from the Oracle database. Again, that's completely seamless. You, it's just like if it was local to our cloud you get your Azure files in your kind of S3 equivalent. So yeah, the, it's not one thing. There's a whole lot of pieces to the ecosystem. And what we've done is we've worked on each piece separately to make sure that it's completely seamless and transparent so you don't have to think about it, it just works. >> So you kind of answered my next question which is one of the technical hurdles. It sounds like the technical hurdles are that integration across the entire stack. That's the sort of architecture that you've built. What was the catalyst for this service? >> Yeah, the catalyst is just fulfilling our vision of an open cloud world. It's really like I said, Oracle, from the very beginning has been believed in open standards. Customers should be able to have choice customers should be able to use whatever they want from wherever they want. And we saw that, you know in the new world of cloud that had broken down everybody had their own authentication system management system, monitoring system networking system, configuration system. And it became very difficult. There was a lot of friction to using services across cloud. So we said, "Well, okay we can fix that." It's work, it's significant amount of work but we know how to do it and let's just go do it and make it easy for customers. >> So given Oracle is really your main focus is on mission critical workloads. You talked about this low latency network, I mean but you still have physical distances, so how are you managing that latency? What's the experience been for customers across Azure and OCI? >> Yeah, so it, it's a good point. I mean, latency can be an issue. So the good thing about clouds is we have a lot of cloud data centers. We have dozens and dozens of cloud data centers around the world. And Azure has dozens and dozens of cloud data centers. And in most cases, they're in the same metro region because there's kind of natural metro regions within each country that you want to put your cloud data centers in. So most of our data centers are actually very close to the Azure data centers. There's the kind of northern Virginia, there's London, there's Tokyo I mean, there's natural places where everybody puts their data centers Seoul et cetera. And so that's the real key. So that allows us to put a very high bandwidth and low latency network. The real problems with latency come when you're trying to go along physical distance. If you're trying to connect, you know across the Pacific or you know across the country or something like that, then you can get in trouble with latency within the same metro region. It's extremely fast. It tends to be around one, you know the highest two millisecond that's roundtrip through all the routers and connections and gateways and everything else. With everything taken into consideration, what we guarantee is it's always less than two millisecond which is a very low latency time. So that tends to not be a problem because it's extremely low latency. >> I was going to ask you less than two milliseconds. So, earlier in the program we had Jack Greenfield who runs architecture for Walmart, and he was explaining what we call their Supercloud, and it's runs across Azure, GCP, and they're on-prem. They have this thing called the triplet model. So my question to you is, are you in situations where you guaranteeing that less than two milliseconds do you have situations where you're bringing, you know Exadata Cloud, a customer on-prem to achieve that? Or is this just across clouds? >> Yeah, in this case, we're talking public cloud data center to public cloud data center. >> Oh okay. >> So add your public cloud data center to Oracle Public Cloud data center. They're in the same metro region. We set up the connections, we do all the technology to make it seamless. And from a customer point of view they don't really see the network. Also, remember that SQL is actually designed to have very low bandwidth and latency requirements. So it is a language. So you don't go to the database and say do this one little thing for me. You send it a SQL statement that can actually access lots of data while in the database. So the real latency requirement of a SQL database is within the database. So I need to access all that data fast. So I need very fast access to storage very fast access across node. That's what exit data gives you. But you send one request and that request can do a huge amount of work and then return one answer. And that's kind of the design point of SQL. So SQL is inherently low bandwidth requirements, it was used back in the eighties when we used to have 10 megabit networks and the the biggest companies in the world ran back then. So right now we're talking over hundred hundreds of gigabits. So it's really not much of a challenge. When you're designed to run on 10 megabit to say, okay I'm going to give you 10,000 times what you were designed for it's really, it's a pretty low hurdle jump. >> What about the deployment models? How do you handle this? Is it a single global instance across clouds or do you sort of instantiate in each you got exudate in Azure and exudates in OCI? What's the deployment model look like? >> It's pretty straightforward. So customer decides where they want to run their application and database. So there's natural places where people go. If you're in Tokyo, you're going to choose the local Tokyo data centers for both, you know Microsoft and Oracle. If you're in London, you're going to do that. If you're in California you're going to choose maybe San Jose, something like that. So a customer just chooses. We both have data centers in that metro region. So they create their service on Azure and then they go to our console which looks just like an Azure console and say all right create me a database. And then we choose the closest Oracle data center which is generally a few miles away, and then it it all gets created. So from a customer point of view, it's very straightforward. >> I'm always in awe about how simple you make things sound. All right what about security? You talked a little bit before about identity access how you sort of abstracting the Azure capabilities away so that you've simplified it for your customers but are there any other specific security things that you need to do? How much did you have to abstract the underlying primitives of Azure or OCI to present that common experience to customers? >> Yeah, so there's really two big things. One is the identity management. Like my name is X on Azure and I have this set of privileges. Oracle has its own identity management system, right? So what we didn't want is that you have to kind of like bridge these things yourself. It's a giant pain to do that. So we actually what we call federate across these identity managements. So you put your credentials into Azure and then they automatically get to use the exact same credentials and identity in the Oracle cloud. So again, you don't have to think about it, it just works. And then the second part is that the whole bridging the network. So within a cloud you generally have virtual network that's private to your company. And so at Oracle, we bridge the private network that you created in, for example, Azure to the private network that we create for you in Oracle. So it is still a private network without you having to do a whole bunch of work. So it's just like if you were in your own data center other people can't get into your network. So it's secured at the network level, it's secured at the identity management, and encryption level. And again we did a lot of work to make that seamless for customers and they don't have to worry about it because we did the work. That's really as simple as it gets. >> That's what's Supercloud's supposed to be all about. Alright, we were talking earlier about sort of the misperception around multicloud, your view of Open I think, which is you run the Oracle database, wherever the customer wants to run it. So you got this database service across OCI and Azure customers today, they run Oracle database in AWS. You got heat wave, MySQL, heat wave that you announced on AWS, Google touts a bare metal offering where you can run Oracle on GCP. Do you see a day when you extend an OCI Azure like situation across multiple clouds? Would that bring benefits to customers or will the world of database generally remain largely fenced with maybe a few exceptions like what you're doing with OCI and Azure? I'm particularly interested in your thoughts on egress fees as maybe one of the reasons that there is a barrier to this happening and why maybe these stove pipes, exist today and in the future. What are your thoughts on that? >> Yeah, we're very open to working with everyone else out there. Like I said, we've always been, big believers in customers should have choice and you should be able to run wherever you want. So that's been kind of a founding principle of Oracle. We have the Azure, we did a partnership with them, we're open to doing other partnerships and you're going to see other things coming down the pipe on the topic of egress. Yeah, the large egress fees, it's pretty obvious what goes on with that. Various vendors like to have large egress fees because they want to keep things kind of locked into their cloud. So it's not a very customer friendly thing to do. And I think everybody recognizes that it's really trying to kind of course or put a lot of friction on moving data out of a particular cloud. And that's not what we do. We have very, very low egress fees. So we don't really do that and we don't think anybody else should do that. But I think customers at the end of the day, will win that battle. They're going to have to go back to their vendor and say, well I have choice in clouds and if you're going to impose these limits on me, maybe I'll make a different choice. So that's ultimately how these things get resolved. >> So do you think other cloud providers are going to take a page out of what you're doing with Azure and provide similar solutions? >> Yeah, well I think customers want, I mean, I've talked to a lot of customers, this is what they want, right? I mean, there's really no doubt no customer wants to be locked into a single ecosystem. There's nobody out there that wants that. And as the competition, when they start seeing an open ecosystem evolving they're going to be like, okay, I'd rather go there than the closed ecosystem, and that's going to put pressure on the closed ecosystems. So that's the nature of competition. That's what ultimately will tip the balance on these things. >> So Juan, even though you have this capability of distributing a workload across multiple clouds as in our Supercloud premise it's still something that's relatively new. It's a big decision that maybe many people might consider somewhat of a risk. So I'm curious who's driving the decisions for your initial customers? What do they want to get out of it? What's the decision point there? >> Yeah, I mean, this is generally driven by customers that want a specific technology in a cloud. I think the risk, I haven't seen a lot of people worry too much about the risk. Everybody involved in this is a very well known, very reputable firm. I mean, Oracle's been around for 40 years. We run most of the world's largest companies. I think customers understand we're not going to build a solution that's going to put their technology and their business at risk. And the same thing with Azure and others. So I don't see customers too worried about this is a risky move because it's really not. And you know, everybody understands networking at the end the day networking works. I mean, how does the internet work? It's a known quantity. It's not like it's some brand new invention. What we're really doing is breaking down the barriers to interconnecting things. Automating 'em, making 'em easy. So there's not a whole lot of risk here for customers. And like I said, every single customer in the world loves an open ecosystem. It's just not a question. If you go to a customer would you rather put your technology or your business to run on a closed ecosystem or an open system? It's kind of not even worth asking a question. It's a no-brainer. >> All right, so we got to go. My last question. What do you think of the term "Supercloud"? You think it'll stick? >> We'll see. There's a lot of terms out there and it's always fun to see which terms stick. It's a cool term. I like it, but the decision makers are actually the public, what sticks and what doesn't. It's very hard to predict. >> Yeah well, it's been a lot of fun having you on, Juan. Really appreciate your time and always good to see you. >> All right, Dave, thanks a lot. It's always fun to talk to you. >> You bet. All right, keep it right there. More Supercloud two content from theCUBE Community Dave Vellante for John Furrier. We'll be right back. (upbeat music)

Published Date : Jan 12 2023

SUMMARY :

and cloud strategies to prepare happy to be here with you. just on the Oracle cloud of the ecosystem at Oracle. and I'd love to hear it And the cloud world has Or is it off the shelf Terraform? So at a high level, it looks to you Juan, does it happen at the PaaS layer? it happens at the database layer, So you kind of And we saw that, you know What's the experience been for customers across the Pacific or you know So my question to you is, to public cloud data center. So the real latency requirement and then they go to our console the Azure capabilities away So it's secured at the network level, So you got this database We have the Azure, we did So that's the nature of competition. What's the decision point there? down the barriers to the term "Supercloud"? and it's always fun to and always good to see you. It's always fun to talk to you. Vellante for John Furrier.

ENTITIES

Entity	Category	Confidence
Microsoft	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Juan Loaiza	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
California	LOCATION	0.99+
Dave Vellante	PERSON	0.99+
Tokyo	LOCATION	0.99+
Juan	PERSON	0.99+
London	LOCATION	0.99+
six	QUANTITY	0.99+
10,000 times	QUANTITY	0.99+
Jack Greenfield	PERSON	0.99+
Google	ORGANIZATION	0.99+
second part	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
less than two millisecond	QUANTITY	0.99+
less than two milliseconds	QUANTITY	0.99+
One	QUANTITY	0.99+
SQL	TITLE	0.99+
10 megabit	QUANTITY	0.99+
both	QUANTITY	0.99+
AOL	ORGANIZATION	0.98+
each piece	QUANTITY	0.98+
MySQL	TITLE	0.98+
first cloud	QUANTITY	0.98+
single	QUANTITY	0.98+
each country	QUANTITY	0.98+
John Furrier	PERSON	0.98+
two big things	QUANTITY	0.98+
under two milliseconds	QUANTITY	0.98+
one	QUANTITY	0.98+
northern Virginia	LOCATION	0.98+
CompuServe	ORGANIZATION	0.97+
first step	QUANTITY	0.97+
Mission Critical Database Technologies	ORGANIZATION	0.97+
one request	QUANTITY	0.97+
Seoul	LOCATION	0.97+
Azure	TITLE	0.97+
each	QUANTITY	0.97+
two millisecond	QUANTITY	0.97+
Azure	ORGANIZATION	0.96+
one cloud	QUANTITY	0.95+
one thing	QUANTITY	0.95+
cloud data centers	QUANTITY	0.95+
one answer	QUANTITY	0.95+
Supercloud	ORGANIZATION	0.94+

Analyst Predictions 2023: The Future of Data Management

(upbeat music) >> Hello, this is Dave Valente with theCUBE, and one of the most gratifying aspects of my role as a host of "theCUBE TV" is I get to cover a wide range of topics. And quite often, we're able to bring to our program a level of expertise that allows us to more deeply explore and unpack some of the topics that we cover throughout the year. And one of our favorite topics, of course, is data. Now, in 2021, after being in isolation for the better part of two years, a group of industry analysts met up at AWS re:Invent and started a collaboration to look at the trends in data and predict what some likely outcomes will be for the coming year. And it resulted in a very popular session that we had last year focused on the future of data management. And I'm very excited and pleased to tell you that the 2023 edition of that predictions episode is back, and with me are five outstanding market analyst, Sanjeev Mohan of SanjMo, Tony Baer of dbInsight, Carl Olofson from IDC, Dave Menninger from Ventana Research, and Doug Henschen, VP and Principal Analyst at Constellation Research. Now, what is it that we're calling you, guys? A data pack like the rat pack? No, no, no, no, that's not it. It's the data crowd, the data crowd, and the crowd includes some of the best minds in the data analyst community. They'll discuss how data management is evolving and what listeners should prepare for in 2023. Guys, welcome back. Great to see you. >> Good to be here. >> Thank you. >> Thanks, Dave. (Tony and Dave faintly speaks) >> All right, before we get into 2023 predictions, we thought it'd be good to do a look back at how we did in 2022 and give a transparent assessment of those predictions. So, let's get right into it. We're going to bring these up here, the predictions from 2022, they're color-coded red, yellow, and green to signify the degree of accuracy. And I'm pleased to report there's no red. Well, maybe some of you will want to debate that grading system. But as always, we want to be open, so you can decide for yourselves. So, we're going to ask each analyst to review their 2022 prediction and explain their rating and what evidence they have that led them to their conclusion. So, Sanjeev, please kick it off. Your prediction was data governance becomes key. I know that's going to knock you guys over, but elaborate, because you had more detail when you double click on that. >> Yeah, absolutely. Thank you so much, Dave, for having us on the show today. And we self-graded ourselves. I could have very easily made my prediction from last year green, but I mentioned why I left it as yellow. I totally fully believe that data governance was in a renaissance in 2022. And why do I say that? You have to look no further than AWS launching its own data catalog called DataZone. Before that, mid-year, we saw Unity Catalog from Databricks went GA. So, overall, I saw there was tremendous movement. When you see these big players launching a new data catalog, you know that they want to be in this space. And this space is highly critical to everything that I feel we will talk about in today's call. Also, if you look at established players, I spoke at Collibra's conference, data.world, work closely with Alation, Informatica, a bunch of other companies, they all added tremendous new capabilities. So, it did become key. The reason I left it as yellow is because I had made a prediction that Collibra would go IPO, and it did not. And I don't think anyone is going IPO right now. The market is really, really down, the funding in VC IPO market. But other than that, data governance had a banner year in 2022. >> Yeah. Well, thank you for that. And of course, you saw data clean rooms being announced at AWS re:Invent, so more evidence. And I like how the fact that you included in your predictions some things that were binary, so you dinged yourself there. So, good job. Okay, Tony Baer, you're up next. Data mesh hits reality check. As you see here, you've given yourself a bright green thumbs up. (Tony laughing) Okay. Let's hear why you feel that was the case. What do you mean by reality check? >> Okay. Thanks, Dave, for having us back again. This is something I just wrote and just tried to get away from, and this just a topic just won't go away. I did speak with a number of folks, early adopters and non-adopters during the year. And I did find that basically that it pretty much validated what I was expecting, which was that there was a lot more, this has now become a front burner issue. And if I had any doubt in my mind, the evidence I would point to is what was originally intended to be a throwaway post on LinkedIn, which I just quickly scribbled down the night before leaving for re:Invent. I was packing at the time, and for some reason, I was doing Google search on data mesh. And I happened to have tripped across this ridiculous article, I will not say where, because it doesn't deserve any publicity, about the eight (Dave laughing) best data mesh software companies of 2022. (Tony laughing) One of my predictions was that you'd see data mesh washing. And I just quickly just hopped on that maybe three sentences and wrote it at about a couple minutes saying this is hogwash, essentially. (laughs) And that just reun... And then, I left for re:Invent. And the next night, when I got into my Vegas hotel room, I clicked on my computer. I saw a 15,000 hits on that post, which was the most hits of any single post I put all year. And the responses were wildly pro and con. So, it pretty much validates my expectation in that data mesh really did hit a lot more scrutiny over this past year. >> Yeah, thank you for that. I remember that article. I remember rolling my eyes when I saw it, and then I recently, (Tony laughing) I talked to Walmart and they actually invoked Martin Fowler and they said that they're working through their data mesh. So, it takes a really lot of thought, and it really, as we've talked about, is really as much an organizational construct. You're not buying data mesh >> Bingo. >> to your point. Okay. Thank you, Tony. Carl Olofson, here we go. You've graded yourself a yellow in the prediction of graph databases. Take off. Please elaborate. >> Yeah, sure. So, I realized in looking at the prediction that it seemed to imply that graph databases could be a major factor in the data world in 2022, which obviously didn't become the case. It was an error on my part in that I should have said it in the right context. It's really a three to five-year time period that graph databases will really become significant, because they still need accepted methodologies that can be applied in a business context as well as proper tools in order for people to be able to use them seriously. But I stand by the idea that it is taking off, because for one thing, Neo4j, which is the leading independent graph database provider, had a very good year. And also, we're seeing interesting developments in terms of things like AWS with Neptune and with Oracle providing graph support in Oracle database this past year. Those things are, as I said, growing gradually. There are other companies like TigerGraph and so forth, that deserve watching as well. But as far as becoming mainstream, it's going to be a few years before we get all the elements together to make that happen. Like any new technology, you have to create an environment in which ordinary people without a whole ton of technical training can actually apply the technology to solve business problems. >> Yeah, thank you for that. These specialized databases, graph databases, time series databases, you see them embedded into mainstream data platforms, but there's a place for these specialized databases, I would suspect we're going to see new types of databases emerge with all this cloud sprawl that we have and maybe to the edge. >> Well, part of it is that it's not as specialized as you might think it. You can apply graphs to great many workloads and use cases. It's just that people have yet to fully explore and discover what those are. >> Yeah. >> And so, it's going to be a process. (laughs) >> All right, Dave Menninger, streaming data permeates the landscape. You gave yourself a yellow. Why? >> Well, I couldn't think of a appropriate combination of yellow and green. Maybe I should have used chartreuse, (Dave laughing) but I was probably a little hard on myself making it yellow. This is another type of specialized data processing like Carl was talking about graph databases is a stream processing, and nearly every data platform offers streaming capabilities now. Often, it's based on Kafka. If you look at Confluent, their revenues have grown at more than 50%, continue to grow at more than 50% a year. They're expected to do more than half a billion dollars in revenue this year. But the thing that hasn't happened yet, and to be honest, they didn't necessarily expect it to happen in one year, is that streaming hasn't become the default way in which we deal with data. It's still a sidecar to data at rest. And I do expect that we'll continue to see streaming become more and more mainstream. I do expect perhaps in the five-year timeframe that we will first deal with data as streaming and then at rest, but the worlds are starting to merge. And we even see some vendors bringing products to market, such as K2View, Hazelcast, and RisingWave Labs. So, in addition to all those core data platform vendors adding these capabilities, there are new vendors approaching this market as well. >> I like the tough grading system, and it's not trivial. And when you talk to practitioners doing this stuff, there's still some complications in the data pipeline. And so, but I think, you're right, it probably was a yellow plus. Doug Henschen, data lakehouses will emerge as dominant. When you talk to people about lakehouses, practitioners, they all use that term. They certainly use the term data lake, but now, they're using lakehouse more and more. What's your thoughts on here? Why the green? What's your evidence there? >> Well, I think, I was accurate. I spoke about it specifically as something that vendors would be pursuing. And we saw yet more lakehouse advocacy in 2022. Google introduced its BigLake service alongside BigQuery. Salesforce introduced Genie, which is really a lakehouse architecture. And it was a safe prediction to say vendors are going to be pursuing this in that AWS, Cloudera, Databricks, Microsoft, Oracle, SAP, Salesforce now, IBM, all advocate this idea of a single platform for all of your data. Now, the trend was also supported in 2023, in that we saw a big embrace of Apache Iceberg in 2022. That's a structured table format. It's used with these lakehouse platforms. It's open, so it ensures portability and it also ensures performance. And that's a structured table that helps with the warehouse side performance. But among those announcements, Snowflake, Google, Cloud Era, SAP, Salesforce, IBM, all embraced Iceberg. But keep in mind, again, I'm talking about this as something that vendors are pursuing as their approach. So, they're advocating end users. It's very cutting edge. I'd say the top, leading edge, 5% of of companies have really embraced the lakehouse. I think, we're now seeing the fast followers, the next 20 to 25% of firms embracing this idea and embracing a lakehouse architecture. I recall Christian Kleinerman at the big Snowflake event last summer, making the announcement about Iceberg, and he asked for a show of hands for any of you in the audience at the keynote, have you heard of Iceberg? And just a smattering of hands went up. So, the vendors are ahead of the curve. They're pushing this trend, and we're now seeing a little bit more mainstream uptake. >> Good. Doug, I was there. It was you, me, and I think, two other hands were up. That was just humorous. (Doug laughing) All right, well, so I liked the fact that we had some yellow and some green. When you think about these things, there's the prediction itself. Did it come true or not? There are the sub predictions that you guys make, and of course, the degree of difficulty. So, thank you for that open assessment. All right, let's get into the 2023 predictions. Let's bring up the predictions. Sanjeev, you're going first. You've got a prediction around unified metadata. What's the prediction, please? >> So, my prediction is that metadata space is currently a mess. It needs to get unified. There are too many use cases of metadata, which are being addressed by disparate systems. For example, data quality has become really big in the last couple of years, data observability, the whole catalog space is actually, people don't like to use the word data catalog anymore, because data catalog sounds like it's a catalog, a museum, if you may, of metadata that you go and admire. So, what I'm saying is that in 2023, we will see that metadata will become the driving force behind things like data ops, things like orchestration of tasks using metadata, not rules. Not saying that if this fails, then do this, if this succeeds, go do that. But it's like getting to the metadata level, and then making a decision as to what to orchestrate, what to automate, how to do data quality check, data observability. So, this space is starting to gel, and I see there'll be more maturation in the metadata space. Even security privacy, some of these topics, which are handled separately. And I'm just talking about data security and data privacy. I'm not talking about infrastructure security. These also need to merge into a unified metadata management piece with some knowledge graph, semantic layer on top, so you can do analytics on it. So, it's no longer something that sits on the side, it's limited in its scope. It is actually the very engine, the very glue that is going to connect data producers and consumers. >> Great. Thank you for that. Doug. Doug Henschen, any thoughts on what Sanjeev just said? Do you agree? Do you disagree? >> Well, I agree with many aspects of what he says. I think, there's a huge opportunity for consolidation and streamlining of these as aspects of governance. Last year, Sanjeev, you said something like, we'll see more people using catalogs than BI. And I have to disagree. I don't think this is a category that's headed for mainstream adoption. It's a behind the scenes activity for the wonky few, or better yet, companies want machine learning and automation to take care of these messy details. We've seen these waves of management technologies, some of the latest data observability, customer data platform, but they failed to sweep away all the earlier investments in data quality and master data management. So, yes, I hope the latest tech offers, glimmers that there's going to be a better, cleaner way of addressing these things. But to my mind, the business leaders, including the CIO, only want to spend as much time and effort and money and resources on these sorts of things to avoid getting breached, ending up in headlines, getting fired or going to jail. So, vendors bring on the ML and AI smarts and the automation of these sorts of activities. >> So, if I may say something, the reason why we have this dichotomy between data catalog and the BI vendors is because data catalogs are very soon, not going to be standalone products, in my opinion. They're going to get embedded. So, when you use a BI tool, you'll actually use the catalog to find out what is it that you want to do, whether you are looking for data or you're looking for an existing dashboard. So, the catalog becomes embedded into the BI tool. >> Hey, Dave Menninger, sometimes you have some data in your back pocket. Do you have any stats (chuckles) on this topic? >> No, I'm glad you asked, because I'm going to... Now, data catalogs are something that's interesting. Sanjeev made a statement that data catalogs are falling out of favor. I don't care what you call them. They're valuable to organizations. Our research shows that organizations that have adequate data catalog technologies are three times more likely to express satisfaction with their analytics for just the reasons that Sanjeev was talking about. You can find what you want, you know you're getting the right information, you know whether or not it's trusted. So, those are good things. So, we expect to see the capabilities, whether it's embedded or separate. We expect to see those capabilities continue to permeate the market. >> And a lot of those catalogs are driven now by machine learning and things. So, they're learning from those patterns of usage by people when people use the data. (airy laughs) >> All right. Okay. Thank you, guys. All right. Let's move on to the next one. Tony Bear, let's bring up the predictions. You got something in here about the modern data stack. We need to rethink it. Is the modern data stack getting long at the tooth? Is it not so modern anymore? >> I think, in a way, it's got almost too modern. It's gotten too, I don't know if it's being long in the tooth, but it is getting long. The modern data stack, it's traditionally been defined as basically you have the data platform, which would be the operational database and the data warehouse. And in between, you have all the tools that are necessary to essentially get that data from the operational realm or the streaming realm for that matter into basically the data warehouse, or as we might be seeing more and more, the data lakehouse. And I think, what's important here is that, or I think, we have seen a lot of progress, and this would be in the cloud, is with the SaaS services. And especially you see that in the modern data stack, which is like all these players, not just the MongoDBs or the Oracles or the Amazons have their database platforms. You see they have the Informatica's, and all the other players there in Fivetrans have their own SaaS services. And within those SaaS services, you get a certain degree of simplicity, which is it takes all the housekeeping off the shoulders of the customers. That's a good thing. The problem is that what we're getting to unfortunately is what I would call lots of islands of simplicity, which means that it leads it (Dave laughing) to the customer to have to integrate or put all that stuff together. It's a complex tool chain. And so, what we really need to think about here, we have too many pieces. And going back to the discussion of catalogs, it's like we have so many catalogs out there, which one do we use? 'Cause chances are of most organizations do not rely on a single catalog at this point. What I'm calling on all the data providers or all the SaaS service providers, is to literally get it together and essentially make this modern data stack less of a stack, make it more of a blending of an end-to-end solution. And that can come in a number of different ways. Part of it is that we're data platform providers have been adding services that are adjacent. And there's some very good examples of this. We've seen progress over the past year or so. For instance, MongoDB integrating search. It's a very common, I guess, sort of tool that basically, that the applications that are developed on MongoDB use, so MongoDB then built it into the database rather than requiring an extra elastic search or open search stack. Amazon just... AWS just did the zero-ETL, which is a first step towards simplifying the process from going from Aurora to Redshift. You've seen same thing with Google, BigQuery integrating basically streaming pipelines. And you're seeing also a lot of movement in database machine learning. So, there's some good moves in this direction. I expect to see more than this year. Part of it's from basically the SaaS platform is adding some functionality. But I also see more importantly, because you're never going to get... This is like asking your data team and your developers, herding cats to standardizing the same tool. In most organizations, that is not going to happen. So, take a look at the most popular combinations of tools and start to come up with some pre-built integrations and pre-built orchestrations, and offer some promotional pricing, maybe not quite two for, but in other words, get two products for the price of two services or for the price of one and a half. I see a lot of potential for this. And it's to me, if the class was to simplify things, this is the next logical step and I expect to see more of this here. >> Yeah, and you see in Oracle, MySQL heat wave, yet another example of eliminating that ETL. Carl Olofson, today, if you think about the data stack and the application stack, they're largely separate. Do you have any thoughts on how that's going to play out? Does that play into this prediction? What do you think? >> Well, I think, that the... I really like Tony's phrase, islands of simplification. It really says (Tony chuckles) what's going on here, which is that all these different vendors you ask about, about how these stacks work. All these different vendors have their own stack vision. And you can... One application group is going to use one, and another application group is going to use another. And some people will say, let's go to, like you go to a Informatica conference and they say, we should be the center of your universe, but you can't connect everything in your universe to Informatica, so you need to use other things. So, the challenge is how do we make those things work together? As Tony has said, and I totally agree, we're never going to get to the point where people standardize on one organizing system. So, the alternative is to have metadata that can be shared amongst those systems and protocols that allow those systems to coordinate their operations. This is standard stuff. It's not easy. But the motive for the vendors is that they can become more active critical players in the enterprise. And of course, the motive for the customer is that things will run better and more completely. So, I've been looking at this in terms of two kinds of metadata. One is the meaning metadata, which says what data can be put together. The other is the operational metadata, which says basically where did it come from? Who created it? What's its current state? What's the security level? Et cetera, et cetera, et cetera. The good news is the operational stuff can actually be done automatically, whereas the meaning stuff requires some human intervention. And as we've already heard from, was it Doug, I think, people are disinclined to put a lot of definition into meaning metadata. So, that may be the harder one, but coordination is key. This problem has been with us forever, but with the addition of new data sources, with streaming data with data in different formats, the whole thing has, it's been like what a customer of mine used to say, "I understand your product can make my system run faster, but right now I just feel I'm putting my problems on roller skates. (chuckles) I don't need that to accelerate what's already not working." >> Excellent. Okay, Carl, let's stay with you. I remember in the early days of the big data movement, Hadoop movement, NoSQL was the big thing. And I remember Amr Awadallah said to us in theCUBE that SQL is the killer app for big data. So, your prediction here, if we bring that up is SQL is back. Please elaborate. >> Yeah. So, of course, some people would say, well, it never left. Actually, that's probably closer to true, but in the perception of the marketplace, there's been all this noise about alternative ways of storing, retrieving data, whether it's in key value stores or document databases and so forth. We're getting a lot of messaging that for a while had persuaded people that, oh, we're not going to do analytics in SQL anymore. We're going to use Spark for everything, except that only a handful of people know how to use Spark. Oh, well, that's a problem. Well, how about, and for ordinary conventional business analytics, Spark is like an over-engineered solution to the problem. SQL works just great. What's happened in the past couple years, and what's going to continue to happen is that SQL is insinuating itself into everything we're seeing. We're seeing all the major data lake providers offering SQL support, whether it's Databricks or... And of course, Snowflake is loving this, because that is what they do, and their success is certainly points to the success of SQL, even MongoDB. And we were all, I think, at the MongoDB conference where on one day, we hear SQL is dead. They're not teaching SQL in schools anymore, and this kind of thing. And then, a couple days later at the same conference, they announced we're adding a new analytic capability-based on SQL. But didn't you just say SQL is dead? So, the reality is that SQL is better understood than most other methods of certainly of retrieving and finding data in a data collection, no matter whether it happens to be relational or non-relational. And even in systems that are very non-relational, such as graph and document databases, their query languages are being built or extended to resemble SQL, because SQL is something people understand. >> Now, you remember when we were in high school and you had had to take the... Your debating in the class and you were forced to take one side and defend it. So, I was was at a Vertica conference one time up on stage with Curt Monash, and I had to take the NoSQL, the world is changing paradigm shift. And so just to be controversial, I said to him, Curt Monash, I said, who really needs acid compliance anyway? Tony Baer. And so, (chuckles) of course, his head exploded, but what are your thoughts (guests laughing) on all this? >> Well, my first thought is congratulations, Dave, for surviving being up on stage with Curt Monash. >> Amen. (group laughing) >> I definitely would concur with Carl. We actually are definitely seeing a SQL renaissance and if there's any proof of the pudding here, I see lakehouse is being icing on the cake. As Doug had predicted last year, now, (clears throat) for the record, I think, Doug was about a year ahead of time in his predictions that this year is really the year that I see (clears throat) the lakehouse ecosystems really firming up. You saw the first shots last year. But anyway, on this, data lakes will not go away. I've actually, I'm on the home stretch of doing a market, a landscape on the lakehouse. And lakehouse will not replace data lakes in terms of that. There is the need for those, data scientists who do know Python, who knows Spark, to go in there and basically do their thing without all the restrictions or the constraints of a pre-built, pre-designed table structure. I get that. Same thing for developing models. But on the other hand, there is huge need. Basically, (clears throat) maybe MongoDB was saying that we're not teaching SQL anymore. Well, maybe we have an oversupply of SQL developers. Well, I'm being facetious there, but there is a huge skills based in SQL. Analytics have been built on SQL. They came with lakehouse and why this really helps to fuel a SQL revival is that the core need in the data lake, what brought on the lakehouse was not so much SQL, it was a need for acid. And what was the best way to do it? It was through a relational table structure. So, the whole idea of acid in the lakehouse was not to turn it into a transaction database, but to make the data trusted, secure, and more granularly governed, where you could govern down to column and row level, which you really could not do in a data lake or a file system. So, while lakehouse can be queried in a manner, you can go in there with Python or whatever, it's built on a relational table structure. And so, for that end, for those types of data lakes, it becomes the end state. You cannot bypass that table structure as I learned the hard way during my research. So, the bottom line I'd say here is that lakehouse is proof that we're starting to see the revenge of the SQL nerds. (Dave chuckles) >> Excellent. Okay, let's bring up back up the predictions. Dave Menninger, this one's really thought-provoking and interesting. We're hearing things like data as code, new data applications, machines actually generating plans with no human involvement. And your prediction is the definition of data is expanding. What do you mean by that? >> So, I think, for too long, we've thought about data as the, I would say facts that we collect the readings off of devices and things like that, but data on its own is really insufficient. Organizations need to manipulate that data and examine derivatives of the data to really understand what's happening in their organization, why has it happened, and to project what might happen in the future. And my comment is that these data derivatives need to be supported and managed just like the data needs to be managed. We can't treat this as entirely separate. Think about all the governance discussions we've had. Think about the metadata discussions we've had. If you separate these things, now you've got more moving parts. We're talking about simplicity and simplifying the stack. So, if these things are treated separately, it creates much more complexity. I also think it creates a little bit of a myopic view on the part of the IT organizations that are acquiring these technologies. They need to think more broadly. So, for instance, metrics. Metric stores are becoming much more common part of the tooling that's part of a data platform. Similarly, feature stores are gaining traction. So, those are designed to promote the reuse and consistency across the AI and ML initiatives. The elements that are used in developing an AI or ML model. And let me go back to metrics and just clarify what I mean by that. So, any type of formula involving the data points. I'm distinguishing metrics from features that are used in AI and ML models. And the data platforms themselves are increasingly managing the models as an element of data. So, just like figuring out how to calculate a metric. Well, if you're going to have the features associated with an AI and ML model, you probably need to be managing the model that's associated with those features. The other element where I see expansion is around external data. Organizations for decades have been focused on the data that they generate within their own organization. We see more and more of these platforms acquiring and publishing data to external third-party sources, whether they're within some sort of a partner ecosystem or whether it's a commercial distribution of that information. And our research shows that when organizations use external data, they derive even more benefits from the various analyses that they're conducting. And the last great frontier in my opinion on this expanding world of data is the world of driver-based planning. Very few of the major data platform providers provide these capabilities today. These are the types of things you would do in a spreadsheet. And we all know the issues associated with spreadsheets. They're hard to govern, they're error-prone. And so, if we can take that type of analysis, collecting the occupancy of a rental property, the projected rise in rental rates, the fluctuations perhaps in occupancy, the interest rates associated with financing that property, we can project forward. And that's a very common thing to do. What the income might look like from that property income, the expenses, we can plan and purchase things appropriately. So, I think, we need this broader purview and I'm beginning to see some of those things happen. And the evidence today I would say, is more focused around the metric stores and the feature stores starting to see vendors offer those capabilities. And we're starting to see the ML ops elements of managing the AI and ML models find their way closer to the data platforms as well. >> Very interesting. When I hear metrics, I think of KPIs, I think of data apps, orchestrate people and places and things to optimize around a set of KPIs. It sounds like a metadata challenge more... Somebody once predicted they'll have more metadata than data. Carl, what are your thoughts on this prediction? >> Yeah, I think that what Dave is describing as data derivatives is in a way, another word for what I was calling operational metadata, which not about the data itself, but how it's used, where it came from, what the rules are governing it, and that kind of thing. If you have a rich enough set of those things, then not only can you do a model of how well your vacation property rental may do in terms of income, but also how well your application that's measuring that is doing for you. In other words, how many times have I used it, how much data have I used and what is the relationship between the data that I've used and the benefits that I've derived from using it? Well, we don't have ways of doing that. What's interesting to me is that folks in the content world are way ahead of us here, because they have always tracked their content using these kinds of attributes. Where did it come from? When was it created, when was it modified? Who modified it? And so on and so forth. We need to do more of that with the structure data that we have, so that we can track what it's used. And also, it tells us how well we're doing with it. Is it really benefiting us? Are we being efficient? Are there improvements in processes that we need to consider? Because maybe data gets created and then it isn't used or it gets used, but it gets altered in some way that actually misleads people. (laughs) So, we need the mechanisms to be able to do that. So, I would say that that's... And I'd say that it's true that we need that stuff. I think, that starting to expand is probably the right way to put it. It's going to be expanding for some time. I think, we're still a distance from having all that stuff really working together. >> Maybe we should say it's gestating. (Dave and Carl laughing) >> Sorry, if I may- >> Sanjeev, yeah, I was going to say this... Sanjeev, please comment. This sounds to me like it supports Zhamak Dehghani's principles, but please. >> Absolutely. So, whether we call it data mesh or not, I'm not getting into that conversation, (Dave chuckles) but data (audio breaking) (Tony laughing) everything that I'm hearing what Dave is saying, Carl, this is the year when data products will start to take off. I'm not saying they'll become mainstream. They may take a couple of years to become so, but this is data products, all this thing about vacation rentals and how is it doing, that data is coming from different sources. I'm packaging it into our data product. And to Carl's point, there's a whole operational metadata associated with it. The idea is for organizations to see things like developer productivity, how many releases am I doing of this? What data products are most popular? I'm actually in right now in the process of formulating this concept that just like we had data catalogs, we are very soon going to be requiring data products catalog. So, I can discover these data products. I'm not just creating data products left, right, and center. I need to know, do they already exist? What is the usage? If no one is using a data product, maybe I want to retire and save cost. But this is a data product. Now, there's a associated thing that is also getting debated quite a bit called data contracts. And a data contract to me is literally just formalization of all these aspects of a product. How do you use it? What is the SLA on it, what is the quality that I am prescribing? So, data product, in my opinion, shifts the conversation to the consumers or to the business people. Up to this point when, Dave, you're talking about data and all of data discovery curation is a very data producer-centric. So, I think, we'll see a shift more into the consumer space. >> Yeah. Dave, can I just jump in there just very quickly there, which is that what Sanjeev has been saying there, this is really central to what Zhamak has been talking about. It's basically about making, one, data products are about the lifecycle management of data. Metadata is just elemental to that. And essentially, one of the things that she calls for is making data products discoverable. That's exactly what Sanjeev was talking about. >> By the way, did everyone just no notice how Sanjeev just snuck in another prediction there? So, we've got- >> Yeah. (group laughing) >> But you- >> Can we also say that he snuck in, I think, the term that we'll remember today, which is metadata museums. >> Yeah, but- >> Yeah. >> And also comment to, Tony, to your last year's prediction, you're really talking about it's not something that you're going to buy from a vendor. >> No. >> It's very specific >> Mm-hmm. >> to an organization, their own data product. So, touche on that one. Okay, last prediction. Let's bring them up. Doug Henschen, BI analytics is headed to embedding. What does that mean? >> Well, we all know that conventional BI dashboarding reporting is really commoditized from a vendor perspective. It never enjoyed truly mainstream adoption. Always that 25% of employees are really using these things. I'm seeing rising interest in embedding concise analytics at the point of decision or better still, using analytics as triggers for automation and workflows, and not even necessitating human interaction with visualizations, for example, if we have confidence in the analytics. So, leading companies are pushing for next generation applications, part of this low-code, no-code movement we've seen. And they want to build that decision support right into the app. So, the analytic is right there. Leading enterprise apps vendors, Salesforce, SAP, Microsoft, Oracle, they're all building smart apps with the analytics predictions, even recommendations built into these applications. And I think, the progressive BI analytics vendors are supporting this idea of driving insight to action, not necessarily necessitating humans interacting with it if there's confidence. So, we want prediction, we want embedding, we want automation. This low-code, no-code development movement is very important to bringing the analytics to where people are doing their work. We got to move beyond the, what I call swivel chair integration, between where people do their work and going off to separate reports and dashboards, and having to interpret and analyze before you can go back and do take action. >> And Dave Menninger, today, if you want, analytics or you want to absorb what's happening in the business, you typically got to go ask an expert, and then wait. So, what are your thoughts on Doug's prediction? >> I'm in total agreement with Doug. I'm going to say that collectively... So, how did we get here? I'm going to say collectively as an industry, we made a mistake. We made BI and analytics separate from the operational systems. Now, okay, it wasn't really a mistake. We were limited by the technology available at the time. Decades ago, we had to separate these two systems, so that the analytics didn't impact the operations. You don't want the operations preventing you from being able to do a transaction. But we've gone beyond that now. We can bring these two systems and worlds together and organizations recognize that need to change. As Doug said, the majority of the workforce and the majority of organizations doesn't have access to analytics. That's wrong. (chuckles) We've got to change that. And one of the ways that's going to change is with embedded analytics. 2/3 of organizations recognize that embedded analytics are important and it even ranks higher in importance than AI and ML in those organizations. So, it's interesting. This is a really important topic to the organizations that are consuming these technologies. The good news is it works. Organizations that have embraced embedded analytics are more comfortable with self-service than those that have not, as opposed to turning somebody loose, in the wild with the data. They're given a guided path to the data. And the research shows that 65% of organizations that have adopted embedded analytics are comfortable with self-service compared with just 40% of organizations that are turning people loose in an ad hoc way with the data. So, totally behind Doug's predictions. >> Can I just break in with something here, a comment on what Dave said about what Doug said, which (laughs) is that I totally agree with what you said about embedded analytics. And at IDC, we made a prediction in our future intelligence, future of intelligence service three years ago that this was going to happen. And the thing that we're waiting for is for developers to build... You have to write the applications to work that way. It just doesn't happen automagically. Developers have to write applications that reference analytic data and apply it while they're running. And that could involve simple things like complex queries against the live data, which is through something that I've been calling analytic transaction processing. Or it could be through something more sophisticated that involves AI operations as Doug has been suggesting, where the result is enacted pretty much automatically unless the scores are too low and you need to have a human being look at it. So, I think that that is definitely something we've been watching for. I'm not sure how soon it will come, because it seems to take a long time for people to change their thinking. But I think, as Dave was saying, once they do and they apply these principles in their application development, the rewards are great. >> Yeah, this is very much, I would say, very consistent with what we were talking about, I was talking about before, about basically rethinking the modern data stack and going into more of an end-to-end solution solution. I think, that what we're talking about clearly here is operational analytics. There'll still be a need for your data scientists to go offline just in their data lakes to do all that very exploratory and that deep modeling. But clearly, it just makes sense to bring operational analytics into where people work into their workspace and further flatten that modern data stack. >> But with all this metadata and all this intelligence, we're talking about injecting AI into applications, it does seem like we're entering a new era of not only data, but new era of apps. Today, most applications are about filling forms out or codifying processes and require a human input. And it seems like there's enough data now and enough intelligence in the system that the system can actually pull data from, whether it's the transaction system, e-commerce, the supply chain, ERP, and actually do something with that data without human involvement, present it to humans. Do you guys see this as a new frontier? >> I think, that's certainly- >> Very much so, but it's going to take a while, as Carl said. You have to design it, you have to get the prediction into the system, you have to get the analytics at the point of decision has to be relevant to that decision point. >> And I also recall basically a lot of the ERP vendors back like 10 years ago, we're promising that. And the fact that we're still looking at the promises shows just how difficult, how much of a challenge it is to get to what Doug's saying. >> One element that could be applied in this case is (indistinct) architecture. If applications are developed that are event-driven rather than following the script or sequence that some programmer or designer had preconceived, then you'll have much more flexible applications. You can inject decisions at various points using this technology much more easily. It's a completely different way of writing applications. And it actually involves a lot more data, which is why we should all like it. (laughs) But in the end (Tony laughing) it's more stable, it's easier to manage, easier to maintain, and it's actually more efficient, which is the result of an MIT study from about 10 years ago, and still, we are not seeing this come to fruition in most business applications. >> And do you think it's going to require a new type of data platform database? Today, data's all far-flung. We see that's all over the clouds and at the edge. Today, you cache- >> We need a super cloud. >> You cache that data, you're throwing into memory. I mentioned, MySQL heat wave. There are other examples where it's a brute force approach, but maybe we need new ways of laying data out on disk and new database architectures, and just when we thought we had it all figured out. >> Well, without referring to disk, which to my mind, is almost like talking about cave painting. I think, that (Dave laughing) all the things that have been mentioned by all of us today are elements of what I'm talking about. In other words, the whole improvement of the data mesh, the improvement of metadata across the board and improvement of the ability to track data and judge its freshness the way we judge the freshness of a melon or something like that, to determine whether we can still use it. Is it still good? That kind of thing. Bringing together data from multiple sources dynamically and real-time requires all the things we've been talking about. All the predictions that we've talked about today add up to elements that can make this happen. >> Well, guys, it's always tremendous to get these wonderful minds together and get your insights, and I love how it shapes the outcome here of the predictions, and let's see how we did. We're going to leave it there. I want to thank Sanjeev, Tony, Carl, David, and Doug. Really appreciate the collaboration and thought that you guys put into these sessions. Really, thank you. >> Thank you. >> Thanks, Dave. >> Thank you for having us. >> Thanks. >> Thank you. >> All right, this is Dave Valente for theCUBE, signing off for now. Follow these guys on social media. Look for coverage on siliconangle.com, theCUBE.net. Thank you for watching. (upbeat music)

Published Date : Jan 11 2023

SUMMARY :

and pleased to tell you (Tony and Dave faintly speaks) that led them to their conclusion. down, the funding in VC IPO market. And I like how the fact And I happened to have tripped across I talked to Walmart in the prediction of graph databases. But I stand by the idea and maybe to the edge. You can apply graphs to great And so, it's going to streaming data permeates the landscape. and to be honest, I like the tough grading the next 20 to 25% of and of course, the degree of difficulty. that sits on the side, Thank you for that. And I have to disagree. So, the catalog becomes Do you have any stats for just the reasons that And a lot of those catalogs about the modern data stack. and more, the data lakehouse. and the application stack, So, the alternative is to have metadata that SQL is the killer app for big data. but in the perception of the marketplace, and I had to take the NoSQL, being up on stage with Curt Monash. (group laughing) is that the core need in the data lake, And your prediction is the and examine derivatives of the data to optimize around a set of KPIs. that folks in the content world (Dave and Carl laughing) going to say this... shifts the conversation to the consumers And essentially, one of the things (group laughing) the term that we'll remember today, to your last year's prediction, is headed to embedding. and going off to separate happening in the business, so that the analytics didn't And the thing that we're waiting for and that deep modeling. that the system can of decision has to be relevant And the fact that we're But in the end We see that's all over the You cache that data, and improvement of the and I love how it shapes the outcome here Thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Doug Henschen	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Doug	PERSON	0.99+
Carl	PERSON	0.99+
Carl Olofson	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Tony Baer	PERSON	0.99+
Tony	PERSON	0.99+
Dave Valente	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
Curt Monash	PERSON	0.99+
Sanjeev Mohan	PERSON	0.99+
Christian Kleinerman	PERSON	0.99+
Dave Valente	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Sanjeev	PERSON	0.99+
Constellation Research	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Ventana Research	ORGANIZATION	0.99+
2022	DATE	0.99+
Hazelcast	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Tony Bear	PERSON	0.99+
25%	QUANTITY	0.99+
2021	DATE	0.99+
last year	DATE	0.99+
65%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
today	DATE	0.99+
five-year	QUANTITY	0.99+
TigerGraph	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
two services	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
David	PERSON	0.99+
RisingWave Labs	ORGANIZATION	0.99+

AWS re:Invent Show Wrap | AWS re:Invent 2022

foreign welcome back to re invent 2022 we're wrapping up four days well one evening and three solid days wall-to-wall of cube coverage I'm Dave vellante John furrier's birthday is today he's on a plane to London to go see his nephew get married his his great Sister Janet awesome family the furriers uh spanning the globe and uh and John I know you wanted to be here you're watching in Newark or you were waiting to uh to get in the plane so all the best to you happy birthday one year the Amazon PR people brought a cake out to celebrate John's birthday because he's always here at AWS re invented his birthday so I'm really pleased to have two really special guests uh former Cube host Cube Alum great wikibon contributor Stu miniman now with red hat still good to see you again great to be here Dave yeah I was here for that cake uh the twitterverse uh was uh really helping to celebrate John's birthday today and uh you know always great to be here with you and then with this you know Awesome event this week and friend of the cube of many time Cube often Cube contributor as here's a cube analyst this week as his own consultancy sarbj johal great to see you thanks for coming on good to see you Dave uh great to see you stu I'm always happy to participate in these discussions and um I enjoy the discussion every time so this is kind of cool because you know usually the last day is a getaway day and this is a getaway day but this place is still packed I mean it's I mean yeah it's definitely lighter you can at least walk and not get slammed but I subjit I'm going to start with you I I wanted to have you as the the tail end here because cause you participated in the analyst sessions you've been watching this event from from the first moment and now you've got four days of the Kool-Aid injection but you're also talking to customers developers Partners the ecosystem where do you want to go what's your big takeaways I think big takeaways that Amazon sort of innovation machine is chugging along they are I was listening to some of the accessions and when I was back to my room at nine so they're filling the holes in some areas but in some areas they're moving forward there's a lot to fix still it doesn't seem like that it seems like we are done with the cloud or The Innovation is done now we are building at the millisecond level so where do you go next there's a lot of room to grow on the storage side on the network side uh the improvements we need and and also making sure that the software which is you know which fits the hardware like there's a specialized software um sorry specialized hardware for certain software you know so there was a lot of talk around that and I attended some of those sessions where I asked the questions around like we have a specialized database for each kind of workload specialized processes processors for each kind of workload yeah the graviton section and actually the the one interesting before I forget that the arbitration was I asked that like why there are so many so many databases and IRS for the egress costs and all that stuff can you are you guys thinking about reducing that you know um the answer was no egress cost is not a big big sort of uh um show stopper for many of the customers but but the from all that sort of little discussion with with the folks sitting who build these products over there was that the plethora of choice is given to the customers to to make them feel that there's no vendor lock-in so if you are using some open source you know um soft software it can be on the you know platform side or can be database side you have database site you have that option at AWS so this is a lot there because I always thought that that AWS is the mother of all lock-ins but it's got an ecosystem and we're going to talk about exactly we'll talk about Stu what's working within AWS when you talk to customers and where are the challenges yeah I I got a comment on open source Dave of course there because I mean look we criticized to Amazon for years about their lack of contribution they've gotten better they're doing more in open source but is Amazon the mother of all lock-ins many times absolutely there's certain people inside Amazon I'm saying you know many of us talk Cloud native they're like well let's do Amazon native which means you're like full stack is things from Amazon and do things the way that we want to do things and you know I talk to a lot of customers they use more than one Cloud Dave and therefore certain things absolutely I want to Leverage The Innovation that Amazon has brought I do think we're past building all the main building blocks in many ways we are like in day two yes Amazon is fanatically customer focused and will always stay that way but you know there wasn't anything that jumped out at me last year or this year that was like Wow new category whole new way of thinking about something we're in a vocals last year Dave said you know we have over 200 services and if we listen to you the customer we'd have over two thousand his session this week actually got some great buzz from my friends in the serverless ecosystem they love some of the things tying together we're using data the next flywheel that we're going to see for the next 10 years Amazon's at the center of the cloud ecosystem in the IT world so you know there's a lot of good things here and to your point Dave the ecosystem one of the things I always look at is you know was there a booth that they're all going to be crying in their beer after Amazon made an announcement there was not a tech vendor that I saw this week that was like oh gosh there was an announcement and all of a sudden our business is gone where I did hear some rumbling is Amazon might be the next GSI to really move forward and we've seen all the gsis pushing really deep into supporting Cloud bringing workloads to the cloud and there's a little bit of rumbling as to that balance between what Amazon will do and their uh their go to market so a couple things so I think I think we all agree that a lot of the the announcements here today were taping seams right I call it and as it relates to the mother of all lock-in the reason why I say that it's it's obviously very much a pejorative compare Oracle company you know really well with Amazon's lock-in for Amazon's lock-in is about bringing this ecosystem together so that you actually have Choice Within the the house so you don't have to leave you know there's a there's a lot to eat at the table yeah you look at oracle's ecosystem it's like yeah you know oracle is oracle's ecosystem so so that is how I think they do lock in customers by incenting them not to leave because there's so much Choice Dave I agree with you a thousand I mean I'm here I'm a I'm a good partner of AWS and all of the partners here want to be successful with Amazon and Amazon is open to that it's not our way or get out which Oracle tries how much do you extract from the overall I.T budget you know are you a YouTube where you give the people that help you create a large sum of the money YouTube hasn't been all that profitable Amazon I think is doing a good balance of the ecosystem makes money you know we used to talk Dave about you know how much dollars does VMware make versus there um I think you know Amazon is a much bigger you know VMware 2.0 we used to think talk about all the time that VMware for every dollar spent on VMware licenses 15 or or 12 or 20 were spent in the ecosystem I would think the ratio is even higher here sarbji and an Oracle I would say it's I don't know yeah actually 1 to 0.5 maybe I don't know but I want to pick on your discussion about the the ecosystem the the partner ecosystem is so it's it's robust strong because it's wider I was I was not saying that there's no lock-in with with Amazon right AWS there's lock-in there's lock-in with everything there's lock-in with open source as well but but the point is that they're they're the the circle is so big you don't feel like locked in but they're playing smart as well they're bringing in the software the the platforms from the open source they're picking up those packages and saying we'll bring it in and cater that to you through AWS make it better perform better and also throw in their custom chips on top of that hey this MySQL runs better here so like what do you do I said oh Oracle because it's oracle's product if you will right so they are I think think they're filing or not slenders from their go to market strategy from their engineering and they listen to they're listening to customers like very closely and that has sort of side effects as well listening to customers creates a sprawl of services they have so many services and I criticized them last year for calling everything a new service I said don't call it a new service it's a feature of a existing service sure a lot of features a lot of features this is egress our egress costs a real problem or is it just the the on-prem guys picking at the the scab I mean what do you hear from customers so I mean Dave you know I I look at what Corey Quinn talks about all the time and Amazon charges on that are more expensive than any other Cloud the cloud providers and partly because Amazon is you know probably not a word they'd use they are dominant when it comes to the infrastructure space and therefore they do want to make it a little bit harder to do that they can get away with it um because um yeah you know we've seen some of the cloud providers have special Partnerships where you can actually you know leave and you're not going to be charged and Amazon they've been a little bit more flexible but absolutely I've heard customers say that they wish some good tunning and tongue-in-cheek stuff what else you got we lay it on us so do our players okay this year I think the focus was on the upside it's shifting gradually this was more focused on offside there were less talk of of developers from the main stage from from all sort of quadrants if you will from all Keynotes right so even Werner this morning he had a little bit for he was talking about he he was talking he he's job is to Rally up the builders right yeah so he talks about the go build right AWS pipes I thought was kind of cool then I said like I'm making glue easier I thought that was good you know I know some folks don't use that I I couldn't attend the whole session but but I heard in between right so it is really adopt or die you know I am Cloud Pro for last you know 10 years and I think it's the best model for a technology consumption right um because of economies of scale but more importantly because of division of labor because of specialization because you can't afford to hire the best security people the best you know the arm chip designers uh you can't you know there's one actually I came up with a bumper sticker you guys talked about bumper sticker I came up with that like last couple of weeks The Innovation favorite scale they have scale they have Innovation so that's where the Innovation is and it's it's not there again they actually say the market sets the price Market you as a customer don't set the price the vendor doesn't set the price Market sets the price so if somebody's complaining about their margins or egress and all that I think that's BS um yeah I I have a few more notes on the the partner if you you concur yeah Dave you know with just coming back to some of this commentary about like can Amazon actually enable something we used to call like Community clouds uh your companies like you know Goldman and NASDAQ and the like where Industries will actually be able to share data uh and you know expand the usage and you know Amazon's going to help drive that API economy forward some so it's good to see those things because you know we all know you know all of us are smarter than just any uh single company together so again some of that's open source but some of that is you know I think Amazon is is you know allowing Innovation to thrive I think the word you're looking for is super cloud there well yeah I mean it it's uh Dave if you want to go there with the super cloud because you know there's a metaphor for exactly what you described NASDAQ Goldman Sachs we you know and and you know a number of other companies that are few weeks at the Berkeley Sky Computing paper yeah you know that's a former supercloud Dave Linthicum calls it metacloud I'm not really careful I mean you know I go back to the the challenge we've been you know working at for a decade is the distributed architecture you know if you talk about AI architectures you know what lives in the cloud what lives at the edge where do we train things where do we do inferences um locations should matter a lot less Amazon you know I I didn't hear a lot about it this show but when they came out with like local zones and oh my gosh out you know all the things that Amazon is building to push out to the edge and also enabling that technology and software and the partner ecosystem helps expand that and Pull It in it's no longer you know Dave it was Hotel California all of the data eventually is going to end up in the public cloud and lock it in it's like I don't think that's going to be the case we know that there will be so much data out at the edge Amazon absolutely is super important um there some of those examples we're giving it's not necessarily multi-cloud but there's collaboration happening like in the healthcare world you know universities and hospitals can all share what they're doing uh regardless of you know where they live well Stephen Armstrong in the analyst session did say that you know we're going to talk about multi-cloud we're not going to lead with it necessarily but we are going to actually talk about it and that's different to your points too than in the fullness of time all the data will be in the cloud that's a new narrative but go ahead yeah actually Amazon is a leader in the cloud so if they push the cloud even if they don't say AWS or Amazon with it they benefit from it right and and the narrative is that way there's the proof is there right so again Innovation favorite scale there are chips which are being made for high scale their software being tweaked for high scale you as a Bank of America or for the Chrysler as a typical Enterprise you cannot afford to do those things in-house what cloud providers can I'm not saying just AWS Google cloud is there Azure guys are there and few others who are behind them and and you guys are there as well so IBM has IBM by the way congratulations to your red hat I know but IBM won the award um right you know very good partner and yeah but yeah people are dragging their feet people usually do on the change and they are in denial denial they they drag their feet and they came in IBM director feed the cave Den Dell drag their feed the cave in yeah you mean by Dragon vs cloud deniers cloud deniers right so server Huggers I call them but they they actually are sitting in Amazon Cloud Marketplace everybody is buying stuff from there the marketplace is the new model OKAY Amazon created the marketplace for b2c they are leading the marketplace of B2B as well on the technology side and other people are copying it so there are multiple marketplaces now so now actually it's like if you're in in a mobile app development there are two main platforms Android and Apple you first write the application for Apple right then for Android hex same here as a technology provider as and I I and and I actually you put your stuff to AWS first then you go anywhere else yeah they are later yeah the Enterprise app store is what we've wanted for a long time the question is is Amazon alone the Enterprise app store or are they partner of a of a larger portfolio because there's a lot of SAS companies out there uh that that play into yeah what we need well and this is what you're talking about the future but I just want to make a point about the past you talking about dragging their feet because the Cube's been following this and Stu you remember this in 2013 IBM actually you know got in a big fight with with Amazon over the CIA deal you know and it all became public judge wheeler eviscerated you know IBM and it ended up IBM ended up buying you know soft layer and then we know what happened there and it Joe Tucci thought the cloud was Mosey right so it's just amazing to see we have booksellers you know VMware called them books I wasn't not all of them are like talking about how great Partnerships they are it's amazing like you said sub GC and IBM uh with the the GSI you know Partnership of the year but what you guys were just talking about was the future and that's what I wanted to get to is because you know Amazon's been leading the way I I was listening to Werner this morning and that just reminded me of back in the days when we used to listen to IBM educate us give us a master class on system design and decoupled systems and and IO and everything else now Amazon is you know the master educator and it got me thinking how long will that last you know will they go the way of you know the other you know incumbents will they be disrupted or will they you know keep innovating maybe it's going to take 10 or 20 years I don't know yeah I mean Dave you actually you did some research I believe it was a year or so ago yeah but what will stop Amazon and the one thing that worries me a little bit um is the two Pizza teams when you have over 202 Pizza teams the amount of things that each one of those groups needs to take care of was more than any human could take care of people burn out they run out of people how many amazonians only last two or three years and then leave because it is tough I bumped into plenty of friends of mine that have been you know six ten years at Amazon and love it but it is a tough culture and they are driving werner's keynote I thought did look to from a product standpoint you could say tape over some of the seams some of those solutions to bring Beyond just a single product and bring them together and leverage data so there are some signs that they might be able to get past some of those limitations but I still worry structurally culturally there could be some challenges for Amazon to keep the momentum going especially with the global economic impact that we are likely to see in the next year bring us home I think the future side like we could talk about the vendors all day right to serve the community out there I think we should talk about how what's the future of technology consumption from the consumer side so from the supplier side just a quick note I think the only danger AWS has has that that you know Fred's going after them you know too big you know like we will break you up and that can cause some disruption there other than that I think they they have some more steam to go for a few more years at least before we start thinking about like oh this thing is falling apart or anything like that so they have a lot more they have momentum and it's continuing so okay from the I think game is on retail by the way is going to get disrupted before AWS yeah go ahead from the buyer's side I think um the the future of the sort of Technology consumption is based on the paper uh use and they actually are turning all their services to uh they are sort of becoming serverless behind the scenes right all analytics service they had one service left they they did that this year so every service is serverless so that means you pay exactly for the amount you use the compute the iops the the storage so all these three layers of course Network we talked about the egress stuff and that's a problem there because of the network design mainly because Google has a flatter design and they have lower cost so so they are actually squeezing the their their designing this their services in a way that you don't waste any resources as a buyer so for example very simple example when early earlier In This Cloud you will get a VM right in Cloud that's how we started so and you can get 20 use 20 percent of the VM 80 is getting wasted that's not happening now that that has been reduced to the most extent so now your VM grows as you grow the usage and if you go higher than the tier you picked they will charge you otherwise they will not charge you extra so that's why there's still a lot of instances like many different types you have to pick one I think the future is that those instances will go away the the instance will be formed for you on the fly so that is the future serverless all right give us bumper sticker Stu and then Serb G I'll give you my quick one and then we'll wrap yeah so just Dave to play off of sharp G and to wrap it up you actually wrote about it on your preview post for here uh serverless we're talking about how developers think about things um and you know Amazon in many ways you know is the new default server uh you know for the cloud um and containerization fits into the whole serverless Paradigm uh it's the space that I live in uh you know every day here and you know I was happy to see the last few years serverless and containers there's a blurring a line and you know subject we're still going to see VMS for a long time yeah yeah we will see that so give us give us your book Instagram my number six is innovation favorite scale that's my bumper sticker and and Amazon has that but also I I want everybody else to like the viewers to take a look at the the Google Cloud as well as well as IBM with others like maybe you have a better price to Performance there for certain workloads and by the way one vendor cannot do it alone we know that for sure the market is so big there's a lot of room for uh Red Hats of the world and and and Microsoft's the world to innovate so keep an eye on them they we need the competition actually and that's why competition Will Keep Us to a place where Market sets the price one vendor doesn't so the only only danger is if if AWS is a monopoly then I will be worried I think ecosystems are the Hallmark of a great Cloud company and Amazon's got the the biggest and baddest ecosystem and I think the other thing to watch for is Industries building on top of the cloud you mentioned the Goldman Sachs NASDAQ Capital One and Warner media these all these industries are building their own clouds and that's where the real money is going to be made in the latter half of the 2020s all right we're a wrap this is Dave Valente I want to first of all thank thanks to our great sponsors AWS for for having us here this is our 10th year at the cube AMD you know sponsoring as well the the the cube here Accenture sponsor to third set upstairs upstairs on the fifth floor all the ecosystem partners that came on the cube this week and supported our mission for free content our content is always free we try to give more to the community and we we take back so go to thecube.net and you'll see all these videos go to siliconangle com for all the news wikibon.com I publish weekly a breaking analysis series I want to thank our amazing crew here you guys we have probably 30 35 people unbelievable our awesome last session John Walls uh Paul Gillen Lisa Martin Savannah Peterson John Furrier who's on a plane we appreciate Andrew and Leonard in our ear and all of our our crew Palo Alto Boston and across the country thank you so much really appreciate it all right we are a wrap AWS re invent 2022 we'll see you in two weeks we'll see you two weeks at Palo Alto ignite back here in Vegas thanks for watching thecube the leader in Enterprise and emerging Tech coverage [Music]

Published Date : Dec 2 2022

SUMMARY :

of the ecosystem makes money you know we

ENTITIES

Entity	Category	Confidence
Stephen Armstrong	PERSON	0.99+
Dave	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Leonard	PERSON	0.99+
Joe Tucci	PERSON	0.99+
John	PERSON	0.99+
London	LOCATION	0.99+
Corey Quinn	PERSON	0.99+
Andrew	PERSON	0.99+
2013	DATE	0.99+
10	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
NASDAQ	ORGANIZATION	0.99+
Goldman Sachs	ORGANIZATION	0.99+
Newark	LOCATION	0.99+
John Walls	PERSON	0.99+
Paul Gillen	PERSON	0.99+
Goldman	ORGANIZATION	0.99+
Vegas	LOCATION	0.99+
10th year	QUANTITY	0.99+
two weeks	QUANTITY	0.99+
YouTube	ORGANIZATION	0.99+
last year	DATE	0.99+
Dave Linthicum	PERSON	0.99+
Google	ORGANIZATION	0.99+
six ten years	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
thecube.net	OTHER	0.99+
Apple	ORGANIZATION	0.99+
Android	TITLE	0.99+
John Furrier	PERSON	0.99+
over 200 services	QUANTITY	0.99+
fifth floor	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
this year	DATE	0.99+

Roland Lee & Hawn Nguyen Loughren | AWS re:Invent 2022 - Global Startup Program

>>Good afternoon everybody. I'm John Walls and welcome back to our coverage here on the cube of AWS Reinvent 22. We are bringing you another segment with the Global Startup Program, which is part of the AWS Start Showcase, and it's a pleasure to welcome two new guests here to the showcase. First, immediately to my right Han w lre. Good to see you Han. Good to see you. The leader of the Enterprise Solutions Architecture at aws. And on the far right, Rolin Lee, who is the co-founder and CEO of Heim Doll Data. Roland, good to see you. Great >>To be here. >>All right, good. Thanks for joining us. Well first off, for those at home, I may not be familiar with Heim Doll. What do you do? Why are you here? But I'll let you take it from there. >>Well, we're one of the sponsors here at AWS and great to be here. We offer a data access layer in the form of a proxy, and what it does is it provides complete visibility and the capability to enhance the interaction between the application and one's current database. And as a result, you'll, the customer will improve database scale, database security and availability. And all these features don't require any application changes. So that's sort of our marketing pitch, if you will, all these types of features to improve the experience of managing a database without any application >>Changes. And, and where's the cloud come into play then, for you then, where, where did it come into play for you? >>So we started out actually helping out customers on premise, and a lot of enterprise customers are moving over to the cloud, and it was just a natural progression to do that. And so aws, which is a key part of ours, partners with us to help solve customer problems, especially on the database side, as the application being application performance tends to have issues between the interaction between the application database and we're solving that issue. >>Right. Sohan, I mean, Roan just touched on it about OnPrem, right? There's still some kickers and screamers out there that, that don't, haven't bought in or, or they're about to, but you're about to get 'em. I, I'm sure. But talk about that, that conversion or that transition, if you would, from going OnPrem into a hybrid environment or to into the, the bigger cloud environment and and how difficult that is sometimes. Yes. Maybe to get people to, to make that kind of a leap. >>Well, I would say that a lot of customers are wanting to focus more on product innovation experimentation, and also in terms of having to manage servers and patching, you know, it's to take away from that initiative that they're trying to do. So with aws, we provide undifferentiated heavy lifting so that they can focus on product innovation. And one of the areas talking about Heim is that from the database side, we do provide Amazon rds, which is database and also Aurora, to give them that lift so they don't have to worry about patching servers and setting up provisioning servers as well. >>Right. So Roland, can you get the idea across to people very simply, let us take care of the, the hard stuff and, and that will free you up to do your product innovations, to do your experimentations to, to really free up your team, basically to do the fun stuff and, and let us sweat over the, the, the details basically. Right? >>Exactly. Our, our motto is not only why build when, when you can buy. So a lot of it has to do with offering the, the value in terms of price and the features such as it's gonna benefit a team. Large companies like amazon.com, Google, they have huge teams that can build data access layers and proxies. And what we're trying to do here is commercialize those cuz those are built in house and it's not readily available for customers to use. And you'd need some type of interface between the application and the database. And we provide that sort of why build when you can buy. >>Well, I was gonna say why h right? I mean what's your special sauce? Because everybody's got something, obviously a market differentiator that you're bringing into place here. So you started to touch on a little bit there for me, but, but dive a little deeper there. I mean, what, what is it that, that you're bringing to the table with AWS that you think puts you above the crowd? >>Well, lemme give you a use case here. In typical events like let's say Black Friday where there's a surge traffic that can overwhelm the database, the Heim doll data access layer database proxy provides an auto scaling distributed architecture such that it can absorb those surges and traffic and help scale the database while keeping the data fresh and up to date. And so basically traffic based on season time of day, we can, we can adjust automatically and all these types of features that we offer, most notably automated query caching, ReadWrite split for asset compliance don't require any code changes, which typically requires the application developer to make those changes. So we're saving months, maybe years of development and maintenance. >>Yeah, a lot of gray hairs too, right? Yeah, you're, you're solving a lot of problems there. What about database trends in just in general Hunt, if you will. I mean, this is your space, right? I mean, what we're hearing about from Heindel, you know, in terms of solutions they're providing, but what are you seeing just from the macro level in terms of what people are doing and thinking about the database and how it relates to the cloud? Right. >>And some of the things that we're seeing is that we're seeing an explosion of data, relevant data that customers need to be able to consume and also process as well. So with the explosion of data, there's also, we see customers trying to modernize their application as well through microservices, which does change the design patterns of like the applications we call the access data patterns as well. So again, going back to that, a differentiated heavy lifting, we do have something called purpose built databases, right? It's the right tool for the right purpose. And so it depends on what their like rpo, rto their access to data pattern. Is it a base, is it an acid? So we want to be able to provide them the options to build and also innovate. So with that, that's why we have the Amazon rds, the also the, we also have Redshift, we also have Aurora and et cetera. The Rediff is more of the BI side, but usually when you ingest the data, you have some level of processing to get more insight. So with that, that's why customers are moving more of towards the managed service so that they can give that lift and then focusing on that product and innovation. Yeah. >>Have we kind of caught up or are we catching up to this just the tsunami of data to begin with, right? Because I mean, that was it, you know, what, seven, eight years ago when, when that data became kind of, or becoming king and, and reams and reams and reams and all, you know, can't handle it, right? And, and are we now able to manage that process and manage that flow and get the right data into the right hands at the right time? We're doing better with that. >>I would say that it, it definitely has grown in size of the amount of data that we're ingesting. And so with the scalability and agility of the cloud, we're able to, I would say, adapt to the rapid changes and ingestions of the data. So, so that's why we have things like Aurora servers to have that or auto scale so they can do like MySQL or Postgres and then they can still, like what you know, I'm trying to do is basically don't have to co do like any code changes. It would be a data migration. They still use the same underlying database on also mechanisms, but here we're providing them at scale on the cloud. >>Yeah. Our proxies, they must have for all databases. I mean, is that, is that essential these days? >>Well, good question John. I would say yes. And this is often built in house, as I mentioned, for large companies, they do build some type of data access layer or proxy and, or some utilize some orm, some object relational map to do it. And what again, what we're trying to do is offer this, put this out into the market commercially speaking, such that it can be readily used for, for all the customers to use rather than building it from scratch all the time. >>You know what I didn't ask you was Roy, how does AWS come into play for you then? And, and as in the startup mode, the focus that they've had in startups in general, but in you in particular, I mean, talk about that partnership or that relationship and the value that you're extracting from that. >>The ad AWS partnership has been absolutely wonderful. The collaboration, they have one of the best managed service databases. The value that it that adds in terms of the durability, the manageability, what the Heim doll data does is it compliments Amazon rds, Amazon Redshift very well in the sense that we're not replacing the database. What we're doing is we are allowing the customer to get the most out of the managed service database, whether it be Redshift or Aurora Serverless, rds, all without code changes. And or the analogy that I would give John is a car, a race car may be very fast, but it takes a driver to get to those fast speeds. We're the driver, the Hyundai proxy provides that intelligence so that you can get the most out of that database engine. >>And, and Hfi would then touch on, first off AWS and the emphasis that you have put on startups and are obviously, you know, kind of putting your money where your mouth is, right? With, with the way you've encouraged and nurtured that environment. And they would be about Heim doll in general about where you see this going or what you would like to have, where you want to take this in the next say 12 months, 18 months. >>I think it's more of a better together story of how we can basically coil with our partners, right? And, and basically focusing on helping our customers drive that innovation and be collaboration. So as Heim, as a independent service vendor isv, most customers can leverage that through a marketplace where basically it integrates very nicely with aws. So that gives 'em that lift and it goes back to the undifferentiated heavy lifting on the Hein proxy side, if you will, because then you have this proxy in the middle where then it helps them with their SQL performance. And I've seen use cases where customers were, have some legacy system that they may not have time to modernize the application. So they use this as a lift to keep, keep going as they try to modernize. But also I've seen customers who use are trying to use it as a, a way to give that performance lift because they may have a third party software that they cannot change the code by putting this in there that helps optimize their lines of business or whatever that is, and maybe can be online store or whatever. So I would say it was a better together type of story. >>Yeah. Which is, there's gotta be a song in there somewhere. So peek around the corner and if you wanna be headlights here right now in terms of 12, 18 months, I mean, what, you know, what what next to solve, right? You've already taken, you've slayed a few dragons along the way, but there are others I'm sure is it always happens in innovation in this space. Just when you solve a problem you've just dealt or you have to deal with others that pop up as maybe unintended consequences or at least a new challenge. So what would that be in your world right now? What, what do you see, you know, occupying your sleepless nights here for the next year or so? >>Well, for, for HOMEDALE data, it's all about improving database performance and scale. And those workloads change. We have O ltp, we have OLA with artificial intelligence ml. There's different type of traffic profiles and we're focused on improving those data profiles. It could be unstructured structured. Right now we're focused on structured data, which is relational databases, but there's a lot of opportunity to improve the performance of data. >>Well, you're driving the car, you got a good navigator. I think the GPS is working. So keep up the good work and thank you for sharing the time today. Thank you. Thank you, joy. Do appreciate it. All right, you are watching the cube. We continue our coverage here from AWS Reinvent 22, the Cube, of course, the leader in high tech coverage.

Published Date : Nov 30 2022

SUMMARY :

Good to see you Han. Why are you here? a data access layer in the form of a proxy, and what it does is it And, and where's the cloud come into play then, for you then, where, where did it come into play for you? and a lot of enterprise customers are moving over to the cloud, and it was just a that conversion or that transition, if you would, from going OnPrem into a hybrid environment or and patching, you know, it's to take away from that initiative that they're trying to do. the hard stuff and, and that will free you up to do your product innovations, So a lot of it has to do with offering the, the value in terms So you started to touch on a little bit there for me, but, but dive a little deeper there. Well, lemme give you a use case here. but what are you seeing just from the macro level in terms of what people are doing and thinking about the database The Rediff is more of the BI side, but usually when you ingest the data, you have some level of processing Because I mean, that was it, you know, what, seven, eight years ago when, then they can still, like what you know, I'm trying to do is basically don't have to co do like any I mean, is that, is that essential to use rather than building it from scratch all the time. And, and as in the startup mode, the focus that they've so that you can get the most out of that database engine. you have put on startups and are obviously, you know, kind of putting your money where your mouth is, right? heavy lifting on the Hein proxy side, if you will, because then you have this proxy in the middle where I mean, what, you know, what what next to solve, right? to improve the performance of data. up the good work and thank you for sharing the time today.

ENTITIES

Entity	Category	Confidence
John Walls	PERSON	0.99+
AWS	ORGANIZATION	0.99+
John	PERSON	0.99+
Hyundai	ORGANIZATION	0.99+
Rolin Lee	PERSON	0.99+
Google	ORGANIZATION	0.99+
12	QUANTITY	0.99+
Roland	PERSON	0.99+
Heim Doll Data	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Heim Doll	ORGANIZATION	0.99+
Sohan	PERSON	0.99+
Roan	PERSON	0.99+
First	QUANTITY	0.99+
Roy	PERSON	0.99+
Black Friday	EVENT	0.99+
18 months	QUANTITY	0.99+
MySQL	TITLE	0.99+
Heim	ORGANIZATION	0.99+
today	DATE	0.98+
amazon.com	ORGANIZATION	0.98+
first	QUANTITY	0.98+
next year	DATE	0.97+
seven	DATE	0.97+
Hawn Nguyen Loughren	PERSON	0.97+
two new guests	QUANTITY	0.97+
SQL	TITLE	0.96+
Roland Lee	PERSON	0.96+
12 months	QUANTITY	0.96+
one	QUANTITY	0.95+
Han	PERSON	0.94+
aws	ORGANIZATION	0.94+
Rediff	ORGANIZATION	0.89+
OLA	ORGANIZATION	0.89+
Hein	ORGANIZATION	0.85+
OnPrem	ORGANIZATION	0.83+
Hfi	ORGANIZATION	0.82+
Reinvent 22	COMMERCIAL_ITEM	0.81+
eight years ago	DATE	0.79+
Redshift	TITLE	0.79+
Redshift	ORGANIZATION	0.76+
Heim doll	ORGANIZATION	0.73+
22	TITLE	0.72+
Aurora	ORGANIZATION	0.71+
Postgres	TITLE	0.66+
Global Startup Program	TITLE	0.66+
Start Showcase	EVENT	0.62+
Heindel	PERSON	0.59+
Aurora Serverless	TITLE	0.57+
Invent 2022	TITLE	0.49+
Global Startup Program	OTHER	0.47+
Hunt	PERSON	0.41+
ReadWrite	ORGANIZATION	0.4+
Reinvent	COMMERCIAL_ITEM	0.36+

Evan Kaplan, InfluxData | AWS re:invent 2022

>>Hey everyone. Welcome to Las Vegas. The Cube is here, live at the Venetian Expo Center for AWS Reinvent 2022. Amazing attendance. This is day one of our coverage. Lisa Martin here with Day Ante. David is great to see so many people back. We're gonna be talk, we've been having great conversations already. We have a wall to wall coverage for the next three and a half days. When we talk to companies, customers, every company has to be a data company. And one of the things I think we learned in the pandemic is that access to real time data and real time analytics, no longer a nice to have that is a differentiator and a competitive all >>About data. I mean, you know, I love the topic and it's, it's got so many dimensions and such texture, can't get enough of data. >>I know we have a great guest joining us. One of our alumni is back, Evan Kaplan, the CEO of Influx Data. Evan, thank you so much for joining us. Welcome back to the Cube. >>Thanks for having me. It's great to be here. So here >>We are, day one. I was telling you before we went live, we're nice and fresh hosts. Talk to us about what's new at Influxed since the last time we saw you at Reinvent. >>That's great. So first of all, we should acknowledge what's going on here. This is pretty exciting. Yeah, that does really feel like, I know there was a show last year, but this feels like the first post Covid shows a lot of energy, a lot of attention despite a difficult economy. In terms of, you know, you guys were commenting in the lead into Big data. I think, you know, if we were to talk about Big Data five, six years ago, what would we be talking about? We'd been talking about Hadoop, we were talking about Cloudera, we were talking about Hortonworks, we were talking about Big Data Lakes, data stores. I think what's happened is, is this this interesting dynamic of, let's call it if you will, the, the secularization of data in which it breaks into different fields, different, almost a taxonomy. You've got this set of search data, you've got this observability data, you've got graph data, you've got document data and what you're seeing in the market and now you have time series data. >>And what you're seeing in the market is this incredible capability by developers as well and mostly open source dynamic driving this, this incredible capability of developers to assemble data platforms that aren't unicellular, that aren't just built on Hado or Oracle or Postgres or MySQL, but in fact represent different data types. So for us, what we care about his time series, we care about anything that happens in time, where time can be the primary measurement, which if you think about it, is a huge proportion of real data. Cuz when you think about what drives ai, you think about what happened, what happened, what happened, what happened, what's going to happen. That's the functional thing. But what happened is always defined by a period, a measurement, a time. And so what's new for us is we've developed this new open source engine called IOx. And so it's basically a refresh of the whole database, a kilo database that uses Apache Arrow, par K and data fusion and turns it into a super powerful real time analytics platform. It was already pretty real time before, but it's increasingly now and it adds SQL capability and infinite cardinality. And so it handles bigger data sets, but importantly, not just bigger but faster, faster data. So that's primarily what we're talking about to show. >>So how does that affect where you can play in the marketplace? Is it, I mean, how does it affect your total available market? Your great question. Your, your customer opportunities. >>I think it's, it's really an interesting market in that you've got all of these different approaches to database. Whether you take data warehouses from Snowflake or, or arguably data bricks also. And you take these individual database companies like Mongo Influx, Neo Forge, elastic, and people like that. I think the commonality you see across the volume is, is many of 'em, if not all of them, are based on some sort of open source dynamic. So I think that is an in an untractable trend that will continue for on. But in terms of the broader, the broader database market, our total expand, total available tam, lots of these things are coming together in interesting ways. And so the, the, the wave that will ride that we wanna ride, because it's all big data and it's all increasingly fast data and it's all machine learning and AI is really around that measurement issue. That instrumentation the idea that if you're gonna build any sophisticated system, it starts with instrumentation and the journey is defined by instrumentation. So we view ourselves as that instrumentation tooling for understanding complex systems. And how, >>I have to follow quick follow up. Why did you say arguably data bricks? I mean open source ethos? >>Well, I was saying arguably data bricks cuz Spark, I mean it's a great company and it's based on Spark, but there's quite a gap between Spark and what Data Bricks is today. And in some ways data bricks from the outside looking in looks a lot like Snowflake to me looks a lot like a really sophisticated data warehouse with a lot of post-processing capabilities >>And, and with an open source less >>Than a >>Core database. Yeah. Right, right, right. Yeah, I totally agree. Okay, thank you for that >>Part that that was not arguably like they're, they're not a good company or >>No, no. They got great momentum and I'm just curious. Absolutely. You know, so, >>So talk a little bit about IOx and, and what it is enabling you guys to achieve from a competitive advantage perspective. The key differentiators give us that scoop. >>So if you think about, so our old storage engine was called tsm, also open sourced, right? And IOx is open sourced and the old storage engine was really built around this time series measurements, particularly metrics, lots of metrics and handling those at scale and making it super easy for developers to use. But, but our old data engine only supported either a custom graphical UI that you'd build yourself on top of it or a dashboarding tool like Grafana or Chronograph or things like that. With IOCs. Two or three interventions were important. One is we now support, we'll support things like Tableau, Microsoft, bi, and so you're taking that same data that was available for instrumentation and now you're using it for business intelligence also. So that became super important and it kind of answers your question about the expanded market expands the market. The second thing is, when you're dealing with time series data, you're dealing with this concept of cardinality, which is, and I don't know if you're familiar with it, but the idea that that it's a multiplication of measurements in a table. And so the more measurements you want over the more series you have, you have this really expanding exponential set that can choke a database off. And the way we've designed IIS to handle what we call infinite cardinality, where you don't even have to think about that design point of view. And then lastly, it's just query performance is dramatically better. And so it's pretty exciting. >>So the unlimited cardinality, basically you could identify relationships between data and different databases. Is that right? Between >>The same database but different measurements, different tables, yeah. Yeah. Right. Yeah, yeah. So you can handle, so you could say, I wanna look at the way, the way the noise levels are performed in this room according to 400 different locations on 25 different days, over seven months of the year. And that each one is a measurement. Each one adds to cardinality. And you can say, I wanna search on Tuesdays in December, what the noise level is at 2:21 PM and you get a very quick response. That kind of instrumentation is critical to smarter systems. How are >>You able to process that data at at, in a performance level that doesn't bring the database to its knees? What's the secret sauce behind that? >>It's AUM database. It's built on Parque and Apache Arrow. But it's, but to say it's nice to say without a much longer conversation, it's an architecture that's really built for pulling that kind of data. If you know the data is time series and you're looking for a time measurement, you already have the ability to optimize pretty dramatically. >>So it's, it's that purpose built aspect of it. It's the >>Purpose built aspect. You couldn't take Postgres and do the same >>Thing. Right? Because a lot of vendors say, oh yeah, we have time series now. Yeah. Right. So yeah. Yeah. Right. >>And they >>Do. Yeah. But >>It's not, it's not, the founding of the company came because Paul Dicks was working on Wall Street building time series databases on H base, on MyQ, on other platforms and realize every time we do it, we have to rewrite the code. We build a bunch of application logic to handle all these. We're talking about, we have customers that are adding hundreds of millions to billions of points a second. So you're talking about an ingest level. You know, you think about all those data points, you're talking about ingest level that just doesn't, you know, it just databases aren't designed for that. Right? And so it's not just us, our competitors also build good time series databases. And so the category is really emergent. Yeah, >>Sure. Talk about a favorite customer story they think really articulates the value of what Influx is doing, especially with IOx. >>Yeah, sure. And I love this, I love this story because you know, Tesla may not be in favor because of the latest Elon Musker aids, but, but, but so we've had about a four year relationship with Tesla where they built their power wall technology around recording that, seeing your device, seeing the stuff, seeing the charging on your car. It's all captured in influx databases that are reporting from power walls and mega power packs all over the world. And they report to a central place at, at, at Tesla's headquarters and it reports out to your phone and so you can see it. And what's really cool about this to me is I've got two Tesla cars and I've got a Tesla solar roof tiles. So I watch this date all the time. So it's a great customer story. And actually if you go on our website, you can see I did an hour interview with the engineer that designed the system cuz the system is super impressive and I just think it's really cool. Plus it's, you know, it's all the good green stuff that we really appreciate supporting sustainability, right? Yeah. >>Right, right. Talk about from a, what's in it for me as a customer, what you guys have done, the change to IOCs, what, what are some of the key features of it and the key values in it for customers like Tesla, like other industry customers as well? >>Well, so it's relatively new. It just arrived in our cloud product. So Tesla's not using it today. We have a first set of customers starting to use it. We, the, it's in open source. So it's a very popular project in the open source world. But the key issues are, are really the stuff that we've kind of covered here, which is that a broad SQL environment. So accessing all those SQL developers, the same people who code against Snowflake's data warehouse or data bricks or Postgres, can now can code that data against influx, open up the BI market. It's the cardinality, it's the performance. It's really an architecture. It's the next gen. We've been doing this for six years, it's the next generation of everything. We've seen how you make time series be super performing. And that's only relevant because more and more things are becoming real time as we develop smarter and smarter systems. The journey is pretty clear. You instrument the system, you, you let it run, you watch for anomalies, you correct those anomalies, you re instrument the system. You do that 4 billion times, you have a self-driving car, you do that 55 times, you have a better podcast that is, that is handling its audio better, right? So everything is on that journey of getting smarter and smarter. So >>You guys, you guys the big committers to IOCs, right? Yes. And how, talk about how you support the, develop the surrounding developer community, how you get that flywheel effect going >>First. I mean it's actually actually a really kind of, let's call it, it's more art than science. Yeah. First of all, you you, you come up with an architecture that really resonates for developers. And Paul Ds our founder, really is a developer's developer. And so he started talking about this in the community about an architecture that uses Apache Arrow Parque, which is, you know, the standard now becoming for file formats that uses Apache Arrow for directing queries and things like that and uses data fusion and said what this thing needs is a Columbia database that sits behind all of this stuff and integrates it. And he started talking about it two years ago and then he started publishing in IOCs that commits in the, in GitHub commits. And slowly, but over time in Hacker News and other, and other people go, oh yeah, this is fundamentally right. >>It addresses the problems that people have with things like click cows or plain databases or Coast and they go, okay, this is the right architecture at the right time. Not different than original influx, not different than what Elastic hit on, not different than what Confluent with Kafka hit on and their time is you build an audience of people who are committed to understanding this kind of stuff and they become committers and they become the core. Yeah. And you build out from it. And so super. And so we chose to have an MIT open source license. Yeah. It's not some secondary license competitors can use it and, and competitors can use it against us. Yeah. >>One of the things I know that Influx data talks about is the time to awesome, which I love that, but what does that mean? What is the time to Awesome. Yeah. For developer, >>It comes from that original story where, where Paul would have to write six months of application logic and stuff to build a time series based applications. And so Paul's notion was, and this was based on the original Mongo, which was very successful because it was very easy to use relative to most databases. So Paul developed this commitment, this idea that I quickly joined on, which was, hey, it should be relatively quickly for a developer to build something of import to solve a problem, it should be able to happen very quickly. So it's got a schemaless background so you don't have to know the schema beforehand. It does some things that make it really easy to feel powerful as a developer quickly. And if you think about that journey, if you feel powerful with a tool quickly, then you'll go deeper and deeper and deeper and pretty soon you're taking that tool with you wherever you go, it becomes the tool of choice as you go to that next job or you go to that next application. And so that's a fundamental way we think about it. To be honest with you, we haven't always delivered perfectly on that. It's generally in our dna. So we do pretty well, but I always feel like we can do better. >>So if you were to put a bumper sticker on one of your Teslas about influx data, what would it >>Say? By the way, I'm not rich. It just happened to be that we have two Teslas and we have for a while, we just committed to that. The, the, so ask the question again. Sorry. >>Bumper sticker on influx data. What would it say? How, how would I >>Understand it be time to Awesome. It would be that that phrase his time to Awesome. Right. >>Love that. >>Yeah, I'd love it. >>Excellent time to. Awesome. Evan, thank you so much for joining David, the >>Program. It's really fun. Great thing >>On Evan. Great to, you're on. Haven't Well, great to have you back talking about what you guys are doing and helping organizations like Tesla and others really transform their businesses, which is all about business transformation these days. We appreciate your insights. >>That's great. Thank >>You for our guest and Dave Ante. I'm Lisa Martin, you're watching The Cube, the leader in emerging and enterprise tech coverage. We'll be right back with our next guest.

Published Date : Nov 29 2022

SUMMARY :

And one of the things I think we learned in the pandemic is that access to real time data and real time analytics, I mean, you know, I love the topic and it's, it's got so many dimensions and such Evan, thank you so much for joining us. It's great to be here. Influxed since the last time we saw you at Reinvent. terms of, you know, you guys were commenting in the lead into Big data. And so it's basically a refresh of the whole database, a kilo database that uses So how does that affect where you can play in the marketplace? And you take these individual database companies like Mongo Influx, Why did you say arguably data bricks? And in some ways data bricks from the outside looking in looks a lot like Snowflake to me looks a lot Okay, thank you for that You know, so, So talk a little bit about IOx and, and what it is enabling you guys to achieve from a And the way we've designed IIS to handle what we call infinite cardinality, where you don't even have to So the unlimited cardinality, basically you could identify relationships between data And you can say, time measurement, you already have the ability to optimize pretty dramatically. So it's, it's that purpose built aspect of it. You couldn't take Postgres and do the same So yeah. And so the category is really emergent. especially with IOx. And I love this, I love this story because you know, what you guys have done, the change to IOCs, what, what are some of the key features of it and the key values in it for customers you have a self-driving car, you do that 55 times, you have a better podcast that And how, talk about how you support architecture that uses Apache Arrow Parque, which is, you know, the standard now becoming for file And you build out from it. One of the things I know that Influx data talks about is the time to awesome, which I love that, So it's got a schemaless background so you don't have to know the schema beforehand. It just happened to be that we have two Teslas and we have for a while, What would it say? Understand it be time to Awesome. Evan, thank you so much for joining David, the Great thing Haven't Well, great to have you back talking about what you guys are doing and helping organizations like Tesla and others really That's great. You for our guest and Dave Ante.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Evan Kaplan	PERSON	0.99+
six months	QUANTITY	0.99+
Evan	PERSON	0.99+
Tesla	ORGANIZATION	0.99+
Influx Data	ORGANIZATION	0.99+
Paul	PERSON	0.99+
55 times	QUANTITY	0.99+
two	QUANTITY	0.99+
2:21 PM	DATE	0.99+
Las Vegas	LOCATION	0.99+
Dave Ante	PERSON	0.99+
Paul Dicks	PERSON	0.99+
six years	QUANTITY	0.99+
last year	DATE	0.99+
hundreds of millions	QUANTITY	0.99+
Mongo Influx	ORGANIZATION	0.99+
4 billion times	QUANTITY	0.99+
Two	QUANTITY	0.99+
December	DATE	0.99+
Microsoft	ORGANIZATION	0.99+
Influxed	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Influx	ORGANIZATION	0.99+
IOx	TITLE	0.99+
MySQL	TITLE	0.99+
three	QUANTITY	0.99+
Tuesdays	DATE	0.99+
each one	QUANTITY	0.98+
400 different locations	QUANTITY	0.98+
25 different days	QUANTITY	0.98+
first set	QUANTITY	0.98+
an hour	QUANTITY	0.98+
First	QUANTITY	0.98+
six years ago	DATE	0.98+
The Cube	TITLE	0.98+
One	QUANTITY	0.98+
Neo Forge	ORGANIZATION	0.98+
second thing	QUANTITY	0.98+
Each one	QUANTITY	0.98+
Paul Ds	PERSON	0.97+
IOx	ORGANIZATION	0.97+
today	DATE	0.97+
Teslas	ORGANIZATION	0.97+
MIT	ORGANIZATION	0.96+
Postgres	ORGANIZATION	0.96+
over seven months	QUANTITY	0.96+
one	QUANTITY	0.96+
five	DATE	0.96+
Venetian Expo Center	LOCATION	0.95+
Big Data Lakes	ORGANIZATION	0.95+
Cloudera	ORGANIZATION	0.94+
Columbia	LOCATION	0.94+
InfluxData	ORGANIZATION	0.94+
Wall Street	LOCATION	0.93+
SQL	TITLE	0.92+
Elastic	TITLE	0.92+
Data Bricks	ORGANIZATION	0.92+
Hacker News	TITLE	0.92+
two years ago	DATE	0.91+
Oracle	ORGANIZATION	0.91+
AWS Reinvent 2022	EVENT	0.91+
Elon Musker	PERSON	0.9+
Snowflake	ORGANIZATION	0.9+
Reinvent	ORGANIZATION	0.89+
billions of points a second	QUANTITY	0.89+
four year	QUANTITY	0.88+
Chronograph	TITLE	0.88+
Confluent	TITLE	0.87+
Spark	TITLE	0.86+
Apache	ORGANIZATION	0.86+
Snowflake	TITLE	0.85+
Grafana	TITLE	0.85+
GitHub	ORGANIZATION	0.84+

Breaking Analysis: re:Invent 2022 marks the next chapter in data & cloud

from the cube studios in Palo Alto in Boston bringing you data-driven insights from the cube and ETR this is breaking analysis with Dave vellante the ascendancy of AWS under the leadership of Andy jassy was marked by a tsunami of data and corresponding cloud services to leverage that data now those Services they mainly came in the form of Primitives I.E basic building blocks that were used by developers to create more sophisticated capabilities AWS in the 2020s being led by CEO Adam solipski will be marked by four high-level Trends in our opinion one A Rush of data that will dwarf anything we've previously seen two a doubling or even tripling down on the basic elements of cloud compute storage database security Etc three a greater emphasis on end-to-end integration of AWS services to simplify and accelerate customer adoption of cloud and four significantly deeper business integration of cloud Beyond it as an underlying element of organizational operations hello and welcome to this week's wikibon Cube insights powered by ETR in this breaking analysis we extract and analyze nuggets from John furrier's annual sit-down with the CEO of AWS we'll share data from ETR and other sources to set the context for the market and competition in cloud and we'll give you our glimpse of what to expect at re invent in 2022. now before we get into the core of our analysis Alibaba has announced earnings they always announced after the big three you know a month later and we've updated our Q3 slash November hyperscale Computing forecast for the year as seen here and we're going to spend a lot of time on this as most of you have seen the bulk of it already but suffice to say alibaba's cloud business is hitting that same macro Trend that we're seeing across the board but a more substantial slowdown than we expected and more substantial than its peers they're facing China headwinds they've been restructuring its Cloud business and it's led to significantly slower growth uh in in the you know low double digits as opposed to where we had it at 15 this puts our year-end estimates for 2022 Revenue at 161 billion still a healthy 34 growth with AWS surpassing 80 billion in 2022 Revenue now on a related note one of the big themes in Cloud that we've been reporting on is how customers are optimizing their Cloud spend it's a technique that they use and when the economy looks a little shaky and here's a graphic that we pulled from aws's website which shows the various pricing plans at a high level as you know they're much more granular than that and more sophisticated but Simplicity we'll just keep it here basically there are four levels first one here is on demand I.E pay by the drink now we're going to jump down to what we've labeled as number two spot instances that's like the right place at the right time I can use that extra capacity in the moment the third is reserved instances or RIS where I pay up front to get a discount and the fourth is sort of optimized savings plans where customers commit to a one or three year term and for a better price now you'll notice we labeled the choices in a different order than AWS presented them on its website and that's because we believe that the order that we chose is the natural progression for customers this started on demand they maybe experiment with spot instances they move to reserve instances when the cloud bill becomes too onerous and if you're large enough you lock in for one or three years okay the interesting thing is the order in which AWS presents them we believe that on-demand accounts for the majority of AWS customer spending now if you think about it those on-demand customers they're also at risk customers yeah sure there's some switching costs like egress and learning curve but many customers they have multiple clouds and they've got experience and so they're kind of already up to a learning curve and if you're not married to AWS with a longer term commitment there's less friction to switch now AWS here presents the most attractive plan from a financial perspective second after on demand and it's also the plan that makes the greatest commitment from a lock-in standpoint now In fairness to AWS it's also true that there is a trend towards subscription-based pricing and we have some data on that this chart is from an ETR drill down survey the end is 300. pay attention to the bars on the right the left side is sort of busy but the pink is subscription and you can see the trend upward the light blue is consumption based or on demand based pricing and you can see there's a steady Trend toward subscription now we'll dig into this in a later episode of Breaking analysis but we'll share with you a little some tidbits with the data that ETR provides you can select which segment is and pass or you can go up the stack Etc but so when you choose is and paths 44 of customers either prefer or are required to use on-demand pricing whereas around 40 percent of customers say they either prefer or are required to use subscription pricing again that's for is so now the further mu you move up the stack the more prominent subscription pricing becomes often with sixty percent or more for the software-based offerings that require or prefer subscription and interestingly cyber security tracks along with software at around 60 percent that that prefer subscription it's likely because as with software you're not shutting down your cyber protection on demand all right let's get into the expectations for reinvent and we're going to start with an observation in data in this 2018 book seeing digital author David michella made the point that whereas most companies apply data on the periphery of their business kind of as an add-on function successful data companies like Google and Amazon and Facebook have placed data at the core of their operations they've operationalized data and they apply machine intelligence to that foundational element why is this the fact is it's not easy to do what the internet Giants have done very very sophisticated engineering and and and cultural discipline and this brings us to reinvent 2022 in the future of cloud machine learning and AI will increasingly be infused into applications we believe the data stack and the application stack are coming together as organizations build data apps and data products data expertise is moving from the domain of Highly specialized individuals to Everyday business people and we are just at the cusp of this trend this will in our view be a massive theme of not only re invent 22 but of cloud in the 2020s the vision of data mesh We Believe jamachtagani's principles will be realized in this decade now what we'd like to do now is share with you a glimpse of the thinking of Adam solipsky from his sit down with John Furrier each year John has a one-on-one conversation with the CEO of AWS AWS he's been doing this for years and the outcome is a better understanding of the directional thinking of the leader of the number one Cloud platform so we're now going to share some direct quotes I'm going to run through them with some commentary and then bring in some ETR data to analyze the market implications here we go this is from solipsky quote I.T in general and data are moving from departments into becoming intrinsic parts of how businesses function okay we're talking here about deeper business integration let's go on to the next one quote in time we'll stop talking about people who have the word analyst we inserted data he meant data data analyst in their title rather will have hundreds of millions of people who analyze data as part of their day-to-day job most of whom will not have the word analyst anywhere in their title we're talking about graphic designers and pizza shop owners and product managers and data scientists as well he threw that in I'm going to come back to that very interesting so he's talking about here about democratizing data operationalizing data next quote customers need to be able to take an end-to-end integrated view of their entire data Journey from ingestion to storage to harmonizing the data to being able to query it doing business Intelligence and human-based Analysis and being able to collaborate and share data and we've been putting together we being Amazon together a broad Suite of tools from database to analytics to business intelligence to help customers with that and this last statement it's true Amazon has a lot of tools and you know they're beginning to become more and more integrated but again under jassy there was not a lot of emphasis on that end-to-end integrated view we believe it's clear from these statements that solipsky's customer interactions are leading him to underscore that the time has come for this capability okay continuing quote if you have data in one place you shouldn't have to move it every time you want to analyze that data couldn't agree more it would be much better if you could leave that data in place avoid all the ETL which has become a nasty three-letter word more and more we're building capabilities where you can query that data in place end quote okay this we see a lot in the marketplace Oracle with mySQL Heatwave the entire Trend toward converge database snowflake [ __ ] extending their platforms into transaction and analytics respectively and so forth a lot of the partners are are doing things as well in that vein let's go into the next quote the other phenomenon is infusing machine learning into all those capabilities yes the comments from the michelleographic come into play here infusing Ai and machine intelligence everywhere next one quote it's not a data Cloud it's not a separate Cloud it's a series of broad but integrated capabilities to help you manage the end-to-end life cycle of your data there you go we AWS are the cloud we're going to come back to that in a moment as well next set of comments around data very interesting here quote data governance is a huge issue really what customers need is to find the right balance of their organization between access to data and control and if you provide too much access then you're nervous that your data is going to end up in places that it shouldn't shouldn't be viewed by people who shouldn't be viewing it and you feel like you lack security around that data and by the way what happens then is people overreact and they lock it down so that almost nobody can see it it's those handcuffs there's data and asset are reliability we've talked about that for years okay very well put by solipsky but this is a gap in our in our view within AWS today and we're we're hoping that they close it at reinvent it's not easy to share data in a safe way within AWS today outside of your organization so we're going to look for that at re invent 2022. now all this leads to the following statement by solipsky quote data clean room is a really interesting area and I think there's a lot of different Industries in which clean rooms are applicable I think that clean rooms are an interesting way of enabling multiple parties to share and collaborate on the data while completely respecting each party's rights and their privacy mandate okay again this is a gap currently within AWS today in our view and we know snowflake is well down this path and databricks with Delta sharing is also on this curve so AWS has to address this and demonstrate this end-to-end data integration and the ability to safely share data in our view now let's bring in some ETR spending data to put some context around these comments with reference points in the form of AWS itself and its competitors and partners here's a chart from ETR that shows Net score or spending momentum on the x-axis an overlap or pervasiveness in the survey um sorry let me go back up the net scores on the y-axis and overlap or pervasiveness in the survey is on the x-axis so spending momentum by pervasiveness okay or should have share within the data set the table that's inserted there with the Reds and the greens that informs us to how the dots are positioned so it's Net score and then the shared ends are how the plots are determined now we've filtered the data on the three big data segments analytics database and machine learning slash Ai and we've only selected one company with fewer than 100 ends in the survey and that's databricks you'll see why in a moment the red dotted line indicates highly elevated customer spend at 40 percent now as usual snowflake outperforms all players on the y-axis with a Net score of 63 percent off the charts all three big U.S cloud players are above that line with Microsoft and AWS dominating the x-axis so very impressive that they have such spending momentum and they're so large and you see a number of other emerging data players like rafana and datadog mongodbs there in the mix and then more established players data players like Splunk and Tableau now you got Cisco who's gonna you know it's a it's a it's a adjacent to their core networking business but they're definitely into you know the analytics business then the really established players in data like Informatica IBM and Oracle all with strong presence but you'll notice in the red from the momentum standpoint now what you're going to see in a moment is we put red highlights around databricks Snowflake and AWS why let's bring that back up and we'll explain so there's no way let's bring that back up Alex if you would there's no way AWS is going to hit the brakes on innovating at the base service level what we call Primitives earlier solipsky told Furrier as much in their sit down that AWS will serve the technical user and data science Community the traditional domain of data bricks and at the same time address the end-to-end integration data sharing and business line requirements that snowflake is positioned to serve now people often ask Snowflake and databricks how will you compete with the likes of AWS and we know the answer focus on data exclusively they have their multi-cloud plays perhaps the more interesting question is how will AWS compete with the likes of Specialists like Snowflake and data bricks and the answer is depicted here in this chart AWS is going to serve both the technical and developer communities and the data science audience and through end-to-end Integrations and future services that simplify the data Journey they're going to serve the business lines as well but the Nuance is in all the other dots in the hundreds or hundreds of thousands that are not shown here and that's the AWS ecosystem you can see AWS has earned the status of the number one Cloud platform that everyone wants to partner with as they say it has over a hundred thousand partners and that ecosystem combined with these capabilities that we're discussing well perhaps behind in areas like data sharing and integrated governance can wildly succeed by offering the capabilities and leveraging its ecosystem now for their part the snowflakes of the world have to stay focused on the mission build the best products possible and develop their own ecosystems to compete and attract the Mind share of both developers and business users and that's why it's so interesting to hear solipski basically say it's not a separate Cloud it's a set of integrated Services well snowflake is in our view building a super cloud on top of AWS Azure and Google when great products meet great sales and marketing good things can happen so this will be really fun to watch what AWS announces in this area at re invent all right one other topic that solipsky talked about was the correlation between serverless and container adoption and you know I don't know if this gets into there certainly their hybrid place maybe it starts to get into their multi-cloud we'll see but we have some data on this so again we're talking about the correlation between serverless and container adoption but before we get into that let's go back to 2017 and listen to what Andy jassy said on the cube about serverless play the clip very very earliest days of AWS Jeff used to say a lot if I were starting Amazon today I'd have built it on top of AWS we didn't have all the capability and all the functionality at that very moment but he knew what was coming and he saw what people were still able to accomplish even with where the services were at that point I think the same thing is true here with Lambda which is I think if Amazon were starting today it's a given they would build it on the cloud and I think we with a lot of the applications that comprise Amazon's consumer business we would build those on on our serverless capabilities now we still have plenty of capabilities and features and functionality we need to add to to Lambda and our various serverless services so that may not be true from the get-go right now but I think if you look at the hundreds of thousands of customers who are building on top of Lambda and lots of real applications you know finra has built a good chunk of their market watch application on top of Lambda and Thompson Reuters has built you know one of their key analytics apps like people are building real serious things on top of Lambda and the pace of iteration you'll see there will increase as well and I really believe that to be true over the next year or two so years ago when Jesse gave a road map that serverless was going to be a key developer platform going forward and so lipsky referenced the correlation between serverless and containers in the Furrier sit down so we wanted to test that within the ETR data set now here's a screen grab of The View across 1300 respondents from the October ETR survey and what we've done here is we've isolated on the cloud computing segment okay so you can see right there cloud computing segment now we've taken the functions from Google AWS Lambda and Microsoft Azure functions all the serverless offerings and we've got Net score on the vertical axis we've got presence in the data set oh by the way 440 by the way is highly elevated remember that and then we've got on the horizontal axis we have the presence in the data center overlap okay that's relative to each other so remember 40 all these guys are above that 40 mark okay so you see that now what we're going to do this is just for serverless and what we're going to do is we're going to turn on containers to see the correlation and see what happens so watch what happens when we click on container boom everything moves to the right you can see all three move to the right Google drops a little bit but all the others now the the filtered end drops as well so you don't have as many people that are aggressively leaning into both but all three move to the right so watch again containers off and then containers on containers off containers on so you can see a really major correlation between containers and serverless okay so to get a better understanding of what that means I call my friend and former Cube co-host Stu miniman what he said was people generally used to think of VMS containers and serverless as distinctly different architectures but the lines are beginning to blur serverless makes things simpler for developers who don't want to worry about underlying infrastructure as solipsky and the data from ETR indicate serverless and containers are coming together but as Stu and I discussed there's a spectrum where on the left you have kind of native Cloud VMS in the middle you got AWS fargate and in the rightmost anchor is Lambda AWS Lambda now traditionally in the cloud if you wanted to use containers developers would have to build a container image they have to select and deploy the ec2 images that they or instances that they wanted to use they have to allocate a certain amount of memory and then fence off the apps in a virtual machine and then run the ec2 instances against the apps and then pay for all those ec2 resources now with AWS fargate you can run containerized apps with less infrastructure management but you still have some you know things that you can you can you can do with the with the infrastructure so with fargate what you do is you'd build the container images then you'd allocate your memory and compute resources then run the app and pay for the resources only when they're used so fargate lets you control the runtime environment while at the same time simplifying the infrastructure management you gotta you don't have to worry about isolating the app and other stuff like choosing server types and patching AWS does all that for you then there's Lambda with Lambda you don't have to worry about any of the underlying server infrastructure you're just running code AS functions so the developer spends their time worrying about the applications and the functions that you're calling the point is there's a movement and we saw in the data towards simplifying the development environment and allowing the cloud vendor AWS in this case to do more of the underlying management now some folks will still want to turn knobs and dials but increasingly we're going to see more higher level service adoption now re invent is always a fire hose of content so let's do a rapid rundown of what to expect we talked about operate optimizing data and the organization we talked about Cloud optimization there'll be a lot of talk on the show floor about best practices and customer sharing data solipsky is leading AWS into the next phase of growth and that means moving beyond I.T transformation into deeper business integration and organizational transformation not just digital transformation organizational transformation so he's leading a multi-vector strategy serving the traditional peeps who want fine-grained access to core services so we'll see continued Innovation compute storage AI Etc and simplification through integration and horizontal apps further up to stack Amazon connect is an example that's often cited now as we've reported many times databricks is moving from its stronghold realm of data science into business intelligence and analytics where snowflake is coming from its data analytics stronghold and moving into the world of data science AWS is going down a path of snowflake meet data bricks with an underlying cloud is and pass layer that puts these three companies on a very interesting trajectory and you can expect AWS to go right after the data sharing opportunity and in doing so it will have to address data governance they go hand in hand okay price performance that is a topic that will never go away and it's something that we haven't mentioned today silicon it's a it's an area we've covered extensively on breaking analysis from Nitro to graviton to the AWS acquisition of Annapurna its secret weapon new special specialized capabilities like inferential and trainium we'd expect something more at re invent maybe new graviton instances David floyer our colleague said he's expecting at some point a complete system on a chip SOC from AWS and maybe an arm-based server to eventually include high-speed cxl connections to devices and memories all to address next-gen applications data intensive applications with low power requirements and lower cost overall now of course every year Swami gives his usual update on machine learning and AI building on Amazon's years of sagemaker innovation perhaps a focus on conversational AI or a better support for vision and maybe better integration across Amazon's portfolio of you know large language models uh neural networks generative AI really infusing AI everywhere of course security always high on the list that reinvent and and Amazon even has reinforce a conference dedicated to it uh to security now here we'd like to see more on supply chain security and perhaps how AWS can help there as well as tooling to make the cio's life easier but the key so far is AWS is much more partner friendly in the security space than say for instance Microsoft traditionally so firms like OCTA and crowdstrike in Palo Alto have plenty of room to play in the AWS ecosystem we'd expect of course to hear something about ESG it's an important topic and hopefully how not only AWS is helping the environment that's important but also how they help customers save money and drive inclusion and diversity again very important topics and finally come back to it reinvent is an ecosystem event it's the Super Bowl of tech events and the ecosystem will be out in full force every tech company on the planet will have a presence and the cube will be featuring many of the partners from the serial floor as well as AWS execs and of course our own independent analysis so you'll definitely want to tune into thecube.net and check out our re invent coverage we start Monday evening and then we go wall to wall through Thursday hopefully my voice will come back we have three sets at the show and our entire team will be there so please reach out or stop by and say hello all right we're going to leave it there for today many thanks to Stu miniman and David floyer for the input to today's episode of course John Furrier for extracting the signal from the noise and a sit down with Adam solipski thanks to Alex Meyerson who was on production and manages the podcast Ken schiffman as well Kristen Martin and Cheryl Knight helped get the word out on social and of course in our newsletters Rob hoef is our editor-in-chief over at siliconangle does some great editing thank thanks to all of you remember all these episodes are available as podcasts wherever you listen you can pop in the headphones go for a walk just search breaking analysis podcast I published each week on wikibon.com at siliconangle.com or you can email me at david.valante at siliconangle.com or DM me at di vallante or please comment on our LinkedIn posts and do check out etr.ai for the best survey data in the Enterprise Tech business this is Dave vellante for the cube insights powered by ETR thanks for watching we'll see it reinvent or we'll see you next time on breaking analysis [Music]

Published Date : Nov 26 2022

SUMMARY :

so now the further mu you move up the

ENTITIES

Entity	Category	Confidence
David michella	PERSON	0.99+
Alex Meyerson	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
one	QUANTITY	0.99+
Dave vellante	PERSON	0.99+
David floyer	PERSON	0.99+
Kristen Martin	PERSON	0.99+
John	PERSON	0.99+
sixty percent	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
Adam solipski	PERSON	0.99+
John Furrier	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
2022	DATE	0.99+
Andy jassy	PERSON	0.99+
Google	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
hundreds	QUANTITY	0.99+
2017	DATE	0.99+
Palo Alto	LOCATION	0.99+
40 percent	QUANTITY	0.99+
alibaba	ORGANIZATION	0.99+
Lambda	TITLE	0.99+
63 percent	QUANTITY	0.99+
1300 respondents	QUANTITY	0.99+
Super Bowl	EVENT	0.99+
80 billion	QUANTITY	0.99+
John furrier	PERSON	0.99+
Thursday	DATE	0.99+
Cisco	ORGANIZATION	0.99+
three years	QUANTITY	0.99+
Monday evening	DATE	0.99+
Jesse	PERSON	0.99+
Stu miniman	PERSON	0.99+
siliconangle.com	OTHER	0.99+
October	DATE	0.99+
thecube.net	OTHER	0.99+
fourth	QUANTITY	0.99+
a month later	DATE	0.99+
third	QUANTITY	0.99+
hundreds of thousands	QUANTITY	0.99+
fargate	ORGANIZATION	0.99+

Deepthi Sigireddi, PlanetScale | KubeCon + CloudNativeCon NA 2022

(upbeat intro music) >> Good afternoon, fellow tech nerds. My name is Savannah Peterson, coming to you from theCube's Remote Studio here in Motown, Detroit, Michigan where we are at KubeCon. John, this is our 12th interview of the day. How are you feeling? >> I'm feeling fresh as the first interview. (Savannah laughs) As always. >> That delivery really implied a level of freshness. >> Let's go! No, this is only Day 1. In three days, reinvent. We go hardcore. These are great events. We get so much great content. The conversations are amazing. The guests are awesome. They're technical, they're smart, and they're making the difference in the future. So, this next segment about Scale MySQL should be awesome. >> I am very excited to introduce our next guest who actually has a Twitter handle that I think most people, at least of my gender in this industry would love to have. She is @ATechGirl. So you can go ahead and tweet her and tell her how great this interview is while we're live. Please welcome Deepthi Sigireddi. Thank you so much for being here with us. >> Thank you for having me. >> You're feeding us in. You've got two talks you're giving while we're here. >> Yes, yes. So tomorrow we will be talking about VTR, myself and one of the other maintainers of Vitess and on Friday we have the Vitess Maintainer Talk. All graduated projects get a maintainer talk. >> Wow, so you are like KubeCon VIP celebrity. >> Well, I hope so. >> Well, you're a maintainer and technical lead, also software engineer at the PlanetScale. But talk about the graduation process where that means to the project and the people involved. >> So Vitess graduated in 2019 and there are strict criteria for graduation and you don't just have to meet the minimum, you sort of have to over perform on the graduation criteria. Some of which are like there must be at least two large production deploys and people from those companies have to go in front of the CNCF committee that approves these things and say that, "Yes, this project is critical to our business." >> A lot of peer review, a lot of deployment success. >> Yes. >> Good consistency in the code. >> Deepthi: Community diversity. >> All that. >> All those things. >> Talk about the importance of this project. What is the top story that people should know about around the project? Why it exists, why it's important, why it's relevant, why it's cool. How would you answer that? >> So MySQL is now 30 years old and yet they are still- >> Makes me feel a little sidebar. (Deepthi laughs) Yeah. >> And yet even though there are many other newer databases, it continues to be used at many of the largest internet scale companies. And some of them, for example, Slack, GitHub, Square, they have grown to a level where they could not have if they had tried to do it with Vanilla MySQL that they started with, and the only reason they are where they are is Vitess. So that is I think the number one thing people should know about Vitess. >> And the origination story on notes say "Came from YouTube." >> Yes. So the way Vitess started was that YouTube was having problems with their MySQL deployment and they got tired of dealing with the site being down. So the founders of Vitess decided that they had to do something about it and they started building Vitess which started as a pretty small, relatively code-based with limited features, and over time they built charting and all of the other things that we have today. >> Well, this is exciting Savannah because we've seen this industry. Like with Facebook, when they started, everyone built their own stuff. MySQL was a great- >> Oh gosh, and everyone wanted to build it their way, reinventing the wheel. >> And MySQL was great. And then as it kind of broke when it grew, it got retrofitted. So, it was constantly being scaled up to the point where now you guys, if I get this right, said, "Hey, we're going to work on this. We're going to make it next-gen." So it's kind of like next-gen MySQL. Almost. >> Yes, yes. I would say that's pretty accurate, yeah. So there are still large companies which run their own MySQL and they have scaled it in their own way, but Vitess happens to be an open source way of scaling MySQL that people can adopt without having to build all of their own tooling around it. >> Speaking of that and growing, you just announced a new version today. >> Yes, yes. >> Tell us about that. >> The focus in this version was to make Vitess easier to use and to deploy. So in the past, there was one glaring gap in Vitess which was that Vitess did not automatically detect and repair MySQL level failures. With this release, we've actually closed that gap. And what that means for people using Vitess is that they will actually spend less time dealing with outages manually, or less human intervention, More automated recovery is what it means. The other thing we've released today is a new web UI. Vitess had a very old web UI, ugly, hard to maintain. Nobody liked it. But it was functional, except we couldn't add anything new to it because it was so old. So, the backend functionality kept advancing but the front end was kind of frozen. Now we have a next generation UI to which in upcoming releases we can add more and more functionality. >> So, it's extensible. They add things in. >> Deepthi: Oh yes, of course. Yeah. >> Awesome. What's the biggest thing that you like about the new situation? Is it more contributors are on board the UI? What's the fresh new impact that's happening in the community? What's getting you excited about with the current project? And the UI's great 'cause usability is important. >> Deepthi: Right. >> Scalability is important. >> I think Vitess solved the scalability problem way early and only now we are really grappling with the usability problem. So the hope and the desire is to make Vitess autopilot so that you reduce human intervention to a minimum once you deploy it. Obviously, you have to go through the process of deploying it. But once you've deployed it, it should just run itself. >> Runs at scale. So, the scale's huge? >> Deepthi: Yes. >> How many contributors are involved in the project? Can you give some numbers? Do you have any handy that you can speak to? >> Right. So, CNCF actually tracks these statistics for all the projects and we consolidated some numbers for the last two full calendar years, 2020 and 2021. We had over 400 contributors and 200 plus of them contributed code and the others contributed documentation issues, website changes, and things like that. So that gives- >> How about downloads? Download's good? >> Oh, okay. So we started publishing the current official Vitess Docker Image in 2018. And by October of 2020, we had about 3.8 million downloads. And by August of 2021, we had 5.2 million. And today, we have had over 10 million downloads- >> Wow! >> Of the main image. >> Starting to see a minute of that hockey stick that we all like to see. Seems like you're very clearly a community-first leader and it seems like that's in the PlanetScale and the test's DNA. Is that how the whole company culture views it? Would you say it's community-first business? >> PlanetScale is very much committed to Vitess as an open source project and to serving the Vitess community. So as part of my role at PlanetScale, some of the things I do are helping new contributors whether they are from PlanetScale or from outside PlanetScale. A number of PlanetScale engineers who don't work full-time on Vitess still contribute bug fixes and features to Vitess. We spend a significant amount of our energy helping users in our community Slack. The releases we do are mainly for the benefit of the community and PlanetScale is making those releases because for Planet Scale... Within PlanetScale, we actually do separate releases versus the public ones. >> One of the things that's coming up here at the show is deploying on Kubernetes. How does that look like? Everyone wants ease of use. Are you guys easy to use? >> Yes, yes. So PlanetScale also open sourced a Kubernetes operator for Vitess that people outside PlanetScale are using to run their production deployments of Vitess. Prior to that, there were Vitess users who actually built their own Kubernetes deployments of Vitess and they are still running those, but new users and new adopters of Vitess tend to use the Kubernetes operator that we are publishing. >> And you guys are the managed service for Vitess for the people that that's the business model for PlanetScale. >> Correct. So PlanetScale has a serverless database on demand which is built on Vitess. So if someone's starting something new and they just need a database, you sign up. It takes 30 seconds to get a database. Connect to it and start doing things with it. Versus if you are a large enterprise and you have a huge database deployment, you can migrate to PlanetScale, import all of your existing data, cut over with minimal downtime and then go, and then PlanetScale manages that. >> And why would they do that? What's the use case for that? Save time new development team or refactoring? >> Save time not being able to hire people with the skills to run it in-house. Not wanting to invest engineering resources in what businesses think is not their core competency. They want to focus on their business value. >> So, this database is a service in their whatever they're doing without adding more costs. >> Right. >> And speed. Okay, cool. How's that going? >> It's going well. >> Any feedback from customers in terms of why that there are any benefit statements you seek popping out? What are the big... What's the big aha when they... When people realize what they have here, what's the aha moment for them? Do they go, "Wow, this is awesome. It's so easy. Push a button. Migrate." Or is it... >> All of those. And people have actually seen cost savings when they've migrated from Amazon RDS to PlanetScale and we have testimonials from people who've said that, "It was so easy to use PlanetScale. Why would we try to do it ourselves?" >> It's the best thing a customer could say, right? We're all about being painkillers and solving some sort of problem. I think that that's a great opportunity to let you show off some of your customers. So, who is receiving this benefit? 'Cause I know PlanetScale specifically is for a certain style of business. >> Hmm. We have a list of customers on the website. >> Savannah: I was going to say you have a really- >> John: She's a software engineer. She's not marketing. >> You did sexy. >> You're doing a great job as much as marketing. >> So the reason I am bringing this up is because it's clear this is a solution for companies like Square, SoundCloud, Etsy, Jordan, and other exciting brands. So when you're talking about companies at scale, these companies are very much at scale, which is awesome. >> Yeah. >> What's next? What do you guys see the future for the project? >> I think we talked about that a little bit already. So, usability is a big thing. We did the new UI. It's not complete, right? Because over the last four years we've built more features into the backend which you can't yet access from the UI. So we want to be able for people to use things like online schema changes which is a big feature of Vitess. Doing schema changes without downtime from the UI. So, schema management from the UI. Vitess has something called VReplication which is the core technology that enables charting. And right now you can from the UI monitor your charting status, but you can't actually start charting from the UI. So more of the administrative functions we want to enable from the UI. >> John: Awesome. >> Last question. What are you personally most excited about this week being here with our wonderful community? >> I always enjoy being at KubeCon. This is my fifth or sixth in-person and I've done a couple of virtual ones. >> Savannah: Awesome. >> Because of the energy, because you get to meet people in person whom previously you've only met in Slack or maybe in a monthly community Zoom calls. We always have people come to our project booth. We have a project booth here for Vitess. People come to the company booth. PlanetScale has a booth. People come to our talks, ask questions. We end up having design discussions, architecture discussions. We get feedback on what is important to the people who show up here. That always informs what we do with the project in future releases. >> Perfect answer. I already mentioned that you can get a hold and in touch with Deepthi through her wonderful Twitter handle. Is there any other website or anything you want to shout out here before I do our close? >> vitess.io. V-I-T-E-S-S dot I-O is the Vitess website and planetscale.com is the PlanetScale website. >> Deepthi Sigireddi, thank you so much for being on the show with us today. John, thanks for keeping me company as always. >> You're welcome. >> And thank all of you for tuning into theCUBE. We will be here in Detroit, Michigan all week live from KubeCon and we hope to see you there. (gentle upbeat music)

Published Date : Oct 27 2022

SUMMARY :

interview of the day. as the first interview. implied a level of freshness. difference in the future. So you You've got two talks you're myself and one of the Wow, so you are like and the people involved. in front of the CNCF committee A lot of peer review, a What is the top story Yeah. and the only reason they are And the origination story and all of the other Well, this is exciting Savannah reinventing the wheel. to the point where now you guys, and they have scaled it in their own way, Speaking of that and growing, So in the past, there was So, it's extensible. Deepthi: Oh yes, of course. in the community? So the hope and the desire So, the scale's huge? and the others contributed And by August of 2021, we had 5.2 million. and the test's DNA. for the benefit of the community One of the things that's coming up here operator that we are publishing. for the people that and you have a huge database deployment, Save time not being able to hire people So, this database is a service How's that going? What are the big... and we have testimonials It's the best thing a customers on the website. John: She's a software engineer. You're doing a great So the reason I am bringing this up into the backend which you What are you personally and I've done a couple of virtual ones. Because of the energy, that you can get a hold V-I-T-E-S-S dot I-O is the Vitess website for being on the show with us today. and we hope to see you there.

ENTITIES

Entity	Category	Confidence
Savannah	PERSON	0.99+
John	PERSON	0.99+
Savannah Peterson	PERSON	0.99+
Deepthi	PERSON	0.99+
August of 2021	DATE	0.99+
YouTube	ORGANIZATION	0.99+
October of 2020	DATE	0.99+
2019	DATE	0.99+
30 seconds	QUANTITY	0.99+
Etsy	ORGANIZATION	0.99+
5.2 million	QUANTITY	0.99+
Friday	DATE	0.99+
Square	ORGANIZATION	0.99+
2021	DATE	0.99+
fifth	QUANTITY	0.99+
2020	DATE	0.99+
sixth	QUANTITY	0.99+
2018	DATE	0.99+
Deepthi Sigireddi	PERSON	0.99+
SoundCloud	ORGANIZATION	0.99+
Vitess	ORGANIZATION	0.99+
MySQL	TITLE	0.99+
Jordan	ORGANIZATION	0.99+
GitHub	ORGANIZATION	0.99+
CNCF	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
12th interview	QUANTITY	0.99+
tomorrow	DATE	0.99+
today	DATE	0.99+
30 years	QUANTITY	0.99+
Detroit, Michigan	LOCATION	0.99+
over 400 contributors	QUANTITY	0.99+
Slack	ORGANIZATION	0.98+
CloudNativeCon	EVENT	0.98+
first interview	QUANTITY	0.98+
KubeCon	EVENT	0.98+
PlanetScale	ORGANIZATION	0.98+
Amazon	ORGANIZATION	0.98+
Day 1	QUANTITY	0.97+
200 plus	QUANTITY	0.97+
One	QUANTITY	0.97+
Motown, Detroit, Michigan	LOCATION	0.97+
Vitess	TITLE	0.97+
vitess.io	OTHER	0.96+
about 3.8 million downloads	QUANTITY	0.96+
one	QUANTITY	0.95+
three days	QUANTITY	0.94+
over 10 million downloads	QUANTITY	0.94+
Scale MySQL	TITLE	0.94+
Kubernetes	TITLE	0.93+
this week	DATE	0.93+
two talks	QUANTITY	0.92+
Twitter	ORGANIZATION	0.91+
Slack	TITLE	0.9+
planetscale.com	OTHER	0.89+
first business	QUANTITY	0.86+
NA 2022	EVENT	0.84+

Digging into HeatWave ML Performance

(upbeat music) >> Hello everyone. This is Dave Vellante. We're diving into the deep end with AMD and Oracle on the topic of mySQL HeatWave performance. And we want to explore the important issues around machine learning. As applications become more data intensive and machine intelligence continues to evolve, workloads increasingly are seeing a major shift where data and AI are being infused into applications. And having a database that simplifies the convergence of transaction and analytics data without the need to context, switch and move data out of and into different data stores. And eliminating the need to perform extensive ETL operations is becoming an industry trend that customers are demanding. At the same time, workloads are becoming more automated and intelligent. And to explore these issues further, we're happy to have back in theCUBE Nipun Agarwal, who's the Senior Vice President of mySQL HeatWave and Kumaran Siva, who's the Corporate Vice President Strategic Business Development at AMD. Gents, hello again. Welcome back. >> Hello. Hi Dave. >> Thank you, Dave. >> Okay. Nipun, obviously machine learning has become a must have for analytics offerings. It's integrated into mySQL HeatWave. Why did you take this approach and not the specialized database approach as many competitors do right tool for the right job? >> Right? So, there are a lot of customers of mySQL who have the need to run machine learning on the data which is store in mySQL database. So in the past, customers would need to extract the data out of mySQL and they would take it to a specialized service for running machine learning. Now, the reason we decided to incorporate machine learning inside the database, there are multiple reasons. One, customers don't need to move the data. And if they don't need to move the data, it is more secure because it's protected by the same access controlled mechanisms as rest of the data There is no need for customers to manage multiple services. But in addition to that, when we run the machine learning inside the database customers are able to leverage the same service the same hardware, which has been provisioned for OTP analytics and use machine learning capabilities at no additional charge. So from a customer's perspective, they get the benefits that it is a single database. They don't need to manage multiple services. And it is offered at no additional charge. And then as another aspect, which is kind of hard to learn which is based on the IP, the work we have done it is also significantly faster than what customers would get by having a separate service. >> Just to follow up on that. How are you seeing customers use HeatWaves machine learning capabilities today? How is that evolving? >> Right. So one of the things which, you know customers very often want to do is to train their models based on the data. Now, one of the things is that data in a database or in a transaction database changes quite rapidly. So we have introduced support for auto machine learning as a part of HeatWave ML. And what it does is that it fully automates the process of training. And this is something which is very important to database users, very important to mySQL users that they don't really want to hire or data scientists or specialists for doing training. So that's the first part that training in HeatWave ML is fully automated. Doesn't require the user to provide any like specific parameters, just the source data and the task which they want to train. The second aspect is the training is really fast. So the training is really fast. The benefit is that customers can retrain quite often. They can make sure that the model is up to date with any changes which have been made to their transaction database. And as a result of the models being up to date, the accuracy of the prediction is high. Right? So that's the first aspect, which is training. The second aspect is inference, which customers run once they have the models trained. And the third thing, which is perhaps been the most sought after request from the mySQL customers is the ability to provide explanations. So, HeatWave ML provides explanations for any model which has been generated or trained by HeatWave ML. So these are the three capabilities- training, inference and explanations. And this whole process is completely automated, doesn't require a specialist or a data scientist. >> Yeah, that's nice. I mean, training obviously very popular today. I've said inference I think is going to explode in the coming decade. And then of course, AI explainable AI is a very important issue. Kumaran, what are the relevant capabilities of the AMD chips that are used in OCI to support HeatWave ML? Are they different from say the specs for HeatWave in general? >> So, actually they aren't. And this is one of the key features of this architecture or this implementation that is really exciting. Um, there with HeatWave ML, you're using the same CPU. And by the way, it's not a GPU, it's a CPU for both for all three of the functions that Nipun just talked about- inference, training and explanation all done on CPU. You know, bigger picture with the capabilities we bring here we're really providing a balance, you know between the CPU cores, memory and the networking. And what that allows you to do here is be able to feed the CPU cores appropriately. And within the cores, we have these AVX instruc... extensions in with the Zen 2 and Zen 3 cores. We had AVX 2, and then with the Zen 4 core coming out we're going to have AVX 512. But we were able to with that balance of being able to bring in the data and utilize the high memory bandwidth and then use the computation to its maximum we're able to provide, you know, build pride enough AI processing that we are able to get the job done. And then we're built to build a fit into that larger pipeline that that we build out here with the HeatWave. >> Got it. Nipun you know, you and I every time we have a conversation we've got to talk benchmarks. So you've done machine learning benchmarks with HeatWave. You might even be the first in the industry to publish you know, transparent, open ML benchmarks on GitHub. I mean, I, I wouldn't know for sure but I've not seen that as common. Can you describe the benchmarks and the data sets that you used here? >> Sure. So what we did was we took a bunch of open data sets for two categories of tasks- classification and regression. So we took about a dozen data sets for classification and about six for regression. So to give an example, the kind of data sets we used for classifications like the airlines data set, hex sensors bank, right? So these are open data sets. And what we did was for on these data sets we did a comparison of what would it take to train using HeatWave ML? And then the other service we compared with is that RedShift ML. So, there were two observations. One is that with HeatWave ML, the user does not need to provide any tuning parameters, right? The HeatWave ML using RML fully generates a train model, figures out what are the right algorithms? What are the right features? What are the right hyper parameters and sets, right? So no need for any manual intervention not so the case with Redshift ML. The second thing is the performance, right? So the performance of HeatWave ML aggregate on these 12 data sets for classification and the six data sets on regression. On an average, it is 25 times faster than Redshift ML. And note that Redshift ML in turn involves SageMaker, right? So on an average, HeatWave ML provides 25 times better performance for training. And the other point to note is that there is no need for any human intervention. That's fully automated. But in the case of Redshift ML, many of these data sets did not even complete in the set duration. If you look at price performance, one of the things again I want to highlight is because of the fact that AMD does pretty well in all kinds of workloads. We are able to use the same cluster users and use the same cluster for analytics, for OTP or for machine learning. So there is no additional cost for customers to run HeatWave ML if they have provision HeatWave. But assuming a user is provisioning a HeatWave cluster only to run HeatWave ML, right? That's the case, even in that case the price performance advantage of HeatWave ML over Redshift ML is 97 times, right? So 25 times faster at 1% of the cost compared to Redshift ML And all these scripts and all this information is available on GitHub for customers to try to modify and like, see, like what are the advantages they would get on their workloads? >> Every time I hear these numbers, I shake my head. I mean, they're just so overwhelming. Um, and so we'll see how the competition responds when, and if they respond. So, but thank you for sharing those results. Kumaran, can you elaborate on how the specs that you talked about earlier contribute to HeatWave ML's you know, benchmark results. I'm particularly interested in scalability, you know Typically things degrade as you push the system harder. What are you seeing? >> No, I think, I think it's good. Look, yeah. That's by those numbers, just blow me, blow my head too. That's crazy good performance. So look from, from an AMD perspective, we have really built an architecture. Like if you think about the chiplet architecture to begin with, it is fundamentally, you know, it's kind of scaling by design, right? And, and one of the things that we've done here is been able to work with, with the HeatWave team and heat well ML team, and then been able to, to within within the CPU package itself, be able to scale up to take very efficient use of all of the course. And then of course, work with them on how you go between nodes. So you can have these very large systems that can run ML very, very efficiently. So it's really, you know, building on the building blocks of the chiplet architecture and how scaling happens there. >> Yeah. So it's you're saying it's near linear scaling or essentially. >> So, let Nipun comment on that. >> Yeah. >> Is it... So, how about as cluster sizes grow, Nipun? >> Right. >> What happens there? >> So one of the design points for HeatWave is scale out architecture, right? So as you said, that as we add more data set or increase the size of the data, or we add the number of nodes to the cluster, we want the performance to scale. So we show that we have near linear scale factor, or nearly near scale scalability for SQL workloads in the case of HeatWave ML, as well. As users add more nodes to the cluster so the size of the cluster the performance of HeatWave ML improves. So I was giving you this example that HeatWave ML is 25 times faster compared to Redshift ML. Well, that was on a cluster size of two. If you increase the cluster size of HeatWave ML to a larger number. But I think the number is 16. The performance advantage over Redshift ML increases from 25 times faster to 45 times faster. So what that means is that on a cluster size of 16 nodes HeatWave ML is 45 times faster for training these again, dozen data sets. So this shows that HeatWave ML skills better than the computation. >> So you're saying adding nodes offsets any management complexity that you would think of as getting in the way. Is that right? >> Right. So one is the management complexity and which is why by features like last customers can scale up or scale down, you know, very easily. The second aspect is, okay What gives us this advantage, right, of scalability? Or how are we able to scale? Now, the techniques which we use for HeatWave ML scalability are a bit different from what we use for SQL processing. So in the case of HeatWave ML, they really like, you know, three, two trade offs which we have to be careful about. One is the accuracy. Because we want to provide better performance for machine learning without compromising on the accuracy. So accuracy would require like more synchronization if you have multiple threads. But if you have too much of synchronization that can slow down the degree of patterns that we get. Right? So we have to strike a fine balance. So what we do is that in HeatWave ML, there are different phases of training, like algorithm selection, feature selection, hyper probability training. Each of these phases is analyzed. And for instance, one of the ways techniques we use is that if you're trying to figure out what's the optimal hyper parameter to be used? We start up with the search space. And then each of the VMs gets a part of the search space. And then we synchronize only when needed, right? So these are some of the techniques which we have developed over the years. And there are actually paper's filed, research publications filed on this. And this is what we do to achieve good scalability. And what that results to the customer is that if they have some amount of training time and they want to make it better they can just provision a larger cluster and they will get better performance. >> Got it. Thank you. Kumaran, when I think of machine learning, machine intelligence, AI, I think GPU but you're not using GPU. So how are you able to get this type of performance or price performance without using GPU's? >> Yeah, definitely. So yeah, that's a good point. And you think about what is going on here and you consider the whole pipeline that Nipun has just described in terms of how you get you know, your training, your algorithms And using the mySQL pieces of it to get to the point where the AI can be effective. In that process what happens is you have to have a lot of memory to transactions. A lot of memory bandwidth comes into play. And then bringing all that data together, feeding the actual complex that does the AI calculations that in itself could be the bottleneck, right? And you can have multiple bottlenecks along the way. And I think what you see in the AMD architecture for epic for this use case is the balance. And the fact that you are able to do the pre-processing, the AI, and then the post-processing all kind of seamlessly together, that has a huge value. And that goes back to what Nipun was saying about using the same infrastructure, gets you the better TCO but it also gets you gets you better performance. And that's because of the fact that you're bringing the data to the computation. So the computation in this case is not strictly the bottleneck. It's really about how you pull together what you need and to do the AI computation. And that is, that's probably a more, you know, it's a common case. And so, you know, you're going to start I think the least start to see this especially for inference applications. But in this case we're doing both inference explanation and training. All using the the CPU in the same OCI infrastructure. >> Interesting. Now Nipun, is the secret sauce for HeatWave ML performance different than what we've discussed before you and I with with HeatWave generally? Is there some, you know, additive engine additive that you're putting in? >> Right? Yes. The secret sauce is indeed different, right? Just the way I was saying that for SQL processing. The reason we get very good performance and price performance is because we have come up with new algorithms which help the SQL process can scale out. Similarly for HeatWave ML, we have come up with new IP, new like algorithms. One example is that we use meta-learn proxy models, right? That's the technique we use for automating the training process, right? So think of this meta-learn proxy models to be like, you know using machine learning for machine learning training. And this is an IP which we developed. And again, we have published the results and the techniques. But having such kind of like techniques is what gives us a better performance. Similarly, another thing which we use is adaptive sampling that you can have a large data set. But we intelligently sample to figure out that how can we train on a small subset without compromising on the accuracy? So, yes, there are many techniques that you have developed specifically for machine learning which is what gives us the better performance, better price performance, and also better scalability. >> What about mySQL autopilot? Is there anything that differs from HeatWave ML that is relevant? >> Okay. Interesting you should ask. So mySQL Autopilot is think of it to be an application using machine learning. So mySQL Autopilot uses machine learning to automate various aspects of the database service. So for instance, if you want to figure out that what's the right partitioning scheme to partition the data in memory? We use machine learning techniques to figure out that what's the right, the best column based on the user's workload to partition the data in memory Or given a workload, if you want to figure out what is the right cluster size to provision? That's something we use mySQL autopilot for. And I want to highlight that we don't aware of any other database service which provides this level of machine learning based automation which customers get with mySQL Autopilot. >> Hmm. Interesting. Okay. Last question for both of you. What are you guys working on next? What can customers expect from this collaboration specifically in this space? Maybe Nipun, you can start and then Kamaran can bring us home. >> Sure. So there are two things we are working on. One is based on the feedback we have gotten from customers, we are going to keep making the machine learning capabilities richer in HeatWave ML. That's one dimension. And the second thing is which Kamaran was alluding to earlier, We are looking at the next generation of like processes coming from AMD. And we will be seeing as to how we can more benefit from these processes whether it's the size of the L3 cache, the memory bandwidth, the network bandwidth, and such or the newer effects. And make sure that we leverage the all the greatness which the new generation of processes will offer. >> It's like an engineering playground. Kumaran, let's give you the final word. >> No, that's great. Now look with the Zen 4 CPU cores, we're also bringing in AVX 512 instruction capability. Now our implementation is a little different. It was in, in Rome and Milan, too where we use a double pump implementation. What that means is, you know, we take two cycles to do these instructions. But the key thing there is we don't lower our speed of the CPU. So there's no noisy neighbor effects. And it's something that OCI and the HeatWave has taken full advantage of. And so like, as we go out in time and we see the Zen 4 core, we can... we see up to 96 CPUs that that's going to work really well. So we're collaborating closely with, with OCI and with the HeatWave team here to make sure that we can take advantage of that. And we're also going to upgrade the memory subsystem to get to 12 channels of DDR 5. So it should be, you know there should be a fairly significant boost in absolute performance. But more important or just as importantly in TCO value for the customers, the end customers who are going to adopt this great service. >> I love their relentless innovation guys. Thanks so much for your time. We're going to have to leave it there. Appreciate it. >> Thank you, David. >> Thank you, David. >> Okay. Thank you for watching this special presentation on theCUBE. Your leader in enterprise and emerging tech coverage.

Published Date : Sep 14 2022

SUMMARY :

And eliminating the need and not the specialized database approach So in the past, customers How are you seeing customers use So one of the things of the AMD chips that are used in OCI And by the way, it's not and the data sets that you used here? And the other point to note elaborate on how the specs And, and one of the things or essentially. So, how about as So one of the design complexity that you would So in the case of HeatWave ML, So how are you able to get And the fact that you are Nipun, is the secret sauce That's the technique we use for automating of the database service. What are you guys working on next? And the second thing is which Kamaran Kumaran, let's give you the final word. OCI and the HeatWave We're going to have to leave it there. and emerging tech coverage.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Rome	LOCATION	0.99+
Dave	PERSON	0.99+
David	PERSON	0.99+
OCI	ORGANIZATION	0.99+
Nipun Agarwal	PERSON	0.99+
Milan	LOCATION	0.99+
45 times	QUANTITY	0.99+
25 times	QUANTITY	0.99+
12 channels	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
AMD	ORGANIZATION	0.99+
Zen 4	COMMERCIAL_ITEM	0.99+
Kumaran	PERSON	0.99+
HeatWave	ORGANIZATION	0.99+
Zen 3	COMMERCIAL_ITEM	0.99+
second aspect	QUANTITY	0.99+
Kumaran Siva	PERSON	0.99+
12 data sets	QUANTITY	0.99+
first aspect	QUANTITY	0.99+
97 times	QUANTITY	0.99+
Zen 2	COMMERCIAL_ITEM	0.99+
both	QUANTITY	0.99+
first	QUANTITY	0.99+
One	QUANTITY	0.99+
two things	QUANTITY	0.99+
one	QUANTITY	0.99+
Each	QUANTITY	0.99+
1%	QUANTITY	0.99+
two cycles	QUANTITY	0.99+
three capabilities	QUANTITY	0.99+
third thing	QUANTITY	0.99+
each	QUANTITY	0.99+
AVX 2	COMMERCIAL_ITEM	0.99+
AVX 512	COMMERCIAL_ITEM	0.99+
second thing	QUANTITY	0.99+
Redshift ML	TITLE	0.99+
six data sets	QUANTITY	0.98+
HeatWave	TITLE	0.98+
mySQL Autopilot	TITLE	0.98+
two	QUANTITY	0.98+
Nipun	PERSON	0.98+
two categories	QUANTITY	0.98+
mySQL	TITLE	0.98+
two observations	QUANTITY	0.98+
first part	QUANTITY	0.98+
mySQL autopilot	TITLE	0.98+
three	QUANTITY	0.97+
SQL	TITLE	0.97+
One example	QUANTITY	0.97+
single database	QUANTITY	0.95+
16	QUANTITY	0.95+
today	DATE	0.95+
about six	QUANTITY	0.95+
HeatWaves	ORGANIZATION	0.94+
about a dozen data sets	QUANTITY	0.94+
16 nodes	QUANTITY	0.93+
mySQL HeatWave	TITLE	0.93+

AMD Oracle Partnership Elevates MySQLHeatwave

(upbeat music) >> For those of you who've been following the cloud database space, you know that MySQL HeatWave has been on a technology tear over the last 24 months with Oracle claiming record breaking benchmarks relative to other database platforms. So far, those benchmarks remain industry leading as competitors have chosen not to respond, perhaps because they don't feel the need to, or maybe they don't feel that doing so would serve their interest. Regardless, the HeatWave team at Oracle has been very aggressive about its performance claims, making lots of noise, challenging the competition to respond, publishing their scripts to GitHub. But so far, there are no takers, but customers seem to be picking up on these moves by Oracle and it's likely the performance numbers resonate with them. Now, the other area we want to explore, which we haven't thus far, is the engine behind HeatWave and that is AMD. AMD's epic processors have been the powerhouse on OCI, running MySQL HeatWave since day one. And today we're going to explore how these two technology companies are working together to deliver these performance gains and some compelling TCO metrics. In fact, a recent Wikibon analysis from senior analyst Marc Staimer made some TCO comparisons in OLAP workloads relative to AWS, Snowflake, GCP, and Azure databases, you can find that research on wikibon.com. And with that, let me introduce today's guest, Nipun Agarwal senior vice president of MySQL HeatWave and Kumaran Siva, who's the corporate vice president for strategic business development at AMD. Welcome to theCUBE gentlemen. >> Welcome. Thank you. >> Thank you, Dave. >> Hey Nipun, you and I have talked a lot about this. You've been on theCUBE a number of times talking about MySQL HeatWave. But for viewers who may not have seen those episodes maybe you could give us an overview of HeatWave and how it's different from competitive cloud database offerings. >> Sure. So MySQL HeatWave is a fully managed MySQL database service offering from Oracle. It's a single database, which can be used to run transactional processing, analytics and machine learning workloads. So, in the past, MySQL has been designed and optimized for transaction processing. So customers of MySQL when they had to run, analytics machine learning, would need to extract the data out of MySQL, into some other database or service, to run analytics or machine learning. MySQL HeatWave offers a single database for running all kinds of workloads so customers don't need to extract data into some of the database. In addition to having a single database, MySQL HeatWave is also very performant compared to one up databases and also it is very price competitive. So the advantages are; single database, very performant, and very good price performance. >> Yes. And you've published some pretty impressive price performance numbers against competitors. Maybe you could describe those benchmarks and highlight some of the results, please. >> Sure. So one thing to notice that the performance of any database is going to like vary, the performance advantage is going to vary based on, the size of the data and the specific workloads, so the mileage varies, that's the first thing to know. So what we have done is, we have published multiple benchmarks. So we have benchmarks on PPCH or PPCDS and we have benchmarks on different data sizes because based on the customer's workload, the mileage is going to vary, so we want to give customers a broad range of comparisons so that they can decide for themselves. So in a specific case, where we are running on a 30 terabyte PPCH workload, HeatWave is about 18 times better price performance compared to Redshift. 18 times better compared to Redshift, about 33 times better price performance, compared to Snowflake, and 42 times better price performance compared to Google BigQuery. So, this is on 30 Terabyte PPCH. Now, if the data size is different, or the workload is different, the characteristics may vary slightly but this is just to give a flavor of the kind of performance advantage MySQL HeatWave offers. >> And then my last question before we bring in Kumaran. We've talked about the secret sauce being the tight integration between hardware and software, but would you add anything to that? What is that secret sauce in HeatWave that enables you to achieve these performance results and what does it mean for customers? >> So there are three parts to this. One is HeatWave has been designed with a scale out architecture in mind. So we have invented and implemented new algorithms for skill out query processing for analytics. The second aspect is that HeatWave has been really optimized for cloud, commodity cloud, and that's where AMD comes in. So for instance, many of the partitioning schemes we have for processing HeatWave, we optimize them for the L3 cache of the AMD processor. The thing which is very important to our customers is not just the sheer performance but the price performance, and that's where we have had a very good partnership with AMD because not only does AMD help us provide very good performance, but the price performance, right? And that all these numbers which I was showing, big part of it is because we are running on AMD which provides very good price performance. So that's the second aspect. And the third aspect is, MySQL autopilot, which provides machine learning based automation. So it's really these three things, a combination of new algorithms, design for scale out query processing, optimized for commodity cloud hardware, specifically AMD processors, and third, MySQL auto pilot which gives us this performance advantage. >> Great, thank you. So that's a good segue for AMD and Kumaran. So Kumaran, what is AMD bringing to the table? What are the, like, for instance, relevance specs of the chips that are used in Oracle cloud infrastructure and what makes them unique? >> Yeah, thanks Dave. That's a good question. So, OCI is a great customer of ours. They use what we call the top of stack devices meaning that they have the highest core count and they also are very, very fast cores. So these are currently Zen 3 cores. I think the HeatWave product is right now deployed on Zen 2 but will shortly be also on the Zen 3 core as well. But we provide in the case of OCI 64 cores. So that's the largest devices that we build. What actually happens is, because these large number of CPUs in a single package and therefore increasing the density of the node, you end up with this fantastic TCO equation and the cost per performance, the cost per for deployed services like HeatWave actually ends up being extraordinarily competitive and that's a big part of the contribution that we're bringing in here. >> So Zen 3 is the AMD micro architecture which you introduced, I think in 2017, and it's the basis for EPIC, which is sort of the enterprise grade that you really attacked the enterprise with. Maybe you could elaborate a little bit, double click on how your chips contribute specifically to HeatWave's, price performance results. >> Yeah, absolutely. So in the case of HeatWave, so as Nipun alluded to, we have very large L3 caches, right? So in our very, very top end parts just like the Milan X devices, we can go all the way up to like 768 megabytes of L3 cache. And that gives you just enormous performance and performance gains. And that's part of what we're seeing with HeatWave today and that not that they're currently on the second generation ROM based product, 'cause it's a 7,002 based product line running with the 64 cores. But as time goes on, they'll be adopting the next generation Milan as well. And the other part of it too is, as our chip led architecture has evolved, we know, so from the first generation Naples way back in 2017, we went from having multiple memory domains and a sort of NUMA architecture at the time, today we've really optimized that architecture. We use a common I/O Die that has all of the memory channels attached to it. And what that means is that, these scale out applications like HeatWave, are able to really scale very efficiently as they go from a small domain of CPUs to, for example the entire chip, all 64 cores that scaling, is been a key focus for AMD and being able to design and build architectures that can take advantage of that and then have applications like HeatWave that scale so well on it, has been, a key aim of ours. >> And Gen 3 moving up the Italian countryside. Nipun, you've taken the somewhat unusual step of posting the benchmark parameters, making them public on GitHub. Now, HeatWave is relatively new. So people felt that when Oracle gained ownership of MySQL it would let it wilt on the vine in favor of Oracle database, so you lost some ground and now, you're getting very aggressive with HeatWave. What's the reason for publishing those benchmark parameters on GitHub? >> So, the main reason for us to publish price performance numbers for HeatWave is to communicate to our customers a sense of what are the benefits they're going to get when they use HeatWave. But we want to be very transparent because as I said the performance advantages for the customers may vary, based on the data size, based on the specific workloads. So one of the reasons for us to publish, all these scripts on GitHub is for transparency. So we want customers to take a look at the scripts, know what we have done, and be confident that we stand by the numbers which we are publishing, and they're very welcome, to try these numbers themselves. In fact, we have had customers who have downloaded the scripts from GitHub and run them on our service to kind of validate. The second aspect is in some cases, they may be some deviations from what we are publishing versus what the customer would like to run in the production deployments so it provides an easy way, for customers to take the scripts, modify them in some ways which may suit their real world scenario and run to see what the performance advantages are. So that's the main reason, first, is transparency, so the customers can see what we are doing, because of the comparison, and B, if they want to modify it to suit their needs, and then see what is the performance of HeatWave, they're very welcome to do so. >> So have customers done that? Have they taken the benchmarks? And I mean, if I were a competitor, honestly, I wouldn't get into that food fight because of the impressive performance, but unless I had to, I mean, have customers picked up on that, Nipun? >> Absolutely. In fact, we have had many customers who have benchmarked the performance of MySQL HeatWave, with other services. And the fact that the scripts are available, gives them a very good starting point, and then they've also tweaked those queries in some cases, to see what the Delta would be. And in some cases, customers got back to us saying, hey the performance advantage of HeatWave is actually slightly higher than what was published and what is the reason. And the reason was, when the customers were trying, they were trying on the latest version of the service, and our benchmark results were posted let's say, two months back. So the service had improved in those two to three months and customers actually saw better performance. So yes, absolutely. We have seen customers download the scripts, try them and also modify them to some extent and then do the comparison of HeatWave with other services. >> Interesting. Maybe a question for both of you how is the competition responding to this? They haven't said, "Hey, we're going to come up "with our own benchmarks." Which is very common, you oftentimes see that. Although, for instance, Snowflake hasn't responded to data bricks, so that's not their game, but if the customers are actually, putting a lot of faith in the benchmarks and actually using that for buying decisions, then it's inevitable. But how have you seen the competition respond to the MySQL HeatWave and AMD combo? >> So maybe I can take the first track from the database service standpoint. When customers have more choice, it is invariably advantages for the customer because then the competition is going to react, right? So the way we have seen the reaction is that we do believe, that the other database services are going to take a closer eye to the price performance, right? Because if you're offering such good price performance, the vendors are already looking at it. And, you know, instances where they have offered let's say discount to the customers, to kind of at least like close the gap to some extent. And the second thing would be in terms of the capability. So like one of the things which I should have mentioned even early on, is that not only does MySQL HeatWave on AMD, provide very good price performance, say on like a small cluster, but it's all the way up to a cluster size of 64 nodes, which has about 1000 cores. So the point is, that HeatWave performs very well, both on a small system, as well as a huge scale out. And this is again, one of those things which is a differentiation compared to other services so we expect that even other database services will have to improve their offerings to provide the same good scale factor, which customers are now starting to expectancy, with MySQL HeatWave. >> Kumaran, anything you'd add to that? I mean, you guys are an arms dealer, you love all your OEMs, but at the same time, you've got chip competitors, Silicon competitors. How do you see the competitive-- >> I'd say the broader answer and the big picture for AMD, we're very maniacally focused on our customers, right? And OCI and Oracle are huge and important customers for us, and this particular use cases is extremely interesting both in that it takes advantage, very well of our architecture and it pulls out some of the value that AMD bring. I think from a big picture standpoint, our aim is to execute, to build to bring out generations of CPUs, kind of, you know, do what we say and say, sorry, say what we do and do what we say. And from that point of view, we're hitting, the schedules that we say, and being able to bring out the latest technology and bring it in a TCO value proposition that generationally keeps OCI and HeatWave ahead. That's the crux of our partnership here. >> Yeah, the execution's been obvious for the last several years. Kumaran, staying with you, how would you characterize the collaboration between, the AMD engineers and the HeatWave engineering team? How do you guys work together? >> No, I'd say we're in a very, very deep collaboration. So, there's a few aspects where, we've actually been working together very closely on the code and being able to optimize for both the large L3 cache that AMD has, and so to be able to take advantage of that. And then also, to be able to take advantage of the scaling. So going between, you know, our architecture is chip like based, so we have these, the CPU cores on, we call 'em CCDs and the inter CCD communication, there's opportunities to optimize an application level and that's something we've been engaged with. In the broader engagement, we are going back now for multiple generations with OCI, and there's a lot of input that now, kind of resonates in the product line itself. And so we value this very close collaboration with HeatWave and OCI. >> Yeah, and the cadence, Nip, and you and I have talked about this quite a bit. The cadence has been quite rapid. It's like this constant cycle every couple of months I turn around, is something new on HeatWave. But for question again, for both of you, what new things do you think that organizations, customers, are going to be able to do with MySQL HeatWave if you could look out next 12 to 18 months, is there anything you can share at this time about future collaborations? >> Right, look, 12 to 18 months is a long time. There's going to be a lot of innovation, a lot of new capabilities coming out on in MySQL HeatWave. But even based on what we are currently offering, and the trend we are seeing is that customers are bringing, more classes of workloads. So we started off with OLTP for MySQL, then it went to analytics. Then we increased it to mixed workloads, and now we offer like machine learning as alike. So one is we are seeing, more and more classes of workloads come to MySQL HeatWave. And the second is a scale, that kind of data volumes people are using HeatWave for, to process these mixed workloads, analytics machine learning OLTP, that's increasing. Now, along the way we are making it simpler to use, we are making it more cost effective use. So for instance, last time, when we talked about, we had introduced this real time elasticity and that's something which is a very, very popular feature because customers want the ability to be able to scale out, or scale down very efficiently. That's something we provided. We provided support for compression. So all of these capabilities are making it more efficient for customers to run a larger part of their workloads on MySQL HeatWave, and we will continue to make it richer in the next 12 to 18 months. >> Thank you. Kumaran, anything you'd add to that, we'll give you the last word as we got to wrap it. >> No, absolutely. So, you know, next 12 to 18 months we will have our Zen 4 CPUs out. So this could potentially go into the next generation of the OCI infrastructure. This would be with the Genoa and then Bergamo CPUs taking us to 96 and 128 cores with 12 channels at DDR five. This capability, you know, when applied to an application like HeatWave, you can see that it'll open up another order of magnitude potentially of use cases, right? And we're excited to see what customers can do do with that. It certainly will make, kind of the, this service, and the cloud in general, that this cloud migration, I think even more attractive. So we're pretty excited to see how things evolve in this period of time. >> Yeah, the innovations are coming together. Guys, thanks so much, we got to leave it there really appreciate your time. >> Thank you. >> All right, and thank you for watching this special Cube conversation, this is Dave Vellante, and we'll see you next time. (soft calm music)

Published Date : Sep 14 2022

SUMMARY :

and it's likely the performance Thank you. and how it's different from So the advantages are; single and highlight some of the results, please. the first thing to know. We've talked about the secret sauce So for instance, many of the relevance specs of the chips that are used and that's a big part of the contribution and it's the basis for EPIC, So in the case of HeatWave, of posting the benchmark parameters, So one of the reasons for us to publish, So the service had improved how is the competition responding to this? So the way we have seen the but at the same time, and the big picture for AMD, for the last several years. and so to be able to Yeah, and the cadence, and the trend we are seeing is we'll give you the last and the cloud in general, Yeah, the innovations we'll see you next time.

ENTITIES

Entity	Category	Confidence
Marc Staimer	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Nipun	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
2017	DATE	0.99+
Dave	PERSON	0.99+
OCI	ORGANIZATION	0.99+
Zen 3	COMMERCIAL_ITEM	0.99+
7,002	QUANTITY	0.99+
Kumaran	PERSON	0.99+
second aspect	QUANTITY	0.99+
Nipun Agarwal	PERSON	0.99+
AMD	ORGANIZATION	0.99+
12	QUANTITY	0.99+
64 cores	QUANTITY	0.99+
768 megabytes	QUANTITY	0.99+
two	QUANTITY	0.99+
MySQL	TITLE	0.99+
third aspect	QUANTITY	0.99+
12 channels	QUANTITY	0.99+
Kumaran Siva	PERSON	0.99+
HeatWave	ORGANIZATION	0.99+
96	QUANTITY	0.99+
18 times	QUANTITY	0.99+
Bergamo	ORGANIZATION	0.99+
three parts	QUANTITY	0.99+
Delta	ORGANIZATION	0.99+
three months	QUANTITY	0.99+
MySQL HeatWave	TITLE	0.99+
42 times	QUANTITY	0.99+
both	QUANTITY	0.99+
18 months	QUANTITY	0.99+
Zen 2	COMMERCIAL_ITEM	0.99+
one	QUANTITY	0.99+
GitHub	ORGANIZATION	0.99+
One	QUANTITY	0.98+
second generation	QUANTITY	0.98+
single database	QUANTITY	0.98+
128 cores	QUANTITY	0.98+
18 months	QUANTITY	0.98+
three things	QUANTITY	0.98+

Breaking Analysis: We Have the Data…What Private Tech Companies Don’t Tell you About Their Business

>> From The Cube Studios in Palo Alto and Boston, bringing you data driven insights from The Cube at ETR. This is "Breaking Analysis" with Dave Vellante. >> The reverse momentum in tech stocks caused by rising interest rates, less attractive discounted cash flow models, and more tepid forward guidance, can be easily measured by public market valuations. And while there's lots of discussion about the impact on private companies and cash runway and 409A valuations, measuring the performance of non-public companies isn't as easy. IPOs have dried up and public statements by private companies, of course, they accentuate the good and they kind of hide the bad. Real data, unless you're an insider, is hard to find. Hello and welcome to this week's "Wikibon Cube Insights" powered by ETR. In this "Breaking Analysis", we unlock some of the secrets that non-public, emerging tech companies may or may not be sharing. And we do this by introducing you to a capability from ETR that we've not exposed you to over the past couple of years, it's called the Emerging Technologies Survey, and it is packed with sentiment data and performance data based on surveys of more than a thousand CIOs and IT buyers covering more than 400 companies. And we've invited back our colleague, Erik Bradley of ETR to help explain the survey and the data that we're going to cover today. Erik, this survey is something that I've not personally spent much time on, but I'm blown away at the data. It's really unique and detailed. First of all, welcome. Good to see you again. >> Great to see you too, Dave, and I'm really happy to be talking about the ETS or the Emerging Technology Survey. Even our own clients of constituents probably don't spend as much time in here as they should. >> Yeah, because there's so much in the mainstream, but let's pull up a slide to bring out the survey composition. Tell us about the study. How often do you run it? What's the background and the methodology? >> Yeah, you were just spot on the way you were talking about the private tech companies out there. So what we did is we decided to take all the vendors that we track that are not yet public and move 'em over to the ETS. And there isn't a lot of information out there. If you're not in Silicon (indistinct), you're not going to get this stuff. So PitchBook and Tech Crunch are two out there that gives some data on these guys. But what we really wanted to do was go out to our community. We have 6,000, ITDMs in our community. We wanted to ask them, "Are you aware of these companies? And if so, are you allocating any resources to them? Are you planning to evaluate them," and really just kind of figure out what we can do. So this particular survey, as you can see, 1000 plus responses, over 450 vendors that we track. And essentially what we're trying to do here is talk about your evaluation and awareness of these companies and also your utilization. And also if you're not utilizing 'em, then we can also figure out your sales conversion or churn. So this is interesting, not only for the ITDMs themselves to figure out what their peers are evaluating and what they should put in POCs against the big guys when contracts come up. But it's also really interesting for the tech vendors themselves to see how they're performing. >> And you can see 2/3 of the respondents are director level of above. You got 28% is C-suite. There is of course a North America bias, 70, 75% is North America. But these smaller companies, you know, that's when they start doing business. So, okay. We're going to do a couple of things here today. First, we're going to give you the big picture across the sectors that ETR covers within the ETS survey. And then we're going to look at the high and low sentiment for the larger private companies. And then we're going to do the same for the smaller private companies, the ones that don't have as much mindshare. And then I'm going to put those two groups together and we're going to look at two dimensions, actually three dimensions, which companies are being evaluated the most. Second, companies are getting the most usage and adoption of their offerings. And then third, which companies are seeing the highest churn rates, which of course is a silent killer of companies. And then finally, we're going to look at the sentiment and mindshare for two key areas that we like to cover often here on "Breaking Analysis", security and data. And data comprises database, including data warehousing, and then big data analytics is the second part of data. And then machine learning and AI is the third section within data that we're going to look at. Now, one other thing before we get into it, ETR very often will include open source offerings in the mix, even though they're not companies like TensorFlow or Kubernetes, for example. And we'll call that out during this discussion. The reason this is done is for context, because everyone is using open source. It is the heart of innovation and many business models are super glued to an open source offering, like take MariaDB, for example. There's the foundation and then there's with the open source code and then there, of course, the company that sells services around the offering. Okay, so let's first look at the highest and lowest sentiment among these private firms, the ones that have the highest mindshare. So they're naturally going to be somewhat larger. And we do this on two dimensions, sentiment on the vertical axis and mindshare on the horizontal axis and note the open source tool, see Kubernetes, Postgres, Kafka, TensorFlow, Jenkins, Grafana, et cetera. So Erik, please explain what we're looking at here, how it's derived and what the data tells us. >> Certainly, so there is a lot here, so we're going to break it down first of all by explaining just what mindshare and net sentiment is. You explain the axis. We have so many evaluation metrics, but we need to aggregate them into one so that way we can rank against each other. Net sentiment is really the aggregation of all the positive and subtracting out the negative. So the net sentiment is a very quick way of looking at where these companies stand versus their peers in their sectors and sub sectors. Mindshare is basically the awareness of them, which is good for very early stage companies. And you'll see some names on here that are obviously been around for a very long time. And they're clearly be the bigger on the axis on the outside. Kubernetes, for instance, as you mentioned, is open source. This de facto standard for all container orchestration, and it should be that far up into the right, because that's what everyone's using. In fact, the open source leaders are so prevalent in the emerging technology survey that we break them out later in our analysis, 'cause it's really not fair to include them and compare them to the actual companies that are providing the support and the security around that open source technology. But no survey, no analysis, no research would be complete without including these open source tech. So what we're looking at here, if I can just get away from the open source names, we see other things like Databricks and OneTrust . They're repeating as top net sentiment performers here. And then also the design vendors. People don't spend a lot of time on 'em, but Miro and Figma. This is their third survey in a row where they're just dominating that sentiment overall. And Adobe should probably take note of that because they're really coming after them. But Databricks, we all know probably would've been a public company by now if the market hadn't turned, but you can see just how dominant they are in a survey of nothing but private companies. And we'll see that again when we talk about the database later. >> And I'll just add, so you see automation anywhere on there, the big UiPath competitor company that was not able to get to the public markets. They've been trying. Snyk, Peter McKay's company, they've raised a bunch of money, big security player. They're doing some really interesting things in developer security, helping developers secure the data flow, H2O.ai, Dataiku AI company. We saw them at the Snowflake Summit. Redis Labs, Netskope and security. So a lot of names that we know that ultimately we think are probably going to be hitting the public market. Okay, here's the same view for private companies with less mindshare, Erik. Take us through this one. >> On the previous slide too real quickly, I wanted to pull that security scorecard and we'll get back into it. But this is a newcomer, that I couldn't believe how strong their data was, but we'll bring that up in a second. Now, when we go to the ones of lower mindshare, it's interesting to talk about open source, right? Kubernetes was all the way on the top right. Everyone uses containers. Here we see Istio up there. Not everyone is using service mesh as much. And that's why Istio is in the smaller breakout. But still when you talk about net sentiment, it's about the leader, it's the highest one there is. So really interesting to point out. Then we see other names like Collibra in the data side really performing well. And again, as always security, very well represented here. We have Aqua, Wiz, Armis, which is a standout in this survey this time around. They do IoT security. I hadn't even heard of them until I started digging into the data here. And I couldn't believe how well they were doing. And then of course you have AnyScale, which is doing a second best in this and the best name in the survey Hugging Face, which is a machine learning AI tool. Also doing really well on a net sentiment, but they're not as far along on that access of mindshare just yet. So these are again, emerging companies that might not be as well represented in the enterprise as they will be in a couple of years. >> Hugging Face sounds like something you do with your two year old. Like you said, you see high performers, AnyScale do machine learning and you mentioned them. They came out of Berkeley. Collibra Governance, InfluxData is on there. InfluxDB's a time series database. And yeah, of course, Alex, if you bring that back up, you get a big group of red dots, right? That's the bad zone, I guess, which Sisense does vis, Yellowbrick Data is a NPP database. How should we interpret the red dots, Erik? I mean, is it necessarily a bad thing? Could it be misinterpreted? What's your take on that? >> Sure, well, let me just explain the definition of it first from a data science perspective, right? We're a data company first. So the gray dots that you're seeing that aren't named, that's the mean that's the average. So in order for you to be on this chart, you have to be at least one standard deviation above or below that average. So that gray is where we're saying, "Hey, this is where the lump of average comes in. This is where everyone normally stands." So you either have to be an outperformer or an underperformer to even show up in this analysis. So by definition, yes, the red dots are bad. You're at least one standard deviation below the average of your peers. It's not where you want to be. And if you're on the lower left, not only are you not performing well from a utilization or an actual usage rate, but people don't even know who you are. So that's a problem, obviously. And the VCs and the PEs out there that are backing these companies, they're the ones who mostly are interested in this data. >> Yeah. Oh, that's great explanation. Thank you for that. No, nice benchmarking there and yeah, you don't want to be in the red. All right, let's get into the next segment here. Here going to look at evaluation rates, adoption and the all important churn. First new evaluations. Let's bring up that slide. And Erik, take us through this. >> So essentially I just want to explain what evaluation means is that people will cite that they either plan to evaluate the company or they're currently evaluating. So that means we're aware of 'em and we are choosing to do a POC of them. And then we'll see later how that turns into utilization, which is what a company wants to see, awareness, evaluation, and then actually utilizing them. That's sort of the life cycle for these emerging companies. So what we're seeing here, again, with very high evaluation rates. H2O, we mentioned. SecurityScorecard jumped up again. Chargebee, Snyk, Salt Security, Armis. A lot of security names are up here, Aqua, Netskope, which God has been around forever. I still can't believe it's in an Emerging Technology Survey But so many of these names fall in data and security again, which is why we decided to pick those out Dave. And on the lower side, Vena, Acton, those unfortunately took the dubious award of the lowest evaluations in our survey, but I prefer to focus on the positive. So SecurityScorecard, again, real standout in this one, they're in a security assessment space, basically. They'll come in and assess for you how your security hygiene is. And it's an area of a real interest right now amongst our ITDM community. >> Yeah, I mean, I think those, and then Arctic Wolf is up there too. They're doing managed services. You had mentioned Netskope. Yeah, okay. All right, let's look at now adoption. These are the companies whose offerings are being used the most and are above that standard deviation in the green. Take us through this, Erik. >> Sure, yet again, what we're looking at is, okay, we went from awareness, we went to evaluation. Now it's about utilization, which means a survey respondent's going to state "Yes, we evaluated and we plan to utilize it" or "It's already in our enterprise and we're actually allocating further resources to it." Not surprising, again, a lot of open source, the reason why, it's free. So it's really easy to grow your utilization on something that's free. But as you and I both know, as Red Hat proved, there's a lot of money to be made once the open source is adopted, right? You need the governance, you need the security, you need the support wrapped around it. So here we're seeing Kubernetes, Postgres, Apache Kafka, Jenkins, Grafana. These are all open source based names. But if we're looking at names that are non open source, we're going to see Databricks, Automation Anywhere, Rubrik all have the highest mindshare. So these are the names, not surprisingly, all names that probably should have been public by now. Everyone's expecting an IPO imminently. These are the names that have the highest mindshare. If we talk about the highest utilization rates, again, Miro and Figma pop up, and I know they're not household names, but they are just dominant in this survey. These are applications that are meant for design software and, again, they're going after an Autodesk or a CAD or Adobe type of thing. It is just dominant how high the utilization rates are here, which again is something Adobe should be paying attention to. And then you'll see a little bit lower, but also interesting, we see Collibra again, we see Hugging Face again. And these are names that are obviously in the data governance, ML, AI side. So we're seeing a ton of data, a ton of security and Rubrik was interesting in this one, too, high utilization and high mindshare. We know how pervasive they are in the enterprise already. >> Erik, Alex, keep that up for a second, if you would. So yeah, you mentioned Rubrik. Cohesity's not on there. They're sort of the big one. We're going to talk about them in a moment. Puppet is interesting to me because you remember the early days of that sort of space, you had Puppet and Chef and then you had Ansible. Red Hat bought Ansible and then Ansible really took off. So it's interesting to see Puppet on there as well. Okay. So now let's look at the churn because this one is where you don't want to be. It's, of course, all red 'cause churn is bad. Take us through this, Erik. >> Yeah, definitely don't want to be here and I don't love to dwell on the negative. So we won't spend as much time. But to your point, there's one thing I want to point out that think it's important. So you see Rubrik in the same spot, but Rubrik has so many citations in our survey that it actually would make sense that they're both being high utilization and churn just because they're so well represented. They have such a high overall representation in our survey. And the reason I call that out is Cohesity. Cohesity has an extremely high churn rate here about 17% and unlike Rubrik, they were not on the utilization side. So Rubrik is seeing both, Cohesity is not. It's not being utilized, but it's seeing a high churn. So that's the way you can look at this data and say, "Hm." Same thing with Puppet. You noticed that it was on the other slide. It's also on this one. So basically what it means is a lot of people are giving Puppet a shot, but it's starting to churn, which means it's not as sticky as we would like. One that was surprising on here for me was Tanium. It's kind of jumbled in there. It's hard to see in the middle, but Tanium, I was very surprised to see as high of a churn because what I do hear from our end user community is that people that use it, like it. It really kind of spreads into not only vulnerability management, but also that endpoint detection and response side. So I was surprised by that one, mostly to see Tanium in here. Mural, again, was another one of those application design softwares that's seeing a very high churn as well. >> So you're saying if you're in both... Alex, bring that back up if you would. So if you're in both like MariaDB is for example, I think, yeah, they're in both. They're both green in the previous one and red here, that's not as bad. You mentioned Rubrik is going to be in both. Cohesity is a bit of a concern. Cohesity just brought on Sanjay Poonen. So this could be a go to market issue, right? I mean, 'cause Cohesity has got a great product and they got really happy customers. So they're just maybe having to figure out, okay, what's the right ideal customer profile and Sanjay Poonen, I guarantee, is going to have that company cranking. I mean they had been doing very well on the surveys and had fallen off of a bit. The other interesting things wondering the previous survey I saw Cvent, which is an event platform. My only reason I pay attention to that is 'cause we actually have an event platform. We don't sell it separately. We bundle it as part of our offerings. And you see Hopin on here. Hopin raised a billion dollars during the pandemic. And we were like, "Wow, that's going to blow up." And so you see Hopin on the churn and you didn't see 'em in the previous chart, but that's sort of interesting. Like you said, let's not kind of dwell on the negative, but you really don't. You know, churn is a real big concern. Okay, now we're going to drill down into two sectors, security and data. Where data comprises three areas, database and data warehousing, machine learning and AI and big data analytics. So first let's take a look at the security sector. Now this is interesting because not only is it a sector drill down, but also gives an indicator of how much money the firm has raised, which is the size of that bubble. And to tell us if a company is punching above its weight and efficiently using its venture capital. Erik, take us through this slide. Explain the dots, the size of the dots. Set this up please. >> Yeah. So again, the axis is still the same, net sentiment and mindshare, but what we've done this time is we've taken publicly available information on how much capital company is raised and that'll be the size of the circle you see around the name. And then whether it's green or red is basically saying relative to the amount of money they've raised, how are they doing in our data? So when you see a Netskope, which has been around forever, raised a lot of money, that's why you're going to see them more leading towards red, 'cause it's just been around forever and kind of would expect it. Versus a name like SecurityScorecard, which is only raised a little bit of money and it's actually performing just as well, if not better than a name, like a Netskope. OneTrust doing absolutely incredible right now. BeyondTrust. We've seen the issues with Okta, right. So those are two names that play in that space that obviously are probably getting some looks about what's going on right now. Wiz, we've all heard about right? So raised a ton of money. It's doing well on net sentiment, but the mindshare isn't as well as you'd want, which is why you're going to see a little bit of that red versus a name like Aqua, which is doing container and application security. And hasn't raised as much money, but is really neck and neck with a name like Wiz. So that is why on a relative basis, you'll see that more green. As we all know, information security is never going away. But as we'll get to later in the program, Dave, I'm not sure in this current market environment, if people are as willing to do POCs and switch away from their security provider, right. There's a little bit of tepidness out there, a little trepidation. So right now we're seeing overall a slight pause, a slight cooling in overall evaluations on the security side versus historical levels a year ago. >> Now let's stay on here for a second. So a couple things I want to point out. So it's interesting. Now Snyk has raised over, I think $800 million but you can see them, they're high on the vertical and the horizontal, but now compare that to Lacework. It's hard to see, but they're kind of buried in the middle there. That's the biggest dot in this whole thing. I think I'm interpreting this correctly. They've raised over a billion dollars. It's a Mike Speiser company. He was the founding investor in Snowflake. So people watch that very closely, but that's an example of where they're not punching above their weight. They recently had a layoff and they got to fine tune things, but I'm still confident they they're going to do well. 'Cause they're approaching security as a data problem, which is probably people having trouble getting their arms around that. And then again, I see Arctic Wolf. They're not red, they're not green, but they've raised fair amount of money, but it's showing up to the right and decent level there. And a couple of the other ones that you mentioned, Netskope. Yeah, they've raised a lot of money, but they're actually performing where you want. What you don't want is where Lacework is, right. They've got some work to do to really take advantage of the money that they raised last November and prior to that. >> Yeah, if you're seeing that more neutral color, like you're calling out with an Arctic Wolf, like that means relative to their peers, this is where they should be. It's when you're seeing that red on a Lacework where we all know, wow, you raised a ton of money and your mindshare isn't where it should be. Your net sentiment is not where it should be comparatively. And then you see these great standouts, like Salt Security and SecurityScorecard and Abnormal. You know they haven't raised that much money yet, but their net sentiment's higher and their mindshare's doing well. So those basically in a nutshell, if you're a PE or a VC and you see a small green circle, then you're doing well, then it means you made a good investment. >> Some of these guys, I don't know, but you see these small green circles. Those are the ones you want to start digging into and maybe help them catch a wave. Okay, let's get into the data discussion. And again, three areas, database slash data warehousing, big data analytics and ML AI. First, we're going to look at the database sector. So Alex, thank you for bringing that up. Alright, take us through this, Erik. Actually, let me just say Postgres SQL. I got to ask you about this. It shows some funding, but that actually could be a mix of EDB, the company that commercializes Postgres and Postgres the open source database, which is a transaction system and kind of an open source Oracle. You see MariaDB is a database, but open source database. But the companies they've raised over $200 million and they filed an S-4. So Erik looks like this might be a little bit of mashup of companies and open source products. Help us understand this. >> Yeah, it's tough when you start dealing with the open source side and I'll be honest with you, there is a little bit of a mashup here. There are certain names here that are a hundred percent for profit companies. And then there are others that are obviously open source based like Redis is open source, but Redis Labs is the one trying to monetize the support around it. So you're a hundred percent accurate on this slide. I think one of the things here that's important to note though, is just how important open source is to data. If you're going to be going to any of these areas, it's going to be open source based to begin with. And Neo4j is one I want to call out here. It's not one everyone's familiar with, but it's basically geographical charting database, which is a name that we're seeing on a net sentiment side actually really, really high. When you think about it's the third overall net sentiment for a niche database play. It's not as big on the mindshare 'cause it's use cases aren't as often, but third biggest play on net sentiment. I found really interesting on this slide. >> And again, so MariaDB, as I said, they filed an S-4 I think $50 million in revenue, that might even be ARR. So they're not huge, but they're getting there. And by the way, MariaDB, if you don't know, was the company that was formed the day that Oracle bought Sun in which they got MySQL and MariaDB has done a really good job of replacing a lot of MySQL instances. Oracle has responded with MySQL HeatWave, which was kind of the Oracle version of MySQL. So there's some interesting battles going on there. If you think about the LAMP stack, the M in the LAMP stack was MySQL. And so now it's all MariaDB replacing that MySQL for a large part. And then you see again, the red, you know, you got to have some concerns about there. Aerospike's been around for a long time. SingleStore changed their name a couple years ago, last year. Yellowbrick Data, Fire Bolt was kind of going after Snowflake for a while, but yeah, you want to get out of that red zone. So they got some work to do. >> And Dave, real quick for the people that aren't aware, I just want to let them know that we can cut this data with the public company data as well. So we can cross over this with that because some of these names are competing with the larger public company names as well. So we can go ahead and cross reference like a MariaDB with a Mongo, for instance, or of something of that nature. So it's not in this slide, but at another point we can certainly explain on a relative basis how these private names are doing compared to the other ones as well. >> All right, let's take a quick look at analytics. Alex, bring that up if you would. Go ahead, Erik. >> Yeah, I mean, essentially here, I can't see it on my screen, my apologies. I just kind of went to blank on that. So gimme one second to catch up. >> So I could set it up while you're doing that. You got Grafana up and to the right. I mean, this is huge right. >> Got it thank you. I lost my screen there for a second. Yep. Again, open source name Grafana, absolutely up and to the right. But as we know, Grafana Labs is actually picking up a lot of speed based on Grafana, of course. And I think we might actually hear some noise from them coming this year. The names that are actually a little bit more disappointing than I want to call out are names like ThoughtSpot. It's been around forever. Their mindshare of course is second best here but based on the amount of time they've been around and the amount of money they've raised, it's not actually outperforming the way it should be. We're seeing Moogsoft obviously make some waves. That's very high net sentiment for that company. It's, you know, what, third, fourth position overall in this entire area, Another name like Fivetran, Matillion is doing well. Fivetran, even though it's got a high net sentiment, again, it's raised so much money that we would've expected a little bit more at this point. I know you know this space extremely well, but basically what we're looking at here and to the bottom left, you're going to see some names with a lot of red, large circles that really just aren't performing that well. InfluxData, however, second highest net sentiment. And it's really pretty early on in this stage and the feedback we're getting on this name is the use cases are great, the efficacy's great. And I think it's one to watch out for. >> InfluxData, time series database. The other interesting things I just noticed here, you got Tamer on here, which is that little small green. Those are the ones we were saying before, look for those guys. They might be some of the interesting companies out there and then observe Jeremy Burton's company. They do observability on top of Snowflake, not green, but kind of in that gray. So that's kind of cool. Monte Carlo is another one, they're sort of slightly green. They are doing some really interesting things in data and data mesh. So yeah, okay. So I can spend all day on this stuff, Erik, phenomenal data. I got to get back and really dig in. Let's end with machine learning and AI. Now this chart it's similar in its dimensions, of course, except for the money raised. We're not showing that size of the bubble, but AI is so hot. We wanted to cover that here, Erik, explain this please. Why TensorFlow is highlighted and walk us through this chart. >> Yeah, it's funny yet again, right? Another open source name, TensorFlow being up there. And I just want to explain, we do break out machine learning, AI is its own sector. A lot of this of course really is intertwined with the data side, but it is on its own area. And one of the things I think that's most important here to break out is Databricks. We started to cover Databricks in machine learning, AI. That company has grown into much, much more than that. So I do want to state to you Dave, and also the audience out there that moving forward, we're going to be moving Databricks out of only the MA/AI into other sectors. So we can kind of value them against their peers a little bit better. But in this instance, you could just see how dominant they are in this area. And one thing that's not here, but I do want to point out is that we have the ability to break this down by industry vertical, organization size. And when I break this down into Fortune 500 and Fortune 1000, both Databricks and Tensorflow are even better than you see here. So it's quite interesting to see that the names that are succeeding are also succeeding with the largest organizations in the world. And as we know, large organizations means large budgets. So this is one area that I just thought was really interesting to point out that as we break it down, the data by vertical, these two names still are the outstanding players. >> I just also want to call it H2O.ai. They're getting a lot of buzz in the marketplace and I'm seeing them a lot more. Anaconda, another one. Dataiku consistently popping up. DataRobot is also interesting because all the kerfuffle that's going on there. The Cube guy, Cube alum, Chris Lynch stepped down as executive chairman. All this stuff came out about how the executives were taking money off the table and didn't allow the employees to participate in that money raising deal. So that's pissed a lot of people off. And so they're now going through some kind of uncomfortable things, which is unfortunate because DataRobot, I noticed, we haven't covered them that much in "Breaking Analysis", but I've noticed them oftentimes, Erik, in the surveys doing really well. So you would think that company has a lot of potential. But yeah, it's an important space that we're going to continue to watch. Let me ask you Erik, can you contextualize this from a time series standpoint? I mean, how is this changed over time? >> Yeah, again, not show here, but in the data. I'm sorry, go ahead. >> No, I'm sorry. What I meant, I should have interjected. In other words, you would think in a downturn that these emerging companies would be less interesting to buyers 'cause they're more risky. What have you seen? >> Yeah, and it was interesting before we went live, you and I were having this conversation about "Is the downturn stopping people from evaluating these private companies or not," right. In a larger sense, that's really what we're doing here. How are these private companies doing when it comes down to the actual practitioners? The people with the budget, the people with the decision making. And so what I did is, we have historical data as you know, I went back to the Emerging Technology Survey we did in November of 21, right at the crest right before the market started to really fall and everything kind of started to fall apart there. And what I noticed is on the security side, very much so, we're seeing less evaluations than we were in November 21. So I broke it down. On cloud security, net sentiment went from 21% to 16% from November '21. That's a pretty big drop. And again, that sentiment is our one aggregate metric for overall positivity, meaning utilization and actual evaluation of the name. Again in database, we saw it drop a little bit from 19% to 13%. However, in analytics we actually saw it stay steady. So it's pretty interesting that yes, cloud security and security in general is always going to be important. But right now we're seeing less overall net sentiment in that space. But within analytics, we're seeing steady with growing mindshare. And also to your point earlier in machine learning, AI, we're seeing steady net sentiment and mindshare has grown a whopping 25% to 30%. So despite the downturn, we're seeing more awareness of these companies in analytics and machine learning and a steady, actual utilization of them. I can't say the same in security and database. They're actually shrinking a little bit since the end of last year. >> You know it's interesting, we were on a round table, Erik does these round tables with CISOs and CIOs, and I remember one time you had asked the question, "How do you think about some of these emerging tech companies?" And one of the executives said, "I always include somebody in the bottom left of the Gartner Magic Quadrant in my RFPs. I think he said, "That's how I found," I don't know, it was Zscaler or something like that years before anybody ever knew of them "Because they're going to help me get to the next level." So it's interesting to see Erik in these sectors, how they're holding up in many cases. >> Yeah. It's a very important part for the actual IT practitioners themselves. There's always contracts coming up and you always have to worry about your next round of negotiations. And that's one of the roles these guys play. You have to do a POC when contracts come up, but it's also their job to stay on top of the new technology. You can't fall behind. Like everyone's a software company. Now everyone's a tech company, no matter what you're doing. So these guys have to stay in on top of it. And that's what this ETS can do. You can go in here and look and say, "All right, I'm going to evaluate their technology," and it could be twofold. It might be that you're ready to upgrade your technology and they're actually pushing the envelope or it simply might be I'm using them as a negotiation ploy. So when I go back to the big guy who I have full intentions of writing that contract to, at least I have some negotiation leverage. >> Erik, we got to leave it there. I could spend all day. I'm going to definitely dig into this on my own time. Thank you for introducing this, really appreciate your time today. >> I always enjoy it, Dave and I hope everyone out there has a great holiday weekend. Enjoy the rest of the summer. And, you know, I love to talk data. So anytime you want, just point the camera on me and I'll start talking data. >> You got it. I also want to thank the team at ETR, not only Erik, but Darren Bramen who's a data scientist, really helped prepare this data, the entire team over at ETR. I cannot tell you how much additional data there is. We are just scratching the surface in this "Breaking Analysis". So great job guys. I want to thank Alex Myerson. Who's on production and he manages the podcast. Ken Shifman as well, who's just coming back from VMware Explore. Kristen Martin and Cheryl Knight help get the word out on social media and in our newsletters. And Rob Hof is our editor in chief over at SiliconANGLE. Does some great editing for us. Thank you. All of you guys. Remember these episodes, they're all available as podcast, wherever you listen. All you got to do is just search "Breaking Analysis" podcast. I publish each week on wikibon.com and siliconangle.com. Or you can email me to get in touch david.vellante@siliconangle.com. You can DM me at dvellante or comment on my LinkedIn posts and please do check out etr.ai for the best survey data in the enterprise tech business. This is Dave Vellante for Erik Bradley and The Cube Insights powered by ETR. Thanks for watching. Be well. And we'll see you next time on "Breaking Analysis". (upbeat music)

Published Date : Sep 7 2022

SUMMARY :

bringing you data driven it's called the Emerging Great to see you too, Dave, so much in the mainstream, not only for the ITDMs themselves It is the heart of innovation So the net sentiment is a very So a lot of names that we And then of course you have AnyScale, That's the bad zone, I guess, So the gray dots that you're rates, adoption and the all And on the lower side, Vena, Acton, in the green. are in the enterprise already. So now let's look at the churn So that's the way you can look of dwell on the negative, So again, the axis is still the same, And a couple of the other And then you see these great standouts, Those are the ones you want to but Redis Labs is the one And by the way, MariaDB, So it's not in this slide, Alex, bring that up if you would. So gimme one second to catch up. So I could set it up but based on the amount of time Those are the ones we were saying before, And one of the things I think didn't allow the employees to here, but in the data. What have you seen? the market started to really And one of the executives said, And that's one of the Thank you for introducing this, just point the camera on me We are just scratching the surface

ENTITIES

Entity	Category	Confidence
Erik	PERSON	0.99+
Alex Myerson	PERSON	0.99+
Ken Shifman	PERSON	0.99+
Sanjay Poonen	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Erik Bradley	PERSON	0.99+
November 21	DATE	0.99+
Darren Bramen	PERSON	0.99+
Alex	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
Postgres	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
Netskope	ORGANIZATION	0.99+
Adobe	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
Fivetran	ORGANIZATION	0.99+
$50 million	QUANTITY	0.99+
21%	QUANTITY	0.99+
Chris Lynch	PERSON	0.99+
19%	QUANTITY	0.99+
Jeremy Burton	PERSON	0.99+
$800 million	QUANTITY	0.99+
6,000	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Redis Labs	ORGANIZATION	0.99+
November '21	DATE	0.99+
ETR	ORGANIZATION	0.99+
First	QUANTITY	0.99+
25%	QUANTITY	0.99+
last year	DATE	0.99+
OneTrust	ORGANIZATION	0.99+
two dimensions	QUANTITY	0.99+
two groups	QUANTITY	0.99+
November of 21	DATE	0.99+
both	QUANTITY	0.99+
Boston	LOCATION	0.99+
more than 400 companies	QUANTITY	0.99+
Kristen Martin	PERSON	0.99+
MySQL	TITLE	0.99+
Moogsoft	ORGANIZATION	0.99+
The Cube	ORGANIZATION	0.99+
third	QUANTITY	0.99+
Grafana	ORGANIZATION	0.99+
H2O	ORGANIZATION	0.99+
Mike Speiser	PERSON	0.99+
david.vellante@siliconangle.com	OTHER	0.99+
second	QUANTITY	0.99+
two	QUANTITY	0.99+
first	QUANTITY	0.99+
28%	QUANTITY	0.99+
16%	QUANTITY	0.99+
Second	QUANTITY	0.99+

David Linthicum, Deloitte US | Supercloud22

(bright music) >> "Supermetafragilisticexpialadotious." What's in a name? In an homage to the inimitable Charles Fitzgerald, we've chosen this title for today's session because of all the buzz surrounding "supercloud," a term that we introduced last year to signify a major architectural trend and shift that's occurring in the technology industry. Since that time, we've published numerous videos and articles on the topic, and on August 9th, kicked off "Supercloud22," an open industry event designed to advance the supercloud conversation, gathering input from more than 30 experienced technologists and business leaders in "The Cube" and broader technology community. We're talking about individuals like Benoit Dageville, Kit Colbert, Ali Ghodsi, Mohit Aron, David McJannet, and dozens of other experts. And today, we're pleased to welcome David Linthicum, who's a Chief Strategy Officer of Cloud Services at Deloitte Consulting. David is a technology visionary, a technical CTO. He's an author and a frequently sought after keynote speaker at high profile conferences like "VMware Explore" next week. David Linthicum, welcome back to "The Cube." Good to see you again. >> Oh, it's great to be here. Thanks for the invitation. Thanks for having me. >> Yeah, you're very welcome. Okay, so this topic of supercloud, what you call metacloud, has created a lot of interest. VMware calls it cross-cloud services, Snowflake calls it their data cloud, there's a lot of different names, but recently, you published a piece in "InfoWorld" where you said the following. "I really don't care what we call it, "and I really don't care if I put "my own buzzword into the mix. "However, this does not change the fact "that metacloud is perhaps the most important "architectural evolution occurring right now, "and we need to get this right out of the gate. "If we do that, who cares what it's named?" So very cool. And you also mentioned in a recent article that you don't like to put out new terms out in the wild without defining them. So what is a metacloud, or what we call supercloud? What's your definition? >> Yeah, and again, I don't care what people call it. The reality is it's the ability to have a layer of cross-cloud services. It sits above existing public cloud providers. So the idea here is that instead of building different security systems, different governance systems, different operational systems in each specific cloud provider, using whatever native features they provide, we're trying to do that in a cross-cloud way. So in other words, we're pushing out data integration, security, all these other things that we have to take care of as part of deploying a particular cloud provider. And in a multicloud scenario, we're building those in and between the clouds. And so we've been tracking this for about five years. We understood that multicloud is not necessarily about the particular public cloud providers, it's about things that you build in and between the clouds. >> Got it, okay. So I want to come back to that, to the definition, but I want to tie us to the so-called multicloud. You guys did a survey recently. We've said that multicloud was mostly a symptom of multi-vendor, Shadow Cloud, M&A, and only recently has become a strategic imperative. Now, Deloitte published a survey recently entitled "Closing the Cloud Strategy, Technology, Innovation Gap," and I'd like to explore that a little bit. And so in that survey, you showed data. What I liked about it is you went beyond what we all know, right? The old, "Our research shows that on average, "X number of clouds are used at an individual company." I mean, you had that too, but you really went deeper. You identified why companies are using multiple clouds, and you developed different categories of practitioners across 500 survey respondents. But the reasons were very clear for "why multicloud," as this becomes more strategic. Service choice scale, negotiating leverage, improved business resiliency, minimizing lock-in, interoperability of data, et cetera. So my question to you, David, is what's the problem supercloud or metacloud solves, and what's different from multicloud? >> That's a great question. The reality is that if we're... Well, supercloud or metacloud, whatever, is really something that exists above a multicloud, but I kind of view them as the same thing. It's an architectural pattern. We can name it anything. But the reality is that if we're moving to these multicloud environments, we're doing so to leverage best of breed things. In other words, best of breed technology to provide the innovators within the company to take the business to the next level, and we determine that in the survey. And so if we're looking at what a multicloud provides, it's the ability to provide different choices of different services or piece parts that allows us to build anything that we need to do. And so what we found in the survey and what we found in just practice in dealing with our clients is that ultimately, the value of cloud computing is going to be the innovation aspects. In other words, the ability to take the company to the next level from being more innovative and more disruptive in the marketplace that they're in. And the only way to do that, instead of basically leveraging the services of a particular walled garden of a single public cloud provider, is to cast a wider net and get out and leverage all kinds of services to make these happen. So if you think about that, that's basically how multicloud has evolved. In other words, it wasn't planned. They didn't say, "We're going to go do a multicloud." It was different developers and innovators in the company that went off and leveraged these cloud services, sometimes with the consent of IT leadership, sometimes not. And now we have these multitudes of different services that we're leveraging. And so many of these enterprises are going from 1000 to, say, 3000 services under management. That creates a complexity problem. We have a problem of heterogeneity, different platforms, different tools, different services, different AI technology, database technology, things like that. So the metacloud, or the supercloud, or whatever you want to call it, is the ability to deal with that complexity on the complexity's terms. And so instead of building all these various things that we have to do individually in each of the cloud providers, we're trying to do so within a cross-cloud service layer. We're trying to create this layer of technology, which removes us from dealing with the complexity of the underlying multicloud services and makes it manageable. Because right now, I think we're getting to a point of complexity we just can't operate it at the budgetary limits that we are right now. We can't keep the number of skills around, the number of operators around, to keep these things going. We're going to have to get creative in terms of how we manage these things, how we manage a multicloud. And that's where the supercloud, metacloud, whatever they want to call it, comes that. >> Yeah, and as John Furrier likes to say, in IT, we tend to solve complexity with more complexity, and that's not what we're talking about here. We're talking about simplifying, and you talked about the abstraction layer, and then it sounds like I'm inferring more. There's value that's added on top of that. And then you also said the hyperscalers are in a walled garden. So I've been asked, why aren't the hyperscalers superclouds? And I've said, essentially, they want to put your data into their cloud and keep it there. Now, that doesn't mean they won't eventually get into that. We've seen examples a little bit, Outposts, Anthos, Azure Arc, but the hyperscalers really aren't building superclouds or metaclouds, at least today, are they? >> No, they're not. And I always have the predictions for every major cloud conference that this is the conference that the hyperscaler is going to figure out some sort of a multicloud across-cloud strategy. In other words, building services that are able to operate across clouds. That really has never happened. It has happened in dribs and drabs, and you just mentioned a few examples of that, but the ability to own the space, to understand that we're not going to be the center of the universe in how people are going to leverage it, is going to be multiple things, including legacy systems and other cloud providers, and even industry clouds that are emerging these days, and SaaS providers, and all these things. So we're going to assist you in dealing with complexity, and we're going to provide the core services of being there. That hasn't happened yet. And they may be worried about conflicting their market, and the messaging is a bit different, even actively pushing back on the concept of multicloud, but the reality is the market's going to take them there. So in other words, if enough of their customers are asking for this and asking that they take the lead in building these cross-cloud technologies, even if they're participating in the stack and not being the stack, it's too compelling of a market that it's not going to drag a lot of the existing public cloud providers there. >> Well, it's going to be interesting to see how that plays out, David, because I never say never when it comes to a company like AWS, and we've seen how fast they move. And at the same time, they don't want to be commoditized. There's the layer underneath all this infrastructure, and they got this ecosystem that's adding all this tremendous value. But I want to ask you, what are the essential elements of supercloud, coming back to the definition, if you will, and what's different about metacloud, as you call it, from plain old SaaS or PaaS? What are the key elements there? >> Well, the key elements would be holistic management of all of the IT infrastructure. So even though it's sitting above a multicloud, I view metacloud, supercloud as the ability to also manage your existing legacy systems, your existing security stack, your existing network operations, basically everything that exists under the purview of IT. If you think about it, we're moving our infrastructure into the clouds, and we're probably going to hit a saturation point of about 70%. And really, if the supercloud, metacloud, which is going to be expensive to build for most of the enterprises, it needs to support these things holistically. So it needs to have all the services, that is going to be shareable across the different providers, and also existing legacy systems, and also edge computing, and IoT, and all these very diverse systems that we're building there right now. So if complexity is a core challenge to operate these things at scale and the ability to secure these things at scale, we have to have commonality in terms of security architecture and technology, commonality in terms of our directory services, commonality in terms of network operations, commonality in term of cloud operations, commonality in terms of FinOps. All these things should exist in some holistic cross-cloud layer that sits above all this complexity. And you pointed out something very profound. In other words, that is going to mean that we're hiding a lot of the existing cloud providers in terms of their interfaces and dashboards and things like that that we're dealing with today, their APIs. But the reality is that if we're able to manage these things at scale, the public cloud providers are going to benefit greatly from that. They're going to sell more services because people are going to find they're able to leverage them easier. And so in other words, if we're removing the complexity wall, which many in the industry are calling it right now, then suddenly we're moving from, say, the 25 to 30% migrated in the cloud, which most enterprises are today, to 50, 60, 70%. And we're able to do this at scale, and we're doing it at scale because we're providing some architectural optimization through the supercloud, metacloud layer. >> Okay, thanks for that. David, I just want to tap your CTO brain for a minute. At "Supercloud22," we came up with these three deployment models. Kit Colbert put forth the idea that one model would be your control planes running in one cloud, let's say AWS, but it interacts with and can manage and deploy on other clouds, the Kubernetes Cluster Management System. The second one, Mohit Aron from Cohesity laid out, where you instantiate the stack on different clouds and different cloud regions, and then you create a layer, a common interface across those. And then Snowflake was the third deployment model where it's a single global instance, it's one instantiation, and basically building out their own cloud across these regions. Help us parse through that. Do those seem like reasonable deployment models to you? Do you have any thoughts on that? >> Yeah, I mean, that's a distributed computing trick we've been doing, which is, in essence, an agent of the supercloud that's carrying out some of the cloud native functions on that particular cloud, but is, in essence, a slave to the metacloud, or the supercloud, whatever, that's able to run across the various cloud providers. In other words, when it wants to access a service, it may not go directly to that service. It goes directly to the control plane, and that control plane is responsible... Very much like Kubernetes and Docker works, that control plane is responsible for reaching out and leveraging those native services. I think that that's thinking that's a step in the right direction. I think these things unto themselves, at least initially, are going to be a very complex array of technology. Even though we're trying to remove complexity, the supercloud unto itself, in terms of the ability to build this thing that's able to operate at scale across-cloud, is going to be a collection of many different technologies that are interfacing with the public cloud providers in different ways. And so we can start putting these meta architectures together, and I certainly have written and spoke about this for years, but initially, this is going to be something that may escape the detail or the holistic nature of these meta architectures that people are floating around right now. >> Yeah, so I want to stay on this, because anytime I get a CTO brain, I like to... I'm not an engineer, but I've been around a long time, so I know a lot of buzzwords and have absorbed a lot over the years, but so you take those, the second two models, the Mohit instantiate on each cloud and each cloud region versus the Snowflake approach. I asked Benoit Dageville, "Does that mean if I'm in "an AWS east region and I want to do a query on Azure West, "I can do that without moving data?" And he said, "Yes and no." And the answer was really, "No, we actually take a subset of that data," so there's the latency problem. From those deployment model standpoints, what are the trade-offs that you see in terms of instantiating the stack on each individual cloud versus that single instance? Is there a benefit of the single instance for governance and security and simplicity, but a trade-off on latency, or am I overthinking this? >> Yeah, you hit it on the nose. The reality is that the trade-off is going to be latency and performance. If we get wiggy with the distributed nature, like the distributed data example you just provided, we have to basically separate the queries and communicate with the databases on each instance, and then reassemble the result set that goes back to the people who are recording it. And so we can do caching systems and things like that. But the reality is, if it's distributed system, we're going to have latency and bandwidth issues that are going to be limiting us. And also security issues, because if we're removing lots of information over the open internet, or even private circuits, that those are going to be attack vectors that hackers can leverage. You have to keep that in mind. We're trying to reduce those attack vectors. So it would be, in many instances, and I think we have to think about this, that we're going to keep the data in the same physical region for just that. So in other words, it's going to provide the best performance and also the most simplistic access to dealing with security. And so we're not, in essence, thinking about where the data's going, how it's moving across things, things like that. So the challenge is going to be is when you're dealing with a supercloud or metacloud is, when do you make those decisions? And I think, in many instances, even though we're leveraging multiple databases across multiple regions and multiple public cloud providers, and that's the idea of it, we're still going to localize the data for performance reasons. I mean, I just wrote a blog in "InfoWorld" a couple of months ago and talked about, people who are trying to distribute data across different public cloud providers for different reasons, distribute an application development system, things like that, you can do it. With enough time and money, you can do anything. I think the challenge is going to be operating that thing, and also providing a viable business return based on the application. And so why it may look like a good science experiment, and it's cool unto itself as an architect, the reality is the more pragmatic approach is going to be a leavitt in a single region on a single cloud. >> Very interesting. The other reason I like to talk to companies like Deloitte and experienced people like you is 'cause I can get... You're agnostic, right? I mean, you're technology agnostic, vendor agnostic. So I want to come back with another question, which is, how do you deal with what I call the lowest common denominator problem? What I mean by that is if one cloud has, let's say, a superior service... Let's take an example of Nitro and Graviton. AWS seems to be ahead on that, but let's say some other cloud isn't quite quite there yet, and you're building a supercloud or a metacloud. How do you rationalize that? Does it have to be like a caravan in the army where you slow down so all the slowest trucks can keep up, or are the ways to adjudicate that that are advantageous to hide that deficiency? >> Yeah, and that's a great thing about leveraging a supercloud or a metacloud is we're putting that management in a single layer. So as far as a user or even a developer on those systems, they shouldn't worry about the performance that may come back, because we're dealing with the... You hit the nail on the head with that one. The slowest component is the one that dictates performance. And so we have to have some sort of a performance management layer. We're also making dynamic decisions to move data, to move processing, from one server to the other to try to minimize the amount of latency that's coming from a single component. So the great thing about that is we're putting that volatility into a single domain, and it's making architectural decisions in terms of where something will run and where it's getting its data from, things are stored, things like that, based on the performance feedback that's coming back from the various cloud services that are under management. And so if you're running across clouds, it becomes even more interesting, because ultimately, you're going to make some architectural choices on the fly in terms of where that stuff runs based on the active dynamic performance that that public cloud provider is providing. So in other words, we may find that it automatically shut down a database service, say MySQL, on one cloud instance, and moved it to a MySQL instance on another public cloud provider because there was some sort of a performance issue that it couldn't work around. And by the way, it does so dynamically. Away from you making that decision, it's making that decision on your behalf. Again, this is a matter of abstraction, removing complexity, and dealing with complexity through abstraction and automation, and this is... That would be an example of fixing something with automation, self-healing. >> When you meet with some of the public cloud providers and they talk about on-prem private cloud, the general narrative from the hyperscalers is, "Well, that's not a cloud." Should on-prem be inclusive of supercloud, metacloud? >> Absolutely, I mean, and they're selling private cloud instances with the edge cloud that they're selling. The reality is that we're going to have to keep a certain amount of our infrastructure, including private clouds, on premise. It's something that's shrinking as a market share, and it's going to be tougher and tougher to justify as the public cloud providers become better and better at what they do, but we certainly have edge clouds now, and hyperscalers have examples of that where they run a instance of their public cloud infrastructure on premise on physical hardware and software. And the reality is, too, we have data centers and we have systems that just won't go away for another 20 or 30 years. They're just too sticky. They're uneconomically viable to move into the cloud. That's the core thing. It's not that we can't do it. The fact of the matter is we shouldn't do it, because there's not going to be an economic... There's not going to be an economic incentive of making that happen. So if we're going to create this meta layer or this infrastructure which is going to run across clouds, and everybody agrees on, that's what the supercloud is, we have to include the on-premise systems, including private clouds, including legacy systems. And by the way, include the rising number of IoT systems that are out there, and edge-based systems out there. So we're managing it using the same infrastructure into cloud services. So they have metadata systems and they have specialized services, and service finance and retail and things like doing risk analytics. So it gets them further down that path, but not necessarily giving them a SaaS application where they're forced into all of the business processes. We're giving you piece parts. So we'll give you 1000 different parts that are related to the finance industry. You can assemble anything you need, but the thing is, it's not going to be like building it from scratch. We're going to give you risk analytics, we're giving you the financial analytics, all these things that you can leverage within your applications how you want to leverage them. We'll maintain them. So in other words, you don't have to maintain 'em just like a cloud service. And suddenly, we can build applications in a couple of weeks that used to take a couple of months, in some cases, a couple of years. So that seems to be a large take of it moving forward. So get it up in the supercloud. Those become just other services that are under managed... That are under management on the supercloud, the metacloud. So we're able to take those services, abstract them, assemble them, use them in different applications. And the ability to manage where those services are originated versus where they're consumed is going to be managed by the supercloud layer, which, you're dealing with the governance, the service governance, the security systems, the directory systems, identity access management, things like that. They're going to get you further along down the pike, and that comes back as real value. If I'm able to build something in two weeks that used to take me two months, and I'm able to give my creators in the organization the ability to move faster, that's a real advantage. And suddenly, we are going to be valued by our digital footprint, our ability to do things in a creative and innovative way. And so organizations are able to move that fast, leveraging cloud computing for what it should be leveraged, as a true force multiplier for the business. They're going to win the game. They're going to get the most value. They're going to be around in 20 years, the others won't. >> David Linthicum, always love talking. You have a dangerous combination of business and technology expertise. Let's tease. "VMware Explore" next week, you're giving a keynote, if they're going to be there. Which day are you? >> Tuesday. Tuesday, 11 o'clock. >> All right, that's a big day. Tuesday, 11 o'clock. And David, please do stop by "The Cube." We're in Moscone West. Love to get you on and continue this conversation. I got 100 more questions for you. Really appreciate your time. >> I always love talking to people at "The Cube." Thank you very much. >> All right, and thanks for watching our ongoing coverage of "Supercloud22" on "The Cube," your leader in enterprise tech and emerging tech coverage. (bright music)

Published Date : Aug 24 2022

SUMMARY :

and articles on the Oh, it's great to be here. right out of the gate. The reality is it's the ability to have and I'd like to explore that a little bit. is the ability to deal but the hyperscalers but the ability to own the space, And at the same time, they and the ability to secure and then you create a layer, that may escape the detail and have absorbed a lot over the years, So the challenge is going to be in the army where you slow down And by the way, it does so dynamically. of the public cloud providers And the ability to manage if they're going to be there. Tuesday, 11 o'clock. Love to get you on and to people at "The Cube." and emerging tech coverage.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
David Linthicum	PERSON	0.99+
David McJannet	PERSON	0.99+
Deloitte	ORGANIZATION	0.99+
Ali Ghodsi	PERSON	0.99+
August 9th	DATE	0.99+
AWS	ORGANIZATION	0.99+
Benoit Dageville	PERSON	0.99+
Kit Colbert	PERSON	0.99+
25	QUANTITY	0.99+
two months	QUANTITY	0.99+
Charles Fitzgerald	PERSON	0.99+
50	QUANTITY	0.99+
next week	DATE	0.99+
M&A	ORGANIZATION	0.99+
Mohit Aron	PERSON	0.99+
John Furrier	PERSON	0.99+
each cloud	QUANTITY	0.99+
Tuesday, 11 o'clock	DATE	0.99+
two weeks	QUANTITY	0.99+
Tuesday	DATE	0.99+
60	QUANTITY	0.99+
today	DATE	0.99+
MySQL	TITLE	0.99+
100 more questions	QUANTITY	0.99+
each	QUANTITY	0.99+
last year	DATE	0.99+
each instance	QUANTITY	0.99+
30 years	QUANTITY	0.99+
20	QUANTITY	0.99+
Moscone West	LOCATION	0.99+
3000 services	QUANTITY	0.99+
one model	QUANTITY	0.99+
70%	QUANTITY	0.99+
second one	QUANTITY	0.98+
1000	QUANTITY	0.98+
30%	QUANTITY	0.98+
500 survey respondents	QUANTITY	0.98+
1000 different parts	QUANTITY	0.98+
VMware	ORGANIZATION	0.98+
single component	QUANTITY	0.98+
single layer	QUANTITY	0.97+
Deloitte Consulting	ORGANIZATION	0.97+
one	QUANTITY	0.97+
Nitro	ORGANIZATION	0.97+
about five years	QUANTITY	0.97+
more than 30 experienced technologists	QUANTITY	0.97+
about 70%	QUANTITY	0.97+
single instance	QUANTITY	0.97+
Shadow Cloud	ORGANIZATION	0.96+
Snowflake	TITLE	0.96+
The Cube	ORGANIZATION	0.96+
third deployment	QUANTITY	0.96+
Deloitte US	ORGANIZATION	0.95+
Supercloud22	ORGANIZATION	0.95+
20 years	QUANTITY	0.95+
each cloud region	QUANTITY	0.95+
second two models	QUANTITY	0.95+
Closing the Cloud Strategy, Technology, Innovation Gap	TITLE	0.94+
one cloud	QUANTITY	0.94+
single cloud	QUANTITY	0.94+
Cohesity	ORGANIZATION	0.94+
one server	QUANTITY	0.94+
single domain	QUANTITY	0.94+
each individual cloud	QUANTITY	0.93+
supercloud	ORGANIZATION	0.93+
metacloud	ORGANIZATION	0.92+
multicloud	ORGANIZATION	0.92+
The Cube	TITLE	0.92+
Graviton	ORGANIZATION	0.92+
VMware Explore	EVENT	0.91+
couple of months ago	DATE	0.89+
single global instance	QUANTITY	0.88+
Snowflake	ORGANIZATION	0.88+
cloud	QUANTITY	0.88+

Domenic Ravita, SingleStore | AWS Summit New York 2022

(digital music) >> And we're back live in New York. It's theCUBE. It's not SNL, it's better than SNL. Lisa Martin and John Furrier here with about 10,000 to 12,000 folks. (John chuckles) There is a ton of energy here. There's a ton of interest in what's going on. But one of the things that we know that AWS is really well-known for is its massive ecosystem. And one of its ecosystem partners is joining us. Please welcome Domenic Ravita, the VP of Product Marketing from SingleStore. Dominic, great to have you on the program. >> Well, thank you. Glad to be here. >> It's a nice opening, wasn't it? (Lisa and John laughing) >> I love SNL. Who doesn't? >> Right? I know. So some big news came out today. >> Yes. >> Funding. Good number. Talk to us a little bit about that before we dig in to SingleStore and what you guys are doing with AWS. >> Right, yeah. Thank you. We announced this morning our latest round, 116 million. We're really grateful to our customers and our investors and the partners and employees and making SingleStore a success to go on this journey of, really, to fulfill our mission to unify and simplify modern, real time data. >> So talk to us about SingleStore. Give us the value prop, the key differentiators, 'cause obviously customers have choice. Help us understand where you're nailing it. >> SingleStore is all about, what we like to say, the moments that matter. When you have an analytical question about what's happening in the moment, SingleStore is your best way to solve that cost-effectively. So that is for, in the case of Thorn, where they're helping to protect and save children from online trafficking or in the case of True Digital, which early in the pandemic, was a company in Southeast Asia that used anonymized phone pings to identify real time population density changes and movements across Thailand to have a proactive response. So really real time data in the moment can help to save lives quite literally. But also it does things that are just good commercially that gives you an advantage like what we do with Uber to help real time pricing and things like this. >> It's interesting this data intensity happening right now. We were talking earlier on theCUBE with another guest and we said, "Why is it happening now?" The big data has been around since the dupe days. That was hard to work with, then data lakes kicked in. But we seem to be, in the past year, everyone's now aware like, "Wow, I got a lot of data." Is it the pandemic? Now we're seeing customers understand the consequences. So how do you look at that? Because is it just timing, evolution? Are they now getting it or is the technology better? Is machine learning better? What's the forces driving the massive data growth acceleration in terms of implementing and getting stuff out, done? (chuckles) >> We think it's the confluence of a lot of those things you mentioned there. First of all, we just celebrate the 15-year anniversary of the iPhone, so that is like wallpaper now. It's just faded into our daily lives. We don't even think of that as a separate thing. So there's an expectation that we all have instant information and not just for the consumer interactions, for the business interactions. That permeates everything. I think COVID with the pandemic forced everyone, every business to try to move to digital first and so that put pressure on the digital service economy to mature even faster and to be digital first. That is what drives what we call data intensity. And more generally, the economic phenomenon is the data intensive era. It's a continuous competition and game for customers. In every moment in every location, in every dimension, the more data hat you have, the better value prop you can give. And so SingleStore is uniquely positioned to and focused on solving this problem of data intensity by bringing and unifying data together. >> What's the big customer success story? Can you share any examples that highlight that? What are some cool things that are happening that can illustrate this new, I won't say bit that's been flipped, that's been happening for a while, but can you share some cutting edge customer successes? >> It's happening across a lot of industries. So I would say first in financial services, FinTech. FinTech is always at the leading edge of these kind of technology adaptions for speeds and things like that. So we have a customer named IEX Cloud and they're focused on providing real time financial data as an API. So it's a data product, API-first. They're providing a lot of historical information on instruments and that sort of thing, as well as real time trending information. So they have customers like Seeking Alpha, for instance, who are providing real time updates on massive, massive data sets. They looked at lots of different ways to do this and there's the traditional, transactionals, LTP database and then maybe if you want to scale an API like theirs, you might have a separate end-memory cache and then yet another database for analytics. And so we bring all that together and simplify that and the benefit of simplification, but it's also this unification and lower latency. Another example is GE who basically uses us to bring together lots of financial information to provide quicker close to the end-of-month process across many different systems. >> So we think about special purpose databases, you mentioned one of the customers having those. We were in the keynote this morning where AWS is like, "We have the broadest set of special purpose databases," but you're saying the industry can't afford them anymore. Why and would it make SingleStore unique in terms of what you deliver? >> It goes back to this data intensity, in that the new business models that are coming out now are all about giving you this instant context and that's all data-driven and it's digital and it's also analytical. And so the reason that's you can't afford to do this, otherwise, is data's getting so big. Moving that data gets expensive, 'cause in the cloud you pay for every byte you store, every byte you process, every byte you move. So data movement is a cost in dollars and cents. It's a cost in time. It's also a cost in skill sets. So when you have many different specialized data sets or data-based technologies, you need skilled people to manage those. So that's why we think the industry needs to be simplified and then that's why you're seeing this unification trend across the database industry and other parts of the stack happening. With AWS, I mean, they've been a great partner of ours for years since we launched our first cloud database product and their perspective is a little bit different. They're offering choice of the specialty, 'cause many people build this way. But if you're going after real time data, you need to bring it. They also offer a SingleStore as a service on AWS. We offer it that way. It's in the AWS Marketplace. So it's easily consumable that way. >> Access to real time data is no longer a nice-to-have for any company, it's table stakes. We saw that especially in the last 20 months or so with companies that needed to pivot so quickly. What is it about SingleStore that delivers, that you talked about moments that matter? Talk about the access to real time data. How that's a differentiator as well? >> I think businesses need to be where their customers are and in the moments their customers are interacting. So that is the real time business-driver. As far as technology wise, it's not easy to do this. And you think about what makes a database fast? A major way of what makes it fast is how you store the data. And so since 2014, when we first released this, what Gartner called at the time, hybrid transaction/analytical processing or HTAP, where we brought transactional data and analytical data together. Fast forward five years to 2019, we released this innovation called Universal Storage, which does that in a single unified table type. Why that matters is because, I would say, basically cost efficiency and better speed. Again, because you pay for the storage and you pay for the movement. If you're not duplicating that data, moving it across different stores, you're going to have a better experience. >> One of the things you guys pioneered is unifying workloads. You mentioned some of the things you've done. Others are now doing it. Snowflake, Google and others. What does that mean for you guys? I mean, 'cause are they copying you? Are they trying to meet the functionality? >> I think. >> I mean, unification. I mean, people want to just store things and make it, get all the table stakes, check boxes, compliance, security and just keep coding and keep building. >> We think it's actually great 'cause they're validating what we've been seeing in the market for years. And obviously, they see that it's needed by customers. And so we welcome them to the party in terms of bringing these unified workloads together. >> Is it easy or hard? >> It's a difficult thing. We started this in 2014. And we've now have lots of production workloads on this. So we know where all the production edge cases are and that capability is also a building block towards a broader, expansive set of capabilities that we've moved onto that next phase and tomorrow actually we have an event called, The Real Time Data Revolution, excuse me, where we're announcing what's in that new product of ours. >> Is that a physical event or virtual? >> It's a virtual event. >> So we'll get the URL on the show notes, or if you know, just go to the new site. >> Absolutely. SingleStore Real Time Data Revolution, you'll find it. >> Can you tease us with the top three takeaways from Revolution tomorrow? >> So like I said, what makes a database fast? It's the storage and we completed that functionality three years ago with Universal Storage. What we're now doing for this next phase of the evolution is making enterprise features available and Workspaces is one of the foundational capabilities there. What SingleStore Workspaces does is it allows you to have this isolation of compute between your different workloads. So that's often a concern to new users to SingleStore. How can I combine transactions and analytics together? That seems like something that might be not a good thing. Well, there are multiple ways we've been doing that with resource governance, workload management. Workspaces offers another management capability and it's also flexible in that you can scale those workloads independently, or if you have a multi-tenant application, you can segment your application, your customer tenant workloads by each workspace. Another capability we're releasing is called Wasm, which is W-A-S-M, Web Assembly. This is something that's really growing in the open source community and SingleStore's contributing to that open source scene, CF project with WASI and Wasm. Where it's been mentioned mostly in the last few years has been in the browser as a more efficient way to run code in the browser. We're adapting that technology to allow you to run any language of your choice in the database and why that's important, again, it's for data movement. As data gets large in petabyte sizes, you can't move it in and out of Pandas in Python. >> Great innovation. That's real valuable. >> So we call this Code Engine with Wasm and- >> What do you call it? >> Code Engine Powered by Wasm. >> Wow. Wow. And that's open source? >> We contribute to the Wasm open source community. >> But you guys have a service that you- >> Yes. It's our implementation and our database. But Wasm allows you to have code that's portable, so any sort of runtime, which is... At release- >> You move the code, not the data. >> Exactly. >> With the compute. (chuckles) >> That's right, bring the compute to the data is what we say. >> You mentioned a whole bunch of great customer examples, GE, Uber, Thorn, you talked about IEX Cloud. When you're in customer conversations, are you dealing mostly with customers that are looking to you to help replace an existing database that was struggling from a performance perspective? Or are you working with startups who are looking to build a product on SingleStore? Is it both? >> It is a mix of both. I would say among SaaS scale up companies, their API, for instance, is their product or their SaaS application is their product. So quite literally, we're the data engine and the database powering their scale to be able to sign that next big customer or to at least sleep at night to know that it's not going to crash if they sign that next big costumer. So in those cases, we're mainly replacing a lot of databases like MySQL, Postgre, where they're typically starting, but more and more we're finding, it's free to start with SingleStore. You can run it in production for free. And in our developer community, we see a lot of customers running in that way. We have a really interesting community member who has a Minecraft server analytics that he's building based on that SingleStore free tier. In the enterprise, it's different, because there are many incumbent databases there. So it typically is a case where there is a, maybe a new product offering, they're maybe delivering a FinTech API or a new SaaS digital offering, again, to better participate in this digital service economy and they're looking for a better price performance for that real time experience in the app. That's typically the starting point, but there are replacements of traditional incumbent databases as well. >> How has the customer conversation evolved the last couple of years? As we talked about, one of the things we learned in the pandemic was access to real time data and those moments that matter isn't a nice-to-have anymore for businesses. There was that force march to digital. We saw the survivors, we're seeing the thrivers, but want to get your perspective on that. From the customers, how has the conversation evolved or elevated, escalated within an organization as every company has to be a data company? >> It really depends on their business strategy, how they are adapting or how they have adapted to this new digital first orientation and what does that mean for them in the direct interaction with their customers and partners. Often, what it means is they realize that they need to take advantage of using more data in the customer and partner interaction and when they come to those new ideas for new product introductions, they find that it's complicated and expensive to build in the old way. And if you're going to have these real time interactions, interactive applications, APIs, with all this context, you're going to have to find a better, more cost-effective approach to get that to market faster, but also not to have a big sprawling data-based technology infrastructure. We find that in those situations, we're replacing four or five different database technologies. A specialized database for key value, a specialized database for search- >> Because there's no unification before? Is that one of the reasons? >> I think it's an awareness thing. I think technology awareness takes a little bit of time, that there's a new way to do things. I think the old saying about, "Don't pave cow paths when the car..." You could build a straight road and pave it. You don't have to pave along the cow path. I think that's the natural course of technology adaption and so as more- >> And the- pandemic, too, highlighted a lot of the things, like, "Do we really need that?" (chuckles) "Who's going to service that?" >> That's right. >> So it's an awakening moment there where it's like, "Hey, let's look at what's working." >> That's right. >> Double down on it. >> Absolutely. >> What are you excited about new round of funding? We talked about, obviously, probably investments in key growth areas, but what excites you about being part of SingleStore and being a partner of AWS? >> SingleStore is super exciting. I've been in this industry a long time as an engineer and an engineering leader. At the time, we were MemSQL, came into SingleStore. And just that unification and simplification, the systems that I had built as a system engineer and helped architect did the job. They could get the speed and scale you needed to do track and trace kinds of use cases in real time, but it was a big trade off you had to make in terms of the complexity, the skill sets you needed and the cost and just hard to maintain. What excites me most about SingleStore is that it really feels like the iPhone moment for databases because it's not something you asked for, but once your friend has it and shows it to you, why would you have three different devices in your pocket with a flip phone, a calculator? (Lisa and Domenic chuckles) Remember these days? >> Yes. >> And a Blackberry pager. (all chuckling) You just suddenly- >> Or a computer. That's in there. >> That's right. So you just suddenly started using iPhone and that is sort of the moment. It feels like we're at it in the database market where there's a growing awareness and those announcements you mentioned show that others are seeing the same. >> And your point earlier about the iPhone throwing off a lot of data. So now you have data explosions at levels that unprecedented, we've never seen before and the fact that you want to have that iPhone moment, too, as a database. >> Absolutely. >> Great stuff. >> The other part of your question, what excites us about AWS. AWS has been a great partner since the beginning. I mean, when we first released our database, it was the cloud database. It was on AWS by customer demand. That's where our customers were. That's where they were building other applications. And now we have integrations with other native services like AWS Glue and we're in the Marketplace. We've expanded, that said we are a multi-cloud system. We are available in any cloud of your choice and on premise and in hybrid. So we're multi-cloud, hybrid and SaaS distribution. >> Got it. All right. >> Got it. So the event is tomorrow, Revolution. Where can folks go to register? What time does it start? >> 1:00 PM Eastern and- >> 1:00 PM. Eastern. >> Just Google SingleStore Real Time Data Revolution and you'll find it. Love for everyone to join us. >> All right. We look forward to it. Domenic, thank you so much for joining us, talking about SingleStore, the value prop, the differentiators, the validation that's happening in the market and what you guys are doing with AWS. We appreciate it. >> Thanks so much for having me. >> Our pleasure. For Domenic Ravita and John Furrier, I'm Lisa Martin. You're watching theCUBE, live from New York at AWS Summit 22. John and I are going to be back after a short break, so come back. (digital pulsing music)

Published Date : Jul 14 2022

SUMMARY :

Dominic, great to have you Glad to be here. I love SNL. So some big news came out today. and what you guys are doing with AWS. and our investors and the So talk to us about SingleStore. So that is for, in the case of Thorn, is the technology better? the better value prop you can give. and the benefit of simplification, in terms of what you deliver? 'cause in the cloud you pay Talk about the access to real time data. and in the moments their One of the things you guys pioneered get all the table stakes, check in the market for years. and that capability is or if you know, just go to the new site. SingleStore Real Time Data in that you can scale That's real valuable. We contribute to the Wasm open source But Wasm allows you to You move the code, With the compute. That's right, bring the compute that are looking to you to help and the database powering their scale We saw the survivors, in the direct interaction with You don't have to pave along the cow path. So it's an awakening moment there and the cost and just hard to maintain. And a Blackberry pager. That's in there. and that is sort of the moment. and the fact that you want to have in the Marketplace. All right. So the event 1:00 PM. Love for everyone to join us. in the market and what you John and I are going to be

ENTITIES

Entity	Category	Confidence
Lisa Martin	PERSON	0.99+
2014	DATE	0.99+
Domenic	PERSON	0.99+
John Furrier	PERSON	0.99+
2019	DATE	0.99+
New York	LOCATION	0.99+
Uber	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Domenic Ravita	PERSON	0.99+
John Furrier	PERSON	0.99+
John	PERSON	0.99+
Thailand	LOCATION	0.99+
Dominic	PERSON	0.99+
Lisa	PERSON	0.99+
Southeast Asia	LOCATION	0.99+
GE	ORGANIZATION	0.99+
Gartner	ORGANIZATION	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
1:00 PM	DATE	0.99+
116 million	QUANTITY	0.99+
four	QUANTITY	0.99+
MySQL	TITLE	0.99+
True Digital	ORGANIZATION	0.99+
both	QUANTITY	0.99+
tomorrow	DATE	0.99+
Google	ORGANIZATION	0.99+
Blackberry	ORGANIZATION	0.99+
today	DATE	0.99+
one	QUANTITY	0.98+
first	QUANTITY	0.98+
SNL	TITLE	0.98+
SingleStore	ORGANIZATION	0.98+
three years ago	DATE	0.98+
SingleStore	TITLE	0.97+
1:00 PM Eastern	DATE	0.97+
pandemic	EVENT	0.97+
Thorn	ORGANIZATION	0.97+
each workspace	QUANTITY	0.96+
five years	QUANTITY	0.96+
Minecraft	TITLE	0.96+
12,000 folks	QUANTITY	0.96+
Python	TITLE	0.96+
One	QUANTITY	0.95+
single	QUANTITY	0.95+
W-A-S-M	TITLE	0.95+
past year	DATE	0.95+
about 10,000	QUANTITY	0.93+
First	QUANTITY	0.93+
Wasm	ORGANIZATION	0.92+
FinTech	ORGANIZATION	0.92+
first cloud database	QUANTITY	0.91+
AWS Summit	EVENT	0.91+
five different database	QUANTITY	0.91+
this morning	DATE	0.9+
three different devices	QUANTITY	0.89+
first orientation	QUANTITY	0.89+

Breaking Analysis: Snowflake Summit 2022...All About Apps & Monetization

>> From theCUBE studios in Palo Alto in Boston, bringing you data driven insights from theCUBE and ETR. This is "Breaking Analysis" with Dave Vellante. >> Snowflake Summit 2022 underscored that the ecosystem excitement which was once forming around Hadoop is being reborn, escalated and coalescing around Snowflake's data cloud. What was once seen as a simpler cloud data warehouse and good marketing with the data cloud is evolving rapidly with new workloads of vertical industry focus, data applications, monetization, and more. The question is, will the promise of data be fulfilled this time around, or is it same wine, new bottle? Hello, and welcome to this week's Wikibon CUBE Insights powered by ETR. In this "Breaking Analysis," we'll talk about the event, the announcements that Snowflake made that are of greatest interest, the major themes of the show, what was hype and what was real, the competition, and some concerns that remain in many parts of the ecosystem and pockets of customers. First let's look at the overall event. It was held at Caesars Forum. Not my favorite venue, but I'll tell you it was packed. Fire Marshall Full, as we sometimes say. Nearly 10,000 people attended the event. Here's Snowflake's CMO Denise Persson on theCUBE describing how this event has evolved. >> Yeah, two, three years ago, we were about 1800 people at a Hilton in San Francisco. We had about 40 partners attending. This week we're close to 10,000 attendees here. Almost 10,000 people online as well, and over over 200 partners here on the show floor. >> Now, those numbers from 2019 remind me of the early days of Hadoop World, which was put on by Cloudera but then Cloudera handed off the event to O'Reilly as this article that we've inserted, if you bring back that slide would say. The headline it almost got it right. Hadoop World was a failure, but it didn't have to be. Snowflake has filled the void created by O'Reilly when it first killed Hadoop World, and killed the name and then killed Strata. Now, ironically, the momentum and excitement from Hadoop's early days, it probably could have stayed with Cloudera but the beginning of the end was when they gave the conference over to O'Reilly. We can't imagine Frank Slootman handing the keys to the kingdom to a third party. Serious business was done at this event. I'm talking substantive deals. Salespeople from a host sponsor and the ecosystems that support these events, they love physical. They really don't like virtual because physical belly to belly means relationship building, pipeline, and deals. And that was blatantly obvious at this show. And in fairness, all theCUBE events that we've done year but this one was more vibrant because of its attendance and the action in the ecosystem. Ecosystem is a hallmark of a cloud company, and that's what Snowflake is. We asked Frank Slootman on theCUBE, was this ecosystem evolution by design or did Snowflake just kind of stumble into it? Here's what he said. >> Well, when you are a data clouding, you have data, people want to do things with that data. They don't want just run data operations, populate dashboards, run reports. Pretty soon they want to build applications and after they build applications, they want build businesses on it. So it goes on and on and on. So it drives your development to enable more and more functionality on that data cloud. Didn't start out that way, you know, we were very, very much focused on data operations. Then it becomes application development and then it becomes, hey, we're developing whole businesses on this platform. So similar to what happened to Facebook in many ways. >> So it sounds like it was maybe a little bit of both. The Facebook analogy is interesting because Facebook is a walled garden, as is Snowflake, but when you come into that garden, you have assurances that things are going to work in a very specific way because a set of standards and protocols is being enforced by a steward, i.e. Snowflake. This means things run better inside of Snowflake than if you try to do all the integration yourself. Now, maybe over time, an open source version of that will come out but if you wait for that, you're going to be left behind. That said, Snowflake has made moves to make its platform more accommodating to open source tooling in many of its announcements this week. Now, I'm not going to do a deep dive on the announcements. Matt Sulkins from Monte Carlo wrote a decent summary of the keynotes and a number of analysts like Sanjeev Mohan, Tony Bear and others are posting some deeper analysis on these innovations, and so we'll point to those. I'll say a few things though. Unistore extends the type of data that can live in the Snowflake data cloud. It's enabled by a new feature called hybrid tables, a new table type in Snowflake. One of the big knocks against Snowflake was it couldn't handle and transaction data. Several database companies are creating this notion of a hybrid where both analytic and transactional workloads can live in the same data store. Oracle's doing this for example, with MySQL HeatWave and there are many others. We saw Mongo earlier this month add an analytics capability to its transaction system. Mongo also added sequel, which was kind of interesting. Here's what Constellation Research analyst Doug Henschen said about Snowflake's moves into transaction data. Play the clip. >> Well with Unistore, they're reaching out and trying to bring transactional data in. Hey, don't limit this to analytical information and there's other ways to do that like CDC and streaming but they're very closely tying that again to that marketplace, with the idea of bring your data over here and you can monetize it. Don't just leave it in that transactional database. So another reach to a broader play across a big community that they're building. >> And you're also seeing Snowflake expand its workload types in its unique way and through Snowpark and its stream lit acquisition, enabling Python so that native apps can be built in the data cloud and benefit from all that structure and the features that Snowflake is built in. Hence that Facebook analogy, or maybe the App Store, the Apple App Store as I propose as well. Python support also widens the aperture for machine intelligence workloads. We asked Snowflake senior VP of product, Christian Kleinerman which announcements he thought were the most impactful. And despite the who's your favorite child nature of the question, he did answer. Here's what he said. >> I think the native applications is the one that looks like, eh, I don't know about it on the surface but he has the biggest potential to change everything. That's create an entire ecosystem of solutions for within a company or across companies that I don't know that we know what's possible. >> Snowflake also announced support for Apache Iceberg, which is a new open table format standard that's emerging. So you're seeing Snowflake respond to these concerns about its lack of openness, and they're building optionality into their cloud. They also showed some cost op optimization tools both from Snowflake itself and from the ecosystem, notably Capital One which launched a software business on top of Snowflake focused on optimizing cost and eventually the rollout data management capabilities, and all kinds of features that Snowflake announced that the show around governance, cross cloud, what we call super cloud, a new security workload, and they reemphasize their ability to read non-native on-prem data into Snowflake through partnerships with Dell and Pure and a lot more. Let's hear from some of the analysts that came on theCUBE this week at Snowflake Summit to see what they said about the announcements and their takeaways from the event. This is Dave Menninger, Sanjeev Mohan, and Tony Bear, roll the clip. >> Our research shows that the majority of organizations, the majority of people do not have access to analytics. And so a couple of the things they've announced I think address those or help to address those issues very directly. So Snowpark and support for Python and other languages is a way for organizations to embed analytics into different business processes. And so I think that'll be really beneficial to try and get analytics into more people's hands. And I also think that the native applications as part of the marketplace is another way to get applications into people's hands rather than just analytical tools. Because most people in the organization are not analysts. They're doing some line of business function. They're HR managers, they're marketing people, they're sales people, they're finance people, right? They're not sitting there mucking around in the data, they're doing a job and they need analytics in that job. >> Primarily, I think it is to contract this whole notion that once you move data into Snowflake, it's a proprietary format. So I think that's how it started but it's usually beneficial to the customers, to the users because now if you have large amount of data in paket files you can leave it on S3, but then you using the Apache Iceberg table format in Snowflake, you get all the benefits of Snowflake's optimizer. So for example, you get the micro partitioning, you get the metadata. And in a single query, you can join, you can do select from a Snowflake table union and select from an iceberg table and you can do store procedure, user defined function. So I think what they've done is extremely interesting. Iceberg by itself still does not have multi-table transactional capabilities. So if I'm running a workload, I might be touching 10 different tables. So if I use Apache Iceberg in a raw format, they don't have it, but Snowflake does. So the way I see it is Snowflake is adding more and more capabilities right into the database. So for example, they've gone ahead and added security and privacy. So you can now create policies and do even cell level masking, dynamic masking, but most organizations have more than Snowflake. So what we are starting to see all around here is that there's a whole series of data catalog companies, a bunch of companies that are doing dynamic data masking, security and governance, data observability which is not a space Snowflake has gone into. So there's a whole ecosystem of companies that is mushrooming. Although, you know, so they're using the native capabilities of Snowflake but they are at a level higher. So if you have a data lake and a cloud data warehouse and you have other like relational databases, you can run these cross platform capabilities in that layer. So that way, you know, Snowflake's done a great job of enabling that ecosystem. >> I think it's like the last mile, essentially. In other words, it's like, okay, you have folks that are basically that are very comfortable with Tableau but you do have developers who don't want to have to shell out to a separate tool. And so this is where Snowflake is essentially working to address that constituency. To Sanjeev's point, and I think part of it, this kind of plays into it is what makes this different from the Hadoop era is the fact that all these capabilities, you know, a lot of vendors are taking it very seriously to put this native. Now, obviously Snowflake acquired Streamlit. So we can expect that the Streamlit capabilities are going to be native. >> I want to share a little bit about the higher level thinking at Snowflake, here's a chart from Frank Slootman's keynote. It's his version of the modern data stack, if you will. Now, Snowflake of course, was built on the public cloud. If there were no AWS, there would be no Snowflake. Now, they're all about bringing data and live data and expanding the types of data, including structured, we just heard about that, unstructured, geospatial, and the list is going to continue on and on. Eventually I think it's going to bleed into the edge if we can figure out what to do with that edge data. Executing on new workloads is a big deal. They started with data sharing and they recently added security and they've essentially created a PaaS layer. We call it a SuperPaaS layer, if you will, to attract application developers. Snowflake has a developer-focused event coming up in November and they've extended the marketplace with 1300 native apps listings. And at the top, that's the holy grail, monetization. We always talk about building data products and we saw a lot of that at this event, very, very impressive and unique. Now here's the thing. There's a lot of talk in the press, in the Wall Street and the broader community about consumption-based pricing and concerns over Snowflake's visibility and its forecast and how analytics may be discretionary. But if you're a company building apps in Snowflake and monetizing like Capital One intends to do, and you're now selling in the marketplace, that is not discretionary, unless of course your costs are greater than your revenue for that service, in which case is going to fail anyway. But the point is we're entering a new error where data apps and data products are beginning to be built and Snowflake is attempting to make the data cloud the defacto place as to where you're going to build them. In our view they're well ahead in that journey. Okay, let's talk about some of the bigger themes that we heard at the event. Bringing apps to the data instead of moving the data to the apps, this was a constant refrain and one that certainly makes sense from a physics point of view. But having a single source of data that is discoverable, sharable and governed with increasingly robust ecosystem options, it doesn't have to be moved. Sometimes it may have to be moved if you're going across regions, but that's unique and a differentiator for Snowflake in our view. I mean, I'm yet to see a data ecosystem that is as rich and growing as fast as the Snowflake ecosystem. Monetization, we talked about that, industry clouds, financial services, healthcare, retail, and media, all front and center at the event. My understanding is that Frank Slootman was a major force behind this shift, this development and go to market focus on verticals. It's really an attempt, and he talked about this in his keynote to align with the customer mission ultimately align with their objectives which not surprisingly, are increasingly monetizing with data as a differentiating ingredient. We heard a ton about data mesh, there were numerous presentations about the topic. And I'll say this, if you map the seven pillars Snowflake talks about, Benoit Dageville talked about this in his keynote, but if you map those into Zhamak Dehghani's data mesh framework and the four principles, they align better than most of the data mesh washing that I've seen. The seven pillars, all data, all workloads, global architecture, self-managed, programmable, marketplace and governance. Those are the seven pillars that he talked about in his keynote. All data, well, maybe with hybrid tables that becomes more of a reality. Global architecture means the data is globally distributed. It's not necessarily physically in one place. Self-managed is key. Self-service infrastructure is one of Zhamak's four principles. And then inherent governance. Zhamak talks about computational, what I'll call automated governance, built in. And with all the talk about monetization, that aligns with the second principle which is data as product. So while it's not a pure hit and to its credit, by the way, Snowflake doesn't use data mesh in its messaging anymore. But by the way, its customers do, several customers talked about it. Geico, JPMC, and a number of other customers and partners are using the term and using it pretty closely to the concepts put forth by Zhamak Dehghani. But back to the point, they essentially, Snowflake that is, is building a proprietary system that substantially addresses some, if not many of the goals of data mesh. Okay, back to the list, supercloud, that's our term. We saw lots of examples of clouds on top of clouds that are architected to spin multiple clouds, not just run on individual clouds as separate services. And this includes Snowflake's data cloud itself but a number of ecosystem partners that are headed in a very similar direction. Snowflake still talks about data sharing but now it uses the term collaboration in its high level messaging, which is I think smart. Data sharing is kind of a geeky term. And also this is an attempt by Snowflake to differentiate from everyone else that's saying, hey, we do data sharing too. And finally Snowflake doesn't say data marketplace anymore. It's now marketplace, accounting for its application market. Okay, let's take a quick look at the competitive landscape via this ETR X-Y graph. Vertical access remembers net score or spending momentum and the x-axis is penetration, pervasiveness in the data center. That's what ETR calls overlap. Snowflake continues to lead on the vertical axis. They guide it conservatively last quarter, remember, so I wouldn't be surprised if that lofty height, even though it's well down from its earlier levels but I wouldn't be surprised if it ticks down again a bit in the July survey, which will be in the field shortly. Databricks is a key competitor obviously at a strong spending momentum, as you can see. We didn't draw it here but we usually draw that 40% line or red line at 40%, anything above that is considered elevated. So you can see Databricks is quite elevated. But it doesn't have the market presence of Snowflake. It didn't get to IPO during the bubble and it doesn't have nearly as deep and capable go-to market machinery. Now, they're getting better and they're getting some attention in the market, nonetheless. But as a private company, you just naturally, more people are aware of Snowflake. Some analysts, Tony Bear in particular, believe Mongo and Snowflake are on a bit of a collision course long term. I actually can see his point. You know, I mean, they're both platforms, they're both about data. It's long ways off, but you can see them sort of in a similar path. They talk about kind of similar aspirations and visions even though they're quite in different markets today but they're definitely participating in similar tam. The cloud players are probably the biggest or definitely the biggest partners and probably the biggest competitors to Snowflake. And then there's always Oracle. Doesn't have the spending velocity of the others but it's got strong market presence. It owns a cloud and it knows a thing about data and it definitely is a go-to market machine. Okay, we're going to end on some of the things that we heard in the ecosystem. 'Cause look, we've heard before how particular technology, enterprise data warehouse, data hubs, MDM, data lakes, Hadoop, et cetera. We're going to solve all of our data problems and of course they didn't. And in fact, sometimes they create more problems that allow vendors to push more incremental technology to solve the problems that they created. Like tools and platforms to clean up the no schema on right nature of data lakes or data swamps. But here are some of the things that I heard firsthand from some customers and partners. First thing is, they said to me that they're having a hard time keeping up sometimes with the pace of Snowflake. It reminds me of AWS in 2014, 2015 timeframe. You remember that fire hose of announcements which causes increased complexity for customers and partners. I talked to several customers that said, well, yeah this is all well and good but I still need skilled people to understand all these tools that I'm integrated in the ecosystem, the catalogs, the machine learning observability. A number of customers said, I just can't use one governance tool, I need multiple governance tools and a lot of other technologies as well, and they're concerned that that's going to drive up their cost and their complexity. I heard other concerns from the ecosystem that it used to be sort of clear as to where they could add value you know, when Snowflake was just a better data warehouse. But to point number one, they're either concerned that they'll be left behind or they're concerned that they'll be subsumed. Look, I mean, just like we tell AWS customers and partners, you got to move fast, you got to keep innovating. If you don't, you're going to be left. Either if your customer you're going to be left behind your competitor, or if you're a partner, somebody else is going to get there or AWS is going to solve the problem for you. Okay, and there were a number of skeptical practitioners, really thoughtful and experienced data pros that suggested that they've seen this movie before. That's hence the same wine, new bottle. Well, this time around I certainly hope not given all the energy and investment that is going into this ecosystem. And the fact is Snowflake is unquestionably making it easier to put data to work. They built on AWS so you didn't have to worry about provisioning, compute and storage and networking and scaling. Snowflake is optimizing its platform to take advantage of things like Graviton so you don't have to, and they're doing some of their own optimization tools. The ecosystem is building optimization tools so that's all good. And firm belief is the less expensive it is, the more data will get brought into the data cloud. And they're building a data platform on which their ecosystem can build and run data applications, aka data products without having to worry about all the hard work that needs to get done to make data discoverable, shareable, and governed. And unlike the last 10 years, you don't have to be a keeper and integrate all the animals in the Hadoop zoo. Okay, that's it for today, thanks for watching. Thanks to my colleague, Stephanie Chan who helps research "Breaking Analysis" topics. Sometimes Alex Myerson is on production and manages the podcasts. Kristin Martin and Cheryl Knight help get the word out on social and in our newsletters, and Rob Hof is our editor in chief over at Silicon, and Hailey does some wonderful editing, thanks to all. Remember, all these episodes are available as podcasts wherever you listen. All you got to do is search Breaking Analysis Podcasts. I publish each week on wikibon.com and siliconangle.com and you can email me at David.Vellante@siliconangle.com or DM me @DVellante. If you got something interesting, I'll respond. If you don't, I'm sorry I won't. Or comment on my LinkedIn post. Please check out etr.ai for the best survey data in the enterprise tech business. This is Dave Vellante for theCUBE Insights powered by ETR. Thanks for watching, and we'll see you next time. (upbeat music)

Published Date : Jun 18 2022

SUMMARY :

bringing you data driven that the ecosystem excitement here on the show floor. and the action in the ecosystem. Didn't start out that way, you know, One of the big knocks against Snowflake the idea of bring your data of the question, he did answer. is the one that looks like, and from the ecosystem, And so a couple of the So that way, you know, from the Hadoop era is the fact the defacto place as to where

ENTITIES

Entity	Category	Confidence
Frank Slootman	PERSON	0.99+
Frank Slootman	PERSON	0.99+
Doug Henschen	PERSON	0.99+
Stephanie Chan	PERSON	0.99+
Christian Kleinerman	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Rob Hof	PERSON	0.99+
Benoit Dageville	PERSON	0.99+
2014	DATE	0.99+
Matt Sulkins	PERSON	0.99+
JPMC	ORGANIZATION	0.99+
2019	DATE	0.99+
Cheryl Knight	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Denise Persson	PERSON	0.99+
Alex Myerson	PERSON	0.99+
Tony Bear	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Dell	ORGANIZATION	0.99+
July	DATE	0.99+
Geico	ORGANIZATION	0.99+
November	DATE	0.99+
Snowflake	TITLE	0.99+
40%	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
App Store	TITLE	0.99+
Capital One	ORGANIZATION	0.99+
second principle	QUANTITY	0.99+
Sanjeev Mohan	PERSON	0.99+
Snowflake	ORGANIZATION	0.99+
1300 native apps	QUANTITY	0.99+
Tony Bear	PERSON	0.99+
David.Vellante@siliconangle.com	OTHER	0.99+
Kristin Martin	PERSON	0.99+
Mongo	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
Snowflake Summit 2022	EVENT	0.99+
First	QUANTITY	0.99+
two	DATE	0.99+
Python	TITLE	0.99+
10 different tables	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
ETR	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Snowflake	EVENT	0.98+
one place	QUANTITY	0.98+
each week	QUANTITY	0.98+
O'Reilly	ORGANIZATION	0.98+
This week	DATE	0.98+
Hadoop World	EVENT	0.98+
this week	DATE	0.98+
Pure	ORGANIZATION	0.98+
about 40 partners	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
last quarter	DATE	0.98+
One	QUANTITY	0.98+
S3	TITLE	0.97+
Hadoop	LOCATION	0.97+
single	QUANTITY	0.97+
Caesars Forum	LOCATION	0.97+
Iceberg	TITLE	0.97+
single source	QUANTITY	0.97+
Silicon	ORGANIZATION	0.97+
Nearly 10,000 people	QUANTITY	0.97+
Apache Iceberg	ORGANIZATION	0.97+

Data Power Panel V3

(upbeat music) >> The stampede to cloud and massive VC investments has led to the emergence of a new generation of object store based data lakes. And with them two important trends, actually three important trends. First, a new category that combines data lakes and data warehouses aka the lakehouse is emerged as a leading contender to be the data platform of the future. And this novelty touts the ability to address data engineering, data science, and data warehouse workloads on a single shared data platform. The other major trend we've seen is query engines and broader data fabric virtualization platforms have embraced NextGen data lakes as platforms for SQL centric business intelligence workloads, reducing, or somebody even claim eliminating the need for separate data warehouses. Pretty bold. However, cloud data warehouses have added complimentary technologies to bridge the gaps with lakehouses. And the third is many, if not most customers that are embracing the so-called data fabric or data mesh architectures. They're looking at data lakes as a fundamental component of their strategies, and they're trying to evolve them to be more capable, hence the interest in lakehouse, but at the same time, they don't want to, or can't abandon their data warehouse estate. As such we see a battle royale is brewing between cloud data warehouses and cloud lakehouses. Is it possible to do it all with one cloud center analytical data platform? Well, we're going to find out. My name is Dave Vellante and welcome to the data platform's power panel on theCUBE. Our next episode in a series where we gather some of the industry's top analysts to talk about one of our favorite topics, data. In today's session, we'll discuss trends, emerging options, and the trade offs of various approaches and we'll name names. Joining us today are Sanjeev Mohan, who's the principal at SanjMo, Tony Baers, principal at dbInsight. And Doug Henschen is the vice president and principal analyst at Constellation Research. Guys, welcome back to theCUBE. Great to see you again. >> Thank guys. Thank you. >> Thank you. >> So it's early June and we're gearing up with two major conferences, there's several database conferences, but two in particular that were very interested in, Snowflake Summit and Databricks Data and AI Summit. Doug let's start off with you and then Tony and Sanjeev, if you could kindly weigh in. Where did this all start, Doug? The notion of lakehouse. And let's talk about what exactly we mean by lakehouse. Go ahead. >> Yeah, well you nailed it in your intro. One platform to address BI data science, data engineering, fewer platforms, less cost, less complexity, very compelling. You can credit Databricks for coining the term lakehouse back in 2020, but it's really a much older idea. You can go back to Cloudera introducing their Impala database in 2012. That was a database on top of Hadoop. And indeed in that last decade, by the middle of that last decade, there were several SQL on Hadoop products, open standards like Apache Drill. And at the same time, the database vendors were trying to respond to this interest in machine learning and the data science. So they were adding SQL extensions, the likes Hudi and Vertical we're adding SQL extensions to support the data science. But then later in that decade with the shift to cloud and object storage, you saw the vendor shift to this whole cloud, and object storage idea. So you have in the database camp Snowflake introduce Snowpark to try to address the data science needs. They introduced that in 2020 and last year they announced support for Python. You also had Oracle, SAP jumped on this lakehouse idea last year, supporting both the lake and warehouse single vendor, not necessarily quite single platform. Google very recently also jumped on the bandwagon. And then you also mentioned, the SQL engine camp, the Dremios, the Ahanas, the Starbursts, really doing two things, a fabric for distributed access to many data sources, but also very firmly planning that idea that you can just have the lake and we'll help you do the BI workloads on that. And then of course, the data lake camp with the Databricks and Clouderas providing a warehouse style deployments on top of their lake platforms. >> Okay, thanks, Doug. I'd be remiss those of you who me know that I typically write my own intros. This time my colleagues fed me a lot of that material. So thank you. You guys make it easy. But Tony, give us your thoughts on this intro. >> Right. Well, I very much agree with both of you, which may not make for the most exciting television in terms of that it has been an evolution just like Doug said. I mean, for instance, just to give an example when Teradata bought AfterData was initially seen as a hardware platform play. In the end, it was basically, it was all those after functions that made a lot of sort of big data analytics accessible to SQL. (clears throat) And so what I really see just in a more simpler definition or functional definition, the data lakehouse is really an attempt by the data lake folks to make the data lake friendlier territory to the SQL folks, and also to get into friendly territory, to all the data stewards, who are basically concerned about the sprawl and the lack of control in governance in the data lake. So it's really kind of a continuing of an ongoing trend that being said, there's no action without counter action. And of course, at the other end of the spectrum, we also see a lot of the data warehouses starting to edit things like in database machine learning. So they're certainly not surrendering without a fight. Again, as Doug was mentioning, this has been part of a continual blending of platforms that we've seen over the years that we first saw in the Hadoop years with SQL on Hadoop and data warehouses starting to reach out to cloud storage or should say the HDFS and then with the cloud then going cloud native and therefore trying to break the silos down even further. >> Now, thank you. And Sanjeev, data lakes, when we first heard about them, there were such a compelling name, and then we realized all the problems associated with them. So pick it up from there. What would you add to Doug and Tony? >> I would say, these are excellent points that Doug and Tony have brought to light. The concept of lakehouse was going on to your point, Dave, a long time ago, long before the tone was invented. For example, in Uber, Uber was trying to do a mix of Hadoop and Vertical because what they really needed were transactional capabilities that Hadoop did not have. So they weren't calling it the lakehouse, they were using multiple technologies, but now they're able to collapse it into a single data store that we call lakehouse. Data lakes, excellent at batch processing large volumes of data, but they don't have the real time capabilities such as change data capture, doing inserts and updates. So this is why lakehouse has become so important because they give us these transactional capabilities. >> Great. So I'm interested, the name is great, lakehouse. The concept is powerful, but I get concerned that it's a lot of marketing hype behind it. So I want to examine that a bit deeper. How mature is the concept of lakehouse? Are there practical examples that really exist in the real world that are driving business results for practitioners? Tony, maybe you could kick that off. >> Well, put it this way. I think what's interesting is that both data lakes and data warehouse that each had to extend themselves. To believe the Databricks hype it's that this was just a natural extension of the data lake. In point of fact, Databricks had to go outside its core technology of Spark to make the lakehouse possible. And it's a very similar type of thing on the part with data warehouse folks, in terms of that they've had to go beyond SQL, In the case of Databricks. There have been a number of incremental improvements to Delta lake, to basically make the table format more performative, for instance. But the other thing, I think the most dramatic change in all that is in their SQL engine and they had to essentially pretty much abandon Spark SQL because it really, in off itself Spark SQL is essentially stop gap solution. And if they wanted to really address that crowd, they had to totally reinvent SQL or at least their SQL engine. And so Databricks SQL is not Spark SQL, it is not Spark, it's basically SQL that it's adapted to run in a Spark environment, but the underlying engine is C++, it's not scale or anything like that. So Databricks had to take a major detour outside of its core platform to do this. So to answer your question, this is not mature because these are all basically kind of, even though the idea of blending platforms has been going on for well over a decade, I would say that the current iteration is still fairly immature. And in the cloud, I could see a further evolution of this because if you think through cloud native architecture where you're essentially abstracting compute from data, there is no reason why, if let's say you are dealing with say, the same basically data targets say cloud storage, cloud object storage that you might not apportion the task to different compute engines. And so therefore you could have, for instance, let's say you're Google, you could have BigQuery, perform basically the types of the analytics, the SQL analytics that would be associated with the data warehouse and you could have BigQuery ML that does some in database machine learning, but at the same time for another part of the query, which might involve, let's say some deep learning, just for example, you might go out to let's say the serverless spark service or the data proc. And there's no reason why Google could not blend all those into a coherent offering that's basically all triggered through microservices. And I just gave Google as an example, if you could generalize that with all the other cloud or all the other third party vendors. So I think we're still very early in the game in terms of maturity of data lakehouses. >> Thanks, Tony. So Sanjeev, is this all hype? What are your thoughts? >> It's not hype, but completely agree. It's not mature yet. Lakehouses have still a lot of work to do, so what I'm now starting to see is that the world is dividing into two camps. On one hand, there are people who don't want to deal with the operational aspects of vast amounts of data. They are the ones who are going for BigQuery, Redshift, Snowflake, Synapse, and so on because they want the platform to handle all the data modeling, access control, performance enhancements, but these are trade off. If you go with these platforms, then you are giving up on vendor neutrality. On the other side are those who have engineering skills. They want the independence. In other words, they don't want vendor lock in. They want to transform their data into any number of use cases, especially data science, machine learning use case. What they want is agility via open file formats using any compute engine. So why do I say lakehouses are not mature? Well, cloud data warehouses they provide you an excellent user experience. That is the main reason why Snowflake took off. If you have thousands of cables, it takes minutes to get them started, uploaded into your warehouse and start experimentation. Table formats are far more resonating with the community than file formats. But once the cost goes up of cloud data warehouse, then the organization start exploring lakehouses. But the problem is lakehouses still need to do a lot of work on metadata. Apache Hive was a fantastic first attempt at it. Even today Apache Hive is still very strong, but it's all technical metadata and it has so many different restrictions. That's why we see Databricks is investing into something called Unity Catalog. Hopefully we'll hear more about Unity Catalog at the end of the month. But there's a second problem. I just want to mention, and that is lack of standards. All these open source vendors, they're running, what I call ego projects. You see on LinkedIn, they're constantly battling with each other, but end user doesn't care. End user wants a problem to be solved. They want to use Trino, Dremio, Spark from EMR, Databricks, Ahana, DaaS, Frink, Athena. But the problem is that we don't have common standards. >> Right. Thanks. So Doug, I worry sometimes. I mean, I look at the space, we've debated for years, best of breed versus the full suite. You see AWS with whatever, 12 different plus data stores and different APIs and primitives. You got Oracle putting everything into its database. It's actually done some interesting things with MySQL HeatWave, so maybe there's proof points there, but Snowflake really good at data warehouse, simplifying data warehouse. Databricks, really good at making lakehouses actually more functional. Can one platform do it all? >> Well in a word, I can't be best at breed at all things. I think the upshot of and cogen analysis from Sanjeev there, the database, the vendors coming out of the database tradition, they excel at the SQL. They're extending it into data science, but when it comes to unstructured data, data science, ML AI often a compromise, the data lake crowd, the Databricks and such. They've struggled to completely displace the data warehouse when it really gets to the tough SLAs, they acknowledge that there's still a role for the warehouse. Maybe you can size down the warehouse and offload some of the BI workloads and maybe and some of these SQL engines, good for ad hoc, minimize data movement. But really when you get to the deep service level, a requirement, the high concurrency, the high query workloads, you end up creating something that's warehouse like. >> Where do you guys think this market is headed? What's going to take hold? Which projects are going to fade away? You got some things in Apache projects like Hudi and Iceberg, where do they fit Sanjeev? Do you have any thoughts on that? >> So thank you, Dave. So I feel that table formats are starting to mature. There is a lot of work that's being done. We will not have a single product or single platform. We'll have a mixture. So I see a lot of Apache Iceberg in the news. Apache Iceberg is really innovating. Their focus is on a table format, but then Delta and Apache Hudi are doing a lot of deep engineering work. For example, how do you handle high concurrency when there are multiple rights going on? Do you version your Parquet files or how do you do your upcerts basically? So different focus, at the end of the day, the end user will decide what is the right platform, but we are going to have multiple formats living with us for a long time. >> Doug is Iceberg in your view, something that's going to address some of those gaps in standards that Sanjeev was talking about earlier? >> Yeah, Delta lake, Hudi, Iceberg, they all address this need for consistency and scalability, Delta lake open technically, but open for access. I don't hear about Delta lakes in any worlds, but Databricks, hearing a lot of buzz about Apache Iceberg. End users want an open performance standard. And most recently Google embraced Iceberg for its recent a big lake, their stab at having supporting both lakes and warehouses on one conjoined platform. >> And Tony, of course, you remember the early days of the sort of big data movement you had MapR was the most closed. You had Horton works the most open. You had Cloudera in between. There was always this kind of contest as to who's the most open. Does that matter? Are we going to see a repeat of that here? >> I think it's spheres of influence, I think, and Doug very much was kind of referring to this. I would call it kind of like the MongoDB syndrome, which is that you have... and I'm talking about MongoDB before they changed their license, open source project, but very much associated with MongoDB, which basically, pretty much controlled most of the contributions made decisions. And I think Databricks has the same iron cloud hold on Delta lake, but still the market is pretty much associated Delta lake as the Databricks, open source project. I mean, Iceberg is probably further advanced than Hudi in terms of mind share. And so what I see that's breaking down to is essentially, basically the Databricks open source versus the everything else open source, the community open source. So I see it's a very similar type of breakdown that I see repeating itself here. >> So by the way, Mongo has a conference next week, another data platform is kind of not really relevant to this discussion totally. But in the sense it is because there's a lot of discussion on earnings calls these last couple of weeks about consumption and who's exposed, obviously people are concerned about Snowflake's consumption model. Mongo is maybe less exposed because Atlas is prominent in the portfolio, blah, blah, blah. But I wanted to bring up the little bit of controversy that we saw come out of the Snowflake earnings call, where the ever core analyst asked Frank Klutman about discretionary spend. And Frank basically said, look, we're not discretionary. We are deeply operationalized. Whereas he kind of poo-pooed the lakehouse or the data lake, et cetera, saying, oh yeah, data scientists will pull files out and play with them. That's really not our business. Do any of you have comments on that? Help us swing through that controversy. Who wants to take that one? >> Let's put it this way. The SQL folks are from Venus and the data scientists are from Mars. So it means it really comes down to it, sort that type of perception. The fact is, is that, traditionally with analytics, it was very SQL oriented and that basically the quants were kind of off in their corner, where they're using SaaS or where they're using Teradata. It's really a great leveler today, which is that, I mean basic Python it's become arguably one of the most popular programming languages, depending on what month you're looking at, at the title index. And of course, obviously SQL is, as I tell the MongoDB folks, SQL is not going away. You have a large skills base out there. And so basically I see this breaking down to essentially, you're going to have each group that's going to have its own natural preferences for its home turf. And the fact that basically, let's say the Python and scale of folks are using Databricks does not make them any less operational or machine critical than the SQL folks. >> Anybody else want to chime in on that one? >> Yeah, I totally agree with that. Python support in Snowflake is very nascent with all of Snowpark, all of the things outside of SQL, they're very much relying on partners too and make things possible and make data science possible. And it's very early days. I think the bottom line, what we're going to see is each of these camps is going to keep working on doing better at the thing that they don't do today, or they're new to, but they're not going to nail it. They're not going to be best of breed on both sides. So the SQL centric companies and shops are going to do more data science on their database centric platform. That data science driven companies might be doing more BI on their leagues with those vendors and the companies that have highly distributed data, they're going to add fabrics, and maybe offload more of their BI onto those engines, like Dremio and Starburst. >> So I've asked you this before, but I'll ask you Sanjeev. 'Cause Snowflake and Databricks are such great examples 'cause you have the data engineering crowd trying to go into data warehousing and you have the data warehousing guys trying to go into the lake territory. Snowflake has $5 billion in the balance sheet and I've asked you before, I ask you again, doesn't there has to be a semantic layer between these two worlds? Does Snowflake go out and do M&A and maybe buy ad scale or a data mirror? Or is that just sort of a bandaid? What are your thoughts on that Sanjeev? >> I think semantic layer is the metadata. The business metadata is extremely important. At the end of the day, the business folks, they'd rather go to the business metadata than have to figure out, for example, like let's say, I want to update somebody's email address and we have a lot of overhead with data residency laws and all that. I want my platform to give me the business metadata so I can write my business logic without having to worry about which database, which location. So having that semantic layer is extremely important. In fact, now we are taking it to the next level. Now we are saying that it's not just a semantic layer, it's all my KPIs, all my calculations. So how can I make those calculations independent of the compute engine, independent of the BI tool and make them fungible. So more disaggregation of the stack, but it gives us more best of breed products that the customers have to worry about. >> So I want to ask you about the stack, the modern data stack, if you will. And we always talk about injecting machine intelligence, AI into applications, making them more data driven. But when you look at the application development stack, it's separate, the database is tends to be separate from the data and analytics stack. Do those two worlds have to come together in the modern data world? And what does that look like organizationally? >> So organizationally even technically I think it is starting to happen. Microservices architecture was a first attempt to bring the application and the data world together, but they are fundamentally different things. For example, if an application crashes, that's horrible, but Kubernetes will self heal and it'll bring the application back up. But if a database crashes and corrupts your data, we have a huge problem. So that's why they have traditionally been two different stacks. They are starting to come together, especially with data ops, for instance, versioning of the way we write business logic. It used to be, a business logic was highly embedded into our database of choice, but now we are disaggregating that using GitHub, CICD the whole DevOps tool chain. So data is catching up to the way applications are. >> We also have databases, that trans analytical databases that's a little bit of what the story is with MongoDB next week with adding more analytical capabilities. But I think companies that talk about that are always careful to couch it as operational analytics, not the warehouse level workloads. So we're making progress, but I think there's always going to be, or there will long be a separate analytical data platform. >> Until data mesh takes over. (all laughing) Not opening a can of worms. >> Well, but wait, I know it's out of scope here, but wouldn't data mesh say, hey, do take your best of breed to Doug's earlier point. You can't be best of breed at everything, wouldn't data mesh advocate, data lakes do your data lake thing, data warehouse, do your data lake, then you're just a node on the mesh. (Tony laughs) Now you need separate data stores and you need separate teams. >> To my point. >> I think, I mean, put it this way. (laughs) Data mesh itself is a logical view of the world. The data mesh is not necessarily on the lake or on the warehouse. I think for me, the fear there is more in terms of, the silos of governance that could happen and the silo views of the world, how we redefine. And that's why and I want to go back to something what Sanjeev said, which is that it's going to be raising the importance of the semantic layer. Now does Snowflake that opens a couple of Pandora's boxes here, which is one, does Snowflake dare go into that space or do they risk basically alienating basically their partner ecosystem, which is a key part of their whole appeal, which is best of breed. They're kind of the same situation that Informatica was where in the early 2000s, when Informatica briefly flirted with analytic applications and realized that was not a good idea, need to redouble down on their core, which was data integration. The other thing though, that raises the importance of and this is where the best of breed comes in, is the data fabric. My contention is that and whether you use employee data mesh practice or not, if you do employee data mesh, you need data fabric. If you deploy data fabric, you don't necessarily need to practice data mesh. But data fabric at its core and admittedly it's a category that's still very poorly defined and evolving, but at its core, we're talking about a common meta data back plane, something that we used to talk about with master data management, this would be something that would be more what I would say basically, mutable, that would be more evolving, basically using, let's say, machine learning to kind of, so that we don't have to predefine rules or predefine what the world looks like. But so I think in the long run, what this really means is that whichever way we implement on whichever physical platform we implement, we need to all be speaking the same metadata language. And I think at the end of the day, regardless of whether it's a lake, warehouse or a lakehouse, we need common metadata. >> Doug, can I come back to something you pointed out? That those talking about bringing analytic and transaction databases together, you had talked about operationalizing those and the caution there. Educate me on MySQL HeatWave. I was surprised when Oracle put so much effort in that, and you may or may not be familiar with it, but a lot of folks have talked about that. Now it's got nowhere in the market, that no market share, but a lot of we've seen these benchmarks from Oracle. How real is that bringing together those two worlds and eliminating ETL? >> Yeah, I have to defer on that one. That's my colleague, Holger Mueller. He wrote the report on that. He's way deep on it and I'm not going to mock him. >> I wonder if that is something, how real that is or if it's just Oracle marketing, anybody have any thoughts on that? >> I'm pretty familiar with HeatWave. It's essentially Oracle doing what, I mean, there's kind of a parallel with what Google's doing with AlloyDB. It's an operational database that will have some embedded analytics. And it's also something which I expect to start seeing with MongoDB. And I think basically, Doug and Sanjeev were kind of referring to this before about basically kind of like the operational analytics, that are basically embedded within an operational database. The idea here is that the last thing you want to do with an operational database is slow it down. So you're not going to be doing very complex deep learning or anything like that, but you might be doing things like classification, you might be doing some predictives. In other words, we've just concluded a transaction with this customer, but was it less than what we were expecting? What does that mean in terms of, is this customer likely to turn? I think we're going to be seeing a lot of that. And I think that's what a lot of what MySQL HeatWave is all about. Whether Oracle has any presence in the market now it's still a pretty new announcement, but the other thing that kind of goes against Oracle, (laughs) that they had to battle against is that even though they own MySQL and run the open source project, everybody else, in terms of the actual commercial implementation it's associated with everybody else. And the popular perception has been that MySQL has been basically kind of like a sidelight for Oracle. And so it's on Oracles shoulders to prove that they're damn serious about it. >> There's no coincidence that MariaDB was launched the day that Oracle acquired Sun. Sanjeev, I wonder if we could come back to a topic that we discussed earlier, which is this notion of consumption, obviously Wall Street's very concerned about it. Snowflake dropped prices last week. I've always felt like, hey, the consumption model is the right model. I can dial it down in when I need to, of course, the street freaks out. What are your thoughts on just pricing, the consumption model? What's the right model for companies, for customers? >> Consumption model is here to stay. What I would like to see, and I think is an ideal situation and actually plays into the lakehouse concept is that, I have my data in some open format, maybe it's Parquet or CSV or JSON, Avro, and I can bring whatever engine is the best engine for my workloads, bring it on, pay for consumption, and then shut it down. And by the way, that could be Cloudera. We don't talk about Cloudera very much, but it could be one business unit wants to use Athena. Another business unit wants to use some other Trino let's say or Dremio. So every business unit is working on the same data set, see that's critical, but that data set is maybe in their VPC and they bring any compute engine, you pay for the use, shut it down. That then you're getting value and you're only paying for consumption. It's not like, I left a cluster running by mistake, so there have to be guardrails. The reason FinOps is so big is because it's very easy for me to run a Cartesian joint in the cloud and get a $10,000 bill. >> This looks like it's been a sort of a victim of its own success in some ways, they made it so easy to spin up single note instances, multi note instances. And back in the day when compute was scarce and costly, those database engines optimized every last bit so they could get as much workload as possible out of every instance. Today, it's really easy to spin up a new node, a new multi node cluster. So that freedom has meant many more nodes that aren't necessarily getting that utilization. So Snowflake has been doing a lot to add reporting, monitoring, dashboards around the utilization of all the nodes and multi node instances that have spun up. And meanwhile, we're seeing some of the traditional on-prem databases that are moving into the cloud, trying to offer that freedom. And I think they're going to have that same discovery that the cost surprises are going to follow as they make it easy to spin up new instances. >> Yeah, a lot of money went into this market over the last decade, separating compute from storage, moving to the cloud. I'm glad you mentioned Cloudera Sanjeev, 'cause they got it all started, the kind of big data movement. We don't talk about them that much. Sometimes I wonder if it's because when they merged Hortonworks and Cloudera, they dead ended both platforms, but then they did invest in a more modern platform. But what's the future of Cloudera? What are you seeing out there? >> Cloudera has a good product. I have to say the problem in our space is that there're way too many companies, there's way too much noise. We are expecting the end users to parse it out or we expecting analyst firms to boil it down. So I think marketing becomes a big problem. As far as technology is concerned, I think Cloudera did turn their selves around and Tony, I know you, you talked to them quite frequently. I think they have quite a comprehensive offering for a long time actually. They've created Kudu, so they got operational, they have Hadoop, they have an operational data warehouse, they're migrated to the cloud. They are in hybrid multi-cloud environment. Lot of cloud data warehouses are not hybrid. They're only in the cloud. >> Right. I think what Cloudera has done the most successful has been in the transition to the cloud and the fact that they're giving their customers more OnRamps to it, more hybrid OnRamps. So I give them a lot of credit there. They're also have been trying to position themselves as being the most price friendly in terms of that we will put more guardrails and governors on it. I mean, part of that could be spin. But on the other hand, they don't have the same vested interest in compute cycles as say, AWS would have with EMR. That being said, yes, Cloudera does it, I think its most powerful appeal so of that, it almost sounds in a way, I don't want to cast them as a legacy system. But the fact is they do have a huge landed legacy on-prem and still significant potential to land and expand that to the cloud. That being said, even though Cloudera is multifunction, I think it certainly has its strengths and weaknesses. And the fact this is that yes, Cloudera has an operational database or an operational data store with a kind of like the outgrowth of age base, but Cloudera is still based, primarily known for the deep analytics, the operational database nobody's going to buy Cloudera or Cloudera data platform strictly for the operational database. They may use it as an add-on, just in the same way that a lot of customers have used let's say Teradata basically to do some machine learning or let's say, Snowflake to parse through JSON. Again, it's not an indictment or anything like that, but the fact is obviously they do have their strengths and their weaknesses. I think their greatest opportunity is with their existing base because that base has a lot invested and vested. And the fact is they do have a hybrid path that a lot of the others lack. >> And of course being on the quarterly shock clock was not a good place to be under the microscope for Cloudera and now they at least can refactor the business accordingly. I'm glad you mentioned hybrid too. We saw Snowflake last month, did a deal with Dell whereby non-native Snowflake data could access on-prem object store from Dell. They announced a similar thing with pure storage. What do you guys make of that? Is that just... How significant will that be? Will customers actually do that? I think they're using either materialized views or extended tables. >> There are data rated and residency requirements. There are desires to have these platforms in your own data center. And finally they capitulated, I mean, Frank Klutman is famous for saying to be very focused and earlier, not many months ago, they called the going on-prem as a distraction, but clearly there's enough demand and certainly government contracts any company that has data residency requirements, it's a real need. So they finally addressed it. >> Yeah, I'll bet dollars to donuts, there was an EBC session and some big customer said, if you don't do this, we ain't doing business with you. And that was like, okay, we'll do it. >> So Dave, I have to say, earlier on you had brought this point, how Frank Klutman was poo-pooing data science workloads. On your show, about a year or so ago, he said, we are never going to on-prem. He burnt that bridge. (Tony laughs) That was on your show. >> I remember exactly the statement because it was interesting. He said, we're never going to do the halfway house. And I think what he meant is we're not going to bring the Snowflake architecture to run on-prem because it defeats the elasticity of the cloud. So this was kind of a capitulation in a way. But I think it still preserves his original intent sort of, I don't know. >> The point here is that every vendor will poo-poo whatever they don't have until they do have it. >> Yes. >> And then it'd be like, oh, we are all in, we've always been doing this. We have always supported this and now we are doing it better than others. >> Look, it was the same type of shock wave that we felt basically when AWS at the last moment at one of their reinvents, oh, by the way, we're going to introduce outposts. And the analyst group is typically pre briefed about a week or two ahead under NDA and that was not part of it. And when they dropped, they just casually dropped that in the analyst session. It's like, you could have heard the sound of lots of analysts changing their diapers at that point. >> (laughs) I remember that. And a props to Andy Jassy who once, many times actually told us, never say never when it comes to AWS. So guys, I know we got to run. We got some hard stops. Maybe you could each give us your final thoughts, Doug start us off and then-- >> Sure. Well, we've got the Snowflake Summit coming up. I'll be looking for customers that are really doing data science, that are really employing Python through Snowflake, through Snowpark. And then a couple weeks later, we've got Databricks with their Data and AI Summit in San Francisco. I'll be looking for customers that are really doing considerable BI workloads. Last year I did a market overview of this analytical data platform space, 14 vendors, eight of them claim to support lakehouse, both sides of the camp, Databricks customer had 32, their top customer that they could site was unnamed. It had 32 concurrent users doing 15,000 queries per hour. That's good but it's not up to the most demanding BI SQL workloads. And they acknowledged that and said, they need to keep working that. Snowflake asked for their biggest data science customer, they cited Kabura, 400 terabytes, 8,500 users, 400,000 data engineering jobs per day. I took the data engineering job to be probably SQL centric, ETL style transformation work. So I want to see the real use of the Python, how much Snowpark has grown as a way to support data science. >> Great. Tony. >> Actually of all things. And certainly, I'll also be looking for similar things in what Doug is saying, but I think sort of like, kind of out of left field, I'm interested to see what MongoDB is going to start to say about operational analytics, 'cause I mean, they're into this conquer the world strategy. We can be all things to all people. Okay, if that's the case, what's going to be a case with basically, putting in some inline analytics, what are you going to be doing with your query engine? So that's actually kind of an interesting thing we're looking for next week. >> Great. Sanjeev. >> So I'll be at MongoDB world, Snowflake and Databricks and very interested in seeing, but since Tony brought up MongoDB, I see that even the databases are shifting tremendously. They are addressing both the hashtag use case online, transactional and analytical. I'm also seeing that these databases started in, let's say in case of MySQL HeatWave, as relational or in MongoDB as document, but now they've added graph, they've added time series, they've added geospatial and they just keep adding more and more data structures and really making these databases multifunctional. So very interesting. >> It gets back to our discussion of best of breed, versus all in one. And it's likely Mongo's path or part of their strategy of course, is through developers. They're very developer focused. So we'll be looking for that. And guys, I'll be there as well. I'm hoping that we maybe have some extra time on theCUBE, so please stop by and we can maybe chat a little bit. Guys as always, fantastic. Thank you so much, Doug, Tony, Sanjeev, and let's do this again. >> It's been a pleasure. >> All right and thank you for watching. This is Dave Vellante for theCUBE and the excellent analyst. We'll see you next time. (upbeat music)

Published Date : Jun 2 2022

SUMMARY :

And Doug Henschen is the vice president Thank you. Doug let's start off with you And at the same time, me a lot of that material. And of course, at the and then we realized all the and Tony have brought to light. So I'm interested, the And in the cloud, So Sanjeev, is this all hype? But the problem is that we I mean, I look at the space, and offload some of the So different focus, at the end of the day, and warehouses on one conjoined platform. of the sort of big data movement most of the contributions made decisions. Whereas he kind of poo-pooed the lakehouse and the data scientists are from Mars. and the companies that have in the balance sheet that the customers have to worry about. the modern data stack, if you will. and the data world together, the story is with MongoDB Until data mesh takes over. and you need separate teams. that raises the importance of and the caution there. Yeah, I have to defer on that one. The idea here is that the of course, the street freaks out. and actually plays into the And back in the day when the kind of big data movement. We are expecting the end And the fact is they do have a hybrid path refactor the business accordingly. saying to be very focused And that was like, okay, we'll do it. So Dave, I have to say, the Snowflake architecture to run on-prem The point here is that and now we are doing that in the analyst session. And a props to Andy Jassy and said, they need to keep working that. Great. Okay, if that's the case, Great. I see that even the databases I'm hoping that we maybe have and the excellent analyst.

ENTITIES

Entity	Category	Confidence
Doug	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Tony	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Frank	PERSON	0.99+
Frank Klutman	PERSON	0.99+
Tony Baers	PERSON	0.99+
Mars	LOCATION	0.99+
Doug Henschen	PERSON	0.99+
2020	DATE	0.99+
AWS	ORGANIZATION	0.99+
Venus	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
2012	DATE	0.99+
Databricks	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Holger Mueller	PERSON	0.99+
Andy Jassy	PERSON	0.99+
last year	DATE	0.99+
$5 billion	QUANTITY	0.99+
$10,000	QUANTITY	0.99+
14 vendors	QUANTITY	0.99+
Last year	DATE	0.99+
last week	DATE	0.99+
San Francisco	LOCATION	0.99+
SanjMo	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
8,500 users	QUANTITY	0.99+
Sanjeev	PERSON	0.99+
Informatica	ORGANIZATION	0.99+
32 concurrent users	QUANTITY	0.99+
two	QUANTITY	0.99+
Constellation Research	ORGANIZATION	0.99+
Mongo	ORGANIZATION	0.99+
Sanjeev Mohan	PERSON	0.99+
Ahana	ORGANIZATION	0.99+
DaaS	ORGANIZATION	0.99+
EMR	ORGANIZATION	0.99+
32	QUANTITY	0.99+
Atlas	ORGANIZATION	0.99+
Delta	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+
Python	TITLE	0.99+
each	QUANTITY	0.99+
Athena	ORGANIZATION	0.99+
next week	DATE	0.99+

Nick Van Wiggeren, PlanetScale | Kubecon + Cloudnativecon Europe 2022

>> Narrator: theCUBE presents KubeCon and CloudNativeCon Europe 2022, brought to you by Red Hat, the Cloud Native Computing Foundation and its ecosystem partners. >> Welcome to Valencia, Spain, KubeCon, CloudNativeCon Europe 2022. I'm Keith Townsend, your host. And we're continuing the conversations around ecosystem cloud native, 7,500 people here, 170 plus show for sponsors. It is for open source conference, I think the destination. I might even premise that this may be, this may eventually roll to the biggest tech conference in the industry, maybe outside of AWS re:Invent. My next guest is Nick van Wiggeren. >> Wiggeren. >> VP engineering of PlanetScale. Nick, I'm going to start off the conversation right off the bat PlanetScale cloud native database, why do we need another database? >> Well, why don't you need another database? I mean, are you happy with yours? Is anyone happy with theirs? >> That's a good question. I don't think anyone is quite happy with, I don't know, I've never seen a excited database user, except for guys with really (murmurs) guys with great beards. >> Yeah. >> Keith: Or guys with gray hair maybe. >> Yeah. Outside of the dungeon I think... >> Keith: Right. >> No one is really is happy with their database, and that's what we're here to change. We're not just building the database, we're actually building the whole kind of start to finish experience, so that people can get more done. >> So what do you mean by getting more done? Because MySQL has been the underpinnings of like massive cloud database deployments. >> 100% >> It has been the de-facto standard. >> Nick: Yep. >> For cloud databases. >> Nick: Yep. >> What is PlanetScale doing in enabling us to do that I can't do with something like a MySQL or a SQL server? >> Great question. So we are MySQL compatible. So under the hood it's a lot of the MySQL you know and love. But on top of that we've layered workflows, we've layered scalability, we've layered serverless. So that you can get all of the the parts of the MySQL, that dependability, the thing that people have used for 20, 30 years, right? People don't even know a world before MySQL. But then you also get this ability to make schema changes faster. So you can kind of do your work quicker get to the business objectives faster. You can scale farther. So when you get to your MySQL and you say, well, can we handle adding this one feature on top? Can we handle the user growth we've got? You don't have to worry about that either. So it's kind of the best of both worlds. We've got one foot in history and we've got one foot in the new kind of cloud native database world. We want to give everyone the best of both. >> So when I think of serverless because that's the buzzy world. >> Yeah. >> But when I think of serverless I think about developers being able to write code. >> Yep. >> Deploy the code, not worry about VM sizes. >> Yep. >> Amount of disk space. >> Yep. >> CPU, et cetera. But we're talking about databases. >> Yep. >> I got to describe what type of disk I want to use. I got to describe the performance levels. >> Yep. >> I got all the descriptive stuff that I have to do about infrastructures. Databases are not... >> Yep. >> Keith: Serverless. >> Yep. >> They're the furthest thing from it. >> So despite what the name may say, I can guarantee you PlanetScale, your PlanetScale database does run on at least one server, usually more than one. But the idea is exactly what you said. So especially when you're starting off, when you're first beginning your, let's say database journey. That's a word I use a lot. The furthest thing from your mind is, how many CPUs do I need? How many disk iOS do I need? How much memory do I need? What we want you to be able to do is get started on focusing on shipping your code, right? The same way that Lambda, the same way that Kubernetes, and all of these other cloud native technologies just help people get done what they want to get done. PlanetScale is the same way, you want a database, you sign up, you click two buttons, you've got a database. We'll handle scaling the disk as you grow, we'll handle giving you more resources. And when you get to a spot where you're really starting to think about, my database has got hundreds of gigabytes or petabytes, terabytes, that's when we'll start to talk to you a little bit more about, hey, you know it really does run on a server, we ain't got to help you with the capacity planning, but there's no reason people should have to do that up front. I mean, that stinks. When you want to use a database you want to use a database. You don't want to use, 747 with 27 different knobs. You just want to get going. >> So, also when I think of serverless and cloud native, I think of stateless. >> Yep. >> Now there's stateless with databases, help me reconcile like, when you say it's cloud native. >> Nick: Yep. >> How is it cloud native when I think of cloud native as stateless? >> Yeah. So it's cloud native because it exists where you want it in the cloud, right? No matter where you've deployed your application on your own cloud, on a public cloud, or something like that, our job is to meet you and match the same level of velocity and the same level of change that you've got on your kind of cloud native setup. So there's a lot of state, right? We are your state and that's a big responsibility. And so what we want to do is, we want to let you experiment with the rest of the stateless workloads, and be right there next to you so that you can kind of get done what you need to get done. >> All right. So this concept of clicking two buttons... >> Nick: Yeah. >> And deploying, it's a database. >> Nick: Yep. >> It has to run somewhere. So let's say that I'm in AWS. >> Nick: Yep. >> And I have AWS VPC. What does it look like from a developer's perspective to consume the service? >> Yeah. So we've got a couple of different offerings, and AWS is a great example. So at the very kind of the most basic database unit you click, you get an endpoint, a host name, a password, and the username. You feed that right into your application and it's TLS secure and stuff like that, goes right into the database no problem. As you grow larger and larger, we can use things like AWS PrivateLink and stuff like that, to actually start to integrate more with your AWS environment, all the way over to what we call PlanetScale Managed. Which is where we actually deploy your data plan in your AWS account. So you give us some permissions and we kind of create a sub-account and stuff like that. And we can actually start sending pods, and hold clusters and stuff like that into your AWS account, give you a PrivateLink, so that everything looks like it's kind of wrapped up in your ownership but you still get the same kind of PlanetScale cloud experience, cloud native experience. >> So how do I make calls to the database? I mean, do I have to install a new... >> Nick: Great question. >> Like agent, or do some weird SQL configuration on my end? Or like what's the experience? >> Nope, we just need MySQL. Same way you'd go, install MySQL if you're on a Mac or app store to install MySQL on analytics PC, you just username, password, database name, and stuff like that, you feed that into your app and it just works. >> All right. So databases are typically security. >> Nick: Yep. >> When my security person. >> Nick: Yep. >> Sees a new database. >> Nick: Yep. >> Oh, they get excited. They're like, oh my job... >> Nick: I bet they do. >> My job just got real easy. I can find like eight or nine different findings. >> Right. >> How do you help me with compliance? >> Yeah. >> And answering these tough security questions from security? >> Great question. So security's at the core of what we do, right? We've got security people ourselves. We do the same thing for all the new vendors that we onboard. So we invest a lot. For example, the only way you can connect to a PlanetScale database even if you're using PrivateLink, even if you're not touching the public internet at all, is over TLS secured endpoint, right? From the very first day, the very first beta that we had we knew not a single byte goes over the internet that's not encrypted. It's encrypted at rest, we have audit logging, we do a ton internally as well to make sure that, what's happening to your database is something you can find out. The favorite thing that I think though is all your schema changes are tracked on PlanetScale, because we provide an entire workflow for your schema changes. We actually have like a GitHub Polar Request style thing, your security folks can actually look and say, what changes were made to the database day in and day out. They can go back and there's a full history of that log. So you actually have, I think better security than a lot of other databases where you've got to build all these tools and stuff like that, it's all built into PlanetScale. >> So, we started out the conversation with two clicks but I'm a developer. >> Nick: Yeah. >> And I'm developing a service at scale. >> Yep. >> I want to have a SaaS offering. How do I automate the deployment of the database and the management of the database across multiple customers? >> Yeah, so everything is API driven. We've got an API that you can use supervision databases to make schema changes, to make whatever changes you want to that database. We have an API that powers our website, the same API that customers can use to kind of automate any part of the workflow that they want. There's actually someone who did talk earlier using, I think, wwww.crossplane.io, or they can use Kubernetes custom resource definitions to provision PlanetScale databases completely automatically. So you can even do it as part of your standard deployment workflow. Just create a PlanetScale database, create a password, inject it in your app, all automatically. >> So Nick, as I'm thinking about scale. >> Yep. >> I'm thinking about multiple customers. >> Nick: Yep. >> I have a successful product. >> Nick: Yep. >> And now these customers are coming to me with different requirements. One customer wants to upgrade once every 1/4, another one, it's like, you know what? Just bring it on. Like bring the schema changes on. >> Yep. >> I want the latest features, et cetera. >> Nick: Right. >> How do I manage that with PlanetScale? When I'm thinking about MySQL it's a little, that can be a little difficult. >> Nick: Yeah. >> But how does PlanetScale help me solve that problem? >> Yeah. So, again I think it's that same workflow engine that we've built. So every database has its own kind of deploy queue, its own migration system. So you can automate all these processes and say, on this database, I want to change this schema this way, on this database I'm going to hold off. You can use our API to drive a view into like, well, what's the schema on this database? What's schema on this database? What version am I running on this database? And you can actually bring all that in. And if you were really successful you'd have this single plane of glass where you can see what's the status of all my databases and how are they doing, all powered by kind of the PlanetScale API. >> So we can't talk about databases without talking about backup. >> Nick: Yep. >> And recovery. >> Yep. >> How do I back this thing up and make sure that I can fall back? If someone deleted a table. >> Nick: Yep. >> It happens all the time in production. >> Nick: Yeah, 100%. >> How do I recover from it? >> So there's two pieces to this, and I'm going to talk about two different ways that we can help you solve this problem. One of them is, every PlanetScale database comes with backups built in and we test them fairly often, right? We use these backups. We actually give you a free daily backup on every database 'cause it's important to us as well. We want to be able to restore from backup, we want to be able to do failovers and stuff like that, all that is handled automatically. The other thing though is this feature that we launched in March called the PlanetScale Rewind. And what Rewind is, is actually a schema migration undo button. So let's say, you're a developer you're dropping a table or a column, you mean to drop this, but you drop the other one on accident, or you thought this column was unused but it wasn't. You know when you do something wrong, you cause an incident and you get that sick feeling in your stomach. >> Oh, I'm sorry. I've pulled a drive that was written not ready file and it was horrible. >> Exactly. And you kind of start to go, oh man, what am I going to do next? Everyone watching this right now is probably squirming in their seat a bit, you know the feeling. >> Yeah, I know the feeling >> Well, PlanetScale gives you an undo button. So you can click, undo migration, for 30 minutes after you do the migration and we'll revert your schema with all the data in it back to what your database looked like before you did that migration. Drop a column on accident, drop a table on accident, click the Rewind button, there's all the data there. And, the new rights that you've taken while that's happened are there as well. So it's not just a restore to a point in time backup. It's actually that we've replicated your rights sent them to both the old and the new schema, and we can get you right back to where you started, downtime solved. >> Both: So. >> Nick: Go ahead. >> DBAs are DBAs, whether they've become now reformed DBAs that are cloud architects, but they're DBAs. So there's a couple of things that they're going to want to know, one, how do I get my zero back up in my hands? >> Yeah. >> I want my, it's MySQL data. >> Nick: Yeah. >> I want my MySQL backup. >> Yeah. So you can just take backups off the database yourself the same way that you're doing today, right? MySQL dump, MySQL backup, and all those kinds of things. If you don't trust PlanetScale, and look, I'm all about backups, right? You want them in two different data centers on different mediums, you can just add on your own backup tools that you have right now and also use that. I'd like you to trust that PlanetScale has the backups as well. But if you want to keep doing that and run your own system, we're totally cool with that as well. In fact, I'd go as far as to say, I recommend it. You never have too many backups. >> So in a moment we're going to run Kube clock. So get your... >> Okay, all right. >> You know, stand tall. >> All right. >> I'll get ready. I'm going to... >> Nick: I'm tall, I'm tall. >> We're both tall. The last question before Kube clock. >> Nick: Yeah. >> It is, let's talk a little nerve knobs. >> Nick: Okay. >> The reform DBA. >> Nick: Yeah. >> They want, they're like, oh, this query ran a little bit slow. I know I can squeeze a little bit more out of that. >> Nick: Yeah. >> Who do they talk to? >> Yeah. So that's a great question. So we provide you some insights on the product itself, right? So you can take a look and see how are my queries performing and stuff like that. Our goal, our job is to surface to you all the metrics that you need to make that decision. 'Cause at the end of the day, a reform DBA or not it is still a skill to analyze the performance of a MySQL query, run and explain, kind of figure all that out. We can't do all of that for you. So we want to give you the information you need either knowledge or you know, stuff to learn whatever it is because some of it does have to come back to, what's my schema? What's my query? And how can I optimize it? I'm missing an index and stuff like that. >> All right. So, you're early adopter of the Kube clock. >> Okay. >> I have to, people say they're ready. >> Nick: Ooh, okay. >> All the time people say they're ready. >> Nick: Woo. >> But I'm not quite sure that they're ready. >> Nick: Well, now I'm nervous. >> So are you ready? >> Do I have any other choice? >> No, you don't. >> Nick: Then I am. >> But are you ready? >> Sure, let's go. >> All right. Start the Kube clock. (upbeat music) >> Nick: All right, what do you want me to do? >> Go. >> All right. >> You said you were ready. >> I'm ready, all right, I'm ready. All right. >> Okay, I'll reset. I'll give you, I'll give, see people say they're ready. >> All right. You're right. You're right. >> Start the Kube clock, go. >> Okay. Are you happy with how your database works? Are you happy with the velocity? Are you happy with what your engineers and what your teams can do with their database? >> Follow the dream not the... Well, follow the green... >> You got to be. >> Not the dream. >> You got to be able to deliver. At the end of the day you got to deliver what the business wants. It's not about performance. >> You got to crawl before you go. You got to crawl, you got to crawl. >> It's not just about is my query fast, it's not just about is my query right, it's about, are my customers getting what they want? >> You're here, you deserve a seat at the table. >> And that's what PlanetScale provides, right? PlanetScale... >> Keith: Ten more seconds. >> PlanetScale is a tool for getting done what you need to get done as a business. That's what we're here for. Ultimately, we want to be the best database for developing software. >> Keith: Two, one. >> That's it. End it there. >> Nick, you took a shot, I'm buying it. Great job. You know, this is fun. Our jobs are complex. >> Yep. >> Databases are hard. >> Yep. >> It is the, where your organization keeps the most valuable assets that you have. >> Nick: A 100%. >> And we are having these tough conversations. >> Nick: Yep. >> Here in Valencia, you're talking to the leader in tech coverage. From Valencia, Spain, I'm Keith Townsend, and you're watching theCUBE, the leader in high tech coverage. (upbeat music)

Published Date : May 20 2022

SUMMARY :

brought to you by Red Hat, in the industry, conversation right off the bat I don't think anyone is quite happy with, Outside of the dungeon I think... We're not just building the database, So what do you mean it's a lot of the MySQL you know and love. because that's the buzzy world. being able to write code. Deploy the code, But we're talking about databases. I got to describe what I got all the descriptive stuff But the idea is exactly what you said. I think of stateless. when you say it's cloud native. and be right there next to you So this concept of clicking two buttons... And deploying, So let's say that I'm in AWS. consume the service? So you give us some permissions So how do I make calls to the database? you feed that into your So databases are typically security. Oh, they get excited. I can find like eight or the only way you can connect So, we started out the and the management of the database So you can even do it another one, it's like, you know what? How do I manage that with PlanetScale? So you can automate all these processes So we can't talk about databases and make sure that I can fall back? that we can help you solve this problem. and it was horrible. And you kind of start to go, and we can get you right that they're going to want to know, So you can just take backups going to run Kube clock. I'm going to... The last question before Kube clock. It is, I know I can squeeze a the metrics that you need of the Kube clock. I have to, sure that they're ready. Start the Kube clock. All right. see people say they're ready. All right. Are you happy with what your engineers Well, follow the green... you got to deliver what You got to crawl before you go. you deserve a seat at the table. And that's what what you need to get done as a business. End it there. Nick, you took a shot, the most valuable assets that you have. And we are having the leader in high tech coverage.

ENTITIES

Entity	Category	Confidence
DeLisa	PERSON	0.99+
Keith	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Anvi	PERSON	0.99+
2009	DATE	0.99+
Keith Townsend	PERSON	0.99+
Europe	LOCATION	0.99+
Nick van Wiggeren	PERSON	0.99+
Avni Khatri	PERSON	0.99+
Jigyasa	PERSON	0.99+
India	LOCATION	0.99+
Canada	LOCATION	0.99+
Nick Van Wiggeren	PERSON	0.99+
one year	QUANTITY	0.99+
Mexico	LOCATION	0.99+
Jigyasa Grover	PERSON	0.99+
Cambridge	LOCATION	0.99+
Red Hat	ORGANIZATION	0.99+
two pieces	QUANTITY	0.99+
Nick	PERSON	0.99+
Valencia	LOCATION	0.99+
five	QUANTITY	0.99+
Oaxaca	LOCATION	0.99+
eight	QUANTITY	0.99+
New Delhi	LOCATION	0.99+
Romania	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
Khan Academy	ORGANIZATION	0.99+
DeLisa Alexander	PERSON	0.99+
March	DATE	0.99+
10 year	QUANTITY	0.99+
100%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
five year	QUANTITY	0.99+
22 labs	QUANTITY	0.99+
Boston	LOCATION	0.99+
Cloud Native Computing Foundation	ORGANIZATION	0.99+
eight years	QUANTITY	0.99+
one foot	QUANTITY	0.99+
five years	QUANTITY	0.99+
MySQL	TITLE	0.99+
Antequera	LOCATION	0.99+
7,500 people	QUANTITY	0.99+
Monday night	DATE	0.99+
five countries	QUANTITY	0.99+
two new labs	QUANTITY	0.99+
two different ways	QUANTITY	0.99+
last week	DATE	0.99+
80%	QUANTITY	0.99+
20	QUANTITY	0.99+
Boston, Massachusetts	LOCATION	0.99+
Oaxaca City	LOCATION	0.99+
30 minutes	QUANTITY	0.99+
iOS	TITLE	0.99+
27 different knobs	QUANTITY	0.99+
Two	QUANTITY	0.99+
KubeCon	EVENT	0.99+

Jim Walker, Cockroach Labs & Christian Hüning, finleap connect | Kubecon + Cloudnativecon EU 2022

>> (bright music) >> Narrator: The Cube, presents Kubecon and Cloudnativecon, year of 2022, brought to you by Red Hat, the cloud native computing foundation and its ecosystem partners. >> Now what we're opening. Welcome to Valencia, Spain in Kubecon Cloudnativecon, Europe, 2022. I'm Keith Townsend, along with my host, Paul Gillin, who is the senior editor for architecture at Silicon angle, Paul. >> Keith you've been asking me questions all these last two days. Let me ask you one. You're a traveling man. You go to a lot of conferences. What's different about this one. >> You know what, we're just talking about that pre-conference, open source conferences are usually pretty intimate. This is big. 7,500 people talking about complex topics, all in one big area. And then it's, I got to say it's overwhelming. It's way more. It's not focused on a single company's product or messaging. It is about a whole ecosystem, very different show. >> And certainly some of the best t-shirts I've ever seen. And our first guest, Jim has one of the better ones. >> I mean a bit cockroach come on, right. >> Jim Walker, principal product evangelist at CockroachDB and Christian Huning, tech director of cloud technologies at Finleap Connect, a financial services company that's based out of Germany, now offering services in four countries now. >> Basically all over Europe. >> Okay. >> But we are in three countries with offices. >> So you're CockroachDB customer and I got to ask the obvious question. Databases are hard and started the company in 2015 CockroachDB, been a customer since 2019, I understand. Why take the risk on a four year old database. I mean that just sounds like a world of risk and trouble. >> So it was in 2018 when we joined the company back then and we did this cloud native transformation, that was our task basically. We had very limited amount of time and we were faced with a legacy infrastructure and we needed something that would run in a cloud native way and just blend in with everything else we had. And the idea was to go all in with Kubernetes. Though early days, a lot of things were alpha beta, and we were running on mySQL back then. >> Yeah. >> On a VM, kind of small setup. And then we were looking for something that we could just deploy in Kubernetes, alongside with everything else. And we had to stack and we had to duplicate it many times. So also to maintain that we wanted to do it all the same like with GitOps and everything and Cockroach delivered that proposition. So that was why we evaluate the risk of relatively early adopting that solution with the proposition of having something that's truly cloud native and really blends in with everything else we do in the same way was something we considered, and then we jumped the leap of faith and >> The fin leap of faith >> The fin leap of faith. Exactly. And we were not dissatisfied. >> So talk to me a little bit about the challenges because when we think of MySQL, MySQL scales to amazing sizes, it is the de facto database for many cloud based architectures. What problems were you running into with MySQL? >> We were running into the problem that we essentially, as a finTech company, we are regulated and we have companies, customers that really value running things like on-prem, private cloud, on-prem is a bit of a bad word, maybe. So it's private cloud, hybrid cloud, private cloud in our own data centers in Frankfurt. And we needed to run it in there. So we wanted to somehow manage that and with, so all of the managed solution were off the table, so we couldn't use them. So we needed something that ran in Kubernetes because we only wanted to maintain Kubernetes. We're a small team, didn't want to use also like full blown VM solution, of sorts. So that was that. And the other thing was, we needed something that was HA distributable somehow. So we also looked into other solutions back at the time, like Vitis, which is also prominent for having a MySQL compliant interface and great solution. We also got into work, but we figured, this is from the scale, and from the sheer amount of maintenance it would need, we couldn't deliver that, we were too small for that. So that's where then Cockroach just fitted in nicely by being able to distribute BHA, be resilient against failure, but also be able to scale out because we had this problem with a single MySQL deployment to not really, as it grew, as the data amounts grew, we had trouble to operatively keep that under control. >> So Jim, every time someone comes to me and says, I have a new database, I think we don't need it, yet another database. >> Right. >> What problem, or how does CockroachDB go about solving the types of problems that Christian had? >> Yeah. I mean, Christian laid out why it exists. I mean, look guys, building a database isn't easy. If it was easy, we'd have a database for every application, but you know, Michael Stonebraker, kind of godfather of all database says it himself, it takes seven, eight years for a database to fully gestate to be something that's like enterprise ready and kind of, be relied upon. We've been billing for about seven, eight years. I mean, I'm thankful for people like Christian to join us early on to help us kind of like troubleshoot and go through some things. We're building a database, it's not easy. You're right. But building a distributor system is also not easy. And so for us, if you look at what's going on in just infrastructure in general, what's happening in Kubernetes, like this whole space is Kubernetes. It's all about automation. How do I automate scale? How do I automate resilience out of the entire equation of what we're actually doing? I don't want to have to think about active passive systems. I don't want to think about sharding a database. Sure you can scale MySQL. You know, how many people it takes to run three or four shards of MySQL database. That's not automation. And I tell you what, this world right now with the advances in data how hard it is to find people who actually understand infrastructure to hire them. This is why this automation is happening, because our systems are more complex. So we started from the very beginning to be something that was very different. This is a cloud native database. This is built with the same exact principles that are in Kubernetes. In fact, like Kubernetes it's kind of a spawn of borg, the back end of Google. We are inspired by Spanner. I mean, this started by three engineers that worked at Google, are frustrated, they didn't have the tools, they had at Google. So they built something that was, outside of Google. And how do we give that kind of Google like infrastructure for everybody. And that's, the advent of Cockroach and kind of why we're doing, what we're doing. >> As your database has matured, you're now beginning a transition or you're in a transition to a serverless version. How are you doing that without disrupting the experience for existing customers? And why go serverless at all? >> Yeah, it's interesting. So, you know, serverless was, it was kind of a an R&D project for us. And when we first started on a path, because I think you know, ultimately what we would love to do for the database is let's not even think about database, Keith. Like, I don't want to think about the database. What we're building too is, we want a SQL API in the cloud. That's it. I don't want to think about scale. I don't want to think about upgrades. I literally like. that stuff should just go away. That's what we need, right. As developers, I don't want to think about isolation levels or like, you know, give me DML and I want to be able to communicate. And for us the realization of that vision is like, if we're going to put a database on the planet for everybody to actually use it, we have to be really, really efficient. And serverless, which I believe really should be infrastructure less because I don't think we should be thinking of just about service. We got to think about, how do I take the context of regions out of this thing? How do I take the context of cloud providers out of what we're talking about? Let's just not think about that. Let's just code against something. Serverless was the answer. Now we've been building for about a year and a half. We launched a serverless version of Cockroach last October and we did it so that everybody in the public could have a free version of a database. And that's what serverless allows us to do. It's all consumption based up to certain limits and then you pay. But I think ultimately, and we spoke a little bit about this at the very beginning. I think as ISVs, people who are building software today the serverless vision gets really interesting because I think what's on the mind of the CTO is, how do I drive down my cost to the cloud provider? And if we can basically, drive down costs through either making things multi-tenant and super efficient, and then optimizing how much compute we use, spinning things down to zero and back up and auto scaling these sort of things in our software. We can start to make changes in the way that people are thinking about spend with the cloud provider. And ultimately we did that, so we could do things for free. >> So, Jim, I think I disagree Christian, I'm sorry, Jim. I think I disagree with you just a little bit. Christian, I think the biggest challenge facing CTOs are people. >> True. >> Getting the people to worry about cost and spend and implementation. So as you hear the concepts of CoachDB moving to a serverless model, and you're a large customer how does that make you think or react to your people side of your resources? >> Well, I can say that from the people side of resources luckily Cockroach is our least problem. So it just kind of, we always said, it's an operator stream because that was the part that just worked for us, so. >> And it's worked as you have scaled it? without you having ... >> Yeah. I mean, we use it in a bit of a, we do not really scale out like the Cockroach, like really large. It's like, more that we use it with the enterprise features of encryption in the stack and our customers then demand. If they do so, we have the Zas offering and we also do like dedicated stacks. So by having a fully cloud native solution on top of Kubernetes, as the foundational layer we can just use that and stamp it out and deploy it. >> How does that translate into services you can provide your customers? Are there services you can provide customers that you couldn't have, if you were running, say, MySQL? >> No, what we do is, we run this, so the SAS offering runs in our hybrid private cloud. And the other thing that we offer is that we run the entire stack at a cloud provider of their choosing. So if they are an AWS, they give us an AWS account, we put it in there. Theoretically, we could then also talk about using the serverless variant, if they like so, but it's not strictly required for us. >> So Christian, talk to me about that provisioning process because if I had a MySQL deployment before I can imagine how putting that into a cloud native type of repeatable CICD pipeline or Ansible script that could be difficult. Talk to me about that. How CockroachDB enables you to create new onboarding experiences for your customers? >> So what we do is, we use helm charts all over the place as probably everybody else. And then each application team has their parts of services, they've packaged them to helm charts, they've wrapped us in a super chart that gets wrapped into the super, super chart for the entire stack. And then at the right place, somewhere in between Cockroach is added, where it's a dependency. And as they just offer a helm chart that's as easy as it gets. And then what the teams do is they have an inner job, that once you deploy all that, it would spin up. And as soon as Cockroach is ready it's just the same reconcile loop as everything. It will then provision users, set up database schema, do all that. And initialize, initial data sets that might be required for a new setup. So with that setup, we can spin up a new cluster and then deploy that stack chart in there. And it takes some time. And then it's done. >> So talk to me about life cycle management. Because when I have one database, I have one schema. When I have a lot of databases I have a lot of different schemas. How do you keep your stack consistent across customers? >> That is basically part of the same story. We have get offs all over the place. So we have this repository, we see the super helm chart versions and we maintain like minus three versions and ensure that we update the customers and keep them up to date. It's part of the contract sometimes, down to the schedule of the customer at times. And Cockroach nicely supports also, these updates with these migrations in the background, the schema migrations in the background. So we use in our case, in that integration SQL alchemy, which is also nicely supported. So there was also part of the story from MySQL to Postgres, was supported by the ORM, these kind of things. So the skill approach together with the ease of helm charts and the background migrations of the schema is a very seamless upgrade operations. Before that we had to have downtime. >> That's right, you could have online schema changes. Upgrading the database uses the same concept of rolling upgrades that you have in Kubernetes. It's just cloud native. It just fits that same context, I think. >> Christian: It became a no-brainer. >> Yeah. >> Yeah. >> Jim, you mentioned the idea of a SQL API in the cloud, that's really interesting. Why does such a thing not exist? >> Because it's really difficult to build. You know, SQL API, what does that mean? Like, okay. What I'm going to, where does that endpoint live? Is there one in California one on the east coast, one in Europe, one in Asia? Okay. And I'm asking that endpoint for data. Where does that data live? Can you control where data lives on the planet? Because ultimately what we're fighting in software today in a lot of these situations is the speed of light. And so how do you intelligently place data on this planet? So that, you know, when you're asking for data, when you're maybe home, it's a different latency than when you're here in Valencia. Does that data follow and move you? These are really, really difficult problems to solve. And I think that we're at that layer of, we're at this moment in time in software engineering, we're solving some really interesting, interesting things cause we are budding against this speed of light problem. And ultimately that's one of the biggest challenges. But underneath, it has to have all this automation like the ease at which we can scale this database like the always on resilient, the way that we can upgrade the entire thing with just rolling upgrades. The cloud native concepts is really what's enabling us to do things at global scale it's automation. >> Let's alk about that speed of light in global scale. There's no better conference for speed of light, for scale, than Kubecon. Any predictions coming out of the show? >> It's less a prediction for me and more of an observation, you guys. Like look at two years ago, when we were here in Barcelona at QCon EU, it was a lot of hype. It's a lot of hype, a lot of people walking around, curious, fascinated, this is reality. The conversations that I'm having with people today, there's a reality. There's people really doing, they're becoming cloud native. And to me, I think what we're going to see over the next two to three years is people start to adopt this kind of distributed mindset. And it permeates not just within infrastructure but it goes up into the stack. We'll start to see much more developers using, Go and these kind of the threaded languages, because I think that distributed mindset, if it starts at the chip all the way to the fingertip of the person clicking and you're distributed everywhere in between. It is extremely powerful. And I think that's what Finleap, I mean, that's exactly what the team is doing. And I think there's a lot of value and a lot of power in that. >> Jim, Christian, thank you so much for coming on the Cube and sharing your story. You know what we're past the hype cycle of Kubernetes, I agree. I was a nonbeliever in Kubernetes two, three years ago. It was mostly hype. We're looking at customers from Microsoft, Finleap and competitors doing amazing things with this platform and cloud native in general. Stay tuned for more coverage of Kubecon from Valencia, Spain. I'm Keith Townsend, along with Paul Gillin and you're watching the Cube, the leader in high tech coverage. (bright music)

Published Date : May 19 2022

SUMMARY :

brought to you by Red Hat, Welcome to Valencia, Spain You go to a lot of conferences. I got to say it's overwhelming. And certainly some of the and Christian Huning, But we are in three and started the company and we were faced with So also to maintain that we And we were not dissatisfied. So talk to me a little and we have companies, customers I think we don't need it, And how do we give that kind disrupting the experience and we did it so that I think I disagree with Getting the people to worry because that was the part And it's worked as you have scaled it? It's like, more that we use it And the other thing that we offer is that So Christian, talk to me it's just the same reconcile I have a lot of different schemas. and ensure that we update the customers Upgrading the database of a SQL API in the cloud, the way that we can Any predictions coming out of the show? and more of an observation, you guys. so much for coming on the Cube

ENTITIES

Entity	Category	Confidence
Jim	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Jim Walker	PERSON	0.99+
California	LOCATION	0.99+
Keith Townsend	PERSON	0.99+
Michael Stonebraker	PERSON	0.99+
2018	DATE	0.99+
Germany	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
2015	DATE	0.99+
Frankfurt	LOCATION	0.99+
Keith	PERSON	0.99+
Europe	LOCATION	0.99+
seven	QUANTITY	0.99+
Red Hat	ORGANIZATION	0.99+
Cockroach Labs	ORGANIZATION	0.99+
Christia	PERSON	0.99+
Barcelona	LOCATION	0.99+
Google	ORGANIZATION	0.99+
Valencia	LOCATION	0.99+
Asia	LOCATION	0.99+
Christian	PERSON	0.99+
Finleap Connect	ORGANIZATION	0.99+
MySQL	TITLE	0.99+
Kubernetes	TITLE	0.99+
Valencia, Spain	LOCATION	0.99+
three	QUANTITY	0.99+
two years ago	DATE	0.99+
Finleap	ORGANIZATION	0.99+
three engineers	QUANTITY	0.99+
three countries	QUANTITY	0.99+
first guest	QUANTITY	0.99+
SQL API	TITLE	0.99+
Paul	PERSON	0.99+
Kubecon	ORGANIZATION	0.98+
last October	DATE	0.98+
eight years	QUANTITY	0.98+
2022	DATE	0.98+
each application	QUANTITY	0.98+
four countries	QUANTITY	0.98+
one database	QUANTITY	0.98+
one	QUANTITY	0.98+
2019	DATE	0.98+
three years ago	DATE	0.98+
CockroachDB	ORGANIZATION	0.98+
one schema	QUANTITY	0.98+
Christian Huning	PERSON	0.97+
about a year and a half	QUANTITY	0.97+
two	DATE	0.96+
first	QUANTITY	0.96+
Christian Hüning	PERSON	0.94+
today	DATE	0.94+
about seven	QUANTITY	0.93+
Cloudnativecon	ORGANIZATION	0.93+
three years	QUANTITY	0.93+

Analyst Power Panel: Future of Database Platforms

(upbeat music) >> Once a staid and boring business dominated by IBM, Oracle, and at the time newcomer Microsoft, along with a handful of wannabes, the database business has exploded in the past decade and has become a staple of financial excellence, customer experience, analytic advantage, competitive strategy, growth initiatives, visualizations, not to mention compliance, security, privacy and dozens of other important use cases and initiatives. And on the vendor's side of the house, we've seen the rapid ascendancy of cloud databases. Most notably from Snowflake, whose massive raises leading up to its IPO in late 2020 sparked a spate of interest and VC investment in the separation of compute and storage and all that elastic resource stuff in the cloud. The company joined AWS, Azure and Google to popularize cloud databases, which have become a linchpin of competitive strategies for technology suppliers. And if I get you to put your data in my database and in my cloud, and I keep innovating, I'm going to build a moat and achieve a hugely attractive lifetime customer value in a really amazing marginal economics dynamic that is going to fund my future. And I'll be able to sell other adjacent services, not just compute and storage, but machine learning and inference and training and all kinds of stuff, dozens of lucrative cloud offerings. Meanwhile, the database leader, Oracle has invested massive amounts of money to maintain its lead. It's building on its position as the king of mission critical workloads and making typical Oracle like claims against the competition. Most were recently just yesterday with another announcement around MySQL HeatWave. An extension of MySQL that is compatible with on-premises MySQLs and is setting new standards in price performance. We're seeing a dramatic divergence in strategies across the database spectrum. On the far left, we see Amazon with more than a dozen database offerings each with its own API and primitives. AWS is taking a right tool for the right job approach, often building on open source platforms and creating services that it offers to customers to solve very specific problems for developers. And on the other side of the line, we see Oracle, which is taking the Swiss Army Knife approach, converging database functionality, enabling analytic and transactional workloads to run in the same data store, eliminating the need to ETL, at the same time adding capabilities into its platform like automation and machine learning. Welcome to this database Power Panel. My name is Dave Vellante, and I'm so excited to bring together some of the most respected industry analyst in the community. Today we're going to assess what's happening in the market. We're going to dig into the competitive landscape and explore the future of database and database platforms and decode what it means to customers. Let me take a moment to welcome our guest analyst today. Matt Kimball is a vice president and principal analysts at Moor Insights and Strategy, Matt. He knows products, he knows industry, he's got real world IT expertise, and he's got all the angles 25 plus years of experience in all kinds of great background. Matt, welcome. Thanks very much for coming on theCUBE. Holgar Mueller, friend of theCUBE, vice president and principal analyst at Constellation Research in depth knowledge on applications, application development, knows developers. He's worked at SAP and Oracle. And then Bob Evans is Chief Content Officer and co-founder of the Acceleration Economy, founder and principle of Cloud Wars. Covers all kinds of industry topics and great insights. He's got awesome videos, these three minute hits. If you haven't seen 'em, checking them out, knows cloud companies, his Cloud Wars minutes are fantastic. And then of course, Marc Staimer is the founder of Dragon Slayer Research. A frequent contributor and guest analyst at Wikibon. He's got a wide ranging knowledge across IT products, knows technology really well, can go deep. And then of course, Ron Westfall, Senior Analyst and Director Research Director at Futurum Research, great all around product trends knowledge. Can take, you know, technical dives and really understands competitive angles, knows Redshift, Snowflake, and many others. Gents, thanks so much for taking the time to join us in theCube today. It's great to have you on, good to see you. >> Good to be here, thanks for having us. >> Thanks, Dave. >> All right, let's start with an around the horn and briefly, if each of you would describe, you know, anything I missed in your areas of expertise and then you answer the following question, how would you describe the state of the database, state of platform market today? Matt Kimball, please start. >> Oh, I hate going first, but that it's okay. How would I describe the world today? I would just in one sentence, I would say, I'm glad I'm not in IT anymore, right? So, you know, it is a complex and dangerous world out there. And I don't envy IT folks I'd have to support, you know, these modernization and transformation efforts that are going on within the enterprise. It used to be, you mentioned it, Dave, you would argue about IBM versus Oracle versus this newcomer in the database space called Microsoft. And don't forget Sybase back in the day, but you know, now it's not just, which SQL vendor am I going to go with? It's all of these different, divergent data types that have to be taken, they have to be merged together, synthesized. And somehow I have to do that cleanly and use this to drive strategic decisions for my business. That is not easy. So, you know, you have to look at it from the perspective of the business user. It's great for them because as a DevOps person, or as an analyst, I have so much flexibility and I have this thing called the cloud now where I can go get services immediately. As an IT person or a DBA, I am calling up prevention hotlines 24 hours a day, because I don't know how I'm going to be able to support the business. And as an Oracle or as an Oracle or a Microsoft or some of the cloud providers and cloud databases out there, I'm licking my chops because, you know, my market is expanding and expanding every day. >> Great, thank you for that, Matt. Holgar, how do you see the world these days? You always have a good perspective on things, share with us. >> Well, I think it's the best time to be in IT, I'm not sure what Matt is talking about. (laughing) It's easier than ever, right? The direction is going to cloud. Kubernetes has won, Google has the best AI for now, right? So things are easier than ever before. You made commitments for five plus years on hardware, networking and so on premise, and I got gray hair about worrying it was the wrong decision. No, just kidding. But you kind of both sides, just to be controversial, make it interesting, right. So yeah, no, I think the interesting thing specifically with databases, right? We have this big suite versus best of breed, right? Obviously innovation, like you mentioned with Snowflake and others happening in the cloud, the cloud vendors server, where to save of their databases. And then we have one of the few survivors of the old guard as Evans likes to call them is Oracle who's doing well, both their traditional database. And now, which is really interesting, remarkable from that because Oracle it was always the power of one, have one database, add more to it, make it what I call the universal database. And now this new HeatWave offering is coming and MySQL open source side. So they're getting the second (indistinct) right? So it's interesting that older players, traditional players who still are in the market are diversifying their offerings. Something we don't see so much from the traditional tools from Oracle on the Microsoft side or the IBM side these days. >> Great, thank you Holgar. Bob Evans, you've covered this business for a while. You've worked at, you know, a number of different outlets and companies and you cover the competition, how do you see things? >> Dave, you know, the other angle to look at this from is from the customer side, right? You got now CEOs who are any sort of business across all sorts of industries, and they understand that their future success is going to be dependent on their ability to become a digital company, to understand data, to use it the right way. So as you outline Dave, I think in your intro there, it is a fantastic time to be in the database business. And I think we've got a lot of new buyers and influencers coming in. They don't know all this history about IBM and Microsoft and Oracle and you know, whoever else. So I think they're going to take a long, hard look, Dave, at some of these results and who is able to help these companies not serve up the best technology, but who's going to be able to help their business move into the digital future. So it's a fascinating time now from every perspective. >> Great points, Bob. I mean, digital transformation has gone from buzzword to imperative. Mr. Staimer, how do you see things? >> I see things a little bit differently than my peers here in that I see the database market being segmented. There's all the different kinds of databases that people are looking at for different kinds of data, and then there is databases in the cloud. And so database as cloud service, I view very differently than databases because the traditional way of implementing a database is changing and it's changing rapidly. So one of the premises that you stated earlier on was that you viewed Oracle as a database company. I don't view Oracle as a database company anymore. I view Oracle as a cloud company that happens to have a significant expertise and specialty in databases, and they still sell database software in the traditional way, but ultimately they're a cloud company. So database cloud services from my point of view is a very distinct market from databases. >> Okay, well, you gave us some good meat on the bone to talk about that. Last but not least-- >> Dave did Marc, just say Oracle's a cloud company? >> Yeah. (laughing) Take away the database, it would be interesting to have that discussion, but let's let Ron jump in here. Ron, give us your take. >> That's a great segue. I think it's truly the era of the cloud database, that's something that's rising. And the key trends that come with it include for example, elastic scaling. That is the ability to scale on demand, to right size workloads according to customer requirements. And also I think it's going to increase the prioritization for high availability. That is the player who can provide the highest availability is going to have, I think, a great deal of success in this emerging market. And also I anticipate that there will be more consolidation across platforms in order to enable cost savings for customers, and that's something that's always going to be important. And I think we'll see more of that over the horizon. And then finally security, security will be more important than ever. We've seen a spike (indistinct), we certainly have seen geopolitical originated cybersecurity concerns. And as a result, I see database security becoming all the more important. >> Great, thank you. Okay, let me share some data with you guys. I'm going to throw this at you and see what you think. We have this awesome data partner called Enterprise Technology Research, ETR. They do these quarterly surveys and each period with dozens of industry segments, they track clients spending, customer spending. And this is the database, data warehouse sector okay so it's taxonomy, so it's not perfect, but it's a big kind of chunk. They essentially ask customers within a category and buy a specific vendor, you're spending more or less on the platform? And then they subtract the lesses from the mores and they derive a metric called net score. It's like NPS, it's a measure of spending velocity. It's more complicated and granular than that, but that's the basis and that's the vertical axis. The horizontal axis is what they call market share, it's not like IDC market share, it's just pervasiveness in the data set. And so there are a couple of things that stand out here and that we can use as reference point. The first is the momentum of Snowflake. They've been off the charts for many, many, for over two years now, anything above that dotted red line, that 40%, is considered by ETR to be highly elevated and Snowflake's even way above that. And I think it's probably not sustainable. We're going to see in the next April survey, next month from those guys, when it comes out. And then you see AWS and Microsoft, they're really pervasive on the horizontal axis and highly elevated, Google falls behind them. And then you got a number of well funded players. You got Cockroach Labs, Mongo, Redis, MariaDB, which of course is a fork on MySQL started almost as protest at Oracle when they acquired Sun and they got MySQL and you can see the number of others. Now Oracle who's the leading database player, despite what Marc Staimer says, we know, (laughs) and they're a cloud player (laughing) who happens to be a leading database player. They dominate in the mission critical space, we know that they're the king of that sector, but you can see here that they're kind of legacy, right? They've been around a long time, they get a big install base. So they don't have the spending momentum on the vertical axis. Now remember this is, just really this doesn't capture spending levels, so that understates Oracle but nonetheless. So it's not a complete picture like SAP for instance is not in here, no Hana. I think people are actually buying it, but it doesn't show up here, (laughs) but it does give an indication of momentum and presence. So Bob Evans, I'm going to start with you. You've commented on many of these companies, you know, what does this data tell you? >> Yeah, you know, Dave, I think all these compilations of things like that are interesting, and that folks at ETR do some good work, but I think as you said, it's a snapshot sort of a two-dimensional thing of a rapidly changing, three dimensional world. You know, the incidents at which some of these companies are mentioned versus the volume that happens. I think it's, you know, with Oracle and I'm not going to declare my religious affiliation, either as cloud company or database company, you know, they're all of those things and more, and I think some of our old language of how we classify companies is just not relevant anymore. But I want to ask too something in here, the autonomous database from Oracle, nobody else has done that. So either Oracle is crazy, they've tried out a technology that nobody other than them is interested in, or they're onto something that nobody else can match. So to me, Dave, within Oracle, trying to identify how they're doing there, I would watch autonomous database growth too, because right, it's either going to be a big plan and it breaks through, or it's going to be caught behind. And the Snowflake phenomenon as you mentioned, that is a rare, rare bird who comes up and can grow 100% at a billion dollar revenue level like that. So now they've had a chance to come in, scare the crap out of everybody, rock the market with something totally new, the data cloud. Will the bigger companies be able to catch up and offer a compelling alternative, or is Snowflake going to continue to be this outlier. It's a fascinating time. >> Really, interesting points there. Holgar, I want to ask you, I mean, I've talked to certainly I'm sure you guys have too, the founders of Snowflake that came out of Oracle and they actually, they don't apologize. They say, "Hey, we not going to do all that complicated stuff that Oracle does, we were trying to keep it real simple." But at the same time, you know, they don't do sophisticated workload management. They don't do complex joints. They're kind of relying on the ecosystems. So when you look at the data like this and the various momentums, and we talked about the diverging strategies, what does this say to you? >> Well, it is a great point. And I think Snowflake is an example how the cloud can turbo charge a well understood concept in this case, the data warehouse, right? You move that and you find steroids and you see like for some players who've been big in data warehouse, like Sentara Data, as an example, here in San Diego, what could have been for them right in that part. The interesting thing, the problem though is the cloud hides a lot of complexity too, which you can scale really well as you attract lots of customers to go there. And you don't have to build things like what Bob said, right? One of the fascinating things, right, nobody's answering Oracle on the autonomous database. I don't think is that they cannot, they just have different priorities or the database is not such a priority. I would dare to say that it's for IBM and Microsoft right now at the moment. And the cloud vendors, you just hide that right through scripts and through scale because you support thousands of customers and you can deal with a little more complexity, right? It's not against them. Whereas if you have to run it yourself, very different story, right? You want to have the autonomous parts, you want to have the powerful tools to do things. >> Thank you. And so Matt, I want to go to you, you've set up front, you know, it's just complicated if you're in IT, it's a complicated situation and you've been on the customer side. And if you're a buyer, it's obviously, it's like Holgar said, "Cloud's supposed to make this stuff easier, but the simpler it gets the more complicated gets." So where do you place your bets? Or I guess more importantly, how do you decide where to place your bets? >> Yeah, it's a good question. And to what Bob and Holgar said, you know, the around autonomous database, I think, you know, part of, as I, you know, play kind of armchair psychologist, if you will, corporate psychologists, I look at what Oracle is doing and, you know, databases where they've made their mark and it's kind of, that's their strong position, right? So it makes sense if you're making an entry into this cloud and you really want to kind of build momentum, you go with what you're good at, right? So that's kind of the strength of Oracle. Let's put a lot of focus on that. They do a lot more than database, don't get me wrong, but you know, I'm going to short my strength and then kind of pivot from there. With regards to, you know, what IT looks at and what I would look at you know as an IT director or somebody who is, you know, trying to consume services from these different cloud providers. First and foremost, I go with what I know, right? Let's not forget IT is a conservative group. And when we look at, you know, all the different permutations of database types out there, SQL, NoSQL, all the different types of NoSQL, those are largely being deployed by business users that are looking for agility or businesses that are looking for agility. You know, the reason why MongoDB is so popular is because of DevOps, right? It's a great platform to develop on and that's where it kind of gained its traction. But as an IT person, I want to go with what I know, where my muscle memory is, and that's my first position. And so as I evaluate different cloud service providers and cloud databases, I look for, you know, what I know and what I've invested in and where my muscle memory is. Is there enough there and do I have enough belief that that company or that service is going to be able to take me to, you know, where I see my organization in five years from a data management perspective, from a business perspective, are they going to be there? And if they are, then I'm a little bit more willing to make that investment, but it is, you know, if I'm kind of going in this blind or if I'm cloud native, you know, that's where the Snowflakes of the world become very attractive to me. >> Thank you. So Marc, I asked Andy Jackson in theCube one time, you have all these, you know, data stores and different APIs and primitives and you know, very granular, what's the strategy there? And he said, "Hey, that allows us as the market changes, it allows us to be more flexible. If we start building abstractions layers, it's harder for us." I think also it was not a good time to market advantage, but let me ask you, I described earlier on that spectrum from AWS to Oracle. We just saw yesterday, Oracle announced, I think the third major enhancement in like 15 months to MySQL HeatWave, what do you make of that announcement? How do you think it impacts the competitive landscape, particularly as it relates to, you know, converging transaction and analytics, eliminating ELT, I know you have some thoughts on this. >> So let me back up for a second and defend my cloud statement about Oracle for a moment. (laughing) AWS did a great job in developing the cloud market in general and everything in the cloud market. I mean, I give them lots of kudos on that. And a lot of what they did is they took open source software and they rent it to people who use their cloud. So I give 'em lots of credit, they dominate the market. Oracle was late to the cloud market. In fact, they actually poo-pooed it initially, if you look at some of Larry Ellison's statements, they said, "Oh, it's never going to take off." And then they did 180 turn, and they said, "Oh, we're going to embrace the cloud." And they really have, but when you're late to a market, you've got to be compelling. And this ties into the announcement yesterday, but let's deal with this compelling. To be compelling from a user point of view, you got to be twice as fast, offer twice as much functionality, at half the cost. That's generally what compelling is that you're going to capture market share from the leaders who established the market. It's very difficult to capture market share in a new market for yourself. And you're right. I mean, Bob was correct on this and Holgar and Matt in which you look at Oracle, and they did a great job of leveraging their database to move into this market, give 'em lots of kudos for that too. But yesterday they announced, as you said, the third innovation release and the pace is just amazing of what they're doing on these releases on HeatWave that ties together initially MySQL with an integrated builtin analytics engine, so a data warehouse built in. And then they added automation with autopilot, and now they've added machine learning to it, and it's all in the same service. It's not something you can buy and put on your premise unless you buy their cloud customers stuff. But generally it's a cloud offering, so it's compellingly better as far as the integration. You don't buy multiple services, you buy one and it's lower cost than any of the other services, but more importantly, it's faster, which again, give 'em credit for, they have more integration of a product. They can tie things together in a way that nobody else does. There's no additional services, ETL services like Glue and AWS. So from that perspective, they're getting better performance, fewer services, lower cost. Hmm, they're aiming at the compelling side again. So from a customer point of view it's compelling. Matt, you wanted to say something there. >> Yeah, I want to kind of, on what you just said there Marc, and this is something I've found really interesting, you know. The traditional way that you look at software and, you know, purchasing software and IT is, you look at either best of breed solutions and you have to work on the backend to integrate them all and make them all work well. And generally, you know, the big hit against the, you know, we have one integrated offering is that, you lose capability or you lose depth of features, right. And to what you were saying, you know, that's the thing I found interesting about what Oracle is doing is they're building in depth as they kind of, you know, build that service. It's not like you're losing a lot of capabilities, because you're going to one integrated service versus having to use A versus B versus C, and I love that idea. >> You're right. Yeah, not only you're not losing, but you're gaining functionality that you can't get by integrating a lot of these. I mean, I can take Snowflake and integrate it in with machine learning, but I also have to integrate in with a transactional database. So I've got to have connectors between all of this, which means I'm adding time. And what it comes down to at the end of the day is expertise, effort, time, and cost. And so what I see the difference from the Oracle announcements is they're aiming at reducing all of that by increasing performance as well. Correct me if I'm wrong on that but that's what I saw at the announcement yesterday. >> You know, Marc, one thing though Marc, it's funny you say that because I started out saying, you know, I'm glad I'm not 19 anymore. And the reason is because of exactly what you said, it's almost like there's a pseudo level of witchcraft that's required to support the modern data environment right in the enterprise. And I need simpler faster, better. That's what I need, you know, I am no longer wearing pocket protectors. I have turned from, you know, break, fix kind of person, to you know, business consultant. And I need that point and click simplicity, but I can't sacrifice, you know, a depth of features of functionality on the backend as I play that consultancy role. >> So, Ron, I want to bring in Ron, you know, it's funny. So Matt, you mentioned Mongo, I often and say, if Oracle mentions you, you're on the map. We saw them yesterday Ron, (laughing) they hammered RedShifts auto ML, they took swipes at Snowflake, a little bit of BigQuery. What were your thoughts on that? Do you agree with what these guys are saying in terms of HeatWaves capabilities? >> Yes, Dave, I think that's an excellent question. And fundamentally I do agree. And the question is why, and I think it's important to know that all of the Oracle data is backed by the fact that they're using benchmarks. For example, all of the ML and all of the TPC benchmarks, including all the scripts, all the configs and all the detail are posted on GitHub. So anybody can look at these results and they're fully transparent and replicate themselves. If you don't agree with this data, then by all means challenge it. And we have not really seen that in all of the new updates in HeatWave over the last 15 months. And as a result, when it comes to these, you know, fundamentals in looking at the competitive landscape, which I think gives validity to outcomes such as Oracle being able to deliver 4.8 times better price performance than Redshift. As well as for example, 14.4 better price performance than Snowflake, and also 12.9 better price performance than BigQuery. And so that is, you know, looking at the quantitative side of things. But again, I think, you know, to Marc's point and to Matt's point, there are also qualitative aspects that clearly differentiate the Oracle proposition, from my perspective. For example now the MySQL HeatWave ML capabilities are native, they're built in, and they also support things such as completion criteria. And as a result, that enables them to show that hey, when you're using Redshift ML for example, you're having to also use their SageMaker tool and it's running on a meter. And so, you know, nobody really wants to be running on a meter when, you know, executing these incredibly complex tasks. And likewise, when it comes to Snowflake, they have to use a third party capability. They don't have the built in, it's not native. So the user, to the point that he's having to spend more time and it increases complexity to use auto ML capabilities across the Snowflake platform. And also, I think it also applies to other important features such as data sampling, for example, with the HeatWave ML, it's intelligent sampling that's being implemented. Whereas in contrast, we're seeing Redshift using random sampling. And again, Snowflake, you're having to use a third party library in order to achieve the same capabilities. So I think the differentiation is crystal clear. I think it definitely is refreshing. It's showing that this is where true value can be assigned. And if you don't agree with it, by all means challenge the data. >> Yeah, I want to come to the benchmarks in a minute. By the way, you know, the gentleman who's the Oracle's architect, he did a great job on the call yesterday explaining what you have to do. I thought that was quite impressive. But Bob, I know you follow the financials pretty closely and on the earnings call earlier this month, Ellison said that, "We're going to see HeatWave on AWS." And the skeptic in me said, oh, they must not be getting people to come to OCI. And then they, you remember this chart they showed yesterday that showed the growth of HeatWave on OCI. But of course there was no data on there, it was just sort of, you know, lines up and to the right. So what do you guys think of that? (Marc laughs) Does it signal Bob, desperation by Oracle that they can't get traction on OCI, or is it just really a smart tame expansion move? What do you think? >> Yeah, Dave, that's a great question. You know, along the way there, and you know, just inside of that was something that said Ellison said on earnings call that spoke to a different sort of philosophy or mindset, almost Marc, where he said, "We're going to make this multicloud," right? With a lot of their other cloud stuff, if you wanted to use any of Oracle's cloud software, you had to use Oracle's infrastructure, OCI, there was no other way out of it. But this one, but I thought it was a classic Ellison line. He said, "Well, we're making this available on AWS. We're making this available, you know, on Snowflake because we're going after those users. And once they see what can be done here." So he's looking at it, I guess you could say, it's a concession to customers because they want multi-cloud. The other way to look at it, it's a hunting expedition and it's one of those uniquely I think Oracle ways. He said up front, right, he doesn't say, "Well, there's a big market, there's a lot for everybody, we just want on our slice." Said, "No, we are going after Amazon, we're going after Redshift, we're going after Aurora. We're going after these users of Snowflake and so on." And I think it's really fairly refreshing these days to hear somebody say that, because now if I'm a buyer, I can look at that and say, you know, to Marc's point, "Do they measure up, do they crack that threshold ceiling? Or is this just going to be more pain than a few dollars savings is worth?" But you look at those numbers that Ron pointed out and that we all saw in that chart. I've never seen Dave, anything like that. In a substantive market, a new player coming in here, and being able to establish differences that are four, seven, eight, 10, 12 times better than competition. And as new buyers look at that, they're going to say, "What the hell are we doing paying, you know, five times more to get a poor result? What's going on here?" So I think this is going to rattle people and force a harder, closer look at what these alternatives are. >> I wonder if the guy, thank you. Let's just skip ahead of the benchmarks guys, bring up the next slide, let's skip ahead a little bit here, which talks to the benchmarks and the benchmarking if we can. You know, David Floyer, the sort of semiretired, you know, Wikibon analyst said, "Dave, this is going to force Amazon and others, Snowflake," he said, "To rethink actually how they architect databases." And this is kind of a compilation of some of the data that they shared. They went after Redshift mostly, (laughs) but also, you know, as I say, Snowflake, BigQuery. And, like I said, you can always tell which companies are doing well, 'cause Oracle will come after you, but they're on the radar here. (laughing) Holgar should we take this stuff seriously? I mean, or is it, you know, a grain salt? What are your thoughts here? >> I think you have to take it seriously. I mean, that's a great question, great point on that. Because like Ron said, "If there's a flaw in a benchmark, we know this database traditionally, right?" If anybody came up that, everybody will be, "Oh, you put the wrong benchmark, it wasn't audited right, let us do it again," and so on. We don't see this happening, right? So kudos to Oracle to be aggressive, differentiated, and seem to having impeccable benchmarks. But what we really see, I think in my view is that the classic and we can talk about this in 100 years, right? Is the suite versus best of breed, right? And the key question of the suite, because the suite's always slower, right? No matter at which level of the stack, you have the suite, then the best of breed that will come up with something new, use a cloud, put the data warehouse on steroids and so on. The important thing is that you have to assess as a buyer what is the speed of my suite vendor. And that's what you guys mentioned before as well, right? Marc said that and so on, "Like, this is a third release in one year of the HeatWave team, right?" So everybody in the database open source Marc, and there's so many MySQL spinoffs to certain point is put on shine on the speed of (indistinct) team, putting out fundamental changes. And the beauty of that is right, is so inherent to the Oracle value proposition. Larry's vision of building the IBM of the 21st century, right from the Silicon, from the chip all the way across the seven stacks to the click of the user. And that what makes the database what Rob was saying, "Tied to the OCI infrastructure," because designed for that, it runs uniquely better for that, that's why we see the cross connect to Microsoft. HeatWave so it's different, right? Because HeatWave runs on cheap hardware, right? Which is the breadth and butter 886 scale of any cloud provider, right? So Oracle probably needs it to scale OCI in a different category, not the expensive side, but also allow us to do what we said before, the multicloud capability, which ultimately CIOs really want, because data gravity is real, you want to operate where that is. If you have a fast, innovative offering, which gives you more functionality and the R and D speed is really impressive for the space, puts away bad results, then it's a good bet to look at. >> Yeah, so you're saying, that we versus best of breed. I just want to sort of play back then Marc a comment. That suite versus best of breed, there's always been that trade off. If I understand you Holgar you're saying that somehow Oracle has magically cut through that trade off and they're giving you the best of both. >> It's the developing velocity, right? The provision of important features, which matter to buyers of the suite vendor, eclipses the best of breed vendor, then the best of breed vendor is in the hell of a potential job. >> Yeah, go ahead Marc. >> Yeah and I want to add on what Holgar just said there. I mean the worst job in the data center is data movement, moving the data sucks. I don't care who you are, nobody likes it. You never get any kudos for doing it well, and you always get the ah craps, when things go wrong. So it's in- >> In the data center Marc all the time across data centers, across cloud. That's where the bleeding comes. >> It's right, you get beat up all the time. So nobody likes to move data, ever. So what you're looking at with what they announce with HeatWave and what I love about HeatWave is it doesn't matter when you started with it, you get all the additional features they announce it's part of the service, all the time. But they don't have to move any of the data. You want to analyze the data that's in your transactional, MySQL database, it's there. You want to do machine learning models, it's there, there's no data movement. The data movement is the key thing, and they just eliminate that, in so many ways. And the other thing I wanted to talk about is on the benchmarks. As great as those benchmarks are, they're really conservative 'cause they're underestimating the cost of that data movement. The ETLs, the other services, everything's left out. It's just comparing HeatWave, MySQL cloud service with HeatWave versus Redshift, not Redshift and Aurora and Glue, Redshift and Redshift ML and SageMaker, it's just Redshift. >> Yeah, so what you're saying is what Oracle's doing is saying, "Okay, we're going to run MySQL HeatWave benchmarks on analytics against Redshift, and then we're going to run 'em in transaction against Aurora." >> Right. >> But if you really had to look at what you would have to do with the ETL, you'd have to buy two different data stores and all the infrastructure around that, and that goes away so. >> Due to the nature of the competition, they're running narrow best of breed benchmarks. There is no suite level benchmark (Dave laughs) because they created something new. >> Well that's you're the earlier point they're beating best of breed with a suite. So that's, I guess to Floyer's earlier point, "That's going to shake things up." But I want to come back to Bob Evans, 'cause I want to tap your Cloud Wars mojo before we wrap. And line up the horses, you got AWS, you got Microsoft, Google and Oracle. Now they all own their own cloud. Snowflake, Mongo, Couchbase, Redis, Cockroach by the way they're all doing very well. They run in the cloud as do many others. I think you guys all saw the Andreessen, you know, commentary from Sarah Wang and company, to talk about the cost of goods sold impact of cloud. So owning your own cloud has to be an advantage because other guys like Snowflake have to pay cloud vendors and negotiate down versus having the whole enchilada, Safra Catz's dream. Bob, how do you think this is going to impact the market long term? >> Well, Dave, that's a great question about, you know, how this is all going to play out. If I could mention three things, one, Frank Slootman has done a fantastic job with Snowflake. Really good company before he got there, but since he's been there, the growth mindset, the discipline, the rigor and the phenomenon of what Snowflake has done has forced all these bigger companies to really accelerate what they're doing. And again, it's an example of how this intense competition makes all the different cloud vendors better and it provides enormous value to customers. Second thing I wanted to mention here was look at the Adam Selipsky effect at AWS, took over in the middle of May, and in Q2, Q3, Q4, AWS's growth rate accelerated. And in each of those three quotas, they grew faster than Microsoft's cloud, which has not happened in two or three years, so they're closing the gap on Microsoft. The third thing, Dave, in this, you know, incredibly intense competitive nature here, look at Larry Ellison, right? He's got his, you know, the product that for the last two or three years, he said, "It's going to help determine the future of the company, autonomous database." You would think he's the last person in the world who's going to bring in, you know, in some ways another database to think about there, but he has put, you know, his whole effort and energy behind this. The investments Oracle's made, he's riding this horse really hard. So it's not just a technology achievement, but it's also an investment priority for Oracle going forward. And I think it's going to form a lot of how they position themselves to this new breed of buyer with a new type of need and expectations from IT. So I just think the next two or three years are going to be fantastic for people who are lucky enough to get to do the sorts of things that we do. >> You know, it's a great point you made about AWS. Back in 2018 Q3, they were doing about 7.4 billion a quarter and they were growing in the mid forties. They dropped down to like 29% Q4, 2020, I'm looking at the data now. They popped back up last quarter, last reported quarter to 40%, that is 17.8 billion, so they more doubled and they accelerated their growth rate. (laughs) So maybe that pretends, people are concerned about Snowflake right now decelerating growth. You know, maybe that's going to be different. By the way, I think Snowflake has a different strategy, the whole data cloud thing, data sharing. They're not trying to necessarily take Oracle head on, which is going to make this next 10 years, really interesting. All right, we got to go, last question. 30 seconds or less, what can we expect from the future of data platforms? Matt, please start. >> I have to go first again? You're killing me, Dave. (laughing) In the next few years, I think you're going to see the major players continue to meet customers where they are, right. Every organization, every environment is, you know, kind of, we use these words bespoke in Snowflake, pardon the pun, but Snowflakes, right. But you know, they're all opinionated and unique and what's great as an IT person is, you know, there is a service for me regardless of where I am on my journey, in my data management journey. I think you're going to continue to see with regards specifically to Oracle, I think you're going to see the company continue along this path of being all things to all people, if you will, or all organizations without sacrificing, you know, kind of richness of features and sacrificing who they are, right. Look, they are the data kings, right? I mean, they've been a database leader for an awful long time. I don't see that going away any time soon and I love the innovative spirit they've brought in with HeatWave. >> All right, great thank you. Okay, 30 seconds, Holgar go. >> Yeah, I mean, the interesting thing that we see is really that trend to autonomous as Oracle calls or self-driving software, right? So the database will have to do more things than just store the data and support the DVA. It will have to show it can wide insights, the whole upside, it will be able to show to one machine learning. We haven't really talked about that. How in just exciting what kind of use case we can get of machine learning running real time on data as it changes, right? So, which is part of the E5 announcement, right? So we'll see more of that self-driving nature in the database space. And because you said we can promote it, right. Check out my report about HeatWave latest release where I post in oracle.com. >> Great, thank you for that. And Bob Evans, please. You're great at quick hits, hit us. >> Dave, thanks. I really enjoyed getting to hear everybody's opinion here today and I think what's going to happen too. I think there's a new generation of buyers, a new set of CXO influencers in here. And I think what Oracle's done with this, MySQL HeatWave, those benchmarks that Ron talked about so eloquently here that is going to become something that forces other companies, not just try to get incrementally better. I think we're going to see a massive new wave of innovation to try to play catch up. So I really take my hat off to Oracle's achievement from going to, push everybody to be better. >> Excellent. Marc Staimer, what do you say? >> Sure, I'm going to leverage off of something Matt said earlier, "Those companies that are going to develop faster, cheaper, simpler products that are going to solve customer problems, IT problems are the ones that are going to succeed, or the ones who are going to grow. The one who are just focused on the technology are going to fall by the wayside." So those who can solve more problems, do it more elegantly and do it for less money are going to do great. So Oracle's going down that path today, Snowflake's going down that path. They're trying to do more integration with third party, but as a result, aiming at that simpler, faster, cheaper mentality is where you're going to continue to see this market go. >> Amen brother Marc. >> Thank you, Ron Westfall, we'll give you the last word, bring us home. >> Well, thank you. And I'm loving it. I see a wave of innovation across the entire cloud database ecosystem and Oracle is fueling it. We are seeing it, with the native integration of auto ML capabilities, elastic scaling, lower entry price points, et cetera. And this is just going to be great news for buyers, but also developers and increased use of open APIs. And so I think that is really the key takeaways. Just we're going to see a lot of great innovation on the horizon here. >> Guys, fantastic insights, one of the best power panel as I've ever done. Love to have you back. Thanks so much for coming on today. >> Great job, Dave, thank you. >> All right, and thank you for watching. This is Dave Vellante for theCube and we'll see you next time. (soft music)

Published Date : Mar 31 2022

SUMMARY :

and co-founder of the and then you answer And don't forget Sybase back in the day, the world these days? and others happening in the cloud, and you cover the competition, and Oracle and you know, whoever else. Mr. Staimer, how do you see things? in that I see the database some good meat on the bone Take away the database, That is the ability to scale on demand, and they got MySQL and you I think it's, you know, and the various momentums, and Microsoft right now at the moment. So where do you place your bets? And to what Bob and Holgar said, you know, and you know, very granular, and everything in the cloud market. And to what you were saying, you know, functionality that you can't get to you know, business consultant. you know, it's funny. and all of the TPC benchmarks, By the way, you know, and you know, just inside of that was of some of the data that they shared. the stack, you have the suite, and they're giving you the best of both. of the suite vendor, and you always get the ah In the data center Marc all the time And the other thing I wanted to talk about and then we're going to run 'em and all the infrastructure around that, Due to the nature of the competition, I think you guys all saw the Andreessen, And I think it's going to form I'm looking at the data now. and I love the innovative All right, great thank you. and support the DVA. Great, thank you for that. And I think what Oracle's done Marc Staimer, what do you say? or the ones who are going to grow. we'll give you the last And this is just going to Love to have you back. and we'll see you next time.

ENTITIES

Entity	Category	Confidence
David Floyer	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Ron Westfall	PERSON	0.99+
Dave	PERSON	0.99+
Marc Staimer	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Marc	PERSON	0.99+
Ellison	PERSON	0.99+
Bob Evans	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Matt	PERSON	0.99+
Holgar Mueller	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Frank Slootman	PERSON	0.99+
Ron	PERSON	0.99+
Staimer	PERSON	0.99+
Andy Jackson	PERSON	0.99+
Bob	PERSON	0.99+
Matt Kimball	PERSON	0.99+
Google	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
Sarah Wang	PERSON	0.99+
San Diego	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Rob	PERSON	0.99+

Breaking Analysis: Snowflake’s Wild Ride

from the cube studios in palo alto in boston bringing you data driven insights from the cube and etr this is breaking analysis with dave vellante snowflake they love the stock at 400 and hated at 165 that's the nature of the business i guess especially in this crazy cycle over the last two years of lockdowns free money exploding demand and now rising inflation and rates but with the fed providing some clarity on its actions the time has come to really dig into the fundamentals of companies and there's no tech company that's more fun to analyze than snowflake hello and welcome to this week's wikibon cube insights powered by etr in this breaking analysis we look at the action of snowflake stock since its ipo why it's behaved the way it has how some sharp traders are looking at the stock and most importantly what customer demand looks like the stock has really provided some great theater since its ipo i know people who got in at 120 before the open and i know lots of people who kind of held their noses and bought the stock on day one at over 300 a day when it closed at around 240 that first day of trading snowflake hit 164 this week it's all-time low as a public company as my college roommate chip simonton a long time trader told me when great companies trade at all times time lows because of panic it's worth taking a shot he did now of course the stock could go lower there's geopolitical risk and the stock with a 64 billion market cap is expensive for a company that's forecast to do around 2 billion in product revenue this year and remember i don't recommend stocks you shouldn't take my advice and my comments you got to do your own research but i have lots of data and i have opinions and i'm willing to share that with you stocks like snowflake crowdstrike z-scaler octa and companies like this are highly volatile when markets are moving up they're going to move up faster than the mean when they're declining they're going to drop more severely and that's clearly what's happened to snowflake so with a company like this you when you see panic selling you'll also see panic buying sometimes like we we've seen with this name it went from 220 to 320 in a very short period earlier snowflake put in a short-term bottom this week and many traders feel the issue was oversold so they bought okay but not everyone felt this way and you can see this in the headlines snowflake hits low but cloud stocks rise and we're going to come back to that is it a buy don't buy the dip buy the dip and what snowflake investors can learn from microsoft and from the street.com snow stock is sliding on the back of ill-conceived guidance and to that i would say that conservative guidance these days is anything but ill-conceived now let's unpack all this a bit and to do so i reached out to ivana delevska who has been on this program before she's with spear invest a female-led etf that goes deep into understanding supply chains she came on breaking analysis and laid out her thesis to buy the dip on snowflake this is a while ago she told me currently spear still likes snowflake and has doubled its position let me share her analysis she called out two drivers for the downside interest rates you know rising of course in snowflakes guidance which my own publication called weak in that previous chart that i just showed you so let's dig into that a bit snowflake guided for product revenues of 67 year on year which was below buy side expectations but i believe within sell side consensus regardless the guide was nuanced and driven by snowflake's decision to pass along price efficiencies to customers from optimizing processor price performance predominantly from aws's graviton too this is going to hit snowflakes revenue a net of about a hundred million dollars this year but the timing's not precise because it's going to hit 165 million but they're going to make up 65 million in increased demand frank slootman on the earnings call made this very clear he said quote this is not philanthropy this stimulates demand classic slootman the point is spear and other bulls believe that this will result in a gain for snowflake over the medium term and we would agree price goes down roi gets better you throw more projects at snowflakes customers going to buy more snowflake and when that happens and it gives the company an advantage as they continue to build their moat it's a longer term bet on cloud and data which are good bets now some of this could also be competitive pressures there have been you know studies that are out there from competitors attacking snowflakes pricing and price performance and they make comparisons oracle's been pretty aggressive as have others but so far the company's customers continue to consume now at a very fast rate now on on this front what can we learn from microsoft that applies to snowflake that's the headline here from benzinga so the article quoted a wealth manager named josh brown talking about what happened to microsoft after the dot-com bubble burst and how they quadrupled earnings over the next decade and the stock went sideways suggesting the same thing could happen to snowflake now i'd like to make a couple of comments here first at the time microsoft was a 23 billion dollar company and it had a monopoly and was already highly profitable steve ballmer became the ceo of microsoft right after the dot-com bubble burst and he hugged onto windows for dear life and lived off of microsoft's pc software monopoly microsoft became an extremely profitable and remarkably uninteresting caretaker of a pc in on-prem software estate during balmer's tenure so i just don't see the comparison as relevant snowflake you know they're going to make struggle for other reasons but that one didn't really resonate with me what's interesting is this chart it poses the question do cloud and data markets behave differently it's a chart that shows aws growth rates over time and superimposes the revenue in the red in q1 2018 aws generated 5.4 billion dollars in revenue and that was growing at the time at nearly a 50 rate now that rate as you can see decelerated quite significantly as aws grew to a 50 billion dollar run rate company that down below where you see it bottoms now it makes sense right law of large numbers you can't keep growing that fast when you get that big well oops look what happened in 2021 aws's growth rate bottoms in the high 20s and then rockets back up to 40 this past quarter as aws surpasses a 70 billion dollar run rate so you have to ask is cloud different is data different is cloud data different or data cloud different let's put it in the snowflake parlance can cloud because of its consumption model and the speed of innovation and ecosystem depth and breadth enable snowflake to exhibit lots of variability in its growth rates versus a say progressive and somewhat linear decline as the company grows revenue which is what you would expect historically and part of the answer relates to its market size here's a chart we've shared before with some additions it's our version of snowflake's total available market they're tam which snowflake's version that that blue data cloud thing superimposed on the right it shows the various layers of market opportunity that we came up with that that snowflake and others we think have in front of them emerging from the disruption of legacy data lakes and data warehouses to what snowflake refers to as its data cloud we think about the data mesh concept and decentralized data architectures with domain ownership and data product and service builders as consistent with snowflake's data cloud vision where snowflake data stores are nodes they're just simply discoverable nodes on the mesh you could have you know data bricks data lakes you know s3 buckets on that mesh it doesn't matter they can be discovered they can be shared and of course they're governed in a federated model now in snowflake's model it's all inside the snowflake data cloud that's fine then you'll go to the out years it gets a little fuzzy you know from edge locations and ai inference it becomes massive and decision making occurs in real time where machines and machine data take over the world instead of you know clicks and keystrokes sounds out there but it's real and how exactly snowflake plays there at this point is unclear but one thing's for sure there'll be a lot of data and it's going to find its way into snowflake you know snowflake's not a real-time engine it's an analytical system it's moving into the realm of data science and you know we've talked about the need for you know semantic layer between those those two worlds of analytics and data science but expanding the scope further out we think that snowflake is a big role to play in this future and the future is massive okay check you got the big tam now as someone that looks at companies through a fundamentals prism you've got to look obviously at the markets in the tan which we just did but you also want to understand customers and it's not hard to find snowflake customers capital one disney micron alliance sainsbury sonos and hundreds of other companies i've talked to snowflake customers who have also been customers of oracle teradata ibm neteza vertica serious database practitioners and they tell me it's consistent soulflake is different they say it's simpler it's more agile it's less complicated to secure and it's disruptive to their traditional ways of doing data management now of course there are naysayers i've spoken to a number of analysts that feel snowflake is deficient in areas like workload management and course complex joins and it's too specialized in a world where we're seeing the convergence of analytics and transactional workloads our own david floyer believes that what oracle is doing with mysql heatwave is radically disruptive to many of the database architectures and blows away anything out there and he believes that snowflake and the likes of aws are going to have to respond now this the other criticism here is that snowflake is not architected for real-time inference where a lot of that edge activity is is going to happen it's a multi-hundred billion dollar market and so look snowflake has a ton of competition that's the other thing all the major cloud players have very capable and competitive database platforms even though they all partner with snowflake except oracle of course but companies like databricks and have garnered tons of vc other vc funded companies have raised billions of dollars to do this kind of elastic consumption based separate compute from storage stuff so you have to always keep an open mind and be aware of potential blind spots for these companies but to the criticisms i would say look snowflake they got there first and watch their ecosystem it's a real key to its continued success snowflake's not going to go it alone and it's going to use its ecosystem partners to expand its reach and accelerate the network effects and fill those gaps and it will acquire its stock is valuable so it should be doing that just as it did with streamlit a zero revenue company that it bought for 800 million dollars in stock and cash just recently streamlit is an open source python library that gets snowflake further deeper into that data science space that data brick space and look watch what snowflake is doing with snowpark it's an api library for processing data and building data intensive applications we've talked about snowflake essentially being becoming the super cloud and building this sort of path-like layer across clouds rather than trying to do it all themselves it seems snowflake is really staring at the api economy and building its ecosystem to plug those holes so let's come back to the customers here's a chart that shows snowflakes customer spending momentum or net score on the the top line that's the vertical axis and pervasiveness in the data or market share and that bottom brown line snowflake has unprecedented net scores and held them up for many many quarters as you can see here going back you know a couple years all leading to its expanded market penetration and measured as pervasiveness of so-called market share within the etr survey it's not like idc market share it's pervasiveness in the data set now i'll say this i don't see how this is sustainable i've been waiting for this to moderate i wouldn't be surprised to see snowflake come back to earth a little bit i think they'll clearly still be highly elevated based on the data that i've seen but but i could see in in one or more of the etr surveys this year this starting to moderate as they get they get big it's just it has to happen um but i would again expect them to have a high spending velocity score but i think we're going to see snowflake you know maybe porpoise a bit here meaning you know it moderates it comes back up it's just really hard to sustain this piece of momentum and higher train retain and scale without absorbing some some friction and some head woods that's going to slow you down but back to the aws growth example it's entirely possible that we could see a similar dynamic with snowflake that you saw with aws and you kind of see it with salesforce and servicenow very successful large entrenched entrenched companies and it's very possible that snowflake could pull back moderate and then accelerate that growth even though people are concerned about the moderated guidance of 80 percent growth yeah that's that's the new definition of tepid i guess i look i like to look at other some other metrics the one that really called you know my my my attention was the remaining performance obligations this last quarter rpo snowflakes is up to something like 2.6 billion and that is a forward-looking indicator of of future revenues so i want to i'd like to see that growing and it's growing at a fast pace so you're going to see some ups and downs with snowflake i have no doubt but i think things are still looking pretty solid for the company growth companies like snowflake and octa and z scalar those other ones that i mentioned earlier have probably been repriced and refactored by investors while there's always going to be market and of course geopolitical risk especially in these times fundamentals matter you've got huge market well capitalized you got a leadership position great products and strong customer adoption you also have a great team team is something else that we look for we haven't touched on that but i'll leave you with this thought everyone knows about frank slootman mike scarpelli and what they've accomplished in their years of working together that's why the stock you know in ipo was was so overvalued they had seen these guys do it before slootman just documented in all this in his book amp it up which gives great insight into the history of of that though you know that pair and and the teams that they've built the companies that they've built how he thinks about building companies and markets and and how you know total available markets super important but the whole philosophy and culture that that he's building in his management style but you got to wonder right how long is this guy going to keep going what keeps him motivated you know i asked him that one time here's what he said why i mean are you in this for the sport what's the story here uh actually that that's not a bad way of characterizing it i think i am in it uh you know for the sport uh you know the only way to become the best version of yourself is to be uh to be under the gun and uh you know every single day and that's that's certainly uh what we are it sort of has its own rewards building great products building great companies uh you know regardless of you know uh what the spoils may be uh it has its own rewards and i i it's hard for people like us to get off the field and uh you know hang it up so here we are so there you have it he's in it for the sport how great is that he loves building companies and that my opinion that's how frank slootman thinks about success it's not about money money's the byproduct of success as earl nightingale would say success is the progressive realization of a worthy ideal i love that quote building great companies building products that change the world changing people's lives with data and insights creating jobs creating life-altering wealth opportunities not for himself but for thousands of employees and partners i'd say that's a pretty worthy ideal and i hope frank slootman sticks with it for a while okay that's it for today thanks to stephanie chan for the background research she does for breaking analysis alex meyerson on production kristen martin and cheryl knight on social with rob hoff on siliconangle and thanks to ivana delevska of spear invest and my friend chip symington for the angles from the money side of things remember all these episodes are available as podcasts just search breaking analysis podcast i publish weekly on wikibon.com and siliconangle.com and don't forget to check out etr.plus for all the survey data you can reach me at devolante or david.velante siliconangle.com and this is dave vellante for cube insights powered by etrbsafe stay well and we'll see you next time [Music] you

Published Date : Mar 18 2022

SUMMARY :

the history of of that though you know

ENTITIES

Entity	Category	Confidence
microsoft	ORGANIZATION	0.99+
josh brown	PERSON	0.99+
alex meyerson	PERSON	0.99+
thousands	QUANTITY	0.99+
80 percent	QUANTITY	0.99+
2021	DATE	0.99+
slootman	PERSON	0.99+
rob hoff	PERSON	0.99+
67 year	QUANTITY	0.99+
5.4 billion dollars	QUANTITY	0.99+
50 billion dollar	QUANTITY	0.99+
64 billion	QUANTITY	0.99+
800 million dollars	QUANTITY	0.99+
165 million	QUANTITY	0.99+
23 billion dollar	QUANTITY	0.99+
stephanie chan	PERSON	0.99+
david floyer	PERSON	0.99+
ivana delevska	PERSON	0.99+
steve ballmer	PERSON	0.99+
this year	DATE	0.99+
2.6 billion	QUANTITY	0.99+
frank slootman	PERSON	0.99+
mike scarpelli	PERSON	0.99+
billions of dollars	QUANTITY	0.99+
oracle	ORGANIZATION	0.99+
earl nightingale	PERSON	0.99+
two drivers	QUANTITY	0.99+
multi-hundred billion dollar	QUANTITY	0.99+
david.velante	OTHER	0.98+
boston	LOCATION	0.98+
dave vellante	PERSON	0.98+
one	QUANTITY	0.98+
about a hundred million dollars	QUANTITY	0.98+
120	QUANTITY	0.98+
aws	ORGANIZATION	0.98+
Snowflake’s Wild Ride	TITLE	0.98+
frank slootman	PERSON	0.98+
siliconangle.com	OTHER	0.98+
this week	DATE	0.98+
around 2 billion	QUANTITY	0.98+
70 billion dollar	QUANTITY	0.97+
400	QUANTITY	0.97+
320	QUANTITY	0.97+
q1 2018	DATE	0.97+
kristen martin	PERSON	0.97+
220	QUANTITY	0.97+
chip symington	PERSON	0.96+
first	QUANTITY	0.96+
benzinga	ORGANIZATION	0.96+
164	QUANTITY	0.96+
over 300 a day	QUANTITY	0.96+
first day	QUANTITY	0.95+
earth	LOCATION	0.95+
windows	TITLE	0.95+
two worlds	QUANTITY	0.95+
past quarter	DATE	0.95+
165	QUANTITY	0.94+
disney	ORGANIZATION	0.94+
65 million	QUANTITY	0.94+
simonton	LOCATION	0.94+
python	TITLE	0.94+
street.com	OTHER	0.93+
a lot of data	QUANTITY	0.92+
last quarter	DATE	0.92+
cheryl knight	PERSON	0.92+
today	DATE	0.92+
50 rate	QUANTITY	0.91+
day one	QUANTITY	0.9+
zero revenue	QUANTITY	0.9+
devolante	OTHER	0.9+
tons	QUANTITY	0.89+
wikibon.com	OTHER	0.88+
one time	QUANTITY	0.88+
hundreds of other companies	QUANTITY	0.88+
etr	ORGANIZATION	0.87+
single day	QUANTITY	0.86+
balmer	PERSON	0.85+
around 240	QUANTITY	0.85+
ipo	ORGANIZATION	0.85+
20s	QUANTITY	0.84+
lots of data	QUANTITY	0.83+

Pete Lilley and Ben Bromhead, Instaclustr | CUBE Conversation

(upbeat music) >> Hello, and welcome to this "CUBE" conversation. I'm John Furrier, host of "theCUBE", Here in Palo Alto, California, beginning in 2022, kicking off the new year with a great conversation. We're with folks from down under, two co-founders of Instaclustr. Peter Lilley, CEO, Ben Bromhead, the CTO, Intaclustr success. 'Cause he's been on "theCUBE" before, 2018 at Amazon re:Invent. Gentlemen, thanks for coming on "theCUBE". Thanks for piping in from Down Under into Palo Alto. >> Thanks, John, it's really good to be here, I'm looking forward to the conversation. >> So, I love the name, Instaclustr. It conjures up cloud, cloud scale, modern application, server list. It just gives me a feel of things coming together. Spin me up a cluster of these kinds of feelings. The cloud is here, open sources is growing, that's what you guys are in the middle of. Take a minute to explain what you guys do real quick and this open source cloud intersection that's just going supernova right now. >> Yeah, yeah, yeah. So, Instaclustr is on a mission to really enable the world's ambitions to use open source technology. And we do that specifically at the data layer. And we primarily do that through what we call our platform offering. And think of it as the way to make it super easy, super scalable, super reliable way to adopt open source technologies at the data layer, to build cutting edge applications in the cloud. Today used by customers all over the world. We started the business in Australia but we've very quickly become a global business. But we are the business that sits behind some of the most successful brands that are building massively scalable cloud based applications. And you did right. We sit at a real intersection of kind of four things. One is open source adoption which is an incredibly powerful journey and wave that's driving the future direction of IT. You've got managed services or managed operations and moving those onto a platform like Instaclustr. You've got the adoption of cloud and cloud as a wave, like open source is a wave. And then you've got the growth of data, everything is data-driven these days. And data is just excellent for businesses and our customers. And in a lot of cases when we work with our customers on Instaclustr today, the application and the data, the data is the business. >> Ben, I want to get your thoughts as a CTO because open source, and technology, and cloud, has been a real game changer. If you go back prior to cloud, open source is very awesome, still great, freedom, we've got code, it's just the scale of open source. And then cloud came along, changed the game, so, open source. And then new business models became, so commercial open source software is now an industry. It's not just open source, "Hey, free software." And then maybe a red hat's out there, or someone like a red hat, have some premium support. There's been innovation on the business model side. So, matching technology innovation with the business model has been a big change in the past, many, many years. And this past year in particular that's been key. And open source, open core, these are the things that people are talking about. License changes, this is a big discussion. Because you could be on the wrong side of history if you make the wrong decision here. >> Yeah, yeah, definitely. I think it's also worth, I guess, taking a step back and understanding a little bit about why have people gravitated towards open source and the cloud? Beyond kind of the hippie freedoms of, I can see the code and I have ownership, and everything's free and great. And I think the reason why it's really taken off in a commercial setting, in an enterprise setting is velocity. How much easier is it to go reach and grab a open-source tool? That you can download, you can grab, you can compile yourself, you can make it work the way you want it to do to solve a problem here and now. Versus the old school way of doing it which is with I have to go download a trial version. Oh no, some of the features are locked. I've got to go talk to a procurement or a salesperson to kind of go and solve the problem that I have. And then I've got to get that approved by my own purchasing department. And do we have budget? And all of a sudden it's way, way, way harder to solve the problem in front of you as an engineer. Whereas with open source I just go grab it and I move on. I've achieved something for the day. >> Basically all that friction that comes, you got a problem to solve, oh, open-source, I'm going to just get a hammer and hammer that nail. Wait, whoa, whoa. I got to stand in line, I got to jump over hoops, I got to do all these things. This is the hassle and friction. >> Exactly, and this is why it's often called one of the most impressive things about that. And I think on the cloud side it's the same thing, but for hardware, and capability, and compute, and memory. Previously, if you wanted to compute, oh, you're going to lodge a ticket. You've got to ask someone to rack a server in a data center. You've got to deal with three different departments. Oh my goodness. How painful is that just to get a server up to go run and do something? That's just pulling your hair out. Whereas with the cloud, that's an API call or clicking a few buttons on a console and off you go. You'd have to combine those two things. And I would say that software engineers are probably the most productive they've ever been in the last 20 years. I know sometimes it doesn't look like that but their ability to solve problems in front of them, especially using external stuff is way way, way better. >> Peter: I think when you put those two things together you get an- >> The fact of the matter is they are productive. They're putting security into the code right in the CICD pipeline. So, this is highly agile right now. So, coders are highly productive and efficient in changing the way people are rolling out applications. So, the game is over, open source has won, open core is winning. And this is where the people are confused. This is why I got you guys here? What's the difference between open source and open core? What's the big deal? Why is it so important? >> Yeah, no, great question. So, really the difference between open source and open core, it comes down to, really it's a business model. So, open core contains open-source software, that's a hundred percent true. So, usually what will happen is a company will take a project that is open source, that has an existing community around it, or they've built it, or they've contributed it, or however that genesis has happened. And then what they'll do is they'll look at all the edges around that open-source project. And I think what are some enterprise features that don't exist in the open-source project that we can build ourselves? And then sprinkle those around the edges and sell that as a proprietary offering. So, what you get is you get the core functionality is powered by an open-source project. And quite often the code is identical. But there's all these kinds of little features around the outside that might make it a little bit easier to use in an enterprise environment. Or might make it a bit easier to do some operations side of things. And they'll charge you a license for that. So, you end up in a situation where you might have adopted the open source project, but then now if you want a feature X, Y, or Z, you then need to go and fork over some money and go into that whole licensing kind of contract. So, that's the core difference between open core and open-source, right? Open core, it's got all these little proprietary bits kind of sprinkled around the outside. >> So, how would you describe your platform for your customers? Obviously, you guys are succeeding, your growth is great, we're going to get that second. But as you guys have been steadily expanding the platform of open source data technologies, what is the main solution that you guys are offering customers? Managing open source technologies? What's the main value that you guys bring to the customer? >> Yeah, definitely. So, really the main value that we bring to the customer is we allow them to, I guess, successfully adopt open source databases or database technologies without having to go down that open core path. Open core can be quite attractive, but what it does is you end up with all these many Oracles drivers. Still having to pay the toll in terms of license fees. What we do, however, is we take those open-source projects and we deliver that as a database, as a service on our managed platform. So, we take care of all the operations, the pain, the care, the feeding, patch management, backups. Everything that you need to do, whether you're running it yourself or getting someone else to run it, we'll take care of that for you. But we do it with the pure upstream open source version. So, that means you get full flexibility, full portability. And more importantly you're not paying those expensive license fees. Plus it's easy and it just works. You get that full cloud native experience and you get your database right now when you need it. >> And basically you guys solve the problem of one, I got this legacy or existing licensed technology I've got to pay for. And it may not be enabling modern applications, and they don't have a team to go do all the work (laughing). Or some companies have like a whole army of people just embedded in open-source, that's very rare. So, it sounds like you guys do both. Did I get that right, is that right? >> Yeah, definitely. So, we definitely enable it if you don't have that capability yourself. We are the outsourced option to that. It's obviously a lot more than that but it's one of those pressures that companies nowadays face. And if we take it back to that concept of developer velocity, you really want them working on your core business problems. You don't want them having to fight database infrastructure. So, you've also got the opportunity cost of having your existing engineers working on running this stuff themselves. Or running a proprietary or an open call solution themselves, when really you should be outsourcing preferably to Instaclustr. But hey, let's be honest, you should be outsourcing it to anyone so that your engineers can be focusing on your core business problems. And really letting them work on the things that make you money. >> That's very smart. You guys have a great business model. Because one of the things we've been reporting on "theCUBE" on SiliconANGLE as well, is that the database market is becoming so diverse for the right reasons. Databases are everywhere now and code is becoming horizontally scalable for the cloud but vertically specialized with machine learning. So, you're seeing applications and new databases, no one database rules the world anymore. It's not about Oracle anymore, or anything else. So, open source fits nicely into this kind of platform view. How do you guys decide which technologies go in to the platform that you support? >> Yeah, great question. So, we certainly live in a world of, I call it polyglot persistence. But a simple way of referring to that is the right tool for the right job. And so, we really live in this world where engineers will reach for a database that solves a specific problem and solves it well. As you mentioned, companies, they're no longer Oracle shops, or they're no longer MySQL shops. You'll quite often see services or applications of teams using two or three different databases to solve different challenges. And so, what we do at Instaclustr is we really look at what are the technologies that our existing customers are using, and using side-by-side with, say, some of the existing Instaclustr offerings. We take great lead from that. We also look at what are the different projects out there that are solving use cases that we don't address at the moment. So, it's very use case driven. Whether it's, "Hey, we need something that's better at," say, "Time series." Or we need something that's a little bit better at translatable workloads. Or something a bit of a better fit for a case, right? And we work with those. And I think importantly, we also have this view that in a world of polyglot persistence, you've also got data integration challenges. So, how do you keep data safe between these two different database types? So, we're also looking at how do we integrate those better and support our users on that particular journey. So, it really comes down to one, listening to your customers, seeing what's out there and what's the right use case for a given technology and then we look to adopt that. >> That's great, Ben, machine learning is completely on fire right now. People love it, they want more of it. AI everything, everyone's putting AI on every label. If it does any automation, it's magic, it's AI. So, really, we know what that's happening, it's just really database work and machine learning under the covers. Pete, the business model here has completely changed too, because now with open source as a platform you have more scale, you have differentiation opportunities. I'm sure business is doing great. Give us an update on the business side of Instaclustr. What's clicking for you guys, what's working? What's the success trajectory look like? >> Yeah, it's been an amazing journey for us. When you think about it we were founded it in 2013, so, we're eight years into our journey. When we started the business we were focused entirely on Cassandra. But as Ben talked about, we've gone in diversified those technologies onto the platform, that common experience that we offer customers. So, you can adopt any one to a number of open source technologies in a highly integrated way and really, really grow off the back of that. It's driving some phenomenal growth in our business and we've really enjoyed growth rates that have been 70, 80, 100 year on year since we've started the business. And that's led to an enormous scale and opportunities for us to invest further in the platform, invest further in additional technologies in a really highly opinionated way. I think Ben talked about that integrations, then that becomes incredibly complex as you have many, many kinds of offerings on the platform. So, Instaclustr is much more targeted in terms of how we want to take our business forward and the growth opportunity before us. We think about being deeply expert and deeply capable in a smaller subset of technologies. But those which actually integrate and inter operate for customers so they can build solutions for their applications. But do that on Instaclustr using its platform with a common experience. And, so we've grown to 270 people now around the world. We started in Australia, we've got a strong presence in the US. We recently acquired a business called credativ in Europe, which was a PostgreSQL specialist organization. And that was because, as Ben said before, talking about those technologies we bring onto our platform. PostgreSQL, huge market, disrupting Oracle, exactly the right place that we want to be as Instaclustr with pure open source offerings. We brought them into the Instaclustr family in March this year and we did that to accelerate it on our platform. And so, we think about that. We think about future technologies on their platform, what we can do, and introduced to even provide an even greater and richer experience. Cadence is new to our platform. Super exciting for us because not only is it something that provides workflow as code, as an open source experience, but as a glue technology to build a complex business technology for applications. It actually drives workloads across Cassandra, PostgreSQL and Kafka, which are kind of core technologies on our platform. Super exciting for us, a big market. Interesting kind of group of adopters. You've got Uber kind of leading the charge there with that and us partnering with them now. We see that as a massive growth opportunity for our business. And as we introduce analytics capabilities, exploration, visibility features into the platform all built on open source. So, you can build a complete top to bottom data services layer using open source technology for your platform. We think that's an incredibly exciting part of the business and a great opportunity for us. >> Opportunities to raise money, more acquisitions on the horizon? >> Well, I think acquisitions where it makes sense. I talked about credativ, where we looked at credativ, we knew that PostgreSQL was new to our market, and we were coming into that market reasonably late. So, the way we thought about that from a strategy perspective was we wanted to accelerate the richness of the capability on our platform that we introduced and became GA late last year. So, we think about when we're selecting that kind of technology, that's the perfect opportunity to consider an acquisition for us. So, as we look at what we're going to introduce in the platform over the next sort of two, three, four years, that sort of decision that will, or that sort of thinking, or frames our thinking on what we would do from an acquisition perspective. I think the other way we think about acquisitions is new markets. So, thinking about globally entering, say into the Japanese market. does that make sense because of any language requirements to be able to support customers? 'Cause one of the things that's really, really important to us is the platform is fantastic for scaling, growing, deploying, running, operating this very powerful open source technology. But so too is the importance of having deep operational open source expertise backing and being there to call on if a customer's having an application issue. And that kind of drives the need for us to have in country kind of market support. And so, when we think about those sort of opportunities, I think we think about acquisition there, isn't it like another string to the bow in terms of getting presence in a particular or an emerging market that we're interested in. >> Awesome, Ben, final question to you is, on the technology front what do you see this year emerging? A lot of changes in 2021. We've got another year of pandemic situation going on. Hopefully it goes by fast. Hopefully it won't be three years, but again, who knows? But you're seeing the cloud open source actually taking as a tailwind from the pandemic. New opportunities, companies are refreshing, they have to, they're forced. There's going to be a lot more changes. What do you see from a tech perspective in open-source, open core, and in general for large companies as opensource continues to power the innovation? >> So, definitely the pandemic has a tailwind, particularly for those companies adopting the cloud. I think it's forced a lot of their hands as well. Their five-year plans have certainly become two or three year plans around moving to the cloud. And certainly, that contest for talent means that you really want to be keeping your engineers focused on core things. So, definitely I think we're going to see a continuation of that. We're going to say the continuation of open source dominating when it comes to a database and the database market, the same with cloud. I think we're going to see the gradual march towards different adoption models within the cloud. So, server lists, right? I think we're going to see that kind of slowly mature. I think it's still a little bit early in the hype cycle there, but we're going to start to see that mature. On the ML, AI side of things as well, people have been talking about it for the last three or four years. And I'm sure to people in the industry, they're like, "Oh, we're over that." But I think on the broader industry we're still quite early in that particular cycle as people figure out, how do they use the data that they've got? How do they use that? How do they train models on that? How do they serve inference on that? And how do they unlock other things with lower down on their data stack as well when it comes to ML and AI, right? We're seeing great research papers come out from AI powered indexes, right? So, the AI is actually speeding up queries, let alone actually solving business problems. So, I think we're going to say more and more of that kind of come out. I think we're going to see more and more process capabilities and organizational responses to this explosion of data. I'm super excited to say people talking about concepts and organizational concepts like data mesh. I think that's going to be fundamental as we move forward and have to manage the complexities of dealing with this. So, it's an old industry, data, when you think about it. As soon as you had computers you had data, and it's an old industry from that perspective. But I feel like we're only just getting started and it's just heating up. So, we're super excited to see what 2022 holds for us. >> Every company will be an source AI company. It has to be no matter what. (Ben laughing) Well, thanks for sharing the data Pete and Ben, the co-founders of Instaclustr. We'll get our "CUBE" AI working on this data we got today from you guys. Thanks for sharing, great stuff. Thanks for sharing the open core perspective. We really appreciate it and congratulations on your success. Companies do need more Instaclustrs out there, and you guys are doing a great job. Thanks for coming on, I appreciate it. >> Thanks John, cheers mate. >> Thanks John. >> It's "theCUBE" Conversation here at Palo Alto. I'm John Furrier, thanks for watching. (bright music)

Published Date : Jan 7 2022

SUMMARY :

kicking off the new year I'm looking forward to the conversation. So, I love the name, Instaclustr. applications in the cloud. it's just the scale of open source. and the cloud? This is the hassle and friction. in the last 20 years. So, the game is over, So, that's the core difference What's the main value that you So, that means you get full So, it sounds like you guys do both. on the things that make you money. is that the database market is the right tool for the right job. So, really, we know what that's happening, and the growth opportunity before us. And that kind of drives the need for us Awesome, Ben, final question to you and the database market, and you guys are doing a great job. I'm John Furrier, thanks for watching.

ENTITIES

Entity	Category	Confidence
Peter Lilley	PERSON	0.99+
Australia	LOCATION	0.99+
2013	DATE	0.99+
Ben	PERSON	0.99+
John	PERSON	0.99+
70	QUANTITY	0.99+
Ben Bromhead	PERSON	0.99+
two	QUANTITY	0.99+
John Furrier	PERSON	0.99+
five-year	QUANTITY	0.99+
Peter	PERSON	0.99+
Europe	LOCATION	0.99+
Pete	PERSON	0.99+
Palo Alto	LOCATION	0.99+
2021	DATE	0.99+
Pete Lilley	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
US	LOCATION	0.99+
three	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
eight years	QUANTITY	0.99+
two things	QUANTITY	0.99+
2022	DATE	0.99+
PostgreSQL	ORGANIZATION	0.99+
three year	QUANTITY	0.99+
four years	QUANTITY	0.99+
270 people	QUANTITY	0.99+
Instaclustr	ORGANIZATION	0.99+
Today	DATE	0.99+
Palo Alto, California	LOCATION	0.99+
2018	DATE	0.98+
three years	QUANTITY	0.98+
today	DATE	0.98+
both	QUANTITY	0.98+
80	QUANTITY	0.98+
Oracles	ORGANIZATION	0.98+
One	QUANTITY	0.97+
one	QUANTITY	0.97+
100 year	QUANTITY	0.97+
Cassandra	TITLE	0.97+
March this year	DATE	0.96+
Kafka	TITLE	0.96+
MySQL	TITLE	0.96+
second	QUANTITY	0.95+
Intaclustr	ORGANIZATION	0.95+
PostgreSQL	TITLE	0.94+
hundred percent	QUANTITY	0.93+
pandemic	EVENT	0.93+
two co-founders	QUANTITY	0.92+
past year	DATE	0.91+
SiliconANGLE	ORGANIZATION	0.9+
late last year	DATE	0.9+
theCUBE	ORGANIZATION	0.9+
credativ	ORGANIZATION	0.88+
Amazon	ORGANIZATION	0.86+
three different databases	QUANTITY	0.86+
last 20 years	DATE	0.84+
this year	DATE	0.83+
Instaclustr	TITLE	0.74+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for MySQL: