The Truth About MySQL HeatWave

>>When Oracle acquired my SQL via the Sun acquisition, nobody really thought the company would put much effort into the platform preferring to focus all the wood behind its leading Oracle database, Arrow pun intended. But two years ago, Oracle surprised many folks by announcing my SQL Heatwave a new database as a service with a massively parallel hybrid Columbia in Mary Mary architecture that brings together transactional and analytic data in a single platform. Welcome to our latest database, power panel on the cube. My name is Dave Ante, and today we're gonna discuss Oracle's MySQL Heat Wave with a who's who of cloud database industry analysts. Holgar Mueller is with Constellation Research. Mark Stammer is the Dragon Slayer and Wikibon contributor. And Ron Westfall is with Fu Chim Research. Gentlemen, welcome back to the Cube. Always a pleasure to have you on. Thanks for having us. Great to be here. >>So we've had a number of of deep dive interviews on the Cube with Nip and Aggarwal. You guys know him? He's a senior vice president of MySQL, Heatwave Development at Oracle. I think you just saw him at Oracle Cloud World and he's come on to describe this is gonna, I'll call it a shock and awe feature additions to to heatwave. You know, the company's clearly putting r and d into the platform and I think at at cloud world we saw like the fifth major release since 2020 when they first announced MySQL heat wave. So just listing a few, they, they got, they taken, brought in analytics machine learning, they got autopilot for machine learning, which is automation onto the basic o l TP functionality of the database. And it's been interesting to watch Oracle's converge database strategy. We've contrasted that amongst ourselves. Love to get your thoughts on Amazon's get the right tool for the right job approach. >>Are they gonna have to change that? You know, Amazon's got the specialized databases, it's just, you know, the both companies are doing well. It just shows there are a lot of ways to, to skin a cat cuz you see some traction in the market in, in both approaches. So today we're gonna focus on the latest heat wave announcements and we're gonna talk about multi-cloud with a native MySQL heat wave implementation, which is available on aws MySQL heat wave for Azure via the Oracle Microsoft interconnect. This kind of cool hybrid action that they got going. Sometimes we call it super cloud. And then we're gonna dive into my SQL Heatwave Lake house, which allows users to process and query data across MyQ databases as heatwave databases, as well as object stores. So, and then we've got, heatwave has been announced on AWS and, and, and Azure, they're available now and Lake House I believe is in beta and I think it's coming out the second half of next year. So again, all of our guests are fresh off of Oracle Cloud world in Las Vegas. So they got the latest scoop. Guys, I'm done talking. Let's get into it. Mark, maybe you could start us off, what's your opinion of my SQL Heatwaves competitive position? When you think about what AWS is doing, you know, Google is, you know, we heard Google Cloud next recently, we heard about all their data innovations. You got, obviously Azure's got a big portfolio, snowflakes doing well in the market. What's your take? >>Well, first let's look at it from the point of view that AWS is the market leader in cloud and cloud services. They own somewhere between 30 to 50% depending on who you read of the market. And then you have Azure as number two and after that it falls off. There's gcp, Google Cloud platform, which is further way down the list and then Oracle and IBM and Alibaba. So when you look at AWS and you and Azure saying, hey, these are the market leaders in the cloud, then you start looking at it and saying, if I am going to provide a service that competes with the service they have, if I can make it available in their cloud, it means that I can be more competitive. And if I'm compelling and compelling means at least twice the performance or functionality or both at half the price, I should be able to gain market share. >>And that's what Oracle's done. They've taken a superior product in my SQL heat wave, which is faster, lower cost does more for a lot less at the end of the day and they make it available to the users of those clouds. You avoid this little thing called egress fees, you avoid the issue of having to migrate from one cloud to another and suddenly you have a very compelling offer. So I look at what Oracle's doing with MyQ and it feels like, I'm gonna use a word term, a flanking maneuver to their competition. They're offering a better service on their platforms. >>All right, so thank you for that. Holger, we've seen this sort of cadence, I sort of referenced it up front a little bit and they sat on MySQL for a decade, then all of a sudden we see this rush of announcements. Why did it take so long? And and more importantly is Oracle, are they developing the right features that cloud database customers are looking for in your view? >>Yeah, great question, but first of all, in your interview you said it's the edit analytics, right? Analytics is kind of like a marketing buzzword. Reports can be analytics, right? The interesting thing, which they did, the first thing they, they, they crossed the chasm between OTP and all up, right? In the same database, right? So major engineering feed very much what customers want and it's all about creating Bellevue for customers, which, which I think is the part why they go into the multi-cloud and why they add these capabilities. And they certainly with the AI capabilities, it's kind of like getting it into an autonomous field, self-driving field now with the lake cost capabilities and meeting customers where they are, like Mark has talked about the e risk costs in the cloud. So that that's a significant advantage, creating value for customers and that's what at the end of the day matters. >>And I believe strongly that long term it's gonna be ones who create better value for customers who will get more of their money From that perspective, why then take them so long? I think it's a great question. I think largely he mentioned the gentleman Nial, it's largely to who leads a product. I used to build products too, so maybe I'm a little fooling myself here, but that made the difference in my view, right? So since he's been charged, he's been building things faster than the rest of the competition, than my SQL space, which in hindsight we thought was a hot and smoking innovation phase. It kind of like was a little self complacent when it comes to the traditional borders of where, where people think, where things are separated between OTP and ola or as an example of adjacent support, right? Structured documents, whereas unstructured documents or databases and all of that has been collapsed and brought together for building a more powerful database for customers. >>So I mean it's certainly, you know, when, when Oracle talks about the competitors, you know, the competitors are in the, I always say they're, if the Oracle talks about you and knows you're doing well, so they talk a lot about aws, talk a little bit about Snowflake, you know, sort of Google, they have partnerships with Azure, but, but in, so I'm presuming that the response in MySQL heatwave was really in, in response to what they were seeing from those big competitors. But then you had Maria DB coming out, you know, the day that that Oracle acquired Sun and, and launching and going after the MySQL base. So it's, I'm, I'm interested and we'll talk about this later and what you guys think AWS and Google and Azure and Snowflake and how they're gonna respond. But, but before I do that, Ron, I want to ask you, you, you, you can get, you know, pretty technical and you've probably seen the benchmarks. >>I know you have Oracle makes a big deal out of it, publishes its benchmarks, makes some transparent on on GI GitHub. Larry Ellison talked about this in his keynote at Cloud World. What are the benchmarks show in general? I mean, when you, when you're new to the market, you gotta have a story like Mark was saying, you gotta be two x you know, the performance at half the cost or you better be or you're not gonna get any market share. So, and, and you know, oftentimes companies don't publish market benchmarks when they're leading. They do it when they, they need to gain share. So what do you make of the benchmarks? Have their, any results that were surprising to you? Have, you know, they been challenged by the competitors. Is it just a bunch of kind of desperate bench marketing to make some noise in the market or you know, are they real? What's your view? >>Well, from my perspective, I think they have the validity. And to your point, I believe that when it comes to competitor responses, that has not really happened. Nobody has like pulled down the information that's on GitHub and said, Oh, here are our price performance results. And they counter oracles. In fact, I think part of the reason why that hasn't happened is that there's the risk if Oracle's coming out and saying, Hey, we can deliver 17 times better query performance using our capabilities versus say, Snowflake when it comes to, you know, the Lakehouse platform and Snowflake turns around and says it's actually only 15 times better during performance, that's not exactly an effective maneuver. And so I think this is really to oracle's credit and I think it's refreshing because these differentiators are significant. We're not talking, you know, like 1.2% differences. We're talking 17 fold differences, we're talking six fold differences depending on, you know, where the spotlight is being shined and so forth. >>And so I think this is actually something that is actually too good to believe initially at first blush. If I'm a cloud database decision maker, I really have to prioritize this. I really would know, pay a lot more attention to this. And that's why I posed the question to Oracle and others like, okay, if these differentiators are so significant, why isn't the needle moving a bit more? And it's for, you know, some of the usual reasons. One is really deep discounting coming from, you know, the other players that's really kind of, you know, marketing 1 0 1, this is something you need to do when there's a real competitive threat to keep, you know, a customer in your own customer base. Plus there is the usual fear and uncertainty about moving from one platform to another. But I think, you know, the traction, the momentum is, is shifting an Oracle's favor. I think we saw that in the Q1 efforts, for example, where Oracle cloud grew 44% and that it generated, you know, 4.8 billion and revenue if I recall correctly. And so, so all these are demonstrating that's Oracle is making, I think many of the right moves, publishing these figures for anybody to look at from their own perspective is something that is, I think, good for the market and I think it's just gonna continue to pay dividends for Oracle down the horizon as you know, competition intens plots. So if I were in, >>Dave, can I, Dave, can I interject something and, and what Ron just said there? Yeah, please go ahead. A couple things here, one discounting, which is a common practice when you have a real threat, as Ron pointed out, isn't going to help much in this situation simply because you can't discount to the point where you improve your performance and the performance is a huge differentiator. You may be able to get your price down, but the problem that most of them have is they don't have an integrated product service. They don't have an integrated O L T P O L A P M L N data lake. Even if you cut out two of them, they don't have any of them integrated. They have multiple services that are required separate integration and that can't be overcome with discounting. And the, they, you have to pay for each one of these. And oh, by the way, as you grow, the discounts go away. So that's a, it's a minor important detail. >>So, so that's a TCO question mark, right? And I know you look at this a lot, if I had that kind of price performance advantage, I would be pounding tco, especially if I need two separate databases to do the job. That one can do, that's gonna be, the TCO numbers are gonna be off the chart or maybe down the chart, which you want. Have you looked at this and how does it compare with, you know, the big cloud guys, for example, >>I've looked at it in depth, in fact, I'm working on another TCO on this arena, but you can find it on Wiki bod in which I compared TCO for MySEQ Heat wave versus Aurora plus Redshift plus ML plus Blue. I've compared it against gcps services, Azure services, Snowflake with other services. And there's just no comparison. The, the TCO differences are huge. More importantly, thefor, the, the TCO per performance is huge. We're talking in some cases multiple orders of magnitude, but at least an order of magnitude difference. So discounting isn't gonna help you much at the end of the day, it's only going to lower your cost a little, but it doesn't improve the automation, it doesn't improve the performance, it doesn't improve the time to insight, it doesn't improve all those things that you want out of a database or multiple databases because you >>Can't discount yourself to a higher value proposition. >>So what about, I wonder ho if you could chime in on the developer angle. You, you followed that, that market. How do these innovations from heatwave, I think you used the term developer velocity. I've heard you used that before. Yeah, I mean, look, Oracle owns Java, okay, so it, it's, you know, most popular, you know, programming language in the world, blah, blah blah. But it does it have the, the minds and hearts of, of developers and does, where does heatwave fit into that equation? >>I think heatwave is gaining quickly mindshare on the developer side, right? It's not the traditional no sequel database which grew up, there's a traditional mistrust of oracles to developers to what was happening to open source when gets acquired. Like in the case of Oracle versus Java and where my sql, right? And, but we know it's not a good competitive strategy to, to bank on Oracle screwing up because it hasn't worked not on Java known my sequel, right? And for developers, it's, once you get to know a technology product and you can do more, it becomes kind of like a Swiss army knife and you can build more use case, you can build more powerful applications. That's super, super important because you don't have to get certified in multiple databases. You, you are fast at getting things done, you achieve fire, develop velocity, and the managers are happy because they don't have to license more things, send you to more trainings, have more risk of something not being delivered, right? >>So it's really the, we see the suite where this best of breed play happening here, which in general was happening before already with Oracle's flagship database. Whereas those Amazon as an example, right? And now the interesting thing is every step away Oracle was always a one database company that can be only one and they're now generally talking about heat web and that two database company with different market spaces, but same value proposition of integrating more things very, very quickly to have a universal database that I call, they call the converge database for all the needs of an enterprise to run certain application use cases. And that's what's attractive to developers. >>It's, it's ironic isn't it? I mean I, you know, the rumor was the TK Thomas Curian left Oracle cuz he wanted to put Oracle database on other clouds and other places. And maybe that was the rift. Maybe there was, I'm sure there was other things, but, but Oracle clearly is now trying to expand its Tam Ron with, with heatwave into aws, into Azure. How do you think Oracle's gonna do, you were at a cloud world, what was the sentiment from customers and the independent analyst? Is this just Oracle trying to screw with the competition, create a little diversion? Or is this, you know, serious business for Oracle? What do you think? >>No, I think it has lakes. I think it's definitely, again, attriting to Oracle's overall ability to differentiate not only my SQL heat wave, but its overall portfolio. And I think the fact that they do have the alliance with the Azure in place, that this is definitely demonstrating their commitment to meeting the multi-cloud needs of its customers as well as what we pointed to in terms of the fact that they're now offering, you know, MySQL capabilities within AWS natively and that it can now perform AWS's own offering. And I think this is all demonstrating that Oracle is, you know, not letting up, they're not resting on its laurels. That's clearly we are living in a multi-cloud world, so why not just make it more easy for customers to be able to use cloud databases according to their own specific, specific needs. And I think, you know, to holder's point, I think that definitely lines with being able to bring on more application developers to leverage these capabilities. >>I think one important announcement that's related to all this was the JSON relational duality capabilities where now it's a lot easier for application developers to use a language that they're very familiar with a JS O and not have to worry about going into relational databases to store their J S O N application coding. So this is, I think an example of the innovation that's enhancing the overall Oracle portfolio and certainly all the work with machine learning is definitely paying dividends as well. And as a result, I see Oracle continue to make these inroads that we pointed to. But I agree with Mark, you know, the short term discounting is just a stall tag. This is not denying the fact that Oracle is being able to not only deliver price performance differentiators that are dramatic, but also meeting a wide range of needs for customers out there that aren't just limited device performance consideration. >>Being able to support multi-cloud according to customer needs. Being able to reach out to the application developer community and address a very specific challenge that has plagued them for many years now. So bring it all together. Yeah, I see this as just enabling Oracles who ring true with customers. That the customers that were there were basically all of them, even though not all of them are going to be saying the same things, they're all basically saying positive feedback. And likewise, I think the analyst community is seeing this. It's always refreshing to be able to talk to customers directly and at Oracle cloud there was a litany of them and so this is just a difference maker as well as being able to talk to strategic partners. The nvidia, I think partnerships also testament to Oracle's ongoing ability to, you know, make the ecosystem more user friendly for the customers out there. >>Yeah, it's interesting when you get these all in one tools, you know, the Swiss Army knife, you expect that it's not able to be best of breed. That's the kind of surprising thing that I'm hearing about, about heatwave. I want to, I want to talk about Lake House because when I think of Lake House, I think data bricks, and to my knowledge data bricks hasn't been in the sites of Oracle yet. Maybe they're next, but, but Oracle claims that MySQL, heatwave, Lakehouse is a breakthrough in terms of capacity and performance. Mark, what are your thoughts on that? Can you double click on, on Lakehouse Oracle's claims for things like query performance and data loading? What does it mean for the market? Is Oracle really leading in, in the lake house competitive landscape? What are your thoughts? >>Well, but name in the game is what are the problems you're solving for the customer? More importantly, are those problems urgent or important? If they're urgent, customers wanna solve 'em. Now if they're important, they might get around to them. So you look at what they're doing with Lake House or previous to that machine learning or previous to that automation or previous to that O L A with O ltp and they're merging all this capability together. If you look at Snowflake or data bricks, they're tacking one problem. You look at MyQ heat wave, they're tacking multiple problems. So when you say, yeah, their queries are much better against the lake house in combination with other analytics in combination with O ltp and the fact that there are no ETLs. So you're getting all this done in real time. So it's, it's doing the query cross, cross everything in real time. >>You're solving multiple user and developer problems, you're increasing their ability to get insight faster, you're having shorter response times. So yeah, they really are solving urgent problems for customers. And by putting it where the customer lives, this is the brilliance of actually being multicloud. And I know I'm backing up here a second, but by making it work in AWS and Azure where people already live, where they already have applications, what they're saying is, we're bringing it to you. You don't have to come to us to get these, these benefits, this value overall, I think it's a brilliant strategy. I give Nip and Argo wallet a huge, huge kudos for what he's doing there. So yes, what they're doing with the lake house is going to put notice on data bricks and Snowflake and everyone else for that matter. Well >>Those are guys that whole ago you, you and I have talked about this. Those are, those are the guys that are doing sort of the best of breed. You know, they're really focused and they, you know, tend to do well at least out of the gate. Now you got Oracle's converged philosophy, obviously with Oracle database. We've seen that now it's kicking in gear with, with heatwave, you know, this whole thing of sweets versus best of breed. I mean the long term, you know, customers tend to migrate towards suite, but the new shiny toy tends to get the growth. How do you think this is gonna play out in cloud database? >>Well, it's the forever never ending story, right? And in software right suite, whereas best of breed and so far in the long run suites have always won, right? So, and sometimes they struggle again because the inherent problem of sweets is you build something larger, it has more complexity and that means your cycles to get everything working together to integrate the test that roll it out, certify whatever it is, takes you longer, right? And that's not the case. It's a fascinating part of what the effort around my SQL heat wave is that the team is out executing the previous best of breed data, bringing us something together. Now if they can maintain that pace, that's something to to, to be seen. But it, the strategy, like what Mark was saying, bring the software to the data is of course interesting and unique and totally an Oracle issue in the past, right? >>Yeah. But it had to be in your database on oci. And but at, that's an interesting part. The interesting thing on the Lake health side is, right, there's three key benefits of a lakehouse. The first one is better reporting analytics, bring more rich information together, like make the, the, the case for silicon angle, right? We want to see engagements for this video, we want to know what's happening. That's a mixed transactional video media use case, right? Typical Lakehouse use case. The next one is to build more rich applications, transactional applications which have video and these elements in there, which are the engaging one. And the third one, and that's where I'm a little critical and concerned, is it's really the base platform for artificial intelligence, right? To run deep learning to run things automatically because they have all the data in one place can create in one way. >>And that's where Oracle, I know that Ron talked about Invidia for a moment, but that's where Oracle doesn't have the strongest best story. Nonetheless, the two other main use cases of the lake house are very strong, very well only concern is four 50 terabyte sounds long. It's an arbitrary limitation. Yeah, sounds as big. So for the start, and it's the first word, they can make that bigger. You don't want your lake house to be limited and the terabyte sizes or any even petabyte size because you want to have the certainty. I can put everything in there that I think it might be relevant without knowing what questions to ask and query those questions. >>Yeah. And you know, in the early days of no schema on right, it just became a mess. But now technology has evolved to allow us to actually get more value out of that data. Data lake. Data swamp is, you know, not much more, more, more, more logical. But, and I want to get in, in a moment, I want to come back to how you think the competitors are gonna respond. Are they gonna have to sort of do a more of a converged approach? AWS in particular? But before I do, Ron, I want to ask you a question about autopilot because I heard Larry Ellison's keynote and he was talking about how, you know, most security issues are human errors with autonomy and autonomous database and things like autopilot. We take care of that. It's like autonomous vehicles, they're gonna be safer. And I went, well maybe, maybe someday. So Oracle really tries to emphasize this, that every time you see an announcement from Oracle, they talk about new, you know, autonomous capabilities. It, how legit is it? Do people care? What about, you know, what's new for heatwave Lakehouse? How much of a differentiator, Ron, do you really think autopilot is in this cloud database space? >>Yeah, I think it will definitely enhance the overall proposition. I don't think people are gonna buy, you know, lake house exclusively cause of autopilot capabilities, but when they look at the overall picture, I think it will be an added capability bonus to Oracle's benefit. And yeah, I think it's kind of one of these age old questions, how much do you automate and what is the bounce to strike? And I think we all understand with the automatic car, autonomous car analogy that there are limitations to being able to use that. However, I think it's a tool that basically every organization out there needs to at least have or at least evaluate because it goes to the point of it helps with ease of use, it helps make automation more balanced in terms of, you know, being able to test, all right, let's automate this process and see if it works well, then we can go on and switch on on autopilot for other processes. >>And then, you know, that allows, for example, the specialists to spend more time on business use cases versus, you know, manual maintenance of, of the cloud database and so forth. So I think that actually is a, a legitimate value proposition. I think it's just gonna be a case by case basis. Some organizations are gonna be more aggressive with putting automation throughout their processes throughout their organization. Others are gonna be more cautious. But it's gonna be, again, something that will help the overall Oracle proposition. And something that I think will be used with caution by many organizations, but other organizations are gonna like, hey, great, this is something that is really answering a real problem. And that is just easing the use of these databases, but also being able to better handle the automation capabilities and benefits that come with it without having, you know, a major screwup happened and the process of transitioning to more automated capabilities. >>Now, I didn't attend cloud world, it's just too many red eyes, you know, recently, so I passed. But one of the things I like to do at those events is talk to customers, you know, in the spirit of the truth, you know, they, you know, you'd have the hallway, you know, track and to talk to customers and they say, Hey, you know, here's the good, the bad and the ugly. So did you guys, did you talk to any customers my SQL Heatwave customers at, at cloud world? And and what did you learn? I don't know, Mark, did you, did you have any luck and, and having some, some private conversations? >>Yeah, I had quite a few private conversations. The one thing before I get to that, I want disagree with one point Ron made, I do believe there are customers out there buying the heat wave service, the MySEQ heat wave server service because of autopilot. Because autopilot is really revolutionary in many ways in the sense for the MySEQ developer in that it, it auto provisions, it auto parallel loads, IT auto data places it auto shape predictions. It can tell you what machine learning models are going to tell you, gonna give you your best results. And, and candidly, I've yet to meet a DBA who didn't wanna give up pedantic tasks that are pain in the kahoo, which they'd rather not do and if it's long as it was done right for them. So yes, I do think people are buying it because of autopilot and that's based on some of the conversations I had with customers at Oracle Cloud World. >>In fact, it was like, yeah, that's great, yeah, we get fantastic performance, but this really makes my life easier and I've yet to meet a DBA who didn't want to make their life easier. And it does. So yeah, I've talked to a few of them. They were excited. I asked them if they ran into any bugs, were there any difficulties in moving to it? And the answer was no. In both cases, it's interesting to note, my sequel is the most popular database on the planet. Well, some will argue that it's neck and neck with SQL Server, but if you add in Mariah DB and ProCon db, which are forks of MySQL, then yeah, by far and away it's the most popular. And as a result of that, everybody for the most part has typically a my sequel database somewhere in their organization. So this is a brilliant situation for anybody going after MyQ, but especially for heat wave. And the customers I talk to love it. I didn't find anybody complaining about it. And >>What about the migration? We talked about TCO earlier. Did your t does your TCO analysis include the migration cost or do you kind of conveniently leave that out or what? >>Well, when you look at migration costs, there are different kinds of migration costs. By the way, the worst job in the data center is the data migration manager. Forget it, no other job is as bad as that one. You get no attaboys for doing it. Right? And then when you screw up, oh boy. So in real terms, anything that can limit data migration is a good thing. And when you look at Data Lake, that limits data migration. So if you're already a MySEQ user, this is a pure MySQL as far as you're concerned. It's just a, a simple transition from one to the other. You may wanna make sure nothing broke and every you, all your tables are correct and your schema's, okay, but it's all the same. So it's a simple migration. So it's pretty much a non-event, right? When you migrate data from an O LTP to an O L A P, that's an ETL and that's gonna take time. >>But you don't have to do that with my SQL heat wave. So that's gone when you start talking about machine learning, again, you may have an etl, you may not, depending on the circumstances, but again, with my SQL heat wave, you don't, and you don't have duplicate storage, you don't have to copy it from one storage container to another to be able to be used in a different database, which by the way, ultimately adds much more cost than just the other service. So yeah, I looked at the migration and again, the users I talked to said it was a non-event. It was literally moving from one physical machine to another. If they had a new version of MySEQ running on something else and just wanted to migrate it over or just hook it up or just connect it to the data, it worked just fine. >>Okay, so every day it sounds like you guys feel, and we've certainly heard this, my colleague David Foyer, the semi-retired David Foyer was always very high on heatwave. So I think you knows got some real legitimacy here coming from a standing start, but I wanna talk about the competition, how they're likely to respond. I mean, if your AWS and you got heatwave is now in your cloud, so there's some good aspects of that. The database guys might not like that, but the infrastructure guys probably love it. Hey, more ways to sell, you know, EC two and graviton, but you're gonna, the database guys in AWS are gonna respond. They're gonna say, Hey, we got Redshift, we got aqua. What's your thoughts on, on not only how that's gonna resonate with customers, but I'm interested in what you guys think will a, I never say never about aws, you know, and are they gonna try to build, in your view a converged Oola and o LTP database? You know, Snowflake is taking an ecosystem approach. They've added in transactional capabilities to the portfolio so they're not standing still. What do you guys see in the competitive landscape in that regard going forward? Maybe Holger, you could start us off and anybody else who wants to can chime in, >>Happy to, you mentioned Snowflake last, we'll start there. I think Snowflake is imitating that strategy, right? That building out original data warehouse and the clouds tasking project to really proposition to have other data available there because AI is relevant for everybody. Ultimately people keep data in the cloud for ultimately running ai. So you see the same suite kind of like level strategy, it's gonna be a little harder because of the original positioning. How much would people know that you're doing other stuff? And I just, as a former developer manager of developers, I just don't see the speed at the moment happening at Snowflake to become really competitive to Oracle. On the flip side, putting my Oracle hat on for a moment back to you, Mark and Iran, right? What could Oracle still add? Because the, the big big things, right? The traditional chasms in the database world, they have built everything, right? >>So I, I really scratched my hat and gave Nipon a hard time at Cloud world say like, what could you be building? Destiny was very conservative. Let's get the Lakehouse thing done, it's gonna spring next year, right? And the AWS is really hard because AWS value proposition is these small innovation teams, right? That they build two pizza teams, which can be fit by two pizzas, not large teams, right? And you need suites to large teams to build these suites with lots of functionalities to make sure they work together. They're consistent, they have the same UX on the administration side, they can consume the same way, they have the same API registry, can't even stop going where the synergy comes to play over suite. So, so it's gonna be really, really hard for them to change that. But AWS super pragmatic. They're always by themselves that they'll listen to customers if they learn from customers suite as a proposition. I would not be surprised if AWS trying to bring things closer together, being morely together. >>Yeah. Well how about, can we talk about multicloud if, if, again, Oracle is very on on Oracle as you said before, but let's look forward, you know, half a year or a year. What do you think about Oracle's moves in, in multicloud in terms of what kind of penetration they're gonna have in the marketplace? You saw a lot of presentations at at cloud world, you know, we've looked pretty closely at the, the Microsoft Azure deal. I think that's really interesting. I've, I've called it a little bit of early days of a super cloud. What impact do you think this is gonna have on, on the marketplace? But, but both. And think about it within Oracle's customer base, I have no doubt they'll do great there. But what about beyond its existing install base? What do you guys think? >>Ryan, do you wanna jump on that? Go ahead. Go ahead Ryan. No, no, no, >>That's an excellent point. I think it aligns with what we've been talking about in terms of Lakehouse. I think Lake House will enable Oracle to pull more customers, more bicycle customers onto the Oracle platforms. And I think we're seeing all the signs pointing toward Oracle being able to make more inroads into the overall market. And that includes garnishing customers from the leaders in, in other words, because they are, you know, coming in as a innovator, a an alternative to, you know, the AWS proposition, the Google cloud proposition that they have less to lose and there's a result they can really drive the multi-cloud messaging to resonate with not only their existing customers, but also to be able to, to that question, Dave's posing actually garnish customers onto their platform. And, and that includes naturally my sequel but also OCI and so forth. So that's how I'm seeing this playing out. I think, you know, again, Oracle's reporting is indicating that, and I think what we saw, Oracle Cloud world is definitely validating the idea that Oracle can make more waves in the overall market in this regard. >>You know, I, I've floated this idea of Super cloud, it's kind of tongue in cheek, but, but there, I think there is some merit to it in terms of building on top of hyperscale infrastructure and abstracting some of the, that complexity. And one of the things that I'm most interested in is industry clouds and an Oracle acquisition of Cerner. I was struck by Larry Ellison's keynote, it was like, I don't know, an hour and a half and an hour and 15 minutes was focused on healthcare transformation. Well, >>So vertical, >>Right? And so, yeah, so you got Oracle's, you know, got some industry chops and you, and then you think about what they're building with, with not only oci, but then you got, you know, MyQ, you can now run in dedicated regions. You got ADB on on Exadata cloud to customer, you can put that OnPrem in in your data center and you look at what the other hyperscalers are, are doing. I I say other hyperscalers, I've always said Oracle's not really a hyperscaler, but they got a cloud so they're in the game. But you can't get, you know, big query OnPrem, you look at outposts, it's very limited in terms of, you know, the database support and again, that that will will evolve. But now you got Oracle's got, they announced Alloy, we can white label their cloud. So I'm interested in what you guys think about these moves, especially the industry cloud. We see, you know, Walmart is doing sort of their own cloud. You got Goldman Sachs doing a cloud. Do you, you guys, what do you think about that and what role does Oracle play? Any thoughts? >>Yeah, let me lemme jump on that for a moment. Now, especially with the MyQ, by making that available in multiple clouds, what they're doing is this follows the philosophy they've had the past with doing cloud, a customer taking the application and the data and putting it where the customer lives. If it's on premise, it's on premise. If it's in the cloud, it's in the cloud. By making the mice equal heat wave, essentially a plug compatible with any other mice equal as far as your, your database is concern and then giving you that integration with O L A P and ML and Data Lake and everything else, then what you've got is a compelling offering. You're making it easier for the customer to use. So I look the difference between MyQ and the Oracle database, MyQ is going to capture market more market share for them. >>You're not gonna find a lot of new users for the Oracle debate database. Yeah, there are always gonna be new users, don't get me wrong, but it's not gonna be a huge growth. Whereas my SQL heatwave is probably gonna be a major growth engine for Oracle going forward. Not just in their own cloud, but in AWS and in Azure and on premise over time that eventually it'll get there. It's not there now, but it will, they're doing the right thing on that basis. They're taking the services and when you talk about multicloud and making them available where the customer wants them, not forcing them to go where you want them, if that makes sense. And as far as where they're going in the future, I think they're gonna take a page outta what they've done with the Oracle database. They'll add things like JSON and XML and time series and spatial over time they'll make it a, a complete converged database like they did with the Oracle database. The difference being Oracle database will scale bigger and will have more transactions and be somewhat faster. And my SQL will be, for anyone who's not on the Oracle database, they're, they're not stupid, that's for sure. >>They've done Jason already. Right. But I give you that they could add graph and time series, right. Since eat with, Right, Right. Yeah, that's something absolutely right. That's, that's >>A sort of a logical move, right? >>Right. But that's, that's some kid ourselves, right? I mean has worked in Oracle's favor, right? 10 x 20 x, the amount of r and d, which is in the MyQ space, has been poured at trying to snatch workloads away from Oracle by starting with IBM 30 years ago, 20 years ago, Microsoft and, and, and, and didn't work, right? Database applications are extremely sticky when they run, you don't want to touch SIM and grow them, right? So that doesn't mean that heat phase is not an attractive offering, but it will be net new things, right? And what works in my SQL heat wave heat phases favor a little bit is it's not the massive enterprise applications which have like we the nails like, like you might be only running 30% or Oracle, but the connections and the interfaces into that is, is like 70, 80% of your enterprise. >>You take it out and it's like the spaghetti ball where you say, ah, no I really don't, don't want to do all that. Right? You don't, don't have that massive part with the equals heat phase sequel kind of like database which are more smaller tactical in comparison, but still I, I don't see them taking so much share. They will be growing because of a attractive value proposition quickly on the, the multi-cloud, right? I think it's not really multi-cloud. If you give people the chance to run your offering on different clouds, right? You can run it there. The multi-cloud advantages when the Uber offering comes out, which allows you to do things across those installations, right? I can migrate data, I can create data across something like Google has done with B query Omni, I can run predictive models or even make iron models in different place and distribute them, right? And Oracle is paving the road for that, but being available on these clouds. But the multi-cloud capability of database which knows I'm running on different clouds that is still yet to be built there. >>Yeah. And >>That the problem with >>That, that's the super cloud concept that I flowed and I I've always said kinda snowflake with a single global instance is sort of, you know, headed in that direction and maybe has a league. What's the issue with that mark? >>Yeah, the problem with the, with that version, the multi-cloud is clouds to charge egress fees. As long as they charge egress fees to move data between clouds, it's gonna make it very difficult to do a real multi-cloud implementation. Even Snowflake, which runs multi-cloud, has to pass out on the egress fees of their customer when data moves between clouds. And that's really expensive. I mean there, there is one customer I talked to who is beta testing for them, the MySQL heatwave and aws. The only reason they didn't want to do that until it was running on AWS is the egress fees were so great to move it to OCI that they couldn't afford it. Yeah. Egress fees are the big issue but, >>But Mark the, the point might be you might wanna root query and only get the results set back, right was much more tinier, which been the answer before for low latency between the class A problem, which we sometimes still have but mostly don't have. Right? And I think in general this with fees coming down based on the Oracle general E with fee move and it's very hard to justify those, right? But, but it's, it's not about moving data as a multi-cloud high value use case. It's about doing intelligent things with that data, right? Putting into other places, replicating it, what I'm saying the same thing what you said before, running remote queries on that, analyzing it, running AI on it, running AI models on that. That's the interesting thing. Cross administered in the same way. Taking things out, making sure compliance happens. Making sure when Ron says I don't want to be American anymore, I want to be in the European cloud that is gets migrated, right? So tho those are the interesting value use case which are really, really hard for enterprise to program hand by hand by developers and they would love to have out of the box and that's yet the innovation to come to, we have to come to see. But the first step to get there is that your software runs in multiple clouds and that's what Oracle's doing so well with my SQL >>Guys. Amazing. >>Go ahead. Yeah. >>Yeah. >>For example, >>Amazing amount of data knowledge and, and brain power in this market. Guys, I really want to thank you for coming on to the cube. Ron Holger. Mark, always a pleasure to have you on. Really appreciate your time. >>Well all the last names we're very happy for Romanic last and moderator. Thanks Dave for moderating us. All right, >>We'll see. We'll see you guys around. Safe travels to all and thank you for watching this power panel, The Truth About My SQL Heat Wave on the cube. Your leader in enterprise and emerging tech coverage.

Published Date : Nov 1 2022

SUMMARY :

Always a pleasure to have you on. I think you just saw him at Oracle Cloud World and he's come on to describe this is doing, you know, Google is, you know, we heard Google Cloud next recently, They own somewhere between 30 to 50% depending on who you read migrate from one cloud to another and suddenly you have a very compelling offer. All right, so thank you for that. And they certainly with the AI capabilities, And I believe strongly that long term it's gonna be ones who create better value for So I mean it's certainly, you know, when, when Oracle talks about the competitors, So what do you make of the benchmarks? say, Snowflake when it comes to, you know, the Lakehouse platform and threat to keep, you know, a customer in your own customer base. And oh, by the way, as you grow, And I know you look at this a lot, to insight, it doesn't improve all those things that you want out of a database or multiple databases So what about, I wonder ho if you could chime in on the developer angle. they don't have to license more things, send you to more trainings, have more risk of something not being delivered, all the needs of an enterprise to run certain application use cases. I mean I, you know, the rumor was the TK Thomas Curian left Oracle And I think, you know, to holder's point, I think that definitely lines But I agree with Mark, you know, the short term discounting is just a stall tag. testament to Oracle's ongoing ability to, you know, make the ecosystem Yeah, it's interesting when you get these all in one tools, you know, the Swiss Army knife, you expect that it's not able So when you say, yeah, their queries are much better against the lake house in You don't have to come to us to get these, these benefits, I mean the long term, you know, customers tend to migrate towards suite, but the new shiny bring the software to the data is of course interesting and unique and totally an Oracle issue in And the third one, lake house to be limited and the terabyte sizes or any even petabyte size because you want keynote and he was talking about how, you know, most security issues are human I don't think people are gonna buy, you know, lake house exclusively cause of And then, you know, that allows, for example, the specialists to And and what did you learn? The one thing before I get to that, I want disagree with And the customers I talk to love it. the migration cost or do you kind of conveniently leave that out or what? And when you look at Data Lake, that limits data migration. So that's gone when you start talking about So I think you knows got some real legitimacy here coming from a standing start, So you see the same And you need suites to large teams to build these suites with lots of functionalities You saw a lot of presentations at at cloud world, you know, we've looked pretty closely at Ryan, do you wanna jump on that? I think, you know, again, Oracle's reporting I think there is some merit to it in terms of building on top of hyperscale infrastructure and to customer, you can put that OnPrem in in your data center and you look at what the So I look the difference between MyQ and the Oracle database, MyQ is going to capture market They're taking the services and when you talk about multicloud and But I give you that they could add graph and time series, right. like, like you might be only running 30% or Oracle, but the connections and the interfaces into You take it out and it's like the spaghetti ball where you say, ah, no I really don't, global instance is sort of, you know, headed in that direction and maybe has a league. Yeah, the problem with the, with that version, the multi-cloud is clouds And I think in general this with fees coming down based on the Oracle general E with fee move Yeah. Guys, I really want to thank you for coming on to the cube. Well all the last names we're very happy for Romanic last and moderator. We'll see you guys around.

ENTITIES

Entity	Category	Confidence
Mark	PERSON	0.99+
Ron Holger	PERSON	0.99+
Ron	PERSON	0.99+
Mark Stammer	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Ron Westfall	PERSON	0.99+
Ryan	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Larry Ellison	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Holgar Mueller	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Constellation Research	ORGANIZATION	0.99+
Goldman Sachs	ORGANIZATION	0.99+
17 times	QUANTITY	0.99+
two	QUANTITY	0.99+
David Foyer	PERSON	0.99+
44%	QUANTITY	0.99+
1.2%	QUANTITY	0.99+
4.8 billion	QUANTITY	0.99+
Jason	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Fu Chim Research	ORGANIZATION	0.99+
Dave Ante	PERSON	0.99+

Analyst Power Panel: Future of Database Platforms

(upbeat music) >> Once a staid and boring business dominated by IBM, Oracle, and at the time newcomer Microsoft, along with a handful of wannabes, the database business has exploded in the past decade and has become a staple of financial excellence, customer experience, analytic advantage, competitive strategy, growth initiatives, visualizations, not to mention compliance, security, privacy and dozens of other important use cases and initiatives. And on the vendor's side of the house, we've seen the rapid ascendancy of cloud databases. Most notably from Snowflake, whose massive raises leading up to its IPO in late 2020 sparked a spate of interest and VC investment in the separation of compute and storage and all that elastic resource stuff in the cloud. The company joined AWS, Azure and Google to popularize cloud databases, which have become a linchpin of competitive strategies for technology suppliers. And if I get you to put your data in my database and in my cloud, and I keep innovating, I'm going to build a moat and achieve a hugely attractive lifetime customer value in a really amazing marginal economics dynamic that is going to fund my future. And I'll be able to sell other adjacent services, not just compute and storage, but machine learning and inference and training and all kinds of stuff, dozens of lucrative cloud offerings. Meanwhile, the database leader, Oracle has invested massive amounts of money to maintain its lead. It's building on its position as the king of mission critical workloads and making typical Oracle like claims against the competition. Most were recently just yesterday with another announcement around MySQL HeatWave. An extension of MySQL that is compatible with on-premises MySQLs and is setting new standards in price performance. We're seeing a dramatic divergence in strategies across the database spectrum. On the far left, we see Amazon with more than a dozen database offerings each with its own API and primitives. AWS is taking a right tool for the right job approach, often building on open source platforms and creating services that it offers to customers to solve very specific problems for developers. And on the other side of the line, we see Oracle, which is taking the Swiss Army Knife approach, converging database functionality, enabling analytic and transactional workloads to run in the same data store, eliminating the need to ETL, at the same time adding capabilities into its platform like automation and machine learning. Welcome to this database Power Panel. My name is Dave Vellante, and I'm so excited to bring together some of the most respected industry analyst in the community. Today we're going to assess what's happening in the market. We're going to dig into the competitive landscape and explore the future of database and database platforms and decode what it means to customers. Let me take a moment to welcome our guest analyst today. Matt Kimball is a vice president and principal analysts at Moor Insights and Strategy, Matt. He knows products, he knows industry, he's got real world IT expertise, and he's got all the angles 25 plus years of experience in all kinds of great background. Matt, welcome. Thanks very much for coming on theCUBE. Holgar Mueller, friend of theCUBE, vice president and principal analyst at Constellation Research in depth knowledge on applications, application development, knows developers. He's worked at SAP and Oracle. And then Bob Evans is Chief Content Officer and co-founder of the Acceleration Economy, founder and principle of Cloud Wars. Covers all kinds of industry topics and great insights. He's got awesome videos, these three minute hits. If you haven't seen 'em, checking them out, knows cloud companies, his Cloud Wars minutes are fantastic. And then of course, Marc Staimer is the founder of Dragon Slayer Research. A frequent contributor and guest analyst at Wikibon. He's got a wide ranging knowledge across IT products, knows technology really well, can go deep. And then of course, Ron Westfall, Senior Analyst and Director Research Director at Futurum Research, great all around product trends knowledge. Can take, you know, technical dives and really understands competitive angles, knows Redshift, Snowflake, and many others. Gents, thanks so much for taking the time to join us in theCube today. It's great to have you on, good to see you. >> Good to be here, thanks for having us. >> Thanks, Dave. >> All right, let's start with an around the horn and briefly, if each of you would describe, you know, anything I missed in your areas of expertise and then you answer the following question, how would you describe the state of the database, state of platform market today? Matt Kimball, please start. >> Oh, I hate going first, but that it's okay. How would I describe the world today? I would just in one sentence, I would say, I'm glad I'm not in IT anymore, right? So, you know, it is a complex and dangerous world out there. And I don't envy IT folks I'd have to support, you know, these modernization and transformation efforts that are going on within the enterprise. It used to be, you mentioned it, Dave, you would argue about IBM versus Oracle versus this newcomer in the database space called Microsoft. And don't forget Sybase back in the day, but you know, now it's not just, which SQL vendor am I going to go with? It's all of these different, divergent data types that have to be taken, they have to be merged together, synthesized. And somehow I have to do that cleanly and use this to drive strategic decisions for my business. That is not easy. So, you know, you have to look at it from the perspective of the business user. It's great for them because as a DevOps person, or as an analyst, I have so much flexibility and I have this thing called the cloud now where I can go get services immediately. As an IT person or a DBA, I am calling up prevention hotlines 24 hours a day, because I don't know how I'm going to be able to support the business. And as an Oracle or as an Oracle or a Microsoft or some of the cloud providers and cloud databases out there, I'm licking my chops because, you know, my market is expanding and expanding every day. >> Great, thank you for that, Matt. Holgar, how do you see the world these days? You always have a good perspective on things, share with us. >> Well, I think it's the best time to be in IT, I'm not sure what Matt is talking about. (laughing) It's easier than ever, right? The direction is going to cloud. Kubernetes has won, Google has the best AI for now, right? So things are easier than ever before. You made commitments for five plus years on hardware, networking and so on premise, and I got gray hair about worrying it was the wrong decision. No, just kidding. But you kind of both sides, just to be controversial, make it interesting, right. So yeah, no, I think the interesting thing specifically with databases, right? We have this big suite versus best of breed, right? Obviously innovation, like you mentioned with Snowflake and others happening in the cloud, the cloud vendors server, where to save of their databases. And then we have one of the few survivors of the old guard as Evans likes to call them is Oracle who's doing well, both their traditional database. And now, which is really interesting, remarkable from that because Oracle it was always the power of one, have one database, add more to it, make it what I call the universal database. And now this new HeatWave offering is coming and MySQL open source side. So they're getting the second (indistinct) right? So it's interesting that older players, traditional players who still are in the market are diversifying their offerings. Something we don't see so much from the traditional tools from Oracle on the Microsoft side or the IBM side these days. >> Great, thank you Holgar. Bob Evans, you've covered this business for a while. You've worked at, you know, a number of different outlets and companies and you cover the competition, how do you see things? >> Dave, you know, the other angle to look at this from is from the customer side, right? You got now CEOs who are any sort of business across all sorts of industries, and they understand that their future success is going to be dependent on their ability to become a digital company, to understand data, to use it the right way. So as you outline Dave, I think in your intro there, it is a fantastic time to be in the database business. And I think we've got a lot of new buyers and influencers coming in. They don't know all this history about IBM and Microsoft and Oracle and you know, whoever else. So I think they're going to take a long, hard look, Dave, at some of these results and who is able to help these companies not serve up the best technology, but who's going to be able to help their business move into the digital future. So it's a fascinating time now from every perspective. >> Great points, Bob. I mean, digital transformation has gone from buzzword to imperative. Mr. Staimer, how do you see things? >> I see things a little bit differently than my peers here in that I see the database market being segmented. There's all the different kinds of databases that people are looking at for different kinds of data, and then there is databases in the cloud. And so database as cloud service, I view very differently than databases because the traditional way of implementing a database is changing and it's changing rapidly. So one of the premises that you stated earlier on was that you viewed Oracle as a database company. I don't view Oracle as a database company anymore. I view Oracle as a cloud company that happens to have a significant expertise and specialty in databases, and they still sell database software in the traditional way, but ultimately they're a cloud company. So database cloud services from my point of view is a very distinct market from databases. >> Okay, well, you gave us some good meat on the bone to talk about that. Last but not least-- >> Dave did Marc, just say Oracle's a cloud company? >> Yeah. (laughing) Take away the database, it would be interesting to have that discussion, but let's let Ron jump in here. Ron, give us your take. >> That's a great segue. I think it's truly the era of the cloud database, that's something that's rising. And the key trends that come with it include for example, elastic scaling. That is the ability to scale on demand, to right size workloads according to customer requirements. And also I think it's going to increase the prioritization for high availability. That is the player who can provide the highest availability is going to have, I think, a great deal of success in this emerging market. And also I anticipate that there will be more consolidation across platforms in order to enable cost savings for customers, and that's something that's always going to be important. And I think we'll see more of that over the horizon. And then finally security, security will be more important than ever. We've seen a spike (indistinct), we certainly have seen geopolitical originated cybersecurity concerns. And as a result, I see database security becoming all the more important. >> Great, thank you. Okay, let me share some data with you guys. I'm going to throw this at you and see what you think. We have this awesome data partner called Enterprise Technology Research, ETR. They do these quarterly surveys and each period with dozens of industry segments, they track clients spending, customer spending. And this is the database, data warehouse sector okay so it's taxonomy, so it's not perfect, but it's a big kind of chunk. They essentially ask customers within a category and buy a specific vendor, you're spending more or less on the platform? And then they subtract the lesses from the mores and they derive a metric called net score. It's like NPS, it's a measure of spending velocity. It's more complicated and granular than that, but that's the basis and that's the vertical axis. The horizontal axis is what they call market share, it's not like IDC market share, it's just pervasiveness in the data set. And so there are a couple of things that stand out here and that we can use as reference point. The first is the momentum of Snowflake. They've been off the charts for many, many, for over two years now, anything above that dotted red line, that 40%, is considered by ETR to be highly elevated and Snowflake's even way above that. And I think it's probably not sustainable. We're going to see in the next April survey, next month from those guys, when it comes out. And then you see AWS and Microsoft, they're really pervasive on the horizontal axis and highly elevated, Google falls behind them. And then you got a number of well funded players. You got Cockroach Labs, Mongo, Redis, MariaDB, which of course is a fork on MySQL started almost as protest at Oracle when they acquired Sun and they got MySQL and you can see the number of others. Now Oracle who's the leading database player, despite what Marc Staimer says, we know, (laughs) and they're a cloud player (laughing) who happens to be a leading database player. They dominate in the mission critical space, we know that they're the king of that sector, but you can see here that they're kind of legacy, right? They've been around a long time, they get a big install base. So they don't have the spending momentum on the vertical axis. Now remember this is, just really this doesn't capture spending levels, so that understates Oracle but nonetheless. So it's not a complete picture like SAP for instance is not in here, no Hana. I think people are actually buying it, but it doesn't show up here, (laughs) but it does give an indication of momentum and presence. So Bob Evans, I'm going to start with you. You've commented on many of these companies, you know, what does this data tell you? >> Yeah, you know, Dave, I think all these compilations of things like that are interesting, and that folks at ETR do some good work, but I think as you said, it's a snapshot sort of a two-dimensional thing of a rapidly changing, three dimensional world. You know, the incidents at which some of these companies are mentioned versus the volume that happens. I think it's, you know, with Oracle and I'm not going to declare my religious affiliation, either as cloud company or database company, you know, they're all of those things and more, and I think some of our old language of how we classify companies is just not relevant anymore. But I want to ask too something in here, the autonomous database from Oracle, nobody else has done that. So either Oracle is crazy, they've tried out a technology that nobody other than them is interested in, or they're onto something that nobody else can match. So to me, Dave, within Oracle, trying to identify how they're doing there, I would watch autonomous database growth too, because right, it's either going to be a big plan and it breaks through, or it's going to be caught behind. And the Snowflake phenomenon as you mentioned, that is a rare, rare bird who comes up and can grow 100% at a billion dollar revenue level like that. So now they've had a chance to come in, scare the crap out of everybody, rock the market with something totally new, the data cloud. Will the bigger companies be able to catch up and offer a compelling alternative, or is Snowflake going to continue to be this outlier. It's a fascinating time. >> Really, interesting points there. Holgar, I want to ask you, I mean, I've talked to certainly I'm sure you guys have too, the founders of Snowflake that came out of Oracle and they actually, they don't apologize. They say, "Hey, we not going to do all that complicated stuff that Oracle does, we were trying to keep it real simple." But at the same time, you know, they don't do sophisticated workload management. They don't do complex joints. They're kind of relying on the ecosystems. So when you look at the data like this and the various momentums, and we talked about the diverging strategies, what does this say to you? >> Well, it is a great point. And I think Snowflake is an example how the cloud can turbo charge a well understood concept in this case, the data warehouse, right? You move that and you find steroids and you see like for some players who've been big in data warehouse, like Sentara Data, as an example, here in San Diego, what could have been for them right in that part. The interesting thing, the problem though is the cloud hides a lot of complexity too, which you can scale really well as you attract lots of customers to go there. And you don't have to build things like what Bob said, right? One of the fascinating things, right, nobody's answering Oracle on the autonomous database. I don't think is that they cannot, they just have different priorities or the database is not such a priority. I would dare to say that it's for IBM and Microsoft right now at the moment. And the cloud vendors, you just hide that right through scripts and through scale because you support thousands of customers and you can deal with a little more complexity, right? It's not against them. Whereas if you have to run it yourself, very different story, right? You want to have the autonomous parts, you want to have the powerful tools to do things. >> Thank you. And so Matt, I want to go to you, you've set up front, you know, it's just complicated if you're in IT, it's a complicated situation and you've been on the customer side. And if you're a buyer, it's obviously, it's like Holgar said, "Cloud's supposed to make this stuff easier, but the simpler it gets the more complicated gets." So where do you place your bets? Or I guess more importantly, how do you decide where to place your bets? >> Yeah, it's a good question. And to what Bob and Holgar said, you know, the around autonomous database, I think, you know, part of, as I, you know, play kind of armchair psychologist, if you will, corporate psychologists, I look at what Oracle is doing and, you know, databases where they've made their mark and it's kind of, that's their strong position, right? So it makes sense if you're making an entry into this cloud and you really want to kind of build momentum, you go with what you're good at, right? So that's kind of the strength of Oracle. Let's put a lot of focus on that. They do a lot more than database, don't get me wrong, but you know, I'm going to short my strength and then kind of pivot from there. With regards to, you know, what IT looks at and what I would look at you know as an IT director or somebody who is, you know, trying to consume services from these different cloud providers. First and foremost, I go with what I know, right? Let's not forget IT is a conservative group. And when we look at, you know, all the different permutations of database types out there, SQL, NoSQL, all the different types of NoSQL, those are largely being deployed by business users that are looking for agility or businesses that are looking for agility. You know, the reason why MongoDB is so popular is because of DevOps, right? It's a great platform to develop on and that's where it kind of gained its traction. But as an IT person, I want to go with what I know, where my muscle memory is, and that's my first position. And so as I evaluate different cloud service providers and cloud databases, I look for, you know, what I know and what I've invested in and where my muscle memory is. Is there enough there and do I have enough belief that that company or that service is going to be able to take me to, you know, where I see my organization in five years from a data management perspective, from a business perspective, are they going to be there? And if they are, then I'm a little bit more willing to make that investment, but it is, you know, if I'm kind of going in this blind or if I'm cloud native, you know, that's where the Snowflakes of the world become very attractive to me. >> Thank you. So Marc, I asked Andy Jackson in theCube one time, you have all these, you know, data stores and different APIs and primitives and you know, very granular, what's the strategy there? And he said, "Hey, that allows us as the market changes, it allows us to be more flexible. If we start building abstractions layers, it's harder for us." I think also it was not a good time to market advantage, but let me ask you, I described earlier on that spectrum from AWS to Oracle. We just saw yesterday, Oracle announced, I think the third major enhancement in like 15 months to MySQL HeatWave, what do you make of that announcement? How do you think it impacts the competitive landscape, particularly as it relates to, you know, converging transaction and analytics, eliminating ELT, I know you have some thoughts on this. >> So let me back up for a second and defend my cloud statement about Oracle for a moment. (laughing) AWS did a great job in developing the cloud market in general and everything in the cloud market. I mean, I give them lots of kudos on that. And a lot of what they did is they took open source software and they rent it to people who use their cloud. So I give 'em lots of credit, they dominate the market. Oracle was late to the cloud market. In fact, they actually poo-pooed it initially, if you look at some of Larry Ellison's statements, they said, "Oh, it's never going to take off." And then they did 180 turn, and they said, "Oh, we're going to embrace the cloud." And they really have, but when you're late to a market, you've got to be compelling. And this ties into the announcement yesterday, but let's deal with this compelling. To be compelling from a user point of view, you got to be twice as fast, offer twice as much functionality, at half the cost. That's generally what compelling is that you're going to capture market share from the leaders who established the market. It's very difficult to capture market share in a new market for yourself. And you're right. I mean, Bob was correct on this and Holgar and Matt in which you look at Oracle, and they did a great job of leveraging their database to move into this market, give 'em lots of kudos for that too. But yesterday they announced, as you said, the third innovation release and the pace is just amazing of what they're doing on these releases on HeatWave that ties together initially MySQL with an integrated builtin analytics engine, so a data warehouse built in. And then they added automation with autopilot, and now they've added machine learning to it, and it's all in the same service. It's not something you can buy and put on your premise unless you buy their cloud customers stuff. But generally it's a cloud offering, so it's compellingly better as far as the integration. You don't buy multiple services, you buy one and it's lower cost than any of the other services, but more importantly, it's faster, which again, give 'em credit for, they have more integration of a product. They can tie things together in a way that nobody else does. There's no additional services, ETL services like Glue and AWS. So from that perspective, they're getting better performance, fewer services, lower cost. Hmm, they're aiming at the compelling side again. So from a customer point of view it's compelling. Matt, you wanted to say something there. >> Yeah, I want to kind of, on what you just said there Marc, and this is something I've found really interesting, you know. The traditional way that you look at software and, you know, purchasing software and IT is, you look at either best of breed solutions and you have to work on the backend to integrate them all and make them all work well. And generally, you know, the big hit against the, you know, we have one integrated offering is that, you lose capability or you lose depth of features, right. And to what you were saying, you know, that's the thing I found interesting about what Oracle is doing is they're building in depth as they kind of, you know, build that service. It's not like you're losing a lot of capabilities, because you're going to one integrated service versus having to use A versus B versus C, and I love that idea. >> You're right. Yeah, not only you're not losing, but you're gaining functionality that you can't get by integrating a lot of these. I mean, I can take Snowflake and integrate it in with machine learning, but I also have to integrate in with a transactional database. So I've got to have connectors between all of this, which means I'm adding time. And what it comes down to at the end of the day is expertise, effort, time, and cost. And so what I see the difference from the Oracle announcements is they're aiming at reducing all of that by increasing performance as well. Correct me if I'm wrong on that but that's what I saw at the announcement yesterday. >> You know, Marc, one thing though Marc, it's funny you say that because I started out saying, you know, I'm glad I'm not 19 anymore. And the reason is because of exactly what you said, it's almost like there's a pseudo level of witchcraft that's required to support the modern data environment right in the enterprise. And I need simpler faster, better. That's what I need, you know, I am no longer wearing pocket protectors. I have turned from, you know, break, fix kind of person, to you know, business consultant. And I need that point and click simplicity, but I can't sacrifice, you know, a depth of features of functionality on the backend as I play that consultancy role. >> So, Ron, I want to bring in Ron, you know, it's funny. So Matt, you mentioned Mongo, I often and say, if Oracle mentions you, you're on the map. We saw them yesterday Ron, (laughing) they hammered RedShifts auto ML, they took swipes at Snowflake, a little bit of BigQuery. What were your thoughts on that? Do you agree with what these guys are saying in terms of HeatWaves capabilities? >> Yes, Dave, I think that's an excellent question. And fundamentally I do agree. And the question is why, and I think it's important to know that all of the Oracle data is backed by the fact that they're using benchmarks. For example, all of the ML and all of the TPC benchmarks, including all the scripts, all the configs and all the detail are posted on GitHub. So anybody can look at these results and they're fully transparent and replicate themselves. If you don't agree with this data, then by all means challenge it. And we have not really seen that in all of the new updates in HeatWave over the last 15 months. And as a result, when it comes to these, you know, fundamentals in looking at the competitive landscape, which I think gives validity to outcomes such as Oracle being able to deliver 4.8 times better price performance than Redshift. As well as for example, 14.4 better price performance than Snowflake, and also 12.9 better price performance than BigQuery. And so that is, you know, looking at the quantitative side of things. But again, I think, you know, to Marc's point and to Matt's point, there are also qualitative aspects that clearly differentiate the Oracle proposition, from my perspective. For example now the MySQL HeatWave ML capabilities are native, they're built in, and they also support things such as completion criteria. And as a result, that enables them to show that hey, when you're using Redshift ML for example, you're having to also use their SageMaker tool and it's running on a meter. And so, you know, nobody really wants to be running on a meter when, you know, executing these incredibly complex tasks. And likewise, when it comes to Snowflake, they have to use a third party capability. They don't have the built in, it's not native. So the user, to the point that he's having to spend more time and it increases complexity to use auto ML capabilities across the Snowflake platform. And also, I think it also applies to other important features such as data sampling, for example, with the HeatWave ML, it's intelligent sampling that's being implemented. Whereas in contrast, we're seeing Redshift using random sampling. And again, Snowflake, you're having to use a third party library in order to achieve the same capabilities. So I think the differentiation is crystal clear. I think it definitely is refreshing. It's showing that this is where true value can be assigned. And if you don't agree with it, by all means challenge the data. >> Yeah, I want to come to the benchmarks in a minute. By the way, you know, the gentleman who's the Oracle's architect, he did a great job on the call yesterday explaining what you have to do. I thought that was quite impressive. But Bob, I know you follow the financials pretty closely and on the earnings call earlier this month, Ellison said that, "We're going to see HeatWave on AWS." And the skeptic in me said, oh, they must not be getting people to come to OCI. And then they, you remember this chart they showed yesterday that showed the growth of HeatWave on OCI. But of course there was no data on there, it was just sort of, you know, lines up and to the right. So what do you guys think of that? (Marc laughs) Does it signal Bob, desperation by Oracle that they can't get traction on OCI, or is it just really a smart tame expansion move? What do you think? >> Yeah, Dave, that's a great question. You know, along the way there, and you know, just inside of that was something that said Ellison said on earnings call that spoke to a different sort of philosophy or mindset, almost Marc, where he said, "We're going to make this multicloud," right? With a lot of their other cloud stuff, if you wanted to use any of Oracle's cloud software, you had to use Oracle's infrastructure, OCI, there was no other way out of it. But this one, but I thought it was a classic Ellison line. He said, "Well, we're making this available on AWS. We're making this available, you know, on Snowflake because we're going after those users. And once they see what can be done here." So he's looking at it, I guess you could say, it's a concession to customers because they want multi-cloud. The other way to look at it, it's a hunting expedition and it's one of those uniquely I think Oracle ways. He said up front, right, he doesn't say, "Well, there's a big market, there's a lot for everybody, we just want on our slice." Said, "No, we are going after Amazon, we're going after Redshift, we're going after Aurora. We're going after these users of Snowflake and so on." And I think it's really fairly refreshing these days to hear somebody say that, because now if I'm a buyer, I can look at that and say, you know, to Marc's point, "Do they measure up, do they crack that threshold ceiling? Or is this just going to be more pain than a few dollars savings is worth?" But you look at those numbers that Ron pointed out and that we all saw in that chart. I've never seen Dave, anything like that. In a substantive market, a new player coming in here, and being able to establish differences that are four, seven, eight, 10, 12 times better than competition. And as new buyers look at that, they're going to say, "What the hell are we doing paying, you know, five times more to get a poor result? What's going on here?" So I think this is going to rattle people and force a harder, closer look at what these alternatives are. >> I wonder if the guy, thank you. Let's just skip ahead of the benchmarks guys, bring up the next slide, let's skip ahead a little bit here, which talks to the benchmarks and the benchmarking if we can. You know, David Floyer, the sort of semiretired, you know, Wikibon analyst said, "Dave, this is going to force Amazon and others, Snowflake," he said, "To rethink actually how they architect databases." And this is kind of a compilation of some of the data that they shared. They went after Redshift mostly, (laughs) but also, you know, as I say, Snowflake, BigQuery. And, like I said, you can always tell which companies are doing well, 'cause Oracle will come after you, but they're on the radar here. (laughing) Holgar should we take this stuff seriously? I mean, or is it, you know, a grain salt? What are your thoughts here? >> I think you have to take it seriously. I mean, that's a great question, great point on that. Because like Ron said, "If there's a flaw in a benchmark, we know this database traditionally, right?" If anybody came up that, everybody will be, "Oh, you put the wrong benchmark, it wasn't audited right, let us do it again," and so on. We don't see this happening, right? So kudos to Oracle to be aggressive, differentiated, and seem to having impeccable benchmarks. But what we really see, I think in my view is that the classic and we can talk about this in 100 years, right? Is the suite versus best of breed, right? And the key question of the suite, because the suite's always slower, right? No matter at which level of the stack, you have the suite, then the best of breed that will come up with something new, use a cloud, put the data warehouse on steroids and so on. The important thing is that you have to assess as a buyer what is the speed of my suite vendor. And that's what you guys mentioned before as well, right? Marc said that and so on, "Like, this is a third release in one year of the HeatWave team, right?" So everybody in the database open source Marc, and there's so many MySQL spinoffs to certain point is put on shine on the speed of (indistinct) team, putting out fundamental changes. And the beauty of that is right, is so inherent to the Oracle value proposition. Larry's vision of building the IBM of the 21st century, right from the Silicon, from the chip all the way across the seven stacks to the click of the user. And that what makes the database what Rob was saying, "Tied to the OCI infrastructure," because designed for that, it runs uniquely better for that, that's why we see the cross connect to Microsoft. HeatWave so it's different, right? Because HeatWave runs on cheap hardware, right? Which is the breadth and butter 886 scale of any cloud provider, right? So Oracle probably needs it to scale OCI in a different category, not the expensive side, but also allow us to do what we said before, the multicloud capability, which ultimately CIOs really want, because data gravity is real, you want to operate where that is. If you have a fast, innovative offering, which gives you more functionality and the R and D speed is really impressive for the space, puts away bad results, then it's a good bet to look at. >> Yeah, so you're saying, that we versus best of breed. I just want to sort of play back then Marc a comment. That suite versus best of breed, there's always been that trade off. If I understand you Holgar you're saying that somehow Oracle has magically cut through that trade off and they're giving you the best of both. >> It's the developing velocity, right? The provision of important features, which matter to buyers of the suite vendor, eclipses the best of breed vendor, then the best of breed vendor is in the hell of a potential job. >> Yeah, go ahead Marc. >> Yeah and I want to add on what Holgar just said there. I mean the worst job in the data center is data movement, moving the data sucks. I don't care who you are, nobody likes it. You never get any kudos for doing it well, and you always get the ah craps, when things go wrong. So it's in- >> In the data center Marc all the time across data centers, across cloud. That's where the bleeding comes. >> It's right, you get beat up all the time. So nobody likes to move data, ever. So what you're looking at with what they announce with HeatWave and what I love about HeatWave is it doesn't matter when you started with it, you get all the additional features they announce it's part of the service, all the time. But they don't have to move any of the data. You want to analyze the data that's in your transactional, MySQL database, it's there. You want to do machine learning models, it's there, there's no data movement. The data movement is the key thing, and they just eliminate that, in so many ways. And the other thing I wanted to talk about is on the benchmarks. As great as those benchmarks are, they're really conservative 'cause they're underestimating the cost of that data movement. The ETLs, the other services, everything's left out. It's just comparing HeatWave, MySQL cloud service with HeatWave versus Redshift, not Redshift and Aurora and Glue, Redshift and Redshift ML and SageMaker, it's just Redshift. >> Yeah, so what you're saying is what Oracle's doing is saying, "Okay, we're going to run MySQL HeatWave benchmarks on analytics against Redshift, and then we're going to run 'em in transaction against Aurora." >> Right. >> But if you really had to look at what you would have to do with the ETL, you'd have to buy two different data stores and all the infrastructure around that, and that goes away so. >> Due to the nature of the competition, they're running narrow best of breed benchmarks. There is no suite level benchmark (Dave laughs) because they created something new. >> Well that's you're the earlier point they're beating best of breed with a suite. So that's, I guess to Floyer's earlier point, "That's going to shake things up." But I want to come back to Bob Evans, 'cause I want to tap your Cloud Wars mojo before we wrap. And line up the horses, you got AWS, you got Microsoft, Google and Oracle. Now they all own their own cloud. Snowflake, Mongo, Couchbase, Redis, Cockroach by the way they're all doing very well. They run in the cloud as do many others. I think you guys all saw the Andreessen, you know, commentary from Sarah Wang and company, to talk about the cost of goods sold impact of cloud. So owning your own cloud has to be an advantage because other guys like Snowflake have to pay cloud vendors and negotiate down versus having the whole enchilada, Safra Catz's dream. Bob, how do you think this is going to impact the market long term? >> Well, Dave, that's a great question about, you know, how this is all going to play out. If I could mention three things, one, Frank Slootman has done a fantastic job with Snowflake. Really good company before he got there, but since he's been there, the growth mindset, the discipline, the rigor and the phenomenon of what Snowflake has done has forced all these bigger companies to really accelerate what they're doing. And again, it's an example of how this intense competition makes all the different cloud vendors better and it provides enormous value to customers. Second thing I wanted to mention here was look at the Adam Selipsky effect at AWS, took over in the middle of May, and in Q2, Q3, Q4, AWS's growth rate accelerated. And in each of those three quotas, they grew faster than Microsoft's cloud, which has not happened in two or three years, so they're closing the gap on Microsoft. The third thing, Dave, in this, you know, incredibly intense competitive nature here, look at Larry Ellison, right? He's got his, you know, the product that for the last two or three years, he said, "It's going to help determine the future of the company, autonomous database." You would think he's the last person in the world who's going to bring in, you know, in some ways another database to think about there, but he has put, you know, his whole effort and energy behind this. The investments Oracle's made, he's riding this horse really hard. So it's not just a technology achievement, but it's also an investment priority for Oracle going forward. And I think it's going to form a lot of how they position themselves to this new breed of buyer with a new type of need and expectations from IT. So I just think the next two or three years are going to be fantastic for people who are lucky enough to get to do the sorts of things that we do. >> You know, it's a great point you made about AWS. Back in 2018 Q3, they were doing about 7.4 billion a quarter and they were growing in the mid forties. They dropped down to like 29% Q4, 2020, I'm looking at the data now. They popped back up last quarter, last reported quarter to 40%, that is 17.8 billion, so they more doubled and they accelerated their growth rate. (laughs) So maybe that pretends, people are concerned about Snowflake right now decelerating growth. You know, maybe that's going to be different. By the way, I think Snowflake has a different strategy, the whole data cloud thing, data sharing. They're not trying to necessarily take Oracle head on, which is going to make this next 10 years, really interesting. All right, we got to go, last question. 30 seconds or less, what can we expect from the future of data platforms? Matt, please start. >> I have to go first again? You're killing me, Dave. (laughing) In the next few years, I think you're going to see the major players continue to meet customers where they are, right. Every organization, every environment is, you know, kind of, we use these words bespoke in Snowflake, pardon the pun, but Snowflakes, right. But you know, they're all opinionated and unique and what's great as an IT person is, you know, there is a service for me regardless of where I am on my journey, in my data management journey. I think you're going to continue to see with regards specifically to Oracle, I think you're going to see the company continue along this path of being all things to all people, if you will, or all organizations without sacrificing, you know, kind of richness of features and sacrificing who they are, right. Look, they are the data kings, right? I mean, they've been a database leader for an awful long time. I don't see that going away any time soon and I love the innovative spirit they've brought in with HeatWave. >> All right, great thank you. Okay, 30 seconds, Holgar go. >> Yeah, I mean, the interesting thing that we see is really that trend to autonomous as Oracle calls or self-driving software, right? So the database will have to do more things than just store the data and support the DVA. It will have to show it can wide insights, the whole upside, it will be able to show to one machine learning. We haven't really talked about that. How in just exciting what kind of use case we can get of machine learning running real time on data as it changes, right? So, which is part of the E5 announcement, right? So we'll see more of that self-driving nature in the database space. And because you said we can promote it, right. Check out my report about HeatWave latest release where I post in oracle.com. >> Great, thank you for that. And Bob Evans, please. You're great at quick hits, hit us. >> Dave, thanks. I really enjoyed getting to hear everybody's opinion here today and I think what's going to happen too. I think there's a new generation of buyers, a new set of CXO influencers in here. And I think what Oracle's done with this, MySQL HeatWave, those benchmarks that Ron talked about so eloquently here that is going to become something that forces other companies, not just try to get incrementally better. I think we're going to see a massive new wave of innovation to try to play catch up. So I really take my hat off to Oracle's achievement from going to, push everybody to be better. >> Excellent. Marc Staimer, what do you say? >> Sure, I'm going to leverage off of something Matt said earlier, "Those companies that are going to develop faster, cheaper, simpler products that are going to solve customer problems, IT problems are the ones that are going to succeed, or the ones who are going to grow. The one who are just focused on the technology are going to fall by the wayside." So those who can solve more problems, do it more elegantly and do it for less money are going to do great. So Oracle's going down that path today, Snowflake's going down that path. They're trying to do more integration with third party, but as a result, aiming at that simpler, faster, cheaper mentality is where you're going to continue to see this market go. >> Amen brother Marc. >> Thank you, Ron Westfall, we'll give you the last word, bring us home. >> Well, thank you. And I'm loving it. I see a wave of innovation across the entire cloud database ecosystem and Oracle is fueling it. We are seeing it, with the native integration of auto ML capabilities, elastic scaling, lower entry price points, et cetera. And this is just going to be great news for buyers, but also developers and increased use of open APIs. And so I think that is really the key takeaways. Just we're going to see a lot of great innovation on the horizon here. >> Guys, fantastic insights, one of the best power panel as I've ever done. Love to have you back. Thanks so much for coming on today. >> Great job, Dave, thank you. >> All right, and thank you for watching. This is Dave Vellante for theCube and we'll see you next time. (soft music)

Published Date : Mar 31 2022

SUMMARY :

and co-founder of the and then you answer And don't forget Sybase back in the day, the world these days? and others happening in the cloud, and you cover the competition, and Oracle and you know, whoever else. Mr. Staimer, how do you see things? in that I see the database some good meat on the bone Take away the database, That is the ability to scale on demand, and they got MySQL and you I think it's, you know, and the various momentums, and Microsoft right now at the moment. So where do you place your bets? And to what Bob and Holgar said, you know, and you know, very granular, and everything in the cloud market. And to what you were saying, you know, functionality that you can't get to you know, business consultant. you know, it's funny. and all of the TPC benchmarks, By the way, you know, and you know, just inside of that was of some of the data that they shared. the stack, you have the suite, and they're giving you the best of both. of the suite vendor, and you always get the ah In the data center Marc all the time And the other thing I wanted to talk about and then we're going to run 'em and all the infrastructure around that, Due to the nature of the competition, I think you guys all saw the Andreessen, And I think it's going to form I'm looking at the data now. and I love the innovative All right, great thank you. and support the DVA. Great, thank you for that. And I think what Oracle's done Marc Staimer, what do you say? or the ones who are going to grow. we'll give you the last And this is just going to Love to have you back. and we'll see you next time.

ENTITIES

Entity	Category	Confidence
David Floyer	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Ron Westfall	PERSON	0.99+
Dave	PERSON	0.99+
Marc Staimer	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Marc	PERSON	0.99+
Ellison	PERSON	0.99+
Bob Evans	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Matt	PERSON	0.99+
Holgar Mueller	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Frank Slootman	PERSON	0.99+
Ron	PERSON	0.99+
Staimer	PERSON	0.99+
Andy Jackson	PERSON	0.99+
Bob	PERSON	0.99+
Matt Kimball	PERSON	0.99+
Google	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
Sarah Wang	PERSON	0.99+
San Diego	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Rob	PERSON	0.99+

Analyst Predictions 2022: The Future of Data Management

[Music] in the 2010s organizations became keenly aware that data would become the key ingredient in driving competitive advantage differentiation and growth but to this day putting data to work remains a difficult challenge for many if not most organizations now as the cloud matures it has become a game changer for data practitioners by making cheap storage and massive processing power readily accessible we've also seen better tooling in the form of data workflows streaming machine intelligence ai developer tools security observability automation new databases and the like these innovations they accelerate data proficiency but at the same time they had complexity for practitioners data lakes data hubs data warehouses data marts data fabrics data meshes data catalogs data oceans are forming they're evolving and exploding onto the scene so in an effort to bring perspective to the sea of optionality we've brought together the brightest minds in the data analyst community to discuss how data management is morphing and what practitioners should expect in 2022 and beyond hello everyone my name is dave vellante with the cube and i'd like to welcome you to a special cube presentation analyst predictions 2022 the future of data management we've gathered six of the best analysts in data and data management who are going to present and discuss their top predictions and trends for 2022 in the first half of this decade let me introduce our six power panelists sanjeev mohan is former gartner analyst and principal at sanjamo tony bear is principal at db insight carl olufsen is well-known research vice president with idc dave meninger is senior vice president and research director at ventana research brad shimon chief analyst at ai platforms analytics and data management at omnia and doug henschen vice president and principal analyst at constellation research gentlemen welcome to the program and thanks for coming on thecube today great to be here thank you all right here's the format we're going to use i as moderator are going to call on each analyst separately who then will deliver their prediction or mega trend and then in the interest of time management and pace two analysts will have the opportunity to comment if we have more time we'll elongate it but let's get started right away sanjeev mohan please kick it off you want to talk about governance go ahead sir thank you dave i i believe that data governance which we've been talking about for many years is now not only going to be mainstream it's going to be table stakes and all the things that you mentioned you know with data oceans data lakes lake houses data fabric meshes the common glue is metadata if we don't understand what data we have and we are governing it there is no way we can manage it so we saw informatica when public last year after a hiatus of six years i've i'm predicting that this year we see some more companies go public uh my bet is on colibra most likely and maybe alation we'll see go public this year we we i'm also predicting that the scope of data governance is going to expand beyond just data it's not just data and reports we are going to see more transformations like spark jaws python even airflow we're going to see more of streaming data so from kafka schema registry for example we will see ai models become part of this whole governance suite so the governance suite is going to be very comprehensive very detailed lineage impact analysis and then even expand into data quality we already seen that happen with some of the tools where they are buying these smaller companies and bringing in data quality monitoring and integrating it with metadata management data catalogs also data access governance so these so what we are going to see is that once the data governance platforms become the key entry point into these modern architectures i'm predicting that the usage the number of users of a data catalog is going to exceed that of a bi tool that will take time and we already seen that that trajectory right now if you look at bi tools i would say there are 100 users to a bi tool to one data catalog and i i see that evening out over a period of time and at some point data catalogs will really become you know the main way for us to access data data catalog will help us visualize data but if we want to do more in-depth analysis it'll be the jumping-off point into the bi tool the data science tool and and that is that is the journey i see for the data governance products excellent thank you some comments maybe maybe doug a lot a lot of things to weigh in on there maybe you could comment yeah sanjeev i think you're spot on a lot of the trends uh the one disagreement i think it's it's really still far from mainstream as you say we've been talking about this for years it's like god motherhood apple pie everyone agrees it's important but too few organizations are really practicing good governance because it's hard and because the incentives have been lacking i think one thing that deserves uh mention in this context is uh esg mandates and guidelines these are environmental social and governance regs and guidelines we've seen the environmental rags and guidelines imposed in industries particularly the carbon intensive industries we've seen the social mandates particularly diversity imposed on suppliers by companies that are leading on this topic we've seen governance guidelines now being imposed by banks and investors so these esgs are presenting new carrots and sticks and it's going to demand more solid data it's going to demand more detailed reporting and solid reporting tighter governance but we're still far from mainstream adoption we have a lot of uh you know best of breed niche players in the space i think the signs that it's going to be more mainstream are starting with things like azure purview google dataplex the big cloud platform uh players seem to be uh upping the ante and and addressing starting to address governance excellent thank you doug brad i wonder if you could chime in as well yeah i would love to be a believer in data catalogs um but uh to doug's point i think that it's going to take some more pressure for for that to happen i recall metadata being something every enterprise thought they were going to get under control when we were working on service oriented architecture back in the 90s and that didn't happen quite the way we we anticipated and and uh to sanjeev's point it's because it is really complex and really difficult to do my hope is that you know we won't sort of uh how do we put this fade out into this nebulous nebula of uh domain catalogs that are specific to individual use cases like purview for getting data quality right or like data governance and cyber security and instead we have some tooling that can actually be adaptive to gather metadata to create something i know is important to you sanjeev and that is this idea of observability if you can get enough metadata without moving your data around but understanding the entirety of a system that's running on this data you can do a lot to help with with the governance that doug is talking about so so i just want to add that you know data governance like many other initiatives did not succeed even ai went into an ai window but that's a different topic but a lot of these things did not succeed because to your point the incentives were not there i i remember when starbucks oxley had come into the scene if if a bank did not do service obviously they were very happy to a million dollar fine that was like you know pocket change for them instead of doing the right thing but i think the stakes are much higher now with gdpr uh the floodgates open now you know california you know has ccpa but even ccpa is being outdated with cpra which is much more gdpr like so we are very rapidly entering a space where every pretty much every major country in the world is coming up with its own uh compliance regulatory requirements data residence is becoming really important and and i i think we are going to reach a stage where uh it won't be optional anymore so whether we like it or not and i think the reason data catalogs were not successful in the past is because we did not have the right focus on adoption we were focused on features and these features were disconnected very hard for business to stop these are built by it people for it departments to to take a look at technical metadata not business metadata today the tables have turned cdo's are driving this uh initiative uh regulatory compliances are beating down hard so i think the time might be right yeah so guys we have to move on here and uh but there's some some real meat on the bone here sanjeev i like the fact that you late you called out calibra and alation so we can look back a year from now and say okay he made the call he stuck it and then the ratio of bi tools the data catalogs that's another sort of measurement that we can we can take even though some skepticism there that's something that we can watch and i wonder if someday if we'll have more metadata than data but i want to move to tony baer you want to talk about data mesh and speaking you know coming off of governance i mean wow you know the whole concept of data mesh is decentralized data and then governance becomes you know a nightmare there but take it away tony we'll put it this way um data mesh you know the the idea at least is proposed by thoughtworks um you know basically was unleashed a couple years ago and the press has been almost uniformly almost uncritical um a good reason for that is for all the problems that basically that sanjeev and doug and brad were just you know we're just speaking about which is that we have all this data out there and we don't know what to do about it um now that's not a new problem that was a problem we had enterprise data warehouses it was a problem when we had our hadoop data clusters it's even more of a problem now the data's out in the cloud where the data is not only your data like is not only s3 it's all over the place and it's also including streaming which i know we'll be talking about later so the data mesh was a response to that the idea of that we need to debate you know who are the folks that really know best about governance is the domain experts so it was basically data mesh was an architectural pattern and a process my prediction for this year is that data mesh is going to hit cold hard reality because if you if you do a google search um basically the the published work the articles and databases have been largely you know pretty uncritical um so far you know that you know basically learning is basically being a very revolutionary new idea i don't think it's that revolutionary because we've talked about ideas like this brad and i you and i met years ago when we were talking about so and decentralizing all of us was at the application level now we're talking about at the data level and now we have microservices so there's this thought of oh if we manage if we're apps in cloud native through microservices why don't we think of data in the same way um my sense this year is that you know this and this has been a very active search if you look at google search trends is that now companies are going to you know enterprises are going to look at this seriously and as they look at seriously it's going to attract its first real hard scrutiny it's going to attract its first backlash that's not necessarily a bad thing it means that it's being taken seriously um the reason why i think that that uh that it will you'll start to see basically the cold hard light of day shine on data mesh is that it's still a work in progress you know this idea is basically a couple years old and there's still some pretty major gaps um the biggest gap is in is in the area of federated governance now federated governance itself is not a new issue uh federated governance position we're trying to figure out like how can we basically strike the balance between getting let's say you know between basically consistent enterprise policy consistent enterprise governance but yet the groups that understand the data know how to basically you know that you know how do we basically sort of balance the two there's a huge there's a huge gap there in practice and knowledge um also to a lesser extent there's a technology gap which is basically in the self-service technologies that will help teams essentially govern data you know basically through the full life cycle from developed from selecting the data from you know building the other pipelines from determining your access control determining looking at quality looking at basically whether data is fresh or whether or not it's trending of course so my predictions is that it will really receive the first harsh scrutiny this year you are going to see some organization enterprises declare premature victory when they've uh when they build some federated query implementations you're going to see vendors start to data mesh wash their products anybody in the data management space they're going to say that whether it's basically a pipelining tool whether it's basically elt whether it's a catalog um or confederated query tool they're all going to be like you know basically promoting the fact of how they support this hopefully nobody is going to call themselves a data mesh tool because data mesh is not a technology we're going to see one other thing come out of this and this harks back to the metadata that sanji was talking about and the catalogs that he was talking about which is that there's going to be a new focus on every renewed focus on metadata and i think that's going to spur interest in data fabrics now data fabrics are pretty vaguely defined but if we just take the most elemental definition which is a common metadata back plane i think that if anybody is going to get serious about data mesh they need to look at a data fabric because we all at the end of the day need to speak you know need to read from the same sheet of music so thank you tony dave dave meninger i mean one of the things that people like about data mesh is it pretty crisply articulates some of the flaws in today's organizational approaches to data what are your thoughts on this well i think we have to start by defining data mesh right the the term is already getting corrupted right tony said it's going to see the cold hard uh light of day and there's a problem right now that there are a number of overlapping terms that are similar but not identical so we've got data virtualization data fabric excuse me for a second sorry about that data virtualization data fabric uh uh data federation right uh so i i think that it's not really clear what each vendor means by these terms i see data mesh and data fabric becoming quite popular i've i've interpreted data mesh as referring primarily to the governance aspects as originally you know intended and specified but that's not the way i see vendors using i see vendors using it much more to mean data fabric and data virtualization so i'm going to comment on the group of those things i think the group of those things is going to happen they're going to happen they're going to become more robust our research suggests that a quarter of organizations are already using virtualized access to their data lakes and another half so a total of three quarters will eventually be accessing their data lakes using some sort of virtualized access again whether you define it as mesh or fabric or virtualization isn't really the point here but this notion that there are different elements of data metadata and governance within an organization that all need to be managed collectively the interesting thing is when you look at the satisfaction rates of those organizations using virtualization versus those that are not it's almost double 68 of organizations i'm i'm sorry um 79 of organizations that were using virtualized access express satisfaction with their access to the data lake only 39 expressed satisfaction if they weren't using virtualized access so thank you uh dave uh sanjeev we just got about a couple minutes on this topic but i know you're speaking or maybe you've spoken already on a panel with jamal dagani who sort of invented the concept governance obviously is a big sticking point but what are your thoughts on this you are mute so my message to your mark and uh and to the community is uh as opposed to what dave said let's not define it we spent the whole year defining it there are four principles domain product data infrastructure and governance let's take it to the next level i get a lot of questions on what is the difference between data fabric and data mesh and i'm like i can compare the two because data mesh is a business concept data fabric is a data integration pattern how do you define how do you compare the two you have to bring data mesh level down so to tony's point i'm on a warp path in 2022 to take it down to what does a data product look like how do we handle shared data across domains and govern it and i think we are going to see more of that in 2022 is operationalization of data mesh i think we could have a whole hour on this topic couldn't we uh maybe we should do that uh but let's go to let's move to carl said carl your database guy you've been around that that block for a while now you want to talk about graph databases bring it on oh yeah okay thanks so i regard graph database as basically the next truly revolutionary database management technology i'm looking forward to for the graph database market which of course we haven't defined yet so obviously i have a little wiggle room in what i'm about to say but that this market will grow by about 600 percent over the next 10 years now 10 years is a long time but over the next five years we expect to see gradual growth as people start to learn how to use it problem isn't that it's used the problem is not that it's not useful is that people don't know how to use it so let me explain before i go any further what a graph database is because some of the folks on the call may not may not know what it is a graph database organizes data according to a mathematical structure called a graph a graph has elements called nodes and edges so a data element drops into a node the nodes are connected by edges the edges connect one node to another node combinations of edges create structures that you can analyze to determine how things are related in some cases the nodes and edges can have properties attached to them which add additional informative material that makes it richer that's called a property graph okay there are two principal use cases for graph databases there's there's semantic proper graphs which are used to break down human language text uh into the semantic structures then you can search it organize it and and and answer complicated questions a lot of ai is aimed at semantic graphs another kind is the property graph that i just mentioned which has a dazzling number of use cases i want to just point out is as i talk about this people are probably wondering well we have relational databases isn't that good enough okay so a relational database defines it uses um it supports what i call definitional relationships that means you define the relationships in a fixed structure the database drops into that structure there's a value foreign key value that relates one table to another and that value is fixed you don't change it if you change it the database becomes unstable it's not clear what you're looking at in a graph database the system is designed to handle change so that it can reflect the true state of the things that it's being used to track so um let me just give you some examples of use cases for this um they include uh entity resolution data lineage uh um social media analysis customer 360 fraud prevention there's cyber security there's strong supply chain is a big one actually there's explainable ai and this is going to become important too because a lot of people are adopting ai but they want a system after the fact to say how did the ai system come to that conclusion how did it make that recommendation right now we don't have really good ways of tracking that okay machine machine learning in general um social network i already mentioned that and then we've got oh gosh we've got data governance data compliance risk management we've got recommendation we've got personalization anti-money money laundering that's another big one identity and access management network and i.t operations is already becoming a key one where you actually have mapped out your operation your your you know whatever it is your data center and you you can track what's going on as things happen there root cause analysis fraud detection is a huge one a number of major credit card companies use graph databases for fraud detection risk analysis tracking and tracing churn analysis next best action what-if analysis impact analysis entity resolution and i would add one other thing or just a few other things to this list metadata management so sanjay here you go this is your engine okay because i was in metadata management for quite a while in my past life and one of the things i found was that none of the data management technologies that were available to us could efficiently handle metadata because of the kinds of structures that result from it but grass can okay grafts can do things like say this term in this context means this but in that context it means that okay things like that and in fact uh logistics management supply chain it also because it handles recursive relationships by recursive relationships i mean objects that own other objects that are of the same type you can do things like bill materials you know so like parts explosion you can do an hr analysis who reports to whom how many levels up the chain and that kind of thing you can do that with relational databases but yes it takes a lot of programming in fact you can do almost any of these things with relational databases but the problem is you have to program it it's not it's not supported in the database and whenever you have to program something that means you can't trace it you can't define it you can't publish it in terms of its functionality and it's really really hard to maintain over time so carl thank you i wonder if we could bring brad in i mean brad i'm sitting there wondering okay is this incremental to the market is it disruptive and replaceable what are your thoughts on this space it's already disrupted the market i mean like carl said go to any bank and ask them are you using graph databases to do to get fraud detection under control and they'll say absolutely that's the only way to solve this problem and it is frankly um and it's the only way to solve a lot of the problems that carl mentioned and that is i think it's it's achilles heel in some ways because you know it's like finding the best way to cross the seven bridges of konigsberg you know it's always going to kind of be tied to those use cases because it's really special and it's really unique and because it's special and it's unique uh it it still unfortunately kind of stands apart from the rest of the community that's building let's say ai outcomes as the great great example here the graph databases and ai as carl mentioned are like chocolate and peanut butter but technologically they don't know how to talk to one another they're completely different um and you know it's you can't just stand up sql and query them you've got to to learn um yeah what is that carlos specter or uh special uh uh yeah thank you uh to actually get to the data in there and if you're gonna scale that data that graph database especially a property graph if you're gonna do something really complex like try to understand uh you know all of the metadata in your organization you might just end up with you know a graph database winter like we had the ai winter simply because you run out of performance to make the thing happen so i i think it's already disrupted but we we need to like treat it like a first-class citizen in in the data analytics and ai community we need to bring it into the fold we need to equip it with the tools it needs to do that the magic it does and to do it not just for specialized use cases but for everything because i i'm with carl i i think it's absolutely revolutionary so i had also identified the principal achilles heel of the technology which is scaling now when these when these things get large and complex enough that they spill over what a single server can handle you start to have difficulties because the relationships span things that have to be resolved over a network and then you get network latency and that slows the system down so that's still a problem to be solved sanjeev any quick thoughts on this i mean i think metadata on the on the on the word cloud is going to be the the largest font uh but what are your thoughts here i want to like step away so people don't you know associate me with only meta data so i want to talk about something a little bit slightly different uh dbengines.com has done an amazing job i think almost everyone knows that they chronicle all the major databases that are in use today in january of 2022 there are 381 databases on its list of ranked list of databases the largest category is rdbms the second largest category is actually divided into two property graphs and rdf graphs these two together make up the second largest number of data databases so talking about accolades here this is a problem the problem is that there's so many graph databases to choose from they come in different shapes and forms uh to bright's point there's so many query languages in rdbms is sql end of the story here we've got sci-fi we've got gremlin we've got gql and then your proprietary languages so i think there's a lot of disparity in this space but excellent all excellent points sanji i must say and that is a problem the languages need to be sorted and standardized and it needs people need to have a road map as to what they can do with it because as you say you can do so many things and so many of those things are unrelated that you sort of say well what do we use this for i'm reminded of the saying i learned a bunch of years ago when somebody said that the digital computer is the only tool man has ever devised that has no particular purpose all right guys we gotta we gotta move on to dave uh meninger uh we've heard about streaming uh your prediction is in that realm so please take it away sure so i like to say that historical databases are to become a thing of the past but i don't mean that they're going to go away that's not my point i mean we need historical databases but streaming data is going to become the default way in which we operate with data so in the next say three to five years i would expect the data platforms and and we're using the term data platforms to represent the evolution of databases and data lakes that the data platforms will incorporate these streaming capabilities we're going to process data as it streams into an organization and then it's going to roll off into historical databases so historical databases don't go away but they become a thing of the past they store the data that occurred previously and as data is occurring we're going to be processing it we're going to be analyzing we're going to be acting on it i mean we we only ever ended up with historical databases because we were limited by the technology that was available to us data doesn't occur in batches but we processed it in batches because that was the best we could do and it wasn't bad and we've continued to improve and we've improved and we've improved but streaming data today is still the exception it's not the rule right there's there are projects within organizations that deal with streaming data but it's not the default way in which we deal with data yet and so that that's my prediction is that this is going to change we're going to have um streaming data be the default way in which we deal with data and and how you label it what you call it you know maybe these databases and data platforms just evolve to be able to handle it but we're going to deal with data in a different way and our research shows that already about half of the participants in our analytics and data benchmark research are using streaming data you know another third are planning to use streaming technologies so that gets us to about eight out of ten organizations need to use this technology that doesn't mean they have to use it throughout the whole organization but but it's pretty widespread in its use today and has continued to grow if you think about the consumerization of i.t we've all been conditioned to expect immediate access to information immediate responsiveness you know we want to know if an uh item is on the shelf at our local retail store and we can go in and pick it up right now you know that's the world we live in and that's spilling over into the enterprise i.t world where we have to provide those same types of capabilities um so that's my prediction historical database has become a thing of the past streaming data becomes the default way in which we we operate with data all right thank you david well so what what say you uh carl a guy who's followed historical databases for a long time well one thing actually every database is historical because as soon as you put data in it it's now history it's no longer it no longer reflects the present state of things but even if that history is only a millisecond old it's still history but um i would say i mean i know you're trying to be a little bit provocative in saying this dave because you know as well as i do that people still need to do their taxes they still need to do accounting they still need to run general ledger programs and things like that that all involves historical data that's not going to go away unless you want to go to jail so you're going to have to deal with that but as far as the leading edge functionality i'm totally with you on that and i'm just you know i'm just kind of wondering um if this chain if this requires a change in the way that we perceive applications in order to truly be manifested and rethinking the way m applications work um saying that uh an application should respond instantly as soon as the state of things changes what do you say about that i i think that's true i think we do have to think about things differently that's you know it's not the way we design systems in the past uh we're seeing more and more systems designed that way but again it's not the default and and agree 100 with you that we do need historical databases you know that that's clear and even some of those historical databases will be used in conjunction with the streaming data right so absolutely i mean you know let's take the data warehouse example where you're using the data warehouse as context and the streaming data as the present you're saying here's a sequence of things that's happening right now have we seen that sequence before and where what what does that pattern look like in past situations and can we learn from that so tony bear i wonder if you could comment i mean if you when you think about you know real-time inferencing at the edge for instance which is something that a lot of people talk about um a lot of what we're discussing here in this segment looks like it's got great potential what are your thoughts yeah well i mean i think you nailed it right you know you hit it right on the head there which is that i think a key what i'm seeing is that essentially and basically i'm going to split this one down the middle is i don't see that basically streaming is the default what i see is streaming and basically and transaction databases um and analytics data you know data warehouses data lakes whatever are converging and what allows us technically to converge is cloud native architecture where you can basically distribute things so you could have you can have a note here that's doing the real-time processing that's also doing it and this is what your leads in we're maybe doing some of that real-time predictive analytics to take a look at well look we're looking at this customer journey what's happening with you know you know with with what the customer is doing right now and this is correlated with what other customers are doing so what i so the thing is that in the cloud you can basically partition this and because of basically you know the speed of the infrastructure um that you can basically bring these together and or and so and kind of orchestrate them sort of loosely coupled manner the other part is that the use cases are demanding and this is part that goes back to what dave is saying is that you know when you look at customer 360 when you look at let's say smart you know smart utility grids when you look at any type of operational problem it has a real-time component and it has a historical component and having predictives and so like you know you know my sense here is that there that technically we can bring this together through the cloud and i think the use case is that is that we we can apply some some real-time sort of you know predictive analytics on these streams and feed this into the transactions so that when we make a decision in terms of what to do as a result of a transaction we have this real time you know input sanjeev did you have a comment yeah i was just going to say that to this point you know we have to think of streaming very different because in the historical databases we used to bring the data and store the data and then we used to run rules on top uh aggregations and all but in case of streaming the mindset changes because the rules normally the inference all of that is fixed but the data is constantly changing so it's a completely reverse way of thinking of uh and building applications on top of that so dave menninger there seemed to be some disagreement about the default or now what kind of time frame are you are you thinking about is this end of decade it becomes the default what would you pin i i think around you know between between five to ten years i think this becomes the reality um i think you know it'll be more and more common between now and then but it becomes the default and i also want sanjeev at some point maybe in one of our subsequent conversations we need to talk about governing streaming data because that's a whole other set of challenges we've also talked about it rather in a two dimensions historical and streaming and there's lots of low latency micro batch sub second that's not quite streaming but in many cases it's fast enough and we're seeing a lot of adoption of near real time not quite real time as uh good enough for most for many applications because nobody's really taking the hardware dimension of this information like how do we that'll just happen carl so near real time maybe before you lose the customer however you define that right okay um let's move on to brad brad you want to talk about automation ai uh the the the pipeline people feel like hey we can just automate everything what's your prediction yeah uh i'm i'm an ai fiction auto so apologies in advance for that but uh you know um i i think that um we've been seeing automation at play within ai for some time now and it's helped us do do a lot of things for especially for practitioners that are building ai outcomes in the enterprise uh it's it's helped them to fill skills gaps it's helped them to speed development and it's helped them to to actually make ai better uh because it you know in some ways provides some swim lanes and and for example with technologies like ottawa milk and can auto document and create that sort of transparency that that we talked about a little bit earlier um but i i think it's there's an interesting kind of conversion happening with this idea of automation um and and that is that uh we've had the automation that started happening for practitioners it's it's trying to move outside of the traditional bounds of things like i'm just trying to get my features i'm just trying to pick the right algorithm i'm just trying to build the right model uh and it's expanding across that full life cycle of building an ai outcome to start at the very beginning of data and to then continue on to the end which is this continuous delivery and continuous uh automation of of that outcome to make sure it's right and it hasn't drifted and stuff like that and because of that because it's become kind of powerful we're starting to to actually see this weird thing happen where the practitioners are starting to converge with the users and that is to say that okay if i'm in tableau right now i can stand up salesforce einstein discovery and it will automatically create a nice predictive algorithm for me um given the data that i that i pull in um but what's starting to happen and we're seeing this from the the the companies that create business software so salesforce oracle sap and others is that they're starting to actually use these same ideals and a lot of deep learning to to basically stand up these out of the box flip a switch and you've got an ai outcome at the ready for business users and um i i'm very much you know i think that that's that's the way that it's going to go and what it means is that ai is is slowly disappearing uh and i don't think that's a bad thing i think if anything what we're going to see in 2022 and maybe into 2023 is this sort of rush to to put this idea of disappearing ai into practice and have as many of these solutions in the enterprise as possible you can see like for example sap is going to roll out this quarter this thing called adaptive recommendation services which which basically is a cold start ai outcome that can work across a whole bunch of different vertical markets and use cases it's just a recommendation engine for whatever you need it to do in the line of business so basically you're you're an sap user you look up to turn on your software one day and you're a sales professional let's say and suddenly you have a recommendation for customer churn it's going that's great well i i don't know i i think that's terrifying in some ways i think it is the future that ai is going to disappear like that but i am absolutely terrified of it because um i i think that what it what it really does is it calls attention to a lot of the issues that we already see around ai um specific to this idea of what what we like to call it omdia responsible ai which is you know how do you build an ai outcome that is free of bias that is inclusive that is fair that is safe that is secure that it's audible etc etc etc etc that takes some a lot of work to do and so if you imagine a customer that that's just a sales force customer let's say and they're turning on einstein discovery within their sales software you need some guidance to make sure that when you flip that switch that the outcome you're going to get is correct and that's that's going to take some work and so i think we're going to see this let's roll this out and suddenly there's going to be a lot of a lot of problems a lot of pushback uh that we're going to see and some of that's going to come from gdpr and others that sam jeeve was mentioning earlier a lot of it's going to come from internal csr requirements within companies that are saying hey hey whoa hold up we can't do this all at once let's take the slow route let's make ai automated in a smart way and that's going to take time yeah so a couple predictions there that i heard i mean ai essentially you disappear it becomes invisible maybe if i can restate that and then if if i understand it correctly brad you're saying there's a backlash in the near term people can say oh slow down let's automate what we can those attributes that you talked about are non trivial to achieve is that why you're a bit of a skeptic yeah i think that we don't have any sort of standards that companies can look to and understand and we certainly within these companies especially those that haven't already stood up in internal data science team they don't have the knowledge to understand what that when they flip that switch for an automated ai outcome that it's it's gonna do what they think it's gonna do and so we need some sort of standard standard methodology and practice best practices that every company that's going to consume this invisible ai can make use of and one of the things that you know is sort of started that google kicked off a few years back that's picking up some momentum and the companies i just mentioned are starting to use it is this idea of model cards where at least you have some transparency about what these things are doing you know so like for the sap example we know for example that it's convolutional neural network with a long short-term memory model that it's using we know that it only works on roman english uh and therefore me as a consumer can say oh well i know that i need to do this internationally so i should not just turn this on today great thank you carl can you add anything any context here yeah we've talked about some of the things brad mentioned here at idc in the our future of intelligence group regarding in particular the moral and legal implications of having a fully automated you know ai uh driven system uh because we already know and we've seen that ai systems are biased by the data that they get right so if if they get data that pushes them in a certain direction i think there was a story last week about an hr system that was uh that was recommending promotions for white people over black people because in the past um you know white people were promoted and and more productive than black people but not it had no context as to why which is you know because they were being historically discriminated black people being historically discriminated against but the system doesn't know that so you know you have to be aware of that and i think that at the very least there should be controls when a decision has either a moral or a legal implication when when you want when you really need a human judgment it could lay out the options for you but a person actually needs to authorize that that action and i also think that we always will have to be vigilant regarding the kind of data we use to train our systems to make sure that it doesn't introduce unintended biases and to some extent they always will so we'll always be chasing after them that's that's absolutely carl yeah i think that what you have to bear in mind as a as a consumer of ai is that it is a reflection of us and we are a very flawed species uh and so if you look at all the really fantastic magical looking supermodels we see like gpt three and four that's coming out z they're xenophobic and hateful uh because the people the data that's built upon them and the algorithms and the people that build them are us so ai is a reflection of us we need to keep that in mind yeah we're the ai's by us because humans are biased all right great okay let's move on doug henson you know a lot of people that said that data lake that term's not not going to not going to live on but it appears to be have some legs here uh you want to talk about lake house bring it on yes i do my prediction is that lake house and this idea of a combined data warehouse and data lake platform is going to emerge as the dominant data management offering i say offering that doesn't mean it's going to be the dominant thing that organizations have out there but it's going to be the predominant vendor offering in 2022. now heading into 2021 we already had cloudera data bricks microsoft snowflake as proponents in 2021 sap oracle and several of these fabric virtualization mesh vendors join the bandwagon the promise is that you have one platform that manages your structured unstructured and semi-structured information and it addresses both the beyond analytics needs and the data science needs the real promise there is simplicity and lower cost but i think end users have to answer a few questions the first is does your organization really have a center of data gravity or is it is the data highly distributed multiple data warehouses multiple data lakes on-premises cloud if it if it's very distributed and you you know you have difficulty consolidating and that's not really a goal for you then maybe that single platform is unrealistic and not likely to add value to you um you know also the fabric and virtualization vendors the the mesh idea that's where if you have this highly distributed situation that might be a better path forward the second question if you are looking at one of these lake house offerings you are looking at consolidating simplifying bringing together to a single platform you have to make sure that it meets both the warehouse need and the data lake need so you have vendors like data bricks microsoft with azure synapse new really to the data warehouse space and they're having to prove that these data warehouse capabilities on their platforms can meet the scaling requirements can meet the user and query concurrency requirements meet those tight slas and then on the other hand you have the or the oracle sap snowflake the data warehouse uh folks coming into the data science world and they have to prove that they can manage the unstructured information and meet the needs of the data scientists i'm seeing a lot of the lake house offerings from the warehouse crowd managing that unstructured information in columns and rows and some of these vendors snowflake in particular is really relying on partners for the data science needs so you really got to look at a lake house offering and make sure that it meets both the warehouse and the data lake requirement well thank you doug well tony if those two worlds are going to come together as doug was saying the analytics and the data science world does it need to be some kind of semantic layer in between i don't know weigh in on this topic if you would oh didn't we talk about data fabrics before common metadata layer um actually i'm almost tempted to say let's declare victory and go home in that this is actually been going on for a while i actually agree with uh you know much what doug is saying there which is that i mean we i remembered as far back as i think it was like 2014 i was doing a a study you know it was still at ovum predecessor omnia um looking at all these specialized databases that were coming up and seeing that you know there's overlap with the edges but yet there was still going to be a reason at the time that you would have let's say a document database for json you'd have a relational database for tran you know for transactions and for data warehouse and you had you know and you had basically something at that time that that resembles to do for what we're considering a day of life fast fo and the thing is what i was saying at the time is that you're seeing basically blur you know sort of blending at the edges that i was saying like about five or six years ago um that's all and the the lake house is essentially you know the amount of the the current manifestation of that idea there is a dichotomy in terms of you know it's the old argument do we centralize this all you know you know in in in in in a single place or do we or do we virtualize and i think it's always going to be a yin and yang there's never going to be a single single silver silver bullet i do see um that they're also going to be questions and these are things that points that doug raised they're you know what your what do you need of of of your of you know for your performance there or for your you know pre-performance characteristics do you need for instance hiking currency you need the ability to do some very sophisticated joins or is your requirement more to be able to distribute and you know distribute our processing is you know as far as possible to get you know to essentially do a kind of brute force approach all these approaches are valid based on you know based on the used case um i just see that essentially that the lake house is the culmination of it's nothing it's just it's a relatively new term introduced by databricks a couple years ago this is the culmination of basically what's been a long time trend and what we see in the cloud is that as we start seeing data warehouses as a checkbox item say hey we can basically source data in cloud and cloud storage and s3 azure blob store you know whatever um as long as it's in certain formats like you know like you know parquet or csv or something like that you know i see that as becoming kind of you know a check box item so to that extent i think that the lake house depending on how you define it is already reality um and in some in some cases maybe new terminology but not a whole heck of a lot new under the sun yeah and dave menger i mean a lot of this thank you tony but a lot of this is going to come down to you know vendor marketing right some people try to co-opt the term we talked about data mesh washing what are your thoughts on this yeah so um i used the term data platform earlier and and part of the reason i use that term is that it's more vendor neutral uh we've we've tried to uh sort of stay out of the the vendor uh terminology patenting world right whether whether the term lake house is what sticks or not the concept is certainly going to stick and we have some data to back it up about a quarter of organizations that are using data lakes today already incorporate data warehouse functionality into it so they consider their data lake house and data warehouse one in the same about a quarter of organizations a little less but about a quarter of organizations feed the data lake from the data warehouse and about a quarter of organizations feed the data warehouse from the data lake so it's pretty obvious that three quarters of organizations need to bring this stuff together right the need is there the need is apparent the technology is going to continue to verge converge i i like to talk about you know you've got data lakes over here at one end and i'm not going to talk about why people thought data lakes were a bad idea because they thought you just throw stuff in a in a server and you ignore it right that's not what a data lake is so you've got data lake people over here and you've got database people over here data warehouse people over here database vendors are adding data lake capabilities and data lake vendors are adding data warehouse capabilities so it's obvious that they're going to meet in the middle i mean i think it's like tony says i think we should there declare victory and go home and so so i it's just a follow-up on that so are you saying these the specialized lake and the specialized warehouse do they go away i mean johnny tony data mesh practitioners would say or or advocates would say well they could all live as just a node on the on the mesh but based on what dave just said are we going to see those all morph together well number one as i was saying before there's always going to be this sort of you know kind of you know centrifugal force or this tug of war between do we centralize the data do we do it virtualize and the fact is i don't think that work there's ever going to be any single answer i think in terms of data mesh data mesh has nothing to do with how you physically implement the data you could have a data mesh on a basically uh on a data warehouse it's just that you know the difference being is that if we use the same you know physical data store but everybody's logically manual basically governing it differently you know um a data mission is basically it's not a technology it's a process it's a governance process um so essentially um you know you know i basically see that you know as as i was saying before that this is basically the culmination of a long time trend we're essentially seeing a lot of blurring but there are going to be cases where for instance if i need let's say like observe i need like high concurrency or something like that there are certain things that i'm not going to be able to get efficiently get out of a data lake um and you know we're basically i'm doing a system where i'm just doing really brute forcing very fast file scanning and that type of thing so i think there always will be some delineations but i would agree with dave and with doug that we are seeing basically a a confluence of requirements that we need to essentially have basically the element you know the ability of a data lake and a data laid out their warehouse we these need to come together so i think what we're likely to see is organizations look for a converged platform that can handle both sides for their center of data gravity the mesh and the fabric vendors the the fabric virtualization vendors they're all on board with the idea of this converged platform and they're saying hey we'll handle all the edge cases of the stuff that isn't in that center of data gradient that is off distributed in a cloud or at a remote location so you can have that single platform for the center of of your your data and then bring in virtualization mesh what have you for reaching out to the distributed data bingo as they basically said people are happy when they virtualize data i i think yes at this point but to this uh dave meningas point you know they have convert they are converging snowflake has introduced support for unstructured data so now we are literally splitting here now what uh databricks is saying is that aha but it's easy to go from data lake to data warehouse than it is from data warehouse to data lake so i think we're getting into semantics but we've already seen these two converge so is that so it takes something like aws who's got what 15 data stores are they're going to have 15 converged data stores that's going to be interesting to watch all right guys i'm going to go down the list and do like a one i'm going to one word each and you guys each of the analysts if you wouldn't just add a very brief sort of course correction for me so sanjeev i mean governance is going to be the maybe it's the dog that wags the tail now i mean it's coming to the fore all this ransomware stuff which really didn't talk much about security but but but what's the one word in your prediction that you would leave us with on governance it's uh it's going to be mainstream mainstream okay tony bear mesh washing is what i wrote down that's that's what we're going to see in uh in in 2022 a little reality check you you want to add to that reality check is i hope that no vendor you know jumps the shark and calls their offering a data mesh project yeah yeah let's hope that doesn't happen if they do we're going to call them out uh carl i mean graph databases thank you for sharing some some you know high growth metrics i know it's early days but magic is what i took away from that it's the magic database yeah i would actually i've said this to people too i i kind of look at it as a swiss army knife of data because you can pretty much do anything you want with it it doesn't mean you should i mean that's definitely the case that if you're you know managing things that are in a fixed schematic relationship probably a relational database is a better choice there are you know times when the document database is a better choice it can handle those things but maybe not it may not be the best choice for that use case but for a great many especially the new emerging use cases i listed it's the best choice thank you and dave meninger thank you by the way for bringing the data in i like how you supported all your comments with with some some data points but streaming data becomes the sort of default uh paradigm if you will what would you add yeah um i would say think fast right that's the world we live in you got to think fast fast love it uh and brad shimon uh i love it i mean on the one hand i was saying okay great i'm afraid i might get disrupted by one of these internet giants who are ai experts so i'm gonna be able to buy instead of build ai but then again you know i've got some real issues there's a potential backlash there so give us the there's your bumper sticker yeah i i would say um going with dave think fast and also think slow uh to to talk about the book that everyone talks about i would say really that this is all about trust trust in the idea of automation and of a transparent invisible ai across the enterprise but verify verify before you do anything and then doug henson i mean i i look i think the the trend is your friend here on this prediction with lake house is uh really becoming dominant i liked the way you set up that notion of you know the the the data warehouse folks coming at it from the analytics perspective but then you got the data science worlds coming together i still feel as though there's this piece in the middle that we're missing but your your final thoughts we'll give you the last well i think the idea of consolidation and simplification uh always prevails that's why the appeal of a single platform is going to be there um we've already seen that with uh you know hadoop platforms moving toward cloud moving toward object storage and object storage becoming really the common storage point for whether it's a lake or a warehouse uh and that second point uh i think esg mandates are uh are gonna come in alongside uh gdpr and things like that to uh up the ante for uh good governance yeah thank you for calling that out okay folks hey that's all the time that that we have here your your experience and depth of understanding on these key issues and in data and data management really on point and they were on display today i want to thank you for your your contributions really appreciate your time enjoyed it thank you now in addition to this video we're going to be making available transcripts of the discussion we're going to do clips of this as well we're going to put them out on social media i'll write this up and publish the discussion on wikibon.com and siliconangle.com no doubt several of the analysts on the panel will take the opportunity to publish written content social commentary or both i want to thank the power panelist and thanks for watching this special cube presentation this is dave vellante be well and we'll see you next time [Music] you

Published Date : Jan 8 2022

SUMMARY :

the end of the day need to speak you

ENTITIES

Entity	Category	Confidence
381 databases	QUANTITY	0.99+
2014	DATE	0.99+
2022	DATE	0.99+
2021	DATE	0.99+
january of 2022	DATE	0.99+
100 users	QUANTITY	0.99+
jamal dagani	PERSON	0.99+
last week	DATE	0.99+
dave meninger	PERSON	0.99+
sanji	PERSON	0.99+
second question	QUANTITY	0.99+
15 converged data stores	QUANTITY	0.99+
dave vellante	PERSON	0.99+
microsoft	ORGANIZATION	0.99+
three	QUANTITY	0.99+
sanjeev	PERSON	0.99+
2023	DATE	0.99+
15 data stores	QUANTITY	0.99+
siliconangle.com	OTHER	0.99+
last year	DATE	0.99+
sanjeev mohan	PERSON	0.99+
six	QUANTITY	0.99+
two	QUANTITY	0.99+
carl	PERSON	0.99+
tony	PERSON	0.99+
carl olufsen	PERSON	0.99+
six years	QUANTITY	0.99+
david	PERSON	0.99+
carlos specter	PERSON	0.98+
both sides	QUANTITY	0.98+
2010s	DATE	0.98+
first backlash	QUANTITY	0.98+
five years	QUANTITY	0.98+
today	DATE	0.98+
dave	PERSON	0.98+
each	QUANTITY	0.98+
three quarters	QUANTITY	0.98+
first	QUANTITY	0.98+
single platform	QUANTITY	0.98+
lake house	ORGANIZATION	0.98+
both	QUANTITY	0.98+
this year	DATE	0.98+
doug	PERSON	0.97+
one word	QUANTITY	0.97+
this year	DATE	0.97+
wikibon.com	OTHER	0.97+
one platform	QUANTITY	0.97+
39	QUANTITY	0.97+
about 600 percent	QUANTITY	0.97+
two analysts	QUANTITY	0.97+
ten years	QUANTITY	0.97+
single platform	QUANTITY	0.96+
five	QUANTITY	0.96+
one	QUANTITY	0.96+
three quarters	QUANTITY	0.96+
california	LOCATION	0.96+
google	ORGANIZATION	0.96+
single	QUANTITY	0.95+

Predictions 2022: Top Analysts See the Future of Data

(bright music) >> In the 2010s, organizations became keenly aware that data would become the key ingredient to driving competitive advantage, differentiation, and growth. But to this day, putting data to work remains a difficult challenge for many, if not most organizations. Now, as the cloud matures, it has become a game changer for data practitioners by making cheap storage and massive processing power readily accessible. We've also seen better tooling in the form of data workflows, streaming, machine intelligence, AI, developer tools, security, observability, automation, new databases and the like. These innovations they accelerate data proficiency, but at the same time, they add complexity for practitioners. Data lakes, data hubs, data warehouses, data marts, data fabrics, data meshes, data catalogs, data oceans are forming, they're evolving and exploding onto the scene. So in an effort to bring perspective to the sea of optionality, we've brought together the brightest minds in the data analyst community to discuss how data management is morphing and what practitioners should expect in 2022 and beyond. Hello everyone, my name is Dave Velannte with theCUBE, and I'd like to welcome you to a special Cube presentation, analysts predictions 2022: the future of data management. We've gathered six of the best analysts in data and data management who are going to present and discuss their top predictions and trends for 2022 in the first half of this decade. Let me introduce our six power panelists. Sanjeev Mohan is former Gartner Analyst and Principal at SanjMo. Tony Baer, principal at dbInsight, Carl Olofson is well-known Research Vice President with IDC, Dave Menninger is Senior Vice President and Research Director at Ventana Research, Brad Shimmin, Chief Analyst, AI Platforms, Analytics and Data Management at Omdia and Doug Henschen, Vice President and Principal Analyst at Constellation Research. Gentlemen, welcome to the program and thanks for coming on theCUBE today. >> Great to be here. >> Thank you. >> All right, here's the format we're going to use. I as moderator, I'm going to call on each analyst separately who then will deliver their prediction or mega trend, and then in the interest of time management and pace, two analysts will have the opportunity to comment. If we have more time, we'll elongate it, but let's get started right away. Sanjeev Mohan, please kick it off. You want to talk about governance, go ahead sir. >> Thank you Dave. I believe that data governance which we've been talking about for many years is now not only going to be mainstream, it's going to be table stakes. And all the things that you mentioned, you know, the data, ocean data lake, lake houses, data fabric, meshes, the common glue is metadata. If we don't understand what data we have and we are governing it, there is no way we can manage it. So we saw Informatica went public last year after a hiatus of six. I'm predicting that this year we see some more companies go public. My bet is on Culebra, most likely and maybe Alation we'll see go public this year. I'm also predicting that the scope of data governance is going to expand beyond just data. It's not just data and reports. We are going to see more transformations like spark jawsxxxxx, Python even Air Flow. We're going to see more of a streaming data. So from Kafka Schema Registry, for example. We will see AI models become part of this whole governance suite. So the governance suite is going to be very comprehensive, very detailed lineage, impact analysis, and then even expand into data quality. We already seen that happen with some of the tools where they are buying these smaller companies and bringing in data quality monitoring and integrating it with metadata management, data catalogs, also data access governance. So what we are going to see is that once the data governance platforms become the key entry point into these modern architectures, I'm predicting that the usage, the number of users of a data catalog is going to exceed that of a BI tool. That will take time and we already seen that trajectory. Right now if you look at BI tools, I would say there a hundred users to BI tool to one data catalog. And I see that evening out over a period of time and at some point data catalogs will really become the main way for us to access data. Data catalog will help us visualize data, but if we want to do more in-depth analysis, it'll be the jumping off point into the BI tool, the data science tool and that is the journey I see for the data governance products. >> Excellent, thank you. Some comments. Maybe Doug, a lot of things to weigh in on there, maybe you can comment. >> Yeah, Sanjeev I think you're spot on, a lot of the trends the one disagreement, I think it's really still far from mainstream. As you say, we've been talking about this for years, it's like God, motherhood, apple pie, everyone agrees it's important, but too few organizations are really practicing good governance because it's hard and because the incentives have been lacking. I think one thing that deserves mention in this context is ESG mandates and guidelines, these are environmental, social and governance, regs and guidelines. We've seen the environmental regs and guidelines and posts in industries, particularly the carbon-intensive industries. We've seen the social mandates, particularly diversity imposed on suppliers by companies that are leading on this topic. We've seen governance guidelines now being imposed by banks on investors. So these ESGs are presenting new carrots and sticks, and it's going to demand more solid data. It's going to demand more detailed reporting and solid reporting, tighter governance. But we're still far from mainstream adoption. We have a lot of, you know, best of breed niche players in the space. I think the signs that it's going to be more mainstream are starting with things like Azure Purview, Google Dataplex, the big cloud platform players seem to be upping the ante and starting to address governance. >> Excellent, thank you Doug. Brad, I wonder if you could chime in as well. >> Yeah, I would love to be a believer in data catalogs. But to Doug's point, I think that it's going to take some more pressure for that to happen. I recall metadata being something every enterprise thought they were going to get under control when we were working on service oriented architecture back in the nineties and that didn't happen quite the way we anticipated. And so to Sanjeev's point it's because it is really complex and really difficult to do. My hope is that, you know, we won't sort of, how do I put this? Fade out into this nebula of domain catalogs that are specific to individual use cases like Purview for getting data quality right or like data governance and cybersecurity. And instead we have some tooling that can actually be adaptive to gather metadata to create something. And I know its important to you, Sanjeev and that is this idea of observability. If you can get enough metadata without moving your data around, but understanding the entirety of a system that's running on this data, you can do a lot. So to help with the governance that Doug is talking about. >> So I just want to add that, data governance, like any other initiatives did not succeed even AI went into an AI window, but that's a different topic. But a lot of these things did not succeed because to your point, the incentives were not there. I remember when Sarbanes Oxley had come into the scene, if a bank did not do Sarbanes Oxley, they were very happy to a million dollar fine. That was like, you know, pocket change for them instead of doing the right thing. But I think the stakes are much higher now. With GDPR, the flood gates opened. Now, you know, California, you know, has CCPA but even CCPA is being outdated with CPRA, which is much more GDPR like. So we are very rapidly entering a space where pretty much every major country in the world is coming up with its own compliance regulatory requirements, data residents is becoming really important. And I think we are going to reach a stage where it won't be optional anymore. So whether we like it or not, and I think the reason data catalogs were not successful in the past is because we did not have the right focus on adoption. We were focused on features and these features were disconnected, very hard for business to adopt. These are built by IT people for IT departments to take a look at technical metadata, not business metadata. Today the tables have turned. CDOs are driving this initiative, regulatory compliances are beating down hard, so I think the time might be right. >> Yeah so guys, we have to move on here. But there's some real meat on the bone here, Sanjeev. I like the fact that you called out Culebra and Alation, so we can look back a year from now and say, okay, he made the call, he stuck it. And then the ratio of BI tools to data catalogs that's another sort of measurement that we can take even though with some skepticism there, that's something that we can watch. And I wonder if someday, if we'll have more metadata than data. But I want to move to Tony Baer, you want to talk about data mesh and speaking, you know, coming off of governance. I mean, wow, you know the whole concept of data mesh is, decentralized data, and then governance becomes, you know, a nightmare there, but take it away, Tony. >> We'll put this way, data mesh, you know, the idea at least as proposed by ThoughtWorks. You know, basically it was at least a couple of years ago and the press has been almost uniformly almost uncritical. A good reason for that is for all the problems that basically Sanjeev and Doug and Brad we're just speaking about, which is that we have all this data out there and we don't know what to do about it. Now, that's not a new problem. That was a problem we had in enterprise data warehouses, it was a problem when we had over DoOP data clusters, it's even more of a problem now that data is out in the cloud where the data is not only your data lake, is not only us three, it's all over the place. And it's also including streaming, which I know we'll be talking about later. So the data mesh was a response to that, the idea of that we need to bait, you know, who are the folks that really know best about governance? It's the domain experts. So it was basically data mesh was an architectural pattern and a process. My prediction for this year is that data mesh is going to hit cold heart reality. Because if you do a Google search, basically the published work, the articles on data mesh have been largely, you know, pretty uncritical so far. Basically loading and is basically being a very revolutionary new idea. I don't think it's that revolutionary because we've talked about ideas like this. Brad now you and I met years ago when we were talking about so and decentralizing all of us, but it was at the application level. Now we're talking about it at the data level. And now we have microservices. So there's this thought of have we managed if we're deconstructing apps in cloud native to microservices, why don't we think of data in the same way? My sense this year is that, you know, this has been a very active search if you look at Google search trends, is that now companies, like enterprise are going to look at this seriously. And as they look at it seriously, it's going to attract its first real hard scrutiny, it's going to attract its first backlash. That's not necessarily a bad thing. It means that it's being taken seriously. The reason why I think that you'll start to see basically the cold hearted light of day shine on data mesh is that it's still a work in progress. You know, this idea is basically a couple of years old and there's still some pretty major gaps. The biggest gap is in the area of federated governance. Now federated governance itself is not a new issue. Federated governance decision, we started figuring out like, how can we basically strike the balance between getting let's say between basically consistent enterprise policy, consistent enterprise governance, but yet the groups that understand the data and know how to basically, you know, that, you know, how do we basically sort of balance the two? There's a huge gap there in practice and knowledge. Also to a lesser extent, there's a technology gap which is basically in the self-service technologies that will help teams essentially govern data. You know, basically through the full life cycle, from develop, from selecting the data from, you know, building the pipelines from, you know, determining your access control, looking at quality, looking at basically whether the data is fresh or whether it's trending off course. So my prediction is that it will receive the first harsh scrutiny this year. You are going to see some organization and enterprises declare premature victory when they build some federated query implementations. You going to see vendors start with data mesh wash their products anybody in the data management space that they are going to say that where this basically a pipelining tool, whether it's basically ELT, whether it's a catalog or federated query tool, they will all going to get like, you know, basically promoting the fact of how they support this. Hopefully nobody's going to call themselves a data mesh tool because data mesh is not a technology. We're going to see one other thing come out of this. And this harks back to the metadata that Sanjeev was talking about and of the catalog just as he was talking about. Which is that there's going to be a new focus, every renewed focus on metadata. And I think that's going to spur interest in data fabrics. Now data fabrics are pretty vaguely defined, but if we just take the most elemental definition, which is a common metadata back plane, I think that if anybody is going to get serious about data mesh, they need to look at the data fabric because we all at the end of the day, need to speak, you know, need to read from the same sheet of music. >> So thank you Tony. Dave Menninger, I mean, one of the things that people like about data mesh is it pretty crisply articulate some of the flaws in today's organizational approaches to data. What are your thoughts on this? >> Well, I think we have to start by defining data mesh, right? The term is already getting corrupted, right? Tony said it's going to see the cold hard light of day. And there's a problem right now that there are a number of overlapping terms that are similar but not identical. So we've got data virtualization, data fabric, excuse me for a second. (clears throat) Sorry about that. Data virtualization, data fabric, data federation, right? So I think that it's not really clear what each vendor means by these terms. I see data mesh and data fabric becoming quite popular. I've interpreted data mesh as referring primarily to the governance aspects as originally intended and specified. But that's not the way I see vendors using it. I see vendors using it much more to mean data fabric and data virtualization. So I'm going to comment on the group of those things. I think the group of those things is going to happen. They're going to happen, they're going to become more robust. Our research suggests that a quarter of organizations are already using virtualized access to their data lakes and another half, so a total of three quarters will eventually be accessing their data lakes using some sort of virtualized access. Again, whether you define it as mesh or fabric or virtualization isn't really the point here. But this notion that there are different elements of data, metadata and governance within an organization that all need to be managed collectively. The interesting thing is when you look at the satisfaction rates of those organizations using virtualization versus those that are not, it's almost double, 68% of organizations, I'm sorry, 79% of organizations that were using virtualized access express satisfaction with their access to the data lake. Only 39% express satisfaction if they weren't using virtualized access. >> Oh thank you Dave. Sanjeev we just got about a couple of minutes on this topic, but I know you're speaking or maybe you've always spoken already on a panel with (indistinct) who sort of invented the concept. Governance obviously is a big sticking point, but what are your thoughts on this? You're on mute. (panelist chuckling) >> So my message to (indistinct) and to the community is as opposed to what they said, let's not define it. We spent a whole year defining it, there are four principles, domain, product, data infrastructure, and governance. Let's take it to the next level. I get a lot of questions on what is the difference between data fabric and data mesh? And I'm like I can't compare the two because data mesh is a business concept, data fabric is a data integration pattern. How do you compare the two? You have to bring data mesh a level down. So to Tony's point, I'm on a warpath in 2022 to take it down to what does a data product look like? How do we handle shared data across domains and governance? And I think we are going to see more of that in 2022, or is "operationalization" of data mesh. >> I think we could have a whole hour on this topic, couldn't we? Maybe we should do that. But let's corner. Let's move to Carl. So Carl, you're a database guy, you've been around that block for a while now, you want to talk about graph databases, bring it on. >> Oh yeah. Okay thanks. So I regard graph database as basically the next truly revolutionary database management technology. I'm looking forward for the graph database market, which of course we haven't defined yet. So obviously I have a little wiggle room in what I'm about to say. But this market will grow by about 600% over the next 10 years. Now, 10 years is a long time. But over the next five years, we expect to see gradual growth as people start to learn how to use it. The problem is not that it's not useful, its that people don't know how to use it. So let me explain before I go any further what a graph database is because some of the folks on the call may not know what it is. A graph database organizes data according to a mathematical structure called a graph. The graph has elements called nodes and edges. So a data element drops into a node, the nodes are connected by edges, the edges connect one node to another node. Combinations of edges create structures that you can analyze to determine how things are related. In some cases, the nodes and edges can have properties attached to them which add additional informative material that makes it richer, that's called a property graph. There are two principle use cases for graph databases. There's semantic property graphs, which are use to break down human language texts into the semantic structures. Then you can search it, organize it and answer complicated questions. A lot of AI is aimed at semantic graphs. Another kind is the property graph that I just mentioned, which has a dazzling number of use cases. I want to just point out as I talk about this, people are probably wondering, well, we have relation databases, isn't that good enough? So a relational database defines... It supports what I call definitional relationships. That means you define the relationships in a fixed structure. The database drops into that structure, there's a value, foreign key value, that relates one table to another and that value is fixed. You don't change it. If you change it, the database becomes unstable, it's not clear what you're looking at. In a graph database, the system is designed to handle change so that it can reflect the true state of the things that it's being used to track. So let me just give you some examples of use cases for this. They include entity resolution, data lineage, social media analysis, Customer 360, fraud prevention. There's cybersecurity, there's strong supply chain is a big one actually. There is explainable AI and this is going to become important too because a lot of people are adopting AI. But they want a system after the fact to say, how do the AI system come to that conclusion? How did it make that recommendation? Right now we don't have really good ways of tracking that. Machine learning in general, social network, I already mentioned that. And then we've got, oh gosh, we've got data governance, data compliance, risk management. We've got recommendation, we've got personalization, anti money laundering, that's another big one, identity and access management, network and IT operations is already becoming a key one where you actually have mapped out your operation, you know, whatever it is, your data center and you can track what's going on as things happen there, root cause analysis, fraud detection is a huge one. A number of major credit card companies use graph databases for fraud detection, risk analysis, tracking and tracing turn analysis, next best action, what if analysis, impact analysis, entity resolution and I would add one other thing or just a few other things to this list, metadata management. So Sanjeev, here you go, this is your engine. Because I was in metadata management for quite a while in my past life. And one of the things I found was that none of the data management technologies that were available to us could efficiently handle metadata because of the kinds of structures that result from it, but graphs can, okay? Graphs can do things like say, this term in this context means this, but in that context, it means that, okay? Things like that. And in fact, logistics management, supply chain. And also because it handles recursive relationships, by recursive relationships I mean objects that own other objects that are of the same type. You can do things like build materials, you know, so like parts explosion. Or you can do an HR analysis, who reports to whom, how many levels up the chain and that kind of thing. You can do that with relational databases, but yet it takes a lot of programming. In fact, you can do almost any of these things with relational databases, but the problem is, you have to program it. It's not supported in the database. And whenever you have to program something, that means you can't trace it, you can't define it. You can't publish it in terms of its functionality and it's really, really hard to maintain over time. >> Carl, thank you. I wonder if we could bring Brad in, I mean. Brad, I'm sitting here wondering, okay, is this incremental to the market? Is it disruptive and replacement? What are your thoughts on this phase? >> It's already disrupted the market. I mean, like Carl said, go to any bank and ask them are you using graph databases to get fraud detection under control? And they'll say, absolutely, that's the only way to solve this problem. And it is frankly. And it's the only way to solve a lot of the problems that Carl mentioned. And that is, I think it's Achilles heel in some ways. Because, you know, it's like finding the best way to cross the seven bridges of Koenigsberg. You know, it's always going to kind of be tied to those use cases because it's really special and it's really unique and because it's special and it's unique, it's still unfortunately kind of stands apart from the rest of the community that's building, let's say AI outcomes, as a great example here. Graph databases and AI, as Carl mentioned, are like chocolate and peanut butter. But technologically, you think don't know how to talk to one another, they're completely different. And you know, you can't just stand up SQL and query them. You've got to learn, know what is the Carl? Specter special. Yeah, thank you to, to actually get to the data in there. And if you're going to scale that data, that graph database, especially a property graph, if you're going to do something really complex, like try to understand you know, all of the metadata in your organization, you might just end up with, you know, a graph database winter like we had the AI winter simply because you run out of performance to make the thing happen. So, I think it's already disrupted, but we need to like treat it like a first-class citizen in the data analytics and AI community. We need to bring it into the fold. We need to equip it with the tools it needs to do the magic it does and to do it not just for specialized use cases, but for everything. 'Cause I'm with Carl. I think it's absolutely revolutionary. >> Brad identified the principal, Achilles' heel of the technology which is scaling. When these things get large and complex enough that they spill over what a single server can handle, you start to have difficulties because the relationships span things that have to be resolved over a network and then you get network latency and that slows the system down. So that's still a problem to be solved. >> Sanjeev, any quick thoughts on this? I mean, I think metadata on the word cloud is going to be the largest font, but what are your thoughts here? >> I want to (indistinct) So people don't associate me with only metadata, so I want to talk about something slightly different. dbengines.com has done an amazing job. I think almost everyone knows that they chronicle all the major databases that are in use today. In January of 2022, there are 381 databases on a ranked list of databases. The largest category is RDBMS. The second largest category is actually divided into two property graphs and IDF graphs. These two together make up the second largest number databases. So talking about Achilles heel, this is a problem. The problem is that there's so many graph databases to choose from. They come in different shapes and forms. To Brad's point, there's so many query languages in RDBMS, in SQL. I know the story, but here We've got cipher, we've got gremlin, we've got GQL and then we're proprietary languages. So I think there's a lot of disparity in this space. >> Well, excellent. All excellent points, Sanjeev, if I must say. And that is a problem that the languages need to be sorted and standardized. People need to have a roadmap as to what they can do with it. Because as you say, you can do so many things. And so many of those things are unrelated that you sort of say, well, what do we use this for? And I'm reminded of the saying I learned a bunch of years ago. And somebody said that the digital computer is the only tool man has ever device that has no particular purpose. (panelists chuckle) >> All right guys, we got to move on to Dave Menninger. We've heard about streaming. Your prediction is in that realm, so please take it away. >> Sure. So I like to say that historical databases are going to become a thing of the past. By that I don't mean that they're going to go away, that's not my point. I mean, we need historical databases, but streaming data is going to become the default way in which we operate with data. So in the next say three to five years, I would expect that data platforms and we're using the term data platforms to represent the evolution of databases and data lakes, that the data platforms will incorporate these streaming capabilities. We're going to process data as it streams into an organization and then it's going to roll off into historical database. So historical databases don't go away, but they become a thing of the past. They store the data that occurred previously. And as data is occurring, we're going to be processing it, we're going to be analyzing it, we're going to be acting on it. I mean we only ever ended up with historical databases because we were limited by the technology that was available to us. Data doesn't occur in patches. But we processed it in patches because that was the best we could do. And it wasn't bad and we've continued to improve and we've improved and we've improved. But streaming data today is still the exception. It's not the rule, right? There are projects within organizations that deal with streaming data. But it's not the default way in which we deal with data yet. And so that's my prediction is that this is going to change, we're going to have streaming data be the default way in which we deal with data and how you label it and what you call it. You know, maybe these databases and data platforms just evolved to be able to handle it. But we're going to deal with data in a different way. And our research shows that already, about half of the participants in our analytics and data benchmark research, are using streaming data. You know, another third are planning to use streaming technologies. So that gets us to about eight out of 10 organizations need to use this technology. And that doesn't mean they have to use it throughout the whole organization, but it's pretty widespread in its use today and has continued to grow. If you think about the consumerization of IT, we've all been conditioned to expect immediate access to information, immediate responsiveness. You know, we want to know if an item is on the shelf at our local retail store and we can go in and pick it up right now. You know, that's the world we live in and that's spilling over into the enterprise IT world We have to provide those same types of capabilities. So that's my prediction, historical databases become a thing of the past, streaming data becomes the default way in which we operate with data. >> All right thank you David. Well, so what say you, Carl, the guy who has followed historical databases for a long time? >> Well, one thing actually, every database is historical because as soon as you put data in it, it's now history. They'll no longer reflect the present state of things. But even if that history is only a millisecond old, it's still history. But I would say, I mean, I know you're trying to be a little bit provocative in saying this Dave 'cause you know, as well as I do that people still need to do their taxes, they still need to do accounting, they still need to run general ledger programs and things like that. That all involves historical data. That's not going to go away unless you want to go to jail. So you're going to have to deal with that. But as far as the leading edge functionality, I'm totally with you on that. And I'm just, you know, I'm just kind of wondering if this requires a change in the way that we perceive applications in order to truly be manifested and rethinking the way applications work. Saying that an application should respond instantly, as soon as the state of things changes. What do you say about that? >> I think that's true. I think we do have to think about things differently. It's not the way we designed systems in the past. We're seeing more and more systems designed that way. But again, it's not the default. And I agree 100% with you that we do need historical databases you know, that's clear. And even some of those historical databases will be used in conjunction with the streaming data, right? >> Absolutely. I mean, you know, let's take the data warehouse example where you're using the data warehouse as its context and the streaming data as the present and you're saying, here's the sequence of things that's happening right now. Have we seen that sequence before? And where? What does that pattern look like in past situations? And can we learn from that? >> So Tony Baer, I wonder if you could comment? I mean, when you think about, you know, real time inferencing at the edge, for instance, which is something that a lot of people talk about, a lot of what we're discussing here in this segment, it looks like it's got a great potential. What are your thoughts? >> Yeah, I mean, I think you nailed it right. You know, you hit it right on the head there. Which is that, what I'm seeing is that essentially. Then based on I'm going to split this one down the middle is that I don't see that basically streaming is the default. What I see is streaming and basically and transaction databases and analytics data, you know, data warehouses, data lakes whatever are converging. And what allows us technically to converge is cloud native architecture, where you can basically distribute things. So you can have a node here that's doing the real-time processing, that's also doing... And this is where it leads in or maybe doing some of that real time predictive analytics to take a look at, well look, we're looking at this customer journey what's happening with what the customer is doing right now and this is correlated with what other customers are doing. So the thing is that in the cloud, you can basically partition this and because of basically the speed of the infrastructure then you can basically bring these together and kind of orchestrate them sort of a loosely coupled manner. The other parts that the use cases are demanding, and this is part of it goes back to what Dave is saying. Is that, you know, when you look at Customer 360, when you look at let's say Smart Utility products, when you look at any type of operational problem, it has a real time component and it has an historical component. And having predictive and so like, you know, my sense here is that technically we can bring this together through the cloud. And I think the use case is that we can apply some real time sort of predictive analytics on these streams and feed this into the transactions so that when we make a decision in terms of what to do as a result of a transaction, we have this real-time input. >> Sanjeev, did you have a comment? >> Yeah, I was just going to say that to Dave's point, you know, we have to think of streaming very different because in the historical databases, we used to bring the data and store the data and then we used to run rules on top, aggregations and all. But in case of streaming, the mindset changes because the rules are normally the inference, all of that is fixed, but the data is constantly changing. So it's a completely reversed way of thinking and building applications on top of that. >> So Dave Menninger, there seem to be some disagreement about the default. What kind of timeframe are you thinking about? Is this end of decade it becomes the default? What would you pin? >> I think around, you know, between five to 10 years, I think this becomes the reality. >> I think its... >> It'll be more and more common between now and then, but it becomes the default. And I also want Sanjeev at some point, maybe in one of our subsequent conversations, we need to talk about governing streaming data. 'Cause that's a whole nother set of challenges. >> We've also talked about it rather in two dimensions, historical and streaming, and there's lots of low latency, micro batch, sub-second, that's not quite streaming, but in many cases its fast enough and we're seeing a lot of adoption of near real time, not quite real-time as good enough for many applications. (indistinct cross talk from panelists) >> Because nobody's really taking the hardware dimension (mumbles). >> That'll just happened, Carl. (panelists laughing) >> So near real time. But maybe before you lose the customer, however we define that, right? Okay, let's move on to Brad. Brad, you want to talk about automation, AI, the pipeline people feel like, hey, we can just automate everything. What's your prediction? >> Yeah I'm an AI aficionados so apologies in advance for that. But, you know, I think that we've been seeing automation play within AI for some time now. And it's helped us do a lot of things especially for practitioners that are building AI outcomes in the enterprise. It's helped them to fill skills gaps, it's helped them to speed development and it's helped them to actually make AI better. 'Cause it, you know, in some ways provide some swim lanes and for example, with technologies like AutoML can auto document and create that sort of transparency that we talked about a little bit earlier. But I think there's an interesting kind of conversion happening with this idea of automation. And that is that we've had the automation that started happening for practitioners, it's trying to move out side of the traditional bounds of things like I'm just trying to get my features, I'm just trying to pick the right algorithm, I'm just trying to build the right model and it's expanding across that full life cycle, building an AI outcome, to start at the very beginning of data and to then continue on to the end, which is this continuous delivery and continuous automation of that outcome to make sure it's right and it hasn't drifted and stuff like that. And because of that, because it's become kind of powerful, we're starting to actually see this weird thing happen where the practitioners are starting to converge with the users. And that is to say that, okay, if I'm in Tableau right now, I can stand up Salesforce Einstein Discovery, and it will automatically create a nice predictive algorithm for me given the data that I pull in. But what's starting to happen and we're seeing this from the companies that create business software, so Salesforce, Oracle, SAP, and others is that they're starting to actually use these same ideals and a lot of deep learning (chuckles) to basically stand up these out of the box flip-a-switch, and you've got an AI outcome at the ready for business users. And I am very much, you know, I think that's the way that it's going to go and what it means is that AI is slowly disappearing. And I don't think that's a bad thing. I think if anything, what we're going to see in 2022 and maybe into 2023 is this sort of rush to put this idea of disappearing AI into practice and have as many of these solutions in the enterprise as possible. You can see, like for example, SAP is going to roll out this quarter, this thing called adaptive recommendation services, which basically is a cold start AI outcome that can work across a whole bunch of different vertical markets and use cases. It's just a recommendation engine for whatever you needed to do in the line of business. So basically, you're an SAP user, you look up to turn on your software one day, you're a sales professional let's say, and suddenly you have a recommendation for customer churn. Boom! It's going, that's great. Well, I don't know, I think that's terrifying. In some ways I think it is the future that AI is going to disappear like that, but I'm absolutely terrified of it because I think that what it really does is it calls attention to a lot of the issues that we already see around AI, specific to this idea of what we like to call at Omdia, responsible AI. Which is, you know, how do you build an AI outcome that is free of bias, that is inclusive, that is fair, that is safe, that is secure, that its audible, et cetera, et cetera, et cetera, et cetera. I'd take a lot of work to do. And so if you imagine a customer that's just a Salesforce customer let's say, and they're turning on Einstein Discovery within their sales software, you need some guidance to make sure that when you flip that switch, that the outcome you're going to get is correct. And that's going to take some work. And so, I think we're going to see this move, let's roll this out and suddenly there's going to be a lot of problems, a lot of pushback that we're going to see. And some of that's going to come from GDPR and others that Sanjeev was mentioning earlier. A lot of it is going to come from internal CSR requirements within companies that are saying, "Hey, hey, whoa, hold up, we can't do this all at once. "Let's take the slow route, "let's make AI automated in a smart way." And that's going to take time. >> Yeah, so a couple of predictions there that I heard. AI simply disappear, it becomes invisible. Maybe if I can restate that. And then if I understand it correctly, Brad you're saying there's a backlash in the near term. You'd be able to say, oh, slow down. Let's automate what we can. Those attributes that you talked about are non trivial to achieve, is that why you're a bit of a skeptic? >> Yeah. I think that we don't have any sort of standards that companies can look to and understand. And we certainly, within these companies, especially those that haven't already stood up an internal data science team, they don't have the knowledge to understand when they flip that switch for an automated AI outcome that it's going to do what they think it's going to do. And so we need some sort of standard methodology and practice, best practices that every company that's going to consume this invisible AI can make use of them. And one of the things that you know, is sort of started that Google kicked off a few years back that's picking up some momentum and the companies I just mentioned are starting to use it is this idea of model cards where at least you have some transparency about what these things are doing. You know, so like for the SAP example, we know, for example, if it's convolutional neural network with a long, short term memory model that it's using, we know that it only works on Roman English and therefore me as a consumer can say, "Oh, well I know that I need to do this internationally. "So I should not just turn this on today." >> Thank you. Carl could you add anything, any context here? >> Yeah, we've talked about some of the things Brad mentioned here at IDC and our future of intelligence group regarding in particular, the moral and legal implications of having a fully automated, you know, AI driven system. Because we already know, and we've seen that AI systems are biased by the data that they get, right? So if they get data that pushes them in a certain direction, I think there was a story last week about an HR system that was recommending promotions for White people over Black people, because in the past, you know, White people were promoted and more productive than Black people, but it had no context as to why which is, you know, because they were being historically discriminated, Black people were being historically discriminated against, but the system doesn't know that. So, you know, you have to be aware of that. And I think that at the very least, there should be controls when a decision has either a moral or legal implication. When you really need a human judgment, it could lay out the options for you. But a person actually needs to authorize that action. And I also think that we always will have to be vigilant regarding the kind of data we use to train our systems to make sure that it doesn't introduce unintended biases. In some extent, they always will. So we'll always be chasing after them. But that's (indistinct). >> Absolutely Carl, yeah. I think that what you have to bear in mind as a consumer of AI is that it is a reflection of us and we are a very flawed species. And so if you look at all of the really fantastic, magical looking supermodels we see like GPT-3 and four, that's coming out, they're xenophobic and hateful because the people that the data that's built upon them and the algorithms and the people that build them are us. So AI is a reflection of us. We need to keep that in mind. >> Yeah, where the AI is biased 'cause humans are biased. All right, great. All right let's move on. Doug you mentioned mentioned, you know, lot of people that said that data lake, that term is not going to live on but here's to be, have some lakes here. You want to talk about lake house, bring it on. >> Yes, I do. My prediction is that lake house and this idea of a combined data warehouse and data lake platform is going to emerge as the dominant data management offering. I say offering that doesn't mean it's going to be the dominant thing that organizations have out there, but it's going to be the pro dominant vendor offering in 2022. Now heading into 2021, we already had Cloudera, Databricks, Microsoft, Snowflake as proponents, in 2021, SAP, Oracle, and several of all of these fabric virtualization/mesh vendors joined the bandwagon. The promise is that you have one platform that manages your structured, unstructured and semi-structured information. And it addresses both the BI analytics needs and the data science needs. The real promise there is simplicity and lower cost. But I think end users have to answer a few questions. The first is, does your organization really have a center of data gravity or is the data highly distributed? Multiple data warehouses, multiple data lakes, on premises, cloud. If it's very distributed and you'd have difficulty consolidating and that's not really a goal for you, then maybe that single platform is unrealistic and not likely to add value to you. You know, also the fabric and virtualization vendors, the mesh idea, that's where if you have this highly distributed situation, that might be a better path forward. The second question, if you are looking at one of these lake house offerings, you are looking at consolidating, simplifying, bringing together to a single platform. You have to make sure that it meets both the warehouse need and the data lake need. So you have vendors like Databricks, Microsoft with Azure Synapse. New really to the data warehouse space and they're having to prove that these data warehouse capabilities on their platforms can meet the scaling requirements, can meet the user and query concurrency requirements. Meet those tight SLS. And then on the other hand, you have the Oracle, SAP, Snowflake, the data warehouse folks coming into the data science world, and they have to prove that they can manage the unstructured information and meet the needs of the data scientists. I'm seeing a lot of the lake house offerings from the warehouse crowd, managing that unstructured information in columns and rows. And some of these vendors, Snowflake a particular is really relying on partners for the data science needs. So you really got to look at a lake house offering and make sure that it meets both the warehouse and the data lake requirement. >> Thank you Doug. Well Tony, if those two worlds are going to come together, as Doug was saying, the analytics and the data science world, does it need to be some kind of semantic layer in between? I don't know. Where are you in on this topic? >> (chuckles) Oh, didn't we talk about data fabrics before? Common metadata layer (chuckles). Actually, I'm almost tempted to say let's declare victory and go home. And that this has actually been going on for a while. I actually agree with, you know, much of what Doug is saying there. Which is that, I mean I remember as far back as I think it was like 2014, I was doing a study. I was still at Ovum, (indistinct) Omdia, looking at all these specialized databases that were coming up and seeing that, you know, there's overlap at the edges. But yet, there was still going to be a reason at the time that you would have, let's say a document database for JSON, you'd have a relational database for transactions and for data warehouse and you had basically something at that time that resembles a dupe for what we consider your data life. Fast forward and the thing is what I was seeing at the time is that you were saying they sort of blending at the edges. That was saying like about five to six years ago. And the lake house is essentially on the current manifestation of that idea. There is a dichotomy in terms of, you know, it's the old argument, do we centralize this all you know in a single place or do we virtualize? And I think it's always going to be a union yeah and there's never going to be a single silver bullet. I do see that there are also going to be questions and these are points that Doug raised. That you know, what do you need for your performance there, or for your free performance characteristics? Do you need for instance high concurrency? You need the ability to do some very sophisticated joins, or is your requirement more to be able to distribute and distribute our processing is, you know, as far as possible to get, you know, to essentially do a kind of a brute force approach. All these approaches are valid based on the use case. I just see that essentially that the lake house is the culmination of it's nothing. It's a relatively new term introduced by Databricks a couple of years ago. This is the culmination of basically what's been a long time trend. And what we see in the cloud is that as we start seeing data warehouses as a check box items say, "Hey, we can basically source data in cloud storage, in S3, "Azure Blob Store, you know, whatever, "as long as it's in certain formats, "like, you know parquet or CSP or something like that." I see that as becoming kind of a checkbox item. So to that extent, I think that the lake house, depending on how you define is already reality. And in some cases, maybe new terminology, but not a whole heck of a lot new under the sun. >> Yeah. And Dave Menninger, I mean a lot of these, thank you Tony, but a lot of this is going to come down to, you know, vendor marketing, right? Some people just kind of co-op the term, we talked about you know, data mesh washing, what are your thoughts on this? (laughing) >> Yeah, so I used the term data platform earlier. And part of the reason I use that term is that it's more vendor neutral. We've tried to sort of stay out of the vendor terminology patenting world, right? Whether the term lake houses, what sticks or not, the concept is certainly going to stick. And we have some data to back it up. About a quarter of organizations that are using data lakes today, already incorporate data warehouse functionality into it. So they consider their data lake house and data warehouse one in the same, about a quarter of organizations, a little less, but about a quarter of organizations feed the data lake from the data warehouse and about a quarter of organizations feed the data warehouse from the data lake. So it's pretty obvious that three quarters of organizations need to bring this stuff together, right? The need is there, the need is apparent. The technology is going to continue to converge. I like to talk about it, you know, you've got data lakes over here at one end, and I'm not going to talk about why people thought data lakes were a bad idea because they thought you just throw stuff in a server and you ignore it, right? That's not what a data lake is. So you've got data lake people over here and you've got database people over here, data warehouse people over here, database vendors are adding data lake capabilities and data lake vendors are adding data warehouse capabilities. So it's obvious that they're going to meet in the middle. I mean, I think it's like Tony says, I think we should declare victory and go home. >> As hell. So just a follow-up on that, so are you saying the specialized lake and the specialized warehouse, do they go away? I mean, Tony data mesh practitioners would say or advocates would say, well, they could all live. It's just a node on the mesh. But based on what Dave just said, are we gona see those all morphed together? >> Well, number one, as I was saying before, there's always going to be this sort of, you know, centrifugal force or this tug of war between do we centralize the data, do we virtualize? And the fact is I don't think that there's ever going to be any single answer. I think in terms of data mesh, data mesh has nothing to do with how you're physically implement the data. You could have a data mesh basically on a data warehouse. It's just that, you know, the difference being is that if we use the same physical data store, but everybody's logically you know, basically governing it differently, you know? Data mesh in space, it's not a technology, it's processes, it's governance process. So essentially, you know, I basically see that, you know, as I was saying before that this is basically the culmination of a long time trend we're essentially seeing a lot of blurring, but there are going to be cases where, for instance, if I need, let's say like, Upserve, I need like high concurrency or something like that. There are certain things that I'm not going to be able to get efficiently get out of a data lake. And, you know, I'm doing a system where I'm just doing really brute forcing very fast file scanning and that type of thing. So I think there always will be some delineations, but I would agree with Dave and with Doug, that we are seeing basically a confluence of requirements that we need to essentially have basically either the element, you know, the ability of a data lake and the data warehouse, these need to come together, so I think. >> I think what we're likely to see is organizations look for a converge platform that can handle both sides for their center of data gravity, the mesh and the fabric virtualization vendors, they're all on board with the idea of this converged platform and they're saying, "Hey, we'll handle all the edge cases "of the stuff that isn't in that center of data gravity "but that is off distributed in a cloud "or at a remote location." So you can have that single platform for the center of your data and then bring in virtualization, mesh, what have you, for reaching out to the distributed data. >> As Dave basically said, people are happy when they virtualized data. >> I think we have at this point, but to Dave Menninger's point, they are converging, Snowflake has introduced support for unstructured data. So obviously literally splitting here. Now what Databricks is saying is that "aha, but it's easy to go from data lake to data warehouse "than it is from databases to data lake." So I think we're getting into semantics, but we're already seeing these two converge. >> So take somebody like AWS has got what? 15 data stores. Are they're going to 15 converge data stores? This is going to be interesting to watch. All right, guys, I'm going to go down and list do like a one, I'm going to one word each and you guys, each of the analyst, if you would just add a very brief sort of course correction for me. So Sanjeev, I mean, governance is going to to be... Maybe it's the dog that wags the tail now. I mean, it's coming to the fore, all this ransomware stuff, which you really didn't talk much about security, but what's the one word in your prediction that you would leave us with on governance? >> It's going to be mainstream. >> Mainstream. Okay. Tony Baer, mesh washing is what I wrote down. That's what we're going to see in 2022, a little reality check, you want to add to that? >> Reality check, 'cause I hope that no vendor jumps the shark and close they're offering a data niche product. >> Yeah, let's hope that doesn't happen. If they do, we're going to call them out. Carl, I mean, graph databases, thank you for sharing some high growth metrics. I know it's early days, but magic is what I took away from that, so magic database. >> Yeah, I would actually, I've said this to people too. I kind of look at it as a Swiss Army knife of data because you can pretty much do anything you want with it. That doesn't mean you should. I mean, there's definitely the case that if you're managing things that are in fixed schematic relationship, probably a relation database is a better choice. There are times when the document database is a better choice. It can handle those things, but maybe not. It may not be the best choice for that use case. But for a great many, especially with the new emerging use cases I listed, it's the best choice. >> Thank you. And Dave Menninger, thank you by the way, for bringing the data in, I like how you supported all your comments with some data points. But streaming data becomes the sort of default paradigm, if you will, what would you add? >> Yeah, I would say think fast, right? That's the world we live in, you got to think fast. >> Think fast, love it. And Brad Shimmin, love it. I mean, on the one hand I was saying, okay, great. I'm afraid I might get disrupted by one of these internet giants who are AI experts. I'm going to be able to buy instead of build AI. But then again, you know, I've got some real issues. There's a potential backlash there. So give us your bumper sticker. >> I'm would say, going with Dave, think fast and also think slow to talk about the book that everyone talks about. I would say really that this is all about trust, trust in the idea of automation and a transparent and visible AI across the enterprise. And verify, verify before you do anything. >> And then Doug Henschen, I mean, I think the trend is your friend here on this prediction with lake house is really becoming dominant. I liked the way you set up that notion of, you know, the data warehouse folks coming at it from the analytics perspective and then you get the data science worlds coming together. I still feel as though there's this piece in the middle that we're missing, but your, your final thoughts will give you the (indistinct). >> I think the idea of consolidation and simplification always prevails. That's why the appeal of a single platform is going to be there. We've already seen that with, you know, DoOP platforms and moving toward cloud, moving toward object storage and object storage, becoming really the common storage point for whether it's a lake or a warehouse. And that second point, I think ESG mandates are going to come in alongside GDPR and things like that to up the ante for good governance. >> Yeah, thank you for calling that out. Okay folks, hey that's all the time that we have here, your experience and depth of understanding on these key issues on data and data management really on point and they were on display today. I want to thank you for your contributions. Really appreciate your time. >> Enjoyed it. >> Thank you. >> Thanks for having me. >> In addition to this video, we're going to be making available transcripts of the discussion. We're going to do clips of this as well we're going to put them out on social media. I'll write this up and publish the discussion on wikibon.com and siliconangle.com. No doubt, several of the analysts on the panel will take the opportunity to publish written content, social commentary or both. I want to thank the power panelists and thanks for watching this special CUBE presentation. This is Dave Vellante, be well and we'll see you next time. (bright music)

Published Date : Jan 7 2022

SUMMARY :

and I'd like to welcome you to I as moderator, I'm going to and that is the journey to weigh in on there, and it's going to demand more solid data. Brad, I wonder if you that are specific to individual use cases in the past is because we I like the fact that you the data from, you know, Dave Menninger, I mean, one of the things that all need to be managed collectively. Oh thank you Dave. and to the community I think we could have a after the fact to say, okay, is this incremental to the market? the magic it does and to do it and that slows the system down. I know the story, but And that is a problem that the languages move on to Dave Menninger. So in the next say three to five years, the guy who has followed that people still need to do their taxes, And I agree 100% with you and the streaming data as the I mean, when you think about, you know, and because of basically the all of that is fixed, but the it becomes the default? I think around, you know, but it becomes the default. and we're seeing a lot of taking the hardware dimension That'll just happened, Carl. Okay, let's move on to Brad. And that is to say that, Those attributes that you And one of the things that you know, Carl could you add in the past, you know, I think that what you have to bear in mind that term is not going to and the data science needs. and the data science world, You need the ability to do lot of these, thank you Tony, I like to talk about it, you know, It's just a node on the mesh. basically either the element, you know, So you can have that single they virtualized data. "aha, but it's easy to go from I mean, it's coming to the you want to add to that? I hope that no vendor Yeah, let's hope that doesn't happen. I've said this to people too. I like how you supported That's the world we live I mean, on the one hand I And verify, verify before you do anything. I liked the way you set up We've already seen that with, you know, the time that we have here, We're going to do clips of this as well

ENTITIES

Entity	Category	Confidence
Dave Menninger	PERSON	0.99+
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Doug Henschen	PERSON	0.99+
David	PERSON	0.99+
Brad Shimmin	PERSON	0.99+
Doug	PERSON	0.99+
Tony Baer	PERSON	0.99+
Dave Velannte	PERSON	0.99+
Tony	PERSON	0.99+
Carl	PERSON	0.99+
Brad	PERSON	0.99+
Carl Olofson	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
2014	DATE	0.99+
Sanjeev Mohan	PERSON	0.99+
Ventana Research	ORGANIZATION	0.99+
2022	DATE	0.99+
Oracle	ORGANIZATION	0.99+
last year	DATE	0.99+
January of 2022	DATE	0.99+
three	QUANTITY	0.99+
381 databases	QUANTITY	0.99+
IDC	ORGANIZATION	0.99+
Informatica	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Sanjeev	PERSON	0.99+
2021	DATE	0.99+
Google	ORGANIZATION	0.99+
Omdia	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
SanjMo	ORGANIZATION	0.99+
79%	QUANTITY	0.99+
second question	QUANTITY	0.99+
last week	DATE	0.99+
15 data stores	QUANTITY	0.99+
100%	QUANTITY	0.99+
SAP	ORGANIZATION	0.99+

Maria Colgan & Gerald Venzl, Oracle | June CUBEconversation

(upbeat music) Developers have become the new king makers in the world of digital and cloud. The rise of containers and microservices has accelerated the transition to cloud native applications. A lot of people will talk about application architecture and the related paradigms and the benefits they bring for the process of writing and delivering new apps. But a major challenge continues to be, the how and the what when it comes to accessing, processing and getting insights from the massive amounts of data that we have to deal with in today's world. And with me are two experts from the data management world who will share with us how they think about the best techniques and practices based on what they see at large organizations who are working with data and developing so-called data-driven apps. Please welcome Maria Colgan and Gerald Venzl, two distinguish product managers from Oracle. Folks, welcome, thanks so much for coming on. >> Thanks for having us Dave. >> Thank you very much for having us. >> Okay, Maria let's start with you. So, we throw around this term data-driven, data-driven applications. What are we really talking about there? >> So data-driven applications are applications that work on a diverse set of data. So anything from spatial to sensor data, document data as well as your usual transaction processing data. And what they're going to do is they'll generate value from that data in very different ways to a traditional application. So for example, they may use machine learning, they are able to do product recommendations in the middle of a transaction. Or we could use graph to be able to identify an influencer within the community so we can target them with a specific promotion. It could also use spatial data to be able to help find the nearest stores to a particular customer. And because these apps are deployed on multiple platforms, everything from mobile devices as well as standard browsers, they need a data platform that's going to be both secure, reliable and scalable. >> Well, so when you think about how the workloads are shifting I mean, we're not talking about, you know it's not anymore a world of just your ERP or your HCM or your CRM, you know kind of the traditional operational systems. You really are seeing an explosion of these new data oriented apps. You're seeing, you know, modeling in the cloud, you are going to see more and more inferencing, inferencing at the edge. But Maria maybe you could talk a little bit about sort of the benefits that customers are seeing from developing these types of applications. I mean, why should people care about data-driven apps? >> Oh, for sure, there's massive benefits to them. I mean, probably the most obvious one for any business regardless of the industry, is that they not only allow you to understand what your customers are up to, but they allow you to be able to anticipate those customer's needs. So that helps businesses maintain that competitive edge and retain their customers. But it also helps them make data-driven decisions in real time based on actual data rather than on somebody's gut feeling or basing those decisions on historical data. So for example, you can do real-time price adjustments on products based on demand and so forth, that kind of thing. So it really changes the way people do business today. >> So Gerald, you think about the narrative in the industry everybody wants to be a platform player all your customers they are becoming software companies, they are becoming platform players. Everybody wants to be like, you know name a company that is huge trillion dollar market cap or whatever, and those are data-driven companies. And so it would seem to me that data-driven applications, there's nobody, no company really shouldn't be data-driven. Do you buy that? >> Yeah, absolutely. I mean, data-driven, and that naturally the whole industry is data-driven, right? It's like we all have information technologies about processing data and deriving information out of it. But when it comes to app development I think there is a big push to kind of like we have to do machine learning in our applications, we have to get insights from data. And when you actually look back a bit and take a step back, you see that there's of course many different kinds of applications out there as well that's not to be forgotten, right? So there is a usual front end user interfaces where really the application all it does is just entering some piece of information that's stored somewhere or perhaps a microservice that's not attached to a data to you at all but just receives or asks calls (indistinct). So I think it's not necessarily so important for every developer to kind of go on a bandwagon that they have to be data-driven. But I think it's equally important for those applications and those developers that build applications, that drive the business, that make business critical decisions as Maria mentioned before. Those guys should take really a close look into what data-driven apps means and what the data to you can actually give to them. Because what we see also happening a lot is that a lot of the things that are well known and out there just ready to use are being reimplemented in the applications. And for those applications, they essentially just ended up spending more time writing codes that will be already there and then have to maintain and debug the code as well rather than just going to market faster. >> Gerald can you talk to the prevailing approaches that developers take to build data-driven applications? What are the ones that you see? Let's dig into that a little bit more and maybe differentiate the different approaches and talk about that? >> Yeah, absolutely. I think right now the industry is like in two camps, it's like sort of a religious war going on that you'll see often happening with different architectures and so forth going on. So we have single purpose databases or data management technologies. Which are technologies that are as the name suggests build around a single purpose. So it's like, you know a typical example would be your ordinary key-value store. And a key-value store all it does is it allows you to store and retrieve a piece of data whatever that may be really, really fast but it doesn't really go beyond that. And then the other side of the house or the other camp would be multimodal databases, multimodal data management technologies. Those are technologies that allow you to store different types of data, different formats of data in the same technology in the same system alongside. And, you know, when you look at the geographics out there of what we have from technology, is pretty much any relational database or any database really has evolved into such a multimodal database. Whether that's MySQL that allows you to store or chase them alongside relational or even a MongoDB that allows you to do or gives you native graph support since (mumbles) and as well alongside the adjacent support. >> Well, it's clearly a trend in the industry. We've talked about this a lot in The Cube. We know where Oracle stands on this. I mean, you just mentioned MySQL but I mean, Oracle Databases you've been extending, you've mentioned JSON, we've got blockchain now in there you're infusing, you know ML and AI into the database, graph database capabilities, you know on and on and on. We talked a lot about we compared that to Amazon which is kind of the right tool, the right job approach. So maybe you could talk about, you know, your point of view, the benefits for developers of using that converged database if I can use that word approach being able to store multiple data formats? Why do you feel like that's a better approach? >> Yeah, I think on a high level it comes down to complexity. You are actually avoiding additional complexity, right? So not every use case that you have necessarily warrants to have yet another data management technology or yet the special build technology for managing that data, right? It's like many use cases that we see out there happily want to just store a piece of a chase and document, a piece of chase in a database and then perhaps retrieve it again afterwards so write some simple queries over it. And you really don't have to get a new database technology or a NoSQL database into the mix if you already have some to just fulfill that exact use case. You could just happily store that information as well in the database you already have. And what it really comes down to is the learning curve for developers, right? So it's like, as you use the same technology to store other types of data, you don't have to learn a new technology, you don't have to associate yourself with new and learn new drivers. You don't have to find new frameworks and you don't have to know how to necessarily operate or best model your data for that database. You can essentially just reuse your knowledge of the technology as well as the libraries and code you have already built in house perhaps in another application, perhaps, you know framework that you used against the same technology because it is still the same technology. So, kind of all comes down again to avoiding complexity rather than not fragmenting you know, the many different technologies we have. If you were to look at the different data formats that are out there today it's like, you know, you would end up with many different databases just to store them if you were to fully religiously follow the single purpose best built technology for every use case paradigm, right? And then you would just end up having to manage many different databases more than actually focusing on your app and getting value to your business or to your user. >> Okay, so I get that and I buy that by the way. I mean, especially if you're a larger organization and you've got all these projects going on but before we go back to Maria, Gerald, I want to just, I want to push on that a little bit. Because the counter to that argument would be in the analogy. And I wonder if you, I'd love for you to, you know knock this analogy off the blocks. The counter would be okay, Oracle is the Swiss Army knife and it's got, you know, all in one. But sometimes I need that specialized long screwdriver and I go into my toolbox and I grab that. It's better than the screwdriver in my Swiss Army knife. Why, are you the Swiss Army knife of databases? Or are you the all-in-one have that best of breed screwdriver for me? How do you think about that? >> Yeah, that's a fantastic question, right? And I think it's first of all, you have to separate between Oracle the company that has actually multiple data management technologies and databases out there as you said before, right? And Oracle Database. And I think Oracle Database is definitely a Swiss Army knife has many capabilities of since the last 40 years, you know that we've seen object support coming that's still in the Oracle Database today. We have seen XML coming, it's still in the Oracle Database, graph, spatial, et cetera. And so you have many different ways of managing your data and then on top of that going into the converge, not only do we allow you to store the different data model in there but we actually allow you also to, you apply all the security policies and so forth on top of it something Maria can talk more about the mission around converged database. I would also argue though that for some aspects, we do actually have to or add a screwdriver that you talked about as well. So especially in the relational world people get very quickly hung up on this idea that, oh, if you only do rows and columns, well, that's kind of what you put down on disk. And that was never true, it's the relational model is actually a logical model. What's probably being put down on disk is blocks that align themselves nice with block storage and always has been. So that allows you to actually model and process the data sort of differently. And one common example or one good example that we have that we introduced a couple of years ago was when, column and databases were very strong and you know, the competition came it's like, yeah, we have In-Memory column that stores now they're so much better. And we were like, well, orienting the data role-based or column-based really doesn't matter in the sense that we store them as blocks on disks. And so we introduced the in memory technology which gives you an In-Memory column, a representation of your data as well alongside your relational. So there is an example where you go like, well, actually you know, if you have this use case of the column or analytics all In-Memory, I would argue Oracle Database is also that screwdriver you want to go down to and gives you that capability. Because not only gives you representation in columnar, but also which many people then forget all the analytic power on top of SQL. It's one thing to store your data columnar, it's a completely different story to actually be able to run analytics on top of that and having all the built-in functionalities and stuff that you want to do with the data on top of it as you analyze it. >> You know, that's a great example, the kilometer 'cause I remember there was like a lot of hype around it. Oh, it's the Oracle killer, you know, at Vertica. Vertica is still around but, you know it never really hit escape velocity. But you know, good product, good company, whatever. Natezza, it kind of got buried inside of IBM. ParXL kind of became, you know, red shift with that deal so that kind of went away. Teradata bought a company, I forget which company it bought but. So that hype kind of disapated and now it's like, oh yeah, columnar. It's kind of like In-Memory, we've had a In-Memory databases ever since we've had databases you know, it's a kind of a feature not a sector. But anyway, Maria, let's come back to you. You've got a lot of customer experience. And you speak with a lot of companies, you know during your time at Oracle. What else are you seeing in terms of the benefits to this approach that might not be so intuitive and obvious right away? >> I think one of the biggest benefits to having a multimodel multiworkload or as we call it a converged database, is the fact that you can get greater data synergy from it. In other words, you can utilize all these different techniques and data models to get better value out of that data. So things like being able to do real-time machine learning, fraud detection inside a transaction or being able to do a product recommendation by accessing three different data models. So for example, if I'm trying to recommend a product for you Dave, I might use graph analytics to be able to figure out your community. Not just your friends, but other people on our system who look and behave just like you. Once I know that community then I can go over and see what products they bought by looking up our product catalog which may be stored as JSON. And then on top of that I can then see using the key-value what products inside that catalog those community members gave a five star rating to. So that way I can really pinpoint the right product for you. And I can do all of that in one transaction inside the database without having to transform that data into different models or God forbid, access different systems to be able to get all of that information. So it really simplifies how we can generate that value from the data. And of course, the other thing our customers love is when it comes to deploying data-driven apps, when you do it on a converged database it's much simpler because it is that standard data platform. So you're not having to manage multiple independent single purpose databases. You're not having to implement the security and the high availability policies, you know across a bunch of different diverse platforms. All of that can be done much simpler with a converged database 'cause the DBA team of course, is going to just use that standard set of tools to manage, monitor and secure those systems. >> Thank you for that. And you know, it's interesting, you talk about simplification and you are in Juan's organization so you've big focus on mission critical. And so one of the things that I think is often overlooked well, we talk about all the time is recovery. And if things are simpler, recovery is faster and easier. And so it's kind of the hallmark of Oracle is like the gold standard of the toughest apps, the most mission critical apps. But I wanted to get to the cloud Maria. So because everything is going to the cloud, right? Not all workloads are going to the cloud but everybody is talking about the cloud. Everybody has cloud first mentality and so yes, it's a hybrid world. But the natural next question is how do you think the cloud fits into this world of data-driven apps? >> I think just like any app that you're developing, the cloud helps to accelerate that development. And of course the deployment of these data-driven applications. 'Cause if you think about it, the developer is instantly able to provision a converged database that Oracle will automatically manage and look after for them. But what's great about doing something like that if you use like our autonomous database service is that it comes in different flavors. So you can get autonomous transaction processing, data warehousing or autonomous JSON so that the developer is going to get a database that's been optimized for their specific use case, whatever they are trying to solve. And it's also going to contain all of that great functionality and capabilities that we've been talking about. So what that really means to the developer though is as the project evolves and inevitably the business needs change a little, there's no need to panic when one of those changes comes in because your converged database or your autonomous database has all of those additional capabilities. So you can simply utilize those to able to address those evolving changes in the project. 'Cause let's face it, none of us normally know exactly what we need to build right at the very beginning. And on top of that they also kind of get a built-in buddy in the cloud, especially in the autonomous database. And that buddy comes in the form of built-in workload optimizations. So with the autonomous database we do things like automatic indexing where we're using machine learning to be that buddy for the developer. So what it'll do is it'll monitor the workload and see what kind of queries are being run on that system. And then it will actually determine if there are indexes that should be built to help improve the performance of that application. And not only does it bill those indexes but it verifies that they help improve the performance before publishing it to the application. So by the time the developer is finished with that app and it's ready to be deployed, it's actually also been optimized by the developers buddy, the Oracle autonomous database. So, you know, it's a really nice helping hand for developers when they're building any app especially data-driven apps. >> I like how you sort of gave us, you know the truth here is you don't always know where you're going when you're building an app. It's like it goes from you are trying to build it and they will come to start building it and we'll figure out where it's going to go. With Agile that's kind of how it works. But so I wonder, can you give some examples of maybe customers or maybe genericize them if you need to. Data-driven apps in the cloud where customers were able to drive more efficiency, where the cloud buddy allowed the customers to do more with less? >> No, we have tons of these but I'll try and keep it to just a couple. One that comes to mind straight away is retrace. These folks built a blockchain app in the Oracle Cloud that allows manufacturers to actually share the supply chain with the consumer. So the consumer can see exactly, who made their product? Using what raw materials? Where they were sourced from? How it was done? All of that is visible to the consumer. And in order to be able to share that they had to work on a very diverse set of data. So they had everything from JSON documents to images as well as your traditional transactions in there. And they store all of that information inside the Oracle autonomous database, they were able to build their app and deploy it on the cloud. And they were able to do all of that very, very quickly. So, you know, that ability to work on multiple different data types in a single database really helped them build that product and get it to market in a very short amount of time. Another customer that's doing something really, really interesting is MindSense. So these guys operate the largest mines in Canada, Chile, and Peru. But what they do is they put these x-ray devices on the massive mechanical shovels that are at the cove or at the mine face. And what that does is it senses the contents of the buckets inside these mining machines. And it's looking to see at that content, to see how it can optimize the processing of the ore inside in that bucket. So they're looking to minimize the amount of power and water that it's going to take to process that. And also of course, minimize the amount of waste that's going to come out of that project. So all of that sensor data is sent into an autonomous database where it's going to be processed by a whole host of different users. So everything from the mine engineers to the geo scientists, to even their own data scientists utilize that data to drive their business forward. And what I love about these guys is they're not happy with building just one app. MindSense actually use our built-in low core development environment, APEX that comes as part of the autonomous database and they actually produce applications constantly for different aspects of their business using that technology. And it's actually able to accelerate those new apps to the business. It takes them now just a couple of days or weeks to produce an app instead of months or years to build those new apps. >> Great, thank you for that Maria. Gerald, I'm going to push you again. So, I said upfront and talked about microservices and the cloud and containers and you know, anybody in the developer space follows that very closely. But some of the things that we've been talking about here people might look at that and say, well, they're kind of antithetical to microservices. This is our Oracles monolithic approach. But when you think about the benefits of microservices, people want freedom of choice, technology choice, seen as a big advantage of microservices and containers. How do you address such an argument? >> Yeah, that's an excellent question and I get that quite often. The microservices architecture in general as I said before had architectures, Linux distributions, et cetera. It's kind of always a bit of like there's an academic approach and there's a pragmatic approach. And when you look at the microservices the original definitions that came out at the early 2010s. They actually never said that each microservice has to have a database. And they also never said that if a microservice has a database, you have to use a different technology for each microservice. Just like they never said, you have to write a microservice in a different programming language, right? So where I'm going with this is like, yes you know, sometimes when you look at some vendors out there, some niche players, they push this message or they jump on this academic approach of like each microservice has the best tool at hand or I'd use a different database for your purpose, et cetera. Which almost often comes across like us. You know, we want to stay part of the conversation. Nothing stops a developer from, you know using a multimodal database for the microservice and just using that as a document store, right? Or just using that as a relational database. And, you know, sometimes I mean, it was actually something that happened that was really interesting yesterday I don't know whether you follow Dave or not. But Facebook had an outage yesterday, right? And Facebook is one of those companies that are seen as the Silicon Valley, you know know how to do microservices companies. And when you add through the outage, well, what happened, right? Some unfortunate logical error with configuration as a force that took a database cluster down. So, you know, there you have it where you go like, well, maybe not every microservice is actually in fact talking to its own database or its own special purpose database. I think there, you know, well, what we should, the industry should be focusing much more on this argument of which technology to use? What's the right tool for a job? Is more to ask themselves, what business problem actually are we trying to solve? And therefore what's the right approach and the right technology for this. And so therefore, just as I said before, you know multimodal databases they do have strong benefits. They have many built-in functionalities that are already there and they allow you to reduce this complexity of having to know many different technologies, right? And so it's not only to store different data models either you know, treat a multimodal database as a chasing documents store or a relational database but most databases are multimodal since 20 plus years. But it's also actually being able to perhaps if you store that data together, you can perhaps actually derive additional value for somebody else but perhaps not for your application. But like for example, if you were to use Oracle Database you can actually write queries on top of all of that data. It doesn't really matter for our query engine whether it's the data is format that then chase or the data is formatted in rows and columns you can just rather than query over it. And that's actually very powerful for those guys that have to, you know get the reporting done the end of the day, the end of the week. And for those guys that are the data scientists that they want to figure out, you know which product performed really well or can we tweak something here and there. When you look into that space you still see a huge divergence between the guys to put data in kind of the altarpiece style and guys that try to derive new insights. And there's still a lot of ETL going around and, you know we have big data technologies that some of them come and went and some of them came in that are still around like Apache Spark which is still like a SQL engine on top of any of your data kind of going back to the same concept. And so I will say that, you know, for developers when we look at microservices it's like, first of all, is the argument you were making because the vendor or the technology you want to use tells you this argument or, you know, you kind of want to have an argument to use a specific technology? Or is it really more because it is the best technology, to best use for this given use case for this given application that you have? And if so there's of course, also nothing wrong to use a single purpose technology either, right? >> Yeah, I mean, whenever I talk about Oracle I always come back to the most important applications, the mission critical. It's very difficult to architect databases with microservices and containers. You have to be really, really careful. And so and again, it comes back to what we were talking before about with Maria that the complexity and the recovery. But Gerald I want to stay with you for a minute. So there's other data management technologies popping out there. I mean, I've seen some people saying, okay just leave the data in an S3 bucket. We can query that, then we've got some magic sauce to do that. And so why are you optimistic about you know, traditional database technology going forward? >> I would say because of the history of databases. So one thing that once struck me when I came to Oracle and then got to meet great people like Juan Luis and Andy Mendelsohn who had been here for a long, long time. I come to realization that relational databases are around for about 45 years now. And, you know, I was like, I'm too young to have been around then, right? So I was like, what else was around 45 years? It's like just the tech stack that we have today. It's like, how does this look like? Well, Linux only came out in 93. Well, databases pre-date Linux a lot rather than as I started digging I saw a lot of technologies come and go, right? And you mentioned before like the technologies that data management systems that we had that came and went like the columnar databases or XML databases, object databases. And even before relational databases before Cot gave us the relational model there were apparently these networks stores network databases which to some extent look very similar to adjacent documents. There wasn't a harder storing data and a hierarchy to format. And, you know when you then start actually reading the Cot paper and diving a little bit more into the relation model, that's I think one important crux in there that most of the industry keeps forgetting or it hasn't been around to even know. And that is that when Cot created the relational model, he actually focused not so much on the application putting the data in, but on future users and applications still being able to making sense out of the data, right? And that's kind of like I said before we had those network models, we had XML databases you have adjacent documents stores. And the one thing that they all have along with it is like the application that puts the data in decides the structure of the data. And that's all well and good if you had an application of the developer writing an application. It can become really tricky when 10 years later you still want to look at that data and the application that the developer is no longer around then you go like, what does this all mean? Where is the structure defined? What is this attribute? What does it mean? How does it correlate to others? And the one thing that people tend to forget is that it's actually the data that's here to stay not someone who does the applications where it is. Ideally, every company wants to store every single byte of data that they have because there might be future value in it. Economically may not make sense that's now much more feasible than just years ago. But if you could, why wouldn't you want to store all your data, right? And sometimes you actually have to store the data for seven years or whatever because the laws require you to. And so coming back then and you know, like 10 years from now and looking at the data and going like making sense of that data can actually become a lot more difficult and a lot more challenging than having to first figure out and how we store this data for general use. And that kind of was what the relational model was all about. We decompose the data structures into tables and columns with relationships amongst each other so therefore between each other. So that therefore if somebody wants to, you know typical example would be well you store some purchases from your web store, right? There's a customer attribute in it. There's some credit card payment information in it, just some product information on what the customer bought. Well, in the relational model if you just want to figure out which products were sold on a given day or week, you just would query the payment and products table to get the sense out of it. You don't need to touch the customer and so forth. And with the hierarchical model you have to first sit down and understand how is the structure, what is the customer? Where is the payment? You know, does the document start with the payment or does it start with the customer? Where do I find this information? And then in the very early days those databases even struggled to then not having to scan all the documents to get the data out. So coming back to your question a bit, I apologize for going on here. But you know, it's like relational databases have been around for 45 years. I actually argue it's one of the most successful software technologies that we have out there when you look in the overall industry, right? 45 years is like, in IT terms it's like from a star being the ones who are going supernova. You have said it before that many technologies coming and went, right? And just want to add a more really interesting example by the way is Hadoop and HDFS, right? They kind of gave us this additional promise of like, you know, the 2010s like 2012, 2013 the hype of Hadoop and so forth and (mumbles) and HDFS. And people are just like, just put everything into HDFS and worry about the data later, right? And we can query it and map reduce it and whatever. And we had customers actually coming to us they were like, great we have half a petabyte of data on an HDFS cluster and we have no clue what's stored in there. How do we figure this out? What are we going to do now? Now you had a big data cleansing problem. And so I think that is why databases and also data modeling is something that will not go away anytime soon. And I think databases and database technologies are here for quite a while to stay. Because many of those are people they don't think about what's happening to the data five years from now. And many of the niche players also and also frankly even Amazon you know, following with this single purpose thing is like, just use the right tool for the job for your application, right? Just pull in the data there the way you wanted. And it's like, okay, so you use technologies all over the place and then five years from now you have your data fragmented everywhere in different formats and, you know inconsistencies, and, and, and. And those are usually when you come back to this data-driven business critical business decision applications the worst case scenario you can have, right? Because now you need an army of people to actually do data cleansing. And there's not a coincidence that data science has become very, very popular the last recent years as we kind of went on with this proliferation of different database or data management technologies some of those are not even database. But I think I leave it at that. >> It's an interesting talk track because you're right. I mean, no schema on right was alluring, but it definitely created some problems. It also created an entire, you know you referenced the hyper specialized roles and did the data cleansing component. I mean, maybe technology will eventually solve that problem but it hasn't up at least up tonight. Okay, last question, Maria maybe you could start off and Gerald if you want to chime in as well it'd be great. I mean, it's interesting to watch this industry when Oracle sort of won the top database mantle. I mean, I watched it, I saw it. It was, remember it was Informix and it was (indistinct) too and of course, Microsoft you got to give them credit with SQL server, but Oracle won the database wars. And then everything got kind of quiet for awhile database was sort of boring. And then it exploded, you know, all the, you know not only SQL and the key-value stores and the cloud databases and this is really a hot area now. And when we looked at Oracle we said, okay, Oracle it's all about Oracle Database, but we've seen the kind of resurgence in MySQL which everybody thought, you know once Oracle bought Sun they were going to kill MySQL. But now we see you investing in HeatWave, TimesTen, we talked about In-Memory databases before. So where do those fit in Maria in the grand scheme? How should we think about Oracle's database portfolio? >> So there's lots of places where you'd use those different things. 'Cause just like any other industry there are going to be new and boutique use cases that are going to benefit from a more specialized product or single purpose product. So good examples off the top of my head of the kind of systems that would benefit from that would be things like a stock exchange system or a telephone exchange system. Both of those are latency critical transaction processing applications where they need microsecond response times. And that's going to exceed perhaps what you might normally get or deploy with a converged database. And so Oracle's TimesTen database our In-Memory database is perfect for those kinds of applications. But there's also a host of MySQL applications out there today and you said it yourself there Dave, HeatWave is a great place to provision and deploy those kinds of applications because it's going to run 100 times faster than AWS (mumbles). So, you know, there really is a place in the market and in our customer's systems and the needs they have for all of these different members of our database family here at Oracle. >> Yeah, well, the internet is basically running in the lamp stack so I see MySQL going away. All right Gerald, will give you the final word, bring us home. >> Oh, thank you very much. Yeah, I mean, as Maria said, I think it comes back to what we discussed before. There is obviously still needs for special technologies or different technologies than a relational database or multimodal database. Oracle has actually many more databases that people may first think of. Not only the three that we have already mentioned but there's even SP so the Oracle's NoSQL database. And, you know, on a high level Oracle is a data management company, right? And we want to give our customers the best tools and the best technology to manage all of their data. Rather than therefore there has to be a need or there should be a part of the business that also focuses on this highly specialized systems and this highly specialized technologies that address those use cases. And I think it makes perfect sense. It's like, you know, when the customer comes to Oracle they're not only getting this, take this one product you know, and if you don't like it your problem but actually you have choice, right? And choice allows you to make a decision based on what's best for you and not necessarily best for the vendor you're talking to. >> Well guys, really appreciate your time today and your insights. Maria, Gerald, thanks so much for coming on The Cube. >> Thank you very much for having us. >> And thanks for watching this Cube conversation this is Dave Vellante and we'll see you next time. (upbeat music)

Published Date : Jun 24 2021

SUMMARY :

in the world of digital and cloud. and the benefits they bring What are we really talking about there? the nearest stores to kind of the traditional So it really changes the way So Gerald, you think about to you at all but just receives or even a MongoDB that allows you to do ML and AI into the database, in the database you already have. and I buy that by the way. of since the last 40 years, you know the benefits to this approach is the fact that you can get And so one of the things that And that buddy comes in the form of the truth here is you don't and deploy it on the cloud. and the cloud and containers and you know, is the argument you were making that the complexity and the recovery. because the laws require you to. And then it exploded, you and the needs they have in the lamp stack so I and the best technology to and your insights. we'll see you next time.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Gerald Venzl	PERSON	0.99+
Andy Mendelsohn	PERSON	0.99+
Maria	PERSON	0.99+
Chile	LOCATION	0.99+
Peru	LOCATION	0.99+
Maria Colgan	PERSON	0.99+
Canada	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
Gerald	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Maria Colgan	PERSON	0.99+
seven years	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Juan Luis	PERSON	0.99+
100 times	QUANTITY	0.99+
five star	QUANTITY	0.99+
Dave	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
two experts	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Sun	ORGANIZATION	0.99+
45 years	QUANTITY	0.99+
MySQL	TITLE	0.99+
three	QUANTITY	0.99+
yesterday	DATE	0.99+
each microservice	QUANTITY	0.99+
Swiss Army	ORGANIZATION	0.99+
early 2010s	DATE	0.99+
Teradata	ORGANIZATION	0.99+
Swiss Army	ORGANIZATION	0.99+
Linux	TITLE	0.99+
10 years later	DATE	0.99+
2012	DATE	0.99+
two camps	QUANTITY	0.99+
SQL	TITLE	0.99+
Both	QUANTITY	0.98+
Oracle Database	TITLE	0.98+
2010s	DATE	0.98+
TimesTen	ORGANIZATION	0.98+
Hadoop	TITLE	0.98+
first	QUANTITY	0.98+
Oracles	ORGANIZATION	0.98+
Vertica	ORGANIZATION	0.98+
tonight	DATE	0.98+
2013	DATE	0.98+

Maria Colgan & Gerald Venzl, Oracle | June CUBEconversation

(upbeat music) >> It'll be five, four, three and then silent two, one, and then you guys just follow my lead. We're just making some last minute adjustments. Like I said, we're down two hands today. So, you good Alex? Okay, are you guys ready? >> I'm ready. >> Ready. >> I got to get get one note here. >> So I noticed Maria you stopped anyway, so I have time. >> Just so they know Dave and the Boston Studio, are they both kind of concurrently be on film even when they're not speaking or will only the speaker be on film for like if Gerald's drawing while Maria is talking about-- >> Sorry but then I missed one part of my onboarding spiel. There should be, if you go into gallery there should be a label. There should be something labeled Boston live switch feed. If you pin that gallery view you'll see what our program currently being recorded is. So any time you don't see yourself on that feed is an excellent time to take a drink of water, scratch your nose, check your notes. Do whatever you got to do off screen. >> Can you give us a three shot, Alex? >> Yes, there it is. >> And then go to me, just give me a one-shot to Dave. So when I'm here you guys can take a drink or whatever >> That makes sense? >> Yeah. >> Excellent, I will get my recordings restarted and we'll open up when Dave's ready. >> All right, you guys ready? >> Ready. >> All right Steve, you go on mute. >> Okay, on me in 5, 4, 3. Developers have become the new king makers in the world of digital and cloud. The rise of containers and microservices has accelerated the transition to cloud native applications. A lot of people will talk about application architecture and the related paradigms and the benefits they bring for the process of writing and delivering new apps. But a major challenge continues to be, the how and the what when it comes to accessing, processing and getting insights from the massive amounts of data that we have to deal with in today's world. And with me are two experts from the data management world who will share with us how they think about the best techniques and practices based on what they see at large organizations who are working with data and developing so-called data-driven apps. Please welcome Maria Colgan and Gerald Venzl, two distinguish product managers from Oracle. Folks, welcome, thanks so much for coming on. >> Thanks for having us Dave. >> Thank you very much for having us. >> Okay, Maria let's start with you. So, we throw around this term data-driven, data-driven applications. What are we really talking about there? >> So data-driven applications are applications that work on a diverse set of data. So anything from spatial to sensor data, document data as well as your usual transaction processing data. And what they're going to do is they'll generate value from that data in very different ways to a traditional application. So for example, they may use machine learning, they are able to do product recommendations in the middle of a transaction. Or we could use graph to be able to identify an influencer within the community so we can target them with a specific promotion. It could also use spatial data to be able to help find the nearest stores to a particular customer. And because these apps are deployed on multiple platforms, everything from mobile devices as well as standard browsers, they need a data platform that's going to be both secure, reliable and scalable. >> Well, so when you think about how the workloads are shifting I mean, we're not talking about, you know it's not anymore a world of just your ERP or your HCM or your CRM, you know kind of the traditional operational systems. You really are seeing an explosion of these new data oriented apps. You're seeing, you know, modeling in the cloud, you are going to see more and more inferencing, inferencing at the edge. But Maria maybe you could talk a little bit about sort of the benefits that customers are seeing from developing these types of applications. I mean, why should people care about data-driven apps? >> Oh, for sure, there's massive benefits to them. I mean, probably the most obvious one for any business regardless of the industry, is that they not only allow you to understand what your customers are up to, but they allow you to be able to anticipate those customer's needs. So that helps businesses maintain that competitive edge and retain their customers. But it also helps them make data-driven decisions in real time based on actual data rather than on somebody's gut feeling or basing those decisions on historical data. So for example, you can do real-time price adjustments on products based on demand and so forth, that kind of thing. So it really changes the way people do business today. >> So Gerald, you think about the narrative in the industry everybody wants to be a platform player all your customers they are becoming software companies, they are becoming platform players. Everybody wants to be like, you know name a company that is huge trillion dollar market cap or whatever, and those are data-driven companies. And so it would seem to me that data-driven applications, there's nobody, no company really shouldn't be data-driven. Do you buy that? >> Yeah, absolutely. I mean, data-driven, and that naturally the whole industry is data-driven, right? It's like we all have information technologies about processing data and deriving information out of it. But when it comes to app development I think there is a big push to kind of like we have to do machine learning in our applications, we have to get insights from data. And when you actually look back a bit and take a step back, you see that there's of course many different kinds of applications out there as well that's not to be forgotten, right? So there is a usual front end user interfaces where really the application all it does is just entering some piece of information that's stored somewhere or perhaps a microservice that's not attached to a data to you at all but just receives or asks calls (indistinct). So I think it's not necessarily so important for every developer to kind of go on a bandwagon that they have to be data-driven. But I think it's equally important for those applications and those developers that build applications, that drive the business, that make business critical decisions as Maria mentioned before. Those guys should take really a close look into what data-driven apps means and what the data to you can actually give to them. Because what we see also happening a lot is that a lot of the things that are well known and out there just ready to use are being reimplemented in the applications. And for those applications, they essentially just ended up spending more time writing codes that will be already there and then have to maintain and debug the code as well rather than just going to market faster. >> Gerald can you talk to the prevailing approaches that developers take to build data-driven applications? What are the ones that you see? Let's dig into that a little bit more and maybe differentiate the different approaches and talk about that? >> Yeah, absolutely. I think right now the industry is like in two camps, it's like sort of a religious war going on that you'll see often happening with different architectures and so forth going on. So we have single purpose databases or data management technologies. Which are technologies that are as the name suggests build around a single purpose. So it's like, you know a typical example would be your ordinary key-value store. And a key-value store all it does is it allows you to store and retrieve a piece of data whatever that may be really, really fast but it doesn't really go beyond that. And then the other side of the house or the other camp would be multimodal databases, multimodal data management technologies. Those are technologies that allow you to store different types of data, different formats of data in the same technology in the same system alongside. And, you know, when you look at the geographics out there of what we have from technology, is pretty much any relational database or any database really has evolved into such a multimodal database. Whether that's MySQL that allows you to store or chase them alongside relational or even a MongoDB that allows you to do or gives you native graph support since (mumbles) and as well alongside the adjacent support. >> Well, it's clearly a trend in the industry. We've talked about this a lot in The Cube. We know where Oracle stands on this. I mean, you just mentioned MySQL but I mean, Oracle Databases you've been extending, you've mentioned JSON, we've got blockchain now in there you're infusing, you know ML and AI into the database, graph database capabilities, you know on and on and on. We talked a lot about we compared that to Amazon which is kind of the right tool, the right job approach. So maybe you could talk about, you know, your point of view, the benefits for developers of using that converged database if I can use that word approach being able to store multiple data formats? Why do you feel like that's a better approach? >> Yeah, I think on a high level it comes down to complexity. You are actually avoiding additional complexity, right? So not every use case that you have necessarily warrants to have yet another data management technology or yet the special build technology for managing that data, right? It's like many use cases that we see out there happily want to just store a piece of a chase and document, a piece of chase in a database and then perhaps retrieve it again afterwards so write some simple queries over it. And you really don't have to get a new database technology or a NoSQL database into the mix if you already have some to just fulfill that exact use case. You could just happily store that information as well in the database you already have. And what it really comes down to is the learning curve for developers, right? So it's like, as you use the same technology to store other types of data, you don't have to learn a new technology, you don't have to associate yourself with new and learn new drivers. You don't have to find new frameworks and you don't have to know how to necessarily operate or best model your data for that database. You can essentially just reuse your knowledge of the technology as well as the libraries and code you have already built in house perhaps in another application, perhaps, you know framework that you used against the same technology because it is still the same technology. So, kind of all comes down again to avoiding complexity rather than not fragmenting you know, the many different technologies we have. If you were to look at the different data formats that are out there today it's like, you know, you would end up with many different databases just to store them if you were to fully religiously follow the single purpose best built technology for every use case paradigm, right? And then you would just end up having to manage many different databases more than actually focusing on your app and getting value to your business or to your user. >> Okay, so I get that and I buy that by the way. I mean, especially if you're a larger organization and you've got all these projects going on but before we go back to Maria, Gerald, I want to just, I want to push on that a little bit. Because the counter to that argument would be in the analogy. And I wonder if you, I'd love for you to, you know knock this analogy off the blocks. The counter would be okay, Oracle is the Swiss Army knife and it's got, you know, all in one. But sometimes I need that specialized long screwdriver and I go into my toolbox and I grab that. It's better than the screwdriver in my Swiss Army knife. Why, are you the Swiss Army knife of databases? Or are you the all-in-one have that best of breed screwdriver for me? How do you think about that? >> Yeah, that's a fantastic question, right? And I think it's first of all, you have to separate between Oracle the company that has actually multiple data management technologies and databases out there as you said before, right? And Oracle Database. And I think Oracle Database is definitely a Swiss Army knife has many capabilities of since the last 40 years, you know that we've seen object support coming that's still in the Oracle Database today. We have seen XML coming, it's still in the Oracle Database, graph, spatial, et cetera. And so you have many different ways of managing your data and then on top of that going into the converge, not only do we allow you to store the different data model in there but we actually allow you also to, you apply all the security policies and so forth on top of it something Maria can talk more about the mission around converged database. I would also argue though that for some aspects, we do actually have to or add a screwdriver that you talked about as well. So especially in the relational world people get very quickly hung up on this idea that, oh, if you only do rows and columns, well, that's kind of what you put down on disk. And that was never true, it's the relational model is actually a logical model. What's probably being put down on disk is blocks that align themselves nice with block storage and always has been. So that allows you to actually model and process the data sort of differently. And one common example or one good example that we have that we introduced a couple of years ago was when, column and databases were very strong and you know, the competition came it's like, yeah, we have In-Memory column that stores now they're so much better. And we were like, well, orienting the data role-based or column-based really doesn't matter in the sense that we store them as blocks on disks. And so we introduced the in memory technology which gives you an In-Memory column, a representation of your data as well alongside your relational. So there is an example where you go like, well, actually you know, if you have this use case of the column or analytics all In-Memory, I would argue Oracle Database is also that screwdriver you want to go down to and gives you that capability. Because not only gives you representation in columnar, but also which many people then forget all the analytic power on top of SQL. It's one thing to store your data columnar, it's a completely different story to actually be able to run analytics on top of that and having all the built-in functionalities and stuff that you want to do with the data on top of it as you analyze it. >> You know, that's a great example, the kilometer 'cause I remember there was like a lot of hype around it. Oh, it's the Oracle killer, you know, at Vertica. Vertica is still around but, you know it never really hit escape velocity. But you know, good product, good company, whatever. Natezza, it kind of got buried inside of IBM. ParXL kind of became, you know, red shift with that deal so that kind of went away. Teradata bought a company, I forget which company it bought but. So that hype kind of disapated and now it's like, oh yeah, columnar. It's kind of like In-Memory, we've had a In-Memory databases ever since we've had databases you know, it's a kind of a feature not a sector. But anyway, Maria, let's come back to you. You've got a lot of customer experience. And you speak with a lot of companies, you know during your time at Oracle. What else are you seeing in terms of the benefits to this approach that might not be so intuitive and obvious right away? >> I think one of the biggest benefits to having a multimodel multiworkload or as we call it a converged database, is the fact that you can get greater data synergy from it. In other words, you can utilize all these different techniques and data models to get better value out of that data. So things like being able to do real-time machine learning, fraud detection inside a transaction or being able to do a product recommendation by accessing three different data models. So for example, if I'm trying to recommend a product for you Dave, I might use graph analytics to be able to figure out your community. Not just your friends, but other people on our system who look and behave just like you. Once I know that community then I can go over and see what products they bought by looking up our product catalog which may be stored as JSON. And then on top of that I can then see using the key-value what products inside that catalog those community members gave a five star rating to. So that way I can really pinpoint the right product for you. And I can do all of that in one transaction inside the database without having to transform that data into different models or God forbid, access different systems to be able to get all of that information. So it really simplifies how we can generate that value from the data. And of course, the other thing our customers love is when it comes to deploying data-driven apps, when you do it on a converged database it's much simpler because it is that standard data platform. So you're not having to manage multiple independent single purpose databases. You're not having to implement the security and the high availability policies, you know across a bunch of different diverse platforms. All of that can be done much simpler with a converged database 'cause the DBA team of course, is going to just use that standard set of tools to manage, monitor and secure those systems. >> Thank you for that. And you know, it's interesting, you talk about simplification and you are in Juan's organization so you've big focus on mission critical. And so one of the things that I think is often overlooked well, we talk about all the time is recovery. And if things are simpler, recovery is faster and easier. And so it's kind of the hallmark of Oracle is like the gold standard of the toughest apps, the most mission critical apps. But I wanted to get to the cloud Maria. So because everything is going to the cloud, right? Not all workloads are going to the cloud but everybody is talking about the cloud. Everybody has cloud first mentality and so yes, it's a hybrid world. But the natural next question is how do you think the cloud fits into this world of data-driven apps? >> I think just like any app that you're developing, the cloud helps to accelerate that development. And of course the deployment of these data-driven applications. 'Cause if you think about it, the developer is instantly able to provision a converged database that Oracle will automatically manage and look after for them. But what's great about doing something like that if you use like our autonomous database service is that it comes in different flavors. So you can get autonomous transaction processing, data warehousing or autonomous JSON so that the developer is going to get a database that's been optimized for their specific use case, whatever they are trying to solve. And it's also going to contain all of that great functionality and capabilities that we've been talking about. So what that really means to the developer though is as the project evolves and inevitably the business needs change a little, there's no need to panic when one of those changes comes in because your converged database or your autonomous database has all of those additional capabilities. So you can simply utilize those to able to address those evolving changes in the project. 'Cause let's face it, none of us normally know exactly what we need to build right at the very beginning. And on top of that they also kind of get a built-in buddy in the cloud, especially in the autonomous database. And that buddy comes in the form of built-in workload optimizations. So with the autonomous database we do things like automatic indexing where we're using machine learning to be that buddy for the developer. So what it'll do is it'll monitor the workload and see what kind of queries are being run on that system. And then it will actually determine if there are indexes that should be built to help improve the performance of that application. And not only does it bill those indexes but it verifies that they help improve the performance before publishing it to the application. So by the time the developer is finished with that app and it's ready to be deployed, it's actually also been optimized by the developers buddy, the Oracle autonomous database. So, you know, it's a really nice helping hand for developers when they're building any app especially data-driven apps. >> I like how you sort of gave us, you know the truth here is you don't always know where you're going when you're building an app. It's like it goes from you are trying to build it and they will come to start building it and we'll figure out where it's going to go. With Agile that's kind of how it works. But so I wonder, can you give some examples of maybe customers or maybe genericize them if you need to. Data-driven apps in the cloud where customers were able to drive more efficiency, where the cloud buddy allowed the customers to do more with less? >> No, we have tons of these but I'll try and keep it to just a couple. One that comes to mind straight away is retrace. These folks built a blockchain app in the Oracle Cloud that allows manufacturers to actually share the supply chain with the consumer. So the consumer can see exactly, who made their product? Using what raw materials? Where they were sourced from? How it was done? All of that is visible to the consumer. And in order to be able to share that they had to work on a very diverse set of data. So they had everything from JSON documents to images as well as your traditional transactions in there. And they store all of that information inside the Oracle autonomous database, they were able to build their app and deploy it on the cloud. And they were able to do all of that very, very quickly. So, you know, that ability to work on multiple different data types in a single database really helped them build that product and get it to market in a very short amount of time. Another customer that's doing something really, really interesting is MindSense. So these guys operate the largest mines in Canada, Chile, and Peru. But what they do is they put these x-ray devices on the massive mechanical shovels that are at the cove or at the mine face. And what that does is it senses the contents of the buckets inside these mining machines. And it's looking to see at that content, to see how it can optimize the processing of the ore inside in that bucket. So they're looking to minimize the amount of power and water that it's going to take to process that. And also of course, minimize the amount of waste that's going to come out of that project. So all of that sensor data is sent into an autonomous database where it's going to be processed by a whole host of different users. So everything from the mine engineers to the geo scientists, to even their own data scientists utilize that data to drive their business forward. And what I love about these guys is they're not happy with building just one app. MindSense actually use our built-in low core development environment, APEX that comes as part of the autonomous database and they actually produce applications constantly for different aspects of their business using that technology. And it's actually able to accelerate those new apps to the business. It takes them now just a couple of days or weeks to produce an app instead of months or years to build those new apps. >> Great, thank you for that Maria. Gerald, I'm going to push you again. So, I said upfront and talked about microservices and the cloud and containers and you know, anybody in the developer space follows that very closely. But some of the things that we've been talking about here people might look at that and say, well, they're kind of antithetical to microservices. This is our Oracles monolithic approach. But when you think about the benefits of microservices, people want freedom of choice, technology choice, seen as a big advantage of microservices and containers. How do you address such an argument? >> Yeah, that's an excellent question and I get that quite often. The microservices architecture in general as I said before had architectures, Linux distributions, et cetera. It's kind of always a bit of like there's an academic approach and there's a pragmatic approach. And when you look at the microservices the original definitions that came out at the early 2010s. They actually never said that each microservice has to have a database. And they also never said that if a microservice has a database, you have to use a different technology for each microservice. Just like they never said, you have to write a microservice in a different programming language, right? So where I'm going with this is like, yes you know, sometimes when you look at some vendors out there, some niche players, they push this message or they jump on this academic approach of like each microservice has the best tool at hand or I'd use a different database for your purpose, et cetera. Which almost often comes across like us. You know, we want to stay part of the conversation. Nothing stops a developer from, you know using a multimodal database for the microservice and just using that as a document store, right? Or just using that as a relational database. And, you know, sometimes I mean, it was actually something that happened that was really interesting yesterday I don't know whether you follow Dave or not. But Facebook had an outage yesterday, right? And Facebook is one of those companies that are seen as the Silicon Valley, you know know how to do microservices companies. And when you add through the outage, well, what happened, right? Some unfortunate logical error with configuration as a force that took a database cluster down. So, you know, there you have it where you go like, well, maybe not every microservice is actually in fact talking to its own database or its own special purpose database. I think there, you know, well, what we should, the industry should be focusing much more on this argument of which technology to use? What's the right tool for a job? Is more to ask themselves, what business problem actually are we trying to solve? And therefore what's the right approach and the right technology for this. And so therefore, just as I said before, you know multimodal databases they do have strong benefits. They have many built-in functionalities that are already there and they allow you to reduce this complexity of having to know many different technologies, right? And so it's not only to store different data models either you know, treat a multimodal database as a chasing documents store or a relational database but most databases are multimodal since 20 plus years. But it's also actually being able to perhaps if you store that data together, you can perhaps actually derive additional value for somebody else but perhaps not for your application. But like for example, if you were to use Oracle Database you can actually write queries on top of all of that data. It doesn't really matter for our query engine whether it's the data is format that then chase or the data is formatted in rows and columns you can just rather than query over it. And that's actually very powerful for those guys that have to, you know get the reporting done the end of the day, the end of the week. And for those guys that are the data scientists that they want to figure out, you know which product performed really well or can we tweak something here and there. When you look into that space you still see a huge divergence between the guys to put data in kind of the altarpiece style and guys that try to derive new insights. And there's still a lot of ETL going around and, you know we have big data technologies that some of them come and went and some of them came in that are still around like Apache Spark which is still like a SQL engine on top of any of your data kind of going back to the same concept. And so I will say that, you know, for developers when we look at microservices it's like, first of all, is the argument you were making because the vendor or the technology you want to use tells you this argument or, you know, you kind of want to have an argument to use a specific technology? Or is it really more because it is the best technology, to best use for this given use case for this given application that you have? And if so there's of course, also nothing wrong to use a single purpose technology either, right? >> Yeah, I mean, whenever I talk about Oracle I always come back to the most important applications, the mission critical. It's very difficult to architect databases with microservices and containers. You have to be really, really careful. And so and again, it comes back to what we were talking before about with Maria that the complexity and the recovery. But Gerald I want to stay with you for a minute. So there's other data management technologies popping out there. I mean, I've seen some people saying, okay just leave the data in an S3 bucket. We can query that, then we've got some magic sauce to do that. And so why are you optimistic about you know, traditional database technology going forward? >> I would say because of the history of databases. So one thing that once struck me when I came to Oracle and then got to meet great people like Juan Luis and Andy Mendelsohn who had been here for a long, long time. I come to realization that relational databases are around for about 45 years now. And, you know, I was like, I'm too young to have been around then, right? So I was like, what else was around 45 years? It's like just the tech stack that we have today. It's like, how does this look like? Well, Linux only came out in 93. Well, databases pre-date Linux a lot rather than as I started digging I saw a lot of technologies come and go, right? And you mentioned before like the technologies that data management systems that we had that came and went like the columnar databases or XML databases, object databases. And even before relational databases before Cot gave us the relational model there were apparently these networks stores network databases which to some extent look very similar to adjacent documents. There wasn't a harder storing data and a hierarchy to format. And, you know when you then start actually reading the Cot paper and diving a little bit more into the relation model, that's I think one important crux in there that most of the industry keeps forgetting or it hasn't been around to even know. And that is that when Cot created the relational model, he actually focused not so much on the application putting the data in, but on future users and applications still being able to making sense out of the data, right? And that's kind of like I said before we had those network models, we had XML databases you have adjacent documents stores. And the one thing that they all have along with it is like the application that puts the data in decides the structure of the data. And that's all well and good if you had an application of the developer writing an application. It can become really tricky when 10 years later you still want to look at that data and the application that the developer is no longer around then you go like, what does this all mean? Where is the structure defined? What is this attribute? What does it mean? How does it correlate to others? And the one thing that people tend to forget is that it's actually the data that's here to stay not someone who does the applications where it is. Ideally, every company wants to store every single byte of data that they have because there might be future value in it. Economically may not make sense that's now much more feasible than just years ago. But if you could, why wouldn't you want to store all your data, right? And sometimes you actually have to store the data for seven years or whatever because the laws require you to. And so coming back then and you know, like 10 years from now and looking at the data and going like making sense of that data can actually become a lot more difficult and a lot more challenging than having to first figure out and how we store this data for general use. And that kind of was what the relational model was all about. We decompose the data structures into tables and columns with relationships amongst each other so therefore between each other. So that therefore if somebody wants to, you know typical example would be well you store some purchases from your web store, right? There's a customer attribute in it. There's some credit card payment information in it, just some product information on what the customer bought. Well, in the relational model if you just want to figure out which products were sold on a given day or week, you just would query the payment and products table to get the sense out of it. You don't need to touch the customer and so forth. And with the hierarchical model you have to first sit down and understand how is the structure, what is the customer? Where is the payment? You know, does the document start with the payment or does it start with the customer? Where do I find this information? And then in the very early days those databases even struggled to then not having to scan all the documents to get the data out. So coming back to your question a bit, I apologize for going on here. But you know, it's like relational databases have been around for 45 years. I actually argue it's one of the most successful software technologies that we have out there when you look in the overall industry, right? 45 years is like, in IT terms it's like from a star being the ones who are going supernova. You have said it before that many technologies coming and went, right? And just want to add a more really interesting example by the way is Hadoop and HDFS, right? They kind of gave us this additional promise of like, you know, the 2010s like 2012, 2013 the hype of Hadoop and so forth and (mumbles) and HDFS. And people are just like, just put everything into HDFS and worry about the data later, right? And we can query it and map reduce it and whatever. And we had customers actually coming to us they were like, great we have half a petabyte of data on an HDFS cluster and we have no clue what's stored in there. How do we figure this out? What are we going to do now? Now you had a big data cleansing problem. And so I think that is why databases and also data modeling is something that will not go away anytime soon. And I think databases and database technologies are here for quite a while to stay. Because many of those are people they don't think about what's happening to the data five years from now. And many of the niche players also and also frankly even Amazon you know, following with this single purpose thing is like, just use the right tool for the job for your application, right? Just pull in the data there the way you wanted. And it's like, okay, so you use technologies all over the place and then five years from now you have your data fragmented everywhere in different formats and, you know inconsistencies, and, and, and. And those are usually when you come back to this data-driven business critical business decision applications the worst case scenario you can have, right? Because now you need an army of people to actually do data cleansing. And there's not a coincidence that data science has become very, very popular the last recent years as we kind of went on with this proliferation of different database or data management technologies some of those are not even database. But I think I leave it at that. >> It's an interesting talk track because you're right. I mean, no schema on right was alluring, but it definitely created some problems. It also created an entire, you know you referenced the hyper specialized roles and did the data cleansing component. I mean, maybe technology will eventually solve that problem but it hasn't up at least up tonight. Okay, last question, Maria maybe you could start off and Gerald if you want to chime in as well it'd be great. I mean, it's interesting to watch this industry when Oracle sort of won the top database mantle. I mean, I watched it, I saw it. It was, remember it was Informix and it was (indistinct) too and of course, Microsoft you got to give them credit with SQL server, but Oracle won the database wars. And then everything got kind of quiet for awhile database was sort of boring. And then it exploded, you know, all the, you know not only SQL and the key-value stores and the cloud databases and this is really a hot area now. And when we looked at Oracle we said, okay, Oracle it's all about Oracle Database, but we've seen the kind of resurgence in MySQL which everybody thought, you know once Oracle bought Sun they were going to kill MySQL. But now we see you investing in HeatWave, TimesTen, we talked about In-Memory databases before. So where do those fit in Maria in the grand scheme? How should we think about Oracle's database portfolio? >> So there's lots of places where you'd use those different things. 'Cause just like any other industry there are going to be new and boutique use cases that are going to benefit from a more specialized product or single purpose product. So good examples off the top of my head of the kind of systems that would benefit from that would be things like a stock exchange system or a telephone exchange system. Both of those are latency critical transaction processing applications where they need microsecond response times. And that's going to exceed perhaps what you might normally get or deploy with a converged database. And so Oracle's TimesTen database our In-Memory database is perfect for those kinds of applications. But there's also a host of MySQL applications out there today and you said it yourself there Dave, HeatWave is a great place to provision and deploy those kinds of applications because it's going to run 100 times faster than AWS (mumbles). So, you know, there really is a place in the market and in our customer's systems and the needs they have for all of these different members of our database family here at Oracle. >> Yeah, well, the internet is basically running in the lamp stack so I see MySQL going away. All right Gerald, will give you the final word, bring us home. >> Oh, thank you very much. Yeah, I mean, as Maria said, I think it comes back to what we discussed before. There is obviously still needs for special technologies or different technologies than a relational database or multimodal database. Oracle has actually many more databases that people may first think of. Not only the three that we have already mentioned but there's even SP so the Oracle's NoSQL database. And, you know, on a high level Oracle is a data management company, right? And we want to give our customers the best tools and the best technology to manage all of their data. Rather than therefore there has to be a need or there should be a part of the business that also focuses on this highly specialized systems and this highly specialized technologies that address those use cases. And I think it makes perfect sense. It's like, you know, when the customer comes to Oracle they're not only getting this, take this one product you know, and if you don't like it your problem but actually you have choice, right? And choice allows you to make a decision based on what's best for you and not necessarily best for the vendor you're talking to. >> Well guys, really appreciate your time today and your insights. Maria, Gerald, thanks so much for coming on The Cube. >> Thank you very much for having us. >> And thanks for watching this Cube conversation this is Dave Vellante and we'll see you next time. (upbeat music)

Published Date : Jun 24 2021

SUMMARY :

and then you guys just follow my lead. So I noticed Maria you stopped anyway, So any time you don't So when I'm here you guys and we'll open up when Dave's ready. and the benefits they bring What are we really talking about there? the nearest stores to kind of the traditional So for example, you can do So Gerald, you think about to you at all but just receives or even a MongoDB that allows you to do ML and AI into the database, in the database you already have. and I buy that by the way. of since the last 40 years, you know the benefits to this approach is the fact that you can get And you know, it's And that buddy comes in the form of the truth here is you don't and deploy it on the cloud. and the cloud and containers and you know, is the argument you were making And so why are you because the laws require you to. And then it exploded, you and the needs they have in the lamp stack so I and the best technology to and your insights. we'll see you next time.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Gerald Venzl	PERSON	0.99+
Andy Mendelsohn	PERSON	0.99+
Maria	PERSON	0.99+
Dave	PERSON	0.99+
Chile	LOCATION	0.99+
Maria Colgan	PERSON	0.99+
Peru	LOCATION	0.99+
100 times	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Gerald	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Canada	LOCATION	0.99+
seven years	QUANTITY	0.99+
Juan Luis	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Steve	PERSON	0.99+
five star	QUANTITY	0.99+
Maria Colgan	PERSON	0.99+
Swiss Army	ORGANIZATION	0.99+
Swiss Army	ORGANIZATION	0.99+
Alex	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
MySQL	TITLE	0.99+
one note	QUANTITY	0.99+
yesterday	DATE	0.99+
two hands	QUANTITY	0.99+
three	QUANTITY	0.99+
two experts	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Linux	TITLE	0.99+
Teradata	ORGANIZATION	0.99+
each microservice	QUANTITY	0.99+
Hadoop	TITLE	0.99+
45 years	QUANTITY	0.99+
Oracles	ORGANIZATION	0.99+
early 2010s	DATE	0.99+
today	DATE	0.99+
one-shot	QUANTITY	0.99+
five	QUANTITY	0.99+
one good example	QUANTITY	0.99+
Sun	ORGANIZATION	0.99+
tonight	DATE	0.99+
first	QUANTITY	0.99+

Andy Mendelsohn, Oracle | CUBE Conversation, March 2021

the cloud has dramatically changed the way providers think about delivering database technologies not only has cloud first become a mandate for many if not most but customers are demanding more capabilities from their technology vendors examples include a substantially similar experience for cloud and on-prem workloads increased automation and a never-ending quest for more secure platforms broadly there are two prevailing models that have emerged one is to provide highly specialized database products that focus on optimizing for a specific workload signature the other end of the spectrum combines technologies in a converge platform to satisfy satisfy the needs of a much broader set of use cases and with me to get a perspective on these and other issues is andy mendelson is the executive vice president of oracle the world's leading database company andy leads database server technologies hello andy thanks for coming on hey dave glad to be here okay so we saw the recent announcements this is kind of your baby around next generation autonomous data warehouse maybe you could take us through the path you took from the original cloud data warehouses to where we are today yeah when we uh we first brought autonomous database out uh we were basically a second generation technology at that point you know we decided that what customers wanted was to the other you know the push of a button provision the really powerful oracle database technology that they've been using for years and um we did that with autonomous database and beyond that we provided a very unique capability that around self-tuning self-driving of the database which is something the first generation vendors didn't provide and this this is really important because customers today are you know developers and data analysts you know you know at the push of a button build out their their data warehouses but you know they're not experts in tuning and so what we thought was really important is that customers get great performance out of the box and that's one of the really unique things about autonomous data warehouse autonomous database in particular and then this latest generation that we just came out with also answers the questions we got from you know the data analysts and developers they said you know it's really great that i can press a button and provision this very powerful data warehouse infrastructure or database infrastructure from oracle but you know if i'm an analyst i want data you know so it's still hard for me to go and you know get data from various data sources transform them clean them up and get them to a way a place where i can start querying the data now i still need data engineers to help me do that and so we've done in the new release we said okay we want to give data analysts and data engineer data scientists developers is a true self-service experience where they can do their job completely without bringing in any you know any any engineers from their i.t organization and so that's what this new version is all about yeah awesome i mean look years ago you guys identified the i.t labor problem and you've been focused on r d and putting it in your r d to solve that problem for customers so we're really starting to see that hit now now gartner recently did some analysis they ranked and rated them some of the more popular cloud databases and oracle did very well i mean particularly particularly in operational categories i mean an operational side and the mission critical stuff you smoked everybody we had mark stamer and david floyer on and our big takeaways were that you're you're again dominating in that mission critical workloads that that that dominance continues but your approach of converging functionality really differs from some others that we saw i mean obviously when you get high ratings from gartner you're pretty stoked about that but what do you think contributed to those rankings and what are you finding specifically in customer interactions yeah so gardner does a lot of its analysis based on talking to customers finding out how their product these products that sound great on paper actually work in practice and i think that's one of the places where oracle database technology really shines it's it's uh it solves real-world problems um it's been doing it for a long time and as we've moved that technology into the cloud you know that continues you know the differentiation we've built up over the years really stands out you know you look at like amazon's databases they generally take some open source technology that isn't that new it could be 30 years old 25 years old and they put it up on the cloud and they say oh it's cloud native it's great but but in fact it's the same old you know technology that that doesn't really compete you know decade behind oracle's database technology so i think the gartner analysis really showed that sort of thing quite clearly yeah so let's talk about that a little bit because obviously i've learned a lot you know one of the things i've learned over the last many years of following this business a lot of ways to skin a cat and cloud database vendors if you think about you mentioned aws you know look at snowflake kind of right tool for the right job approach they're going to say that their specialty databases they're focused uh are better than your converged approach which they make you know think of as a you know swiss army knife what's your take on that yeah well the converged approach is something of course we've been working on for a long time so the the idea is pretty simple you know think about your smartphone you know if you can think back you know over 10 years ago used to have you know a camcorder and a a camera and a messaging device and also a dump phone device that all those different devices got converged into what we now call the smartphone why did the smartphone win it's just simply much more productive for you to carry one device around that that is actually best to breed in all the different categories instead of lots of separate devices and that's what we're doing with converge database over the years you know we've been able to build out technologies that are really good at transaction breasts at analytics for data warehousing now we're working on you know json technologies graph technologies the other vendors basically can't do this i mean it's much easier to build a specialty database that does one thing to build out a converged database that does end things really well and that's what we've been doing for years and again it's it's based on technology that uh you've invested in for quite a long time um and it's something that i think uh customers and developers and analyze analysts find to be a much more productive way of doing their jobs it's very unique and not common at all to see a technology that's been around as long as oracle database to see that sort of morph into a more modern platform i mean you mentioned aws uses leverages open source a lot you know snowflake would say okay hey we are born in the cloud and they are i think google bigquery would be another good example but but but that notion of boy i want to get your take on this born in the cloud those folks would say well we're superior to oracle's because you know they started you know decades ago not necessarily you know native cloud services uh how have you been able to address that i know you know cloud first is kind of the buzzword but but how have you you made that sort of transparent to users or or irrelevant to users because you are cloud first maybe you could talk about how you've able to achieve that and convince us that you actually really are cloud native now you know one of the things we we sort of like pointing out is that um oracle very uniquely has had this scale out technology for running all kinds of workloads not just analytic workloads which is what you see out in the cloud there but we can also scale out transaction processing workloads now that was another one of the reasons we do so well in for example the gardner analysis for trans operational workloads and that technology is really valuable as we went to cloud it lets us do some really unique things and the most obvious unique thing we we have is something we like to call you know you know cloud native you know instant elasticity and so with our technology if you want to provision a share you know some number of amount of compute to run your workloads you can provision exactly what you need you know if you need 17 cpus to get your job done you do 17 cpus when you provision your autonomous database our competitors who claim to be born in the cloud like snowflake and amazon they still use this this archaic way of provisioning uh servers based on shapes you know snowflake you know says what which shape cluster do you want you want 16 you want 32 you want 64. no it goes up by a power of 2 which means if you compare that to what oracle does you you have to provision up to like twice as much cpu than you really need so if you really need 17 they make you provision 32. if you really need 33 they make your provision 64. so this is not a cloud native experience at all it's an archaic way of doing things and and we like to point out with our instant elasticity you know we can go from 17 to 18 to 19 you know whatever you want plus we have something called auto scale so you can set your baseline to be 17 let's say but we will automatically based on your workload scale you up to three times that so in this case be 51 and because of that true elasticity we have we are really the only ones that can deliver true pay as you go kind of you know just pay for what you need kind of capability which is certainly what amazon was talking about when they first called their cloud elastic but it turns out for database services these guys still do this archaic thing with shapes so that's a really good example of where we're quite better than the other guys and it's much more cloud native than the other guys i want to follow up on that uh just stay here for a second because you're basically saying we have we have better granularity than the so-called cloud native guys now you mentioned snowflake right you got you got the shapes you got to you got to choose which shape you want and it sounds like it sounds like redshift the same and of course i know the way in which amazon separates compute from storage is largely a tiering exercise so it's not as as is as smooth as you might expect but nonetheless it's it's good how is it that you were you were able to achieve this with a database that was you know born you know many decades ago is it i mean what is it in from a technical standpoint an r d standpoint that you were able to do i mean did you design that in in the 1980s how did you how did you get here yeah well um it's a combination of interesting technologies so autonomous database you know it has the oracle database software that software is running on a very powerful optimized infrastructure for database based on the exadata technology that we've had on prem for many years we brought that to the cloud and that technology is a scale-out infrastructure that supports you know thousands of cpus and then we use our multi-tenant technology which is a way of sharing large infrastructures amongst amongst separate uh clients and we divide it up dynamically on the fly so if there's thousands of cpus you know this guy wants 20 and this one wants 30 we we divide it up and give them exactly what they need and if they want to grow we just take some extra cpus that are in reserve and we give it to them instantly and so that's a very different way of doing things and that's been a shape based approach where you know what what snowflake and amazon do under the covers they give you a real physical server you know or a cluster and that's how they provision if you want to grow they give you another big physical cluster which takes a long time to get the data populated to get it get it working we just have that one infrastructure that we're sharing among lots of users and we just give you a little extra capacity we don't it doesn't it's done instantly there's no need for data to be moved to populate the new clusters that you know snowflake or amazon are provisioning for you so it's a very different way of doing things and you're able to do that because of the tight integration between you mentioned exadata tight integration between the hardware and software we got david floyer calls it the iphone of enterprise sometimes sometimes you get some grief for that but it's it's not a bad metaphor but is that really the sort of secret well the big secret under the covers is this you know exudated technology our real application cluster scale out technologies our multi-tenant technologies so these are things we've been working on for a long time and they are very mature very powerful technologies and they really provide very unique benefits in a cloud world where people want things to happen instantly and they want to work well for any kind of workload um you know that's that's why we call we talk about being converged we can do mixed workloads you can do transactions and analytics all in the same data the other guys can't do that you know they're really good at like you said a narrow workload like i can do analytics or i can do graph you know i can do json but they can't really do the combination which is what real world applications are like they're not pure one thing versus enough right thank you for that so one of the questions people want to know is can oracle attract you know new customers that aren't existing oracle customers so maybe you could talk about that and you know why should uh somebody who's not an existing oracle customer think about using autonomous database yeah that's a that's a really good question you know oracle if you look at our customer base has a lot of really large enterprises you know the biggest banks and the biggest telcos you know they run oracle they run their businesses on oracle and these guys are sort of the most conservative of the bunch out there and they are moving to cloud at a somewhat slower rate than the than the smaller companies and so if you look at who's using autonomous database now it's actually the smaller companies you know the same type of people that first decided amazon was an interesting cloud 10 years ago they're also using our technologies and it's for the same reason they're finding you know they don't have large it organizations they don't have large numbers of engineers to engineer their infrastructure and that's why cloud is so attractive to them and autonomous database on top of cloud is really attractive as well because you know information is the lifeblood of every organization and if they can empower their analysts to get their job done without lots of help from it organizations they're going to do it and you know that's really what's made autonomous database really interesting you know the whole self-driving nature is very attractive to the smaller shops that don't have a lot of sophisticated um i.t expertise all right let's talk about developers you guys are the stewards of the java community so obviously you know big probably you know the biggest most popular programming language out there but when i think of developers i think of guys in hoodies pounding away but when i think of oracle developers i might think of maybe an app dev team inside of maybe some of those large customers that you talked about but why would developers and or analysts be interested in in using oracle as opposed to some some of those more focused narrow use databases that we were talking about earlier yeah so if you're a developer um you want to get your job done as fast as possible and so having a database that gives you the most productive application development experience is important to you and so you know i was talking we've been talking about converged database off and on so if i'm a developer i have a given job to do a converged database that lets me do a combination of analytics and and transactions and do a little json and little graph all in one is a much more productive place to go because if i if i i don't have something like that then i'm stuck taking my my application and breaking it up into pieces you know this piece i'm going to run on say aurora on amazon and this piece i have to run on the graph database and here's some json i got to run that on some document database and then i have to move the data around the data gets sort of fragmented between these databases and i have to do all this data you know integration and and whatever with a converged database i have a much simpler world where i can just use one technology stack i can get my job done and then i'm future proof against change you know requirements change all the time so you build the initial version of the application and your users say you know that this is not what i want i want some something else and it turns out that something else often is why i want analytics and you use something like a you know a document stored technology that has really poor analytic capabilities and then so you have to take that data and you have to move it to another database and so with with our converged approach you don't have to do that you know you're already in a place where everything works everything that you need you can possibly need in the future is going to be there as well and so for developers i i think you know converged is the right way to go plus for people who are what we call citizen developers you know like the data analysts that they cuddle they write a little code occasionally but they're really after getting value of the data we have this really fabulous no code loco tool called apex and apex is again a very mature technology it's been around for years and it lets somebody who's just a data analyst he knows a little sql but doesn't want to write code get their job done really fast and we've published some benchmark on our website showing you know basically you can get the job done 20 to 40 times faster using a no co loco tool like apex versus something like you know just writing cutting lots of traditional code i'm glad you brought up apex we recently interviewed one of your former colleagues amit xavery and all he would talk about is low code no code and then in the apex announcement you said something to the effect of coding should be the exception not the rule did you mean that what do you mean by that yeah so apex is a tool that people use with our our database technology for building what we call data driven applications so if you got a bunch of data and you want to get some value out of it you want to build maybe dashboards or more sophisticated reports apex is an incredible tool for doing that and it's it's modern you know it builds applications that look great on your smartphone and it automatically you know renders that same user interface on a bigger device like a laptop desktop device as well and uh it's very it's one of these things that uh the people that use it just go bonkers with it it's a viral technology they get really excited about how productive they they've been using it and they tell all their friends and i think we decided uh i guess about a year ago when we came up with this apex service that you know we really want to start going bigger on the marketing around it because it's very unique nobody else has anything quite like it and it's it again it just adds value to the whole developer productivity story around an oracle database so uh that's why we have the apex service now and we also have apex available with every oracle database on the cloud god i want to i want to ask you about some of the features around 21c there are a lot of them you announced earlier this year maybe you could tease out some of the top things that we should be paying attention to in 21c yeah sure um so one of the ways to look at 21c is we're we're continuing down this path of a converged database and so one of the the marquee features in 21c is something we call blockchain tables so what is blockchain well blockchain was this technology that's under the covers behind bitcoin you know it's a way of creating a tamper-proof data store um that was used by the original bitcoin algorithms well developers actually like having tamper proof data objects and databases too um you know and so what we decided to do was say well if i create a sql table in an oracle database what if there's a new option that just says i want that table implemented using blockchain technology to make the table tamper proof and fully audited etc and so we just did that and so in 21c you can now get a basically another feature of the converged database that says uh you know give me a sql table i can do everything i can query it i can insert rows into it but it's it's tamper proof i can't ever update it i can't delete rows from it amazon did the their usual thing they took again some open source technology and they said hey we got this great thing called quantum ledger database and it does blockchain tables but but if you want to do blockchain tables in any of their other databases you're out of luck they don't have it you have to go move the data into this new thing and it's again one of their it's again showing sort of the problem with their their proprietary this proprietary approach of having specialty databases versus just having one conversion that does it all so that's the blockchain cable feature uh we did a bunch of other things um the one i i think is worth mentioning the most is is support for persistent memory so a lot of people out there haven't noticed this this very interesting technology that intel shipped a couple years ago called optane data center memory and what it is it's basically a hybrid of flash memory which is persistent memory and standard dram which is not persistent means you can't store a database in dram um and so with this persistent memory you can basically have a database stored persistently in memory all the time and so it's a very innovative new technology from a database standpoint it's a very disruptive technology to the database market because now you can have an in-memory database basic period all the time 24 7. and so 21c is the first database out there that has native support for this new kind of persistent memory technology and we think it's it's really important so we're actually making it available as uh to our 19c customers as well and uh you know that's another technology i'd call out that we think is very unique we're way ahead of the game there and we're going to continue investing moving forward in that space as well yeah so that layer in between dram and and persistent flash that's that's a great innovation and good game changing from a from a performance and actually the way you write applications but i gotta i gotta ask you i and all the analysts were wrong with juan recently juan loyza and and to listen to that introduction of blockchain and everybody wants to know is safra going to start putting bitcoin on the oracle balance sheet i'm about to get that leap yeah that's a good question who knows yeah i can't comment on speculation ah that would be interesting okay last question then we got to go uh look oracle the narrative on oracle is you're expensive and you're mean you know it's hard to do business with do you care are you doing things to maybe change that perception in the cloud yeah i think we've made a very conscious decision that as we move to the cloud we're offering a totally new business model on the club that is a a cloud-native model you pay for what you use um you have everyday low prices you don't have to negotiate with some salesman for for months to get get a good price um so yeah we really like the message to get out there that those of you who think you know what oracle's all about um you know i and how it might be to work with oracle on in from your on premises days um you should really check out how oracle is now on the cloud we have this autonomous database technology really easy to use really simple any analysts can help get value out of the data without any help from any other engineers it's very unique it's it's uh it's the same technology you're used to but now it's delivered in a way that's much easier to consume and much lower cost and so yeah you should definitely take a look at what we've got out there on the cloud and it's all free to try out we got this free tier you can provision free vms free databases um free apex whatever you want and uh try it out and see what you think well thanks for that i was kidding about me and a lot of a lot of friends at oracle some relatives as well and thanks andy for coming on thecube today it's really great to talk to you yeah it's my pleasure and thanks for watching this is dave vellante we'll see you next time you

Published Date : Mar 29 2021

SUMMARY :

and so for developers i i think you know

ENTITIES

Entity	Category	Confidence
Andy Mendelsohn	PERSON	0.99+
amazon	ORGANIZATION	0.99+
March 2021	DATE	0.99+
20	QUANTITY	0.99+
gartner	ORGANIZATION	0.99+
oracle	ORGANIZATION	0.99+
apex	TITLE	0.99+
juan loyza	PERSON	0.99+
first database	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
david floyer	PERSON	0.99+
two prevailing models	QUANTITY	0.99+
twice	QUANTITY	0.98+
dave vellante	PERSON	0.98+
today	DATE	0.98+
first generation	QUANTITY	0.98+
10 years ago	DATE	0.98+
thousands of cpus	QUANTITY	0.98+
decades ago	DATE	0.98+
40 times	QUANTITY	0.98+
51	OTHER	0.97+
25 years old	QUANTITY	0.97+
30	QUANTITY	0.96+
first	QUANTITY	0.96+
andy mendelson	PERSON	0.96+
17	OTHER	0.96+
1980s	DATE	0.96+
second generation	QUANTITY	0.96+
33	OTHER	0.96+
one	QUANTITY	0.96+
30 years old	QUANTITY	0.96+
json	ORGANIZATION	0.95+
earlier this year	DATE	0.95+
one device	QUANTITY	0.94+
amit xavery	PERSON	0.94+
mark stamer	PERSON	0.92+
google	ORGANIZATION	0.91+
years	DATE	0.91+
32	OTHER	0.9+
about a year ago	DATE	0.9+
oracle	TITLE	0.9+
over 10 years ago	DATE	0.89+
safra	ORGANIZATION	0.88+
16	OTHER	0.88+
one thing	QUANTITY	0.87+
many decades ago	DATE	0.87+
lot of people	QUANTITY	0.83+

Breaking Analysis: Unpacking Oracle’s Autonomous Data Warehouse Announcement

(upbeat music) >> On February 19th of this year, Barron's dropped an article declaring Oracle, a cloud giant and the article explained why the stock was a buy. Investors took notice and the stock ran up 18% over the next nine trading days and it peaked on March 9th, the day before Oracle announced its latest earnings. The company beat consensus earnings on both top-line and EPS last quarter, but investors, they did not like Oracle's tepid guidance and the stock pulled back. But it's still, as you can see, well above its pre-Barron's article price. What does all this mean? Is Oracle a cloud giant? What are its growth prospects? Now many parts of Oracle's business are growing including Fusion ERP, Fusion HCM, NetSuite, we're talking deep into the double digits, 20 plus percent growth. It's OnPrem legacy licensed business however, continues to decline and that moderates, the overall company growth because that OnPrem business is so large. So the overall Oracle's growing in the low single digits. Now what stands out about Oracle is it's recurring revenue model. That figure, the company says now it represents 73% of its revenue and that's going to continue to grow. Now two other things stood out on the earnings call to us. First, Oracle plans on increasing its CapEX by 50% in the coming quarter, that's a lot. Now it's still far less than AWS Google or Microsoft Spend on capital but it's a meaningful data point. Second Oracle's consumption revenue for Autonomous Database and Cloud Infrastructure, OCI or Oracle Cloud Infrastructure grew at 64% and 139% respectively and these two factors combined with the CapEX Spend suggest that the company has real momentum. I mean look, it's possible that the CapEx announcements maybe just optics in they're front loading, some spend to show the street that it's a player in cloud but I don't think so. Oracle's Safra Catz's usually pretty disciplined when it comes to it's spending. Now today on March 17th, Oracle announced updates towards Autonomous Data Warehouse and with me is David Floyer who has extensively researched Oracle over the years and today we're going to unpack the Oracle Autonomous Data Warehouse, ADW announcement. What it means to customers but we also want to dig into Oracle's strategy. We want to compare it to some other prominent database vendors specifically, AWS and Snowflake. David Floyer, Welcome back to The Cube, thanks for making some time for me. >> Thank you Vellante, great pleasure to be here. >> All right, I want to get into the news but I want to start with this idea of the autonomous database which Oracle's announcement today is building on. Oracle uses the analogy of a self-driving car. It's obviously powerful metaphor as they call it the self-driving database and my takeaway is that, this means that the system automatically provisions, it upgrades, it does all the patching for you, it tunes itself. Oracle claims that all reduces labor costs or admin costs by 90%. So I ask you, is this the right interpretation of what Oracle means by autonomous database? And is it real? >> Is that the right interpretation? It's a nice analogy. It's a test to that analogy, isn't it? I would put it as the first stage of the Autonomous Data Warehouse was to do the things that you talked about, which was the tuning, the provisioning, all of that sort of thing. The second stage is actually, I think more interesting in that what they're focusing on is making it easy to use for the end user. Eliminating the requirement for IT, staff to be there to help in the actual using of it and that is a very big step for them but an absolutely vital step because all of the competition focusing on ease of use, ease of use, ease of use and cheapness of being able to manage and deploy. But, so I think that is the really important area that Oracle has focused on and it seemed to have done so very well. >> So in your view, is this, I mean you don't really hear a lot of other companies talking about this analogy of the self-driving database, is this unique? Is it differentiable for Oracle? If so, why, or maybe you could help us understand that a little bit better. >> Well, the whole strategy is unique in its breadth. It has really brought together a whole number of things together and made it of its type the best. So it has a single, whole number of data sources and database types. So it's got a very broad range of different ways that you can look at the data and the second thing that is also excellent is it's a platform. It is fully self provisioned and its functionality is very, very broad indeed. The quality of the original SQL and the query languages, etc, is very, very good indeed and it's a better agent to do joints for example, is excellent. So all of the building blocks are there and together with it's sharing of the same data with OLTP and inference and in memory data paces as well. All together the breadth of what they have is unique and very, very powerful. >> I want to come back to this but let's get into the news a little bit and the announcement. I mean, it seems like what's new in the autonomous data warehouse piece for Oracle's new tooling around four areas that so Andy Mendelsohn, the head of this group instead of the guy who releases his baby, he talked about four things. My takeaway, faster simpler loads, simplified transforms, autonomous machine learning models which are facilitating, What do you call it? Citizen data science and then faster time to insights. So tooling to make those four things happen. What's your take and takeaways on the news? >> I think those are all correct. I would add the ease of use in terms of being able to drag and drop, the user interface has been dramatically improved. Again, I think those, strategically are actually more important that the others are all useful and good components of it but strategically, I think is more important. There's ease of use, the use of apex for example, are more important. And, >> Why are they more important strategically? >> Because they focus on the end users capability. For example, one of other things that they've started to introduce is Python together with their spatial databases, for example. That is really important that you reach out to the developer as they are and what tools they want to use. So those type of ease of use things, those types of things are respecting what the end users use. For example, they haven't come out with anything like click or Tableau. They've left that there for that marketplace for the end user to use what they like best. >> Do you mean, they're not trying to compete with those two tools. They indeed had a laundry list of stuff that they supported, Talend, Tableau, Looker, click, Informatica, IBM, I had IBM there. So their claim was, hey, we're open. But so that's smart. That's just, hey, they realized that people use these tools. >> I'm trying to exclude other people, be a platform and be an ecosystem for the end users. >> Okay, so Mendelsohn who made the announcement said that Oracle's the smartphone of databases and I think, I actually think Alison kind of used that or maybe that was us planing to have, I thought he did like the iPhone of when he announced the exit data way back when the integrated hardware and software but is that how you see it, is Oracle, the smartphone of databases? >> It is, I mean, they are trying to own the complete stack, the hardware with the exit data all the way up to the databases at the data warehouses and the OLTP databases, the inference databases. They're trying to own the complete stack from top to bottom and that's what makes autonomy process possible. You can make it autonomous when you control all of that. Take away all of the requirements for IT in the business itself. So it's democratizing the use of data warehouses. It is pushing it out to the lines of business and it's simplifying it and making it possible to push out so that they can own their own data. They can manage their own data and they do not need an IT person from headquarters to help them. >> Let's stay in this a little bit more and then I want to go into some of the competitive stuff because Mendelsohn mentioned AWS several times. One of the things that struck me, he said, hey, we're basically one API 'cause we're doing analytics in the cloud, we're doing data in the cloud, we're doing integration in the cloud and that's sort of a big part of the value proposition. He made some comparisons to Redshift. Of course, I would say, if you can't find a workload where you beat your big competitor then you shouldn't be in this business. So I take those things with a grain of salt but one of the other things that caught me is that migrating from OnPrem to Oracle, Oracle Cloud was very simple and I think he might've made some comparisons to other platforms. And this to me is important because he also brought in that Gartner data. We looked at that Gardner data when they came out with it in the operational database class, Oracle smoked everybody. They were like way ahead and the reason why I think that's important is because let's face it, the Mission Critical Workloads, when you look at what's moving into AWS, the Mission Critical Workloads, the high performance, high criticality OLTP stuff. That's not moving in droves and you've made the point often that companies with their own cloud particularly, Oracle you've mentioned this about IBM for certain, DB2 for instance, customers are going to, there should be a lower risk environment moving from OnPrem to their cloud, because you could do, I don't think you could get Oracle RAC on AWS. For example, I don't think EXIF data is running in AWS data centers and so that like component is going to facilitate migration. What's your take on all that spiel? >> I think that's absolutely right. You all crown Jewels, the most expensive and the most valuable applications, the mission-critical applications. The ones that have got to take a beating, keep on taking. So those types of applications are where Oracle really shines. They own a very large high percentage of those Mission Critical Workloads and you have the choice if you're going to AWS, for example of either migrating to Oracle on AWS and that is frankly not a good fit at all. There're a lot of constraints to running large systems on AWS, large mission critical systems. So that's not an option and then the option, of course, that AWS will push is move to a Roller, change your way of writing applications, make them tiny little pieces and stitch them all together with microservices and that's okay if you're a small organization but that has got a lot of problems in its own, right? Because then you, the user have to stitch all those pieces together and you're responsible for testing it and you're responsible for looking after it. And that as you grow becomes a bigger and bigger overhead. So AWS, in my opinion needs to have a move towards a tier-one database of it's own and it's not in that position at the moment. >> Interesting, okay. So, let's talk about the competitive landscape and the choices that customers have. As I said, Mendelssohn mentioned AWS many times, Larry on the calls often take shy, it's a compliment to me. When Larry Ellison calls you out, that means you've made it, you're doing well. We've seen it over the years, whether it's IBM or Workday or Salesforce, even though Salesforce's big Oracle customer 'cause AWS, as we know are Oracle customer as well, even though AWS tells us they've off called when you peel the onion >> Five years should be great, some of the workers >> Well, as I said, I believe they're still using Oracle in certain workloads. Way, way, we digress. So AWS though, they take a different approach and I want to push on this a little bit with database. It's got more than a dozen, I think purpose-built databases. They take this kind of right tool for the right job approach was Oracle there converging all this function into a single database. SQL JSON graph databases, machine learning, blockchain. I'd love to talk about more about blockchain if we have time but seems to me that the right tool for the right job purpose-built, very granular down to the primitives and APIs. That seems to me to be a pretty viable approach versus kind of a Swiss Army approach. How do you compare the two? >> Yes, and it is to many initial programmers who are very interested for example, in graph databases or in time series databases. They are looking for a cheap database that will do the job for a particular project and that makes, for the program or for that individual piece of work is making a very sensible way of doing it and they pay for ads on it's clear cloud dynamics. The challenge as you have more and more data and as you're building up your data warehouse in your data lakes is that you do not want to have to move data from one place to another place. So for example, if you've got a Roller,, you have to move the database and it's a pretty complicated thing to do it, to move it to Redshift. It's a five or six steps to do that and each of those costs money and each of those take time. More importantly, they take time. The Oracle approach is a single database in terms of all the pieces that obviously you have multiple databases you have different OLTP databases and data warehouse databases but as a single architecture and a single design which means that all of the work in terms of moving stuff from one place to another place is within Oracle itself. It's Oracle that's doing that work for you and as you grow, that becomes very, very important. To me, very, very important, cost saving. The overhead of all those different ones and the databases themselves originate with all as open source and they've done very well with it and then there's a large revenue stream behind the, >> The AWS, you mean? >> Yes, the original database is in AWS and they've done a lot of work in terms of making it set with the panels, etc. But if a larger organization, especially very large ones and certainly if they want to combine, for example data warehouse with the OLTP and the inference which is in my opinion, a very good thing that they should be trying to do then that is incredibly difficult to do with AWS and in my opinion, AWS has to invest enormously in to make the whole ecosystem much better. >> Okay, so innovation required there maybe is part of the TAM expansion strategy but just to sort of digress for a second. So it seems like, and by the way, there are others that are doing, they're taking this converged approach. It seems like that is a trend. I mean, you certainly see it with single store. I mean, the name sort of implies that formerly MemSQL I think Monte Zweben of splice machine is probably headed in a similar direction, embedding AI in Microsoft's, kind of interesting. It seems like Microsoft is willing to build this abstraction layer that hides that complexity of the different tooling. AWS thus far has not taken that approach and then sort of looking at Snowflake, Snowflake's got a completely different, I think Snowflake's trying to do something completely different. I don't think they're necessarily trying to take Oracle head-on. I mean, they're certainly trying to just, I guess, let's talk about this. Snowflake simplified EDW, that's clear. Zero to snowflake in 90 minutes. It's got this data cloud vision. So you sign on to this Snowflake, speaking of layers they're abstracting the complexity of the underlying cloud. That's what the data cloud vision is all about. They, talk about this Global Mesh but they've not done a good job of explaining what the heck it is. We've been pushing them on that, but we got, >> Aspiration of moment >> Well, I guess, yeah, it seems that way. And so, but conceptually, it's I think very powerful but in reality, what snowflake is doing with data sharing, a lot of reading it's probably mostly read-only and I say, mostly read-only, oh, there you go. You'll get better but it's mostly read and so you're able to share the data, it's governed. I mean, it's exactly, quite genius how they've implemented this with its simplicity. It is a caching architecture. We've talked about that, we can geek out about that. There's good, there's bad, there's ugly but generally speaking, I guess my premise here I would love your thoughts. Is snowflakes trying to do something different? It's trying to be not just another data warehouse. It's not just trying to compete with data lakes. It's trying to create this data cloud to facilitate data sharing, put data in the hands of business owners in terms of a product build, data product builders. That's a different vision than anything I've seen thus far, your thoughts. >> I agree and even more going further, being a place where people can sell data. Put it up and make it available to whoever needs it and making it so simple that it can be shared across the country and across the world. I think it's a very powerful vision indeed. The challenge they have is that the pieces at the moment are very, very easy to use but the quality in terms of the, for example, joints, I mentioned, the joints were very powerful in Oracle. They don't try and do joints. They, they say >> They being Snowflake, snowflake. Yeah, they don't even write it. They would say use another Postgres >> Yeah. >> Database to do that. >> Yeah, so then they have a long way to go. >> Complex joints anyway, maybe simple joints, yeah. >> Complex joints, so they have a long way to go in terms of the functionality of their product and also in my opinion, they sure be going to have more types of databases inside it, including OLTP and they can do that. They have obviously got a great market gap and they can do that by acquisition as well as they can >> They've started. I think, I think they support JSON, right. >> Do they support JSON? And graph, I think there's a graph database that's either coming or it's there, I can't keep all that stuff in my head but there's no reason they can't go in that direction. I mean, in speaking to the founders in Snowflake they were like, look, we're kind of new. We would focus on simple. A lot of them came from Oracle so they know all database and they know how hard it is to do things like facilitate complex joints and do complex workload management and so they said, let's just simplify, we'll put it in the cloud and it will spin up a separate data warehouse. It's a virtual data warehouse every time you want one to. So that's how they handle those things. So different philosophy but again, coming back to some of the mission critical work and some of the larger Oracle customers, they said they have a thousand autonomous database customers. I think it was autonomous database, not ADW but anyway, a few stood out AON, lift, I think Deloitte stood out and as obviously, hundreds more. So we have people who misunderstand Oracle, I think. They got a big install base. They invest in R and D and they talk about lock-in sure but the CIO that I talked to and you talked to David, they're looking for business value. I would say that 75 to 80% of them will gravitate toward business value over the fear of lock-in and I think at the end of the day, they feel like, you know what? If our business is performing, it's a better business decision, it's a better business case. >> I fully agree, they've been very difficult to do business with in the past. Everybody's in dread of the >> The audit. >> The knock on the door from the auditor. >> Right. >> And that from a purchasing point of view has been really bad experience for many, many customers. The users of the database itself are very happy indeed. I mean, you talk to them and they understand why, what they're paying for. They understand the value and in terms of availability and all of the tools for complex multi-dimensional types of applications. It's pretty well, the only game in town. It's only DB2 and SQL that had any hope of doing >> Doing Microsoft, Microsoft SQL, right. >> Okay, SQL >> Which, okay, yeah, definitely competitive for sure. DB2, no IBM look, IBM lost its dominant position in database. They kind of seeded that. Oracle had to fight hard to win it. It wasn't obvious in the 80s who was going to be the database King and all had to fight. And to me, I always tell people the difference is that the chairman of Oracle is also the CTO. They spend money on R and D and they throw off a ton of cash. I want to say something about, >> I was just going to make one extra point. The simplicity and the capability of their cloud versions of all of this is incredibly good. They are better in terms of spending what you need or what you use much better than AWS, for example or anybody else. So they have really come full circle in terms of attractiveness in a cloud environment. >> You mean charging you for what you consume. Yeah, Mendelsohn talked about that. He made a big point about the granularity, you pay for only what you need. If you need 33 CPUs or the other databases you've got to shape, if you need 33, you've got to go to 64. I know that's true for everyone. I'm not sure if that's true too for snowflake. It may be, I got to dig into that a little bit, but maybe >> Yes, Snowflake has got a front end to hiding behind. >> Right, but I didn't want to push it that a little bit because I want to go look at their pricing strategies because I still think they make you buy, I may be wrong. I thought they make you still do a one-year or two-year or three-year term. I don't know if you can just turn it off at any time. They might allow, I should hold off. I'll do some more research on that but I wanted to make a point about the audits, you mentioned audits before. A big mistake that a lot of Oracle customers have made many times and we've written about this, negotiating with Oracle, you've got to bring your best and your brightest when you negotiate with Oracle. Some of the things that people didn't pay attention to and I think they've sort of caught onto this is that Oracle's SOW is adjudicate over the MSA, a lot of legal departments and procurement department. Oh, do we have an MSA? With all, Yes, you do, okay, great and because they think the MSA, they then can run. If they have an MSA, they can rubber stamp it but the SOW really dictateS and Oracle's gotcha there and they're really smart about that. So you got to bring your best and the brightest and you've got to really negotiate hard with Oracle, you get trouble. >> Sure. >> So it is what it is but coming back to Oracle, let's sort of wrap on this. Dominant position in mission critical, we saw that from the Gartner research, especially for operational, giant customer base, there's cloud-first notion, there's investing in R and D, open, we'll put a question Mark around that but hey, they're doing some cool stuff with Michael stuff. >> Ecosystem, I put that, ecosystem they're promoting their ecosystem. >> Yeah, and look, I mean, for a lot of their customers, we've talked to many, they say, look, there's actually, a tail at the tail way, this saves us money and we don't have to migrate. >> Yeah. So interesting, so I'll give you the last word. We started sort of focusing on the announcement. So what do you want to leave us with? >> My last word is that there are platforms with a certain key application or key parts of the infrastructure, which I think can differentiate themselves from the Azures or the AWS. and Oracle owns one of those, SAP might be another one but there are certain platforms which are big enough and important enough that they will, in my opinion will succeed in that cloud strategy for this. >> Great, David, thanks so much, appreciate your insights. >> Good to be here. Thank you for watching everybody, this is Dave Vellante for The Cube. We'll see you next time. (upbeat music)

Published Date : Mar 17 2021

SUMMARY :

and that moderates, the great pleasure to be here. that the system automatically and it seemed to have done so very well. So in your view, is this, I mean and the second thing and the announcement. that the others are all useful that they've started to of stuff that they supported, and be an ecosystem for the end users. and the OLTP databases, and the reason why I and the most valuable applications, and the choices that customers have. for the right job approach was and that makes, for the program OLTP and the inference that complexity of the different tooling. put data in the hands of business owners that the pieces at the moment Yeah, they don't even write it. Yeah, so then they Complex joints anyway, and also in my opinion, they sure be going I think, I think they support JSON, right. and some of the larger Everybody's in dread of the and all of the tools is that the chairman of The simplicity and the capability He made a big point about the granularity, front end to hiding behind. and because they think the but coming back to Oracle, Ecosystem, I put that, ecosystem Yeah, and look, I mean, on the announcement. and important enough that much, appreciate your insights. Good to be here.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Mendelsohn	PERSON	0.99+
Andy Mendelsohn	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
David Floyer	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
March 9th	DATE	0.99+
February 19th	DATE	0.99+
five	QUANTITY	0.99+
Deloitte	ORGANIZATION	0.99+
75	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Larry Ellison	PERSON	0.99+
Mendelssohn	PERSON	0.99+
two	QUANTITY	0.99+
each	QUANTITY	0.99+
90%	QUANTITY	0.99+
one-year	QUANTITY	0.99+
Gartner	ORGANIZATION	0.99+
73%	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
two tools	QUANTITY	0.99+
Michael	PERSON	0.99+
64%	QUANTITY	0.99+
two factors	QUANTITY	0.99+
more than a dozen	QUANTITY	0.99+
last quarter	DATE	0.99+
SQL	TITLE	0.99+

Rahul Pathak, AWS | AWS re:Invent 2020

>>from around the globe. It's the Cube with digital coverage of AWS reinvent 2020 sponsored by Intel and AWS. Yeah, welcome back to the cubes. Ongoing coverage of AWS reinvent virtual Cuba's Gone Virtual along with most events these days are all events and continues to bring our digital coverage of reinvent With me is Rahul Pathak, who is the vice president of analytics at AWS A Ro. It's great to see you again. Welcome. And thanks for joining the program. >>They have Great co two and always a pleasure. Thanks for having me on. >>You're very welcome. Before we get into your leadership discussion, I want to talk about some of the things that AWS has announced. Uh, in the early parts of reinvent, I want to start with a glue elastic views. Very notable announcement allowing people to, you know, essentially share data across different data stores. Maybe tell us a little bit more about glue. Elastic view is kind of where the name came from and what the implication is, >>Uh, sure. So, yeah, we're really excited about blue elastic views and, you know, as you mentioned, the idea is to make it easy for customers to combine and use data from a variety of different sources and pull them together into one or many targets. And the reason for it is that you know we're really seeing customers adopt what we're calling a lake house architectural, which is, uh, at its core Data Lake for making sense of data and integrating it across different silos, uh, typically integrated with the data warehouse, and not just that, but also a range of other purpose. Both stores like Aurora, Relation of Workloads or dynamodb for non relational ones. And while customers typically get a lot of benefit from using purpose built stores because you get the best possible functionality, performance and scale forgiven use case, you often want to combine data across them to get a holistic view of what's happening in your business or with your customers. And before glue elastic views, customers would have to either use E. T. L or data integration software, or they have to write custom code that could be complex to manage, and I could be are prone and tough to change. And so, with elastic views, you can now use sequel to define a view across multiple data sources pick one or many targets. And then the system will actually monitor the sources for changes and propagate them into the targets in near real time. And it manages the anti pipeline and can notify operators if if anything, changes. And so the you know the components of the name are pretty straightforward. Blues are survivalists E T Elling data integration service on blue elastic views about our about data integration their views because you could define these virtual tables using sequel and then elastic because it's several lists and will scale up and down to deal with the propagation of changes. So we're really excited about it, and customers are as well. >>Okay, great. So my understanding is I'm gonna be able to take what's called what the parlance of materialized views, which in my laypersons terms assumes I'm gonna run a query on the database and take that subset. And then I'm gonna be ableto thio. Copy that and move it to another data store. And then you're gonna automatically keep track of the changes and keep everything up to date. Is that right? >>Yes. That's exactly right. So you can imagine. So you had a product catalog for example, that's being updated in dynamodb, and you can create a view that will move that to Amazon Elasticsearch service. You could search through a current version of your catalog, and we will monitor your dynamodb tables for any changes and make sure those air all propagated in the real time. And all of that is is taken care of for our customers as soon as they defined the view on. But they don't be just kept in sync a za long as the views in effect. >>Let's see, this is being really valuable for a person who's building Looks like I like to think in terms of data services or data products that are gonna help me, you know, monetize my business. Maybe, you know, maybe it's a simple as a dashboard, but maybe it's actually a product. You know, it might be some content that I want to develop, and I've got transaction systems. I've got unstructured data, may be in a no sequel database, and I wanna actually combine those build new products, and I want to do that quickly. So So take me through what I would have to do. You you sort of alluded to it with, you know, a lot of e t l and but take me through in a little bit more detail how I would do that, you know, before this innovation. And maybe you could give us a sense as to what the possibilities are with glue. Elastic views? >>Sure. So, you know, before we announced elastic views, a customer would typically have toe think about using a T l software, so they'd have to write a neat L pipeline that would extract data periodically from a range of sources. They then have to write transformation code that would do things like matchup types. Make sure you didn't have any invalid values, and then you would combine it on periodically, Write that into a target. And so once you've got that pipeline set up, you've got to monitor it. If you see an unusual spike in data volume, you might have to add more. Resource is to the pipeline to make a complete on time. And then, if anything changed in either the source of the destination that prevented that data from flowing in the way you would expect it, you'd have toe manually, figure that out and have data, quality checks and all of that in place to make sure everything kept working but with elastic views just gets much simpler. So instead of having to write custom transformation code, you right view using sequel and um, sequel is, uh, you know, widely popular with data analysts and folks that work with data, as you well know. And so you can define that view and sequel. The view will look across multiple sources, and then you pick your destination and then glue. Elastic views essentially monitors both the source for changes as well as the source and the destination for any any issues like, for example, did the schema changed. The shape of the data change is something briefly unavailable, and it can monitor. All of that can handle any errors, but it can recover from automatically. Or if it can't say someone dropped an important table in the source. That was part of your view. You can actually get alerted and notified to take some action to prevent bad data from getting through your system or to prevent your pipeline from breaking without your knowledge and then the final pieces, the elasticity of it. It will automatically deal with adding more resource is if, for example, say you had a spiky day, Um, in the markets, maybe you're building a financial services application and you needed to add more resource is to process those changes into your targets more quickly. The system would handle that for you. And then, if you're monetizing data services on the back end, you've got a range of options for folks subscribing to those targets. So we've got capabilities like our, uh, Amazon data exchange, where people can exchange and monetize data set. So it allows this and to end flow in a much more straightforward way. It was possible before >>awesome. So a lot of automation, especially if something goes wrong. So something goes wrong. You can automatically recover. And if for whatever reason, you can't what happens? You quite ask the system and and let the operator No. Hey, there's an issue. You gotta go fix it. How does that work? >>Yes, exactly. Right. So if we can recover, say, for example, you can you know that for a short period of time, you can't read the target database. The system will keep trying until it can get through. But say someone dropped a column from your source. That was a key part of your ultimate view and destination. You just can't proceed at that point. So the pipeline stops and then we notify using a PS or an SMS alert eso that programmatic action can be taken. So this effectively provides a really great way to enforce the integrity of data that's going between the sources and the targets. >>All right, make it kindergarten proof of it. So let's talk about another innovation. You guys announced quicksight que, uh, kind of speaking to the machine in my natural language, but but give us some more detail there. What is quicksight Q and and how doe I interact with it. What What kind of questions can I ask it >>so quick? Like you is essentially a deep, learning based semantic model of your data that allows you to ask natural language questions in your dashboard so you'll get a search bar in your quick side dashboard and quick site is our service B I service. That makes it really easy to provide rich dashboards. Whoever needs them in the organization on what Q does is it's automatically developing relationships between the entities in your data, and it's able to actually reason about the questions you ask. So unlike earlier natural language systems, where you have to pre define your models, you have to pre define all the calculations that you might ask the system to do on your behalf. Q can actually figure it out. So you can say Show me the top five categories for sales in California and it'll look in your data and figure out what that is and will prevent. It will present you with how it parse that question, and there will, in line in seconds, pop up a dashboard of what you asked and actually automatically try and take a chart or visualization for that data. That makes sense, and you could then start to refine it further and say, How does this compare to what happened in New York? And we'll be able to figure out that you're tryingto overlay those two data sets and it'll add them. And unlike other systems, it doesn't need to have all of those things pre defined. It's able to reason about it because it's building a model of what your data means on the flight and we pre trained it across a variety of different domains So you can ask a question about sales or HR or any of that on another great part accused that when it presents to you what it's parsed, you're actually able toe correct it if it needs it and provide feedback to the system. So, for example, if it got something slightly off you could actually select from a drop down and then it will remember your selection for the next time on it will get better as you use it. >>I saw a demo on in Swamis Keynote on December 8. That was basically you were able to ask Quick psych you the same question, but in different ways, you know, like compare California in New York or and then the data comes up or give me the top, you know, five. And then the California, New York, the same exact data. So so is that how I kind of can can check and see if the answer that I'm getting back is correct is ask different questions. I don't have to know. The schema is what you're saying. I have to have knowledge of that is the user I can. I can triangulate from different angles and then look and see if that's correct. Is that is that how you verify or there are other ways? >>Eso That's one way to verify. You could definitely ask the same question a couple of different ways and ensure you're seeing the same results. I think the third option would be toe, uh, you know, potentially click and drill and filter down into that data through the dash one on, then the you know, the other step would be at data ingestion Time. Typically, data pipelines will have some quality controls, but when you're interacting with Q, I think the ability to ask the question multiple ways and make sure that you're getting the same result is a perfectly reasonable way to validate. >>You know what I like about that answer that you just gave, and I wonder if I could get your opinion on this because you're you've been in this business for a while? You work with a lot of customers is if you think about our operational systems, you know things like sales or E r. P systems. We've contextualized them. In other words, the business lines have inject context into the system. I mean, they kind of own it, if you will. They own the data when I put in quotes, but they do. They feel like they're responsible for it. There's not this constant argument because it's their data. It seems to me that if you look back in the last 10 years, ah, lot of the the data architecture has been sort of generis ized. In other words, the experts. Whether it's the data engineer, the quality engineer, they don't really have the business context. But the example that you just gave it the drill down to verify that the answer is correct. It seems to me, just in listening again to Swamis Keynote the other day is that you're really trying to put data in the hands of business users who have the context on the domain knowledge. And that seems to me to be a change in mindset that we're gonna see evolve over the next decade. I wonder if you could give me your thoughts on that change in the data architecture data mindset. >>David, I think you're absolutely right. I mean, we see this across all the customers that we speak with there's there's an increasing desire to get data broadly distributed into the hands of the organization in a well governed and controlled way. But customers want to give data to the folks that know what it means and know how they can take action on it to do something for the business, whether that's finding a new opportunity or looking for efficiencies. And I think, you know, we're seeing that increasingly, especially given the unpredictability that we've all gone through in 2020 customers are realizing that they need to get a lot more agile, and they need to get a lot more data about their business, their customers, because you've got to find ways to adapt quickly. And you know, that's not gonna change anytime in the future. >>And I've said many times in the The Cube, you know, there are industry. The technology industry used to be all about the products, and in the last decade it was really platforms, whether it's SAS platforms or AWS cloud platforms, and it seems like innovation in the coming years, in many respects is coming is gonna come from the ecosystem and the ability toe share data we've We've had some examples today and then But you hit on. You know, one of the key challenges, of course, is security and governance. And can you automate that if you will and protect? You know the users from doing things that you know, whether it's data access of corporate edicts for governance and compliance. How are you handling that challenge? >>That's a great question, and it's something that really emphasized in my leadership session. But the you know, the notion of what customers are doing and what we're seeing is that there's, uh, the Lake House architectural concept. So you've got a day late. Purpose build stores and customers are looking for easy data movement across those. And so we have things like blue elastic views or some of the other blue features we announced. But they're also looking for unified governance, and that's why we built it ws late formation. And the idea here is that it can quickly discover and catalog customer data assets and then allows customers to define granular access policies centrally around that data. And once you have defined that, it then sets customers free to give broader access to the data because they put the guardrails in place. They put the protections in place. So you know you can tag columns as being private so nobody can see them on gun were announced. We announced a couple of new capabilities where you can provide row based control. So only a certain set of users can see certain rose in the data, whereas a different set of users might only be able to see, you know, a different step. And so, by creating this fine grained but unified governance model, this actually sets customers free to give broader access to the data because they know that they're policies and compliance requirements are being met on it gets them out of the way of the analyst. For someone who can actually use the data to drive some value for the business, >>right? They could really focus on driving value. And I always talk about monetization. However monetization could be, you know, a generic term, for it could be saving lives, admission of the business or the or the organization I meant to ask you about acute customers in bed. Uh, looks like you into their own APs. >>Yes, absolutely so one of quick sites key strengths is its embed ability. And on then it's also serverless, so you could embed it at a really massive scale. And so we see customers, for example, like blackboard that's embedding quick side dashboards into information. It's providing the thousands of educators to provide data on the effectiveness of online learning. For example, on you could embed Q into that capability. So it's a really cool way to give a broad set of people the ability to ask questions of data without requiring them to be fluent in things like Sequel. >>If I ask you a question, we've talked a little bit about data movement. I think last year reinvent you guys announced our A three. I think it made general availability this year. And remember Andy speaking about it, talking about you know, the importance of having big enough pipes when you're moving, you know, data around. Of course you do. Doing tearing. You also announced Aqua Advanced Query accelerator, which kind of reduces bringing the computer. The data, I guess, is how I would think about that reducing that movement. But then we're talking about, you know, glue, elastic views you're copying and moving data. How are you ensuring you know, maintaining that that maximum performance for your customers. I mean, I know it's an architectural question, but as an analytics professional, you have toe be comfortable that that infrastructure is there. So how does what's A. W s general philosophy in that regard? >>So there's a few ways that we think about this, and you're absolutely right. I think there's data volumes were going up, and we're seeing customers going from terabytes, two petabytes and even people heading into the exabyte range. Uh, there's really a need to deliver performance at scale. And you know, the reality of customer architectures is that customers will use purpose built systems for different best in class use cases. And, you know, if you're trying to do a one size fits all thing, you're inevitably going to end up compromising somewhere. And so the reality is, is that customers will have more data. We're gonna want to get it to more people on. They're gonna want their analytics to be fast and cost effective. And so we look at strategies to enable all of this. So, for example, glue elastic views. It's about moving data, but it's about moving data efficiently. So What we do is we allow customers to define a view that represents the subset of their data they care about, and then we only look to move changes as efficiently as possible. So you're reducing the amount of data that needs to get moved and making sure it's focused on the essential. Similarly, with Aqua, what we've done, as you mentioned, is we've taken the compute down to the storage layer, and we're using our nitro chips to help with things like compression and encryption. And then we have F. P. J s in line to allow filtering an aggregation operation. So again, you're tryingto quickly and effectively get through as much data as you can so that you're only sending back what's relevant to the query that's being processed. And that again leads to more performance. If you can avoid reading a bite, you're going to speed up your queries. And that Awkward is trying to do. It's trying to push those operations down so that you're really reducing data as close to its origin as possible on focusing on what's essential. And that's what we're applying across our analytics portfolio. I would say one other piece we're focused on with performance is really about innovating across the stack. So you mentioned network performance. You know, we've got 100 gigabits per second throughout now, with the next 10 instances and then with things like Grab it on to your able to drive better price performance for customers, for general purpose workloads. So it's really innovating at all layers. >>It's amazing to watch it. I mean, you guys, it's a It's an incredible engineering challenge as you built this hyper distributed system. That's now, of course, going to the edge. I wanna come back to something you mentioned on do wanna hit on your leadership session as well. But you mentioned the one size fits all, uh, system. And I've asked Andy Jassy about this. I've had a discussion with many folks that because you're full and and of course, you mentioned the challenges you're gonna have to make tradeoffs if it's one size fits all. The flip side of that is okay. It's simple is you know, 11 of the Swiss Army knife of database, for example. But your philosophy is Amazon is you wanna have fine grained access and to the primitives in case the market changes you, you wanna be able to move quickly. So that puts more pressure on you to then simplify. You're not gonna build this big hairball abstraction layer. That's not what he gonna dio. Uh, you know, I think about, you know, layers and layers of paint. I live in a very old house. Eso your That's not your approach. So it puts greater pressure on on you to constantly listen to your customers, and and they're always saying, Hey, I want to simplify, simplify, simplify. We certainly again heard that in swamis presentation the other day, all about, you know, minimizing complexity. So that really is your trade office. It puts pressure on Amazon Engineering to continue to raise the bar on simplification. Isn't Is that a fair statement? >>Yeah, I think so. I mean, you know, I think any time we can do work, so our customers don't have to. I think that's a win for both of us. Um, you know, because I think we're delivering more value, and it makes it easier for our customers to get value from their data way. Absolutely believe in using the right tool for the right job. And you know you talked about an old house. You're not gonna build or renovate a house of the Swiss Army knife. It's just the wrong tool. It might work for small projects, but you're going to need something more specialized. The handle things that matter. It's and that is, uh, that's really what we see with that, you know, with that set of capabilities. So we want to provide customers with the best of both worlds. We want to give them purpose built tools so they don't have to compromise on performance or scale of functionality. And then we want to make it easy to use these together. Whether it's about data movement or things like Federated Queries, you can reach into each of them and through a single query and through a unified governance model. So it's all about stitching those together. >>Yeah, so far you've been on the right side of history. I think it serves you well on your customers. Well, I wanna come back to your leadership discussion, your your leadership session. What else could you tell us about? You know, what you covered there? >>So we we've actually had a bunch of innovations on the analytics tax. So some of the highlights are in m r, which is our managed spark. And to do service, we've been able to achieve 1.7 x better performance and open source with our spark runtime. So we've invested heavily in performance on now. EMR is also available for customers who are running and containerized environment. So we announced you Marnie chaos on then eh an integrated development environment and studio for you Marco D M R studio. So making it easier both for people at the infrastructure layer to run em are on their eks environments and make it available within their organizations but also simplifying life for data analysts and folks working with data so they can operate in that studio and not have toe mess with the details of the clusters underneath and then a bunch of innovation in red shift. We talked about Aqua already, but then we also announced data sharing for red Shift. So this makes it easy for red shift clusters to share data with other clusters without putting any load on the central producer cluster. And this also speaks to the theme of simplifying getting data from point A to point B so you could have central producer environments publishing data, which represents the source of truth, say into other departments within the organization or departments. And they can query the data, use it. It's always up to date, but it doesn't put any load on the producers that enables these really powerful data sharing on downstream data monetization capabilities like you've mentioned. In addition, like Swami mentioned in his keynote Red Shift ML, so you can now essentially train and run models that were built in sage maker and optimized from within your red shift clusters. And then we've also automated all of the performance tuning that's possible in red ships. So we really invested heavily in price performance, and now we've automated all of the things that make Red Shift the best in class data warehouse service from a price performance perspective up to three X better than others. But customers can just set red shift auto, and it'll handle workload management, data compression and data distribution. Eso making it easier to access all about performance and then the other big one was in Lake Formacion. We announced three new capabilities. One is transactions, so enabling consistent acid transactions on data lakes so you can do things like inserts and updates and deletes. We announced row based filtering for fine grained access control and that unified governance model and then automated storage optimization for Data Lake. So customers are dealing with an optimized small files that air coming off streaming systems, for example, like Formacion can auto compact those under the covers, and you can get a 78 x performance boost. It's been a busy year for prime lyrics. >>I'll say that, z that it no great great job, bro. Thanks so much for coming back in the Cube and, you know, sharing the innovations and, uh, great to see you again. And good luck in the coming here. Well, >>thank you very much. Great to be here. Great to see you. And hope we get Thio see each other in person against >>I hope so. All right. And thank you for watching everybody says Dave Volonte for the Cube will be right back right after this short break

Published Date : Dec 10 2020

SUMMARY :

It's great to see you again. They have Great co two and always a pleasure. to, you know, essentially share data across different And so the you know the components of the name are pretty straightforward. And then you're gonna automatically keep track of the changes and keep everything up to date. So you can imagine. services or data products that are gonna help me, you know, monetize my business. that prevented that data from flowing in the way you would expect it, you'd have toe manually, And if for whatever reason, you can't what happens? So if we can recover, say, for example, you can you know that for a So let's talk about another innovation. that you might ask the system to do on your behalf. but in different ways, you know, like compare California in New York or and then the data comes then the you know, the other step would be at data ingestion Time. But the example that you just gave it the drill down to verify that the answer is correct. And I think, you know, we're seeing that increasingly, You know the users from doing things that you know, whether it's data access But the you know, the notion of what customers are doing and what we're seeing is that admission of the business or the or the organization I meant to ask you about acute customers And on then it's also serverless, so you could embed it at a really massive But then we're talking about, you know, glue, elastic views you're copying and moving And you know, the reality of customer architectures is that customers will use purpose built So that puts more pressure on you to then really what we see with that, you know, with that set of capabilities. I think it serves you well on your customers. speaks to the theme of simplifying getting data from point A to point B so you could have central in the Cube and, you know, sharing the innovations and, uh, great to see you again. thank you very much. And thank you for watching everybody says Dave Volonte for the Cube will be right back right after

ENTITIES

Entity	Category	Confidence
Rahul Pathak	PERSON	0.99+
Andy Jassy	PERSON	0.99+
AWS	ORGANIZATION	0.99+
David	PERSON	0.99+
California	LOCATION	0.99+
New York	LOCATION	0.99+
Andy	PERSON	0.99+
Swiss Army	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
December 8	DATE	0.99+
Dave Volonte	PERSON	0.99+
last year	DATE	0.99+
2020	DATE	0.99+
third option	QUANTITY	0.99+
Swami	PERSON	0.99+
each	QUANTITY	0.99+
both	QUANTITY	0.99+
A. W	PERSON	0.99+
this year	DATE	0.99+
10 instances	QUANTITY	0.98+
A three	COMMERCIAL_ITEM	0.98+
78 x	QUANTITY	0.98+
two petabytes	QUANTITY	0.98+
five	QUANTITY	0.97+
Amazon Engineering	ORGANIZATION	0.97+
Red Shift ML	TITLE	0.97+
Formacion	ORGANIZATION	0.97+
11	QUANTITY	0.96+
one	QUANTITY	0.96+
one way	QUANTITY	0.96+
Intel	ORGANIZATION	0.96+
One	QUANTITY	0.96+
five categories	QUANTITY	0.94+
Aqua	ORGANIZATION	0.93+
Elasticsearch	TITLE	0.93+
terabytes	QUANTITY	0.93+
both worlds	QUANTITY	0.93+
next decade	DATE	0.92+
two data sets	QUANTITY	0.91+
Lake Formacion	ORGANIZATION	0.9+
single query	QUANTITY	0.9+
Data Lake	ORGANIZATION	0.89+
thousands of educators	QUANTITY	0.89+
Both stores	QUANTITY	0.88+
Thio	PERSON	0.88+
agile	TITLE	0.88+
Cuba	LOCATION	0.87+
dynamodb	ORGANIZATION	0.86+
1.7 x	QUANTITY	0.86+
Swamis	PERSON	0.84+
EMR	TITLE	0.82+
one size	QUANTITY	0.82+
Red Shift	TITLE	0.82+
up to three X	QUANTITY	0.82+
100 gigabits per second	QUANTITY	0.82+
Marnie	PERSON	0.79+
last decade	DATE	0.79+
reinvent 2020	EVENT	0.74+
Invent	EVENT	0.74+
last 10 years	DATE	0.74+
Cube	COMMERCIAL_ITEM	0.74+
today	DATE	0.74+
A Ro	EVENT	0.71+
three new capabilities	QUANTITY	0.71+
two	QUANTITY	0.7+
E T Elling	PERSON	0.69+
Eso	ORGANIZATION	0.66+
Aqua	TITLE	0.64+
Cube	ORGANIZATION	0.63+
Query	COMMERCIAL_ITEM	0.63+
SAS	ORGANIZATION	0.62+
Aurora	ORGANIZATION	0.61+
Lake House	ORGANIZATION	0.6+
Sequel	TITLE	0.58+
P.	PERSON	0.56+

Bill McGee, Trend Micro | AWS re Invent 2019

>>law from Las Vegas. It's the Q covering a ws re invent 2019. Brought to you by Amazon Web service is and in along with its ecosystem partners. >>Okay, Welcome back, everyone. Cube coverage. Las Vegas live action. It was re invent 2019 3rd day of a massive show where our seventh year of the eight years of Abel documenting the history and the rise in the changing landscape of the business. I'm John for Bruce. To Minutemen, my co host. Our next guest Bill McGee, senior vice president, general manager of the Hybrid Cloud Security group within Trend Micro. So, this company, those guys now lead executive of the Cloud Hybrid. I have rid Cloud Security hybrid in there looking cute. >>And I've been to every reinvent, every single one. >>Congratulations. Thank you. >>Thank you. Nice to be >>here. So, eight years, what's changed in your mind? Real quick. >>Uh, wow. The Yeah, certainly. The amount of a dot Uh, the amount of adoption is now massive mainstream. You don't have the question. Should I go to the cloud? It's all about how and how much. Probably the biggest change we've seen is how it's really being embraced all around the world where a global company we saw initially a US on Australia type focused you K. Now it's all over the place and it's really relevant everywhere, >>you know, at least from my standpoint. And I have enough friends of mine in the security industry. When we first started coming to show, I mean security was here. Security is not only is so front and center in the discussion of cloud that they had all show for it here, so you know, it gives the 2019 view of security inside that the broader hybrid cloud discussion here, a re >>investor. Let me tell you a couple of things, kind of what we're seeing within our customer base and then what matters from a security perspective. So we see, you know, some organizations doing cloud migration moving. We're close to the cloud of various forms. Had a couple of meetings yesterday. One was college evacuating their data center. The other one was celebrating that two weeks ago they closed their data center, So that's a big step. Windows and Lennox workloads moving to the cloud and really changing existing security controls toe work better in the cloud. But certainly what a lot of these cloud builders are here for is, you know, developing cloud native applications. Originally back 78 years ago, that was on top of what's now seem like pretty simple. Service is like s three E. C two. I've got containers and server lists and other platforms that that people are using. And then the last thing. A lot of companies are establishing a cloud centre of excellence, and they're trying to optimize the use of the cloud. They still have compliance requirements that they need to achieve. So these are what we see happening and really the challenge for the customer. How do we secure all this? How do we secure the aggressive, aggressive cloud Native application development? How do we help a customer achieve compliance easily from a cloud centre of excellence? So that's where we see us fitting. And we made a big announcement a couple of weeks ago about a new platform that we've created. I would love to talk to >>love that. Let's dig into that. But first we were at reinforces Amazons First security, Carver's David Locked and I were talking about cloud security was on Prem security and then what's happening here and had a conversation with someone who was close to the C I. A. Can't say his or her name. And they said Cloud has changed the game for them because they're cost line was pretty much flat. But the demand for missions were squirrels going scaling. So we're seeing that same dynamic. You were referring to it earlier that costs and data centers is kind of flat. But the demand for application new stuff's happened, so there's a real increased her demand for APS. Sure, this is the real driver, how people are flexing and deploying technology. So the security becomes really the built in conversation, cracked comment on that dynamic. And what do you recommend? Well, so here's a couple >>of things we've seen, Really? You know, again, we've been doing private security for about a decade, and really it was primarily focused on one service of eight of us, which is easy to now that's a pretty darn big service and widely used within their customer base. There's no 170 service's, I think is the most recent number. So the developers are embracing all these new service is we acquired a new capability in October. Company called Cloud Conformity, based in Sydney, Australia, very focused on AWS, analyzes implementations against the eight of US well-architected framework. So the first step we see for customers is you gotta get visibility into use of the cloud for the security team. What service is air being used, then? Can you set up a set of security guard rails to allow those service is to be used in a secure manner. Then we help our customers turn to more detailed, specialized protection of easy to or containers or server list. So that's what we've recognized ourselves. We had to create a very modest version of what Amazon has created themselves, which is a platform that allows builders to connect to and choose what security service is they want. >>Road is your service bases and all the service's air. You guys now pick and choose the wall. Yeah, there's a main ones. What does highlight? So >>there's Yeah, I'll give you the ones where we provide a very large breath of protection. So in the what we're calling Cloud one conformity service. So that's this technology we acquired a couple months ago. It cuts across about 70 service is right now and gives you visibility of potential security configuration errors that you have in your environment now if it's in a deaf team, maybe not such a big deal. But if it's in production, that is a big deal. Even better, you can scan your cloud formacion templates on the way to being live. Then we have a set of specialized protection that you know will run on a workload and protect it protected containerized environment. A library that can sit within a server lis application. That's kind of how we look at it. All right, >>So, Bill, one of things of going to the more and more cloud for customers is that there's that shared responsibility. Modern. We know that security is everyone's responsibility. It needs to be built in from the ground up. How are your customers doing with that shift? And are they understanding what they need to do? There have been some pretty visible, like a weight. I really had to configure that. I've thought about that Amazons trying to close the gap on song. But for some of those, >>we've seen a big positive change over the years. Initially I would say that there was what I would call a naive perception that the cloud with magic and it was perfectly secure and that I don't have to worry about it, right. Amazon data did the industry a real favor by establishing the shared responsibility model and making crystal clear what they've got covered that you don't need to worry about anymore as a customer. And then what are the capabilities you still need? Toe worry about? They've delivered a set of security tools that help their customers, and then they rely on partners like us. Thio deliver a set of more in depth tools. Thio, you know, specialized market. >>You actually used a word that we've been talking about a lot this week. Naive. Yeah. So we said, there's, you know, the one letter difference between being cloud native meeting Cloud naive there. Yeah. What does it mean to be cloud native in the security world? >>Well, I would say what allows you to be so first, the most important thing in every customer's mind. I don't care how good the security capabilities you're helping with me with. If you're going to slow down the improvements that I've just made to my development lifecycle. I'm not interested. So that is the most important thing is, are you able to inject your security technology and allow the customer to deliver at the rate that they're currently or continuing to improve? That is by far the most important thing. Then it's our your controls, fitting into an environment in a way that that are as easy as possible for the customer. One part that's been very critical for us. We've been a lead adopter of the AWS marketplace, allowing customers too procure security technology easily. They don't actually have to talk to us to buy our product. That's pretty revolutionary >>about the number of breaches that I'm going on, What's changed with you guys over the year because new vectors air coming out at this more surface area. Obviously, it's been discussed. What's changed most in your I'll >>tell you what we're worried about and what we expect to see, although I would say the evidence. It's early, uh, the reality in our traditional data centers. They were so porous at runtime in terms of the infrastructure and vulnerabilities that it was relatively easy for Attackers to get in the cloud has actually improved the level of security because of automation, less configuration errors. Unfortunately, what we expect his Attackers >>to move to. >>The developers moved to the depth pipeline, injecting code not a run time, but injecting it earlier in the life cycle. We've seen evidence of container images up on Dr Hub getting infected and then developers just pulling in without thinking about it. That's where Attackers are going to move to the depth pipeline. And we need to move some of our security technology to the dead pipeline toe, help customers defend themselves. >>What about International Geo Geo issues around compliance. How is that changing the game or slowing it down? Or I'm sailing it or you talk about that dynamic with regions? Are you >>sure you know us is the most innovative market and the most risk taking market, and therefore people moved to the cloud quite bravely over this over this decade. Some of the markets So, for example, were Japanese headquarters company. In general, Japanese companies, you know, really taken to a lot of considerations before they make that type of big bet. But now we're seeing it. We're seeing auto manufacturers embrace the cloud. So I think those it was a struggle for us in the early days. How regional the adoption of Cloud was. That's not the case anymore. It's really a relevant conversation in every one of our markets. >>Bill. Thank you for coming on the Cuban Sharing your insights Hybrid Cloud Security Got to ask you to end the segment. Yeah, What is going on for you This year? I'll see hybrids in your title. Operating models. Cloud center, gravity clouds going to the edge or data center. Just operate model. What's on your mind this year? What are you trying to do? Accomplish what you excited >>about? What? We're really excited about what this product announcement we made, called Cloud One. And what Cloud one is, is a set of Security Service's, which customers can access through common common access common building infrastructure, common cloud account management and choose what to use. You know, Andy put it pretty well in his keynote where you know he talked about He doesn't think of aws, a Swiss Army knife. He thinks of it as a specialized set of tools that builders get to adopt. We want to create a set of security tools in a similar way where customers can choose which of these specialized security service is that they want to adopt >>Bill. Great pleasure to meet you and have this conversation pro and then security area entrepreneur sold his company to Trend Micro. This is the hybrid world. It's all about the cloud operating model. So about agility and getting things done with application developers. This cube bringing all the data from reinvent stables for more coverage after this short break.

Published Date : Dec 6 2019

SUMMARY :

Brought to you by Amazon Web service and the rise in the changing landscape of the business. Thank you. Nice to be So, eight years, what's changed in your mind? is how it's really being embraced all around the world where a global company we saw initially center in the discussion of cloud that they had all show for it here, so you know, So we see, you know, some organizations doing cloud migration And what do you recommend? So the first step we see for customers is you gotta get visibility You guys now pick and choose the wall. So in the what we're calling Cloud one conformity service. So, Bill, one of things of going to the more and more cloud for customers is that the shared responsibility model and making crystal clear what they've got covered that you don't need to What does it mean to be cloud native in the security world? So that is the most important thing is, are you able to inject your security technology about the number of breaches that I'm going on, What's changed with you guys over the year because new easy for Attackers to get in the cloud has actually improved the level of security because The developers moved to the depth pipeline, injecting code not a run time, How is that changing the game or slowing it down? Some of the markets So, for example, were Japanese headquarters company. Yeah, What is going on for you This year? you know he talked about He doesn't think of aws, a Swiss Army knife. This is the hybrid world.

ENTITIES

Entity	Category	Confidence
Andy	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Bill McGee	PERSON	0.99+
October	DATE	0.99+
Trend Micro	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
2019	DATE	0.99+
Amazons	ORGANIZATION	0.99+
John	PERSON	0.99+
Carver	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
eight years	QUANTITY	0.99+
Australia	LOCATION	0.99+
Sydney, Australia	LOCATION	0.99+
seventh year	QUANTITY	0.99+
first step	QUANTITY	0.99+
David Locked	PERSON	0.99+
Swiss Army	ORGANIZATION	0.99+
eight	QUANTITY	0.99+
one service	QUANTITY	0.99+
two weeks ago	DATE	0.99+
Bruce	PERSON	0.99+
yesterday	DATE	0.99+
3rd day	QUANTITY	0.99+
this week	DATE	0.98+
this year	DATE	0.98+
First	QUANTITY	0.98+
One	QUANTITY	0.98+
first	QUANTITY	0.97+
Lennox	ORGANIZATION	0.97+
One part	QUANTITY	0.97+
about 70 service	QUANTITY	0.97+
170 service	QUANTITY	0.97+
This year	DATE	0.96+
one letter	QUANTITY	0.96+
US	LOCATION	0.96+
78 years ago	DATE	0.95+
about a decade	QUANTITY	0.94+
E. C two	TITLE	0.93+
couple months ago	DATE	0.93+
Cloud Conformity	ORGANIZATION	0.93+
Amazon Web	ORGANIZATION	0.91+
Thio	ORGANIZATION	0.89+
one	QUANTITY	0.89+
Cuban	OTHER	0.86+
Bill	PERSON	0.86+
Cloud Security hybrid	COMMERCIAL_ITEM	0.86+
Prem	ORGANIZATION	0.82+
Hybrid Cloud Security	ORGANIZATION	0.8+
a couple of weeks ago	DATE	0.77+
C I. A. Ca	ORGANIZATION	0.76+
Cloud one	TITLE	0.74+
Abel	PERSON	0.73+
Cloud naive	TITLE	0.68+
Cloud Hybrid	COMMERCIAL_ITEM	0.67+
Cloud One	TITLE	0.64+
Japanese	OTHER	0.63+
International Geo Geo	ORGANIZATION	0.63+
single one	QUANTITY	0.59+
couple	QUANTITY	0.58+
Naive	PERSON	0.52+
Japanese	LOCATION	0.51+
Windows	TITLE	0.5+
Dr Hub	ORGANIZATION	0.43+
Minutemen	TITLE	0.42+
Invent	EVENT	0.4+
Cloud	ORGANIZATION	0.31+

Phil Finucane, Express Scripts | Mayfield People First Network

>> Narrator: From Sand Hill Road, in the heart of Silicon Valley, it's theCUBE, presenting the People First Network, insights from entrepreneurs and tech leaders. >> Hello and welcome to a special Cube conversation, I'm John Furrier with theCUBE. We're here at Mayfield Fund on Sand Hill Road, Venture Cap for investing here for the People First co-created production by theCube and Mayfield. Next to us, Phil Finucane who's the former CTO of Express Scripts as well as a variety of other roles. Went to Stanford, Stanford alum. >> Mm hmm. >> Good to see you, thanks for joining me for this interview. >> Thank you, thank you for having me. >> So, before we get into some of the specifics, talk about your career, you're a former CTO of Express Scripts >> Yep. >> What are some of the other journeys that you've had? Talk about your roles. >> Yeah, I've had sort of a varied career. I started off as just a computer coder for a contract coder in the mid-90s. I sort of stumbled into it, not because I had a computer science background, but because when you start coding, sort of for fun in Silicon Valley in the mid-90s, there are just lots of jobs and I was lucky to have great mentors along the way. In 2003, I joined Yahoo and came in as the lead engineer, sort of the ops guy and the build and release guy for the log in and registration team at Yahoo, so I learned how to, went from being just a coder to being somebody who know how to run and build big systems and manage them all around the world. That was in the day when everything was bare metal and I could go to a data center and actually look at my machine and say, "Wow, that one's mine," right? And you know, sort of progressed from there to being the architect by the time that I left for some of the big social initiatives at Yahoo. On my way out, the YOS, the initiative to try to build Facebook in I think 2007, 2008 to try to take them on. That didn't work out too well, but it was definitely a formative experience in my career. From there I went to Zynga, where I was the CTO for Farmville. Was really, really good at getting middle-aged women in the Midwest to come play our game, and you know, was there for >> And it was highly, >> About three years >> high growth, Farmville >> Huge growth >> Took off like a rocket ship. >> Yeah, you know, over the 10 quarters I worked on the game we had over a billion dollars in revenue and that was, you know, the Zynga IPO'd on the back of that, right? And we weren't the only game, but we were certainly >> That was one of the big games >> The big whale, us and poker were the two that really drove the value in Zynga at that point. After that, I went to American Express, where I worked in a division that sort of sat off on the side of American Express focusing on stored value products. I was the chief architect for that division. Stored value products and international currency exchange. So, you know, at one point, I was in charge of both a pre-paid platform and American Express's traveler's checks platform, believe it or not, a thing that still exists. Although it's not heavily used any more. And you know, finally, I went to Express Scripts, where I spent the last three years as the CTO for that org. >> It's interesting, you've got a very unique background, because you know, you've seen the web scale, talk about bare metal Yahoo days, I mean, I remember those days vividly, you know, dealing with database schemas, I mean certainly the scale of Yahoo front page, never mind the different services that they had, which by the way, silo-like, they had databases >> Very, oh totally >> So building a registration and identity system must've been like, really stitching together a core part of Yahoo, I mean, what a Herculean task that must've been. >> Yeah, it was a lot of fun. I learned a lot, you know, we, it was my first experience in figuring out how to deal with security around the web. You know, we had, at the beginning, some vulnerabilities here and there, as time went on, our standards around interacting around the web got better and better. Obviously, Yahoo has run into trouble around that in subsequent years, but it was definitely a big learning experience, being involved in you know, the development of the OAuth 2.0 spec and all of that, I was sort of sitting there advising the folks who were, you know, in the middle of that, doing all the work. >> And that became such a standard as we know, tokens, dealing with tokens and SAS. Really drove a lot of the SAS mobile generation that did cloud, which becomes kind of that next generation so you had, you know Web 1.0, Web 2.0, then you had the cloud era, cloud 2.0, now they're goin' DevOps and apps. I want to get your thought, and you throw crypto in there just for fun, of dealing with blockchain and then token economics and new kinds of paradigms are coming online >> It's amazing how far we've come in those years, right? I mean I look at the database that was built inside of Yahoo and this predated me, you know, this was back to circa 1996, I think, but you know, big massively scalable databases that were needed just because the traditional relational database just wouldn't work at that scale, and Yahoo was one of the first to sort of discover that. And now you look at the database technologies that are out there today that take some of those core concepts and just extend them so much further and they're so much easier to access, to use, to run, operate, all of those things than back in the days of Yahoozle, UDB, and it's amazing just to see how far we've come. >> Phil, I want to get your thoughts, because you know, talking about Yahoo and just your experiences and even today, at that time it was like changing the airplane's engine at 35,000 feet, it's really difficult. A lot of corporate enterprises right nhow are having that same kind of feeling with digital, and digital transformation, I'd say it's a cliche, but it is true this impact, the role of data that's playing and the just for value creation but also cybersecurity could put a company out of business, so there's all kinds of looming things that are opportunities and challenges, that are sizable, huge tasks that was once regulated to the full stack developers and the full web scalers, now the lonely CIO with the anemic enterprise staff has to turn around on a dime. Staff up, build a stack, build commodity, scale out, this is pretty massive, and not a lot of people are talking about this. What's your view on this? Because this is super important. >> Yeah it is, and you know, so I had kind of a shock, moving from working my whole career here on Silicon Valley and then going to American Express, which you know, is very similar in a lot of ways to Express Scripts, and the sort of corporate mindset around, "What is technology?" There is this notion that everything is IT and here in the valley, IT is you know, internal networks and laptops and those sorts of things, the stuff that's required to make your enterprise run internally. Their IT is all of your infrastructure, right? And IT is a service organization, it's not the competitive advantage in your industry, right? And so both of the places that I've gone have had really forward-thinking leaders that have wanted to change the way that their enterprise operates around technology, and move away from IT but, to technology, to thinking about engineering as a core competency. And that's a huge change, not only for the CIO >> You're saying they did have that vision >> They had the vision, but they didn't know how to get there, so my charter coming in and you know, others who were on the teams around me, our charter was to come in and help build a real engineering organization as opposed to an IT org that's very vendor-oriented, you know, that's dependent on third parties to tell you the right thing or the wrong thing, you know that hires consultants to come in and help set up architecture standards, because we couldn't do that on our own, we're not the experts on this side. You know, that's sort of the mindset in many old school companies, right? That needs, that I think needs to change. This notion that software is eating the world is still not something that people have gotten their heads around in many companies, right? >> And data's washing out old business models, so if software's eating the world, data's the tsunami that's coming in and going to take out the beach and the people there. >> Right. And so it's like, all of these things, it's one thing for, you know, a forward-thinking CEO like Tim Wentworth at Express Scripts, who was responsible for bringing me and the group in, you know, those kinds of folks, it's one thing to know that you have to make that transition it's another thing to have a sense of what that means for an engineering team, and all the more for the rest of the organization to be able to get behind it. I mean, people you know, I don't know any number of business partners who've been used to, just sort of taking a spec, throwing it over the wall, and saying, "Come back to me in two years when you're done." That's not how effective organizations work around technology. >> Let's drill into that, because one of the things that's cultural, I mean I do some of the interviews of theCUBE, I talk to leaders all the time like yourself, the theme keeps coming back, it's culture, it's process, technology, all those things you talk about, but culture is the number one issue people point to, saying, "That's the reason why "something did or didn't happen." >> Correct. >> So, you talk about throwing it over the fence, that's waterfall, so you think about the old waterfall methodology, agile, well documented, but the mindset of product thinking is a really novel concept to corporate America Not to Silicon Valley, and entrepreneurs, they got to launch a product, not roll out SAP over two years, right, or something they used to be doing. So that's a cultural mindset shift. >> It's difficult for folks, even if they want to get on board to come along some of the time. One of the real big successes we had early on at Express Scripts was, you know, transitioning our teams to Agile wasn't difficult, what was difficult was getting business partners to sort of come along and be actively engaged in that product development mindset and lifecycle and all those sorts of things. And you know, we had one partner in particular, we were migrating from a really old, really clunky customer care application that you know had taken years and years to build, took on average, a new agent took six weeks to get trained on it because it was so complex and it's Oracle Forms and you know, every field in the database was a field on this thing, and there were green screens to do the stuff that you couldn't do in Oracle Forms, so and we wanted to rebuild the application. We tried to get them to come along and say, "Okay, we're going to do it in really small chunks," but business partners were like, "No, we can't afford "to have our agents swiveling between two applications." And so finally after we got our first sort of full-feature complete, we begged to go into a call center, you know with our business partners, and sit down with a few agents and just have them use it and see if it looked like it worked, if it did the right thing, and it was amazing seeing the business partner go, over the course of an hour, from "I can't be engaged in this, "I don't want an agent swiveling, "I don't want to be, you know, delivering partial applications "I want the whole thing." to, "Oh my god, it works way better, "the design is much nicer, the agents seem to like it," you know, "Here are the next things we should work on, "These are the things we got wrong." They immediately pivoted, and it wasn't, it was because they're the experts, they know how to run their business, they know what's important in their call centers, they know what their agents need, and they had just never seen the movie before, they just had no concept you could work that way. >> So this is actually interesting, 'cause what you're saying is, a new thing, foreign to the business partners, the tech team's on board, being Agile, building product, they have to, they can't just hear the feature benefits, they got to feel it. >> Yeah, they have to see it >> This seems to be the experience of success before they can move. Is that a success you think culturally, something that people have to be mindful of? >> It's absolutely something you have to be mindful of. And that was just the first step down the path. I mean, that team made a number of mistakes that folks here I think in the valley wouldn't normally make, you know. Over-committing and getting themselves into deep water by trying to get too much done and actually getting less accomplished in the process because of it and you know, the engagement around using data to actually figure out what's the next feature that we build. When you've got this enormous application to migrate, you should probably have some insight as to you know, feature by feature, what are you going to work on next? And that was a real challenge, 'cause there's a culture of expertise-driven, you know being subject-matter driven, expertise driven as opposed to being data driven about how do you >> Let's talk about data-driven. We had an interview earlier this morning with another luminary here at the Mayfield 50th conference celebration that they're having, and he said, "Data is the new feedback mechanism." and his point was, is that if you treat the Agile as an R&D exercise from a data standpoint. Not from a product but get it out there, get the data circulating in, it's critical in formulation of the next >> It is, yeah, it's absolutely critical. That was the eye opener for me going to Zynga. Zynga had an incredible, probably still does have, an incredible product culture that every single thing gets rolled out behind an experiment. And so you know, that's great from an operational perspective, because it allows you to, you know, move quickly and roll things out in small increments and when it doesn't work, you can just shut it off but it's not some huge catastrophe. But it's also critical because it allows you to see what's working and what's not and the flip side of that is, some humility of the people developing the products that their ideas are not going to work sometimes just because you know this domain well doesn't mean that you're necessarily going to be the expert on exactly how everything is going to play out. And so you have to have this ability to go out, try stuff, let it fail, use that, hopefully you fail quickly, you learn what's not working and use that to inform what's the next step down the path that you take, right? And Agile plays into it, but that's for me, that's the big transition that corporations really have to struggle with, and it's hard. >> You know you're, been there done that, seen multiple waves of innovation, want to bring up something to kind of get you going here. You see this classically in the old school 90s, 80s day. Product management, product people and sales people. They're always buttin' heads, you know? Product marketing, marketing people want this sales and marketing want this, product people buttin' heads, but now with Agile, the engineering focus has been the front lines. People are building engineering teams in house. They're building custom stacks for whatever reasons, the apps are getting smarter. The engineers are getting closer to the edge, the customer if you will. How do you help companies, or how do you advise companies to think about the relationship between a product-centric culture and a sales-centric culture? Because sometimes you have companies that are all about the customer-centric, customer-centric customer-centric, product-centric and sometimes if you try to put 'em together there's always going to be an alpha-beta kind of thing there and that's the balance in this. What's your take on this? Seems to be a cutting edge topic >> Yeah, well, so you know, one of the last big initiatives that I worked on at Express Scripts. Express Scripts has the, to my knowledge, the largest automated home delivery pharmacy in the world. It's amazing if you walk into one of our pharmacies where automation is packaging and filling prescriptions and packaging and shipping and doing all of that stuff. And we've built so much efficiency into the process that we've started getting slack in the system. Every year, you're trying to figure out how to make something work better and you know, have better automation around it. And so, you know, what do you do with all of that slack? The sales team can't sign up enough new customers for Express Scripts to actually fill that capacity. And so they create a division of commoditizing this, basically white labeling your pharmacy. We called it Pharmacy as a Platform, exposing APIs to third parties who might want to come along and hey, Phil's pharmacy can now fill branded prescriptions to get sent to you in your home, right? And so that's a fantastic vision, but there's a real struggle between engineering who had all these legacy stacks that we needed to figure out how to move to be able to really live up to this, you know the core of Express Scripts was our members and not somebody else's members. And so there's a lot of rewiring at the core that needs to be done. An operations team, a product team that's, you know, running these home delivery pharmacies, and a sales team that wants to go off and sell all over the place, right? And so, you know, early on, we started off and the sales team tried to sell, like six different deals that all required different parts of the vision, but you know, they weren't really, there was no real roadmap to figure out how do you get from where we're at to the end, and we could've done any of those things, but trying to do them all at once was going to be a trainwreck. And so, you know, we stubbed our toes a couple of times along the way, but I think it just came down to having a conversation and trying to be as transparent as possible on all sides, in all sides. To you know, try to get to a place where we could be effective in delivering on the vision. The vision was right. Everybody was doing all of the right things. But if you haven't actually, with so much of this stuff, if you haven't seen the movie, if you haven't worked this way before, there's nothing I can tell you that's going to make it work magically for you tomorrow. You have to just get this together and work in small increments to figure out how to get there. >> You got to go through spring training, you got to do the reps. >> Yep, absolutely. >> All right, so on your career, as you look at what you've done in your career, and what people outside are looking at right now, you got startups trying to compete and get a market position. You have other existing suppliers who could be the old guard, retooling and replatforming, refactoring, whatever the buzz word you want to use. And then the ultimate customer who wants to consume and have the ability of having custom personalization, data analytics, unlimited elastic capability with resources for their solution. How, what advice would you give to the startup, to the supplier, and to the customer to survive this next transition of cloud 2.0, you know and data tsunami, and all the opportunities that are coming? Because if they don't, they'll be challenged a startup goes out of business, a supplier gets displaced. >> Right, I mean, well, so the startup, I don't know if I have good advice for the startup. Startups in general have to find a market that actually works for them. And so, you know, I don't know that I've got some secret key that allows startups to be effective other than don't run out of money, try to figure out how to build effectively to get you to the point where you're, you know, where you're going to win. One of my earliest, one of the earliest jobs I had in my career, I came into a startup, and I tried, one of the founders had written the initial version of the code base. I, as a headstrong engineer, was convinced that he had done horrible work, and so I sort of holed up for like, six to eight weeks doing a hundred hours a week trying to rewrite the entire code base while getting nothing done for the startup. You know, in the end, that was the one job I've ever been fired from, and I should've been fired, because, you know, honestly as a startup, you shouldn't worry about perfection from an engineering perspective. You should figure out how to try to find your marketplace. Everybody has tech debt, you can fix that as time goes on, the startup needs to figure out how to be viable more than anything else. As far as suppliers go, you know, I don't know it's interesting the, you know, I sort of look at corporate America and there are many many companies that really rely heavily on their vendors to tell them how to do things. They don't trust in their own internal engineering ability. And then there are the ones, like the teams I have built at AmEx and Express Scripts that really do want to learn it all and be independent. I would say, identify when you walk into somebody's shop which they are and sell to them appropriately. You know, I've been a Splunk customer for a long time, I love Splunk. But the Splunk sales team early on at Express Scripts tried to come in and sell me on a whole bunch of stuff that Splunk was just not good at, right? >> And you knew that. >> And I knew that, because I've been a hands-on customer every since Zynga, right? I know what it's good at, and I love it as a tool, but you know, it's not the Swiss Army knife. It can't do everything. >> Well now you got Signal FX, so now you can get the observability you need. >> Exactly, right? So yeah, I, you know, I would say, you know, for those kinds of companies, it's important to go in and understand what your customer is, you know, what your customer is asking for and respond to them appropriately. And in some cases, they're going to need your expertise, either because they're building towards it or they haven't gotten there yet, and some cases, one of the things that I have done with teams of mine in the past, was it with AppDynamics at Express Scripts, excuse me at AmEx, five or six years ago, they were sold on, you know, bringing in AppDynamics as a monitoring tool, I actually made them not bring it in, because they didn't know what they didn't know. I made them go build some basic monitoring, you know, using some open source tools, just to get some background, and then, you know, once they did, we ended up bringing AppDynamics in, but doing it in a way that they were accretive to what we were trying to accomplish and not just this thing that was going to solve all of our problems. >> And so that brings up the whole off-the-shelf general purpose software model that you were referring to. The old model was lean on your vendors. They're supplying you, and because you don't have the staff to do it yourself. That's changing, do you think that's changing? >> It is, it's changing, but again, I think there's a lot of places where people nominally want to go there, but don't know how to get there, and so, you know, people are stubbing their toes left and right. If you're doing it with this mindset of, we're constantly getting better and we're learning and it's okay to make mistakes as long as we move forward, >> It's okay to stub your toe as long as you don't cut an artery open. >> Yeah, that's true, yeah exactly >> You don't want to bleed out, that's a cybersecurity hack >> That's true, that's true. But for me a lot of the time that just comes down to how long are you waiting before you stub your toe? If you're, you know, if you wait two years before you actually try to launch something, the odds of you cutting your leg off are much higher than >> Well I want to get into the failure thing, so I think stubbing your toe brings up this notion of risk management, learning what to try, what not to do, take experiments to try to your, which is a great example. Before you get there, you mentioned suppliers. One of the things we hear and I want to get your thoughts on, is that, a lot of CIOs and C-sos, and CBOs, or whatever title is the acronym, they're trying to reduce the number of suppliers. They don't want more tools, right? They don't necessarily want another tool for the tool's sake or they might want to replatform, what does that even mean? So, we're hearing in our interviews and our discussions with partitioners, "Hey, I want to get my suppliers down, "and by the way, I want to be API driven, "so I want to start getting to a mode "where I'm dictating the relationship to suppliers." How do you respond to that? Do you see that as aspirational, real dynamic, or fiction? >> It's a good goal to give motivation, I believe it. For me, I approach the problem a little differently. I'm a big believer, well, so, because I've seen this pattern of this next tool is going to be the one that consolidates three things and it's going to be the right answer and instead of eliminating three and getting down to one, you have four, because you're, you need to unwire this new thing, there's a lot of time and effort required to get rid of, you know, your old technology stack, and move to the new one, right? I've seen that especially coming from the C-Sec for Express Scripts is an amazing guy, and you know, was definitely trying to head down that path but we stubbed our toes, we ran into problems in trying to figure out, you know, how do you move from one set of networking gear to the next set? How do you deal with, you know, all of the virus protection and all the other, there's a huge variety of tools. >> So it's not just technical debt, it's disruption >> It's disruption to the existing stack, and you've got to move from old to new, so my philosophy has always been, with technical debt, when you're in debt, and I think technical debt really does operate in a lot of ways like real debt, right? Probably good to have some of it. If you're completely debt-free, that's I've never been in that place before. >> You're comfortable. You might not be moving, >> Exactly, right? But with that technical debt, you know, there's two ways to pay down your debt. You can scrimp and save and put more money into debt principal payments as opposed to spending on other new things, or, well and/or, build productive capacity. So a huge focus for me for the engineering teams that we've built, and this is not anything new to the folks in this area, but, you know, always think about an arms race, where you're getting 1% better every day. The aggregation of marginal gains and investing in internal improvements so that your team is doubling productivity every year, which is something that's really possible for, you know, some of these engineering organizations, is the way that you deal with that, right? If you get to the point where your team is really, really productive, they can go through and eliminate all the old legacy technology. >> That's actually great advice, and it's interesting, because a lot of people just get hung up on one thing. Operating something, and then growing something, and you can have different management styles and different techniques for both, the growth team, the operating team. You're kind of bringing in and saying, we can do both. Operate with growth in mind, to 1% better approach. >> Right, you know, and for me, it's been an interesting journey, you know. I started off as the engineer and then the architect, who was always focused on just the technology, the design of the system in production. Sort of learned from there that you had to be good at the you know, all the systems that get code from a developer's desktop into production, that's a whole interrelated system that's not isolated from your production system. And then from there, it has to be the engineering team that you build has to be effective as well. And so, I've moved from being very technology-centric to somebody who says, "Okay, I have to start "with getting the team right "and getting the culture right if we're ever going to "be able to get the technology to a good place." Mind you, I still love the technology. I'm still an architect at my core, but I've come to this realization that good technology and bad teams will get crushed by bad technologies and good teams. Because now I've seen that a couple of places, where you have old but evolving technology stacks that have gone from low availability and poor performance and low ability to get new features into production to a place where you're fixing all of that at a high rate. It starts with the team. >> You're bringing us some core Silicon Valley ethos to the IT conversation, because what you're talking about is "I'll fund an A team with a B plan any day "over a B team with an A plan." >> Right. >> And where this makes sense, I think is true, is that to your point about debt, A teams know how to manage it. >> Yeah. >> So this is kind of what you're getting at here. >> Right. >> You can take that same ethos, so it's the Agile enterprise. >> Yeah, it is >> That's what we're talking about. Okay, so hypothetical final point I want to chat with you about. Let's just say you and I were startin' a company. We're chief architects, you're the chief architect, I'm a coder, what are we doing? Do I code from horizontally scalable cloud, certainly cloud native, how would you think about building, we have an app in mind, all of our requirements defined, it's going to be data-centric, it's going to be game change and have community, it might have some crypto in there, who knows, but it's going to be fun. How do we scale this out to be really fast? How would you architect this? >> Yeah, well, you know, I do start in the cloud. I go to AWS or Azure or any of the offerings that are out there, and you know, leverage everything that they have that's already wired up already for you. I mean the thing that we've seen in the evolution of software and production systems over the last, well, forever, is you get more and more leverage every day, every year, right? And so, if you and I are startin' a new company, let's go use the tools that are there to do the things that we shouldn't be wasting our time on. Let's focus on the value for our company as much as we can. Don't over-architect. I think premature optimization is a thing that you know, I learned early on is a real problem. You should, you know >> Give an example, what that would look like. >> I've seen >> Database scale decisions done with no scale >> Correct, yeah, you know? You go off >> Let's pick this! It's the most scalable database, well we have no users yet. >> Right, you know you build the super complicated caching architecture or you know, you go design the most critical part of the system out of the gate, you know, using Assembly. You use C++ or, you use a low level language when a high level language with your three users would be just fine, right? You can get the work done in a fraction of the time. >> And get the business logic down, the IP, >> Solve the problem when it becomes a problem. Like, it's, you know, I've, any number of times, I've run into systems, I've built systems where you have some issue that you run into, and you have to go back and redesign some chunk of the system. In my experience, I'm really bad at predicting, and I think engineers are really bad at predicting what are going to be the problem areas until you run into them, so just go as simple as you can out of the gate, you know. Use as many tools as you can to solve problems that, you know, maybe as an engineer, I want to go rebuild every thing from scratch every time. I get the inclination. But it's >> It's a knee-jerk reaction to do that but you stay your course. Don't over-provision, overthink it, thus start taking steps toward the destination, the vision you want to go to, and get better, operate >> Solve the problem you have when it shows up. >> So growth mindset, execute, solve the problems when they're there. >> Right, and initially the problem that you have is finding a market, you know, not building the greatest platform in the world, right? >> Find a market, exactly. >> Right? >> Phil, thanks for taking the time >> Thank you very much, appreciate it. >> Appreciate the insights. Hey, we're here for the People First, Mayfield's 50th celebration, 50 years in business. It's a CUBE co-production, I'm John Furrier, thanks for watching >> Thanks John. (outro music)

Published Date : Sep 11 2019

SUMMARY :

in the heart of Silicon Valley, for the People First co-created production What are some of the other journeys that you've had? to come play our game, and you know, was there for And you know, finally, I went to Express Scripts, what a Herculean task that must've been. advising the folks who were, you know, that next generation so you had, you know Web 1.0, and this predated me, you know, this was back to circa 1996, because you know, talking about Yahoo and here in the valley, IT is you know, to tell you the right thing or the wrong thing, you know and going to take out the beach and the people there. it's one thing to know that you have to make that transition it's process, technology, all those things you talk about, that's waterfall, so you think about and it's Oracle Forms and you know, a new thing, foreign to the business partners, Is that a success you think culturally, as to you know, feature by feature, and his point was, is that if you treat the Agile down the path that you take, right? the customer if you will. different parts of the vision, but you know, you got to do the reps. to survive this next transition of cloud 2.0, you know to get you to the point where you're, you know, but you know, it's not the Swiss Army knife. so now you can get the observability you need. just to get some background, and then, you know, general purpose software model that you were referring to. and it's okay to make mistakes as long as we move forward, as long as you don't cut an artery open. the odds of you cutting your leg off are much higher than "where I'm dictating the relationship to suppliers." to get rid of, you know, your old technology stack, It's disruption to the existing stack, You might not be moving, to the folks in this area, but, you know, and you can have different management styles be good at the you know, all the systems that to the IT conversation, because what you're talking about is is that to your point about debt, so it's the Agile enterprise. I want to chat with you about. and you know, leverage everything that they have It's the most scalable database, or you know, you go design the most critical and you have to go back destination, the vision you want to go to, solve the problems when they're there. Appreciate the insights.

ENTITIES

Entity	Category	Confidence
Tim Wentworth	PERSON	0.99+
Phil Finucane	PERSON	0.99+
Zynga	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
2003	DATE	0.99+
AmEx	ORGANIZATION	0.99+
Yahoo	ORGANIZATION	0.99+
six	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
six weeks	QUANTITY	0.99+
American Express	ORGANIZATION	0.99+
2008	DATE	0.99+
35,000 feet	QUANTITY	0.99+
Splunk	ORGANIZATION	0.99+
Swiss Army	ORGANIZATION	0.99+
2007	DATE	0.99+
Mayfield	ORGANIZATION	0.99+
Phil	PERSON	0.99+
two applications	QUANTITY	0.99+
John	PERSON	0.99+
three users	QUANTITY	0.99+
People First	ORGANIZATION	0.99+
five	DATE	0.99+
two	QUANTITY	0.99+
1%	QUANTITY	0.99+
Express Scripts	ORGANIZATION	0.99+
six different deals	QUANTITY	0.99+
one partner	QUANTITY	0.99+
Mayfield Fund	ORGANIZATION	0.99+
three	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
four	QUANTITY	0.99+
Oracle Forms	TITLE	0.99+
one	QUANTITY	0.99+
AppDynamics	ORGANIZATION	0.99+
both	QUANTITY	0.99+
two ways	QUANTITY	0.99+
first experience	QUANTITY	0.99+
two years	QUANTITY	0.99+
theCube	ORGANIZATION	0.99+
mid-90s	DATE	0.98+
50 years	QUANTITY	0.98+
tomorrow	DATE	0.98+
eight weeks	QUANTITY	0.98+
over a billion dollars	QUANTITY	0.98+
first	QUANTITY	0.97+
one point	QUANTITY	0.97+
six years ago	DATE	0.97+
one thing	QUANTITY	0.97+
theCUBE	ORGANIZATION	0.97+
three things	QUANTITY	0.97+
One	QUANTITY	0.96+
CUBE	ORGANIZATION	0.96+

Show Wrap | VeeamON 2019

you live from Miami Beach Florida 2019 brought to you by V we're back this is the cube the leader in live tech coverage we're here in Miami this is a wrap of v-mon 2019 two days of coverage I'm Dave Volante with my co-host Peterborough's our third year covering v-mon we started in New Orleans we've seen you know veeam go from what they called at this show act one to act two and we talked two years ago about you know to the our first V Mon about the ascendancy of Veeam being so tightly tied to the rise of virtualization and now we heard this year act to being cloud multi cloud and we heard a number of announcements that are in support of that we're going to talk about that but Peter there were three key announcements this week one was the the billion dollar you know milestone they actually you know they finally hit a billion dollars I've been talking about it for a while it's now official billion dollars on a trailing 12-month basis they're a profitable company veem and a focused billion dollar yeah I think that's really a very focused I mean they do some M&A but not a lot of M&A and that's because of NIH I mean you know these guys they trust themselves to write code it's also it's also sustained that simple value proposition right and that's a in a fundamental Dogma I think I think it's fair to say we we heard the announcement of the the the with Veeam a API infrastructure which which is key we're going to talk about that we I think there were two companies they announced partnerships with Nutanix with mine and exa grid both taking advantage of that there will be others can ring doll just told us you know maybe 10 to 12 it's not going to be an enormous number at least for secondary storage yeah but and but that'll knock down a large portion of the infrastructure market and then a Veeam availability Orchestrator version 2 which allows you to do if fast backups recover from from backups without having to go to a replicated you know off-site and some other capabilities they call the dynamic documentation and automating testing and some DevOps capabilities so you know the people seem pretty excited about that it wasn't a sea of announcements like you see it some of these things which I think Peter talks to the degree of focus that you were just mentioning you know they're not about your bragging rights and the number of announcements that they can make you know it's really all about extending that platform a lot of incremental announcements ratmir told us not a big roadmap company even though he did show a roadmap today but the roadmap he showed was a lot of near-term functional improvement so very function rich you know the the tagline of it just works but um let's see I think there's the first time you and I have done v-mon together I've been here your impressions look I love wandering the halls and talked into the actual attendees and seeing what they have to say so I spent about an hour hour and a half just doing some work in one of the hallways here and one of the reasons I do that is because it's an opportunity here with the attendees and the customers are talking about and what's important to them you've got a lot of these shows and everybody's buzzing about one or another product announcement you go here and everybody's talking about the problems that they're solving and I think that one of the reasons why we didn't have this frenzy of product announcements like we have in so many other places because the focus is though because a lot of companies want the focus to be on them I think what we heard here or what I heard here somewhat different was again customers trying to solve the problems and Veeam creating an opportunity for them to talk in terms of some of the new directions and some of the new products that being are being introduced but the focus stayed on the customer and the problems of trying to solve and that's that's what to my mind that's what successful companies focus on yeah and I come back to this notion of the with Veeam the whole API integration cloud hybrid cloud the edge V wants to be and they've laid they laid this vision out you know certainly last year and even started the year before of of essentially being that that that backup capability data protection capability across wherever your data lives you know on from in cloud now they really are focused on on backup and data protection they even say backups where it starts a lot of other companies like don't even use the term backup no it's not about backup it's about data management and data protection so it's interesting that veem is really focused on on backup and when you do what you did and talk to the customers what do you use the V for backup backup backup backup and so so they're not over rotating to that vision now they're many of their competitors are going hard after that and doing some great great marketing so the competitive dynamics are very interesting now you got cohesive e you got rubric doing really well with positioning as a modern architecture in veeam definitely not a legacy company their business is growing you've got you've got CommVault you've got Dell EMC Veritas IBM you know trying to hit single-digit growth trying not to decline I mean IBM in particular declined and then and really had to do a deal with CAD illogic to stop veem from eating its market share that's really what that deal was all about you saw Dell EMC kind of take its eye off the ball when it merged with with Dell EMC you know it was the leader in in purpose-built backup appliances it's made some announcements recently to try to get you know it's got some really good start back in the game right so you know you don't ever count those guys out Comicon vaults approached it differently they've got a large install base you know Veritas went through private equity and so they had some some other challenges but again they're investing and so it's a big market you know people are gonna go fight hard for it and then and then with with the outside funding that's come in it's really up the game now a lot of that funding is gonna go to promotion which again comes back to your point about focused R&D really really important to focus R&D on things that customers want that you're gonna solve a business problem so if you go back and just just to take your segmentation and we can kind of look at it in a couple of perhaps simple ways you've got you've got you've got veem and companies like beam who saw the hole virtualization and the need to do a better job of supporting and protecting and and replicating and backing up virtualized resources all hitting the market pretty hard and then you have the Delhi MCS and a lot of the other companies that you mentioned trying to sustain or keep pace with those guys and then you have the new guys the Dhruv is and what not are we talking about just cross cloud multi cloud backup on top of that you have and something we talked about with a couple of guests the security guys are looking at this and saying wait a minute you know data is data and protection and security are going to be increasingly difficult to separate because data is going to move and I have to be able to move security with the data it's going to be an inevitability it's we're talking about a cloud that allows us to more do more distribution of data because we're gonna do more distribution of work and the security is gonna have to move for the data so the security guys are gonna get in this the networking guys are gonna be asked some questions about the opportunity you got the old guard who is more focused on devices and managing and backing up devices trying to get back in you got the new guys you're saying let's let's lead the the the act to before you know the veins get there it's gonna be an extremely complex market but all of its gonna boil down to this simple fact I'm gonna distribute data in response to the work that needs to be performed and how am I going to manage the digital assets that I have to make that easy so that it doesn't explode and all of these companies at some point kind of the next phase of this is going to be on protecting data but can I turn it into a digital asset so here's what I saw I saw them talking about the idea of you know what we're gonna protect locally I'll suggest it over the course the next couple of years it's going to be we're gonna do you know data asset management with protection with where the actual act of protecting it is similar to the act of defining it as an asset so being able to you know use a a snapshot for a lot of different uses already happening now but adding services you know a consistent set of services on top of that through with veem and other resources allow them to do that and then move more of that what's today regarded as replication function into that protection side of things a lot more support for locally because that's where the services are going to become having the services are not having the services it's really going to be an essential question because we're gonna move more of this data out to where the work is going to be profound we often talk about customers having to place bets but but the the the vendors are having to place bets as well they're obviously betting on multi-cloud but but juxtapose for example what themes doing it was interesting to hear ken ring doll he answered your question about whether it was em through M&A and he answered in an M&A context but or maybe organic development around more security functions and he kind of said Never Say Never but really focused the team the engineering team is really focused on backup and data protection and what they call data management juxtapose that now with way say for instance what a daydream is doing X data domain guys built their own file system trying to bring both primary and secondary stores together yeah and which I like and I think it's really powerful themes taking a different approach they're saying and with with VM api's we're gonna partner with pure we're gonna partner with with Cisco we're gonna partner with Nutanix so different approach and they're gonna obviously you know claim the same capability hey we can do that too you know date tree I'm saying well we can do that too with just one mousetrap you know the integration points etc so it's gonna really be interesting to see how that all shakes out that that word seamless you know I said it sometimes triggers me if it really is seamless you know theme has a go to market advantage relative to you know the the Swiss Army knife approach if it's not seamless then you know ad atrium approach will have an event it's from a product standpoint you and I both know there's so much more to success than just having a great product absolutely you know and mentioned it but but here's you know it's interesting one of the thought about what will the roadmap the practical roadmap because FEMA's altered its roadmap in response to customer demand quite frankly very successfully and and and and you know you got to applaud him for doing so but one of the things we heard was it look we don't want we don't want to over promise on the engineering front because you've got a certain number of Engineers and a certain engineering capacity focus them on things that are creating value to the problems you're trying to solve the same things true within a lot of user shops you don't want to throw a whole bunch of new function to new requirements and a bunch of guys who are still themselves trying to evolve from backing up devices to now actually protecting data and and so there's a there's a natural evolution that's going to take place and I and I think veeam that's done a pretty good job of keeping their finger on what that pulse is it's it's what can be invented but also what can be innovated if we think of innovation as the customer adopting and applying it and betting it and changing their activities around it and I think themes done a pretty good job of navigating you know that what can customers really do right now not getting too far ahead so a lot of these guys that the natural tendency that you come from a product perspective and you say put more into the product and you know get the better check marks and you know have the better it's better statute is better factsheet and I think Veeam is taking a simpler approach almost an apple like approach is an enterprise sense and saying look give them what they can a candle give them what they can use give them what's going to generate value and as they master that give them a little bit more it reminds me of is you said Apple it reminds me of early EMC days when EMC brought out you know it's symmetric it was it would connect you know AIX solaris unisys obviously the IBM mainframe it had all the optionality all the connectivity and that's kind of what would be and then the features that it announced were really practical they clearly solved the problem now since then you know MCS evolved into the checkbox so we have more features than anybody that's what happens when you everybody wants right you have the customer base everybody wants and they say check we have that thin provisioning we have that too and you know we're gonna freeze the market that's the you know much more mature company in their defense it's also in response to an increasingly specialized and complex customer base they're trying to cover all the base and you know competitive guys eating that they're absolutely absolutely and the sales guys saying hey we need something and they've done a great job of doing that but but Veeam is very very focused on the optionality in four years they they wouldn't talk about bare metal and a couple of years ago would beam on the big thing was hey we said for years that were only virtualization well guess what now we do bare metal that was sort of the one the big announcement one year so they're they're very judicious about how they allocate their R&D you know capital and in you're seeing that you know translate into function that actually gets used actually give yeah I think it's a key point I think your analogy with EMC is actually really good Dave because if you go back thirty years when the EMC first started getting going what was the problem controllers on mainframes and mini-computers were getting incredibly complex it's you know the Daddy controllers and the amount of processing that was being put into that in the microcode was just overwhelming most people's ability to deal with it and so MC came along and said well if that's the problem can we fix it we put cash in that'll just make this whole system simpler and then they stayed true to that for a number of years and they turn into a beer mark and it's interesting I think it is a good analogy because what is the problem the problem is data's going to be more distributed it's going to be more central to a company's mission it's gonna be used by more functions and repurposed into more applications that have a greater diversity of RTO and RPO and as a consequence they're saying they seem to be saying we're going to do our best to pose much function to that protect side of things local as we possibly can so that people who aren't PhDs in computer science to perform a real business service by making all that stuff work and then will at the same time work very closely with third parties who can bring specialization of that secondary storage to bear as the specialization increases because it's going to increase and the other the other you know China MB a case study example that I would point to is the early days of Veritas when Jim when Jeremy Burton was running Merrick marketing it Veritas II sort of coined the language Jeremy calling the no hardware agenda a pure software a lot of function and they you know rose to a couple of billion dollar you know in revenue you know very very successful now have the big install base that everybody wants to eat it's just again reminiscent the pure software company they're not shipping boxes they're not shipping appliances they're they're not selling direct their pure channel play there's a big tamp to just continue to do virtualization like the big question is are they going to will their focus on what they're currently doing translate into focus on multi cloud and here at this conference they're claiming yes we've heard nothing that suggests that they won't be able to but there's a lot of new players out there who are looking at that space and saying you know what I can do that too and there's gonna be a lot of invention a lot of investment and you know there's good reasons to suspect it beans gonna be able to evolve successfully but there are a few areas where I think they're gonna have to focus more time in the big part of a CEOs job is Tam expansion and you know right now there are you know a billion out of fifteen let's call it so there's a long way to go but as you point out that multi-cloud appears like it's gonna be lucrative and there's a lot of different companies coming at it from from different angles you guys tell me we look at it is this big blob yeah this is gonna be incredibly specialized very fragmented I mean you got Cisco coming at it from a networking perspective RedHat coming from a past perspective Google you know partnering everybody Amazon right now ignoring it but you guarantee they're gonna be awesome and Microsoft has to be in it because of the huge estate of on-prem you know software and there's a dozen security guys are gonna be looking at this and saying oh look data in motion that's my service now is going to get its pieces so very interesting how that's all gonna shake out it's okay so wrap it up Peter you know kind of summarize your thoughts on the space v-mon so first beam on for me a lot of customers that we're talking about solving complex problems during their digital business transformation that's always good to hear got to a billion dollars that's a great milestone for any software company good reasons is the fact that beam is going to evolve into a company like Veritas like one of the big guys this is a company that's got legs and I think that the final one that I'd say not got legs but that they've got what it takes to be able to affect this transition they probably got the execution chops look we had a user on here who effectively said if you're not using if you're CIO and you're not using veem you're not competent and you know he said that that's not that's not a bad testimonial when you come down to it yeah and then the one thing that we have not talked about which is it shines through is culture yeah you know this company has a culture that is a winning culture it's a fun culture there's an accountability associated with it and and very customer orientation solutely up so that's the winning formulas have been fun sort of watching these guys grow and interacting with a number of their customers and you saw you saw a couple years ago Veeam saying okay we're going Enterprise so I ain't so easy there's just say we're going enterprise but in interestingly even though they've somewhat retrenched from that messaging they're having success in the enterprise clearly with their partnerships with guys like HPE at Cisco and NetApp and and others and so they're just gonna let it bake a little bit and go from there position of strength which is that you know kind of s in an MB do more simply with your protection environment is not a bad story a company of any size right right and okay Peter hey you spent great working with you thank you and thank you for watching guys great job awesome go to Silicon angle comm you'll see all the news the cube net is where we host all these videos and you'll see wiki bond comm has all the research Peter recently wrote a great piece on on data protection and how that markets involving check out our Twitter at the cube and at the cube 365 Twitter handles you'll see all kinds of clips coming out of this show and other shows let's see where we got a lot coming up good for you and what do you think so I think you're seeing as I said before a very practical approach to gaining foothold and in maintaining and growing in a market I like the business model this this company has been somewhat opaque you know european-based you know the Russian founders but and and most of us businesses outside of the US and and I think they're really coming into the mainstream now and Cube helps make it more transparent yeah absolutely and right because you can ask the questions of people and you know you get you get all kinds of different answers so and we're able to have you know independence on you know guys like Justin the firm's like the four five one guys that you know Gartner coming on and and it's fun to have those guys so so it's been great thank you for watching the cube go to the cube dotnet check out the events that are coming up we got a huge huge season May and June or our busiest months take a slight break in July although you know we'll be cranking this summer as well so thank you for watching everybody we're out Dave a lot day for Peterborough's we'll see you next time

Published Date : May 22 2019

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Nutanix	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Peter	PERSON	0.99+
Jeremy Burton	PERSON	0.99+
Dave Volante	PERSON	0.99+
Miami	LOCATION	0.99+
New Orleans	LOCATION	0.99+
Veritas	ORGANIZATION	0.99+
FEMA	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
two companies	QUANTITY	0.99+
Swiss Army	ORGANIZATION	0.99+
ratmir	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Jeremy	PERSON	0.99+
12-month	QUANTITY	0.99+
US	LOCATION	0.99+
three key announcements	QUANTITY	0.99+
NIH	ORGANIZATION	0.99+
billion dollar	QUANTITY	0.99+
June	DATE	0.99+
billion dollars	QUANTITY	0.99+
Jim	PERSON	0.99+
Google	ORGANIZATION	0.99+
July	DATE	0.99+
two days	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
EMC	ORGANIZATION	0.99+
Dave	PERSON	0.98+
Gartner	ORGANIZATION	0.98+
2019	DATE	0.98+
two years ago	DATE	0.98+
last year	DATE	0.98+
NetApp	ORGANIZATION	0.98+
Justin	PERSON	0.98+
both	QUANTITY	0.98+
Dell EMC	ORGANIZATION	0.98+
first time	QUANTITY	0.97+
one	QUANTITY	0.97+
beam	ORGANIZATION	0.97+
Russian	OTHER	0.97+
this week	DATE	0.97+
four years	QUANTITY	0.97+
third year	QUANTITY	0.97+
about an hour	QUANTITY	0.97+
12	QUANTITY	0.96+
today	DATE	0.95+
thirty years	QUANTITY	0.95+
this year	DATE	0.95+
ken ring doll	PERSON	0.95+
10	QUANTITY	0.95+
May	DATE	0.94+
Dhruv	ORGANIZATION	0.93+
Veeam	ORGANIZATION	0.93+
M&A	TITLE	0.93+
HPE	ORGANIZATION	0.93+
M&A	ORGANIZATION	0.93+
apple	ORGANIZATION	0.92+
fifteen	QUANTITY	0.91+
first	QUANTITY	0.91+
a lot of new players	QUANTITY	0.91+
Miami Beach Florida	LOCATION	0.91+
billion dollars	QUANTITY	0.91+
MCS	ORGANIZATION	0.9+
a dozen security guys	QUANTITY	0.89+
next couple of years	DATE	0.89+
Comicon	ORGANIZATION	0.87+
CommVault	ORGANIZATION	0.87+
a billion	QUANTITY	0.87+
one thing	QUANTITY	0.86+
Twitter	ORGANIZATION	0.85+

David Tucker, SingleWire | Cisco Live US 2018

>> Live from Orlando, Florida, it's theCUBE covering Cisco Live 2018. Brought to you by Cisco, NetApp, and theCUBE's Ecosystem partner. (techy music playing) >> Welcome back, I'm Stu Miniman and we're getting to the end of three days of live, wall-to-wall coverage here at Cisco Live 2018, and you're watching theCUBE. Happy to welcome to the program first time guest David Tucker, who's the SVP of business development, with SingleWire, thanks so much for joining us. >> I'm glad to be here, Stu. >> All right, so David, you've worked for Cisco, you're now a partner with them. Give us a little bit about your background and tell us, for those who don't know, what SingleWire's mission is. >> Okay, my background is almost 40 years in the voice space, and then I joined SingleWire about two... Two and a half years ago, after leaving Cisco for 18 years. So, Singlewire is a ISV that partnered with Cisco right after 9/11 when the Department of Interior came to Cisco and said, "We have to evacuate "all of our buildings in less than "10 minutes, give us a solution." Our product, InformaCast, was born after that event. So, what we do is we work with Cisco's UC collaboration portfolio, and when a customer needs to get critical communications out to employees, such as a weather alert, such as a safety announcement, in a hurry and they need to reach all their employees no matter where they may be, that's where we come in. >> Yeah, so David, it's an interesting dichotomy. On the one hand, you know, when I go to my kids and I say, you know, "You have this device"-- >> Right. >> It's for the internet, the apps, and you know, text and everything... They don't actually use voice on it all that much. >> Right. >> On the other hand, you know, I go to some of the cloud shows and things like that, you know, voice is the new interface. I've got an Alexa and I got a Google Home and I talk to Siri and Cortana and everything like that, so people are programming again for voice. Give us your perspective on, you know, just voice in general and then I definitely want to get into, you know, your company's mission and how some of that stuff works. >> Well, couple thoughts... First off, when you need to communicate to people and get their attention there's nothing better than audio with voice, live audio. You know, if there's an emergency in a building, for example, if you hear it in your ear through, you know, a speaker, through a phone, through your cell phone, it gets your attention. If it's a text or an image I may look at it later and it's too late. From a user interface perspective, when you walk into a building or you now need to communicate that there's an event going on, what better way other than to speak it? "I need help." "Man down." "There's an armed gunman in the building." "There's a chemical spill in the factory." Using voice is a great way to communicate and to trigger events is a natural way of doing it. So, voice is still very powerful in terms of, you know, that human to human interaction. >> Yeah, absolutely, so tell us a little bit about how do these mass notifications work? You know, I've been, I travel a bunch, you know, for these environments and I've been places where all of a sudden everybody's phones start making those weird sounds because you know, flood alerts are going on. You know, "Thunderstorm, stay in place." You know, sometimes when there are either manmade or nature made events, you know, how does this work? Is this tied to the phone network, is it, you know, how does this fit in? >> Well, think of us as kind of a Swiss Army knife piece of software where a customer may have all kinds of ways to trigger an event, and it could be as simple as a panic button under a desk or a panic button on a Cisco phone, it could be tied to the weather service, such as a tornado bearing down on your building, and then we have to then communicate out to reach all the employees through their mobile interface, through digital sign-ins, through overhead speakers, through their Cisco phone be able to reach everybody. So, it's kind of that any-to-any kind of model because you never know where people are going to be, and so that's really what SingleWire really does. Very common in schools, K through 12, obviously. A lot of universities, but there's now a workflow component of this as well. "I need to be able to tell the "employees that my network is down." "There's a chemical spill in the factory. "How do I tell the right people not "to drive through that chemical spill?" So, that's what we really do, we're that Swiss Army knife of really making sure people know, whatever event it may be, as quickly as possible, and seconds is what it really comes down to. >> All right, and David, I heard you say this can be like a featurous part of a Cisco phone, so I assume that the software that goes in... Can you speak a little bit about, you know, how much is just... You have pre-made packages for verticals and how much is it... You know, we're here in the DevNet zone or you know, people coding and you know, putting together their own packages or you know, building you as part of a custom application. >> Well, we're one of the very first Cisco Ecosystem partners going back into 2001. Our system is API forward, so we've been working with Cisco APIs into the phone and into Call Manager since that timeframe, because we need to be able to talk to the speaker, be able to put something on the display of the phone. We need to be able to talk to Call Manager to know where people are, who's on what phone, what building, what floor they're on, and then we also have APIs on our side. So, we're also like a Ecosystem partner of Cisco but talking to another ecosystem of speaker vendors and of digital sign vendors. We're working now with the intent-based networking solutions, so when they're having problems in the network and they need to be able to communicate to certain people that there's a network issue, you know, who do you tell? So, you know, plug into that ecosystem, as well. So, that's our livelihood is DevNet. >> That's fascinating, yeah, it's great. No longer the pagers going off, right? >> That's right. >> Now that they hear that little voice on the weekend when they're not even on call anymore. (laughing) >> That's right, that's right. We're going to find you, yeah. (laughing) >> It sounds like you've got a really interesting, diverse customer set. I wonder if you have any interesting customer examples or things that would help us, you know, really put a face on these solutions. >> Our common, our core applications deal around safety, safety in the workplace, safety in a government building, safety in a school, but our customers have taken our product and used it for all different kinds of things. We have a school system, for example, that connected us to their restrooms because they wanted to be able to save water, and after hours we'd turn off the water supply so the toilets aren't running all the time. >> Right. >> We have an ice cream company that connected us to their ammonia sensors because they use ammonia to clean the tanks to make sure there's no Wisteria in there. So, that basically is an application where now, you know, that sensor goes off, we tell the people on the forklifts not to drive through this particular area. So, we have another hospital in California, Laguna Hospital, it's a hospital that focuses on patients with dementia, and they put RFID tags on the patient, and when they get up and they wander the hallways they may get lost and they get confused. We can detect that that patient is now in this particular wing. We now play back over the closest speaker the voice of their loved one telling them to go back to their room, which they tend to respond to. So, those are some interesting use cases how we tied Cisco technology to ours. >> Yeah, that's fascinating, David, have you been to a few of these Cisco Lives in the past? >> In my career quite a few, yes. >> Yeah, it had been a couple of years since I'd been here. There's great energy here, especially being, you know, here in the DevNet zone. What do you see that's different today in networking in general and at Cisco Live specifically? >> It continues to be a evolution to more open connectivity. In the early days it was all about Cisco and it was very much, "Sell my box." Now it's selling solutions and the resellers and the partners here that are selling those solutions look to multi-vendor type environments, and Cisco's openness to that has really changed over the years, and I was involved in the very beginning days of the Ecosystem and the collab group, and it was really about starting that ball going. But you know, this is a totally different environment now and it's really amazing to feel the energy of people willing to work together and create something unique. So, that's the big change I see. >> Yeah, absolutely, I think back to... You know, the line you'd hear is like, "Well, Cisco is the standard," so that's kind of the way people do things, but that community and openness, it's making progress. Not saying that everything that Cisco does is 100% open, but it is... We were talking to Susie Wee earlier and some of the other people in DevNet is... Remember when Cisco had, like, one API-- >> Oh, yeah. >> For a product, now you know, there are a lot of solutions and they're building and they're collaborating with the community. >> Right, right. >> All right, Dave, I want to just give you a final word as to, you know, things you're seeing out there, what's exciting you the most and any final takeaways from the show? >> Well, I think networking continues to evolve. I think UC and collaboration is really still in the forefront of how it can affect how businesses operate, and you know, with the extension of video in meetings, you know, I see that's going to continue to be very exciting for both Cisco and for customers. You know, what we do is really on the forefront of people's minds now with safety and we're just on the tip of the iceberg of customer opportunities, so you know, if somebody's out there, he knows a Cisco customer, give us a call. >> All right, well David Tucker, thank you so much. SingleWire, absolutely, security of the network has been top of mind, security of people, absolutely, critically, and great to see some of those things also spreading into lots of other ecosystems and building interesting solutions. So, be back with lots more coverage here at Cisco Live 2018. I'm Stu Miniman and thanks so much for watching theCUBE. (techy music playing)

Published Date : Jun 13 2018

SUMMARY :

Brought to you by Cisco, NetApp, to the end of three days of live, wall-to-wall coverage Give us a little bit about your background and tell us, in a hurry and they need to reach all their employees On the one hand, you know, when I go to my kids and you know, text and everything... On the other hand, you know, I go to some you know, that human to human interaction. You know, I've been, I travel a bunch, you know, "I need to be able to tell the you know, how much is just... in the network and they need to be able No longer the pagers going off, right? Now that they hear that little voice on the weekend We're going to find you, yeah. or things that would help us, you know, so the toilets aren't running all the time. the tanks to make sure there's no Wisteria in there. you know, here in the DevNet zone. But you know, this is a totally different kind of the way people do things, but that community For a product, now you know, there are a lot so you know, if somebody's out there, All right, well David Tucker, thank you so much.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
David Tucker	PERSON	0.99+
Dave	PERSON	0.99+
Susie Wee	PERSON	0.99+
2001	DATE	0.99+
Stu Miniman	PERSON	0.99+
Siri	TITLE	0.99+
100%	QUANTITY	0.99+
18 years	QUANTITY	0.99+
California	LOCATION	0.99+
Cortana	TITLE	0.99+
Orlando, Florida	LOCATION	0.99+
Swiss Army	ORGANIZATION	0.99+
NetApp	ORGANIZATION	0.99+
Stu	PERSON	0.99+
three days	QUANTITY	0.99+
SingleWire	ORGANIZATION	0.99+
Singlewire	ORGANIZATION	0.98+
theCUBE	ORGANIZATION	0.98+
Two and a half years ago	DATE	0.98+
10 minutes	QUANTITY	0.97+
Department of Interior	ORGANIZATION	0.97+
first time	QUANTITY	0.97+
both	QUANTITY	0.97+
Cisco Live 2018	EVENT	0.96+
Laguna Hospital	ORGANIZATION	0.96+
Alexa	TITLE	0.96+
almost 40 years	QUANTITY	0.95+
First	QUANTITY	0.94+
first	QUANTITY	0.94+
less than	QUANTITY	0.93+
12	QUANTITY	0.92+
today	DATE	0.92+
InformaCast	ORGANIZATION	0.9+
Wisteria	ORGANIZATION	0.9+
Cisco Ecosystem	ORGANIZATION	0.85+
one	QUANTITY	0.85+
andard	PERSON	0.79+
UC	ORGANIZATION	0.78+
9/11	DATE	0.78+
DevNet	TITLE	0.68+
Cisco Live	ORGANIZATION	0.63+
DevNet	ORGANIZATION	0.59+
ISV	ORGANIZATION	0.59+
2018	DATE	0.57+
US	LOCATION	0.54+
couple	QUANTITY	0.53+
Google Home	COMMERCIAL_ITEM	0.53+
about two	DATE	0.53+
Cisco Live	EVENT	0.51+

Kellyn Pot'Vin Gorman, Delphix - Data Platforms 2017 - #DataPlatforms2017

>> Announcer: Live from the Wigwam in Phoenix, Arizona. It's theCUBE covering Data Platforms 2017. Brought to you by Qubole. >> Hey welcome back everybody. Jeff Frick here with theCUBE. We're at the historic Wigwam Resort. 99 years young just outside of Phoenix. At Data Platforms 2017. I'm Jeff Frick here with George Gilbert from Wikibon who's co-hosting with me all day. Getting to the end of the day. And we're excited to have our next guest. She is Kellyn Gorman. The technical intelligence manager and also the office of the CTO at Delphix, welcome. >> Yes, thank you, thank you so much. >> Absolutely, so what is Delphix for people that aren't familiar with Delphix? >> Most of realize that the database and data in general is the bottleneck and Delphix completely revolutionizes that. We remove it from being the bottleneck by virtualizing data. >> So you must love this show. >> Oh I do, I do. I'm hearing all about all kinds of new terms that we can take advantage of. >> Right, Cloud-Native and SEPRATE, you know and I think just the whole concept of atomic computing. Breaking down, removing storage, from serve. Breaking it down into smaller parts. Sounds like it fits right into kind of your guys will house. >> Yeah, I kind of want to containerize it all and be able to move it everywhere. But I love it. Yeah. >> So what do you think of this whole concept of Data Ops? We've been talking about Dev Ops for, I don't know how long... How long have we been talking about Dev Ops George? Five years? Six years? A while? >> Yeah a while (small chuckle) >> But now... >> Actually maybe eight years. >> Jeff: you're dating yourself George. (all laugh) Now we're talking about Data Ops, right? And there's a lot of talk of Data Ops. So this is the first time I've really heard it coined in such a way where it really becomes the primary driver in the way that you basically deliver value inside your organization. >> Oh absolutely. You know I come from the database realm. I was a DBA for over two decades and Dev Ops was a hard sell to a lot of DBAs. They didn't want to hear about it. I tried to introduce it over and over. The idea of automating and taking us kind of out this manual intervention. That introduced many times human error. So Dev Ops was a huge step forward getting that out of there. But the database was still in data in general was still this bottleneck. So Data Ops is the idea that you automate all of this and if you virtualize that data we found with Delphix that removed that last hurdle. And that was my, I guess my session was on virtualizing big data. The idea that I could take any kind of structured or unstructured file and virtualize that as well and instead of deploying it to multiple environments, I was able to deploy it once and actually do IO on demand. >> So let's peel the onion on that a little bit. What does it mean to virtualize data? And how does that break databases' bottleneck on the application? >> Well right now, when you talk about a relational data or any kind of legacy data store, people are duplicating that through our kick processes. So if we talk about Oracle they're using things like Datapump. They're using transportable table spaces. These are very cumbersome they take a very long time. Especially with the introduction of the cloud, there's many room for failure. It's not made for that, especially as the network is our last bottleneck. Is what we're also feeling too for many of these folks. When we introduce big data, many of these environments many of these, I guess you'd say projects came out of open source. They were done as a need, as a necessity to fulfill. And they've got a lot of moving pieces. And to be able to containerize that and then deploy it once and the virtualize it so instead of let's say you have 16 gigs that you need to duplicate here and over and over again. Especially if you're going on-prem or to the cloud. That I'm able to do it once and then do that IO on demand and go back to a gold copy a central location. And it makes it look like it's there. I was able to deploy a 16 gig file to multiple environments in less than a minute. And then each of those developers each have their own environment. Each tester has their own and they actually have a read write full robust copy. That's amazing to folks. All of a sudden, they're not held back by it. >> So our infrastructure analysts and our Wikibon research CTO David Floyer, if I'm understanding this correctly, talks about this where it's almost like a snapshot. >> Absolutely >> And it's a read write snapshot although you're probably not going to merge it back into the original. And this way Dev tests and whoever else wants to operate on live data can do that. >> Absolutely, it's full read write what we call it data version control. We've always had version control at the cold level. You may of had it at the actual server level. But you've rarely ever had it at the data level for the database or with flat files. What I used was the cms.gov data. It's available to everyone, it's public data. And we realized that these files were quite large and cumbersome. And I was able to reproduce it and enhance what they were doing at TIME magazine. And create a used case that made sense to a lot of people. Things that they're seeing in their real world environments. >> So, tell us more, elaborate how dev ops expands on this, I'm sorry, not dev ops data ops. How, take that as an example and generalize it some more so that we see how if DBAs were a bottleneck. How they now can become an enabler? >> One it's getting them to raise new skills. Many DBAs think that their value relies on those archaic processes. "It's going to take me three weeks to do this." So I have three weeks of value. Instead of saying "I am going to be able to do this in one day" and those other resources are now also valuable because they're doing their jobs. We're also seeing that data was seen as the centralized point. People were trying to come up with these pain points of solution to them. We're able to take that out completely. And people are able to embrace agility. They have agile environments now. Dev Ops means that they're able to automate that very easily instead of having that stopping point of constantly hitting a data and saying "I've got to take time to refresh this." "How am I going to refresh it?" "Can I do just certain..." We hear about this all the time with testing. When I go to testing summits, they are trying to create synchronized virtualized data. They're creating test data sets that they have to manage. It may not be the same as production where I can actually create a container of the entire developmental production environment. And refresh that back. And people are working on their full product. There's no room for error that you're seeing. Where you would have that if you were just taking a piece of it. Or if you were able to just grab just one tier of that environment because the data was too large before. >> So would the automation part be a generation of snapshot one or more snapshots. And then the sort of orchestration distribution to get it to the intended audiences? >> Yes, and we would use >> Okay. things like Jenkins through Chev normal dev ops tools work along with this. Along with command line utilities that are part of our product. To allow people to just create what they would create normally. But many times it's been siloed and like I said, work around that data. We've included the data as part of that. That they can deploy it just as fast. >> So a lot of the conversation here this morning was really about put the data all in this through your or pick your favorite public cloud to enable access to all the applications to the UPIs, through all different types of things. How does that impact kind of what you guys do in terms of conceptually? >> If you're able to containerize that it makes you capable of deploying to multiple clouds. Which is what we're finding. About 60% of our customers are in more than one cloud, two to five exactly. As we're dealing with that and recognizing that it's kind of like looking at your cloud environments. Like your phone providers. People see something shiny and new a better price point, lesser dollar. We're able to provide that one by saving all that storage space. It's virtualized, it's not taking a lot of disc space. Second of all, we're seeing them say "You know, I'm going to go over to Google." Oh guess what? This project says they need the data and they need to actually take the data source over to Amazon now. We're able to do that very easily. And we do it from multi tier. Flat files, the data, legacy data sources as well as our application tier. >> Now, when you're doing these snapshots, my understanding if I'm getting it right, is it's like a, it's not a full Xerox. It's more like the Delta. Like if someone's doing test dev they have some portion of the source of the source of truth, and as they make changes to it, it grows to include the edits until they're done, in which case then the whole thing is blown away. >> It depends on the technology you're looking at. Ours is able to trap that. So when we're talking about a virtual database, we're using the native recovery mechanisms. To kind of think of it as a perpetual recovery state inside our Delphix engine. So those changes are going on and then you have your VDBs that are a snapshot in time that they're working on. >> Oh so like you take a snapshot and then it's like a journal >> the transactional data is from the logs is continually applied. Of course it's different depending on each technology. So we do it differently for Cybase versus Oracle versus Sequal server and so on and so forth. Virtual files when we talk about flat files are different as well. Your parent, you take an exact snapshot of it. But it's really just projecting that NFS mount to another place. So that mount, if you replace those files, or update them of course, then you would be able to refresh and create a new shot of those files. So somebody said "We refresh these files every single night." You would be able to then refresh and project them out to the new place. >> Oh so you're, it's almost like you're sub-classing them... >> Yes. >> Okay, interesting... When you go into a company that's got a big data initiative, where do you fit in the discussion, in the sequence how do you position the value add relative to the data platform that it's sort of the center of the priority of getting it a platform in place? >> Well, that's what's so interesting about this is that we haven't really talked to a lot of big data companies. We've been very relational over a period of time. But our product is very much a Swiss Army knife. It will work on flat files. We've been doing it for multi tier environments forever. It's that our customers are now going "I have 96 petabytes in Oracle. I'm about to move over to big data." so I was able to go out and say we how would I do this in a big data environment? And I found this used case being used by TIME magazine and then created my environment. And did it off of Amazon. But it was just a used case. I was just a proof of concept that I built to show and demonstrate that. Yeah, my guy's back at the office are going "Kellyn when you're done with it, you can just deliver it back to us." (laughing) >> Jeff: Alright Kellyn. Well thank you for taking a few minutes to stop by and pretty interesting story. Everything's getting virtualized machines, databases... >> Soon us! >> And our data. >> Soon George! >> Right, not me George... (George laughs) Alright, thanks again Kellyn >> Thank you so much. >> for stopping by. Alright I'm with George Gilbert. I'm Jeff Frick you're watching theCUBE from Data Platforms 2017 in Phoenix, Arizona. Thanks for watching. (upbeat electronic music)

Published Date : May 26 2017

SUMMARY :

Brought to you by Qubole. and also the office of the CTO at Delphix, welcome. Most of realize that the database that we can take advantage of. Right, Cloud-Native and SEPRATE, you know and be able to move it everywhere. So what do you think of this whole concept in the way that you basically deliver and instead of deploying it to multiple environments, What does it mean to virtualize data? And to be able to containerize that and our Wikibon research CTO David Floyer, into the original. You may of had it at the actual server level. so that we see how if DBAs were a bottleneck. They're creating test data sets that they have to manage. distribution to get it to the intended audiences? To allow people to just create what So a lot of the conversation here the data source over to Amazon now. of the source of truth, and as they make and then you have your VDBs that NFS mount to another place. Oh so you're, it's almost like you're to the data platform that it's sort of I'm about to move over to big data." to stop by and pretty interesting story. Right, not me George... Alright I'm with George Gilbert.

ENTITIES

Entity	Category	Confidence
George Gilbert	PERSON	0.99+
Jeff	PERSON	0.99+
Kellyn Gorman	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Kellyn	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
George	PERSON	0.99+
two	QUANTITY	0.99+
three weeks	QUANTITY	0.99+
16 gig	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Phoenix	LOCATION	0.99+
Five years	QUANTITY	0.99+
eight years	QUANTITY	0.99+
Six years	QUANTITY	0.99+
16 gigs	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
less than a minute	QUANTITY	0.99+
each	QUANTITY	0.99+
99 years	QUANTITY	0.99+
Xerox	ORGANIZATION	0.99+
Phoenix, Arizona	LOCATION	0.99+
Delphix	ORGANIZATION	0.99+
Swiss Army	ORGANIZATION	0.99+
96 petabytes	QUANTITY	0.98+
David Floyer	PERSON	0.98+
About 60%	QUANTITY	0.98+
Each tester	QUANTITY	0.98+
Wikibon	ORGANIZATION	0.98+
more than one cloud	QUANTITY	0.98+
Second	QUANTITY	0.98+
one day	QUANTITY	0.98+
first time	QUANTITY	0.97+
TIME	TITLE	0.97+
five	QUANTITY	0.97+
Ops	TITLE	0.96+
each technology	QUANTITY	0.96+
Qubole	PERSON	0.96+
CTO	PERSON	0.95+
one tier	QUANTITY	0.94+
theCUBE	ORGANIZATION	0.94+
Chev	TITLE	0.93+
#DataPlatforms2017	EVENT	0.92+
Dev Ops	TITLE	0.91+
this morning	DATE	0.89+
Kellyn Pot'Vin Gorman	PERSON	0.88+
over two decades	QUANTITY	0.87+
one	QUANTITY	0.82+
Delphix	TITLE	0.81+
One	QUANTITY	0.77+
Datapump	ORGANIZATION	0.75+
Wigwam Resort	LOCATION	0.75+
Ops	ORGANIZATION	0.73+
single night	QUANTITY	0.72+
Jenkins	TITLE	0.71+
Wigwam	LOCATION	0.71+
Sequal	ORGANIZATION	0.7+
Data	TITLE	0.66+
Platforms	EVENT	0.65+
Data Platforms 2017	EVENT	0.64+
SEPRATE	PERSON	0.63+
cms.gov	OTHER	0.56+
Cybase	ORGANIZATION	0.56+
Cloud-	ORGANIZATION	0.55+
Delta	ORGANIZATION	0.54+
Data Ops	ORGANIZATION	0.52+
2017	DATE	0.44+

Vikram Bhambri, Dell EMC - Dell EMC World 2017

>> Narrator: Live from Las Vegas, it's theCUBE. Covering Dell EMC World 2017, brought to you by Dell EMC. >> Okay, welcome back everyone, we are live in Las Vegas for Dell EMC World 2017. This is theCUBE's eighth year of coverage of what was once EMC World, now it's Dell EMC World 2017. I'm John Furrier at SiliconANGLE, and also my cohost from SiliconANGLE, Paul Gillin. Our next guest is Vikram Bhambri, who is the Vice President of Product Management at Dell EMC. Formally with Microsoft Azure, knows cloud, knows VIPRE, knows the management, knows storage up and down, the Emerging Technologies Group, formerly of EMC. Good to see you on theCUBE again. >> Good to see you guys again. >> Okay, so Elastic Compute, this is going to be the game changer. We're so excited about one of our favorite interviews was your colleague we had on earlier. Unstructured data, object store, is becoming super valuable. And it was once the throwaway, "Yeah, store, later late ". Now with absent data driven enterprises having access to data is the value proposition that they're all driving towards. >> Absolutely. >> Where are you guys with making that happen and bringing that data to life? >> So, when I think about object storage in general, people talk about it's the S3 protocol, or it's the object protocol versus the file protocol. I think the conversation is not about that. The conversation is about data of the universe is increasing and it's increasing tremendously. We're talking about 44 zettabytes of data by 2020. You need an easier way to consume, store, that data in a meaningful way, and not only just that but being able to derive meaningful insights out of that either when the data is coming in or when the data is stored on a periodic basis being able to drive value. So having access to the data at any point of time, anywhere, is the most important aspect of it. And with ECS we've been able to actually attack the market from both sides. Whether it's talking about moving data from higher cost storage arrays or higher performance tiers down to a more accessible, more cheap storage that is available geographically, that's one market. And then also you have tons of data that's available on the tape drive but that data is so difficult to access, so not available. And if you want to go put that tape back on a actual active system the turnaround time is so long. So being able to turn all of that storage into an active storage system that's accessible all the time is the real value proposition that we have to talk about. >> Well now help me understand this because we have all these different ways to make sense of unstructured data now. We have NoSQL databases, we have JSON, we have HDFS, and we've got object storage. Where does it fit into the hierarchy of making sense of unstructured data? >> The simplest way to think about it is we talk about a data ocean, with the amount of data that's growing. Having the capability to store data that is in a global content repository. That is accessible-- >> Meaning one massive repository. >> One massive repository. And not necessarily in one data center, right? It's spread across multiple data centers, it's accessible, available with a single, global namespace, regardless of whether you're trying to access data from location A or location B. But having that data be available through a single global namespace is the key value proposition that object storage brings to bear. The other part is the economics that we're able to provide consistently better than what the public clouds are able to offer. You're talking about anywhere between 30 to 48% cheaper TCO than what public clouds are able to offer, in your own data center with all the constraints that you want to like upload to it, whether it's regular environments. Whether you're talking about country specific clouds and such, that's where it fits well together. But, exposing that same data out whether through HDFS or a file is where ECS differentiated itself from other cloud platforms. Yes, you can go to a Hadoop cluster and do a separate data processing but then you're creating more copies of the same data that you have in your primary storage. So things like that essentially help position object as the global content repository where you can just dump and forget about, about the storage needs. >> Vikram I want to ask you about the elastic cloud storage, as you mentioned, ECS, it's been around for a couple of years. You just announced a ECS lesser cloud storage, dedicated cloud. Can you tell me what that is and more about that because some people think of elastic they think Amazon, "I'll just throw it in object storage in the cloud." What are you guys doing specifically 'cause you have this hybrid offering. >> Absolutely. >> What is this about, can you explain that? >> Yeah, so if you look at, there are two extremes, or two paradigms that people are attracted by. On one side you have public clouds which give you the ease of use, you just swipe your credit card and you're in business. You don't have to worry about the infrastructure, you don't have to worry about, like, "Where my data is going to be stored?" It's just there. And then on the other side you have regular environments or you just have environments where you cannot move to public clouds so customers end up put in ECS, or other object storage for that matter, though ECS is the best. >> John: Biased, but that's okay. >> Yeah, now we are starting to see customers they're saying, "Can I have the best of both worlds? "Can I have a situation where I like the ease of use "of the public cloud but I don't want to "be in a shared bathtub environment. "I don't want to be in a public cloud environment. "I like the privacy that you are able to provide me "with this ECS in my own data center "but I don't want to take on the infrastructure management." So for those customers we have launched ECS dedicated cloud service. And this is specifically targeted for scenarios where customers have maybe one data center, two data centers, but they want to use the full strength and the capabilities of ECS. So what we're telling them we will actually put their bought ECS in our data centers, ECS team will operate and manage that environment for the customer but they're the only dedicated customer on that cloud. So that means they have their own environment-- >> It's completely secure for their data. >> Vikram: Exactly. >> No multi tenant issues at all. >> No, and you can have either partial capabilities in our data center, or you can fully host in our data center. So you can do various permutation and combinations thus giving customers a lot of flexibility of starting with one point and moving to the other. Let's them start with a private cloud, they want to move to a hybrid version they can move that, or if they start from the hybrid and they want to go back to their own data centers they can do that as well. >> Let's change gears and talk about IoT. You guys had launched Project Nautilus, we also heard that from your boss earlier, two days ago. What is that about? Explain, specifically, what is Project Nautilus? >> So as I was mentioning earlier there is a whole universe of data that is now being generated by these IoT devices. Whether you're talking about connected cars, you're talking about wind sensors, you're talking about anything that collects a piece of data that needs to be not only stored but people want to do realtime analysis on that dataset. And today people end up using a combination of 10 different things. They're using Kafka, Speak, HDFS, Cassandra, DASH storage to build together a makeshift solution, that sort of works but doesn't really. Or you end up, like, if you're in the public cloud you'll end up using some implementation of Lambda Architecture. But the challenge there is you're storing same amount of data in a few different places, and not only that there is no consistent way of managing data, processing data that effectively. So what Project Nautilus is our attempt to essentially streamline all of that. Allow stream of data that's coming from these IoT devices to be processed realtime, or for batch, in the same solution. And then once you've done that processing you essentially push that data down to a tier, whether it's Isilon or ECS, depending on the use case that you are trying to do. So it simplifies the whole story on realtime analytics and you don't want to do it in a closed source way. What we've done is we've created this new paradigm, or new primitive called streaming storage, and we are open sourcing it, we are Project Pravega, which is in the Apache Foundation. We want the whole community, just like there is a common sense of awareness for object file we want to that same thing for streaming storage-- >> So you guys are active in open source. Explain quickly, many might not know that. Talk about that. >> So, yeah, as I mentioned Project Prevega is something we announced at Flink Forward Conference. It's a streaming storage layer which is completely open source in the Apache Foundation and we just open sourced it today. And giving customers the capability to contribute code to it, take their version, or they can do whatever they want to do, like build additional innovation on top. And the goal is to make streaming storage just like a common paradigm like everything else. And in addition we're partnering with another open source component. There is a company called data Artisans based out of Berlin, Germany, and they have a project called Flink, and we're working with them pretty closely to bring Nautilus to fruition. >> theCUBE was there by the way, we covered Flink Forward again, one of the-- >> Paul: True streaming engine. >> Very good, very big open source project. >> Yeah, we we're talking with Jeff Woodrow earlier about software defined storage, self driving storage as he calls it. >> Where does ECS fit in the self driving storage? Is this an important part of what you're doing right now or is it a different use? >> Yeah, our vision right from the beginning itself was when we built this next generation of object storage system it has to be software first. Not only software first where a customer can choose their commodity hardware to bring to bear or we an supply the commodity hardware but over time build intelligence in that layer of software so that you can pull data off smartly to other, from SSDs to more SATA based drives. Or you can bring in smarts around metadata search capabilities that we've introduced recently. Because you have now billions of billions of records that are being stored on ECS. You want ease of search of what specifically you're looking for, so we introduced metadata search capability. So making the storage system and all of the data services that were usually outside of the platform, making them be part of the code platform itself. >> Are you working with Elasticsearch? >> Yes, we are using Elasticsearch more to enable customers who want to get insights about ECS itself. And Nautilus, of course, is also going to integrate with Elasticsearch as well. >> Vikram let's wrap this up. Thank you for coming on theCUBE. Bottom line, what's the bottom line message, quickly, summarize the value proposition, why customers should be using ECS, what's the big aha moment, what's the proposition? >> I would say the value proposition is very simple. Sometimes it can be like, people talk about lots of complex terms, it's very simple. Sustainably, low cost storage, for storing a wide variety of content in a global content repository is the key value proposition. >> And used for application developers to tap into? The whole dev ops, data as code, infrastructure as code movement. >> Yeah, you start, what we have seen in the majority of the used cases customers start with one used case of archiving. And then they very quickly realize that there's, it's like a Swiss Army knife. You start with archiving then you move on to application development, more modern applications, or in the cloud native applications development. And now with IoT and Nautilus being able to leverage data from these IoT devices onto these-- >> As I said two days ago, I think this is a huge, important area for agile developers. Having access to data in less than a hundred milliseconds, from any place in the world, is going to be table steaks. >> ECS has to be, or in general, object storage, has to be part of every important conversation that is happening about digital IT transformation. >> It sounds like eventually most of the data's going to end up there. >> Absolutely. >> Okay, so I'll put ya on the spot. When are we going to be seeing data in less than a hundred milliseconds from any database anywhere in the fabric of a company for a developer to call a data ocean and give me data back from any database, from any transaction in less than a hundred milliseconds? Can we do that today? >> We can do that today, it's available today. The challenge is how quickly enterprises are adopting the technology. >> John: So they got to architect it? >> Yeah. >> They have to architect it. >> Paul: If it's all of Isilon. >> They can pull it, they can cloud pull it down from Isilon to ECS. >> True. >> Yeah. >> Speed, low latency, is the key to success. Congratulations. >> Thank you so much. >> And I love this new object store, love this tier two value proposition. It's so much more compelling for developers, certainly in cloud native. >> Vikram: Absolutely. >> Vikram, here on theCUBE, bringing you more action from Las Vegas. We'll be right back as day three coverage continues here at Dell EMC World 2017. I'm John Furrier with Paul Gillan, we'll be right back.

Published Date : May 10 2017

SUMMARY :

brought to you by Dell EMC. Good to see you on theCUBE again. this is going to be the game changer. is the real value proposition that we have to talk about. Where does it fit into the hierarchy Having the capability to store data of the same data that you have in your primary storage. Vikram I want to ask you about the elastic cloud storage, And then on the other side you have regular environments "I like the privacy that you are able to provide me No, and you can have either partial capabilities What is that about? depending on the use case that you are trying to do. So you guys are active in open source. And the goal is to make streaming storage Yeah, we we're talking with Jeff Woodrow so that you can pull data off smartly to other, And Nautilus, of course, is also going to summarize the value proposition, of content in a global content repository is the key developers to tap into? You start with archiving then you move on from any place in the world, is going to be table steaks. has to be part of every important conversation of the data's going to end up there. of a company for a developer to call a data ocean are adopting the technology. down from Isilon to ECS. Speed, low latency, is the key to success. And I love this new object store, bringing you more action from Las Vegas.

ENTITIES

Entity	Category	Confidence
Jeff Woodrow	PERSON	0.99+
Paul	PERSON	0.99+
John	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Paul Gillan	PERSON	0.99+
Vikram Bhambri	PERSON	0.99+
Vikram	PERSON	0.99+
John Furrier	PERSON	0.99+
Paul Gillin	PERSON	0.99+
EMC	ORGANIZATION	0.99+
Emerging Technologies Group	ORGANIZATION	0.99+
2020	DATE	0.99+
Las Vegas	LOCATION	0.99+
less than a hundred milliseconds	QUANTITY	0.99+
Dell EMC	ORGANIZATION	0.99+
two extremes	QUANTITY	0.99+
Apache Foundation	ORGANIZATION	0.99+
two paradigms	QUANTITY	0.99+
Isilon	ORGANIZATION	0.99+
eighth year	QUANTITY	0.99+
both sides	QUANTITY	0.99+
Swiss Army	ORGANIZATION	0.99+
Flink	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
today	DATE	0.99+
two days ago	DATE	0.99+
one	QUANTITY	0.98+
Nautilus	ORGANIZATION	0.98+
30	QUANTITY	0.98+
Lambda Architecture	TITLE	0.98+
48%	QUANTITY	0.98+
two data centers	QUANTITY	0.98+
10 different things	QUANTITY	0.98+
SiliconANGLE	ORGANIZATION	0.98+
one data center	QUANTITY	0.98+
Elasticsearch	TITLE	0.98+
NoSQL	TITLE	0.97+
ECS	TITLE	0.97+
single	QUANTITY	0.97+
Kafka	TITLE	0.97+
both worlds	QUANTITY	0.97+
ECS	ORGANIZATION	0.97+
one point	QUANTITY	0.97+
one side	QUANTITY	0.97+
one market	QUANTITY	0.96+
first	QUANTITY	0.96+
Speak	TITLE	0.96+
Cassandra	TITLE	0.95+
Dell EMC World 2017	EVENT	0.94+
VIPRE	ORGANIZATION	0.94+
billions of billions of records	QUANTITY	0.93+
Project Nautilus	ORGANIZATION	0.92+
Vikram	ORGANIZATION	0.92+
day three	QUANTITY	0.91+
JSON	TITLE	0.91+
Berlin, Germany	LOCATION	0.9+
tons of data	QUANTITY	0.89+
EMC World 2017	EVENT	0.88+
data Artisans	ORGANIZATION	0.86+
HDFS	TITLE	0.84+
tier two	QUANTITY	0.83+
theCUBE	ORGANIZATION	0.82+
S3	OTHER	0.82+
44 zettabytes	QUANTITY	0.82+
Project Nautilus	TITLE	0.8+
Project Pravega	ORGANIZATION	0.78+

Scott Gnau | DataWorks Summit Europe 2017

>> More information, click here. (soothing technological music) >> Announcer: Live from Munich, Germany, it's theCUBE. Covering Dataworks Summit Europe 2017. Brought to you by Hortonworks. (soft technological music) >> Okay welcome back everyone, we're here in Munich, Germany for Dataworks Summit 2017 formerly Hadoop Summit powered by Hortonworks. It's their event, but now called Dataworks because data is at the center of the value proposition Hadoop plus Airal Data and storage. I'm John, my cohost David. Our next guest is Scott Gnau he's the CTO of Hortonworks joining us again from the keynote stage, good to see you again. >> Thanks for having me back, great to be here. >> Good having you back. Get down and dirty and get technical. I'm super excited about the conversations that are happening in the industry right now for a variety of reasons. One is you can't get more excited about what's happening in the data business. Machine learning AI has really brought up the hype around, to me is human America, people can visualize AI and see the self-driving cars and understand how software's powering all this. But still it's data driven and Hadoop is extending into data seeing that natural extension and CloudAIR has filed their S1 to go public. So it brings back the conversations of this opensource community that's been doin' all this work in the big data industry, originally riding in the horse of Hadoop. You guys have an update to your Hadoop data platform which we'll get to in a second, but I want to ask you a lot of stories around Hadoop, I say Hadoop was the first horse that everyone rode in on in the big data industry... When I say big data, I mean like DevOps, Cloud, the whole open sourcing he does, but it's evolving it's not being replaced. So I want you to clarify your position on this because we're just talkin' about some of the false premises, a lot of stories being written about the demise of Hadoop, long-live Hadoop. Yeah, well, how long do we have? (laughing) I think you hit it first, we're at Dataworks Summit 2017 and we rebranded and it was previously Hadoop Summit. We rebranded it to really recognize that there's this bigger thing going on and it's not just Hadoop. Hadoop is a big contributor, a big driver, a very important part of the ecosystem but it's more than that. It's really about being able to manage and deliver analytic content on all data across that data's lifecycle from when it gets created at the edge to its moving through networks, to its landed and store in a cluster to analytics run and decisions go back out. It's that entire lifecycle and you mentioned some of the megatrends and I talked about this morning in the opening keynote. With AI and streaming and IoT, all of these things kind of converging are creating a much larger problem set and frankly, opportunity for us as an industry to go soft. So that's the context that we're really looking-- >> And there's real demand there. This is not like, I mean there's certainly a hype factor on AI, but IoT is real. You have data now, not just a back office concept, you have a front-facing business centric... I mean there's real customer demand here. >> There's real customer demand and it really creates the ability to dramatically change a business. A simple example that I used onstage this morning is think about the electric utility business. I live in Southern California. 25 years ago, by the way I studied to be an electrical engineer, 20 years ago, 30 years ago, that business not entirely simple was about building a big power plant and distributing electrons out to all the consumers of electrons. One direction and optimization of that grid, network and that business was very hard and there was billions of dollars at stake. Fast forward to today, now you still got those generating plants online, but you've also got folks like me generating their own power and putting it back into the grid. So now you've got bidirectional electrons. The optimization is totally different. Then how do you figure out how most effectively to create capacity and distribute that capacity because created capacity that's not consumed is 100% spoiled. So it's a huge data problem but it's a huge data problem meeting IoT, right? Devices, smart meter devices out at the edge creating data doing it in realtime. A cloud blew over, my generating capacity on my roof went down so I've got to pull from the grid, combining all of that data to make realtime decisions is we're talking hundreds of billions of dollars and it's being done today in an industry, it's not a high-tech Silicon Valley kind of industry, electric utilities are taking advantage of this technology today. >> So we were talking off-camera about you know some commentary about the Hadoop is failed and obviously you take exception to that and I and you also made the point it's not just about Hadoop but in a way it is because Hadoop was the catalyst of all this open Why has Hadoop not failed in your view >> Well because we have customers and you know the great thing about conferences like this is we're actually able to get a lot of folks to come in and talk about what they're doing with the technology and how they're driving business benefit and share that business benefit to their colleagues so we see that that it's business benefit coming along you know In any hype cycle you know people can go down a path maybe they had false expectations right early on you know six years ago years ago we were talking about hey is open source of Hadoop is going to come along and replace EDW complete fallacy right what I talked about in that opportunity being able to store all kinds of disparate data being able to manage and maneuver analytics in real time that's the value proposition is very different than some of the legacy ten. So if you view it as hey this thing is going to replace that thing okay maybe not but the point is is very successful for what is not verified that-- >> Just to clarify what you just said there that was you guys never kicked that position. CloudAIR or did with their impala was their initial on you could give me that you don't agree with that? >> Publicly they would say oh it's not a replacement but you're right i mean the actions were maybe designed to do that >> And set in the marketplace that that might be one of the outcomes >> Yeah, but they pivoted quickly when they realized that was failed strategy but i mean that but that became a premise that people locked in on. >> If that becomes your yardstick for measuring then then so-- >> Oh but but wouldn't you agree that that Hadoop in many respects was designed to solve some of the problems that edw never could >> Exactly so so you know again when you think about the the variety of data when you think about the analytic content doing time series analysis is very hard to do in a relational model so it's a new tool in the workbench to go solve analytic problems and so when you look at it from that perspective and I use the utility example the manufacturing example financial consumer finance telco all of these companies are using this technology leveraging this technology to solve problems they couldn't solve or and frankly to build new businesses that they couldn't build before because they didn't have access to that real time-- >> And so money did shift from pouring money into the edw with limited returns because you were at the steep part or the flat part of the s-curve to hey let's put it over here and this so called big data thing and that's why the market I think was conditioned to sort of come to that simple conclusion but dollars the spending did shift did it not? >> Yeah I mean if you subscribe kind of that to that herd mentality and you know the net increase the net new expenditure in the new technology is always going to outpace the growth of the existing kind of plateau technologists. That's just math. >> The growth yes, but not the size not the absolute dollars and so you have a lot of companies right now struggling in the traditional legacy space and you got this rocket ship going in-- >> And again I think if you think about kind of the converging forces that are out there in addition to you know i OT and streaming the ability frankly Hadoop is an enabler of AI when you think about the success of AI and machine learning it's about having massive massive massive amounts of data right? And I think back 25 years ago my first data Mart was 30 gigabytes and we thought that was all the data in the world Now fits on your phone so so when you think about just having the utter capacity and the ability to actually process that capacity of data these are technology breakthroughs that have been driven in the poor open source in Hadoop community when combined with the ability then to execute in clouds and ephemeral kinds of workloads you combine all that stuff together now instead of going to capital committee for 20 millioin dollars for a bunch of hardware to do an exabyte kind of study where you may not get an answer that means anything you can now spin that up in the cloud and for a couple of thousand dollars get the answer take that answer and go build a new system of insight that's going to drive your business and this is a whole new area of opportunity or even by the convergence of all that >> So I agree i mean it's absurd to say Hadoop and big data has failed, it's crazy. Okay but despite the growth i called profitless prosperity can the industry fund itself I mean you've got to make big bets yarn tezz different clouds how does the industry turn into one that is profitable and growing well I mean obviously it creates new business models and new ways of monetizing software in deploying software you know one of the key things that is core to our belief system is really leveraging and working with and nurturing the community is going to be a key success factor for our business right nurturing that innovation in collaboration across the community to keep up with the rate of pace of change is one of the aspects of being relevant as a business and then obviously creating a great service experience for our customers so that they they know that they can depend on enterprise class support enterprise-class security and governance and operational management in the cloud and on-prem in creating that value propisition along with the the advanced and accelerated delivery of innovation is where I think you know we kind of intersect uniquely in in the in the industry. >> and one of the things that I think that people point out and I have this conversation all the time of people who try to squint through the you know the wall street implications of the value proposition of the industry and this and that and I want to get your thoughts on because open source at this era that we're living in today bringing so much value outside of just important works in your your company Dave would made a comment on the intro package we're doing is that the practitioners are getting a lot of value people out in the field so these are the white space as a value and they're actually transformative can you give some examples where things are getting done that are real of real value as use cases that are that are highlighted you guys can i light I think that's the unwritten story that no one thought about it that rising tide floating all boat happening? >> Yeah yes I mean what is the most use cases the white so you have some of those use cases again it really involves kind of integrating legacy traditional transactional information right very valuable information about a company its operations its customers its products and all this kind of thing about being able to combine that with the ability to do real-time sensor management and ultimately have a technology stack that enables kind of the connection of all of those sources of data for an analytic and that's an important differentiation you know for the first 25 years of my career right it was all about what school all this data into a place and then let's do something with it and then we can push analytics back not an entirely bad model but a model that breaks in the world of IOT connected devices it's just frankly isn't enough money to spend on bandwidth to make that happen and as fast as the speed of light is it creates latency so those decisions aren't going to be able to be made in time so we're seeing even in traditional i mentioned utility business think about manufacturing oil and gas right sensors everywhere being able to take advantage not not of collecting all the central data and all of that but being able to actually create analytics based on sensor data and put those analytics outs of the sensors to make real-time decisions that can affect hundreds of millions of dollars of production or equipment are the use cases that we're seeing be deployed today and that's complete white space that was unavailable before. >> Yeah and customer demand too I mean Dave and I were also debating about the this not being a new trend this is just big data happening the customers are demanding production workload so you've seen a lot more forcing function driven by the customer and you guys have some news I want to get to and give your thoughts on HTTP or worse data platform two points dicks what's the key news their house in real time you talking about real time. >> Yeah it's about real time real time flexibility and choice you know motherhood and apple pie >> And the major highlights of that operate >> So the upgrades really inside of hive we now have operational analytic query capabilities where when you do tactical response times second sub second kind of response time. >> You know Hadoop and Hive wasn't previously known for that kind of a tactical response we've been able to now add inside of that technology the ability to view that workload we have customers who building these white space applications who have hundreds or thousands of users or applications that depend on consistency of very quick analytic response time we now deliver that inside the platform what's really cool about it in addition to the fact that it works is is that we did it inside a pipe so we didn't create yet another project or yet another thing that a customer has to integrate to or rewrite their application so any high based application cannot take advantage of this performance enhancement and that's part of our thinking of it as a platform the second thing inside of that that we've done that really it creaks to those kinds of workload is is we've really enhance the ability to incremental data acquisition right whether it be streaming whether it be patch up certs right on the sequel person doing up service being able to do that data maintenance in an active compliant fashion completely automatically and behind the scenes so that those applications again can just kind of run without any heavy lifting >> Just staying in motion kind of thing going on >> Right it's anywhere from data in motion even to batch to mini batch and anywhere kind of in between but we're doing those incremental data loads you know, it's easy to get the same file twice by mistake you don't want to double count you want to have sanctity of the transactions we now handle that inside of Hive with acid compliance. >> So a layperson question for the CTO if I may you mentioned Hadoop was not known for a sort of real-time response you just mentioned acid it was never in the early days known for a sort of acid you know complies others would say you know Hadoop the original Big Data Platform is not designed for the matrix of the matrix math of AI for example are these misconceptions and like Tim Berners-lee when we met Tim Berners-lee web 2.0 this is what the web was designed for would you say the same thing about Hadoop? >> Yeah. Ultimately from my perspective and kind of mending it out, Hadoop was designed for the easy acquisition of data the easy onboarding of data and then once you've onboarded that data it it also was known for enabling new kinds of analytics that could be plugged in certainly starting out with MapReduce in HDFS was kind of before but the whole idea is I have now the flexible way to easily acquire data in its native form without having to apply schema without having to have any formatting distort I can get it exactly as it was and store it and then I can apply whatever schema whatever rules whatever analytics on top of that that I want so the center of gravity from my mind has really moved up to yarn which enables a multi-tenancy approach to having pluggable multiple different kinds of file formats and pluggable different kinds of analytics and data access methods whether it be sequel whether it be machine learning whether the HBase will look up and indexing and anywhere kind of in between it's that it's that Swiss Army knife as it were for handling all of this new stuff that is changing every second we sit here data has changed. >> And just a quick follow-up if I can just clarification so you said new types of analytics that can be plugged in by design because of its openness is that right? >> By design because of its openness and the flexibility that the platform was was built for in addition on the performance we've also got a new update to spark and usability consume ability and collaboration for data scientists using the latest versions of spark inside the platform we've got a whole lot of other features and functions as that our customers have asked for and then on the flexibility and choice it's available public cloud infrastructures of service public cloud platform as a service on Prem x and net new on prem with power >> Just got final question for you just as the industry evolves what are some of the key areas that open source can pivot to that really takes advantage of the machine learning the AI trends going on because you start to see that really increase the narrative around the importance of data and a lot of people are scratching their heads going okay i need to do the back office to set up my IT to have all those crates stuff always open source projects all that the Hadoop data platform but then I got to get down and dirty i might do multiple clouds on the hybrid cloud going on i might want to leverage the moles canoe cool containers and super Nettie's and micro services and almost devops where's that transition happening as a CTO what do you see that that how do you talk to customers about that this transition this evolution of how the data businesses in getting more and more mainstream? >> Yeah i mean i think i think the big thing that people had to get over is we've reverse polarity from again 30 years of I want a stack vendor to have an integrated stack of everything a plug-and-play it's integrated and end it might not be a hundred percent what I want but the cost leverage that I get out of the stack versus what I'm going to go do that's perfect in this world if the opposite it's about enabling the ecosystem and that's where having and by the way it's a combination of open source and proprietary software that you know some of our partners have proprietary software that's okay but it's really about enabling the ecosystem and I think the biggest service that we as an open source community can do is to continue to kind of keep that standard kernel for the platform and make it very usable and very easy for many apps and software providers and other folks. >> A thousand flower bloom and kind of concept and that's what you've done with the white spaces as these cases are evolving very rapidly and then the bigger apps are kind of going to settling into a workload with realtime. >> Yeah all time you know think about the next generation of IT professional the next generation of business professional grew up with iphones and here comes they grew up in a mini app world i mean it download an app i'm going to try it is a widget boom and it's going to help me get something done but it's not a big stack that I'm going to spend 30 years to implement and I liked it and then I want to take to those widgets and connect them together to do things that i haven't been able to do before and that's how this ecosystem is really-- >> Great DevOps culture very agile that's their mindset. So Scott congratulations on your 2.6 upgrade and >> Scott: We're thrilled about it. >> Great stuff acid compliance really big deal again these compliance because little things are important in the enterprise great all right thanks for coming to accuse the Dataworks in Germany Munich I'm John thanks for watching more coverage live here in Germany after this short break

Published Date : Apr 5 2017

SUMMARY :

(soothing technological music) Brought to you by Hortonworks. because data is at the center of the value proposition that are happening in the industry you have a front-facing business centric... combining all of that data to make realtime decisions and share that business benefit to their Just to clarify what you just said there a premise that people locked in on. that to that herd mentality and you know the community to keep up with the rate cases the white so you have some of debating about the this not being a new So the upgrades really inside of hive we it's easy to get the same file twice by mistake you the CTO if I may you mentioned Hadoop acquisition of data the easy onboarding the big thing that people had to get kind of going to settling into a So Scott congratulations on your 2.6 upgrade and

ENTITIES

Entity	Category	Confidence
Scott	PERSON	0.99+
100%	QUANTITY	0.99+
John	PERSON	0.99+
David	PERSON	0.99+
Dave	PERSON	0.99+
Germany	LOCATION	0.99+
Southern California	LOCATION	0.99+
30 years	QUANTITY	0.99+
30 gigabytes	QUANTITY	0.99+
Scott Gnau	PERSON	0.99+
hundreds	QUANTITY	0.99+
Hortonworks	ORGANIZATION	0.99+
Swiss Army	ORGANIZATION	0.99+
six years ago years ago	DATE	0.99+
America	LOCATION	0.99+
25 years ago	DATE	0.99+
Hadoop	TITLE	0.99+
Munich, Germany	LOCATION	0.99+
today	DATE	0.98+
Dataworks Summit 2017	EVENT	0.98+
30 years ago	DATE	0.98+
two points	QUANTITY	0.98+
iphones	COMMERCIAL_ITEM	0.98+
telco	ORGANIZATION	0.98+
Hadoop	ORGANIZATION	0.98+
hundred percent	QUANTITY	0.98+
billions of dollars	QUANTITY	0.98+
first 25 years	QUANTITY	0.97+
DevOps	TITLE	0.97+
hundreds of millions of dollars	QUANTITY	0.97+
20 years ago	DATE	0.97+
20 millioin dollars	QUANTITY	0.97+
twice	QUANTITY	0.97+
DataWorks Summit	EVENT	0.97+
first	QUANTITY	0.97+
one	QUANTITY	0.97+
One	QUANTITY	0.96+
second thing	QUANTITY	0.96+
Tim Berners-lee	PERSON	0.96+
Silicon Valley	LOCATION	0.96+
Munich	LOCATION	0.96+
Hadoop Summit	EVENT	0.96+
One direction	QUANTITY	0.96+
first horse	QUANTITY	0.95+
first data	QUANTITY	0.95+
Dataworks	ORGANIZATION	0.94+
second	QUANTITY	0.92+
Cloud	TITLE	0.92+
EDW	ORGANIZATION	0.85+
2017	EVENT	0.85+
couple of thousand dollars	QUANTITY	0.84+
Dataworks Summit Europe 2017	EVENT	0.84+
MapReduce	TITLE	0.84+
thousands of users	QUANTITY	0.83+
lot of folks	QUANTITY	0.83+
this morning	DATE	0.8+
S1	TITLE	0.79+
Europe	LOCATION	0.78+
A thousand flower bloom	QUANTITY	0.78+
2.6	OTHER	0.76+
apps	QUANTITY	0.73+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Swiss Army: