Analyst Predictions 2023: The Future of Data Management

(upbeat music) >> Hello, this is Dave Valente with theCUBE, and one of the most gratifying aspects of my role as a host of "theCUBE TV" is I get to cover a wide range of topics. And quite often, we're able to bring to our program a level of expertise that allows us to more deeply explore and unpack some of the topics that we cover throughout the year. And one of our favorite topics, of course, is data. Now, in 2021, after being in isolation for the better part of two years, a group of industry analysts met up at AWS re:Invent and started a collaboration to look at the trends in data and predict what some likely outcomes will be for the coming year. And it resulted in a very popular session that we had last year focused on the future of data management. And I'm very excited and pleased to tell you that the 2023 edition of that predictions episode is back, and with me are five outstanding market analyst, Sanjeev Mohan of SanjMo, Tony Baer of dbInsight, Carl Olofson from IDC, Dave Menninger from Ventana Research, and Doug Henschen, VP and Principal Analyst at Constellation Research. Now, what is it that we're calling you, guys? A data pack like the rat pack? No, no, no, no, that's not it. It's the data crowd, the data crowd, and the crowd includes some of the best minds in the data analyst community. They'll discuss how data management is evolving and what listeners should prepare for in 2023. Guys, welcome back. Great to see you. >> Good to be here. >> Thank you. >> Thanks, Dave. (Tony and Dave faintly speaks) >> All right, before we get into 2023 predictions, we thought it'd be good to do a look back at how we did in 2022 and give a transparent assessment of those predictions. So, let's get right into it. We're going to bring these up here, the predictions from 2022, they're color-coded red, yellow, and green to signify the degree of accuracy. And I'm pleased to report there's no red. Well, maybe some of you will want to debate that grading system. But as always, we want to be open, so you can decide for yourselves. So, we're going to ask each analyst to review their 2022 prediction and explain their rating and what evidence they have that led them to their conclusion. So, Sanjeev, please kick it off. Your prediction was data governance becomes key. I know that's going to knock you guys over, but elaborate, because you had more detail when you double click on that. >> Yeah, absolutely. Thank you so much, Dave, for having us on the show today. And we self-graded ourselves. I could have very easily made my prediction from last year green, but I mentioned why I left it as yellow. I totally fully believe that data governance was in a renaissance in 2022. And why do I say that? You have to look no further than AWS launching its own data catalog called DataZone. Before that, mid-year, we saw Unity Catalog from Databricks went GA. So, overall, I saw there was tremendous movement. When you see these big players launching a new data catalog, you know that they want to be in this space. And this space is highly critical to everything that I feel we will talk about in today's call. Also, if you look at established players, I spoke at Collibra's conference, data.world, work closely with Alation, Informatica, a bunch of other companies, they all added tremendous new capabilities. So, it did become key. The reason I left it as yellow is because I had made a prediction that Collibra would go IPO, and it did not. And I don't think anyone is going IPO right now. The market is really, really down, the funding in VC IPO market. But other than that, data governance had a banner year in 2022. >> Yeah. Well, thank you for that. And of course, you saw data clean rooms being announced at AWS re:Invent, so more evidence. And I like how the fact that you included in your predictions some things that were binary, so you dinged yourself there. So, good job. Okay, Tony Baer, you're up next. Data mesh hits reality check. As you see here, you've given yourself a bright green thumbs up. (Tony laughing) Okay. Let's hear why you feel that was the case. What do you mean by reality check? >> Okay. Thanks, Dave, for having us back again. This is something I just wrote and just tried to get away from, and this just a topic just won't go away. I did speak with a number of folks, early adopters and non-adopters during the year. And I did find that basically that it pretty much validated what I was expecting, which was that there was a lot more, this has now become a front burner issue. And if I had any doubt in my mind, the evidence I would point to is what was originally intended to be a throwaway post on LinkedIn, which I just quickly scribbled down the night before leaving for re:Invent. I was packing at the time, and for some reason, I was doing Google search on data mesh. And I happened to have tripped across this ridiculous article, I will not say where, because it doesn't deserve any publicity, about the eight (Dave laughing) best data mesh software companies of 2022. (Tony laughing) One of my predictions was that you'd see data mesh washing. And I just quickly just hopped on that maybe three sentences and wrote it at about a couple minutes saying this is hogwash, essentially. (laughs) And that just reun... And then, I left for re:Invent. And the next night, when I got into my Vegas hotel room, I clicked on my computer. I saw a 15,000 hits on that post, which was the most hits of any single post I put all year. And the responses were wildly pro and con. So, it pretty much validates my expectation in that data mesh really did hit a lot more scrutiny over this past year. >> Yeah, thank you for that. I remember that article. I remember rolling my eyes when I saw it, and then I recently, (Tony laughing) I talked to Walmart and they actually invoked Martin Fowler and they said that they're working through their data mesh. So, it takes a really lot of thought, and it really, as we've talked about, is really as much an organizational construct. You're not buying data mesh >> Bingo. >> to your point. Okay. Thank you, Tony. Carl Olofson, here we go. You've graded yourself a yellow in the prediction of graph databases. Take off. Please elaborate. >> Yeah, sure. So, I realized in looking at the prediction that it seemed to imply that graph databases could be a major factor in the data world in 2022, which obviously didn't become the case. It was an error on my part in that I should have said it in the right context. It's really a three to five-year time period that graph databases will really become significant, because they still need accepted methodologies that can be applied in a business context as well as proper tools in order for people to be able to use them seriously. But I stand by the idea that it is taking off, because for one thing, Neo4j, which is the leading independent graph database provider, had a very good year. And also, we're seeing interesting developments in terms of things like AWS with Neptune and with Oracle providing graph support in Oracle database this past year. Those things are, as I said, growing gradually. There are other companies like TigerGraph and so forth, that deserve watching as well. But as far as becoming mainstream, it's going to be a few years before we get all the elements together to make that happen. Like any new technology, you have to create an environment in which ordinary people without a whole ton of technical training can actually apply the technology to solve business problems. >> Yeah, thank you for that. These specialized databases, graph databases, time series databases, you see them embedded into mainstream data platforms, but there's a place for these specialized databases, I would suspect we're going to see new types of databases emerge with all this cloud sprawl that we have and maybe to the edge. >> Well, part of it is that it's not as specialized as you might think it. You can apply graphs to great many workloads and use cases. It's just that people have yet to fully explore and discover what those are. >> Yeah. >> And so, it's going to be a process. (laughs) >> All right, Dave Menninger, streaming data permeates the landscape. You gave yourself a yellow. Why? >> Well, I couldn't think of a appropriate combination of yellow and green. Maybe I should have used chartreuse, (Dave laughing) but I was probably a little hard on myself making it yellow. This is another type of specialized data processing like Carl was talking about graph databases is a stream processing, and nearly every data platform offers streaming capabilities now. Often, it's based on Kafka. If you look at Confluent, their revenues have grown at more than 50%, continue to grow at more than 50% a year. They're expected to do more than half a billion dollars in revenue this year. But the thing that hasn't happened yet, and to be honest, they didn't necessarily expect it to happen in one year, is that streaming hasn't become the default way in which we deal with data. It's still a sidecar to data at rest. And I do expect that we'll continue to see streaming become more and more mainstream. I do expect perhaps in the five-year timeframe that we will first deal with data as streaming and then at rest, but the worlds are starting to merge. And we even see some vendors bringing products to market, such as K2View, Hazelcast, and RisingWave Labs. So, in addition to all those core data platform vendors adding these capabilities, there are new vendors approaching this market as well. >> I like the tough grading system, and it's not trivial. And when you talk to practitioners doing this stuff, there's still some complications in the data pipeline. And so, but I think, you're right, it probably was a yellow plus. Doug Henschen, data lakehouses will emerge as dominant. When you talk to people about lakehouses, practitioners, they all use that term. They certainly use the term data lake, but now, they're using lakehouse more and more. What's your thoughts on here? Why the green? What's your evidence there? >> Well, I think, I was accurate. I spoke about it specifically as something that vendors would be pursuing. And we saw yet more lakehouse advocacy in 2022. Google introduced its BigLake service alongside BigQuery. Salesforce introduced Genie, which is really a lakehouse architecture. And it was a safe prediction to say vendors are going to be pursuing this in that AWS, Cloudera, Databricks, Microsoft, Oracle, SAP, Salesforce now, IBM, all advocate this idea of a single platform for all of your data. Now, the trend was also supported in 2023, in that we saw a big embrace of Apache Iceberg in 2022. That's a structured table format. It's used with these lakehouse platforms. It's open, so it ensures portability and it also ensures performance. And that's a structured table that helps with the warehouse side performance. But among those announcements, Snowflake, Google, Cloud Era, SAP, Salesforce, IBM, all embraced Iceberg. But keep in mind, again, I'm talking about this as something that vendors are pursuing as their approach. So, they're advocating end users. It's very cutting edge. I'd say the top, leading edge, 5% of of companies have really embraced the lakehouse. I think, we're now seeing the fast followers, the next 20 to 25% of firms embracing this idea and embracing a lakehouse architecture. I recall Christian Kleinerman at the big Snowflake event last summer, making the announcement about Iceberg, and he asked for a show of hands for any of you in the audience at the keynote, have you heard of Iceberg? And just a smattering of hands went up. So, the vendors are ahead of the curve. They're pushing this trend, and we're now seeing a little bit more mainstream uptake. >> Good. Doug, I was there. It was you, me, and I think, two other hands were up. That was just humorous. (Doug laughing) All right, well, so I liked the fact that we had some yellow and some green. When you think about these things, there's the prediction itself. Did it come true or not? There are the sub predictions that you guys make, and of course, the degree of difficulty. So, thank you for that open assessment. All right, let's get into the 2023 predictions. Let's bring up the predictions. Sanjeev, you're going first. You've got a prediction around unified metadata. What's the prediction, please? >> So, my prediction is that metadata space is currently a mess. It needs to get unified. There are too many use cases of metadata, which are being addressed by disparate systems. For example, data quality has become really big in the last couple of years, data observability, the whole catalog space is actually, people don't like to use the word data catalog anymore, because data catalog sounds like it's a catalog, a museum, if you may, of metadata that you go and admire. So, what I'm saying is that in 2023, we will see that metadata will become the driving force behind things like data ops, things like orchestration of tasks using metadata, not rules. Not saying that if this fails, then do this, if this succeeds, go do that. But it's like getting to the metadata level, and then making a decision as to what to orchestrate, what to automate, how to do data quality check, data observability. So, this space is starting to gel, and I see there'll be more maturation in the metadata space. Even security privacy, some of these topics, which are handled separately. And I'm just talking about data security and data privacy. I'm not talking about infrastructure security. These also need to merge into a unified metadata management piece with some knowledge graph, semantic layer on top, so you can do analytics on it. So, it's no longer something that sits on the side, it's limited in its scope. It is actually the very engine, the very glue that is going to connect data producers and consumers. >> Great. Thank you for that. Doug. Doug Henschen, any thoughts on what Sanjeev just said? Do you agree? Do you disagree? >> Well, I agree with many aspects of what he says. I think, there's a huge opportunity for consolidation and streamlining of these as aspects of governance. Last year, Sanjeev, you said something like, we'll see more people using catalogs than BI. And I have to disagree. I don't think this is a category that's headed for mainstream adoption. It's a behind the scenes activity for the wonky few, or better yet, companies want machine learning and automation to take care of these messy details. We've seen these waves of management technologies, some of the latest data observability, customer data platform, but they failed to sweep away all the earlier investments in data quality and master data management. So, yes, I hope the latest tech offers, glimmers that there's going to be a better, cleaner way of addressing these things. But to my mind, the business leaders, including the CIO, only want to spend as much time and effort and money and resources on these sorts of things to avoid getting breached, ending up in headlines, getting fired or going to jail. So, vendors bring on the ML and AI smarts and the automation of these sorts of activities. >> So, if I may say something, the reason why we have this dichotomy between data catalog and the BI vendors is because data catalogs are very soon, not going to be standalone products, in my opinion. They're going to get embedded. So, when you use a BI tool, you'll actually use the catalog to find out what is it that you want to do, whether you are looking for data or you're looking for an existing dashboard. So, the catalog becomes embedded into the BI tool. >> Hey, Dave Menninger, sometimes you have some data in your back pocket. Do you have any stats (chuckles) on this topic? >> No, I'm glad you asked, because I'm going to... Now, data catalogs are something that's interesting. Sanjeev made a statement that data catalogs are falling out of favor. I don't care what you call them. They're valuable to organizations. Our research shows that organizations that have adequate data catalog technologies are three times more likely to express satisfaction with their analytics for just the reasons that Sanjeev was talking about. You can find what you want, you know you're getting the right information, you know whether or not it's trusted. So, those are good things. So, we expect to see the capabilities, whether it's embedded or separate. We expect to see those capabilities continue to permeate the market. >> And a lot of those catalogs are driven now by machine learning and things. So, they're learning from those patterns of usage by people when people use the data. (airy laughs) >> All right. Okay. Thank you, guys. All right. Let's move on to the next one. Tony Bear, let's bring up the predictions. You got something in here about the modern data stack. We need to rethink it. Is the modern data stack getting long at the tooth? Is it not so modern anymore? >> I think, in a way, it's got almost too modern. It's gotten too, I don't know if it's being long in the tooth, but it is getting long. The modern data stack, it's traditionally been defined as basically you have the data platform, which would be the operational database and the data warehouse. And in between, you have all the tools that are necessary to essentially get that data from the operational realm or the streaming realm for that matter into basically the data warehouse, or as we might be seeing more and more, the data lakehouse. And I think, what's important here is that, or I think, we have seen a lot of progress, and this would be in the cloud, is with the SaaS services. And especially you see that in the modern data stack, which is like all these players, not just the MongoDBs or the Oracles or the Amazons have their database platforms. You see they have the Informatica's, and all the other players there in Fivetrans have their own SaaS services. And within those SaaS services, you get a certain degree of simplicity, which is it takes all the housekeeping off the shoulders of the customers. That's a good thing. The problem is that what we're getting to unfortunately is what I would call lots of islands of simplicity, which means that it leads it (Dave laughing) to the customer to have to integrate or put all that stuff together. It's a complex tool chain. And so, what we really need to think about here, we have too many pieces. And going back to the discussion of catalogs, it's like we have so many catalogs out there, which one do we use? 'Cause chances are of most organizations do not rely on a single catalog at this point. What I'm calling on all the data providers or all the SaaS service providers, is to literally get it together and essentially make this modern data stack less of a stack, make it more of a blending of an end-to-end solution. And that can come in a number of different ways. Part of it is that we're data platform providers have been adding services that are adjacent. And there's some very good examples of this. We've seen progress over the past year or so. For instance, MongoDB integrating search. It's a very common, I guess, sort of tool that basically, that the applications that are developed on MongoDB use, so MongoDB then built it into the database rather than requiring an extra elastic search or open search stack. Amazon just... AWS just did the zero-ETL, which is a first step towards simplifying the process from going from Aurora to Redshift. You've seen same thing with Google, BigQuery integrating basically streaming pipelines. And you're seeing also a lot of movement in database machine learning. So, there's some good moves in this direction. I expect to see more than this year. Part of it's from basically the SaaS platform is adding some functionality. But I also see more importantly, because you're never going to get... This is like asking your data team and your developers, herding cats to standardizing the same tool. In most organizations, that is not going to happen. So, take a look at the most popular combinations of tools and start to come up with some pre-built integrations and pre-built orchestrations, and offer some promotional pricing, maybe not quite two for, but in other words, get two products for the price of two services or for the price of one and a half. I see a lot of potential for this. And it's to me, if the class was to simplify things, this is the next logical step and I expect to see more of this here. >> Yeah, and you see in Oracle, MySQL heat wave, yet another example of eliminating that ETL. Carl Olofson, today, if you think about the data stack and the application stack, they're largely separate. Do you have any thoughts on how that's going to play out? Does that play into this prediction? What do you think? >> Well, I think, that the... I really like Tony's phrase, islands of simplification. It really says (Tony chuckles) what's going on here, which is that all these different vendors you ask about, about how these stacks work. All these different vendors have their own stack vision. And you can... One application group is going to use one, and another application group is going to use another. And some people will say, let's go to, like you go to a Informatica conference and they say, we should be the center of your universe, but you can't connect everything in your universe to Informatica, so you need to use other things. So, the challenge is how do we make those things work together? As Tony has said, and I totally agree, we're never going to get to the point where people standardize on one organizing system. So, the alternative is to have metadata that can be shared amongst those systems and protocols that allow those systems to coordinate their operations. This is standard stuff. It's not easy. But the motive for the vendors is that they can become more active critical players in the enterprise. And of course, the motive for the customer is that things will run better and more completely. So, I've been looking at this in terms of two kinds of metadata. One is the meaning metadata, which says what data can be put together. The other is the operational metadata, which says basically where did it come from? Who created it? What's its current state? What's the security level? Et cetera, et cetera, et cetera. The good news is the operational stuff can actually be done automatically, whereas the meaning stuff requires some human intervention. And as we've already heard from, was it Doug, I think, people are disinclined to put a lot of definition into meaning metadata. So, that may be the harder one, but coordination is key. This problem has been with us forever, but with the addition of new data sources, with streaming data with data in different formats, the whole thing has, it's been like what a customer of mine used to say, "I understand your product can make my system run faster, but right now I just feel I'm putting my problems on roller skates. (chuckles) I don't need that to accelerate what's already not working." >> Excellent. Okay, Carl, let's stay with you. I remember in the early days of the big data movement, Hadoop movement, NoSQL was the big thing. And I remember Amr Awadallah said to us in theCUBE that SQL is the killer app for big data. So, your prediction here, if we bring that up is SQL is back. Please elaborate. >> Yeah. So, of course, some people would say, well, it never left. Actually, that's probably closer to true, but in the perception of the marketplace, there's been all this noise about alternative ways of storing, retrieving data, whether it's in key value stores or document databases and so forth. We're getting a lot of messaging that for a while had persuaded people that, oh, we're not going to do analytics in SQL anymore. We're going to use Spark for everything, except that only a handful of people know how to use Spark. Oh, well, that's a problem. Well, how about, and for ordinary conventional business analytics, Spark is like an over-engineered solution to the problem. SQL works just great. What's happened in the past couple years, and what's going to continue to happen is that SQL is insinuating itself into everything we're seeing. We're seeing all the major data lake providers offering SQL support, whether it's Databricks or... And of course, Snowflake is loving this, because that is what they do, and their success is certainly points to the success of SQL, even MongoDB. And we were all, I think, at the MongoDB conference where on one day, we hear SQL is dead. They're not teaching SQL in schools anymore, and this kind of thing. And then, a couple days later at the same conference, they announced we're adding a new analytic capability-based on SQL. But didn't you just say SQL is dead? So, the reality is that SQL is better understood than most other methods of certainly of retrieving and finding data in a data collection, no matter whether it happens to be relational or non-relational. And even in systems that are very non-relational, such as graph and document databases, their query languages are being built or extended to resemble SQL, because SQL is something people understand. >> Now, you remember when we were in high school and you had had to take the... Your debating in the class and you were forced to take one side and defend it. So, I was was at a Vertica conference one time up on stage with Curt Monash, and I had to take the NoSQL, the world is changing paradigm shift. And so just to be controversial, I said to him, Curt Monash, I said, who really needs acid compliance anyway? Tony Baer. And so, (chuckles) of course, his head exploded, but what are your thoughts (guests laughing) on all this? >> Well, my first thought is congratulations, Dave, for surviving being up on stage with Curt Monash. >> Amen. (group laughing) >> I definitely would concur with Carl. We actually are definitely seeing a SQL renaissance and if there's any proof of the pudding here, I see lakehouse is being icing on the cake. As Doug had predicted last year, now, (clears throat) for the record, I think, Doug was about a year ahead of time in his predictions that this year is really the year that I see (clears throat) the lakehouse ecosystems really firming up. You saw the first shots last year. But anyway, on this, data lakes will not go away. I've actually, I'm on the home stretch of doing a market, a landscape on the lakehouse. And lakehouse will not replace data lakes in terms of that. There is the need for those, data scientists who do know Python, who knows Spark, to go in there and basically do their thing without all the restrictions or the constraints of a pre-built, pre-designed table structure. I get that. Same thing for developing models. But on the other hand, there is huge need. Basically, (clears throat) maybe MongoDB was saying that we're not teaching SQL anymore. Well, maybe we have an oversupply of SQL developers. Well, I'm being facetious there, but there is a huge skills based in SQL. Analytics have been built on SQL. They came with lakehouse and why this really helps to fuel a SQL revival is that the core need in the data lake, what brought on the lakehouse was not so much SQL, it was a need for acid. And what was the best way to do it? It was through a relational table structure. So, the whole idea of acid in the lakehouse was not to turn it into a transaction database, but to make the data trusted, secure, and more granularly governed, where you could govern down to column and row level, which you really could not do in a data lake or a file system. So, while lakehouse can be queried in a manner, you can go in there with Python or whatever, it's built on a relational table structure. And so, for that end, for those types of data lakes, it becomes the end state. You cannot bypass that table structure as I learned the hard way during my research. So, the bottom line I'd say here is that lakehouse is proof that we're starting to see the revenge of the SQL nerds. (Dave chuckles) >> Excellent. Okay, let's bring up back up the predictions. Dave Menninger, this one's really thought-provoking and interesting. We're hearing things like data as code, new data applications, machines actually generating plans with no human involvement. And your prediction is the definition of data is expanding. What do you mean by that? >> So, I think, for too long, we've thought about data as the, I would say facts that we collect the readings off of devices and things like that, but data on its own is really insufficient. Organizations need to manipulate that data and examine derivatives of the data to really understand what's happening in their organization, why has it happened, and to project what might happen in the future. And my comment is that these data derivatives need to be supported and managed just like the data needs to be managed. We can't treat this as entirely separate. Think about all the governance discussions we've had. Think about the metadata discussions we've had. If you separate these things, now you've got more moving parts. We're talking about simplicity and simplifying the stack. So, if these things are treated separately, it creates much more complexity. I also think it creates a little bit of a myopic view on the part of the IT organizations that are acquiring these technologies. They need to think more broadly. So, for instance, metrics. Metric stores are becoming much more common part of the tooling that's part of a data platform. Similarly, feature stores are gaining traction. So, those are designed to promote the reuse and consistency across the AI and ML initiatives. The elements that are used in developing an AI or ML model. And let me go back to metrics and just clarify what I mean by that. So, any type of formula involving the data points. I'm distinguishing metrics from features that are used in AI and ML models. And the data platforms themselves are increasingly managing the models as an element of data. So, just like figuring out how to calculate a metric. Well, if you're going to have the features associated with an AI and ML model, you probably need to be managing the model that's associated with those features. The other element where I see expansion is around external data. Organizations for decades have been focused on the data that they generate within their own organization. We see more and more of these platforms acquiring and publishing data to external third-party sources, whether they're within some sort of a partner ecosystem or whether it's a commercial distribution of that information. And our research shows that when organizations use external data, they derive even more benefits from the various analyses that they're conducting. And the last great frontier in my opinion on this expanding world of data is the world of driver-based planning. Very few of the major data platform providers provide these capabilities today. These are the types of things you would do in a spreadsheet. And we all know the issues associated with spreadsheets. They're hard to govern, they're error-prone. And so, if we can take that type of analysis, collecting the occupancy of a rental property, the projected rise in rental rates, the fluctuations perhaps in occupancy, the interest rates associated with financing that property, we can project forward. And that's a very common thing to do. What the income might look like from that property income, the expenses, we can plan and purchase things appropriately. So, I think, we need this broader purview and I'm beginning to see some of those things happen. And the evidence today I would say, is more focused around the metric stores and the feature stores starting to see vendors offer those capabilities. And we're starting to see the ML ops elements of managing the AI and ML models find their way closer to the data platforms as well. >> Very interesting. When I hear metrics, I think of KPIs, I think of data apps, orchestrate people and places and things to optimize around a set of KPIs. It sounds like a metadata challenge more... Somebody once predicted they'll have more metadata than data. Carl, what are your thoughts on this prediction? >> Yeah, I think that what Dave is describing as data derivatives is in a way, another word for what I was calling operational metadata, which not about the data itself, but how it's used, where it came from, what the rules are governing it, and that kind of thing. If you have a rich enough set of those things, then not only can you do a model of how well your vacation property rental may do in terms of income, but also how well your application that's measuring that is doing for you. In other words, how many times have I used it, how much data have I used and what is the relationship between the data that I've used and the benefits that I've derived from using it? Well, we don't have ways of doing that. What's interesting to me is that folks in the content world are way ahead of us here, because they have always tracked their content using these kinds of attributes. Where did it come from? When was it created, when was it modified? Who modified it? And so on and so forth. We need to do more of that with the structure data that we have, so that we can track what it's used. And also, it tells us how well we're doing with it. Is it really benefiting us? Are we being efficient? Are there improvements in processes that we need to consider? Because maybe data gets created and then it isn't used or it gets used, but it gets altered in some way that actually misleads people. (laughs) So, we need the mechanisms to be able to do that. So, I would say that that's... And I'd say that it's true that we need that stuff. I think, that starting to expand is probably the right way to put it. It's going to be expanding for some time. I think, we're still a distance from having all that stuff really working together. >> Maybe we should say it's gestating. (Dave and Carl laughing) >> Sorry, if I may- >> Sanjeev, yeah, I was going to say this... Sanjeev, please comment. This sounds to me like it supports Zhamak Dehghani's principles, but please. >> Absolutely. So, whether we call it data mesh or not, I'm not getting into that conversation, (Dave chuckles) but data (audio breaking) (Tony laughing) everything that I'm hearing what Dave is saying, Carl, this is the year when data products will start to take off. I'm not saying they'll become mainstream. They may take a couple of years to become so, but this is data products, all this thing about vacation rentals and how is it doing, that data is coming from different sources. I'm packaging it into our data product. And to Carl's point, there's a whole operational metadata associated with it. The idea is for organizations to see things like developer productivity, how many releases am I doing of this? What data products are most popular? I'm actually in right now in the process of formulating this concept that just like we had data catalogs, we are very soon going to be requiring data products catalog. So, I can discover these data products. I'm not just creating data products left, right, and center. I need to know, do they already exist? What is the usage? If no one is using a data product, maybe I want to retire and save cost. But this is a data product. Now, there's a associated thing that is also getting debated quite a bit called data contracts. And a data contract to me is literally just formalization of all these aspects of a product. How do you use it? What is the SLA on it, what is the quality that I am prescribing? So, data product, in my opinion, shifts the conversation to the consumers or to the business people. Up to this point when, Dave, you're talking about data and all of data discovery curation is a very data producer-centric. So, I think, we'll see a shift more into the consumer space. >> Yeah. Dave, can I just jump in there just very quickly there, which is that what Sanjeev has been saying there, this is really central to what Zhamak has been talking about. It's basically about making, one, data products are about the lifecycle management of data. Metadata is just elemental to that. And essentially, one of the things that she calls for is making data products discoverable. That's exactly what Sanjeev was talking about. >> By the way, did everyone just no notice how Sanjeev just snuck in another prediction there? So, we've got- >> Yeah. (group laughing) >> But you- >> Can we also say that he snuck in, I think, the term that we'll remember today, which is metadata museums. >> Yeah, but- >> Yeah. >> And also comment to, Tony, to your last year's prediction, you're really talking about it's not something that you're going to buy from a vendor. >> No. >> It's very specific >> Mm-hmm. >> to an organization, their own data product. So, touche on that one. Okay, last prediction. Let's bring them up. Doug Henschen, BI analytics is headed to embedding. What does that mean? >> Well, we all know that conventional BI dashboarding reporting is really commoditized from a vendor perspective. It never enjoyed truly mainstream adoption. Always that 25% of employees are really using these things. I'm seeing rising interest in embedding concise analytics at the point of decision or better still, using analytics as triggers for automation and workflows, and not even necessitating human interaction with visualizations, for example, if we have confidence in the analytics. So, leading companies are pushing for next generation applications, part of this low-code, no-code movement we've seen. And they want to build that decision support right into the app. So, the analytic is right there. Leading enterprise apps vendors, Salesforce, SAP, Microsoft, Oracle, they're all building smart apps with the analytics predictions, even recommendations built into these applications. And I think, the progressive BI analytics vendors are supporting this idea of driving insight to action, not necessarily necessitating humans interacting with it if there's confidence. So, we want prediction, we want embedding, we want automation. This low-code, no-code development movement is very important to bringing the analytics to where people are doing their work. We got to move beyond the, what I call swivel chair integration, between where people do their work and going off to separate reports and dashboards, and having to interpret and analyze before you can go back and do take action. >> And Dave Menninger, today, if you want, analytics or you want to absorb what's happening in the business, you typically got to go ask an expert, and then wait. So, what are your thoughts on Doug's prediction? >> I'm in total agreement with Doug. I'm going to say that collectively... So, how did we get here? I'm going to say collectively as an industry, we made a mistake. We made BI and analytics separate from the operational systems. Now, okay, it wasn't really a mistake. We were limited by the technology available at the time. Decades ago, we had to separate these two systems, so that the analytics didn't impact the operations. You don't want the operations preventing you from being able to do a transaction. But we've gone beyond that now. We can bring these two systems and worlds together and organizations recognize that need to change. As Doug said, the majority of the workforce and the majority of organizations doesn't have access to analytics. That's wrong. (chuckles) We've got to change that. And one of the ways that's going to change is with embedded analytics. 2/3 of organizations recognize that embedded analytics are important and it even ranks higher in importance than AI and ML in those organizations. So, it's interesting. This is a really important topic to the organizations that are consuming these technologies. The good news is it works. Organizations that have embraced embedded analytics are more comfortable with self-service than those that have not, as opposed to turning somebody loose, in the wild with the data. They're given a guided path to the data. And the research shows that 65% of organizations that have adopted embedded analytics are comfortable with self-service compared with just 40% of organizations that are turning people loose in an ad hoc way with the data. So, totally behind Doug's predictions. >> Can I just break in with something here, a comment on what Dave said about what Doug said, which (laughs) is that I totally agree with what you said about embedded analytics. And at IDC, we made a prediction in our future intelligence, future of intelligence service three years ago that this was going to happen. And the thing that we're waiting for is for developers to build... You have to write the applications to work that way. It just doesn't happen automagically. Developers have to write applications that reference analytic data and apply it while they're running. And that could involve simple things like complex queries against the live data, which is through something that I've been calling analytic transaction processing. Or it could be through something more sophisticated that involves AI operations as Doug has been suggesting, where the result is enacted pretty much automatically unless the scores are too low and you need to have a human being look at it. So, I think that that is definitely something we've been watching for. I'm not sure how soon it will come, because it seems to take a long time for people to change their thinking. But I think, as Dave was saying, once they do and they apply these principles in their application development, the rewards are great. >> Yeah, this is very much, I would say, very consistent with what we were talking about, I was talking about before, about basically rethinking the modern data stack and going into more of an end-to-end solution solution. I think, that what we're talking about clearly here is operational analytics. There'll still be a need for your data scientists to go offline just in their data lakes to do all that very exploratory and that deep modeling. But clearly, it just makes sense to bring operational analytics into where people work into their workspace and further flatten that modern data stack. >> But with all this metadata and all this intelligence, we're talking about injecting AI into applications, it does seem like we're entering a new era of not only data, but new era of apps. Today, most applications are about filling forms out or codifying processes and require a human input. And it seems like there's enough data now and enough intelligence in the system that the system can actually pull data from, whether it's the transaction system, e-commerce, the supply chain, ERP, and actually do something with that data without human involvement, present it to humans. Do you guys see this as a new frontier? >> I think, that's certainly- >> Very much so, but it's going to take a while, as Carl said. You have to design it, you have to get the prediction into the system, you have to get the analytics at the point of decision has to be relevant to that decision point. >> And I also recall basically a lot of the ERP vendors back like 10 years ago, we're promising that. And the fact that we're still looking at the promises shows just how difficult, how much of a challenge it is to get to what Doug's saying. >> One element that could be applied in this case is (indistinct) architecture. If applications are developed that are event-driven rather than following the script or sequence that some programmer or designer had preconceived, then you'll have much more flexible applications. You can inject decisions at various points using this technology much more easily. It's a completely different way of writing applications. And it actually involves a lot more data, which is why we should all like it. (laughs) But in the end (Tony laughing) it's more stable, it's easier to manage, easier to maintain, and it's actually more efficient, which is the result of an MIT study from about 10 years ago, and still, we are not seeing this come to fruition in most business applications. >> And do you think it's going to require a new type of data platform database? Today, data's all far-flung. We see that's all over the clouds and at the edge. Today, you cache- >> We need a super cloud. >> You cache that data, you're throwing into memory. I mentioned, MySQL heat wave. There are other examples where it's a brute force approach, but maybe we need new ways of laying data out on disk and new database architectures, and just when we thought we had it all figured out. >> Well, without referring to disk, which to my mind, is almost like talking about cave painting. I think, that (Dave laughing) all the things that have been mentioned by all of us today are elements of what I'm talking about. In other words, the whole improvement of the data mesh, the improvement of metadata across the board and improvement of the ability to track data and judge its freshness the way we judge the freshness of a melon or something like that, to determine whether we can still use it. Is it still good? That kind of thing. Bringing together data from multiple sources dynamically and real-time requires all the things we've been talking about. All the predictions that we've talked about today add up to elements that can make this happen. >> Well, guys, it's always tremendous to get these wonderful minds together and get your insights, and I love how it shapes the outcome here of the predictions, and let's see how we did. We're going to leave it there. I want to thank Sanjeev, Tony, Carl, David, and Doug. Really appreciate the collaboration and thought that you guys put into these sessions. Really, thank you. >> Thank you. >> Thanks, Dave. >> Thank you for having us. >> Thanks. >> Thank you. >> All right, this is Dave Valente for theCUBE, signing off for now. Follow these guys on social media. Look for coverage on siliconangle.com, theCUBE.net. Thank you for watching. (upbeat music)

Published Date : Jan 11 2023

SUMMARY :

and pleased to tell you (Tony and Dave faintly speaks) that led them to their conclusion. down, the funding in VC IPO market. And I like how the fact And I happened to have tripped across I talked to Walmart in the prediction of graph databases. But I stand by the idea and maybe to the edge. You can apply graphs to great And so, it's going to streaming data permeates the landscape. and to be honest, I like the tough grading the next 20 to 25% of and of course, the degree of difficulty. that sits on the side, Thank you for that. And I have to disagree. So, the catalog becomes Do you have any stats for just the reasons that And a lot of those catalogs about the modern data stack. and more, the data lakehouse. and the application stack, So, the alternative is to have metadata that SQL is the killer app for big data. but in the perception of the marketplace, and I had to take the NoSQL, being up on stage with Curt Monash. (group laughing) is that the core need in the data lake, And your prediction is the and examine derivatives of the data to optimize around a set of KPIs. that folks in the content world (Dave and Carl laughing) going to say this... shifts the conversation to the consumers And essentially, one of the things (group laughing) the term that we'll remember today, to your last year's prediction, is headed to embedding. and going off to separate happening in the business, so that the analytics didn't And the thing that we're waiting for and that deep modeling. that the system can of decision has to be relevant And the fact that we're But in the end We see that's all over the You cache that data, and improvement of the and I love how it shapes the outcome here Thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Doug Henschen	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Doug	PERSON	0.99+
Carl	PERSON	0.99+
Carl Olofson	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Tony Baer	PERSON	0.99+
Tony	PERSON	0.99+
Dave Valente	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
Curt Monash	PERSON	0.99+
Sanjeev Mohan	PERSON	0.99+
Christian Kleinerman	PERSON	0.99+
Dave Valente	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Sanjeev	PERSON	0.99+
Constellation Research	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Ventana Research	ORGANIZATION	0.99+
2022	DATE	0.99+
Hazelcast	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Tony Bear	PERSON	0.99+
25%	QUANTITY	0.99+
2021	DATE	0.99+
last year	DATE	0.99+
65%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
today	DATE	0.99+
five-year	QUANTITY	0.99+
TigerGraph	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
two services	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
David	PERSON	0.99+
RisingWave Labs	ORGANIZATION	0.99+

Manish Gupta, Redis Labs | Spark Summit East 2017

>> Announcer: Live from Boston, Massachusetts, it's theCUBE, covering Spark Summit East 2017. Brought to you by Databricks. Now, here are your hosts Dave Vellante and George Gilbert. >> Welcome back to snowy Boston, everybody. This is theCUBE, the leader in live tech coverage. We're here at Spark Summit East, hashtag SparkSummit. Manish Gupta is here, he's the CMO at Redis Labs. Manish, welcome to theCUBE. >> Thank you, good to be here. >> So, you know, 10 years ago you say you're in the database business and everybody would yawn. Now you're the life of the party. >> Yeah, the world has changed. I think the party has lots and lots of players. We are happy to be on the top of that heap. >> It is a crowded space, so how does Redis Labs differentiate? >> Redis Labs is the company behind the massively popular open source Redis, and Redis became popular because of its performance primarily, and then simplicity. Developers could very easily run up an instance of Redis, solve some very hairy problems, and time to market was a big issue for them. Redis Enterprise took that forward and enabled it to be mission critical, ready for the largest workloads, ready for things that the enterprises need in a highly distributed clustered environment. So they have resilience and they benefit from the performance of Redis. >> And your claim to fame, as you say, is that top-gun performance, you guys will talk about some of the benchmarks later. We're talking about use cases like fraud detection, as example. Obviously ad serving would be another one. But add some color to that if you would. >> Redis is whatever you need to make real time real, Redis plays a very important role. It is able to deliver millions of operations per second with sub-millisecond latency, and that's the hallmark. With data structures that comprise Redis, you can solve the problems in a way, and the reason you can get that performance is because the data structures take some very complex issues and simplify the operation. Depending on the use case, you could use one of the data structures, you can mix and match the data structures, so that's the power of a Redis. We're used for ITO, for machine learning, for metering of billing and telecommunications environment, for personalization, for ad serving with companies like Groupon and others, and the list goes on and on. >> Yeah, you've got a big list on your website of all your customers, so you can check that out. Let's get the business model piece out of the way. Everybody's always fascinated. Okay, you got open source, how do you make money? How does Redis make money? >> Yeah, you know, we believe strategically fostering the growth of open source is foundational in our business model, and we invest heavily both R&D and marketing to do that. On top of that, to enable enterprise success and deployment of Redis, we have the mission critical, highly available Redis Enterprise offerings. Our monetization is entirely based on the Redis Enterprise platform, which takes advantage of the data structures and performance of core Redis, but layers on top management and the capabilities that make things like auto-recovery, auto-sorting, management much, much easier for the enterprise. We make that available in four deployment models. The enterprise can select us as Redis cloud, which runs on a public infrastructure on any of the four major platforms. We also allow for the enterprise to select a VPC environment in their own private clouds. They can also get software and self-manage that, or get our software and we can manage it for them. Four deployment options are the modalities in other ways where the enterprise customers help us monetize. >> When you said four major platforms, you meant cloud platforms? >> That's right. AWS, >> So, AWS, Azure >> Azure, Google, and IBM. >> Is IBM software, got there in the fourth, alright. >> That's right, all four. >> Go to the whip IBM. Go ahead, George. >> Along the lines of the business model, and we were sort of starting to talk about this earlier offline, you're just one component in building an application, and there's always this challenge of, well, I can manage my component better than anyone else, but it's got to fit with a bunch of other vendors' components. How do you make that seamless to the customer so that it's not defaulting over to a cloud vendor who has to build all the components themselves to make it work together? >> Certainly, you know, database is an integral part of your stack, of your application stack, but it is a stack, so there are other components. Redis and Redis Labs has a very, very large ecosystem within which we operate. We work closely with others for interfaces, for connectors, for interoperability, and that's a sustained environment that we invest in on a continuous basis. >> How do handle application consistency? A lot of in the no-SQL world, even in the AWS world, you hear about eventual consistency, but in the real-time world, there's a need for more rigorous, what's your philosophy there, how do you approach that? >> I think that's an issue that many no-SQL vendors have not been able to crack. Redis Labs has been at the forefront of that. We are taking an approach, and we are offering what we call tuneable consistency. Depending on the economics and the business model and the use case, the needs of consistency vary. In some cases, you do need immediate consistency. In other cases, you don't ever need consistency. And to give that flexibility to the customer is very important, so we've taken the approach where you can go from loose consistency to what we call strong eventual consistency. That approach is based on a fairly well trusted architecture and approach called CRDT, Conflict-free Replication Data Type. That approach allows us to, regardless of what the cluster magnitude or the distribution looks like geographically, we can deliver strong eventual consistency which meets the needs of majority of the customers. >> What are you seeing in terms of, you know, also in that a discussion about acid properties, and how many workloads really need acid properties. What are seeing now as you get more cloud native workloads and more no-SQL oriented workloads in terms of the requirement for those acid properties? >> First of all, we truly believe and agree that not all environments required acid support. Having said that, to be a truly credible database, you must support acid, and we do. Redis is acid-compli, supports acid, and Redis Labs certainly supports that. >> I remember on a stage once with Curt Monash, I'm sure you know Curt, right? Very famous database person. And he basically had a similar answer. But you would say that increasingly there are workloads that, the growth workloads don't necessarily require that, is that fair statement? >> That's a fair statement I would say. >> Dave: Great, good. >> There's a trade-off, though, when you talked about strong eventual consistency, potentially you have to wait for, presumably, a quorum of the partitions, I'm getting really technical here, but in other words, you've got a copy of the data here-- >> Dave: Good CMO question. (laughing) >> But your value proposition to the customers, we get this stuff done fast, but if you have to wait for a couple other servers to make sure that they've got the update, that can slow things way down. How does that trade-off work? >> I think that's part of the power of our architecture. We have a nothing shared, single proxy architecture where all of the replication, the disaster recovery, and the consistency management of the back end is handled by the proxy, and we ensure that the performance is not degraded when you are working through the consistency challenges, and that's where significant amount of IP is in the development of that proxy. >> I'll take that as a, let's go into it even more offline. >> Manish: Sounds good. >> And I have some other CMO questions, if I may. A lot of young companies like yours, especially in open source world, when they go to get the word out, they rely on their community, their open source community, and that's the core, and that makes a lot of sense, it's their peeps. As you become, grow more into enterprise grade apps and workloads, how do you extend beyond that? What is Redis Labs doing to sort of reach that C-Suite, are you even trying to reach that C-Suite up level to messaging? How do you as a CMO deal with those challenges? >> Maybe I'll begin by talking about our personas that matter to us in the ecosystem. The enterprise level, the architects, the developers, are the primary target, which we try to influence in early part of the decision cycle, it's at the architectural level. The ultimate teams that manage, run, and operate the infrastructure is certainly the DevOps, or the operations teams, and we spend time there. All along for some of the enterprise engagements, CIOs, chief data officers, and CTOs tend to play a very important role in the decisions and the selection process, and so, we do influence and interact with the C-Suite quite heavily. What the power of the open source gives us is that groundswell of love for Redis. Literally you can walk around a developer environment, such as the Spark Summit here, and you'll find people wearing Redis Geek shirts. And we get emails from Kazakhstan and strange, places from all over the world where we don't necessarily have salesforce, and requesting t-shirts, "send us stickers." Because people love Redis, and the word of mouth, that ground level love for the technology enables the decisions to be so much easier and smoother. We're not convincing, it's not a philosophical battle anymore. It's simply about the use case and the solution where Redis Enterprise fits or doesn't fit. >> Okay, so it really is that core developer community that are your advocates, and they're able to internally sell to the C-Suite. A lot of times the C-Suite, not the CTO so much, but certainly the CIO, CDO are like, "Yeah, yeah, they're geekin' out on some new hot thing. "What's the business impact?" Do you get that question a lot, and how do address it? >> I think then you get to some of the very basic tools, ROI calculators and the value proposition. For the C-level, the message is very simple. We are the least risky bet. We are the best long-term proposition, and we are the best cost answer for their implementation. Particularly as the needs are increasingly becoming more real-time in nature, they are not batch processed. Yes, there will always be some of that, but as the workloads are becoming, there is a need for faster processing, there is a need for quick insights, and real-time is not a moniker anymore, right. Real-time truly needs to be delivered today. And so, I think those three propositions for the C-Suite are resonating very well. >> Let's talk about ROI calculators for a second. I love talking about it because it underscores what a company feels as though its core value proposition is. I would think with Redis Labs part of the value proposition is you are enabling new types of workloads and new types of, whether it's sources of revenue or productivity. And these are generally telephone numbers as compared to some of the cost savings head to head to your competition, which of course you want to stress as well because the CFO cares about the cap-backs. What do you emphasize in that, and we don't have to get into the calculator itself, but in the conceptual model, what's the emphasis? Is it on those sort of business value attributes, is it on the sort of cost-savings? How do you translate performance into that business value? A lot of questions there, but if you could summarize, that'd be great. >> Well, I think you can think of it in three dimensions. The very first one is, does the performance support the use case or the solution that is required? That's the very first one. The second piece that fits in it, and that's in our books, that's operations per second and the latency. The second piece is the cost side, and that has two components to it. The first component is, what are the compute requirements? So, what is the infrastructure underneath that has to support it? And the efficiency that Redis and Redis Enterprise has is dramatically superior to the alternatives. And so, the economics show up. To run a million operations per second, we can do that on two nodes as opposed to alternative, which might need 50 nodes or 300 nodes. >> You can utilize your assets on the floor much better than maybe the competition can. >> This is where the data structures come into play quite a bit. That's one part of-- >> Dave: That's one part of the cost. >> Yeah. The other part of the cost is the human cost. >> Dave: People, yeah. >> And because, and this goes back to the open source, because the people available with the talent and the competency and appreciation for Redis, it's easy to procure those people, and your cost of acquisition and deploying goes down quite a bit. So, there's a human cost to it. The third dimension to this whole equation is time to market. And time to market is measured in many ways. Is it lost revenue if it takes you longer to get there? And Redis consistently from multiple analysts' reports gets top ranking for fastest way to get to market because of how simple it is. Beyond performance, simplicity is a second hallmark. >> That's a benefit acceleration, and you can quantify that. >> Absolutely, absolutely. And that's a revenue parameter, right. >> For years, people have been saying this Cambrian explosion of databases is unsustainable, and sort of in response we've gotten a squaring of the Cambrian explosion. The question is, with your sort of very flexible, I don't want to get too geeky, 'cause Dave'll cut me off, but the idea that you can accommodate time series and all these different ways of, all these different types of data, are we approaching a situation where customers can start consolidating their database choices and have fewer vendors, fewer products in their landscape? >> I think not only are we getting there, but we must get there. You've got over 300 databases in the marketplace, and imagine a CIO or an architect trying to have to sort through that to make a decision, it's difficult, and you certainly cannot support it from a trading standpoint or from an investment, cap-backs, and all that standpoint. What we have done with Redis is introduce something called Redis Modules. We released that at the last RedisConf in May in San Francisco. And the Redis Module is a very simple concept but a very powerful concept. It's an API which can be utilized to take an existing development effort, written as CC++, that can be ported onto the Redis data structures. This gives you the flexibility without having to reinvent the wheel every single time to take that investment, port it on top of Redis, and you get the performance, and you can make now Redis becomes a multi-model database. And I'm going to get to your answer of how do you address the multiple needs so you don't need multiple databases. To give you some examples, since the introduction of Redis Modules, we have now over 50 modules that have been published by a variety of places, not just Redis Labs. To indicate how simple and how powerful this model is. We took Lucene and developed the world's fastest full-text search engine as a module. We have very recently introduced Redis machine learning as a module that works with Spark ML and serves as a great serving layer in the machine learning domain. Just two very simple examples, but work that's being done ported over onto Redis data structures and now you have ability to do some very powerful things because of what Redis is. And this is the way future's going to be. I think every database is trying to offer multi-functionality to be multi-model in nature, but instead of doing it one step at a time, this approach gives us the ability to leverage the entire ecosystem. >> Your point being consolidation's inevitable in this business as well. >> Manish: Architectural consolidation. >> Yes, but also you would think, company consolidation, isn't that going to follow? What do you make of the market, and tell me, if you look back on the database market and what Oracle was able to achieve in the face of, maybe not as many players, but you had Sybase and Informix, and certainly DB2's still around, and SQL Server's still around, but Oracle won, and maybe it was SQL standards that. It's great to be lucky and good. Can we learn from that, or is this a whole different world? Are there similarities, and how do you, how do you see that consolidation potentially shaking out, if you agree that there will be consolidation? >> Yeah, there has to be, first and foremost, an architectural approach that solves the OPEX, CAPEX challenge for the enterprise. But beyond that, no industry can sustain the diversity and the fragmentation that exists in database world. I think there will always be new things coming out, of universities particularly. There's great innovation and research happening, and that is required to augment. But at the end of the day, the commercial enterprises cannot be of the fragmented volume that we have today in the database world, so there is going to be some consolidation, and it's not unnatural. I think it's natural, it's expected, time will tell what that looks like. We've seen some of our competitors acquire smaller companies to add graph functionality, to add search functionality. We just don't think that's the level of consolidation that really moves the needle for the industry. It's got to be at a higher level of consolidation. >> I don't want to, don't take this the wrong way, don't hate me for saying it, but is Oracle sort of the enemy, if I can say that. I mean, it's like, no, okay. >> Depends how you define enemy. >> I'm not going to go do many of the workloads that you're talking about on Oracle, despite what Larry tells me at Oracle OpenWorld. And I'm not going to make Oracle my choice for any of the workloads that you guys are working on. I guess in terms, I mean, everybody who's in the database business looks at that and say, "Hey, we can do it cheaper, better, "more productively," but, could you respond to that, and what do you make of Amazon's moves in the database world? Does that concern you? >> We think of Amazon and Oracle as two very different philosophies, if you can use that word. The approach we have taken is really a forward-looking approach and philosophy. We believe that the needs of the market need to be solved in new ways, and new ways should not be encumbered by old approaches. We're not trying to go and replicate what was done in the SQL world or in a relational database world. Our approach is how do you deliver a multi-model database that has the real-time attribute attached to it in a way that requires very limited computer force power and very few resources to manage? You take all of those things as kind of the core philosophy, which is a forward-looking philosophy. We are definitely not trying to replicate what an Oracle used to be. AWS I think is a very different animal. >> Dave: Interesting, though. >> They have defined the cloud, and I think play a very important role. We are a strong partner of theirs, much of our traffic runs on AWS infrastructure, certainly also on other clouds. I think AWS is one to watch in how they evolve. They have database offerings, including Redis offerings. However, we fully recognize, and the industry recognizes that that's not to the same capability as Redis Enterprise. It's open sourced Redis managed by AWS, and that's fine as a cache, but you cannot persist, and you really cannot have a multi-model capability that's a full database in that approach. >> And you're in the marketplace. >> Manish: We are in the marketplace. >> Obviously. >> And actually, we announced earlier, a few weeks ago, that you can buy and get Redis cloud access, which is Redis Enterprise cloud, on AWS through the integrated billing approach on their marketplace. You can have an AWS account and get our service, the true Redis Enterprise service. >> And as a software company, you'd figure, okay, the cloud infrastructures are service, we don't care what infrastructure it runs on. Whatever the customer wants, but you see AWS making these moves up-market, you got to obviously be paying attention to that. >> Manish: Certainly, certainly. >> Go ahead, last question. >> Interesting that you were saying that to solve this problem of proliferation of choice it has to be multi-model with speed and low resource requirement. If I were to interpret that from an old-style database perspective, it would be you're going to get, the multi-model is something you are addressing now, with the extensibility, but the speed means taking out that abstraction layer that was the query optimizer sort of and working almost at the storage layer, or having an option to do that. Would that be a fair way to say? >> No, I don't think that necessarily needs to be the case. For us, speed translates from the simplicity and the power of the data structures. Instead of having to serialize, deserialize before you process data in a Spark context, or instead of having to look for data that is perhaps not put in sorted sets for a use case that you might be doing, running a query on, if the data is already handled through one of the data structures, you now have a much faster query time, you now have the ability to reach the data in the right approach. And again, this is no-SQL, right, so it's a schema lesson write and it sets your scheme as you want it be on read. We marry that with the data structures, and that gives you the ultimate speed. >> We have to leave it there, but Manish, I'll give you the last word. Things we should be paying attention to for Redis Labs this year, events, announcements? >> I think the big thing I would leave the audience with is RedisConf 2017. It's May 31 to June 2 in San Francisco. We are expecting over 1,000 people. The brightest minds around Redis of the database world will be there, and anybody who is considering deploying the next generation database should attend. >> Dave: Where are you doing that? >> It's the Marriott Marquis in San Franciso. >> Great, is that on Howard Street, across from the--? >> It is right across from Moscone. >> Great, awesome location. People know it, easy to get to. Well, congratulations on the success. We'll be lookin' for outputs from that event, and hope to see you again on theCUBE. >> Thank you, enjoyed the conversation. >> Alright, good. Keep it right there, everybody, we'll be back with our next guest. This is theCUBE, we're live from Spark Summit East. Be right back. (upbeat electronic rock music)

Published Date : Feb 9 2017

SUMMARY :

Brought to you by Databricks. Manish Gupta is here, he's the CMO at Redis Labs. So, you know, 10 years ago you say We are happy to be on the top of that heap. Redis Labs is the company behind But add some color to that if you would. and the reason you can get that performance Let's get the business model piece out of the way. We also allow for the enterprise to select a VPC environment That's right. Google, and IBM. Go to the whip IBM. Along the lines of the business model, Certainly, you know, database is an integral part and the use case, the needs of consistency vary. in terms of the requirement for those acid properties? you must support acid, and we do. the growth workloads don't necessarily require that, Dave: Good CMO question. but if you have to wait for a couple other servers and the consistency management of the back end and that's the core, and that makes and the word of mouth, that ground level love but certainly the CIO, CDO are like, For the C-level, the message is very simple. part of the value proposition is you are enabling That's the very first one. much better than maybe the competition can. This is where the data structures of the cost. The other part of the cost is the human cost. and the competency and appreciation for Redis, And that's a revenue parameter, right. but the idea that you can accommodate time series We released that at the last RedisConf in this business as well. and tell me, if you look back on the database market that really moves the needle for the industry. but is Oracle sort of the enemy, if I can say that. for any of the workloads that you guys are working on. We believe that the needs of the market and that's fine as a cache, but you cannot persist, the true Redis Enterprise service. okay, the cloud infrastructures are service, the multi-model is something you are addressing now, and the power of the data structures. but Manish, I'll give you the last word. of the database world will be there, and hope to see you again on theCUBE. This is theCUBE, we're live from Spark Summit East.

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
George Gilbert	PERSON	0.99+
Dave	PERSON	0.99+
AWS	ORGANIZATION	0.99+
George	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Howard Street	LOCATION	0.99+
Curt	PERSON	0.99+
second piece	QUANTITY	0.99+
San Francisco	LOCATION	0.99+
Redis Labs	ORGANIZATION	0.99+
Manish Gupta	PERSON	0.99+
two nodes	QUANTITY	0.99+
Redis	ORGANIZATION	0.99+
two components	QUANTITY	0.99+
two	QUANTITY	0.99+
San Franciso	LOCATION	0.99+
Larry	PERSON	0.99+
Manish	PERSON	0.99+
first component	QUANTITY	0.99+
Boston, Massachusetts	LOCATION	0.99+
over 50 modules	QUANTITY	0.99+
June 2	DATE	0.99+
May 31	DATE	0.99+
Google	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
Curt Monash	PERSON	0.99+
May	DATE	0.99+
millions	QUANTITY	0.99+
third dimension	QUANTITY	0.98+
50 nodes	QUANTITY	0.98+
Moscone	LOCATION	0.98+
fourth	QUANTITY	0.98+
Redis Enterprise	TITLE	0.98+
300 nodes	QUANTITY	0.98+
Redis	TITLE	0.98+
Kazakhstan	LOCATION	0.98+
over 1,000 people	QUANTITY	0.98+
one part	QUANTITY	0.98+
both	QUANTITY	0.98+
one step	QUANTITY	0.97+
C-Suite	TITLE	0.97+
Marriott Marquis	ORGANIZATION	0.97+
second hallmark	QUANTITY	0.97+
10 years ago	DATE	0.97+
Spark Summit East 2017	EVENT	0.97+
Groupon	ORGANIZATION	0.97+
first one	QUANTITY	0.97+
CDO	TITLE	0.97+
over 300 databases	QUANTITY	0.96+
SQL Server	TITLE	0.96+
Redis Enterprise cloud	TITLE	0.96+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Curt Monash: