Analyst Predictions 2023: The Future of Data Management

(upbeat music) >> Hello, this is Dave Valente with theCUBE, and one of the most gratifying aspects of my role as a host of "theCUBE TV" is I get to cover a wide range of topics. And quite often, we're able to bring to our program a level of expertise that allows us to more deeply explore and unpack some of the topics that we cover throughout the year. And one of our favorite topics, of course, is data. Now, in 2021, after being in isolation for the better part of two years, a group of industry analysts met up at AWS re:Invent and started a collaboration to look at the trends in data and predict what some likely outcomes will be for the coming year. And it resulted in a very popular session that we had last year focused on the future of data management. And I'm very excited and pleased to tell you that the 2023 edition of that predictions episode is back, and with me are five outstanding market analyst, Sanjeev Mohan of SanjMo, Tony Baer of dbInsight, Carl Olofson from IDC, Dave Menninger from Ventana Research, and Doug Henschen, VP and Principal Analyst at Constellation Research. Now, what is it that we're calling you, guys? A data pack like the rat pack? No, no, no, no, that's not it. It's the data crowd, the data crowd, and the crowd includes some of the best minds in the data analyst community. They'll discuss how data management is evolving and what listeners should prepare for in 2023. Guys, welcome back. Great to see you. >> Good to be here. >> Thank you. >> Thanks, Dave. (Tony and Dave faintly speaks) >> All right, before we get into 2023 predictions, we thought it'd be good to do a look back at how we did in 2022 and give a transparent assessment of those predictions. So, let's get right into it. We're going to bring these up here, the predictions from 2022, they're color-coded red, yellow, and green to signify the degree of accuracy. And I'm pleased to report there's no red. Well, maybe some of you will want to debate that grading system. But as always, we want to be open, so you can decide for yourselves. So, we're going to ask each analyst to review their 2022 prediction and explain their rating and what evidence they have that led them to their conclusion. So, Sanjeev, please kick it off. Your prediction was data governance becomes key. I know that's going to knock you guys over, but elaborate, because you had more detail when you double click on that. >> Yeah, absolutely. Thank you so much, Dave, for having us on the show today. And we self-graded ourselves. I could have very easily made my prediction from last year green, but I mentioned why I left it as yellow. I totally fully believe that data governance was in a renaissance in 2022. And why do I say that? You have to look no further than AWS launching its own data catalog called DataZone. Before that, mid-year, we saw Unity Catalog from Databricks went GA. So, overall, I saw there was tremendous movement. When you see these big players launching a new data catalog, you know that they want to be in this space. And this space is highly critical to everything that I feel we will talk about in today's call. Also, if you look at established players, I spoke at Collibra's conference, data.world, work closely with Alation, Informatica, a bunch of other companies, they all added tremendous new capabilities. So, it did become key. The reason I left it as yellow is because I had made a prediction that Collibra would go IPO, and it did not. And I don't think anyone is going IPO right now. The market is really, really down, the funding in VC IPO market. But other than that, data governance had a banner year in 2022. >> Yeah. Well, thank you for that. And of course, you saw data clean rooms being announced at AWS re:Invent, so more evidence. And I like how the fact that you included in your predictions some things that were binary, so you dinged yourself there. So, good job. Okay, Tony Baer, you're up next. Data mesh hits reality check. As you see here, you've given yourself a bright green thumbs up. (Tony laughing) Okay. Let's hear why you feel that was the case. What do you mean by reality check? >> Okay. Thanks, Dave, for having us back again. This is something I just wrote and just tried to get away from, and this just a topic just won't go away. I did speak with a number of folks, early adopters and non-adopters during the year. And I did find that basically that it pretty much validated what I was expecting, which was that there was a lot more, this has now become a front burner issue. And if I had any doubt in my mind, the evidence I would point to is what was originally intended to be a throwaway post on LinkedIn, which I just quickly scribbled down the night before leaving for re:Invent. I was packing at the time, and for some reason, I was doing Google search on data mesh. And I happened to have tripped across this ridiculous article, I will not say where, because it doesn't deserve any publicity, about the eight (Dave laughing) best data mesh software companies of 2022. (Tony laughing) One of my predictions was that you'd see data mesh washing. And I just quickly just hopped on that maybe three sentences and wrote it at about a couple minutes saying this is hogwash, essentially. (laughs) And that just reun... And then, I left for re:Invent. And the next night, when I got into my Vegas hotel room, I clicked on my computer. I saw a 15,000 hits on that post, which was the most hits of any single post I put all year. And the responses were wildly pro and con. So, it pretty much validates my expectation in that data mesh really did hit a lot more scrutiny over this past year. >> Yeah, thank you for that. I remember that article. I remember rolling my eyes when I saw it, and then I recently, (Tony laughing) I talked to Walmart and they actually invoked Martin Fowler and they said that they're working through their data mesh. So, it takes a really lot of thought, and it really, as we've talked about, is really as much an organizational construct. You're not buying data mesh >> Bingo. >> to your point. Okay. Thank you, Tony. Carl Olofson, here we go. You've graded yourself a yellow in the prediction of graph databases. Take off. Please elaborate. >> Yeah, sure. So, I realized in looking at the prediction that it seemed to imply that graph databases could be a major factor in the data world in 2022, which obviously didn't become the case. It was an error on my part in that I should have said it in the right context. It's really a three to five-year time period that graph databases will really become significant, because they still need accepted methodologies that can be applied in a business context as well as proper tools in order for people to be able to use them seriously. But I stand by the idea that it is taking off, because for one thing, Neo4j, which is the leading independent graph database provider, had a very good year. And also, we're seeing interesting developments in terms of things like AWS with Neptune and with Oracle providing graph support in Oracle database this past year. Those things are, as I said, growing gradually. There are other companies like TigerGraph and so forth, that deserve watching as well. But as far as becoming mainstream, it's going to be a few years before we get all the elements together to make that happen. Like any new technology, you have to create an environment in which ordinary people without a whole ton of technical training can actually apply the technology to solve business problems. >> Yeah, thank you for that. These specialized databases, graph databases, time series databases, you see them embedded into mainstream data platforms, but there's a place for these specialized databases, I would suspect we're going to see new types of databases emerge with all this cloud sprawl that we have and maybe to the edge. >> Well, part of it is that it's not as specialized as you might think it. You can apply graphs to great many workloads and use cases. It's just that people have yet to fully explore and discover what those are. >> Yeah. >> And so, it's going to be a process. (laughs) >> All right, Dave Menninger, streaming data permeates the landscape. You gave yourself a yellow. Why? >> Well, I couldn't think of a appropriate combination of yellow and green. Maybe I should have used chartreuse, (Dave laughing) but I was probably a little hard on myself making it yellow. This is another type of specialized data processing like Carl was talking about graph databases is a stream processing, and nearly every data platform offers streaming capabilities now. Often, it's based on Kafka. If you look at Confluent, their revenues have grown at more than 50%, continue to grow at more than 50% a year. They're expected to do more than half a billion dollars in revenue this year. But the thing that hasn't happened yet, and to be honest, they didn't necessarily expect it to happen in one year, is that streaming hasn't become the default way in which we deal with data. It's still a sidecar to data at rest. And I do expect that we'll continue to see streaming become more and more mainstream. I do expect perhaps in the five-year timeframe that we will first deal with data as streaming and then at rest, but the worlds are starting to merge. And we even see some vendors bringing products to market, such as K2View, Hazelcast, and RisingWave Labs. So, in addition to all those core data platform vendors adding these capabilities, there are new vendors approaching this market as well. >> I like the tough grading system, and it's not trivial. And when you talk to practitioners doing this stuff, there's still some complications in the data pipeline. And so, but I think, you're right, it probably was a yellow plus. Doug Henschen, data lakehouses will emerge as dominant. When you talk to people about lakehouses, practitioners, they all use that term. They certainly use the term data lake, but now, they're using lakehouse more and more. What's your thoughts on here? Why the green? What's your evidence there? >> Well, I think, I was accurate. I spoke about it specifically as something that vendors would be pursuing. And we saw yet more lakehouse advocacy in 2022. Google introduced its BigLake service alongside BigQuery. Salesforce introduced Genie, which is really a lakehouse architecture. And it was a safe prediction to say vendors are going to be pursuing this in that AWS, Cloudera, Databricks, Microsoft, Oracle, SAP, Salesforce now, IBM, all advocate this idea of a single platform for all of your data. Now, the trend was also supported in 2023, in that we saw a big embrace of Apache Iceberg in 2022. That's a structured table format. It's used with these lakehouse platforms. It's open, so it ensures portability and it also ensures performance. And that's a structured table that helps with the warehouse side performance. But among those announcements, Snowflake, Google, Cloud Era, SAP, Salesforce, IBM, all embraced Iceberg. But keep in mind, again, I'm talking about this as something that vendors are pursuing as their approach. So, they're advocating end users. It's very cutting edge. I'd say the top, leading edge, 5% of of companies have really embraced the lakehouse. I think, we're now seeing the fast followers, the next 20 to 25% of firms embracing this idea and embracing a lakehouse architecture. I recall Christian Kleinerman at the big Snowflake event last summer, making the announcement about Iceberg, and he asked for a show of hands for any of you in the audience at the keynote, have you heard of Iceberg? And just a smattering of hands went up. So, the vendors are ahead of the curve. They're pushing this trend, and we're now seeing a little bit more mainstream uptake. >> Good. Doug, I was there. It was you, me, and I think, two other hands were up. That was just humorous. (Doug laughing) All right, well, so I liked the fact that we had some yellow and some green. When you think about these things, there's the prediction itself. Did it come true or not? There are the sub predictions that you guys make, and of course, the degree of difficulty. So, thank you for that open assessment. All right, let's get into the 2023 predictions. Let's bring up the predictions. Sanjeev, you're going first. You've got a prediction around unified metadata. What's the prediction, please? >> So, my prediction is that metadata space is currently a mess. It needs to get unified. There are too many use cases of metadata, which are being addressed by disparate systems. For example, data quality has become really big in the last couple of years, data observability, the whole catalog space is actually, people don't like to use the word data catalog anymore, because data catalog sounds like it's a catalog, a museum, if you may, of metadata that you go and admire. So, what I'm saying is that in 2023, we will see that metadata will become the driving force behind things like data ops, things like orchestration of tasks using metadata, not rules. Not saying that if this fails, then do this, if this succeeds, go do that. But it's like getting to the metadata level, and then making a decision as to what to orchestrate, what to automate, how to do data quality check, data observability. So, this space is starting to gel, and I see there'll be more maturation in the metadata space. Even security privacy, some of these topics, which are handled separately. And I'm just talking about data security and data privacy. I'm not talking about infrastructure security. These also need to merge into a unified metadata management piece with some knowledge graph, semantic layer on top, so you can do analytics on it. So, it's no longer something that sits on the side, it's limited in its scope. It is actually the very engine, the very glue that is going to connect data producers and consumers. >> Great. Thank you for that. Doug. Doug Henschen, any thoughts on what Sanjeev just said? Do you agree? Do you disagree? >> Well, I agree with many aspects of what he says. I think, there's a huge opportunity for consolidation and streamlining of these as aspects of governance. Last year, Sanjeev, you said something like, we'll see more people using catalogs than BI. And I have to disagree. I don't think this is a category that's headed for mainstream adoption. It's a behind the scenes activity for the wonky few, or better yet, companies want machine learning and automation to take care of these messy details. We've seen these waves of management technologies, some of the latest data observability, customer data platform, but they failed to sweep away all the earlier investments in data quality and master data management. So, yes, I hope the latest tech offers, glimmers that there's going to be a better, cleaner way of addressing these things. But to my mind, the business leaders, including the CIO, only want to spend as much time and effort and money and resources on these sorts of things to avoid getting breached, ending up in headlines, getting fired or going to jail. So, vendors bring on the ML and AI smarts and the automation of these sorts of activities. >> So, if I may say something, the reason why we have this dichotomy between data catalog and the BI vendors is because data catalogs are very soon, not going to be standalone products, in my opinion. They're going to get embedded. So, when you use a BI tool, you'll actually use the catalog to find out what is it that you want to do, whether you are looking for data or you're looking for an existing dashboard. So, the catalog becomes embedded into the BI tool. >> Hey, Dave Menninger, sometimes you have some data in your back pocket. Do you have any stats (chuckles) on this topic? >> No, I'm glad you asked, because I'm going to... Now, data catalogs are something that's interesting. Sanjeev made a statement that data catalogs are falling out of favor. I don't care what you call them. They're valuable to organizations. Our research shows that organizations that have adequate data catalog technologies are three times more likely to express satisfaction with their analytics for just the reasons that Sanjeev was talking about. You can find what you want, you know you're getting the right information, you know whether or not it's trusted. So, those are good things. So, we expect to see the capabilities, whether it's embedded or separate. We expect to see those capabilities continue to permeate the market. >> And a lot of those catalogs are driven now by machine learning and things. So, they're learning from those patterns of usage by people when people use the data. (airy laughs) >> All right. Okay. Thank you, guys. All right. Let's move on to the next one. Tony Bear, let's bring up the predictions. You got something in here about the modern data stack. We need to rethink it. Is the modern data stack getting long at the tooth? Is it not so modern anymore? >> I think, in a way, it's got almost too modern. It's gotten too, I don't know if it's being long in the tooth, but it is getting long. The modern data stack, it's traditionally been defined as basically you have the data platform, which would be the operational database and the data warehouse. And in between, you have all the tools that are necessary to essentially get that data from the operational realm or the streaming realm for that matter into basically the data warehouse, or as we might be seeing more and more, the data lakehouse. And I think, what's important here is that, or I think, we have seen a lot of progress, and this would be in the cloud, is with the SaaS services. And especially you see that in the modern data stack, which is like all these players, not just the MongoDBs or the Oracles or the Amazons have their database platforms. You see they have the Informatica's, and all the other players there in Fivetrans have their own SaaS services. And within those SaaS services, you get a certain degree of simplicity, which is it takes all the housekeeping off the shoulders of the customers. That's a good thing. The problem is that what we're getting to unfortunately is what I would call lots of islands of simplicity, which means that it leads it (Dave laughing) to the customer to have to integrate or put all that stuff together. It's a complex tool chain. And so, what we really need to think about here, we have too many pieces. And going back to the discussion of catalogs, it's like we have so many catalogs out there, which one do we use? 'Cause chances are of most organizations do not rely on a single catalog at this point. What I'm calling on all the data providers or all the SaaS service providers, is to literally get it together and essentially make this modern data stack less of a stack, make it more of a blending of an end-to-end solution. And that can come in a number of different ways. Part of it is that we're data platform providers have been adding services that are adjacent. And there's some very good examples of this. We've seen progress over the past year or so. For instance, MongoDB integrating search. It's a very common, I guess, sort of tool that basically, that the applications that are developed on MongoDB use, so MongoDB then built it into the database rather than requiring an extra elastic search or open search stack. Amazon just... AWS just did the zero-ETL, which is a first step towards simplifying the process from going from Aurora to Redshift. You've seen same thing with Google, BigQuery integrating basically streaming pipelines. And you're seeing also a lot of movement in database machine learning. So, there's some good moves in this direction. I expect to see more than this year. Part of it's from basically the SaaS platform is adding some functionality. But I also see more importantly, because you're never going to get... This is like asking your data team and your developers, herding cats to standardizing the same tool. In most organizations, that is not going to happen. So, take a look at the most popular combinations of tools and start to come up with some pre-built integrations and pre-built orchestrations, and offer some promotional pricing, maybe not quite two for, but in other words, get two products for the price of two services or for the price of one and a half. I see a lot of potential for this. And it's to me, if the class was to simplify things, this is the next logical step and I expect to see more of this here. >> Yeah, and you see in Oracle, MySQL heat wave, yet another example of eliminating that ETL. Carl Olofson, today, if you think about the data stack and the application stack, they're largely separate. Do you have any thoughts on how that's going to play out? Does that play into this prediction? What do you think? >> Well, I think, that the... I really like Tony's phrase, islands of simplification. It really says (Tony chuckles) what's going on here, which is that all these different vendors you ask about, about how these stacks work. All these different vendors have their own stack vision. And you can... One application group is going to use one, and another application group is going to use another. And some people will say, let's go to, like you go to a Informatica conference and they say, we should be the center of your universe, but you can't connect everything in your universe to Informatica, so you need to use other things. So, the challenge is how do we make those things work together? As Tony has said, and I totally agree, we're never going to get to the point where people standardize on one organizing system. So, the alternative is to have metadata that can be shared amongst those systems and protocols that allow those systems to coordinate their operations. This is standard stuff. It's not easy. But the motive for the vendors is that they can become more active critical players in the enterprise. And of course, the motive for the customer is that things will run better and more completely. So, I've been looking at this in terms of two kinds of metadata. One is the meaning metadata, which says what data can be put together. The other is the operational metadata, which says basically where did it come from? Who created it? What's its current state? What's the security level? Et cetera, et cetera, et cetera. The good news is the operational stuff can actually be done automatically, whereas the meaning stuff requires some human intervention. And as we've already heard from, was it Doug, I think, people are disinclined to put a lot of definition into meaning metadata. So, that may be the harder one, but coordination is key. This problem has been with us forever, but with the addition of new data sources, with streaming data with data in different formats, the whole thing has, it's been like what a customer of mine used to say, "I understand your product can make my system run faster, but right now I just feel I'm putting my problems on roller skates. (chuckles) I don't need that to accelerate what's already not working." >> Excellent. Okay, Carl, let's stay with you. I remember in the early days of the big data movement, Hadoop movement, NoSQL was the big thing. And I remember Amr Awadallah said to us in theCUBE that SQL is the killer app for big data. So, your prediction here, if we bring that up is SQL is back. Please elaborate. >> Yeah. So, of course, some people would say, well, it never left. Actually, that's probably closer to true, but in the perception of the marketplace, there's been all this noise about alternative ways of storing, retrieving data, whether it's in key value stores or document databases and so forth. We're getting a lot of messaging that for a while had persuaded people that, oh, we're not going to do analytics in SQL anymore. We're going to use Spark for everything, except that only a handful of people know how to use Spark. Oh, well, that's a problem. Well, how about, and for ordinary conventional business analytics, Spark is like an over-engineered solution to the problem. SQL works just great. What's happened in the past couple years, and what's going to continue to happen is that SQL is insinuating itself into everything we're seeing. We're seeing all the major data lake providers offering SQL support, whether it's Databricks or... And of course, Snowflake is loving this, because that is what they do, and their success is certainly points to the success of SQL, even MongoDB. And we were all, I think, at the MongoDB conference where on one day, we hear SQL is dead. They're not teaching SQL in schools anymore, and this kind of thing. And then, a couple days later at the same conference, they announced we're adding a new analytic capability-based on SQL. But didn't you just say SQL is dead? So, the reality is that SQL is better understood than most other methods of certainly of retrieving and finding data in a data collection, no matter whether it happens to be relational or non-relational. And even in systems that are very non-relational, such as graph and document databases, their query languages are being built or extended to resemble SQL, because SQL is something people understand. >> Now, you remember when we were in high school and you had had to take the... Your debating in the class and you were forced to take one side and defend it. So, I was was at a Vertica conference one time up on stage with Curt Monash, and I had to take the NoSQL, the world is changing paradigm shift. And so just to be controversial, I said to him, Curt Monash, I said, who really needs acid compliance anyway? Tony Baer. And so, (chuckles) of course, his head exploded, but what are your thoughts (guests laughing) on all this? >> Well, my first thought is congratulations, Dave, for surviving being up on stage with Curt Monash. >> Amen. (group laughing) >> I definitely would concur with Carl. We actually are definitely seeing a SQL renaissance and if there's any proof of the pudding here, I see lakehouse is being icing on the cake. As Doug had predicted last year, now, (clears throat) for the record, I think, Doug was about a year ahead of time in his predictions that this year is really the year that I see (clears throat) the lakehouse ecosystems really firming up. You saw the first shots last year. But anyway, on this, data lakes will not go away. I've actually, I'm on the home stretch of doing a market, a landscape on the lakehouse. And lakehouse will not replace data lakes in terms of that. There is the need for those, data scientists who do know Python, who knows Spark, to go in there and basically do their thing without all the restrictions or the constraints of a pre-built, pre-designed table structure. I get that. Same thing for developing models. But on the other hand, there is huge need. Basically, (clears throat) maybe MongoDB was saying that we're not teaching SQL anymore. Well, maybe we have an oversupply of SQL developers. Well, I'm being facetious there, but there is a huge skills based in SQL. Analytics have been built on SQL. They came with lakehouse and why this really helps to fuel a SQL revival is that the core need in the data lake, what brought on the lakehouse was not so much SQL, it was a need for acid. And what was the best way to do it? It was through a relational table structure. So, the whole idea of acid in the lakehouse was not to turn it into a transaction database, but to make the data trusted, secure, and more granularly governed, where you could govern down to column and row level, which you really could not do in a data lake or a file system. So, while lakehouse can be queried in a manner, you can go in there with Python or whatever, it's built on a relational table structure. And so, for that end, for those types of data lakes, it becomes the end state. You cannot bypass that table structure as I learned the hard way during my research. So, the bottom line I'd say here is that lakehouse is proof that we're starting to see the revenge of the SQL nerds. (Dave chuckles) >> Excellent. Okay, let's bring up back up the predictions. Dave Menninger, this one's really thought-provoking and interesting. We're hearing things like data as code, new data applications, machines actually generating plans with no human involvement. And your prediction is the definition of data is expanding. What do you mean by that? >> So, I think, for too long, we've thought about data as the, I would say facts that we collect the readings off of devices and things like that, but data on its own is really insufficient. Organizations need to manipulate that data and examine derivatives of the data to really understand what's happening in their organization, why has it happened, and to project what might happen in the future. And my comment is that these data derivatives need to be supported and managed just like the data needs to be managed. We can't treat this as entirely separate. Think about all the governance discussions we've had. Think about the metadata discussions we've had. If you separate these things, now you've got more moving parts. We're talking about simplicity and simplifying the stack. So, if these things are treated separately, it creates much more complexity. I also think it creates a little bit of a myopic view on the part of the IT organizations that are acquiring these technologies. They need to think more broadly. So, for instance, metrics. Metric stores are becoming much more common part of the tooling that's part of a data platform. Similarly, feature stores are gaining traction. So, those are designed to promote the reuse and consistency across the AI and ML initiatives. The elements that are used in developing an AI or ML model. And let me go back to metrics and just clarify what I mean by that. So, any type of formula involving the data points. I'm distinguishing metrics from features that are used in AI and ML models. And the data platforms themselves are increasingly managing the models as an element of data. So, just like figuring out how to calculate a metric. Well, if you're going to have the features associated with an AI and ML model, you probably need to be managing the model that's associated with those features. The other element where I see expansion is around external data. Organizations for decades have been focused on the data that they generate within their own organization. We see more and more of these platforms acquiring and publishing data to external third-party sources, whether they're within some sort of a partner ecosystem or whether it's a commercial distribution of that information. And our research shows that when organizations use external data, they derive even more benefits from the various analyses that they're conducting. And the last great frontier in my opinion on this expanding world of data is the world of driver-based planning. Very few of the major data platform providers provide these capabilities today. These are the types of things you would do in a spreadsheet. And we all know the issues associated with spreadsheets. They're hard to govern, they're error-prone. And so, if we can take that type of analysis, collecting the occupancy of a rental property, the projected rise in rental rates, the fluctuations perhaps in occupancy, the interest rates associated with financing that property, we can project forward. And that's a very common thing to do. What the income might look like from that property income, the expenses, we can plan and purchase things appropriately. So, I think, we need this broader purview and I'm beginning to see some of those things happen. And the evidence today I would say, is more focused around the metric stores and the feature stores starting to see vendors offer those capabilities. And we're starting to see the ML ops elements of managing the AI and ML models find their way closer to the data platforms as well. >> Very interesting. When I hear metrics, I think of KPIs, I think of data apps, orchestrate people and places and things to optimize around a set of KPIs. It sounds like a metadata challenge more... Somebody once predicted they'll have more metadata than data. Carl, what are your thoughts on this prediction? >> Yeah, I think that what Dave is describing as data derivatives is in a way, another word for what I was calling operational metadata, which not about the data itself, but how it's used, where it came from, what the rules are governing it, and that kind of thing. If you have a rich enough set of those things, then not only can you do a model of how well your vacation property rental may do in terms of income, but also how well your application that's measuring that is doing for you. In other words, how many times have I used it, how much data have I used and what is the relationship between the data that I've used and the benefits that I've derived from using it? Well, we don't have ways of doing that. What's interesting to me is that folks in the content world are way ahead of us here, because they have always tracked their content using these kinds of attributes. Where did it come from? When was it created, when was it modified? Who modified it? And so on and so forth. We need to do more of that with the structure data that we have, so that we can track what it's used. And also, it tells us how well we're doing with it. Is it really benefiting us? Are we being efficient? Are there improvements in processes that we need to consider? Because maybe data gets created and then it isn't used or it gets used, but it gets altered in some way that actually misleads people. (laughs) So, we need the mechanisms to be able to do that. So, I would say that that's... And I'd say that it's true that we need that stuff. I think, that starting to expand is probably the right way to put it. It's going to be expanding for some time. I think, we're still a distance from having all that stuff really working together. >> Maybe we should say it's gestating. (Dave and Carl laughing) >> Sorry, if I may- >> Sanjeev, yeah, I was going to say this... Sanjeev, please comment. This sounds to me like it supports Zhamak Dehghani's principles, but please. >> Absolutely. So, whether we call it data mesh or not, I'm not getting into that conversation, (Dave chuckles) but data (audio breaking) (Tony laughing) everything that I'm hearing what Dave is saying, Carl, this is the year when data products will start to take off. I'm not saying they'll become mainstream. They may take a couple of years to become so, but this is data products, all this thing about vacation rentals and how is it doing, that data is coming from different sources. I'm packaging it into our data product. And to Carl's point, there's a whole operational metadata associated with it. The idea is for organizations to see things like developer productivity, how many releases am I doing of this? What data products are most popular? I'm actually in right now in the process of formulating this concept that just like we had data catalogs, we are very soon going to be requiring data products catalog. So, I can discover these data products. I'm not just creating data products left, right, and center. I need to know, do they already exist? What is the usage? If no one is using a data product, maybe I want to retire and save cost. But this is a data product. Now, there's a associated thing that is also getting debated quite a bit called data contracts. And a data contract to me is literally just formalization of all these aspects of a product. How do you use it? What is the SLA on it, what is the quality that I am prescribing? So, data product, in my opinion, shifts the conversation to the consumers or to the business people. Up to this point when, Dave, you're talking about data and all of data discovery curation is a very data producer-centric. So, I think, we'll see a shift more into the consumer space. >> Yeah. Dave, can I just jump in there just very quickly there, which is that what Sanjeev has been saying there, this is really central to what Zhamak has been talking about. It's basically about making, one, data products are about the lifecycle management of data. Metadata is just elemental to that. And essentially, one of the things that she calls for is making data products discoverable. That's exactly what Sanjeev was talking about. >> By the way, did everyone just no notice how Sanjeev just snuck in another prediction there? So, we've got- >> Yeah. (group laughing) >> But you- >> Can we also say that he snuck in, I think, the term that we'll remember today, which is metadata museums. >> Yeah, but- >> Yeah. >> And also comment to, Tony, to your last year's prediction, you're really talking about it's not something that you're going to buy from a vendor. >> No. >> It's very specific >> Mm-hmm. >> to an organization, their own data product. So, touche on that one. Okay, last prediction. Let's bring them up. Doug Henschen, BI analytics is headed to embedding. What does that mean? >> Well, we all know that conventional BI dashboarding reporting is really commoditized from a vendor perspective. It never enjoyed truly mainstream adoption. Always that 25% of employees are really using these things. I'm seeing rising interest in embedding concise analytics at the point of decision or better still, using analytics as triggers for automation and workflows, and not even necessitating human interaction with visualizations, for example, if we have confidence in the analytics. So, leading companies are pushing for next generation applications, part of this low-code, no-code movement we've seen. And they want to build that decision support right into the app. So, the analytic is right there. Leading enterprise apps vendors, Salesforce, SAP, Microsoft, Oracle, they're all building smart apps with the analytics predictions, even recommendations built into these applications. And I think, the progressive BI analytics vendors are supporting this idea of driving insight to action, not necessarily necessitating humans interacting with it if there's confidence. So, we want prediction, we want embedding, we want automation. This low-code, no-code development movement is very important to bringing the analytics to where people are doing their work. We got to move beyond the, what I call swivel chair integration, between where people do their work and going off to separate reports and dashboards, and having to interpret and analyze before you can go back and do take action. >> And Dave Menninger, today, if you want, analytics or you want to absorb what's happening in the business, you typically got to go ask an expert, and then wait. So, what are your thoughts on Doug's prediction? >> I'm in total agreement with Doug. I'm going to say that collectively... So, how did we get here? I'm going to say collectively as an industry, we made a mistake. We made BI and analytics separate from the operational systems. Now, okay, it wasn't really a mistake. We were limited by the technology available at the time. Decades ago, we had to separate these two systems, so that the analytics didn't impact the operations. You don't want the operations preventing you from being able to do a transaction. But we've gone beyond that now. We can bring these two systems and worlds together and organizations recognize that need to change. As Doug said, the majority of the workforce and the majority of organizations doesn't have access to analytics. That's wrong. (chuckles) We've got to change that. And one of the ways that's going to change is with embedded analytics. 2/3 of organizations recognize that embedded analytics are important and it even ranks higher in importance than AI and ML in those organizations. So, it's interesting. This is a really important topic to the organizations that are consuming these technologies. The good news is it works. Organizations that have embraced embedded analytics are more comfortable with self-service than those that have not, as opposed to turning somebody loose, in the wild with the data. They're given a guided path to the data. And the research shows that 65% of organizations that have adopted embedded analytics are comfortable with self-service compared with just 40% of organizations that are turning people loose in an ad hoc way with the data. So, totally behind Doug's predictions. >> Can I just break in with something here, a comment on what Dave said about what Doug said, which (laughs) is that I totally agree with what you said about embedded analytics. And at IDC, we made a prediction in our future intelligence, future of intelligence service three years ago that this was going to happen. And the thing that we're waiting for is for developers to build... You have to write the applications to work that way. It just doesn't happen automagically. Developers have to write applications that reference analytic data and apply it while they're running. And that could involve simple things like complex queries against the live data, which is through something that I've been calling analytic transaction processing. Or it could be through something more sophisticated that involves AI operations as Doug has been suggesting, where the result is enacted pretty much automatically unless the scores are too low and you need to have a human being look at it. So, I think that that is definitely something we've been watching for. I'm not sure how soon it will come, because it seems to take a long time for people to change their thinking. But I think, as Dave was saying, once they do and they apply these principles in their application development, the rewards are great. >> Yeah, this is very much, I would say, very consistent with what we were talking about, I was talking about before, about basically rethinking the modern data stack and going into more of an end-to-end solution solution. I think, that what we're talking about clearly here is operational analytics. There'll still be a need for your data scientists to go offline just in their data lakes to do all that very exploratory and that deep modeling. But clearly, it just makes sense to bring operational analytics into where people work into their workspace and further flatten that modern data stack. >> But with all this metadata and all this intelligence, we're talking about injecting AI into applications, it does seem like we're entering a new era of not only data, but new era of apps. Today, most applications are about filling forms out or codifying processes and require a human input. And it seems like there's enough data now and enough intelligence in the system that the system can actually pull data from, whether it's the transaction system, e-commerce, the supply chain, ERP, and actually do something with that data without human involvement, present it to humans. Do you guys see this as a new frontier? >> I think, that's certainly- >> Very much so, but it's going to take a while, as Carl said. You have to design it, you have to get the prediction into the system, you have to get the analytics at the point of decision has to be relevant to that decision point. >> And I also recall basically a lot of the ERP vendors back like 10 years ago, we're promising that. And the fact that we're still looking at the promises shows just how difficult, how much of a challenge it is to get to what Doug's saying. >> One element that could be applied in this case is (indistinct) architecture. If applications are developed that are event-driven rather than following the script or sequence that some programmer or designer had preconceived, then you'll have much more flexible applications. You can inject decisions at various points using this technology much more easily. It's a completely different way of writing applications. And it actually involves a lot more data, which is why we should all like it. (laughs) But in the end (Tony laughing) it's more stable, it's easier to manage, easier to maintain, and it's actually more efficient, which is the result of an MIT study from about 10 years ago, and still, we are not seeing this come to fruition in most business applications. >> And do you think it's going to require a new type of data platform database? Today, data's all far-flung. We see that's all over the clouds and at the edge. Today, you cache- >> We need a super cloud. >> You cache that data, you're throwing into memory. I mentioned, MySQL heat wave. There are other examples where it's a brute force approach, but maybe we need new ways of laying data out on disk and new database architectures, and just when we thought we had it all figured out. >> Well, without referring to disk, which to my mind, is almost like talking about cave painting. I think, that (Dave laughing) all the things that have been mentioned by all of us today are elements of what I'm talking about. In other words, the whole improvement of the data mesh, the improvement of metadata across the board and improvement of the ability to track data and judge its freshness the way we judge the freshness of a melon or something like that, to determine whether we can still use it. Is it still good? That kind of thing. Bringing together data from multiple sources dynamically and real-time requires all the things we've been talking about. All the predictions that we've talked about today add up to elements that can make this happen. >> Well, guys, it's always tremendous to get these wonderful minds together and get your insights, and I love how it shapes the outcome here of the predictions, and let's see how we did. We're going to leave it there. I want to thank Sanjeev, Tony, Carl, David, and Doug. Really appreciate the collaboration and thought that you guys put into these sessions. Really, thank you. >> Thank you. >> Thanks, Dave. >> Thank you for having us. >> Thanks. >> Thank you. >> All right, this is Dave Valente for theCUBE, signing off for now. Follow these guys on social media. Look for coverage on siliconangle.com, theCUBE.net. Thank you for watching. (upbeat music)

Published Date : Jan 11 2023

SUMMARY :

and pleased to tell you (Tony and Dave faintly speaks) that led them to their conclusion. down, the funding in VC IPO market. And I like how the fact And I happened to have tripped across I talked to Walmart in the prediction of graph databases. But I stand by the idea and maybe to the edge. You can apply graphs to great And so, it's going to streaming data permeates the landscape. and to be honest, I like the tough grading the next 20 to 25% of and of course, the degree of difficulty. that sits on the side, Thank you for that. And I have to disagree. So, the catalog becomes Do you have any stats for just the reasons that And a lot of those catalogs about the modern data stack. and more, the data lakehouse. and the application stack, So, the alternative is to have metadata that SQL is the killer app for big data. but in the perception of the marketplace, and I had to take the NoSQL, being up on stage with Curt Monash. (group laughing) is that the core need in the data lake, And your prediction is the and examine derivatives of the data to optimize around a set of KPIs. that folks in the content world (Dave and Carl laughing) going to say this... shifts the conversation to the consumers And essentially, one of the things (group laughing) the term that we'll remember today, to your last year's prediction, is headed to embedding. and going off to separate happening in the business, so that the analytics didn't And the thing that we're waiting for and that deep modeling. that the system can of decision has to be relevant And the fact that we're But in the end We see that's all over the You cache that data, and improvement of the and I love how it shapes the outcome here Thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Doug Henschen	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Doug	PERSON	0.99+
Carl	PERSON	0.99+
Carl Olofson	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Tony Baer	PERSON	0.99+
Tony	PERSON	0.99+
Dave Valente	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
Curt Monash	PERSON	0.99+
Sanjeev Mohan	PERSON	0.99+
Christian Kleinerman	PERSON	0.99+
Dave Valente	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Sanjeev	PERSON	0.99+
Constellation Research	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Ventana Research	ORGANIZATION	0.99+
2022	DATE	0.99+
Hazelcast	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Tony Bear	PERSON	0.99+
25%	QUANTITY	0.99+
2021	DATE	0.99+
last year	DATE	0.99+
65%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
today	DATE	0.99+
five-year	QUANTITY	0.99+
TigerGraph	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
two services	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
David	PERSON	0.99+
RisingWave Labs	ORGANIZATION	0.99+

theCUBE Insights | Snowflake Summit 2022

(upbeat music) >> Hey everyone, welcome back to theCUBE's three day coverage of Snowflake Summit 22. Lisa Martin here with Dave Vellante. We have been here as I said for three days. Dave, we have had an amazing three days. The energy, the momentum, the number of people still here speaks volumes for- >> Yeah, I was just saying, you look back, theCUBE, when it started, early days was a big part of the Hadoop ecosystem. You know Cloudera kind of got it started, the whole big data movement, it was awesome energy, and that whole ecosystem has been, I think, just hoovered into the Snowflake ecosystem. They've taken over as the data company, the data cloud, I mean, that was Cloudera, it could have been Cloudera, and now they didn't, they missed it, it was a variety of factors, but Snowflake has nailed it. And now it's theirs to lose. Benoit talked about that on our previous segment, how he knew that technically Hadoop was too complex, and was going to fail, and they didn't know it was going to do this. They were going to turn their company into what we see here. But the event itself, Lisa, is almost 10,000 people, the right people, people are doing business, we've had a number of people tell us that they're booking deals. That's why people come to face-to-face shows, right? That's the criticism of virtual. It takes too long to close business. Salespeople want to be belly-to-belly. And this is a belly-to belly-show. >> It absolutely is. When you and I were trying to get into the keynote on Tuesday, we finally got in standing room only, multiple overflow rooms, and we're even hearing that, so this is day four of the summit for them, there are still queues to get into breakout sessions. The momentum, but the appetite for this flywheel, and what they're creating, but also they're involving this massively growing ecosystem in its evolution. It's that synergy was really very much heard, and echoed throughout pretty much all of our segments the last couple days. >> Yeah, it was amazing actually. So we like to go, we want to be in the front row in the keynotes, we're taking notes, we always do that. Sometimes we listen remotely, but when you listen remotely, you miss some things. When you're there, you can see the executives, you can feel their energy, you can chit chat to them on the side, be seen, whatever. And it was crazy, we couldn't get in. So we had to do our thing, and sneak our way in, and "Hey, we're media." "Oh yeah, come on in." And then no, they were taking us to a breakout room. We had to sneak in a side door, got like the last two seats, and wow, I'm glad we were in there because it gave us a better sense. When you're in the remote watching rooms you just can't get a sense of the energy. That's why I like to be there, I know you do too. And then to your point about ecosystem. So we've said many times that what Snowflake is developing is what we call supercloud. It's not just a SaaS, it's not just a cloud database, it's a new layer that they're creating. And so what are the attributes of that layer? Well, it hides the underlying complexity of the underlying primitives of the cloud. We've said that ad nauseam, and it adds new value on top. Well, what's that value that they're adding? Well, they're adding value of being able to share data, collaborate, have data that's governed, and secure, globally. And now the other hallmark of a cloud company is ecosystem. And so they're building that ecosystem much more rapidly than we saw at ServiceNow, which is Slootman's previous company. And the key to me is they've launched an application development platform, essentially a super PaaS, so that you can develop applications on top of the data cloud. And we're hearing tons about monetization. Duh, you could actually make money with data. You can package data into data products, and data services, or feed data products and services, and actually sell that in a cloud, in a supercloud. That's exactly what's happening here. So that's critical. I think my one question mark if I had to lay one out, is the other hallmark of a cloud is startup, startups come into that cloud. And I think we're seeing that, maybe not at the pace that AWS did, it's a little different. Snowflake are, they're whale hunters. They're after big companies. But it looks to me like they're relying on the ecosystem to be the startup innovators. That's the important thing about cloud, cloud brings scale. It definitely brings lower cost 'cause you're eliminating all this undifferentiated labor, but it also brings innovation through startups. So unlike AWS, who sold the startups directly, and startups built businesses on AWS, and by paying AWS, it's a little bit indirect, but it's actually happening where startups in the ecosystem are building products on the data cloud, and that ultimately is going to drive value for customers, and money for Snowflake, and ultimately AWS, and Google, and Azure. The other thing I would say is the criticism or concern that the cost of goods sold for cloud are going to be so high that it's going to force people to come back on-prem. I think it's a step in the wrong direction. I think cloud, and the cloud operating model is here to stay. I think it's going to be very difficult to replicate that on-prem. I don't think you can do cloud without cloud, and we'll see what the edge brings. >> Curious what your thoughts are. We were just at Dell technologies world a month or so ago when the big announcement, the Snowflake partnership there, cloud native companies recognizing, ah, there's still a lot of data that lives on-prem. Given that, and everything that we've heard the last couple of days, what are your thoughts around that and their partnerships there? >> So Dell is, I think finally, now maybe they weren't publicly talking like this, but certainly their marketing was defensive. But in the last year or so, Dell has really embraced cloud, not just the cloud operating model, Dell has said, "Look, we can build value on top of all these hyperscalers." And we saw some examples at Dell Tech World of them stepping their toe into supercloud. Project Alpine is an example, and there are others. And then of course the Snowflake deal, where Snowflake and Dell got together, I asked Frank Slootman how that deal came about. And 'cause I said, "Did the customer get you into a headlock?" 'Cause I presume that was the case. Customer said, "You got to do this or we're not going to do business with you." He said, "Well, no, not really. Michael and I had a chat, and that's how it started." Which was my other scenario, and that's exactly what happened I guess. The point being that those worlds are coming together. And so what it means for Dell is as they embrace cloud, as they develop supercloud capabilities, they're going to do a lot of business. Dell for sure knows how to sell, they know how to execute. What I would be doing if I were Dell, is I would be trying to substantially replicate what's happening in the cloud on-prem with on-prem data. So what happens with that Snowflake deal is, it's read-only data, you read the data into the cloud, the compute is in the cloud. And I should've asked Terry this, I mean Benoit. Can there be an architecture on-prem? We've seen at Vertica has one, it's called Vertica Eon where you separate compute from storage. It doesn't have unlimited elasticity, but you can grow, compute, and storage independently, and have a lot more. With Dell doing APEX on demand, it's cloudlike, they could begin to develop a little mini data cloud, or a big data cloud within on-prem that connects to the public cloud. So what Snowflake is missing, a big part of their TAM that they're missing is the on-prem. The Dell and Pure deals are forays into that, but this on-prem is massive, and Dell is the on-prem poster child. So I think again what it means for them is they've got to continue to embrace it, they got to do more in software, more in data management, they got to push on APEX. And I'd say the same thing for HPE. I think they're both well behind this in terms of ecosystems. I mean they're not even close. But they have to start, and they got to start somewhere, and they've got resources to make it happen. >> You said in your breaking analysis that you published just a few days ago before the event that Snowflake plans to create a de facto standard in data platforms. What we heard from our guests on this program, your mainstage session with Frank Slootman. Still think that? >> I do. I think it more than I believed it coming in. And the reason I called it that is because I am a super fan of Zhamak Dehghani and her data mesh. And what her vision is, it's kind of the Immaculate Conception, where she wants everything to be open, open standards, and those don't exist today. And I think she perfectly realizes the practicality of de facto standards are going to get to market, and add value sooner than open standards. Now open standards over time, and I'll come back to that, may occur, but that's clear to me what Snowflake is creating, is the de facto standard for data platforms, the data cloud, the supercloud. And what's most impressive, or I think really important, is they're layering applications now on top of that. The metric to me, and I don't know if we can even count this, but VMware used to use it. For every dollar spent on VMware license, $15 was spent in the ecosystem. It started at 1 to 1.5, 1 to 2, 1 to 10, 1 to 15, I think it went up to 1 to 30 at the max. I don't know how they counted that, but it's countable. Reasonable people can make estimates like that. And I think as the ecosystem grows, what Snowflake's doing is it's in many respects modeling the cloud, what the cloud has. Cloud has ecosystems, we talked about startups, and the cloud also has optionality. And optionality means open source. So what you saw with Apache Iceberg is we're going to extend to open technologies. What you saw with Hybrid tables is we're going to extend a new workloads like transactions. The other thing about Snowflake that's really impressive is you're seeing the vertical focus. Financial services, healthcare, retail, media and entertainment. It's very rare for a company in this tenure, they're only 10 years old, to really start going vertical with their go-to-market, and building expertise around that. I think what's going to happen is the GSIs are going to come in, they love to eat at the trough, the trough here is maybe not big enough for them yet, but it will be. And they're going to start to align with the GSIs, and they're going to do really well within those industries, connecting people, collaborating with data. But I think it's a killer strategy, but they're executing on it. >> Right, and we heard a lot of great customer stories from all of those four verticals that you talked about, and then some, that that direction and that pivot from a customer perspective, from a sales and marketing perspective is all aligned. And that was kind of one of the themes as well that Frank talked about in his keynote is mission alignment, mission alignment with customers, but also with the ecosystem. And I feel that I heard that with every customer conversation, with every partner conversation, and Snowflake conversation that we had over the last I think 36 segments, Dave. >> Yeah, I mean, yeah, it's the power of many versus the resources of one. And even though Snowflake tell you they have $5 billion in cash, and assets on the balance sheet, and that's fine, that's nothing compared to what an ecosystem has. And Amazon's part of that ecosystem. Azure is part of that ecosystem. Google is part of that ecosystem. Those companies have huge resources, and Snowflake it seems has figured out how to tap those resources, and build value on top of it. To me they're doing a better job than a lot of the cloud databases out there. They don't necessarily have a better database, in fact, I could argue that their database is less functional. And I would argue that actually in many cases. Their database is less functional if you just want a database. But if you want a data cloud, and an ecosystem, and develop applications on top of that, and to be able to monetize, that's unique, and that is a moat that they're building that is highly differentiable, and being able to do that relatively easily. I mean, I think they overstate the simplicity with which that is being done. We talked to some customers who said, he didn't say same wine, new bottle. I did ask him that, about Hadoop complexity. And he said, "No, it's not that bad." But you still got to put this stuff together. And I think in the early parts of a market that are immature, people get really excited because it's so much easier than what was previous. So my other question is, okay, what's somebody working on now, that's looking at what Snowflake's doing and saying, I can improve on that. And what's going to be really interesting to see is, can they improve on it in a way, and can they raise enough capital such that they can disrupt, or is Snowflake going to keep staying paranoid, 'cause they got good leaders, and keep executing? And then I think the other wild card is edge. Snowflake doesn't really have an edge strategy right now. I think they will develop one. >> Through the ecosystem? >> And I don't think they're missing the boat, and they'll do it through the ecosystem, exactly. I don't think they're missing the boat, I think they're just like, "Well, we don't know what to do today." It's all distributed data, and it's ephemeral, and nobody's storing the data. You know anything that comes back to the cloud, we get. But new architectures are emerging on the edge that are going to bring new economics. There's new silicon, you see what's happening with Apple, and the M1, the M1 Ultra, and the new systems that they've just developed. What Tesla is doing with custom silicon, and amazing things, and programmability of the arm model. So it's early days, but semiconductors are the mainspring of innovation in this industry. Without chips, you got nothing. And when you get innovations in silicon, it drives innovations in software, because developers go, "Wow, I can do that now?" I can do things in parallel, I can do things faster, I can do things more simply, and programmable at scale. So that's happening. And that's going to bring a new set of economics that the premise is that will eventually bleed into the data center. It will, it always does. And I guess the other thing is every 15 years or so, the world gets disrupted, the tech world. We're about 15, 16 years in now to the cloud. So at this point, everybody's like, "Wow this is insurmountable, this is all we'll ever see. Everything that's ever been invented, this is the model of the future." We know that's not the case. I don't know how it's going to get disrupted, but I think edge is going to be part of that. It could be public policy. Governments could come in and take big tech on, seems like Sharekhan wants to do that. So that's what makes this industry so fun. >> Never a dull moment, Dave. This has been a great three days hosting this show with you. We've uncovered a lot. Your breaking analysis was great to get me prepared for the show. If you haven't seen it, check it out on siliconangle.com. Thanks, Dave, I appreciate all of your insights. >> Thank you, Lisa, It's been a pleasure working with you. >> Always good to work with you. >> Awesome, great job. >> Likewise. Great job to the team. >> Yes, thank you to our awesome production team. They've kept us going for three days. >> Yes, and the team back, Kristin, and Cheryl, and everybody back at the office. >> Exactly, it takes a village. For Dave Vellante, I am Lisa Martin. We are wrappin' up three days of wall-to-wall coverage at Snowflake Summit 22 from Vegas. Thanks for watching guys, we'll see you soon. (upbeat music)

Published Date : Jun 17 2022

SUMMARY :

The energy, the momentum, And now it's theirs to lose. The momentum, but the And the key to me is they've launched the last couple of days, and Dell is the on-prem poster child. that Snowflake plans to is the GSIs are going to come in, And I feel that I heard that and assets on the balance And I guess the other thing to get me prepared for the show. a pleasure working with you. Great job to the team. Yes, thank you to our Yes, and the team guys, we'll see you soon.

ENTITIES

Entity	Category	Confidence
Frank Slootman	PERSON	0.99+
Michael	PERSON	0.99+
Kristin	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Cheryl	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Frank	PERSON	0.99+
Terry	PERSON	0.99+
Lisa	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Zhamak Dehghani	PERSON	0.99+
Dell	ORGANIZATION	0.99+
$15	QUANTITY	0.99+
$5 billion	QUANTITY	0.99+
Vertica	ORGANIZATION	0.99+
Tuesday	DATE	0.99+
Vegas	LOCATION	0.99+
Benoit	PERSON	0.99+
three days	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Tesla	ORGANIZATION	0.99+
Apache Iceberg	ORGANIZATION	0.99+
three day	QUANTITY	0.99+
Snowflake Summit 22	EVENT	0.99+
last year	DATE	0.99+
Apple	ORGANIZATION	0.99+
three days	QUANTITY	0.99+
1	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
15	QUANTITY	0.98+
36 segments	QUANTITY	0.98+
30	QUANTITY	0.98+
1.5	QUANTITY	0.98+
M1 Ultra	COMMERCIAL_ITEM	0.98+
10	QUANTITY	0.98+
today	DATE	0.98+
theCUBE	ORGANIZATION	0.97+
siliconangle.com	OTHER	0.97+
both	QUANTITY	0.97+
Snowflake Summit 2022	EVENT	0.97+
2	QUANTITY	0.96+
Cloudera	ORGANIZATION	0.96+
M1	COMMERCIAL_ITEM	0.94+
Vertica Eon	ORGANIZATION	0.94+
two seats	QUANTITY	0.94+
Dell Tech World	ORGANIZATION	0.92+
few days ago	DATE	0.92+
one question	QUANTITY	0.91+
one	QUANTITY	0.91+
ServiceNow	ORGANIZATION	0.91+
up	QUANTITY	0.9+
VMware	ORGANIZATION	0.9+
10 years old	QUANTITY	0.89+
TAM	ORGANIZATION	0.87+
four verticals	QUANTITY	0.85+
almost 10,000 people	QUANTITY	0.84+
a month or so ago	DATE	0.83+
last couple of days	DATE	0.82+

Tony Baer, Doug Henschen and Sanjeev Mohan, Couchbase | Couchbase Application Modernization

(upbeat music) >> Welcome to this CUBE Power Panel where we're going to talk about application modernization, also success templates, and take a look at some new survey data to see how CIOs are thinking about digital transformation, as we get deeper into the post isolation economy. And with me are three familiar VIP guests to CUBE audiences. Tony Bear, the principal at DB InSight, Doug Henschen, VP and principal analyst at Constellation Research and Sanjeev Mohan principal at SanjMo. Guys, good to see you again, welcome back. >> Thank you. >> Glad to be here. >> Thanks for having us. >> Glad to be here. >> All right, Doug. Let's get started with you. You know, this recent survey, which was commissioned by Couchbase, 650 CIOs and CTOs, and IT practitioners. So obviously very IT heavy. They responded to the following question, "In response to the pandemic, my organization accelerated our application modernization strategy and of course, an overwhelming majority, 94% agreed or strongly agreed." So I'm sure, Doug, that you're not shocked by that, but in the same survey, modernizing existing technologies was second only behind cyber security is the top investment priority this year. Doug, bring us into your world and tell us the trends that you're seeing with the clients and customers you work with in their modernization initiatives. >> Well, the survey, of course, is spot on. You know, any Constellation Research analyst, any systems integrator will tell you that we saw more transformation work in the last two years than in the prior six to eight years. A lot of it was forced, you know, a lot of movement to the cloud, a lot of process improvement, a lot of automation work, but transformational is aspirational and not every company can be a leader. You know, at Constellation, we focus our research on those market leaders and that's only, you know, the top 5% of companies that are really innovating, that are really disrupting their markets and we try to share that with companies that want to be fast followers, that these are the next 20 to 25% of companies that don't want to get left behind, but don't want to hit some of the same roadblocks and you know, pioneering pitfalls that the real leaders are encountering when they're harnessing new technologies. So the rest of the companies, you know, the cautious adopters, the laggards, many of them fall by the wayside, that's certainly what we saw during the pandemic. Who are these leaders? You know, the old saw examples that people saw at the Amazons, the Teslas, the Airbnbs, the Ubers and Lyfts, but new examples are emerging every year. And as a consumer, you immediately recognize these transformed experiences. One of my favorite examples from the pandemic is Rocket Mortgage. No disclaimer required, I don't own stock and you're not client, but when I wanted to take advantage of those record low mortgage interest rates, I called my current bank and some, you know, stall word, very established conventional banks, I'm talking to you Bank of America, City Bank, and they were taking days and weeks to get back to me. Rocket Mortgage had the locked in commitment that day, a very proactive, consistent communications across web, mobile, email, all customer touchpoints. I closed in a matter of weeks an entirely digital seamless process. This is back in the gloves and masks days and the loan officer came parked in our driveway, wiped down an iPad, handed us that iPad, we signed all those documents digitally, completely electronic workflow. The only wet signatures required were those demanded by the state. So it's easy to spot these transformed experiences. You know, Rocket had most of that in place before the pandemic, and that's why they captured 8% of the national mortgage market by 2020 and they're on track to hit 10% here in 2022. >> Yeah, those are great examples. I mean, I'm not a shareholder either, but I am a customer. I even went through the same thing in the pandemic. It was all done in digital it was a piece of cake and I happened to have to do another one with a different firm and stuck with that firm for a variety of reasons and it was night and day. So to your point, it was a forced merge to digital. If you were there beforehand, you had real advantage, it could accelerate your lead during the pandemic. Okay, now Tony bear. Mr. Bear, I understand you're skeptical about all this buzz around digital transformation. So in that same survey, the data shows that the majority of respondents said that their digital initiatives were largely reactive to outside forces, the pandemic compliance changes, et cetera. But at the same time, they indicated that the results while somewhat mixed were generally positive. So why are you skeptical? >> The reason being, and by the way, I have nothing against application modernization. The problem... I think the problem I ever said, it often gets conflated with digital transformation and digital transformation itself has become such a buzzword and so overused that it's really hard, if not impossible to pin down (coughs) what digital transformation actually means. And very often what you'll hear from, let's say a C level, you know, (mumbles) we want to run like Google regardless of whether or not that goal is realistic you know, for that organization (coughs). The thing is that we've been using, you know, businesses have been using digital data since the days of the mainframe, since the... Sorry that data has been digital. What really has changed though, is just the degree of how businesses interact with their customers, their partners, with the whole rest of the ecosystem and how their business... And how in many cases you take look at the auto industry that the nature of the business, you know, is changing. So there is real change of foot, the question is I think we need to get more specific in our goals. And when you look at it, if we can boil it down to a couple, maybe, you know, boil it down like really over simplistically, it's really all about connectedness. No, I'm not saying connectivity 'cause that's more of a physical thing, but connectedness. Being connected to your customer, being connected to your supplier, being connected to the, you know, to the whole landscape, that you operate in. And of course today we have many more channels with which we operate, you know, with customers. And in fact also if you take a look at what's happening in the automotive industry, for instance, I was just reading an interview with Bill Ford, you know, their... Ford is now rapidly ramping up their electric, you know, their electric vehicle strategy. And what they realize is it's not just a change of technology, you know, it is a change in their business, it's a change in terms of the relationship they have with their customer. Their customers have traditionally been automotive dealers who... And the automotive dealers have, you know, traditionally and in many cases by state law now have been the ones who own the relationship with the end customer. But when you go to an electric vehicle, the product becomes a lot more of a software product. And in turn, that means that Ford would have much more direct interaction with its end customers. So that's really what it's all about. It's about, you know, connectedness, it's also about the ability to act, you know, we can say agility, it's about ability not just to react, but to anticipate and act. And so... And of course with all the proliferation, you know, the explosion of data sources and connectivity out there and the cloud, which allows much more, you know, access to compute, it changes the whole nature of the ball game. The fact is that we have to avoid being overwhelmed by this and make our goals more, I guess, tangible, more strictly defined. >> Yeah, now... You know, great points there. And I want to just bring in some survey data, again, two thirds of the respondents said their digital strategies were set by IT and only 26% by the C-suite, 8% by the line of business. Now, this was largely a survey of CIOs and CTOs, but, wow, doesn't seem like the right mix. It's a Doug's point about, you know, leaders in lagers. My guess is that Rocket Mortgage, their digital strategy was led by the chief digital officer potentially. But at the same time, you would think, Tony, that application modernization is a prerequisite for digital transformation. But I want to go to Sanjeev in this war in the survey. And respondents said that on average, they want 58% of their IT spend to be in the public cloud three years down the road. Now, again, this is CIOs and CTOs, but (mumbles), but that's a big number. And there was no ambiguity because the question wasn't worded as cloud, it was worded as public cloud. So Sanjeev, what do you make of that? What's your feeling on cloud as flexible architecture? What does this all mean to you? >> Dave, 58% of IT spend in the cloud is a huge change from today. Today, most estimates, peg cloud IT spend to be somewhere around five to 15%. So what this number tells us is that the cloud journey is still in its early days, so we should buckle up. We ain't seen nothing yet, but let me add some color to this. CIOs and CTOs maybe ramping up their cloud deployment, but they still have a lot of problems to solve. I can tell you from my previous experience, for example, when I was in Gartner, I used to talk to a lot of customers who were in a rush to move into the cloud. So if we were to plot, let's say a maturity model, typically a maturity model in any discipline in IT would have something like crawl, walk, run. So what I was noticing was that these organizations were jumping straight to run because in the pandemic, they were under the gun to quickly deploy into the cloud. So now they're kind of coming back down to, you know, to crawl, walk, run. So basically they did what they had to do under the circumstances, but now they're starting to resolve some of the very, very important issues. For example, security, data privacy, governance, observability, these are all very big ticket items. Another huge problem that nav we are noticing more than we've ever seen, other rising costs. Cloud makes it so easy to onboard new use cases, but it leads to all kinds of unexpected increase in spikes in your operating expenses. So what we are seeing is that organizations are now getting smarter about where the workloads should be deployed. And sometimes it may be in more than one cloud. Multi-cloud is no longer an aspirational thing. So that is a huge trend that we are seeing and that's why you see there's so much increased planning to spend money in public cloud. We do have some issues that we still need to resolve. For example, multi-cloud sounds great, but we still need some sort of single pane of glass, control plane so we can have some fungibility and move workloads around. And some of this may also not be in public cloud, some workloads may actually be done in a more hybrid environment. >> Yeah, definitely. I call it Supercloud. People win sometimes-- >> Supercloud. >> At that term, but it's above multi-cloud, it floats, you know, on topic. But so you clearly identified some potholes. So I want to talk about the evolution of the application experience 'cause there's some potholes there too. 81% of their respondents in that survey said, "Our development teams are embracing the cloud and other technologies faster than the rest of the organization can adopt and manage them." And that was an interesting finding to me because you'd think that infrastructure is code and designing insecurity and containers and Kubernetes would be a great thing for organizations, and it is I'm sure in terms of developer productivity, but what do you make of this? Does the modernization path also have some potholes, Sanjeev? What are those? >> So, first of all, Dave, you mentioned in your previous question, there's no ambiguity, it's a public cloud. This one, I feel it has quite a bit of ambiguity because it talks about cloud and other technologies, that sort of opens up the kimono, it's like that's everything. Also, it says that the rest of the organization is not able to adopt and manage. Adoption is a business function, management is an IT function. So I feed this question is a bit loaded. We know that app modernization is here to stay, developing in the cloud removes a lot of traditional barriers or procuring instantiating infrastructure. In addition, developers today have so many more advanced tools. So they're able to develop the application faster because they have like low-code/no-code options, they have notebooks to write the machine learning code, they have the entire DevOps CI/CD tool chain that makes it easy to version control and push changes. But there are potholes. For example, are developers really interested in fixing data quality problems, all data, privacy, data, access, data governance? How about monitoring? I doubt developers want to get encumbered with all of these operationalization management pieces. Developers are very keen to deliver new functionality. So what we are now seeing is that it is left to the data team to figure out all of these operationalization productionization things that the developers have... You know, are not truly interested in that. So which actually takes me to this topic that, Dave, you've been quite actively covering and we've been talking about, see, the whole data mesh. >> Yeah, I was going to say, it's going to solve all those data quality problems, Sanjeev. You know, I'm a sucker for data mesh. (laughing) >> Yeah, I know, but see, what's going to happen with data mesh is that developers are now going to have more domain resident power to develop these applications. What happens to all of the data curation governance quality that, you know, a central team used to do. So there's a lot of open ended questions that still need to be answered. >> Yeah, That gets automated, Tony, right? With computational governance. So-- >> Of course. >> It's not trivial, it's not trivial, but I'm still an optimist by the end of the decade we'll start to get there. Doug, I want to go to you again and talk about the business case. We all remember, you know, the business case for modernization that is... We remember the Y2K, there was a big it spending binge and this was before the (mumbles) of the enterprise, right? CIOs, they'd be asked to develop new applications and the business maybe helps pay for it or offset the cost with the initial work and deployment then IT got stuck managing the sprawling portfolio for years. And a lot of the apps had limited adoption or only served a few users, so there were big pushes toward rationalizing the portfolio at that time, you know? So do I modernize, they had to make a decision, consolidate, do I sunset? You know, it was all based on value. So what's happening today and how are businesses making the case to modernize, are they going through a similar rationalization exercise, Doug? >> Well, the Y2K era experience that you talked about was back in the days of, you know, throw the requirements over the wall and then we had waterfall development that lasted months in some cases years. We see today's most successful companies building cross functional teams. You know, the C-suite the line of business, the operations, the data and analytics teams, the IT, everybody has a seat at the table to lead innovation and modernization initiatives and they don't start, the most successful companies don't start by talking about technology, they start by envisioning a business outcome by envisioning a transformed customer experience. You hear the example of Amazon writing the press release for the product or service it wants to deliver and then it works backwards to create it. You got to work backwards to determine the tech that will get you there. What's very clear though, is that you can't transform or modernize by lifting and shifting the legacy mess into the cloud. That doesn't give you the seamless processes, that doesn't give you data driven personalization, it doesn't give you a connected and consistent customer experience, whether it's online or mobile, you know, bots, chat, phone, everything that we have today that requires a modern, scalable cloud negative approach and agile deliver iterative experience where you're collaborating with this cross-functional team and course correct, again, making sure you're on track to what's needed. >> Yeah. Now, Tony, both Doug and Sanjeev have been, you know, talking about what I'm going to call this IT and business schism, and we've all done surveys. One of the things I'd love to see Couchbase do in future surveys is not only survey the it heavy, but also survey the business heavy and see what they say about who's leading the digital transformation and who's in charge of the customer experience. Do you have any thoughts on that, Tony? >> Well, there's no question... I mean, it's kind like, you know, the more things change. I mean, we've been talking about that IT and the business has to get together, we talked about this back during, and Doug, you probably remember this, back during the Y2K ERP days, is that you need these cross functional teams, we've been seeing this. I think what's happening today though, is that, you know, back in the Y2K era, we were basically going into like our bedrock systems and having to totally re-engineer them. And today what we're looking at is that, okay, those bedrock systems, the ones that basically are keeping the lights on, okay, those are there, we're not going to mess with that, but on top of that, that's where we're going to innovate. And that gives us a chance to be more, you know, more directed and therefore we can bring these related domains together. I mean, that's why just kind of, you know, talk... Where Sanjeev brought up the term of data mesh, I've been a bit of a cynic about data mesh, but I do think that work and work is where we bring a bunch of these connected teams together, teams that have some sort of shared context, though it's everybody that's... Every team that's working, let's say around the customer, for instance, which could be, you know, in marketing, it could be in sales, order processing in some cases, you know, in logistics and delivery. So I think that's where I think we... You know, there's some hope and the fact is that with all the advanced, you know, basically the low-code/no-code tools, they are ways to bring some of these other players, you know, into the process who previously had to... Were sort of, you know, more at the end of like a, you know, kind of a... Sort of like they throw it over the wall type process. So I do believe, but despite all my cynicism, I do believe there's some hope. >> Thank you. Okay, last question. And maybe all of you could answer this. Maybe, Sanjeev, you can start it off and then Doug and Tony can chime in. In the survey, about a half, nearly half of the 650 respondents said they could tangibly show their organizations improve customer experiences that were realized from digital projects in the last 12 months. Now, again, not surprising, but we've been talking about digital experiences, but there's a long way to go judging from our pandemic customer experiences. And we, again, you know, some were great, some were terrible. And so, you know, and some actually got worse, right? Will that improve? When and how will it improve? Where's 5G and things like that fit in in terms of improving customer outcomes? Maybe, Sanjeev, you could start us off here. And by the way, plug any research that you're working on in this sort of area, please do. >> Thank you, Dave. As a resident optimist on this call, I'll get us started and then I'm sure Doug and Tony will have interesting counterpoints. So I'm a technology fan boy, I have to admit, I am in all of all these new companies and how they have been able to rise up and handle extreme scale. In this time that we are speaking on this show, these food delivery companies would have probably handled tens of thousands of orders in minutes. So these concurrent orders, delivery, customer support, geospatial location intelligence, all of this has really become commonplace now. It used to be that, you know, large companies like Apple would be able to handle all of these supply chain issues, disruptions that we've been facing. But now in my opinion, I think we are seeing this in, Doug mentioned Rocket Mortgage. So we've seen it in FinTech and shopping apps. So we've seen the same scale and it's more than 5G. It includes things like... Even in the public cloud, we have much more efficient, better hardware, which can do like deep learning networks much more efficiently. So machine learning, a lot of natural language programming, being able to handle unstructured data. So in my opinion, it's quite phenomenal to see how technology has actually come to rescue and as, you know, billions of us have gone online over the last two years. >> Yeah, so, Doug, so Sanjeev's point, he's saying, basically, you ain't seen nothing yet. What are your thoughts here, your final thoughts. >> Well, yeah, I mean, there's some incredible technologies coming including 5G, but you know, it's only going to pave the cow path if the underlying app, if the underlying process is clunky. You have to modernize, take advantage of, you know, serverless scalability, autonomous optimization, advanced data science. There's lots of cutting edge capabilities out there today, but you know, lifting and shifting you got to get your hands dirty and actually modernize on that data front. I mentioned my research this year, I'm doing a lot of in depth looks at some of the analytical data platforms. You know, these lake houses we've had some conversations about that and helping companies to harness their data, to have a more personalized and predictive and proactive experience. So, you know, we're talking about the Snowflakes and Databricks and Googles and Teradata and Vertica and Yellowbrick and that's the research I'm focusing on this year. >> Yeah, your point about paving the cow path is right on, especially over the pandemic, a lot of the processes were unknown. But you saw this with RPA, paving the cow path only got you so far. And so, you know, great points there. Tony, you get the last word, bring us home. >> Well, I'll put it this way. I think there's a lot of hope in terms of that the new generation of developers that are coming in are a lot more savvy about things like data. And I think also the new generation of people in the business are realizing that we need to have data as a core competence. So I do have optimism there that the fact is, I think there is a much greater consciousness within both the business side and the technical. In the technology side, the organization of the importance of data and how to approach that. And so I'd like to just end on that note. >> Yeah, excellent. And I think you're right. Putting data at the core is critical data mesh I think very well describes the problem and (mumbles) credit lays out a solution, just the technology's not there yet, nor are the standards. Anyway, I want to thank the panelists here. Amazing. You guys are always so much fun to work with and love to have you back in the future. And thank you for joining today's broadcast brought to you by Couchbase. By the way, check out Couchbase on the road this summer at their application modernization summits, they're making up for two years of shut in and coming to you. So you got to go to couchbase.com/roadshow to find a city near you where you can meet face to face. In a moment. Ravi Mayuram, the chief technology officer of Couchbase will join me. You're watching theCUBE, the leader in high tech enterprise coverage. (bright music)

Published Date : May 19 2022

SUMMARY :

Guys, good to see you again, welcome back. but in the same survey, So the rest of the companies, you know, and I happened to have to do another one it's also about the ability to act, So Sanjeev, what do you make of that? Dave, 58% of IT spend in the cloud I call it Supercloud. it floats, you know, on topic. Also, it says that the say, it's going to solve that still need to be answered. Yeah, That gets automated, Tony, right? And a lot of the apps had limited adoption is that you can't transform or modernize One of the things I'd love to see and the business has to get together, nearly half of the 650 respondents and how they have been able to rise up you ain't seen nothing yet. and that's the research paving the cow path only got you so far. in terms of that the new and love to have you back in the future.

ENTITIES

Entity	Category	Confidence
Doug	PERSON	0.99+
Tony	PERSON	0.99+
Ravi Mayuram	PERSON	0.99+
Apple	ORGANIZATION	0.99+
Tony Bear	PERSON	0.99+
Dave	PERSON	0.99+
Doug Henschen	PERSON	0.99+
Bank of America	ORGANIZATION	0.99+
Tony Baer	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Ford	ORGANIZATION	0.99+
iPad	COMMERCIAL_ITEM	0.99+
Sanjeev Mohan	PERSON	0.99+
Sanjeev	PERSON	0.99+
Teradata	ORGANIZATION	0.99+
94%	QUANTITY	0.99+
Vertica	ORGANIZATION	0.99+
58%	QUANTITY	0.99+
Constellation Research	ORGANIZATION	0.99+
Yellowbrick	ORGANIZATION	0.99+
8%	QUANTITY	0.99+
2022	DATE	0.99+
today	DATE	0.99+
City Bank	ORGANIZATION	0.99+
Bill Ford	PERSON	0.99+
two years	QUANTITY	0.99+
Googles	ORGANIZATION	0.99+
81%	QUANTITY	0.99+
10%	QUANTITY	0.99+
DB InSight	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Today	DATE	0.99+
2020	DATE	0.99+
Couchbase	ORGANIZATION	0.99+
Snowflakes	ORGANIZATION	0.99+
5%	QUANTITY	0.98+
650 CIOs	QUANTITY	0.98+
Amazons	ORGANIZATION	0.98+
both	QUANTITY	0.98+
One	QUANTITY	0.98+
Lyfts	ORGANIZATION	0.98+
second	QUANTITY	0.98+
SanjMo	ORGANIZATION	0.98+
26%	QUANTITY	0.98+
Ubers	ORGANIZATION	0.98+
three years	QUANTITY	0.98+
650 respondents	QUANTITY	0.98+
pandemic	EVENT	0.97+
this year	DATE	0.97+
15%	QUANTITY	0.97+
Rocket	ORGANIZATION	0.97+
more than one cloud	QUANTITY	0.97+
25%	QUANTITY	0.97+
Tony bear	PERSON	0.97+
around five	QUANTITY	0.96+
two thirds	QUANTITY	0.96+
about a half	QUANTITY	0.96+

Wrap with Stu Miniman | Red Hat Summit 2022

(bright music) >> Okay, we're back in theCUBE. We said we were signing off for the night, but during the hallway track, we ran into old friend Stu Miniman who was the Director of Market Insights at Red Hat. Stu, friend of theCUBE done the thousands of CUBE interviews. >> Dave, it's great to be here. Thanks for pulling me on, you and I hosted Red Hat Summit before. It's great to see Paul here. I was actually, I was talking to some of the Red Hatters walking around Boston. It's great to have an event here. Boston's got strong presence and I understand, I think was either first or second year, they had it over... What's the building they're tearing down right down the road here. Was that the World Trade Center? I think that's where they actually held it, the first time they were here. We hosted theCUBE >> So they moved up. >> at the Hines Convention Center. We did theCUBE for summit at the BCEC next door. And of course, with the pandemic being what it was, we're a little smaller, nice intimate event here. It's great to be able to room the hall, see a whole bunch of people and lots watching online. >> It's great, it's around the same size as those, remember those Vertica Big Data events that we used to have here. And I like that you were commenting out at the theater and the around this morning for the keynotes, that was good. And the keynotes being compressed, I think, is real value for the attendees, you know? 'Cause people come to these events, they want to see each other, you know? They want to... It's like the band getting back together. And so when you're stuck in the keynote room, it's like, "Oh, it's okay, it's time to go." >> I don't know that any of us used to sitting at home where I could just click to another tab or pause it or run for, do something for the family, or a quick bio break. It's the three-hour keynote I hope has been retired. >> But it's an interesting point though, that the virtual event really is driving the physical and this, the way Red Hat marketed this event was very much around the virtual attendee. Physical was almost an afterthought, so. >> Right, this is an invite only for in-person. So you're absolutely right. It's optimizing the things that are being streamed, the online audience is the big audience. And we just happy to be in here to clap and do some things see around what you're doing. >> Wonderful see that becoming the norm. >> I think like virtual Stu, you know this well when virtual first came in, nobody had a clue with what they were doing. It was really hard. They tried different things, they tried to take the physical and just jam it into the virtual. That didn't work, they tried doing fun things. They would bring in a famous person or a comedian. And that kind of worked, I guess, but everybody showed up for that and then left. And I think they're trying to figure it out what this hybrid thing is. I've seen it both ways. I've seen situations like this, where they're really sensitive to the virtual. I've seen others where that's the FOMO of the physical, people want physical. So, yeah, I think it depends. I mean, reinvent last year was heavy physical. >> Yeah, with 15,000 people there. >> Pretty long keynotes, you know? So maybe Amazon can get away with it, but I think most companies aren't going to be able to. So what is the market telling you? What are these insights? >> So Dave just talking about Amazon, obviously, the world I live in cloud and that discussion of cloud, the journey that customers are going on is where we're spending a lot of the discussions. So, it was great to hear in the keynote, talked about our deep partnerships with the cloud providers and what we're doing to help people with, you like to call it super cloud, some call it hybrid, or multi-cloud... >> New name. (crosstalk) Meta-Cloud, come on. >> All right, you know if Che's my executive, so it's wonderful. >> Love it. >> But we'll see, if I could put on my VR Goggles and that will help me move things. But I love like the partnership announcement with General Motors today because not every company has the needs of software driven electric vehicles all over the place. But the technology that we build for them actually has ramifications everywhere. We've working to take Kubernetes and make it smaller over time. So things that we do at the edge benefit the cloud, benefit what we do in the data center, it's that advancement of science and technology just lifts all boats. >> So what's your take on all this? The EV and software on wheels. I mean, Tesla obviously has a huge lead. It's kind of like the Amazon of vehicles, right? It's sort of inspired a whole new wave of innovation. Now you've got every automobile manufacturer kind of go and after. That is the future of vehicles is something you followed or something you have an opinion on Stu? >> Absolutely. It's driving innovation in some ways, the way the DOS drove innovation on the desktop, if you remember the 64K DOS limit, for years, that was... The software developers came up with some amazing ways to work within that 64K limit. Then when it was gone, we got bloatware, but it actually does enforce a level of discipline on you to try to figure out how to make software run better, run more efficiently. And that has upstream impacts on the enterprise products. >> Well, right. So following your analogy, you talk about the enablement to the desktop, Linux was a huge influence on allowing the individual person to write code and write software, and what's happening in the EV, it's software platform. All of these innovations that we're seeing across industries, it's how is software transforming things. We go back to the mark end reasons, software's eating the world, open source is the way that software is developed. Who's at the intersection of all those? We think we have a nice part to play in that. I loved tha- Dave, I don't know if you caught at the end of the keynote, Matt Hicks basically said, "Our mission isn't just to write enterprise software. "Our mission is based off of open source because open source unlocks innovation for the world." And that's one of the things that drew me to Red Hat, it's not just tech in good places, but allowing underrepresented, different countries to participate in what's happening with software. And we can all move that ball forward. >> Well, can we declare victory for open source because it's not just open source products, but everything that's developed today, whether proprietary or open has open source in it. >> Paul, I agree. Open source is the development model period, today. Are there some places that there's proprietary? Absolutely. But I had a discussion with Deepak Singh who's been on theCUBE many times. He said like, our default is, we start with open source code. I mean, even Amazon when you start talking about that. >> I said this, the $70 billion business on open source. >> Exactly. >> Necessarily give it back, but that say, Hey, this is... All's fair in tech and more. >> It is interesting how the managed service model has sort of rescued open source, open source companies, that were trying to do the Red Hat model. No one's ever really successfully duplicated the Red Hat model. A lot of companies were floundering and failing. And then the managed service option came along. And so now they're all cloud service providers. >> So the only thing I'd say is that there are some other peers we have in the industry that are built off open source they're doing okay. The recent example, GitLab and Hashicorp, both went public. Hashi is doing some managed services, but it's not the majority of their product. Look at a company like Mongo, they've heavily pivoted toward the managed service. It is where we see the largest growth in our area. The products that we have again with Amazon, with Microsoft, huge growth, lots of interest. It's one of the things I spend most of my time talking on. >> I think Databricks is another interesting example 'cause Cloudera was the now company and they had the sort of open core, and then they had the proprietary piece, and they've obviously didn't work. Databricks when they developed Spark out of Berkeley, everybody thought they were going to do kind of a similar model. Instead, they went for all in managed services. And it's really worked well, I think they were ahead of that curve and you're seeing it now is it's what customers want. >> Well, I mean, Dave, you cover the database market pretty heavily. How many different open source database options are there today? And that's one of the things we're solving. When you look at what is Red Hat doing in the cloud? Okay, I've got lots of databases. Well, we have something called, it's Red Hat Open Database Access, which is from a developer, I don't want to have to think about, I've got six different databases, which one, where's the repository? How does all that happen? We give that consistency, it's tied into OpenShift, so it can help abstract some of those pieces. we've got same Kafka streaming and we've got APIs. So it's frameworks and enablers to help bridge that gap between the complexity that's out there, in the cloud and for the developer tool chain. >> That's really important role you guys play though because you had this proliferation, you mentioned Mongo. So many others, Presto and Starbursts, et cetera, so many other open source options out there now. And companies, developers want to work with multiple databases within the same application. And you have a role in making that easy. >> Yeah, so and that is, if you talk about the question I get all the time is, what's next for Kubernetes? Dave, you and I did a preview for KubeCon and it's automation and simplicity that we need to be. It's not enough to just say, "Hey, we've got APIs." It's like Dave, we used to say, "We've got standards? Great." Everybody's implementation was a little bit different. So we have API Sprawl today. So it's building that ecosystem. You've been talking to a number of our partners. We are very active in the community and trying to do things that can lift up the community, help the developers, help that cloud native ecosystem, help our customers move faster. >> Yeah API's better than scripts, but they got to be managed, right? So, and that's really what you guys are doing that's different. You're not trying to own everything, right? It's sort of antithetical to how billions and trillions are made in the IT industry. >> I remember a few years ago we talked here, and you look at the size that Red Hat is. And the question is, could Red Hat have monetized more if the model was a little different? It's like, well maybe, but that's not the why. I love that they actually had Simon Sinek come in and work with Red Hat and that open, unlocks the world. Like that's the core, it's the why. When I join, they're like, here's a book of Red Hat, you can get it online and that why of what we do, so we never have to think of how do we get there. We did an acquisition in the security space a year ago, StackRox, took us a year, it's open source. Stackrox.io, it's community driven, open source project there because we could have said, "Oh, well, yeah, it's kind of open source and there's pieces that are open source, but we want it to be fully open source." You just talked to Gunnar about how he's RHEL nine, based off CentOS stream, and now developing out in the open with that model, so. >> Well, you were always a big fan of Whitehurst culture book, right? It makes a difference. >> The open organization and right, Red Hat? That culture is special. It's definitely interesting. So first of all, most companies are built with the hierarchy in mind. Had a friend of mine that when he joined Red Hat, he's like, I don't understand, it's almost like you have like lots of individual contractors, all doing their things 'cause Red Hat works on thousands of projects. But I remember talking to Rackspace years ago when OpenStack was a thing and they're like, "How do you figure out what to work on?" "Oh, well we hired great people and they work on what's important to them." And I'm like, "That doesn't sound like a business." And he is like, "Well, we struggle sometimes to that balance." Red Hat has found that balance because we work on a lot of different projects and there are people inside Red Hat that are, you know, they care more about the project than they do the business, but there's the overall view as to where we participate and where we productize because we're not creating IP because it's all an open source. So it's the monetizations, the relationships we have our customers, the ecosystems that we build. And so that is special. And I'll tell you that my line has been Red Hat on the inside is even more Red Hat. The debates and the discussions are brutal. I mean, technical people tearing things apart, questioning things and you can't be thin skinned. And the other thing is, what's great is new people. I've talked to so many people that started at Red Hat as interns and will stay for seven, eight years. And they come there and they have as much of a seat at the table, and when I talk to new people, your job, is if you don't understand something or you think we might be able to do it differently, you better speak up because we want your opinion and we'll take that, everybody takes that into consideration. It's not like, does the decision go all the way up to this executive? And it's like, no, it's done more at the team. >> The cultural contrast between that and your parent, IBM, couldn't be more dramatic. And we talked earlier with Paul Cormier about has IBM really walked the walk when it comes to leaving Red Hat alone. Naturally he said, "Yes." Well what's your perspective. >> Yeah, are there some big blue people across the street or something I heard that did this event, but look, do we interact with IBM? Of course. One of the reasons that IBM and IBM Services, both products and services should be able to help get us breadth in the marketplace. There are times that we go arm and arm into customer meetings and there are times that customers tell us, "I like Red Hat, I don't like IBM." And there's other ones that have been like, "Well, I'm a long time IBM, I'm not sure about Red Hat." And we have to be able to meet all of those customers where they are. But from my standpoint, I've got a Red Hat badge, I've got a Red Hat email, I've got Red Hat benefits. So we are fiercely independent. And you know, Paul, we've done blogs and there's lots of articles been written is, Red Hat will stay Red Hat. I didn't happen to catch Arvin I know was on CNBC today and talking at their event, but I'm sure Red Hat got mentioned, but... >> Well, he talks about Red Hat all time. >> But in his call he's talking backwards. >> It's interesting that he's not here, greeting this audience, right? It's again, almost by design, right? >> But maybe that's supposed to be... >> Hundreds of yards away. >> And one of the questions being in the cloud group is I'm not out pitching IBM Cloud, you know? If a customer comes to me and asks about, we have a deep partnership and IBM will be happy to tell you about our integrations, as opposed to, I'm happy to go into a deep discussion of what we're doing with Google, Amazon, and Microsoft. So that's how we do it. It's very different Dave, from you and I watch really closely the VMware-EMC, VMware-Dell, and how that relationship. This one is different. We are owned by IBM, but we mostly, it does IBM fund initiatives and have certain strategic things that are done, absolutely. But we maintain Red Hat. >> But there are similarities. I mean, VMware crowd didn't want to talk about EMC, but they had to, they were kind of forced to. Whereas, you're not being forced to. >> And then once Dell came in there, it was joint product development. >> I always thought a spin in. Would've been the more effective, of course, Michael Dell and Egon wouldn't have gotten their $40 billion out. But I think a spin in was more natural based on where they were going. And it would've been, I think, a more dominant position in the marketplace. They would've had more software, but again, financially it wouldn't have made as much sense, but that whole dynamic is different. I mean, but people said they were going to look at VMware as a model and it's been largely different because remember, VMware of course was a separate company, now is a fully separate company. Red Hat was integrated, we thought, okay, are they going to get blue washed? We're watching and watching, and watching, you had said, well, if the Red Hat culture isn't permeating IBM, then it's a failure. And I don't know if that's happening, but it's definitely... >> I think a long time for that. >> It's definitely been preserved. >> I mean, Dave, I know I read one article at the beginning of the year is, can Arvin make IBM, Microsoft Junior? Follow the same turnaround that Satya Nadella drove over there. IBM I think making some progress, I mean, I read and watch what you and the team are all writing about it. And I'll withhold judgment on IBM. Obviously, there's certain financial things that we'd love to see IBM succeed. We worry about our business. We do our thing and IBM shares our results and they've been solid, so. >> Microsoft had such massive cash flow that even bomber couldn't screw it up. Well, I mean, this is true, right? I mean, you think about how were relevant Microsoft was in the conversation during his tenure and yet they never got really... They maintained a position so that when the Nadella came in, they were able to reascend and now are becoming that dominant player. I mean, IBM just doesn't have that cash flow and that luxury, but I mean, if he pulls it off, he'll be the CEO of the decade. >> You mentioned partners earlier, big concern when the acquisition was first announced, was that the Dells and the HP's and the such wouldn't want to work with Red Hat anymore, you've sort of been here through that transition. Is that an issue? >> Not that I've seen, no. I mean, the hardware suppliers, the ISVs, the GSIs are all very important. It was great to see, I think you had Accenture on theCUBE today, obviously very important partner as we go to the cloud. IBM's another important partner, not only for IBM Cloud, but IBM Services, deep partnership with Azure and AWS. So those partners and from a technology standpoint, the cloud native ecosystem, we talked about, it's not just a Red Hat product. I constantly have to talk about, look, we have a lot of pieces, but your developers are going to have other tools that they're going to use and the security space. There is no such thing as a silver bullet. So I've been having some great conversations here already this week with some of our partners that are helping us to round out that whole solution, help our customers because it has to be, it's an ecosystem. And we're one of the drivers to help that move forward. >> Well, I mean, we were at Dell Tech World last week, and there's a lot of talk about DevSecOps and DevOps and Dell being more developer friendly. Obviously they got a long way to go, but you can't have that take that posture and not have a relationship with Red Hat. If all you got is Pivotal and VMware, and Tansu >> I was thrilled to hear the OpenShift mention in the keynote when they talked about what they were doing. >> How could you not, how could you have any credibility if you're just like, Oh, Pivotal, Pivotal, Pivotal, Tansu, Tansu. Tansu is doing its thing. And they smart strategy. >> VMware is also a partner of ours, but that we would hope that with VMware being independent, that does open the door for us to do more with them. >> Yeah, because you guys have had a weird relationship with them, under ownership of EMC and then Dell, right? And then the whole IBM thing. But it's just a different world now. Ecosystems are forming and reforming, and Dell's building out its own cloud and it's got to have... Look at Amazon, I wrote about this. I said, "Can you envision the day where Dell actually offers competitive products in its suite, in its service offering?" I mean, it's hard to see, they're not there yet. They're not even close. And they have this high say/do ratio, or really it's a low say/do, they say high say/do, but look at what they did with Nutanix. You look over- (chuckles) would tell if it's the Cisco relationship. So it's got to get better at that. And it will, I really do believe. That's new thinking and same thing with HPE. And, I don't know about Lenovo that not as much of an ecosystem play, but certainly Dell and HPE. >> Absolutely. Michael Dell would always love to poke at HPE and HP really went very far down the path of their own products. They went away from their services organization that used to be more like IBM, that would offer lots of different offerings and very much, it was HP Invent. Well, if we didn't invent it, you're not getting it from us. So Dell, we'll see, as you said, the ecosystems are definitely forming, converging and going in lots of different directions. >> But your position is, Hey, we're here, we're here to help. >> Yeah, we're here. We have customers, one of the best proof points I have is the solution that we have with Amazon. Amazon doesn't do the engineering work to make us a native offering if they didn't have the customer demand because Amazon's driven off of data. So they came to us, they worked with us. It's a lot of work to be able to make that happen, but you want to make it frictionless for customers so that they can adopt that. That's a long path. >> All right, so evening event, there's a customer event this evening upstairs in the lobby. Microsoft is having a little shin dig, and then serves a lot of customer dinners going on. So Stu, we'll see you out there tonight. >> All right, thanks you. >> Were watching a brewing somewhere. >> Keynotes tomorrow, a lot of good sessions and enablement, and yeah, it's great to be in person to be able to bump some people, meet some people and, Hey, I'm still a year and a half in still meeting a lot of my peers in person for the first time. >> Yeah, and that's kind of weird, isn't it? Imagine. And then we kick off tomorrow at 10:00 AM. Actually, Stephanie Chiras is coming on. There she is in the background. She's always a great guest and maybe do a little kickoff and have some fun tomorrow. So this is Dave Vellante for Stu Miniman, Paul Gillin, who's my co-host. You're watching theCUBEs coverage of Red Hat Summit 2022. We'll see you tomorrow. (bright music)

Published Date : May 11 2022

SUMMARY :

but during the hallway track, Was that the World Trade Center? at the Hines Convention Center. And I like that you were It's the three-hour keynote that the virtual event really It's optimizing the things becoming the norm. and just jam it into the virtual. aren't going to be able to. a lot of the discussions. Meta-Cloud, come on. All right, you know But the technology that we build for them It's kind of like the innovation on the desktop, And that's one of the things Well, can we declare I mean, even Amazon when you start talking the $70 billion business on open source. but that say, Hey, this is... the managed service model but it's not the majority and then they had the proprietary piece, And that's one of the And you have a role in making that easy. I get all the time is, are made in the IT industry. And the question is, Well, you were always a big fan the relationships we have our customers, And we talked earlier One of the reasons that But in his call he's talking that's supposed to be... And one of the questions I mean, VMware crowd didn't And then once Dell came in there, Would've been the more I think a long time It's definitely been at the beginning of the year is, and that luxury, the HP's and the such I mean, the hardware suppliers, the ISVs, and not have a relationship with Red Hat. the OpenShift mention in the keynote And they smart strategy. that does open the door for us and it's got to have... the ecosystems are definitely forming, But your position is, Hey, is the solution that we have with Amazon. So Stu, we'll see you out there tonight. Were watching a brewing person for the first time. There she is in the background.

ENTITIES

Entity	Category	Confidence
Google	ORGANIZATION	0.99+
Paul	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Stu Miniman	PERSON	0.99+
General Motors	ORGANIZATION	0.99+
Paul Gillin	PERSON	0.99+
Dave	PERSON	0.99+
seven	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Stephanie Chiras	PERSON	0.99+
HP	ORGANIZATION	0.99+
Matt Hicks	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Gunnar	PERSON	0.99+
Paul Cormier	PERSON	0.99+
Deepak Singh	PERSON	0.99+
$40 billion	QUANTITY	0.99+
Boston	LOCATION	0.99+
Databricks	ORGANIZATION	0.99+
Berkeley	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
Satya Nadella	PERSON	0.99+
HPE	ORGANIZATION	0.99+
$70 billion	QUANTITY	0.99+
Cisco	ORGANIZATION	0.99+
tomorrow	DATE	0.99+
Simon Sinek	PERSON	0.99+
Stu	PERSON	0.99+
last week	DATE	0.99+
Hashicorp	ORGANIZATION	0.99+
GitLab	ORGANIZATION	0.99+
Dells	ORGANIZATION	0.99+
Lenovo	ORGANIZATION	0.99+
Tesla	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
Mongo	ORGANIZATION	0.99+
EMC	ORGANIZATION	0.99+
15,000 people	QUANTITY	0.99+
Red Hat	TITLE	0.99+
Michael Dell	PERSON	0.99+
64K	QUANTITY	0.99+
last year	DATE	0.99+
Arvin	PERSON	0.99+
VMware	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+

Breaking Analysis: What you May not Know About the Dell Snowflake Deal

>> From theCUBE Studios in Palo Alto, in Boston bringing you Data Driven Insights from theCUBE and ETR. This is Breaking Analysis with Dave Vellante. >> In the pre-cloud era hardware companies would run benchmarks, showing how database and or application performance ran better on their systems relative to competitors or previous generation boxes. And they would make a big deal out of it. And the independent software vendors, you know they'd do a little golf clap if you will, in the form of a joint press release it became a game of leaprog amongst hardware competitors. That was pretty commonplace over the years. The Dell Snowflake Deal underscores that the value proposition between hardware companies and ISVs is changing and has much more to do with distribution channels, volumes and the amount of data that lives On-Prem in various storage platforms. For cloud native ISVs like Snowflake they're realizing that despite their Cloud only dogma they have to grit their teeth and deal with On-premises data or risk getting shut out of evolving architectures. Hello and welcome to this week's Wikibon Cube Insights powered by ETR. In this breaking analysis, we unpack what little is known about the Snowflake announcement from Dell Technologies World and discuss the implications of a changing Cloud landscape. We'll also share some new data for Cloud and Database platforms from ETR that shows Snowflake has actually entered the Earth's orbit when it comes to spending momentum on its platform. Now, before we get into the news I want you to listen to Frank's Slootman's answer to my question as to whether or not Snowflake would ever architect the platform to run On-Prem because it's doable technically, here's what he said, play the clip >> Forget it, this will only work in the Public Cloud. Because it's, this is how the utility model works, right. I think everybody is coming through this realization, right? I mean, excuses are running out at this point. You know, we think that it'll, people will come to the Public Cloud a lot sooner than we will ever come to the Private Cloud. It's not that we can't run a private Cloud. It's just diminishes the potential and the value that we bring. >> So you may be asking yourselves how do you square that circle? Because basically the Dell Snowflake announcement is about bringing Snowflake to the private cloud, right? Or is it let's get into the news and we'll find out. Here's what we know at Dell Technologies World. One of the more buzzy announcements was the, by the way this was a very well attended vet event. I should say about I would say 8,000 people by my estimates. But anyway, one of the more buzzy announcements was Snowflake can now run analytics on Non-native Snowflake data that lives On-prem in a Dell object store Dell's ECS to start with. And eventually it's software defined object store. Here's Snowflake's clark, Snowflake's Clark Patterson describing how it works this past week on theCUBE. Play the clip. The way it works is I can now access Non-native Snowflake data using what materialized views, external tables How does that work? >> Some combination of the, all the above. So we've had in Snowflake, a capability called External Tables, which you refer to, it goes hand in hand with this notion of external stages. Basically there's a through the combination of those two capabilities, it's a metadata layer on data, wherever it resides. So customers have actually used this in Snowflake for data lake data outside of Snowflake in the Cloud, up until this point. So it's effectively an extension of that functionality into the Dell On-Premises world, so that we can tap into those things. So we use the external stages to expose all the metadata about what's in the Dell environment. And then we build external tables in Snowflake. So that data looks like it is in Snowflake. And then the experience for the analyst or whomever it is, is exactly as though that data lives in the Snowflake world. >> So as Clark explained, this capability of External tables has been around in the Cloud for a while, mainly to suck data out of Cloud data lakes. Snowflake External Tables use file level metadata, for instance, the name of the file and the versioning so that it can be queried in a stage. A stage is just an external location outside of Snowflake. It could be an S3 bucket or an Azure Blob and it's soon will be a Dell object store. And in using this feature, the Dell looks like it lives inside of Snowflake and Clark essentially, he's correct to say to an analyst that looks exactly like the data is in Snowflake, but uh, not exactly the data's read only which means you can't do what are called DML operations. DML stands for Data Manipulation Language and allows for things like inserting data into tables or deleting and modifying existing data. But the data can be queried. However, the performance of those queries to External Tables will almost certainly be slower. Now users can build things like materialized views which are going to speed things up a bit, but at the end of the day, it's going to run faster than the Cloud. And you can be almost certain that's where Snowflake wants it to run, but some organizations can't or won't move data into the Cloud for a variety of reasons, data sovereignty, compliance security policies, culture, you know, whatever. So data can remain in place On-prem, or it can be moved into the Public Cloud with this new announcement. Now, the compute today presumably is going to be done in the Public Cloud. I don't know where else it's going to be done. They really didn't talk about the compute side of things. Remember, one of Snowflake's early innovations was to separate compute from storage. And what that gave them is you could more efficiently scale with unlimited resources when you needed them. And you could shut off the compute when you don't need us. You didn't have to buy, and if you need more storage you didn't have to buy more compute and vice versa. So everybody in the industry has copied that including AWS with Redshift, although as we've reported not as elegantly as Snowflake did. RedShift's more of a storage tiering solution which minimizes the compute required but you can't really shut it off. And there are companies like Vertica with Eon Mode that have enabled this capability to be done On-prem, you know, but of course in that instance you don't have unlimited elastic compute scale on-Prem but with solutions like Dell Apex and HPE GreenLake, you can certainly, you can start to simulate that Cloud elasticity On-prem. I mean, it's not unlimited but it's sort of gets you there. According to a Dell Snowflake joint statement, the companies the quote, the companies will pursue product integrations and joint go to market efforts in the second half of 2022. So that's a little vague and kind of benign. It's not really clear when this is going to be available based on that statement from the two first, but, you know, we're left wondering will Dell develop an On-Prem compute capability and enable queries to run locally maybe as part of an extended apex offering? I mean, we don't know really not sure there's even a market for that but it's probably a good bet that again, Snowflake wants that data to land in the Snowflake data Cloud kind of makes you wonder how this deal came about. You heard Sloop on earlier Snowflake has always been pretty dogmatic about getting data into its native snowflake format to enable the best performance as we talked about but also data sharing and governance. But you could imagine that data architects they're building out their data mesh we've reported on this quite extensively and their data fabric and those visions around that. And they're probably telling Snowflake, Hey if you want to be a strategic partner of ours you're going to have to be more inclusive of our data. That for whatever reason we're not putting in your Cloud. So Snowflake had to kind of hold its nose and capitulate. Now the good news is it further opens up Snowflakes Tam the total available market. It's obviously good marketing posture. And ultimately it provides an on ramp to the Cloud. And we're going to come back to that shortly but let's look a little deeper into what's happening with data platforms and to do that we'll bring in some ETR data. Now, let me just say as companies like Dell, IBM, Cisco, HPE, Lenovo, Pure and others build out their hybrid Clouds. The cold hard fact is not only do they have to replicate the Cloud Operating Model. You will hear them talk about that a lot, but they got to do that. So it, and that's critical from a user experience but in order to gain that flywheel momentum they need to build a robust ecosystem that goes beyond their proprietary portfolios. And, you know, honestly they're really not even in the first inning most companies and for the likes of Snowflake to sort of flip this, they've had to recognize that not everything is moving into the Cloud. Now, let's bring up the next slide. One of the big areas of discussion at Dell Tech World was Apex. That's essentially Dell's nascent as a service offering. Apex is infrastructure as a Service Cloud On-prem and obviously has the vision of connecting to the Cloud and across Clouds and out to the Edge. And it's no secret that database is one of the most important ingredients of infrastructure as a service generally in Cloud Infrastructure specifically. So this chart here shows the ETR data for data platforms inside of Dell accounts. So the beauty of ETR platform is you can cut data a million different ways. So we cut it. We said, okay, give us the Cloud platforms inside Dell accounts, how are they performing? Now, this is a two dimensional graphic. You got net score or spending momentum on the vertical axis and what ETR now calls Overlap formally called Market Share which is a measure of pervasiveness in the survey. That's on the horizontal axis that red dotted line at 40% represents highly elevated spending on the Y. The table insert shows the raw data for how the dots are positioned. Now, the first call out here is Snowflake. According to ETR quote, after 13 straight surveys of astounding net scores, Snowflake has finally broken the trend with its net score dropping below the 70% mark among all respondents. Now, as you know, net score is measured by asking customers are you adding the platform new? That's the lime green in the bar that's pointing from Snowflake in the graph and or are you increasing spend by 6% or more? That's the forest green is spending flat that's the gray is you're spend decreasing by 6% or worse. That's the pinkish or are you decommissioning the platform bright red which is essentially zero for Snowflake subtract the reds from the greens and you get a net score. Now, what's somewhat interesting is that snowflakes net score overall in the survey is 68 which is still huge, just under 70%, but it's net score inside the Dell account base drops to the low sixties. Nonetheless, this chart tells you why Snowflake it's highly elevated spending momentum combined with an increasing presence in the market over the past two years makes it a perfect initial data platform partner for Dell. Now and in the Ford versus Ferrari dynamic. That's going on between the likes of Dell's apex and HPE GreenLake database deals are going to become increasingly important beyond what we're seeing with this recent Snowflake deal. Now noticed by the way HPE is positioned on this graph with its acquisition of map R which is now part of HPE Ezmeral. But if these companies want to be taken seriously as Cloud players, they need to further expand their database affinity to compete ideally spinning up databases as part of their super Clouds. We'll come back to that that span multiple Clouds and include Edge data platforms. We're a long ways off from that. But look, there's Mongo, there's Couchbase, MariaDB, Cloudera or Redis. All of those should be on the short list in my view and why not Microsoft? And what about Oracle? Look, that's to be continued on maybe as a future topic in a, in a Breaking Analysis but I'll leave you with this. There are a lot of people like John Furrier who believe that Dell is playing with fire in the Snowflake deal because he sees it as a one way ticket to the Cloud. He calls it a one way door sometimes listen to what he said this past week. >> I would say that that's a dangerous game because we've seen that movie before, VMware and AWS. >> Yeah, but that we've talked about this don't you think that was the right move for VMware? >> At the time, but if you don't nurture the relationship AWS will take all those customers ultimately from VMware. >> Okay, so what does the data say about what John just said? How is VMware actually doing in Cloud after its early missteps and then its subsequent embracing of AWS and other Clouds. Here's that same XY graphic spending momentum on the Y and pervasiveness on the X and the same table insert that plots the dots and the, in the breakdown of Dell's net score granularity. You see that at the bottom of the chart in those colors. So as usual, you see Azure and AWS up and to the right with Google well behind in a distant third, but still in the mix. So very impressive for Microsoft and AWS to have both that market presence in such elevated spending momentum. But the story here in context is that the VMware Cloud on AWS and VMware's On-Prem Cloud like VMware Cloud Foundation VCF they're doing pretty well in the market. Look, at HPE, gaining some traction in Cloud. And remember, you may not think HPE and Dell and VCF are true Cloud but these are customers answering the survey. So their perspective matters more than the purest view. And the bad news is the Dell Cloud is not setting the world on fire from a momentum standpoint on the vertical axis but it's above the line of zero and compared to Dell's overall net score of 20 you could see it's got some work to do. Okay, so overall Dell's got a pretty solid net score to you know, positive 20, as I say their Cloud perception needs to improve. Look, Apex has to be the Dell Cloud brand not Dell reselling VMware. And that requires more maturity of Apex it's feature sets, its selling partners, its compensation models and it's ecosystem. And I think Dell clearly understands that. I think they're pretty open about that. Now this includes partners that go beyond being just sellers has to include more tech offerings in the marketplace. And actually they got to build out a marketplace like Cloud Platform. So they got a lot of work to do there. And look, you've got Oracle coming up. I mean they're actually kind of just below the magic 40% in the line which is pro it's pretty impressive. And we've been telling you for years, you can hate Oracle all you want. You can hate its price, it's closed system all of that it's red stack shore. You can say it's legacy. You can say it's old and outdated, blah, blah, blah. You can say Oracle is irrelevant in trouble. You are dead wrong. When it comes to mission critical workloads. Oracle is the king of the hill. They're a founder led company that knows exactly what it's doing and they're showing Cloud momentum. Okay, the last point is that while Microsoft AWS and Google have major presence as shown on the X axis. VMware and Oracle now have more than a hundred citations in the survey. You can see that on the insert in the right hand, right most column. And IBM had better keep the momentum from last quarter going, or it won't be long before they get passed by Dell and HP in Cloud. So look, John might be right. And I would think Snowflake quietly agrees that this Dell deal is all about access to Dell's customers and their data. So they can Hoover it into the Snowflake Data Cloud but the data right now, anyway doesn't suggest that's happening with VMware. Oh, by the way, we're keeping an eye close eye on NetApp who last September ink, a similar deal to VMware Cloud on AWS to see how that fares. Okay, let's wrap with some closing thoughts on what this deal means. We learned a lot from the Cloud generally in AWS, specifically in two pizza teams, working backwards, customer obsession. We talk about flywheel all the time and we've been talking today about marketplaces. These have all become common parlance and often fundamental narratives within strategic plans investor decks and customer presentations. Cloud ecosystems are different. They take both competition and partnerships to new heights. You know, when I look at Azure service offerings like Apex, GreenLake and similar services and I see the vendor noise or hear the vendor noise that's being made around them. I kind of shake my head and ask, you know which movie were these companies watching last decade? I really wish we would've seen these initiatives start to roll out in 2015, three years before AWS announced Outposts not three years after but Hey, the good news is that not only was Outposts a wake up call for the On-Prem crowd but it's showing how difficult it is to build a platform like Outposts and bring it to On-Premises. I mean, Outpost isn't currently even a rounding era in the marketplace. It really doesn't do much in terms of database support and support of other services. And, you know, it's unclear where that that is going. And I don't think it has much momentum. And so the Hybrid Cloud Vendors they've had time to figure it out. But now it's game on, companies like Dell they're promising a consistent experience between On-Prem into the Cloud, across Clouds and out to the Edge. They call it MultCloud which by the way my view has really been multi-vendor Chuck, Chuck Whitten. Who's the new co-COO of Dell called it Multi-Cloud by default. (laughing) That's really, I think an accurate description of that. I call this new world Super Cloud. To me, it's different than MultiCloud. It's a layer that runs on top of hyperscale infrastructure kind of hides the underlying complexity of the Cloud. It's APIs, it's primitives. And it stretches not only across Clouds but out to the Edge. That's a big vision and that's going to require some seriously intense engineering to build out. It's also going to require partnerships that go beyond the portfolios of companies like Dell like their own proprietary stacks if you will. It's going to have to replicate the Cloud Operating Model and to do that, you're going to need more and more deals like Snowflake and even deeper than Snowflake, not just in database. Sure, you'll need to have a catalog of databases that run in your On-Prem and Hybrid and Super Cloud but also other services that customers can tap. I mean, can you imagine a day when Dell offers and embraces a directly competitive service inside of apex. I have trouble envisioning that, you know not with their historical posture, you think about companies like, you know, Nutanix, you know, or Cisco where they really, you know those relationships cooled quite quickly but you know, look, think about it. That's what AWS does. It offers for instance, Redshift and Snowflake side by side happily and the Redshift guys they probably hate Snowflake. I wouldn't blame them, but the EC Two Folks, they love them. And Adam SloopesKy understands that ISVs like Snowflake are a key part of the Cloud ecosystem. Again, I have a hard time envisioning that occurring with Dell or even HPE, you know maybe less so with HPE, but what does this imply that the Edge will allow companies like Dell to a reach around on the Cloud and somehow create a new type of model that begrudgingly accommodates the Public Cloud but drafts of the new momentum of the Edge, which right now to these companies is kind of mostly telco and retail. It's hard to see that happening. I think it's got to evolve in a more comprehensive and inclusive fashion. What's much more likely is companies like Dell are going to substantially replicate that Cloud Operating Model for the pieces that they own pieces that they control which admittedly are big pieces of the market. But unless they're able to really tap that ecosystem magic they're not going to be able to grow much beyond their existing install bases. You take that lime green we showed you earlier that new adoption metric from ETR as an example, by my estimates, AWS and Azure are capturing new accounts at a rate between three to five times faster than Dell and HPE. And in the more mature US and mere markets it's probably more like 10 X and a major reason is because of the Cloud's robust ecosystem and the optionality and simplicity of transaction that that is bringing to customers. Now, Dell for its part is a hundred billion dollar revenue company. And it has the capability to drive that kind of dynamic. If it can pivot its partner ecosystem mindset from kind of resellers to Cloud services and technology optionality. Okay, that's it for now? Thanks to my colleagues, Stephanie Chan who helped research topics for Breaking Analysis. Alex Myerson is on the production team. Kristen Martin and Cheryl Knight and Rob Hof, on editorial they helped get the word out and thanks to Jordan Anderson for the new Breaking Analysis branding and graphics package. Remember these episodes are all available as podcasts wherever you listen. All you do is search Breaking Analysis podcasts. You could check out ETR website @etr.ai. We publish a full report every week on wikibon.com and siliconangle.com. You want to get in touch. @dave.vellente @siliconangle.com. You can DM me @dvellante. You can make a comment on our LinkedIn posts. This is Dave Vellante for the Cube Insights powered by ETR. Have a great week, stay safe, be well. And we'll see you next time. (upbeat music)

Published Date : May 7 2022

SUMMARY :

bringing you Data Driven and the amount of data that lives On-Prem and the value that we bring. One of the more buzzy into the Dell On-Premises world, Now and in the Ford I would say that At the time, but if you And it has the capability to

ENTITIES

Entity	Category	Confidence
Jordan Anderson	PERSON	0.99+
Stephanie Chan	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
Clark Patterson	PERSON	0.99+
Alex Myerson	PERSON	0.99+
Dave Vellante	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
Lenovo	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
John	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
2015	DATE	0.99+
Google	ORGANIZATION	0.99+
Cheryl Knight	PERSON	0.99+
Clark	PERSON	0.99+
HP	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
Boston	LOCATION	0.99+
HPE	ORGANIZATION	0.99+
6%	QUANTITY	0.99+
Ford	ORGANIZATION	0.99+
three	QUANTITY	0.99+
40%	QUANTITY	0.99+
Chuck Whitten	PERSON	0.99+
VMware	ORGANIZATION	0.99+
Nutanix	ORGANIZATION	0.99+
Kristen Martin	PERSON	0.99+
Ferrari	ORGANIZATION	0.99+
Adam SloopesKy	PERSON	0.99+
Earth	LOCATION	0.99+
13 straight surveys	QUANTITY	0.99+
70%	QUANTITY	0.99+
first	QUANTITY	0.99+
68	QUANTITY	0.99+
last quarter	DATE	0.99+
Redshift	TITLE	0.99+
siliconangle.com	OTHER	0.99+
theCUBE Studios	ORGANIZATION	0.99+
Snowflake	EVENT	0.99+
Snowflake	TITLE	0.99+
8,000 people	QUANTITY	0.99+
both	QUANTITY	0.99+
20	QUANTITY	0.99+
VCF	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+

Neil Fowler, Micro Focus & Sabina Joseph, AWS | AWS re:Invent 2021

>>Welcome back to the cubes. Continuous live coverage of AWS reinvent 2021 live from Las Vegas. It's I'm Lisa Martin. And it's so great to say that we are doing with AWS and its massive ecosystem of partners. One of the most important hybrid tech events of the year. We've two sets over a hundred guests to remote studios, lots going on. I've got an alumni back with me and a new guest. Please. Welcome back. Sabina. Jo said the GM of technology partners at AWS and Neil Fowler joins her is the GM of micro-focus AMC. And you're going to tell me what AMC stands for >>Application modernization and >>Connectivity. I love it. Awesome guys. It's great. It's great to see you again in person. Thank you for having us. It's great to have the buzz. I know it's gonna be a little bit hard to hear, but great to have. AWS has done a phenomenal job of getting everyone in here safely. I want to give them kudos to that. So being to talk to me with, it's been a while since I've seen you in person, but talk to me about your current role at AWS. What's going on? >>Yeah, so I'm the general manager for technology partnerships globally out of the Americas. We also help partners out of EMEA and APAC grow in the Americas. And one of the great examples of a successful partnership is micro-focus with their solutions across application modernization security, database services, mainframes. >>And so from your perspective, through your lens, how do you think they're performing as a partner? Yes. >>So, um, first of all, kudos to Neil and the entire micro-focus team. They have done a great job leaning in with a cloud first strategy with SAS solutions on AWS and these solutions help customers across application modernization, application, delivery, security, cyber resiliency, database services, and also it performance management. And we've been working with them now for a few years. And in fact, today we have actually 400 customer wins together regulations and then also eight digit annual recurring revenue. They have six active listings in marketplace and all of this is really helping customers move their workloads and modernize their workloads into AWS. >>We've seen that such an acceleration nail in the digital transformation cloud adoption. The pandemic has really been a forcing function for that. There are some silver linings, but talk to me about some of the things that you've seen at micro-focus the last 20 months or so. And how have you helped those 400 customers, you know, getting to that big ARR, how are you helping them with that acceleration? >>Well, I think as you're saying that there's lots of changes in the last 12 to 18 months, some of it brought on by the pandemic and the change in business in business to having to respond, deliver solutions more quickly to the market, as well as remote working. So optimizing and the economic environment of costs, but being there to be more dynamic, it really has caused businesses to have to do something different than just to be able to survive and serve their customers better. That was a >>Big thing that we saw in the very beginning. It was not survival mode. And then of course it wasn't too long when we started seeing those survivors really start to thrive. And you started seeing who were going to be the winners of tomorrow. Cause the thing is every company, these days is a data company. If it's not, it's going to be passed up by competitor, that's right there in the rear view mirror. >>For sure. And so we've got, you know, organizations, so running mainframes, you know, older applications, legacy applications, modernization, where are most industries in terms of adopting that, the mindset, first of all, that they need to change? Well, I think across the whole industry, I mean, it doesn't matter whether it's retail. I mean, if you think about airlines with when the, when the pandemic hit business went down to, unless they've got that elastic nature of flashy to respond to it, but everyone had to bring in new services, new offerings very quickly. So the ability to be able to innovate in their environments and bring more solutions to their customers in a really fast way, you know, they couldn't just sit there and work with what they had. They had to move forward just to be able to stay in the business, but also be able to reduce the costs out of what they're trying to do. So running and transforming at the same time. >>Absolutely. And so how can organizations integrate existing core applications with new technologies to really be able to thrive in today's dynamic market? >>We look at modernization overall. We think of it in sort of three different ways with application process and infrastructure. So with a move to cloud, that's the infrastructure modernization they've immediately got far more access to more scalable dynamic elastic, compute resources, as well as all the technology platforms they have around. And then if you look at the application size and that's where the Microfocus platform comes in, we can help customers actually move those applications forward in terms of making them available through API APIs, maybe as a journey to microservices and cloud native. But once that core business logic and that data is available, it can be integrated into artificial intelligence machine learning and actually rained out the whole solution. So the final part of that from the process modernization, if you, as they're developing these applications with new tools, new ranges, in terms of where they can deploy on the AWS platform, they can automate the build deployment and operations so that all those existing applications and they were running on to contemporary platform with full access to the technologies that were available. >>That's fantastic and so necessary for businesses in any industry. So can you talk about some of the different business units of micro-focus? Are there any ones in particular that you want to call out? >>Yeah, so we work with them across all of their business units, but some of them that come to my mind is of course, Neil and team are doing a great job with application modernization and connectivity, really helping customers modernize the applications. And as customers are modernizing the applications, their cyber resiliency business unit is helping customers secure those applications. And then they also have their it operations management bridge product listed in marketplace. And then just since September are verdict a business unit launch Vertica accelerator on AWS. So I think they have a very holistic story to help customers >>On AWS. Talk to me a little bit, Neil, about cyber resiliency. We have seen such a dramatic change in cybersecurity in the threat landscape the last 20 months. I think I saw a stat recently that ransomware was up almost 11 X in the first half of 2021. Every, every day that companies had had a company, that data is gotta be secure. It's no longer a nice to have. That is a core requirement. How are you helping customers achieve that cyber? >>Well, the thing is, I mean, as you say, across the whole spectrum from cyber, from, from the identity access management through data encryption, through data protection, it's not, it's not a nice to actually say it's not a nice to have Kate take capability. You really have to have an integrated solution to be able to manage access control it, and also generating the events in terms of being able to, if anyone tries to get into the systems and log it because, you know, before, by the time you've discovered something it's too late, so you really need a combined solution for multi-factor authentication to really take it to that next level. >>Absolutely. Right. Once you've detected it, it's too late. And I mean, with ransomware as a service, cyber criminals are getting so much more sophisticated and also more brazen. There's so much money in it that the security front is, is I think even more interesting now than it's ever been. Talk to me about some joint customers and how you've helped them together with AWS with micro-focus achieve some of those key outcomes that you were talking about earlier. Well, I think >>Obviously with AWS as a platform has quite over a technology solutions going in, what we often find with our customers is a lots of, um, they're coming from an existing on-prem solution. So they need that hybrid model. So as part of taking that forward, been able to have that integrated solution that allows them to work both on-prem and as part of the cloud, most of it all being hooked up now, even that from even down to the, uh, as they're developing the applications now to do static code analysis, to help those applications be more secure with things like 40 pound demand, as well as integrating internet security platform for multifactor. So I think as you know, it's a combination of Brunel to bridge between all the different technologies, but have one single view of mail to protect the whole real estate, multiple layers for both external and internal threat. So that's, that's the other thing you also need to take into and can be able to protect all, all layers multi-layered approach. >>Absolutely. But you're right. The internal threats is something that we don't talk about as much, but that is obviously a substantial problem for organizations and most, if not any industries to be, to talk to me a little bit about, let's kind of get into the, the responsibilities that you have a little bit more in there. You've got responsibility for multiple solutions segments at AWS. You told me before we went live, you have 50 meetings this week. My goodness. And since day one, it taught all good. It's fun, fun. It is. Talk to me about AWS approach to partnering. What does it look like? What are some of the things that you think are really critical components? Yeah. >>So as you may have heard, we always start with the, at Amazon and AWS, we start with the customer. We work backwards when we are relaunching our products, our programs or services, you really go and ask the customers, what do you want us to develop? Where do you want us to focus the resources? It takes a lot of discipline to do that, but it's something that where we really want to walk the talk and we use the same approach with our partners when we started to work with micro-focus, we really kind of want to make sure that what we are working on together is what customers want, because we firmly believe that once you lay that foundation of that solution, you can scale your business a lot more quicker. Your story is a lot more simple and the customers are going to find a lot of value in what you are doing together. So it's really all about the customer for us. It is >>Absolutely critical, right? That's the whole point that the whole reason that we're here now, talk to me a little bit about maybe some cultural alignment with AWS, that customer first customer obsession. It sounds like at Microfocus, very similar. >>Absolutely. I mean, the way that we always think about how we're building our products, it's all around customer centric innovation. So that aspect of trying to make sure that we can solve what the business, understanding what the customers are trying to do to then help develop, to deliver solutions that meet that and that combination of a, the way that we look at it from that infrastructure modernization and the range of technologies that are available and that relentless focus on making customer successful is so key. But we have to make sure that that collaboration works together to make sure that the solutions align and we're helping customers get there together >>In your customer conversations. I imagine they've changed quite a bit during the pandemic with so many things being escalated to the C-suite to the board. How have your, how important is that cultural alignment between AWS and Microfocus from your customer's perspective? Is it something that comes up fairly often? Well, >>It's, it's a, I think it, when you actually get a mismatching culture, it's more obvious. So don't think that necessarily people are looking for it to say, I need organizations, but if you're not thinking the same way, you're not behaving the same way and actually partnering. I think that partnering part of it is really important because you're both working together to come up with that desired outcome. So I think it's more, more obvious when it isn't a good match as opposed to what it looking for that particular site. But I think that's a really key aspect in the sense of working together to help that customer be successful. >>Right? That's a great point that you bring up, but it's probably more obvious when it isn't working than when it's beautifully aligned, falling into place and really focused on that customer. So what are some of the things that attendees can, can feel and see and learn at the micro-focus booth at this year's reinvent nail, >>As well as obviously the key Roundup application modernization, where we're looking at the mainframe modernization on the site, we've got the full range of the Microsoft booth in terms of cyber resilience, as well as our, uh, item, my top, uh, it operations management or ADM portfolios. So we've got a lot of technologies which we can learn about in the booth interactive as well as all by experts to understand how we can do all these things and work together as part of the AWS platform to be able to deliver those solutions. >>Excellent. I'm sure there will be plethora of, of knowledge shared at the booth there. Last question, Neil, for you, talk to me about the vision going forward with the partnership. What are some of the things that you're looking forward to as we end 2021 and go into hopefully what is a better year, 2022? >>You know, one of the key things, you know, especially range, no one might, my passionate areas is helping our customers really look in terms of building the platform of the future. We can help solve their customer the problems today, but we're really trying to create that innovation platform to going through. So again, that combination of the technologies that we can bring to help our customers and the breadth and the investment that AWS continue making in the platform, those two combinations really helps us help our customers, not just solve today's problems, who really move into the forward to be the platform for innovation for the next decade. >>And that's really critical that that future ready state that is so undefined most of the time, I mean, none of us saw the pandemic coming, all right. That was a complete shock, but to be able to partner together, to help your customers really set up the foundation to be innovative as things happen that we can't even predict is really critical. So congratulations on your 400 customer wins your eight digit ARR. That's fantastic. Yes, we thank you so much for joining us on the queue, talking about the Microfocus AWS partnership and all of the successes that you guys have had. Great job. And I hope that you have cough drops and a lot of water this week. Sabina. I hope you do too guys. Thanks for joining me. Pleasure for my is I'm Lisa Martin. You're watching the cube, the global leader in live tech coverage.

Published Date : Nov 30 2021

SUMMARY :

And it's so great to say that we are doing with AWS So being to talk to me with, it's been a while since I've seen you in person, but talk to me about your current role at AWS. And one of the great examples And so from your perspective, through your lens, how do you think they're performing And in fact, today we have actually 400 customer wins together There are some silver linings, but talk to me about some of and the economic environment of costs, but being there to be more dynamic, it really has caused businesses to have If it's not, it's going to be passed So the ability to be able to innovate in their environments technologies to really be able to thrive in today's dynamic market? So the final part of that from the process modernization, if you, as they're developing these So can you talk about some of the to help customers Talk to me a little bit, Neil, about cyber resiliency. Well, the thing is, I mean, as you say, across the whole spectrum from cyber, from, from the identity access management it that the security front is, is I think even more interesting now than it's ever been. So that's, that's the other thing you also need to take into and can be able to protect all, to talk to me a little bit about, let's kind of get into the, the responsibilities that you have a little bit more Your story is a lot more simple and the customers are going to find That's the whole point that the whole reason that we're here now, talk to me a little bit about maybe I mean, the way that we always think about how we're building our products, it's all around customer centric innovation. things being escalated to the C-suite to the board. So don't think that necessarily people are looking for it to say, That's a great point that you bring up, but it's probably more obvious when it isn't working than when it's beautifully to understand how we can do all these things and work together as part of the AWS platform to be able to deliver What are some of the things that you're looking forward to as we end 2021 and go into hopefully what So again, that combination of the technologies that we can bring to help our customers and And I hope that you have cough drops and a lot of water this week.

ENTITIES

Entity	Category	Confidence
Neil	PERSON	0.99+
Sabina	PERSON	0.99+
Neil Fowler	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Americas	LOCATION	0.99+
APAC	ORGANIZATION	0.99+
Lisa Martin	PERSON	0.99+
Las Vegas	LOCATION	0.99+
400 customers	QUANTITY	0.99+
40 pound	QUANTITY	0.99+
September	DATE	0.99+
2022	DATE	0.99+
50 meetings	QUANTITY	0.99+
AMC	ORGANIZATION	0.99+
400 customer	QUANTITY	0.99+
Jo	PERSON	0.99+
Sabina Joseph	PERSON	0.99+
next decade	DATE	0.99+
Microfocus	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
both	QUANTITY	0.98+
this week	DATE	0.98+
Micro Focus	ORGANIZATION	0.98+
one	QUANTITY	0.97+
first	QUANTITY	0.97+
tomorrow	DATE	0.97+
One	QUANTITY	0.97+
today	DATE	0.97+
two combinations	QUANTITY	0.96+
first strategy	QUANTITY	0.95+
pandemic	EVENT	0.93+
one single view	QUANTITY	0.93+
2021	DATE	0.93+
Brunel	TITLE	0.93+
this year	DATE	0.93+
six active listings	QUANTITY	0.92+
EMEA	ORGANIZATION	0.92+
two sets	QUANTITY	0.89+
day one	QUANTITY	0.88+
eight digit	QUANTITY	0.87+
last 20 months	DATE	0.87+
ways	QUANTITY	0.82+
Kate	PERSON	0.82+
almost 11 X	QUANTITY	0.81+
three	QUANTITY	0.8+
over a hundred guests	QUANTITY	0.76+
first half of 2021	DATE	0.75+
18	QUANTITY	0.75+
Invent	EVENT	0.69+
eight digit	QUANTITY	0.68+
SAS	ORGANIZATION	0.63+
ransomware	TITLE	0.59+
months	DATE	0.49+
last 12	DATE	0.49+

Maria Colgan & Gerald Venzl, Oracle | June CUBEconversation

(upbeat music) Developers have become the new king makers in the world of digital and cloud. The rise of containers and microservices has accelerated the transition to cloud native applications. A lot of people will talk about application architecture and the related paradigms and the benefits they bring for the process of writing and delivering new apps. But a major challenge continues to be, the how and the what when it comes to accessing, processing and getting insights from the massive amounts of data that we have to deal with in today's world. And with me are two experts from the data management world who will share with us how they think about the best techniques and practices based on what they see at large organizations who are working with data and developing so-called data-driven apps. Please welcome Maria Colgan and Gerald Venzl, two distinguish product managers from Oracle. Folks, welcome, thanks so much for coming on. >> Thanks for having us Dave. >> Thank you very much for having us. >> Okay, Maria let's start with you. So, we throw around this term data-driven, data-driven applications. What are we really talking about there? >> So data-driven applications are applications that work on a diverse set of data. So anything from spatial to sensor data, document data as well as your usual transaction processing data. And what they're going to do is they'll generate value from that data in very different ways to a traditional application. So for example, they may use machine learning, they are able to do product recommendations in the middle of a transaction. Or we could use graph to be able to identify an influencer within the community so we can target them with a specific promotion. It could also use spatial data to be able to help find the nearest stores to a particular customer. And because these apps are deployed on multiple platforms, everything from mobile devices as well as standard browsers, they need a data platform that's going to be both secure, reliable and scalable. >> Well, so when you think about how the workloads are shifting I mean, we're not talking about, you know it's not anymore a world of just your ERP or your HCM or your CRM, you know kind of the traditional operational systems. You really are seeing an explosion of these new data oriented apps. You're seeing, you know, modeling in the cloud, you are going to see more and more inferencing, inferencing at the edge. But Maria maybe you could talk a little bit about sort of the benefits that customers are seeing from developing these types of applications. I mean, why should people care about data-driven apps? >> Oh, for sure, there's massive benefits to them. I mean, probably the most obvious one for any business regardless of the industry, is that they not only allow you to understand what your customers are up to, but they allow you to be able to anticipate those customer's needs. So that helps businesses maintain that competitive edge and retain their customers. But it also helps them make data-driven decisions in real time based on actual data rather than on somebody's gut feeling or basing those decisions on historical data. So for example, you can do real-time price adjustments on products based on demand and so forth, that kind of thing. So it really changes the way people do business today. >> So Gerald, you think about the narrative in the industry everybody wants to be a platform player all your customers they are becoming software companies, they are becoming platform players. Everybody wants to be like, you know name a company that is huge trillion dollar market cap or whatever, and those are data-driven companies. And so it would seem to me that data-driven applications, there's nobody, no company really shouldn't be data-driven. Do you buy that? >> Yeah, absolutely. I mean, data-driven, and that naturally the whole industry is data-driven, right? It's like we all have information technologies about processing data and deriving information out of it. But when it comes to app development I think there is a big push to kind of like we have to do machine learning in our applications, we have to get insights from data. And when you actually look back a bit and take a step back, you see that there's of course many different kinds of applications out there as well that's not to be forgotten, right? So there is a usual front end user interfaces where really the application all it does is just entering some piece of information that's stored somewhere or perhaps a microservice that's not attached to a data to you at all but just receives or asks calls (indistinct). So I think it's not necessarily so important for every developer to kind of go on a bandwagon that they have to be data-driven. But I think it's equally important for those applications and those developers that build applications, that drive the business, that make business critical decisions as Maria mentioned before. Those guys should take really a close look into what data-driven apps means and what the data to you can actually give to them. Because what we see also happening a lot is that a lot of the things that are well known and out there just ready to use are being reimplemented in the applications. And for those applications, they essentially just ended up spending more time writing codes that will be already there and then have to maintain and debug the code as well rather than just going to market faster. >> Gerald can you talk to the prevailing approaches that developers take to build data-driven applications? What are the ones that you see? Let's dig into that a little bit more and maybe differentiate the different approaches and talk about that? >> Yeah, absolutely. I think right now the industry is like in two camps, it's like sort of a religious war going on that you'll see often happening with different architectures and so forth going on. So we have single purpose databases or data management technologies. Which are technologies that are as the name suggests build around a single purpose. So it's like, you know a typical example would be your ordinary key-value store. And a key-value store all it does is it allows you to store and retrieve a piece of data whatever that may be really, really fast but it doesn't really go beyond that. And then the other side of the house or the other camp would be multimodal databases, multimodal data management technologies. Those are technologies that allow you to store different types of data, different formats of data in the same technology in the same system alongside. And, you know, when you look at the geographics out there of what we have from technology, is pretty much any relational database or any database really has evolved into such a multimodal database. Whether that's MySQL that allows you to store or chase them alongside relational or even a MongoDB that allows you to do or gives you native graph support since (mumbles) and as well alongside the adjacent support. >> Well, it's clearly a trend in the industry. We've talked about this a lot in The Cube. We know where Oracle stands on this. I mean, you just mentioned MySQL but I mean, Oracle Databases you've been extending, you've mentioned JSON, we've got blockchain now in there you're infusing, you know ML and AI into the database, graph database capabilities, you know on and on and on. We talked a lot about we compared that to Amazon which is kind of the right tool, the right job approach. So maybe you could talk about, you know, your point of view, the benefits for developers of using that converged database if I can use that word approach being able to store multiple data formats? Why do you feel like that's a better approach? >> Yeah, I think on a high level it comes down to complexity. You are actually avoiding additional complexity, right? So not every use case that you have necessarily warrants to have yet another data management technology or yet the special build technology for managing that data, right? It's like many use cases that we see out there happily want to just store a piece of a chase and document, a piece of chase in a database and then perhaps retrieve it again afterwards so write some simple queries over it. And you really don't have to get a new database technology or a NoSQL database into the mix if you already have some to just fulfill that exact use case. You could just happily store that information as well in the database you already have. And what it really comes down to is the learning curve for developers, right? So it's like, as you use the same technology to store other types of data, you don't have to learn a new technology, you don't have to associate yourself with new and learn new drivers. You don't have to find new frameworks and you don't have to know how to necessarily operate or best model your data for that database. You can essentially just reuse your knowledge of the technology as well as the libraries and code you have already built in house perhaps in another application, perhaps, you know framework that you used against the same technology because it is still the same technology. So, kind of all comes down again to avoiding complexity rather than not fragmenting you know, the many different technologies we have. If you were to look at the different data formats that are out there today it's like, you know, you would end up with many different databases just to store them if you were to fully religiously follow the single purpose best built technology for every use case paradigm, right? And then you would just end up having to manage many different databases more than actually focusing on your app and getting value to your business or to your user. >> Okay, so I get that and I buy that by the way. I mean, especially if you're a larger organization and you've got all these projects going on but before we go back to Maria, Gerald, I want to just, I want to push on that a little bit. Because the counter to that argument would be in the analogy. And I wonder if you, I'd love for you to, you know knock this analogy off the blocks. The counter would be okay, Oracle is the Swiss Army knife and it's got, you know, all in one. But sometimes I need that specialized long screwdriver and I go into my toolbox and I grab that. It's better than the screwdriver in my Swiss Army knife. Why, are you the Swiss Army knife of databases? Or are you the all-in-one have that best of breed screwdriver for me? How do you think about that? >> Yeah, that's a fantastic question, right? And I think it's first of all, you have to separate between Oracle the company that has actually multiple data management technologies and databases out there as you said before, right? And Oracle Database. And I think Oracle Database is definitely a Swiss Army knife has many capabilities of since the last 40 years, you know that we've seen object support coming that's still in the Oracle Database today. We have seen XML coming, it's still in the Oracle Database, graph, spatial, et cetera. And so you have many different ways of managing your data and then on top of that going into the converge, not only do we allow you to store the different data model in there but we actually allow you also to, you apply all the security policies and so forth on top of it something Maria can talk more about the mission around converged database. I would also argue though that for some aspects, we do actually have to or add a screwdriver that you talked about as well. So especially in the relational world people get very quickly hung up on this idea that, oh, if you only do rows and columns, well, that's kind of what you put down on disk. And that was never true, it's the relational model is actually a logical model. What's probably being put down on disk is blocks that align themselves nice with block storage and always has been. So that allows you to actually model and process the data sort of differently. And one common example or one good example that we have that we introduced a couple of years ago was when, column and databases were very strong and you know, the competition came it's like, yeah, we have In-Memory column that stores now they're so much better. And we were like, well, orienting the data role-based or column-based really doesn't matter in the sense that we store them as blocks on disks. And so we introduced the in memory technology which gives you an In-Memory column, a representation of your data as well alongside your relational. So there is an example where you go like, well, actually you know, if you have this use case of the column or analytics all In-Memory, I would argue Oracle Database is also that screwdriver you want to go down to and gives you that capability. Because not only gives you representation in columnar, but also which many people then forget all the analytic power on top of SQL. It's one thing to store your data columnar, it's a completely different story to actually be able to run analytics on top of that and having all the built-in functionalities and stuff that you want to do with the data on top of it as you analyze it. >> You know, that's a great example, the kilometer 'cause I remember there was like a lot of hype around it. Oh, it's the Oracle killer, you know, at Vertica. Vertica is still around but, you know it never really hit escape velocity. But you know, good product, good company, whatever. Natezza, it kind of got buried inside of IBM. ParXL kind of became, you know, red shift with that deal so that kind of went away. Teradata bought a company, I forget which company it bought but. So that hype kind of disapated and now it's like, oh yeah, columnar. It's kind of like In-Memory, we've had a In-Memory databases ever since we've had databases you know, it's a kind of a feature not a sector. But anyway, Maria, let's come back to you. You've got a lot of customer experience. And you speak with a lot of companies, you know during your time at Oracle. What else are you seeing in terms of the benefits to this approach that might not be so intuitive and obvious right away? >> I think one of the biggest benefits to having a multimodel multiworkload or as we call it a converged database, is the fact that you can get greater data synergy from it. In other words, you can utilize all these different techniques and data models to get better value out of that data. So things like being able to do real-time machine learning, fraud detection inside a transaction or being able to do a product recommendation by accessing three different data models. So for example, if I'm trying to recommend a product for you Dave, I might use graph analytics to be able to figure out your community. Not just your friends, but other people on our system who look and behave just like you. Once I know that community then I can go over and see what products they bought by looking up our product catalog which may be stored as JSON. And then on top of that I can then see using the key-value what products inside that catalog those community members gave a five star rating to. So that way I can really pinpoint the right product for you. And I can do all of that in one transaction inside the database without having to transform that data into different models or God forbid, access different systems to be able to get all of that information. So it really simplifies how we can generate that value from the data. And of course, the other thing our customers love is when it comes to deploying data-driven apps, when you do it on a converged database it's much simpler because it is that standard data platform. So you're not having to manage multiple independent single purpose databases. You're not having to implement the security and the high availability policies, you know across a bunch of different diverse platforms. All of that can be done much simpler with a converged database 'cause the DBA team of course, is going to just use that standard set of tools to manage, monitor and secure those systems. >> Thank you for that. And you know, it's interesting, you talk about simplification and you are in Juan's organization so you've big focus on mission critical. And so one of the things that I think is often overlooked well, we talk about all the time is recovery. And if things are simpler, recovery is faster and easier. And so it's kind of the hallmark of Oracle is like the gold standard of the toughest apps, the most mission critical apps. But I wanted to get to the cloud Maria. So because everything is going to the cloud, right? Not all workloads are going to the cloud but everybody is talking about the cloud. Everybody has cloud first mentality and so yes, it's a hybrid world. But the natural next question is how do you think the cloud fits into this world of data-driven apps? >> I think just like any app that you're developing, the cloud helps to accelerate that development. And of course the deployment of these data-driven applications. 'Cause if you think about it, the developer is instantly able to provision a converged database that Oracle will automatically manage and look after for them. But what's great about doing something like that if you use like our autonomous database service is that it comes in different flavors. So you can get autonomous transaction processing, data warehousing or autonomous JSON so that the developer is going to get a database that's been optimized for their specific use case, whatever they are trying to solve. And it's also going to contain all of that great functionality and capabilities that we've been talking about. So what that really means to the developer though is as the project evolves and inevitably the business needs change a little, there's no need to panic when one of those changes comes in because your converged database or your autonomous database has all of those additional capabilities. So you can simply utilize those to able to address those evolving changes in the project. 'Cause let's face it, none of us normally know exactly what we need to build right at the very beginning. And on top of that they also kind of get a built-in buddy in the cloud, especially in the autonomous database. And that buddy comes in the form of built-in workload optimizations. So with the autonomous database we do things like automatic indexing where we're using machine learning to be that buddy for the developer. So what it'll do is it'll monitor the workload and see what kind of queries are being run on that system. And then it will actually determine if there are indexes that should be built to help improve the performance of that application. And not only does it bill those indexes but it verifies that they help improve the performance before publishing it to the application. So by the time the developer is finished with that app and it's ready to be deployed, it's actually also been optimized by the developers buddy, the Oracle autonomous database. So, you know, it's a really nice helping hand for developers when they're building any app especially data-driven apps. >> I like how you sort of gave us, you know the truth here is you don't always know where you're going when you're building an app. It's like it goes from you are trying to build it and they will come to start building it and we'll figure out where it's going to go. With Agile that's kind of how it works. But so I wonder, can you give some examples of maybe customers or maybe genericize them if you need to. Data-driven apps in the cloud where customers were able to drive more efficiency, where the cloud buddy allowed the customers to do more with less? >> No, we have tons of these but I'll try and keep it to just a couple. One that comes to mind straight away is retrace. These folks built a blockchain app in the Oracle Cloud that allows manufacturers to actually share the supply chain with the consumer. So the consumer can see exactly, who made their product? Using what raw materials? Where they were sourced from? How it was done? All of that is visible to the consumer. And in order to be able to share that they had to work on a very diverse set of data. So they had everything from JSON documents to images as well as your traditional transactions in there. And they store all of that information inside the Oracle autonomous database, they were able to build their app and deploy it on the cloud. And they were able to do all of that very, very quickly. So, you know, that ability to work on multiple different data types in a single database really helped them build that product and get it to market in a very short amount of time. Another customer that's doing something really, really interesting is MindSense. So these guys operate the largest mines in Canada, Chile, and Peru. But what they do is they put these x-ray devices on the massive mechanical shovels that are at the cove or at the mine face. And what that does is it senses the contents of the buckets inside these mining machines. And it's looking to see at that content, to see how it can optimize the processing of the ore inside in that bucket. So they're looking to minimize the amount of power and water that it's going to take to process that. And also of course, minimize the amount of waste that's going to come out of that project. So all of that sensor data is sent into an autonomous database where it's going to be processed by a whole host of different users. So everything from the mine engineers to the geo scientists, to even their own data scientists utilize that data to drive their business forward. And what I love about these guys is they're not happy with building just one app. MindSense actually use our built-in low core development environment, APEX that comes as part of the autonomous database and they actually produce applications constantly for different aspects of their business using that technology. And it's actually able to accelerate those new apps to the business. It takes them now just a couple of days or weeks to produce an app instead of months or years to build those new apps. >> Great, thank you for that Maria. Gerald, I'm going to push you again. So, I said upfront and talked about microservices and the cloud and containers and you know, anybody in the developer space follows that very closely. But some of the things that we've been talking about here people might look at that and say, well, they're kind of antithetical to microservices. This is our Oracles monolithic approach. But when you think about the benefits of microservices, people want freedom of choice, technology choice, seen as a big advantage of microservices and containers. How do you address such an argument? >> Yeah, that's an excellent question and I get that quite often. The microservices architecture in general as I said before had architectures, Linux distributions, et cetera. It's kind of always a bit of like there's an academic approach and there's a pragmatic approach. And when you look at the microservices the original definitions that came out at the early 2010s. They actually never said that each microservice has to have a database. And they also never said that if a microservice has a database, you have to use a different technology for each microservice. Just like they never said, you have to write a microservice in a different programming language, right? So where I'm going with this is like, yes you know, sometimes when you look at some vendors out there, some niche players, they push this message or they jump on this academic approach of like each microservice has the best tool at hand or I'd use a different database for your purpose, et cetera. Which almost often comes across like us. You know, we want to stay part of the conversation. Nothing stops a developer from, you know using a multimodal database for the microservice and just using that as a document store, right? Or just using that as a relational database. And, you know, sometimes I mean, it was actually something that happened that was really interesting yesterday I don't know whether you follow Dave or not. But Facebook had an outage yesterday, right? And Facebook is one of those companies that are seen as the Silicon Valley, you know know how to do microservices companies. And when you add through the outage, well, what happened, right? Some unfortunate logical error with configuration as a force that took a database cluster down. So, you know, there you have it where you go like, well, maybe not every microservice is actually in fact talking to its own database or its own special purpose database. I think there, you know, well, what we should, the industry should be focusing much more on this argument of which technology to use? What's the right tool for a job? Is more to ask themselves, what business problem actually are we trying to solve? And therefore what's the right approach and the right technology for this. And so therefore, just as I said before, you know multimodal databases they do have strong benefits. They have many built-in functionalities that are already there and they allow you to reduce this complexity of having to know many different technologies, right? And so it's not only to store different data models either you know, treat a multimodal database as a chasing documents store or a relational database but most databases are multimodal since 20 plus years. But it's also actually being able to perhaps if you store that data together, you can perhaps actually derive additional value for somebody else but perhaps not for your application. But like for example, if you were to use Oracle Database you can actually write queries on top of all of that data. It doesn't really matter for our query engine whether it's the data is format that then chase or the data is formatted in rows and columns you can just rather than query over it. And that's actually very powerful for those guys that have to, you know get the reporting done the end of the day, the end of the week. And for those guys that are the data scientists that they want to figure out, you know which product performed really well or can we tweak something here and there. When you look into that space you still see a huge divergence between the guys to put data in kind of the altarpiece style and guys that try to derive new insights. And there's still a lot of ETL going around and, you know we have big data technologies that some of them come and went and some of them came in that are still around like Apache Spark which is still like a SQL engine on top of any of your data kind of going back to the same concept. And so I will say that, you know, for developers when we look at microservices it's like, first of all, is the argument you were making because the vendor or the technology you want to use tells you this argument or, you know, you kind of want to have an argument to use a specific technology? Or is it really more because it is the best technology, to best use for this given use case for this given application that you have? And if so there's of course, also nothing wrong to use a single purpose technology either, right? >> Yeah, I mean, whenever I talk about Oracle I always come back to the most important applications, the mission critical. It's very difficult to architect databases with microservices and containers. You have to be really, really careful. And so and again, it comes back to what we were talking before about with Maria that the complexity and the recovery. But Gerald I want to stay with you for a minute. So there's other data management technologies popping out there. I mean, I've seen some people saying, okay just leave the data in an S3 bucket. We can query that, then we've got some magic sauce to do that. And so why are you optimistic about you know, traditional database technology going forward? >> I would say because of the history of databases. So one thing that once struck me when I came to Oracle and then got to meet great people like Juan Luis and Andy Mendelsohn who had been here for a long, long time. I come to realization that relational databases are around for about 45 years now. And, you know, I was like, I'm too young to have been around then, right? So I was like, what else was around 45 years? It's like just the tech stack that we have today. It's like, how does this look like? Well, Linux only came out in 93. Well, databases pre-date Linux a lot rather than as I started digging I saw a lot of technologies come and go, right? And you mentioned before like the technologies that data management systems that we had that came and went like the columnar databases or XML databases, object databases. And even before relational databases before Cot gave us the relational model there were apparently these networks stores network databases which to some extent look very similar to adjacent documents. There wasn't a harder storing data and a hierarchy to format. And, you know when you then start actually reading the Cot paper and diving a little bit more into the relation model, that's I think one important crux in there that most of the industry keeps forgetting or it hasn't been around to even know. And that is that when Cot created the relational model, he actually focused not so much on the application putting the data in, but on future users and applications still being able to making sense out of the data, right? And that's kind of like I said before we had those network models, we had XML databases you have adjacent documents stores. And the one thing that they all have along with it is like the application that puts the data in decides the structure of the data. And that's all well and good if you had an application of the developer writing an application. It can become really tricky when 10 years later you still want to look at that data and the application that the developer is no longer around then you go like, what does this all mean? Where is the structure defined? What is this attribute? What does it mean? How does it correlate to others? And the one thing that people tend to forget is that it's actually the data that's here to stay not someone who does the applications where it is. Ideally, every company wants to store every single byte of data that they have because there might be future value in it. Economically may not make sense that's now much more feasible than just years ago. But if you could, why wouldn't you want to store all your data, right? And sometimes you actually have to store the data for seven years or whatever because the laws require you to. And so coming back then and you know, like 10 years from now and looking at the data and going like making sense of that data can actually become a lot more difficult and a lot more challenging than having to first figure out and how we store this data for general use. And that kind of was what the relational model was all about. We decompose the data structures into tables and columns with relationships amongst each other so therefore between each other. So that therefore if somebody wants to, you know typical example would be well you store some purchases from your web store, right? There's a customer attribute in it. There's some credit card payment information in it, just some product information on what the customer bought. Well, in the relational model if you just want to figure out which products were sold on a given day or week, you just would query the payment and products table to get the sense out of it. You don't need to touch the customer and so forth. And with the hierarchical model you have to first sit down and understand how is the structure, what is the customer? Where is the payment? You know, does the document start with the payment or does it start with the customer? Where do I find this information? And then in the very early days those databases even struggled to then not having to scan all the documents to get the data out. So coming back to your question a bit, I apologize for going on here. But you know, it's like relational databases have been around for 45 years. I actually argue it's one of the most successful software technologies that we have out there when you look in the overall industry, right? 45 years is like, in IT terms it's like from a star being the ones who are going supernova. You have said it before that many technologies coming and went, right? And just want to add a more really interesting example by the way is Hadoop and HDFS, right? They kind of gave us this additional promise of like, you know, the 2010s like 2012, 2013 the hype of Hadoop and so forth and (mumbles) and HDFS. And people are just like, just put everything into HDFS and worry about the data later, right? And we can query it and map reduce it and whatever. And we had customers actually coming to us they were like, great we have half a petabyte of data on an HDFS cluster and we have no clue what's stored in there. How do we figure this out? What are we going to do now? Now you had a big data cleansing problem. And so I think that is why databases and also data modeling is something that will not go away anytime soon. And I think databases and database technologies are here for quite a while to stay. Because many of those are people they don't think about what's happening to the data five years from now. And many of the niche players also and also frankly even Amazon you know, following with this single purpose thing is like, just use the right tool for the job for your application, right? Just pull in the data there the way you wanted. And it's like, okay, so you use technologies all over the place and then five years from now you have your data fragmented everywhere in different formats and, you know inconsistencies, and, and, and. And those are usually when you come back to this data-driven business critical business decision applications the worst case scenario you can have, right? Because now you need an army of people to actually do data cleansing. And there's not a coincidence that data science has become very, very popular the last recent years as we kind of went on with this proliferation of different database or data management technologies some of those are not even database. But I think I leave it at that. >> It's an interesting talk track because you're right. I mean, no schema on right was alluring, but it definitely created some problems. It also created an entire, you know you referenced the hyper specialized roles and did the data cleansing component. I mean, maybe technology will eventually solve that problem but it hasn't up at least up tonight. Okay, last question, Maria maybe you could start off and Gerald if you want to chime in as well it'd be great. I mean, it's interesting to watch this industry when Oracle sort of won the top database mantle. I mean, I watched it, I saw it. It was, remember it was Informix and it was (indistinct) too and of course, Microsoft you got to give them credit with SQL server, but Oracle won the database wars. And then everything got kind of quiet for awhile database was sort of boring. And then it exploded, you know, all the, you know not only SQL and the key-value stores and the cloud databases and this is really a hot area now. And when we looked at Oracle we said, okay, Oracle it's all about Oracle Database, but we've seen the kind of resurgence in MySQL which everybody thought, you know once Oracle bought Sun they were going to kill MySQL. But now we see you investing in HeatWave, TimesTen, we talked about In-Memory databases before. So where do those fit in Maria in the grand scheme? How should we think about Oracle's database portfolio? >> So there's lots of places where you'd use those different things. 'Cause just like any other industry there are going to be new and boutique use cases that are going to benefit from a more specialized product or single purpose product. So good examples off the top of my head of the kind of systems that would benefit from that would be things like a stock exchange system or a telephone exchange system. Both of those are latency critical transaction processing applications where they need microsecond response times. And that's going to exceed perhaps what you might normally get or deploy with a converged database. And so Oracle's TimesTen database our In-Memory database is perfect for those kinds of applications. But there's also a host of MySQL applications out there today and you said it yourself there Dave, HeatWave is a great place to provision and deploy those kinds of applications because it's going to run 100 times faster than AWS (mumbles). So, you know, there really is a place in the market and in our customer's systems and the needs they have for all of these different members of our database family here at Oracle. >> Yeah, well, the internet is basically running in the lamp stack so I see MySQL going away. All right Gerald, will give you the final word, bring us home. >> Oh, thank you very much. Yeah, I mean, as Maria said, I think it comes back to what we discussed before. There is obviously still needs for special technologies or different technologies than a relational database or multimodal database. Oracle has actually many more databases that people may first think of. Not only the three that we have already mentioned but there's even SP so the Oracle's NoSQL database. And, you know, on a high level Oracle is a data management company, right? And we want to give our customers the best tools and the best technology to manage all of their data. Rather than therefore there has to be a need or there should be a part of the business that also focuses on this highly specialized systems and this highly specialized technologies that address those use cases. And I think it makes perfect sense. It's like, you know, when the customer comes to Oracle they're not only getting this, take this one product you know, and if you don't like it your problem but actually you have choice, right? And choice allows you to make a decision based on what's best for you and not necessarily best for the vendor you're talking to. >> Well guys, really appreciate your time today and your insights. Maria, Gerald, thanks so much for coming on The Cube. >> Thank you very much for having us. >> And thanks for watching this Cube conversation this is Dave Vellante and we'll see you next time. (upbeat music)

Published Date : Jun 24 2021

SUMMARY :

in the world of digital and cloud. and the benefits they bring What are we really talking about there? the nearest stores to kind of the traditional So it really changes the way So Gerald, you think about to you at all but just receives or even a MongoDB that allows you to do ML and AI into the database, in the database you already have. and I buy that by the way. of since the last 40 years, you know the benefits to this approach is the fact that you can get And so one of the things that And that buddy comes in the form of the truth here is you don't and deploy it on the cloud. and the cloud and containers and you know, is the argument you were making that the complexity and the recovery. because the laws require you to. And then it exploded, you and the needs they have in the lamp stack so I and the best technology to and your insights. we'll see you next time.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Gerald Venzl	PERSON	0.99+
Andy Mendelsohn	PERSON	0.99+
Maria	PERSON	0.99+
Chile	LOCATION	0.99+
Peru	LOCATION	0.99+
Maria Colgan	PERSON	0.99+
Canada	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
Gerald	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Maria Colgan	PERSON	0.99+
seven years	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Juan Luis	PERSON	0.99+
100 times	QUANTITY	0.99+
five star	QUANTITY	0.99+
Dave	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
two experts	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Sun	ORGANIZATION	0.99+
45 years	QUANTITY	0.99+
MySQL	TITLE	0.99+
three	QUANTITY	0.99+
yesterday	DATE	0.99+
each microservice	QUANTITY	0.99+
Swiss Army	ORGANIZATION	0.99+
early 2010s	DATE	0.99+
Teradata	ORGANIZATION	0.99+
Swiss Army	ORGANIZATION	0.99+
Linux	TITLE	0.99+
10 years later	DATE	0.99+
2012	DATE	0.99+
two camps	QUANTITY	0.99+
SQL	TITLE	0.99+
Both	QUANTITY	0.98+
Oracle Database	TITLE	0.98+
2010s	DATE	0.98+
TimesTen	ORGANIZATION	0.98+
Hadoop	TITLE	0.98+
first	QUANTITY	0.98+
Oracles	ORGANIZATION	0.98+
Vertica	ORGANIZATION	0.98+
tonight	DATE	0.98+
2013	DATE	0.98+

Maria Colgan & Gerald Venzl, Oracle | June CUBEconversation

(upbeat music) >> It'll be five, four, three and then silent two, one, and then you guys just follow my lead. We're just making some last minute adjustments. Like I said, we're down two hands today. So, you good Alex? Okay, are you guys ready? >> I'm ready. >> Ready. >> I got to get get one note here. >> So I noticed Maria you stopped anyway, so I have time. >> Just so they know Dave and the Boston Studio, are they both kind of concurrently be on film even when they're not speaking or will only the speaker be on film for like if Gerald's drawing while Maria is talking about-- >> Sorry but then I missed one part of my onboarding spiel. There should be, if you go into gallery there should be a label. There should be something labeled Boston live switch feed. If you pin that gallery view you'll see what our program currently being recorded is. So any time you don't see yourself on that feed is an excellent time to take a drink of water, scratch your nose, check your notes. Do whatever you got to do off screen. >> Can you give us a three shot, Alex? >> Yes, there it is. >> And then go to me, just give me a one-shot to Dave. So when I'm here you guys can take a drink or whatever >> That makes sense? >> Yeah. >> Excellent, I will get my recordings restarted and we'll open up when Dave's ready. >> All right, you guys ready? >> Ready. >> All right Steve, you go on mute. >> Okay, on me in 5, 4, 3. Developers have become the new king makers in the world of digital and cloud. The rise of containers and microservices has accelerated the transition to cloud native applications. A lot of people will talk about application architecture and the related paradigms and the benefits they bring for the process of writing and delivering new apps. But a major challenge continues to be, the how and the what when it comes to accessing, processing and getting insights from the massive amounts of data that we have to deal with in today's world. And with me are two experts from the data management world who will share with us how they think about the best techniques and practices based on what they see at large organizations who are working with data and developing so-called data-driven apps. Please welcome Maria Colgan and Gerald Venzl, two distinguish product managers from Oracle. Folks, welcome, thanks so much for coming on. >> Thanks for having us Dave. >> Thank you very much for having us. >> Okay, Maria let's start with you. So, we throw around this term data-driven, data-driven applications. What are we really talking about there? >> So data-driven applications are applications that work on a diverse set of data. So anything from spatial to sensor data, document data as well as your usual transaction processing data. And what they're going to do is they'll generate value from that data in very different ways to a traditional application. So for example, they may use machine learning, they are able to do product recommendations in the middle of a transaction. Or we could use graph to be able to identify an influencer within the community so we can target them with a specific promotion. It could also use spatial data to be able to help find the nearest stores to a particular customer. And because these apps are deployed on multiple platforms, everything from mobile devices as well as standard browsers, they need a data platform that's going to be both secure, reliable and scalable. >> Well, so when you think about how the workloads are shifting I mean, we're not talking about, you know it's not anymore a world of just your ERP or your HCM or your CRM, you know kind of the traditional operational systems. You really are seeing an explosion of these new data oriented apps. You're seeing, you know, modeling in the cloud, you are going to see more and more inferencing, inferencing at the edge. But Maria maybe you could talk a little bit about sort of the benefits that customers are seeing from developing these types of applications. I mean, why should people care about data-driven apps? >> Oh, for sure, there's massive benefits to them. I mean, probably the most obvious one for any business regardless of the industry, is that they not only allow you to understand what your customers are up to, but they allow you to be able to anticipate those customer's needs. So that helps businesses maintain that competitive edge and retain their customers. But it also helps them make data-driven decisions in real time based on actual data rather than on somebody's gut feeling or basing those decisions on historical data. So for example, you can do real-time price adjustments on products based on demand and so forth, that kind of thing. So it really changes the way people do business today. >> So Gerald, you think about the narrative in the industry everybody wants to be a platform player all your customers they are becoming software companies, they are becoming platform players. Everybody wants to be like, you know name a company that is huge trillion dollar market cap or whatever, and those are data-driven companies. And so it would seem to me that data-driven applications, there's nobody, no company really shouldn't be data-driven. Do you buy that? >> Yeah, absolutely. I mean, data-driven, and that naturally the whole industry is data-driven, right? It's like we all have information technologies about processing data and deriving information out of it. But when it comes to app development I think there is a big push to kind of like we have to do machine learning in our applications, we have to get insights from data. And when you actually look back a bit and take a step back, you see that there's of course many different kinds of applications out there as well that's not to be forgotten, right? So there is a usual front end user interfaces where really the application all it does is just entering some piece of information that's stored somewhere or perhaps a microservice that's not attached to a data to you at all but just receives or asks calls (indistinct). So I think it's not necessarily so important for every developer to kind of go on a bandwagon that they have to be data-driven. But I think it's equally important for those applications and those developers that build applications, that drive the business, that make business critical decisions as Maria mentioned before. Those guys should take really a close look into what data-driven apps means and what the data to you can actually give to them. Because what we see also happening a lot is that a lot of the things that are well known and out there just ready to use are being reimplemented in the applications. And for those applications, they essentially just ended up spending more time writing codes that will be already there and then have to maintain and debug the code as well rather than just going to market faster. >> Gerald can you talk to the prevailing approaches that developers take to build data-driven applications? What are the ones that you see? Let's dig into that a little bit more and maybe differentiate the different approaches and talk about that? >> Yeah, absolutely. I think right now the industry is like in two camps, it's like sort of a religious war going on that you'll see often happening with different architectures and so forth going on. So we have single purpose databases or data management technologies. Which are technologies that are as the name suggests build around a single purpose. So it's like, you know a typical example would be your ordinary key-value store. And a key-value store all it does is it allows you to store and retrieve a piece of data whatever that may be really, really fast but it doesn't really go beyond that. And then the other side of the house or the other camp would be multimodal databases, multimodal data management technologies. Those are technologies that allow you to store different types of data, different formats of data in the same technology in the same system alongside. And, you know, when you look at the geographics out there of what we have from technology, is pretty much any relational database or any database really has evolved into such a multimodal database. Whether that's MySQL that allows you to store or chase them alongside relational or even a MongoDB that allows you to do or gives you native graph support since (mumbles) and as well alongside the adjacent support. >> Well, it's clearly a trend in the industry. We've talked about this a lot in The Cube. We know where Oracle stands on this. I mean, you just mentioned MySQL but I mean, Oracle Databases you've been extending, you've mentioned JSON, we've got blockchain now in there you're infusing, you know ML and AI into the database, graph database capabilities, you know on and on and on. We talked a lot about we compared that to Amazon which is kind of the right tool, the right job approach. So maybe you could talk about, you know, your point of view, the benefits for developers of using that converged database if I can use that word approach being able to store multiple data formats? Why do you feel like that's a better approach? >> Yeah, I think on a high level it comes down to complexity. You are actually avoiding additional complexity, right? So not every use case that you have necessarily warrants to have yet another data management technology or yet the special build technology for managing that data, right? It's like many use cases that we see out there happily want to just store a piece of a chase and document, a piece of chase in a database and then perhaps retrieve it again afterwards so write some simple queries over it. And you really don't have to get a new database technology or a NoSQL database into the mix if you already have some to just fulfill that exact use case. You could just happily store that information as well in the database you already have. And what it really comes down to is the learning curve for developers, right? So it's like, as you use the same technology to store other types of data, you don't have to learn a new technology, you don't have to associate yourself with new and learn new drivers. You don't have to find new frameworks and you don't have to know how to necessarily operate or best model your data for that database. You can essentially just reuse your knowledge of the technology as well as the libraries and code you have already built in house perhaps in another application, perhaps, you know framework that you used against the same technology because it is still the same technology. So, kind of all comes down again to avoiding complexity rather than not fragmenting you know, the many different technologies we have. If you were to look at the different data formats that are out there today it's like, you know, you would end up with many different databases just to store them if you were to fully religiously follow the single purpose best built technology for every use case paradigm, right? And then you would just end up having to manage many different databases more than actually focusing on your app and getting value to your business or to your user. >> Okay, so I get that and I buy that by the way. I mean, especially if you're a larger organization and you've got all these projects going on but before we go back to Maria, Gerald, I want to just, I want to push on that a little bit. Because the counter to that argument would be in the analogy. And I wonder if you, I'd love for you to, you know knock this analogy off the blocks. The counter would be okay, Oracle is the Swiss Army knife and it's got, you know, all in one. But sometimes I need that specialized long screwdriver and I go into my toolbox and I grab that. It's better than the screwdriver in my Swiss Army knife. Why, are you the Swiss Army knife of databases? Or are you the all-in-one have that best of breed screwdriver for me? How do you think about that? >> Yeah, that's a fantastic question, right? And I think it's first of all, you have to separate between Oracle the company that has actually multiple data management technologies and databases out there as you said before, right? And Oracle Database. And I think Oracle Database is definitely a Swiss Army knife has many capabilities of since the last 40 years, you know that we've seen object support coming that's still in the Oracle Database today. We have seen XML coming, it's still in the Oracle Database, graph, spatial, et cetera. And so you have many different ways of managing your data and then on top of that going into the converge, not only do we allow you to store the different data model in there but we actually allow you also to, you apply all the security policies and so forth on top of it something Maria can talk more about the mission around converged database. I would also argue though that for some aspects, we do actually have to or add a screwdriver that you talked about as well. So especially in the relational world people get very quickly hung up on this idea that, oh, if you only do rows and columns, well, that's kind of what you put down on disk. And that was never true, it's the relational model is actually a logical model. What's probably being put down on disk is blocks that align themselves nice with block storage and always has been. So that allows you to actually model and process the data sort of differently. And one common example or one good example that we have that we introduced a couple of years ago was when, column and databases were very strong and you know, the competition came it's like, yeah, we have In-Memory column that stores now they're so much better. And we were like, well, orienting the data role-based or column-based really doesn't matter in the sense that we store them as blocks on disks. And so we introduced the in memory technology which gives you an In-Memory column, a representation of your data as well alongside your relational. So there is an example where you go like, well, actually you know, if you have this use case of the column or analytics all In-Memory, I would argue Oracle Database is also that screwdriver you want to go down to and gives you that capability. Because not only gives you representation in columnar, but also which many people then forget all the analytic power on top of SQL. It's one thing to store your data columnar, it's a completely different story to actually be able to run analytics on top of that and having all the built-in functionalities and stuff that you want to do with the data on top of it as you analyze it. >> You know, that's a great example, the kilometer 'cause I remember there was like a lot of hype around it. Oh, it's the Oracle killer, you know, at Vertica. Vertica is still around but, you know it never really hit escape velocity. But you know, good product, good company, whatever. Natezza, it kind of got buried inside of IBM. ParXL kind of became, you know, red shift with that deal so that kind of went away. Teradata bought a company, I forget which company it bought but. So that hype kind of disapated and now it's like, oh yeah, columnar. It's kind of like In-Memory, we've had a In-Memory databases ever since we've had databases you know, it's a kind of a feature not a sector. But anyway, Maria, let's come back to you. You've got a lot of customer experience. And you speak with a lot of companies, you know during your time at Oracle. What else are you seeing in terms of the benefits to this approach that might not be so intuitive and obvious right away? >> I think one of the biggest benefits to having a multimodel multiworkload or as we call it a converged database, is the fact that you can get greater data synergy from it. In other words, you can utilize all these different techniques and data models to get better value out of that data. So things like being able to do real-time machine learning, fraud detection inside a transaction or being able to do a product recommendation by accessing three different data models. So for example, if I'm trying to recommend a product for you Dave, I might use graph analytics to be able to figure out your community. Not just your friends, but other people on our system who look and behave just like you. Once I know that community then I can go over and see what products they bought by looking up our product catalog which may be stored as JSON. And then on top of that I can then see using the key-value what products inside that catalog those community members gave a five star rating to. So that way I can really pinpoint the right product for you. And I can do all of that in one transaction inside the database without having to transform that data into different models or God forbid, access different systems to be able to get all of that information. So it really simplifies how we can generate that value from the data. And of course, the other thing our customers love is when it comes to deploying data-driven apps, when you do it on a converged database it's much simpler because it is that standard data platform. So you're not having to manage multiple independent single purpose databases. You're not having to implement the security and the high availability policies, you know across a bunch of different diverse platforms. All of that can be done much simpler with a converged database 'cause the DBA team of course, is going to just use that standard set of tools to manage, monitor and secure those systems. >> Thank you for that. And you know, it's interesting, you talk about simplification and you are in Juan's organization so you've big focus on mission critical. And so one of the things that I think is often overlooked well, we talk about all the time is recovery. And if things are simpler, recovery is faster and easier. And so it's kind of the hallmark of Oracle is like the gold standard of the toughest apps, the most mission critical apps. But I wanted to get to the cloud Maria. So because everything is going to the cloud, right? Not all workloads are going to the cloud but everybody is talking about the cloud. Everybody has cloud first mentality and so yes, it's a hybrid world. But the natural next question is how do you think the cloud fits into this world of data-driven apps? >> I think just like any app that you're developing, the cloud helps to accelerate that development. And of course the deployment of these data-driven applications. 'Cause if you think about it, the developer is instantly able to provision a converged database that Oracle will automatically manage and look after for them. But what's great about doing something like that if you use like our autonomous database service is that it comes in different flavors. So you can get autonomous transaction processing, data warehousing or autonomous JSON so that the developer is going to get a database that's been optimized for their specific use case, whatever they are trying to solve. And it's also going to contain all of that great functionality and capabilities that we've been talking about. So what that really means to the developer though is as the project evolves and inevitably the business needs change a little, there's no need to panic when one of those changes comes in because your converged database or your autonomous database has all of those additional capabilities. So you can simply utilize those to able to address those evolving changes in the project. 'Cause let's face it, none of us normally know exactly what we need to build right at the very beginning. And on top of that they also kind of get a built-in buddy in the cloud, especially in the autonomous database. And that buddy comes in the form of built-in workload optimizations. So with the autonomous database we do things like automatic indexing where we're using machine learning to be that buddy for the developer. So what it'll do is it'll monitor the workload and see what kind of queries are being run on that system. And then it will actually determine if there are indexes that should be built to help improve the performance of that application. And not only does it bill those indexes but it verifies that they help improve the performance before publishing it to the application. So by the time the developer is finished with that app and it's ready to be deployed, it's actually also been optimized by the developers buddy, the Oracle autonomous database. So, you know, it's a really nice helping hand for developers when they're building any app especially data-driven apps. >> I like how you sort of gave us, you know the truth here is you don't always know where you're going when you're building an app. It's like it goes from you are trying to build it and they will come to start building it and we'll figure out where it's going to go. With Agile that's kind of how it works. But so I wonder, can you give some examples of maybe customers or maybe genericize them if you need to. Data-driven apps in the cloud where customers were able to drive more efficiency, where the cloud buddy allowed the customers to do more with less? >> No, we have tons of these but I'll try and keep it to just a couple. One that comes to mind straight away is retrace. These folks built a blockchain app in the Oracle Cloud that allows manufacturers to actually share the supply chain with the consumer. So the consumer can see exactly, who made their product? Using what raw materials? Where they were sourced from? How it was done? All of that is visible to the consumer. And in order to be able to share that they had to work on a very diverse set of data. So they had everything from JSON documents to images as well as your traditional transactions in there. And they store all of that information inside the Oracle autonomous database, they were able to build their app and deploy it on the cloud. And they were able to do all of that very, very quickly. So, you know, that ability to work on multiple different data types in a single database really helped them build that product and get it to market in a very short amount of time. Another customer that's doing something really, really interesting is MindSense. So these guys operate the largest mines in Canada, Chile, and Peru. But what they do is they put these x-ray devices on the massive mechanical shovels that are at the cove or at the mine face. And what that does is it senses the contents of the buckets inside these mining machines. And it's looking to see at that content, to see how it can optimize the processing of the ore inside in that bucket. So they're looking to minimize the amount of power and water that it's going to take to process that. And also of course, minimize the amount of waste that's going to come out of that project. So all of that sensor data is sent into an autonomous database where it's going to be processed by a whole host of different users. So everything from the mine engineers to the geo scientists, to even their own data scientists utilize that data to drive their business forward. And what I love about these guys is they're not happy with building just one app. MindSense actually use our built-in low core development environment, APEX that comes as part of the autonomous database and they actually produce applications constantly for different aspects of their business using that technology. And it's actually able to accelerate those new apps to the business. It takes them now just a couple of days or weeks to produce an app instead of months or years to build those new apps. >> Great, thank you for that Maria. Gerald, I'm going to push you again. So, I said upfront and talked about microservices and the cloud and containers and you know, anybody in the developer space follows that very closely. But some of the things that we've been talking about here people might look at that and say, well, they're kind of antithetical to microservices. This is our Oracles monolithic approach. But when you think about the benefits of microservices, people want freedom of choice, technology choice, seen as a big advantage of microservices and containers. How do you address such an argument? >> Yeah, that's an excellent question and I get that quite often. The microservices architecture in general as I said before had architectures, Linux distributions, et cetera. It's kind of always a bit of like there's an academic approach and there's a pragmatic approach. And when you look at the microservices the original definitions that came out at the early 2010s. They actually never said that each microservice has to have a database. And they also never said that if a microservice has a database, you have to use a different technology for each microservice. Just like they never said, you have to write a microservice in a different programming language, right? So where I'm going with this is like, yes you know, sometimes when you look at some vendors out there, some niche players, they push this message or they jump on this academic approach of like each microservice has the best tool at hand or I'd use a different database for your purpose, et cetera. Which almost often comes across like us. You know, we want to stay part of the conversation. Nothing stops a developer from, you know using a multimodal database for the microservice and just using that as a document store, right? Or just using that as a relational database. And, you know, sometimes I mean, it was actually something that happened that was really interesting yesterday I don't know whether you follow Dave or not. But Facebook had an outage yesterday, right? And Facebook is one of those companies that are seen as the Silicon Valley, you know know how to do microservices companies. And when you add through the outage, well, what happened, right? Some unfortunate logical error with configuration as a force that took a database cluster down. So, you know, there you have it where you go like, well, maybe not every microservice is actually in fact talking to its own database or its own special purpose database. I think there, you know, well, what we should, the industry should be focusing much more on this argument of which technology to use? What's the right tool for a job? Is more to ask themselves, what business problem actually are we trying to solve? And therefore what's the right approach and the right technology for this. And so therefore, just as I said before, you know multimodal databases they do have strong benefits. They have many built-in functionalities that are already there and they allow you to reduce this complexity of having to know many different technologies, right? And so it's not only to store different data models either you know, treat a multimodal database as a chasing documents store or a relational database but most databases are multimodal since 20 plus years. But it's also actually being able to perhaps if you store that data together, you can perhaps actually derive additional value for somebody else but perhaps not for your application. But like for example, if you were to use Oracle Database you can actually write queries on top of all of that data. It doesn't really matter for our query engine whether it's the data is format that then chase or the data is formatted in rows and columns you can just rather than query over it. And that's actually very powerful for those guys that have to, you know get the reporting done the end of the day, the end of the week. And for those guys that are the data scientists that they want to figure out, you know which product performed really well or can we tweak something here and there. When you look into that space you still see a huge divergence between the guys to put data in kind of the altarpiece style and guys that try to derive new insights. And there's still a lot of ETL going around and, you know we have big data technologies that some of them come and went and some of them came in that are still around like Apache Spark which is still like a SQL engine on top of any of your data kind of going back to the same concept. And so I will say that, you know, for developers when we look at microservices it's like, first of all, is the argument you were making because the vendor or the technology you want to use tells you this argument or, you know, you kind of want to have an argument to use a specific technology? Or is it really more because it is the best technology, to best use for this given use case for this given application that you have? And if so there's of course, also nothing wrong to use a single purpose technology either, right? >> Yeah, I mean, whenever I talk about Oracle I always come back to the most important applications, the mission critical. It's very difficult to architect databases with microservices and containers. You have to be really, really careful. And so and again, it comes back to what we were talking before about with Maria that the complexity and the recovery. But Gerald I want to stay with you for a minute. So there's other data management technologies popping out there. I mean, I've seen some people saying, okay just leave the data in an S3 bucket. We can query that, then we've got some magic sauce to do that. And so why are you optimistic about you know, traditional database technology going forward? >> I would say because of the history of databases. So one thing that once struck me when I came to Oracle and then got to meet great people like Juan Luis and Andy Mendelsohn who had been here for a long, long time. I come to realization that relational databases are around for about 45 years now. And, you know, I was like, I'm too young to have been around then, right? So I was like, what else was around 45 years? It's like just the tech stack that we have today. It's like, how does this look like? Well, Linux only came out in 93. Well, databases pre-date Linux a lot rather than as I started digging I saw a lot of technologies come and go, right? And you mentioned before like the technologies that data management systems that we had that came and went like the columnar databases or XML databases, object databases. And even before relational databases before Cot gave us the relational model there were apparently these networks stores network databases which to some extent look very similar to adjacent documents. There wasn't a harder storing data and a hierarchy to format. And, you know when you then start actually reading the Cot paper and diving a little bit more into the relation model, that's I think one important crux in there that most of the industry keeps forgetting or it hasn't been around to even know. And that is that when Cot created the relational model, he actually focused not so much on the application putting the data in, but on future users and applications still being able to making sense out of the data, right? And that's kind of like I said before we had those network models, we had XML databases you have adjacent documents stores. And the one thing that they all have along with it is like the application that puts the data in decides the structure of the data. And that's all well and good if you had an application of the developer writing an application. It can become really tricky when 10 years later you still want to look at that data and the application that the developer is no longer around then you go like, what does this all mean? Where is the structure defined? What is this attribute? What does it mean? How does it correlate to others? And the one thing that people tend to forget is that it's actually the data that's here to stay not someone who does the applications where it is. Ideally, every company wants to store every single byte of data that they have because there might be future value in it. Economically may not make sense that's now much more feasible than just years ago. But if you could, why wouldn't you want to store all your data, right? And sometimes you actually have to store the data for seven years or whatever because the laws require you to. And so coming back then and you know, like 10 years from now and looking at the data and going like making sense of that data can actually become a lot more difficult and a lot more challenging than having to first figure out and how we store this data for general use. And that kind of was what the relational model was all about. We decompose the data structures into tables and columns with relationships amongst each other so therefore between each other. So that therefore if somebody wants to, you know typical example would be well you store some purchases from your web store, right? There's a customer attribute in it. There's some credit card payment information in it, just some product information on what the customer bought. Well, in the relational model if you just want to figure out which products were sold on a given day or week, you just would query the payment and products table to get the sense out of it. You don't need to touch the customer and so forth. And with the hierarchical model you have to first sit down and understand how is the structure, what is the customer? Where is the payment? You know, does the document start with the payment or does it start with the customer? Where do I find this information? And then in the very early days those databases even struggled to then not having to scan all the documents to get the data out. So coming back to your question a bit, I apologize for going on here. But you know, it's like relational databases have been around for 45 years. I actually argue it's one of the most successful software technologies that we have out there when you look in the overall industry, right? 45 years is like, in IT terms it's like from a star being the ones who are going supernova. You have said it before that many technologies coming and went, right? And just want to add a more really interesting example by the way is Hadoop and HDFS, right? They kind of gave us this additional promise of like, you know, the 2010s like 2012, 2013 the hype of Hadoop and so forth and (mumbles) and HDFS. And people are just like, just put everything into HDFS and worry about the data later, right? And we can query it and map reduce it and whatever. And we had customers actually coming to us they were like, great we have half a petabyte of data on an HDFS cluster and we have no clue what's stored in there. How do we figure this out? What are we going to do now? Now you had a big data cleansing problem. And so I think that is why databases and also data modeling is something that will not go away anytime soon. And I think databases and database technologies are here for quite a while to stay. Because many of those are people they don't think about what's happening to the data five years from now. And many of the niche players also and also frankly even Amazon you know, following with this single purpose thing is like, just use the right tool for the job for your application, right? Just pull in the data there the way you wanted. And it's like, okay, so you use technologies all over the place and then five years from now you have your data fragmented everywhere in different formats and, you know inconsistencies, and, and, and. And those are usually when you come back to this data-driven business critical business decision applications the worst case scenario you can have, right? Because now you need an army of people to actually do data cleansing. And there's not a coincidence that data science has become very, very popular the last recent years as we kind of went on with this proliferation of different database or data management technologies some of those are not even database. But I think I leave it at that. >> It's an interesting talk track because you're right. I mean, no schema on right was alluring, but it definitely created some problems. It also created an entire, you know you referenced the hyper specialized roles and did the data cleansing component. I mean, maybe technology will eventually solve that problem but it hasn't up at least up tonight. Okay, last question, Maria maybe you could start off and Gerald if you want to chime in as well it'd be great. I mean, it's interesting to watch this industry when Oracle sort of won the top database mantle. I mean, I watched it, I saw it. It was, remember it was Informix and it was (indistinct) too and of course, Microsoft you got to give them credit with SQL server, but Oracle won the database wars. And then everything got kind of quiet for awhile database was sort of boring. And then it exploded, you know, all the, you know not only SQL and the key-value stores and the cloud databases and this is really a hot area now. And when we looked at Oracle we said, okay, Oracle it's all about Oracle Database, but we've seen the kind of resurgence in MySQL which everybody thought, you know once Oracle bought Sun they were going to kill MySQL. But now we see you investing in HeatWave, TimesTen, we talked about In-Memory databases before. So where do those fit in Maria in the grand scheme? How should we think about Oracle's database portfolio? >> So there's lots of places where you'd use those different things. 'Cause just like any other industry there are going to be new and boutique use cases that are going to benefit from a more specialized product or single purpose product. So good examples off the top of my head of the kind of systems that would benefit from that would be things like a stock exchange system or a telephone exchange system. Both of those are latency critical transaction processing applications where they need microsecond response times. And that's going to exceed perhaps what you might normally get or deploy with a converged database. And so Oracle's TimesTen database our In-Memory database is perfect for those kinds of applications. But there's also a host of MySQL applications out there today and you said it yourself there Dave, HeatWave is a great place to provision and deploy those kinds of applications because it's going to run 100 times faster than AWS (mumbles). So, you know, there really is a place in the market and in our customer's systems and the needs they have for all of these different members of our database family here at Oracle. >> Yeah, well, the internet is basically running in the lamp stack so I see MySQL going away. All right Gerald, will give you the final word, bring us home. >> Oh, thank you very much. Yeah, I mean, as Maria said, I think it comes back to what we discussed before. There is obviously still needs for special technologies or different technologies than a relational database or multimodal database. Oracle has actually many more databases that people may first think of. Not only the three that we have already mentioned but there's even SP so the Oracle's NoSQL database. And, you know, on a high level Oracle is a data management company, right? And we want to give our customers the best tools and the best technology to manage all of their data. Rather than therefore there has to be a need or there should be a part of the business that also focuses on this highly specialized systems and this highly specialized technologies that address those use cases. And I think it makes perfect sense. It's like, you know, when the customer comes to Oracle they're not only getting this, take this one product you know, and if you don't like it your problem but actually you have choice, right? And choice allows you to make a decision based on what's best for you and not necessarily best for the vendor you're talking to. >> Well guys, really appreciate your time today and your insights. Maria, Gerald, thanks so much for coming on The Cube. >> Thank you very much for having us. >> And thanks for watching this Cube conversation this is Dave Vellante and we'll see you next time. (upbeat music)

Published Date : Jun 24 2021

SUMMARY :

and then you guys just follow my lead. So I noticed Maria you stopped anyway, So any time you don't So when I'm here you guys and we'll open up when Dave's ready. and the benefits they bring What are we really talking about there? the nearest stores to kind of the traditional So for example, you can do So Gerald, you think about to you at all but just receives or even a MongoDB that allows you to do ML and AI into the database, in the database you already have. and I buy that by the way. of since the last 40 years, you know the benefits to this approach is the fact that you can get And you know, it's And that buddy comes in the form of the truth here is you don't and deploy it on the cloud. and the cloud and containers and you know, is the argument you were making And so why are you because the laws require you to. And then it exploded, you and the needs they have in the lamp stack so I and the best technology to and your insights. we'll see you next time.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Gerald Venzl	PERSON	0.99+
Andy Mendelsohn	PERSON	0.99+
Maria	PERSON	0.99+
Dave	PERSON	0.99+
Chile	LOCATION	0.99+
Maria Colgan	PERSON	0.99+
Peru	LOCATION	0.99+
100 times	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Gerald	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Canada	LOCATION	0.99+
seven years	QUANTITY	0.99+
Juan Luis	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Steve	PERSON	0.99+
five star	QUANTITY	0.99+
Maria Colgan	PERSON	0.99+
Swiss Army	ORGANIZATION	0.99+
Swiss Army	ORGANIZATION	0.99+
Alex	PERSON	0.99+
Facebook	ORGANIZATION	0.99+
MySQL	TITLE	0.99+
one note	QUANTITY	0.99+
yesterday	DATE	0.99+
two hands	QUANTITY	0.99+
three	QUANTITY	0.99+
two experts	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Linux	TITLE	0.99+
Teradata	ORGANIZATION	0.99+
each microservice	QUANTITY	0.99+
Hadoop	TITLE	0.99+
45 years	QUANTITY	0.99+
Oracles	ORGANIZATION	0.99+
early 2010s	DATE	0.99+
today	DATE	0.99+
one-shot	QUANTITY	0.99+
five	QUANTITY	0.99+
one good example	QUANTITY	0.99+
Sun	ORGANIZATION	0.99+
tonight	DATE	0.99+
first	QUANTITY	0.99+

CB Bohn, Principal Data Engineer, Microfocus | The Convergence of File and Object

>> Announcer: From around the globe it's theCUBE. Presenting the Convergence of File and Object brought to you by Pure Storage. >> Okay now we're going to get the customer perspective on object and we'll talk about the convergence of file and object, but really focusing on the object pieces this is a content program that's being made possible by Pure Storage and it's co-created with theCUBE. Christopher CB Bohn is here. He's a lead architect for MicroFocus the enterprise data warehouse and principal data engineer at MicroFocus. CB welcome good to see you. >> Thanks Dave good to be here. >> So tell us more about your role at Microfocus it's a pan Microfocus role because we know the company is a multi-national software firm it acquired the software assets of HP of course including Vertica tell us where you fit. >> Yeah so Microfocus is you know, it's like I can says it's wide, worldwide company that it sells a lot of software products all over the place to governments and so forth. And it also grows often by acquiring other companies. So there is there the problem of integrating new companies and their data. And so what's happened over the years is that they've had a number of different discreet data systems so you've got this data spread all over the place and they've never been able to get a full complete introspection on the entire business because of that. So my role was come in, design a central data repository and an enterprise data warehouse, that all reporting could be generated against. And so that's what we're doing and we selected Vertica as the EDW system and Pure Storage FlashBlade as the communal repository. >> Okay so you obviously had experience with with Vertica in your previous role, so it's not like you were starting from scratch, but paint a picture of what life was like before you embarked on this sort of consolidated approach to your data warehouse. Was it just dispared data all over the place? A lot of M and A going on, where did the data live? >> CB: So >> Right so again the data is all over the place including under people's desks and just dedicated you know their own private SQL servers, It, a lot of data in a Microfocus is one on SQL server, which has pros and cons. Cause that's a great transactional database but it's not really good for analytics in my opinion. So but a lot of stuff was running on that, they had one Vertica instance that was doing some select reporting. Wasn't a very powerful system and it was what they call Vertica enterprise mode where it had dedicated nodes which had the compute and storage in the same locus on each server okay. So Vertica Eon mode is a whole new world because it separates compute from storage. Okay and at first was implemented in AWS so that you could spin up you know different numbers of compute nodes and they all share the same communal storage. But there has been a demand for that kind of capability, but in an on-prem situation. Okay so Pure storage was the first vendor to come along and have an S3 emulation that was actually workable. And so Vertica worked with Pure Storage to make that all happen and that's what we're using. >> Yeah I know back when back from where we used to do face-to-face, we would be at you know Pure Accelerate, Vertica was always there it stopped by the booth, see what they're doing so tight integration there. And you mentioned Eon mode and the ability to scale, storage and compute independently. And so and I think Vertica is the only one I know they were the first, I'm not sure anybody else does that both for cloud and on-prem, but so how are you using Eon mode, are you both in AWS and on-prem are you exclusively cloud? Maybe you could describe that a little bit. >> Right so there's a number of internal rules at Microfocus that you know there's, it's not AWS is not approved for their business processes. At least not all of them, they really wanted to be on-prem and all the transactional systems are on-prem. And so we wanted to have the analytics OLAP stuff close to the OLTP stuff right? So that's why they called there, co-located very close to each other. And so we could, what's nice about this situation is that these S3 objects, it's an S3 object store on the Pure Flash Blade. We could copy those over if we needed it to AWS and we could spin up a version of Vertica there, and keep going. It's like a tertiary GR strategy cause we actually have a, we're setting up a second, Flash Blade Vertica system geo located elsewhere for backup and we can get into it if you want to talk about how the latest version of the Pure software for the Flash Blade allows synchronization across network boundaries of those Flash Blade which is really nice because if, you know there's a giant sinkhole opens up under our Koll of facility and we lose that thing then we just have to switch to DNS. And we were back in business of the DR. And then the third one was to go, we could copy those objects over to AWS and be up and running there. So we're feeling pretty confident about being able to weather whatever comes along. >> Yeah I'm actually very interested in that conversation but before we go there. you mentioned you want, you're going to have the old lab close to the OLTP, was that for latency reasons, data movement reasons, security, all of the above. >> Yeah it's really all of the above because you know we are operating under the same sub-net. So to gain access to that data, you know you'd have to be within that VPN environment. We didn't want to going out over the public internet. Okay so and just for latency reasons also, you know we have a lot of data and we're continually doing ETL processes into Vertica from our production data, transactional databases. >> Right so they got to be approximate. So I'm interested in so you're using the Pure Flash Blade as an object store, most people think, oh object simple but slow. Not the case for you is that right? >> Not the case at all >> Why is that. >> This thing had hoop It's ripping, well you have to understand about Vertica and the way it stores data. It stores data in what they call storage containers. And those are immutable, okay on disc whether it's on AWS or if you had a enterprise mode Vertica, if you do an update or delete it actually has to go and retrieve that object container from disc and it destroys it and rebuilds it, okay which is why you don't, you want to avoid updates and deletes with vertica because the way it gets its speed is by sorting and ordering and encoding the data on disk. So it can read it really fast. But if you do an operation where you're deleting or updating a record in the middle of that, then you've got to rebuild that entire thing. So that actually matches up really well with S3 object storage because it's kind of the same way, it gets destroyed and rebuilt too okay. So that matches up very well with Vertica and we were able to design the system so that it's a panda only. Now we have some reports that we're running in SQL server. Okay which we're taking seven days. So we moved that to Vertica from SQL server and we rewrote the queries, which were had, which had been written in TC SQL with a bunch of loops and so forth and we were to get, this is amazing it went from seven days to two seconds, to generate this report. Which has tremendous value to the company because it would have to have this long cycle of seven days to get a new introspection in what they call the knowledge base. And now all of a sudden it's almost on demand two seconds to generate it. That's great and that's because of the way the data is stored. And the S3 you asked about, oh you know it, it's slow, well not in that context. Because what happens really with Vertica Eon mode is that it can, they have, when you set up your compute nodes, they have local storage also which is called the depot. It's kind of a cache okay. So the data will be drawn from the Flash Blade and cached locally. And that was, it was thought when they designed that, oh you know it's that'll cut down on the latency. Okay but it turns out that if you have your compute nodes close meaning minimal hops to the Flash Blade that you can actually tell Vertica, you know don't even bother caching that stuff just read it directly on the fly from the from the Flash Blade and the performance is still really good. It depends on your situation. But I know for example a major telecom company that uses the same topologies we're talking about here they did the same thing. They just dropped the cache cause the Flash Blade was able to deliver the data fast enough. >> So that's, you're talking about that's speed of light issues and just the overhead of switching infrastructure is that, it's eliminated and so as a result you can go directly to the storage array? >> That's correct yeah, it's like, it's fast enough that it's almost as if it's local to the compute node. But every situation is different depending on your needs. If you've got like a few tables that are heavily used, then yeah put them in the cache because that'll be probably a little bit faster. But if you're have a lot of ad hoc queries that are going on, you know you may exceed the storage of the local cache and then you're better off having it just read directly from the, from the Flash Blade. >> Got it so it's >> Okay. >> It's an append only approach. So you're not >> Right >> Overwriting on a record, so but then what you have automatically re index and that's the intelligence of the system. how does that work? >> Oh this is where we did a little bit of magic. There's not really anything like magic but I'll tell you what it is I mean. ( Dave laughing) Vertica does not have indexes. They don't exist. Instead I told you earlier that it gets a speed by sorting and encoding the data on disk and ordering it right. So when you've got an append-only situation, the natural question is well if I have a unique record, with let's say ID one, two, three, what happens if I append a new version of that, what happens? Well the way Vertica operates is that there's a thing called a projection which is actually like a materialized columnar data store. And you can have a, what they call a top-K projection, which says only put in this projection the records that meet a certain condition. So there's a field that we like to call a discriminator field which is like okay usually it's the latest update timestamp. So let's say we have record one, two, three and it had yesterday's date and that's the latest version. Now a new version comes in. When the data at load time vertical looks at that and then it looks in the projection and says does this exist already? If it doesn't then it adds it. If it does then that one now goes into that projection okay. And so what you end up having is a projection that is the latest snapshot of the data, which would be like, oh that's the reality of what the table is today okay. But inherent in that is that you now have a table that has all the change history of those records, which is awesome. >> Yeah. >> Because, you often want to go back and revisit, you know what it will happen to you. >> But that materialized view is the most current and the system knows that at least can (murmuring). >> Right so we then create views that draw off from that projection so that our users don't have to worry about any of that. They just get oh and say select from this view and they're getting the latest greatest snapshot of what the reality of the data is right now. But if they want to go back and say, well how did this data look two days ago? That's an easy query for them to do also. So they get the best of both worlds. >> So could you just plug any flash array into your system and achieve the same results or is there anything really unique about Pure? >> Yeah well they're the only ones that have got I think really dialed in the S3 object form because I don't think AWS actually publishes every last detail of that S3 spec. Okay so it had, there's a certain amount of reverse engineering they had to do I think. But they got it right. When we've, a couple maybe a year and a half ago or so there they were like at 99%, but now they worked with Vertica people to make sure that that object format was true to what it should be. So that it works just as if Vertica doesn't care, if it is on AWS or if it's on Pure Flash Blade because Pure did a really good job of dialing in that format and so Vertica doesn't care. It just knows S3, doesn't know what it doesn't care where it's going it just works. >> So the essentially vendor R and D abstracted that complexity so you didn't have to rewrite the application is that right? >> Right, so you know when Vertica ships it's software, you don't get a specific version for Pure or AWS, it's all in one package, and then when you configure it, it knows oh okay well, I'm just pointed at the, you know this port, on the Pure storage Flash Blade, and it just works. >> CB what's your data team look like? How is it evolving? You know a lot of customers I talked to they complain that they struggled to get value out of the data and they don't have the expertise, what does your team look like? How is it, is it changing or did the pandemic change things at all? I wonder if you could bring us up to date on that? >> Yeah but in some ways Microfocus has an advantage in that it's such a widely dispersed across the world company you know it's headquartered in the UK, but I deal with people I'm in the Bay Area, we have people in Mexico, Romania, India. >> Okay enough >> All over the place yeah all over the place. So when this started, it was actually a bigger project it got scaled back, it was almost to the point where it was going to be cut. Okay, but then we said, well let's try to do almost a skunkworks type of thing with reduced staff. And so we're just like a hand. You could count the number of key people on this on one hand. But we got it all together, and it's been a traumatic transformation for the company. Now there's, it's one approval and admiration from the highest echelons of this company that, hey this is really providing value. And the company is starting to get views into their business that they didn't have before. >> That's awesome, I mean, I've watched Microfocus for years. So to me they've always had a, their part of their DNA is private equity I mean they're sharp investors, they do great M and A >> CB: Yeah >> They know how to drive value and they're doing modern M and A, you know, we've seen what they what wait, what they did with SUSE, obviously driving value out of Vertica, they've got a really, some sharp financial people there. So that's they must have loved the the Skunkworks, fast ROI you know, small denominator, big numerator. (laughing) >> Well I think that in this case, smaller is better when you're doing development. You know it's a two-minute cooks type of thing and if you've got people who know what they're doing, you know I've got a lot of experience with Vertica, I've been on the advisory board for Vertica for a long time. >> Right And you know I was able to learn from people who had already, we're like the second or third company to do a Pure Flash Blade Vertica installation, but some of the best companies after they've already done it we are members of the advisory board also. So I learned from the best, and we were able to get this thing up and running quickly and we've got you know, a lot of other, you know handful of other key people who know how to write SQL and so forth to get this up and running quickly. >> Yeah so I mean, look it Pure is a fit I mean I sound like a fan boy, but Pure is all about simplicity, so is object. So that means you don't have to ra, you know worry about wrangling storage and worrying about LANs and all that other nonsense and file names but >> I have burned by hardware in the past you know, where oh okay they built into a price and so they cheap out on stuff like fans or other things in these components fail and the whole thing goes down, but this hardware is super good quality. And so I'm happy with the quality of that we're getting. >> So CB last question. What's next for you? Where do you want to take this initiative? >> Well we are in the process now of, we're when, so I designed a system to combine the best of the Kimball approach to data warehousing and the inland approach okay. And what we do is we bring over all the data we've got and we put it into a pristine staging layer. Okay like I said it's a, because it's append-only, it's essentially a log of all the transactions that are happening in this company, just as they appear okay. And then from the Kimball side of things we're designing the data marts now. So that's what the end users actually interact with. So we're taking the, we're examining the transactional systems to say, how are these business objects created? What's the logic there and we're recreating those logical models in Vertica. So we've done a handful of them so far, and it's working out really well. So going forward we've got a lot of work to do, to create just about every object that the company needs. >> CB you're an awesome guest really always a pleasure talking to you and >> Thank you. >> congratulations and good luck going forward stay safe. >> Thank you, you too Dave. >> All right thank you. And thank you for watching the Convergence of File and Object. This is Dave Vellante for theCUBE. (soft music)

Published Date : Apr 28 2021

SUMMARY :

brought to you by Pure Storage. but really focusing on the object pieces it acquired the software assets of HP all over the place to Okay so you obviously so that you could spin up you know and the ability to scale, and we can get into it if you want to talk security, all of the above. Yeah it's really all of the above Not the case for you is that right? And the S3 you asked about, storage of the local cache So you're not and that's the intelligence of the system. and that's the latest version. you know what it will happen to you. and the system knows that at least the data is right now. in the S3 object form and then when you configure it, I'm in the Bay Area, And the company is starting to get So to me they've always had loved the the Skunkworks, I've been on the advisory a lot of other, you know So that means you don't have to by hardware in the past you know, Where do you want to take this initiative? object that the company needs. congratulations and good And thank you for watching

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Mexico	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
MicroFocus	ORGANIZATION	0.99+
Vertica	ORGANIZATION	0.99+
UK	LOCATION	0.99+
seven days	QUANTITY	0.99+
Romania	LOCATION	0.99+
99%	QUANTITY	0.99+
HP	ORGANIZATION	0.99+
Microfocus	ORGANIZATION	0.99+
two-minute	QUANTITY	0.99+
second	QUANTITY	0.99+
two seconds	QUANTITY	0.99+
India	LOCATION	0.99+
Kimball	ORGANIZATION	0.99+
Pure Storage	ORGANIZATION	0.99+
each server	QUANTITY	0.99+
CB Bohn	PERSON	0.99+
yesterday	DATE	0.99+
two days ago	DATE	0.99+
first	QUANTITY	0.99+
Christopher CB Bohn	PERSON	0.98+
SQL	TITLE	0.98+
Vertica	TITLE	0.98+
a year and a half ago	DATE	0.98+
both worlds	QUANTITY	0.98+
Pure Flash Blade	COMMERCIAL_ITEM	0.98+
both	QUANTITY	0.98+
vertica	TITLE	0.98+
Bay Area	LOCATION	0.97+
one	QUANTITY	0.97+
Flash Blade	COMMERCIAL_ITEM	0.97+
third one	QUANTITY	0.96+
CB	PERSON	0.96+
one package	QUANTITY	0.96+
today	DATE	0.95+
Pure storage Flash Blade	COMMERCIAL_ITEM	0.95+
first vendor	QUANTITY	0.95+
pandemic	EVENT	0.94+
S3	TITLE	0.94+
marts	DATE	0.92+
Skunkworks	ORGANIZATION	0.91+
SUSE	ORGANIZATION	0.89+
three	QUANTITY	0.87+
S3	COMMERCIAL_ITEM	0.87+
third company	QUANTITY	0.84+
Pure Flash Blade Vertica	COMMERCIAL_ITEM	0.83+

George Lumpkin & Neil Mendelson, Oracle | CUBE Conversation, April 2021

(bright upbeat music) >> Hi well, this is Dave Vellante. We're digging deeper into the world of database. You know, there are a lot of ways to skin a cat and different vendors take different approaches and we're reaching out to the technologists to get their perspective on the major trends that they're seeing in the market, 'cause we want to understand the different ways in which you can solve problems. So look, if you have thoughts and the technical chops on this topic, I'd love to interview you. Just ping me at at DVellante, on Twitter, a lot of ways to get ahold of me. Anyway, we recently spoke with Andrew Mendelsohn, who is Oracle's EVP and he's responsible for database server technologies. And we talked a lot about Oracle's ADW, Autonomous Data Warehouse. And we looked at the cloud database strategy that Oracle is taking and the company's plans and how they're different maybe from other solutions in the marketplace, but I wanted to dig deeper. And so today we have two members of Mendelsohn's team on The Cube, and we're going to probe a little bit. George Lumpkin, is the Vice President of Autonomous Data Warehouse. And Neil Mendelson is the VP of Modern Data Warehouse, that business for Oracle. They're both 20-year veterans of Oracle. When I reached out to Steve Savannah, who's a colleague of mine for many years, he's always telling me how great Oracle is relative to the competition. So I said, okay, come on The Cube and talk about this, give me your best people. And he said, whatever these two don't know about cloud data warehouse, it isn't worth knowing anyway. So with that said gentlemen, welcome to The Cube. Thanks so much for coming on. >> Thank you. >> Hey, glad to be here. >> So George, let's start with you. And maybe we could recap for some of the viewers who might not be familiar with the interview that I did with Andy. In your words, what exactly is an Autonomous Data Warehouse? Is this cloud native? Is it an Oracle buzzword? What is it? >> Well, I mean, Autonomous Data Warehouse is Oracle's cloud data warehouse. It's a service that built to allow business users to get more value from their data. That's what the cloud data warehouse market is. Autonomous Data Warehouse is absolutely cloud native. This is a huge misconception that people might have when they first sort of hear about something, this service because they think this is a Oracle database, right? Oracle makes databases. This is the same old database I knew from 10 years ago. And that's absolutely not true. We built a cloud native service or data warehousing built it with cloud features. You know, if your understanding of the cloud data warehouse market is based upon how you thought things look 10 years ago, like Snowflake wouldn't have even existed, right? You can't base your understanding of Oracle based upon that. We have a modern service that's highly elastic, provides cloud capabilities like online patching and it's fully autonomous. It's really built the business users so they don't need to worry about administering their database. >> So I want to come back and actually ask you some questions about that, but let me follow up and talk about some of the evolution of the ADW. And where did you start? I think it was 2018, maybe where you came from, where you are today, maybe you can take us through the technological progression and maybe the path you took to get here. >> And so 2018, was when we released the service and made generally available, but of course, you know we started much earlier than that. And this was started within my product management team, and other organization. So we really sat down with a blank sheet of paper and we said, what should the data warehouse in the cloud look like? You know, let's put aside everything that Oracle does for its on-prem customers and think about how the cloud should be different. And the first thing that we said was, well, you know, if Oracle writes the database software, and Oracle builds its own hardware, and Oracle has created its own cloud, why do we need customers to manage a database? And that's where the idea of autonomous database came from. That Oracle is managing the entire ecosystem. And therefore we built a database that we believe it's far and away the simplest to use simplest data warehouse in the market. And that's been our focus since we started with 2018. And that continues to be our focus, looking at more ways that we can make an Autonomous Data Warehouse as simpler and easier for business users to get more value out of their data. >> Awesome, one more question. And actually Neil, you might want to chime in on this as well. So just from a technical perspective, you know forget the marketing claims and all the BS. How do you compare ADW to the so-called born in the cloud data warehouses? You mentioned Snowflake, you know Redshift, is Redshift born in the cloud. Well, it was par XL but Amazon's done some good work around Redshift. I think big query is maybe probably a better example 'cause it was, you know, like Snowflake started in the cloud but how do you compare ADW to some of these other so-called born in the cloud data warehouses? >> I think part of this, you mentioned Redshift wasn't important in the cloud. It was, you know, a code base taken from a prior company that was on-premise company. So they adapted it to the cloud, right? And you know, we have done, as George said, much of the same, which is, you know, our starting point was not you know another company's code base, but our starting point was our own code base. But as George said, it's less about the starting point and it's more about where you envision the end point, right? Which is that, you know, whatever your starting point is, I think we have a fundamental different view of the endpoint. Amazon talks about how they're literally built for you know, a cloud built for developers, right? You know, builders, right? And you know Oracle wasn't first in the infrastructure business, we entered through applications business. And all of a sudden, you know we began taking on 100s of 1000s and 100s of even more customers that were SAS customers. Underneath was the database and all the infrastructure. One of the things that we took away from that was that we couldn't possibly hire enough people DBA, to manage all the infrastructure below our applications customers. So one of the things that influenced this is that, you know customers expect SAS applications to just take care of themselves, right? So we had to essentially modify the infrastructure to allow it to do so as well, right? And we're bringing that capability to those people who, you know, may or may not have an application, but their interest is, you know more of this self-service agility type of aspect. >> So it seems to me and Georgia was sort of alluding to this before. I mean, when you mentioned Snowflake a couple of times, and then Neil, something you just said, I'm going to pick up on is you've been around for a long time. And you know, when I talked to the Snowflake people, they know Oracle, a lot of them came from Oracle. They understand I think how you can't just build Oracle overnight and build in the capabilities that Oracle has and the recovery. And you talk to customers and you know you are the gold standard of, you know especially mission critical databases, so I get that. But now you just sort of hit on it, is it takes a lot of people and skill to run the database. So that's the problem that you're saying you were attacking, is that, am I getting that right? >> Right, right, so the people that you talked about who originally built Snowflake came from Oracle, but they came from Oracle more than a decade ago. So their context is over a decade old, right? In the meantime, we've been busy, you know building a economies and many other capabilities, right? Their view of Oracle is that view that was back more than 10 years ago, right? They're still adding capability. So a really good example of this illustration is Oracle as you said, it's the most capable system that's out there and has been for many years. We've been focusing on how do we simplify that and how do we use machine learning embedded within the system itself? Because core to the concept of autonomous is that inside, is this machine learning system that's continually improving, right? That's the whole notion. Where in Snowflakes case, they're still adding functionality. Last year, they added masking which you know functionality they didn't have, but when they added the capability, they added it without, you know, the ability for a business user to actually take advantage of it. There's no capability for a business user to actually find the information that needs to be masked. And then after the information is found, you require a technical person to actually implement the mask. In Oracle's case, we've had masking and those capabilities for a long time, our focus was to be able to provide a simple tool that a business user can use that doesn't need technical or security experience. Find the data that needs to be masked PII data, and then hit a button and have it masked for you. So, you know, they're still, you know, without this notion of a strategy to move toward the system to heal itself and to manage itself, they're just going to continue. As they continue to add more capability, they will in turn add more complexity. What we're trying to do is take complexity out while others are adding it in, its an ironic twist. >> It is an ironic twist. It is interesting to look at it. And I don't want to make this about Snowflake. But I mean, Hey, I like what they're doing. I like them. I know the management, they're growing like crazy and you know and the customers tell me, hey, this is really simple. And it's simple by design. I mean, to your point over time it's going to get, you know, more and more complex. I was talking to Andy, I think it was Andy. He was saying, you know, they've got the different sizes you've got to shape some, you know, they call it t-shirt sizes. And I was like, okay, I got a small, I got a medium and a large, maybe that's okay. But you guys would say, we give more granular you know, a scaling, I guess is the point there, right? I mean George, I don't know if you can comment on that. It just a different strategy. You've got a company that was founded well, I guess, 2015 versus one that was founded in 1977. So you would think the latter has, you know way more function than the former, but George, anything you'd add to this conversation? >> Yeah, I mean, I'm always amazed that there are these database systems that are perceived as cloud native and they do things like sell you database sizes by t-shirt sizes, as you described. I mean, if you look at Snowflake, it's small, medium, large extra large too extra large, but they're all factors of two. You're getting a size of your database of two, four, eight, six, 32, et cetera. Or if you look at AWS Redshift, you're buying your database by the nodes. You say, how many nodes do you want? And in both those cases, this is a cloud native. This is saying we have some hardware underneath our database and we need you, Mr. Customer, to tell us how many servers you want. That's not the way the clouds should work, right? And I think this is one of the things that we did with Autonomous Data Warehouse. We said, no, that's not how the rules should work. We still run our database on hardware, we still have nodes and servers. We should tell the customer, how many CPU's you would like for your data warehouse? You want 16? Sounds good. You want 18? Yeah, we can give you 18. We're not, you know, we're not selling these to you in bundles of eight or bundles of six or powers of two. We'll sell you what you need. That's what cloud elasticity should be. Not this idea that oh, we are a database that should be managed by IT. IT already knows about servers and nodes. Therefore it's okay if we tell people your cloud data warehouse runs on nodes. Within Oracle as Neil said, we wouldn't. The data warehouse should be used by the people who want to actually analyze their data, should be used by the business users. >> Well, and so the other piece of cloud native that has become popular, is this idea of separating compute from storage and being able to scale those two independent of each other which is pretty important, right? Because you don't want to have to pay for a chunk of compute if you don't need the storage and vice versa. Maybe you could talk about that, how you solve that problem, to the extent that you solve that problem. >> Absolutely, we do separate compute print storage with Autonomous Data Warehouse. When you come in and you say, I need 10 CPU's for my data warehouse and I need two terabytes of storage. Those are two dependent decisions that you make. So they're not tied together in any way. And, you are exactly right, Dave, this is how things should work in the cloud. You should pay for what you need, pay for what you use, not be constrained by having big sets of storage you have to use for a given amount CPU or vice versa. >> Okay, go ahead Neil, please. >> Oh, just to add on to that, you know, the other aspect that comes into play is that, you know, so your starting point is X, whatever that happens to be. Over time that changes. And we all know that workloads vary right throughout the day throughout the month, throughout the year by various events that occur maybe the close of the year, close of business at the end of the quarter, it maybe you know, holiday season for retailers and so forth. So, you know, it's not only the starting point, but how do you actually manage the growth, right? scaling up and scaling down, right? In our case, we tried, as George said, we abstracted that completely for the customer basically said check a box, which has auto scale. So, if the system is required more resources, will apply more resources. And we do so instantaneously without any downtime whatsoever, right? Because you know, again, you know, people think in terms of these systems have now become business critical. So if the business critical, you can't just shut down to expand. Imagine during the holiday season is your business is ramping up. And then all of a sudden you have to scale, right? And your system either shuts down, reboots itself, right? Or it slows down to the point that it's a crawl and all your customers get frustrated. We don't do that. You click a button, auto scale and we take care of it for you smoothing out those lumps, right? Without any technical assistance. And again, if you look at Redshift, you look at all these various systems, they require technical assistance to be able to figure out not only your initial data, but how you scale out over time. >> Interesting, okay. So all is said, you know, a lot of companies are using Azure, AWS Google for infrastructure, why would these customers not just use their database? Why would they switch to Oracle or ADW? >> Well, I think Neil will probably add something. I want to start by saying a huge number of our existing Autonomous Data Warehouse customers today are customers of AWS and Azure. They are pulling data from AWS and Azure and bringing it into an Oracle Autonomous Data Warehouse. And we built feature Joe, I focused on product managers. We feel featured for that. And so it's perfectly viable and it it's almost commonplace, that the very largest enterprises to be doing that. But then coming to the question of why would they want to do it? I don't know, Neil, you want to take that? >> Yeah, yeah, so one of the things that we've really see emerge here is you know, a data warehouse doesn't generate the transactions on itself, right? So the data has to come from somewhere, right? And you ask yourself, well, where does the data come from? Well, in a lot of cases, that data is coming from applications and increasingly SAS applications that the company has deployed. And those are, you know, HR applications, you know, CRM applications, you know ERP applications and many vertical applications. In Oracle's case, what we've done is we say, okay, well, we have the application, this transactional thing, we have the infrastructure from the economist data warehouse, why don't we just make it really, really easy? And if you're an Oracle applications customer, that's already running on the Oracle cloud, we will essentially provide you the ability to create a data warehouse from that information, right? With a clicker, with largely either with a product and service or quick start kit. You don't start from scratch, you start from where you are. And there are many cases that where you are has data, very much as George mentioned before telcos, banks, insurance companies, governments, all of the data that they want to analyze, a lot of that data guess where it's coming from, it's coming from Oracle applications. So it makes sense to be able to have both the data that's generated and the data that's being analyzed close to the same place. Because at the end of the day, the payoff pitch for any form of analysis is not coming up with an insight, oh, I realized X, Y, Z, but it's rather putting the insight directly into production. And that's where, when you have this stuff spread all over God's greener trying to go from insight into action can take months, if not years. The reason that a lot of customers are now turning to us is that they need to be much more agile and they need to be able to turn that insight into action immediately without it being a science project. >> Okay, thank you for that. So let's tick them off. Like what are the top things that customers can get from Oracle Autonomous Data Warehouse, that they couldn't get from say a Snowflake or Redshift or Big query or SQL server or something yet. I appreciate you guys' willingness to talk about the competition. Let's tick them off. What are the most important things that we should know about that they can't get elsewhere? >> So first, I mean, we already talked about a couple of what we think are really the major themes of Autonomous Data Warehouse. The services is autonomous. You don't need to worry about managing it, anyone can manage the data warehouse. The service is elastic. You can buy and pay for what you use. You know, those are just what we think of as being the general characteristics of Autonomous Data Warehouse. But you know, when you come to your question of, hey, what do we give that other vendors don't provide? And I think the one angle that Autonomous Data Warehouse does a really good job is and Neil was just discussing this, it focuses on the business problems, right? We have years and years of experience with not just database security, but data security, right? You know, every cloud vendor can say, oh we encrypt all your data, we have these compliance certifications, all of these things. And what they're saying is, we are securing your database, we are securing your database infrastructure. At Oracle of course has to do those as well. But where we go further, is we say, hey, no, no, no, no, no, we know what business users want. They want to secure their data. What kind of data am I storing? Do I have PII data? Could you detect whether there's PII data and tell me about it in case some user loaded something that I wasn't aware of? What kind of privileges did I give my users? Can you make sure that those privileges are right? And can you tell me if users were given privileges that they're not using maybe I need to take them away. These are the problems that Oracle's tackled in security over the last 20 years. It's really more about the business problem. Yeah, some other, oh, go ahead. >> Oh, I'm sorry, I got so many questions for you guys. We'll get back to that 'cause it sounds like there's a long list. (laughs) >> We have nowhere to go.(laughs) I want to pick up with George on something you said about elasticity. Is it true pay by the drink? Do you have a consumption pricing? I mean, can I dial it up and dial it down whenever I want? How does that work? >> Yes, I mean not to be too many technical details, but you say, I want 14 CPU's that's what your database runs at. You can change that default number anytime you want online, right? You can say, okay, I'm coming up on my quarter end, I'm going to raise my database 20 CPU. We just do it on the ply. We just adjust the size--- >> What about the other way? What about coming down? Can I go down to one? >> You go down, you can go down to one--- >> And you're not going to charge me for 14 if I go down to one? >> No, if you set it down to one, you get charged for one, right? >> Okay, that's good, that's good. >> In the background, you know we are also allowing levels of auto scaling. You say, if you say hey, I want to charged for 14 and Oracle, can you take care of all those scaling for me? So if a bunch of people jump on at 5:00 PM, to run some queries, 'cause the executive said, hey, I need a report by tomorrow morning. We'll take care of that for you. We'll let you go beyond 14 and only charge you for exactly what you use for those extra CPU's beyond 14. >> Okay, thank you. Go ahead, Neil. >> And maybe, if we add, you know, Andy talked about this when he was on that show with you last week, right? And you know, he talked about this concept of a converged database, but let me talk about it in the way that we see it from a business point of view, right? You know, business users are looking to, you know ask a variety of questions, right? And those questions need to be able to relate to both you know, the customer themselves, the relationship that the customer might have with others. You know, today we talk about like the social network and who are influencers within that, and then where they actually conduct business. Which is really, you know, in every case, it's on some form of increasingly on a mobile device. So in that case, you want to be able to ask questions, which is not only, you know, who should I focus on, but who are the key influencers within this community, right? That could influence others? And does that happen in a particular place in time? Meaning, you know, let's say pre COVID, it might happen at a coffee shop or somewhere else. We can answer all of those questions and more inside of the autonomous system without having to replicate the data out to one system that does graph and another system that does spatial, a third system that does this. It's like a business user. It's like, wait a minute, come on, you're trying to tell me that I need a separate system and replicate the data just be able to understand location? The answer in many cases is yes, you have to have separate, which a business person says, well, that's absurd. Can't I just do this all in one system? You can with Oracle. >> So look, I'm not trying to be the snarky journalist or analyst here but I want to keep pushing on this issue. So here we are, it's 2021. It's April. We're like a third of the way through the year. And so far, nobody has come out and said, okay, we're going to deliver Autonomous Data Warehouse just like Oracle. So I asked myself, well, why is Oracle doing this? You guys answered, you know, to reduce the labor cost. But I asked myself, is this how they're solving the problem of keeping relevant a database that spans five decades? And you guys said, no, no, this is cloud native born in the cloud, you know started essentially with a new mindset. But is this a trend that others are going to follow? You know, and if so, why haven't we seen it this idea of a self-driving databases? Why is it right now unique to Oracle? What's really going on here? >> So I think there's a really interesting thing that's happening, it's not visible outside of Oracle. It's very visible for those of us who work inside of the development organization. You know, if you look at Oracle, I can tell you bad. I mean, I think it's safe to presume Oracle has the largest database development organization on the planet, right? I mean, it was kind of the largest database or large most used database for the past two decades. And what's happened is we pivoted to building a cloud platform. We're not just building a database, we're taking all of these resources that we have with all these expertise of building database software. We were saying, we now have to build the platform to run and manage the database software in the cloud, right? And it's a little bit like, you know I think to make people relate to it a little better, there was a really good quote from Elon Musk couple of years ago, talking about Tesla. Like everyone looks at the car, right? Tesla, the car is really great. The hard part of this, is building the factory, and that's analogy holds for Oracle. What we're building is the cloud battery. And what we have transitioned is our database development organization is now building as robust a cloud as possible. So that you know, when we increase the number of databases by 10 X, we don't add 10 X, more cloud ops people to manage it. We are ramping up developer building features to automate the management of our cloud infrastructure. And with that automation, we get better ability, less errors, more security. We give benefits to our cloud data warehouse customers with it. And I think this something really important to realize, right? We build database software. We build, you know, an engineered system built for databases called exit data, and we build a cloud platform. And these are really equal tiers in what we are building and developing today in 2021 from Oracle database development organization. >> Well, you mentioned exit data, I want to shift gears here a little bit and talk about we're seeing this hybrid cloud on-premises clouds, they're finally gaining some traction. I got to give props Oracle's cloud of customers really the early to that game. I think it was the first in my view anyway, true same same vision, took you guys a little while to get there but it was the right vision. And the thing I always say about Oracle people don't understand is Oracle invest in R and D, your chairman is also the CTO. You guys are serious about technical investment so you know, that's where innovation comes from. But, and we heard during your recent earnings call, we heard some positive comments on this. So what's your take on delivering autonomous data warehouse on-prem and how do you compare with say Snowflake and AWS in that area? Snowflake, Frank Slootman, I've had him on record saying we're not going to do that halfway house. Forget it, we are always going to be in the cloud. We're never going to do an on-prem installation. AWS, we'll see to date. Yeah, I don't think you can get a Redshift for instance in outposts, but maybe that'll come. But, how do you see that emerging? What's your difference there? Maybe Neil, you could talk about that. >> Yeah, so, you know, I think, you know, customers had a lot of regulated industries, right? Still have concerns about the public cloud. And I think that when you hear statements like, you know, we're never going to do, you know, on-prem. Well, economist cloud at customer, it's not a classic on-prem solution. What it is, it's a piece of our cloud delivered in your data center. It's still the cloud software. Oracle manages it, Oracle, you know, the system itself manages itself and we take care of that responsibility so you don't have to. The differences is that we can make that available in a public cloud as well as in a private cloud, right? And there are so many use cases, you know, that you can imagine from a regulatory point of view, or just from a comfort point of view, where customers are choosing, they want the ability to decide for themselves where to place this stuff as compared to only having one option, right? And you know, you look at a lot of what's happening in the emerging world where, you know, there are a lot of places in the world that may not have, you know, really really high-speed internet connections to make, you know a public cloud feasible. Well, in that case, whether you're talking about, you know an oil rig or you're talking about something else, right? We can put that capability where it needs to be close to the operation that you're talking about, irrespective of the deployment option. >> Well, let me just follow up on that because I think it's interesting that, you know Frank Slootman said that to me, I oftentimes around AWS I say, never say never 'cause they'll surprise you, right? And I've learned that with Andy Jassy, but one of the things that seems difficult for on-prem, would be to separate that compute from storage because you have to actually physically move in resources. I think about Vertica Xeon mode. It's not quite the same, same. So, I mean, in that regard, maybe you're not the same same. And maybe that dogma makes sense for some companies. For Oracle, obviously you've got a huge on-prem state, thoughts on that. >> So, you know, clearly, you know, so typically what we'll do is that we'll provide additional hardware beyond what the customer might expect and that allows them to use the capabilities of expansion, right? We also have the ability to allow the customer to expand from their cloud of customer into the public cloud as well, of which we have a lot of those situations. So we can provide a level of elasticity, even on-premises by over provisioning the systems, well not charging the customer until they use only based on what they consume, right? Combined together with the ability for us to augment their usage in the public cloud as well, right? Where others, again are constraint, right? Because they only have a single option. >> Right, well, you've got the capital resources to do that as well which is not to be overlooked. Okay, I mean, I've blown our time here but you guys are so awesome. (laughs) I appreciate the candor. So last question and George, if you want to throw in a couple of those other tick boxes, you know the differentiators, please feel free, but for both of you, if you can leave customers with the one key point or the top key points on how Oracle Autonomous Data Warehouse can really help them improve their business in the near term, what would they be? Maybe George, you could start and then Neil you bring us home. >> Yeah, I mean, I think that, as I said before, our starting point with Autonomous Data Warehouse, is how can we build a better customer experience in the cloud? And I think, and this continues throughout 2021, and I think that the big theme here is the business users should be able to get value directly from their data warehouses. We talked a few times about how a line of business user should be able to manage their own data, should be able to load their own data warehouse, should be able to start to work with their own data, should be able to run machine learning, model of build machine learning, models against that data and all of that built in, and delivered in Autonomous Data Warehouse. And we think that this is, you know we see our customer organizations large and small, the light bulbs starting to go on how easy the services to use to and how completed it is for helping business users get value from their data. And just adding onto what George said, you know, the development organization has done a tremendous job of really simplifying this cooperation. What we also tried to do that on the business side. You know, when a customer has an on-prem situation, they're looking at moving to the cloud, whether lift and shift or modernized, they're looking at costs, they're looking at risk and they're looking at time. So one of the things we look at is how do we mitigate that? How do we mitigate the cost, the risk and the time? Well, this week, I think we announced our new cloud lift program and the cloud lift program is what Oracle will provide to its cloud engineering resources around the world is that we will do, we will take the cost, the risk and the time out of the equation and Oracle will work directly with the customer or the customer's partner of choice, maybe an Accenture or Deloitte, and we will move them, right? You know, at little or no cost, most cases there's no cost whatsoever, right? We mitigate the risk because we're taking the risk on. And we've built a lot of automated tools to make that go very quickly, right? And securely, and then finally, we do it in a very very short amount of time as compared to what you would need to do with, you know 'cause there is no Redshift on-premises. There is no Snowflake on-premises. You have to convert from what you already have to that, right? And, but the company beyond the technological barriers that George talked about were also trying to smooth the operation so that a business itself can make a decision that not only did they not need the technical people to operate it, they won't need an entire consulting contract with millions of dollars in order to actually do the movement to the cloud. >> Well, guys, I really appreciate you coming on the program and again, your candor to speak openly about you know, your approach, the competitors. And so it's great having you, really really thank you for, for your time. >> Appreciate it. >> And thank you for watching everybody. Look, if you guys want to come back, go toe to toe with these guys, say the word you're always welcome to come on The Cube. One thing for sure, Oracle are serious, when it comes to database. Thank you for watching. This is Dave Vellante. We'll see you next time. (bright music)

Published Date : Apr 7 2021

SUMMARY :

And Neil Mendelson is the for some of the viewers of the cloud data warehouse and maybe the path you took to get here. And the first thing that we And actually Neil, you might want to chime And you know, we have And you know, when I talked In the meantime, we've been busy, you know it's going to get, you know, not selling these to you to the extent that you solve that problem. decisions that you make. Oh, just to add on to that, you know, So all is said, you know, I don't know, Neil, you want to take that? And those are, you know, HR applications, I appreciate you guys' And can you tell me if many questions for you guys. George on something you said but you say, I want 14 CPU's In the background, you Okay, thank you. And maybe, if we add, you know, born in the cloud, you So that you know, when we really the early to that game. And I think that when you hear interesting that, you know We also have the ability to you know the differentiators, And we think that this is, you know speak openly about you know, And thank you for watching everybody.

ENTITIES

Entity	Category	Confidence
Andy	PERSON	0.99+
George	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Andrew Mendelsohn	PERSON	0.99+
Neil	PERSON	0.99+
Neil Mendelson	PERSON	0.99+
Dave	PERSON	0.99+
George Lumpkin	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Deloitte	ORGANIZATION	0.99+
Steve Savannah	PERSON	0.99+
1977	DATE	0.99+
AWS	ORGANIZATION	0.99+
Frank Slootman	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
2015	DATE	0.99+
Andy Jassy	PERSON	0.99+
2018	DATE	0.99+
April	DATE	0.99+
100s	QUANTITY	0.99+
5:00 PM	DATE	0.99+
April 2021	DATE	0.99+
tomorrow morning	DATE	0.99+
Tesla	ORGANIZATION	0.99+
10 CPU	QUANTITY	0.99+
Last year	DATE	0.99+
Oracle Autonomous Data Warehouse	ORGANIZATION	0.99+

Chris Lynch, AtScale | CUBE Conversation, March 2021

>>Hello, and welcome to this cube conversation. I'm Sean for, with the cube here in Palo Alto, California, actually coming out of the pandemic this year. Hopefully we'll be back to real life soon. Uh it's uh, in March, shouldn't it be? April spring, 2021. Got a great guest Chris Lynch, who is executive chairman, CEO of scale, who took over at the helm of this company about two and a half years ago, or so, um, lots of going on Chris. Great to see you, uh, remotely, uh, in Boston, we're here in Palo Alto. Great to see you. >>Great to see you as well, but hope to see you in person, this sprint. >>Yeah. I got to say people really missing real life. And I started to see events coming back to vaccines out there, but a lot going on. I mean, Dave and I Volante, I was just talking about how, um, you know, when we first met you and big data world was kicking ass and taking names a lot's changed at Duke went the way it went. Um, you know, Vertica coming, you led, did extremely well sold. HP continue to be a crown jewel for HPE. Now the world has changed in the data and with COVID more than ever, you starting to see more and more people really doubling down. You can see who the winners and losers are. You starting to see kind of the mega trend, and now you've got the edge and other things. So I want to get your take at scale, took advantage of that pivot. You've been in charge. Give us the update. What's the current strategy of that scale? >>Sure. Well, when I took the company over about two and a half years ago, it was very focused on accelerating the dupe instances. And, uh, as you mentioned earlier, the dupe is sort of plateaued, but the ability to take that semantic layer and deliver it in the cloud is actually even more relevant with the advent of snowflake and Databricks and the emergence of, uh, Google big query, um, and Azure as the analytic platforms, in addition to Amazon, which obviously was, was the first mover in the space. So I would say that while people present big day in as sort of a passing concept, I think it's been refined and matured and companies are now digitizing their environment to take advantage of being able to deliver all of this big data in a way that, um, they could get actionable insights, which I don't think has been the case through the early stages of the development of big data concepts. >>Yeah, Chris, we've always followed your career. You've been a strong operator, but also see things a little bit early, get on the wave, uh, and help helps companies turn around also on public, a great career. You've had, I got to ask you in your opinion and you, and you can make sense for customers and make sure customers see the value proposition. So I got to ask you in this new world of the semantic layer, you mentioned snowflake, Amazon and cloud scales. Huge. Why is the semantic layer important? What is it and why is it important for customers? What are they really buying with this? >>Well, they're buying a few things, the buying freedom and choice because we're multicloud, um, they're, they're buying the ability to evolve their environments versus your evolution versus revolution. When they think about how they move forward in the next generation of their enterprise architecture. And the reason that you need the semantic layer, particularly in the cloud is that we separate the source from the actual presentation of the data. So we allow data to stay where it is, but we create one logical view that was important for legacy data workloads, but it's even more important in a world of hybrid compute models in multi-vendor cloud models. So having one source of truth, consistency, consistent access, secure access, and actual insights to wall, and we deliver this with no code and we allow you to turbocharge the stacks of Azure and Amazon Redshift and Google big query while being able to use the data that you've created your enterprise. So, so there's a demand for big data and big data means being able to access all your data into one logical form, not pockets of data that are in the cloud that are behind the firewall that are constrained by, um, vendor lock-in, but open access to all of the data to make the best decisions. >>So if I'm an enterprise and I'm used to on-premise data warehouses and data management, you know, from whether it's playing with a dupe clusters or whatever, I see snowflake, I see the cloud scale. How do I get my teams kind of modernized if you had to kind of go in and say, cause most companies actually have a hard time doing that. They're like they got to turn their existing it into cloud powerhouses. That's what they want to do. So how do you get them there? What's the secret in your opinion, to take a team and a company that's used to doing it on prem, on premises to the cloud? >>Sure. It's a great question. So as I mentioned before, the difference between evolution and revolution today, without outscale to do what you're suggesting is a revolution. And you know, it's very difficult to perform heart surgery on the patient while he's running the Boston marathon. And that's the analog I would give you for trying to digitize your environment without this semantic layer that allows you to first create a logical layer, right? This information in a logical mapping so that you can gradually move data to the appropriate place. Without us. You're asked to go from, you know, one spot to another and do that while you're running your business. And that's what discourages companies or creates tremendous risk with digitizing your environment or moving to cloud. They have to be able to do it in a way that's non-disruptive to their business and seamless with respect to their current workflows. >>No, Chris, I got to ask you without, I know you probably not expecting this question, but um, most people don't know that you are also an investor before you as CEO, um, angel investor as well. You did an angel investment deal with a chemical data robot. We've had a good outcome. And so you've seen the wave, you've seen a kind of how the progress, you mentioned snowflake earlier. Um, as you look at those kinds of deals, as they've evolved, you know, you're seeing this acceleration with data science, what's your take on this because you know, those companies that have become successful or been acquired that you've invested in now, you're operating at scale as a company, you got to direct the company into the right direction. Where is that? Where are you taking this thing? >>Sure. It's a great, great question. So with respect to AI and ML and the investment that I made almost 10 years ago and data robot, um, I believe then, and I believe now more than ever that AI is going to be the next step function in industrial productivity. And I think it's going to change, you know, the composition of our lives. And, um, I think I have enough to have been around when the web was commercialized in the internet, the impact that's having had on the world. I think that impact pales in comparison to what AI, the application of AI to all walks of life has gone going to do. Um, I think that, um, within the next 24 months companies that don't have an AI strategy will be shorted on wall street. I think every phone, every, every vertical function in the marketplace is going to be impacted by AI. >>And, um, we're just seeing the infancy of mass adoption application when it comes to at scale. I think we're going to be right in the middle of that. We're about the democratization of those AI and machine learning models. One of the interesting things we developed it, this ML ops product, where we're able to allow you with your current BI tool, we're able to take machine learning models and just all the legacy BI data into those models, providing better models, more accurate, and precise models, and then re publish that data back out to the BI tool of your choice, whether it be Tableau, Microsoft power, BI Excel, we don't care. >>So I got to ask you, okay, the enterprises are easy targets, large enterprises, you know, virtualization of the, of this world that we're living with. COVID virtualization being more, you know, virtual events, virtual meetings, virtual remote, not, not true virtualization, as we know it, it virtualization, but like life of virtualization of life companies, small companies like the, even our size, the cube, we're getting more data. So you start to see people becoming more data full, not used to dealing with data city mission. They see opportunities to pivot, leverage the data and take advantage of the cloud scale. McKinsey, just put out a report that we covered. There's a trillion dollars of new Tam in innovation, new use cases around data. So a small company, the size of the cube Silicon angle could be out there and innovate and build a use case. This is a new dynamic. This is something that was seen, this mid-market opportunity where people are starting to realize they can have a competitive advantage and disrupt the big guys and the incumbents. How do you see this mid market opportunity and how does at-scale fit into that? >>So you're as usual you're spot on John. And I think the living breathing example of snowflake, they brought analytics to the masses and to small and medium enterprises that didn't necessarily have the technical resources to implement. And we're taking a page out of their book. We're beginning to deliver the end of this quarter, integrated solutions, that map SME data with public markets, data and models, all integrated in their favorite SAS applications to make it simple and easy for them to get EnLink insight and drive it into their business decisions. And we think we're very excited about it. And, you know, if, if we can be a fraction, um, if we can, if we get a fraction of the adoption that snowflake has will be very soon, we'll be very successful and very happy with the results this year. >>Great to see you, Chris, I want to ask you one final question. Um, as you look at companies coming out of the pandemic, um, growth strategies is going to be in play some projects going to be canceled. There's pretty obvious, uh, you know, evidence that, that has been exposed by working at remote and everyone working at home, you can start to see what worked, what wasn't working. So that's going to be clear. You're gonna start to see pattern of people doubling down on certain projects. Um, at scales, a company has a new trajectory for folks that kind of new the old company, or might not have the update. What is at scale all about what are what's the bumper sticker? What's the value proposition what's working that you're doubling down on. >>We want to deliver advanced multi-dimensional analytics to customers in the cloud. And we want to do that by delivering, not compromising on the complexity of analytics, um, and to do that, you have to deliver it, um, in a seamless and easy to use way. And we figure out a way to do that by delivering it through the applications that they know and love today, whether it be their Salesforce or QuickBooks or you name, the SAS picked that application, we're going to turbocharge them with big data and machine learning in a way that's going to enhance their operations without, uh, increase the complexity. So it's about delivering analytics in a way that customers can absorb big customers and small customers alike. >>While I got you here, one final final question, because you're such an expert at turnarounds, as well as growing companies that have a growth opportunity. There's three classes of companies that we see emerging from this new cloud scale model where data's involved or whatever new things out there, but mainly data and cloud scale. One is use companies that are either rejuvenating their business model or pivoting. Okay. So they're looking at cost optimization, things of that nature, uh, class number two innovation strategy, where they're using technology and data to build new use cases or changed existing use cases for kind of new capabilities and finally pioneers, pioneering new net, new paradigms or categories. So each one has its own kind of profile. All, all are winning with data as a former investor and now angel investor and someone who's seen turnarounds and growing companies that are on the innovation wave. What's your takeaway from this because it's pretty miraculous. If you think about what could happen in each one of those cases, there's an opportunity for all three categories with cloud and data. What's your personal take on that? >>So I think if you look at, um, ways we've seen in the past, you know, particularly the, you know, the internet, it created a level of disruption that croup that delivered basically a renewed, um, playing field so that the winners and losers really could be reset and be based on their ability to absorb and leverage the new technology. I think the same as an AI and ML. So I think it creates an opportunity for businesses that were laggerts to catch, operate, or even supersede the competitors. Um, I think it has that kind of an impact. So from my, my view, you're going to see as big data and analytics and artificial intelligence, you know, mature and coalesce, um, vertical integration. So you're going to see companies that are full stack businesses that are delivered through AI and cloud, um, that are completely new and created or read juvenile based on leveraging these new fundamentals. >>So I think you're going to see a set of new businesses and business models that are created by this ubiquitous access to analytics and data. And you're going to see some laggerts catch up that you're going to see some of the people that say, Hey, if it isn't broke, don't fix it. And they're going to go by the wayside and it's going to happen very, very quickly. When we started this business, John, the cycle of innovation was five it's now, you know, under a year, maybe, maybe even five months. So it's like the difference between college for some professional sports, same football game, the speed of the game is completely different. And the speed of the game is accelerating. >>That's why the startup actions hot, and that's why startups are going from zero to 60, if you will, uh, very quickly, um, highly accelerated great stuff. Chris Lynch veteran the industry executive chairman CEO of scale here on the cube conversation with John furrier, the host. Thank you for watching Chris. Great to see you. Thanks for coming on. >>Great to see you, John, take care. Hope to see you soon. >>Okay. Let's keep conversation. Thanks for watching.

Published Date : Mar 24 2021

SUMMARY :

Great to see you, And I started to see events coming back to vaccines out there, the dupe is sort of plateaued, but the ability to take that semantic layer So I got to ask you in this new this with no code and we allow you to turbocharge the stacks of Azure So how do you get them there? You're asked to go from, you know, one spot to another and do No, Chris, I got to ask you without, I know you probably not expecting this question, but um, the application of AI to all walks of life has gone going to do. and then re publish that data back out to the BI tool of your choice, So I got to ask you, okay, the enterprises are easy targets, large enterprises, you know, enterprises that didn't necessarily have the technical resources to implement. So that's going to be clear. and to do that, you have to deliver it, um, in a seamless and easy to use way. companies that are on the innovation wave. So I think if you look at, um, ways we've seen in the past, And they're going to go by the wayside and it's going to happen very, very quickly. executive chairman CEO of scale here on the cube conversation with John furrier, the host. Hope to see you soon. Thanks for watching.

ENTITIES

Entity	Category	Confidence
Chris	PERSON	0.99+
Chris Lynch	PERSON	0.99+
April spring, 2021	DATE	0.99+
Boston	LOCATION	0.99+
March 2021	DATE	0.99+
Amazon	ORGANIZATION	0.99+
John	PERSON	0.99+
Dave	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Sean	PERSON	0.99+
March	DATE	0.99+
five months	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
five	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
this year	DATE	0.99+
60	QUANTITY	0.99+
One	QUANTITY	0.99+
zero	QUANTITY	0.99+
SAS	ORGANIZATION	0.99+
one final question	QUANTITY	0.98+
pandemic	EVENT	0.98+
Google	ORGANIZATION	0.98+
HPE	ORGANIZATION	0.98+
each one	QUANTITY	0.97+
HP	ORGANIZATION	0.97+
Duke	ORGANIZATION	0.97+
today	DATE	0.97+
Tableau	TITLE	0.96+
first	QUANTITY	0.95+
under a year	QUANTITY	0.94+
one final final question	QUANTITY	0.94+
QuickBooks	TITLE	0.94+
one	QUANTITY	0.93+
COVID	OTHER	0.93+
three	QUANTITY	0.93+
about two and a half years ago	DATE	0.92+
10 years ago	DATE	0.91+
BI Excel	TITLE	0.91+
about	DATE	0.91+
scale	ORGANIZATION	0.91+
two and a half years ago	DATE	0.89+
Salesforce	TITLE	0.87+
Boston marathon	EVENT	0.84+
Redshift	TITLE	0.84+
first mover	QUANTITY	0.83+
one logical form	QUANTITY	0.83+
one source	QUANTITY	0.83+
McKinsey	ORGANIZATION	0.8+
trillion dollars	QUANTITY	0.8+
next 24 months	DATE	0.8+
end	DATE	0.8+
EnLink	ORGANIZATION	0.79+
one spot	QUANTITY	0.75+
this quarter	DATE	0.71+
three classes	QUANTITY	0.7+
AtScale	ORGANIZATION	0.69+
Azure	TITLE	0.67+
Vertica	ORGANIZATION	0.63+
innovation	EVENT	0.59+
Volante	PERSON	0.56+
Azure	ORGANIZATION	0.54+
ML	ORGANIZATION	0.53+
two	QUANTITY	0.41+
COVID	TITLE	0.33+

Pure Storage Convergence of File and Object FULL SHOW V1

we're running what i would call a little mini series and we're exploring the convergence of file and object storage what are the key trends why would you want to converge file an object what are the use cases and architectural considerations and importantly what are the business drivers of uffo so-called unified fast file and object in this program you'll hear from matt burr who is the gm of pure's flashblade business and then we'll bring in the perspectives of a solutions architect garrett belsner who's from cdw and then the analyst angle with scott sinclair of the enterprise strategy group esg he'll share some cool data on our power panel and then we'll wrap with a really interesting technical conversation with chris bond cb bond who is a lead data architect at microfocus and he's got a really cool use case to share with us so sit back and enjoy the program from around the globe it's thecube presenting the convergence of file and object brought to you by pure storage we're back with the convergence of file and object a special program made possible by pure storage and co-created with the cube so in this series we're exploring that convergence between file and object storage we're digging into the trends the architectures and some of the use cases for unified fast file and object storage uffo with me is matt burr who's the vice president and general manager of flashblade at pure storage hello matt how you doing i'm doing great morning dave how are you good thank you hey let's start with a little 101 you know kind of the basics what is unified fast file and object yeah so look i mean i think you got to start with first principles talking about the rise of unstructured data so um when we think about unstructured data you sort of think about the projections 80 of data by 2025 is going to be unstructured data whether that's machine generated data or um you know ai and ml type workloads uh you start to sort of see this um i don't want to say it's a boom uh but it's sort of a renaissance for unstructured data if you will we move away from you know what we've traditionally thought of as general purpose nas and and file shares to you know really things that focus on uh fast object taking advantage of s3 cloud native applications that need to integrate with applications on site um you know ai workloads ml workloads tend to look to share data across you know multiple data sets and you really need to have a platform that can deliver both highly performant and scalable fast file and object from one system so talk a little bit more about some of the drivers that you know bring forth that need to unify file an object yeah i mean look you know there's a there's there's a real challenge um in managing you know bespoke uh bespoke infrastructure or architectures around general purpose nas and daz etc so um if you think about how a an architect sort of looks at an application they might say well okay i need to have um you know fast daz storage proximal to the application um but that's going to require a tremendous amount of dams which is a tremendous amount of drives right hard drives are you know historically pretty pretty pretty unwieldy to manage because you're replacing them relatively consistently at multi-petabyte scale um so you start to look at things like the complexity of daz you start to look at the complexity of general purpose nas and you start to just look at quite frankly something that a lot of people don't really want to talk about anymore but actual data center space right like consolidation matters the ability to take you know something that's the size of a microwave like a modern flash blade or a modern um you know uffo device uh replaces something that might be you know the size of three or four or five refrigerators so matt what why is is now the right time for this i mean for years nobody really paid much attention to object s3 already obviously changed you know that course most of the world's data is still stored in file formats and you get there with nfs or smb why is now the time to think about unifying object and file well because we're moving to things like a contactless society um you know the the things that we're going to do are going to just require a tremendous amount more compute power network um and quite frankly storage throughput and you know i can give you two sort of real primary examples here right you know warehouses are being you know taken over by robots if you will um it's not a war it's a it's a it's sort of a friendly advancement in you know how do i how do i store a box in a warehouse and you know we have we have a customer who focuses on large sort of big box distribution warehousing and you know a box that carried a an object two weeks ago might have a different box size two weeks later well that robot needs to know where the space is in the data center in order to put it but also needs to be able to process hey i don't want to put the thing that i'm going to access the most in the back of the warehouse i'm going to put that thing in the front of the warehouse all of those types of data you know sort of real time you can think of the robot as almost an edge device is processing in real time unstructured data in its object right so it's sort of the emergence of these new types of workloads and i give you the opposite example the other end of the spectrum is ransomware right you know today you know we'll talk to customers and they'll say quite commonly hey if you know anybody can sell me a backup device i need something that can restore quickly um if you had the ability to restore something in 270 terabytes an hour or 250 terabytes an hour uh that's much faster when you're dealing with a ransomware attack you want to get your data back quickly you know so i want to add i was going to ask you about that later but since you brought it up what is the right i guess call it architecture for for for ransomware i mean how and explain like how unified object and file which appointment i get the fast recovery but how how would you recommend a customer uh go about architecting a ransomware proof you know system yeah well you know with with flashblade and and with flasharray there's an actual feature called called safe mode and that safe mode actually protects uh the snapshots and and the data from uh sort of being a part of the of the ransomware event and so if you're in a type of ransomware situation like this you're able to leverage safe mode and you say okay what happens in a ransomware attack is you can't get access to your data and so you know the bad guy the perpetrator is basically saying hey i'm not going to give you access to your data until you pay me you know x in bitcoin or whatever it might be right um with with safe mode those snapshots are actually protected outside of the ransomware blast zone and you can bring back those snapshots because what's your alternative if you're not doing something like that your alternative is either to pay and unlock your data or you have to start retouring restoring excuse me from tape or slow disk that could take you days or weeks to get your data back so leveraging safe mode um you know in either the flash for the flash blade product uh is a great way to go about architecting against ransomware i got to put my my i'm thinking like a customer now so safe mode so that's an immutable mode right can't change the data um is it can can an administrator go in and change that mode can you turn it off do i still need an air gap for example what would you recommend there yeah so there there are still um uh you know sort of our back or roll back role-based access control policies uh around who can access that safe mode and who can right okay so uh anyway subject for a different day i want to i want to actually bring up uh if you don't object a topic that i think used to be really front and center and it now be is becoming front and center again i mean wikibon just produced a research note forecasting the future of flash and hard drives and those of you who follow us know we've done this for quite some time and you can if you could bring up the chart here you you could and we see this happening again it was originally we forecast the the the death of of quote-unquote high spin speed disc drives which is kind of an oxymoron but you can see on here on this chart this hard disk had a magnificent journey but they peaked in volume in manufacturing volume in 2010 and the reason why that is is so important is that volumes now are steadily dropping you can see that and we use wright's law to explain why this is a problem and wright's law essentially says that as you your cumulative manufacturing volume doubles your cost to manufacture decline by a constant percentage now i won't go too much detail on that but suffice it to say that flash volumes are growing very rapidly hdd volumes aren't and so flash because of consumer volumes can take advantage of wright's law and that constant reduction and that's what's really important for the next generation which is always more expensive to build uh and so this kind of marks the beginning of the end matt what do you think what what's the future hold for spinning disc in your view uh well i can give you the answer on two levels on a personal level uh it's why i come to work every day uh you know the the eradication or or extinction of an inefficient thing um you know i like to say that uh inefficiency is the bane of my existence uh and i think hard drives are largely inefficient and i'm willing to accept the sort of long-standing argument that um you know we've seen this transition in block right and we're starting to see it repeat itself in in unstructured data and i'm going to accept the argument that cost is a vector here and it most certainly is right hdds have been considerably cheaper uh than than than flash storage um you know even to this day uh you know up up to this point right but we're starting to approach the point where you sort of reach a a 3x sort of um you know differentiator between the cost of an hdd and an std and you know that really is that point in time when uh you begin to pick up a lot of volume and velocity and so you know that tends to map directly to you know what you're seeing here which is you know a a slow decline uh which i think is going to become even more rapid kind of probably starting around next year um where you start to see sds excuse me ssds uh you know really replacing hdds uh at a much more rapid clip particularly on the unstructured data side and it's largely around cost the the workloads that we talked about robots and warehouses or you know other types of advanced machine learning and artificial intelligence type applications and workflows you know they require a degree of performance that a hard drive just can't deliver we are we are seeing sort of the um creative innovative uh disruption of an entire industry right before our eyes it's a fun thing to live through yeah and and we would agree i mean it doesn't the premise there is that it doesn't have to be less expensive we think it will be by you know the second half or early second half of this decade but even if it's a we think around a 3x delta the value of of ssd relative to spinning disk is going to overwhelm just like with your laptop you know it got to the point where you said why would i ever have a spinning disc in my laptop we see the same thing happening here um and and so and we're talking about you know raw capacity you know put in compression and d-dupe and everything else that you really can't do with spinning discs because of the performance issues you can do with flash okay let's come back to uffo can we dig into the challenges specifically that that this solves for customers give me give us some examples yeah so you know i mean if we if we think about the examples um you know the the robotic one um i think is is is the one that i think is the marker for you know kind of of of the the modern side of of of what we see here um but what we're you know what we're what we're seeing from a trend perspective which you know not everybody's deploying robots right um you know there's there's many companies that are you know that aren't going to be in either the robotic business uh or or even thinking about you know sort of future type oriented type things but what they are doing is green field applications are being built on object um generally not on not on file and and not on block and so you know the rise of of object as sort of the the sort of let's call it the the next great protocol for um you know for uh for for modern workloads right this is this is that that modern application coming to the forefront and that could be anything from you know financial institutions you know right down through um you we've even see it and seen it in oil and gas uh we're also seeing it across across healthcare uh so you know as as as companies take the opportunity as industries to take this opportunity to modernize you know they're modernizing not on things that are are leveraging you know um you know sort of archaic disk technology they're they're they're really focusing on on object but they still have file workflows that they need to that they need to be able to support and so having the ability to be able to deliver those things from one device in a capacity orientation or a performance orientation uh while at the same time dramatically simplifying uh the overall administration of your environment both physically and non-physically is a key driver so the great thing about object is it's simple it's a kind of a get put metaphor um it's it scales out you know because it's got metadata associated with the data uh and and it's cheap uh the drawback is you don't necessarily associate it with high performance and and and as well most applications don't you know speak in that language they speak in the language of file you know or as you mentioned block so i i see real opportunities here if i have some some data that's not necessarily frequently accessed you know every day but yet i want to then whether end of quarter or whatever it is i want to i want to or machine learning i want to apply some ai to that data i want to bring it in and then apply a file format uh because for performance reasons is that right maybe you could unpack that a little bit yeah so um you know we see i mean i think you described it well right um but i don't think object necessarily has to be slow um and nor does it have to be um you know because when you think about you brought up a good point with metadata right being able to scale to a billions of objects being able to scale to billions of objects excuse me is of value right um and i think people do traditionally associate object with slow but it's not necessarily slow anymore right we we did a sort of unofficial survey of of of our of our customers and our employee base and when people described object they thought of it as like law firms and storing a word doc if you will um and that that's just you know i think that there's a lack of understanding or a misnomer around what modern what modern object has become and perform an object particularly at scale when we're talking about billions of objects you know that's the next frontier right um is it at pace performance wise with you know the other protocols no uh but it's making leaps and grounds so you talked a little bit more about some of the verticals that you see i mean i think when i think of financial services i think transaction processing but of course they have a lot of tons of unstructured data are there any patterns you're seeing by by vertical market um we're you know we're not that's the interesting thing um and you know um as a as a as a as a company with a with a block heritage or a block dna those patterns were pretty easy to spot right there were a certain number of databases that you really needed to support oracle sql some postgres work et cetera then kind of the modern databases around cassandra and things like that you knew that there were going to be vmware environments you know you could you could sort of see the trends and where things were going unstructured data is such a broader horizontal thing right so you know inside of oil and gas for example you have you know um you have specific applications and bespoke infrastructures for those applications um you know inside of media entertainment you know the same thing the the trend that we're seeing the commonality that we're seeing is the modernization of you know object as a starting point for all the all the net new workloads within within those industry verticals right that's the most common request we see is what's your object roadmap what's your you know what's your what's your object strategy you know where do you think where do you think object is going so um there isn't any um you know sort of uh there's no there's no path uh it's really just kind of a wide open field in front of us with common requests across all industries so the amazing thing about pure just as a kind of a little you know quasi you know armchair historian the industry is pure was really the only company in many many years to be able to achieve escape velocity break through a billion dollars i mean three part couldn't do it isilon couldn't do it compellent couldn't do it i could go on but pure was able to achieve that as an independent company and so you become a leader you look at the gartner magic quadrant you're a leader in there i mean if you've made it this far you've got to have some chops and so of course it's very competitive there are a number of other storage suppliers that have announced products that unify object and file so i'm interested in how pure differentiates why pure um it's a great question um and it's one that uh you know having been a long time puritan uh you know i take pride in answering um and it's actually a really simple answer um it's it's business model innovation and technology right the the technology that goes behind how we do what we do right and i don't mean the product right innovation is product but having a better support model for example um or having on the business model side you know evergreen storage right where we sort of look at your relationship to us as a subscription right um you know we're going to sort of take the thing that that you've had and we're going to modernize that thing in place over time such that you're not rebuying that same you know terabyte or you know petabyte of storage that you've that you that you've paid for over time so um you know sort of three legs of the stool uh that that have made you know pure clearly differentiated i think the market has has recognized that um you're right it's it's hard to break through to a billion dollars um but i look forward to the day that you know we we have two billion dollar products and i think with uh you know that rise in in unstructured data growing to 80 by 2025 and you know the massive transition that you know you guys have noted in in in your hdd slide i think it's a huge opportunity for us on you know the other unstructured data side of the house you know the other thing i'd add matt i've talked to cause about this is is it's simplicity first i've asked them why don't you do this why don't you do it and the answer is always the same is that adds complexity and we we put simplicity for the customer ahead of everything else and i think that served you very very well what about the economics of of unified file an object i mean if you bring in additional value presumably there's a there there's a cost to that but there's got to be also a business case behind it what kind of impact have you seen uh with customers yeah i mean look i'll i'll i'll go back to something i mentioned earlier which is just the reclamation of floor space and power and cooling right um you know there's a you know there's people people people want to search for kind of the the sexier element if you will when it comes to looking at how we how you derive value from something but the reality is if you're reducing your power consumption by you know by by a material percentage power bills matter in big in big data centers um you know customers typically are are facing you know a paradigm of well i i want to go to the cloud but you know the clouds are not being more expensive than i thought it was going to be or you know i figured out what i can use in the cloud i thought it was going to be everything but it's not going to be everything so hybrid's where we're landing but i want to be out of the data center business and i don't want to have a team of 20 storage people to match you know to administer my storage um you know so there's sort of this this very tangible value around you know hey if i could manage um you know multiple petabytes with one full-time engineer uh because the system uh to yoran kaz's point was radically simpler to administer didn't require someone to be running around swapping drives all the time would that be a value the answer is yes 100 of the time right and then you start to look at okay all right well on the uffo side from a product perspective hey if i have to manage a you know bespoke environment for this application if i have to manage a bespoke environment for this application and a bespoke environment for this application and this book environment for this application i'm managing four different things and can i actually share data across those four different things there's ways to share data but most customers it just gets too complex how do you even know what your what your gold.master copy is of data if you have it in four different places or you try to have it in four different places and it's four different siloed infrastructures so when you get to the sort of the side of you know how do we how do you measure value in uffo it's actually being able to have all of that data concentrated in one place so that you can share it from application to application got it i'm interested we use a couple minutes left i'm interested in the the update on flashblade you know generally but also i have a specific question i mean look getting file right is hard enough uh you just announced smb support for flashblade i'm interested in you know how that fits in i think it's kind of obvious with file and object converging but give us the update on on flashblade and maybe you could address that specific question yeah so um look i mean we're we're um you know tremendously excited about the growth of flashblade uh you know we we we found workloads we never expected to find um you know the rapid restore workload was one that was actually brought to us from from from a customer actually and has become you know one of our one of our top two three four you know workloads so um you know we're really happy with the trend we've seen in it um and you know mapping back to you know thinking about hdds and ssds you know we're well on a path to building a billion dollar business here so you know we're very excited about that um but to your point you know you don't just snap your fingers and get there right um you know we've learned that doing file and object uh is is harder than block um because there's more things that you have to go do for one you're basically focused on three protocols s b nfs and s3 not necessarily in that order um but to your point about smb uh you know we we are uh on the path through to releasing um you know smb uh full full native smb support in in the system that will allow us to uh service customers we have a limitation with some customers today where they'll have an s b portion of their nfs workflow um and we do great on the nfs side um but you know we didn't we didn't have the ability to plug into the s p component of their workflow so that's going to open up a lot of opportunity for us um on on that front um and you know we continue to you know invest significantly across the board in in areas like security which is you know become more than just a hot button you know today security's always been there but it feels like it's blazing hot today um and so you know going through the next couple years we'll be looking at uh you know developing some some um you know pretty material security elements of the product as well so uh well on a path to a billion dollars is the net on that and uh you know we're we're fortunate to have have smb here and we're looking forward to introducing that to to those customers that have you know nfs workloads today with an s p component yeah nice tailwind good tam expansion strategy matt thanks so much really appreciate you coming on the program we appreciate you having us and uh thanks much dave good to see you [Music] okay we're back with the convergence of file and object in a power panel this is a special content program made possible by pure storage and co-created with the cube now in this series what we're doing is we're exploring the coming together of file and object storage trying to understand the trends that are driving this convergence the architectural considerations that users should be aware of and which use cases make the most sense for so-called unified fast file in object storage and with me are three great guests to unpack these issues garrett belsner is the data center solutions architect he's with cdw scott sinclair is a senior analyst at enterprise strategy group he's got deep experience on enterprise storage and brings that independent analyst perspective and matt burr is back with us gentlemen welcome to the program thank you hey scott let me let me start with you uh and get your perspective on what's going on the market with with object the cloud a huge amount of unstructured data out there that lives in files give us your independent view of the trends that you're seeing out there well dave you know where to start i mean surprise surprise date is growing um but one of the big things that we've seen is we've been talking about data growth for what decades now but what's really fascinating is or changed is because of the digital economy digital business digital transformation whatever you call it now people are not just storing data they actually have to use it and so we see this in trends like analytics and artificial intelligence and what that does is it's just increasing the demand for not only consolidation of massive amounts of storage that we've seen for a while but also the demand for incredibly low latency access to that storage and i think that's one of the things that we're seeing that's driving this need for convergence as you put it of having multiple protocols consolidated onto one platform but also the need for high performance access to that data thank you for that a great setup i got like i wrote down three topics that we're going to unpack as a result of that so garrett let me let me go to you maybe you can give us the perspective of what you see with customers is is this is this like a push where customers are saying hey listen i need to converge my file and object or is it more a story where they're saying garrett i have this problem and then you see unified file and object as a solution yeah i think i think for us it's you know taking that consultative approach with our customers and really kind of hearing pain around some of the pipelines the way that they're going to market with data today and kind of what are the problems that they're seeing we're also seeing a lot of the change driven by the software vendors as well so really being able to support a disaggregated design where you're not having to upgrade and maintain everything as a single block has really been a place where we've seen a lot of customers pivot to where they have more flexibility as they need to maintain larger volumes of data and higher performance data having the ability to do that separate from compute and cache and those other layers are is really critical so matt i wonder if if you could you know follow up on that so so gary was talking about this disaggregated design so i like it you know distributed cloud etc but then we're talking about bringing things together in in one place right so square that circle how does this fit in with this hyper-distributed cloud edge that's getting built out yeah you know i mean i i could give you the easy answer on that but i could also pass it back to garrett in the sense that you know garrett maybe it's important to talk about um elastic and splunk and some of the things that you're seeing in in that world and and how that i think the answer to dave's question i think you can give you can give a pretty qualified answer relative what your customers are seeing oh that'd be great please yeah absolutely no no problem at all so you know i think with um splunk kind of moving from its traditional design and classic design whatever you want you want to call it up into smart store um that was kind of one of the first that we saw kind of make that move towards kind of separating object out and i think you know a lot of that comes from their own move to the cloud and updating their code to basically take advantage of object object in the cloud uh but we're starting to see you know with like vertica eon for example um elastic other folks taking that same type of approach where in the past we were building out many 2u servers we were jamming them full of uh you know ssds and nvme drives that was great but it doesn't really scale and it kind of gets into that same problem that we see with you know hyper convergence a little bit where it's you know you're all you're always adding something maybe that you didn't want to add um so i think it you know again being driven by software is really kind of where we're seeing the world open up there but that whole idea of just having that as a hub and a central place where you can then leverage that out to other applications whether that's out to the edge for machine learning or ai applications to take advantage of it i think that's where that convergence really comes back in but i think like scott mentioned earlier it's really folks are now doing things with the data where before i think they were really storing it trying to figure out what are we going to actually do with it when we need to do something with it so this is making it possible yeah and dave if i could just sort of tack on to the end of garrett's answer there you know in particular vertica with neon mode the ability to leverage sharded subclusters give you um you know sort of an advantage in terms of being able to isolate performance hot spots you an advantage to that is being able to do that on a flashblade for example so um sharded subclusters allow you to sort of say i'm you know i'm going to give prioritization to you know this particular element of my application and my data set but i can still share those share that data across those across those subclusters so um you know as you see you know vertica advance with eon mode or you see splunk advance with with smart store you know these are all sort of advancements that are you know it's a chicken in the egg thing um they need faster storage they need you know sort of a consolidated data storage data set um and and that's what sort of allows these things to drive forward yeah so vertica eon mode for those who don't know it's the ability to separate compute and storage and scale independently i think i think vertica if they're if they're not the only one they're one of the only ones i think they might even be the only one that does that in the cloud and on-prem and that sort of plays into this distributed you know nature of this hyper-distributed cloud i sometimes call it and and i'm interested in the in the data pipeline and i wonder scott if we could talk a little bit about that maybe we're unified object and file i mean i'm envisioning this this distributed mesh and then you know uffo is sort of a node on that that i i can tap when i need it but but scott what are you seeing as the state of infrastructure as it relates to the data pipeline and the trends there yeah absolutely dave so when i think data pipeline i immediately gravitate to analytics or or machine learning initiatives right and so one of the big things we see and this is it's an interesting trend it seems you know we continue to see increased investment in ai increased interest and people think and as companies get started they think okay well what does that mean well i got to go hire a data scientist okay well that data scientist probably needs some infrastructure and what they end what often happens in these environments is where it ends up being a bespoke environment or a one-off environment and then over time organizations run into challenges and one of the big challenges is the data science team or people whose jobs are outside of it spend way too much time trying to get the infrastructure to to keep up with their demands and predominantly around data performance so one of the one of the ways organizations that especially have artificial intelligence workloads in production and we found this in our research have started mitigating that is by deploying flash all across the data pipeline we have we have data on this sorry interrupt but yeah if you could bring up that that chart that would be great um so take us through this uh uh scott and share with us what we're looking at here yeah absolutely so so dave i'm glad you brought this up so we did this study um i want to say late last year uh one of the things we looked at was across artificial intelligence environments now one thing that you're not seeing on this slide is we went through and we asked all around the data pipeline and we saw flash everywhere but i thought this was really telling because this is around data lakes and when when or many people think about the idea of a data lake they think about it as a repository it's a place where you keep maybe cold data and what we see here is especially within production environments a pervasive use of flash storage so i think that 69 of organizations are saying their data lake is mostly flash or all flash and i think we have zero percent that don't have any flash in that environment so organizations are finding out that they that flash is an essential technology to allow them to harness the value of their data so garrett and then matt i wonder if you could chime in as well we talk about digital transformation and i sometimes call it you know the coveted forced march to digital transformation and and i'm curious as to your perspective on things like machine learning and the adoption and scott you may have a perspective on this as well you know we had to pivot we had to get laptops we had to secure the end points you know and vdi those became super high priorities what happened to you know injecting ai into my applications and and machine learning did that go in the back burner was that accelerated along with the need to digitally transform garrett i wonder if you could share with us what you saw with with customers last year yeah i mean i think we definitely saw an acceleration um i think folks are in in my market are still kind of figuring out how they inject that into more of a widely distributed business use case but again this data hub and allowing folks to now take advantage of this data that they've had in these data lakes for a long time i agree with scott i mean many of the data lakes that we have were somewhat flash accelerated but they were typically really made up of you know large capacity slower spinning near-line drive accelerated with some flash but i'm really starting to see folks now look at some of those older hadoop implementations and really leveraging new ways to look at how they consume data and many of those redesigned customers are coming to us wanting to look at all flash solutions so we're definitely seeing it we're seeing an acceleration towards folks trying to figure out how to actually use it in more of a business sense now or before i feel it goes a little bit more skunk works kind of people dealing with uh you know in a much smaller situation maybe in the executive offices trying to do some testing and things scott you're nodding away anything you can add in here yeah so first off it's great to get that confirmation that the stuff we're seeing in our research garrett's seeing you know out in the field and in the real world um but you know as it relates to really the past year it's been really fascinating so one of the things we study at esg is i.t buying intentions what are things what are initiatives that companies plan to invest in and at the beginning of 2020 we saw a heavy interest in machine learning initiatives then you transition to the middle of 2020 in the midst of covid some organizations continued on that path but a lot of them had the pivot right how do we get laptops to everyone how do we continue business in this new world well now as we enter into 2021 and hopefully we're coming out of this uh you know the pandemic era um we're getting into a world where organizations are pivoting back towards these strategic investments around how do i maximize the usage of data and actually accelerating those because they've seen the importance of of digital business initiatives over the past year yeah matt i mean when we exited 2019 we saw a narrowing of experimentation and our premise was you know that that organizations are going to start now operationalizing all their digital transformation experiments and and then we had a you know 10 month petri dish on on digital so what do you what are you seeing in this regard a 10 month petri dish is an interesting way to interesting way to describe it um you know we saw another there's another there's another candidate for pivot in there around ransomware as well right um you know security entered into the mix which took people's attention away from some of this as well i mean look i'd like to bring this up just a level or two um because what we're actually talking about here is progress right and and progress isn't is an inevitability um you know whether it's whether whether you believe that it's by 2025 or you or you think it's 2035 or 2050 it doesn't matter we're on a forced march to the eradication of disk and that is happening in many ways uh you know in many ways um due to some of the things that garrett was referring to and what scott was referring to in terms of what are customers demands for how they're going to actually leverage the data that they have and that brings me to kind of my final point on this which is we see customers in three phases there's the first phase where they say hey i have this large data store and i know there's value in there i don't know how to get to it or i have this large data store and i've started a project to get value out of it and we failed those could be customers that um you know marched down the hadoop path early on and they they got some value out of it um but they realized that you know hdfs wasn't going to be a modern protocol going forward for any number of reasons you know the first being hey if i have gold.master how do i know that i have gold.4 is consistent with my gold.master so data consistency matters and then you have the sort of third group that says i have these large data sets i know how to extract value from them and i'm already on to the verticas the elastics you know the splunks etc um i think those folks are the folks that that ladder group are the folks that kept their their their projects going because they were already extracting value from them the first two groups we we're seeing sort of saying the second half of this year is when we're going to begin really being picking up on these on these types of initiatives again well thank you matt by the way for for hitting the escape key because i think value from data really is what this is all about and there are some real blockers there that i kind of want to talk about you mentioned hdfs i mean we were very excited of course in the early days of hadoop many of the concepts were profound but at the end of the day it was too complicated we've got these hyper-specialized roles that are that are you know serving the business but it still takes too long it's it's too hard to get value from data and one of the blockers is infrastructure that the complexity of that infrastructure really needs to be abstracted taking up a level we're starting to see this in in cloud where you're seeing some of those abstraction layers being built from some of the cloud vendors but more importantly a lot of the vendors like pew are saying hey we can do that heavy lifting for you uh and we you know we have expertise in engineering to do cloud native so i'm wondering what you guys see uh maybe garrett you could start us off and other students as some of the blockers uh to getting value from data and and how we're going to address those in the coming decade yeah i mean i i think part of it we're solving here obviously with with pure bringing uh you know flash to a market that traditionally was utilizing uh much slower media um you know the other thing that i that i see that's very nice with flashblade for example is the ability to kind of do things you know once you get it set up a blade at a time i mean a lot of the things that we see from just kind of more of a you know simplistic approach to this like a lot of these teams don't have big budgets and being able to kind of break them down into almost a blade type chunk i think has really kind of allowed folks to get more projects and and things off the ground because they don't have to buy a full expensive system to run these projects so that's helped a lot i think the wider use cases have helped a lot so matt mentioned ransomware you know using safe mode as a place to help with ransomware has been a really big growth spot for us we've got a lot of customers very interested and excited about that and the other thing that i would say is bringing devops into data is another thing that we're seeing so kind of that push towards data ops and really kind of using automation and infrastructure as code as a way to now kind of drive things through the system the way that we've seen with automation through devops is really an area we're seeing a ton of growth with from a services perspective guys any other thoughts on that i mean we're i'll tee it up there we are seeing some bleeding edge which is somewhat counterintuitive especially from a cost standpoint organizational changes at some some companies uh think of some of the the the internet companies that do uh music uh for instance and adding podcasts etc and those are different data products we're seeing them actually reorganize their data architectures to make them more distributed uh and actually put the domain heads the business heads in charge of the the data and the data pipeline and that is maybe less efficient but but it's again some of these bleeding edge what else are you guys seeing out there that might be yes some harbingers of the next decade uh i'll go first um you know i think specific to um the the construct that you threw out dave one of the things that we're seeing is um you know the the application owner maybe it's the devops person but it's you know maybe it's it's it's the application owner through the devops person they're they're becoming more technical in their understanding of how infrastructure um interfaces with their with their application i think um you know what what we're seeing on the flashblade side is we're having a lot more conversations with application people than um just i.t people it doesn't mean that the it people aren't there the it people are still there for sure they have to deliver the service etc um but you know the days of of i.t you know building up a catalog of services and a business owner subscribing to one of those services you know picking you know whatever sort of fits their need um i don't think that constru i think that's the construct that changes going forward the application owner is becoming much more prescriptive about what they want the infrastructure to fit how they want the infrastructure to fit into their application and that's a big change and and for for um you know certainly folks like like garrett and cdw um you know they do a good job with this being able to sort of get to the application owner and bring those two sides together there's a tremendous amount of value there for us it's been a little bit of a retooling we've traditionally sold to the i.t side of the house and um you know we've had to teach ourselves how to go talk the language of of applications so um you know i think you pointed out a good a good a good construct there and and you know that that application owner taking playing a much bigger role in what they're expecting uh from the performance of it infrastructure i think is is is a key is a key change interesting i mean that definitely is a trend that's put you guys closer to the business where the the infrastructure team is is serving the business as opposed to sometimes i talk to data experts and they're frustrated uh especially data owners or or data product builders who are frustrated that they feel like they have to beg beg the the data pipeline team to get you know new data sources or get data out how about the edge um you know maybe scott you can kick us off i mean we're seeing you know the emergence of edge use cases ai inferencing at the edge a lot of data at the edge what are you seeing there and and how does this unified object i'll bring us back to that and file fit wow dave how much time do we have um two minutes first of all scott why don't you why don't you just tell everybody what the edge is yeah you got it figured out all right how much time do you have matt at the end of the day and that that's that's a great question right is if you take a step back and i think it comes back today of something you mentioned it's about extracting value from data and what that means is when you extract value from data what it does is as matt pointed out the the influencers or the users of data the application owners they have more power because they're driving revenue now and so what that means is from an i.t standpoint it's not just hey here are the services you get use them or lose them or you know don't throw a fit it is no i have to i have to adapt i have to follow what my application owners mean now when you bring that back to the edge what it means is is that data is not localized to the data center i mean we just went through a nearly 12-month period where the entire workforce for most of the companies in this country had went distributed and business continued so if business is distributed data is distributed and that means that means in the data center that means at the edge that means that the cloud that means in all other places in tons of places and what it also means is you have to be able to extract and utilize data anywhere it may be and i think that's something that we're going to continue to and continue to see and i think it comes back to you know if you think about key characteristics we've talked about things like performance and scale for years but we need to start rethinking it because on one hand we need to get performance everywhere but also in terms of scale and this ties back to some of the other initiatives and getting value from data it's something i call that the massive success problem one of the things we see especially with with workloads like machine learning is businesses find success with them and as soon as they do they say well i need about 20 of these projects now all of a sudden that overburdens it organizations especially across across core and edge and cloud environments and so when you look at environments ability to meet performance and scale demands wherever it needs to be is something that's really important you know so dave i'd like to um just sort of tie together sort of two things that um i think that i heard from scott and garrett that i think are important and it's around this concept of scale um you know some of us are old enough to remember the day when kind of a 10 terabyte blast radius was too big of a blast radius for people to take on or a terabyte of storage was considered to be um you know an exemplary budget environment right um now we sort of think as terabytes kind of like we used to think of as gigabytes in some ways um petabyte like you don't have to explain anybody what a petabyte is anymore um and you know what's on the horizon and it's not far are our exabyte type data set workloads um and you start to think about what could be in that exabyte of data we've talked about how you extract that value we've talked about sort of um how you start but if the scale is big not everybody's going to start at a petabyte or an exabyte to garrett's point the ability to start small and grow into these products or excuse me these projects i think a is a really um fundamental concept here because you're not going to just go by i'm going to kick off a five petabyte project whether you do that on disk or flash it's going to be expensive right but if you could start at a couple hundred terabytes not just as a proof of concept but as something that you know you could get predictable value out of that then you could say hey this either scales linearly or non-linearly in a way that i can then go map my investments to how i can go dig deeper into this that's how all of these things are gonna that's how these successful projects are going to start because the people that are starting with these very large you know sort of um expansive you know greenfield projects at multi-petabyte scale it's gonna be hard to realize near-term value excellent we gotta wrap but but garrett i wonder if you could close when you look forward you talk to customers do you see this unification of of file and object is it is this an evolutionary trend is it something that is that that is that is that is going to be a lever that customers use how do you see it evolving over the next two three years and beyond yeah i mean i think from our perspective i mean just from what we're seeing from the numbers within the market the amount of growth that's happening with unstructured data is really just starting to finally really kind of hit this data deluge or whatever you want to call it that we've been talking about for so many years it really does seem to now be becoming true as we start to see things scale out and really folks settle into okay i'm going to use the cloud to to start and maybe train my models but now i'm going to get it back on prem because of latency or security or whatever the the um decision points are there this is something that is not going to slow down and i think you know folks like pure having the ability to have the tools that they give us um to use and bring to market with our customers are really key and critical for us so i see it as a huge growth area and a big focus for us moving forward guys great job unpacking a topic that you know it's covered a little bit but i think we we covered some ground that is uh that is new and so thank you so much for those insights and that data really appreciate your time thanks steve thanks yeah thanks dave okay and thank you for watching the convergence of file and object keep it right there right back after this short break innovation impact influence welcome to the cube disruptors developers and practitioners learn from the voices of leaders who share their personal insights from the hottest digital events around the globe enjoy the best this community has to offer on the cube your global leader in high-tech digital coverage [Music] okay now we're going to get the customer perspective on object and we'll talk about the convergence of file and object but really focusing on the object piece this is a content program that's being made possible by pure storage and it's co-created with the cube christopher cb bond is here he's a lead architect for microfocus the enterprise data warehouse and principal data engineer at microfocus cb welcome good to see you thanks dave good to be here so tell us more about your role at microfocus it's a pan microfocus role of course we know the company is a multinational software firm and acquired the software assets of hp of course including vertica tell us where you fit yeah so microfocus is uh you know it's like i said wide worldwide uh company that uh sells a lot of software products all over the place to governments and so forth and um it also grows often by acquiring other companies so there is the problem of of integrating new companies and their data and so what's happened over the years is that they've had a a number of different discrete data systems so you've got this data spread all over the place and they've never been able to get a full complete introspection on the entire business because of that so my role was come in design a central data repository an enterprise data warehouse that all reporting could be generated against and so that's what we're doing and we selected vertica as the edw system and pure storage flashblade as the communal repository okay so you obviously had experience with with vertica in your in your previous role so it's not like you were starting from scratch but but paint a picture of what life was like before you embarked on this sort of consolidated a approach to your your data warehouse what was it just disparate data all over the place a lot of m a going on where did the data live right so again the data was all over the place including under people's desks in just dedicated you know their their own private uh sql servers it a lot of data in in um microfocus is run on sql server which has pros and cons because that's a great uh transactional database but it's not really good for analytics in my opinion so uh but a lot of stuff was running on that they had one vertica instance that was doing some select uh reporting wasn't a very uh powerful system and it was what they call vertica enterprise mode where had dedicated nodes which um had the compute and storage um in the same locus on each uh server okay so vertica eon mode is a whole new world because it separates compute from storage you mentioned eon mode uh and the ability to to to scale storage and compute independently we wanted to have the uh analytics olap stuff close to the oltp stuff right so that's why they're co-located very close to each other and so uh we could what's nice about this situation is that these s3 objects it's an s3 object store on the pure flash plate we could copy those over if we needed to uh aws and we could spin up um a version of vertica there and keep going it's it's like a tertiary dr strategy because we actually have a we're setting up a second flashblade vertica system geo-located elsewhere for backup and we can get into it if you want to talk about how the latest version of the pure software for the flashblade allows synchronization across network boundaries of those flash plays which is really nice because if uh you know there's a giant sinkhole opens up under our colo facility and we lose that thing then we just have to switch the dns and we were back in business off the dr and then if that one was to go we could copy those objects over to aws and be up and running there so we're feeling pretty confident about being able to weather whatever comes along so you're using the the pure flash blade as an object store um most people think oh object simple but slow uh not the case for you is that right not the case at all it's ripping um well you have to understand about vertica and the way it stores data it stores data in what they call storage containers and those are immutable okay on disk whether it's on aws or if you had a enterprise mode vertica if you do an update or delete it actually has to go and retrieve that object container from disk and it destroys it and rebuilds it okay which is why you don't you want to avoid updates and deletes with vertica because the way it gets its speed is by sorting and ordering and encoding the data on disk so it can read it really fast but if you do an operation where you're deleting or updating a record in the middle of that then you've got to rebuild that entire thing so that actually matches up really well with s3 object storage because it's kind of the same way uh it gets destroyed and rebuilt too okay so that matches up very well with vertica and we were able to design this system so that it's append only now we had some reports that were running in sql server okay uh which were taking seven days so we moved that to uh to vertica from sql server and uh we rewrote the queries which were which had been written in t sql with a bunch of loops and so forth and we were to get this is amazing it went from seven days to two seconds to generate this report which has tremendous value uh to the company because it would have to have this long cycle of seven days to get a new introspection in what they call their knowledge base and now all of a sudden it's almost on demand two seconds to generate it that's great and that's because of the way the data is stored and uh the s3 you asked about oh you know is it slow well not in that context because what happens really with vertica eon mode is that it can they have um when you set up your compute nodes they have local storage also which is called the depot it's kind of a cache okay so the data will be drawn from the flash and cached locally uh and that was it was thought when they designed that oh you know it's that'll cut down on the latency okay but it turns out that if you have your compute nodes close meaning minimal hops to the flashblade that you can actually uh tell vertica you know don't even bother caching that stuff just read it directly on the fly from the from the flashblade and the performance is still really good it depends on your situation but i know for example a major telecom company that uh uses the same topology as we're talking about here they did the same thing they just they just dropped the cache because the flash player was able to to deliver the the data fast enough so that's you're talking about that that's speed of light issues and just the overhead of of of switching infrastructure is that that gets eliminated and so as a result you can go directly to the storage array that's correct yeah it's it's like it's fast enough that it's it's almost as if it's local to the compute node uh but every situation is different depending on your uh your knees if you've got like a few tables that are heavily used uh then yeah put them um put them in the cash because that'll be probably a little bit faster but if you have a lot of ad hoc queries that are going on you know you may exceed the storage of the local cache and then you're better off having it uh just read directly from the uh from the flash blade got it look it pure's a fit i mean i sound like a fanboy but pure is all about simplicity so is object so that means you don't have to you know worry about wrangling storage and worrying about luns and all that other you know nonsense and and file i've been burned by hardware in the past you know where oh okay they're building to a price and so they cheap out on stuff like fans or other things and these these components fail and the whole thing goes down but this hardware is super super good quality and uh so i'm i'm happy with the quality that we're getting so cb last question what's next for you where do you want to take this uh this this initiative well we are in the process now of we um when so i i designed this system to combine the best of the kimball approach to data warehousing and the inland approach okay and what we do is we bring over all the data we've got and we put it into a pristine staging layer okay like i said it's uh because it's append only it's essentially a log of all the transactions that are happening in this company just they appear okay and then from the the kimball side of things we're designing the data marts now so that that's what the end users actually interact with and so we're we're taking uh the we're examining the transactional systems to say how are these business objects created what's what's the logic there and we're recreating those logical models in uh in vertica so we've done a handful of them so far and it's working out really well so going forward we've got a lot of work to do to uh create just about every object that that the company needs cb you're an awesome guest to really always a pleasure talking to you and uh thank you congratulations and and good luck going forward stay safe thank you [Music] okay let's summarize the convergence of file and object first i want to thank our guests matt burr scott sinclair garrett belsener and c.b bohn i'm your host dave vellante and please allow me to briefly share some of the key takeaways from today's program so first as scott sinclair of esg stated surprise surprise data's growing and matt burr he helped us understand the growth of unstructured data i mean estimates indicate that the vast majority of data will be considered unstructured by mid-decade 80 or so and obviously unstructured data is growing very very rapidly now of course your definition of unstructured data and that may vary across across a wide spectrum i mean there's video there's audio there's documents there's spreadsheets there's chat i mean these are generally considered unstructured data but of course they all have some type of structure to them you know perhaps it's not as strict as a relational database but there's certainly metadata and certain structure to these types of use cases that i just mentioned now the key to what pure is promoting is this idea of unified fast file and object uffo look object is great it's inexpensive it's simple but historically it's been less performant so good for archiving or cheap and deep types of examples organizations often use file for higher performance workloads and let's face it most of the world's data lives in file formats what pure is doing is bringing together file and object by for example supporting multiple protocols ie nfs smb and s3 s3 of course has really given new life to object over the past decade now the key here is to essentially enable customers to have the best of both worlds not having to trade off performance for object simplicity and a key discussion point that we've had on the program has been the impact of flash on the long slow death of spinning disk look hard disk drives they had a great run but hdd volumes they peaked in 2010 and flash as you well know has seen tremendous volume growth thanks to the consumption of flash in mobile devices and then of course its application into the enterprise and that's volume is just going to keep growing and growing and growing the price declines of flash are coming down faster than those of hdd so it's the writing's on the wall it's just a matter of time so flash is riding down that cost curve very very aggressively and hdd has essentially become you know a managed decline business now by bringing flash to object as part of the flashblade portfolio and allowing for multiple protocols pure hopes to eliminate the dissonance between file and object and simplify the choice in other words let the workload decide if you have data in a file format no problem pure can still bring the benefits of simplicity of object at scale to the table so again let the workload inform what the right strategy is not the technical infrastructure now pure course is not alone there are others supporting this multi-protocol strategy and so we asked matt burr why pure or what's so special about you and not surprisingly in addition to the product innovation he went right to pure's business model advantages i mean for example with its evergreen support model which was very disruptive in the marketplace you know frankly pure's entire business disrupted the traditional disk array model which was fundamentally was flawed pure forced the industry to respond and when it achieved escape velocity velocity and pure went public the entire industry had to react and a big part of the pure value prop in addition to this business model innovation that we just discussed is simplicity pure's keep its simple approach coincided perfectly with the ascendancy of cloud where technology organizations needed cloud-like simplicity for certain workloads that were never going to move into the cloud they're going to stay on-prem now i'm going to come back to this but allow me to bring in another concept that garrett and cb really highlighted and that is the complexity of the data pipeline and what do you mean what do i mean by that and why is this important so scott sinclair articulated he implied that the big challenge is organizations their data full but insights are scarce scarce a lot of data not as much insights it takes time too much time to get to those insights so we heard from our guests that the complexity of the data pipeline was a barrier to getting to faster insights now cb bonds shared how he streamlined his data architecture using vertica's eon mode which allowed him to scale compute independently of storage so that brought critical flexibility and improved economics at scale and flashblade of course was the back-end storage for his data warehouse efforts now the reason i think this is so important is that organizations are struggling to get insights from data and the complexity associated with the data pipeline and data life cycles let's face it it's overwhelming organizations and there the answer to this problem is a much longer and different discussion than unifying object and file that's you know i can spend all day talking about that but let's focus narrowly on the part of the issue that is related to file and object so the situation here is that technology has not been serving the business the way it should rather the formula is twisted in the world of data and big data and data architectures the data team is mired in complex technical issues that impact the time to insights now part of the answer is to abstract the underlying infrastructure complexity and create a layer with which the business can interact that accelerates instead of impedes innovation and unifying file and object is a simple example of this where the business team is not blocked by infrastructure nuance like does this data reside in a file or object format can i get to it quickly and inexpensively in a logical way or is the infrastructure in a stovepipe and blocking me so if you think about the prevailing sentiment of how the cloud is evolving to incorporate on premises workloads that are hybrid and configurations that are working across clouds and now out to the edge this idea of an abstraction layer that essentially hides the underlying infrastructure is a trend we're going to see evolve this decade now is uffo the be all end-all answer to solving all of our data pipeline challenges no no of course not but by bringing the simplicity and economics of object together with the ubiquity and performance of file uffo makes it a lot easier it simplifies life organizations that are evolving into digital businesses which by the way is every business so we see this as an evolutionary trend that further simplifies the underlying technology infrastructure and does a better job supporting the data flows for organizations so they don't have to spend so much time worrying about the technology details that add a little value to the business okay so thanks for watching the convergence of file and object and thanks to pure storage for making this program possible this is dave vellante for the cube we'll see you next time [Music] you

Published Date : Feb 24 2021

SUMMARY :

on the nfs side um but you know we

ENTITIES

Entity	Category	Confidence
garrett belsner	PERSON	0.99+
matt burr	PERSON	0.99+
2010	DATE	0.99+
2050	DATE	0.99+
270 terabytes	QUANTITY	0.99+
seven days	QUANTITY	0.99+
2021	DATE	0.99+
scott sinclair	PERSON	0.99+
2035	DATE	0.99+
2019	DATE	0.99+
four	QUANTITY	0.99+
three	QUANTITY	0.99+
two seconds	QUANTITY	0.99+
2025	DATE	0.99+
matt burr	PERSON	0.99+
first phase	QUANTITY	0.99+
dave	PERSON	0.99+
dave vellante	PERSON	0.99+
scott sinclair	PERSON	0.99+
five	QUANTITY	0.99+
250 terabytes	QUANTITY	0.99+
10 terabyte	QUANTITY	0.99+
zero percent	QUANTITY	0.99+
100	QUANTITY	0.99+
steve	PERSON	0.99+
gary	PERSON	0.99+
two billion dollar	QUANTITY	0.99+
garrett	PERSON	0.99+
two minutes	QUANTITY	0.99+
two weeks later	DATE	0.99+
three topics	QUANTITY	0.99+
two sides	QUANTITY	0.99+
two weeks ago	DATE	0.99+
billion dollars	QUANTITY	0.99+
mid-decade 80	DATE	0.99+
today	DATE	0.99+
cdw	PERSON	0.98+
three phases	QUANTITY	0.98+
80	QUANTITY	0.98+
billions of objects	QUANTITY	0.98+
10 month	QUANTITY	0.98+
one device	QUANTITY	0.98+
an hour	QUANTITY	0.98+
one platform	QUANTITY	0.98+
scott	ORGANIZATION	0.97+
last year	DATE	0.97+
five petabyte	QUANTITY	0.97+
scott	PERSON	0.97+
cassandra	PERSON	0.97+
one	QUANTITY	0.97+
single block	QUANTITY	0.97+
one system	QUANTITY	0.97+
next decade	DATE	0.96+
tons of places	QUANTITY	0.96+
both worlds	QUANTITY	0.96+
vertica	TITLE	0.96+
matt	PERSON	0.96+
both	QUANTITY	0.96+
69 of organizations	QUANTITY	0.96+
billion dollars	QUANTITY	0.95+
pandemic	EVENT	0.95+
first	QUANTITY	0.95+
three great guests	QUANTITY	0.95+
next year	DATE	0.95+

DV Pure Storage 208

>> Thank you, sir. All right, you ready to roll? >> Ready. >> All right, we'll go ahead and go in five, four, three, two. >> Okay, let's summarize the convergence of file and object. First, I want to thank our guests, Matt Burr, Scott Sinclair, Garrett Belsner, and CB Bonne. I'm your host, Dave Vellante, and please allow me to briefly share some of the key takeaways from today's program. So first, as Scott Sinclair of ESG stated surprise, surprise, data's growing. And Matt Burr, he helped us understand the growth of unstructured data. I mean, estimates indicate that the vast majority of data will be considered unstructured by mid decade, 80% or so. And obviously, unstructured data is growing very, very rapidly. Now, of course, your definition of unstructured data, now that may vary across a wide spectrum. I mean, there's video, there's audio, there's documents, there's spreadsheets, there's chat. I mean, these are generally considered unstructured data but of course they all have some type of structure to them. You know, perhaps it's not as strict as a relational database, but there's certainly metadata and certain structure to these types of use cases that I just mentioned. Now, the key to what Pure is promoting is this idea of unified fast file and object, U-F-F-O. Look, object is great, it's inexpensive, it's simple, but historically, it's been less performant, so good for archiving, or cheap and deep types of examples. Organizations often use file for higher performance workloads and let's face it, most of the world's data lives in file formats. What Pure is doing is bringing together file and object by, for example, supporting multiple protocols, ie, NFS, SMB, and S3. S3, of course, has really given a new life to object over the past decade. Now, the key here is to essentially enable customers to have the best of both worlds, not having to trade off performance for object simplicity. And a key discussion point that we've had in the program has been the impact of Flash on the long, slow, death of spinning disk. Look, hard disk drives, they had a great run, but HDD volumes, they peaked in 2010, and Flash, as you well know, has seen tremendous volume growth thanks to the consumption of Flash in mobile devices and then of course, its application into the enterprise. And as volume is just going to keep growing and growing, and growing. the price declines of Flash are coming down faster than those of HDD. So it's, the writing's on the wall. It's just a matter of time. So Flash is riding down that cost curve very, very aggressively and HDD has essentially become a managed decline business. Now, by bringing Flash to object as part of the FlashBlade portfolio and allowing for multiple protocols, Pure hopes to eliminate the dissonance between file and object and simplify the choice. In other words, let the workload decide. If you have data in a file format, no problem. Pure can still bring the benefits of simplicity of object at scale to the table. So again, let the workload inform what the right strategy is not the technical infrastructure. Now Pure, of course, is not alone. There are others supporting this multi-protocol strategy. And so we asked Matt Burr why Pure, what's so special about you? And not surprisingly, in addition to the product innovation, he went right to Pure's business model advantages. I mean, for example, with its Evergreen support model which was very disruptive in the marketplace. You know, frankly, Pure's entire business disrupted the traditional disk array model which was, fundamentally, it was flawed. Pure forced the industry to respond. And when it achieved escape velocity and Pure went public, the entire industry had to react. And a big part of the Pure value prop in addition to this business model innovation that we just discussed is simplicity. Pure's keep it simple approach coincided perfectly with the ascendancy of cloud where technology organizations needed cloud-like simplicity for certain workloads that were never going to move into the cloud. They were going to stay on-prem. Now I'm going to come back to this but allow me to bring in another concept that Garrett and CB really highlighted, and that is the complexity of the data pipeline. And what do I mean, what do I mean by that, and why is this important? So Scott Sinclair articulated or he implied that the big challenge is organizations, they're data full, but insights are scarce; a lot of data, not as much insights, and it takes time, too much time to get to those insights. So we heard from our guests that the complexity of the data pipeline was a barrier to getting to faster insights. Now, CB Bonne shared how he streamlined his data architecture using Vertica's Eon Mode which allowed him to scale, compute, independently of storage, so that brought critical flexibility and improved economics at scale. And FlashBlade, of course, was the backend storage for his data warehouse efforts. Now, the reason I think this is so important is that organizations are struggling to get insights from data and the complexity associated with the data pipeline and data lifecycles, let's face it, it's overwhelming organizations. And there, the answer to this problem is a much longer and different discussion than unifying object and file. That's, you know, I could spend all day talking about that, but let's focus narrowly on the part of the issue that is related to file and object. So the situation here is the technology has not been serving the business the way it should. Rather, the formula is twisted in the world of data and big data, and data architectures. The data team is mired in complex technical issues that impact the time to insights. Now, part of the answer is to abstract the underlying infrastructure complexity and create a layer with which the business can interact that accelerates instead of impedes innovation. And unifying file and object is a simple example of this where the business team is not blocked by infrastructure nuance, like does this data reside in the file or object format? Can I get to it quickly and inexpensively in a logical way or is the infrastructure in a stovepipe and blocking me? So if you think about the prevailing sentiment of how the cloud is evolving to incorporate on premises, workloads that are hybrid, and configurations that are working across clouds, and now out to the edge, this idea of an abstraction layer that essentially hides the underlying infrastructure is a trend we're going to see evolve this decade. Now, is UFFO the be-all end-all answer to solving all of our data pipeline challenges? No, no, of course not. But by bringing the simplicity and economics of object together with the ubiquity and performance of file, UFFO makes it a lot easier. It simplifies a life organizations that are evolving into digital businesses, which by the way, is every business. So, we see this as an evolutionary trend that further simplifies the underlying technology infrastructure and does a better job supporting the data flows for organizations so they didn't have to spend so much time worrying about the technology details that add little value to the business. Okay, so thanks for watching the convergence of file and object and thanks to Pure Storage for making this program possible. This is Dave Vellante for theCUBE. We'll see you next time.

Published Date : Feb 8 2021

SUMMARY :

All right, you ready to roll? in five, four, three, two. that impact the time to insights.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Matt Burr	PERSON	0.99+
Scott Sinclair	PERSON	0.99+
Garrett Belsner	PERSON	0.99+
ESG	ORGANIZATION	0.99+
80%	QUANTITY	0.99+
five	QUANTITY	0.99+
CB Bonne	PERSON	0.99+
two	QUANTITY	0.99+
2010	DATE	0.99+
First	QUANTITY	0.99+
today	DATE	0.99+
first	QUANTITY	0.98+
four	QUANTITY	0.98+
three	QUANTITY	0.98+
both worlds	QUANTITY	0.98+
Flash	TITLE	0.97+
CB	PERSON	0.97+
Vertica	ORGANIZATION	0.97+
Pure Storage	ORGANIZATION	0.96+
Pure	ORGANIZATION	0.96+
Garrett	PERSON	0.96+
Evergreen	ORGANIZATION	0.86+
past decade	DATE	0.59+
UFFO	ORGANIZATION	0.59+
Pure Storage 208	COMMERCIAL_ITEM	0.59+
Pure	PERSON	0.58+
this decade	DATE	0.5+
FlashBlade	ORGANIZATION	0.43+
FlashBlade	TITLE	0.37+

Matt Burr, Scott Sinclair, Garrett Belschner | The Convergence of File and Object

>>From around the globe presenting the convergence of file and object brought to you by pure storage. Okay. >>We're back with the convergence of file and object and a power panel. This is a special content program made possible by pure storage. And co-created with the cube. Now in this series, what we're doing is we're exploring the coming together of file and object storage, trying to understand the trends that are driving this convergence, the architectural considerations that users should be aware of and which use cases make the most sense for so-called unified fast file in object storage. And with me are three great guests to unpack these issues. Garrett bell center is the data center solutions architect he's with CDW. Scott Sinclair is a senior analyst at enterprise strategy group. He's got deep experience on enterprise storage and brings that independent analyst perspective. And Matt Burr is back with us, gentlemen, welcome to the program. >>Thank you. >>Hey Scott, let me, let me start with you, uh, and get your perspective on what's going on in the market with, with object to cloud huge amount of unstructured data out there. It lives in files. Give us your independent view of the trends that you're seeing out there. >>Well, Dave, you know where to start, I mean, surprise, surprise data's growing. Um, but one of the big things that we've seen is that we've been talking about data growth for what decades now, but what's really fascinating is or changed is because of the digital economy, digital business, digital transformation, whatever you call it. Now, people are not just storing data. They actually have to use it. And so we see this in trends like analytics and artificial intelligence. And what that does is it's just increasing the demand for not only consolidation of massive amounts of storage that we've seen for awhile, but also the demand for incredibly low latency access to that storage. And I think that's one of the things that we're seeing, that's driving this need for convergence, as you put it of having multiple protocols can Solidated onto one platform, but also the need for high performance access to that data. >>Thank you for that. A great setup. I got, like I wrote down three topics that we're going to unpack as a result of that. So Garrett, let me, let me go to you. Maybe you can give us the perspective of what you see with customers is, is this, is this like a push where customers are saying, Hey, listen, I need to converge my file and object. Or is it more a story where they're saying, Garrett, I have this problem. And then you see unified file and object as a solution. >>Yeah, I think, I think for us, it's, you know, taking that consultative approach with our customers and really kind of hearing pain around some of the pipelines, the way that they're going to market with data today and kind of what are the problems that they're seeing. We're also seeing a lot of the change driven by the software vendors as well. So really being able to support a dis-aggregated design where you're not having to upgrade and maintain everything as a single block has been a place where we've seen a lot of customers pivot to where they have more flexibility as they need to maintain larger volumes of data and higher performance data, having the ability to do that separate from compute and cash. And some of those other layers are, is really critical. >>So, Matt, I wonder if you could follow up on that. So, so Gary was talking about this dis-aggregated design, so I like it, you know, distributed cloud, et cetera, but then we're talking about bringing things together in one place, right? So square that circle. How does this fit in with this hyper distributed cloud edge that's getting built out? >>Yeah. You know, I mean, I could give you the easy answer on that, but I can also pass it back to Garrett in the sense that, you know, Garrett, maybe it's important to talk about, um, elastic and Splunk and some of the things that you're seeing in, in that world and, and how that, I think the answer today, the question I think you can give, you can give a pretty qualified answer relative to what your customers are seeing. >>Oh, that'd be great, please. >>Yeah, absolutely. No, no problem at all. So, you know, I think with, um, Splunk kind of moving from its traditional design and classic design, whatever you want to, you want to call it up into smart store? Um, that was kind of one of the first that we saw kind of make that move towards kind of separating object out. And I think, you know, a lot of that comes from their own move to the cloud and updating their code to basically take advantage of object object in the cloud. Um, but we're starting to see, you know, with like Vertica Ian, for example, um, elastic other folks taking that same type of approach where in the past we were building out many to use servers. We were jamming them full of, uh, you know, SSDs and then DME drives. Um, that was great, but it doesn't really scale. >>And it kind of gets into that same problem that we see with hyperconvergence a little bit where it's, you know, you're all, you're always adding something maybe that you didn't want to add. Um, so I think it, you know, again, being driven by software is really kind of where we're seeing the world open up there. Um, but that whole idea of just having that as a hub and a central place where you can then leverage that out to other applications, whether that's out to the edge for machine learning or AI applications to take advantage of it. I think that's where that convergence really comes back in. Um, but I think like Scott mentioned earlier, it's really folks are now doing things with the data where before I think they were really storing and trying to figure out what are we going to actually do with it when we need to do something with it? So this is making it possible. >>Yeah. And Dave, if I could just sort of tack onto the end of the Garrett's answer there, you know, in particular verdict with beyond mode, the ability to leverage sharted sub clusters, give you, um, you know, sort of an advantage in terms of being able to isolate performance, hotspots you an advantage to that as being able to do that on a flash blade, for example. So, um, sharted, sub clusters allow you to sort of say, I am, you know, I am going to give prioritization to, you know, this particular element of my application in my dataset, but I can still share those, share that data across those, across those sub clusters. So, um, you know, as you see, you know, Vertica with the non-motor, >>You see Splunk advanced with, with smart store, um, you know, these are all sort of advancements that are, you know, it's a chicken and the egg thing. Um, they need faster storage, they need, you know, sort of a consolidated data storage data set. Um, and, and that's what sort of allows these things to drive forward. Yes, >>The verdict eon mode, there was a no, no, it's the ability to separate compute and storage and scale independently. I think, I think Vertica, if they're, if they're not the only one, they're one of the only ones I think they might even be the only one that does that in the cloud and on prem and that sort of plays into this distributed nature of this hyper distributed cloud. I sometimes call it and I'm interested in the, in the data pipeline. And I wonder Scott, if we can talk a little bit about that maybe where unified object and file fund. I mean, I'm envisioning this, this distributed mesh and then, you know, UFO is sort of a note on that, that I can tap when I need it. But, but Scott, what are you seeing as the state of infrastructure as it relates to the data pipeline and the trends there? >>Yeah, absolutely. Dave, so w when I think data pipeline, I immediately gravitate to analytics or, or machine learning initiatives. Right. And so one of the big things we see, and this is, it's an interesting trend. It seems, you know, we continue to see increased investment in AI, increase interest and people think, and as companies get started, they think, okay, well, what does that mean? Well, I gotta go hire a data scientist. Okay. Well that data scientist probably needs some infrastructure. And what they end, what often happens in these environments is where it ends up being a bespoke environment or a one-off environment. And then over time organizations run into challenges. And one of the big challenges is the data science team or people whose jobs are outside of it, spend way too much time trying to get the infrastructure, um, to, to keep up with their demands and predominantly around data performance. So one of the, one of the ways organizations that especially have artificial intelligence workloads in production, and we found this in our research have started mitigating that is by deploying flash all across the data pipe. We have. Yeah, >>We have data on this. Sorry to interrupt, but Pat, if you could bring up that, that chart, that would be great. Um, so take us through this, uh, Scott and, and share with us what we're looking at here. >>Yeah, absolutely. So, so Dave, I'm glad you brought this up. So we did this study. Um, I want to say late last year, uh, one of the things we looked at was across artificial intelligence environments. Now, one thing that you're not seeing on this slide is we went through and we asked all around the data pipeline and we saw flash everywhere. But I thought this was really telling because this is around data lakes. And when many people think about the idea of a data Lake, they think about it as a repository. It's a place where you keep maybe cold data. And what we see here is especially within production environments, a pervasive use of flash stores. So I think that 69% of organizations are saying their data Lake is mostly flash or all flash. And I think we had 0% that don't have any flash in that environment. So organizations are out that thing that flashes in essential technology to allow them to harness the value of their data. >>So Garrett, and then Matt, I wonder if you could chime in as well. We talk about digital transformation and I, I sometimes call it, you know, the COVID forced March to digital transformation. And, and I'm curious as to your perspective on things like machine learning and the adoption, um, and Scott, you may have a perspective on this as well. You know, we had to pivot, he had to get laptops. We had to secure the end points, you know, VDI, those became super high priorities. What happened to, you know, injecting AI into my applications and, and machine learning. Did that go in the back burner? Was that accelerated along with the need to digitally transform, uh, Garrett, I wonder if you could share with us what you saw with, with customers last year? >>Yeah. I mean, I think we definitely saw an acceleration. Um, I think folks are in, in my market are, are still kind of figuring out how they inject that into more of a widely distributed business use case. Um, but again, this data hub and allowing folks to now take advantage of this data that they've had in these data lakes for a long time. I agree with Scott. I mean, many of the data lakes that we have were somewhat flashing, accelerated, but they were typically really made up of large capacity, uh, slower spinning nearline drives, um, accelerated with some flash, but I'm really starting to see folks now look at some of those older Hadoop implementations and really leveraging new ways to look at how they consume data. And many of those redesigned customers are coming to us, wanting to look at all flash solutions. So we're definitely seeing it. And we're seeing an acceleration towards folks trying to figure out how to actually use it in more of a business sense now, or before I feel it goes a little bit more skunkworks kind of people dealing with, uh, you know, in a much smaller situation, maybe in the executive offices trying to do some testing and things. >>Scott you're nodding away. Anything you can add in here. >>Yeah. So, well, first off, it's great to get that confirmation that the stuff we're seeing in our research, Garrett seeing, you know, out in the field and in the real world, um, but you know, as it relates to really the past year, it's been really fascinating. So one of the things we, we studied at ESG is it buying intentions. What are things, what are initiatives that companies plan to invest in? And at the beginning of 2020, we saw heavy interest in machine learning initiatives. Then you transition to the middle of 2020 in the midst of COVID. Uh, some organizations continued on that path, but a lot of them had the pivot, right? How do we get laptops, everyone? How do we continue business in this new world? Well, now as we enter into 2021, and hopefully we're coming out of this, uh, you know, the, the pandemic era, um, we're getting into a world where organizations are pivoting back towards these strategic investments around how do I maximize the usage of data and actually accelerating those because they've seen the importance of, of digital business initiatives over the past >>Year. >>Yeah, Matt, I mean, when we exited 2019, we saw a narrowing of experimentation in our premise was, you know, that that organizations are going to start now operationalizing all their digital transformation experiments. And, and then we had a 10 month Petri dish on, on digital. So what are you, what are you seeing in this regard? >>It's 10 months, Petri dish is an interesting way to interesting way to describe it. Um, you know, we, we saw another, there's another, there's another candidate for pivot in there around ransomware as well. Right. Um, you know, security entered into the mix, uh, which took people's attention away from some of this as well. I mean, look, I I'd like to bring this up just a level or two, um, because what we're actually talking about here is progress, right? And, and progress is an, is an inevitability. Um, you know, whether it's whether, whether you believe that it's by 20, 25 or you, or you think it's 20, 35 or 2050, it doesn't matter. We're on a forced March to the eradication of desk. And that is happening in many ways. Uh, you know, in many ways, um, due to some of the things that Garrett was referring to and what Scott was referring to in terms of what our customer's demands for, how they're going to actually leverage the data that they have. >>And that brings me to kind of my final point on this, which is we see customers in three phases. There's the first phase where they say, Hey, I have this large data store, and I know there's value in there. I don't know how to get to it. Or I have this large data store and I've started a project to get value out of it. And we failed. Those could be customers that, um, you know, marched down the dupe, the Hadoop path early on. And they, they, they got some value out of it. Um, but they realized that, you know, HDFS, wasn't going to be a modern protocol going forward for any number of reasons. You know, the first being, Hey, if I have gold dot master, how do I know that I have gold dot four is consistent with my gold dot master? So data consistency matters. >>And then you have the sort of third group that says, I have these large datasets. I know how to extract value from them. And I'm already on to the Vertica is the elastics, you know, the Splunks et cetera. Um, I think those folks are the folks that, that latter group are the folks that kept their, their, their projects going because they were already extracting value from them. The first two groups we were seeing, sort of saying the second half of this year is when we're going to begin really being picking up on these, on these types of initiatives again. >>Well, thank you, Matt, by the way, for, for hitting the escape key, because I think value from data really is what this is all about. And there are some real blockers there that I kind of want to talk about. You've mentioned HDFS. I mean, we were very excited, of course, in the early days of a dupes, many of the concepts were profound, but at the end of the day, it was too complicated. We've got these hyper specialized roles that are, that are serving the business, but it still takes too long. It's, it's too hard to get value from data. And one of the blockers is infrastructure that the complexity of that infrastructure really needs to be abstracted taken up a level. We're starting to see this in, in cloud where you're seeing some of those abstraction layers being built from some of the cloud vendors, but more importantly, a lot of the vendors like pure, Hey, we can do that heavy lifting for you. Uh, and we, you know, we have expertise in engineering to do cloud native. So I'm wondering what you guys see. Maybe Garrett, you could start us off and the other salmon as some of the blockers, uh, to getting value from data and how we're going to address those in the coming decade. >>Yeah. I mean, I think part of it we're solving here obviously with, with pure bringing, uh, you know, flash to a market that traditionally was utilizing a much slower media. Um, you know, the other thing that I, that I see that's very nice with flash blade for example, is the ability to kind of do things, you know, once you get it set up a blade at a time. I mean, a lot of the things that we see from just kind of more of a simplistic approach to this, like a lot of these teams don't have big budgets and being able to kind of break them down into almost a blade type chunk, I think has really kind of allowed folks to get more projects and, and things off the ground because they don't have to buy a full expensive system to run these projects. Um, so that's helped a lot. >>I think the wider use cases have helped a lot. So, um, Matt mentioned ransomware, um, you know, using safe mode as a, as a place to help with ransomware has been a really big growth spot for us. We've got a lot of customers, very interested and excited about that. Um, and the other thing that I would say is bringing dev ops into data is another thing that we're seeing. So kind of that push towards data ops and really kind of using automation and infrastructure as code as a way to now kind of drive things through the system. The way that we've seen with automation through dev ops is, is really an area we're seeing a ton of growth with from a services perspective, >>Guys, any other thoughts on that? I mean, we're, I I'll, I'll tee it up there. I, we are seeing some bleeding edge, which is somewhat counterintuitive, especially from a cost standpoint, organizational changes at some, some companies, uh, think of some of the, the, the, the internet companies that do, uh, music, uh, for instance, and adding podcasts, et cetera. And those are different data products. We're seeing them actually reorganize their data architectures to make them more distributed, uh, and actually put the domain heads, the business heads in charge of the data and the data pipeline. And that is maybe less efficient, but, but it's, again, some of these bleeding edge. What else are you guys seeing out there that might be some harbinger of the next decade? >>Uh, I'll go first. Um, you know, I think specific to, um, the, the construct that you threw out, Dave, one of the things that we're seeing is, um, you know, the, the, the application owner, maybe it's the dev ops person, but it's, you know, maybe it's, it's, it's, it's the application owner through the dev ops person. They're, they're becoming more technical in their understanding of how infrastructure, um, interfaces with their, with their application. I think, um, you know, what, what we're seeing on the flash blade side is we're having a lot more conversations with application people than, um, just it people. It doesn't mean that the, it people aren't there, the it, people are still there for sure if they have to deliver the service, et cetera. Um, but you know, the days of, of it, you know, building up a catalog of services and a business owner subscribing to one of those services, you know, picking, you know, whatever sort of fits their need. >>Um, I don't think that constant, I think that's the construct that changes going forward. The application owner is becoming much more prescriptive about what they want the infrastructure to fit, how they want the infrastructure to fit into their application. Um, and that's a big change. And for, for, um, you know, certainly folks like, like Garrett and CDW, um, you know, they do a good job with this being able to sort of get to the application owner and bring those two sides together. There's a tremendous amount of value there, uh, for us to spend a little bit of a, of a retooling we've traditionally sold to the it side of the house. And, um, you know, we've had to teach ourselves how to go talk the language of, of applications. So, um, you know, I think you pointed out a good, a good, a good construct there, and you know, that that application owner tank playing a much bigger role in what they're expecting from the performance of it, infrastructure I think is, is, is a key, is a key change. >>Interesting. I mean, that definitely is a trend. That's puts you guys closer to the business where the infrastructure team is serving the business, as opposed to sometimes I talked to data experts and they're frustrated, uh, especially data owners or data, product builders who are frustrated that they feel like they have to beg, beg the, the data pipeline team to get, you know, new data sources or get data out. How about the edge? Um, you know, maybe Scott, you can kick us off. I mean, we're seeing, you know, the emergence of, of edge use cases, AI inferencing at the edge, lot of data at the edge. W what are you seeing there and how does this unified object I'll bring us back to that in file fit. >>Wow. Dave, how much time do we have, um, tell me, first of all, Scott, why don't you, why don't you just tell everybody what the edge is? Yeah. You got it all figured out. How much time do you have end of the day. And that's, that's a great question, right? Is if you take a step back and I think it comes back to Dave, something you mentioned it's about extracting value from data. And what that means is when you extract value from data, what it does is as Matt pointed out the, the influencers or the users of data, the application owners, they have more power because they're driving revenue now. And so what that means is from an it standpoint, it's not just, Hey, here are the services you get, use them or lose them, or, you know, don't throw a fit. It is no, I have to, I have to adapt. I have to follow what my application owners me. Now, when you bring that back to the edge, what it means is, is that data is not localized to the data center. I mean, we just went through a nearly 12 month period where >>The entire workforce for most of the companies in this country had went distributed and business continued. So if business is distributed, data is distributed. And that means, that means in the data center, that means at the edge, that means that the cloud, and that means in all other places and tons of places. And what it also means is you have to be able to extract and utilize data anywhere it may be. And I think that's something that we're going to continue to and continue to see. And I think it comes back to, you know, if you think about key characteristics, we've talked about, um, things like performance and scale for years, but we need to start rethinking it because on one hand, we need to get performance everywhere. But also in terms of scale, and this ties back to some of the other initiatives and getting value from data, it's something I call the, the massive success problem. One of the things we see, especially with, with workloads like machine learning is businesses find success with them. And as soon as they do they say, well, I need about 20 of these projects now will all of a sudden that overburdens it organizations, especially across, across core and edge and cloud environments. And so when you look at environments ability to meet performance and scale demands, wherever it needs to be is something that's really important. You know, >>Dave, I'd like to, um, just sort of tie together sort of two things that, um, I think that I heard from Scott and Garrett that I think are important and it's around this concept of scale. Um, you know, some of us are old enough to remember the day when kind of a 10 terabyte blast radius was too big of a blast radius for people to take on, or a terabyte of storage was considered to be, um, you know, uh, uh, an exemplary budget environment. Right. Um, now we sort of think as terabytes, kind of like we used to think of as gigabytes in some ways, um, petabyte, like you don't have to explain to anybody what a petabyte is anymore. Um, and you know, what's on the horizon and it's not far are our exabyte type dataset workloads. Um, and you start to think about what could be in that exabyte of data. >>We've talked about how you extract that value. And we've talked about sort of, um, how you start, but if the scale is big, not everybody's going to start at a petabyte or an exabyte to Garrett's point, the ability to start small and grow into these products, or excuse me, these projects, I think is a, is a really, um, fundamental concept here because you're not going to just go buy five. I'm going to go kick off a five petabyte project, whether you do that on disk or flash, it's going to be expensive, right. But if you could start at a couple of hundred terabytes, not just as a proof of concept, but as something that, you know, you could get predictable value out of that, then you could say, Hey, this either scales linearly, or non-linearly in a way that I can then go map my investments to how I can go dig deeper into this. That's how all of these things are going to, that's how these successful projects are going to start, because the people that are starting with these very large, you know, sort of, um, expansive, you know, Greenfield projects at multi petabyte scale, it's gonna be hard to realize near-term value. Excellent. Uh, >>We we're, we gotta wrap, but, but Garrett, I wonder if you could close it, when you look forward, you talk to customers, do you see this unification of file and object? Is it, is this an evolutionary trend? Is it something that is, that is, that is, that is going to be a lever that customers use. How do you see it evolving over the next two, three years and beyond? >>Yeah, I mean, I think from our perspective, I mean, just from what we're seeing from the numbers within the market, the amount of growth that's happening with unstructured data is really just starting to finally really kind of hit this data delusion or whatever you want to call it that we've been talking about for so many years. Um, it really does seem to now be becoming true, um, as we start to see things scale out and really folks settle into, okay, I'm going to use the cloud to start and maybe train my models, but now I'm going to get it back on prem because of latency or security or whatever the, the, the, um, decision points are there. Um, this is something that is not going to slow down. And I think, you know, folks like pure having the ability to have the tools that they give us, um, do use and bring to market with our customers are, are really key and critical for us. So I see it as a huge growth area and a big focus for us moving forward, >>Guys, great job unpacking a topic that, you know, it's covered a little bit, but I think we, we covered some ground. That is a, that is new. And so thank you so much for those insights and that data really appreciate your time. >>Thanks, Dave. Thanks. Yeah. Thanks, Dave. >>Okay. And thank you for watching the convergence of file and object. Keep it right there. Bright, bright back after the short break.

Published Date : Jan 28 2021

SUMMARY :

of file and object brought to you by pure storage. And Matt Burr is back with us, gentlemen, welcome to the program. Hey Scott, let me, let me start with you, uh, and get your perspective on what's going on in the market with, but also the need for high performance access to that data. And then you see unified Yeah, I think, I think for us, it's, you know, taking that consultative approach with our customers and really kind design, so I like it, you know, distributed cloud, et cetera, you know, Garrett, maybe it's important to talk about, um, elastic and Splunk and some of the things that you're seeing Um, but we're starting to see, you know, with like Vertica Ian, so I think it, you know, again, being driven by software is really kind of where we're seeing the world I am, you know, I am going to give prioritization to, you know, this particular element of my application you know, it's a chicken and the egg thing. But, but Scott, what are you seeing as the state of infrastructure as it relates to the data It seems, you know, we continue to see increased investment in AI, Sorry to interrupt, but Pat, if you could bring up that, that chart, that would be great. So, so Dave, I'm glad you brought this up. We had to secure the end points, you know, uh, you know, in a much smaller situation, maybe in the executive offices trying to do some testing and things. Anything you can add in here. Garrett seeing, you know, out in the field and in the real world, um, but you know, in our premise was, you know, that that organizations are going to start now operationalizing all Um, you know, security entered into the mix, uh, which took people's attention away from some of this as well. Um, but they realized that, you know, HDFS, wasn't going to be a modern you know, the Splunks et cetera. Uh, and we, you know, we have expertise in engineering is the ability to kind of do things, you know, once you get it set up a blade at a time. um, you know, using safe mode as a, as a place to help with ransomware has been a really What else are you guys seeing out there that Um, but you know, the days of, of it, you know, building up a So, um, you know, I think you pointed out a good, a good, a good construct there, to get, you know, new data sources or get data out. And what that means is when you extract value from data, what it does And I think it comes back to, you know, if you think about key characteristics, considered to be, um, you know, uh, uh, an exemplary budget environment. you know, sort of, um, expansive, you know, Greenfield projects at multi petabyte scale, you talk to customers, do you see this unification of file and object? And I think, you know, folks like pure having the Guys, great job unpacking a topic that, you know, it's covered a little bit, but I think we, we covered some ground. Bright, bright back after the short break.

ENTITIES

Entity	Category	Confidence
Matt	PERSON	0.99+
Garrett	PERSON	0.99+
Scott	PERSON	0.99+
Gary	PERSON	0.99+
Scott Sinclair	PERSON	0.99+
Matt Burr	PERSON	0.99+
Dave	PERSON	0.99+
Garrett Belschner	PERSON	0.99+
2019	DATE	0.99+
2021	DATE	0.99+
Petr	PERSON	0.99+
69%	QUANTITY	0.99+
10 terabyte	QUANTITY	0.99+
first phase	QUANTITY	0.99+
10 month	QUANTITY	0.99+
last year	DATE	0.99+
10 months	QUANTITY	0.99+
five	QUANTITY	0.99+
0%	QUANTITY	0.99+
ESG	ORGANIZATION	0.99+
two sides	QUANTITY	0.99+
Pat	PERSON	0.99+
today	DATE	0.99+
next decade	DATE	0.98+
25	QUANTITY	0.98+
two	QUANTITY	0.98+
20	QUANTITY	0.98+
three phases	QUANTITY	0.98+
one	QUANTITY	0.98+
first	QUANTITY	0.98+
Vertica	ORGANIZATION	0.98+
2050	DATE	0.98+
third group	QUANTITY	0.98+
single block	QUANTITY	0.97+
one platform	QUANTITY	0.97+
three topics	QUANTITY	0.97+
five petabyte	QUANTITY	0.96+
March	DATE	0.95+
three great guests	QUANTITY	0.95+
late last year	DATE	0.95+
one place	QUANTITY	0.95+
one thing	QUANTITY	0.92+
past year	DATE	0.91+
Greenfield	ORGANIZATION	0.9+
CDW	PERSON	0.89+
CDW	ORGANIZATION	0.88+
35	QUANTITY	0.88+
pandemic	EVENT	0.87+
One	QUANTITY	0.87+
three years	QUANTITY	0.85+

Breaking Analysis: Five Questions About Snowflake’s Pending IPO

>> From theCUBE Studios in Palo Alto in Boston, bringing you data driven insights from theCUBE and ETR. This is breaking analysis with Dave Vellante. >> In June of this year, Snowflake filed a confidential document suggesting that it would do an IPO. Now of course, everybody knows about it, found out about it and it had a $20 billion valuation. So, many in the community and the investment community and so forth are excited about this IPO. It could be the hottest one of the year, and we're getting a number of questions from investors and practitioners and the entire Wiki bond, ETR and CUBE community. So, welcome everybody. This is Dave Vellante. This is "CUBE Insights" powered by ETR. In this breaking analysis, we're going to unpack five critical questions around Snowflake's IPO or pending IPO. And with me to discuss that is Erik Bradley. He's the Chief Engagement Strategists at ETR and he's also the Managing Director of VENN. Erik, thanks for coming on and great to see you as always. >> Great to see you too. Always enjoy being on the show. Thank you. >> Now for those of you don't know Erik, VENN is a roundtable that he hosts and he brings in CIOs, IT practitioners, CSOs, data experts and they have an open and frank conversation, but it's private to ETR clients. But they know who the individual is, what their role is, what their title is, et cetera and it's a kind of an ask me anything. And I participated in one of them this past week. Outstanding. And we're going to share with you some of that. But let's bring up the agenda slide if we can here. And these are really some of the questions that we're getting from investors and others in the community. There's really five areas that we want to address. The first is what's happening in this enterprise data warehouse marketplace? The second thing is kind of a one area. What about the legacy EDW players like Oracle and Teradata and Netezza? The third question we get a lot is can Snowflake compete with the big cloud players? Amazon, Google, Microsoft. I mean they're right there in the heart, in the thick of things there. And then what about that multi-cloud strategy? Is that viable? How much of a differentiator is that? And then we get a lot of questions on the TAM. Meaning the total available market. How big is that market? Does it justify the valuation for Snowflake? Now, Erik, you've been doing this now. You've run a couple VENNs, you've been following this, you've done some other work that you've done with Eagle Alpha. What's your, just your initial sort of takeaway from all this work that you've been doing. >> Yeah, sure. So my first take on Snowflake was about two and a half years ago. I actually hosted them for one of my VENN interviews and my initial thought was impressed. So impressed. They were talking at the time about their ability to kind of make ease of use of a multi-cloud strategy. At the time although I was impressed, I did not expect the growth and the hyper growth that we have seen now. But, looking at the company in its current iteration, I understand where the hype is coming from. I mean, it's 12 and a half billion private valuation in the last round. The least confidential IPO (laughs) anyone's ever seen (Dave laughs) with a 15 to $20 billion valuation coming out, which is more than Teradata, Margo and Cloudera combined. It's a great question. So obviously the success to this point is warranted, but we need to see what they're going to be able to do next. So I think the agenda you laid out is a great one and I'm looking forward to getting into some of those details. >> So let's start with what's happening in the marketplace and let's pull up a slide that I very much love to use. It's the classic X-Y. On the vertical axis here we show net score. And remember folks, net score is an indicator of spending momentum. ETR every quarter does like a clockwork survey where they're asking people, "Essentially are you spending more or less?" They subtract the less from the more and comes up with a net score. It's more complicated than, but like NPS, it's a very simple and reliable methodology. That's the vertical axis. And the horizontal axis is what's called market share. Market share is the pervasiveness within the data set. So it's calculated by the number of mentions of the vendor divided by the number of mentions within that sector. And what we're showing here is the EDW sector. And we've pulled out a few companies that I want to talk about. So the big three, obviously Microsoft, AWS and Google. And you can see Microsoft has a huge presence far to the right. AWS, very, very strong. A lot of Redshift in there. And then they're pretty high on the vertical axis. And then Google, not as much share, but very solid in that. Close to 60% net score. And then you can see above all of them from a vertical standpoint is Snowflake with a 77.5% net score. You can see them in the upper right there in the green. One of the highest Erik in the entire data set. So, let's start with some sort of initial comments on the big guys and Snowflakes. Your thoughts? >> Sure. Just first of all to comment on the data, what we're showing there is just the data warehousing sector, but Snowflake's actual net score is that high amongst the entire universe that we follow. Their data strength is unprecedented and we have forward-looking spending intention. So this bodes very well for them. Now, what you did say very accurately is there's a difference between their spending intentions on a net revenue level compared to AWS, Microsoft. There no one's saying that this is an apples-to-apples comparison when it comes to actual revenue. So we have to be very cognizant of that. There is domination (laughs) quite frankly from AWS and from Azure. And Snowflake is a necessary component for them not only to help facilitate a multi-cloud, but look what's happening right now in the US Congress, right? We have these tech leaders being grilled on their actual dominance. And one of the main concerns they have is the amount of data that they're collecting. So I think the environment is right to have another player like this. I think Snowflake really has a lot of longevity and our data is supporting that. And the commentary that we hear from our end users, the people that take the survey are supporting that as well. >> Okay, and then let's stay on this X-Y slide for a moment. I want to just pull out a couple of other comments here, because one of the questions we're asking is Whither, the legacy EDW players. So we've got in here, IBM, Oracle, you can see Teradata and then Hortonworks and MapR. We're going to talk a little bit about Hortonworks 'cause it's now Cloudera. We're going to talk a little bit about Hadoop and some of the data lakes. So you can see there they don't have nearly the net score momentum. Oracle obviously has a huge install base and is investing quite frankly in R&D and do an Exadata and it has its own cloud. So, it's got a lock on it's customers and if it keeps investing and adding value, it's not going away. IBM with Netezza, there's really been some questions around their commitment to that base. And I know that a lot of the folks in the VENNs that we've talked to Erik have said, "Well, we're replacing Netezza." Frank Slootman has been very vocal about going after Teradata. And then we're going to talk a little bit about the Hadoop space. But, can you summarize for us your thoughts in your research and the commentary from your community, what's going on with the legacy guys? Are these guys cooked? Can they hang on? What's your take? >> Sure. We focus on this quite a bit actually. So, I'm going to talk about it from the data perspective first, and then we'll go into some of the commentary and the panel. You even joined one yesterday. You know that it was touched upon. But, first on the data side, what we're noticing and capturing is a widening bifurcation between these cloud native and the legacy on-prem. It is undeniable. There is nothing that you can really refute. The data is concrete and it is getting worse. That gap is getting wider and wider and wider. Now, the one thing I will say is, nobody's going to rip out their legacy applications tomorrow. It takes years and years. So when you look at Teradata, right? Their market cap's only 2 billion, 2.3 billion. How much revenue growth do they need to stay where they are? Not much, right? No one's expecting them to grow 20%, which is what you're seeing on the left side of that screen. So when you look at the legacy versus the cloud native, there is very clear direction of what's happening. The one thing I would note from the data perspective is if you switched from net score or adoptions and you went to flat spending, you suddenly see Oracle and Teradata move over to that left a little bit, because again what I'm trying to say is I don't think they're going to catch up. No, but also don't think they're going away tomorrow. That these have large install bases, they have relationships. Now to kind of get into what you were saying about each particular one, IBM, they shut down Netezza. They shut it down and then they brought it back to life. How does that make you feel if you're the head of data architecture or you're DevOps and you're trying to build an application for a large company? I'm not going back to that. There's absolutely no way. Teradata on the other hand is known to be incredibly stable. They are known to just not fail. If you need to kind of re-architect or you do a migration, they work. Teradata also has a lot of compliance built in. So if you're a financials, if you have a regulated business or industry, there's still some data sets that you're not going to move up to the cloud. Whether it's a PII compliance or financial reasons, some of that stuff is still going to live on-prem. So Teradata is still has a very good niche. And from what we're hearing from our panels, then this is a direct quote if you don't mind me looking off screen for one second. But this is a great one. Basically said, "Teradata is the only one from the legacy camp who is putting up a fight and not giving up." Basically from a CIO perspective, the rest of them aren't an option anymore. But Teradata is still fighting and that's great to hear. They have their own data as a service offering and listen, they're a small market cap compared to these other companies we're talking about. But, to summarize, the data is very clear. There is a widening bifurcation between the two camps. I do not think legacy will catch up. I think all net new workloads are moving to data as a service, moving to cloud native, moving to hosted, but there are still going to be some existing legacy on-prem applications that will be supported with these older databases. And of those, Oracle and Teradata are still viable options. >> I totally agree with you and my colleague David Floyd is actually quite high on Teradata Vantage because he really does believe that a key component, we're going to talk about the TAM in a minute, but a key component of the TAM he believes must include the on-premises workloads. And Frank Slootman has been very clear, "We're not doing on-prem, we're not doing this halfway house." And so that's an opportunity for companies like Teradata, certainly Oracle I would put it in that camp is putting up a fight. Vertica is another one. They're very small, but another one that's sort of battling it out from the old NPP world. But that's great. Let's go into some of the specifics. Let's bring up here some of the specific commentary that we've curated here from the roundtables. I'm going to go through these and then ask you to comment. The first one is just, I mean, people are obviously very excited about Snowflake. It's easy to use, the whole thing zero to Snowflake in 90 minutes, but Snowflake is synonymous with cloud-native data warehousing. There are no equals. We heard that a lot from your VENN panelist. >> We certainly did. There was even more euphoria around Snowflake than I expected when we started hosting these series of data warehousing panels. And this particular gentleman that said that happens to be the global head of data architecture for a fortune 100 financials company. And you mentioned earlier that we did a report alongside Eagle Alpha. And we noticed that among fortune 100 companies that are also using the big three public cloud companies, Snowflake is growing market share faster than anyone else. They are positioned in a way where even if you're aligned with Azure, even if you're aligned with AWS, if you're a large company, they are gaining share right now. So that particular gentleman's comments was very interesting. He also made a comment that said, "Snowflake is the person who championed the idea that data warehousing is not dead yet. Use that old monthly Python line and you're not dead yet." And back in the day where the Hadoop came along and the data lakes turned into a data swamp and everyone said, "We don't need warehousing anymore." Well, that turned out to be a head fake, right? Hadoop was an interesting technology, but it's a complex technology. And it ended up not really working the way people want it. I think Snowflake came in at that point at an opportune time and said, "No, data warehousing isn't dead. We just have to separate the compute from the storage layer and look at what I can do. That increases flexibility, security. It gives you that ability to run across multi-cloud." So honestly the commentary has been nothing but positive. We can get into some of the commentary about people thinking that there's competition catching up to what they do, but there is no doubt that right now Snowflake is the name when it comes to data as a service. >> The other thing we heard a lot was ETL is going to get completely disrupted, you sort of embedded ETL. You heard one panelist say, "Well, it's interesting to see that guys like Informatica are talking about how fast they can run inside a Snowflake." But Snowflake is making that easy. That data prep is sort of part of the package. And so that does not bode well for ETL vendors. >> It does not, right? So ETL is a legacy of on-prem databases and even when Hadoop came along, it still needed that extra layer to kind of work with the data. But this is really, really disrupting them. Now the Snowflake's credit, they partner well. All the ETL players are partnered with Snowflake, they're trying to play nice with them, but the writings on the wall as more and more of this application and workloads move to the cloud, you don't need the ETL layer. Now, obviously that's going to affect their talent and Informatica the most. We had a recent comment that said, this was a CIO who basically said, "The most telling thing about the ETL players right now is every time you speak to them, all they talk about is how they work in a Snowflake architecture." That's their only metric that they talk about right now. And he said, "That's very telling." That he basically used it as it's their existential identity to be part of Snowflake. If they're not, they don't exist anymore. So it was interesting to have sort of a philosophical comment brought up in one of my roundtables. But that's how important playing nice and finding a niche within this new data as a service is for ETL, but to be quite honest, they might be going the same way of, "Okay, let's figure out our niche on these still the on-prem workloads that are still there." I think over time we might see them maybe as an M&A possibility, whether it's Snowflake or one of these new up and comers, kind of bring them in and sort of take some of the technology that's useful and layer it in. But as a large market cap, solo existing niche, I just don't know how long ETL is for this world. >> Now, yeah. I mean, you're right that if it wasn't for the marketing, they're not fighting fashion. But >> No. >> really there're some challenges there. Now, there were some contrarians in the panel and they signaled some potential icebergs ahead. And I guarantee you're going to see this in Snowflake's Red Herring when we actually get it. Like we're going to see all the risks. One of the comments, I'll mention the two and then we can talk about it. "Their engineering advantage will fade over time." Essentially we're saying that people are going to copycat and we've seen that. And the other point is, "Hey, we might see some similar things that happened to Hadoop." The public cloud players giving away these offerings at zero cost. Essentially marginal cost of adding another service is near zero. So the cloud players will use their heft to compete. Your thoughts? >> Yeah, first of all one of the reasons I love doing panels, right? Because we had three gentlemen on this panel that all had nothing but wonderful things to say. But you always get one. And this particular person is a CTO of a well known online public travel agency. We'll put it that way. And he said, "I'm going to be the contrarian here. I have seven different technologies from private companies that do the same thing that I'm evaluating." So that's the pressure from behind, right? The technology, they're going to catch up. Right now Snowflake has the best engineering which interestingly enough they took a lot of that engineering from IBM and Teradata if you actually go back and look at it, which was brought up in our panel as well. He said, "However, the engineering will catch up. They always do." Now from the other side they're getting squeezed because the big cloud players just say, "Hey, we can do this too. I can bundle it with all the other services I'm giving you and I can squeeze your pay. Pretty much give it a waive at the cost." So I do think that there is a very valid concern. When you come out with a $20 billion IPO evaluation, you need to warrant that. And when you see competitive pressures from both sides, from private emerging technologies and from the more dominant public cloud players, you're going to get squeezed there a little bit. And if pricing gets squeezed, it's going to be very, very important for Snowflake to continue to innovate. That comment you brought up about possibly being the next Cloudera was certainly the best sound bite that I got. And I'm going to use it as Clickbait in future articles, because I think everyone who starts looking to buy a Snowflake stock and they see that, they're going to need to take a look. But I would take that with a grain of salt. I don't think that's happening anytime soon, but what that particular CTO was referring to was if you don't innovate, the technology itself will become commoditized. And he believes that this technology will become commoditized. So therefore Snowflake has to continue to innovate. They have to find other layers to bring in. Whether that's through their massive war chest of cash they're about to have and M&A, whether that's them buying analytics company, whether that's them buying an ETL layer, finding a way to provide more value as they move forward is going to be very important for them to justify this valuation going forward. >> And I want to comment on that. The Cloudera, Hortonworks, MapRs, Hadoop, et cetera. I mean, there are dramatic differences obviously. I mean, that whole space was so hard, very difficult to stand up. You needed science project guys and lab coats to do it. It was very services intensive. As well companies like Cloudera had to fund all these open source projects and it really squeezed their R&D. I think Snowflake is much more focused and you mentioned some of the background of their engineers, of course Oracle guys as well. However, you will see Amazon's going to trot out a ton of customers using their RA3 managed storage and their flash. I think it's the DC two piece. They have a ton of action in the marketplace because it's just so easy. It's interesting one of the comments, you asked this yesterday, was with regard to separating compute from storage, which of course it's Snowflakes they basically invented it, it was one of their climbs to fame. The comment was what AWS has done to separate compute from storage for Redshift is largely a bolt on. Which I thought that was an interesting comment. I've had some other comments. My friend George Gilbert said, "Hey, despite claims to the contrary, AWS still hasn't separated storage from compute. What they have is really primitive." We got to dig into that some more, but you're seeing some data points that suggest there's copycatting going on. May not be as functional, but at the same time, Erik, like I was saying good enough is maybe good enough in this space. >> Yeah, and especially with the enterprise, right? You see what Microsoft has done. Their technology is not as good as all the niche players, but it's good enough and I already have a Microsoft license. So, (laughs) you know why am I going to move off of it. But I want to get back to the comment you mentioned too about that particular gentleman who made that comment about RedShift, their separation is really more of a bolt on than a true offering. It's interesting because I know who these people are behind the scenes and he has a very strong relationship with AWS. So it was interesting to me that in the panel yesterday he said he switched from Redshift to Snowflake because of that and some other functionality issues. So there is no doubt from the end users that are buying this. And he's again a fortune 100 financial organization. Not the same one we mentioned. That's a different one. But again, a fortune 100 well known financials organization. He switched from AWS to Snowflake. So there is no doubt that right now they have the technological lead. And when you look at our ETR data platform, we have that adoption reasoning slide that you show. When you look at the number one reason that people are adopting Snowflake is their feature set of technological lead. They have that lead now. They have to maintain it. Now, another thing to bring up on this to think about is when you have large data sets like this, and as we're moving forward, you need to have machine learning capabilities layered into it, right? So they need to make sure that they're playing nicely with that. And now you could go open source with the Apache suite, but Google is doing so well with BigQuery and so well with their machine learning aspects. And although they don't speak enterprise well, they don't sell to the enterprise well, that's changing. I think they're somebody to really keep an eye on because their machine learning capabilities that are layered into the BigQuery are impressive. Now, of course, Microsoft Azure has Databricks. They're layering that in, but this is an area where I think you're going to see maybe what's next. You have to have machine learning capabilities out of the box if you're going to do data as a service. Right now Snowflake doesn't really have that. Some of the other ones do. So I had one of my guest panelist basically say to me, because of that, they ended up going with Google BigQuery because he was able to run a machine learning algorithm within hours of getting set up. Within hours. And he said that that kind of capability out of the box is what people are going to have to use going forward. So that's another thing we should dive into a little bit more. >> Let's get into that right now. Let's bring up the next slide which shows net score. Remember this is spending momentum across the major cloud players and plus Snowflake. So you've got Snowflake on the left, Google, AWS and Microsoft. And it's showing three survey timeframes last October, April 20, which is right in the middle of the pandemic. And then the most recent survey which has just taken place this month in July. And you can see Snowflake very, very high scores. Actually improving from the last October survey. Google, lower net scores, but still very strong. Want to come back to that and pick up on your comments. AWS dipping a little bit. I think what's happening here, we saw this yesterday with AWS's results. 30% growth. Awesome. Slight miss on the revenue side for AWS, but look, I mean massive. And they're so exposed to so many industries. So some of their industries have been pretty hard hit. Microsoft pretty interesting. A little softness there. But one of the things I wanted to pick up on Erik, when you're talking about Google and BigQuery and it's ML out of the box was what we heard from a lot of the VENN participants. There's no question about it that Google technically I would say is one of Snowflake's biggest competitors because it's cloud native. Remember >> Yep. >> AWS did a license one time. License deal with PowerShell and had a sort of refactor the thing to be cloud native. And of course we know what's happening with Microsoft. They basically were on-prem and then they put stuff in the cloud and then all the updates happen in the cloud. And then they pushed to on-prem. But they have that what Frank Slootman calls that halfway house, but BigQuery no question technically is very, very solid. But again, you see Snowflake right now anyway outpacing these guys in terms of momentum. >> Snowflake is out outpacing everyone (laughs) across our entire survey universe. It really is impressive to see. And one of the things that they have going for them is they can connect all three. It's that multi-cloud ability, right? That portability that they bring to you is such an important piece for today's modern CIO as data architects. They don't want vendor lock-in. They are afraid of vendor lock-in. And this ability to make their data portable and to do that with ease and the flexibility that they offer is a huge advantage right now. However, I think you're a hundred percent right. Google has been so focused on the engineering side and never really focusing on the enterprise sales side. That is why they're playing catch up. I think they can catch up. They're bringing in some really important enterprise salespeople with experience. They're starting to learn how to talk to enterprise, how to sell, how to support. And nobody can really doubt their engineering. How many open sources have they given us, right? They invented Kubernetes and the entire container space. No one's really going to compete with them on that side if they learn how to sell it and support it. Yeah, right now they're behind. They're a distant third. Don't get me wrong. From a pure hosted ability, AWS is number one. Microsoft is yours. Sometimes it looks like it's number one, but you have to recognize that a lot of that is because of simply they're hosted 365. It's a SAS app. It's not a true cloud type of infrastructure as a service. But Google is a distant third, but their technology is really, really great. And their ability to catch up is there. And like you said, in the panels we were hearing a lot about their machine learning capability is right out of the box. And that's where this is going. What's the point of having this huge data if you're not going to be supporting it on new application architecture. And all of those applications require machine learning. >> Awesome. So we're. And I totally agree with what you're saying about Google. They just don't have it figured out how to sell the enterprise yet. And a hundred percent AWS has the best cloud. I mean, hands down. But a very, very competitive market as we heard yesterday in front of Congress. Now we're on the point about, can Snowflake compete with the big cloud players? I want to show one more data point. So let's bring up, this is the same chart as we showed before, but it's new adoptions. And this is really telling. >> Yeah. >> You can see Snowflake with 34% in the yellow, new adoptions, down yes from previous surveys, but still significantly higher than the other players. Interesting to see Google showing momentum on new adoptions, AWS down on new adoptions. And again, exposed to a lot of industries that have been hard hit. And Microsoft actually quite low on new adoption. So this is very impressive for Snowflake. And I want to talk about the multi-cloud strategy now Erik. This came up a lot. The VENN participants who are sort of fans of Snowflake said three things: It was really the flexibility, the security which is really interesting to me. And a lot of that had to do with the flexibility. The ability to easily set up roles and not have to waste a lot of time wrangling. And then the third was multi-cloud. And that was really something that came through heavily in the VENN. Didn't it? >> It really did. And again, I think it just comes down to, I don't think you can ever overstate how afraid these guys are of vendor lock-in. They can't have it. They don't want it. And it's best practice to make sure your sensitive information is being kind of spread out a little bit. We all know that people don't trust Bezos. So if you're in certain industries, you're not going to use AWS at all, right? So yeah, this ability to have your data portability through multi-cloud is the number one reason I think people start looking at Snowflake. And to go to your point about the adoptions, it's very telling and it bodes well for them going forward. Most of the things that we're seeing right now are net new workloads. So let's go again back to the legacy side that we were talking about, the Teradatas, IBMs, Oracles. They still have the monolithic applications and the data that needs to support that, right? Like an old ERP type of thing. But anyone who's now building a new application, bringing something new to market, it's all net new workloads. There is no net new workload that is going to go to SAP or IBM. It's not going to happen. The net new workloads are going to the cloud. And that's why when you switch from net score to adoption, you see Snowflake really stand out because this is about new adoption for net new workloads. And that's really where they're driving everything. So I would just say that as this continues, as data as a service continues, I think Snowflake's only going to gain more and more share for all the reasons you stated. Now get back to your comment about security. I was shocked by that. I really was. I did not expect these guys to say, "Oh, no. Snowflake enterprise security not a concern." So two panels ago, a gentleman from a fortune 100 financials said, "Listen, it's very difficult to get us to sign off on something for security. Snowflake is past it, it is enterprise ready, and we are going full steam ahead." Once they got that go ahead, there was no turning back. We gave it to our DevOps guys, we gave it to everyone and said, "Run with it." So, when a company that's big, I believe their fortune rank is 28. (laughs) So when a company that big says, "Yeah, you've got the green light. That we were okay with the internal compliance aspect, we're okay with the security aspect, this gives us multi-cloud portability, this gives us flexibility, ease of use." Honestly there's a really long runway ahead for Snowflake. >> Yeah, so the big question I have around the multi-cloud piece and I totally and I've been on record saying, "Look, if you're going looking for an agnostic multi-cloud, you're probably not going to go with the cloud vendor." (laughs) But I've also said that I think multi-cloud to date anyway has largely been a symptom as opposed to a strategy, but that's changing. But to your point about lock-in and also I think people are maybe looking at doing things across clouds, but I think that certainly it expands Snowflake's TAM and we're going to talk about that because they support multiple clouds and they're going to be the best at that. That's a mandate for them. The question I have is how much of complex joining are you going to be doing across clouds? And is that something that is just going to be too latency intensive? Is that really Snowflake's expertise? You're really trying to build that data layer. You're probably going to maybe use some kind of Postgres database for that. >> Right. >> I don't know. I need to dig into that, but that would be an opportunity from a TAM standpoint. I just don't know how real that is. >> Yeah, unfortunately I'm going to just be honest with this one. I don't think I have great expertise there and I wouldn't want to lead anyone a wrong direction. But from what I've heard from some of my VENN interview subjects, this is happening. So the data portability needs to be agnostic to the cloud. I do think that when you're saying, are there going to be real complex kind of workloads and applications? Yes, the answer is yes. And I think a lot of that has to do with some of the container architecture as well, right? If I can just pull data from one spot, spin it up for as long as I need and then just get rid of that container, that ethereal layer of compute. It doesn't matter where the cloud lies. It really doesn't. I do think that multi-cloud is the way of the future. I know that the container workloads right now in the enterprise are still very small. I've heard people say like, "Yeah, I'm kicking the tires. We got 5%." That's going to grow. And if Snowflake can make themselves an integral part of that, then yes. I think that's one of those things where, I remember the guy said, "Snowflake has to continue to innovate. They have to find a way to grow this TAM." This is an area where they can do so. I think you're right about that, but as far as my expertise, on this one I'm going to be honest with you and say, I don't want to answer incorrectly. So you and I need to dig in a little bit on this one. >> Yeah, as it relates to question four, what's the viability of Snowflake's multi-cloud strategy? I'll say unquestionably supporting multiple clouds, very viable. Whether or not portability across clouds, multi-cloud joins, et cetera, TBD. So we'll keep digging into that. The last thing I want to focus on here is the last question, does Snowflake's TAM justify its $20 billion valuation? And you think about the data pipeline. You go from data acquisition to data prep. I mean, that really is where Snowflake shines. And then of course there's analysis. You've got to bring in EMI or AI and ML tools. That's not Snowflake's strength. And then you're obviously preparing that, serving that up to the business, visualization. So there's potential adjacencies that they could get into that they may or may not decide to. But so we put together this next chart which is kind of the TAM expansion opportunity. And I just want to briefly go through it. We published this stuff so you can go and look at all the fine print, but it's kind of starts with the data lake disruption. You called it data swamp before. The Hadoop no schema on, right? Basically the ROI of Hadoop became reduction of investment as my friend Abby Meadow would say. But so they're kind of disrupting that data lake which really was a failure. And then really going after that enterprise data warehouse which is kind of I have it here as a 10 billion. It's actually bigger than that. It's probably more like a $20 billion market. I'll update this slide. And then really what Snowflake is trying to do is be data as a service. A data layer across data stores, across clouds, really make it easy to ingest and prepare data and then serve the business with insights. And then ultimately this huge TAM around automated decision making, real-time analytics, automated business processes. I mean, that is potentially an enormous market. We got a couple of hundred billion. I mean, just huge. Your thoughts on their TAM? >> I agree. I'm not worried about their TAM and one of the reasons why as I mentioned before, they are coming out with a whole lot of cash. (laughs) This is going to be a red hot IPO. They are going to have a lot of money to spend. And look at their management team. Who is leading the way? A very successful, wise, intelligent, acquisitive type of CEO. I think there is going to be M&A activity, and I believe that M&A activity is going to be 100% for the mindset of growing their TAM. The entire world is moving to data as a service. So let's take as a backdrop. I'm going to go back to the panel we did yesterday. The first question we asked was, there was an understanding or a theory that when the virus pandemic hit, people wouldn't be taking on any sort of net new architecture. They're like, "Okay, I have Teradata, I have IBM. Let's just make sure the lights are on. Let's stick with it." Every single person I've asked, they're just now eight different experts, said to us, "Oh, no. Oh, no, no." There is the virus pandemic, the shift from work from home. Everything we're seeing right now has only accelerated and advanced our data as a service strategy in the cloud. We are building for scale, adopting cloud for data initiatives. So, across the board they have a great backdrop. So that's going to only continue, right? This is very new. We're in the early innings of this. So for their TAM, that's great because that's the core of what they do. Now on top of it you mentioned the type of things about, yeah, right now they don't have great machine learning. That could easily be acquired and built in. Right now they don't have an analytics layer. I for one would love to see these guys talk to Alteryx. Alteryx is red hot. We're seeing great data and great feedback on them. If they could do that business intelligence, that analytics layer on top of it, the entire suite as a service, I mean, come on. (laughs) Their TAM is expanding in my opinion. >> Yeah, your point about their leadership is right on. And I interviewed Frank Slootman right in the heart of the pandemic >> So impressed. >> and he said, "I'm investing in engineering almost sight unseen. More circumspect around sales." But I will caution people. That a lot of people I think see what Slootman did with ServiceNow. And he came into ServiceNow. I have to tell you. It was they didn't have their unit economics right, they didn't have their sales model and marketing model. He cleaned that up. Took it from 120 million to 1.2 billion and really did an amazing job. People are looking for a repeat here. This is a totally different situation. ServiceNow drove a truck through BMCs install base and with IT help desk and then created this brilliant TAM expansion. Let's learn and expand model. This is much different here. And Slootman also told me that he's a situational CEO. He doesn't have a playbook. And so that's what is most impressive and interesting about this. He's now up against the biggest competitors in the world: AWS, Google and Microsoft and dozens of other smaller startups that have raised a lot of money. Look at the company like Yellowbrick. They've raised I don't know $180 million. They've got a great team. Google, IBM, et cetera. So it's going to be really, really fun to watch. I'm super excited, Erik, but I'll tell you the data right now suggest they've got a great tailwind and if they can continue to execute, this is going to be really fun to watch. >> Yeah, certainly. I mean, when you come out and you are as impressive as Snowflake is, you get a target on your back. There's no doubt about it, right? So we said that they basically created the data as a service. That's going to invite competition. There's no doubt about it. And Yellowbrick is one that came up in the panel yesterday about one of our CIOs were doing a proof of concept with them. We had about seven others mentioned as well that are startups that are in this space. However, none of them despite their great valuation and their great funding are going to have the kind of money and the market lead that Slootman is going to have which Snowflake has as this comes out. And what we're seeing in Congress right now with some antitrust scrutiny around the large data that's being collected by AWS as your Google, I'm not going to bet against this guy either. Right now I think he's got a lot of opportunity, there's a lot of additional layers and because he can basically develop this as a suite service, I think there's a lot of great opportunity ahead for this company. >> Yeah, and I guarantee that he understands well that customer acquisition cost and the lifetime value of the customer, the retention rates. Those are all things that he and Mike Scarpelli, his CFO learned at ServiceNow. Not learned, perfected. (Erik laughs) Well Erik, really great conversation, awesome data. It's always a pleasure having you on. Thank you so much, my friend. I really appreciate it. >> I appreciate talking to you too. We'll do it again soon. And stay safe everyone out there. >> All right, and thank you for watching everybody this episode of "CUBE Insights" powered by ETR. This is Dave Vellante, and we'll see you next time. (soft music)

Published Date : Jul 31 2020

SUMMARY :

This is breaking analysis and he's also the Great to see you too. and others in the community. I did not expect the And the horizontal axis is And one of the main concerns they have and some of the data lakes. and the legacy on-prem. but a key component of the TAM And back in the day where of part of the package. and Informatica the most. I mean, you're right that if And the other point is, "Hey, and from the more dominant It's interesting one of the comments, that in the panel yesterday and it's ML out of the box the thing to be cloud native. That portability that they bring to you And I totally agree with what And a lot of that had to and the data that needs and they're going to be the best at that. I need to dig into that, I know that the container on here is the last question, and one of the reasons heart of the pandemic and if they can continue to execute, And Yellowbrick is one that and the lifetime value of the customer, I appreciate talking to you too. This is Dave Vellante, and

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Frank Slootman	PERSON	0.99+
George Gilbert	PERSON	0.99+
Erik Bradley	PERSON	0.99+
Erik	PERSON	0.99+
Frank Slootman	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Mike Scarpelli	PERSON	0.99+
Google	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
David Floyd	PERSON	0.99+
Slootman	PERSON	0.99+
Teradata	ORGANIZATION	0.99+
Abby Meadow	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
$180 million	QUANTITY	0.99+
$20 billion	QUANTITY	0.99+
Netezza	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
77.5%	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
20%	QUANTITY	0.99+
10 billion	QUANTITY	0.99+
12 and a half billion	QUANTITY	0.99+
120 million	QUANTITY	0.99+
Oracles	ORGANIZATION	0.99+
one	QUANTITY	0.99+
two	QUANTITY	0.99+
Cloudera	ORGANIZATION	0.99+
Yellowbrick	ORGANIZATION	0.99+

Breaking Analysis: Living Digital: New Rules for Technology Events

from the cube studios in Palo Alto in Boston connecting with thought leaders all around the world this is a cube conversation you know for years marketers marketers have been pushing for more digital especially with their big conferences I heard forward-thinking CMO say the war will be won in digital but the sales teams love the belly-to-belly interaction so every year once or even sometimes more often big corporations have hosted gatherings of thousands or even tens of thousands of attendees these events were like rock concerts they had DJs in the hallway thumping music giant screens beautiful pitches highly produced videos thing a technical breakouts Food lines private dinners etc all come on it culminating in a customer appreciation event with a big-name band physical events are expensive but they generate tons of leads for the host companies and their partner ecosystems well then BOOM coronavirus hits and the marketing teams got what they wished for right overnight virtual events became a mandate if you don't have a solution you were in big trouble because your leads from these large events just dried up hello everyone this is Dave Allen day and welcome to this week's cube insights powered by ETR ETR is entering its quiet period and I won't be able to share any new data for a couple of weeks so rather than look back at the April survey in this breaking analysis we thought we'd take a pause and really talk about the virtual event landscape and just a few of the things that we've learned in the past 120 days now this isn't meant to be an exhaustive list but we do want to call out a few important items that we see is critical in this new digital world in the isolation economy every company scrambled they took one of three paths first companies either postpone their events to buy some time think like Dell technology world Google cloud next cube convey my MIT CBO event etc or to some companies flat-out canceled their events for the year until next year like snowflake and uipath forth number three they scrambled to deploy a virtual event and they went forward IBM think did this HPE discover Susac on AWS summits docker convey Monde a peggle world Vertica big data conference octane sa P sapphire and hundreds of others pushed forward so when this braking analysis I want to share some data from the cube what we've learned not only in the last hundred and twenty days but in ten years of doing events mostly physical and we want to share the new rules of events and event marketing and beyond so let's get right into it everyone knows events events have gone virtual and there are tons of people who could give you advice on approving your digital events including us and and I will in this segment but the first thing that everyone found out is they're going to attract far more people online with a free virtual event than they do with a paid physical event so removing time timing in the expensive travel dramatically increases the participation Tam the total available market here's a tweet from docker CEO Scott Johnson he says that he's looking forward to welcoming 50,000 people to his event this is based on registration data somewhere around 30,000 people logged into the live event so docker got 60% of the pre event registrants to actually log in which is outstanding but there's a lot more to this story I'll share some other stats that are worth mentioning by the way I got permission from docker to to share these numbers not surprising because the event was it was a huge success for such a small company in the end they got nearly 83,000 registrations and they continue to come in weeks after the event which was held in late May now marketers generally will cite 2 to 3 minutes as a respect-- respectable time on site for a web property docker logged in users averaged almost four and a half minutes on site that's the average the bell curve sauce superfans like this guy who was binge watching so this brings me to rule number one it's actually really easy to get people to sign up for free online events but it's not so easy to keep them there now I could talk all day about what docker did right and I'm gonna bring some examples in during this except this segment but the one thing docker did was they did a call for papers or a call for sessions and that's a lot of work but if you look at the docket on speaker list the content is all community driven not all but mostly community driven talker had to break some eggs and reject some folks but it also had a sponsor track so it gave folks another avenue to participate so big success for docker they definitely did it right which brings us to new rule number two attention is precious you got to create high-quality content and realize that you have much less time with participants than if they were in person now unfortunately the doctor docker example is a bit of an outlier it hasn't always been this pretty remember that scene in the social network the movie when a duardo pulled the funding on the servers just to get marks attention remember how Jesse Eisenberg the actor who played Zuckerberg reacted everybody else we don't crash ever if the server's are down for even a day our entire reputation is irreversibly destroyed the whole point well some of the big tech companies crashed their servers and they say there's no such thing as bad press but look at look what happened to s AP and s AP apologized publicly and its CEO told people that they made a mistake in outsourcing their event platform so this brings us to new rule number three don't crash now I come back to Dhaka Khan for a second here's a tweet from a developer who shared the network traffic profile of his network before and during docker con you can see no glitches I mean I don't mean to pick on sa P they they owned the problem and look s AP had a huge attention attendance at its digital event more than 200,000 people and over a million views so Wow you'll kill me with that problem but it underscores the importance of scaling and s AP you have to say was not alone there have been lots of fails from much smaller events here's an example that was really frustrating you try to log in at 7:59 but the event doesn't start until 8:00 sharp really come on back in 60 seconds and in another example there was a slide failure I mean many of these virtual events are glorified webinars so if you're going to rely on slide where make sure the slides will render its scale you maybe embed them into the video you know but at least this company had a back-up plan here's another example and I've redacted the email because I'm not here to throw anyone under the bus well you know kind of but but no reason to name names you know who they are but in this case an old legacy webinar platform failed and they had to move to WebEx and again at least there was a back-up plan so you know it's been tough in a lot of these cases here's a tweet from Jason Reed it kind of summed sums it up now what does he mean by vendors are not getting the job done not enough creativity well not only were platforms failing they weren't performing adequately but the virtual experience is leaving many users unenthused they're they're just one alt-tab away from something better if the virtual event fails to engage them so new rule number four is virtual events that look like webinars actually our webinar webinars I mean in fairness you know the industry had to pivot with no notice but this is why I always tell people start with the outcome that you want and work backwards that'll inform you as to the content strategy the new roles you need to assign and make no mistakes there are new rules you know there's no site inspection virtual and then you got to figure out what you want to use your experience to be there's a whole lot to figure out and this next next one is a bit of a throwaway because yeah it's so obvious and everyone talks about it but I want to bring it up because it's important because I'm amazed at how many virtual event speakers really haven't thought through their setup you can look good you know or at least less bad get those things called books and raise up the laptop figure out some better audio your better yet get a good kit send it to their home with a nice camera and a solid mic maybe you know a clearer IFB comms for the ear spend some money to look good just as you might go and buy a nice outfit even if you're a developer put on a clean t-shirt so rule number five don't cheap out on production value get your guests a good set up and coach them up it doesn't have to be over the top no just a bit thought out okay one of the biggest mistakes I've seen is event organisers they become enamored with a platform and the features of that platform that really don't support their objectives kind of feature creep or they have so many competing objectives and masters that they're serving that they lose sight of the user experience and then the event becomes a buffet of unused features rather than a buffet of engaging content now many have told me that Dave these virtual events are too long there's too much content now I don't necessarily agree I really think if you have something to say you should say it as long as you do it right and you keep people engaged so I want to talk a little bit about a to of the meteor events that we attended one was octane twenty20 hosted by octo the identity management security player and then IBM think 2020 they called it the the think digital event experience and they both had multi day events with lots of content they both organized sessions by topic and made it pretty easy to find stuff and all assessing sessions had a reasonably consistent look and feel to them which kind of helped the production value IBM had content organized and categorized which made things easy to find and they both had good search and with IBM you could go directly from the list of topics right into the videos which I really liked very easy and intuitive and as you can see here in this octane video they had a nice and very ambitious agenda that was really quite well organized and things were pretty easy to find as you can see with this crisp filtering on the left hand side and in really nice search but one of the things that has been frustrating with most of the events that I've watched is you can't get to the sessions directly from the agenda you got to go back out for some linear path and find the content and it's somewhat confusing so I want to come back to the docker count example because I think there were two things that I found interesting and useful with docker con you know this got George nailed it when he said this is how you display a virtual conference what's relevant about this picture is you have multiple simultaneous sessions running live and concurrently and you can pop in and out of them you can easily see the sessions and this tile and there's a red line this linear clock that's running in real time to show you where you are in the event agenda versus in a time of day so I felt like with docker that as a user user you're really connected to the event you come to the site and there's a hero video very easy to find the content and in fact you can't miss it it's not a sales pitch to get to the content and then I really liked what what George change was talking about in terms of the agenda and the tile layout you can see they ran simultaneous sessions and at one point up to seven at once and they gave their sponsors a track on the agenda which is very easy to navigate but what I really like as well is when you click on a tile it takes you directly to the session video and you can see the chat which docker preserved in the PO event mode and you have this easy-to-follow agenda and again you can go directly to the session video and in the chat from the agenda so many paths to find the content I mean something so simple is navigating directly from the agenda to the session most events haven't done that they make you back out and then what I call this linear manner and then go forward and find the sessions that you want and then dive in now maybe they're trying to simulate walking to a session in a Las Vegas Convention Center because it takes about that long to figure out where most of these events in these sessions live so rule number six is make it easy to discover and consume content sounds so simple why is it not happening in most events okay I'm running out of time so I want to encapsulate a number of items in one idea that we talk about all the time at the cube I ran a little survey of the day and someone asked does it really make sense to cram educational content product content partner content customer content rally content and leadership content into the constrain confines of an arbitrary one or two-day window I thought that was an interesting comment now it doesn't necessarily mean shorten up the virtual event which a lot of people think should happen people complain that these things are too long well let me leave you with this it's actually not just about events what do I mean by that well you know how everyone says that all companies are software companies or every company is a SAS company well guess what we believe that every company is a media company in 2004 at the low point of its reputation Microsoft launched channel 9 it was named after the United Airlines channel 9 that lets you listen in to the pilots and their unfiltered conversations kind of cool Microsoft understood that having an authentic voice with which to communicate to developers and serve its community was a smart thing to do and that is the key point channel 9 is about community it's not about audience metrics or lead generation both important things but Microsoft they launched this site understanding the leverage it gets out of its community of developers and instead of treating them like leads they created a site to help developers learn so rule number seven is get your best media mojo on one of the biggest failures I see with physical events and it's clearly carrying over to digital is the failure to optimize the post-event opportunity and experience so just like physical events when the event is over I see companies and their employees they're so burnt out after a virtual event because they feel like they've just given birth and what do they do now after the event they take some time off they got to recharge and when they come back they're swamped and so they're on to the next project it might be another event it might be a webinar series or some regional summits or whatever now it's interesting it feels like all tech companies talk about these days is breaking down silos but most of these parent and child events are disconnected silos sure maybe the data around the events is consolidated into a marketing cloud maybe so that you can nurture leads okay that's fine but what about the community kovat has given us a great opportunity to reimagine how we serve communities and one thing I'm certain about is that physical events they're going to come back at some point in some form but when they do there's gonna be a stronger digital component attached to them hybrids will emerge and some will serve communities better than others and in our opinion the ones that do the best job in digital and serving their communities are gonna win the marketing Wars so ask yourself how are you serving your community are you serving the best way that you can is a lead conversion your number one metric that's okay there's nothing wrong with that but how are your content consumption metrics looking what are you measuring what does your Arc of content look like what's your content and an organic media strategy what does your media stack look like media stack you ask what do you mean Dave well you nailed physical and then you were forced to do virtual overnight eventually there's going to be a hybrid that emerges so there's physical at the bottom and then there's a virtual layer and then you get this hybrid layer at some point on top of that at the very top of the stack you got apps social media you got corporate content you got TV like channel 9 you have video library's website you have tools for agile media you got media production and distribution tooling remember customers will be entering from any one of these layers of that stack and they'll be looking to you for guidance inspiration learning vision product knowledge how to's etc and you'd be delivering that primarily through content so your media stack should be designed to serve your community events software yeah sure but it's much more than that we believe that this stack will emerge not as a monolithic beast but rather as a set of scalable cloud services and api's think of paths for media that you can skin yes of course but also one that you can control add value to integrate with other platforms and fit your business as your community demands and remember new roles are emerging as a result of this pandemic and the pivot to digital the things are different really mostly from from most physical events is that it's very important to think about these roles and one of the important roles is this designer or UX developer that can actually do some coding and API integration think of it as a DevOps for digital organizations that's emerging organizations like yours will want self-service and sometimes out-of-the-box functionality and features for sure no question but we believe that as a media producer you will want to customize your media experience for your community and this work will require new skills that you haven't really prioritized in the past what what do you think what's your vision as to how this will all play out and unfold do you buy that all companies must become media companies or at least media savvy not in the sense of Corp comms but really as an organic media producer tweet me at devonté or email me at David Galante at Silicon angle comm or comment on my LinkedIn post who would react next week with some data from et our survey sphere thanks for watching this wiki bond cube insights powered by ETR this is Dave Volante we'll see you next time [Music]

Published Date : Jul 8 2020

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Dave Volante	PERSON	0.99+
Jason Reed	PERSON	0.99+
2004	DATE	0.99+
60%	QUANTITY	0.99+
United Airlines	ORGANIZATION	0.99+
Jesse Eisenberg	PERSON	0.99+
2	QUANTITY	0.99+
ten years	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
Zuckerberg	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Scott Johnson	PERSON	0.99+
George	PERSON	0.99+
Dhaka Khan	LOCATION	0.99+
50,000 people	QUANTITY	0.99+
David Galante	PERSON	0.99+
Dave	PERSON	0.99+
next week	DATE	0.99+
April	DATE	0.99+
60 seconds	QUANTITY	0.99+
more than 200,000 people	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
late May	DATE	0.99+
3 minutes	QUANTITY	0.99+
two things	QUANTITY	0.99+
next year	DATE	0.99+
Las Vegas Convention Center	LOCATION	0.99+
Boston	LOCATION	0.99+
tons of people	QUANTITY	0.98+
hundreds	QUANTITY	0.98+
over a million views	QUANTITY	0.98+
first thing	QUANTITY	0.98+
nearly 83,000 registrations	QUANTITY	0.98+
one idea	QUANTITY	0.98+
both	QUANTITY	0.97+
two-day	QUANTITY	0.97+
octane twenty20	EVENT	0.97+
tons of leads	QUANTITY	0.97+
almost four and a half minutes	QUANTITY	0.97+
AP	ORGANIZATION	0.97+
Dell	ORGANIZATION	0.96+
one	QUANTITY	0.96+
SAS	ORGANIZATION	0.95+
around 30,000 people	QUANTITY	0.94+
docker	ORGANIZATION	0.93+
channel 9	ORGANIZATION	0.93+
this week	DATE	0.93+
thousands	QUANTITY	0.91+
one point	QUANTITY	0.9+
CEO	PERSON	0.9+
devonté	PERSON	0.89+
first companies	QUANTITY	0.88+
a day	QUANTITY	0.88+
pandemic	EVENT	0.88+
kovat	ORGANIZATION	0.87+
8:00	DATE	0.87+
WebEx	TITLE	0.86+
number three	QUANTITY	0.86+
rule number three	OTHER	0.84+
MIT CBO	EVENT	0.83+
LinkedIn	ORGANIZATION	0.82+
tens of thousands of attendees	QUANTITY	0.82+
one thing	QUANTITY	0.82+
agile	TITLE	0.81+
six	OTHER	0.8+
every year	QUANTITY	0.8+
7:59	DATE	0.79+
one	OTHER	0.78+
Dave Allen	PERSON	0.78+
multi day	QUANTITY	0.75+
rule number	QUANTITY	0.75+
couple of weeks	QUANTITY	0.74+
docker	TITLE	0.73+
ETR	ORGANIZATION	0.73+
rule number four	QUANTITY	0.73+
a lot of work	QUANTITY	0.71+
rule number seven	QUANTITY	0.71+
up to seven	QUANTITY	0.7+

Rich Gaston, Micro Focus | Virtual Vertica BDC 2020

(upbeat music) >> Announcer: It's theCUBE covering the virtual Vertica Big Data Conference 2020 brought to you by Vertica. >> Welcome back to the Vertica Virtual Big Data Conference, BDC 2020. You know, it was supposed to be a physical event in Boston at the Encore. Vertica pivoted to a digital event, and we're pleased that The Cube could participate because we've participated in every BDC since the inception. Rich Gaston this year is the global solutions architect for security risk and governance at Micro Focus. Rich, thanks for coming on, good to see you. >> Hey, thank you very much for having me. >> So you got a chewy title, man. You got a lot of stuff, a lot of hairy things in there. But maybe you can talk about your role as an architect in those spaces. >> Sure, absolutely. We handle a lot of different requests from the global 2000 type of organization that will try to move various business processes, various application systems, databases, into new realms. Whether they're looking at opening up new business opportunities, whether they're looking at sharing data with partners securely, they might be migrating it to cloud applications, and doing migration into a Hybrid IT architecture. So we will take those large organizations and their existing installed base of technical platforms and data, users, and try to chart a course to the future, using Micro Focus technologies, but also partnering with other third parties out there in the ecosystem. So we have large, solid relationships with the big cloud vendors, with also a lot of the big database spenders. Vertica's our in-house solution for big data and analytics, and we are one of the first integrated data security solutions with Vertica. We've had great success out in the customer base with Vertica as organizations have tried to add another layer of security around their data. So what we will try to emphasize is an enterprise wide data security approach, where you're taking a look at data as it flows throughout the enterprise from its inception, where it's created, where it's ingested, all the way through the utilization of that data. And then to the other uses where we might be doing shared analytics with third parties. How do we do that in a secure way that maintains regulatory compliance, and that also keeps our company safe against data breach. >> A lot has changed since the early days of big data, certainly since the inception of Vertica. You know, it used to be big data, everyone was rushing to figure it out. You had a lot of skunkworks going on, and it was just like, figure out data. And then as organizations began to figure it out, they realized, wow, who's governing this stuff? A lot of shadow IT was going on, and then the CIO was called to sort of reign that back in. As well, you know, with all kinds of whatever, fake news, the hacking of elections, and so forth, the sense of heightened security has gone up dramatically. So I wonder if you can talk about the changes that have occurred in the last several years, and how you guys are responding. >> You know, it's a great question, and it's been an amazing journey because I was walking down the street here in my hometown of San Francisco at Christmastime years ago and I got a call from my bank, and they said, we want to inform you your card has been breached by Target, a hack at Target Corporation and they got your card, and they also got your pin. And so you're going to need to get a new card, we're going to cancel this. Do you need some cash? I said, yeah, it's Christmastime so I need to do some shopping. And so they worked with me to make sure that I could get that cash, and then get the new card and the new pin. And being a professional in the inside of the industry, I really questioned, how did they get the pin? Tell me more about this. And they said, well, we don't know the details, but you know, I'm sure you'll find out. And in fact, we did find out a lot about that breach and what it did to Target. The impact that $250 million immediate impact, CIO gone, CEO gone. This was a big one in the industry, and it really woke a lot of people up to the different types of threats on the data that we're facing with our largest organizations. Not just financial data; medical data, personal data of all kinds. Flash forward to the Cambridge Analytica scandal that occurred where Facebook is handing off data, they're making a partnership agreement --think they can trust, and then that is misused. And who's going to end up paying the cost of that? Well, it's going to be Facebook at a tune of about five billion on that, plus some other finds that'll come along, and other costs that they're facing. So what we've seen over the course of the past several years has been an evolution from data breach making the headlines, and how do my customers come to us and say, help us neutralize the threat of this breach. Help us mitigate this risk, and manage this risk. What do we need to be doing, what are the best practices in the industry? Clearly what we're doing on the perimeter security, the application security and the platform security is not enough. We continue to have breaches, and we are the experts at that answer. The follow on fascinating piece has been the regulators jumping in now. First in Europe, but now we see California enacting a law just this year. They came into a place that is very stringent, and has a lot of deep protections that are really far-reaching around personal data of consumers. Look at jurisdictions like Australia, where fiduciary responsibility now goes to the Board of Directors. That's getting attention. For a regulated entity in Australia, if you're on the Board of Directors, you better have a plan for data security. And if there is a breach, you need to follow protocols, or you personally will be liable. And that is a sea change that we're seeing out in the industry. So we're getting a lot of attention on both, how do we neutralize the risk of breach, but also how can we use software tools to maintain and support our regulatory compliance efforts as we work with, say, the largest money center bank out of New York. I've watched their audit year after year, and it's gotten more and more stringent, more and more specific, tell me more about this aspect of data security, tell me more about encryption, tell me more about money management. The auditors are getting better. And we're supporting our customers in that journey to provide better security for the data, to provide a better operational environment for them to be able to roll new services out with confidence that they're not going to get breached. With that confidence, they're not going to have a regulatory compliance fine or a nightmare in the press. And these are the major drivers that help us with Vertica sell together into large organizations to say, let's add some defense in depth to your data. And that's really a key concept in the security field, this concept of defense in depth. We apply that to the data itself by changing the actual data element of Rich Gaston, I will change that name into Ciphertext, and that then yields a whole bunch of benefits throughout the organization as we deal with the lifecycle of that data. >> Okay, so a couple things I want to mention there. So first of all, totally board level topic, every board of directors should really have cyber and security as part of its agenda, and it does for the reasons that you mentioned. The other is, GDPR got it all started. I guess it was May 2018 that the penalties went into effect, and that just created a whole Domino effect. You mentioned California enacting its own laws, which, you know, in some cases are even more stringent. And you're seeing this all over the world. So I think one of the questions I have is, how do you approach all this variability? It seems to me, you can't just take a narrow approach. You have to have an end to end perspective on governance and risk and security, and the like. So are you able to do that? And if so, how so? >> Absolutely, I think one of the key areas in big data in particular, has been the concern that we have a schema, we have database tables, we have CALMS, and we have data, but we're not exactly sure what's in there. We have application developers that have been given sandbox space in our clusters, and what are they putting in there? So can we discover that data? We have those tools within Micro Focus to discover sensitive data within in your data stores, but we can also protect that data, and then we'll track it. And what we really find is that when you protect, let's say, five billion rows of a customer database, we can now know what is being done with that data on a very fine grain and granular basis, to say that this business process has a justified need to see the data in the clear, we're going to give them that authorization, they can decrypt the data. Secure data, my product, knows about that and tracks that, and can report on that and say at this date and time, Rich Gaston did the following thing to be able to pull data in the clear. And that could be then used to support the regulatory compliance responses and then audit to say, who really has access to this, and what really is that data? Then in GDPR, we're getting down into much more fine grained decisions around who can get access to the data, and who cannot. And organizations are scrambling. One of the funny conversations that I had a couple years ago as GDPR came into place was, it seemed a couple of customers were taking these sort of brute force approach of, we're going to move our analytics and all of our data to Europe, to European data centers because we believe that if we do this in the U.S., we're going to violate their law. But if we do it all in Europe, we'll be okay. And that simply was a short-term way of thinking about it. You really can't be moving your data around the globe to try to satisfy a particular jurisdiction. You have to apply the controls and the policies and put the software layers in place to make sure that anywhere that someone wants to get that data, that we have the ability to look at that transaction and say it is or is not authorized, and that we have a rock solid way of approaching that for audit and for compliance and risk management. And once you do that, then you really open up the organization to go back and use those tools the way they were meant to be used. We can use Vertica for AI, we can use Vertica for machine learning, and for all kinds of really cool use cases that are being done with IOT, with other kinds of cases that we're seeing that require data being managed at scale, but with security. And that's the challenge, I think, in the current era, is how do we do this in an elegant way? How do we do it in a way that's future proof when CCPA comes in? How can I lay this on as another layer of audit responsibility and control around my data so that I can satisfy those regulators as well as the folks over in Europe and Singapore and China and Turkey and Australia. It goes on and on. Each jurisdiction out there is now requiring audit. And like I mentioned, the audits are getting tougher. And if you read the news, the GDPR example I think is classic. They told us in 2016, it's coming. They told us in 2018, it's here. They're telling us in 2020, we're serious about this, and here's the finds, and you better be aware that we're coming to audit you. And when we audit you, we're going to be asking some tough questions. If you can't answer those in a timely manner, then you're going to be facing some serious consequences, and I think that's what's getting attention. >> Yeah, so the whole big data thing started with Hadoop, and Hadoop is open, it's distributed, and it just created a real governance challenge. I want to talk about your solutions in this space. Can you tell us more about Micro Focus voltage? I want to understand what it is, and then get into sort of how it works, and then I really want to understand how it's applied to Vertica. >> Yeah, absolutely, that's a great question. First of all, we were the originators of format preserving encryption, we developed some of the core basic research out of Stanford University that then became the company of Voltage; that build-a-brand name that we apply even though we're part of Micro Focus. So the lineage still goes back to Dr. Benet down at Stanford, one of my buddies there, and he's still at it doing amazing work in cryptography and keeping moving the industry forward, and the science forward of cryptography. It's a very deep science, and we all want to have it peer-reviewed, we all want to be attacked, we all want it to be proved secure, that we're not selling something to a major money center bank that is potentially risky because it's obscure and we're private. So we have an open standard. For six years, we worked with the Department of Commerce to get our standard approved by NIST; The National Institute of Science and Technology. They initially said, well, AES256 is going to be fine. And we said, well, it's fine for certain use cases, but for your database, you don't want to change your schema, you don't want to have this increase in storage costs. What we want is format preserving encryption. And what that does is turns my name, Rich, into a four-letter ciphertext. It can be reversed. The mathematics of that are fascinating, and really deep and amazing. But we really make that very simple for the end customer because we produce APIs. So these application programming interfaces can be accessed by applications in C or Java, C sharp, other languages. But they can also be accessed in Microservice Manor via rest and web service APIs. And that's the core of our technical platform. We have an appliance-based approach, so we take a secure data appliance, we'll put it on Prim, we'll make 50 of them if you're a big company like Verizon and you need to have these co-located around the globe, no problem; we can scale to the largest enterprise needs. But our typical customer will install several appliances and get going with a couple of environments like QA and Prod to be able to start getting encryption going inside their organization. Once the appliances are set up and installed, it takes just a couple of days of work for a typical technical staff to get done. Then you're up and running to be able to plug in the clients. Now what are the clients? Vertica's a huge one. Vertica's one of our most powerful client endpoints because you're able to now take that API, put it inside Vertica, it's all open on the internet. We can go and look at Vertica.com/secure data. You get all of our documentation on it. You understand how to use it very quickly. The APIs are super simple; they require three parameter inputs. It's a really basic approach to being able to protect and access data. And then it gets very deep from there because you have data like credit card numbers. Very different from a street address and we want to take a different approach to that. We have data like birthdate, and we want to be able to do analytics on dates. We have deep approaches on managing analytics on protected data like Date without having to put it in the clear. So we've maintained a lead in the industry in terms of being an innovator of the FF1 standard, what we call FF1 is format preserving encryption. We license that to others in the industry, per our NIST agreement. So we're the owner, we're the operator of it, and others use our technology. And we're the original founders of that, and so we continue to sort of lead the industry by adding additional capabilities on top of FF1 that really differentiate us from our competitors. Then you look at our API presence. We can definitely run as a dup, but we also run in open systems. We run on main frame, we run on mobile. So anywhere in the enterprise or one in the cloud, anywhere you want to be able to put secure data, and be able to access the protect data, we're going to be there and be able to support you there. >> Okay so, let's say I've talked to a lot of customers this week, and let's say I'm running in Eon mode. And I got some workload running in AWS, I've got some on Prim. I'm going to take an appliance or multiple appliances, I'm going to put it on Prim, but that will also secure my cloud workloads as part of a sort of shared responsibility model, for example? Or how does that work? >> No, that's absolutely correct. We're really flexible that we can run on Prim or in the cloud as far as our crypto engine, the key management is really hard stuff. Cryptography is really hard stuff, and we take care of all that, so we've all baked that in, and we can run that for you as a service either in the cloud or on Prim on your small Vms. So really the lightweight footprint for me running my infrastructure. When I look at the organization like you just described, it's a classic example of where we fit because we will be able to protect that data. Let's say you're ingesting it from a third party, or from an operational system, you have a website that collects customer data. Someone has now registered as a new customer, and they're going to do E-commerce with you. We'll take that data, and we'll protect it right at the point of capture. And we can now flow that through the organization and decrypt it at will on any platform that you have that you need us to be able to operate on. So let's say you wanted to pick that customer data from the operational transaction system, let's throw it into Eon, let's throw it into the cloud, let's do analytics there on that data, and we may need some decryption. We can place secure data wherever you want to be able to service that use case. In most cases, what you're doing is a simple, tiny little atomic efetch across a protected tunnel, your typical TLS pipe tunnel. And once that key is then cashed within our client, we maintain all that technology for you. You don't have to know about key management or dashing. We're good at that; that's our job. And then you'll be able to make those API calls to access or protect the data, and apply the authorization authentication controls that you need to be able to service your security requirements. So you might have third parties having access to your Vertica clusters. That is a special need, and we can have that ability to say employees can get X, and the third party can get Y, and that's a really interesting use case we're seeing for shared analytics in the internet now. >> Yeah for sure, so you can set the policy how we want. You know, I have to ask you, in a perfect world, I would encrypt everything. But part of the reason why people don't is because of performance concerns. Can you talk about, and you touched upon it I think recently with your sort of atomic access, but can you talk about, and I know it's Vertica, it's Ferrari, etc, but anything that slows it down, I'm going to be a concern. Are customers concerned about that? What are the performance implications of running encryption on Vertica? >> Great question there as well, and what we see is that we want to be able to apply scale where it's needed. And so if you look at ingest platforms that we find, Vertica is commonly connected up to something like Kafka. Maybe streamsets, maybe NiFi, there are a variety of different technologies that can route that data, pipe that data into Vertica at scale. Secured data is architected to go along with that architecture at the node or at the executor or at the lowest level operator level. And what I mean by that is that we don't have a bottleneck that everything has to go through one process or one box or one channel to be able to operate. We don't put an interceptor in between your data and coming and going. That's not our approach because those approaches are fragile and they're slow. So we typically want to focus on integrating our APIs natively within those pipeline processes that come into Vertica within the Vertica ingestion process itself, you can simply apply our protection when you do the copy command in Vertica. So really basic simple use case that everybody is typically familiar with in Vertica land; be able to copy the data and put it into Vertica, and you simply say protect as part of the data. So my first name is coming in as part of this ingestion. I'll simply put the protect keyword in the Syntax right in SQL; it's nothing other than just an extension SQL. Very very simple, the developer, easy to read, easy to write. And then you're going to provide the parameters that you need to say, oh the name is protected with this kind of a format. To differentiate it between a credit card number and an alphanumeric stream, for example. So once you do that, you then have the ability to decrypt. Now, on decrypt, let's look at a couple different use cases. First within Vertica, we might be doing select statements within Vertica, we might be doing all kinds of jobs within Vertica that just operate at the SQL layer. Again, just insert the word "access" into the Vertica select string and provide us with the data that you want to access, that's our word for decryption, that's our lingo. And we will then, at the Vertica level, harness the power of its CPU, its RAM, its horsepower at the node to be able to operate on that operator, the decryption request, if you will. So that gives us the speed and the ability to scale out. So if you start with two nodes of Vertica, we're going to operate at X number of hundreds of thousands of transactions a second, depending on what you're doing. Long strings are a little bit more intensive in terms of performance, but short strings like social security number are our sweet spot. So we operate very very high speed on that, and you won't notice the overhead with Vertica, perse, at the node level. When you scale Vertica up and you have 50 nodes, and you have large clusters of Vertica resources, then we scale with you. And we're not a bottleneck and at any particular point. Everybody's operating independently, but they're all copies of each other, all doing the same operation. Fetch a key, do the work, go to sleep. >> Yeah, you know, I think this is, a lot of the customers have said to us this week that one of the reasons why they like Vertica is it's very mature, it's been around, it's got a lot of functionality, and of course, you know, look, security, I understand is it's kind of table sticks, but it's also can be a differentiator. You know, big enterprises that you sell to, they're asking for security assessments, SOC 2 reports, penetration testing, and I think I'm hearing, with the partnership here, you're sort of passing those with flying colors. Are you able to make security a differentiator, or is it just sort of everybody's kind of got to have good security? What are your thoughts on that? >> Well, there's good security, and then there's great security. And what I found with one of my money center bank customers here in San Francisco was based here, was the concern around the insider access, when they had a large data store. And the concern that a DBA, a database administrator who has privilege to everything, could potentially exfil data out of the organization, and in one fell swoop, create havoc for them because of the amount of data that was present in that data store, and the sensitivity of that data in the data store. So when you put voltage encryption on top of Vertica, what you're doing now is that you're putting a layer in place that would prevent that kind of a breach. So you're looking at insider threats, you're looking at external threats, you're looking at also being able to pass your audit with flying colors. The audits are getting tougher. And when they say, tell me about your encryption, tell me about your authentication scheme, show me the access control list that says that this person can or cannot get access to something. They're asking tougher questions. That's where secure data can come in and give you that quick answer of it's encrypted at rest. It's encrypted and protected while it's in use, and we can show you exactly who's had access to that data because it's tracked via a different layer, a different appliance. And I would even draw the analogy, many of our customers use a device called a hardware security module, an HSM. Now, these are fairly expensive devices that are invented for military applications and adopted by banks. And now they're really spreading out, and people say, do I need an HSM? Well, with secure data, we certainly protect your crypto very very well. We have very very solid engineering. I'll stand on that any day of the week, but your auditor is going to want to ask a checkbox question. Do you have HSM? Yes or no. Because the auditor understands, it's another layer of protection. And it provides me another tamper evident layer of protection around your key management and your crypto. And we, as professionals in the industry, nod and say, that is worth it. That's an expensive option that you're going to add on, but your auditor's going to want it. If you're in financial services, you're dealing with PCI data, you're going to enjoy the checkbox that says, yes, I have HSMs and not get into some arcane conversation around, well no, but it's good enough. That's kind of the argument then conversation we get into when folks want to say, Vertica has great security, Vertica's fantastic on security. Why would I want secure data as well? It's another layer of protection, and it's defense in depth for you data. When you believe in that, when you take security really seriously, and you're really paranoid, like a person like myself, then you're going to invest in those kinds of solutions that get you best in-class results. >> So I'm hearing a data-centric approach to security. Security experts will tell you, you got to layer it. I often say, we live in a new world. The green used to just build a moat around the queen, but the queen, she's leaving her castle in this world of distributed data. Rich, incredibly knowlegable guest, and really appreciate you being on the front lines and sharing with us your knowledge about this important topic. So thanks for coming on theCUBE. >> Hey, thank you very much. >> You're welcome, and thanks for watching everybody. This is Dave Vellante for theCUBE, we're covering wall-to-wall coverage of the Virtual Vertica BDC, Big Data Conference. Remotely, digitally, thanks for watching. Keep it right there. We'll be right back right after this short break. (intense music)

Published Date : Mar 31 2020

SUMMARY :

Vertica Big Data Conference 2020 brought to you by Vertica. and we're pleased that The Cube could participate But maybe you can talk about your role And then to the other uses where we might be doing and how you guys are responding. and they said, we want to inform you your card and it does for the reasons that you mentioned. and put the software layers in place to make sure Yeah, so the whole big data thing started with Hadoop, So the lineage still goes back to Dr. Benet but that will also secure my cloud workloads as part of a and we can run that for you as a service but can you talk about, at the node to be able to operate on that operator, a lot of the customers have said to us this week and we can show you exactly who's had access to that data and really appreciate you being on the front lines of the Virtual Vertica BDC, Big Data Conference.

ENTITIES

Entity	Category	Confidence
Australia	LOCATION	0.99+
Europe	LOCATION	0.99+
Target	ORGANIZATION	0.99+
Verizon	ORGANIZATION	0.99+
Vertica	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
May 2018	DATE	0.99+
NIST	ORGANIZATION	0.99+
2016	DATE	0.99+
Boston	LOCATION	0.99+
2018	DATE	0.99+
San Francisco	LOCATION	0.99+
New York	LOCATION	0.99+
Target Corporation	ORGANIZATION	0.99+
$250 million	QUANTITY	0.99+
50	QUANTITY	0.99+
Rich Gaston	PERSON	0.99+
Singapore	LOCATION	0.99+
Turkey	LOCATION	0.99+
Ferrari	ORGANIZATION	0.99+
six years	QUANTITY	0.99+
2020	DATE	0.99+
one box	QUANTITY	0.99+
China	LOCATION	0.99+
C	TITLE	0.99+
Stanford University	ORGANIZATION	0.99+
Java	TITLE	0.99+
First	QUANTITY	0.99+
one	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
U.S.	LOCATION	0.99+
this week	DATE	0.99+
National Institute of Science and Technology	ORGANIZATION	0.99+
Each jurisdiction	QUANTITY	0.99+
both	QUANTITY	0.99+
Vertica	TITLE	0.99+
Rich	PERSON	0.99+
this year	DATE	0.98+
Vertica Virtual Big Data Conference	EVENT	0.98+
one channel	QUANTITY	0.98+
one process	QUANTITY	0.98+
GDPR	TITLE	0.98+
SQL	TITLE	0.98+
five billion rows	QUANTITY	0.98+
about five billion	QUANTITY	0.97+
One	QUANTITY	0.97+
C sharp	TITLE	0.97+
Benet	PERSON	0.97+
first	QUANTITY	0.96+
four-letter	QUANTITY	0.96+
Vertica Big Data Conference 2020	EVENT	0.95+
Hadoop	TITLE	0.94+
Kafka	TITLE	0.94+
Micro Focus	ORGANIZATION	0.94+

Colin Mahony, Vertica at Micro Focus | Virtual Vertica BDC 2020

>>It's the queue covering the virtual vertical Big Data Conference 2020. Brought to you by vertical. >>Hello, everybody. Welcome to the new Normal. You're watching the Cube, and it's remote coverage of the vertical big data event on digital or gone Virtual. My name is Dave Volante, and I'm here with Colin Mahoney, who's a senior vice president at Micro Focus and the GM of Vertical Colin. Well, strange times, but the show goes on. Great to see you again. >>Good to see you too, Dave. Yeah, strange times indeed. Obviously, Safety first of everyone that we made >>a >>decision to go Virtual. I think it was absolutely the right all made it in advance of how things have transpired, but we're making the best of it and appreciate your time here, going virtual with us. >>Well, Joe and we're super excited to be here. As you know, the Cube has been at every single BDC since its inception. It's a great event. You just you just presented the key note to your to your audience, You know, it was remote. You didn't have that that live vibe. And you have a lot of fans in the vertical community But could you feel the love? >>Yeah, you know, it's >>it's hard to >>feel the love virtually, but I'll tell you what. The silver lining in all this is the reach that we have for this event now is much broader than it would have been a Z you know, you know, we brought this event back. It's been a few years since we've done it. We're super excited to do it, obviously, you know, in Boston, where it was supposed to be on location, but there wouldn't have been as many people that could participate. So the silver lining in all of this is that I think there's there's a lot of love out there we're getting, too. I have a lot of participants who otherwise would not have been able to participate in this. Both live as well. It's a lot of these assets that we're gonna have available. So, um, you know, it's out there. We've got an amazing customers and of practitioners with vertical. We've got so many have been with us for a long time. We've of course, have a lot of new customers as well that we're welcoming, so it's exciting. >>Well, it's been a while. Since you've had the BDC event, a lot of transpired. You're now part of micro focus, but I know you and I know the vertical team you guys have have not stopped. You've kept the innovation going. We've been following the announcements, but but bridge the gap between the last time. You know, we had coverage of this event and where we are today. A lot has changed. >>Oh, yeah, a lot. A lot has changed. I mean, you know, it's it's the software industry, right? So nothing stays the same. We constantly have Teoh keep going. Probably the only thing that stays the same is the name Vertical. Um and, uh, you know, you're not spending 10 which is just a phenomenal released for us. So, you know, overall, the the organization continues to grow. The dedication and commitment to this great form of vertical continues every single release we do as you know, and this hasn't changed. It's always about performance and scale and adding a whole bunch of new capabilities on that front. But it's also about are our main road map and direction that we're going towards. And I think one of the things have been great about it is that we've stayed true that from day one we haven't tried to deviate too much and get into things that are barred to outside your box. But we've really done, I think, a great job of extending vertical into places where people need a lot of help. And with vertical 10 we know we're going to talk more about that. But we've done a lot of that. It's super exciting for our customers, and all of this, of course, is driven by our customers. But back to the big data conference. You know, everybody has been saying this for years. It was one of the best conferences we've been to just so really it's. It's developers giving tech talks, its customers giving talks. And we have more customers that wanted to give talks than we had slots to fill this year at the event, which is another benefit, a little bit of going virtually accommodate a little bit more about obviously still a tight schedule. But it really was an opportunity for our community to come together and talk about not just America, but how to deal with data, you know, we know the volumes are slowing down. We know the complexity isn't slowing down. The things that people want to do with AI and machine learning are moving forward in a rapid pace as well. There's a lot talk about and share, and that's really huge part of what we try to do with it. >>Well, let's get into some of that. Um, your customers are making bets. Micro focus is actually making a bet on one vertical. I wanna get your perspective on one of the waves that you're riding and where are you placing your bets? >>Yeah, No, it's great. So, you know, I think that one of the waves that we've been writing for a long time, obviously Vertical started out as a sequel platform for analytics as a sequel, database engine, relational engine. But we always knew that was just sort of takes that we wanted to do. People were going to trust us to put enormous amounts of data in our platform and what we owe everyone else's lots of analytics to take advantage of that data in the lots of tools and capabilities to shape that data to get into the right format. The operational reporting but also in this day and age for machine learning and from some pretty advanced regressions and other techniques of things. So a huge part of vertical 10 is just doubling down on that commitment to what we call in database machine learning and ai. Um, And to do that, you know, we know that we're not going to come up with the world's best algorithms. Nor is that our focus to do. Our advantage is we have this massively parallel platform to ingest store, manage and analyze the data. So we made some announcements about incorporating PM ML models into the product. We continue to deepen our python integration. Building off of a new open source project we started with uber has been a great customer and partner on This is one of our great talks here at the event. So you know, we're continuing to do that, and it turns out that when it comes to anything analytics machine learning, certainly so much of what you have to do is actually prepare the big shape the data get the data in the right format, apply the model, fit the model test a model operationalized model and is a great platform to do that. So that's a huge bet that were, um, continuing to ride on, taking advantage of and then some of the other things that we've just been seeing. You continue. I'll take object. Storage is an example on, I think Hadoop and what would you point through ultimately was a huge part of this, but there's just a massive disruption going on in the world around object storage. You know, we've made several bets on S three early we created America Yang mode, which separates computing story. And so for us that separation is not just about being able to take care of your take advantage of cloud economics as we do, or the economics of object storage. It's also about being able to truly isolate workloads and start to set the sort of platform to be able to do very autonomous things in the databases in the database could actually start self analysing without impacting many operational workloads, and so that continues with our partnership with pure storage. On premise, we just announced that we're supporting beyond Google Cloud now. In addition to Amazon, we supported on we've got a CFS now being supported by are you on mode. So we continue to ride on that mega trend as well. Just the clouds in general. Whether it's a public cloud, it's a private cloud on premise. Giving our customers the flexibility and choice to run wherever it makes sense for them is something that we are very committed to. From a flexibility standpoint. There's a lot of lock in products out there. There's a lot of cloud only products now more than ever. We're hearing our customers that they want that flexibility to be able to run anywhere. They want the ease of use and simplicity of native cloud experiences, which we're giving them as well. >>I want to stay in that architectural component for a minute. Talk about separating compute from storage is not just about economics. I mean apart Is that you, you know, green, really scale compute separate from storage as opposed to in chunks. It's more efficient, but you're saying there's other advantages to operational and workload. Specificity. Um, what is unique about vertical In this regard, however, many others separate compute from storage? What's different about vertical? >>Yeah, I think you know, there's a lot of differences about how we do it. It's one thing if you're a cloud native company, you do it and you have a shared catalog. That's key value store that all of your customers are using and are on the same one. Frankly, it's probably more of a security concern than anything. But it's another thing. When you give that capability to each customer on their own, they're fully protected. They're not sharing it with any other customers. And that's something that we hear a lot of insights from our customers. They want to be able to separate compute and storage. But they want to be able to do this in their own environment so that they know that in their data catalog there's no one else is. You share in that catalog, there's no single point of failure. So, um, that's one huge advantage that we have. And frankly, I think it just comes from being a company that's operating on premise and, uh, up in the cloud. I think another huge advantages for us is we don't know what object storage platform is gonna win, nor do we necessarily have. We designed the young vote so that it's an sdk. We started with us three, but it could be anything. It's DFS. That's three. Who knows what what object storage formats were going to be there and then finally, beyond just the object storage. We're really one of the only database companies that actually allows our customers to natively operate on data in very different formats, like parquet and or if you're familiar with those in the Hadoop community. So we not only embrace this kind of object storage disruption, but we really embrace the different data formats. And what that means is our customers that have data pipelines that you know, fully automated, putting this information in different places. They don't have to completely reload everything to take advantage of the Arctic analytics. We can go where the data is connected into it, and we offer them a lot of different ways to take advantage of those analytics. So there are a couple of unique differences with verdict, and again, I think are really advance. You know, in many ways, by not being a cloud native platform is that we're very good at operating in different environments with different formats that changing formats over time. And I don't think a lot of the other companies out there that I think many, particularly many of the SAS companies were scrambling. They even have challenges moving from saying Amazon environment to a Microsoft azure environment with their office because they've got so much unique Band Aid. Excuse me in the background. Just holding the system up that is native to any of those. >>Good. I'm gonna summarize. I'm hearing from you your Ferrari of databases that we've always known. Your your object store agnostic? Um, it's any. It's the cloud experience that you can bring on Prem to virtually any cloud. All the popular clouds hybrid. You know, aws, azure, now Google or on Prem and in a variety of different data formats. And that is, I think, you know, you need the combination of those I think is unique in the marketplace. Um, before we get into the news, I want to ask you about data silos and data silos. You mentioned H DFs where you and I met back in the early days of big data. You know, in some respects, you know, Hadoop help break down the silos with distributing the date and leave it in place, and in other respects, they created Data Lakes, which became silos. And so we have. Yet all these other sales people are trying to get to, Ah, digital transformation meeting, putting data at their core virtually obviously, and leave it in place. What's your thoughts on that in terms of data being a silo buster Buster, How does verdict of way there? >>Yeah, so And you're absolutely right, I think if even if you look at his due for all the new data that gets into the do. In many ways, it's created yet another large island of data that many organizations are struggling with because it's separate from their core traditional data warehouse. It's separate from some of the operational systems that they have, and so there might be a lot of data in there, but they're still struggling with How do I break it out of that large silo and or combine it again? I think some some of the things that verdict it doesn't part of the announcement just attend his migration tools to make it really easy. If you do want to move it from one platform to another inter vertical, but you don't have to move it, you can actually take advantage of a lot of the data where it resides with vertical, especially in the Hadoop brown with our external table storage with our building or compartment natively. So we're very pragmatic about how our customers go about this. Very few customers, Many of them tried it with Hadoop and realize that didn't work. But very few customers want a wholesale. Just say we're going to throw everything out. We're gonna get rid of our data warehouse. We're gonna hit the pause button and we're going to go from there. Just it's not possible to do that. So we've spent a lot of time investing in the product, really work with them to go where the data is and then seamlessly migrate. And when it makes sense to migrate, you mentioned the performance of America. Um, and you talked about it is the variety. It definitely is. And one other thing that we're really proud of this is that it actually is not a gas guzzler. Easy either One of the things that we're seeing, a lot of the other cloud databases pound for pound you get on the 10th the hardware vertical running up there. You get over 10 x performance. We're seeing that a lot, so it's Ah, it's not just about the performance, but it's about the efficiency as well. And I think that efficiency is really important when it comes to silos. Because there's there's just only so much horsepower out there. And it's easier for companies to play tricks and lots of servers environment when they start up for so many organizations and cloud and frankly, looking at the bills they're getting from these cloud workloads that are running. They really conscious of that. >>Yeah. The big, big energy companies love the gas guzzlers. A lot of a lot of cloud. Cute. But let's get into the news. Uh, 10 dot io you shared with your the audience in your keynote. One of the one of the highlights of data. What do we need to know? >>Yeah, so, you know, again doubling down on these mega trends, I'll start with Machine Learning and ai. We've done a lot of work to integrate so that you can take native PM ml models, bring them into vertical, run them massively parallel and help shape you know your data and prepare it. Do all the work that we know is required true machine learning. And for all the hype that there is around it, this is really you know, people want to do a lot of unsupervised machine learning, whether it's for healthcare fraud, detection, financial services. So we've doubled down on that. We now also support things like Tensorflow and, you know, as I mentioned, we're not going to come up with the best algorithms. Our job is really to ensure that those algorithms that people coming up with could be incorporated, that we can run them against massive data sets super efficiently. So that's that's number one number two on object storage. We continue to support Mawr object storage platforms for ya mode in the cloud we're expanding to Google G CPI, Google's cloud beyond just Amazon on premise or in the cloud. Now we're also supporting HD fs with beyond. Of course, we continue to have a great relationship with our partners, your storage on premise. Well, what we continue to invest in the eon mode, especially. I'm not gonna go through all the different things here, but it's not just sort of Hey, you support this and then you move on. There's so many different things that we learn about AP I calls and how to save our customers money and tricks on performance and things on the third areas. We definitely continue to build on that flexibility of deployment, which is related to young vote with. Some are described, but it's also about simplicity. It's also about some of the migration tools that we've announced to make it easy to go from one platform to another. We have a great road map on these abuse on security, on performance and scale. I mean, for us. Those are the things that we're working on every single release. We probably don't talk about them as much as we need to, but obviously they're critically important. And so we constantly look at every component in this product, you know, Version 10 is. It is a huge release for any product, especially an analytic database platform. And so there's We're just constantly revisiting you know, some of the code base and figuring out how we can do it in new and better ways. And that's a big part of 10 as well. >>I'm glad you brought up the machine Intelligence, the machine Learning and AI piece because we would agree that it is really one of the things we've noticed is that you know the new innovation cocktail. It's not being driven by Moore's law anymore. It's really a combination of you. You've collected all this data over the last 10 years through Hadoop and other data stores, object stores, etcetera. And now you're applying machine intelligence to that. And then you've got the cloud for scale. And of course, we talked about you bringing the cloud experience, whether it's on Prem or hybrid etcetera. The reason why I think this is important I wanted to get your take on this is because you do see a lot of emerging analytic databases. Cloud Native. Yes, they do suck up, you know, a lot of compute. Yeah, but they also had a lot of value. And I really wanted to understand how you guys play in that new trend, that sort of cloud database, high performance, bringing in machine learning and AI and ML tools and then driving, you know, turning data into insights and from what I'm hearing is you played directly in that and your differentiation is a lot of the things that we talk about including the ability to do that on from and in the cloud and across clouds. >>Yeah, I mean, I think that's a great point. We were a great cloud database. We run very well upon three major clouds, and you could argue some of the other plants as well in other parts of the world. Um, if you talk to our customers and we have hundreds of customers who are running vertical in the cloud, the experience is very good. I think it would always be better. We've invested a lot in taking advantage of the native cloud ecosystem, so that provisioning and managing vertical is seamless when you're in that environment will continue to do that. But vertical excuse me as a cloud platform is phenomenal. And, um, you know, there's a There's a lot of confusion out there, you know? I think there's a lot of marketing dollars spent that won't name many of the companies here. You know who they are, You know, the cloud Native Data Warehouse and it's true, you know their their software as a service. But if you talk to a lot of our customers, they're getting very good and very similar. experiences with Bernie comic. We stopped short of saying where software is a service because ultimately our customers have that control of flexibility there. They're putting verdict on whichever cloud they want to run it on, managing it. Stay tuned on that. I think you'll you'll hear from or more from us about, you know, that going going even further. But, um, you know, we do really well in the cloud, and I think he on so much of yang. And, you know, this has really been a sort of 2.5 years and never for us. But so much of eon is was designed around. The cloud was designed around Cloud Data Lakes s three, separation of compute and storage on. And if you look at the work that we're doing around container ization and a lot of these other elements, it just takes that to the next level. And, um, there's a lot of great work, so I think we're gonna get continue to get better at cloud. But I would argue that we're already and have been for some time very good at being a cloud analytic data platform. >>Well, since you open the door I got to ask you. So it's e. I hear you from a performance and architectural perspective, but you're also alluding two. I think something else. I don't know what you can share with us. You said stay tuned on that. But I think you're talking about Optionality, maybe different consumption models. That am I getting that right and you share >>your difficult in that right? And actually, I'm glad you wrote something. I think a huge part of Cloud is also has nothing to do with the technology. I think it's how you and seeing the product. Some companies want to rent the product and they want to rent it for a certain period of time. And so we allow our customers to do that. We have incredibly flexible models of how you provision and purchase our product, and I think that helps a lot. You know, I am opening the door Ah, a little bit. But look, we have customers that ask us that we're in offer them or, you know, we can offer them platforms, brawl in. We've had customers come to us and say please take over systems, um, and offer something as a distribution as I said, though I think one thing that we've been really good at is focusing on on what is our core and where we really offer offer value. But I can tell you that, um, we introduced something called the Verdict Advisor Tool this year. One of the things that the Advisor Tool does is it collects information from our customer environments on premise or the cloud, and we run through our own machine learning. We analyze the customer's environment and we make some recommendations automatically. And a lot of our customers have said to us, You know, it's funny. We've tried managed service, tried SAS off, and you guys blow them away in terms of your ability to help us, like automatically managed the verdict, environment and the system. Why don't you guys just take this product and converted into a SAS offering, so I won't go much further than that? But you can imagine that there's a lot of innovation and a lot of thoughts going into how we can do that. But there's no reason that we have to wait and do that today and being able to offer our customers on premise customers that same sort of experience from a managed capability is something that we spend a lot of time thinking about as well. So again, just back to the automation that ease of use, the going above and beyond. Its really excited to have an analytic platform because we can do so much automation off ourselves. And just like we're doing with Perfect Advisor Tool, we're leveraging our own Kool Aid or Champagne Dawn. However you want to say Teoh, in fact, tune up and solve, um, some optimization for our customers automatically, and I think you're going to see that continue. And I think that could work really well in a bunch of different wallets. >>Welcome. Just on a personal note, I've always enjoyed our conversations. I've learned a lot from you over the years. I'm bummed that we can't hang out in Boston, but hopefully soon, uh, this will blow over. I loved last summer when we got together. We had the verdict throwback. We had Stone Breaker, Palmer, Lynch and Mahoney. We did a great series, and that was a lot of fun. So it's really it's a pleasure. And thanks so much. Stay safe out there and, uh, we'll talk to you soon. >>Yeah, you too did stay safe. I really appreciate it up. Unity and, you know, this is what it's all about. It's Ah, it's a lot of fun. I know we're going to see each other in person soon, and it's the people in the community that really make this happen. So looking forward to that, but I really appreciate it. >>Alright. And thank you, everybody for watching. This is the Cube coverage of the verdict. Big data conference gone, virtual going digital. I'm Dave Volante. We'll be right back right after this short break. >>Yeah.

Published Date : Mar 31 2020

SUMMARY :

Brought to you by vertical. Great to see you again. Good to see you too, Dave. I think it was absolutely the right all made it in advance of And you have a lot of fans in the vertical community But could you feel the love? to do it, obviously, you know, in Boston, where it was supposed to be on location, micro focus, but I know you and I know the vertical team you guys have have not stopped. I mean, you know, it's it's the software industry, on one of the waves that you're riding and where are you placing your Um, And to do that, you know, we know that we're not going to come up with the world's best algorithms. I mean apart Is that you, you know, green, really scale Yeah, I think you know, there's a lot of differences about how we do it. It's the cloud experience that you can bring on Prem to virtually any cloud. to another inter vertical, but you don't have to move it, you can actually take advantage of a lot of the data One of the one of the highlights of data. And so we constantly look at every component in this product, you know, And of course, we talked about you bringing the cloud experience, whether it's on Prem or hybrid etcetera. And if you look at the work that we're doing around container ization I don't know what you can share with us. I think it's how you and seeing the product. I've learned a lot from you over the years. Unity and, you know, this is what it's all about. This is the Cube coverage of the verdict.

ENTITIES

Entity	Category	Confidence
Colin Mahoney	PERSON	0.99+
Dave Volante	PERSON	0.99+
Dave	PERSON	0.99+
Boston	LOCATION	0.99+
Joe	PERSON	0.99+
Colin Mahony	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
uber	ORGANIZATION	0.99+
three	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
python	TITLE	0.99+
hundreds	QUANTITY	0.99+
Ferrari	ORGANIZATION	0.99+
10	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
one	QUANTITY	0.99+
2.5 years	QUANTITY	0.99+
two	QUANTITY	0.99+
Kool Aid	ORGANIZATION	0.99+
Vertical Colin	ORGANIZATION	0.99+
10th	QUANTITY	0.99+
Both	QUANTITY	0.99+
Micro Focus	ORGANIZATION	0.98+
each customer	QUANTITY	0.98+
Moore	PERSON	0.98+
America	LOCATION	0.98+
this year	DATE	0.98+
one platform	QUANTITY	0.97+
today	DATE	0.96+
One	QUANTITY	0.96+
10	TITLE	0.96+
Vertica	ORGANIZATION	0.96+
last summer	DATE	0.95+
third areas	QUANTITY	0.94+
one thing	QUANTITY	0.93+
Vertical	ORGANIZATION	0.92+
this year	DATE	0.92+
single point	QUANTITY	0.92+
Big Data Conference 2020	EVENT	0.92+
Arctic	ORGANIZATION	0.91+
Hadoop	ORGANIZATION	0.89+
three major clouds	QUANTITY	0.88+
H DFs	ORGANIZATION	0.86+
Cloud Data Lakes	TITLE	0.86+
Stone Breaker	ORGANIZATION	0.86+
one huge advantage	QUANTITY	0.86+
Hadoop	TITLE	0.85+
BDC	EVENT	0.83+
day one	QUANTITY	0.83+
Version 10	TITLE	0.83+
Cube	COMMERCIAL_ITEM	0.82+
Google Cloud	TITLE	0.82+
BDC 2020	EVENT	0.81+
thing	QUANTITY	0.79+
Bernie	PERSON	0.79+
first	QUANTITY	0.79+
over 10 x	QUANTITY	0.78+
Prem	ORGANIZATION	0.78+
one vertical	QUANTITY	0.77+
Virtual Vertica	ORGANIZATION	0.77+
Verdict	ORGANIZATION	0.75+
SAS	ORGANIZATION	0.75+
Champagne Dawn	ORGANIZATION	0.73+
every single release	QUANTITY	0.72+
Perfect	TITLE	0.71+
years	QUANTITY	0.7+
last 10 years	DATE	0.69+
Palmer	ORGANIZATION	0.67+
Tensorflow	TITLE	0.65+
single release	QUANTITY	0.65+
a minute	QUANTITY	0.64+
Advisor Tool	TITLE	0.63+
customers	QUANTITY	0.62+

Ben White, Domo | Virtual Vertica BDC 2020

>> Announcer: It's theCUBE covering the Virtual Vertica Big Data Conference 2020, brought to you by Vertica. >> Hi, everybody. Welcome to this digital coverage of the Vertica Big Data Conference. You're watching theCUBE and my name is Dave Volante. It's my pleasure to invite in Ben White, who's the Senior Database Engineer at Domo. Ben, great to see you, man. Thanks for coming on. >> Great to be here and here. >> You know, as I said, you know, earlier when we were off-camera, I really was hoping I could meet you face-to-face in Boston this year, but hey, I'll take it, and, you know, our community really wants to hear from experts like yourself. But let's start with Domo as the company. Share with us what Domo does and what your role is there. >> Well, if I can go straight to the official what Domo does is we provide, we process data at BI scale, we-we-we provide BI leverage at cloud scale in record time. And so what that means is, you know, we are a business-operating system where we provide a number of analytical abilities to companies of all sizes. But we do that at cloud scale and so I think that differentiates us quite a bit. >> So a lot of your work, if I understand it, and just in terms of understanding what Domo does, there's a lot of pressure in terms of being real-time. It's not, like, you sometimes don't know what's coming at you, so it's ad-hoc. I wonder if you could sort of talk about that, confirm that, maybe add a little color to it. >> Yeah, absolutely, absolutely. That's probably the biggest challenge it is to being, to operating Domo is that it is an ad hoc environment. And certainly what that means, is that you've got analysts and executives that are able to submit their own queries with out very... With very few limitations. So from an engineering standpoint, that challenge in that of course is that you don't have this predictable dashboard to plan for, when it comes to performance planning. So it definitely presents some challenges for us that we've done some pretty unique things, I think, to address those. >> So it sounds like your background fits well with that. I understand your people have called you a database whisperer and an envelope pusher. What does that mean to a DBA in this day and age? >> The whisperer part is probably a lost art, in the sense that it's not really sustainable, right? The idea that, you know, whatever it is I'm able to do with the database, it has to be repeatable. And so that's really where analytics comes in, right? That's where pushing the envelope comes in. And in a lot of ways that's where Vertica comes in with this open architecture. And so as a person who has a reputation for saying, "I understand this is what our limitations should be, but I think we can do more." Having a platform like Vertica, with such an open architecture, kind of lets you push those limits quite a bit. >> I mean I've always felt like, you know, Vertica, when I first saw the stone breaker architecture and talked to some of the early founders, I always felt like it was the Ferrari of databases, certainly at the time. And it sounds like you guys use it in that regard. But talk a little bit more about how you use Vertica, why, you know, why MPP, why Vertica? You know, why-why can't you do this with RDBMS? Educate us, a little bit, on, sort of, the basics. >> For us it was, part of what I mentioned when we started, when we talked about the very nature of the Domo platform, where there's an incredible amount of resiliency required. And so Vertica, the MPP platform, of course, allows us to build individual database clusters that can perform best for the workload that might be assigned to them. So the open, the expandable, the... The-the ability to grow Vertica, right, as your base grows, those are all important factors, when you're choosing early on, right? Without a real idea of how growth would be or what it will look like. If you were kind of, throwing up something to the dark, you look at the Vertica platform and you can see, well, as I grow, I can, kind of, build with this, right? I can do some unique things with the platform in terms of this open architecture that will allow me to not have to make all my decisions today, right? (mutters) >> So, you're using Vertica, I know, at least in part, you're working with AWS as well, can you describe sort of your environment? Do you give anything on-prem, is everything in cloud? What's your set up look like? >> Sure, we have a hybrid cloud environment where we have a significant presence in public files in our own private cloud. And so, yeah, having said that, we certainly have a really an extensive presence, I would say, in AWS. So, they're definitely the partner of our when it comes to providing the databases and the server power that we need to operate on. >> From a standpoint of engineering and architecting a database, what were some of the challenges that you faced when you had to create that hybrid architecture? What did you face and how did you overcome that? >> Well, you know, some of the... There were some things we faced in terms of, one, it made it easy that Vertica and AWS have their own... They play well together, we'll say that. And so, Vertica was designed to work on AWS. So that part of it took care of it's self. Now our own private cloud and being able to connect that to our public cloud has been a part of our own engineering abilities. And again, I don't want to make little, make light of it, it certainly not impossible. And so we... Some of the challenges that pertain to the database really were in the early days, that you mentioned, when we talked a little bit earlier about Vertica's most recent eon mode. And I'm sure you'll get to that. But when I think of early challenges, some of the early challenges were the architecture of enterprise mode. When I talk about all of these, this idea that we can have unique databases or database clusters of different sizes, or this elasticity, because really, if you know the enterprise architecture, that's not necessarily the enterprise architecture. So we had to do some unique things, I think, to overcome that, right, early. To get around the rigidness of enterprise. >> Yeah, I mean, I hear you. Right? Enterprise is complex and you like when things are hardened and fossilized but, in your ad hoc environment, that's not what you needed. So talk more about eon mode. What is eon mode for you and how do you apply it? What are some of the challenges and opportunities there, that you've found? >> So, the opportunities were certainly in this elastic architecture and the ability to separate in the storage, immediately meant that for some of the unique data paths that we wanted to take, right? We could do that fairly quickly. Certainly we could expand databases, right, quickly. More importantly, now you can reduce. Because previously, in the past, right, when I mentioned the enterprise architecture, the idea of growing a database in itself has it's pain. As far as the time it takes to (mumbles) the data, and that. Then think about taking that database back down and (telephone interference). All of a sudden, with eon, right, we had this elasticity, where you could, kind of, start to think about auto scaling, where you can go up and down and maybe you could save some money or maybe you could improve performance or maybe you could meet demand, At a time where customers need it most, in a real way, right? So it's definitely a game changer in that regard. >> I always love to talk to the customers because I get to, you know, I hear from the vendor, what they say, and then I like to, sort of, validate it. So, you know, Vertica talks a lot about separating compute and storage, and they're not the only one, from an architectural standpoint who do that. But Vertica stresses it. They're the only one that does that with a hybrid architecture. They can do it on-prem, they can do it in the cloud. From your experience, well first of all, is that true? You may or may not know, but is that advantageous to you, and if so, why? >> Well, first of all, it's certainly true. Earlier in some of the original beta testing for the on-prem eon modes that we... I was able to participate in it and be aware of it. So it certainly a realty, they, it's actually supported on Pure storage with FlashBlade and it's quite impressive. You know, for who, who will that be for, tough one. It's probably Vertica's question that they're probably still answering, but I think, obviously, some enterprise users that probably have some hybrid cloud, right? They have some architecture, they have some hardware, that they themselves, want to make use of. We certainly would probably fit into one of their, you know, their market segments. That they would say that we might be the ones to look at on-prem eon mode. Again, the beauty of it is, the elasticity, right? The idea that you could have this... So a lot of times... So I want to go back real quick to separating compute. >> Sure. Great. >> You know, we start by separating it. And I like to think of it, maybe more of, like, the up link. Because in a true way, it's not necessarily separated because ultimately, you're bringing the compute and the storage back together. But to be able to decouple it quickly, replace nodes, bring in nodes, that certainly fits, I think, what we were trying to do in building this kind of ecosystem that could respond to unknown of a customer query or of a customer demand. >> I see, thank you for that clarification because you're right, it's really not separating, it's decoupling. And that's important because you can scale them independently, but you still need compute and you still need storage to run your work load. But from a cost standpoint, you don't have to buy it in chunks. You can buy in granular segments for whatever your workload requires. Is that, is that the correct understanding? >> Yeah, and to, the ability to able to reuse compute. So in the scenario of AWS or even in the scenario of your on-prem solution, you've got this data that's safe and secure in (mumbles) computer storage, but the compute that you have, you can reuse that, right? You could have a scenario that you have some query that needs more analytic, more-more fire power, more memory, more what have you that you have. And so you can kind of move between, and that's important, right? That's maybe more important than can I grow them separately. Can I, can I borrow it. Can I borrow that compute you're using for my (cuts out) and give it back? And you can do that, when you're so easily able to decouple the compute and put it where you want, right? And likewise, if you have a down period where customers aren't using it, you'd like to be able to not use that, if you no longer require it, you're not going to get it back. 'Cause it-it opened the door to a lot of those things that allowed performance and process department to meet up. >> I wonder if I can ask you a question, you mentioned Pure a couple of times, are you using Pure FlashBlade on-prem, is that correct? >> That is the solution that is supported, that is supported by Vertica for the on-prem. (cuts out) So at this point, we have been discussing with them about some our own POCs for that. Before, again, we're back to the idea of how do we see ourselves using it? And so we certainly discuss the feasibility of bringing it in and giving it the (mumbles). But that's not something we're... Heavily on right now. >> And what is Domo for Domo? Tell us about that. >> Well it really started as this idea, even in the company, where we say, we should be using Domo in our everyday business. From the sales folk to the marketing folk, right. Everybody is going to use Domo, it's a business platform. For us in engineering team, it was kind of like, well if we use Domo, say for instance, to be better at the database engineers, now we've pointed Domo at itself, right? Vertica's running Domo in the background to some degree and then we turn around and say, "Hey Domo, how can we better at running you?" So it became this kind of cool thing we'd play with. We're now able to put some, some methods together where we can actually do that, right. Where we can monitor using our platform, that's really good at processing large amounts of data and spitting out useful analytics, right. We take those analytics down, make recommendation changes at the-- For now, you've got Domo for Domo happening and it allows us to sit at home and work. Now, even when we have to, even before we had to. >> Well, you know, look. Look at us here. Right? We couldn't meet in Boston physically, we're now meeting remote. You're on a hot spot because you've got some weather in your satellite internet in Atlanta and we're having a great conversation. So-so, we're here with Ben White, who's a senior database engineer at Domo. I want to ask you about some of the envelope pushing that you've done around autonomous. You hear that word thrown around a lot. Means a lot of things to a lot of different people. How do you look at autonomous? And how does it fit with eon and some of the other things you're doing? >> You know, I... Autonomous and the idea idea of autonomy is something that I don't even know if that I have already, ready to define. And so, even in my discussion, I often mention it as a road to it. Because exactly where it is, it's hard to pin down, because there's always this idea of how much trust do you give, right, to the system or how much, how much is truly autonomous? How much already is being intervened by us, the engineers. So I do hedge on using that. But on this road towards autonomy, when we look at, what we're, how we're using Domo. And even what that really means for Vertica, because in a lot of my examples and a lot of the things that we've engineered at Domo, were designed to maybe overcome something that I thought was a limitation thing. And so many times as we've done that, Vertica has kind of met us. Like right after we've kind of engineered our architecture stuff, that we thought that could help on our side, Vertica has a release that kind of addresses it. So, the autonomy idea and the idea that we could analyze metadata, make recommendations, and then execute those recommendations without innervation, is that road to autonomy. Once the database is properly able to do that, you could see in our ad hoc environment how that would be pretty useful, where with literally millions of queries every hour, trying to figure out what's the best, you know, profile. >> You know for- >> (overlapping) probably do a better job in that, than we could. >> For years I felt like IT folks sometimes were really, did not want that automation, they wanted the knobs to turn. But I wonder if you can comment. I feel as though the level of complexity now, with cloud, with on-prem, with, you know, hybrid, multicloud, the scale, the speed, the real time, it just gets, the pace is just too much for humans. And so, it's almost like the industry is going to have to capitulate to the machine. And then, really trust the machine. But I'm still sensing, from you, a little bit of hesitation there, but light at the end of the tunnel. I wonder if you can comment? >> Sure. I think the light at the end of the tunnel is even in the recent months and recent... We've really begin to incorporate more machine learning and artificial intelligence into the model, right. And back to what we're saying. So I do feel that we're getting closer to finding conditions that we don't know about. Because right now our system is kind of a rule, rules based system, where we've said, "Well these are the things we should be looking for, these are the things that we think are a problem." To mature to the point where the database is recognizing anomalies and taking on pattern (mutters). These are problems you didn't know happen. And that's kind of the next step, right. Identifying the things you didn't know. And that's the path we're on now. And it's probably more exciting even than, kind of, nailing down all the things you think you know. We figure out what we don't know yet. >> So I want to close with, I know you're a prominent member of the, a respected member of the Vertica Customer Advisory Board, and you know, without divulging anything confidential, what are the kinds of things that you want Vertica to do going forward? >> Oh, I think, some of the in dated base for autonomy. The ability to take some of the recommendations that we know can derive from the metadata that already exists in the platform and start to execute some of the recommendations. And another thing we've talked about, and I've been pretty open about talking to it, talking about it, is the, a new version of the database designer, I think, is something that I'm sure they're working on. Lightweight, something that can give us that database design without the overhead. Those are two things, I think, as they nail or basically the database designer, as they respect that, they'll really have all the components in play to do in based autonomy. And I think that's, to some degree, where they're heading. >> Nice. Well Ben, listen, I really appreciate you coming on. You're a thought leader, you're very open, open minded, Vertica is, you know, a really open community. I mean, they've always been quite transparent in terms of where they're going. It's just awesome to have guys like you on theCUBE to-to share with our community. So thank you so much and hopefully we can meet face-to-face shortly. >> Absolutely. Well you stay safe in Boston, one of my favorite towns and so no doubt, when the doors get back open, I'll be coming down. Or coming up as it were. >> Take care. All right, and thank you for watching everybody. Dave Volante with theCUBE, we're here covering the Virtual Vertica Big Data Conference. (electronic music)

Published Date : Mar 31 2020

SUMMARY :

brought to you by Vertica. of the Vertica Big Data Conference. I really was hoping I could meet you face-to-face And so what that means is, you know, I wonder if you could sort of talk about that, confirm that, is that you don't have this predictable dashboard What does that mean to a DBA in this day and age? The idea that, you know, And it sounds like you guys use it in that regard. that can perform best for the workload that we need to operate on. Some of the challenges that pertain to the database and you like when things are hardened and fossilized and the ability to separate in the storage, but is that advantageous to you, and if so, why? The idea that you could have this... And I like to think of it, maybe more of, like, the up link. And that's important because you can scale them the compute and put it where you want, right? that is supported by Vertica for the on-prem. And what is Domo for Domo? From the sales folk to the marketing folk, right. I want to ask you about some of the envelope pushing and a lot of the things that we've engineered at Domo, than we could. But I wonder if you can comment. nailing down all the things you think you know. And I think that's, to some degree, where they're heading. It's just awesome to have guys like you on theCUBE Well you stay safe in Boston, All right, and thank you for watching everybody.

ENTITIES

Entity	Category	Confidence
AWS	ORGANIZATION	0.99+
Dave Volante	PERSON	0.99+
Ben White	PERSON	0.99+
Boston	LOCATION	0.99+
Vertica	ORGANIZATION	0.99+
Atlanta	LOCATION	0.99+
Ferrari	ORGANIZATION	0.99+
Domo	ORGANIZATION	0.99+
Vertica Customer Advisory Board	ORGANIZATION	0.99+
Ben	PERSON	0.99+
two things	QUANTITY	0.98+
this year	DATE	0.98+
Vertica	TITLE	0.98+
theCUBE	ORGANIZATION	0.97+
Vertica Big Data Conference	EVENT	0.97+
Domo	TITLE	0.97+
Domo	PERSON	0.96+
Virtual Vertica Big Data Conference	EVENT	0.96+
Virtual Vertica Big Data Conference 2020	EVENT	0.96+
first	QUANTITY	0.95+
eon	TITLE	0.92+
one	QUANTITY	0.87+
today	DATE	0.87+
millions of queries	QUANTITY	0.84+
FlashBlade	TITLE	0.82+
Virtual Vertica	EVENT	0.75+
couple	QUANTITY	0.7+
Pure FlashBlade	COMMERCIAL_ITEM	0.58+
BDC 2020	EVENT	0.56+
MPP	TITLE	0.55+
times	QUANTITY	0.51+
RDBMS	TITLE	0.48+

Joy King, Vertica | Virtual Vertica BDC 2020

>>Yeah, it's the queue covering the virtual vertical Big Data Conference 2020 Brought to You by vertical. >>Welcome back, everybody. My name is Dave Vellante, and you're watching the Cube's coverage of the verdict of Virtual Big Data conference. The Cube has been at every BTC, and it's our pleasure in these difficult times to be covering BBC as a virtual event. This digital program really excited to have Joy King joining us. Joy is the vice president of product and go to market strategy in particular. And if that weren't enough, he also runs marketing and education curve for him. So, Joe, you're a multi tool players. You've got the technical side and the marketing gene, So welcome to the Cube. You're always a great guest. Love to have you on. >>Thank you so much, David. The pleasure, it really is. >>So I want to get in. You know, we'll have some time. We've been talking about the conference and the virtual event, but I really want to dig in to the product stuff. It's a big day for you guys. You announced 10.0. But before we get into the announcements, step back a little bit you know, you guys are riding the waves. I've said to ah, number of our guests that that brick has always been good. It riding the wave not only the initial MPP, but you you embraced, embraced HD fs. You embrace data science and analytics and in the cloud. So one of the trends that you see the big waves that you're writing >>Well, you're absolutely right, Dave. I mean, what what I think is most interesting and important is because verdict is, at its core a true engineering culture founded by, well, a pretty famous guy, right, Dr Stone Breaker, who embedded that very technical vertical engineering culture. It means that we don't pretend to know everything that's coming, but we are committed to embracing the tech. An ology trends, the innovations, things like that. We don't pretend to know it all. We just do it all. So right now, I think I see three big imminent trends that we are addressing. And matters had we have been for a while, but that are particularly relevant right now. The first is a combination of, I guess, a disappointment in what Hadoop was able to deliver. I always feel a little guilty because she's a very reasonably capable elephant. She was designed to be HD fs highly distributed file store, but she cant be an entire zoo, so there's a lot of disappointment in the market, but a lot of data. In HD FM, you combine that with some of the well, not some the explosion of cloud object storage. You're talking about even more data, but even more data silos. So data growth and and data silos is Trend one. Then what I would say Trend, too, is the cloud Reality Cloud brings so many events. There are so many opportunities that public cloud computing delivers. But I think we've learned enough now to know that there's also some reality. The cloud providers themselves. Dave. Don't talk about it well, because not, is it more agile? Can you do things without having to manage your own data center? Of course you can. That the reality is it's a little more pricey than we expected. There are some security and privacy concerns. There's some workloads that can go to the cloud, so hybrid and also multi cloud deployments are the next trend that are mandatory. And then maybe the one that is the most exciting in terms of changing the world we could use. A little change right now is operationalize in machine learning. There's so much potential in the technology, but it's somehow has been stuck for the most part in science projects and data science lab, and the time is now to operationalize it. Those are the three big trends that vertical is focusing on right now. >>That's great. I wonder if I could ask you a couple questions about that. I mean, I like you have a soft spot in my heart for the and the thing about the Hadoop that that was, I think, profound was it got people thinking about, you know, bringing compute to the data and leaving data in place, and it really got people thinking about data driven cultures. It didn't solve all the problems, but it collected a lot of data that we can now take your third trend and apply machine intelligence on top of that data. And then the cloud is really the ability to scale, and it gives you that agility and that it's not really that cloud experience. It's not not just the cloud itself, it's bringing the cloud experience to wherever the data lives. And I think that's what I'm hearing from you. Those are the three big super powers of innovation today. >>That's exactly right. So, you know, I have to say I think we all know that Data Analytics machine learning none of that delivers real value unless the volume of data is there to be able to truly predict and influence the future. So the last 7 to 10 years has been correctly about collecting the data, getting the data into a common location, and H DFS was well designed for that. But we live in a capitalist world, and some companies stepped in and tried to make HD Fs and the broader Hadoop ecosystem be the single solution to big data. It's not true. So now that the key is, how do we take advantage of all of that data? And now that's exactly what verdict is focusing on. So as you know, we began our journey with vertical back in the day in 2007 with our first release, and we saw the growth of the dupe. So we announced many years ago verdict a sequel on that. The idea to be able to deploy vertical on Hadoop nodes and query the data in Hadoop. We wanted to help. Now with Verdict A 10. We are also introducing vertical in eon mode, and we can talk more about that. But Verdict and Ian Mode for HDs, This is a way to apply it and see sequel database management platform to H DFS infrastructure and data in each DFS file storage. And that is a great way to leverage the investment that so many companies have made in HD Fs. And I think it's fair to the elephant to treat >>her well. Okay, let's get into the hard news and auto. Um, she's got, but you got a mature stack, but one of the highlights of append auto. And then we can drill into some of the technologies >>Absolutely so in well in 2018 vertical announced vertical in Deon mode is the separation of compute from storage. Now this is a great example of vertical embracing innovation. Vertical was designed for on premises, data centers and bare metal servers, tightly coupled storage de l three eighties from Hewlett Packard Enterprises, Dell, etcetera. But we saw that cloud computing was changing fundamentally data center architectures, and it made sense to separate compute from storage. So you add compute when you need compute. You add storage when you need storage. That's exactly what the cloud's introduced, but it was only available on the club. So first thing we did was architect vertical and EON mode, which is not a new product. Eight. This is really important. It's a deployment option. And in 2018 our customers had the opportunity to deploy their vertical licenses in EON mode on AWS in September of 2019. We then broke an important record. We brought cloud architecture down to earth and we announced vertical in eon mode so vertical with communal or shared storage, leveraging pure storage flash blade that gave us all the advantages of separating compute from storage. All of the workload, isolation, the scale up scale down the ability to manage clusters. And we did that with on Premise Data Center. And now, with vertical 10 we are announcing verdict in eon mode on HD fs and vertically on mode on Google Cloud. So what we've got here, in summary, is vertical Andy on mode, multi cloud and multiple on premise data that storage, and that gives us the opportunity to help our customers both with the hybrid and multi cloud strategies they have and unifying their data silos. But America 10 goes farther. >>Well, let me stop you there, because I just wanna I want to mention So we talked to Joe Gonzalez and past Mutual, who essentially, he was brought in. And one of this task was the lead into eon mode. Why? Because I'm asking. You still had three separate data silos and they wanted to bring those together. They're investing heavily in technology. Joe is an expert, though that really put data at their core and beyond Mode was a key part of that because they're using S three and s o. So that was Ah, very important step for those guys carry on. What else do we need to know about? >>So one of the reasons, for example, that Mass Mutual is so excited about John Mode is because of the operational advantages. You think about exactly what Joe told you about multiple clusters serving must multiple use cases and maybe multiple divisions. And look, let's be clear. Marketing doesn't always get along with finance and finance doesn't necessarily get along with up, and I t is often caught the middle. Erica and Dion mode allows workload, isolation, meaning allocating the compute resource is that different use cases need without allowing them to interfere with other use cases and allowing everybody to access the data. So it's a great way to bring the corporate world together but still protect them from each other. And that's one of the things that Mass Mutual is going to benefit from, as well, so many of >>our other customers I also want to mention. So when I saw you, ah, last last year at the Pure Storage Accelerate conference just today we are the only company that separates you from storage that that runs on Prem and in the cloud. And I was like I had to think about it. I've researched. I still can't find anybody anybody else who doesn't know. I want to mention you beat actually a number of the cloud players with that capability. So good job and I think is a differentiator, assuming that you're giving me that cloud experience and the licensing and the pricing capability. So I want to talk about that a little >>bit. Well, you're absolutely right. So let's be clear. There is no question that the public cloud public clouds introduced the separation of compute storage and these advantages that they do not have the ability or the interest to replicate that on premise for vertical. We were born to be software only. We make no money on underlying infrastructure. We don't charge as a package for the hardware underneath, so we are totally motivated to be independent of that and also to continuously optimize the software to be as efficient as possible. And we do the exact same thing to your question about life. Cloud providers charge for note indignance. That's how they charge for their underlying infrastructure. Well, in some cases, if you're being, if you're talking about a use case where you have a whole lot of data, but you don't necessarily have a lot of compute for that workload, it may make sense to pay her note. Then it's unlimited data. But what if you have a huge compute need on a relatively small data set that's not so good? Vertical offers per node and four terabyte for our customers, depending on their use case, we also offer perpetual licenses for customers who want capital. But we also offer subscription for companies that they Nope, I have to have opt in. And while this can certainly cause some complexity for our field organization, we know that it's all about choice, that everybody in today's world wants it personalized just for me. And that's exactly what we're doing with our pricing in life. >>So just to clarify, you're saying I can pay by the drink if I want to. You're not going to force me necessarily into a term or Aiken choose to have, you know, more predictable pricing. Is that, Is that correct? >>Well, so it's partially correct. The first verdict, a subscription licensing is a fixed amount for the period of the subscription. We do that so many of our customers cannot, and I'm one of them, by the way, cannot tell finance what the budgets forecast is going to be for the quarter after I spent you say what it's gonna be before, So our subscription facing is a fixed amount for a period of time. However, we do respect the fact that some companies do want usage based pricing. So on AWS, you can use verdict up by the hour and you pay by the hour. We are about to launch the very same thing on Google Cloud. So for us, it's about what do you need? And we make it happen natively directly with us or through AWS and Google Cloud. >>So I want to send so the the fixed isn't some floor. And then if you want a surge above that, you can allow usage pricing. If you're on the cloud, correct. >>Well, you actually license your cluster vertical by the hour on AWS and you run your cluster there. Or you can buy a license from vertical or a fixed capacity or a fixed number of nodes and deploy it on the cloud. And then, if you want to add more nodes or add more capacity, you can. It's not usage based for the license that you bring to the cloud. But if you purchase through the cloud provider, it is usage. >>Yeah, okay. And you guys are in the marketplace. Is that right? So, again, if I want up X, I can do that. I can choose to do that. >>That's awesome. Next usage through the AWS marketplace or yeah, directly from vertical >>because every small business who then goes to a salesforce management system knows this. Okay, great. I can pay by the month. Well, yeah, Well, not really. Here's our three year term in it, right? And it's very frustrating. >>Well, and even in the public cloud you can pay for by the hour by the minute or whatever, but it becomes pretty obvious that you're better off if you have reserved instance types or committed amounts in that by vertical offers subscription. That says, Hey, you want to have 100 terabytes for the next year? Here's what it will cost you. We do interval billing. You want to do monthly orderly bi annual will do that. But we won't charge you for usage that you didn't even know you were using until after you get the bill. And frankly, that's something my finance team does not like. >>Yeah, I think you know, I know this is kind of a wonky discussion, but so many people gloss over the licensing and the pricing, and I think my take away here is Optionality. You know, pricing your way of That's great. Thank you for that clarification. Okay, so you got Google Cloud? I want to talk about storage. Optionality. If I found him up, I got history. I got I'm presuming Google now of you you're pure >>is an s three compatible storage yet So your story >>Google object store >>like Google object store Amazon s three object store HD fs pure storage flash blade, which is an object store on prim. And we are continuing on this theft because ultimately we know that our customers need the option of having next generation data center architecture, which is sort of shared or communal storage. So all the data is in one place. Workloads can be managed independently on that data, and that's exactly what we're doing. But what we already have in two public clouds and to on premise deployment options today. And as you said, I did challenge you back when we saw each other at the conference. Today, vertical is the only analytic data warehouse platform that offers that option on premise and in multiple public clouds. >>Okay, let's talk about the ah, go back through the innovation cocktail. I'll call it So it's It's the data applying machine intelligence to that data. And we've talked about scaling at Cloud and some of the other advantages of Let's Talk About the Machine Intelligence, the machine learning piece of it. What's your story there? Give us any updates on your embracing of tooling and and the like. >>Well, quite a few years ago, we began building some in database native in database machine learning algorithms into vertical, and the reason we did that was we knew that the architecture of MPP Columbia execution would dramatically improve performance. We also knew that a lot of people speak sequel, but at the time, not so many people spoke R or even Python. And so what if we could give act us to machine learning in the database via sequel and deliver that kind of performance? So that's the journey we started out. And then we realized that actually, machine learning is a lot more as everybody knows and just algorithms. So we then built in the full end to end machine learning functions from data preparation to model training, model scoring and evaluation all the way through to fold the point and all of this again sequel accessible. You speak sequel. You speak to the data and the other advantage of this approach was we realized that accuracy was compromised if you down sample. If you moved a portion of the data from a database to a specialty machine learning platform, you you were challenged by accuracy and also what the industry is calling replica ability. And that means if a model makes a decision like, let's say, credit scoring and that decision isn't anyway challenged, well, you have to be able to replicate it to prove that you made the decision correctly. And there was a bit of, ah, you know, blow up in the media not too long ago about a credit scoring decision that appeared to be gender bias. But unfortunately, because the model could not be replicated, there was no way to this Prove that, and that was not a good thing. So all of this is built in a vertical, and with vertical 10. We've taken the next step, just like with with Hadoop. We know that innovation happens within vertical, but also outside of vertical. We saw that data scientists really love their preferred language. Like python, they love their tools and platforms like tensor flow with vertical 10. We now integrate even more with python, which we have for a while, but we also integrate with tensorflow integration and PM ML. What does that mean? It means that if you build and train a model external to vertical, using the machine learning platform that you like, you can import that model into a vertical and run it on the full end to end process. But run it on all the data. No more accuracy challenges MPP Kilometer execution. So it's blazing fast. And if somebody wants to know why a model made a decision, you can replicate that model, and you can explain why those are very powerful. And it's also another cultural unification. Dave. It unifies the business analyst community who speak sequel with the data scientist community who love their tools like Tensorflow and Python. >>Well, I think joy. That's important because so much of machine intelligence and ai there's a black box problem. You can't replicate the model. Then you do run into a potential gender bias. In the example that you're talking about there in their many you know, let's say an individual is very wealthy. He goes for a mortgage and his wife goes for some credit she gets rejected. He gets accepted this to say it's the same household, but the bias in the model that may be gender bias that could be race bias. And so being able to replicate that in and open up and make the the machine intelligence transparent is very, very important, >>It really is. And that replica ability as well as accuracy. It's critical because if you're down sampling and you're running models on different sets of data, things can get confusing. And yet you don't really have a choice. Because if you're talking about petabytes of data and you need to export that data to a machine learning platform and then try to put it back and get the next at the next day, you're looking at way too much time doing it in the database or training the model and then importing it into the database for production. That's what vertical allows, and our customers are. So it right they reopens. Of course, you know, they are the ones that are sort of the Trailblazers they've always been, and ah, this is the next step. In blazing the ML >>thrill joint customers want analytics. They want functional analytics full function. Analytics. What are they pushing you for now? What are you delivering? What's your thought on that? >>Well, I would say the number one thing that our customers are demanding right now is deployment. Flexibility. What? What the what the CEO or the CFO mandated six months ago? Now shout Whatever that thou shalt is is different. And they would, I tell them is it is impossible. No, what you're going to be commanded to do or what options you might have in the future. The key is not having to choose, and they are very, very committed to that. We have a large telco customer who is multi cloud as their commit. Why multi cloud? Well, because they see innovation available in different public clouds. They want to take advantage of all of them. They also, admittedly, the that there's the risk of lock it right. Like any vendor, they don't want that either, so they want multi cloud. We have other customers who say we have some workloads that make sense for the cloud and some that we absolutely cannot in the cloud. But we want a unified analytics strategy, so they are adamant in focusing on deployment flexibility. That's what I'd say is 1st 2nd I would say that the interest in operationalize in machine learning but not necessarily forcing the analytics team to hammer the data science team about which tools or the best tools. That's the probably number two. And then I'd say Number three. And it's because when you look at companies like Uber or the Trade Desk or A T and T or Cerner performance at scale, when they say milliseconds, they think that flow. When they say petabytes, they're like, Yeah, that was yesterday. So performance at scale good enough for vertical is never good enough. And it's why we're constantly building at the core the next generation execution engine, database designer, optimization engine, all that stuff >>I wanna also ask you. When I first started following vertical, we covered the cube covering the BBC. One of things I noticed was in talking to customers and people in the community is that you have a community edition, uh, free addition, and it's not neutered ais that have you maintain that that ethos, you know, through the transitions into into micro focus. And can you talk about that a little bit >>absolutely vertical community edition is vertical. It's all of the verdict of functionality geospatial time series, pattern matching, machine learning, all of the verdict, vertical neon mode, vertical and enterprise mode. All vertical is the community edition. The only limitation is one terabyte of data and three notes, and it's free now. If you want commercial support, where you can file a support ticket and and things like that, you do have to buy the life. But it's free, and we people say, Well, free for how long? Like our field? I've asked that and I say forever and what he said, What do you mean forever? Because we want people to use vertical for use cases that are small. They want to learn that they want to try, and we see no reason to limit that. And what we look for is when they're ready to grow when they need the next set of data that goes beyond a terabyte or they need more compute than three notes, then we're here for them, and it also brings up an important thing that I should remind you or tell you about Davis. You haven't heard it, and that's about the Vertical Academy Academy that vertical dot com well, what is that? That is, well, self paced on demand as well as vertical essential certification. Training and certification means you have seven days with your hands on a vertical cluster hosted in the cloud to go through all the certification. And guess what? All of that is free. Why why would you give it for free? Because for us empowering the market, giving the market the expert East, the learning they need to take advantage of vertical, just like with Community Edition is fundamental to our mission because we see the advantage that vertical can bring. And we want to make it possible for every company all around the world that take advantage >>of it. I love that ethos of vertical. I mean, obviously great product. But it's not just the product. It's the business practices and really progressive progressive pricing and embracing of all these trends and not running away from the waves but really leaning in joy. Thanks so much. Great interview really appreciate it. And, ah, I wished we could have been faced face in Boston, but I think it's prudent thing to do, >>I promise you, Dave we will, because the verdict of BTC and 2021 is already booked. So I will see you there. >>Haas enjoyed King. Thanks so much for coming on the Cube. And thank you for watching. Remember, the Cube is running this program in conjunction with the virtual vertical BDC goto vertical dot com slash BBC 2020 for all the coverage and keep it right there. This is Dave Vellante with the Cube. We'll be right back. >>Yeah, >>yeah, yeah.

Published Date : Mar 31 2020

SUMMARY :

Yeah, it's the queue covering the virtual vertical Big Data Conference Love to have you on. Thank you so much, David. So one of the trends that you see the big waves that you're writing Those are the three big trends that vertical is focusing on right now. it's bringing the cloud experience to wherever the data lives. So now that the key is, how do we take advantage of all of that data? And then we can drill into some of the technologies had the opportunity to deploy their vertical licenses in EON mode on Well, let me stop you there, because I just wanna I want to mention So we talked to Joe Gonzalez and past Mutual, And that's one of the things that Mass Mutual is going to benefit from, I want to mention you beat actually a number of the cloud players with that capability. for the hardware underneath, so we are totally motivated to be independent of that So just to clarify, you're saying I can pay by the drink if I want to. So for us, it's about what do you need? And then if you want a surge above that, for the license that you bring to the cloud. And you guys are in the marketplace. directly from vertical I can pay by the month. Well, and even in the public cloud you can pay for by the hour by the minute or whatever, and the pricing, and I think my take away here is Optionality. And as you said, I'll call it So it's It's the data applying machine intelligence to that data. So that's the journey we started And so being able to replicate that in and open up and make the the and get the next at the next day, you're looking at way too much time doing it in the What are they pushing you for now? commanded to do or what options you might have in the future. And can you talk about that a little bit the market, giving the market the expert East, the learning they need to take advantage of vertical, But it's not just the product. So I will see you there. And thank you for watching.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Dave Vellante	PERSON	0.99+
September of 2019	DATE	0.99+
Joe Gonzalez	PERSON	0.99+
Dave	PERSON	0.99+
2007	DATE	0.99+
Dell	ORGANIZATION	0.99+
Joy King	PERSON	0.99+
Joe	PERSON	0.99+
Joy	PERSON	0.99+
Uber	ORGANIZATION	0.99+
2018	DATE	0.99+
Boston	LOCATION	0.99+
Vertical Academy Academy	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
seven days	QUANTITY	0.99+
one terabyte	QUANTITY	0.99+
python	TITLE	0.99+
three notes	QUANTITY	0.99+
Today	DATE	0.99+
Hewlett Packard Enterprises	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
BBC	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
100 terabytes	QUANTITY	0.99+
Ian Mode	PERSON	0.99+
six months ago	DATE	0.99+
Python	TITLE	0.99+
first release	QUANTITY	0.99+
1st 2nd	QUANTITY	0.99+
three year	QUANTITY	0.99+
Mass Mutual	ORGANIZATION	0.99+
Eight	QUANTITY	0.99+
next year	DATE	0.99+
Stone Breaker	PERSON	0.99+
first	QUANTITY	0.99+
one	QUANTITY	0.98+
America 10	TITLE	0.98+
King	PERSON	0.98+
today	DATE	0.98+
four terabyte	QUANTITY	0.97+
John Mode	PERSON	0.97+
Haas	PERSON	0.97+
yesterday	DATE	0.97+
first verdict	QUANTITY	0.96+
one place	QUANTITY	0.96+
s three	COMMERCIAL_ITEM	0.96+
single	QUANTITY	0.95+
first thing	QUANTITY	0.95+
One	QUANTITY	0.95+
both	QUANTITY	0.95+
Tensorflow	TITLE	0.95+
Hadoop	TITLE	0.95+
third trend	QUANTITY	0.94+
MPP Columbia	ORGANIZATION	0.94+
Hadoop	PERSON	0.94+
last last year	DATE	0.92+
three big trends	QUANTITY	0.92+
vertical 10	TITLE	0.92+
two public clouds	QUANTITY	0.92+
Pure Storage Accelerate conference	EVENT	0.91+
Andy	PERSON	0.9+
few years ago	DATE	0.9+
next day	DATE	0.9+
Mutual	ORGANIZATION	0.9+
Mode	PERSON	0.89+
telco	ORGANIZATION	0.89+
three big	QUANTITY	0.88+
eon	TITLE	0.88+
Verdict	PERSON	0.88+
three separate data	QUANTITY	0.88+
Cube	COMMERCIAL_ITEM	0.87+
petabytes	QUANTITY	0.87+
Google Cloud	TITLE	0.86+

Larry Lancaster, Zebrium | Virtual Vertica BDC 2020

>> Announcer: It's theCUBE! Covering the Virtual Vertica Big Data Conference 2020 brought to you by Vertica. >> Hi, everybody. Welcome back. You're watching theCUBE's coverage of the Vertica Virtual Big Data Conference. It was, of course, going to be in Boston at the Encore Hotel. Win big with big data with the new casino but obviously Coronavirus has changed all that. Our hearts go out and we are empathy to those people who are struggling. We are going to continue our wall-to-wall coverage of this conference and we're here with Larry Lancaster who's the founder and CTO of Zebrium. Larry, welcome to theCUBE. Thanks for coming on. >> Hi, thanks for having me. >> You're welcome. So first question, why did you start Zebrium? >> You know, I've been dealing with machine data a long time. So for those of you who don't know what that is, if you can imagine servers or whatever goes on in a data center or in a SAS shop. There's data coming out of those servers, out of those applications and basically, you can build a lot of cool stuff on that. So there's a lot of metrics that come out and there's a lot of log files that come. And so, I've built this... Basically spent my career building that sort of thing. So tools on top of that or products on top of that. The problem is that since at least log files are completely unstructured, it's always doing the same thing over and over again, which is going in and understanding the data and extracting the data and all that stuff. It's very time consuming. If you've done it like five times you don't want to do it again. So really, my idea was at this point with machine learning where it's at there's got to be a better way. So Zebrium was founded on the notion that we can just do all that automatically. We can take a pile of machine data, we can turn it into a database, and we can build stuff on top of that. And so the company is really all about bringing that value to the market. >> That's cool. I want to get in to that, just better understand who you're disrupting and understand that opportunity better. But before I do, tell us a little bit about your background. You got kind of an interesting background. Lot of tech jobs. Give us some color there. >> Yeah, so I started in the Valley I guess 20 years ago and when my son was born I left grad school. I was in grad school over at Berkeley, Biophysics. And I realized I needed to go get a job so I ended up starting in software and I've been there ever since. I mean, I spent a lot of time at, I guess I cut my teeth at Nedap, which was a storage company. And then I co-founded a business called Glassbeam, which was kind of an ETL database company. And then after that I ended up at Nimble Storage. Another company, EMC, ended up buying the Glassbeam so I went over there and then after Nimble though, which where I build the InfoSight platform. That's where I kind of, after that I was able to step back and take a year and a half and just go into my basement, actually, this is my kind of workspace here, and come up with the technology and actually build it so that I could go raise money and get a team together to build Zebrium. So that's really my career in a nutshell. >> And you've got Hello Kitty over your right shoulder, which is kind of cool >> That's right. >> And then up to the left you got your monitor, right? >> Well, I had it. It's over here, yeah. >> But it was great! Pull it out, pull it out, let me see it. So, okay, so you got that. So what do you do? You just sit there and code all night or what? >> Yeah, that's right. So Hello Kitty's over here. I have a daughter and she setup my workspace here on this side with Hello Kitty and so on. And over on this side, I've got my recliner where I basically lay it all the way back and then I pivot this thing down over my face and put my keyboard on my lap and I can just sit there for like 20 hours. It's great. Completely comfortable. >> That's cool. All right, better put that monitor back or our guys will yell at me. But so, obviously, we're talking to somebody with serious coding chops and I'll also add that the Nimble InfoSight, I think it was one of the best pick ups that HP, HPE, has had in a while. And the thing that interested me about that, Larry, is the ability that the company was able to take that InfoSight and poured it very quickly across its product lines. So that says to me it was a modern, architecture, I'm sure API, microservices, and all those cool buzz words, but the proof is in their ability to bring that IP to other parts of the portfolio. So, well done. >> Yeah, well thanks. Appreciate that. I mean, they've got a fantastic team there. And the other thing that helps is when you have the notion that you don't just build on top of the data, you extract the data, you structure it, you put that in a database, we used Vertica there for that, and then you build on top of that. Taking the time to build that layer is what lets you build a scalable platform. >> Yeah, so, why Vertica? I mean, Vertica's been around for awhile. You remember you had the you had the old RDBMS, Oracles, Db2s, SQL Server, and then the database was kind of a boring market. And then, all of a sudden, you had all of these MPP companies came out, a spade of them. They all got acquired, including Vertica. And they've all sort of disappeared and morphed into different brands and Micro Focus has preserved the Vertica brand. But it seems like Vertica has been able to survive the transitions. Why Vertica? What was it about that platform that was unique and interested you? >> Well, I mean, so they're the first fund to build, what I would call a real column store that's kind of market capable, right? So there was the C-Store project at Berkeley, which Stonebreaker was involved in. And then that became sort of the seed from which Vertica was spawned. So you had this idea of, let's lay things out in a columnar way. And when I say columnar, I don't just mean that the data for every column is in a different set of files. What I mean by that is it takes full advantage of things like run length and coding, and L file and coding, and block--impression, and so you end up with these massive orders of magnitude savings in terms of the data that's being pulled off of storage as well as as it's moving through the pipeline internally in Vertica's query processing. So why am I saying all this? Because it's fundamentally, it was a fundamentally disruptive technology. I think column stores are ubiquitous now in analytics. And I think you could name maybe a couple of projects which are mostly open source who do something like Vertica does but name me another one that's actually capable of serving an enterprise as a relational database. I still think Vertica is unique in being that one. >> Well, it's interesting because you're a startup. And so a lot of startups would say, okay, we're going with a born-in-the-cloud database. Now Vertica touts that, well look, we've embraced cloud. You know, we have, we run in the cloud, we run on PRAM, all different optionality. And you hear a lot of vendors say that, but a lot of times they're just taking their stack and stuffing it into the cloud. But, so why didn't you go with a cloud-native database and is Vertica able to, I mean, obviously, that's why you chose it, but I'm interested from a technologist standpoint as to why you, again, made that choice given all these other choices around there. >> Right, I mean, again, I'm not, so... As I explained a column store, which I think is the appropriate definition, I'm not aware of another cloud-native-- >> Hm, okay. >> I'm aware of other cloud-native transactional databases, I'm not aware of one that has the analytics form it and I've tried some of them. So it was not like I didn't look. What I was actually impressed with and I think what let me move forward using Vertica in our stack is the fact that Eon really is built from the ground up to be cloud-native. And so we've been using Eon almost ever since we started the work that we're doing. So I've been really happy with the performance and with reliability of Eon. >> It's interesting. I've been saying for years that Vertica's a diamond in the rough and it's previous owner didn't know what to do with it because it got distracted and now Micro Focus seems to really see the value and is obviously putting some investments in there. >> Yeah >> Tell me more about your business. Who are you disrupting? Are you kind of disrupting the do-it-yourself? Or is there sort of a big whale out there that you're going to go after? Add some color to that. >> Yeah, so our broader market is monitoring software, that's kind of the high-level category. So you have a lot of people in that market right now. Some of them are entrenched in large players, like Datadog would be a great example. Some of them are smaller upstarts. It's a pretty, it's a pretty saturated market. But what's happened over the last, I'd say two years, is that there's been sort of a push towards what's called observability in terms of at least how some of the products are architected, like Honeycomb, and how some of them are messaged. Most of them are messaged these days. And what that really means is there's been sort of an understanding that's developed that that MTTR is really what people need to focus on to keep their customers happy. If you're a SAS company, MTTR is going to be your bread and butter. And it's still measured in hours and days. And the biggest reason for that is because of what's called unknown unknowns. Because of complexity. Now a days, things are, applications are ten times as complex as they used to be. And what you end up with is a situation where if something is new, if it's a known issue with a known symptom and a known root cause, then you can setup a automation for it. But the ones that really cost a lot of time in terms of service disruption are unknown unknowns. And now you got to go dig into this massive mass of data. So observability is about making tools to help you do that, but it's still going to take you hours. And so our contention is, you need to automate the eyeball. The bottleneck is now the eyeball. And so you have to get away from this notion of a person's going to be able to do it infinitely more efficient and recognize that you need automated help. When you get an alert agent, it shouldn't be that, "Hey, something weird's happening. Now go dig in." It should be, "Here's a root cause and a symptom." And that should be proposed to you by a system that actually does the observing. That actually does the watching. And that's what Zebrium does. >> Yeah, that's awesome. I mean, you're right. The last thing you want is just another alert and it say, "Go figure something out because there's a problem." So how does it work, Larry? In terms of what you built there. Can you take us inside the covers? >> Yeah, sure. So there's really, right now there's two kinds of data that we're ingesting. There's metrics and there's log files. Metrics, there's actually sort of a framework that's really popular in DevOp circles especially but it's becoming popular everywhere, which is called Prometheus. And it's a way of exporting metrics so that scrapers can collect them. And so if you go look at a typical stack, you'll find that most of the open source components and many of the closed source components are going to have exporters that export all their stacks to Prometheus. So by supporting that stack we can bring in all of those metrics. And then there's also the log files. And so you've got host log files in a containerized environment, you've got container logs, and you've got application-specific logs, perhaps living on a host mount. And you want to pull all those back and you want to be able to associate this log that I've collected here is associated with the same container on the same host that this metric is associated with. But now what? So once you've got that, you've got a pile of unstructured logs. So what we do is we take a look at those logs and we say, let's structure those into tables, right? So where I used to have a log message, if I look in my log file and I see it says something like, X happened five times, right? Well, that event types going to occur again and it'll say, X happened six times or X happened three times. So if I see that as a human being, I can say, "Oh clearly, that's the same thing." And what's interesting here is the times that X, that X happened, and that this number read... I may want to know when the numbers happened as a time series, the values of that column. And so you can imagine it as a table. So now I have table for that event type and every time it happens, I get a row. And then I have a column with that number in it. And so now I can do any kind of analytics I want almost instantly across my... If I have all my event types structured that way, every thing changes. You can do real anomaly detection and incident detection on top of that data. So that's really how we go about doing it. How we go about being able to do autonomous monitoring in a way that's effective. >> How do you handle doing that for, like the Spoke app? Do you have to, does somebody have to build a connector to those apps? How do you handle that? >> Yeah, that's a really good question. So you're right. So if I go and install a typical log manager, there'll be connectors for different apps and usually what that means is pulling in the stuff on the left, if you were to be looking at that log line, and it will be things like a time stamp, or a severity, or a function name, or various other things. And so the connector will know how to pull those apart and then the stuff to the right will be considered the message and that'll get indexed for search. And so our approach is we actually go in with machine learning and we structure that whole thing. So there's a table. And it's going to have a column called severity, and timestamp, and function name. And then it's going to have columns that correspond to the parameters that are in that event. And it'll have a name associated with the constant parts of that event. And so you end up with a situation where you've structured all of it automatically so we don't need collectors. It'll work just as well on your home-grown app that has no collectors or no parsers to find or anything. It'll work immediately just as well as it would work on anything else. And that's important, because you can't be asking people for connectors to their own applications. It just, it becomes now they've go to stop what they're doing and go write code for you, for your platform and they have to maintain it. It's just untenable. So you can be up and running with our service in three minutes. It'll just be monitoring those for you. >> That's awesome! I mean, that is really a breakthrough innovation. So, nice. Love to see that hittin' the market. Who do you sell to? Both types of companies and what role within the company? >> Well, definitely there's two main sort of pushes that we've seen, or I should say pulls. One is from DevOps folks, SRE folks. So these are people who are tasked with monitoring an environment, basically. And then you've got people who are in engineering and they have a staging environment. And what they actually find valuable is... Because when we find an incident in a staging environment, yeah, half the time it's because they're tearing everything up and it's not release ready, whatever's in stage. That's fine, they know that. But the other half the time it's new bugs, it's issues and they're finding issues. So it's kind of diverged. You have engineering users and they don't have titles like QA, they're Dev engineers or Dev managers that are really interested. And then you've got DevOps and SRE people there (mumbles). >> And how do I consume your product? Is the SAS... I sign up and you say within three minutes I'm up and running. I'm paying by the drink. >> Well, (laughs) right. So there's a couple ways. So, right. So the easiest way is if you use Kubernetes. So Kubernetes is what's called a container orchestrator. So these days, you know Docker and containers and all that, so now there's container orchestrators have become, I wouldn't say ubiquitous but they're very popular now. So it's kind of on that inflection curve. I'm not exactly sure the penetration but I'm going to say 30-40% probably of shops that were interested are using container orchestrators. So if you're using Kubernetes, basically you can install our Kubernetes chart, which basically means copying and pasting a URL and so on into your little admin panel there. And then it'll just start collecting all the logs and metrics and then you just login on the website. And the way you do that is just go to our website and it'll show you how to sign up for the service and you'll get your little API key and link to the chart and you're off and running. You don't have to do anything else. You can add rules, you can add stuff, but you don't have to. You shouldn't have to, right? You should never have to do any more work. >> That's great. So it's a SAS capability and I just pay for... How do you price it? >> Oh, right. So it's priced on volume, data volume. I don't want to go too much into it because I'm not the pricing guy. But what I'll say is that it's, as far as I know it's as cheap or cheaper than any other log manager or metrics product. It's in that same neighborhood as the very low priced ones. Because right now, we're not trying to optimize for take. We're trying to make a healthy margin and get the value of autonomous monitoring out there. Right now, that's our priority. >> And it's running in the cloud, is that right? AWB West-- >> Yeah, that right. Oh, I should've also pointed out that you can have a free account if it's less than some number of gigabytes a day we're not going to charge. Yeah, so we run in AWS. We have a multi-tenant instance in AWS. And we have a Vertica Eon cluster behind that. And it's been working out really well. >> And on your freemium, you have used the Vertica Community Edition? Because they don't charge you for that, right? So is that how you do it or... >> No, no. We're, no, no. So, I don't want to go into that because I'm not the bizdev guy. But what I'll say is that if you're doing something that winds up being OEM-ish, you can work out the particulars with Vertica. It's not like you're going to just go pay retail and they won't let you distinguish between tests, and prod, and paid, and all that. They'll work with you. Just call 'em up. >> Yeah, and that's why I brought it up because Vertica, they have a community edition, which is not neutered. It runs Eon, it's just there's limits on clusters and storage >> There's limits. >> But it's still fully functional though. >> So to your point, we want it multi-tenant. So it's big just because it's multi-tenant. We have hundred of users on that (audio cuts out). >> And then, what's your partnership with Vertica like? Can we close on that and just describe that a little bit? >> What's it like. I mean, it's pleasant. >> Yeah, I mean (mumbles). >> You know what, so the important thing... Here's what's important. What's important is that I don't have to worry about that layer of our stack. When it comes to being able to get the performance I need, being able to get the economy of scale that I need, being able to get the absolute scale that I need, I've not been disappointed ever with Vertica. And frankly, being able to have acid guarantees and everything else, like a normal mature database that can join lots of tables and still be fast, that's also necessary at scale. And so I feel like it was definitely the right choice to start with. >> Yeah, it's interesting. I remember in the early days of big data a lot of people said, "Who's going to need these acid properties and all this complexity of databases." And of course, acid properties and SQL became the killer features and functions of these databases. >> Who didn't see that one coming, right? >> Yeah, right. And then, so you guys have done a big seed round. You've raised a little over $6 million dollars and you got the product market fit down. You're ready to rock, right? >> Yeah, that's right. So we're doing a launch probably, well, when this airs it'll probably be the day before this airs. Basically, yeah. We've got people... Like literally in the last, I'd say, six to eight weeks, It's just been this sort of pique of interest. All of a sudden, everyone kind of gets what we're doing, realizes they need it, and we've got a solution that seems to meet expectations. So it's like... It's been an amazing... Let me just say this, it's been an amazing start to the year. I mean, at the same time, it's been really difficult for us but more difficult for some other people that haven't been able to go to work over the last couple of weeks and so on. But it's been a good start to the year, at least for our business. So... >> Well, Larry, congratulations on getting the company off the ground and thank you so much for coming on theCUBE and being part of the Virtual Vertica Big Data Conference. >> Thank you very much. >> All right, and thank you everybody for watching. This is Dave Vellante for theCUBE. Keep it right there. We're covering wall-to-wall Virtual Vertica BDC. You're watching theCUBE. (upbeat music)

Published Date : Mar 31 2020

SUMMARY :

brought to you by Vertica. and we're here with Larry Lancaster why did you start Zebrium? and basically, you can build a lot of cool stuff on that. and understand that opportunity better. and actually build it so that I could go raise money It's over here, yeah. So what do you do? and then I pivot this thing down over my face and I'll also add that the Nimble InfoSight, And the other thing that helps is when you have the notion and Micro Focus has preserved the Vertica brand. and so you end up with these massive orders And you hear a lot of vendors say that, I'm not aware of another cloud-native-- I'm not aware of one that has the analytics form it and now Micro Focus seems to really see the value Are you kind of disrupting the do-it-yourself? And that should be proposed to you In terms of what you built there. And so you can imagine it as a table. And so you end up with a situation I mean, that is really a breakthrough innovation. and it's not release ready, I sign up and you say within three minutes And the way you do that So it's a SAS capability and I just pay for... and get the value of autonomous monitoring out there. that you can have a free account So is that how you do it or... and they won't let you distinguish between Yeah, and that's why I brought it up because Vertica, But it's still So to your point, I mean, it's pleasant. What's important is that I don't have to worry I remember in the early days of big data and you got the product market fit down. that haven't been able to go to work and thank you so much for coming on theCUBE All right, and thank you everybody for watching.

ENTITIES

Entity	Category	Confidence
Larry Lancaster	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Larry	PERSON	0.99+
Boston	LOCATION	0.99+
five times	QUANTITY	0.99+
three times	QUANTITY	0.99+
six times	QUANTITY	0.99+
EMC	ORGANIZATION	0.99+
six	QUANTITY	0.99+
Zebrium	ORGANIZATION	0.99+
20 hours	QUANTITY	0.99+
Glassbeam	ORGANIZATION	0.99+
Nedap	ORGANIZATION	0.99+
Vertica	ORGANIZATION	0.99+
Nimble	ORGANIZATION	0.99+
Nimble Storage	ORGANIZATION	0.99+
HP	ORGANIZATION	0.99+
HPE	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
a year and a half	QUANTITY	0.99+
Micro Focus	ORGANIZATION	0.99+
ten times	QUANTITY	0.99+
two kinds	QUANTITY	0.99+
two years	QUANTITY	0.99+
three minutes	QUANTITY	0.99+
first question	QUANTITY	0.99+
eight weeks	QUANTITY	0.98+
Stonebreaker	ORGANIZATION	0.98+
Prometheus	TITLE	0.98+
30-40%	QUANTITY	0.98+
Eon	ORGANIZATION	0.98+
hundred of users	QUANTITY	0.98+
One	QUANTITY	0.98+
Vertica Virtual Big Data Conference	EVENT	0.98+
Kubernetes	TITLE	0.97+
first fund	QUANTITY	0.97+
Virtual Vertica Big Data Conference 2020	EVENT	0.97+
AWB West	ORGANIZATION	0.97+
Virtual Vertica Big Data Conference	EVENT	0.97+
Honeycomb	ORGANIZATION	0.96+
SAS	ORGANIZATION	0.96+
20 years ago	DATE	0.96+
Both types	QUANTITY	0.95+
theCUBE	ORGANIZATION	0.95+
Datadog	ORGANIZATION	0.95+
two main	QUANTITY	0.94+
over $6 million dollars	QUANTITY	0.93+
Hello Kitty	ORGANIZATION	0.93+
SQL	TITLE	0.93+
Zebrium	PERSON	0.91+
Spoke	TITLE	0.89+
Encore Hotel	LOCATION	0.88+
InfoSight	ORGANIZATION	0.88+
Coronavirus	OTHER	0.88+
one	QUANTITY	0.86+
less	QUANTITY	0.85+
Oracles	ORGANIZATION	0.85+
2020	DATE	0.85+
CTO	PERSON	0.84+
Vertica	TITLE	0.82+
Nimble InfoSight	ORGANIZATION	0.81+

Ron Cormier, The Trade Desk | Virtual Vertica BDC 2020

>> David: It's the cube covering the virtual Vertica Big Data conference 2020 brought to you by Vertica. Hello, buddy, welcome to this special digital presentation of the cube. We're tracking the Vertica virtual Big Data conferences, the cubes. I think fifth year doing the BDC. We've been to every big data conference that they've held and really excited to be helping with the digital component here in these interesting times. Ron Cormier is here, Principal database engineer at the Trade Desk. Ron, great to see you. Thanks for coming on. >> Hi, David, my pleasure, good to see you as well. >> So we're talking a little bit about your background you got, you're basically a Vertica and database guru, but tell us about your role at Trade Desk and then I want to get into a little bit about what Trade Desk does. >> Sure, so I'm a principal database engineer at the Trade Desk. The Trade Desk was one of my customers when I was working with Hp, at HP, as a member of the Vertica team, and I joined the Trade Desk in early 2016. And since then, I've been working on building out their Vertica capabilities and expanding the data warehouse footprint and as ever growing database technology, data volume environment. >> And the Trade Desk is an ad tech firm and you are specializing in real time ad serving and pricing. And I guess real time you know, people talk about real time a lot we define real time as before you lose the customer. Maybe you can talk a little bit about you know, the Trade Desk in the business and maybe how you define real time. >> Totally, so to give everybody kind of a frame of reference. Anytime you pull up your phone or your laptop and you go to a website or you use some app and you see an ad what's happening behind the scenes is an auction is taking place. And people are bidding on the privilege to show you an ad. And across the open Internet, this happens seven to 13 million times per second. And so the ads, the whole auction dynamic and the display of the ad needs to happen really fast. So that's about as real time as it gets outside of high frequency trading, as far as I'm aware. So we put the Trade Desk participates in those auctions, we bid on behalf of our customers, which are ad agencies, and the agencies represent brands so the agencies are the madman companies of the world and they have brands that under their guidance, and so they give us budget to spend, to place the ads and to display them and once the ads get displayed, so we bid on the hundreds of thousands of auctions per second. Once we make those bids, anytime we do make a bid some data flows into our data platform, which is powered by Vertica. And, so we're getting hundreds of thousands of events per second. We have other events that flow into Vertica as well. And we clean them up, we aggregate them, and then we run reports on the data. And we run about 40,000 reports per day on behalf of our customers. The reports aren't as real time as I was talking about earlier, they're more batch oriented. Our customers like to see big chunks of time, like a whole day or a whole week or a whole month on a single report. So we wait for that time period to complete and then we run the reports on the results. >> So you you have one of the largest commercial infrastructures, in the Big Data sphere. Paint a picture for us. I understand you got a couple of like 320 node clusters we're talking about petabytes of data. But describe what your environment looks like. >> Sure, so like I said, we've been very good customers for a while. And we started out with with a bunch of enterprise clusters. So the Enterprise Mode is the traditional Vertica deployment where the compute and the storage is tightly coupled all raid arrays on the servers. And we had four of those and we're doing okay, but our volumes are ever increasing, we wanted to store more data. And we wanted to run more reports in a shorter period of time, was to keep pushing. And so we had these four clusters and then we started talking with Vertica about Eon mode, and that's Vertica separation of compute and storage where you get the compute and the storage can be scaled independently, we can add storage without adding compute or vice versa or we can add both, like. So that was something that we were very interested in for a couple reasons. One, our enterprise clusters, we're running out of disk, like when adding disk is expensive. In Enterprise Mode, it's kind of a pain, you got to add, compute at the same time, so you kind of end up in an unbalanced place. So beyond mode that problem gets a lot better. We can add disk, infinite disk because it's backed by S3. And we can add compute really easy to scale, the number of things that we run in parallel concurrency, just add a sub cluster. So they are two US East and US west of Amazon, so reasonably diverse. And and the real benefit is that they can, we can stop nodes when we don't need them. Our workload is fairly lumpy, I call it. Like we, after the day completes, we do the ingest, we do the aggregation for ingesting and aggregating all day, but the final hour, so it needs to be completed. And then once that's done, then the number of reports that we need to run spikes up, it goes really high. And we run those reports, we spin up a bunch of extra compute on the fly, run those reports and then spin them down. And we don't have to pay for that, for the rest of the day. So Eon has been a nice Boone for us for both those reasons. >> I'd love to explore you on little bit more. I mean, it's relatively new, I think 2018 Vertica announced Eon mode, so it's only been out there a couple years. So I'm curious for the folks that haven't moved the Eon mode, can you which presumably they want to for the same reasons that you mentioned why by the stories and chunks when you're on Storage if you don't have to, what were some of the challenges that you had to, that you faced in going to Eon mode? What kind of things did you have to prepare for? Were there any out of scope expectations? Can you share that experience with us? >> Sure, so we were an early adopter. We participated in the beta program. I mean, we, I think it's fair to say we actually drove the requirements and a lot of ways because we approached Vertica early on. So the challenges were what you'd expect any early adopter to be going through. The sort of getting things working as expected. I mean, there's a number of cases, which I could touch upon, like, we found an efficiency in the way that it accesses the data on S3 and it was accessing the data too frequently, which ended up was just expensive. So our S3 bill went up pretty significantly for a couple of months. So that was a challenge, but we worked through that another was that we recently made huge strides in with Vertica was the ability to stop and start nodes and not have to start them very quickly. And when they start to not interfere with any running queries, so when we create, when we want to spin up a bunch to compute, there was a point in time when it would break certain queries that were already running. So that that was a challenge. But again, the very good team has been quite responsive to solving these issues and now that's behind us. In terms of those who need to get started, there's or looking to get started. there's a number of things to think about. Off the top of my head there's sort of new configuration items that you'll want to think about, like how instance type. So certainly the Amazon has a variety of instances and its important to consider one of Vertica's architectural advantages in these areas Vertica has this caching layer on the instances themselves. And what that does is if we can keep the data in cache, what we've found is that the performance is basically the same performance of Enterprise Mode. So having a good size cast when needed, can be a little worrying. So we went with the I three instance types, which have a lot of local NVME storage that we can, so we can cache data and get good performance. That's one thing to think about. The number of nodes, the instance type, certainly the number of shards is a sort of technical item that needs to be considered. It's how the data gets, its distributed. It's sort of a layer on top of the segmentation that some Vertica engineers will be familiar with. And probably I mean, the, one of the big things that one needs to consider is how to get data in the database. So if you have an existing database, there's no sort of nice tool yet to suck all the data into an Eon database. And so I think they're working on that. But we're at the point we got there. We had to, we exported all our data out of enterprise cluster as cache dumped it out to S3 and then we had the Eon cluster to suck that data. >> So awesome advice. Thank you for sharing that with the community. So but at the end of the day, so it sounds like you had some learning to do some tweaking to do and obviously how to get the data in. At the end of the day, was it worth it? What was the business impact? >> Yeah, it definitely was worth it for us. I mean, so right now, we have four times the data in our Eon cluster that we have in our enterprise clusters. We still run some enterprise clusters. We started with four at the peak. Now we're down to two. So we have the two young clusters. So it's been, I think our business would say it's been a huge win, like we're doing things that we really never could have done before, like for accessing the data on enterprise would have been really difficult. It would have required non trivial engineering to do things like daisy chaining clusters together, and then how to aggregate data across clusters, which would, again, non trivial. So we have all the data we want, we can continue to grow data, where running reports on seasonality. So our customers can compare their campaigns last year versus this year, which is something we just haven't been able to do in the past. We've expanded that. So we grew the data vertically, we've expanded the data horizontally as well. So we were adding columns to our aggregates. We are, in reaching the data much more than we have in the past. So while we still have enterprise kicking around, I'd say our clusters are doing the majority of the heavy lifting. >> And the cloud was part of the enablement, here, particularly with scale, is that right? And are you running certain... >> Definitely. >> And you are running on prem as well, or are you in a hybrid mode? Or is it all AWS? >> Great question, so yeah. When I've been speaking about enterprise, I've been referring to on prem. So we have a physical machines in data centers. So yeah, we are running a hybrid now and I mean, and so it's really hard to get like an apples to apples direct comparison of enterprise on prem versus Eon in the cloud. One thing that I touched upon in my presentation is it would require, if I try to get apples to apples, And I think about how I would run the entire workload on enterprise or on Eon, I had to run the entire thing, we want both, I tried to think about how many cores, we would need CPU cores to do that. And basically, it would be about the same number of cores, I think, for enterprise on prime versus Eon in the cloud. However, Eon nodes only need to be running half the course only need to be running about six hours out of the day. So the other the other 18 hours I can shut them down and not be paying for them, mostly. >> Interesting, okay, and so, I got to ask you, I mean, notwithstanding the fact that you've got a lot invested in Vertica, and get a lot of experience there. A lot of you know, emerging cloud databases. Did you look, I mean, you know, a lot about database, not just Vertica, your database guru in many areas, you know, traditional RDBMS, as well as MPP new cloud databases. What is it about Vertica that works for you in this specific sweet spot that you've chosen? What's really the difference there? >> Yeah, so I think the key differences is the maturity. There are a number, I am familiar with another, a number of other database platforms in the cloud and otherwise, column stores specifically, that don't have the maturity that we're used to and we need at our scale. So being able to specify alternate projections, so different sort orders on my data is huge. And, there's other platforms where we don't have that capability. And so the, Vertica is, of course, the original column store and they've had time to build up a lead in terms of their maturity and features and I think that other other column stores cloud, otherwise are playing a little bit of catch up in that regard. Of course, Vertica is playing catch up on the cloud side. But if I had to pick whether I wanted to write a column store, first graph from scratch, or use a defined file system, like a cloud file system from scratch, I'd probably think it would be easier to write the cloud file system. The column store is where the real smarts are. >> Interesting, let's talk a little bit about some of the challenges you have in reporting. You have a very dynamic nature of reporting, like I said, your clients want to they want to a time series, they just don't want to snap snapshot of a slice. But at the same time, your reporting is probably pretty lumpy, a very dynamic, you know, demand curve. So first of all, is that accurate? Can you describe that sort of dynamic, dynamism and how are you handling that? >> Yep, that's exactly right. It is lumpy. And that's the exact word that I use. So like, at the end of the UTC day, when UTC midnight rolls around, that's we do the final ingest the final aggregate and then the queue for the number of reports that need to run spikes. So the majority of those 40,000 reports that we run per day are run in the four to six hours after that spikes up. And so that's when we need to have all the compute come online. And that's what helps us answer all those queries as fast as possible. And that's a big reason why Eon is advantage for us because the rest of the day we kind of don't necessarily need all that compute and we can shut it down and not pay for it. >> So Ron, I wonder if you could share with us just sort of the wrap here, where you want to take this you're obviously very close to Vertica. Are you driving them in a heart and Eon mode, you mentioned before you'd like, you'd have the ability to load data into Eon mode would have been nice for you, I guess that you're kind of over that hump. But what are the kinds of things, If Column Mahoney is here in the room, what are you telling him that you want the team, the engineering team at Vertica to work on that would make your life better? >> I think the things that need the most attention sort of near term is just the smoothing out some of the edges in terms of making it a little bit more seamless in terms of the cloud aspects to it. So our goal is to be able to start instances and have them join the cluster in less than five minutes. We're not quite there yet. If you look at some of the other cloud database platforms, they're beating that handle it so I know the team is working on that. Some of the other things are the control. Like I mentioned, while we like control in the column store, we also want control on the cloud side of things in terms of being able to dedicate cluster, some clusters specific. We can pin workloads against a specific sub cluster and take advantage of the cast that's over there. We can say, okay, this resource pool. I mean, the sub cluster is a new concept, relatively new concept for Vertica. So being able to have control of many things at sub cluster level, resource pools, configuration parameters, and so on. >> Yeah, so I mean, I personally have always been impressed with Vertica. And their ability to sort of ride the wave adopt new trends. I mean, they do have a robust stack. It's been, you know, been 10 plus years around. They certainly embraced to do, the embracing machine learning, we've been talking about the cloud. So I actually have a lot of confidence to them, especially when you compare it to other sort of mid last decade MPP column stores that came out, you know, Vertica is one of the few remaining certainly as an independent brand. So I think that speaks the team there and the engineering culture. But give your final word. Just final thoughts on your role the company Vertica wherever you want to take it. >> Yeah, no, I mean, we're really appreciative and we value the partners that we have and so I think it's been a win win, like our volumes are, like I know that we have some data that got pulled into their test suite. So I think it's been a win win for both sides and it'll be a win for other Vertica customers and prospects, knowing that they're working with some of the highest volume, velocity variety data that (mumbles) >> Well, Ron, thanks for coming on. I wish we could have met face to face at the the Encore in Boston. I think next year we'll be able to do that. But I appreciate that technology allows us to have these remote conversations. Stay safe, all the best to you and your family. And thanks again. >> My pleasure, David, good speaking with you. >> And thank you for watching everybody, we're covering this is the Cubes coverage of the Vertica virtual Big Data conference. I'm Dave volante. We'll be right back right after this short break. (soft music)

Published Date : Mar 31 2020

SUMMARY :

brought to you by Vertica. So we're talking a little bit about your background and I joined the Trade Desk in early 2016. And the Trade Desk is an ad tech firm And people are bidding on the privilege to show you an ad. So you you have one of the largest And and the real benefit is that they can, for the same reasons that you mentioned why by dumped it out to S3 and then we had the Eon cluster So but at the end of the day, So we have all the data we want, And the cloud was part of the enablement, here, half the course only need to be running I mean, notwithstanding the fact that you've got that don't have the maturity about some of the challenges you have in reporting. because the rest of the day we kind of So Ron, I wonder if you could share with us in terms of the cloud aspects to it. the company Vertica wherever you want to take it. and we value the partners that we have Stay safe, all the best to you and your family. of the Vertica virtual Big Data conference.

ENTITIES

Entity	Category	Confidence
Ron	PERSON	0.99+
David	PERSON	0.99+
Vertica	ORGANIZATION	0.99+
Ron Cormier	PERSON	0.99+
HP	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
last year	DATE	0.99+
AWS	ORGANIZATION	0.99+
40,000 reports	QUANTITY	0.99+
Boston	LOCATION	0.99+
18 hours	QUANTITY	0.99+
fifth year	QUANTITY	0.99+
US	LOCATION	0.99+
Dave volante	PERSON	0.99+
next year	DATE	0.99+
seven	QUANTITY	0.99+
both	QUANTITY	0.99+
One	QUANTITY	0.99+
2018	DATE	0.99+
less than five minutes	QUANTITY	0.99+
this year	DATE	0.99+
10 plus years	QUANTITY	0.99+
one	QUANTITY	0.99+
four	QUANTITY	0.99+
early 2016	DATE	0.98+
apples	ORGANIZATION	0.98+
two young clusters	QUANTITY	0.98+
two	QUANTITY	0.98+
both sides	QUANTITY	0.98+
about six hours	QUANTITY	0.98+
Cubes	ORGANIZATION	0.98+
six hours	QUANTITY	0.98+
US East	LOCATION	0.98+
Hp	ORGANIZATION	0.98+
Eon	ORGANIZATION	0.96+
S3	TITLE	0.95+
13 million times per second	QUANTITY	0.94+
half	QUANTITY	0.94+
prime	COMMERCIAL_ITEM	0.94+
four times	QUANTITY	0.92+
hundreds of thousands of auctions	QUANTITY	0.92+
mid last decade	DATE	0.89+
one thing	QUANTITY	0.88+
One thing	QUANTITY	0.87+
single report	QUANTITY	0.85+
couple reasons	QUANTITY	0.84+
four clusters	QUANTITY	0.83+
first graph	QUANTITY	0.81+
Vertica	TITLE	0.81+
hundreds of thousands of events per second	QUANTITY	0.8+
about 40,000 reports per day	QUANTITY	0.78+
Vertica Big Data conference 2020	EVENT	0.77+
320 node	QUANTITY	0.74+
a whole week	QUANTITY	0.72+
Vertica virtual Big Data	EVENT	0.7+

Joe Gonzalez, MassMutual | Virtual Vertica BDC 2020

(bright music) >> Announcer: It's theCUBE. Covering the Virtual Vertica Big Data Conference 2020, brought to you by Vertica. Hello everybody, welcome back to theCUBE's coverage of the Vertica Big Data Conference, the Virtual BDC. My name is Dave Volante, and you're watching theCUBE. And we're here with Joe Gonzalez, who is a Vertica DBA, at MassMutual Financial. Joe, thanks so much for coming on theCUBE I'm sorry that we can't be face to face in Boston, but at least we're being responsible. So thank you for coming on. >> (laughs) Thank you for having me. It's nice to be here. >> Yeah, so let's set it up. We'll talk about, you know, a little bit about MassMutual. Everybody knows it's a big financial firm, but what's your role there and kind of your mission? >> So my role is Vertica DBA. I was hired January of last year to come on and manage their Vertica cluster. They've been on Vertica for probably about a year and a half before that started out on on-prem cluster and then move to AWS Enterprise in the cloud, and brought me on just as they were considering transitioning over to Vertica's EON mode. And they didn't really have anybody dedicated to Vertica, nobody who really knew and understood the product. And I've been working with Vertica for about probably six, seven years, at that point. I was looking for something new and landed a really good opportunity here with a great company. >> Yeah, you have a lot of experience in Vertica. You had a role as a market research, so you're a data guy, right? I mean that's really what you've been doing your entire career. >> I am, I've worked with Pitney Bowes, in the postage industry, I worked with healthcare auditing, after seven years in market research. And then I've been with MassMutual for a little over a year now, yeah, quite a lot. >> So tell us a little bit about kind of what your objectives are at MassMutual, what you're kind of doing with the platform, what application just supporting, paint a picture for us if you would. >> Certainly, so my role is, MassMutual just decided to make Vertica its enterprise data warehouse. So they've really bought into Vertica. And we're moving all of our data there probably about to good 80, 90% of MassMutual's data is going to be on the Vertica platform, in EON mode. So, and we have a wide usage of that data across corporation. Right now we're about 50 terabytes and growing quickly. And a wide variety of users. So there's a lot of ETLs coming in overnight, loading a lot of data, transforming a lot of data. And a lot of reporting tools are using it. So currently, Tableau MicroStrategy. We have Alteryx using it, and we also have API's running against it throughout the day, 24/7 with people coming in, especially now these days with the, you know, some financial uncertainty going on. A lot of people coming and checking their 401k's, checking their insurance and status and what not. So we have to handle a lot of concurrent traffic on top of the normal big query. So it's a quite diverse cluster. And I'm glad they're really investing in using Vertica as their overall solution for this. >> Yeah, I mean, these days your 401k like this, right? (laughing) Afraid to look. So I wonder, Joe if you could share with our audience. I mean, for those who might not be as familiar with the history of just Vertica, and specifically, about MPP, you've had historically you have, you know, traditional RDBMS, whether it's Db2 or Oracle, and then you had a spate of companies that came out with this notion of MPP Vertica is the one that, I think it's probably one of the few if only brands that they've survived, but what did that bring to the industry and why is that important for people to understand, just in terms of whatever it is, scale, performance, cost. Can you explain that? >> To me, it actually brought scale at good cost. And that's why I've been a big proponent of Vertica ever since I started using it. There's a number, like you said of different platforms where you can load big data and store and house big data. But the purpose of having that big data is not just for it to sit there, but to be used, and used in a variety of ways. And that's from, you know, something small, like the first installation I was on was about 10 terabytes. And, you know, I work with the data warehouses up to 100 terabytes, and, you know, there's Vertica installations with, you know, hundreds of petabytes on them. You want to be able to use that data, so you need a platform that's going to be able to access that data and get it to the clients, get it to the customers as quickly as possible, and not paying an arm and a leg for the privilege to do so. And Vertica allows companies to do that, not only get their data to clients and you know, in company users quickly, but save money while doing so. >> So, but so, why couldn't I just use a traditional RDBMS? Why not just throw it all into Oracle? >> One, cost, Oracle is very expensive while Vertica's a lot more affordable than that. But the column-score structure of Vertica allows for a lot more optimized queries. Some of the queries that you can run in Vertica in 2, 3, 4 seconds, will take minutes and sometimes hours in an RDBMS, like Oracle, like SQL Server. They have the capability to store that amount of data, no question, but the usability really lacks when you start querying tables that are 180 billion column, 180 billion rows rather of tables in Vertica that are over 1000 columns. Those will take hours to run on a traditional RDBMS and then running them in Vertica, I get my queries back in a sec. >> You know what's interesting to me, Joe and I wonder if you could comment, it seems that Vertica has done a good job of embracing, you know, riding the waves, whether it was HDFS and the big data in our early part of the big data era, the machine learning, machine intelligence. Whether it's, you know, TensorFlow and other data science tools, it seems like Vertica somehow in the cloud is the other one, right? A lot of times cloud is super disruptive, particularly to companies that started on-prem, it seems like Vertica somehow has been able to adopt and embrace some of these trends. Why, from your standpoint, first of all, from your standpoint, as a customer, is that true? And why do you think that is? Is it architectural? Is it true mindset engineering? I wonder if you could comment on that. >> It's absolutely true, I've started out again, on an on-prem Vertica data warehouse, and we kind of, you know, rolled kind of along with them, you know, more and more people have been using data, they want to make it accessible to people on the web now. And you know, having that, the option to provide that data from an on-prem solution, from AWS is key, and now Vertica is offering even a hybrid solution, if you want to keep some of your data behind a firewall, on-prem, and put some in the cloud as well. So data at Vertica has absolutely evolved along with the industry in ways that no other company really has that I've seen. And I think the reason for it and the reason I've stayed with Vertica, and specifically have remained at Vertica DBA for the last seven years, is because of the way Vertica stays in touch with it's persons. I've been working with the same people for the seven, eight years, I've been using Vertica, they're family. I'm part of their family, and you know, I'm good friends with some of these people. And they really are in tune not only with the customer but what they're doing. They really sit down with you and have those conversations about, you know, what are your needs? How can we make Vertica better? And they listen to their clients. You know, just having access to the data engineers who develop Vertica to be arranged on a phone call or whatnot, I've never had that with any other company. Vertica makes that available to their customers when they need it. So the personal touch is a huge for them. >> That's good, it's always good to get the confirmation from the practitioners, just not hear from the vendor. I want to ask you about the EON transition. You mentioned that MassMutual brought you in to help with that. What were some of the challenges that you faced? And how did you get over them? And what did, what is, why EON? You know, what was the goal, the outcome and some of the challenges maybe that you had to overcome? >> Right. So MassMutual had an interesting setup when I first came in. They had three different Vertica clusters to accommodate three different portions of their business. The data scientists who use the data quite extensively in very large queries, very intense queries, their work with their predictive analytics and whatnot. It was a separate one for the API's, which needed, you know, sub-second query response times. And the enterprise solution, they weren't always able to get the performance they needed, because the fast queries were being overrun by the larger queries that needed more resources. And then they had a third for starting to develop this enterprise data platform and started, you know, looking into their future. The first challenge was, first of all, bringing all those three together, and back into a single cluster, and allowing our users to have both of the heavy queries and the API queries running at the same time, on the same platform without having to completely separate them out onto different clusters. EON really helps with that because it allows to store that data in the S3 communal storage, have the main cluster set up to run the heavy queries. And then you can set up sub clusters that still point to that S3 data, but separates out the compute so that the API's really have their own resources to run and not be interfered with by the other process. >> Okay, so that, I'm hearing a couple of things. One is you're sort of busting down data silos. So you're able to have a much more coherent view of your data, which I would imagine is critical, certainly. Companies like MassMutual, have been around for 100 years, and so you've got all kinds of data dispersed. So to the extent that you can break down those silos, that's important, but also being able to I guess have granular increments of compute and storage is what I'm hearing. What does that do for you? It make that more efficient? Well, they are other business benefits? Maybe you could elucidate. >> Well, one cost is again, a huge benefit, the cost of running three different clusters in even AWS, in the enterprise solution was a little costly, you know, you had to have your dedicated servers here and there. So you're paying for like, you know, 12, 15 different servers, for example. Whereas we bring them all back into EON, I can run everything on a six-node production cluster. And you know, when things are busy, I can spin up the three-node top cluster for the API's, only paid for when I need them, and then bring them back into the main cluster when things are slowed down a bit, and they can get that performance that they need. So that saves a ton on resource costs, you know, you're not paying for the storage, you're paying for one S3 bucket, you're only paying for the nodes, these are two instances, that are up and running when you need them., and that is huge. And again, like you said, it gives us the ability to silo our data without having to completely separate our data into different storage areas. Which is a big benefit, it gives us the ability to query everything from one single cluster without having to synchronize it to, you know, three different ones. So this one going to have there's, this one going to have there's, but everyone's still looking at the same data and replicate that in QA and Devs so that people can do it outside of production and do some testing as well. >> So EON, obviously a very important innovation. And of course, Vertica touts the difference between others who separate huge storage, and you know, they're not the only one that does that, but they are really I think the only one that does it for on-prem, and virtually across clouds. So my question is, and I think you're doing a breakout session on the Virtual BDC. We're going to be in Boston, now we're doing it online. If I'm in the audience, I'm imagining I'm a junior DBA at an organization that maybe doesn't have a Joe. I haven't been an expert for seven years. How hard is it for me to get, what do I need to do to get up to speed on EON? It sounds great, I want it. I'm going to save my company money, but I'm nervous 'cause I've only been at Vertica DBA for, you know, a year, and I'm sort of, you know, not as experienced as you. What are the things that I should be thinking about? Do I need to bring in? Do I need to hire somebody? Do I need to bring in a consultant? Can I learn it myself? What would you advise? >> It's definitely easy enough that if you have at least a little bit of work experience, you can learn it yourself, okay? 'Cause the concepts are still there. There's some you know, little bits of nuances where you do need to be aware of certain changes between the Enterprise and EON edition. But I would also say consult with your Vertica Account Manager, consult with your, you know, let them bring in the right people from Vertica to help you get up to speed and if you need to, there are also resources available as far as consultants go, that will help you get up to speed very quickly. And we did work together with Vertica and with one of their partners, Clarity, in helping us to understand EON better, set it up the right way, you know, how do we take our, the number of shards for our data warehouse? You know, they helped us evaluate all that and pick the right number of shards, the right number of nodes to get set up and going. And, you know, helped us figure out the best ways to get our data over from the Enterprise Edition into EON very quickly and very efficient. So different with yourself. >> I wanted to ask you about organizational, you know, issues because, you know, the guys like you practitioners always tell me, "Look, the tech, technology comes and goes, that's kind of the easy part, we're good at that. It's the people it's the processes, the skill sets." What does your, you know, team regime look like? And do you have any sort of ideal team makeup or, you know, ideal advice, is it two piece of teams? Is it what kind of skills? What kind of interaction and communications to senior leadership? I wonder if you could just give us some color on that. >> One of the things that makes me extremely proud to be working for MassMutual right now, is that they do what a lot of companies have not been doing and that is investing in IT. They have put a lot of thought, a lot of money, and a lot of support into setting up their enterprise data platform and putting Vertica at the center. And not only did they put the money into getting the software that they needed, like Vertica, you know, MicroStrategy, and all the other tools that we were using to use that, they put the money in the people. Our managers are extremely supportive of us. We hired about 40 to 45 different people within a four-month time frame, data engineers, data analysts, data modelers, a nice mix of people across who can help shape your data and bring the data in and help the users use the data properly, and allow me as the database administrator to make sure that they're doing what they're doing most efficiently and focus on my job. So you have to have that diversity among the different data skills in order to make your team successful. >> That's awesome. Kind of a side question, and it's really not Vertica's wheelhouse, but I'm curious, you know, in the early days of the big data, you know, movement, a lot of the data scientists would complain, and they still do that, "80% of my time is spent wrangling data." The tools for the data engineer, the data scientists, the database, you know, experts, they're all different. And is that changing? And to what degree is that changing? Kind of what ending are we in and just in terms of a more facile environment for all those roles? >> Again, I think it depends on company to company, you know, what resources they make available to the data scientists. And the data scientists, we have a lot of them at MassMutual. And they're very much into doing a lot of machine learning, model training, predictive analytics. And they are, you know, used to doing it outside of Vertica too, you know, pulling that data out into Python and Scalars Bar, and tools like that. And they're also now just getting into using Vertica's in-database analytics and machine learning, which is a skill that, you know, definitely nobody else out there has. So being able to have one somebody who understands Vertica like myself, and being able to train other people to use Vertica the way that is most efficient for them is key. But also just having people who understand not only the tools that you're using, but how to model data, how to architect your tables, your schemas, the interaction between your tables and schemas and whatnot, you need to have that diversity in order to make this work. And our data scientists have benefited immensely from the struct that MassMutual put in place by our data management delivery team. >> That's great, I think I saw, somewhere in your background, that you've trained about 100 people in Vertica. Did I get that right? >> Yes, I've, since I started here, I've gone to our Boston location, our Springfield location, and our New York City location and trained, probably about this point, about 120, 140 of our Vertica users. And I'm trying to do, you know, a couple of follow-up sessions per year. >> So adoption, obviously, is a big goal of yours. Getting people to adopt the platform, but then more importantly, I guess, deliver business value and outcomes. >> Absolutely. >> Yeah, I wanted to ask you about encryption. You know, in the perfect world, everything would be encrypted, but there are trade offs. Are you using encryption? What are you doing in that regard? >> We are actually just getting into that now due to the New York and the CCPA regulations that are now in place. We do have a lot of Person Identifiable Information in our data store that does require encryption. So we are going through a month's long process that started in December, I think, it's actually a bit earlier than that, to start identifying all the columns, not only in our Vertica database, but in, you know, the other databases that we do use, you know, we have Postgres database, SQL Server, Teradata for the time being, until that moves into Vertica. And identify where that data sits, what downstream applications, pull that data from the data sources and store it locally as well, and starts encrypting that data. And because of the tight relationship between Voltage and Vertica, we settled on Voltages as the major platform to start doing that encryption. So we're going to be implementing that in Vertica probably within the next month or two, and roll it out to all the teams that have data that requires encryption. We're going to start rolling it out to the downstream application owners to make sure that they are encrypting the data as they get it pulled over. And we're also using another product for several other applications that don't mesh well as well with both. >> Voltage being micro, focuses encryption solution, correct? >> Right, yes. >> Yes, of course, like a focus for the audience's is the, it owns Vertica and if Vertica is a separate brand. So I want to ask you kind of close on what success looks like. You've been at this for a number of years, coming into MassMutual which was great to hear. I've had some past experience with MassMutual, it's an awesome company, I've been to the Springfield facility and in Boston as well, and I have great respect for them, and they've really always been a leader. So it's great to hear that they're investing in technology as a differentiator. What does success look like for you? Let's say you're at MassMutual for a few years, you're looking back, what success look like? Go. >> A good question. It's changing every day just, you know, with more and more, you know, applications coming onboard, more and more data being pulled in, more uses being found for the data that we have. I think success for me is making sure that Vertica, first of all, is always up made, is always running at its most optimal to keep our users happy. I think when I started, you know, we had a lot of processes that were running, you know, six, seven hours, some of them were taking, you know, almost a day long, because they were so complicated, we've got those running in under an hour now, some of them running in a matter of minutes. I want to keep that optimization going for all of our processes. Like I said, there's a lot of users using this data. And it's been hard over the first year of me being here to get to all of them. And thankfully, you know, I'm getting a bit of help now, I have a couple of system DBAs, and I'm training up to help out with these optimizations, you know, fixing queries, fixing projections to make sure that queries do run as quickly as possible. So getting that to its optimal stage is one. Two, getting our data encrypted and protected so that even if for whatever reasons, somehow somebody breaks into our data, they're not going to be able to get anything at all, because our data is 100% protected. And I think more companies need to be focusing on that as well. And third, I want to see our data science teams using more and more of Vertica's in-database predictive analytics, in-database machine learning products, and really helping make their jobs more efficient by doing so. >> Joe, you're awesome guest I mean, we always like I said, love having the practitioners on and getting the straight, skinny and pros. You're welcome back anytime, and as I say, I wish we could have met in Boston, maybe next year at the BDC. But it's great to have you online, and thanks for coming on theCUBE. >> And thank you for having me and hopefully we'll meet next year. >> Yeah, I hope so. And thank you everybody for watching that. Remember theCUBE is running concurrent with the Vertica Virtual BDC, it's vertica.com/bdc2020. If you want to check out all the keynotes, and all the breakout sessions, I'm Dave Volante for theCUBE. We'll be going. More interviews, for people right there. Thanks for watching. (bright music)

Published Date : Mar 31 2020

SUMMARY :

Big Data Conference 2020, brought to you by Vertica. (laughs) Thank you for having me. We'll talk about, you know, cluster and then move to AWS Enterprise in the cloud, Yeah, you have a lot of experience in Vertica. in the postage industry, I worked with healthcare auditing, paint a picture for us if you would. with the, you know, some financial uncertainty going on. and then you had a spate of companies that came out their data to clients and you know, Some of the queries that you can run in Vertica a good job of embracing, you know, riding the waves, And you know, having that, the option to provide and some of the challenges maybe that you had to overcome? It was a separate one for the API's, which needed, you know, So to the extent that you can break down those silos, So that saves a ton on resource costs, you know, and I'm sort of, you know, not as experienced as you. to help you get up to speed and if you need to, because, you know, the guys like you practitioners the database administrator to make sure that they're doing of the big data, you know, movement, Again, I think it depends on company to company, you know, Did I get that right? And I'm trying to do, you know, a couple of follow-up Getting people to adopt the platform, but then more What are you doing in that regard? the other databases that we do use, you know, So I want to ask you kind of close on what success looks like. And thankfully, you know, I'm getting a bit of help now, But it's great to have you online, And thank you for having me And thank you everybody for watching that.

ENTITIES

Entity	Category	Confidence
Joe Gonzalez	PERSON	0.99+
Vertica	ORGANIZATION	0.99+
Dave Volante	PERSON	0.99+
MassMutual	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
December	DATE	0.99+
100%	QUANTITY	0.99+
Joe	PERSON	0.99+
six	QUANTITY	0.99+
New York City	LOCATION	0.99+
seven years	QUANTITY	0.99+
12	QUANTITY	0.99+
80%	QUANTITY	0.99+
seven	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
four-month	QUANTITY	0.99+
vertica.com/bdc2020	OTHER	0.99+
Springfield	LOCATION	0.99+
2	QUANTITY	0.99+
next year	DATE	0.99+
two instances	QUANTITY	0.99+
seven hours	QUANTITY	0.99+
both	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Scalars Bar	TITLE	0.99+
Python	TITLE	0.99+
180 billion rows	QUANTITY	0.99+
Two	QUANTITY	0.99+
third	QUANTITY	0.99+
15 different servers	QUANTITY	0.99+
two piece	QUANTITY	0.98+
One	QUANTITY	0.98+
180 billion column	QUANTITY	0.98+
over 1000 columns	QUANTITY	0.98+
eight years	QUANTITY	0.98+
Voltage	ORGANIZATION	0.98+
three	QUANTITY	0.98+
hundreds of petabytes	QUANTITY	0.98+
first	QUANTITY	0.98+
six-node	QUANTITY	0.98+
one	QUANTITY	0.98+
one single cluster	QUANTITY	0.98+
Vertica Big Data Conference	EVENT	0.98+
MassMutual Financial	ORGANIZATION	0.98+
4 seconds	QUANTITY	0.98+
EON	ORGANIZATION	0.98+
New York	LOCATION	0.97+
about 10 terabytes	QUANTITY	0.97+
first challenge	QUANTITY	0.97+
next month	DATE	0.97+

Keynote Analysis | Virtual Vertica BDC 2020

(upbeat music) >> Narrator: It's theCUBE, covering the Virtual Vertica Big Data Conference 2020. Brought to you by Vertica. >> Dave Vellante: Hello everyone, and welcome to theCUBE's exclusive coverage of the Vertica Virtual Big Data Conference. You're watching theCUBE, the leader in digital event tech coverage. And we're broadcasting remotely from our studios in Palo Alto and Boston. And, we're pleased to be covering wall-to-wall this digital event. Now, as you know, originally BDC was scheduled this week at the new Encore Hotel and Casino in Boston. Their theme was "Win big with big data". Oh sorry, "Win big with data". That's right, got it. And, I know the community was really looking forward to that, you know, meet up. But look, we're making the best of it, given these uncertain times. We wish you and your families good health and safety. And this is the way that we're going to broadcast for the next several months. Now, we want to unpack Colin Mahony's keynote, but, before we do that, I want to give a little context on the market. First, theCUBE has covered every BDC since its inception, since the BDC's inception that is. It's a very intimate event, with a heavy emphasis on user content. Now, historically, the data engineers and DBAs in the Vertica community, they comprised the majority of the content at this event. And, that's going to be the same for this virtual, or digital, production. Now, theCUBE is going to be broadcasting for two days. What we're doing, is we're going to be concurrent with the Virtual BDC. We got practitioners that are coming on the show, DBAs, data engineers, database gurus, we got a security experts coming on, and really a great line up. And, of course, we'll also be hearing from Vertica Execs, Colin Mahony himself right of the keynote, folks from product marketing, partners, and a number of experts, including some from Micro Focus, which is the, of course, owner of Vertica. But I want to take a moment to share a little bit about the history of Vertica. The company, as you know, was founded by Michael Stonebraker. And, Verica started, really they started out as a SQL platform for analytics. It was the first, or at least one of the first, to really nail the MPP column store trend. Not only did Vertica have an early mover advantage in MPP, but the efficiency and scale of its software, relative to traditional DBMS, and also other MPP players, is underscored by the fact that Vertica, and the Vertica brand, really thrives to this day. But, I have to tell you, it wasn't without some pain. And, I'll talk a little bit about that, and really talk about how we got here today. So first, you know, you think about traditional transaction databases, like Oracle or IMBDB tour, or even enterprise data warehouse platforms like Teradata. They were simply not purpose-built for big data. Vertica was. Along with a whole bunch of other players, like Netezza, which was bought by IBM, Aster Data, which is now Teradata, Actian, ParAccel, which was the basis for Redshift, Amazon's Redshift, Greenplum was bought, in the early days, by EMC. And, these companies were really designed to run as massively parallel systems that smoked traditional RDBMS and EDW for particular analytic applications. You know, back in the big data days, I often joked that, like an NFL draft, there was run on MPP players, like when you see a run on polling guards. You know, once one goes, they all start to fall. And that's what you saw with the MPP columnar stores, IBM, EMC, and then HP getting into the game. So, it was like 2011, and Leo Apotheker, he was the new CEO of HP. Frankly, he has no clue, in my opinion, with what to do with Vertica, and totally missed one the biggest trends of the last decade, the data trend, the big data trend. HP picked up Vertica for a song, it wasn't disclosed, but my guess is that it was around 200 million. So, rather than build a bunch of smart tokens around Vertica, which I always call the diamond in the rough, Apotheker basically permanently altered HP for years. He kind of ruined HP, in my view, with a 12 billion dollar purchase of Autonomy, which turned out to be one of the biggest disasters in recent M&A history. HP was forced to spin merge, and ended up selling most of its software to Microsoft, Micro Focus. (laughs) Luckily, during its time at HP, CEO Meg Whitman, largely was distracted with what to do with the mess that she inherited form Apotheker. So, Vertica was left alone. Now, the upshot is Colin Mahony, who was then the GM of Vertica, and still is. By the way, he's really the CEO, and he just doesn't have the title, I actually think they should give that to him. But anyway, he's been at the helm the whole time. And Colin, as you'll see in our interview, is a rockstar, he's got technical and business jobs, people love him in the community. Vertica's culture is really engineering driven and they're all about data. Despite the fact that Vertica is a 15-year-old company, they've really kept pace, and not been polluted by legacy baggage. Vertica, early on, embraced Hadoop and the whole open-source movement. And that helped give it tailwinds. It leaned heavily into cloud, as we're going to talk about further this week. And they got a good story around machine intelligence and AI. So, whereas many traditional database players are really getting hurt, and some are getting killed, by cloud database providers, Vertica's actually doing a pretty good job of servicing its install base, and is in a reasonable position to compete for new workloads. On its last earnings call, the Micro Focus CFO, Stephen Murdoch, he said they're investing 70 to 80 million dollars in two key growth areas, security and Vertica. Now, Micro Focus is running its Suse play on these two parts of its business. What I mean by that, is they're investing and allowing them to be semi-autonomous, spending on R&D and go to market. And, they have no hardware agenda, unlike when Vertica was part of HP, or HPE, I guess HP, before the spin out. Now, let me come back to the big trend in the market today. And there's something going on around analytic databases in the cloud. You've got companies like Snowflake and AWS with Redshift, as we've reported numerous times, and they're doing quite well, they're gaining share, especially of new workloads that are merging, particularly in the cloud native space. They combine scalable compute, storage, and machine learning, and, importantly, they're allowing customers to scale, compute, and storage independent of each other. Why is that important? Because you don't have to buy storage every time you buy compute, or vice versa, in chunks. So, if you can scale them independently, you've got granularity. Vertica is keeping pace. In talking to customers, Vertica is leaning heavily into the cloud, supporting all the major cloud platforms, as we heard from Colin earlier today, adding Google. And, why my research shows that Vertica has some work to do in cloud and cloud native, to simplify the experience, it's more robust in motor stack, which supports many different environments, you know deep SQL, acid properties, and DNA that allows Vertica to compete with these cloud-native database suppliers. Now, Vertica might lose out in some of those native workloads. But, I have to say, my experience in talking with customers, if you're looking for a great MMP column store that scales and runs in the cloud, or on-prem, Vertica is in a very strong position. Vertica claims to be the only MPP columnar store to allow customers to scale, compute, and storage independently, both in the cloud and in hybrid environments on-prem, et cetera, cross clouds, as well. So, while Vertica may be at a disadvantage in a pure cloud native bake-off, it's more robust in motor stack, combined with its multi-cloud strategy, gives Vertica a compelling set of advantages. So, we heard a lot of this from Colin Mahony, who announced Vertica 10.0 in his keynote. He really emphasized Vertica's multi-cloud affinity, it's Eon Mode, which really allows that separation, or scaling of compute, independent of storage, both in the cloud and on-prem. Vertica 10, according to Mahony, is making big bets on in-database machine learning, he talked about that, AI, and along with some advanced regression techniques. He talked about PMML models, Python integration, which was actually something that they talked about doing with Uber and some other customers. Now, Mahony also stressed the trend toward object stores. And, Vertica now supports, let's see S3, with Eon, S3 Eon in Google Cloud, in addition to AWS, and then Pure and HDFS, as well, they all support Eon Mode. Mahony also stressed, as I mentioned earlier, a big commitment to on-prem and the whole cloud optionality thing. So 10.0, according to Colin Mahony, is all about really doubling down on these industry waves. As they say, enabling native PMML models, running them in Vertica, and really doing all the work that's required around ML and AI, they also announced support for TensorFlow. So, object store optionality is important, is what he talked about in Eon Mode, with the news of support for Google Cloud and, as well as HTFS. And finally, a big focus on deployment flexibility. Migration tools, which are a critical focus really on improving ease of use, and you hear this from a lot of customers. So, these are the critical aspects of Vertica 10.0, and an announcement that we're going to be unpacking all week, with some of the experts that I talked about. So, I'm going to close with this. My long-time co-host, John Furrier, and I have talked some time about this new cocktail of innovation. No longer is Moore's law the, really, mainspring of innovation. It's now about taking all these data troves, bringing machine learning and AI into that data to extract insights, and then operationalizing those insights at scale, leveraging cloud. And, one of the things I always look for from cloud is, if you've got a cloud play, you can attract innovation in the form of startups. It's part of the success equation, certainly for AWS, and I think it's one of the challenges for a lot of the legacy on-prem players. Vertica, I think, has done a pretty good job in this regard. And, you know, we're going to look this week for evidence of that innovation. One of the interviews that I'm personally excited about this week, is a new-ish company, I would consider them a startup, called Zebrium. What they're doing, is they're applying AI to do autonomous log monitoring for IT ops. And, I'm interviewing Larry Lancaster, who's their CEO, this week, and I'm going to press him on why he chose to run on Vertica and not a cloud database. This guy is a hardcore tech guru and I want to hear his opinion. Okay, so keep it right there, stay with us. We're all over the Vertica Virtual Big Data Conference, covering in-depth interviews and following all the news. So, theCUBE is going to be interviewing these folks, two days, wall-to-wall coverage, so keep it right there. We're going to be right back with our next guest, right after this short break. This is Dave Vellante and you're watching theCUBE. (upbeat music)

Published Date : Mar 31 2020

SUMMARY :

Brought to you by Vertica. and the Vertica brand, really thrives to this day.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Larry Lancaster	PERSON	0.99+
Colin	PERSON	0.99+
IBM	ORGANIZATION	0.99+
HP	ORGANIZATION	0.99+
70	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Michael Stonebraker	PERSON	0.99+
Colin Mahony	PERSON	0.99+
Stephen Murdoch	PERSON	0.99+
Vertica	ORGANIZATION	0.99+
EMC	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
Zebrium	ORGANIZATION	0.99+
two days	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
Verica	ORGANIZATION	0.99+
Micro Focus	ORGANIZATION	0.99+
2011	DATE	0.99+
HPE	ORGANIZATION	0.99+
Uber	ORGANIZATION	0.99+
first	QUANTITY	0.99+
Mahony	PERSON	0.99+
Meg Whitman	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Aster Data	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
First	QUANTITY	0.99+
12 billion dollar	QUANTITY	0.99+
One	QUANTITY	0.99+
this week	DATE	0.99+
John Furrier	PERSON	0.99+
15-year-old	QUANTITY	0.98+
Python	TITLE	0.98+
Oracle	ORGANIZATION	0.98+
olin Mahony	PERSON	0.98+
around 200 million	QUANTITY	0.98+
Virtual Vertica Big Data Conference 2020	EVENT	0.98+
theCUBE	ORGANIZATION	0.98+
80 million dollars	QUANTITY	0.97+
today	DATE	0.97+
two parts	QUANTITY	0.97+
Vertica Virtual Big Data Conference	EVENT	0.97+
Teradata	ORGANIZATION	0.97+
one	QUANTITY	0.97+
Actian	ORGANIZATION	0.97+

Dan Woicke, Cerner Corporation | Virtual Vertica BDC 2020

(gentle electronic music) >> Hello, everybody, welcome back to the Virtual Vertica Big Data Conference. My name is Dave Vellante and you're watching theCUBE, the leader in digital coverage. This is the Virtual BDC, as I said, theCUBE has covered every Big Data Conference from the inception, and we're pleased to be a part of this, even though it's challenging times. I'm here with Dan Woicke, the senior director of CernerWorks Engineering. Dan, good to see ya, how are things where you are in the middle of the country? >> Good morning, challenging times, as usual. We're trying to adapt to having the kids at home, out of school, trying to figure out how they're supposed to get on their laptop and do virtual learning. We all have to adapt to it and figure out how to get by. >> Well, it sure would've been my pleasure to meet you face to face in Boston at the Encore Casino, hopefully next year we'll be able to make that happen. But let's talk about Cerner and CernerWorks Engineering, what is that all about? >> So, CernerWorks Engineering, we used to be part of what's called IP, or Intellectual Property, which is basically the organization at Cerner that does all of our software development. But what we did was we made a decision about five years ago to organize my team with CernerWorks which is the hosting side of Cerner. So, about 80% of our clients choose to have their domains hosted within one of the two Kansas City data centers. We have one in Lee's Summit, in south Kansas City, and then we have one on our main campus that's a brand new one in downtown, north Kansas City. About 80, so we have about 27,000 environments that we manage in the Kansas City data centers. So, what my team does is we develop software in order to make it easier for us to monitor, manage, and keep those clients healthy within our data centers. >> Got it. I mean, I think of Cerner as a real advanced health tech company. It's the combination of healthcare and technology, the collision of those two. But maybe describe a little bit more about Cerner's business. >> So we have, like I said, 27,000 facilities across the world. Growing each day, thank goodness. And, our goal is to ensure that we reduce errors and we digitize the entire medical records for all of our clients. And we do that by having a consulting practice, we do that by having engineering, and then we do that with my team, which manages those particular clients. And that's how we got introduced to the Vertica side as well, when we introduced them about seven years ago. We were actually able to take a tremendous leap forward in how we manage our clients. And I'd be more than happy to talk deeper about how we do that. >> Yeah, and as we get into it, I want to understand, healthcare is all about outcomes, about patient outcomes and you work back from there. IT, for years, has obviously been a contributor but removed, and somewhat indirect from those outcomes. But, in this day and age, especially in an organization like yours, it really starts with the outcomes. I wonder if you could ratify that and talk about what that means for Cerner. >> Sorry, are you talking about medical outcomes? >> Yeah, outcomes of your business. >> So, there's two different sides to Cerner, right? There's the medical side, the clinical side, which is obviously our main practice, and then there's the side that I manage, which is more of the operational side. Both are very important, but they go hand in hand together. On the operational side, the goal is to ensure that our clinicians are on the system, and they don't know they're on the system, right? Things are progressing, doctors don't want to be on the system, trust me. My job is to ensure they're having the most seamless experience possible while they're on the EMR and have it just be one of their side jobs as opposed to taking their attention away from the patients. That make sense? >> Yeah it does, I mean, EMR and meaningful use, around the Affordable Care Act, really dramatically changed the unit. I mean, people had to demonstrate in order to get paid, and so that became sort of an unfunded mandate for folks and you really had to respond to that, didn't you? >> We did, we did that about three to four years ago. And we had to help our clients get through what's called meaningful use, there was different stages of meaningful use. And what we did, is we have the website called the Lights On Network which is free to all of our clients. Once you get onto the website the Lights On Network, you can actually show how you're measured and whether or not you're actually completing the different necessary tasks in order to get those payments for meaningful use. And it also allows you to see what your performance is on your domain, how the clinicians are doing on the system, how many hours they're spending on the system, how many orders they're executing. All of that is completely free and visible to our clients on the Lights On Network. And that's actually backed by some of the Vertica software that we've invested in. >> Yeah, so before we get into that, it sounds like your mission, really, is just great user experiences for the people that are on the network. Full stop. >> We do. So, one of the things that we invented about 10 years ago is called RTMS Timers. They're called Response Time Measurement System. And it started off as a way of us proving that clients are actually using the system, and now it's turned into more of a user outcomes. What we do is we collect 2.5 billion timers per day across all of our clients across the world. And every single one of those records goes to the Vertica platform. And then we've also developed a system on that which allows us in real time to go and see whether or not they're deviating from their normal. So we do baselines every hour of the week and then if they're deviating from those baselines, we can immediately call a service center and have them engage the client before they call in. >> So, Dan, I wonder if you could paint a picture. By the way, that's awesome. I wonder if you could paint a picture of your analytics environment. What does it look like? Maybe give us a sense of the scale. >> Okay. So, I've been describing how we operate, our remote hosted clients in the two Kansas City data centers, but all the software that we write, we also help our client hosted agents as well. Not only do we take care of what's going on at the Kansas City data center, but we do write software to ensure that all of clients are treated the same and we provide the same level of care and performance management across all those clients. So what we do is we have 90,000 agents that we have split across all these clients across the world. And every single hour, we're committing a billion rows to Vertica of operational data. So I talked a little bit about the RTMS timers, but we do things just like everyone else does for CPU, memory, Java Heap Stack. We can tell you how many concurrent users are on the system, I can tell you if there's an application that goes down unexpected, like a crash. I can tell you the response time from the network as most of us use Citrix at Cerner. And so what we do is we measure the amount of time it takes from the client side to PCs, it's sitting in the virtual data centers, sorry, in the hospitals, and then round trip to the Citrix servers that are sitting in the Kansas City data center. That's called the RTT, our round trip transactions. And what we've done is, over the last couple of years, what we've done is we've switched from just summarizing CPU and memory and all that high-level stuff, in order to go down to a user level. So, what are you doing, Dr. Smith, today? How many hours are you using the EMR? Have you experienced any slowness? Have you experienced any hourglass holding within your application? Have you experienced, unfortunately, maybe a crash? Have you experienced any slowness compared to your normal use case? And that's the step we've taken over the last few years, to go from summarization of high-level CPU memory, over to outcome metrics, which are what is really happening with a particular user. >> So, really granular views of how the system is being used and deep analytics on that. I wonder, go ahead, please. >> And, we weren't able to do that by summarizing things in traditional databases. You have to actually have the individual rows and you can't summarize information, you have to have individual metrics that point to exactly what's going on with a particular clinician. >> So, okay, the MPP architecture, the columnar store, the scalability of Vertica, that's what's key. That was my next question, let me take us back to the days of traditional RDBMS and then you brought in Vertica. Maybe you could give us a sense as to why, what that did for you, the before and after. >> Right. So, I'd been painting a picture going forward here about how traditionally, eight years ago, all we could do was summarize information. If CPU was going to go and jump up 8%, I could alarm the data center and say, hey, listen, CPU looks like it's higher, maybe an application's hanging more than it has been in the past. Things are a little slower, but I wouldn't be able to tell you who's affected. And that's where the whole thing has changed, when we brought Vertica in six years ago is that, we're able to take those 90,000 agents and commit a billion rows per hour operational data, and I can tell you exactly what's going on with each of our clinicians. Because you know, it's important for an entire domain to be healthy. But what about the 10 doctors that are experiencing frustration right now? If you're going to summarize that information and roll it up, you'll never know what those 10 doctors are experiencing and then guess what happens? They call the data center and complain, right? The squeaky wheels? We don't want that, we want to be able to show exactly who's experiencing a bad performance right now and be able to reach out to them before they call the help desk. >> So you're able to be proactive there, so you've gone from, Houston, we have a problem, we really can't tell you what it is, go figure it out, to, we see that there's an issue with these docs, or these users, and go figure that out and focus narrowly on where the problem is as opposed to trying to whack-a-mole. >> Exactly. And the other big thing that we've been able to do is corelation. So, we operate two gigantic data centers. And there's things that are shared, switches, network, shared storage, those things are shared. So if there is an issue that goes on with one of those pieces of equipment, it could affect multiple clients. Now that we have every row in Vertica, we have a new program in place called performance abnormality flags. And what we're able to do is provide a website in real time that goes through the entire stack from Citrix to network to database to back-end tier, all the way to the end-user desktop. And so if something was going to be related because we have a network switch going out of the data center or something's backing up slow, you can actually see which clients are on that switch, and, what we did five years ago before this, is we would deploy out five different teams to troubleshoot, right? Because five clients would call in, and they would all have the same problem. So, here you are having to spare teams trying to investigate why the same problem is happening. And now that we have all of the data within Vertica, we're able to show that in a real time fashion, through a very transparent dashboard. >> And so operational metrics throughout the stack, right? A game changer. >> It's very compact, right? I just label five different things, the stack from your end-user device all the way through the back-end to your database and all the way back. All that has to work properly, right? Including the network. >> How big is this, what are we talking about? However you measure it, terabytes, clusters. What can you share there? >> Sorry, you mean, the amount of data that we process within our data centers? >> Give us a fun fact. >> Absolute petabytes, yeah, for sure. And in Vertica right now we have two petabytes of data, and I purge it out every year, one year's worth of data within two different clusters. So we have to two different data centers I've been describing, what we've done is we've set Vertica up to be in both data centers, to be highly redundant, and then one of those is configured to do real-time analysis and corelation research, and then the other one is to provide service towards what I described earlier as our Lights On Network, so it's a very dedicated hardened cluster in one of our data centers to allow the Lights On Network to provide the transparency directly to our clients. So we want that one to be pristine, fast, and nobody touch it. As opposed to the other one, where, people are doing real-time, ad hoc queries, which sometimes aren't the best thing in the world. No matter what kind of database or how fast it is, people do bad things in databases and we just don't want that to affect what we show our clients in a transparent fashion. >> Yeah, I mean, for our audience, Vertica has always been aimed at these big, hairy, analytic problems, it's not for a tiny little data mart in a department, it's really the big scale problems. I wonder if I could ask you, so you guys, obviously, healthcare, with HIPAA and privacy, are you doing anything in the cloud, or is it all on-prem today? >> So, in the operational space that I manage, it's all on-premises, and that is changing. As I was describing earlier, we have an initiative to go to AWS and provide levels of service to countries like Sweden which does not want any operational data to leave that country's walls, whether it be operational data or whether it be PHI. And so, we have to be able to adapt into Vertia Eon Mode in order to provide the same services within Sweden. So obviously, Cerner's not going to go up and build a data center in every single country that requires us, so we're going to leverage our partnership with AWS to make this happen. >> Okay, so, I was going to ask you, so you're not running Eon Mode today, it's something that you're obviously interested in. AWS will allow you to keep the data locally in that region. In talking to a lot of practitioners, they're intrigued by this notion of being able to scale independently, storage from compute. They've said they wished that's a much more efficient way, I don't have to buy in chunks, if I'm out of storage, I don't have to buy compute, and vice-versa. So, maybe you could share with us what you're thinking, I know it's early days, but what's the logic behind the business case there? >> I think you're 100% correct in your assessment of taking compute away from storage. And, we do exactly what you say, we buy a server. And it has so much compute on it, and so much storage. And obviously, it's not scaled properly, right? Either storage runs out first or compute runs out first, but you're still paying big bucks for the entire server itself. So that's exactly why we're doing the POC right now for Eon Mode. And I sit on Vertica's TAB, the advisory board, and they've been doing a really good job of taking our requirements and listening to us, as to what we need. And that was probably number one or two on everybody's lists, was to separate storage from compute. And that's exactly what we're trying to do right now. >> Yeah, it's interesting, I've talked to some other customers that are on the customer advisory board. And Vertica is one of these companies that're pretty transparent about what goes on there. And I think that for the early adopters of Eon Mode there were some challenges with getting data into the new system, I know Vertica has been working on that very hard but you guys push Vertica pretty hard and from what I can tell, they listen. Your thoughts. >> They do listen, they do a great job. And even though the Big Data Conference is canceled, they're committed to having us go virtually to the CAD meeting on Monday, so I'm looking forward to that. They do listen to our requirements and they've been very very responsive. >> Nice. So, I wonder if you could give us some final thoughts as to where you want to take this thing. If you look down the road a year or two, what does success look like, Dan? >> That's a good question. Success means that we're a little bit more nimble as far as the different regions across the world that we can provide our services to. I want to do more corelation. I want to gather more information about what users are actually experiencing. I want to be able to have our phone never ring in our data center, I know that's a grand thought there. But I want to be able to look forward to measuring the data internally and reaching out to our clients when they have issues and then doing the proper corelation so that I can understand how things are intertwining if multiple clients are having an issue. That's the goal going forward. >> Well, in these trying times, during this crisis, it's critical that your operations are running smoothly. The last thing that organizations need right now, especially in healthcare, is disruption. So thank you for all the hard work that you and your teams are doing. I wish you and your family all the best. Stay safe, stay healthy, and thanks so much for coming on theCUBE. >> I really appreciate it, thanks for the opportunity. >> You're very welcome, and thank you, everybody, for watching, keep it right there, we'll be back with our next guest. This is Dave Vellante for theCUBE. Covering Virtual Vertica Big Data Conference. We'll be right back. (upbeat electronic music)

Published Date : Mar 31 2020

SUMMARY :

in the middle of the country? and figure out how to get by. been my pleasure to meet you and then we have one on our main campus and technology, the and then we do that with my team, Yeah, and as we get into it, the goal is to ensure that our clinicians in order to get paid, and so that became in order to get those for the people that are on the network. So, one of the things that we invented I wonder if you could paint a picture from the client side to PCs, of how the system is being used that point to exactly what's going on and then you brought in Vertica. and be able to reach out to them we really can't tell you what it is, And now that we have all And so operational metrics and all the way back. are we talking about? And in Vertica right now we in the cloud, or is it all on-prem today? So, in the operational I don't have to buy in chunks, and listening to us, as to what we need. that are on the customer advisory board. so I'm looking forward to that. as to where you want to take this thing. and reaching out to our that you and your teams are doing. thanks for the opportunity. and thank you, everybody,

ENTITIES

Entity	Category	Confidence
Dan Woicke	PERSON	0.99+
Dave Vellante	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Cerner	ORGANIZATION	0.99+
Affordable Care Act	TITLE	0.99+
Boston	LOCATION	0.99+
100%	QUANTITY	0.99+
Dan	PERSON	0.99+
10 doctors	QUANTITY	0.99+
Sweden	LOCATION	0.99+
90,000 agents	QUANTITY	0.99+
five clients	QUANTITY	0.99+
CernerWorks	ORGANIZATION	0.99+
8%	QUANTITY	0.99+
two	QUANTITY	0.99+
Kansas City	LOCATION	0.99+
Smith	PERSON	0.99+
Vertica	ORGANIZATION	0.99+
Cerner Corporation	ORGANIZATION	0.99+
next year	DATE	0.99+
Monday	DATE	0.99+
Both	QUANTITY	0.99+
today	DATE	0.99+
one year	QUANTITY	0.99+
a year	QUANTITY	0.99+
27,000 facilities	QUANTITY	0.99+
Houston	LOCATION	0.99+
one	QUANTITY	0.99+
two petabytes	QUANTITY	0.99+
five years ago	DATE	0.99+
CernerWorks Engineering	ORGANIZATION	0.98+
south Kansas City	LOCATION	0.98+
eight years ago	DATE	0.98+
about 80%	QUANTITY	0.98+
Virtual Vertica Big Data Conference	EVENT	0.98+
Citrix	ORGANIZATION	0.98+
two different data centers	QUANTITY	0.97+
each day	QUANTITY	0.97+
four years ago	DATE	0.97+
two different clusters	QUANTITY	0.97+
six years ago	DATE	0.97+
each	QUANTITY	0.97+
north Kansas City	LOCATION	0.97+
HIPAA	TITLE	0.97+
five different teams	QUANTITY	0.97+
first	QUANTITY	0.96+
five different things	QUANTITY	0.95+
two different sides	QUANTITY	0.95+
about 27,000 environments	QUANTITY	0.95+
both data centers	QUANTITY	0.95+
About 80	QUANTITY	0.95+
Response Time Measurement System	OTHER	0.95+
two gigantic data centers	QUANTITY	0.93+
Java Heap	TITLE	0.92+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Sizing and Configuring Vertica in Eon Modefor Different Use Cases: