Subbu Iyer, Aerospike | AWS re:Invent 2022

>>Hey everyone, welcome to the Cube's coverage of AWS Reinvent 2022. Lisa Martin here with you with Subaru ier, one of our alumni who's now the CEO of Aerospike. Sabu. Great to have you on the program. Thank you for joining us. >>Great as always, to be on the cube. Luisa, good to meet you. >>So, you know, every company these days has got to be a data company, whether it's a retailer, a manufacturer, a grocer, a automotive company. But for a lot of companies, data is underutilized, yet a huge asset that is value added. Why do you think companies are struggling so much to make data a value added asset? >>Well, you know, we, we see this across the board when I talk to customers and prospects. There's a desire from the business and from it actually to leverage data to really fuel newer applications, newer services, newer business lines, if you will, for companies. I think the struggle is one, I think one the, you know, the plethora of data that is created, you know, surveys say that over the next three years data is gonna be, you know, by 2025, around 175 zetabytes, right? A hundred and zetabytes of data is gonna be created. And that's really a, a, a growth of north of 30% year over year. But the more important, and the interesting thing is the real time component of that data is actually growing at, you know, 35% cagr. And what enterprises desire is decisions that are made in real time or near real time. >>And a lot of the challenges that do exist today is that either the infrastructure that enterprises have in place was never built to actually manipulate data in real time. The second is really the ability to actually put something in place which can handle spikes yet be cost efficient if you'll, so you can build for really peak loads, but then it's very expensive to operate that particular service at normal loads. So how do you build something which actually works for you, for both you, both users, so to speak? And the last point that we see out there is even if you're able to, you know, bring all that data, you don't have the processing capability to run through that data. So as a result, most enterprises struggle with one, capturing the data, you know, making decisions from it in real time and really operating it at the cost point that they need to operate it at. >>You know, you bring up a great point with respect to real time data access. And I think one of the things that we've learned the last couple of years is that access to real time data, it's not a nice to have anymore. It's business critical for organizations in any industry. Talk about that as one of the challenges that organizations are facing. >>Yeah. When, when, when we started Aerospike, right when the company started, it started with the premise that data is gonna grow, number one, exponentially. Two, when applications open up to the internet, there's gonna be a flood of users and demands on those applications. And that was true primarily when we started the company in the ad tech vertical. So ad tech was the first vertical where there was a lot of data both on the supply side and the demand side from an inventory of ads that were available. And on the other hand, they had like microseconds or milliseconds in which they could make a decision on which ad to put in front of you and I so that we would click or engage with that particular ad. But over the last three to five years, what we've seen is as digitization has actually permeated every industry out there, the need to harness data in real time is pretty much present in every industry. >>Whether that's retail, whether that's financial services, telecommunications, e-commerce, gaming and entertainment. Every industry has a desire. One, the innovative companies, the small companies rather, are innovating at a pace and standing up new businesses to compete with the larger companies in each of these verticals. And the larger companies don't wanna be left behind. So they're standing up their own competing services or getting into new lines of business that really harness and are driven by real time data. So this compelling pressures, one, the customer exp you know, customer experience is paramount and we as customers expect answers in, you know, an instant in real time. And on the other hand, the way they make decisions is based on a large data set because you know, larger data sets actually propel better decisions. So there's competing pressures here, which essentially drive the need. One from a business perspective, two from a customer perspective to harness all of this data in real time. So that's what's driving an inces need to actually make decisions in real or near real time. >>You know, I think one of the things that's been in short supply over the last couple of years is patients we do expect as consumers, whether we're in our business lives, our personal lives that we're going to be getting, be given information and data that's relevant, it's personal to help us make those real time decisions. So having access to real time data is really business critical for organizations across any industries. Talk about some of the main capabilities that modern data applications and data platforms need to have. What are some of the key capabilities of a modern data platform that need to be delivered to meet demanding customer expectations? >>So, you know, going back to your initial question Lisa, around why is data really a high value but underutilized or underleveraged asset? One of the reasons we see is a lot of the data platforms that, you know, some of these applications were built on have been then around for a decade plus and they were never built for the needs of today, which is really driving a lot of data and driving insight in real time from a lot of data. So there are four major capabilities that we see that are essential ingredients of any modern data platform. One is really the ability to, you know, operate at unlimited scale. So what we mean by that is really the ability to scale from gigabytes to even petabytes without any degradation in performance or latency or throughput. The second is really, you know, predictable performance. So can you actually deliver predictable performance as your data size grows or your throughput grows or your concurrent user on that application of service grows? >>It's really easy to build an application that operates at low scale or low throughput or low concurrency, but performance usually starts degrading as you start scaling one of these attributes. The third thing is the ability to operate and always on globally resilient application. And that requires a, a really robust data platform that can be up on a five, nine basis globally, can support global distribution because a lot of these applications have global users. And the last point is, goes back to my first answer, which is, can you operate all of this at a cost point? Which is not prohibitive, but it makes sense from a TCO perspective. Cuz a lot of times what we see is people make choices of data platforms and as ironically their service or applications become more successful and more users join their journey, the revenue starts going up, the user base starts going up, but the cost basis starts crossing over the revenue and they're losing money on the service, ironically, as the service becomes more popular. So really unlimited scale, predictable performance always on, on a globally resilient basis and low tco. These are the four essential capabilities of any modern data platform. >>So then talk to me with those as the four main core functionalities of a modern data platform. How does aerospace deliver that? >>So we were built, as I said, from the from day one to operate at unlimited scale and deliver predictable performance. And then over the years as we work with customers, we build this incredible high availability capability which helps us deliver the always on, you know, operations. So we have customers who are, who have been on the platform 10 years with no downtime for example, right? So we are talking about an amazing continuum of high availability that we provide for customers who operate these, you know, globally resilient services. The key to our innovation here is what we call the hybrid memory architecture. So, you know, going a little bit technically deep here, essentially what we built out in our architecture is the ability on each node or each server to treat a bank of SSDs or solid state devices as essentially extended memory. So you're getting memory performance, but you're accessing these SSDs, you're not paying memory prices, but you're getting memory performance as a result of that. >>You can attach a lot more data to each node or each server in your distributed cluster. And when you kind of scale that across basically a distributed cluster you can do with aerospike, the same things at 60 to 80% lower server count and as a result 60 to 80% lower TCO compared to some of the other options that are available in the market. Then basically, as I said, that's the key kind of starting point to the innovation. We layer around capabilities like, you know, replication change, data notification, you know, synchronous and asynchronous replication. The ability to actually stretch a single cluster across multiple regions. So for example, if you're operating a global service, you can have a single aerospace cluster with one node in San Francisco, one northern New York, another one in London. And this would be basically seamlessly operating. So that, you know, this is strongly consistent. >>Very few no SQL data platforms are strongly consistent or if they are strongly consistent, they will actually suffer performance degradation. And what strongly consistent means is, you know, all your data is always available, it's guaranteed to be available, there is no data lost anytime. So in this configuration that I talked about, if the node in London goes down, your application still continues to operate, right? Your users see no kind of downtime and you know, when London comes up, it rejoins the cluster and everything is back to kind of the way it was before, you know, London left the cluster so to speak. So the op, the ability to do this globally resilient, highly available kind of model is really, really powerful. A lot of our customers actually use that kind of a scenario and we offer other deployment scenarios from a higher availability perspective. So everything starts with HMA or hybrid memory architecture and then we start building out a lot of these other capabilities around the platform. >>And then over the years, what our customers have guided us to do is as they're putting together a modern kind of data infrastructure, we don't live in a silo. So aerospace gets deployed with other technologies like streaming technologies or analytics technologies. So we built connectors into Kafka, pulsar, so that as you're ingesting data from a variety of data sources, you can ingest them at very high ingest speeds and store them persistently into Aerospike. Once the data is in Aerospike, you can actually run spark jobs across that data in a, in a multithreaded parallel fashion to get really insight from that data at really high, high throughput and high speed, >>High throughput, high speed, incredibly important, especially as today's landscape is increasingly distributed. Data centers, multiple public clouds, edge IOT devices, the workforce embracing more and more hybrid these days. How are you ex helping customers to extract more value from data while also lowering costs? Go into some customer examples cause I know you have some great ones. >>Yeah, you know, I think we have, we have built an amazing set of customers and customers actually use us for some really mission critical applications. So, you know, before I get into specific customer examples, let me talk to you about some of kind of the use cases which we see out there. We see a lot of aerospace being used in fraud detection. We see us being used in recommendations and since we use get used in customer data profiles or customer profiles, customer 360 stores, you know, multiplayer gaming and entertainment, these are kind of the repeated use case digital payments. We power most of the digital payment systems across the globe. Specific example from a, from a specific example perspective, the first one I would love to talk about is PayPal. So if you use PayPal today, then you know when you actually paying somebody your transaction is, you know, being sent through aero spike to really decide whether this is a fraudulent transaction or not. >>And when you do that, you know, you and I as a customer not gonna wait around for 10 seconds for PayPal to say yay or me, we expect, you know, the decision to be made in an instant. So we are powering that fraud detection engine at PayPal for every transaction that goes through PayPal before us, you know, PayPal was missing out on about 2% of their SLAs, which was essentially millions of dollars, which they were losing because, you know, they were letting transactions go through and taking the risk that it, it's not a fraudulent transaction with the aerospace. They can now actually get a much better sla and the data set on which they compute the fraud score has gone up by, you know, several factors. So by 30 x if you will. So not only has the data size that is powering the fraud engine actually grown up 30 x with Aerospike. Yeah. But they're actually making decisions in an instant for, you know, 99.95% of their transactions. So that's, >>And that's what we expect as consumers, right? We want to know that there's fraud detection on the swipe regardless of who we're interacting with. >>Yes. And so that's a, that's a really powerful use case and you know, it's, it's a great customer, great customer success story. The other one I would talk about is really Wayfair, right? From retail and you know, from e-commerce. So everybody knows Wayfair global leader in really, you know, online home furnishings and they use us to power their recommendations engine and you know, it's basically if you're purchasing this, people who bought this but also bought these five other things, so on and so forth, they have actually seen the card size at checkout go by up to 30% as a result of actually powering their recommendations in G by through Aerospike. And they, they were able to do this by reducing the server count by nine x. So on one ninth of the servers that were there before aerospace, they're now powering their recommendation engine and seeing card size checkout go up by 30%. Really, really powerful in terms of the business outcome and what we are able to, you know, drive at Wayfair >>Hugely powerful as a business outcome. And that's also what the consumer wants. The consumer is expecting these days to have a very personalized, relevant experience that's gonna show me if I bought this, show me something else that's related to that. We have this expectation that needs to be really fueled by technology. >>Exactly. And you know, another great example you asked about, you know, customer stories, Adobe, who doesn't know Adobe, you know, they, they're on a, they're on a mission to deliver the best customer experience that they can and they're talking about, you know, great customer 360 experience at scale and they're modernizing their entire edge compute infrastructure to support this. With Aerospike going to Aerospike, basically what they have seen is their throughput go up by 70%, their cost has been reduced by three x. So essentially doing it at one third of the cost while their annual data growth continues at, you know, about north of 30%. So not only is their data growing, they're able to actually reduce their cost to actually deliver this great customer experience by one third to one third and continue to deliver great customer 360 experience at scale. Really, really powerful example of how you deliver Customer 360 in a world which is dynamic and you know, on a dataset which is constantly growing at north, north of 30% in this case. >>Those are three great examples, PayPal, Wayfair, Adobe talking about, especially with Wayfair when you talk about increasing their cart checkout sizes, but also with Adobe increasing throughput by over 70%. I'm looking at my notes here. While data is growing at 32%, that's something that every organization has to contend with data growth is continuing to scale and scale and scale. >>Yep. I, I'll give you a fun one here. So, you know, you may not have heard about this company, it's called Dream 11 and it's a company based out of India, but it's a very, you know, it's a fun story because it's the world's largest fantasy sports platform and you know, India is a nation which is cricket crazy. So you know, when, when they have their premier league going on, you know, there's millions of users logged onto the dream alone platform building their fantasy lead teams and you know, playing on that particular platform, it has a hundred million users, a hundred million plus users on the platform, 5.5 million concurrent users and they have been growing at 30%. So they are considered a, an amazing success story in, in terms of what they have accomplished and the way they have architected their platform to operate at scale. And all of that is really powered by aerospace where think about that they are able to deliver all of this and support a hundred million users, 5.5 million concurrent users all with you know, 99 plus percent of their transactions completing in less than one millisecond. Just incredible success story. Not a brand that is you know, world renowned but at least you know from a what we see out there, it's an amazing success story of operating at scale. >>Amazing success story, huge business outcomes. Last question for you as we're almost out of time is talk a little bit about Aerospike aws, the partnership GRAVITON two better together. What are you guys doing together there? >>Great partnership. AWS has multiple layers in terms of partnerships. So you know, we engage with AWS at the executive level. They plan out, really roll out of new instances in partnership with us, making sure that, you know, those instance types work well for us. And then we just released support for Aerospike on the graviton platform and we just announced a benchmark of Aerospike running on graviton on aws. And what we see out there is with the benchmark, a 1.6 x improvement in price performance and you know, about 18% increase in throughput while maintaining a 27% reduction in cost, you know, on graviton. So this is an amazing story from a price performance perspective, performance per wat for greater energy efficiencies, which basically a lot of our customers are starting to kind of talk to us about leveraging this to further meet their sustainability target. So great story from Aero Aerospike and aws, not just from a partnership perspective on a technology and an executive level, but also in terms of what joint outcomes we are able to deliver for our customers. >>And it sounds like a great sustainability story. I wish we had more time so we would talk about this, but thank you so much for talking about the main capabilities of a modern data platform, what's needed, why, and how you guys are delivering that. We appreciate your insights and appreciate your time. >>Thank you very much. I mean, if, if folks are at reinvent next week or this week, come on and see us at our booth. We are in the data analytics pavilion. You can find us pretty easily. Would love to talk to you. >>Perfect. We'll send them there. So Ira, thank you so much for joining me on the program today. We appreciate your insights. >>Thank you Lisa. >>I'm Lisa Martin. You're watching The Cubes coverage of AWS Reinvent 2022. Thanks for watching.

Published Date : Dec 7 2022

SUMMARY :

Great to have you on the program. Great as always, to be on the cube. So, you know, every company these days has got to be a data company, the, you know, the plethora of data that is created, you know, surveys say that over the next three years you know, making decisions from it in real time and really operating it You know, you bring up a great point with respect to real time data access. on which ad to put in front of you and I so that we would click or engage with that particular the way they make decisions is based on a large data set because you know, larger data sets actually capabilities of a modern data platform that need to be delivered to meet demanding lot of the data platforms that, you know, some of these applications were built on have goes back to my first answer, which is, can you operate all of this at a cost So then talk to me with those as the four main core functionalities of deliver the always on, you know, operations. So that, you know, this is strongly consistent. the way it was before, you know, London left the cluster so to speak. Once the data is in Aerospike, you can actually run you ex helping customers to extract more value from data while also lowering So, you know, before I get into specific customer examples, let me talk to you about some 10 seconds for PayPal to say yay or me, we expect, you know, the decision to be made in an And that's what we expect as consumers, right? really powerful in terms of the business outcome and what we are able to, you know, We have this expectation that needs to be really fueled by technology. And you know, another great example you asked about, you know, especially with Wayfair when you talk about increasing their cart onto the dream alone platform building their fantasy lead teams and you know, What are you guys doing together there? So you know, we engage with AWS at the executive level. but thank you so much for talking about the main capabilities of a modern data platform, Thank you very much. So Ira, thank you so much for joining me on the program today. Thanks for watching.

ENTITIES

Entity	Category	Confidence
Lisa Martin	PERSON	0.99+
AWS	ORGANIZATION	0.99+
London	LOCATION	0.99+
Ira	PERSON	0.99+
Lisa	PERSON	0.99+
60	QUANTITY	0.99+
Luisa	PERSON	0.99+
Adobe	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
PayPal	ORGANIZATION	0.99+
30%	QUANTITY	0.99+
70%	QUANTITY	0.99+
10 seconds	QUANTITY	0.99+
Wayfair	ORGANIZATION	0.99+
35%	QUANTITY	0.99+
Aerospike	ORGANIZATION	0.99+
each server	QUANTITY	0.99+
One	QUANTITY	0.99+
India	LOCATION	0.99+
27%	QUANTITY	0.99+
nine	QUANTITY	0.99+
10 years	QUANTITY	0.99+
30 x	QUANTITY	0.99+
32%	QUANTITY	0.99+
99.95%	QUANTITY	0.99+
two	QUANTITY	0.99+
one	QUANTITY	0.99+
aws	ORGANIZATION	0.99+
each node	QUANTITY	0.99+
next week	DATE	0.99+
2025	DATE	0.99+
five	QUANTITY	0.99+
less than one millisecond	QUANTITY	0.99+
millions of users	QUANTITY	0.99+
Subaru	ORGANIZATION	0.99+
both	QUANTITY	0.99+
second	QUANTITY	0.99+
first answer	QUANTITY	0.99+
one third	QUANTITY	0.99+
this week	DATE	0.99+
millions of dollars	QUANTITY	0.99+
over 70%	QUANTITY	0.99+
Sabu	PERSON	0.99+
both users	QUANTITY	0.99+
three	QUANTITY	0.98+
today	DATE	0.98+
80%	QUANTITY	0.98+
Kafka	TITLE	0.98+
1.6 x	QUANTITY	0.98+
northern New York	LOCATION	0.98+
5.5 million concurrent users	QUANTITY	0.98+
GRAVITON	ORGANIZATION	0.98+
hundred million users	QUANTITY	0.97+
Dream 11	ORGANIZATION	0.97+
Two	QUANTITY	0.97+
each	QUANTITY	0.97+
Aerospike	TITLE	0.97+
third thing	QUANTITY	0.96+
hundred million users	QUANTITY	0.96+
The Cubes	TITLE	0.95+
around 175 zetabytes	QUANTITY	0.95+

ML & AI Keynote Analysis | AWS re:Invent 2022

>>Hey, welcome back everyone. Day three of eight of us Reinvent 2022. I'm John Farmer with Dave Volante, co-host the q Dave. 10 years for us, the leader in high tech coverage is our slogan. Now 10 years of reinvent day. We've been to every single one except with the original, which we would've come to if Amazon actually marketed the event, but they didn't. It's more of a customer event. This is day three. Is the machine learning ai keynote sws up there. A lot of announcements. We're gonna break this down. We got, we got Andy Thra here, vice President, prince Constellation Research. Andy, great to see you've been on the cube before one of our analysts bringing the, bringing the, the analysis, commentary to the keynote. This is your wheelhouse. Ai. What do you think about Swami up there? I mean, he's awesome. We love him. Big fan Oh yeah. Of of the Cuban we're fans of him, but he got 13 announcements. >>A lot. A lot, >>A lot. >>So, well some of them are, first of all, thanks for having me here and I'm glad to have both of you on the same show attacking me. I'm just kidding. But some of the announcement really sort of like a game changer announcements and some of them are like, meh, you know, just to plug in the holes what they have and a lot of golf claps. Yeah. Meeting today. And you could have also noticed that by, when he was making the announcements, you know, the, the, the clapping volume difference, you could say, which is better, right? But some of the announcements are, are really, really good. You know, particularly we talked about, one of that was Microsoft took that out of, you know, having the open AI in there, doing the large language models. And then they were going after that, you know, having the transformer available to them. And Amazon was a little bit weak in the area, so they couldn't, they don't have a large language model. So, you know, they, they are taking a different route saying that, you know what, I'll help you train the large language model by yourself, customized models. So I can provide the necessary instance. I can provide the instant volume, memory, the whole thing. Yeah. So you can train the model by yourself without depending on them kind >>Of thing. So Dave and Andy, I wanna get your thoughts cuz first of all, we've been following Amazon's deep bench on the, on the infrastructure pass. They've been doing a lot of machine learning and ai, a lot of data. It just seems that the sentiment is that there's other competitors doing a good job too. Like Google, Dave. And I've heard folks in the hallway, even here, ex Amazonians saying, Hey, they're train their models on Google than they bring up the SageMaker cuz it's better interface. So you got, Google's making a play for being that data cloud. Microsoft's obviously putting in a, a great kind of package to kind of make it turnkey. How do they really stand versus the competition guys? >>Good question. So they, you know, each have their own uniqueness and the we variation that take it to the field, right? So for example, if you were to look at it, Microsoft is known for as industry or later things that they are been going after, you know, industry verticals and whatnot. So that's one of the things I looked here, you know, they, they had this omic announcement, particularly towards that healthcare genomics space. That's a huge space for hpz related AIML applications. And they have put a lot of things in together in here in the SageMaker and in the, in their models saying that, you know, how do you, how do you use this transmit to do things like that? Like for example, drug discovery, for genomics analysis, for cancer treatment, the whole, right? That's a few volumes of data do. So they're going in that healthcare area. Google has taken a different route. I mean they want to make everything simple. All I have to do is I gotta call an api, give what I need and then get it done. But Amazon wants to go at a much deeper level saying that, you know what? I wanna provide everything you need. You can customize the whole thing for what you need. >>So to me, the big picture here is, and and Swami references, Hey, we are a data company. We started, he talked about books and how that informed them as to, you know, what books to place front and center. Here's the, here's the big picture. In my view, companies need to put data at the core of their business and they haven't, they've generally put humans at the core of their business and data. And now machine learning are at the, at the outside and the periphery. Amazon, Google, Microsoft, Facebook have put data at their core. So the question is how do incumbent companies, and you mentioned some Toyota Capital One, Bristol Myers Squibb, I don't know, are those data companies, you know, we'll see, but the challenge is most companies don't have the resources as you well know, Andy, to actually implement what Google and Facebook and others have. >>So how are they gonna do that? Well, they're gonna buy it, right? So are they gonna build it with tools that's kind of like you said the Amazon approach or are they gonna buy it from Microsoft and Google, I pulled some ETR data to say, okay, who are the top companies that are showing up in terms of spending? Who's spending with whom? AWS number one, Microsoft number two, Google number three, data bricks. Number four, just in terms of, you know, presence. And then it falls down DataRobot, Anaconda data icu, Oracle popped up actually cuz they're embedding a lot of AI into their products and, and of course IBM and then a lot of smaller companies. But do companies generally customers have the resources to do what it takes to implement AI into applications and into workflows? >>So a couple of things on that. One is when it comes to, I mean it's, it's no surprise that the, the top three or the hyperscalers, because they all want to bring their business to them to run the specific workloads on the next biggest workload. As you was saying, his keynote are two things. One is the A AIML workloads and the other one is the, the heavy unstructured workloads that he was talking about. 80%, 90% of the data that's coming off is unstructured. So how do you analyze that? Such as the geospatial data. He was talking about the volumes of data you need to analyze the, the neural deep neural net drug you ought to use, only hyperscale can do it, right? So that's no wonder all of them on top for the data, one of the things they announced, which not many people paid attention, there was a zero eight L that that they talked about. >>What that does is a little bit of a game changing moment in a sense that you don't have to, for example, if you were to train the data, data, if the data is distributed everywhere, if you have to bring them all together to integrate it, to do that, it's a lot of work to doing the dl. So by taking Amazon, Aurora, and then Rich combine them as zero or no ETL and then have Apaches Apaches Spark applications run on top of analytical applications, ML workloads. That's huge. So you don't have to move around the data, use the data where it is, >>I, I think you said it, they're basically filling holes, right? Yeah. They created this, you know, suite of tools, let's call it. You might say it's a mess. It's not a mess because it's, they're really powerful but they're not well integrated and now they're starting to take the seams as I say. >>Well yeah, it's a great point. And I would double down and say, look it, I think that boring is good. You know, we had that phase in Kubernetes hype cycle where it got boring and that was kind of like, boring is good. Boring means we're getting better, we're invisible. That's infrastructure that's in the weeds, that's in between the toes details. It's the stuff that, you know, people we have to get done. So, you know, you look at their 40 new data sources with data Wrangler 50, new app flow connectors, Redshift Auto Cog, this is boring. Good important shit Dave. The governance, you gotta get it and the governance is gonna be key. So, so to me, this may not jump off the page. Adam's keynote also felt a little bit of, we gotta get these gaps done in a good way. So I think that's a very positive sign. >>Now going back to the bigger picture, I think the real question is can there be another independent cloud data cloud? And that's the, to me, what I try to get at my story and you're breaking analysis kind of hit a home run on this, is there's interesting opportunity for an independent data cloud. Meaning something that isn't aws, that isn't, Google isn't one of the big three that could sit in. And so let me give you an example. I had a conversation last night with a bunch of ex Amazonian engineering teams that left the conversation was interesting, Dave. They were like talking, well data bricks and Snowflake are basically batch, okay, not transactional. And you look at Aerospike, I can see their booth here. Transactional data bases are hot right now. Streaming data is different. Confluence different than data bricks. Is data bricks good at hosting? >>No, Amazon's better. So you start to see these kinds of questions come up where, you know, data bricks is great, but maybe not good for this, that and the other thing. So you start to see the formation of swim lanes or visibility into where people might sit in the ecosystem, but what came out was transactional. Yep. And batch the relationship there and streaming real time and versus you know, the transactional data. So you're starting to see these new things emerge. Andy, what do you, what's your take on this? You're following this closely. This seems to be the alpha nerd conversation and it all points to who's gonna have the best data cloud, say data, super clouds, I call it. What's your take? >>Yes, data cloud is important as well. But also the computational that goes on top of it too, right? Because when, when the data is like unstructured data, it's that much of a huge data, it's going to be hard to do that with a low model, you know, compute power. But going back to your data point, the training of the AIML models required the batch data, right? That's when you need all the, the historical data to train your models. And then after that, when you do inference of it, that's where you need the streaming real time data that's available to you too. You can make an inference. One of the things, what, what they also announced, which is somewhat interesting, is you saw that they have like 700 different instances geared towards every single workload. And there are some of them very specifically run on the Amazon's new chip. The, the inference in two and theran tr one chips that basically not only has a specific instances but also is run on a high powered chip. And then if you have that data to support that, both the training as well as towards the inference, the efficiency, again, those numbers have to be proven. They claim that it could be anywhere between 40 to 60% faster. >>Well, so a couple things. You're definitely right. I mean Snowflake started out as a data warehouse that was simpler and it's not architected, you know, in and it's first wave to do real time inference, which is not now how, how could they, the other second point is snowflake's two or three years ahead when it comes to governance, data sharing. I mean, Amazon's doing what always does. It's copying, you know, it's customer driven. Cuz they probably walk into an account and they say, Hey look, what's Snowflake's doing for us? This stuff's kicking ass. And they go, oh, that's a good idea, let's do that too. You saw that with separating compute from storage, which is their tiering. You saw it today with extending data, sharing Redshift, data sharing. So how does Snowflake and data bricks approach this? They deal with ecosystem. They bring in ecosystem partners, they bring in open source tooling and that's how they compete. I think there's unquestionably an opportunity for a data cloud. >>Yeah, I think, I think the super cloud conversation and then, you know, sky Cloud with Berkeley Paper and other folks talking about this kind of pre, multi-cloud era. I mean that's what I would call us right now. We are, we're kind of in the pre era of multi-cloud, which by the way is not even yet defined. I think people use that term, Dave, to say, you know, some sort of magical thing that's happening. Yeah. People have multiple clouds. They got, they, they end up by default, not by design as Dell likes to say. Right? And they gotta deal with it. So it's more of they're inheriting multiple cloud environments. It's not necessarily what they want in the situation. So to me that is a big, big issue. >>Yeah, I mean, again, going back to your snowflake and data breaks announcements, they're a data company. So they, that's how they made their mark in the market saying that, you know, I do all those things, therefore you have, I had to have your data because it's a seamless data. And, and Amazon is catching up with that with a lot of that announcements they made, how far it's gonna get traction, you know, to change when I to say, >>Yeah, I mean to me, to me there's no doubt about Dave. I think, I think what Swamee is doing, if Amazon can get corner the market on out of the box ML and AI capabilities so that people can make it easier, that's gonna be the end of the day tell sign can they fill in the gaps. Again, boring is good competition. I don't know mean, mean I'm not following the competition. Andy, this is a real question mark for me. I don't know where they stand. Are they more comprehensive? Are they more deeper? Are they have deeper services? I mean, obviously shows to all the, the different, you know, capabilities. Where, where, where does Amazon stand? What's the process? >>So what, particularly when it comes to the models. So they're going at, at a different angle that, you know, I will help you create the models we talked about the zero and the whole data. We'll get the data sources in, we'll create the model. We'll move the, the whole model. We are talking about the ML ops teams here, right? And they have the whole functionality that, that they built ind over the year. So essentially they want to become the platform that I, when you come in, I'm the only platform you would use from the model training to deployment to inference, to model versioning to management, the old s and that's angle they're trying to take. So it's, it's a one source platform. >>What about this idea of technical debt? Adrian Carro was on yesterday. John, I know you talked to him as well. He said, look, Amazon's Legos, you wanna buy a toy for Christmas, you can go out and buy a toy or do you wanna build a, to, if you buy a toy in a couple years, you could break and what are you gonna do? You're gonna throw it out. But if you, if you, if part of your Lego needs to be extended, you extend it. So, you know, George Gilbert was saying, well, there's a lot of technical debt. Adrian was countering that. Does Amazon have technical debt or is that Lego blocks analogy the right one? >>Well, I talked to him about the debt and one of the things we talked about was what do you optimize for E two APIs or Kubernetes APIs? It depends on what team you're on. If you're on the runtime gene, you're gonna optimize for Kubernetes, but E two is the resources you want to use. So I think the idea of the 15 years of technical debt, I, I don't believe that. I think the APIs are still hardened. The issue that he brings up that I think is relevant is it's an end situation, not an or. You can have the bag of Legos, which is the primitives and build a durable application platform, monitor it, customize it, work with it, build it. It's harder, but the outcome is durability and sustainability. Building a toy, having a toy with those Legos glued together for you, you can get the play with, but it'll break over time. Then you gotta replace it. So there's gonna be a toy business and there's gonna be a Legos business. Make your own. >>So who, who are the toys in ai? >>Well, out of >>The box and who's outta Legos? >>The, so you asking about what what toys Amazon building >>Or, yeah, I mean Amazon clearly is Lego blocks. >>If people gonna have out the box, >>What about Google? What about Microsoft? Are they basically more, more building toys, more solutions? >>So Google is more of, you know, building solutions angle like, you know, I give you an API kind of thing. But, but if it comes to vertical industry solutions, Microsoft is, is is ahead, right? Because they have, they have had years of indu industry experience. I mean there are other smaller cloud are trying to do that too. IBM being an example, but you know, the, now they are starting to go after the specific industry use cases. They think that through, for example, you know the medical one we talked about, right? So they want to build the, the health lake, security health lake that they're trying to build, which will HIPPA and it'll provide all the, the European regulations, the whole line yard, and it'll help you, you know, personalize things as you need as well. For example, you know, if you go for a certain treatment, it could analyze you based on your genome profile saying that, you know, the treatment for this particular person has to be individualized this way, but doing that requires a anomalous power, right? So if you do applications like that, you could bring in a lot of the, whether healthcare, finance or what have you, and then easy for them to use. >>What's the biggest mistake customers make when it comes to machine intelligence, ai, machine learning, >>So many things, right? I could start out with even the, the model. Basically when you build a model, you, you should be able to figure out how long that model is effective. Because as good as creating a model and, and going to the business and doing things the right way, there are people that they leave the model much longer than it's needed. It's hurting your business more than it is, you know, it could be things like that. Or you are, you are not building a responsibly or later things. You are, you are having a bias and you model and are so many issues. I, I don't know if I can pinpoint one, but there are many, many issues. Responsible ai, ethical ai. All >>Right, well, we'll leave it there. You're watching the cube, the leader in high tech coverage here at J three at reinvent. I'm Jeff, Dave Ante. Andy joining us here for the critical analysis and breaking down the commentary. We'll be right back with more coverage after this short break.

Published Date : Nov 30 2022

SUMMARY :

Ai. What do you think about Swami up there? A lot. of, you know, having the open AI in there, doing the large language models. So you got, Google's making a play for being that data cloud. So they, you know, each have their own uniqueness and the we variation that take it to have the resources as you well know, Andy, to actually implement what Google and they gonna build it with tools that's kind of like you said the Amazon approach or are they gonna buy it from Microsoft the neural deep neural net drug you ought to use, only hyperscale can do it, right? So you don't have to move around the data, use the data where it is, They created this, you know, It's the stuff that, you know, people we have to get done. And so let me give you an example. So you start to see these kinds of questions come up where, you know, it's going to be hard to do that with a low model, you know, compute power. was simpler and it's not architected, you know, in and it's first wave to do real time inference, I think people use that term, Dave, to say, you know, some sort of magical thing that's happening. you know, I do all those things, therefore you have, I had to have your data because it's a seamless data. the different, you know, capabilities. at a different angle that, you know, I will help you create the models we talked about the zero and you know, George Gilbert was saying, well, there's a lot of technical debt. Well, I talked to him about the debt and one of the things we talked about was what do you optimize for E two APIs or Kubernetes So Google is more of, you know, building solutions angle like, you know, I give you an API kind of thing. you know, it could be things like that. We'll be right back with more coverage after this short break.

ENTITIES

Entity	Category	Confidence
Jeff	PERSON	0.99+
George Gilbert	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Adrian	PERSON	0.99+
Dave	PERSON	0.99+
Andy	PERSON	0.99+
Google	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Adrian Carro	PERSON	0.99+
Dave Volante	PERSON	0.99+
Andy Thra	PERSON	0.99+
90%	QUANTITY	0.99+
15 years	QUANTITY	0.99+
John	PERSON	0.99+
Adam	PERSON	0.99+
13 announcements	QUANTITY	0.99+
Lego	ORGANIZATION	0.99+
John Farmer	PERSON	0.99+
Dave Ante	PERSON	0.99+
two	QUANTITY	0.99+
10 years	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
Legos	ORGANIZATION	0.99+
Bristol Myers Squibb	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Constellation Research	ORGANIZATION	0.99+
One	QUANTITY	0.99+
Christmas	EVENT	0.99+
second point	QUANTITY	0.99+
yesterday	DATE	0.99+
Anaconda	ORGANIZATION	0.99+
today	DATE	0.99+
Berkeley Paper	ORGANIZATION	0.99+
one	QUANTITY	0.99+
eight	QUANTITY	0.98+
700 different instances	QUANTITY	0.98+
three years	QUANTITY	0.98+
Swami	PERSON	0.98+
Aerospike	ORGANIZATION	0.98+
both	QUANTITY	0.98+
Snowflake	ORGANIZATION	0.98+
two things	QUANTITY	0.98+
60%	QUANTITY	0.98+

Subbu Iyer

>> And it'll be the fastest 15 minutes of your day from there. >> In three- >> We go Lisa. >> Wait. >> Yes >> Wait, wait, wait. I'm sorry I didn't pin the right speed. >> Yap, no, no rush. >> There we go. >> The beauty of not being live. >> I think, in the background. >> Fantastic, you all ready to go there, Lisa? >> Yeah. >> We are speeding around the horn and we are coming to you in five, four, three, two. >> Hey everyone, welcome to theCUBE's coverage of AWS re:Invent 2022. Lisa Martin here with you with Subbu Iyer one of our alumni who's now the CEO of Aerospike. Subbu, great to have you on the program. Thank you for joining us. >> Great as always to be on theCUBE Lisa, good to meet you. >> So, you know, every company these days has got to be a data company, whether it's a retailer, a manufacturer, a grocer, a automotive company. But for a lot of companies, data is underutilized yet a huge asset that is value added. Why do you think companies are struggling so much to make data a value added asset? >> Well, you know, we see this across the board. When I talk to customers and prospects there is a desire from the business and from IT actually to leverage data to really fuel newer applications, newer services newer business lines if you will, for companies. I think the struggle is one, I think one the, the plethora of data that is created. Surveys say that over the next three years data is going to be you know by 2025 around 175 zettabytes, right? A hundred and zettabytes of data is going to be created. And that's really a growth of north of 30% year over year. But the more important and the interesting thing is the real time component of that data is actually growing at, you know 35% CAGR. And what enterprises desire is decisions that are made in real time or near real time. And a lot of the challenges that do exist today is that either the infrastructure that enterprises have in place was never built to actually manipulate data in real time. The second is really the ability to actually put something in place which can handle spikes yet be cost efficient to fuel. So you can build for really peak loads, but then it's very expensive to operate that particular service at normal loads. So how do you build something which actually works for you for both users, so to speak. And the last point that we see out there is even if you're able to, you know bring all that data you don't have the processing capability to run through that data. So as a result, most enterprises struggle with one capturing the data, making decisions from it in real time and really operating it at the cost point that they need to operate it at. >> You know, you bring up a great point with respect to real time data access. And I think one of the things that we've learned the last couple of years is that access to real time data it's not a nice to have anymore. It's business critical for organizations in any industry. Talk about that as one of the challenges that organizations are facing. >> Yeah, when we started Aerospike, right? When the company started, it started with the premise that data is going to grow, number one exponentially. Two, when applications open up to the internet there's going to be a flood of users and demands on those applications. And that was true primarily when we started the company in the ad tech vertical. So ad tech was the first vertical where there was a lot of data both on the supply set and the demand side from an inventory of ads that were available. And on the other hand, they had like microseconds or milliseconds in which they could make a decision on which ad to put in front of you and I so that we would click or engage with that particular ad. But over the last three to five years what we've seen is as digitization has actually permeated every industry out there the need to harness data in real time is pretty much present in every industry. Whether that's retail, whether that's financial services telecommunications, e-commerce, gaming and entertainment. Every industry has a desire. One, the innovative companies, the small companies rather are innovating at a pace and standing up new businesses to compete with the larger companies in each of these verticals. And the larger companies don't want to be left behind. So they're standing up their own competing services or getting into new lines of business that really harness and are driven by real time data. So this compelling pressures, one, you know customer experience is paramount and we as customers expect answers in you know an instant, in real time. And on the other hand, the way they make decisions is based on a large data set because you know larger data sets actually propel better decisions. So there's competing pressures here which essentially drive the need one from a business perspective, two from a customer perspective to harness all of this data in real time. So that's what's driving an incessant need to actually make decisions in real or near real time. >> You know, I think one of the things that's been in short supply over the last couple of years is patience. We do expect as consumers whether we're in our business lives our personal lives that we're going to be getting be given information and data that's relevant it's personal to help us make those real time decisions. So having access to real time data is really business critical for organizations across any industries. Talk about some of the main capabilities that modern data applications and data platforms need to have. What are some of the key capabilities of a modern data platform that need to be delivered to meet demanding customer expectations? >> So, you know, going back to your initial question Lisa around why is data really a high value but underutilized or under-leveraged asset? One of the reasons we see is a lot of the data platforms that, you know, some of these applications were built on have been then around for a decade plus. And they were never built for the needs of today, which is really driving a lot of data and driving insight in real time from a lot of data. So there are four major capabilities that we see that are essential ingredients of any modern data platform. One is really the ability to, you know, operate at unlimited scale. So what we mean by that is really the ability to scale from gigabytes to even petabytes without any degradation in performance or latency or throughput. The second is really, you know, predictable performance. So can you actually deliver predictable performance as your data size grows or your throughput grows or your concurrent user on that application of service grows? It's really easy to build an application that operates at low scale or low throughput or low concurrency but performance usually starts degrading as you start scaling one of these attributes. The third thing is the ability to operate and always on globally resilient application. And that requires a really robust data platform that can be up on a five nine basis globally, can support global distribution because a lot of these applications have global users. And the last point is, goes back to my first answer which is, can you operate all of this at a cost point which is not prohibitive but it makes sense from a TCO perspective. 'Cause a lot of times what we see is people make choices of data platforms and as ironically their service or applications become more successful and more users join their journey the revenue starts going up, the user base starts going up but the cost basis starts crossing over the revenue and they're losing money on the service, ironically as the service becomes more popular. So really unlimited scale predictable performance always on a globally resilient basis and low TCO. These are the four essential capabilities of any modern data platform. >> So then talk to me with those as the four main core functionalities of a modern data platform, how does Aerospike deliver that? >> So we were built, as I said from day one to operate at unlimited scale and deliver predictable performance. And then over the years as we work with customers we build this incredible high availability capability which helps us deliver the always on, you know, operations. So we have customers who are who have been on the platform 10 years with no downtime for example, right? So we are talking about an amazing continuum of high availability that we provide for customers who operate these, you know globally resilient services. The key to our innovation here is what we call the hybrid memory architecture. So, you know, going a little bit technically deep here essentially what we built out in our architecture is the ability on each node or each server to treat a bank of SSDs or solid-state devices as essentially extended memory. So you're getting memory performance but you're accessing these SSDs. You're not paying memory prices but you're getting memory performance. As a result of that you can attach a lot more data to each node or each server in a distributed cluster. And when you kind of scale that across basically a distributed cluster you can do with Aerospike the same things at 60 to 80% lower server count. And as a result 60 to 80% lower TCO compared to some of the other options that are available in the market. Then basically, as I said that's the key kind of starting point to the innovation. We lay around capabilities like, you know replication, change data notification, you know synchronous and asynchronous replication. The ability to actually stretch a single cluster across multiple regions. So for example, if you're operating a global service you can have a single Aerospike cluster with one node in San Francisco one node in New York, another one in London and this would be basically seamlessly operating. So that, you know, this is strongly consistent, very few no SQL data platforms are strongly consistent or if they are strongly consistent they will actually suffer performance degradation. And what strongly consistent means is, you know all your data is always available it's guaranteed to be available there is no data lost any time. So in this configuration that I talked about if the node in London goes down your application still continues to operate, right? Your users see no kind of downtime and you know, when London comes up it rejoins the cluster and everything is back to kind of the way it was before, you know London left the cluster so to speak. So the ability to do this globally resilient highly available kind of model is really, really powerful. A lot of our customers actually use that kind of a scenario and we offer other deployment scenarios from a higher availability perspective. So everything starts with HMA or Hybrid Memory Architecture and then we start building a lot of these other capabilities around the platform. And then over the years what our customers have guided us to do is as they're putting together a modern kind of data infrastructure, we don't live in the silo. So Aerospike gets deployed with other technologies like streaming technologies or analytics technologies. So we built connectors into Kafka, Pulsar, so that as you're ingesting data from a variety of data sources you can ingest them at very high ingest speeds and store them persistently into Aerospike. Once the data is in Aerospike you can actually run Spark jobs across that data in a multi-threaded parallel fashion to get really insight from that data at really high throughput and high speed. >> High throughput, high speed, incredibly important especially as today's landscape is increasingly distributed. Data centers, multiple public clouds, Edge, IoT devices, the workforce embracing more and more hybrid these days. How are you helping customers to extract more value from data while also lowering costs? Go into some customer examples 'cause I know you have some great ones. >> Yeah, you know, I think, we have built an amazing set of customers and customers actually use us for some really mission critical applications. So, you know, before I get into specific customer examples let me talk to you about some of kind of the use cases which we see out there. We see a lot of Aerospike being used in fraud detection. We see us being used in recommendations engines we get used in customer data profiles, or customer profiles, Customer 360 stores, you know multiplayer gaming and entertainment. These are kind of the repeated use case, digital payments. We power most of the digital payment systems across the globe. Specific example from a specific example perspective the first one I would love to talk about is PayPal. So if you use PayPal today, then you know when you're actually paying somebody your transaction is, you know being sent through Aerospike to really decide whether this is a fraudulent transaction or not. And when you do that, you know, you and I as a customer are not going to wait around for 10 seconds for PayPal to say yay or nay. We expect, you know, the decision to be made in an instant. So we are powering that fraud detection engine at PayPal. For every transaction that goes through PayPal. Before us, you know, PayPal was missing out on about 2% of their SLAs which was essentially millions of dollars which they were losing because, you know, they were letting transactions go through and taking the risk that it's not a fraudulent transaction. With Aerospike they can now actually get a much better SLA and the data set on which they compute the fraud score has gone up by you know, several factors. So by 30X if you will. So not only has the data size that is powering the fraud engine actually gone up 30X with Aerospike but they're actually making decisions in an instant for, you know, 99.95% of their transactions. So that's- >> And that's what we expect as consumers, right? We want to know that there's fraud detection on the swipe regardless of who we're interacting with. >> Yes, and so that's a really powerful use case and you know, it's a great customer success story. The other one I would talk about is really Wayfair, right, from retail and you know from e-commerce. So everybody knows Wayfair global leader in really in online home furnishings and they use us to power their recommendations engine. And you know it's basically if you're purchasing this, people who bought this also bought these five other things, so on and so forth. They have actually seen their cart size at checkout go up by up to 30%, as a result of actually powering their recommendations engine through Aerospike. And they were able to do this by reducing the server count by 9X. So on one ninth of the servers that were there before Aerospike, they're now powering their recommendations engine and seeing cart size checkout go up by 30%. Really, really powerful in terms of the business outcome and what we are able to, you know, drive at Wayfair. >> Hugely powerful as a business outcome. And that's also what the consumer wants. The consumer is expecting these days to have a very personalized relevant experience that's going to show me if I bought this show me something else that's related to that. We have this expectation that needs to be really fueled by technology. >> Exactly, and you know, another great example you asked about you know, customer stories, Adobe. Who doesn't know Adobe, you know. They're on a mission to deliver the best customer experience that they can. And they're talking about, you know great Customer 360 experience at scale and they're modernizing their entire edge compute infrastructure to support this with Aerospike. Going to Aerospike basically what they have seen is their throughput go up by 70%, their cost has been reduced by 3X. So essentially doing it at one third of the cost while their annual data growth continues at, you know about north of 30%. So not only is their data growing they're able to actually reduce their cost to actually deliver this great customer experience by one third to one third and continue to deliver great Customer 360 experience at scale. Really, really powerful example of how you deliver Customer 360 in a world which is dynamic and you know on a data set which is constantly growing at north of 30% in this case. >> Those are three great examples, PayPal, Wayfair, Adobe, talking about, especially with Wayfair when you talk about increasing their cart checkout sizes but also with Adobe increasing throughput by over 70%. I'm looking at my notes here. While data is growing at 32%, that's something that every organization has to contend with data growth is continuing to scale and scale and scale. >> Yap, I'll give you a fun one here. So, you know, you may not have heard about this company it's called Dream11 and it's a company based out of India but it's a very, you know, it's a fun story because it's the world's largest fantasy sports platform. And you know, India is a nation which is cricket crazy. So you know, when they have their premier league going on and there's millions of users logged onto the Dream11 platform building their fantasy league teams and you know, playing on that particular platform, it has a hundred million users a hundred million plus users on the platform, 5.5 million concurrent users and they have been growing at 30%. So they are considered an amazing success story in terms of what they have accomplished and the way they have architected their platform to operate at scale. And all of that is really powered by Aerospike. Think about that they're able to deliver all of this and support a hundred million users 5.5 million concurrent users all with, you know 99 plus percent of their transactions completing in less than one millisecond. Just incredible success story. Not a brand that is, you know, world renowned but at least you know from what we see out there it's an amazing success story of operating at scale. >> Amazing success story, huge business outcomes. Last question for you as we're almost out of time is talk a little bit about Aerospike AWS the partnership Graviton2 better together. What are you guys doing together there? >> Great partnership. AWS has multiple layers in terms of partnerships. So, you know, we engage with AWS at the executive level. They plan out, really roll out of new instances in partnership with us, making sure that, you know those instance types work well for us. And then we just released support for Aerospike on the Graviton platform and we just announced a benchmark of Aerospike running on Graviton on AWS. And what we see out there is with the benchmark a 1.6X improvement in price performance. And you know about 18% increase in throughput while maintaining a 27% reduction in cost, you know, on Graviton. So this is an amazing story from a price performance perspective, performance per watt for greater energy efficiencies, which basically a lot of our customers are starting to kind of talk to us about leveraging this to further meet their sustainability target. So great story from Aerospike and AWS not just from a partnership perspective on a technology and an executive level, but also in terms of what joint outcomes we are able to deliver for our customers. >> And it sounds like a great sustainability story. I wish we had more time so we would talk about this but thank you so much for talking about the main capabilities of a modern data platform, what's needed, why, and how you guys are delivering that. We appreciate your insights and appreciate your time. >> Thank you very much. I mean, if folks are at re:Invent next week or this week come on and see us at our booth and we are in the data analytics pavilion and you can find us pretty easily. Would love to talk to you. >> Perfect, we'll send them there. Subbu Iyer, thank you so much for joining me on the program today. We appreciate your insights. >> Thank you Lisa. >> I'm Lisa Martin, you're watching theCUBE's coverage of AWS re:Invent 2022. Thanks for watching. >> Clear- >> Clear cutting. >> Nice job, very nice job.

Published Date : Nov 25 2022

SUMMARY :

the fastest 15 minutes I'm sorry I didn't pin the right speed. and we are coming to you in Subbu, great to have you on the program. Great as always to be on So, you know, every company these days And a lot of the challenges that access to real time data to put in front of you and I and data platforms need to have. One of the reasons we see is So the ability to do How are you helping customers let me talk to you about fraud detection on the swipe and you know, it's a great We have this expectation that needs to be Exactly, and you know, with Wayfair when you talk So you know, when they have What are you guys doing together there? And you know about 18% and how you guys are delivering that. and you can find us pretty easily. for joining me on the program today. of AWS re:Invent 2022.

ENTITIES

Entity	Category	Confidence
AWS	ORGANIZATION	0.99+
Lisa Martin	PERSON	0.99+
60	QUANTITY	0.99+
London	LOCATION	0.99+
Lisa	PERSON	0.99+
PayPal	ORGANIZATION	0.99+
New York	LOCATION	0.99+
15 minutes	QUANTITY	0.99+
3X	QUANTITY	0.99+
2025	DATE	0.99+
Wayfair	ORGANIZATION	0.99+
35%	QUANTITY	0.99+
Adobe	ORGANIZATION	0.99+
30%	QUANTITY	0.99+
99.95%	QUANTITY	0.99+
10 seconds	QUANTITY	0.99+
San Francisco	LOCATION	0.99+
30X	QUANTITY	0.99+
70%	QUANTITY	0.99+
32%	QUANTITY	0.99+
27%	QUANTITY	0.99+
1.6X	QUANTITY	0.99+
each server	QUANTITY	0.99+
two	QUANTITY	0.99+
one	QUANTITY	0.99+
One	QUANTITY	0.99+
Aerospike	ORGANIZATION	0.99+
millions of dollars	QUANTITY	0.99+
India	LOCATION	0.99+
Subbu	PERSON	0.99+
9X	QUANTITY	0.99+
five	QUANTITY	0.99+
99 plus percent	QUANTITY	0.99+
first answer	QUANTITY	0.99+
third thing	QUANTITY	0.99+
less than one millisecond	QUANTITY	0.99+
10 years	QUANTITY	0.99+
this week	DATE	0.99+
Subbu Iyer	PERSON	0.99+
one third	QUANTITY	0.99+
millions of users	QUANTITY	0.99+
over 70%	QUANTITY	0.98+
both users	QUANTITY	0.98+
Dream11	ORGANIZATION	0.98+
80%	QUANTITY	0.98+
today	DATE	0.98+
Graviton	TITLE	0.98+
each node	QUANTITY	0.98+
second	QUANTITY	0.98+
both	QUANTITY	0.98+
three	QUANTITY	0.98+
four	QUANTITY	0.98+
Two	QUANTITY	0.98+
one node	QUANTITY	0.98+
hundred million users	QUANTITY	0.98+
first vertical	QUANTITY	0.97+
about 2%	QUANTITY	0.97+
Aerospike	TITLE	0.97+
single cluster	QUANTITY	0.96+

Breaking Analysis: We Have the Data…What Private Tech Companies Don’t Tell you About Their Business

>> From The Cube Studios in Palo Alto and Boston, bringing you data driven insights from The Cube at ETR. This is "Breaking Analysis" with Dave Vellante. >> The reverse momentum in tech stocks caused by rising interest rates, less attractive discounted cash flow models, and more tepid forward guidance, can be easily measured by public market valuations. And while there's lots of discussion about the impact on private companies and cash runway and 409A valuations, measuring the performance of non-public companies isn't as easy. IPOs have dried up and public statements by private companies, of course, they accentuate the good and they kind of hide the bad. Real data, unless you're an insider, is hard to find. Hello and welcome to this week's "Wikibon Cube Insights" powered by ETR. In this "Breaking Analysis", we unlock some of the secrets that non-public, emerging tech companies may or may not be sharing. And we do this by introducing you to a capability from ETR that we've not exposed you to over the past couple of years, it's called the Emerging Technologies Survey, and it is packed with sentiment data and performance data based on surveys of more than a thousand CIOs and IT buyers covering more than 400 companies. And we've invited back our colleague, Erik Bradley of ETR to help explain the survey and the data that we're going to cover today. Erik, this survey is something that I've not personally spent much time on, but I'm blown away at the data. It's really unique and detailed. First of all, welcome. Good to see you again. >> Great to see you too, Dave, and I'm really happy to be talking about the ETS or the Emerging Technology Survey. Even our own clients of constituents probably don't spend as much time in here as they should. >> Yeah, because there's so much in the mainstream, but let's pull up a slide to bring out the survey composition. Tell us about the study. How often do you run it? What's the background and the methodology? >> Yeah, you were just spot on the way you were talking about the private tech companies out there. So what we did is we decided to take all the vendors that we track that are not yet public and move 'em over to the ETS. And there isn't a lot of information out there. If you're not in Silicon (indistinct), you're not going to get this stuff. So PitchBook and Tech Crunch are two out there that gives some data on these guys. But what we really wanted to do was go out to our community. We have 6,000, ITDMs in our community. We wanted to ask them, "Are you aware of these companies? And if so, are you allocating any resources to them? Are you planning to evaluate them," and really just kind of figure out what we can do. So this particular survey, as you can see, 1000 plus responses, over 450 vendors that we track. And essentially what we're trying to do here is talk about your evaluation and awareness of these companies and also your utilization. And also if you're not utilizing 'em, then we can also figure out your sales conversion or churn. So this is interesting, not only for the ITDMs themselves to figure out what their peers are evaluating and what they should put in POCs against the big guys when contracts come up. But it's also really interesting for the tech vendors themselves to see how they're performing. >> And you can see 2/3 of the respondents are director level of above. You got 28% is C-suite. There is of course a North America bias, 70, 75% is North America. But these smaller companies, you know, that's when they start doing business. So, okay. We're going to do a couple of things here today. First, we're going to give you the big picture across the sectors that ETR covers within the ETS survey. And then we're going to look at the high and low sentiment for the larger private companies. And then we're going to do the same for the smaller private companies, the ones that don't have as much mindshare. And then I'm going to put those two groups together and we're going to look at two dimensions, actually three dimensions, which companies are being evaluated the most. Second, companies are getting the most usage and adoption of their offerings. And then third, which companies are seeing the highest churn rates, which of course is a silent killer of companies. And then finally, we're going to look at the sentiment and mindshare for two key areas that we like to cover often here on "Breaking Analysis", security and data. And data comprises database, including data warehousing, and then big data analytics is the second part of data. And then machine learning and AI is the third section within data that we're going to look at. Now, one other thing before we get into it, ETR very often will include open source offerings in the mix, even though they're not companies like TensorFlow or Kubernetes, for example. And we'll call that out during this discussion. The reason this is done is for context, because everyone is using open source. It is the heart of innovation and many business models are super glued to an open source offering, like take MariaDB, for example. There's the foundation and then there's with the open source code and then there, of course, the company that sells services around the offering. Okay, so let's first look at the highest and lowest sentiment among these private firms, the ones that have the highest mindshare. So they're naturally going to be somewhat larger. And we do this on two dimensions, sentiment on the vertical axis and mindshare on the horizontal axis and note the open source tool, see Kubernetes, Postgres, Kafka, TensorFlow, Jenkins, Grafana, et cetera. So Erik, please explain what we're looking at here, how it's derived and what the data tells us. >> Certainly, so there is a lot here, so we're going to break it down first of all by explaining just what mindshare and net sentiment is. You explain the axis. We have so many evaluation metrics, but we need to aggregate them into one so that way we can rank against each other. Net sentiment is really the aggregation of all the positive and subtracting out the negative. So the net sentiment is a very quick way of looking at where these companies stand versus their peers in their sectors and sub sectors. Mindshare is basically the awareness of them, which is good for very early stage companies. And you'll see some names on here that are obviously been around for a very long time. And they're clearly be the bigger on the axis on the outside. Kubernetes, for instance, as you mentioned, is open source. This de facto standard for all container orchestration, and it should be that far up into the right, because that's what everyone's using. In fact, the open source leaders are so prevalent in the emerging technology survey that we break them out later in our analysis, 'cause it's really not fair to include them and compare them to the actual companies that are providing the support and the security around that open source technology. But no survey, no analysis, no research would be complete without including these open source tech. So what we're looking at here, if I can just get away from the open source names, we see other things like Databricks and OneTrust . They're repeating as top net sentiment performers here. And then also the design vendors. People don't spend a lot of time on 'em, but Miro and Figma. This is their third survey in a row where they're just dominating that sentiment overall. And Adobe should probably take note of that because they're really coming after them. But Databricks, we all know probably would've been a public company by now if the market hadn't turned, but you can see just how dominant they are in a survey of nothing but private companies. And we'll see that again when we talk about the database later. >> And I'll just add, so you see automation anywhere on there, the big UiPath competitor company that was not able to get to the public markets. They've been trying. Snyk, Peter McKay's company, they've raised a bunch of money, big security player. They're doing some really interesting things in developer security, helping developers secure the data flow, H2O.ai, Dataiku AI company. We saw them at the Snowflake Summit. Redis Labs, Netskope and security. So a lot of names that we know that ultimately we think are probably going to be hitting the public market. Okay, here's the same view for private companies with less mindshare, Erik. Take us through this one. >> On the previous slide too real quickly, I wanted to pull that security scorecard and we'll get back into it. But this is a newcomer, that I couldn't believe how strong their data was, but we'll bring that up in a second. Now, when we go to the ones of lower mindshare, it's interesting to talk about open source, right? Kubernetes was all the way on the top right. Everyone uses containers. Here we see Istio up there. Not everyone is using service mesh as much. And that's why Istio is in the smaller breakout. But still when you talk about net sentiment, it's about the leader, it's the highest one there is. So really interesting to point out. Then we see other names like Collibra in the data side really performing well. And again, as always security, very well represented here. We have Aqua, Wiz, Armis, which is a standout in this survey this time around. They do IoT security. I hadn't even heard of them until I started digging into the data here. And I couldn't believe how well they were doing. And then of course you have AnyScale, which is doing a second best in this and the best name in the survey Hugging Face, which is a machine learning AI tool. Also doing really well on a net sentiment, but they're not as far along on that access of mindshare just yet. So these are again, emerging companies that might not be as well represented in the enterprise as they will be in a couple of years. >> Hugging Face sounds like something you do with your two year old. Like you said, you see high performers, AnyScale do machine learning and you mentioned them. They came out of Berkeley. Collibra Governance, InfluxData is on there. InfluxDB's a time series database. And yeah, of course, Alex, if you bring that back up, you get a big group of red dots, right? That's the bad zone, I guess, which Sisense does vis, Yellowbrick Data is a NPP database. How should we interpret the red dots, Erik? I mean, is it necessarily a bad thing? Could it be misinterpreted? What's your take on that? >> Sure, well, let me just explain the definition of it first from a data science perspective, right? We're a data company first. So the gray dots that you're seeing that aren't named, that's the mean that's the average. So in order for you to be on this chart, you have to be at least one standard deviation above or below that average. So that gray is where we're saying, "Hey, this is where the lump of average comes in. This is where everyone normally stands." So you either have to be an outperformer or an underperformer to even show up in this analysis. So by definition, yes, the red dots are bad. You're at least one standard deviation below the average of your peers. It's not where you want to be. And if you're on the lower left, not only are you not performing well from a utilization or an actual usage rate, but people don't even know who you are. So that's a problem, obviously. And the VCs and the PEs out there that are backing these companies, they're the ones who mostly are interested in this data. >> Yeah. Oh, that's great explanation. Thank you for that. No, nice benchmarking there and yeah, you don't want to be in the red. All right, let's get into the next segment here. Here going to look at evaluation rates, adoption and the all important churn. First new evaluations. Let's bring up that slide. And Erik, take us through this. >> So essentially I just want to explain what evaluation means is that people will cite that they either plan to evaluate the company or they're currently evaluating. So that means we're aware of 'em and we are choosing to do a POC of them. And then we'll see later how that turns into utilization, which is what a company wants to see, awareness, evaluation, and then actually utilizing them. That's sort of the life cycle for these emerging companies. So what we're seeing here, again, with very high evaluation rates. H2O, we mentioned. SecurityScorecard jumped up again. Chargebee, Snyk, Salt Security, Armis. A lot of security names are up here, Aqua, Netskope, which God has been around forever. I still can't believe it's in an Emerging Technology Survey But so many of these names fall in data and security again, which is why we decided to pick those out Dave. And on the lower side, Vena, Acton, those unfortunately took the dubious award of the lowest evaluations in our survey, but I prefer to focus on the positive. So SecurityScorecard, again, real standout in this one, they're in a security assessment space, basically. They'll come in and assess for you how your security hygiene is. And it's an area of a real interest right now amongst our ITDM community. >> Yeah, I mean, I think those, and then Arctic Wolf is up there too. They're doing managed services. You had mentioned Netskope. Yeah, okay. All right, let's look at now adoption. These are the companies whose offerings are being used the most and are above that standard deviation in the green. Take us through this, Erik. >> Sure, yet again, what we're looking at is, okay, we went from awareness, we went to evaluation. Now it's about utilization, which means a survey respondent's going to state "Yes, we evaluated and we plan to utilize it" or "It's already in our enterprise and we're actually allocating further resources to it." Not surprising, again, a lot of open source, the reason why, it's free. So it's really easy to grow your utilization on something that's free. But as you and I both know, as Red Hat proved, there's a lot of money to be made once the open source is adopted, right? You need the governance, you need the security, you need the support wrapped around it. So here we're seeing Kubernetes, Postgres, Apache Kafka, Jenkins, Grafana. These are all open source based names. But if we're looking at names that are non open source, we're going to see Databricks, Automation Anywhere, Rubrik all have the highest mindshare. So these are the names, not surprisingly, all names that probably should have been public by now. Everyone's expecting an IPO imminently. These are the names that have the highest mindshare. If we talk about the highest utilization rates, again, Miro and Figma pop up, and I know they're not household names, but they are just dominant in this survey. These are applications that are meant for design software and, again, they're going after an Autodesk or a CAD or Adobe type of thing. It is just dominant how high the utilization rates are here, which again is something Adobe should be paying attention to. And then you'll see a little bit lower, but also interesting, we see Collibra again, we see Hugging Face again. And these are names that are obviously in the data governance, ML, AI side. So we're seeing a ton of data, a ton of security and Rubrik was interesting in this one, too, high utilization and high mindshare. We know how pervasive they are in the enterprise already. >> Erik, Alex, keep that up for a second, if you would. So yeah, you mentioned Rubrik. Cohesity's not on there. They're sort of the big one. We're going to talk about them in a moment. Puppet is interesting to me because you remember the early days of that sort of space, you had Puppet and Chef and then you had Ansible. Red Hat bought Ansible and then Ansible really took off. So it's interesting to see Puppet on there as well. Okay. So now let's look at the churn because this one is where you don't want to be. It's, of course, all red 'cause churn is bad. Take us through this, Erik. >> Yeah, definitely don't want to be here and I don't love to dwell on the negative. So we won't spend as much time. But to your point, there's one thing I want to point out that think it's important. So you see Rubrik in the same spot, but Rubrik has so many citations in our survey that it actually would make sense that they're both being high utilization and churn just because they're so well represented. They have such a high overall representation in our survey. And the reason I call that out is Cohesity. Cohesity has an extremely high churn rate here about 17% and unlike Rubrik, they were not on the utilization side. So Rubrik is seeing both, Cohesity is not. It's not being utilized, but it's seeing a high churn. So that's the way you can look at this data and say, "Hm." Same thing with Puppet. You noticed that it was on the other slide. It's also on this one. So basically what it means is a lot of people are giving Puppet a shot, but it's starting to churn, which means it's not as sticky as we would like. One that was surprising on here for me was Tanium. It's kind of jumbled in there. It's hard to see in the middle, but Tanium, I was very surprised to see as high of a churn because what I do hear from our end user community is that people that use it, like it. It really kind of spreads into not only vulnerability management, but also that endpoint detection and response side. So I was surprised by that one, mostly to see Tanium in here. Mural, again, was another one of those application design softwares that's seeing a very high churn as well. >> So you're saying if you're in both... Alex, bring that back up if you would. So if you're in both like MariaDB is for example, I think, yeah, they're in both. They're both green in the previous one and red here, that's not as bad. You mentioned Rubrik is going to be in both. Cohesity is a bit of a concern. Cohesity just brought on Sanjay Poonen. So this could be a go to market issue, right? I mean, 'cause Cohesity has got a great product and they got really happy customers. So they're just maybe having to figure out, okay, what's the right ideal customer profile and Sanjay Poonen, I guarantee, is going to have that company cranking. I mean they had been doing very well on the surveys and had fallen off of a bit. The other interesting things wondering the previous survey I saw Cvent, which is an event platform. My only reason I pay attention to that is 'cause we actually have an event platform. We don't sell it separately. We bundle it as part of our offerings. And you see Hopin on here. Hopin raised a billion dollars during the pandemic. And we were like, "Wow, that's going to blow up." And so you see Hopin on the churn and you didn't see 'em in the previous chart, but that's sort of interesting. Like you said, let's not kind of dwell on the negative, but you really don't. You know, churn is a real big concern. Okay, now we're going to drill down into two sectors, security and data. Where data comprises three areas, database and data warehousing, machine learning and AI and big data analytics. So first let's take a look at the security sector. Now this is interesting because not only is it a sector drill down, but also gives an indicator of how much money the firm has raised, which is the size of that bubble. And to tell us if a company is punching above its weight and efficiently using its venture capital. Erik, take us through this slide. Explain the dots, the size of the dots. Set this up please. >> Yeah. So again, the axis is still the same, net sentiment and mindshare, but what we've done this time is we've taken publicly available information on how much capital company is raised and that'll be the size of the circle you see around the name. And then whether it's green or red is basically saying relative to the amount of money they've raised, how are they doing in our data? So when you see a Netskope, which has been around forever, raised a lot of money, that's why you're going to see them more leading towards red, 'cause it's just been around forever and kind of would expect it. Versus a name like SecurityScorecard, which is only raised a little bit of money and it's actually performing just as well, if not better than a name, like a Netskope. OneTrust doing absolutely incredible right now. BeyondTrust. We've seen the issues with Okta, right. So those are two names that play in that space that obviously are probably getting some looks about what's going on right now. Wiz, we've all heard about right? So raised a ton of money. It's doing well on net sentiment, but the mindshare isn't as well as you'd want, which is why you're going to see a little bit of that red versus a name like Aqua, which is doing container and application security. And hasn't raised as much money, but is really neck and neck with a name like Wiz. So that is why on a relative basis, you'll see that more green. As we all know, information security is never going away. But as we'll get to later in the program, Dave, I'm not sure in this current market environment, if people are as willing to do POCs and switch away from their security provider, right. There's a little bit of tepidness out there, a little trepidation. So right now we're seeing overall a slight pause, a slight cooling in overall evaluations on the security side versus historical levels a year ago. >> Now let's stay on here for a second. So a couple things I want to point out. So it's interesting. Now Snyk has raised over, I think $800 million but you can see them, they're high on the vertical and the horizontal, but now compare that to Lacework. It's hard to see, but they're kind of buried in the middle there. That's the biggest dot in this whole thing. I think I'm interpreting this correctly. They've raised over a billion dollars. It's a Mike Speiser company. He was the founding investor in Snowflake. So people watch that very closely, but that's an example of where they're not punching above their weight. They recently had a layoff and they got to fine tune things, but I'm still confident they they're going to do well. 'Cause they're approaching security as a data problem, which is probably people having trouble getting their arms around that. And then again, I see Arctic Wolf. They're not red, they're not green, but they've raised fair amount of money, but it's showing up to the right and decent level there. And a couple of the other ones that you mentioned, Netskope. Yeah, they've raised a lot of money, but they're actually performing where you want. What you don't want is where Lacework is, right. They've got some work to do to really take advantage of the money that they raised last November and prior to that. >> Yeah, if you're seeing that more neutral color, like you're calling out with an Arctic Wolf, like that means relative to their peers, this is where they should be. It's when you're seeing that red on a Lacework where we all know, wow, you raised a ton of money and your mindshare isn't where it should be. Your net sentiment is not where it should be comparatively. And then you see these great standouts, like Salt Security and SecurityScorecard and Abnormal. You know they haven't raised that much money yet, but their net sentiment's higher and their mindshare's doing well. So those basically in a nutshell, if you're a PE or a VC and you see a small green circle, then you're doing well, then it means you made a good investment. >> Some of these guys, I don't know, but you see these small green circles. Those are the ones you want to start digging into and maybe help them catch a wave. Okay, let's get into the data discussion. And again, three areas, database slash data warehousing, big data analytics and ML AI. First, we're going to look at the database sector. So Alex, thank you for bringing that up. Alright, take us through this, Erik. Actually, let me just say Postgres SQL. I got to ask you about this. It shows some funding, but that actually could be a mix of EDB, the company that commercializes Postgres and Postgres the open source database, which is a transaction system and kind of an open source Oracle. You see MariaDB is a database, but open source database. But the companies they've raised over $200 million and they filed an S-4. So Erik looks like this might be a little bit of mashup of companies and open source products. Help us understand this. >> Yeah, it's tough when you start dealing with the open source side and I'll be honest with you, there is a little bit of a mashup here. There are certain names here that are a hundred percent for profit companies. And then there are others that are obviously open source based like Redis is open source, but Redis Labs is the one trying to monetize the support around it. So you're a hundred percent accurate on this slide. I think one of the things here that's important to note though, is just how important open source is to data. If you're going to be going to any of these areas, it's going to be open source based to begin with. And Neo4j is one I want to call out here. It's not one everyone's familiar with, but it's basically geographical charting database, which is a name that we're seeing on a net sentiment side actually really, really high. When you think about it's the third overall net sentiment for a niche database play. It's not as big on the mindshare 'cause it's use cases aren't as often, but third biggest play on net sentiment. I found really interesting on this slide. >> And again, so MariaDB, as I said, they filed an S-4 I think $50 million in revenue, that might even be ARR. So they're not huge, but they're getting there. And by the way, MariaDB, if you don't know, was the company that was formed the day that Oracle bought Sun in which they got MySQL and MariaDB has done a really good job of replacing a lot of MySQL instances. Oracle has responded with MySQL HeatWave, which was kind of the Oracle version of MySQL. So there's some interesting battles going on there. If you think about the LAMP stack, the M in the LAMP stack was MySQL. And so now it's all MariaDB replacing that MySQL for a large part. And then you see again, the red, you know, you got to have some concerns about there. Aerospike's been around for a long time. SingleStore changed their name a couple years ago, last year. Yellowbrick Data, Fire Bolt was kind of going after Snowflake for a while, but yeah, you want to get out of that red zone. So they got some work to do. >> And Dave, real quick for the people that aren't aware, I just want to let them know that we can cut this data with the public company data as well. So we can cross over this with that because some of these names are competing with the larger public company names as well. So we can go ahead and cross reference like a MariaDB with a Mongo, for instance, or of something of that nature. So it's not in this slide, but at another point we can certainly explain on a relative basis how these private names are doing compared to the other ones as well. >> All right, let's take a quick look at analytics. Alex, bring that up if you would. Go ahead, Erik. >> Yeah, I mean, essentially here, I can't see it on my screen, my apologies. I just kind of went to blank on that. So gimme one second to catch up. >> So I could set it up while you're doing that. You got Grafana up and to the right. I mean, this is huge right. >> Got it thank you. I lost my screen there for a second. Yep. Again, open source name Grafana, absolutely up and to the right. But as we know, Grafana Labs is actually picking up a lot of speed based on Grafana, of course. And I think we might actually hear some noise from them coming this year. The names that are actually a little bit more disappointing than I want to call out are names like ThoughtSpot. It's been around forever. Their mindshare of course is second best here but based on the amount of time they've been around and the amount of money they've raised, it's not actually outperforming the way it should be. We're seeing Moogsoft obviously make some waves. That's very high net sentiment for that company. It's, you know, what, third, fourth position overall in this entire area, Another name like Fivetran, Matillion is doing well. Fivetran, even though it's got a high net sentiment, again, it's raised so much money that we would've expected a little bit more at this point. I know you know this space extremely well, but basically what we're looking at here and to the bottom left, you're going to see some names with a lot of red, large circles that really just aren't performing that well. InfluxData, however, second highest net sentiment. And it's really pretty early on in this stage and the feedback we're getting on this name is the use cases are great, the efficacy's great. And I think it's one to watch out for. >> InfluxData, time series database. The other interesting things I just noticed here, you got Tamer on here, which is that little small green. Those are the ones we were saying before, look for those guys. They might be some of the interesting companies out there and then observe Jeremy Burton's company. They do observability on top of Snowflake, not green, but kind of in that gray. So that's kind of cool. Monte Carlo is another one, they're sort of slightly green. They are doing some really interesting things in data and data mesh. So yeah, okay. So I can spend all day on this stuff, Erik, phenomenal data. I got to get back and really dig in. Let's end with machine learning and AI. Now this chart it's similar in its dimensions, of course, except for the money raised. We're not showing that size of the bubble, but AI is so hot. We wanted to cover that here, Erik, explain this please. Why TensorFlow is highlighted and walk us through this chart. >> Yeah, it's funny yet again, right? Another open source name, TensorFlow being up there. And I just want to explain, we do break out machine learning, AI is its own sector. A lot of this of course really is intertwined with the data side, but it is on its own area. And one of the things I think that's most important here to break out is Databricks. We started to cover Databricks in machine learning, AI. That company has grown into much, much more than that. So I do want to state to you Dave, and also the audience out there that moving forward, we're going to be moving Databricks out of only the MA/AI into other sectors. So we can kind of value them against their peers a little bit better. But in this instance, you could just see how dominant they are in this area. And one thing that's not here, but I do want to point out is that we have the ability to break this down by industry vertical, organization size. And when I break this down into Fortune 500 and Fortune 1000, both Databricks and Tensorflow are even better than you see here. So it's quite interesting to see that the names that are succeeding are also succeeding with the largest organizations in the world. And as we know, large organizations means large budgets. So this is one area that I just thought was really interesting to point out that as we break it down, the data by vertical, these two names still are the outstanding players. >> I just also want to call it H2O.ai. They're getting a lot of buzz in the marketplace and I'm seeing them a lot more. Anaconda, another one. Dataiku consistently popping up. DataRobot is also interesting because all the kerfuffle that's going on there. The Cube guy, Cube alum, Chris Lynch stepped down as executive chairman. All this stuff came out about how the executives were taking money off the table and didn't allow the employees to participate in that money raising deal. So that's pissed a lot of people off. And so they're now going through some kind of uncomfortable things, which is unfortunate because DataRobot, I noticed, we haven't covered them that much in "Breaking Analysis", but I've noticed them oftentimes, Erik, in the surveys doing really well. So you would think that company has a lot of potential. But yeah, it's an important space that we're going to continue to watch. Let me ask you Erik, can you contextualize this from a time series standpoint? I mean, how is this changed over time? >> Yeah, again, not show here, but in the data. I'm sorry, go ahead. >> No, I'm sorry. What I meant, I should have interjected. In other words, you would think in a downturn that these emerging companies would be less interesting to buyers 'cause they're more risky. What have you seen? >> Yeah, and it was interesting before we went live, you and I were having this conversation about "Is the downturn stopping people from evaluating these private companies or not," right. In a larger sense, that's really what we're doing here. How are these private companies doing when it comes down to the actual practitioners? The people with the budget, the people with the decision making. And so what I did is, we have historical data as you know, I went back to the Emerging Technology Survey we did in November of 21, right at the crest right before the market started to really fall and everything kind of started to fall apart there. And what I noticed is on the security side, very much so, we're seeing less evaluations than we were in November 21. So I broke it down. On cloud security, net sentiment went from 21% to 16% from November '21. That's a pretty big drop. And again, that sentiment is our one aggregate metric for overall positivity, meaning utilization and actual evaluation of the name. Again in database, we saw it drop a little bit from 19% to 13%. However, in analytics we actually saw it stay steady. So it's pretty interesting that yes, cloud security and security in general is always going to be important. But right now we're seeing less overall net sentiment in that space. But within analytics, we're seeing steady with growing mindshare. And also to your point earlier in machine learning, AI, we're seeing steady net sentiment and mindshare has grown a whopping 25% to 30%. So despite the downturn, we're seeing more awareness of these companies in analytics and machine learning and a steady, actual utilization of them. I can't say the same in security and database. They're actually shrinking a little bit since the end of last year. >> You know it's interesting, we were on a round table, Erik does these round tables with CISOs and CIOs, and I remember one time you had asked the question, "How do you think about some of these emerging tech companies?" And one of the executives said, "I always include somebody in the bottom left of the Gartner Magic Quadrant in my RFPs. I think he said, "That's how I found," I don't know, it was Zscaler or something like that years before anybody ever knew of them "Because they're going to help me get to the next level." So it's interesting to see Erik in these sectors, how they're holding up in many cases. >> Yeah. It's a very important part for the actual IT practitioners themselves. There's always contracts coming up and you always have to worry about your next round of negotiations. And that's one of the roles these guys play. You have to do a POC when contracts come up, but it's also their job to stay on top of the new technology. You can't fall behind. Like everyone's a software company. Now everyone's a tech company, no matter what you're doing. So these guys have to stay in on top of it. And that's what this ETS can do. You can go in here and look and say, "All right, I'm going to evaluate their technology," and it could be twofold. It might be that you're ready to upgrade your technology and they're actually pushing the envelope or it simply might be I'm using them as a negotiation ploy. So when I go back to the big guy who I have full intentions of writing that contract to, at least I have some negotiation leverage. >> Erik, we got to leave it there. I could spend all day. I'm going to definitely dig into this on my own time. Thank you for introducing this, really appreciate your time today. >> I always enjoy it, Dave and I hope everyone out there has a great holiday weekend. Enjoy the rest of the summer. And, you know, I love to talk data. So anytime you want, just point the camera on me and I'll start talking data. >> You got it. I also want to thank the team at ETR, not only Erik, but Darren Bramen who's a data scientist, really helped prepare this data, the entire team over at ETR. I cannot tell you how much additional data there is. We are just scratching the surface in this "Breaking Analysis". So great job guys. I want to thank Alex Myerson. Who's on production and he manages the podcast. Ken Shifman as well, who's just coming back from VMware Explore. Kristen Martin and Cheryl Knight help get the word out on social media and in our newsletters. And Rob Hof is our editor in chief over at SiliconANGLE. Does some great editing for us. Thank you. All of you guys. Remember these episodes, they're all available as podcast, wherever you listen. All you got to do is just search "Breaking Analysis" podcast. I publish each week on wikibon.com and siliconangle.com. Or you can email me to get in touch david.vellante@siliconangle.com. You can DM me at dvellante or comment on my LinkedIn posts and please do check out etr.ai for the best survey data in the enterprise tech business. This is Dave Vellante for Erik Bradley and The Cube Insights powered by ETR. Thanks for watching. Be well. And we'll see you next time on "Breaking Analysis". (upbeat music)

Published Date : Sep 7 2022

SUMMARY :

bringing you data driven it's called the Emerging Great to see you too, Dave, so much in the mainstream, not only for the ITDMs themselves It is the heart of innovation So the net sentiment is a very So a lot of names that we And then of course you have AnyScale, That's the bad zone, I guess, So the gray dots that you're rates, adoption and the all And on the lower side, Vena, Acton, in the green. are in the enterprise already. So now let's look at the churn So that's the way you can look of dwell on the negative, So again, the axis is still the same, And a couple of the other And then you see these great standouts, Those are the ones you want to but Redis Labs is the one And by the way, MariaDB, So it's not in this slide, Alex, bring that up if you would. So gimme one second to catch up. So I could set it up but based on the amount of time Those are the ones we were saying before, And one of the things I think didn't allow the employees to here, but in the data. What have you seen? the market started to really And one of the executives said, And that's one of the Thank you for introducing this, just point the camera on me We are just scratching the surface

ENTITIES

Entity	Category	Confidence
Erik	PERSON	0.99+
Alex Myerson	PERSON	0.99+
Ken Shifman	PERSON	0.99+
Sanjay Poonen	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Erik Bradley	PERSON	0.99+
November 21	DATE	0.99+
Darren Bramen	PERSON	0.99+
Alex	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
Postgres	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
Netskope	ORGANIZATION	0.99+
Adobe	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
Fivetran	ORGANIZATION	0.99+
$50 million	QUANTITY	0.99+
21%	QUANTITY	0.99+
Chris Lynch	PERSON	0.99+
19%	QUANTITY	0.99+
Jeremy Burton	PERSON	0.99+
$800 million	QUANTITY	0.99+
6,000	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Redis Labs	ORGANIZATION	0.99+
November '21	DATE	0.99+
ETR	ORGANIZATION	0.99+
First	QUANTITY	0.99+
25%	QUANTITY	0.99+
last year	DATE	0.99+
OneTrust	ORGANIZATION	0.99+
two dimensions	QUANTITY	0.99+
two groups	QUANTITY	0.99+
November of 21	DATE	0.99+
both	QUANTITY	0.99+
Boston	LOCATION	0.99+
more than 400 companies	QUANTITY	0.99+
Kristen Martin	PERSON	0.99+
MySQL	TITLE	0.99+
Moogsoft	ORGANIZATION	0.99+
The Cube	ORGANIZATION	0.99+
third	QUANTITY	0.99+
Grafana	ORGANIZATION	0.99+
H2O	ORGANIZATION	0.99+
Mike Speiser	PERSON	0.99+
david.vellante@siliconangle.com	OTHER	0.99+
second	QUANTITY	0.99+
two	QUANTITY	0.99+
first	QUANTITY	0.99+
28%	QUANTITY	0.99+
16%	QUANTITY	0.99+
Second	QUANTITY	0.99+

Breaking Analysis: Emerging Tech sees Notable Decline post Covid-19

>> Announcer: From theCUBE studios in Palo Alto in Boston, connecting with thought leaders all around the world, this is a CUBE conversation. >> As you may recall, coming into the second part of 2019 we reported, based on ETR Survey data, that there was a narrowing of spending on emerging tech and an unplugging of a lot of legacy systems. This was really because people were going from experimentation into operationalizing their digital initiatives. When COVID hit, conventional wisdom suggested that there would be a flight to safety. Now, interestingly, we reported with Eric Bradley, based on one of the Venns, that a lot of CIOs were still experimenting with emerging vendors. But this was very anecdotal. Today, we have more data, fresh data, from the ETR Emerging Technology Study on private companies, which really does suggest that there's a notable decline in experimentation, and that's affecting emerging technology vendors. Hi, everybody, this is Dave Vellante, and welcome to this week's Wikibon Cube Insights, powered by ETR. Once again, Sagar Kadakia is joining us. Sagar is the Director of Research at ETR. Sagar, good to see you. Thanks for coming on. >> Good to see you again. Thanks for having me, Dave. >> So, it's really important to point out, this Emerging Tech Study that you guys do, it's different from your quarterly Technology Spending Intention Survey. Take us through the methodology. Guys, maybe you could bring up the first chart. And, Sagar, walk us through how you guys approach this. >> No problem. So, a lot of the viewers are used to seeing a lot of the results from the Technology Spending Intention Survey, or the TSIS, as we call it. That study, as the title says, it really tracks spending intentions on more pervasive vendors, right, Microsoft, AWS, as an example. What we're going to look at today is our Emerging Technology Study, which we conduct biannually, in May and November. This study is a little bit different. We ask CIOs around evaluations, awareness, planned evaluations, so think of this as pre-spend, right. So that's a major differentiator from the TSIS. That, and this study, really focuses on private emerging providers. We're really only focused on those really emerging private companies, say, like your Series B to Series G or H, whatever it may be, so, two big differences within those studies. And then today what we're really going to look at is the results from the Emerging Technology Study. Just a couple of quick things here. We had 811 CIOs participate, which represents about 380 billion in annual IT spend, so the results from this study matter. We had almost 75 Fortune 100s take it. So, again, we're really measuring how private emerging providers are doing in the largest organizations. And so today we're going to be reviewing notable sectors, but largely this survey tracks roughly 356 private technologies and frameworks. >> All right, guys, bring up the pie chart, the next slide. Now, Sagar, this is sort of a snapshot here, and it basically says that 44% of CIOs agree that COVID has decreased the organization's evaluation and utilization of emerging tech, despite what I mentioned, Eric Bradley's Venn, which suggested one CIO in particular said, "Hey, I always pick somebody in the lower left "of the magic quadrant." But, again, this is a static view. I know we have some other data, but take us through this, and how this compares to other surveys that you've done. >> No problem. So let's start with the high level takeaways. And I'll actually kind of get into to the point that Eric was debating, 'cause that point is true. It's just really how you kind of slice and dice the data to get to that. So, what you're looking at here, and what the overall takeaway from the Emerging Technology Study was, is, you know, you are going to see notable declines in POCs, of proof-of-concepts, any valuations because of COVID-19. Even though we had been communicating for quite some time, you know, the last few months, that there's increasing pressure for companies to further digitize with COVID-19, there are IT budget constraints. There is a huge pivot in IT resources towards supporting remote employees, a decrease in risk tolerance, and so that's why what you're seeing here is a rather notable number of CIOs, 44%, that said that they are decreasing their organization's evaluation and utilization of private emerging providers. So that is notable. >> Now, as you pointed out, you guys run this survey a couple of times a year. So now let's look at the time series. Guys, if you bring up the next chart. We can see how the sentiment has changed since last year. And, of course, we're isolating here on some of larger companies. So, take us through what this data means. >> No problem. So, how do we quantify what we just saw in the prior slide? We saw 44% of CIOs indicating that they are going to be decreasing their evaluations. But what exactly does that mean? We can pretty much determine that by looking at a lot of the data that we captured through our Emerging Technology Study. There's a lot going on in this slide, but I'll walk you through it. What you're looking at here is Fortune 1000 organizations, so we've really isolated the data to those organizations that matter. So, let's start with the teal, kind of green line first, because I think it's a little bit easier to understand. What you're looking at, Fortune 1000 evaluations, both planned and current, okay? And you're looking at a time series, one year ago and six months ago. So, two of the answer options that we provide CIOs in this survey, right, think about the survey as a grid, where you have seven answer options going horizontally, and then 300-plus vendors and technologies going vertically. For any given vendor, they can essentially indicate one of these options, two of them being on currently evaluating them or I plan to evaluate them in six months. So what you're looking at here is effectively the aggregate number, or the average number of Fortune 1000 evaluations. So if you look into May 2019, all the way on the left of that chart, that 24% roughly means that a quarter of selections made by Fortune 1000 of the survey, they selected plan to evaluate or currently evaluating. If you fast-forward six months, to the middle of the chart, November '19, it's roughly the same, one in four technologies that are Fortune 1000 selected, they indicated that I plan or am currently evaluating them. But now look at that big drop off going into May 2020, the 17%, right? So now one out of every six technologies, or one out of every selections that they made was an evaluation. So a very notable drop. And then if you look at the blue line, this is another answer option that we provided CIOs: I'm aware of the technology but I have no plans to evaluate. So this answer option essentially tracks awareness levels. If you look at the last six months, look at that big uptick from 44% to over 50%, right? So now, essentially one out of every two technologies, or private technologies that a CIO is aware of, they have no plans to evaluate. So this is going to have an impact on the general landscape, when we think about those private emerging providers. But there is one caveat, and, Dave, this is what you mentioned earlier, this is what Eric was talking about. The providers that are doing well are the ones that are work-from-home aligned. And so, just like a few years ago, we were really analyzing results based on are you cloud-native or are you Cloud-aligned, because those technologies are going to do the best, what we're seeing in the emerging space is now the same thing. Those emerging providers that enable organizations to maintain productivity for their employees, essentially allowing their employees to work remotely, those emerging providers are still doing well. And that is probably the second biggest takeaway from this study. >> So now what we're seeing here is this flight to perceive safety, which, to your point, Sagar, doesn't necessarily mean good news for all enterprise tech vendors, but certainly for those that are positioned for the work-from-home pivot. So now let's take a look at a couple of sectors. We'll start with information security. We've reported for years about how the perimeter's been broken down, and that more spend was going to shift from inside the moat to a distributed network, and that's clearly what's happened as a result of COVID. Guys, if you bring up the next chart. Sagar, you take us through this. >> No problem. And as you imagine, I think that the big theme here is zero trust. So, a couple of things here. And let me just explain this chart a little bit, because we're going to be going through a couple of these. What you're seeing on the X-axis here, is this is effectively what we're classifying as near term growth opportunity from all customers. The way we measure that effectively is we look at all the evaluations, current evaluations, planned evaluations, we look at people who are evaluated and plan to utilize these vendors. The more indications you get on that the more to the top right you're going to be. The more indications you get around I'm aware of but I don't plan to evaluate, or I'm replacing this early-stage vendor, the further down and on the left you're going to be. So, on the X-axis you have near term growth opportunity from all customers, and on the Y-axis you have near term growth opportunity from, really, the biggest shops in the world, your Global 2000, your Forbes Private 225, like Cargill, as an example, and then, of course, your federal agencies. So you really want to be positioned up and to the right here. So, the big takeaway here is zero trust. So, just a couple of things on this slide when we think about zero trust. As organizations accelerate their Cloud and Saas spend because of COVID-19, and, you know, what we were talking about earlier, Dave, remote work becomes the new normal, that perimeter security approach is losing appeal, because the perimeter's less defined, right? Apps and data are increasingly being stored in the Cloud. That, and employees are working remotely from everywhere, and they're accessing all of these items. And so what we're seeing now is a big move into zero trust. So, if we look at that chart again, what you're going to see in that upper right quadrant are a lot of identity and access management players. And look at the bifurcation in general. This is what we were talking about earlier in terms of the landscape not doing well. Most security vendors are in that red area, you know, in the middle to the bottom. But if you look at the top right, what are you seeing here? Unify ID, Auth0, WSO2, right, all identity and access management players. These are critical in your zero trust approach, and this is one of the few area where we are seeing upticks. You also see here BitSight, Lucideus. So that's going to be security assessment. You're seeing VECTRA and Netskope and Darktrace, and a few others here. And Cloud Security and IDPS, Intrusion Detection and Prevention System. So, very few sectors are seeing an uptick, very few security sectors actually look pretty good, based on opportunities that are coming. But, essentially, all of them are in that work-from-home aligned security stack, so to speak. >> Right, and of course, as we know, as we've been reporting, buyers have options, from both established companies and these emerging companies that are public, Okta, CrowdStrike, Zscaler. We've seen the work-from-home pivot benefit those guys, but even Palo Alto Networks, even CISCO, I asked (other speaker drowns out speech) last week, I said, "Hey, what about this pivot to work from home? "What about this zero trust?" And he said, "Look, the reality is, yes, "a big part of our portfolio is exposed "to that traditional infrastructure, "but we have options for zero trust as well." So, from a buyer's standpoint, that perceived flight to safety, you have a lot of established vendors, and that clearly is showing up in your data. Now, the other sector that we want to talk about is database. We've been reporting a lot on database, data warehouse. So, why don't you take us through the next graphic here, if you would. >> Sagar: No problem. So, our theme here is that Snowflake is really separating itself from the pack, and, again, you can see that here. Private database and data warehousing vendors really continue to impact a lot of their public peers, and Snowflake is leading the way. We expect Snowflake to gain momentum in the next few years. And, look, there's some rumors that IPOing soon. And so when we think about that set-up, we like it, because as organizations transition away from hybrid Cloud architectures to 100% or near-100% public Cloud, Snowflake is really going to benefit. So they look good, their data stacks look pretty good, right, that's resiliency, redundancy across data centers. So we kind of like them as well. Redis Labs bring a DB and they look pretty good here on the opportunity side, but we are seeing a little bit of churn, so I think probably Snowflake and DataStax are probably our two favorites here. And again, when you think about Snowflake, we continue to think more pervasive vendors, like Paradata and Cloudera, and some of the other larger database firms, they're going to continue seeing wallet and market share losses due to some of these emerging providers. >> Yeah. If you could just keep that slide up for a second, I would point out, in many ways Snowflake is kind of a safer bet, you know, we talk about flight to safety, because they're well-funded, they're established. You can go from zero to Snowflake very quickly, that's sort of their mantra, if you will. But I want to point out and recognize that it is somewhat oranges and tangerines here, Snowflake being an analytical database. You take MariaDB, for instance, I look at that, anyway, as relational and operational. And then you mentioned DataStax. I would say Couchbase, Redis Labs, Aerospike. Cockroach is really a... EValue Store. You've got some non-relational databases in there. But we're looking at the entire sector of databases, which has become a really interesting market. But again, some of those established players are going to do very well, and I would put Snowflake on that cusp. As you pointed out, Bloomberg broke the story, I think last week, that they were contemplating an IPO, which we've known for a while. >> Yeah. And just one last thing on that. We do like some of the more pervasive players, right. Obviously, AWS, all their products, Redshift and DynamoDB. Microsoft looks really good. It's just really some of the other legacy ones, like the Teradatas, the Oracles, the Hadoops, right, that we are going to be impacted. And so the claw providers look really good. >> So, the last decade has really brought forth this whole notion of DevOps, infrastructure as code, the whole API economy. And that's the piece we want to jump into now. And there are some real stand-outs here, you know, despite the early data that we showed you, where CIOs are less prone to look at emerging vendors. There are some, for instance, if you bring up the next chart, guys, like Hashi, that really are standing out, aren't they? >> That's right, Dave. So, again, what you're seeing here is you're seeing that bifurcation that we were talking about earlier. There are a lot of infrastructure software vendors that are not positioned well, but if you look at the ones at the top right that are positioned well... We have two kind of things on here, starting with infrastructure automation. We think a winner here is emerging with Terraform. Look all the way up to the right, how well-positioned they are, how many opportunities they're getting. And for the second straight survey now, Terraform is leading along their peers, Chef, Puppet, SaltStack. And they're leading their peers in so many different categories, notably on allocating more spend, which is obviously very important. For Chef, Puppet and SaltStack, which you can see a little bit below, probably a little bit higher than the middle, we are seeing some elevator churn levels. And so, really, Terraform looks like they're kind of separating themselves. And we've got this great quote from the CIO just a few months ago, on why Terraform is likely pulling away, and I'll read it out here quickly. "The Terraform tool creates "an entire infrastructure in a box. "Unlike vendors that use procedural languages, "like Ants, Bull and Chef, "it will show you the infrastructure "in the way you want it to be. "You don't have to worry about "the things that happen underneath." I know some companies where you can put your entire Amazon infrastructure through Terraform. If Amazon disappears, if your availability drops, load balancers, RDS, everything, you just run Terraform and everything will be created in 10 to 15 minutes. So that shows you the power of Terraform and why we think it's ranked better than some of the other vendors. >> Yeah, I think that really does sum it up. And, actually, guys, if you don't mind bringing that chart back up again. So, a point out, so, Mitchell Hashimoto, Hashi, really, I believe I'm correct, talking to Stu about this a little bit, he sort of led the Terraform project, which is an Open Source project, and, to your point, very easy to deploy. Chef, Puppet, Salt, they were largely disrupted by Cloud, because they're designed to automate deployment largely on-prem and DevOps, and now Terraform sort of packages everything up into a platform. So, Hashi actually makes money, and you'll see it on this slide, and things, Vault, which is kind of their security play. You see GitLab on here. That's really application tooling to deploy code. You see Docker containers, you know, Docker, really all about open source, and they've had great adoption, Docker's challenge has always been monetization. You see Turbonomic on here, which is application resource management. You can't go too deep on these things, but it's pretty deep within this sector. But we are comparing different types of companies, but just to give you a sense as to where the momentum is. All right, let's wrap here. So maybe some final thoughts, Sagar, on the Emerging Technology Study, and then what we can expect in the coming month here, on the update in the Technology Spending Intention Study, please. >> Yeah, no problem. One last thing on the zero trust side that has been a big issue that we didn't get to cover, is VPN spend. Our data is pointing that, yes, even though VPN spend did increase the last few months because of remote work, we actually think that people are going to move away from that as they move onto zero trust. So just one last point on that, just in terms of overall thoughts, you know, again, as we cover it, you can see how bifurcated all these spaces are. Really, if we were to go sector by sector by sector, right, storage and block chain and MLAI and all that stuff, you would see there's a few or maybe one or two vendors doing well, and the majority of vendors are not seeing as many opportunities. And so, again, are you work-from-home aligned? Are you the best vendor of all the other emerging providers? And if you fit those two criteria then you will continue seeing POCs and evaluations. And if you don't fit that criteria, unfortunately, you're going to see less opportunities. So think that's really the big takeaway on that. And then, just in terms of next steps, we're already transitioning now to our next Technology Spending Intention Survey. That launched last week. And so, again, we're going to start getting a feel for how CIOs are spending in 2H-20, right, so, for the back half of the year. And our question changes a little bit. We ask them, "How do you plan on spending in the back half year "versus how you actually spent "in the first half of the year, or 1H-20?" So, we're kind of, tighten the screw, so to speak, and really getting an idea of what's spend going to look like in the back half, and we're also going to get some updates as it relates to budget impacts from COVID-19, as well as how vendor-relationships have changed, as well as business impacts, like layoffs and furloughs, and all that stuff. So we have a tremendous amount of data that's going to be coming in the next few weeks, and it should really prepare us for what to see over the summer and into the fall. >> Yeah, very excited, Sagar, to see that. I just wanted to double down on what you said about changes in networking. We've reported with you guys on NPLS networks, shifting to SD-WAN. But even VPN and SD-WAN are being called into question as the internet becomes the new private network. And so lots of changes there. And again, very excited to see updated data, return of post-COVID, as we exit this isolation economy. Really want to point out to folks that this is not a snapshot survey, right? This is an ongoing exercise that ETR runs, and grateful for our partnership with you guys. Check out ETR.plus, that's the ETR website. I publish weekly on Wikibon.com and SiliconANGLE.com. Sagar, thanks so much for coming on. Once again, great to have you. >> Thank you so much, for having me, Dave. I really appreciate it, as always. >> And thank you for watching this episode of theCube Insights, powered by ETR. This Dave Vellante. We'll see you next time. (gentle music)

Published Date : Jun 22 2020

SUMMARY :

leaders all around the world, Sagar is the Director of Research at ETR. Good to see you again. So, it's really important to point out, So, a lot of the viewers that COVID has decreased the of slice and dice the data So now let's look at the time series. by looking at a lot of the data is this flight to perceive safety, and on the Y-axis you have Now, the other sector that we and Snowflake is leading the way. And then you mentioned DataStax. And so the claw providers And that's the piece we "in the way you want it to be. but just to give you a sense and the majority of vendors are not seeing on what you said about Thank you so much, for having me, Dave. And thank you for watching this episode

ENTITIES

Entity	Category	Confidence
Sagar	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Eric	PERSON	0.99+
May 2019	DATE	0.99+
CISCO	ORGANIZATION	0.99+
Dave	PERSON	0.99+
two	QUANTITY	0.99+
May 2020	DATE	0.99+
Eric Bradley	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Terraform	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Mitchell Hashimoto	PERSON	0.99+
100%	QUANTITY	0.99+
Zscaler	ORGANIZATION	0.99+
one	QUANTITY	0.99+
44%	QUANTITY	0.99+
ETR	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
last year	DATE	0.99+
November '19	DATE	0.99+
Palo Alto Networks	ORGANIZATION	0.99+
24%	QUANTITY	0.99+
10	QUANTITY	0.99+
17%	QUANTITY	0.99+
May	DATE	0.99+
Amazon	ORGANIZATION	0.99+
last week	DATE	0.99+
Redis Labs	ORGANIZATION	0.99+
Couchbase	ORGANIZATION	0.99+
Okta	ORGANIZATION	0.99+
Aerospike	ORGANIZATION	0.99+
COVID-19	OTHER	0.99+
Paradata	ORGANIZATION	0.99+
811 CIOs	QUANTITY	0.99+
Hashi	PERSON	0.99+
CrowdStrike	ORGANIZATION	0.99+
one caveat	QUANTITY	0.99+
November	DATE	0.99+
two criteria	QUANTITY	0.99+
Series G	OTHER	0.99+
Boston	LOCATION	0.99+
X-axis	ORGANIZATION	0.99+
today	DATE	0.99+
both	QUANTITY	0.99+
Bloomberg	ORGANIZATION	0.99+
Cloudera	ORGANIZATION	0.99+
DataStax	ORGANIZATION	0.99+
two kind	QUANTITY	0.99+
six months ago	DATE	0.99+
15 minutes	QUANTITY	0.99+
Today	DATE	0.99+
six months	QUANTITY	0.98+
Sagar Kadakia	PERSON	0.98+
about 380 billion	QUANTITY	0.98+
Oracles	ORGANIZATION	0.98+
one year ago	DATE	0.98+
MariaDB	TITLE	0.98+
over 50%	QUANTITY	0.98+
zero trust	QUANTITY	0.98+
two vendors	QUANTITY	0.98+
Series B	OTHER	0.98+
first chart	QUANTITY	0.98+

Brian Pawlowski, DriveScale | CUBEConversation, Sept 2018

(intense orchestral music) >> Hey welcome back everybody, Jeff Frick here with theCUBE. We're having a CUBE Conversation in our Palo Alto studios, getting a short little break between the madness of the conference season, which is fully upon us, and we're excited to have a long time industry veteran Brian Pawlowski, the CTO of DriveScale, joining us to talk about some of the crazy developments that continue to happen in this in this world that just advances, advances. Brian, great to see you. >> Good morning, Jeff, it's great to be here, I'm a bit, still trying to get used to the timezone after a long, long trip in Europe, but I'm glad to be here, I'm glad we finally were able to schedule this. >> Yes, it's never easy, (laughs) one of the secrets of our business is everyone is actually all together at conferences, it's hard to get 'em together when when there's not that catalyst of a conference to bring everybody together. So give us the 101 on DriveScale. >> So, DriveScale. Let me start with, what is composable infrastructure? DriveScale provides product for orchestrating disaggregated components on a high-performance fabric to allow you to spin up essentially your own private cloud, your own clusters for these modern applications, scale out applications. And I just said a bunch of gobble-dee-gook, what does that mean? The DriveScale software is essentially an orchestration package that provides the ability to take compute nodes and storage nodes on high-performance fabric and securely form multi-tenant architectures, much like you would in a cloud. When we think of application deployment, we think of a hundred nodes or 500 nodes. The applications we're looking at are things that our people are using for big data, machine learning, or AI, or, or these scale out databases. Things like Vertica, Aerospike, is important, DRAM, ESES, dBase database, and, this is an alternative to the standard way of deploying applications in a very static nature onto fixed physical resources, or into network storage coming from the likes of Network Appliance, sorry NetApp, and Dell EMC. It's the modern applications we're after, the big data applications for analytics. >> Right. So it's software that basically manages the orchestration of hardware, I mean of compute, store, and networks you can deploy big data analytics applications? >> Yes. >> Ah, at scale. >> It's absolutely focused on the orchestration part. The typical way applications that we're in pursuit of right now are deployed is on 500 physical bare metal nodes from, pick your vendor, of compute and storage that is all bundled together and then laid out into physical deployment on network. What we do is just that you essentially disaggregate, separate compute, pure compute, no disks at all, storage into another layer, have the fabric, and we inventory it all and, much like vCenter for virtualization, for doing software deployment of applications, we do software deployment of scale out applications and a scale out cluster, so. >> Right. So you talked about using industry standard servers, industry standard storage, does the system accommodate different types of compute and CPUs, different types of storage? Whether it's high performance disks, or it's Flash, how does it accommodate those things? And if I'm trying to set up my big stack of hardware to then deploy your software to get it configured, what're some of the things I should be thinkin' about? >> That's actually, a great question, I'm going to try to hit three points. (clears throat) Absolutely. In fact, a core part of our orchestration layer is to essentially generalize the compute and storage components and the networking components of your data center, and do rule-based, constraint-based selection when creating a cluster. From your perspective when creating a cluster (coughs) you say "I want a hundred nodes, and I'm going to run this application on it, and I need that this environment for the application." And this application is running on local, it thinks it's running local, bare metal, so. You say "A hundred nodes, eight cores each minimum, and I want 64 gig of memory minimum." It'll go out and look at the inventory and do a best match of the components there. You could have different products out there, we are compute agnostic, storage agnostic, you could have mix and match, we will basically do a best fit match of all of your available resources and then propose to you in a couple seconds back with the cluster you want, and then you just hit go, and it forms a cluster in a couple seconds. >> A virtual cluster within that inventory of assets that I-- >> A virtual cluster that-- Yes, out of the inventory of assets, except from the perspective of the application it looks like a physical cluster. This is the critical part of what we do, is that, somebody told me "It's like we have an extension cord between the storage and the compute nodes." They used this analogy yesterday and I said I was going to reuse it, so if they listen to this: Hey, I stole your analogy! We basically provide a long extension cord to the direct-to-test storage, except we've separated out the storage from the compute. What's really cool about that, it was the second point of what you said is that you can mix and match. The mix and match occurs because one of the things your doing with your compute and storage is refreshing your compute and storage at three to five year cycles, separately. When you have the old style model of combining compute and storage in what I'd call a captured dazz scenario. You are forced to do refreshes of both compute and persistent storage at the same time, it just becomes, it's a unmanageable position to be in, and separating out the components provides you a lot of flexibility from mixing and matching different types of components, doing rolling upgrades of the compute separate from the storage, and then also having different storage tiers that you can combine SSD storage, the biggest tiers today are SSD storage and spinning disk storage, being able to either provide spinning disk, SSDs, solid-state storage, or a mixture of both for a hybrid deployment for an application without having to worry about a purchase time having to configure your box that way, we just basically do it on the fly. >> Right. So, and then obviously I can run multiple applications against that big stack of assets, and it's going to go ahead and parse the pieces out that I need for each application. >> We didn't even practice this beforehand, that was a great one too! (laughs) Key part of this is actually providing secure multi-tenant environment is the phrase I use, because it's a common phrase. Our target customer is running multiple applications, 2010, when somebody was deploying big data, they were deploying Hadoop. Quickly, (snaps) think, what were the other things then? Nothing. It was Hadoop. Today it's 10 applications, all scale out, all having different requirements for the reference architecture for the amount of compute storage. So, our orchestration layer basically allows you to provision separate virtual physical clusters in a secure, multi-tenant way, cryptographically secure, and you could encrypt the data too if you wanted you could turn on encryption to get over the wire with that data at rest encryption, think GDPR and stuff like that. But, the different clusters cannot interfere with each other's workloads, and because you're on a fully switched internet fabric, they don't interfere with performance either. But that secure multi-tenant part is critical for the orchestration and management of multiple scale out clusters. >> So then, (light laugh) so in theory, if I'm doing this well, I can continually add capacity, I can upgrade my drives to SSDs, I can put in new CPUs as new great things come out into my big cloud, not my cloud, but my big bucket of resources, and then using your software continue to deploy those against applications as is most appropriate? >> Could we switch seats? (both laugh) Let me ask the questions. (laughing) No, because it's-- >> It sounds great, I just keep adding capacity, and then it redeploys based on the optimum, right? >> That's a great summary because the thing that we're-- the basic problem we're trying to solve is that... This is like the lesson from VMware, right? One lesson from VMware was, first it was, we had unused CPU resources, let's get those unused CPU cycles back. No CPU cycle shall go unused! Right? >> I thought that they needed to keep 50% overhead, just to make sure they didn't bump against the roof. But that's a different conversation. >> That's a little detail, (both laugh) that's a little detail. But anyway. The secondary effect was way more important. Once people decoupled their applications from physical purchase decisions and rolling out physical hardware, they stopped caring about any critical piece of hardware, they then found that the simplified management, the one button push software application deployment, was a critical enabler for business operations and business agility. So, we're trying to do what VMware did for that kind of captured legacy application deployments, we're trying to do that for essentially what has been historically, bare metal, big data application deployment, where people were... Seriously in 2012, 2010, 2012, after virtualization took over the data center, and the IT manager had his cup of coffee and he's layin' back goin' "Man, this is great, I have nothing else to worry about." Then there's a (knocks) and the guy comes in his office, or his cube, and goes "Whaddya want?!" and he goes "Well, I'd like you to deploy 500 bare metal nodes to run this thing called Hadoop." and he goes "Well, I'll just give you 500 virtualized instances." a he goes "Nope, not good enough! I want to start going back to bare metal." And sense then it's gotten worse. So what we're trying to do is restore the balance in the universe, and apply for the scale out clusters what virtualization did for the legacy applications. Does that make a little bit of sense? >> Yeah! And is it heading towards the other direction ride is towards the atomic, right? So if you're trying to break the units of compute and store down to the base, so you've got a unified baseline that you can apply more volume than maybe a particular feature set, in a particular CPU, or a particular, characteristic of a particular type of a storage? >> Right. >> This way you're doing in software, and leveraging a whole bunch of it to satisfy, as you said kind of the meets min for that particular application. >> Yeah, absolutely. And I think, kind of critical about the timing of all this is that virtualization drove, very much, a model of commoditization of CPUs, once VMware hit there, people weren't deploying applications on particular platforms, they were deploying applications on a virtualized hardware model, and that was how applications were always thought about from then on. From a lot of these scale out applications, not a lot of them, all of them, are designed to be hardware agnostic. They want to run on bare metal 'cause they're designed to run, when you play a bare metal application for a scale out, Apache Spark, it uses all of the CPU on the machine, you don't need virtualization because it will use all the CPU, it will use all the bandwidth and the disks underneath it. What we're doing is separating it out to provide lifecycle management between the two of them, but also allow you to change the configurations dynamically over time. But, this word of atomic kinda's a-- the disaggregation part is the first step for composability. You want to break it out, and I'll go here and say that the enterprise storage vendors got it right at one point, I mean, they did something good. When they broke out captured storage to the network and provided a separation of compute and storage, before virtualization, that was a step towards a gaining controlled in a sane management approach to what are essentially very different technologies evolving at very different speeds. And then your comment about "So what if you want to basically replace spinning disks with SSDs?" That's easily done in a composable infrastructure because it's a virtual function, you're just using software, software-defined data center, you're using software, except for the set of applications that just slip past what was being done in the virtualized infrastructure, and the network storage infrastructure. >> Right. And this really supports kind of the trend that we see, which is the new age, which is "No, don't tell me what infrastructure I have, and then I'll build an app and try and make it fit." It's really app first, and the infrastructure has to support the app, and I don't really care as a developer and as a competitive business trying to get apps to satisfy my marketplace, the infrastructure, I'm just now assuming, is going to support whatever I build. This is how you enable that. >> Right. And very importantly, the people that are writing all of these apps, the tons of low apps, Apache-- by the way, there's so many Apache things, Apache Kafka, (laughing) Apache Spark, the Hadoops of the world, the NoSQL databases, >> Flinks, and Oracle, >> Cassandra, Vertica, things that we consider-- >> MongoDB, you got 'em all. MongoDB, right. Let's just keep rolling these things off our tongue. >> They're all CUBE alumni, so we've talked to 'em all. >> Oh, this is great. >> It's awesome. (laughs) >> And they're all brilliant technologists, right? And they have defined applications that are so, so good at what they do, but they didn't all get together beforehand and say, "Hey, by the way, how can we work together to make sure that when this is all deployed, and operating in pipelines, and in parallel, that from an IT management perspective, it all just plays well together?" They solved their particular problems, and when it was just one application being deployed no harm no foul, right? When it's 10 applications being deployed, and all of a sudden the line item for big data application starts creeping past five, six, approaching 10%, people start to get a little bit nervous about the operational cost, the management cost, deployability, I talked about lifecycle management, refreshes, tech refreshes, expansion, all these things that when it's a small thing over there in the corner, okay, I'll just ignore it for a while. Yeah. Do you remember the old adventure game pieces? (Jeff laughs) I'm dating myself. >> What's adventure game, I don't know? (laughs) >> Yeah, when you watered a plant, "Water, please! Water, please!" The plant, the plant in there looked pitiful, you gave it water and then it goes "Water! Water! Give me water!" Then it starts to attack, but. >> I'll have to look that one up. (both laugh) Alright so, before I let you go, you've been at this for a while, you've seen a lot of iterations. As you kind of look forward over the next little while, kind of what do you see as some of the next kind of big movements or kind of big developments as kind of the IT evolution, and every company's now an IT company, or software company continues? >> So, let's just say that this is a great time, why I joined DriveScale actually, a couple reasons. This is a great time for composable infrastructure. It's like "Why is composalbe infrastructure important now?" It does solve a lot of problems, you can deploy legacy applications over and stuff, but, they don't have any pain points per se, they're running in their virtualization infrastructure over here, the enterprise storage over here. >> And IBM still sells mainframes, right? So there's still stuff-- >> IBM still sells mainframes. >> There's still stuff runnin' on those boxes. >> Yes there is. (laughs) >> Just let it be, let it run. >> This came up in Europe. (laughs) >> And just let it run, but there's no pain point there, what these increasingly deployed scale out applications, 2004 when the clocks beep was hit, and then everything went multi-core and then parallel applications became the norm, and then it became scale out applications for these for the Facebooks of the world, the Googles of the world, whatever. >> Amazon, et cetera. >> For their applications, that scale out is becoming the norm moving forward for application architecture, and application deployment. The more data that you process, the more scale out you need, and composable infrastructure is becoming a-- is a critical part of getting that under control, and getting you the flexibility and manageability to allow you to actually make sense of that deployment, in the IT center, in the large. And the second thing I want to mention is that, one thing is that Flash has emerged, and that's driven something called NVME over Fabrics, essentially a high-performance fabric interconnect for providing essentially local latency to remote resources; that is part of the composable infrastructure story today, and you're basically accessing with the speed of local access to solid state memory, you're accessing it over the fabric, and all these things are coming together driving a set of applications that are becoming both increasingly important, and increasingly expensive to deploy. And composable infrastructure allows you to get a handle on controlling those costs, and making it a lot more manageable. >> That's a great summary. And clearly, the amount of data, that's going to be coming into these things is only going up, up, up, so. Great conversation Brian, again, we still got to go meet at Terún, later so. >> Yeah, we have to go, yes. >> We will make that happen with ya. >> Great restaurant in Palo Alto. >> Thanks for stoppin' by, and, really appreciate the conversation. >> Yeah, and if you need to buy DriveScale, I'm your guy. (both laughing) >> Alright, he's Brian, I'm Jeff, you're walking the CUBE Conversation from our Palo Alto studios. Thanks for watchin', we'll see you at a conference soon, I'm sure. See ya next time. (intense orchestral music)

Published Date : Sep 28 2018

SUMMARY :

madness of the conference season, which is fully upon us, but I'm glad to be here, one of the secrets of our business that provides the ability to take the orchestration of hardware, It's absolutely focused on the orchestration part. does the system accommodate and the networking components of your data center, and persistent storage at the same time, and it's going to go ahead and and you could encrypt the data too if you wanted Let me ask the questions. This is like the lesson from VMware, right? I thought that they needed to keep 50% overhead, and apply for the scale out clusters and leveraging a whole bunch of it to satisfy, and the network storage infrastructure. and the infrastructure has to support the app, the Hadoops of the world, the NoSQL databases, MongoDB, you got 'em all. It's awesome. and all of a sudden the line item for big data application the plant in there looked pitiful, kind of the IT evolution, the enterprise storage over here. (laughs) This came up in Europe. for the Facebooks of the world, the Googles of the world, and getting you the flexibility and manageability And clearly, the amount of data, really appreciate the conversation. Yeah, and if you need to buy DriveScale, I'm your guy. we'll see you at a conference soon, I'm sure.

ENTITIES

Entity	Category	Confidence
Brian Pawlowski	PERSON	0.99+
Jeff	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Brian	PERSON	0.99+
50%	QUANTITY	0.99+
Europe	LOCATION	0.99+
10 applications	QUANTITY	0.99+
2012	DATE	0.99+
Palo Alto	LOCATION	0.99+
two	QUANTITY	0.99+
2010	DATE	0.99+
Sept 2018	DATE	0.99+
IBM	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
2004	DATE	0.99+
five year	QUANTITY	0.99+
three	QUANTITY	0.99+
500 nodes	QUANTITY	0.99+
One lesson	QUANTITY	0.99+
MongoDB	TITLE	0.99+
both	QUANTITY	0.99+
six	QUANTITY	0.99+
yesterday	DATE	0.99+
64 gig	QUANTITY	0.99+
eight cores	QUANTITY	0.99+
10%	QUANTITY	0.99+
Network Appliance	ORGANIZATION	0.98+
one application	QUANTITY	0.98+
first step	QUANTITY	0.98+
five	QUANTITY	0.98+
each application	QUANTITY	0.98+
second point	QUANTITY	0.98+
VMware	ORGANIZATION	0.97+
DriveScale	ORGANIZATION	0.97+
GDPR	TITLE	0.97+
101	QUANTITY	0.97+
today	DATE	0.97+
Cassandra	TITLE	0.97+
Today	DATE	0.96+
second thing	QUANTITY	0.96+
CUBE	ORGANIZATION	0.96+
one	QUANTITY	0.96+
NoSQL	TITLE	0.96+
each	QUANTITY	0.96+
Facebooks	ORGANIZATION	0.96+
one thing	QUANTITY	0.95+
one point	QUANTITY	0.95+
both laugh	QUANTITY	0.95+
first	QUANTITY	0.94+
Googles	ORGANIZATION	0.94+
Dell EMC	ORGANIZATION	0.94+
NetApp	ORGANIZATION	0.93+
Apache	ORGANIZATION	0.91+
three points	QUANTITY	0.91+
DriveScale	TITLE	0.88+
Terún	ORGANIZATION	0.88+
500 bare metal nodes	QUANTITY	0.88+
Flinks	TITLE	0.87+
Vertica	TITLE	0.86+
a hundred nodes	QUANTITY	0.85+
vCenter	TITLE	0.84+
CUBEConversation	EVENT	0.83+
couple seconds	QUANTITY	0.83+
500 physical bare metal nodes	QUANTITY	0.81+
couple	QUANTITY	0.81+
Aerospike	TITLE	0.78+
500 virtualized	QUANTITY	0.77+
hundred nodes	QUANTITY	0.76+
secondary	QUANTITY	0.76+
one button	QUANTITY	0.72+
Spark	TITLE	0.68+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Aerospike: