Ed Walsh, ChaosSearch | CUBE Conversation May 2021

>>president >>so called big data promised to usher in a new era of innovation where companies competed on the basis of insights and agile decision making. There's little question that social media giants, search leaders and e commerce companies benefited. They had the engineering shops and the execution capabilities to take troves of data and turned them into piles of money. But many organizations were not as successful. They invested heavily in data architecture is tooling and hyper specialized experts to build out their data pipelines. Yet they still struggle today to truly realize they're busy. Did data in their lakes is plentiful but actionable insights aren't so much chaos. Search is a cloud based startup that wants to change this dynamic with a new approach designed to simplify and accelerate time to insights and dramatically lower cost and with us to discuss his company and its vision for the future is cuba Lem Ed Walsh had great to see you. Thanks for coming back in the cube. >>I always love to be here. Thank you very much. It's always a warm welcome. Thank you. >>Alright, so give us the update. You guys have had some big funding rounds, You're making real progress on the tech, taking it to market what's new with chaos surgery. >>Sure. Actually even a lot of good exciting things happen. In fact just this month we need some, you know, obviously announced some pretty exciting things. So we unveiled what we consider the industry first multi model data late platform that we allow you to take your data in S three. In fact, if you want to show the image you can, but basically we allow you to put your data in S three and then what we do is we activate that data and what we do is a full index of the data and makes it available through open a P. I. S. And the key thing about that is it allows your end users to use the tools are using today. So simply put your data in your cloud option charge, think Amazon S three and glacier think of all the different data. Is that a natural act? And then we do the hard work. And the key thing is to get one unified delic but it's a multi mode model access so we expose api like the elastic search aPI So you can do things like search or using cabana do log analytics but you can also do things like sequel, use Tableau looker or bring relational concepts into cabana. Things like joins in the data back end. But it allows you also to machine learning which is early next year. But what you get is that with that because of a data lake philosophy, we're not making new transformations without all the data movement. People typically land data in S. Three and we're on the shoulders of giants with us three. Um There's not a better more cost effective platform. More resilient. There's not a better queuing system out there and it's gonna cost curve that you can't beat. But basically so people store a lot of data in S. Three. Um But what their um But basically what you have to do is you E. T. L. Out to other locations. What we do is allow you to literally keep it in place. We index in place. We write our hot index to rewrite index, allow you to go after that but published an open aPI S. But what we avoid is the GTL process. So what our index does is look at the data and does full scheme of discovery normalization, were able to give sample sets. And then the refinery allows you to advance transformations using code. Think about using sequel or using rejects to change that data pull the dead apartheid things but use role based access to give that to the end user. But it's in a format that their tools understand cabana will use the elasticsearch ap or using elasticsearch calls but also sequel and go directly after data by doing that. You get a data lake but you haven't had to take the three weeks to three months to transform your data. Everyone else makes you. And you talk about the failure. The idea that Alex was put your data there in a very scalable resilient environment. Don't do transformation. It was too hard to structure for databases and data. Where else is put it there? We'll show you how value out Largely un delivered. But we're that last mile. We do exactly that. Just put it in s. three and we activated and activate it with a piece that the tools of your analysts use today or what they want to use in the future. That is what's so powerful. So basically we're on the shoulders of giants with street, put it there and we light it up and that's really the last mile. But it's this multi model but it's also this lack of transformation. We can do all the transformation that's all to virtually and available immediately. You're not doing extended GTL projects with big teams moving around a lot of data in the enterprise. In fact, most time they land and that's three and they move it somewhere and they move it again. What we're saying is now just leave in place well index and make it available. >>So the reason that it was interesting, so the reason they want to move in the S three was the original object storage cloud. It was, it was a cheap bucket. Okay. But it's become much more than that when you talk to customers like, hey, I have all this data in this three. I want to do something with it. I want to apply machine intelligence. I want to search it. I want to do all these things, but you're right. I have to move it. Oftentimes to do that. So that's a huge value. Now can I, are you available in the AWS marketplace yet? >>You know, in fact that was the other announcement to talk about. So our solution is one person available AWS marketplace, which is great for clients because they've been burned down their credits with amazon. >>Yeah, that's that super great news there. Now let's talk a little bit more about data. Like you know, the old joke of the tongue in cheek was data lakes become data swamps. You sort of know, see no schema on, right. Oh great. I can put everything into the lake and then it's like, okay, what? Um, so maybe double click on that a little bit and provide a little bit more details to your, your vision there and your philosophy. >>So if you could put things that data can get after it with your own tools on elastic or search, of course you do that. If you don't have to go through that. But everyone thinks it's a status quo. Everyone is using, you know, everyone has to put it in some sort of schema in a database before they can get access to what everyone does. They move it some place to do it. Now. They're using 1970s and maybe 1980s technology. And they're saying, I'm gonna put it in this database, it works on the cloud and you can go after it. But you have to do all the same pain of transformation, which is what takes human. We use time, cost and complexity. It takes time to do that to do a transformation for an user. It takes a lot of time. But it also takes a teams time to do it with dBS and data scientists to do exactly that. And it's not one thing going on. So it takes three weeks to three months in enterprise. It's a cost complexity. But all these pipelines for every data request, you're trying to give them their own data set. It ends up being data puddles all over this. It might be in your data lake, but it's all separated. Hard to govern. Hard to manage. What we do is we stop that. What we do is we index in place. Your dad is already necessary. Typically retailing it out. You can continue doing that. We really are just one more use of the data. We do read only access. We do not change that data and you give us a place in. You're going to write our index. It's a full rewrite index. Once we did that that allows you with the refinery to make that we just we activate that data. It will immediately fully index was performant from cabana. So you no longer have to take your data and move it and do a pipeline into elasticsearch which becomes kind of brittle at scale. You have the scale of S. Three but use the exact same tools you do today. And what we find for like log analytics is it's a slightly different use case for large analytics or value prop than Be I or what we're doing with private companies but the logs were saving clients 50 to 80% on the hard dollars a day in the month. They're going from very limited data sets to unlimited data sets. Whatever they want to keep an S. Three and glacier. But also they're getting away from the brittle data layer which is the loosen environment which any of the data layers hold you back because it takes time to put it there. But more importantly It becomes brittle at scale where you don't have any of that scale issue when using S. three. Is your dad like. So what what >>are the big use cases Ed you mentioned log analytics? Maybe you can talk about that. And are there any others that are sort of forming in the marketplace? Any patterns that you see >>Because of the multi model we can do a lot of different use cases but we always work with clients on high R. O. I use cases why the Big Bang theory of Due dad like and put everything in it. It's just proven not to work right? So what we're focusing first use cases, log analytics, why as by way with everything had a tipping point, right? People were buying model, save money here, invested here. It went quickly to no, no we're going cloud native and we have to and then on top of it it was how do we efficiently innovate? So they got the tipping point happens, everyone's going cloud native. Once you go cloud native, the amount of machine generated data that you have that comes from the environment dramatically. It just explodes. You're not managing hundreds or thousands or maybe 10,000 endpoints, you're dealing with millions or billions and also you need this insight to get inside out. So logs become one of the things you can't keep up with it. I think I mentioned uh we went to a group of end users, it was only 60 enterprise clients but we asked him what's your capture rate on logs And they said what do you want it to be 80%, actually 78 said listen we want eight captured 80 200 of our logs. That would be the ideal not everything but we need most of it. And then the same group, what are you doing? Well 82 had less than 50%. They just can't keep up with it and every everything including elastic and Splunk. They work harder to the process to narrow and keep less and less data. Why? Because they can't handle the scale, we just say landed there don't transform will make it all available to you. So for log analytics, especially with cloud native, you need this type of technology and you need to stop, it's like uh it feels so good when you stop hitting your head against the wall. Right? This detail process that this type of scale just doesn't work. So that's exactly we're delivering the second use case uh and that's with using elastic KPI but also using sequel to go after the same data representation. And we come out with machine learning. You can also do anomaly detection on the same data representation. So for a log uh analytic use case series devops setups. It's a huge value problem now the same platform because it has sequel exposed. You can do just what we use the term is agile B. I people are using you think about look or tableau power bi I uh metabolic. I think of all these toolsets that people want to give and uh and use your business or coming back to the centralized team every single week asking for new datasets. And they have to be set up like a data set. They have to do an e tail process that give access to that data where because of the way just landed in the bucket. If you have access to that with role based access, I can literally get you access that with your tool set, let's say Tableau looker. You know um these different data sets literally in five minutes and now you're off and running and if you want a new dataset they give another virtual and you're off and running. But with full governance so we can use to be in B I either had self service or centralized. Self service is kind of out of control, but we can move fast and the centralized team is it takes me months but at least I'm in control. We allow you do both fully governed but self service. Right. I got to >>have lower. I gotta excel. All right. And it's like and that's the trade off on each of the pieces of the triangle. Right. >>And they make it easy, we'll just put in a data source and you're done. But the problem is you have to E T L the data source. And that's what takes the three weeks to three months in enterprise and we do it virtually in five minutes. So now the third is actually think about um it's kind of a combination of the two. Think about uh you love the beers and diaper stories. So you know, think about early days of terror data where they look at sales out data for business and they were able to look at all the sales out data, large relational environment, look at it, they crunch all these numbers and they figured out by different location of products and the start of they sell more sticker things and they came up with an analogy which everyone talked about beers and diapers. If you put it together, you sell more from why? Because afternoon for anyone that has kids, you picked up diapers and you might want to grab a beer of your home with the kids. But that analogy 30 years ago, it's now well we're what's the shelf space now for approximate company? You know it is the website, it's actually what's the data coming from there. It's actually the app logs and you're not capturing them because you can't in these environments or you're capturing the data. But everyone's telling, you know, you've got to do an E. T. L. Process to keep less data. You've got to select, you got to be very specific because it's going to kill your budget. You can't do that with elastic or Splunk, you gotta keep less data and you don't even know what the questions are gonna ask with us, Bring all the app logs just land in S. three or glacier which is the most it's really shoulders of giants right? There's not a better platform cost effectively security resilience or through but to think about what you can stream and the it's the best queuing platform I've ever seen in the industry just landed there. And it's also very cost effective. We also compress the data. So by doing that now you match that up with actually relatively small amount of relational data and now you have the vaccine being data. But instead it's like this users using that use case and our top users are always, they start with this one then they use that feature and that feature. Hey, we just did new pricing is affecting these clients and that clients by doing this. We get that. But you need that data and people aren't able to capture it with the current platforms. A data lake. As long as you can make it available. Hot is a way to do it. And that's what we're doing. But we're unique in that. Other people are making GTL IT and put it in a in 19 seventies and 19 eighties data format called a schema. And we avoided that because we basically make S three a hot and elected. >>So okay. So I gotta I want to, I want to land on that for a second because I think sometimes people get confused. I know I do sometimes without chaos or it's like sometimes don't know where to put you. I'm like okay observe ability that seems to be a hot space. You know of course log analytics as part of that B. I. Agile B. I. You called it but there's players like elastic search their star burst. There's data, dogs, data bricks. Dream EOS Snowflake. I mean where do you fit where what's the category and how do you differentiate from players like that? >>Yeah. So we went about it fundamentally different than everyone else. Six years ago. Um Tom hazel and his band of merry men and women came up and designed it from scratch. They may basically yesterday they purposely built make s free hot analytic environment with open A. P. I. S. By doing that. They kind of changed the game so we deliver upon the true promises. Just put it there and I'll give you access to it. No one else does that. Everyone else makes you move the data and put it in schema of some format to get to it. And they try to put so if you look at elasticsearch, why are we going after? Like it just happens to be an easy logs are overwhelming. You once you go to cloud native, you can't afford to put it in a loose seen the elk stack. L is for loosen its inverted index. Start small. Great. But once you now grow it's now not one server. Five servers, 15 servers, you lose a server, you're down for three days because you have to rebuild the whole thing. It becomes brittle at scale and expensive. So you trade off I'm going to keep less or keep less either from retention or data. So basically by doing that so elastic we're not we have no elastic on that covers but we allow you to well index the data in S. Tree and you can access it directly through a cabana interface or an open search interface. Api >>out it's just a P. >>It's open A P. I. S. It's And by doing that you've avoided a whole bunch of time cost, complexity, time of your team to do it. But also the time to results the delays of doing that cost. It's crazy. We're saving 50-80 hard dollars while giving you unlimited retention where you were dramatically limited before us. And as a managed service you have to manage that Kind of Clunky. Not when it starts small, when it starts small, it's great once at scale. That's a terrible environment to manage the scale. That's why you end up with not one elasticsearch cluster, dozens. I just talked to someone yesterday had 125 elasticsearch clusters because of the scale. So anyway, that's where elastic we're not a Mhm. If you're using elastic it scale and you're having problems with the retired off of cost time in the, in the scale, we become a natural fit and you don't change what your end users do. >>So the thing, you know, they had people here, this will go, wow, that sounds so simple. Why doesn't everybody do this? The reason is it's not easy. You said tom and his merry band. This is really hard core tech. Um and it's and it's it's not trivial what you've built. Let's talk about your secret sauce. >>Yeah. So it is a patented technology. So if you look at our, you know, component for architecture is basically a large part of the 90% of value add is actually S. Three, I gotta give S three full kudos. They built a platform that we're on shoulders of giants. Um But what we did is we purpose built to make an object storage a hot alec database. So we have an index, like a database. Um And we basically the data you bring a refinery to be able to do all the advanced type of transformation but all virtually done because we're not changing the source of record, we're changing the virtual views And then a fabric allows you to manage and be fully elastic. So if we have a big queries because we have multiple clients with multiple use cases, each multiple petabytes, we're spending up 1800 different nodes after a particular environment. But even with all that we're saving them 58%. But it's really the patented technology to do this, it took us six years by the way, that's what it takes to come up with this. I come upon it, I knew the founder, I've known tom tom a stable for a while and uh you know his first thing was he figured out the math and the math worked out. Its deep tech, it's hard tech. But the key thing about it is we've been in market now for two years, multiple use cases in production at scale. Um Now what you do is roadmap, we're adding a P. I. So now we have elasticsearch natural proofpoint. Now you're adding sequel allows you open up new markets. But the idea for the person dealing with, you know, so we believe we deliver on the true promise of Data Lakes and the promise of Data lakes was put it there, don't focus on transferring. It's just too hard. I'll get insights out and that's exactly what we do. But we're the only ones that do that everyone else makes you E. T. L. At places. And that's the innovation of the index in the refinery that allows the index in place and give virtual views in place at scale. Um And then the open api is to be honest, uh I think that's a game. Give me an open api let me go after it. I don't know what tool I'm gonna use next week every time we go into account they're not a looker shop or Tableau Sharp or quick site shop there, all of them and they're just trying to keep up with the businesses. Um and then the ability to have role based access where actually can give, hey, get them their own bucket, give them their own refinery. As long as they have access to the data, they can go to their own manipulation ends up being >>just, >>that's the true promise of data lakes. Once we come out with machine learning next year, now you're gonna rip through the same embassy and the way we structured the data matrices. It's a natural fit for things like tensorflow pytorch, but that's, that's gonna be next year just because it's a different persona. But the underlining architecture has been built, what we're doing is trying to use case that time. So we worked, our clients say it's not a big bang. Let's nail a use case that works well. Great R. O. I great business value for a particular business unit and let's move to the next. And that's how I think it's gonna be really. That's what if you think about gardener talks about, if you think about what really got successful in data, where else in the past? That's exactly it wasn't the big bang, it was, let's go and nail it for particular users. And that's what we're doing now because it's multi model, there's a bunch of different use cases, but even then we're focusing on these core things that are really hard to do with other relational only environments. Yeah, I >>can see why you're still because you know, you haven't been well, you and I have talked about the api economy for forever and then you've been in the storage world so long. You know what a nightmare is to move data. We gotta, we gotta jump. But I want to ask you, I want to be clear on this. So you are your cloud cloud Native talked to frank's Lukman maybe a year ago and I asked him about on prem and he's like, no, we're never doing the halfway house. We are cloud all the >>way. I think >>you're, I think you have a similar answer. What what's your plan on Hybrid? >>Okay. We get, there's nothing about technology, we can't go on, but we are 100 cloud native or only in the public cloud. We believe that's a trend line. Everyone agrees with us, we're sticking there. That's for the opportunity. And if you can run analytics, There's nothing better than getting to the public cloud like Amazon and he was actually, that were 100 cloud native. Uh, we love S three and what would be a better place to put this is put the next three and we just let you light it up and then I guess if I'm gonna add the commercial and buy it through amazon marketplace, which we love that business model with amazon. It's >>great. Ed thanks so much for coming back in the cube and participating in the startup showcase. Love having you and best of luck. Really exciting. >>Hey, thanks again, appreciate it. >>All right, thank you for watching everybody. This is Dave Volonte for the cube. Keep it right there.

Published Date : May 14 2021

SUMMARY :

They had the engineering shops and the execution capabilities to take troves of data and Thank you very much. taking it to market what's new with chaos surgery. But basically what you have to do is you E. T. L. Out to other locations. But it's become much more than that when you talk You know, in fact that was the other announcement to talk about. Like you know, the old joke of the tongue in cheek was data lakes become data swamps. You have the scale of S. Three but use the exact same tools you do today. are the big use cases Ed you mentioned log analytics? So logs become one of the things you can't keep up with it. And it's like and that's the trade off on each of But the problem is you have to E T L the data I mean where do you fit where what's the category and how do you differentiate from players like that? no elastic on that covers but we allow you to well index the data in S. And as a managed service you have to manage that Kind of Clunky. So the thing, you know, they had people here, this will go, wow, that sounds so simple. the source of record, we're changing the virtual views And then a fabric allows you to manage and be That's what if you think about gardener talks about, if you think about what really got successful in data, So you are your cloud cloud I think What what's your plan on Hybrid? to put this is put the next three and we just let you light it up and then I guess if I'm gonna add Love having you and best of luck. All right, thank you for watching everybody.

ENTITIES

Entity	Category	Confidence
Dave Volonte	PERSON	0.99+
Ed Walsh	PERSON	0.99+
15 servers	QUANTITY	0.99+
80%	QUANTITY	0.99+
58%	QUANTITY	0.99+
three months	QUANTITY	0.99+
three weeks	QUANTITY	0.99+
May 2021	DATE	0.99+
two years	QUANTITY	0.99+
90%	QUANTITY	0.99+
Five servers	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
1970s	DATE	0.99+
amazon	ORGANIZATION	0.99+
1980s	DATE	0.99+
yesterday	DATE	0.99+
five minutes	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
millions	QUANTITY	0.99+
S three	TITLE	0.99+
three days	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
six years	QUANTITY	0.99+
50	QUANTITY	0.99+
one server	QUANTITY	0.99+
Ed	PERSON	0.99+
Tom hazel	PERSON	0.99+
two	QUANTITY	0.99+
three weeks	QUANTITY	0.99+
78	QUANTITY	0.99+
S. three	LOCATION	0.99+
third	QUANTITY	0.99+
next year	DATE	0.99+
less than 50%	QUANTITY	0.99+
tom	PERSON	0.99+
billions	QUANTITY	0.99+
three	QUANTITY	0.99+
thousands	QUANTITY	0.99+
next week	DATE	0.99+
dozens	QUANTITY	0.99+
50-80	QUANTITY	0.98+
Six years ago	DATE	0.98+
125 elasticsearch clusters	QUANTITY	0.98+
both	QUANTITY	0.98+
a year ago	DATE	0.98+
early next year	DATE	0.97+
Tableau Sharp	ORGANIZATION	0.97+
Alex	PERSON	0.97+
today	DATE	0.97+
first	QUANTITY	0.97+
first thing	QUANTITY	0.96+
30 years ago	DATE	0.96+
each	QUANTITY	0.96+
one person	QUANTITY	0.96+
S. Tree	TITLE	0.96+
10,000 endpoints	QUANTITY	0.96+
second use	QUANTITY	0.95+
82	QUANTITY	0.95+
one thing	QUANTITY	0.94+
Tableau	TITLE	0.94+
60 enterprise clients	QUANTITY	0.93+
one	QUANTITY	0.93+
eight	QUANTITY	0.93+
1800 different nodes	QUANTITY	0.91+
excel	TITLE	0.9+
80 200 of our logs	QUANTITY	0.89+
this month	DATE	0.89+
S. Three	TITLE	0.88+
agile	TITLE	0.88+
ChaosSearch	ORGANIZATION	0.86+
S. Three	TITLE	0.86+
Dream EOS Snowflake	TITLE	0.85+
cabana	LOCATION	0.85+
100 cloud	QUANTITY	0.83+
a day	QUANTITY	0.81+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Tableau Sharp: