Roland Cabana, Vault Systems | OpenStack Summit 2018

>> Announcer: Live from Vancouver, Canada it's theCUBE, covering OpenStack Summit North America 2018. Brought to you by Red Hat, the OpenStack foundation, and its Ecosystem partners. >> Welcome back, I'm Stu Miniman and my cohost John Troyer and you're watching theCUBE's coverage of OpenStack Summit 2018 here in Vancouver. Happy to welcome first-time guest Roland Cabana who is a DevOps Manager at Vault Systems out of Australia, but you come from a little bit more local. Thanks for joining us Roland. >> Thank you, thanks for having me. Yes, I'm actually born and raised in Vancouver, I moved to Australia a couple years ago. I realized the potential in Australian cloud providers, and I've been there ever since. >> Alright, so one of the big things we talk about here at OpenStack of course is, you know, do people really build clouds with this stuff, where does it fit, how is it doing, so a nice lead-in to what does Vault Systems do for the people who aren't aware. >> Definitely, so yes, we do build cloud, a cloud, or many clouds, actually. And Vault Systems provides cloud services infrastructure service to Australian Government. We do that because we are a certified cloud. We are certified to handle unclassified DLM data, and protected data. And what that means is the sensitive information that is gathered for the Australian citizens, and anything to do with big user-space data is actually secured with certain controls set up by the Australian Government. The Australian Government body around this is called ASD, the Australian Signals Directorate, and they release a document called the ISM. And this document actually outlines 1,088 plus controls that dictate how a cloud should operate, how data should be handled inside of Australia. >> Just to step back for a second, I took a quick look at your website, it's not like you're listed as the government OpenStack cloud there. (Roland laughs) Could you give us, where does OpenStack fit into the overall discussion of the identity of the company, what your ultimate end-users think about how they're doing, help us kind of understand where this fits. >> Yeah, for sure, and I mean the journey started long ago when we, actually our CEO, Rupert Taylor-Price, set out to handle a lot of government information, and tried to find this cloud provider that could handle it in the prescribed way that the Australian Signals Directorate needed to handle. So, he went to different vendors, different cloud platforms, and found out that you couldn't actually meet all the controls in this document using a proprietary cloud or using a proprietary platform to plot out your bare-metal hardware. So, eventually he found OpenStack and saw that there was a great opportunity to massage the code and change it, so that it would comply 100% to the Australian Signals Directorate. >> Alright, so the keynote this morning were talking about people that build, people that operate, you've got DevOps in your title, tell us a little about your role in working with OpenStack, specifically, in broader scope of your-- >> For sure, for sure, so in Vault Systems I'm the DevOps Manager, and so what I do, we run through a lot of tests in terms of our infrastructure. So, complying to those controls I had mentioned earlier, going through the rigmarole of making sure that all the different services that are provided on our platform comply to those specific standards, the specific use cases. So, as a DevOps Manger, I handle a lot of the pipelining in terms of where the code goes. I handle a lot of the logistics and operations. And so it actually extends beyond just operation and development, it actually extends into our policies. And so marrying all that stuff together is pretty much my role day-to-day. I have a leg in the infrastructure team with the engineering and I also have a leg in with sort of the solutions architects and how they get feedback from different customers in terms of what we need and how would we architect that so it's safe and secure for government. >> Roland, so since one of your parts of your remit is compliance, would you say that you're DevSecOps? Do you like that one or not? >> Well I guess there's a few more buzzwords, and there's a few more roles I can throw in there but yeah, I guess yes. DevSecOps there's a strong security posture that Vault holds, and we hold it to a higher standard than a lot of the other incumbents or a lot of platform providers, because we are actually very sensitive about how we handle this information for government. So, security's a big portion of it, and I think the company culture internally is actually centered around how we handle the security. A good example of this is, you know, internally we actually have controls about printing, you know, most modern companies today, they print pages, and you know it's an eco thing. It's an eco thing for us too, but at the same time there are controls around printed documents, and how sensitive those things are. And so, our position in the company is if that control exists because Australian Government decides that that's a sensitive matter, let's adopt that in our entire internal ecosystem. >> There was a lot of talk this morning at the keynote both about upgrades, and I'm blanking on the name of the new feature, but also about Zuul and about upgrading OpenStack. You guys are a full Upstream, OpenStack expert cloud provider. How do you deal with upgrades, and what do you think the state of the OpenStack community is in terms of kind of upgrades, and maintenance, and day two kind of stuff? >> Well I'll tell you the truth, the upgrade path for OpenStack is actually quite difficult. I mean, there's a lot of moving parts, a lot of components that you have to be very specific in terms of how you upgrade to the next level. If you're not keeping in step of the next releases, you may fall behind and you can't upgrade, you know, Keystone from a Liberty all the way up to Alcatel, right? You're basically stuck there. And so what we do is we try to figure out what the government needs, what are the features that are required. And, you know, it's also a conversation piece with government, because we don't have certain features in this particular release of OpenStack, it doesn't mean we're not going to support it. We're not going to move to the next version just because it's available, right? There's a lot of security involved in fusing our controls inside our distribution of OpenStack. I guess you can call it a distribution, on our build of OpenStack. But it's all based on a conversation that we start with the government. So, you know, if they need VGPUs for some reason, right, with the Queens release that's coming out, that's a conversation we're starting. And we will build into that functionality as we need it. >> So, does that mean that you have different entities with different versions, and if so, how do you manage all of that? >> Well, okay, so yes that's true. We do have different versions where we have a Liberty release, and we have an Alcatel release, which is predominant in our infrastructure. And that's only because we started with the inception of the Liberty release before our certification process. A lot of the things that we work with government for is how do they progress through this cloud maturity model. And, you know, the forklift and shift is actually a problem when you're talking about releases. But when you're talking about containerization, you're talking about Agile Methodologies and things like that, it's less of a reliance on the version because you now have the ability to respawn that same application, migrate the data, and have everything live as you progress through different cloud platforms. And so, as OpenStack matures, this whole idea of the fast forward idea of getting to the next release, because now they have an integration step, or they have a path to the next version even though you're two or three versions behind, because let's face it, most operators will not go to the latest and greatest, because there's a lot of issues you're going to face there. I mean, not that the software is bad, it's just that early adopters will come with early adopter problems. And, you know, you need that userbase. You need those forum conversations to be able to be safe and secure about, you know, whether or not you can handle those kinds of things. And there's no need for our particular users' user space to have those latest and greatest things unless there is an actual request. >> Roland, you are an IAS provider. How are you handling containers, or requests for containers from your customers? >> Yes, containers is a big topic. There's a lot of maturity happening right now with government, in terms of what a container is, for example, what is orchestration with containers, how does my Legacy application forklift and shift to a container? And so, we're handling it in stages, right, because we're working with government in their maturity. We don't do container services on the platform, but what we do is we open-source a lot of code that allows people to deploy, let's say a terraform file, that creates a Docker Host, you know, and we give them examples. A good segue into what we've just launched last week was our Vault Academy, which we are now training 3,000 government public servants on new cloud technologies. We're not talking about how does an OS work, we're talking about infrastructures, code, we're talking about Kubernetes. We're talking about all these cool, fun things, all the way up to function as a service, right? And those kinds of capabilities is what's going to propel government in Australia moving forward in the future. >> You hit on one of my hot buttons here. So functions as a service, do you have serverless deployed in your environment, or is it an education at this point? >> It's an education at this point. Right now we have customers who would like to have that available as a native service in our cloud, but what we do is we concentrate on the controls and the infrastructure as a service platform first and foremost, just to make sure that it's secure and compliant. Everyone has the ability to deploy functions as a service on their platform, or on their accounts, or on their tenancies, and have that available to them through a different set of APIs. >> Great. There's a whole bunch of open-source versions out there. Is that what they're doing? Do you any preference toward the OpenWhisk, or FN, or you know, Fission, all the different versions that are out there? >> I guess, you know, you can sort of like, you know, pick your racehorse in that regard. Because it's still early days, and I think open to us is pretty much what I've been looking at recently, and it's just a discovery stage at this point. There are more mature customers who are coming in, some partners who are championing different technologies, so the great is that we can make sure our platform is secure and they can build on top of it. >> So you brought up security again, one of the areas I wanted to poke at a little bit is your network. So, it being an IS provider, networking's critical, what are you doing from a networking standpoint is micro-segmentation part of your environment? >> Definitely. So natively to build in our cloud, the functions that we build in our cloud are all around security, obviously. Micro-segmentation's a big part of that, training people in terms of how micro-segmentation works from a forklift and shift perspective. And the network connectivity we have with the government is also a part of this whole model, right? And so, we use technologies like Mellanox, 400G fabric. We're BGP internally, so we're routing through the host, or routing to the host, and we have this... Well so in Australia there's this, there's service from the Department of Finance, they create this idea of an icon network. And what it is, is an actually direct media fiber from the department directly to us. And that means, directly to the edge of our cloud and pipes right through into their tenancy. So essentially what happens is, this is true, true hybrid cloud. I'm not talking about going through gateways and stuff, I'm talking about I speed up an instance in the Vault cloud, and I can ping it from my desktop in my agency. Low latency, submillisecond direct fiber link, up to 100g. >> Do you have certain programmability you're doing in your network? I know lots of service providers, they want to play and get in there, they're using, you know, new operating models. >> Yes, I mean, we're using the... I draw a blank. There's a lot of technologies we're using for network, and the Cumulus Networking OS is what we're using. That allows us to bring it in to our automation team, and actually use more of a DevOps tool to sort of create the deployment from a code perspective instead of having a lot of engineers hardcoding things right on the actual production systems. Which allows us to gate a lot of the changes, which is part of the security posture as well. So, we were doing a lot of network offloading on the ConnectX-5 cards in the data center, we're using cumulus networks for bridging, we're working with Neutron to make sure that we have Neutron routers and making sure that that's secure and it's code reviewed. And, you know, there's a lot of moving parts there as well, and I think from a security standpoint and from a network functionality standpoint, we've come to a happy place in terms of providing the fastest network possible, and also the most secure and safe network as possible. >> Roland, you're working directly with the Upstream OpenStack projects, and it sounds like some others as well. You're not working with a vendor who's packaging it for you or supporting it. So that's a lot of responsibility on you and your team, I'm kind of curious how you work with the OpenStack community, and how you've seen the OpenStack community develop over the years. >> Yeah, so I mean we have a lot of talented people in our company who actually OpenStack as a passion, right? This is what they do, this is what they love. They've come from different companies who worked in OpenStack and have contributed a lot actually, to the community. And actually that segues into how we operate inside culturally in our company. Because if we do work with Upstream code, and it doesn't have anything to do with the security compliance of the Australian Signals Directorate in general, we'd like to Upstream that as much as possible and contribute back the code where it seems fit. Obviously, there's vendor mixes and things we have internally, and that's with the Mellanox and Cumulus stuff, but anything else beyond that is usually contributed up. Our team's actually very supportive of each other, we have network specialists, we have storage specialists. And it's a culture of learning, so there's a lot of synchronizations, a lot of synergies inside the company. And I think that's part to do with the people who make up Vault Systems, and that whole camaraderie is actually propagated through our technology as well. >> One of the big themes of the show this year has been broadening out of what's happening. We talked a little bit about containers already, Edge Computing is a big topic here. Either Edge, or some other areas, what are you looking for next from this ecosystem, or new areas that Vault is looking at poking at? >> Well, I mean, a lot of the exciting things for me personally, I guess, I can't talk to Vault in general, but, 'cause there's a lot of engineers who have their own opinions of what they like to see, but with the Queens release with the VGPUs, something I'd like, that all's great, a long-term release cycle with the OpenStack foundation would be great, or the OpenStack platform would be great. And that's just to keep in step with the next releases to make sure that we have the continuity, even though we're missing one release, there's a jump point. >> Can you actually put a point on that, what that means for you. We talked to Mark Collier a little bit about it this morning but what you're looking and why that's important. >> Well, it comes down to user acceptance, right? So, I mean, let's say you have a new feature or a new project that's integrated through OpenStack. And, you know, some people find out that there's these new functions that are available. There's a lot of testing behind-the-scenes that has to happen before that can be vetted and exposed as part of our infrastructure as a service platform. And so, by the time that you get to the point where you have all the checks and balances, and marrying that next to the Australian controls that we have it's one year, two years, or you know, however it might be. And you know by that time we're at the night of the release and so, you know, you do all that work, you want to make sure that you're not doing that work and refactoring it for the next release when you're ready to go live. And so, having that long-term release is actually what I'm really keen about. Having that point of, that jump point to the latest and greatest. >> Well Roland, I think that's a great point. You know, it used to be we were on the 18 month cycle, OpenStack was more like a six month cycle, so I absolutely understand why this is important that I don't want to be tied to a release when I want to get a new function. >> John: That's right. >> Roland Cabana, thank you the insight into Vault Systems and congrats on all the progress you have made. So for John Troyer, I'm Stu Miniman. Back here with lots more coverage from the OpenStack Summit 2018 in Vancouver, thanks for watching theCUBE. (upbeat music)

Published Date : May 21 2018

SUMMARY :

Brought to you by Red Hat, the OpenStack foundation, but you come from a little bit more local. I realized the potential in Australian cloud providers, Alright, so one of the big things we talk about and anything to do with big user-space data into the overall discussion of the identity of the company, that the Australian Signals Directorate needed to handle. I have a leg in the infrastructure team with the engineering A good example of this is, you know, of the new feature, but also about Zuul a lot of components that you have to be very specific A lot of the things that we work with government for How are you handling containers, that creates a Docker Host, you know, So functions as a service, do you have serverless deployed and the infrastructure as a service platform or you know, Fission, all the different versions so the great is that we can make sure our platform is secure what are you doing from a networking standpoint And the network connectivity we have with the government they're using, you know, new operating models. and the Cumulus Networking OS is what we're using. So that's a lot of responsibility on you and your team, and it doesn't have anything to do with One of the big themes of the show this year has been And that's just to keep in step with the next releases Can you actually put a point on that, And so, by the time that you get to the point where that I don't want to be tied to a release and congrats on all the progress you have made.

ENTITIES

Entity	Category	Confidence
Australia	LOCATION	0.99+
Vancouver	LOCATION	0.99+
Stu Miniman	PERSON	0.99+
John Troyer	PERSON	0.99+
OpenStack	ORGANIZATION	0.99+
one year	QUANTITY	0.99+
Roland Cabana	PERSON	0.99+
Red Hat	ORGANIZATION	0.99+
Mark Collier	PERSON	0.99+
100%	QUANTITY	0.99+
Roland	PERSON	0.99+
John	PERSON	0.99+
Vault Systems	ORGANIZATION	0.99+
Alcatel	ORGANIZATION	0.99+
Australian Signals Directorate	ORGANIZATION	0.99+
Rupert Taylor-Price	PERSON	0.99+
Department of Finance	ORGANIZATION	0.99+
18 month	QUANTITY	0.99+
six month	QUANTITY	0.99+
ASD	ORGANIZATION	0.99+
two years	QUANTITY	0.99+
Neutron	ORGANIZATION	0.99+
last week	DATE	0.99+
Mellanox	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Australian Government	ORGANIZATION	0.99+
OpenStack	TITLE	0.99+
Vancouver, Canada	LOCATION	0.99+
Cumulus	ORGANIZATION	0.99+
1,088 plus controls	QUANTITY	0.99+
OpenStack Summit 2018	EVENT	0.99+
first-time	QUANTITY	0.98+
Vault Academy	ORGANIZATION	0.98+
one	QUANTITY	0.97+
this year	DATE	0.97+
Vault	ORGANIZATION	0.97+
both	QUANTITY	0.96+
One	QUANTITY	0.96+
Liberty	TITLE	0.96+
three versions	QUANTITY	0.96+
Kubernetes	TITLE	0.96+
theCUBE	ORGANIZATION	0.95+
Zuul	ORGANIZATION	0.95+
one release	QUANTITY	0.95+
DevSecOps	TITLE	0.93+
up to 100g	QUANTITY	0.93+
today	DATE	0.93+
OpenStack Summit North America 2018	EVENT	0.91+
ConnectX-5 cards	COMMERCIAL_ITEM	0.9+
3,000 government public servants	QUANTITY	0.9+
ISM	ORGANIZATION	0.9+
Upstream	ORGANIZATION	0.9+
this morning	DATE	0.89+
Agile Methodologies	TITLE	0.88+
a second	QUANTITY	0.87+
Queens	ORGANIZATION	0.87+
couple years ago	DATE	0.87+
DevOps	TITLE	0.86+
day two	QUANTITY	0.86+
Liberty	ORGANIZATION	0.85+

Kevin Miller and Ed Walsh | AWS re:Invent 2022 - Global Startup Program

hi everybody welcome back to re invent 2022. this is thecube's exclusive coverage we're here at the satellite set it's up on the fifth floor of the Venetian Conference Center and this is part of the global startup program the AWS startup showcase series that we've been running all through last year and and into this year with AWS and featuring some of its its Global Partners Ed wallson series the CEO of chaos search many times Cube Alum and Kevin Miller there's also a cube Alum vice president GM of S3 at AWS guys good to see you again yeah great to see you Dave hi Kevin this is we call this our Super Bowl so this must be like your I don't know uh World Cup it's a pretty big event yeah it's the World Cup for sure yeah so a lot of S3 talk you know I mean that's what got us all started in 2006 so absolutely what's new in S3 yeah it's been a great show we've had a number of really interesting launches over the last few weeks and a few at the show as well so you know we've been really focused on helping customers that are running Mass scale data Lakes including you know whether it's structured or unstructured data we actually announced just a few just an hour ago I think it was a new capability to give customers cross-account access points for sharing data securely with other parts of the organization and that's something that we'd heard from customers is as they are growing and have more data sets and they're looking to to get more out of their data they are increasingly looking to enable multiple teams across their businesses to access those data sets securely and that's what we provide with cross-count access points we also launched yesterday our multi-region access point failover capabilities and so again this is where customers have data sets and they're using multiple regions for certain critical workloads they're now able to to use that to fail to control the failover between different regions in AWS and then one other launch I would just highlight is some improvements we made to storage lens which is our really a very novel and you need capability to help customers really understand what storage they have where who's accessing it when it's being accessed and we added a bunch of new metrics storage lens has been pretty exciting for a lot of customers in fact we looked at the data and saw that customers who have adopted storage lens typically within six months they saved more than six times what they had invested in turning storage lens on and certainly in this environment right now we have a lot of customers who are it's pretty top of mind they're looking for ways to optimize their their costs in the cloud and take some of those savings and be able to reinvest them in new innovation so pretty exciting with the storage lens launch I think what's interesting about S3 is that you know pre-cloud Object Store was this kind of a niche right and then of course you guys announced you know S3 in 2006 as I said and okay great you know cheap and deep storage simple get put now the conversations about how to enable value from from data absolutely analytics and it's just a whole new world and Ed you've talked many times I love the term yeah we built chaos search on the on the shoulders of giants right and so the under underlying that is S3 but the value that you can build on top of that has been key and I don't think we've talked about his shoulders and Giants but we've talked about how we literally you know we have a big Vision right so hard to kind of solve the challenge to analytics at scale we really focus on the you know the you know Big Data coming environment get analytics so we talk about the on the shoulders Giants obviously Isaac Newton's you know metaphor of I learned from everything before and we layer on top so really when you talk about all the things come from S3 like I just smile because like we picked it up naturally we went all in an S3 and this is where I think you're going Dave but everyone is so let's just cut the chase like so any of the data platforms you're using S3 is what you're building but we did it a little bit differently so at first people using a cold storage like you said and then they ETL it up into a different platforms for analytics of different sorts now people are using it closer they're doing caching layers and cashing out and they're that's where but that's where the attributes of a scale or reliability are what we did is we actually make S3 a database so literally we have no persistence outside that three and that kind of comes in so it's working really well with clients because most of the thing is we pick up all these attributes of scale reliability and it shows up in the clients environments and so when you launch all these new scalable things we just see it like our clients constantly comment like one of our biggest customers fintech in uh Europe they go to Black Friday again black Friday's not one days and they lose scale from what is it 58 terabytes a day and they're going up to 187 terabytes a day and we don't Flinch they say how do you do that well we built our platform on S3 as long as you can stream it to S3 so they're saying I can't overrun S3 and it's a natural play so it's it's really nice that but we take out those attributes but same thing that's why we're able to you know help clients get you know really you know Equifax is a good example maybe they're able to consolidate 12 their divisions on one platform we couldn't have done that without the scale and the performance of what you can get S3 but also they saved 90 I'm able to do that but that's really because the only persistence is S3 and what you guys are delivering but and then we really for focus on shoulders Giants we're doing on top of that innovating on top of your platforms and bringing that out so things like you know we have a unique data representation that makes it easy to ingest this data because it's kind of coming at you four v's of big data we allow you to do that make it performant on s3h so now you're doing hot analytics on S3 as if it's just a native database in memory but there's no memory SSC caching and then multi-model once you get it there don't move it leverage it in place so you know elasticsearch access you know Cabana grafana access or SQL access with your tools so we're seeing that constantly but we always talk about on the shoulders of giants but even this week I get comments from our customers like how did you do that and most of it is because we built on top of what you guys provided so it's really working out pretty well and you know we talk a lot about digital transformation of course we had the pleasure sitting down with Adam solipski prior John Furrier flew to Seattle sits down his annual one-on-one with the AWS CEO which is kind of cool yeah it was it's good it's like study for the test you know and uh and so but but one of the interesting things he said was you know we're one of our challenges going forward is is how do we go Beyond digital transformation into business transformation like okay well that's that's interesting I was talking to a customer today AWS customer and obviously others because they're 100 year old company and they're basically their business was they call them like the Uber for for servicing appliances when your Appliance breaks you got to get a person to serve it a service if it's out of warranty you know these guys do that so they got to basically have a you know a network of technicians yeah and they gotta deal with the customers no phone right so they had a completely you know that was a business transformation right they're becoming you know everybody says they're coming a software company but they're building it of course yeah right on the cloud so wonder if you guys could each talk about what's what you're seeing in terms of changing not only in the sort of I.T and the digital transformation but also the business transformation yeah I know I I 100 agree that I think business transformation is probably that one of the top themes I'm hearing from customers of all sizes right now even in this environment I think customers are looking for what can I do to drive top line or you know improve bottom line or just improve my customer experience and really you know sort of have that effect where I'm helping customers get more done and you know it is it is very tricky because to do that successfully the customers that are doing that successfully I think are really getting into the lines of businesses and figuring out you know it's probably a different skill set possibly a different culture different norms and practices and process and so it's it's a lot more than just a like you said a lot more than just the technology involved but when it you know we sort of liquidate it down into the data that's where absolutely we see that as a critical function for lines of businesses to become more comfortable first off knowing what data sets they have what data they they could access but possibly aren't today and then starting to tap into those data sources and then as as that progresses figuring out how to share and collaborate with data sets across a company to you know to correlate across those data sets and and drive more insights and then as all that's being done of course it's important to measure the results and be able to really see is this what what effect is this having and proving that effect and certainly I've seen plenty of customers be able to show you know this is a percentage increase in top or bottom line and uh so that pattern is playing out a lot and actually a lot of how we think about where we're going with S3 is related to how do we make it easier for customers to to do everything that I just described to have to understand what data they have to make it accessible and you know it's great to have such a great ecosystem of partners that are then building on top of that and innovating to help customers connect really directly with the businesses that they're running and driving those insights well and customers are hours today one of the things I loved that Adam said he said where Amazon is strategically very very patient but tactically we're really impatient and the customers out there like how are you going to help me increase Revenue how are you going to help me cut costs you know we were talking about how off off camera how you know software can actually help do that yeah it's deflationary I love the quote right so software's deflationary as costs come up how do you go drive it also free up the team and you nail it it's like okay everyone wants to save money but they're not putting off these projects in fact the digital transformation or the business it's actually moving forward but they're getting a little bit bigger but everyone's looking for creative ways to look at their architecture and it becomes larger larger we talked about a couple of those examples but like even like uh things like observability they want to give this tool set this data to all the developers all their sres same data to all the security team and then to do that they need to find a way an architect should do that scale and save money simultaneously so we see constantly people who are pairing us up with some of these larger firms like uh or like keep your data dog keep your Splunk use us to reduce the cost that one and one is actually cheaper than what you have but then they use it either to save money we're saving 50 to 80 hard dollars but more importantly to free up your team from the toil and then they they turn around and make that budget neutral and then allowed to get the same tools to more people across the org because they're sometimes constrained of getting the access to everyone explain that a little bit more let's say I got a Splunk or data dog I'm sifting through you know logs how exactly do you help so it's pretty simple I'll use dad dog example so let's say using data dog preservability so it's just your developers your sres managing environments all these platforms are really good at being a monitoring alerting type of tool what they're not necessarily great at is keeping the data for longer periods like the log data the bigger data that's where we're strong what you see is like a data dog let's say you're using it for a minister for to keep 30 days of logs which is not enough like let's say you're running environment you're finding that performance issue you kind of want to look to last quarter in last month in or maybe last Black Friday so 30 days is not enough but will charge you two eighty two dollars and eighty cents a gigabyte don't focus on just 280 and then if you just turn the knob and keep seven days but keep two years of data on us which is on S3 it goes down to 22 cents plus our list price of 80 cents goes to a dollar two compared to 280. so here's the thing what they're able to do is just turn a knob get more data we do an integration so you can go right from data dog or grafana directly into our platform so the user doesn't see it but they save money A lot of times they don't just save the money now they use that to go fund and get data dog to a lot more people make sense so it's a creativity they're looking at it and they're looking at tools we see the same thing with a grafana if you look at the whole grafana play which is hey you can't put it in one place but put Prometheus for metrics or traces we fit well with logs but they're using that to bring down their costs because a lot of this data just really bogs down these applications the alerting monitoring are good at small data they're not good at the big data which is what we're really good at and then the one and one is actually less than you paid for the one so it and it works pretty well so things are really unpredictable right now in the economy you know during the pandemic we've sort of lockdown and then the stock market went crazy we're like okay it's going to end it's going to end and then it looked like it was going to end and then it you know but last year it reinvented just just in that sweet spot before Omicron so we we tucked it in which which was awesome right it was a great great event we really really missed one physical reinvent you know which was very rare so that's cool but I've called it the slingshot economy it feels like you know you're driving down the highway and you got to hit the brakes and then all of a sudden you're going okay we're through it Oh no you're gonna hit the brakes again yeah so it's very very hard to predict and I was listening to jassy this morning he was talking about yeah consumers they're still spending but what they're doing is they're they're shopping for more features they might be you know buying a TV that's less expensive you know more value for the money so okay so hopefully the consumer spending will get us out of this but you don't really know you know and I don't yeah you know we don't seem to have the algorithms we've never been through something like this before so what are you guys seeing in terms of customer Behavior given that uncertainty well one thing I would highlight that I think particularly going back to what we were just talking about as far as business and digital transformation I think some customers are still appreciating the fact that where you know yesterday you may have had to to buy some Capital put out some capital and commit to something for a large upfront expenditure is that you know today the value of being able to experiment and scale up and then most importantly scale down and dynamically based on is the experiment working out am I seeing real value from it and doing that on a time scale of a day or a week or a few months that is so important right now because again it gets to I am looking for a ways to innovate and to drive Top Line growth but I I can't commit to a multi-year sort of uh set of costs to to do that so and I think plenty of customers are finding that even a few months of experimentation gives them some really valuable insight as far as is this going to be successful or not and so I think that again just of course with S3 and storage from day one we've been elastic pay for what you use if you're not using the storage you don't get charged for it and I think that particularly right now having the applications and the rest of the ecosystem around the storage and the data be able to scale up and scale down is is just ever more important and when people see that like typically they're looking to do more with it so if they find you usually find these little Department projects but they see a way to actually move faster and save money I think it is a mix of those two they're looking to expand it which can be a nightmare for sales Cycles because they take longer but people are looking well why don't you leverage this and go across division so we do see people trying to leverage it because they're still I don't think digital transformation is slowing down but a lot more to be honest a lot more approvals at this point for everything it is you know Adam and another great quote in his in his keynote he said if you want to save money the Cloud's a place to do it absolutely and I read an article recently and I was looking through and I said this is the first time you know AWS has ever seen a downturn because the cloud was too early back then I'm like you weren't paying attention in 2008 because that was the first major inflection point for cloud adoption where CFO said okay stop the capex we're going to Opex and you saw the cloud take off and then 2010 started this you know amazing cycle that we really haven't seen anything like it where they were doubling down in Investments and they were real hardcore investment it wasn't like 1998 99 was all just going out the door for no clear reason yeah so that Foundation is now in place and I think it makes a lot of sense and it could be here for for a while where people are saying Hey I want to optimize and I'm going to do that on the cloud yeah no I mean I've obviously I certainly agree with Adam's quote I think really that's been in aws's DNA from from day one right is that ability to scale costs with with the actual consumption and paying for what you use and I think that you know certainly moments like now are ones that can really motivate change in an organization in a way that might not have been as palatable when it just it didn't feel like it was as necessary yeah all right we got to go give you a last word uh I think it's been a great event I love all your announcements I think this is wonderful uh it's been a great show I love uh in fact how many people are here at reinvent north of 50 000. yeah I mean I feel like it was it's as big if not bigger than 2019. people have said ah 2019 was a record when you count out all the professors I don't know it feels it feels as big if not bigger so there's great energy yeah it's quite amazing and uh and we're thrilled to be part of it guys thanks for coming on thecube again really appreciate it face to face all right thank you for watching this is Dave vellante for the cube your leader in Enterprise and emerging Tech coverage we'll be right back foreign

Published Date : Dec 7 2022

SUMMARY :

across a company to you know to

ENTITIES

Entity	Category	Confidence
Ed Walsh	PERSON	0.99+
Kevin Miller	PERSON	0.99+
two years	QUANTITY	0.99+
2006	DATE	0.99+
2008	DATE	0.99+
seven days	QUANTITY	0.99+
Adam	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
30 days	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
50	QUANTITY	0.99+
Adam solipski	PERSON	0.99+
Dave vellante	PERSON	0.99+
two	QUANTITY	0.99+
eighty cents	QUANTITY	0.99+
Europe	LOCATION	0.99+
22 cents	QUANTITY	0.99+
Kevin	PERSON	0.99+
80 cents	QUANTITY	0.99+
Seattle	LOCATION	0.99+
12	QUANTITY	0.99+
2010	DATE	0.99+
Isaac Newton	PERSON	0.99+
Dave	PERSON	0.99+
Super Bowl	EVENT	0.99+
a day	QUANTITY	0.99+
Venetian Conference Center	LOCATION	0.99+
fifth floor	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
World Cup	EVENT	0.99+
last year	DATE	0.99+
last quarter	DATE	0.99+
yesterday	DATE	0.99+
S3	TITLE	0.99+
last month	DATE	0.99+
more than six times	QUANTITY	0.99+
2019	DATE	0.98+
Prometheus	TITLE	0.98+
six months	QUANTITY	0.98+
280	QUANTITY	0.98+
pandemic	EVENT	0.98+
Black Friday	EVENT	0.97+
an hour ago	DATE	0.97+
today	DATE	0.97+
58 terabytes a day	QUANTITY	0.97+
100 year old	QUANTITY	0.97+
this morning	DATE	0.97+
a week	QUANTITY	0.97+
Ed wallson	PERSON	0.97+
three	QUANTITY	0.96+
Equifax	ORGANIZATION	0.96+
jassy	PERSON	0.96+
one platform	QUANTITY	0.96+
this year	DATE	0.96+
grafana	TITLE	0.96+
one days	QUANTITY	0.95+
first time	QUANTITY	0.95+
one	QUANTITY	0.95+
black Friday	EVENT	0.93+
this week	DATE	0.92+
first major inflection	QUANTITY	0.91+
one place	QUANTITY	0.91+
SQL	TITLE	0.9+
last	DATE	0.89+
Store	TITLE	0.89+

Ed Walsh and Thomas Hazel, ChaosSearch

>> Welcome to theCUBE, I am Dave Vellante. And today we're going to explore the ebb and flow of data as it travels into the cloud and the data lake. The concept of data lakes was alluring when it was first coined last decade by CTO James Dixon. Rather than be limited to highly structured and curated data that lives in a relational database in the form of an expensive and rigid data warehouse or a data mart. A data lake is formed by flowing data from a variety of sources into a scalable repository, like, say an S3 bucket that anyone can access, dive into, they can extract water, A.K.A data, from that lake and analyze data that's much more fine-grained and less expensive to store at scale. The problem became that organizations started to dump everything into their data lakes with no schema on our right, no metadata, no context, just shoving it into the data lake and figure out what's valuable at some point down the road. Kind of reminds you of your attic, right? Except this is an attic in the cloud. So it's too big to clean out over a weekend. Well look, it's 2021 and we should be solving this problem by now. A lot of folks are working on this, but often the solutions add other complexities for technology pros. So to understand this better, we're going to enlist the help of ChaosSearch CEO Ed Walsh, and Thomas Hazel, the CTO and Founder of ChaosSearch. We're also going to speak with Kevin Miller who's the Vice President and General Manager of S3 at Amazon web services. And of course they manage the largest and deepest data lakes on the planet. And we'll hear from a customer to get their perspective on this problem and how to go about solving it, but let's get started. Ed, Thomas, great to see you. Thanks for coming on theCUBE. >> Likewise. >> Face to face, it's really good to be here. >> It is nice face to face. >> It's great. >> So, Ed, let me start with you. We've been talking about data lakes in the cloud forever. Why is it still so difficult to extract value from those data lakes? >> Good question. I mean, data analytics at scale has always been a challenge, right? So, we're making some incremental changes. As you mentioned that we need to see some step function changes. But in fact, it's the reason ChaosSearch was really founded. But if you look at it, the same challenge around data warehouse or a data lake. Really it's not just to flowing the data in, it's how to get insights out. So it kind of falls into a couple of areas, but the business side will always complain and it's kind of uniform across everything in data lakes, everything in data warehousing. They'll say, "Hey, listen, I typically have to deal with a centralized team to do that data prep because it's data scientists and DBAs". Most of the time, they're a centralized group. Sometimes they're are business units, but most of the time, because they're scarce resources together. And then it takes a lot of time. It's arduous, it's complicated, it's a rigid process of the deal of the team, hard to add new data, but also it's hard to, it's very hard to share data and there's no way to governance without locking it down. And of course they would be more self-serve. So there's, you hear from the business side constantly now underneath is like, there's some real technology issues that we haven't really changed the way we're doing data prep since the two thousands, right? So if you look at it, it's, it falls two big areas. It's one, how to do data prep. How do you take, a request comes in from a business unit. I want to do X, Y, Z with this data. I want to use this type of tool sets to do the following. Someone has to be smart, how to put that data in the right schema, you mentioned. You have to put it in the right format, that the tool sets can analyze that data before you do anything. And then second thing, I'll come back to that 'cause that's the biggest challenge. But the second challenge is how these different data lakes and data warehouses are now persisting data and the complexity of managing that data and also the cost of computing it. And I'll go through that. But basically the biggest thing is actually getting it from raw data so the rigidness and complexity that the business sides are using it is literally someone has to do this ETL process, extract, transform, load. They're actually taking data, a request comes in, I need so much data in this type of way to put together. They're literally physically duplicating data and putting it together on a schema. They're stitching together almost a data puddle for all these different requests. And what happens is anytime they have to do that, someone has to do it. And it's, very skilled resources are scanned in the enterprise, right? So it's a DBS and data scientists. And then when they want new data, you give them a set of data set. They're always saying, what can I add to this data? Now that I've seen the reports. I want to add this data more fresh. And the same process has to happen. This takes about 60% to 80% of the data scientists in DPA's to do this work. It's kind of well-documented. And this is what actually stops the process. That's what is rigid. They have to be rigid because there's a process around that. That's the biggest challenge of doing this. And it takes an enterprise, weeks or months. I always say three weeks or three months. And no one challenges beyond that. It also takes the same skill set of people that you want to drive digital transformation, data warehousing initiatives, motorization, being data driven or all these data scientists and DBS they don't have enough of. So this is not only hurting you getting insights out of your day like in the warehouses. It's also, this resource constraint is hurting you actually getting. >> So that smallest atomic unit is that team, that's super specialized team, right? >> Right. >> Yeah. Okay. So you guys talk about activating the data lake. >> Yep. >> For analytics. What's unique about that? What problems are you all solving? You know, when you guys crew created this magic sauce. >> No, and basically, there's a lot of things. I highlighted the biggest one is how to do the data prep, but also you're persisting and using the data. But in the end, it's like, there's a lot of challenges at how to get analytics at scale. And this is really where Thomas and I founded the team to go after this, but I'll try to say it simply. What we're doing, I'll try to compare and contrast what we do compared to what you do with maybe an elastic cluster or a BI cluster. And if you look at it, what we do is we simply put your data in S3, don't move it, don't transform it. In fact, we're against data movement. What we do is we literally point and set that data and we index that data and make it available in a data representation that you can give virtual views to end-users. And those virtual views are available immediately over petabytes of data. And it actually gets presented to the end-user as an open API. So if you're elastic search user, you can use all your elastic search tools on this view. If you're a SQL user, Tableau, Looker, all the different tools, same thing with machine learning next year. So what we do is we take it, make it very simple. Simply put it there. It's already there already. Point us at it. We do the hard of indexing and making available. And then you publish in the open API as your users can use exactly what they do today. So that's, dramatically I'll give you a before and after. So let's say you're doing elastic search. You're doing logging analytics at scale, they're lending their data in S3. And then they're ETL physically duplicating and moving data. And typically deleting a lot of data to get in a format that elastic search can use. They're persisting it up in a data layer called leucine. It's physically sitting in memories, CPU, SSDs, and it's not one of them, it's a bunch of those. They in the cloud, you have to set them up because they're persisting ECC. They stand up same by 24, not a very cost-effective way to the cloud computing. What we do in comparison to that is literally pointing it at the same S3. In fact, you can run a complete parallel, the data necessary it's being ETL out. When just one more use case read only, or allow you to get that data and make this virtual views. So we run a complete parallel, but what happens is we just give a virtual view to the end users. We don't need this persistence layer, this extra cost layer, this extra time, cost and complexity of doing that. So what happens is when you look at what happens in elastic, they have a constraint, a trade-off of how much you can keep and how much you can afford to keep. And also it becomes unstable at time because you have to build out a schema. It's on a server, the more the schema scales out, guess what? you have to add more servers, very expensive. They're up seven by 24. And also they become brutalized. You lose one node, the whole thing has to be put together. We have none of that cost and complexity. We literally go from to keep whatever you want, whatever you want to keep an S3 is single persistence, very cost effective. And what we are able to do is, costs, we save 50 to 80%. Why? We don't go with the old paradigm of sit it up on servers, spin them up for persistence and keep them up 7 by 24. We're literally asking their cluster, what do you want to cut? We bring up the right compute resources. And then we release those sources after the query done. So we can do some queries that they can't imagine at scale, but we're able to do the exact same query at 50 to 80% savings. And they don't have to do any tutorial of moving that data or managing that layer of persistence, which is not only expensive, it becomes brittle. And then it becomes, I'll be quick. Once you go to BI, it's the same challenge, but the BI systems, the requests are constant coming at from a business unit down to the centralized data team. Give me this flavor of data. I want to use this piece of, you know, this analytic tool in that desk set. So they have to do all this pipeline. They're constantly saying, okay, I'll give you this data, this data, I'm duplicating that data, I'm moving it and stitching it together. And then the minute you want more data, they do the same process all over. We completely eliminate that. >> And those requests are queue up. Thomas, it had me, you don't have to move the data. That's kind of the exciting piece here, isn't it? >> Absolutely no. I think, you know, the data lake philosophy has always been solid, right? The problem is we had that Hadoop hang over, right? Where let's say we were using that platform, little too many variety of ways. And so, I always believed in data lake philosophy when James came and coined that I'm like, that's it. However, HTFS, that wasn't really a service. Cloud object storage is a service that the elasticity, the security, the durability, all that benefits are really why we founded on-cloud storage as a first move. >> So it was talking Thomas about, you know, being able to shut off essentially the compute so you don't have to keep paying for it, but there's other vendors out there and stuff like that. Something similar as separating, compute from storage that they're famous for that. And you have Databricks out there doing their lake house thing. Do you compete with those? How do you participate and how do you differentiate? >> Well, you know you've heard this term data lakes, warehouse, now lake house. And so what everybody wants is simple in, easy in, however, the problem with data lakes was complexity of out. Driving value. And I said, what if, what if you have the easy in and the value out? So if you look at, say snowflake as a warehousing solution, you have to all that prep and data movement to get into that system. And that it's rigid static. Now, Databricks, now that lake house has exact same thing. Now, should they have a data lake philosophy, but their data ingestion is not data lake philosophy. So I said, what if we had that simple in with a unique architecture and indexed technology, make it virtually accessible, publishable dynamically at petabyte scale. And so our service connects to the customer's cloud storage. Data stream the data in, set up what we call a live indexing stream, and then go to our data refinery and publish views that can be consumed the elastic API, use cabana Grafana, or say SQL tables look or say Tableau. And so we're getting the benefits of both sides, use scheme on read-write performance with scheme write-read performance. And if you can do that, that's the true promise of a data lake, you know, again, nothing against Hadoop, but scheme on read with all that complexity of software was a little data swamping. >> Well, you've got to start it, okay. So we got to give them a good prompt, but everybody I talked to has got this big bunch of spark clusters, now saying, all right, this doesn't scale, we're stuck. And so, you know, I'm a big fan of Jamag Dagani and our concept of the data lake and it's early days. But if you fast forward to the end of the decade, you know, what do you see as being the sort of critical components of this notion of, people call it data mesh, but to get the analytics stack, you're a visionary Thomas, how do you see this thing playing out over the next decade? >> I love her thought leadership, to be honest, our core principles were her core principles now, 5, 6, 7 years ago. And so this idea of, decentralize that data as a product, self-serve and, and federated computer governance, I mean, all that was our core principle. The trick is how do you enable that mesh philosophy? I can say we're a mesh ready, meaning that, we can participate in a way that very few products can participate. If there's gates data into your system, the CTL, the schema management, my argument with the data meshes like producers and consumers have the same rights. I want the consumer, people that choose how they want to consume that data. As well as the producer, publishing it. I can say our data refinery is that answer. You know, shoot, I'd love to open up a standard, right? Where we can really talk about the producers and consumers and the rights each others have. But I think she's right on the philosophy. I think as products mature in this cloud, in this data lake capabilities, the trick is those gates. If you have to structure up front, if you set those pipelines, the chance of you getting your data into a mesh is the weeks and months that Ed was mentioning. >> Well, I think you're right. I think the problem with data mesh today is the lack of standards you've got. You know, when you draw the conceptual diagrams, you've got a lot of lollipops, which are APIs, but they're all unique primitives. So there aren't standards, by which to your point, the consumer can take the data the way he or she wants it and build their own data products without having to tap people on the shoulder to say, how can I use this?, where does the data live? And being able to add their own data. >> You're exactly right. So I'm an organization, I'm generating data, when the courageously stream it into a lake. And then the service, a ChaosSearch service, is the data is discoverable and configurable by the consumer. Let's say you want to go to the corner store. I want to make a certain meal tonight. I want to pick and choose what I want, how I want it. Imagine if the data mesh truly can have that producer of information, you know, all the things you can buy a grocery store and what you want to make for dinner. And if you'd static, if you call up your producer to do the change, was it really a data mesh enabled service? I would argue not. >> Ed, bring us home. >> Well, maybe one more thing with this. >> Please, yeah. 'Cause some of this is we're talking 2031, but largely these principles are what we have in production today, right? So even the self service where you can actually have a business context on top of a data lake, we do that today, we talked about, we get rid of the physical ETL, which is 80% of the work, but the last 20% it's done by this refinery where you can do virtual views, the right or back and do all the transformation need and make it available. But also that's available to, you can actually give that as a role-based access service to your end-users, actually analysts. And you don't want to be a data scientist or DBA. In the hands of a data scientist the DBA is powerful, but the fact of matter, you don't have to affect all of our employees, regardless of seniority, if they're in finance or in sales, they actually go through and learn how to do this. So you don't have to be it. So part of that, and they can come up with their own view, which that's one of the things about data lakes. The business unit wants to do themselves, but more importantly, because they have that context of what they're trying to do instead of queuing up the very specific request that takes weeks, they're able to do it themselves. >> And if I have to put it on different data stores and ETL that I can do things in real time or near real time. And that's game changing and something we haven't been able to do ever. >> And then maybe just to wrap it up, listen, you know 8 years ago, Thomas and his group of founders, came up with the concept. How do you actually get after analytics at scale and solve the real problems? And it's not one thing, it's not just getting S3. It's all these different things. And what we have in market today is the ability to literally just simply stream it to S3, by the way, simply do, what we do is automate the process of getting the data in a representation that you can now share an augment. And then we publish open API. So can actually use a tool as you want, first use case log analytics, hey, it's easy to just stream your logs in. And we give you elastic search type of services. Same thing that with CQL, you'll see mainstream machine learning next year. So listen, I think we have the data lake, you know, 3.0 now, and we're just stretching our legs right now to have fun. >> Well, and you have to say it log analytics. But if I really do believe in this concept of building data products and data services, because I want to sell them, I want to monetize them and being able to do that quickly and easily, so I can consume them as the future. So guys, thanks so much for coming on the program. Really appreciate it.

Published Date : Nov 15 2021

SUMMARY :

and Thomas Hazel, the CTO really good to be here. lakes in the cloud forever. And the same process has to happen. So you guys talk about You know, when you guys crew founded the team to go after this, That's kind of the exciting service that the elasticity, And you have Databricks out there And if you can do that, end of the decade, you know, the chance of you getting your on the shoulder to say, all the things you can buy a grocery store So even the self service where you can actually have And if I have to put it is the ability to literally Well, and you have

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Kevin Miller	PERSON	0.99+
Thomas	PERSON	0.99+
Ed	PERSON	0.99+
80%	QUANTITY	0.99+
Ed Walsh	PERSON	0.99+
50	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
James	PERSON	0.99+
Thomas Hazel	PERSON	0.99+
ChaosSearch	ORGANIZATION	0.99+
three months	QUANTITY	0.99+
Databricks	ORGANIZATION	0.99+
next year	DATE	0.99+
2021	DATE	0.99+
two thousands	QUANTITY	0.99+
three weeks	QUANTITY	0.99+
24	QUANTITY	0.99+
James Dixon	PERSON	0.99+
last decade	DATE	0.99+
7	QUANTITY	0.99+
second challenge	QUANTITY	0.99+
2031	DATE	0.99+
Jamag Dagani	PERSON	0.98+
S3	ORGANIZATION	0.98+
both sides	QUANTITY	0.98+
S3	TITLE	0.98+
8 years ago	DATE	0.98+
second thing	QUANTITY	0.98+
today	DATE	0.98+
about 60%	QUANTITY	0.98+
tonight	DATE	0.97+
first	QUANTITY	0.97+
Tableau	TITLE	0.97+
two big areas	QUANTITY	0.96+
one	QUANTITY	0.95+
SQL	TITLE	0.94+
seven	QUANTITY	0.94+
6	DATE	0.94+
CTO	PERSON	0.93+
CQL	TITLE	0.93+
7 years	DATE	0.93+
first move	QUANTITY	0.93+
next decade	DATE	0.92+
single	QUANTITY	0.91+
DBS	ORGANIZATION	0.9+
20%	QUANTITY	0.9+
one thing	QUANTITY	0.87+
5	DATE	0.87+
Hadoop	TITLE	0.87+
Looker	TITLE	0.8+
Grafana	TITLE	0.73+
DPA	ORGANIZATION	0.71+
one more thing	QUANTITY	0.71+
end of the	DATE	0.69+
Vice President	PERSON	0.65+
petabytes	QUANTITY	0.64+
cabana	TITLE	0.62+
CEO	PERSON	0.57+
HTFS	ORGANIZATION	0.54+
house	ORGANIZATION	0.49+
theCUBE	ORGANIZATION	0.48+

Ed Walsh and Thomas Hazel V1

>>Welcome to the cube. I'm Dave Volante. Today, we're going to explore the ebb and flow of data as it travels into the cloud. In the data lake, the concept of data lakes was a Loring when it was first coined last decade by CTO James Dickson, rather than be limited to highly structured and curated data that lives in a relational database in the form of an expensive and rigid data warehouse or a data Mart, a data lake is formed by flowing data from a variety of sources into a scalable repository, like say an S3 bucket that anyone can access, dive into. They can extract water. It can a data from that lake and analyze data. That's much more fine-grained and less expensive to store at scale. The problem became that organizations started to dump everything into their data lakes with no schema on it, right? No metadata, no context to shove it into the data lake and figure out what's valuable. >>At some point down the road kind of reminds you of your attic, right? Except this is an attic in the cloud. So it's too big to clean out over a weekend. We'll look it's 2021 and we should be solving this problem by now, a lot of folks are working on this, but often the solutions at other complexities for technology pros. So to understand this better, we're going to enlist the help of chaos search CEO and Walsh and Thomas Hazel, the CTO and founder of chaos search. We're also going to speak with Kevin Miller. Who's the vice president and general manager of S3 at Amazon web services. And of course they manage the largest and deepest data lakes on the planet. And we'll hear from a customer to get their perspective on this problem and how to go about solving it, but let's get started. Ed Thomas. Great to see you. Thanks for coming on the cube. Likewise face. It's really good to be in this nice face. Great. So let me start with you. We've been talking about data lakes in the cloud forever. Why is it still so difficult to extract value from those data? >>Good question. I mean, a data analytics at scale is always been a challenge, right? So, and it's, uh, we're making some incremental changes. As you mentioned that we need to seem some step function changes, but, uh, in fact, it's the reason, uh, search was really founded. But if you look at it the same challenge around data warehouse or a data lake, really, it's not just a flowing the data in is how to get insights out. So it kind of falls into a couple of areas, but the business side will always complain and it's kind of uniform across everything in data lakes, everything that we're offering, they'll say, Hey, listen, I typically have to deal with a centralized team to do that data prep because it's data scientist and DBS. Most of the time they're a centralized group, sometimes are business units, but most of the time, because they're scarce resources together. >>And then it takes a lot of time. It's arduous, it's complicated. It's a rigid process of the deal of the team, hard to add new data. But also it's hard to, you know, it's very hard to share data and there's no way to governance without locking it down. And of course they would be more self-service. So there's you hear from the business side constantly now underneath is like, there's some real technology issues that we haven't really changed the way we're doing data prep since the two thousands. Right? So if you look at it, it's, it falls, uh, two big areas. It's one. How do data prep, how do you take a request comes in from a business unit. I want to do X, Y, Z with this data. I want to use this type of tool sets to do the following. Someone has to be smart, how to put that data in the right schema. >>You mentioned you have to put it in the right format, that the tool sets can analyze that data before you do anything. And then secondly, I'll come back to that because that's a biggest challenge. But the second challenge is how these different data lakes and data we're also going to persisting data and the complexity of managing that data and also the cost of computing. And I'll go through that. But basically the biggest thing is actually getting it from raw data so that the rigidness and complexity that the business sides are using it is literally someone has to do this ETL process extract, transform load. They're actually taking data request comes in. I need so much data in this type of way to put together their Lilly, physically duplicating data and putting it together and schema they're stitching together almost a data puddle for all these different requests. >>And what happens is anytime they have to do that, someone has to do it. And it's very skilled. Resources are scant in the enterprise, right? So it's a DBS and data scientists. And then when they want new data, you give them a set of data set. They're always saying, what can I add this data? Now that I've seen the reports, I want to add this data more fresh. And the same process has to happen. This takes about 60 to 80% of the data scientists in DPA's to do this work. It's kind of well-documented. Uh, and this is what actually stops the process. That's what is rigid. They have to be rigid because there's a process around that. Uh, that's the biggest challenge to doing this. And it takes in the enterprise, uh, weeks or months. I always say three weeks to three months. And no one challenges beyond that. It also takes the same skill set of people that you want to drive. Digital transformation, data, warehousing initiatives, uh, monitorization being, data driven, or all these data scientists and DBS. They don't have enough of, so this is not only hurting you getting insights out of your dead like that, or else it's also this resource constraints hurting you actually getting smaller. >>The Tomic unit is that team that's super specialized team. Right. Right. Yeah. Okay. So you guys talk about activating the data lake. Yep, sure. Analytics, what what's unique about that? What problems are you all solving? You know, when you guys crew created this, this, this magic sauce. >>No, and it basically, there's a lot of things I highlighted the biggest one is how to do the data prep, but also you're persisting and using the data. But in the end, it's like, there's a lot of challenges that how to get analytics at scale. And this is really where Thomas founded the team to go after this. But, um, I'll try to say it simply, what are we doing? I'll try to compare and stress what we do compared to what you do with maybe an elastic cluster or a BI cluster. Um, and if you look at it, what we do is we simply put your data in S3, don't move it, don't transform it. In fact, we're not we're against data movement. What we do is we literally pointed at that data and we index that data and make it available in a data representation that you can give virtual views to end users. >>And those virtual views are available immediately over petabytes of data. And it re it actually gets presented to the end user as an open API. So if you're elastic search user, you can use all your lesser search tools on this view. If you're a SQL user, Tableau, Looker, all the different tools, same thing with machine learning next year. So what we do is we take it, make it very simple. Simply put it there. It's already there already. Point is at it. We do the hard of indexing and making available. And then you publish in the open API as your users can use exactly what they do today. So that's dramatically. I'll give you a before and after. So let's say you're doing elastic search. You're doing logging analytics at scale, they're lending their data in S3. And then they're,, they're physically duplicating a moving data and typically deleting a lot of data to get in a format that elastic search can use. >>They're persisting it up in a data layer called leucine. It's physically sitting in memories, CPU, uh, uh, SSDs. And it's not one of them. It's a bunch of those. They in the cloud, you have to set them up because they're persisting ECC. They stand up semi by 24, not a very cost-effective way to the cloud, uh, cloud computing. What we do in comparison to that is literally pointing it at the same S3. In fact, you can run a complete parallel, the data necessary. It's being ETL. That we're just one more use case read only, or allow you to get that data and make this virtual views. So we run a complete parallel, but what happens is we just give a virtual view to the end users. We don't need this persistence layer, this extra cost layer, this extra, um, uh, time cost and complexity of doing that. >>So what happens is when you look at what happens in elastic, they have a constraint, a trade-off of how much you can keep and how much you can afford to keep. And also it becomes unstable at time because you have to build out a schema. It's on a server, the more the schema scales out, guess what you have to add more servers, very expensive. They're up seven by 24. And also they become brittle. As you lose one node. The whole thing has to be put together. We have none of that cost and complexity. We literally go from to keep whatever you want, whatever you want to keep an S3, a single persistence, very cost effective. And what we do is, um, costs. We save 50 to 80% why we don't go with the old paradigm of sit it up on servers, spin them up for persistence and keep them up. >>Somebody 24, we're literally asking her cluster, what do you want to cut? We bring up the right compute resources. And then we release those sources after the query done. So we can do some queries that they can't imagine at scale, but we're able to do the exact same query at 50 to 80% savings. And they don't have to do any of the toil of moving that data or managing that layer of persistence, which is not only expensive. It becomes brittle. And then it becomes an I'll be quick. Once you go to BI, it's the same challenge, but the BI systems, the requests are constant coming at from a business unit down to the centralized data team. Give me this flavor of debt. I want to use this piece of, you know, this analytic tool in that desk set. So they have to do all this pipeline. They're constantly saying, okay, I'll give you this data, this data I'm duplicating that data. I'm moving in stitching together. And then the minute you want more data, they do the same process all over. We completely eliminate that. >>The questions queue up, Thomas, it had me, you don't have to move the data. That's, that's kind of the >>Writing piece here. Isn't it? I absolutely, no. I think, you know, the daylight philosophy has always been solid, right? The problem is we had that who do hang over, right? Where let's say we were using that platform, little, too many variety of ways. And so I always believed in daily philosophy when James came and coined that I'm like, that's it. However, HTFS that wasn't really a service cloud. Oddish storage is a service that the, the last society, the security and the durability, all that benefits are really why we founded, uh, Oncotype storage as a first move. >>So it was talking Thomas about, you know, being able to shut off essentially the compute and you have to keep paying for it, but there's other vendors out there and stuff like that. Something similar as separating, compute from storage that they're famous for that. And, and, and yet Databricks out there doing their lake house thing. Do you compete with those? How do you participate and how do you differentiate? >>I know you've heard this term data lakes, warehouse now, lake house. And so what everybody wants is simple in easy N however, the problem with data lakes was complexity of out driving value. And I said, what if, what if you have the easy end and the value out? So if you look at, uh, say snowflake as a, as a warehousing solution, you have to all that prep and data movement to get into that system. And that it's rigid static. Now, Databricks, now that lake house has exact same thing. Now, should they have a data lake philosophy, but their data ingestion is not daily philosophy. So I said, what if we had that simple in with a unique architecture, indexed technology, make it virtually accessible publishable dynamically at petabyte scale. And so our service connects to the customer's cloud storage data, stream the data in set up what we call a live indexing stream, and then go to our data refinery and publish views that can be consumed the lasted API, use cabana Grafana, or say SQL tables look or say Tableau. And so we're getting the benefits of both sides, you know, schema on read, write performance with scheme on, right. Reperformance. And if you can do that, that's the true promise of a data lake, you know, again, nothing against Hadoop, but a schema on read with all that complexity of, uh, software was, uh, what was a little data, swamp >>Got to start it. Okay. So we got to give a good prompt, but everybody I talked to has got this big bunch of spark clusters now saying, all right, this, this doesn't scale we're stuck. And so, you know, I'm a big fan of and our concept of the data lake and it's it's early days. But if you fast forward to the end of the decade, you know, what do you see as being the sort of critical components of this notion of, you know, people call it data mesh, but you've got the analytics stack. Uh, you, you, you're a visionary Thomas, how do you see this thing playing out over the next? >>I love for thought leadership, to be honest, our core principles were her core principles now, you know, 5, 6, 7 years ago. And so this idea of, you know, de centralize that data as a product, you know, self-serve and, and federated, computer, uh, governance, I mean, all that, it was our core principle. The trick is how do you enable that mesh philosophy? We, I could say we're a mesh ready, meaning that, you know, we can participate in a way that very few products can participate. If there's gates data into your system, the CTLA, the schema management, my argument with the data meshes like producers and consumers have the same rights. I want the consumer people that choose how they want to consume that data, as well as the producer publishing it. I can say our data refinery is that answer. You know, shoot, I love to open up a standard, right, where we can really talk about the producers and consumers and the rights each others have. But I think she's right on the philosophy. I think as products mature in this cloud, in this data lake capabilities, the trick is those gates. If you have the structure up front, it gets at those pipelines. You know, the chance of you getting your data into a mesh is the weeks and months that it was mentioning. >>Well, I think you're right. I think the problem with, with data mesh today is the lack of standards you've got. You know, when you draw the conceptual diagrams, you've got a lot of lollipops, which are API APIs, but they're all, you know, unique primitives. So there aren't standards by which to your point, the consumer can take the data the way he or she wants it and build their own data products without having to tap people on the shoulder to say, how can I use this? Where's the data live and, and, and, and, and being able to add their own >>You're exactly right. So I'm an organization I'm generally data will be courageous, a stream it to a lake. And then the service, uh, Ks search service is the data's con uh, discoverable and configurable by the consumer. Let's say you want to go to the corner store? You know, I want to make a certain meal tonight. I want to pick and choose what I want, how I want it. Imagine if the data mesh truly can have that producer of information, you, all the things you can buy a grocery store and what you want to make for dinner. And if you'd static, if you call up your producer to do the change, was it really a data mesh enabled service? I would argue not that >>Bring us home >>Well. Uh, and, um, maybe one more thing with this, cause some of this is we talking 20, 31, but largely these principles are what we have in production today, right? So even the self service where you can actually have business context on top of a debt, like we do that today, we talked about, we get rid of the physical ETL, which is 80% of the work, but the last 20% it's done by this refinery where you can do virtual views, the right our back and do all the transformation need and make it available. But also that's available to, you can actually give that as a role-based access service to your end users actually analysts, and you don't want to be a data scientist or DBA in the hands of a data science. The DBA is powerful, but the fact of matter, you don't have to affect all of our employees, regardless of seniority. If they're in finance or in sales, they actually go through and learn how to do this. So you don't have to be it. So part of that, and they can come up with their own view, which that's one of the things about debt lakes, the business unit wants to do themselves, but more importantly, because they have that context of what they're trying to do instead of queuing up the very specific request that takes weeks, they're able to do it themselves and to find out that >>Different data stores and ETL that I can do things in real time or near real time. And that's that's game changing and something we haven't been able to do, um, ever. Hmm. >>And then maybe just to wrap it up, listen, um, you know, eight years ago is a group of founders came up with the concept. How do you actually get after analytics at scale and solve the real problems? And it's not one thing it's not just getting S3, it's all these different things. And what we have in market today is the ability to literally just simply stream it to S3 by the way, simply do what we do is automate the process of getting the data in a representation that you can now share an augment. And then we publish open API. So can actually use a tool as you want first use case log analytics, Hey, it's easy to just stream your logs in and we give you elastic search puppet services, same thing that with CQL, you'll see mainstream machine learning next year. So listen, I think we have the data lake, you know, 3.0 now, and we're just stretching our legs run off >>Well, and you have to say it log analytics. But if I really do believe in this concept of building data products and data services, because I want to sell them, I want to monetize them and being able to do that quickly and easily, so that can consume them as the future. So guys, thanks so much for coming on the program. Really appreciate it. All right. In a moment, Kevin Miller of Amazon web services joins me. You're watching the cube, your leader in high tech coverage.

Published Date : Nov 2 2021

SUMMARY :

that organizations started to dump everything into their data lakes with no schema on it, At some point down the road kind of reminds you of your attic, right? But if you look at it the same challenge around data warehouse So if you look at it, it's, it falls, uh, two big areas. You mentioned you have to put it in the right format, that the tool sets can analyze that data before you do anything. It also takes the same skill set of people that you want So you guys talk about activating the data lake. Um, and if you look at it, what we do is we simply put your data in S3, don't move it, And then you publish in the open API as your users can use exactly what they you have to set them up because they're persisting ECC. It's on a server, the more the schema scales out, guess what you have to add more servers, And then the minute you want more data, they do the same process all over. The questions queue up, Thomas, it had me, you don't have to move the data. I absolutely, no. I think, you know, the daylight philosophy has always been So it was talking Thomas about, you know, being able to shut off essentially the And I said, what if, what if you have the easy end and the value out? the sort of critical components of this notion of, you know, people call it data mesh, And so this idea of, you know, de centralize that You know, when you draw the conceptual diagrams, you've got a lot of lollipops, which are API APIs, but they're all, if you call up your producer to do the change, was it really a data mesh enabled service? but the fact of matter, you don't have to affect all of our employees, regardless of seniority. And that's that's game changing And then maybe just to wrap it up, listen, um, you know, eight years ago is a group of founders Well, and you have to say it log analytics.

ENTITIES

Entity	Category	Confidence
Kevin Miller	PERSON	0.99+
Thomas	PERSON	0.99+
Dave Volante	PERSON	0.99+
Ed Thomas	PERSON	0.99+
50	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
James	PERSON	0.99+
80%	QUANTITY	0.99+
three months	QUANTITY	0.99+
three weeks	QUANTITY	0.99+
Thomas Hazel	PERSON	0.99+
2021	DATE	0.99+
Ed Walsh	PERSON	0.99+
next year	DATE	0.99+
Databricks	ORGANIZATION	0.99+
S3	ORGANIZATION	0.99+
second challenge	QUANTITY	0.99+
24	QUANTITY	0.99+
both sides	QUANTITY	0.99+
eight years ago	DATE	0.98+
Today	DATE	0.98+
two thousands	QUANTITY	0.98+
today	DATE	0.98+
20%	QUANTITY	0.98+
tonight	DATE	0.97+
last decade	DATE	0.97+
S3	TITLE	0.97+
first	QUANTITY	0.96+
one	QUANTITY	0.96+
Tableau	TITLE	0.95+
single	QUANTITY	0.95+
James Dickson	PERSON	0.94+
Hadoop	TITLE	0.94+
two big areas	QUANTITY	0.94+
20	QUANTITY	0.94+
SQL	TITLE	0.93+
seven	QUANTITY	0.93+
CTO	PERSON	0.93+
about 60	QUANTITY	0.93+
Oncotype	ORGANIZATION	0.92+
first move	QUANTITY	0.92+
secondly	QUANTITY	0.91+
one more thing	QUANTITY	0.89+
DBS	ORGANIZATION	0.89+
one node	QUANTITY	0.85+
Walsh	PERSON	0.83+
petabytes	QUANTITY	0.77+
Tomic	ORGANIZATION	0.77+
31	QUANTITY	0.77+
end of the	DATE	0.76+
cabana	TITLE	0.73+
HTFS	ORGANIZATION	0.7+
Mart	ORGANIZATION	0.68+
Grafana	TITLE	0.63+
data	ORGANIZATION	0.58+
Looker	TITLE	0.55+
CQL	TITLE	0.55+
DPA	ORGANIZATION	0.54+

Thomas Hazel, ChaosSearch & Jeremy Foran, BAI Communications | AWS Startup Showcase

(upbeat music) >> Hey everyone, I'm John Furrier with The Cube, we're here in Palo Alto, California for a remote interview and session for The Cube presents AWS startup showcase, the next big thing in AI security in life sciences. I'm John Furrier. We're here with a great segment on cloud. Next big thing in Cloud with Chaos Search, Thomas Hazel, Chief Technology and Science Officer of Chaos Search joined by Jeremy Foran, the head of data analytics, the bad boy of data analyst as they say, but BAI communications, Jeremy Thomas, great to have you on. >> Great to be here. >> Pleasure to be here. >> So we're going to be talking about applying large scale log analytics to building the future of the transit industry. Obviously Telco's a big part of that, smart cities, you name the use case self-driving trucks, cars, you name it, everything's now edge. That the edge is super valuable, it's a new kind of last mile if you will, it's moving fast, it's mobile. This is a huge deal. Let's get into it, Thomas. What's this big story around this, this session? >> Well, we provide unique ability to take all that edge data and drive it into a data lake offering that we provide data analytics, both in logs, BI and coming out with ML there this year into next. So our unique play is transforming customers' cloud outer storage into an analytical platform. And really, I think with BIA is a log analytics specifically where, you know there's a lot of data streams from all those devices going into a lake that we transform their lake into analytics for driving, I guess, operational analysis. >> You know, Jeremy, I remember back in the day, I'm old enough to remember when the edge was the remote switch or campus hub or something. And then even on the Telco side, there was no wifi back in 2000 and you know, someone was driving in a car and you got any signal, you're lucky. Now you got, you know, no perimeter you have unlimited connectivity everywhere. This has opened up more of an Omni channel data problem. How do you see that world? Because you still got more devices pushing out at this edge and it's getting super local, right? Even on the body, even on people in the car. So certainly a lot of change on the infrastructure side. What does that pose for data challenge? >> Yeah, I, I would say that, you know users always want more, more bandwidth, more performance and that requires us to create more systems that require more complexity to deliver that user experience that we're, we're very proud of. And with that complexity means, you know exponentially more data. And so one of the wifi networks we offer in the Toronto subway system, T-connect, you know we see a 100-200,000 unique users a day and you can imagine just the amount of infrastructure to support that so that everyone has a seamless experience and can get their news and emails and even stream media while they're waiting for the subway. >> So you guys provide state of the art infrastructure for cell, wifi, broadcast, radio, IP networks, basically I mean, I call it the smart city kind of go-to. But that's basically anything involving kind of that edge piece. This is a huge thing. So as smart cities are on the table, which and you seeing 5G being called more of an enterprise app where there's feeding large dense areas of people this is now a new modern version of what I would call the, the smart city blueprint. What's changed in your mind on this whole modernization of this smart city infrastructure concept? What's new? What's cutting edge? >> Yeah. I would say that, you know there was an explosion of data and a lot of our insights aren't coming from one system anymore. It's coming from collecting data from all of the different pieces, the different infrastructure whether that's your fiber infrastructure or your wireless infrastructure, and then to solve problems you need to correlate data across those systems. So we're seeing more and more technologies that allow you to do that correlation. And that's really where we're finding tons of value, right? >> Thomas, take us through what you guys do as a, as a, as a product, a value proposition, the secret sauce, and and why I'm here with Jeremy? Why is this conversation important for the folks watching? What's the connection between Chaos Search and BAI communication? >> Well, it's data, right? And lots of it. So our unique platform allows people like Jeremy to stream all this data, right? In you know, today's world terabytes go to petabytes really easily, billions go to trillion really easily, and so providing the analysis of that data for their operations is challenging particularly based on technology and architectures that have been around for a long time. So what we do here at Chaos Search is the ability for BIA to stream all these devices, all these services into one centralized data lake on their cloud outer storage, where we connect to that cloud outer storage and transform it into an analytical database to do, in this case log analytics and do it seamlessly, easily where a new workload a new stream just streams into that lake. And we, as a service take over, we discover we index it and publish well-known open API and visualization so that they can focus on their business, not all the operational data pipeline, database and data engineering type work that again, at these types of scales is is frankly a nightmare. >> You know, one of the things that we've always observed on The Cube when you see new things come out that are really cool groundbreaking products like you guys are doing it's always a challenge to manage the cost and complexity of bringing in the new. So Jeremy, take us through this tech stack here because you know, it's, sometimes it might be unwieldy just in from a tech stack perspective, nevermind the business logic or the business processes that got to be either unwound or changed. Can you take us through the IT stack that's critical to support your, your area? >> Yeah, absolutely. So with all the various different equipment you know, to provide our public wifi and and our desks, carrier agnostic, LT and 5G networks, you know, we need to be able to adhere to PCI compliance and ISO 27,000, so that, you know, requires us to keep a tremendous amount of our data. And the challenge we were facing is how do we do that cost effectively, and not have to make any sort of compromises on how we do that? A lot of times you'll find you don't know the value of your data today until tomorrow. An example would be COVID. You know, we, when we were storing data two years ago we weren't planning for a pandemic, but now that we were able to retain that data and look back we can see a tremendous amount of value with trying to forecast how our systems will recover when things get back to normal. And so when I met Thomas and we were sort of talking about how we were going to solve some of these data retention problems, he started explaining to me their compression in some of the performance metrics of their profession. And, you know, I said, oh, middle out compression. And it was a bit, it's been a bit of a running joke between me and him and I'm sure others, but it's incredibly impressive the amount of data we're able to store at the kind of cost, right? >> What, what problem does, did he solve for you? Because I mean, these guys, honestly, you know the startups have a lot and the Cloud's enabling more value now, we're seeing this, but when you look at this what was your, what was your core problem that you had? >> Yeah, so we, when you we want to be able to, I mean, primarily this is for our CIS log server. And CIS long servers today aren't what they were 10, 15 years ago where you just sort of had a machine and if something broke you went and looked, right? Now, they're very complex, that data is feeding to various systems and third-party software. So, you know, we're actively looking for changes in patterns and we have our, you know security teams auditing these from, for penetration testing and such. And then the getting that data to S3 so that we could have it in case, you know, for two, three years of storage. Well, the problem we were facing is all of that all of these different systems we needed to feed and retain data, we couldn't do that on site. We wanted to do use S3 but when we were doing some projections, it's like, we, we don't really have the budget for all of these places. Meeting Thomas and, and working with Chaos Search, you know, using their compression brought those costs down drastically. And then as we've been working with them the really exciting thing is they we're bringing more and more features to that surface or offering. So, you know, first it was just storing that data away. And now we're starting to build solutions off of that sitting in storage. So that's where it gets really exciting because you know, there, it's nothing to start getting anomaly detection off those logs, which, you know originally it was just, we need to store them in case somebody needs them two, three years from now. >> So Thomas Thomas, if I get this right then what I'm hearing is obviously I've put aside the complexity and the governing side the regulations for a minute just generally. Data retention as, as a key value proposition and having data available when you need it and then to do that and doing it in a very cost-effective simple way. It sounds like what you guys are offering. Is that right? >> Yeah, I mean, one key aspect of our solution is retention, right? Those are a lot of the challenges, but at the same time we provide real time notification like a classic log analytic type platform, alerting, monitoring. The key thing is to bringing both those worlds together and solving that problem. And so this, you know, middle in middle out, well, to be frank, we created a new technology called what we call Chaos Index that is a database index that is wonderfully small as as we're indicating, but also provides all the features that makes Cloud object storage, high performance. And so the idea is that use this lake offering to store all your data in a cost effective way but our service allows you to analyze it both in a long retention perspective as well as real-time perspective and bringing those two worlds together is so key because typically you have Silo Solutions and whether it's real-time at scale or retention scale the cost complexity and time to build out those solutions I know Jeremy knows also, well, a lot of folks come to us to solve those problems because you know when you're dealing with, you know terabytes and up, you know these things get complicated and to be frank, fall over quite often. >> Yeah. Let me, let me just ask you the question that's probably on everyone's mind who's watching and you guys probably have both heard this many times, because a lot of people just throw the data lake solution around like it's, you know why they whitewash their kind of old legacy solutions with data lake, store it on data lake. It's been called a data swamp. So people are fearful that, okay. I love this idea of a data lake, who doesn't like throwing data into a repository, having it available at will with notifications, all this secret magic beans that just magically create value. But I doubt that, I don't want to turn into a data swamp. So Thomas and Jeremy, talk about that, that concern. How do you mitigate that? How do you talk to that? Because if done properly, there's huge value in having a control plane or some sort of data system that is going to be tied in with signals and just storage retention. So I see the value. How do you manage the concern that people might say, Hey, I don't want to date a swamp? >> Yeah, I'll jump into that. So, you know, let's just be frank, Hadoop was a great tool for a very narrow scenario. I think that data swamp came out because people were using the tooling in an incorrect way. I've always had the belief that data lakes are the future. You just have the right to have the right service the right philosophy to leverage it. So what we do here at Chaos Search is we allow you to organize it, discover it, automatically index that data so that swamp doesn't get swampy. You know, when you stream data into your lake how do you organize it, such that it's has a nice stream? How do you transform that data into a value? So with our service we actually start where the storage begins, not a end point, not an archive. So we have tooling and services that keep your lake from being swampy to be, to be clear. And, but the key value is the benefits of the lake, the cost effectiveness, the reliability, security, the scale, those are all the benefits. The problem was that no one really made cloud offer storage a first-class citizen and we've done that. We've dressed the swamp nature but provided all the value of analysis. And that cost metrics, that scale. No one can touch cloud outer storage, it just, you can't. But what we've done is cracked the code of how you make it analytical. >> Jeremy, I want to get your thoughts on this too, on your side I mean, as a practitioner and customer of, of of these solutions, you know, the concern is am I missing anything? And I've been a big proponent of data retention for many, many years. You know, Dave Alondra in our Cube knows all know that I bang on the table all the time, store your data, be a data hoarder, because it's going to come back and be valuable. Costs are going down so I'm a big fan of data retention. But the fear might be on, what am I missing? Because machine learning starts to come in down the road you got AI, the more data you have that's accessible in real time, the more machine learning is effective. Do you, do you worry about missing anything or do you just store everything? >> We, we store everything. Sometimes it's, it's interesting where the value and insights come from your data. Something that see, might seem trivial today down the road offers tremendous, tremendous value. So one of the things we do is provide because we have wifi in the subway infrastructure, you know taking that wifi data, we can start to understand the flow of people in and out of the subway network. And we can take that and provide insights to the rail operators, which get them from A to B quicker. You know, when we built the wifi it wasn't with the intention of getting Torontonians across the city faster. But that was one of the values that we were able to get from the data in terms of, you know, Thomas's solution, I think one of the reasons we we engaged him in the first place is because I didn't believe his compression. It sounded a little too good to be true. And so when it was time to try them out, you know all we had to do was ship data to an S3 bucket. You know, there's tons of, of solutions to do that. And, and data shippers right out of the box. It took a few, you know, a few minutes and then to start exploring the data was in Cabana, which is or their dashboard, which is, you know, an interface that's easy to use. So we were, you know, within a two days getting the value out of that data that we were looking for which is, you know, phenomenal. We've been very happy. >> Thomas, sounds like you've got a great, great testimonial here and it's not like an easy problem that he's living in there. I mean, I think, you know, I was mentioning this earlier and we're going to get into it now. There's regulations and there's certain compliance issues. First of all, everyone has this now problem now, it's not just within that space. But just the technical complexities of packets moving around I got on my wifi and the stop here, I'm jumping over here, and there's a ton of data it's all over the place, it's totally unstructured. So it's a tough, tough test for you guys, Chaos Search. So yeah, it's almost like the Mount Everest of customer testimonials. You've got to, it's a big, it's a big use case here. How does this translate to other clients? And talk about this governance and security controls because I know this highly regulated and you got there's penalties involved on his side of the world and Telco, the providers that have these edge devices there's actually penalties and, and whatnot so, not just commercial, it's maybe a, you know risk management, but here there's actually penalties. >> Absolutely. So, you know centralizing your data has a real benefit of of not getting in trouble, right? So you have one place, you store one place that's a good thing, but what we've done and this was a key aspect to our offering is we as Chaos, Chaos Search folks, we don't own the customer's data. We don't own BIA's data. They own the data. They give us access rights, very standard way with Cloud App storage roll on policies from Amazon, read only access rights to their data. And so not owning a customer's data is a big selling point not only for them, but for us for compliance regulatory perspective. So, you know, unlike a lot of solutions where you move the data into them and now they are responsible, actually BIA owns everything. We, they provide access so that we could provide an analysis that they could turn off at any point in time. We're also SOC 2 type 1 and type 2 compliant you got to do it, you know, in this, this world, you know when we were young we ran at this because of all of these compliance scenarios that we will be in, but, you know, the long as short of it is, we're transient service. The storage, cloud storage is the source of truth where all data resides and, you know, think about it, it's architecturally smart, it's cost effective, it's secure, it's reliable, it's durable. But from a security perspective, having the customer own their own data is a big differentiation in the market, a big differentiation. >> Jeremy, talk about on your end the security controls surrounding the log management environments that span across countries with different regulations. Now you've got all kinds of policy dimensions and technical dimensions and topology dimensions. >> Yeah, absolutely. So how we approach it is we look at where we have offerings across the globe and we figure out what the sort of highest watermark level of adherence we need to hit. And then we standardize across that. And by shipping to S3, it allows us to enforce that governance really easily and right to Tom's point you know, we manage the data, which is very important to us and we don't have to be worried about a third party or if we want to change providers years down the road. Although I don't think anyone's coming out with 81% compression anytime soon (laughs). But yeah, so that's, for us, it's about meeting those high standards and having the technologies that enable us to do it. And Chaos Search is a very big part of that right now. >> All right let me ask you a question, for the folks watching that are like really interested in this topic, what would you say to them when evaluating Chaos Search obviously, your use case is complex, but so are others as enterprises start to have an edge, obviously the security posture shifts, everything shifts. There's no more perimeter and the data problem becomes acute to them. So the enterprises are going to start seeing what you've been living for in your world. What's your advice to people watching? >> My advice would be to give them a try. You know, it's it's has been really quite impressive. The customer service has been hands-on and we've been getting, you know, they've been under-promising and over-delivering, which when you have the kind of requirements to manage solutions in these very complex environment, cloud local, you know various data centers and such, you know that kind of customer service is very important, right? It enables us to continue to deliver those high quality solutions. >> So Thomas give us the, the overview of the secret sauce. You've got a great testimonial here. You got people watching, what's different now in the world that you're going after, what wave are you on? Talk to the people who are watching this and saying, okay why Chaos Search? Why are you relevant? Obviously there's some cool things you're doing. I love that. What's cool, and what's relevant and why what's in it for them if they work with you? >> Yeah. So you know, that that whole Silicon Valley reference actually got that from my patent attorney when we were talking. But yeah, no, we, we, you know, focus on if we can crack this code of making data, one a face small, store small, moves small, process small. But then make it multimodal access make it virtual transformation. If we could do that, and we could transform cloud outer storage into a high-performance medical database all these heavy, heavy problems, all that complexity that scaffolding that you build to do these type of scales would be solved. Now what we had to focus on and this has been my, I guess you say life passion is working on a new data representation. And that's our secret sauce that enables a new architecture a new service that where the customer folks on their tooling, their APIs, their visualizations that they know and love, what we focus is on taking that data lake, and again, to transform it into an analytical database, both for log analytics think of like elastic search replacement, as well as a BI replacement for your SQL warehousing database. And coming out later this year into 2022, ML support on one representation. You don't have the silo your information you don't have to re index your data, both. So elastic search CQL and actually ML TensorFlow actions on the exact same representation. So think about the data retention, doing some post analysis on all those logs of data, months, years, and then maybe set up some triggers if you see some anomaly that's happening within your service. So you think about it, the hunt with BI reporting, with predictive analysis on one platform. Again, it sounds a little unicorn, I agree with Jeremy, maybe it didn't sound true but it's been a life's work. So it didn't happen overnight. And you know, it's eight years, at least in the in the making, but I guess the life journey in the end. >> Well, you know, the timing is great. You know, all the database geeks out there who have been following the data industry know that, you know there's a good point for structured data but when you start getting into mechanisms and they become a bottleneck or a blocker to innovation, you know you starting to see this idea of a data lake being let the data kind of form, let it be. You know, I hate the word control plane but more of a, a connective tissue between systems is become an interesting thing. So now you can store everything so you know, no worries there, no blind spots and then let the magic of machine learning in the future, come around. So Jeremy, with that, I got to ask you since you're the bad boy of data analytics at BAI communications head of data analytics, what does that, what do you look for in the future as you start to set this up because I can almost imagine and connecting the dots here in the interview, you got the data lake you're storing everything, which is good. Now you have to create more insights and get ahead of the curve and provide some prescriptive and automated ways to do things better. What's your vision? >> First I would just like to say that, you know when astrophysicists talk about, you know, dark dark energy, dark matter, I'm convinced that's where Thomas is hiding the ones and zeros to get that compression, right? I don't don't know that to be fact but I know it to be true. And then in terms of machine learning and these sort of future technologies, which are becoming available you know, starting from scratch and trying to build out you know, models that have value, you know that takes a fair amount of work. And that landscape keeps changing, right? Being able to push our data into an S3 bucket and then you know, retain that data and then get anomaly detection on top of it. That's, I mean, that's something special and that unlocks a lot of ability for you know, our teams to very easily deliver anomaly detection, machine learning to our customers, without having to take on a lot of work to understand the latest and greatest in machine learning. So, I mean, it's really empowering to our team, right? And, and a tool that we're going to. >> Yeah, I love and I love the name, Chaos Search, Thomas. I got to say, you know it brings up the inside baseball around chaos monkey which everyone knows was a DevOps tool to create kind of day two simulate day two operations and disruptions in DevOps. But what you're really getting at is your whole new architecture that's beyond DevOps movement, it's like next gen architecture. Talk about that to the people watching who have a lot of legacy and want to transform over to a more enabling platform that's going to give them some headroom for their data. What, what do you say to them? How do they get started? What, how should they, how what's their mindset? What they, what are some first principles you can share? >> Well, you know, I always start with first principles but you know, I like to say we're the next next gen. The key thing with the Chaos Search offering is you can start today with B, without even Chaos Search. Stream your data to S3. We're going to make hip and cool data lakes again. And actually it's a, Google it now, data lakes are hip and cool. So start streaming now, start managing your data in a well-formed centralized viewpoint with security governance and cost effectiveness. Then call Chaos Search shop, and we'll make access to it easily, simply to ultimately solve your problems. The bug whether your security issue, the bug, whether it's more performance issues at scale, right? And so when workloads can be added instantaneously in your data lake it's, it's game changing it's mind changing. So from the DevOps folks where, you know, you're up all night trying to say, how am I going to scale from terabyte, you know one today to 50 terabytes, don't. Stream it to S3. We'll take over, we'll worry about that scale pain. You worry about your job of security, performance, operations, integrity. >> That really highlights the cloud scale the value proposition as, as apps start to be using data as an input, not just as a a part of a repo repo, so great stuff. Thomas, thanks for sharing your life's work and your technology magic. Jeremy, thanks for coming on and sharing your use cases with us and how you are making it all work. Appreciate it. >> Thank you. >> My pleasure. >> Okay. This is The Cubes, coverage and presenting AWS this time showcase the next big thing here with Chaos Search. I'm John Furrier, your host. Thanks for watching. (upbeat music)

Published Date : Jun 24 2021

SUMMARY :

great to have you on. it's a new kind of last mile if you will, specifically where, you know and you know, someone was driving and you can imagine just the amount and you seeing 5G being called that allow you to do that correlation. and so providing the analysis and complexity of bringing in the new. And the challenge we were and we have our, you know and having data available when you need it And so this, you know, of data system that is going to be tied in is we allow you to organize it, of these solutions, you So we were, you know, within and you got there's penalties of solutions where you the security controls surrounding the log and having the technologies and the data problem you know, they've been after, what wave are you on? that scaffolding that you in the interview, you got the data lake like to say that, you know I got to say, you know but you know, I like to say with us and how you the next big thing here with Chaos Search.

ENTITIES

Entity	Category	Confidence
Jeremy	PERSON	0.99+
Thomas	PERSON	0.99+
Dave Alondra	PERSON	0.99+
two	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Jeremy Thomas	PERSON	0.99+
Thomas Hazel	PERSON	0.99+
Telco	ORGANIZATION	0.99+
Jeremy Foran	PERSON	0.99+
BIA	ORGANIZATION	0.99+
Tom	PERSON	0.99+
AWS	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
81%	QUANTITY	0.99+
Chaos Search	ORGANIZATION	0.99+
eight years	QUANTITY	0.99+
tomorrow	DATE	0.99+
Palo Alto, California	LOCATION	0.99+
2000	DATE	0.99+
both	QUANTITY	0.99+
50 terabytes	QUANTITY	0.99+
two days	QUANTITY	0.99+
one	QUANTITY	0.99+
today	DATE	0.99+
billions	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
Toronto	LOCATION	0.99+
Google	ORGANIZATION	0.98+
First	QUANTITY	0.98+
S3	TITLE	0.98+
one platform	QUANTITY	0.98+
ChaosSearch	ORGANIZATION	0.98+
first principles	QUANTITY	0.98+
two worlds	QUANTITY	0.98+
first principles	QUANTITY	0.98+
2022	DATE	0.98+
one place	QUANTITY	0.98+
one system	QUANTITY	0.98+
three years	QUANTITY	0.98+
DevOps	TITLE	0.98+
two years ago	DATE	0.97+
Thomas Thomas	PERSON	0.96+
Chaos	ORGANIZATION	0.96+
SQL	TITLE	0.96+
BAI	ORGANIZATION	0.96+
trillion	QUANTITY	0.95+
BAI Communications	ORGANIZATION	0.95+
Mount Everest	LOCATION	0.95+
The Cube	ORGANIZATION	0.95+
this year	DATE	0.95+
first	QUANTITY	0.95+
Cloud App	TITLE	0.94+
Hadoop	TITLE	0.94+
pandemic	EVENT	0.94+
first place	QUANTITY	0.94+

Ed Walsh, ChaosSearch | CUBE Conversation May 2021

>>president >>so called big data promised to usher in a new era of innovation where companies competed on the basis of insights and agile decision making. There's little question that social media giants, search leaders and e commerce companies benefited. They had the engineering shops and the execution capabilities to take troves of data and turned them into piles of money. But many organizations were not as successful. They invested heavily in data architecture is tooling and hyper specialized experts to build out their data pipelines. Yet they still struggle today to truly realize they're busy. Did data in their lakes is plentiful but actionable insights aren't so much chaos. Search is a cloud based startup that wants to change this dynamic with a new approach designed to simplify and accelerate time to insights and dramatically lower cost and with us to discuss his company and its vision for the future is cuba Lem Ed Walsh had great to see you. Thanks for coming back in the cube. >>I always love to be here. Thank you very much. It's always a warm welcome. Thank you. >>Alright, so give us the update. You guys have had some big funding rounds, You're making real progress on the tech, taking it to market what's new with chaos surgery. >>Sure. Actually even a lot of good exciting things happen. In fact just this month we need some, you know, obviously announced some pretty exciting things. So we unveiled what we consider the industry first multi model data late platform that we allow you to take your data in S three. In fact, if you want to show the image you can, but basically we allow you to put your data in S three and then what we do is we activate that data and what we do is a full index of the data and makes it available through open a P. I. S. And the key thing about that is it allows your end users to use the tools are using today. So simply put your data in your cloud option charge, think Amazon S three and glacier think of all the different data. Is that a natural act? And then we do the hard work. And the key thing is to get one unified delic but it's a multi mode model access so we expose api like the elastic search aPI So you can do things like search or using cabana do log analytics but you can also do things like sequel, use Tableau looker or bring relational concepts into cabana. Things like joins in the data back end. But it allows you also to machine learning which is early next year. But what you get is that with that because of a data lake philosophy, we're not making new transformations without all the data movement. People typically land data in S. Three and we're on the shoulders of giants with us three. Um There's not a better more cost effective platform. More resilient. There's not a better queuing system out there and it's gonna cost curve that you can't beat. But basically so people store a lot of data in S. Three. Um But what their um But basically what you have to do is you E. T. L. Out to other locations. What we do is allow you to literally keep it in place. We index in place. We write our hot index to rewrite index, allow you to go after that but published an open aPI S. But what we avoid is the GTL process. So what our index does is look at the data and does full scheme of discovery normalization, were able to give sample sets. And then the refinery allows you to advance transformations using code. Think about using sequel or using rejects to change that data pull the dead apartheid things but use role based access to give that to the end user. But it's in a format that their tools understand cabana will use the elasticsearch ap or using elasticsearch calls but also sequel and go directly after data by doing that. You get a data lake but you haven't had to take the three weeks to three months to transform your data. Everyone else makes you. And you talk about the failure. The idea that Alex was put your data there in a very scalable resilient environment. Don't do transformation. It was too hard to structure for databases and data. Where else is put it there? We'll show you how value out Largely un delivered. But we're that last mile. We do exactly that. Just put it in s. three and we activated and activate it with a piece that the tools of your analysts use today or what they want to use in the future. That is what's so powerful. So basically we're on the shoulders of giants with street, put it there and we light it up and that's really the last mile. But it's this multi model but it's also this lack of transformation. We can do all the transformation that's all to virtually and available immediately. You're not doing extended GTL projects with big teams moving around a lot of data in the enterprise. In fact, most time they land and that's three and they move it somewhere and they move it again. What we're saying is now just leave in place well index and make it available. >>So the reason that it was interesting, so the reason they want to move in the S three was the original object storage cloud. It was, it was a cheap bucket. Okay. But it's become much more than that when you talk to customers like, hey, I have all this data in this three. I want to do something with it. I want to apply machine intelligence. I want to search it. I want to do all these things, but you're right. I have to move it. Oftentimes to do that. So that's a huge value. Now can I, are you available in the AWS marketplace yet? >>You know, in fact that was the other announcement to talk about. So our solution is one person available AWS marketplace, which is great for clients because they've been burned down their credits with amazon. >>Yeah, that's that super great news there. Now let's talk a little bit more about data. Like you know, the old joke of the tongue in cheek was data lakes become data swamps. You sort of know, see no schema on, right. Oh great. I can put everything into the lake and then it's like, okay, what? Um, so maybe double click on that a little bit and provide a little bit more details to your, your vision there and your philosophy. >>So if you could put things that data can get after it with your own tools on elastic or search, of course you do that. If you don't have to go through that. But everyone thinks it's a status quo. Everyone is using, you know, everyone has to put it in some sort of schema in a database before they can get access to what everyone does. They move it some place to do it. Now. They're using 1970s and maybe 1980s technology. And they're saying, I'm gonna put it in this database, it works on the cloud and you can go after it. But you have to do all the same pain of transformation, which is what takes human. We use time, cost and complexity. It takes time to do that to do a transformation for an user. It takes a lot of time. But it also takes a teams time to do it with dBS and data scientists to do exactly that. And it's not one thing going on. So it takes three weeks to three months in enterprise. It's a cost complexity. But all these pipelines for every data request, you're trying to give them their own data set. It ends up being data puddles all over this. It might be in your data lake, but it's all separated. Hard to govern. Hard to manage. What we do is we stop that. What we do is we index in place. Your dad is already necessary. Typically retailing it out. You can continue doing that. We really are just one more use of the data. We do read only access. We do not change that data and you give us a place in. You're going to write our index. It's a full rewrite index. Once we did that that allows you with the refinery to make that we just we activate that data. It will immediately fully index was performant from cabana. So you no longer have to take your data and move it and do a pipeline into elasticsearch which becomes kind of brittle at scale. You have the scale of S. Three but use the exact same tools you do today. And what we find for like log analytics is it's a slightly different use case for large analytics or value prop than Be I or what we're doing with private companies but the logs were saving clients 50 to 80% on the hard dollars a day in the month. They're going from very limited data sets to unlimited data sets. Whatever they want to keep an S. Three and glacier. But also they're getting away from the brittle data layer which is the loosen environment which any of the data layers hold you back because it takes time to put it there. But more importantly It becomes brittle at scale where you don't have any of that scale issue when using S. three. Is your dad like. So what what >>are the big use cases Ed you mentioned log analytics? Maybe you can talk about that. And are there any others that are sort of forming in the marketplace? Any patterns that you see >>Because of the multi model we can do a lot of different use cases but we always work with clients on high R. O. I use cases why the Big Bang theory of Due dad like and put everything in it. It's just proven not to work right? So what we're focusing first use cases, log analytics, why as by way with everything had a tipping point, right? People were buying model, save money here, invested here. It went quickly to no, no we're going cloud native and we have to and then on top of it it was how do we efficiently innovate? So they got the tipping point happens, everyone's going cloud native. Once you go cloud native, the amount of machine generated data that you have that comes from the environment dramatically. It just explodes. You're not managing hundreds or thousands or maybe 10,000 endpoints, you're dealing with millions or billions and also you need this insight to get inside out. So logs become one of the things you can't keep up with it. I think I mentioned uh we went to a group of end users, it was only 60 enterprise clients but we asked him what's your capture rate on logs And they said what do you want it to be 80%, actually 78 said listen we want eight captured 80 200 of our logs. That would be the ideal not everything but we need most of it. And then the same group, what are you doing? Well 82 had less than 50%. They just can't keep up with it and every everything including elastic and Splunk. They work harder to the process to narrow and keep less and less data. Why? Because they can't handle the scale, we just say landed there don't transform will make it all available to you. So for log analytics, especially with cloud native, you need this type of technology and you need to stop, it's like uh it feels so good when you stop hitting your head against the wall. Right? This detail process that this type of scale just doesn't work. So that's exactly we're delivering the second use case uh and that's with using elastic KPI but also using sequel to go after the same data representation. And we come out with machine learning. You can also do anomaly detection on the same data representation. So for a log uh analytic use case series devops setups. It's a huge value problem now the same platform because it has sequel exposed. You can do just what we use the term is agile B. I people are using you think about look or tableau power bi I uh metabolic. I think of all these toolsets that people want to give and uh and use your business or coming back to the centralized team every single week asking for new datasets. And they have to be set up like a data set. They have to do an e tail process that give access to that data where because of the way just landed in the bucket. If you have access to that with role based access, I can literally get you access that with your tool set, let's say Tableau looker. You know um these different data sets literally in five minutes and now you're off and running and if you want a new dataset they give another virtual and you're off and running. But with full governance so we can use to be in B I either had self service or centralized. Self service is kind of out of control, but we can move fast and the centralized team is it takes me months but at least I'm in control. We allow you do both fully governed but self service. Right. I got to >>have lower. I gotta excel. All right. And it's like and that's the trade off on each of the pieces of the triangle. Right. >>And they make it easy, we'll just put in a data source and you're done. But the problem is you have to E T L the data source. And that's what takes the three weeks to three months in enterprise and we do it virtually in five minutes. So now the third is actually think about um it's kind of a combination of the two. Think about uh you love the beers and diaper stories. So you know, think about early days of terror data where they look at sales out data for business and they were able to look at all the sales out data, large relational environment, look at it, they crunch all these numbers and they figured out by different location of products and the start of they sell more sticker things and they came up with an analogy which everyone talked about beers and diapers. If you put it together, you sell more from why? Because afternoon for anyone that has kids, you picked up diapers and you might want to grab a beer of your home with the kids. But that analogy 30 years ago, it's now well we're what's the shelf space now for approximate company? You know it is the website, it's actually what's the data coming from there. It's actually the app logs and you're not capturing them because you can't in these environments or you're capturing the data. But everyone's telling, you know, you've got to do an E. T. L. Process to keep less data. You've got to select, you got to be very specific because it's going to kill your budget. You can't do that with elastic or Splunk, you gotta keep less data and you don't even know what the questions are gonna ask with us, Bring all the app logs just land in S. three or glacier which is the most it's really shoulders of giants right? There's not a better platform cost effectively security resilience or through but to think about what you can stream and the it's the best queuing platform I've ever seen in the industry just landed there. And it's also very cost effective. We also compress the data. So by doing that now you match that up with actually relatively small amount of relational data and now you have the vaccine being data. But instead it's like this users using that use case and our top users are always, they start with this one then they use that feature and that feature. Hey, we just did new pricing is affecting these clients and that clients by doing this. We get that. But you need that data and people aren't able to capture it with the current platforms. A data lake. As long as you can make it available. Hot is a way to do it. And that's what we're doing. But we're unique in that. Other people are making GTL IT and put it in a in 19 seventies and 19 eighties data format called a schema. And we avoided that because we basically make S three a hot and elected. >>So okay. So I gotta I want to, I want to land on that for a second because I think sometimes people get confused. I know I do sometimes without chaos or it's like sometimes don't know where to put you. I'm like okay observe ability that seems to be a hot space. You know of course log analytics as part of that B. I. Agile B. I. You called it but there's players like elastic search their star burst. There's data, dogs, data bricks. Dream EOS Snowflake. I mean where do you fit where what's the category and how do you differentiate from players like that? >>Yeah. So we went about it fundamentally different than everyone else. Six years ago. Um Tom hazel and his band of merry men and women came up and designed it from scratch. They may basically yesterday they purposely built make s free hot analytic environment with open A. P. I. S. By doing that. They kind of changed the game so we deliver upon the true promises. Just put it there and I'll give you access to it. No one else does that. Everyone else makes you move the data and put it in schema of some format to get to it. And they try to put so if you look at elasticsearch, why are we going after? Like it just happens to be an easy logs are overwhelming. You once you go to cloud native, you can't afford to put it in a loose seen the elk stack. L is for loosen its inverted index. Start small. Great. But once you now grow it's now not one server. Five servers, 15 servers, you lose a server, you're down for three days because you have to rebuild the whole thing. It becomes brittle at scale and expensive. So you trade off I'm going to keep less or keep less either from retention or data. So basically by doing that so elastic we're not we have no elastic on that covers but we allow you to well index the data in S. Tree and you can access it directly through a cabana interface or an open search interface. Api >>out it's just a P. >>It's open A P. I. S. It's And by doing that you've avoided a whole bunch of time cost, complexity, time of your team to do it. But also the time to results the delays of doing that cost. It's crazy. We're saving 50-80 hard dollars while giving you unlimited retention where you were dramatically limited before us. And as a managed service you have to manage that Kind of Clunky. Not when it starts small, when it starts small, it's great once at scale. That's a terrible environment to manage the scale. That's why you end up with not one elasticsearch cluster, dozens. I just talked to someone yesterday had 125 elasticsearch clusters because of the scale. So anyway, that's where elastic we're not a Mhm. If you're using elastic it scale and you're having problems with the retired off of cost time in the, in the scale, we become a natural fit and you don't change what your end users do. >>So the thing, you know, they had people here, this will go, wow, that sounds so simple. Why doesn't everybody do this? The reason is it's not easy. You said tom and his merry band. This is really hard core tech. Um and it's and it's it's not trivial what you've built. Let's talk about your secret sauce. >>Yeah. So it is a patented technology. So if you look at our, you know, component for architecture is basically a large part of the 90% of value add is actually S. Three, I gotta give S three full kudos. They built a platform that we're on shoulders of giants. Um But what we did is we purpose built to make an object storage a hot alec database. So we have an index, like a database. Um And we basically the data you bring a refinery to be able to do all the advanced type of transformation but all virtually done because we're not changing the source of record, we're changing the virtual views And then a fabric allows you to manage and be fully elastic. So if we have a big queries because we have multiple clients with multiple use cases, each multiple petabytes, we're spending up 1800 different nodes after a particular environment. But even with all that we're saving them 58%. But it's really the patented technology to do this, it took us six years by the way, that's what it takes to come up with this. I come upon it, I knew the founder, I've known tom tom a stable for a while and uh you know his first thing was he figured out the math and the math worked out. Its deep tech, it's hard tech. But the key thing about it is we've been in market now for two years, multiple use cases in production at scale. Um Now what you do is roadmap, we're adding a P. I. So now we have elasticsearch natural proofpoint. Now you're adding sequel allows you open up new markets. But the idea for the person dealing with, you know, so we believe we deliver on the true promise of Data Lakes and the promise of Data lakes was put it there, don't focus on transferring. It's just too hard. I'll get insights out and that's exactly what we do. But we're the only ones that do that everyone else makes you E. T. L. At places. And that's the innovation of the index in the refinery that allows the index in place and give virtual views in place at scale. Um And then the open api is to be honest, uh I think that's a game. Give me an open api let me go after it. I don't know what tool I'm gonna use next week every time we go into account they're not a looker shop or Tableau Sharp or quick site shop there, all of them and they're just trying to keep up with the businesses. Um and then the ability to have role based access where actually can give, hey, get them their own bucket, give them their own refinery. As long as they have access to the data, they can go to their own manipulation ends up being >>just, >>that's the true promise of data lakes. Once we come out with machine learning next year, now you're gonna rip through the same embassy and the way we structured the data matrices. It's a natural fit for things like tensorflow pytorch, but that's, that's gonna be next year just because it's a different persona. But the underlining architecture has been built, what we're doing is trying to use case that time. So we worked, our clients say it's not a big bang. Let's nail a use case that works well. Great R. O. I great business value for a particular business unit and let's move to the next. And that's how I think it's gonna be really. That's what if you think about gardener talks about, if you think about what really got successful in data, where else in the past? That's exactly it wasn't the big bang, it was, let's go and nail it for particular users. And that's what we're doing now because it's multi model, there's a bunch of different use cases, but even then we're focusing on these core things that are really hard to do with other relational only environments. Yeah, I >>can see why you're still because you know, you haven't been well, you and I have talked about the api economy for forever and then you've been in the storage world so long. You know what a nightmare is to move data. We gotta, we gotta jump. But I want to ask you, I want to be clear on this. So you are your cloud cloud Native talked to frank's Lukman maybe a year ago and I asked him about on prem and he's like, no, we're never doing the halfway house. We are cloud all the >>way. I think >>you're, I think you have a similar answer. What what's your plan on Hybrid? >>Okay. We get, there's nothing about technology, we can't go on, but we are 100 cloud native or only in the public cloud. We believe that's a trend line. Everyone agrees with us, we're sticking there. That's for the opportunity. And if you can run analytics, There's nothing better than getting to the public cloud like Amazon and he was actually, that were 100 cloud native. Uh, we love S three and what would be a better place to put this is put the next three and we just let you light it up and then I guess if I'm gonna add the commercial and buy it through amazon marketplace, which we love that business model with amazon. It's >>great. Ed thanks so much for coming back in the cube and participating in the startup showcase. Love having you and best of luck. Really exciting. >>Hey, thanks again, appreciate it. >>All right, thank you for watching everybody. This is Dave Volonte for the cube. Keep it right there.

Published Date : May 14 2021

SUMMARY :

They had the engineering shops and the execution capabilities to take troves of data and Thank you very much. taking it to market what's new with chaos surgery. But basically what you have to do is you E. T. L. Out to other locations. But it's become much more than that when you talk You know, in fact that was the other announcement to talk about. Like you know, the old joke of the tongue in cheek was data lakes become data swamps. You have the scale of S. Three but use the exact same tools you do today. are the big use cases Ed you mentioned log analytics? So logs become one of the things you can't keep up with it. And it's like and that's the trade off on each of But the problem is you have to E T L the data I mean where do you fit where what's the category and how do you differentiate from players like that? no elastic on that covers but we allow you to well index the data in S. And as a managed service you have to manage that Kind of Clunky. So the thing, you know, they had people here, this will go, wow, that sounds so simple. the source of record, we're changing the virtual views And then a fabric allows you to manage and be That's what if you think about gardener talks about, if you think about what really got successful in data, So you are your cloud cloud I think What what's your plan on Hybrid? to put this is put the next three and we just let you light it up and then I guess if I'm gonna add Love having you and best of luck. All right, thank you for watching everybody.

ENTITIES

Entity	Category	Confidence
Dave Volonte	PERSON	0.99+
Ed Walsh	PERSON	0.99+
15 servers	QUANTITY	0.99+
80%	QUANTITY	0.99+
58%	QUANTITY	0.99+
three months	QUANTITY	0.99+
three weeks	QUANTITY	0.99+
May 2021	DATE	0.99+
two years	QUANTITY	0.99+
90%	QUANTITY	0.99+
Five servers	QUANTITY	0.99+
hundreds	QUANTITY	0.99+
1970s	DATE	0.99+
amazon	ORGANIZATION	0.99+
1980s	DATE	0.99+
yesterday	DATE	0.99+
five minutes	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
millions	QUANTITY	0.99+
S three	TITLE	0.99+
three days	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
six years	QUANTITY	0.99+
50	QUANTITY	0.99+
one server	QUANTITY	0.99+
Ed	PERSON	0.99+
Tom hazel	PERSON	0.99+
two	QUANTITY	0.99+
three weeks	QUANTITY	0.99+
78	QUANTITY	0.99+
S. three	LOCATION	0.99+
third	QUANTITY	0.99+
next year	DATE	0.99+
less than 50%	QUANTITY	0.99+
tom	PERSON	0.99+
billions	QUANTITY	0.99+
three	QUANTITY	0.99+
thousands	QUANTITY	0.99+
next week	DATE	0.99+
dozens	QUANTITY	0.99+
50-80	QUANTITY	0.98+
Six years ago	DATE	0.98+
125 elasticsearch clusters	QUANTITY	0.98+
both	QUANTITY	0.98+
a year ago	DATE	0.98+
early next year	DATE	0.97+
Tableau Sharp	ORGANIZATION	0.97+
Alex	PERSON	0.97+
today	DATE	0.97+
first	QUANTITY	0.97+
first thing	QUANTITY	0.96+
30 years ago	DATE	0.96+
each	QUANTITY	0.96+
one person	QUANTITY	0.96+
S. Tree	TITLE	0.96+
10,000 endpoints	QUANTITY	0.96+
second use	QUANTITY	0.95+
82	QUANTITY	0.95+
one thing	QUANTITY	0.94+
Tableau	TITLE	0.94+
60 enterprise clients	QUANTITY	0.93+
one	QUANTITY	0.93+
eight	QUANTITY	0.93+
1800 different nodes	QUANTITY	0.91+
excel	TITLE	0.9+
80 200 of our logs	QUANTITY	0.89+
this month	DATE	0.89+
S. Three	TITLE	0.88+
agile	TITLE	0.88+
ChaosSearch	ORGANIZATION	0.86+
S. Three	TITLE	0.86+
Dream EOS Snowflake	TITLE	0.85+
cabana	LOCATION	0.85+
100 cloud	QUANTITY	0.83+
a day	QUANTITY	0.81+

Ed Walsh | CUBE Conversation, August 2020

>> From theCUBE Studios in Palo Alto in Boston, connecting with thought leaders all around the world, this is theCUBE Conversation. >> Hey, everybody, this is Dave Vellante, and welcome to this CXO Series. As you know, I've been running this series discussing major trends and CXOs, how they've navigated through the pandemic. And we've got some good news and some bad news today. And Ed Walsh is here to talk about that. Ed, how you doing? Great to see you. >> Great seeing you, thank you for having me on. I really appreciate it. So the bad news is Ed Walsh is leaving IBM as the head of the storage division (indistinct). But the good news is, he's joining a new startup as CEO, and we're going to talk about that, but Ed, always a pleasure to have you. You're quite a run at at IBM. You really have done a great job there. So, let's start there if we can before we get into the other part of the news. So, you give us the update. You're coming off another strong quarter for the storage business. >> I would say listen, they're sweet, heartily, but to be honest, we're leaving them in a really good position where they have sustainable growth. So they're actually IBM storage in a very good position. I think you're seeing it in the numbers as well. So, yeah, listen, I think the team... I'm very proud of what they were able to pull off. Four years ago, they kind of brought me in, hey, can we get IBM storage back to leadership? They were kind of on their heels, not quite growing, or not growing but falling back in market share. You know, kind of a distant third place finisher, and basically through real innovation that mattered to clients which that's a big deal. It's the right innovation that matters to the clients. We really were able to dramatically grow, grow all different four segments of the portfolio. But also get things like profitability growing, but also NPS growing. It really allowed us to go into a sustainable model. And it's really about the team. You heard I've talked about team all the time, which is you get a good team and they really nailed great client experiences. And they take the right offerings and go to market and merge it. And I'll tell you, I'm very proud of what the IBM team put together. And I'm still the number one fan and inside or outside IBM. So it might be bittersweet, but I actually think they're ready for quite some growth. >> You know Ed, when you came in theCUBE, right after you had joined IBM, a lot of people are saying, Ed Walsh joined an IBM storage division to sell the division. And I asked you on theCUBE, are you there to sell division? And you said, no, absolutely not. So it's always it seemed to me, well, hey, it's good. It's a good business, good cash flow business, got a big customer base, so why would IBM sell it? Never really made sense to me. >> I think it's integral to what IBM does, I think it places their client base in a big way. And under my leadership, really, we got more aligned with what IBM is doing from the big IBM right. What we're doing around Red Hat hybrid multi cloud and what we're doing with AI. And those are big focuses of the storage portfolio. So listen, I think IBM as a company is in a position where they're really innovating and thriving, and really customer centric. And I think IBM storage is benefiting from that. And vice versa. I think it's a good match. >> So one of the thing I want to bring up before we move on. So you had said you were seeing a number. So I want to bring up a chart here. As you know, we've been using a lot of data and sharing data reporting from our partner. ETR, Enterprise Technology Research, they do quarterly surveys. They have a very tight methodology, it's similar to NPS. But it's a net score, we call it methodology. And every quarter they go out and what we're showing here is the results from the last three quarter, specific to IBM storage and IBM net score in storage. And net scores is essentially, we ask people are you spending more, are you spending less, we subtract the less from the more and that's the net score. And you can see when you go back to the October 19, survey, you know, low single digits and then it dipped in the April survey, which was the height of the pandemic. So this was this is forward looking. So in the height of the pa, the lockdown people were saying, maybe I'm going to hold off on budgets. But then now look at the July survey. Huge, huge up check. And I think this is testament to a couple of things. One is, as you mentioned, the team. But the other is, you guys have done a good job of taking R&D, building a product pipeline and getting it into the field. And I think that shows up in the numbers. That was really a one of the hallmarks of your leadership. >> Yeah, I mean, they're the innovation. IBM is there's almost an embarrassment of riches inside. It's how do you get in the pipeline? We went from a typically about for four years, four and a half year cycles, not a two year cycle product cycle. So we're able to innovate and bring it to market much quicker. And I think that's what clients are looking for. >> Yeah, so I mean, you brought a startup mentality to the division and of course now, cause your startup guy, let's face it. Now you're going back to the startup world. So the other part of the news is Ed Walsh is joining ChaosSearch as the CEO. ChaosSearches is a local Boston company, they're focused on log analytics but more on we're going to talk about that. So first of all, congratulations. And tell us about your decision. Why ChaosSearch? And you know where you're out there? >> Yeah, listen, if you can tell from the way I describe IBM, I mean, it was a hard decision to leave IBM, but it was a very, very easy decision to go to Chaos, right. So I knew the founder, I knew what he was working on for the last seven years, right. Last five years as a company, and I was just blown away at their fundamental innovation, and how they're really driving like how to get insights at scale from your data lake in the cloud. But also and also instead, and statements slash cost dramatically. And they make it so simple. Simply put your data in your S3 or really Cloud object storage. But right now, it's, Amazon, they'll go the rest of clouds, but just put your data in S3. And what we'll do is we'll index it, give you API so you can search it and query it. And it literally brings a way to do at scale data analysts. And also login analytics on everything you just put into S3 basically bucket. It makes it very simple. And because they're really fundamental, we can go through it. Fundamental on hard technology that data layer, but they kept all the API. So you're using your normal tools that we did for Elastic Search API's. You want to do Glyfada, you want to do Cabana, or you want to do SQL or you want to do use Looker, Tableau, all those work. Which is that's a part of it. It's really revolutionary what they're doing as far as the value prop and we can explain it. But also they made it evolution, it's very easy for clients to go. Just run in parallel, and then they basically turn off what they currently have running. >> So data lakes, really the term became popular during the sort of early big data, Hadoop era. And, Hadoop obviously brought a lot of innovation, you know, leave the data where it is. Bring the compute to the data, really launched the Big Data initiative, but it was very complicated. You had, MapReduce and and elastic MapReduce in the cloud. And, it really was a big batch job, where storage was really kind of a second class citizen, if you will. There wasn't a lot of real time stuff going on. And then, Spark comes in. And still there's this very complicated situation. So it's sounds like, ChaosSearch is really attacking that problem. And the first use case, it's really going after is log analytics. Explain that a little bit more, please. >> Yeah, so listen, they finally went after it with this, it's called a data lake engine for scalable and we'll say log analytics firstly. It was the first use case to go after it. But basically, they allows for log analytics people, everyone does it, and everyone's kind of getting to scale with it, right. But if you asked your IT department, are you even challenged with scale, or cost, or retention levels, but also management overlay of what they're doing on log analytics or security log analytics, or all this machine data they're collecting? The answer be absolutely no, it's a nightmare. It starts easy and becomes a big, very costly application for our environments. And what Chaos does is because they deal with a real issue, which is the data layer, but keep the API's on top. And so people easily use the data insights at scale, what they're able to do is very simply run in parallel and we'll save 80% of your cost, but also get better data retention. Cause there's typically a trade off. Clients basically have this trade off, or it gets really expensive. It gets to scale. So I should just retain less. We have clients that went from nine day retention and security logs to literally four and five days. If they didn't catch it in that time, it was too late. Now what they're able to do is, they're able to go to our solution. Not change what they're doing applications, because you're using the same API's, but literally save 80% and this is millions and 10s of millions of dollars of savings, but also basically get 90 day retention. There's really limitless, whatever you put into your S3 bucket, we're going to give you access to. So that alone shows you that it's literally revolutions that CFO wins because they save money. The IT department wins because they don't that wrestle with this data technology that wasn't really built. It is really built 30 years ago, wasn't built for this volume and velocity of data coming in. And then the data analytics guys, hey, I keep my tool set but I get all the retention I want. No one's limiting me anymore. So it's kind of an easy win win. And it makes it really easy for clients to have this really big benefit for them. And dramatic cost savings. But also you get the scale, which really means a lot in security login or anything else. >> So let's dig into that a little bit. So Cloud Object Storage has kind of become the de facto bucket, if you will. Everybody wants it, because it's simple. It's a get put kind of paradigm. And it's cheap, but it's also got performance issues. So people will throw cash at the problem, they'll have to move data around. So is that the problem that you're solving? Is it a performance? You know, problem is it a cause problem or both? And explain that a little bit. >> Yeah, so it's all over. So basically, if you were building a data lake, they would like to just put all their data in one very cost effective, scalable, resilient environment. And that is Cloud Object Storage, or S3, or every cloud has around, right? You can do also on prem, everyone would love to do that. And then literally get their insights out of it. But they want to go after it with our tools. Is it Search or is it SQL, they want to go after their own tools. That's the vision everyone wants. But what everyone does now is because this is where the core special sauce what ChaosSearch provides, is we built from the ground up. The database, the indexing technology, the database technology, how to actually make your Cloud object storage a database. We don't move it somewhere, we don't cash it. You put it in the inside the bucket, we literally make the Cloud object storage, the database. And then around it, we basically built a Chaos fabric that allows you to spin up compute nodes to go at the data in different ways. We truly have separated that the data from the compute, but also if a worker nodes, beautiful, beauty of like containerization technology, a worker nodes goes away, nothing happens. It's not like what you do on Prem. And all sudden you have to rebuild clusters. So by fundamentally solving that data layer, but really what was interesting is they just published API's, you mentioned put and get. So the API's you're using cloud obvious sources of put and get. Imagine we just added to that API, your Search API from elastic, or your SQL interface. It's just all we're doing is extending. You put it in the bucket will extend your ability to get after it. Really is an API company, but it's a hard tech, putting that data layer together. So you have cost effectiveness, and scale simultaneously. But we can ask for instance, log analytics. We don't cash, nothing's on the SSD, nothing's on local storage. And we're as fast as you're running Elastic Search on SSDs. So we've solved the performance and scale issues simultaneously. And that's really the core fundamental technology. >> And you do that with math, with algorithms, with machine learning, what's the secret sauce? Yeah, we should really have I'll tell you, my founder, just has the right interesting way of looking at problems. And he really looked at this differently and went after how do you make a both, going after data. He really did it in a different way, and really a modern way. And the reason it differentiates itself is he built from the ground up to do this on object storage. Where basically everyone else is using 30 year old technology, right? So even really new up and coming companies, they're using Tableau, Looker, or Snowflake could be another example. They're not changing how the data stored, they always have to move it ETL at somewhere to go after it. We avoid all that. In fact, we're probably a pretty good ecosystem players for all those partners as we go forward. >> So your talking about Tom Hazel, you're founder and CTO and he's brought in the team and they've been working on this for a while. What's his background? >> Launched Telkom, building out God boxes. So he's always been in the database space. I can't do his in my first day of the job, I can't do justice to his deep technology. There's a really good white paper on our website that does that pretty well. But literally the patent technology is a Chaos index, which is a database that it makes your object storage, the database. And then it's really the chaos fabric that puts around in the chaos refinery that gives you virtual views. But that's one solution. And if you look for log analytics, you come in log in and you get all the tools you're used to. But underneath the covers, were just saving about 80% of overall cost, but also almost limitless retention. We see people going from literally have been reduced the number of logs are keeping because of cost, and complexity, and scale, down to literally a very small amount and going right back at nine days. You could do longer, but that's what we see most people go into when they go to our service. >> Let's talk about the market. I mean, as a startup person, you always look for large markets. Obviously, you got to have good tech, a great team. And you want large markets. So the, space that you're in, I mean, I would think it started, early days and kind of the decision support. Sort of morphed into the data warehouse, you mentioned ETL, that's kind of part of it. Business Intelligence, it's sort of all in there. If you look at the EDW market, it's probably around 18 to 20 billion. Small slice of that is data lakes, maybe a billion or a billion plus. And then you got this sort of BI layer on top, you mentioned a lot of those. You got ETL, you probably get up into the 30,35 billion just sort of off the top of my head and from my historical experience and looking at these markets. But I have to say these markets have traditionally failed to live up to the expectations. Things like 360 degree views of the customer, real time analytics, delivering insights and self service to the business. Those are promises that these industries made. And they ended up being cumbersome, slow, maybe requiring real experts, requiring a lot of infrastructure, the cloud is changing that. Is that right? Is that the way to look at the market that you're going after? You're a player inside of that very large team. >> Yeah, I think we're a key fundamental component underneath that whole ecosystem. And yes, you're seeing us build a full stack solution for log analytics, because there's really good way to prove just how game changing the technology is. But also how we publishing API's, and it's seamless for how you're using log analytics. Same thing can be applied as we go across the SQL and different BI and analytic type of platforms. So it's exactly how we're looking at the market. And it's those players that are all struggling with the same thing. How they add more value to clients? It's a big cost game, right? So if I can literally make your underlying how you store your data and mix it literally 80% more cost effective. that's a big deal or simultaneously saving 80% and give you much longer retention. Those two things are typically, Lily a trade off, you have to go through, and we don't have to do that. That's what really makes this kind of the underlying core technology. And really I look at log analytics is really the first application set. But or if you have any log analytics issues, if you talk to your teams and find out, scale, cost, management issues, it's a pretty we make it very easy. Just run in parallel, we'll do a PLC, and you'll see how easy it is you can just save 80% which is, 80% and better retention is really the value proposition you see at scale, right. >> So this is day zero for you. Give us the hundred day plan, what do you want to accomplish? Where are you going to focus your priorities? I mean, obviously, the company's been started, it's well funded, but where are you going to focus in the next 100 days? >> No, I think it's building out where are we taking the next? There's a lot of things we could do, there's degrees of freedom as far as where we'd go with this technology is pretty wide. You're going to see us be the best log analytic company there. We're getting, really a (mumbling) we, you saw the announcement, best quarter ever last quarter. And you're seeing this nice as a service ramp, you're going to see us go to VPC. So you can do as a service with us, but now we can put this same thing in your own virtual private data center. You're going to see us go to Google, Azure, and also IBM cloud. And the really, clients are driving this. It's not us driving it, but you're going to see actually the client. So we'll go into Google because we had a couple financial institutions that are saying they're driving us to go do exactly that. So it's more really working with our client sets and making sure we got the right roadmap to support what they're trying to do. And then the ecosystem is another play. How to, you know, my core technology is not necessarily competitive with anyone else. No one else is doing this. They're just kind of, hey, move it here, I'll put it on this, you know, a foundational DV or they'll put it on on a presto environment. They're not really worried about the bottom line economics, which is really that's the value prop and that's the hard tech and patented technology that we bring to this ecosystem. >> Well, people are definitely worried about their cloud bills. The the CFO saying, whoa, cause it's so easy to spin up, instances in the cloud. And so, Ed it really looks like you're going after a real problem. You got some great tech behind you. And of course, we love the fact that it's another Boston based company that you're joining, cause it's more Boston based startups. Better for us here at the East Coast Cube, so give us a give us your final thoughts. What should we look for? I'm sure we're going to be being touched and congratulations. >> No, hey, thank you for the time. I'm really excited about this. I really just think it's fundamental technology that allows us to get the most out of everything you're doing around analytics in the cloud. And if you look at a data lake model, I think that's our philosophy. And we're going to drive it pretty aggressively. And I think it's a good fundamental innovation for the space and that's the type of tech that I like. And I think we can also, do a lot of partnering across ecosystems to make it work for a lot of different people. So anyway, so I guess thank you very much for the time appreciate. >> Yeah, well, thanks for coming on theCUBE and best of luck. I'm sure we're going to be learning a lot more and hearing a lot more about ChaosSearch, Ed Walsh. This is Dave Vellante. Thank you for watching everybody, and we'll see you next time on theCUBE. (upbeat music)

Published Date : Aug 7 2020

SUMMARY :

leaders all around the world, And Ed Walsh is here to talk about that. So the bad news is Ed Walsh is leaving IBM And it's really about the team. And I asked you on theCUBE, of the storage portfolio. So in the height of the pa, the And I think that's what And you know where you're out there? So I knew the founder, I knew And the first use case, So that alone shows you that So is that the problem And that's really the core And the reason it differentiates he's brought in the team I can't do his in my first day of the job, And then you got this and give you much longer retention. I mean, obviously, the And the really, clients are driving this. And of course, And if you look at a data lake model, and we'll see you next time on theCUBE.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Tom Hazel	PERSON	0.99+
80%	QUANTITY	0.99+
October 19	DATE	0.99+
Ed	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Ed Walsh	PERSON	0.99+
90 day	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
ChaosSearches	ORGANIZATION	0.99+
April	DATE	0.99+
July	DATE	0.99+
ChaosSearch	ORGANIZATION	0.99+
nine day	QUANTITY	0.99+
millions	QUANTITY	0.99+
August 2020	DATE	0.99+
four	QUANTITY	0.99+
Boston	LOCATION	0.99+
Chaos	ORGANIZATION	0.99+
360 degree	QUANTITY	0.99+
30,35 billion	QUANTITY	0.99+
two things	QUANTITY	0.99+
nine days	QUANTITY	0.99+
five days	QUANTITY	0.99+
last quarter	DATE	0.99+
Snowflake	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
two year	QUANTITY	0.99+
Looker	ORGANIZATION	0.99+
S3	TITLE	0.99+
Telkom	ORGANIZATION	0.99+
SQL	TITLE	0.99+
Enterprise Technology Research	ORGANIZATION	0.98+
East Coast Cube	ORGANIZATION	0.98+
a billion	QUANTITY	0.98+
30 years ago	DATE	0.98+
Tableau	TITLE	0.98+
four and a half year	QUANTITY	0.98+
Four years ago	DATE	0.98+
one	QUANTITY	0.98+
both	QUANTITY	0.98+
Elastic Search	TITLE	0.97+
today	DATE	0.97+
Cabana	TITLE	0.97+
one solution	QUANTITY	0.97+
One	QUANTITY	0.97+
first day	QUANTITY	0.97+
ETR	ORGANIZATION	0.97+
first use case	QUANTITY	0.96+
theCUBE Studios	ORGANIZATION	0.96+
VPC	ORGANIZATION	0.96+
about 80%	QUANTITY	0.96+
30 year old	QUANTITY	0.95+
Looker	TITLE	0.95+
last three quarter	DATE	0.94+
third place	QUANTITY	0.93+

Trevor Koverko & Genevieve Roch-Decter | Polycon 2018

(upbeat music) >> Live from Nassau in the Bahamas, it's theCUBE Covering Polycon '18. Brought to you by Polymath >> Okay, welcome back everyone. This is theCUBE's exclusive live coverage here in the Bahamas for Polycon '18, put on by Polymath and Grit Capital. I'm here with the CEO of both of those companies, who have been gracious enough to let us come in and tap into the bandwidth, tap into the guests, and host us here at theCUBE's two days of exclusive coverage. We have great guests, Trevor Koverko, CEO of Polymath, really changing the game. Security tokens are really kind of driving great, fast, accelerated innovation. And we have Genevieve Roch-Decter who's a CEO of Grit Capital, funding it, being part of it. You guys created a great community. Welcome to theCUBE! >> Great, thanks for having us. >> Thank you. >> So, live coverage, thank you very much. We really appreciate the collaboration with you guys, great guests. But there's something magical going on here. You've got a big even, couple hundred, 400 people. But it feels like the early days of, when I was in my 20s, the computer revolution, PC, and then the internet came. People are doing deals. This is a very intimate conference, you've got whales, billionaires, you've got entrepreneurs, you've got folks from investment banking companies coming into the sector, young guns, all dudes and gals. I mean, This is a melting pot! >> We have professional athletes, too, yeah, no we've really brought together a cluster of different zones, if you will. I come from the world of the Canadian equivalent of Wall Street, Bay Street, and so we've got institutional investors here who don't have wallets don't have coins, and are learning about it from the top Crypto minds in the world, so it's quite magical. I don't think Trevor and I have slept in 60 days. We literally came up with this idea, it's supposed to be a very intimate setting of 20 or 30 people and it's ballooned into 600, mostly because Trevor has so many friends and is partnering up with a lot of them on his projects, so yeah it's been a great time so far. >> And Trevor you, by the way, you're not sleeping 'cause everyone's staying out til two in the morning. It's been a great intimate gathering, people are mingling. But they're players, they're not pretenders here. This is a really interesting group, people who are investing their time, it's mission-driven here. We talk about societal change, but there's money-making going on, too, you're powering that, I mean you've got to be exhausted, how do you feel? >> I call it the eye of the hurricane, this was like if you weren't here this week, in crypto, you're just not relevant, this is where you wanted to be. And it's all about the attendees, the caliber of the people that came just blew me away, very humbled by the quality of people that we had here, it's no surprise, we have a beautiful venue like here in the Bahamas, and at Baha Mar, and amazing people. Good things are going to happen. >> Community is a very important formula for success in this world, we've seen this movie before, in open-source software It started out as a tier-2 citizen, now it runs softwares tier-1 class capabilities, cloud computing has been amazing growth, crypto, same model, you know, it's emerged as the money, the value store, technology-enablement. What are you guys seeing as the pattern, 'cause honestly, people recognize that certainly in the in industry. If you don't you're going to miss the boat on this one. Most people who don't get it will probably miss the boat. But a lot of people are getting in, what is the pattern that's happening, why is this moving so fast? Is it the wealth creation, is it the money-making? Is it the technology enablement, what's you guys' reaction to the why? What's the why, here? >> I think it's a convergence of a lot of mega-trends going on right now, both of the technology and on the regulatory side. If you look at, you know, the exciting sexiness of having this liquid tokens that kind of feel like stocks, but are also utilities in the sense that you can use them to do certain things with, that's a big component of it. But I think another reason is just, there's a lot of strangling going on in the capital markets, where you have a lot less companies going public, you have a lot more barriers to raise capital, in a lot of ways. And this is kind of like, light peeking through the hole. Where you have new ways re-imagined ways to raise capital. So we're seeing just a convergence of a lot of mega-trends, I think. >> And a lot of pros are coming in, and they're either young pros that are learning and growing with this trend, the young guns, I call them, and then you've got pros coming in from other industries, whether it's banking, and other sectors, this is interesting. So the question I have for you, is the security token. This has been a big deal, a lot of companies have seen the ICOs on the utility side, certainly the SEC in the US has been really sending signals pretty radically, like hey, don't pump and dump, I don't want to see any, watch that advisor stuff, and oh by the way, show me the utility, how we test et cetera, et cetera. That the startups who have to build the future are trying to rush a utility token out, now have a safe harbor in the security token, and existing companies can raise money with the security token that are tokenizing a real business, this is a pretty important point. Can you guys share some color commentary on that? Do you agree with it, and then, if you do, share some color around this whole trend. >> Yeah, I mean, right now if you look today, there's two major categories of tokens as you alluded to, you have utilities on the one hand, and securities on the other hand. And the distribution right now is extremely one-sided. Security tokens are dominated by utilities. Utilities like Bitcoin, Ether, Ripple, they make up 99% of the total market cap of alt coins, so, where does that leave us? Well it depends, today it means all the action is in utilities, there's more upside, they're faster, they're simpler, I'm very bullish on utilities. But what's even more exciting to me, is the mega-trend the tsunami of real-world financial assets migrating to the blockchain. And that's what I see as the next sort of part two, second-wave of crypto, is real-world, tangible assets tokenizing and migrating to the blockchain. >> And you know what I think, you know the SEC kind of gets a bad rap in all this, but the rules are there for a certain reason: to protect investors, and I think that this industry is in the beginning it's a nascent, and you know, with Trevor's company Polymath introducing the securities token. Literally, I think you coined the word. It's growing up, it's an industry that has to, you know, it's going to have some red tape, too, right, and I think working with the regulators, and Trevor's company has done that, you know, befriend them, and be open-source about it, and communal. And, you know there's certain aspects about the regulations that are not good, and we don't want communication and the communities that have formed, Telegram's a great example of this, so there's a lot of these chat rooms that I'm in and literally people are sharing information about companies and teaching each other, and learning and that's great. But there is an assymetry of information sharing, that at some point, you know, we have to rein that in. But we don't want to lose the positive aspects. >> You could choke the innovation, if you put too much regulatory on it, the innovation won't grow, so you have to have a balance, I mean, that's what you're saying, right? You got to get through it, but redefine a new era. And the SEC in the US has not been too bad, I think they're just sending a signal, and I think they're not, And they can be hardcore. They could be harder core, I think, than they are. But thank God they're not, you want to let these startups figure out what to do. Alright so I got to talk about liquidity and funding. So, Grit Capital, you guys are involved in investments also, you're enabling partnerships at Polymath. A lot of people you're connecting into your system, we had one on earlier. The funding environment, certainly a lot of investors are here I talked to probably at least a dozen actively investing, different profile make-ups some go hardcore protocol under the hood, some are more business we're going to decentralize apps. Make-up, Persona, trends, can you share? >> Yeah! >> You know that world. Eight months ago, so, I'm from Toronto, I'm from Canada. Eight months ago, there was literally no publicly-traded blockchain company in Canada. And now there's probably, I think, 70, you know, new one every day, name change. But yeah, there's been a lot of equity raised. There's two companies about to go public actually, in Canada Hut 8 Mining, who's our sponsor here at the conference, and Galaxy Digital Michael Novogratz's company, and I think between the two of them, they've raised almost half a billion dollars in capital. Or, like market capitalization when they go public. Probably about 250 million in actual capital. But that's huge, those checks were written not by just by high net worth people, but actual institutions. And those people that are here today, they're good with writing equity checks, ICO checks and that is going to come. And I think the securities token aspect of it will give them a lot of comfort that they can write checks in those kinds of-- >> And how does Grit Capital, talk about Grit Capital. >> Yeah so very simply, we introduce companies to capital holders, investors. So I was a portfolio manager for nine years, and I like to say I was in the no game for nine years, 'cause when you're portfolio managing-- >> Now you're in the yes game! >> Yeah, your goal-tending, you're like trying not to let bad deals in, and that wasn't really conducive to my personality and now I'm in the yes game, I'm you know, I like this company, I'm going to invest in it, but I'm going to introduce them to these other capital holders. And it's a positive experience. >> How much is community involved in what you do? 'cause we're seeing obviously the pattern of kind of paying it forward, which is great culture, but also people are, you know help scratch my back, I'll scratch your back on deal flow, and also on participation, it seems to be a big part of the current rules of engagement, or implied protocol. Is that going on? >> Yeah, you know, look I think this is a very collaborative ecosystem, and It's has to be because by definition, open-source communities are powered by the people that make it up, and it's all about volunteering, about helping, about giving back, and it's one of the reasons I'm so passionate about this space. >> I think you should probably talk about your fund that you just announced that you're launching. And it probably plays into, so Trevor's network is global, it's extensive he has deal-flow coming at him all the time. >> Alright, so what's in the news? >> Yeah what are going to do with that deal flow? You holding news back? >> Yeah, I've got a bit of a brain freeze, I have so many announcements out there, uh, yeah we're doing a lot of exciting initiatives right now, and part of what I'm excited about, and also slightly intimidated by, is that there's just so much opportunity, there's so many key components of this new infrastructure that need to get build, that aren't in existence yet, that is easy to get, you know, carried away. But for me it's about prioritizing and finding out the real kind of high-leverage initiatives that are going to help us achieve our goals. >> And so you're putting a fund together to invest in the ecosystem, or is this for financial investment, is it a crypto fund, or what are you, what's going on? >> One of those initiatives is a securities token focused venture fund, this will be the first one that I know of that exists, and it would be to help our ecosystem get financed, and that's a big component of this marketplace is capital, is investors, is demand. And we just want to channel all of that to the best deals. So Polymath capital-- >> Ecosystem is important to you guys, Polymath your ecosystem is strategic, right? >> Yes. >> How do you see that playing out, what's your vision? What do you hope to unfold in your ecosystem? Obviously, people connect in the variety of things that you can help people with, and vice versa. How do you see your ecosystem rolling out? >> Well, part of it is I want an arms length organization that has its own kind of mandate, its own charter. And the way I look at it is, if you look at Ethereum, which I am very familiar with being from Toronto and knowing those guys kind of since day one. They opted not to do a venture fund, but if they had, it would have been literally the most, >> John: high performance fund ever in history? >> Of all time, yeah, just mathematically-speaking, so we don't want to lose out on an opportunity like that. And in the process of building another potentially profitable entity we want to also seed the ecosystem and help projects that we're excited about. Get the first check. >> Who are you looking for in your ecosystem? Is it developers, 'cause obviously Ethereum, we're Ethereum developed we're a ERC20 token, we love it. It's easy to work with, smart contracts are easy to work with, so it's clearly a developer market on that side, are you guys looking for the same? Is it a different kind of partner, what is some of the partner makeup that you hope to attract, in case they're watching now, why should they work with you, who are they? Describe the persona of your ideal ecosystem partners, or partner. >> For better or worse we have a lot of verticals that we have to build communities within, so those are the business community, we want leaders, we want action-takers we want people that can structure deals, we want legal professionals, that's a big component of the security token landscape, is the regulation is the exemptions, and the offerings, and the memorandums, and all the legal stuff, so we need a legal community. And then finally, most importantly, we need a developer for community, we need the best technical minds just like any other decentralized project, so that's what my full-time job is, when people ask me, is building communities with our broader community. >> Well I can totally give you props, one, because I know you're super busy, and you're drinking from the fire hose at all levels, and certainly the event's been great. I think a breath of fresh air, a sigh of relief from the world when see entrepreneurs, at least from the perspective of the entrepreneurs and the markets is that security tokens, finally someone just made a decision let's just use this security token as a way to get the funding and get set up, and not foreclose the option for, say, a utility token. Why rush and force a utility, needs to be built out. And lot of these utilities have really missed out because they had to run so fast to write code funded by a utility, that has a test. So I think you guys are doing a great service, I want to give you props for that. >> Thank you, yeah I would whole-heartedly agree, I think a lot of these so-called utility coins are actually securities masquerading as utilities, and you know, >> I think that's the game everyone kind of is realizing like, okay great, now you have the platform, so what's the update on the platform, the company? Take a quick minute to explain to the folks about Polymath. >> We are inundated and overwhelmed with demand right now. And we have thousands, tens of thousands of sign-ups on both the investor and issuer side. And kind of my goal right now on a day-to-day basis is to scale our on-boarding process so we can take all these issuers and give them a secure and robust token that they can fundraise on top of. And we are in the process of unveiling our application layer that's going to make that kind of self-serve process exciting and scalable. >> Well congratulations, and Grit Capital, genevieve, thanks for connecting, great to connect with you. Shout out to Bill Tai who made it happen. If it wasn't for Bill Tai and Genevieve, theCUBE would not be here, and of course Polymath supporting us as well. It's been great, so thank you very much! >> Thank you! >> Great event, and we'll keep on following you guys and thanks for coming on, sharing success. Final question: The craziest thing that's happened here this week, one, two, three, things that might have won? Craziest thing that's happened, could be good, bad, or ugly. Did someone fall in the pool? Was someone found on the beach? Share a funny story or two. >> We found a mermaid. >> there was a mermaid, yeah. >> A real, live mermaid, we actually found a mermaid. And we put her in the pool for the cocktail event. >> And we almost put Trevor in the pool as a merman. Just to balance it out. >> Merman, We're a mermaid-neutral company we have mermen as well, oh geez, what else? We had uh, a friend of our decided to get the jacuzzi suite at the top floor and uh, I don't know if you've ever seen the movie Scarface? But there was a lot of uh, opulence going on, which was a little more than I bargained for. And then Genevieve being the celebrity that she is. Umm, what do you think? >> Umm, I mean there's been so much, like, we've had literally 13 side-events within the conference. So drinking from a fire hose is an understatement, I would say, there's still more to do, we're going to Cabana pool party now so maybe, I think there's going to be a bull there, a stampede security bull there? >> Trevor: Oh geez, is there? >> And maybe the SEC, no! (laughs) >> Well, hey congratulations, you guys are doing a great service in the industry and I love how you brought together the inner-circle major players, really the community really admires that so appreciate your help. Okay this is theCUBE, live coverage in the Bahamas. More interviews after this short break, stay with us. (upbeat music)

Published Date : Mar 3 2018

SUMMARY :

Brought to you by Polymath here in the Bahamas for Polycon '18, But it feels like the early days of, when I was in my 20s, I come from the world of the Canadian equivalent of be exhausted, how do you feel? I call it the eye of the hurricane, this was like Is it the technology enablement, what's you guys' reaction strangling going on in the capital markets, where you have show me the utility, how we test et cetera, et cetera. And the distribution right now is extremely one-sided. is in the beginning it's a nascent, and you know, You could choke the innovation, if you put too much I think, 70, you know, new one every day, name change. and I like to say I was in the no game and now I'm in the yes game, I'm you know, I like this a big part of the current Yeah, you know, look I think this is a very collaborative I think you should probably talk about your fund that and finding out the real kind of And we just want to channel all of that to the best deals. that you can help people with, and vice versa. And the way I look at it is, if you look at Ethereum, which And in the process of building another potentially on that side, are you guys looking for the same? and all the legal stuff, so we need a legal community. of the entrepreneurs and the markets is that like, okay great, now you have the platform, on both the investor and issuer side. It's been great, so thank you very much! Great event, and we'll keep on following you guys And we put her in the pool for the cocktail event. And we almost put Trevor in the pool as a merman. Umm, what do you think? Cabana pool party now so maybe, I think there's going to service in the industry and I love how you brought together

ENTITIES

Entity	Category	Confidence
Trevor	PERSON	0.99+
Trevor Koverko	PERSON	0.99+
Canada	LOCATION	0.99+
Genevieve Roch-Decter	PERSON	0.99+
Genevieve	PERSON	0.99+
Grit Capital	ORGANIZATION	0.99+
Polymath	ORGANIZATION	0.99+
20	QUANTITY	0.99+
nine years	QUANTITY	0.99+
Bahamas	LOCATION	0.99+
Toronto	LOCATION	0.99+
Galaxy Digital	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
Bill Tai	PERSON	0.99+
two	QUANTITY	0.99+
John	PERSON	0.99+
two days	QUANTITY	0.99+
two companies	QUANTITY	0.99+
60 days	QUANTITY	0.99+
600	QUANTITY	0.99+
both	QUANTITY	0.99+
Genevieve Roch	PERSON	0.99+
Merman	PERSON	0.99+
theCUBE	ORGANIZATION	0.99+
13 side-events	QUANTITY	0.99+
99%	QUANTITY	0.99+
Bay Street	LOCATION	0.99+
Eight months ago	DATE	0.99+
US	LOCATION	0.99+
Scarface	TITLE	0.99+
30 people	QUANTITY	0.99+
first check	QUANTITY	0.99+
SEC	ORGANIZATION	0.99+
Baha Mar	LOCATION	0.98+
today	DATE	0.98+
genevieve	PERSON	0.98+
One	QUANTITY	0.98+
Wall Street	LOCATION	0.98+
Michael Novogratz	PERSON	0.98+
tens	QUANTITY	0.98+
this week	DATE	0.97+
about 250 million	QUANTITY	0.97+
ERC20	OTHER	0.97+
first one	QUANTITY	0.97+
Ethereum	ORGANIZATION	0.96+
one	QUANTITY	0.96+
two major categories	QUANTITY	0.95+
Nassau	LOCATION	0.95+
20s	QUANTITY	0.94+
70	QUANTITY	0.93+
Polycon '18	EVENT	0.93+
three	QUANTITY	0.9+
second-wave of	EVENT	0.89+
couple hundred	QUANTITY	0.89+
almost half a billion dollars	QUANTITY	0.86+
Canadian	OTHER	0.86+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for cabana: