Thomas Hazel, ChaosSearch & Jeremy Foran, BAI Communications | AWS Startup Showcase

(upbeat music) >> Hey everyone, I'm John Furrier with The Cube, we're here in Palo Alto, California for a remote interview and session for The Cube presents AWS startup showcase, the next big thing in AI security in life sciences. I'm John Furrier. We're here with a great segment on cloud. Next big thing in Cloud with Chaos Search, Thomas Hazel, Chief Technology and Science Officer of Chaos Search joined by Jeremy Foran, the head of data analytics, the bad boy of data analyst as they say, but BAI communications, Jeremy Thomas, great to have you on. >> Great to be here. >> Pleasure to be here. >> So we're going to be talking about applying large scale log analytics to building the future of the transit industry. Obviously Telco's a big part of that, smart cities, you name the use case self-driving trucks, cars, you name it, everything's now edge. That the edge is super valuable, it's a new kind of last mile if you will, it's moving fast, it's mobile. This is a huge deal. Let's get into it, Thomas. What's this big story around this, this session? >> Well, we provide unique ability to take all that edge data and drive it into a data lake offering that we provide data analytics, both in logs, BI and coming out with ML there this year into next. So our unique play is transforming customers' cloud outer storage into an analytical platform. And really, I think with BIA is a log analytics specifically where, you know there's a lot of data streams from all those devices going into a lake that we transform their lake into analytics for driving, I guess, operational analysis. >> You know, Jeremy, I remember back in the day, I'm old enough to remember when the edge was the remote switch or campus hub or something. And then even on the Telco side, there was no wifi back in 2000 and you know, someone was driving in a car and you got any signal, you're lucky. Now you got, you know, no perimeter you have unlimited connectivity everywhere. This has opened up more of an Omni channel data problem. How do you see that world? Because you still got more devices pushing out at this edge and it's getting super local, right? Even on the body, even on people in the car. So certainly a lot of change on the infrastructure side. What does that pose for data challenge? >> Yeah, I, I would say that, you know users always want more, more bandwidth, more performance and that requires us to create more systems that require more complexity to deliver that user experience that we're, we're very proud of. And with that complexity means, you know exponentially more data. And so one of the wifi networks we offer in the Toronto subway system, T-connect, you know we see a 100-200,000 unique users a day and you can imagine just the amount of infrastructure to support that so that everyone has a seamless experience and can get their news and emails and even stream media while they're waiting for the subway. >> So you guys provide state of the art infrastructure for cell, wifi, broadcast, radio, IP networks, basically I mean, I call it the smart city kind of go-to. But that's basically anything involving kind of that edge piece. This is a huge thing. So as smart cities are on the table, which and you seeing 5G being called more of an enterprise app where there's feeding large dense areas of people this is now a new modern version of what I would call the, the smart city blueprint. What's changed in your mind on this whole modernization of this smart city infrastructure concept? What's new? What's cutting edge? >> Yeah. I would say that, you know there was an explosion of data and a lot of our insights aren't coming from one system anymore. It's coming from collecting data from all of the different pieces, the different infrastructure whether that's your fiber infrastructure or your wireless infrastructure, and then to solve problems you need to correlate data across those systems. So we're seeing more and more technologies that allow you to do that correlation. And that's really where we're finding tons of value, right? >> Thomas, take us through what you guys do as a, as a, as a product, a value proposition, the secret sauce, and and why I'm here with Jeremy? Why is this conversation important for the folks watching? What's the connection between Chaos Search and BAI communication? >> Well, it's data, right? And lots of it. So our unique platform allows people like Jeremy to stream all this data, right? In you know, today's world terabytes go to petabytes really easily, billions go to trillion really easily, and so providing the analysis of that data for their operations is challenging particularly based on technology and architectures that have been around for a long time. So what we do here at Chaos Search is the ability for BIA to stream all these devices, all these services into one centralized data lake on their cloud outer storage, where we connect to that cloud outer storage and transform it into an analytical database to do, in this case log analytics and do it seamlessly, easily where a new workload a new stream just streams into that lake. And we, as a service take over, we discover we index it and publish well-known open API and visualization so that they can focus on their business, not all the operational data pipeline, database and data engineering type work that again, at these types of scales is is frankly a nightmare. >> You know, one of the things that we've always observed on The Cube when you see new things come out that are really cool groundbreaking products like you guys are doing it's always a challenge to manage the cost and complexity of bringing in the new. So Jeremy, take us through this tech stack here because you know, it's, sometimes it might be unwieldy just in from a tech stack perspective, nevermind the business logic or the business processes that got to be either unwound or changed. Can you take us through the IT stack that's critical to support your, your area? >> Yeah, absolutely. So with all the various different equipment you know, to provide our public wifi and and our desks, carrier agnostic, LT and 5G networks, you know, we need to be able to adhere to PCI compliance and ISO 27,000, so that, you know, requires us to keep a tremendous amount of our data. And the challenge we were facing is how do we do that cost effectively, and not have to make any sort of compromises on how we do that? A lot of times you'll find you don't know the value of your data today until tomorrow. An example would be COVID. You know, we, when we were storing data two years ago we weren't planning for a pandemic, but now that we were able to retain that data and look back we can see a tremendous amount of value with trying to forecast how our systems will recover when things get back to normal. And so when I met Thomas and we were sort of talking about how we were going to solve some of these data retention problems, he started explaining to me their compression in some of the performance metrics of their profession. And, you know, I said, oh, middle out compression. And it was a bit, it's been a bit of a running joke between me and him and I'm sure others, but it's incredibly impressive the amount of data we're able to store at the kind of cost, right? >> What, what problem does, did he solve for you? Because I mean, these guys, honestly, you know the startups have a lot and the Cloud's enabling more value now, we're seeing this, but when you look at this what was your, what was your core problem that you had? >> Yeah, so we, when you we want to be able to, I mean, primarily this is for our CIS log server. And CIS long servers today aren't what they were 10, 15 years ago where you just sort of had a machine and if something broke you went and looked, right? Now, they're very complex, that data is feeding to various systems and third-party software. So, you know, we're actively looking for changes in patterns and we have our, you know security teams auditing these from, for penetration testing and such. And then the getting that data to S3 so that we could have it in case, you know, for two, three years of storage. Well, the problem we were facing is all of that all of these different systems we needed to feed and retain data, we couldn't do that on site. We wanted to do use S3 but when we were doing some projections, it's like, we, we don't really have the budget for all of these places. Meeting Thomas and, and working with Chaos Search, you know, using their compression brought those costs down drastically. And then as we've been working with them the really exciting thing is they we're bringing more and more features to that surface or offering. So, you know, first it was just storing that data away. And now we're starting to build solutions off of that sitting in storage. So that's where it gets really exciting because you know, there, it's nothing to start getting anomaly detection off those logs, which, you know originally it was just, we need to store them in case somebody needs them two, three years from now. >> So Thomas Thomas, if I get this right then what I'm hearing is obviously I've put aside the complexity and the governing side the regulations for a minute just generally. Data retention as, as a key value proposition and having data available when you need it and then to do that and doing it in a very cost-effective simple way. It sounds like what you guys are offering. Is that right? >> Yeah, I mean, one key aspect of our solution is retention, right? Those are a lot of the challenges, but at the same time we provide real time notification like a classic log analytic type platform, alerting, monitoring. The key thing is to bringing both those worlds together and solving that problem. And so this, you know, middle in middle out, well, to be frank, we created a new technology called what we call Chaos Index that is a database index that is wonderfully small as as we're indicating, but also provides all the features that makes Cloud object storage, high performance. And so the idea is that use this lake offering to store all your data in a cost effective way but our service allows you to analyze it both in a long retention perspective as well as real-time perspective and bringing those two worlds together is so key because typically you have Silo Solutions and whether it's real-time at scale or retention scale the cost complexity and time to build out those solutions I know Jeremy knows also, well, a lot of folks come to us to solve those problems because you know when you're dealing with, you know terabytes and up, you know these things get complicated and to be frank, fall over quite often. >> Yeah. Let me, let me just ask you the question that's probably on everyone's mind who's watching and you guys probably have both heard this many times, because a lot of people just throw the data lake solution around like it's, you know why they whitewash their kind of old legacy solutions with data lake, store it on data lake. It's been called a data swamp. So people are fearful that, okay. I love this idea of a data lake, who doesn't like throwing data into a repository, having it available at will with notifications, all this secret magic beans that just magically create value. But I doubt that, I don't want to turn into a data swamp. So Thomas and Jeremy, talk about that, that concern. How do you mitigate that? How do you talk to that? Because if done properly, there's huge value in having a control plane or some sort of data system that is going to be tied in with signals and just storage retention. So I see the value. How do you manage the concern that people might say, Hey, I don't want to date a swamp? >> Yeah, I'll jump into that. So, you know, let's just be frank, Hadoop was a great tool for a very narrow scenario. I think that data swamp came out because people were using the tooling in an incorrect way. I've always had the belief that data lakes are the future. You just have the right to have the right service the right philosophy to leverage it. So what we do here at Chaos Search is we allow you to organize it, discover it, automatically index that data so that swamp doesn't get swampy. You know, when you stream data into your lake how do you organize it, such that it's has a nice stream? How do you transform that data into a value? So with our service we actually start where the storage begins, not a end point, not an archive. So we have tooling and services that keep your lake from being swampy to be, to be clear. And, but the key value is the benefits of the lake, the cost effectiveness, the reliability, security, the scale, those are all the benefits. The problem was that no one really made cloud offer storage a first-class citizen and we've done that. We've dressed the swamp nature but provided all the value of analysis. And that cost metrics, that scale. No one can touch cloud outer storage, it just, you can't. But what we've done is cracked the code of how you make it analytical. >> Jeremy, I want to get your thoughts on this too, on your side I mean, as a practitioner and customer of, of of these solutions, you know, the concern is am I missing anything? And I've been a big proponent of data retention for many, many years. You know, Dave Alondra in our Cube knows all know that I bang on the table all the time, store your data, be a data hoarder, because it's going to come back and be valuable. Costs are going down so I'm a big fan of data retention. But the fear might be on, what am I missing? Because machine learning starts to come in down the road you got AI, the more data you have that's accessible in real time, the more machine learning is effective. Do you, do you worry about missing anything or do you just store everything? >> We, we store everything. Sometimes it's, it's interesting where the value and insights come from your data. Something that see, might seem trivial today down the road offers tremendous, tremendous value. So one of the things we do is provide because we have wifi in the subway infrastructure, you know taking that wifi data, we can start to understand the flow of people in and out of the subway network. And we can take that and provide insights to the rail operators, which get them from A to B quicker. You know, when we built the wifi it wasn't with the intention of getting Torontonians across the city faster. But that was one of the values that we were able to get from the data in terms of, you know, Thomas's solution, I think one of the reasons we we engaged him in the first place is because I didn't believe his compression. It sounded a little too good to be true. And so when it was time to try them out, you know all we had to do was ship data to an S3 bucket. You know, there's tons of, of solutions to do that. And, and data shippers right out of the box. It took a few, you know, a few minutes and then to start exploring the data was in Cabana, which is or their dashboard, which is, you know, an interface that's easy to use. So we were, you know, within a two days getting the value out of that data that we were looking for which is, you know, phenomenal. We've been very happy. >> Thomas, sounds like you've got a great, great testimonial here and it's not like an easy problem that he's living in there. I mean, I think, you know, I was mentioning this earlier and we're going to get into it now. There's regulations and there's certain compliance issues. First of all, everyone has this now problem now, it's not just within that space. But just the technical complexities of packets moving around I got on my wifi and the stop here, I'm jumping over here, and there's a ton of data it's all over the place, it's totally unstructured. So it's a tough, tough test for you guys, Chaos Search. So yeah, it's almost like the Mount Everest of customer testimonials. You've got to, it's a big, it's a big use case here. How does this translate to other clients? And talk about this governance and security controls because I know this highly regulated and you got there's penalties involved on his side of the world and Telco, the providers that have these edge devices there's actually penalties and, and whatnot so, not just commercial, it's maybe a, you know risk management, but here there's actually penalties. >> Absolutely. So, you know centralizing your data has a real benefit of of not getting in trouble, right? So you have one place, you store one place that's a good thing, but what we've done and this was a key aspect to our offering is we as Chaos, Chaos Search folks, we don't own the customer's data. We don't own BIA's data. They own the data. They give us access rights, very standard way with Cloud App storage roll on policies from Amazon, read only access rights to their data. And so not owning a customer's data is a big selling point not only for them, but for us for compliance regulatory perspective. So, you know, unlike a lot of solutions where you move the data into them and now they are responsible, actually BIA owns everything. We, they provide access so that we could provide an analysis that they could turn off at any point in time. We're also SOC 2 type 1 and type 2 compliant you got to do it, you know, in this, this world, you know when we were young we ran at this because of all of these compliance scenarios that we will be in, but, you know, the long as short of it is, we're transient service. The storage, cloud storage is the source of truth where all data resides and, you know, think about it, it's architecturally smart, it's cost effective, it's secure, it's reliable, it's durable. But from a security perspective, having the customer own their own data is a big differentiation in the market, a big differentiation. >> Jeremy, talk about on your end the security controls surrounding the log management environments that span across countries with different regulations. Now you've got all kinds of policy dimensions and technical dimensions and topology dimensions. >> Yeah, absolutely. So how we approach it is we look at where we have offerings across the globe and we figure out what the sort of highest watermark level of adherence we need to hit. And then we standardize across that. And by shipping to S3, it allows us to enforce that governance really easily and right to Tom's point you know, we manage the data, which is very important to us and we don't have to be worried about a third party or if we want to change providers years down the road. Although I don't think anyone's coming out with 81% compression anytime soon (laughs). But yeah, so that's, for us, it's about meeting those high standards and having the technologies that enable us to do it. And Chaos Search is a very big part of that right now. >> All right let me ask you a question, for the folks watching that are like really interested in this topic, what would you say to them when evaluating Chaos Search obviously, your use case is complex, but so are others as enterprises start to have an edge, obviously the security posture shifts, everything shifts. There's no more perimeter and the data problem becomes acute to them. So the enterprises are going to start seeing what you've been living for in your world. What's your advice to people watching? >> My advice would be to give them a try. You know, it's it's has been really quite impressive. The customer service has been hands-on and we've been getting, you know, they've been under-promising and over-delivering, which when you have the kind of requirements to manage solutions in these very complex environment, cloud local, you know various data centers and such, you know that kind of customer service is very important, right? It enables us to continue to deliver those high quality solutions. >> So Thomas give us the, the overview of the secret sauce. You've got a great testimonial here. You got people watching, what's different now in the world that you're going after, what wave are you on? Talk to the people who are watching this and saying, okay why Chaos Search? Why are you relevant? Obviously there's some cool things you're doing. I love that. What's cool, and what's relevant and why what's in it for them if they work with you? >> Yeah. So you know, that that whole Silicon Valley reference actually got that from my patent attorney when we were talking. But yeah, no, we, we, you know, focus on if we can crack this code of making data, one a face small, store small, moves small, process small. But then make it multimodal access make it virtual transformation. If we could do that, and we could transform cloud outer storage into a high-performance medical database all these heavy, heavy problems, all that complexity that scaffolding that you build to do these type of scales would be solved. Now what we had to focus on and this has been my, I guess you say life passion is working on a new data representation. And that's our secret sauce that enables a new architecture a new service that where the customer folks on their tooling, their APIs, their visualizations that they know and love, what we focus is on taking that data lake, and again, to transform it into an analytical database, both for log analytics think of like elastic search replacement, as well as a BI replacement for your SQL warehousing database. And coming out later this year into 2022, ML support on one representation. You don't have the silo your information you don't have to re index your data, both. So elastic search CQL and actually ML TensorFlow actions on the exact same representation. So think about the data retention, doing some post analysis on all those logs of data, months, years, and then maybe set up some triggers if you see some anomaly that's happening within your service. So you think about it, the hunt with BI reporting, with predictive analysis on one platform. Again, it sounds a little unicorn, I agree with Jeremy, maybe it didn't sound true but it's been a life's work. So it didn't happen overnight. And you know, it's eight years, at least in the in the making, but I guess the life journey in the end. >> Well, you know, the timing is great. You know, all the database geeks out there who have been following the data industry know that, you know there's a good point for structured data but when you start getting into mechanisms and they become a bottleneck or a blocker to innovation, you know you starting to see this idea of a data lake being let the data kind of form, let it be. You know, I hate the word control plane but more of a, a connective tissue between systems is become an interesting thing. So now you can store everything so you know, no worries there, no blind spots and then let the magic of machine learning in the future, come around. So Jeremy, with that, I got to ask you since you're the bad boy of data analytics at BAI communications head of data analytics, what does that, what do you look for in the future as you start to set this up because I can almost imagine and connecting the dots here in the interview, you got the data lake you're storing everything, which is good. Now you have to create more insights and get ahead of the curve and provide some prescriptive and automated ways to do things better. What's your vision? >> First I would just like to say that, you know when astrophysicists talk about, you know, dark dark energy, dark matter, I'm convinced that's where Thomas is hiding the ones and zeros to get that compression, right? I don't don't know that to be fact but I know it to be true. And then in terms of machine learning and these sort of future technologies, which are becoming available you know, starting from scratch and trying to build out you know, models that have value, you know that takes a fair amount of work. And that landscape keeps changing, right? Being able to push our data into an S3 bucket and then you know, retain that data and then get anomaly detection on top of it. That's, I mean, that's something special and that unlocks a lot of ability for you know, our teams to very easily deliver anomaly detection, machine learning to our customers, without having to take on a lot of work to understand the latest and greatest in machine learning. So, I mean, it's really empowering to our team, right? And, and a tool that we're going to. >> Yeah, I love and I love the name, Chaos Search, Thomas. I got to say, you know it brings up the inside baseball around chaos monkey which everyone knows was a DevOps tool to create kind of day two simulate day two operations and disruptions in DevOps. But what you're really getting at is your whole new architecture that's beyond DevOps movement, it's like next gen architecture. Talk about that to the people watching who have a lot of legacy and want to transform over to a more enabling platform that's going to give them some headroom for their data. What, what do you say to them? How do they get started? What, how should they, how what's their mindset? What they, what are some first principles you can share? >> Well, you know, I always start with first principles but you know, I like to say we're the next next gen. The key thing with the Chaos Search offering is you can start today with B, without even Chaos Search. Stream your data to S3. We're going to make hip and cool data lakes again. And actually it's a, Google it now, data lakes are hip and cool. So start streaming now, start managing your data in a well-formed centralized viewpoint with security governance and cost effectiveness. Then call Chaos Search shop, and we'll make access to it easily, simply to ultimately solve your problems. The bug whether your security issue, the bug, whether it's more performance issues at scale, right? And so when workloads can be added instantaneously in your data lake it's, it's game changing it's mind changing. So from the DevOps folks where, you know, you're up all night trying to say, how am I going to scale from terabyte, you know one today to 50 terabytes, don't. Stream it to S3. We'll take over, we'll worry about that scale pain. You worry about your job of security, performance, operations, integrity. >> That really highlights the cloud scale the value proposition as, as apps start to be using data as an input, not just as a a part of a repo repo, so great stuff. Thomas, thanks for sharing your life's work and your technology magic. Jeremy, thanks for coming on and sharing your use cases with us and how you are making it all work. Appreciate it. >> Thank you. >> My pleasure. >> Okay. This is The Cubes, coverage and presenting AWS this time showcase the next big thing here with Chaos Search. I'm John Furrier, your host. Thanks for watching. (upbeat music)

Published Date : Jun 24 2021

SUMMARY :

great to have you on. it's a new kind of last mile if you will, specifically where, you know and you know, someone was driving and you can imagine just the amount and you seeing 5G being called that allow you to do that correlation. and so providing the analysis and complexity of bringing in the new. And the challenge we were and we have our, you know and having data available when you need it And so this, you know, of data system that is going to be tied in is we allow you to organize it, of these solutions, you So we were, you know, within and you got there's penalties of solutions where you the security controls surrounding the log and having the technologies and the data problem you know, they've been after, what wave are you on? that scaffolding that you in the interview, you got the data lake like to say that, you know I got to say, you know but you know, I like to say with us and how you the next big thing here with Chaos Search.

ENTITIES

Entity	Category	Confidence
Jeremy	PERSON	0.99+
Thomas	PERSON	0.99+
Dave Alondra	PERSON	0.99+
two	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Jeremy Thomas	PERSON	0.99+
Thomas Hazel	PERSON	0.99+
Telco	ORGANIZATION	0.99+
Jeremy Foran	PERSON	0.99+
BIA	ORGANIZATION	0.99+
Tom	PERSON	0.99+
AWS	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
81%	QUANTITY	0.99+
Chaos Search	ORGANIZATION	0.99+
eight years	QUANTITY	0.99+
tomorrow	DATE	0.99+
Palo Alto, California	LOCATION	0.99+
2000	DATE	0.99+
both	QUANTITY	0.99+
50 terabytes	QUANTITY	0.99+
two days	QUANTITY	0.99+
one	QUANTITY	0.99+
today	DATE	0.99+
billions	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
Toronto	LOCATION	0.99+
Google	ORGANIZATION	0.98+
First	QUANTITY	0.98+
S3	TITLE	0.98+
one platform	QUANTITY	0.98+
ChaosSearch	ORGANIZATION	0.98+
first principles	QUANTITY	0.98+
two worlds	QUANTITY	0.98+
first principles	QUANTITY	0.98+
2022	DATE	0.98+
one place	QUANTITY	0.98+
one system	QUANTITY	0.98+
three years	QUANTITY	0.98+
DevOps	TITLE	0.98+
two years ago	DATE	0.97+
Thomas Thomas	PERSON	0.96+
Chaos	ORGANIZATION	0.96+
SQL	TITLE	0.96+
BAI	ORGANIZATION	0.96+
trillion	QUANTITY	0.95+
BAI Communications	ORGANIZATION	0.95+
Mount Everest	LOCATION	0.95+
The Cube	ORGANIZATION	0.95+
this year	DATE	0.95+
first	QUANTITY	0.95+
Cloud App	TITLE	0.94+
Hadoop	TITLE	0.94+
pandemic	EVENT	0.94+
first place	QUANTITY	0.94+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for BAI Communications: