Ed Bailey, Cribl | AWS Startup Showcase S2 E2

(upbeat music) >> Welcome everyone to theCUBE presentation of the AWS Startup Showcase, the theme here is Data as Code. This is season two, episode two of our ongoing series covering the exciting startups from the AWS ecosystem. And talk about the future of data, future of analytics, the future of development and all kind of cool stuff in Multicloud. I'm your host, John Furrier. Today we're joined by Ed Bailey, Senior Technology, Technical Evangelist at Cribl. Thanks for coming on the queue here. >> I thank you for the invitation, thrilled to be here. >> The theme of this session is the observability lake, which I love by the way I'm getting into that in a second. A breach investigation's best friend, which is a great topic. Couple of things, one, I like the breach investigation angle, but I also like this observability lake positioning, because I think this is a teaser of what's coming, more and more data usage where it's actually being applied specifically for things here, it's observability lake. So first, what is an observability lake? Why is it important? >> Why it's important is technology professionals, especially security professionals need data to make decisions. They need data to drive better decisions. They need data to understand, just to achieve understanding. And that means they need everything. They don't need what they can afford to store. They don't need not what vendor is going to let them store. They need everything. And I think as a point of the observability lake, because you couple an observability pipeline with the lake to bring your enterprise of data, to make it accessible for analytics, to be able to use it, to be able to get value from it. And I think that's one of the things that's missing right now in the enterprises. Admins are being forced to make decisions about, okay, we can't afford to keep this, we can afford to keep this, they're missing things. They're missing parts of the picture. And by bringing, able to bring it together, to be able to have your cake and eat it too, where I can get what I need and I can do it affordably is just, I think that's the future, and it just drives value for everyone. >> And it just makes a lot of sense data lake or the earlier concert, throw everything into the lake, and you can figure it out, you can query it, you can take action on it real time, you can stream it. You can do all kinds of things with it. Verb observability is important because it's the most critical thing people are doing right now for all kinds of things from QA, administration, security. So this is where the breach piece comes in. I like that's part of the talk because the breached investigation's best friend, it implies that you got the secret sourced to behind it, right? So, what is the state of the breach investigation today? What's going on with that? Because we know breaches, we see 'em out there, but like, why is this the best friend of a breach investigator? >> Well, and this is unfortunate, but typically there's an enormous delay between breach and detection. And right now, there's an IBM study, I think it's 287 days, but from the actual breach to detection and containment. It's an enormous amount of time. And the key is so when you do detect a breach, you're bringing in your instant, your response team, and typically without an observability lake, without Cribl solutions around observability pipeline, you're going to have an incomplete picture. The incident response team has to first to understand what's the scope of the breach. Is it one server? Is it three servers? Is it all the servers? You got to understand what's been compromised, what's been the end, what's the impact? How did the breach occur in the first place? And they need all the data to stitch that together, and they need it quickly. The more time it takes to get that data, the more time it takes for them to finish their analysis and contain the breach. I mean, hence the, I think about an 87, 90 days to contain a breach. And so by being able to remove the friction, by able to make it easier to achieve these goals, what shouldn't be hard, but making, by removing that friction, you speed up the containment and resolution time. Not to mention for many system administrators, they don't simply have the data because they can afford to store the data in their SIEM. Or they have to go to their backup team to get a restore which can take days. And so that's-- It's just so many obstacles to getting resolution right now. >> I mean, it's just, you're crawling through glass there, right? Because you think about it like just the timing aspect. Where is the data? Where is it stored and relevant and-- >> And do you have it at all? >> And you have it at all, and then, you know, that person doesn't work anywhere, they change jobs. I mean, who is keeping track of all this? You guys have now, this capability where you can come in and do the instrumentation with the observability lake without a lot of change to the environment, which is not the way it used to be. Used to be, buy a tool, build a platform. Cribl has a solution that eases the struggles with the enterprise. What specifically is that pain point? And what do you guys do specifically? >> Well, I'll start out with kind of example, what drew me to Cribl, so back in 2018. I'm running the Splunk team for a very large multinational. The complexity of that, we were dealing with the complexity of the data, the demands we were getting from security and operations were just an enormous issue to overcome. I had vendors come to me all the time that will solve your problems, but that means you got to move to our platform where you have to get rid of Splunk or you have to do this, and I'm losing something. And what Cribl stream brought into, was I could put it between my sources and my destinations and manage my data. And I would have flow control over the data. I don't have to lose anything. I could keep continuing use our existing analytics tools, and that sense of power and control, and I don't have to lose anything. I was like, there's something wrong here. This is too good to be true. And so what we're talking about now in terms of breach investigation, is that with Cribl stream, I can create a clone of my data to an object store. So this is in, this is almost any object store. So it can be AWS, it could be the other vendor object stores. It could be on-prem object stores. And then I can house my data, I can house all my data at the cheapest possible price. So instead of eating up my most expensive storage, I put all my data in my object store. And I only put the data I need for the detections in my SIEM. So if, and hopefully never, but if you do have a breach, lock stream has a wonderful UI that makes a trivial to then pick my data out of my object store and restore it back into my SIEM so that my IR team has to develop a complete picture of how the breach happen. What's the scope? What is their lateral movement and answer those questions. And it just, it takes the friction away. Just like you said, just no more crawling over glass. You're running to your solution. >> You mentioned object store, and you're streaming that in. You talk about the Cribble stream tool. I'm assuming there when you're streaming the pipeline stuff, but is there a schema involved? Is there database challenges? What, how do you guys look at that? I know you're vendor agnostic. I like that piece, you plug in and you leverage all the tools that are out there, Splunk, Datadog, whatever. But how about on the database side, what's the impact there? >> Well, so I'm assuming you're talking about the object store itself, so we don't have to apply the schema. We can fit the data to whichever the object store is. We structure the data so it makes it easier to understand. For example, if I want to see communications from one IP to another IP, we structure it to make it easier to see that and query that, but it is just, we're-- Yeah, it's completely vendor neutral and this makes it so simple, so simple to enable, I think-- >> So no pre-defined schema needed. >> No, not at all. And this, it made it so much easier. I think we enabled this for the enterprise. I think it took us three hours to do, and we were able to then start, I mean, start cutting our retention costs dramatically. >> Yeah, it's great when you get that kind of value, time to value critical and all the skeptics fall to the sides pretty quickly. (chuckles) I got to ask you, well, go ahead. >> So I say, I mean, previously, I would have to go to our backup team. We'd have to open up a ticket, we'd have to have a bridge, then we'd have to go through the process of pulling tape and being, it could take, you know, hours, hours if not days to restore the amount of data we needed. And just it, you know, we were able to run to our goals, and solve business problems instead of focusing on the process steps of getting things done. >> Right, so take me through the architecture here and some customer examples, 'cause you have the Cribble streaming there, observability pipeline. That's key, you mentioned that. >> Yes. >> And then they build out these observability lakes from that. So what is the impact of that? Can you share the customers that are using that solution? What are they seeing for benefits? What are some of the impact? Can you give us some specifics? >> I mean, I can't share with all the exact customer names. I can definitely give you some examples. Like referenceable conference would be TransUnion, so that I came from TransUnion. I was one of the first customers and it solved enormous number of problems for us. Autodesk is another great example. The idea that we're able to automate and data practices. I mean, just for example, what we were talking about with backups. We'd have to, you have to put a lot of time into managing your backups in your inner analytics platforms, you have to. And then you're locked into custom database schemas, you're locked into vendors. And it's also, it's still, it's expensive. So being able to spend a few hours, dramatically cut your costs, but still have the data available, and that's the key. I didn't have to make compromises, 'cause before I was having to say, okay, we're going to keep this, we're going to just drop this and hope for the best. And we just don't, we just didn't have to do that anymore. I think for the same thing for TransUnion and Autodesk, the idea that we're going to lower our cost, we're going to make it easier for our administrators to do their job and so they can spend more time on business value fundamentals, like responding to a breach. You're going to spend time working with your teams, getting value observability solutions and stop spending time on writing custom solutions using to open source tools. 'Cause your engineering time is the most precious asset for any enterprise and you got to focus your engineering time on where it's needed the most. >> Yeah, and they can't underestimate the hassle and cost of ownership, of swapping out pre-existing stuff, just for the sake of having a functionality. I mean that's a big-- >> It's pain and that's a big thing about lock stream is that being vendor neutral is so important. If you want to use the Splunk universal forwarder, that's great. If you want to use Beats, that's awesome. If you want to use Fluentd, even better. If you want to use all three, you can do that too. It's the customer choice and we're saying to people, use what suits your needs. And if you want to write some of your data to elastic, that's great. Some of your data to Splunk, that's even better. Some of it to, pick your pick, fine as well or Exabeam. You have the choices to put together, put your own solutions together and put your data where you need it to be. We're not asking you only in our ecosystem to work with only our partners. We're letting you pick and choose what suits your business. >> Yeah, you know, that's the direction I was just talking about the Amazon folks around their serverless. You know, you can use any tool, you know, you can, they have that core architecture for everything, the S3 and then pick whatever you want to use. SageMaker, just that other thing. This is the new way. That's the way it has to be to be effective. How do you guys handle that? What's been the reaction from customers? Do they like, roll their eyes and doubt you guys, or can you do it? Are they skeptical? How fast can you convert 'em over? (chuckles) >> Right, and that's always the challenge. And that's, I mean, the best part of my day is talking to customers. I love hearing and feedback, what they like, what they don't and what they need. And of course I was skeptical. I didn't believe it when I first saw it because I was like this, you know, because I'm, I was used to being locked in. I was used to having to put a lot of effort, a lot of custom code, like, what do you mean? It's this easy? I believe I did the first, this is 2018, and I did our first demos, like 30 minutes in, and I cut about 1/2 million dollars out of our license in the first 30 minutes in our first demo. And I was stunned because I mean, it's like, this is easy. >> Yeah, I mean-- >> Yeah, exactly. I mean, this is, and then this is the future. And then for example, we needed to bring in so like the security team wanted to bring in a UBA solution that wasn't part of the vendor ecosystem that we were in. And I was like, not a problem. We're going to use log stream. We're going to clone a copy of our data to the UBA solution. We were able to get value from this UBA solution in weeks. What typically is a six month cycle to start getting value. And it just, it was just too easy and the best part of it. And the thing is, it just struck me was my engineers can now spend their time on delivering value instead of integrations and moving data around. >> Yeah, and also we can spend more time preventing breaches. But what's interesting is counterintuitive here is that, if you, as you add more flexibility and choice, you'd think it'd be harder to handle a breach, right? So, now let's go back to the scenario. Now you guys, say an organization has a breach, and they have the observability pipeline, They got the lake in place, your observability lake, take me through the investigation. How easy is it, what happens? How they start it, what goes on? >> So, once your SOC detects a breach, then they bring in the idea. Typically you're going to bring in your incident response team. So what we did, and this is one more way that we removed that friction, we cleaned up the glass, is we delegate to the instant response team, the ability to restore, we call it-- So if Cribl calls it replay, we play data at our object store back into your SIEM. There's a very nice UI that gives you the ability to say, "I want data from this time period, at this time period, I want it to be all the data." Or the ability to filter and say, "I want this, just this IP." For example, if I detected, okay, this IP has been breached then I'm going to pull all the data that mentions this IP and this timeframe, hit a button and it just starts. And then it's going to restore how as fast your IOPS are for your solution. And then it's back in your tool, it's back in your tool. One of the things I also want to mention is we have an amazing enrichment capability. So one of the things that we would do is we would've pipelines so as the data comes out of the object store, it hits the pipeline, and then we enrich it. We hit use GoIP information, perverse and NAS. It gets processed through threat Intel feed. So the data's already enriched and ready for the incident response people to do their job. And so it just, it bamboozle the friction of getting to the point where I can start doing my job. >> You know, at this theme, this episode for this showcase is about Data as Code. And which is, you know, we've been, I've been saying this on theCUBES for since it was being around 13 years ago, that developers are going to be dealing with data like they deal with software code, and you're starting to see, you mentioned enrichment. Where do you see Data as Code going? How relevant in it now, because we really talking about when you add machine learning in here, that has to be enriched, and iterated on too. We're talking about taking things off a branch and putting it back into the core. This is a data discussion, this isn't software, but it sounds the same. >> Right, and this is something that the irony is that, I remember first time saying it to an auditor. I was constantly going with auditors, and that's what I described is I'm going to show you the code that manages the data. This is the data's code that's going to show you how we transform it, how we secure it, where the data goes, how it's enriched. So you can see the whole story, the data life cycle in one place. And that's how we handled our orders. And I think that is enormously, you know, positive because it's so easy to be confused. It's so easy to have complexity to get in the way of progress. And by being able to represent your Data as Code, it's a step forward 'cause the amount of data and the complexity of data, it's not getting simpler, it's getting more complex. So we need to come up with better ways to handle it. >> Now you've been on both sides of the fence. You've been in the trenches as customer, now you're a supplier with Great Solution. What are people doing with this data engineering roles? Because it's not enough data engineering. I mean, 'cause if you say Data as Code, if you believe that to be true and many people do, we do. And you looked at the history of infrastructure risk code that enabled DevOps, AIOps, MLOps, DataOps, it's happening, right? So data stack ops is coming. Obviously security is huge in this. How does that data engineering role evolve? Because it just seems more and more that there's going to be a big push towards an SRE version of data, right? >> I completely agree. I was working with a customer yesterday, and I spent a large part of our conversation talking about implementing development practices for administrators. It's a new role. It's a new way to think of things 'cause traditionally your Splunk or elastic administrators is talking about operating systems and memory and talking about how to use proprietary tools in the vendor, that's just not quite the same. And so we started talking about, you need to have, you need to start getting used to code reviews. Yeah, the idea of getting used to making sure everything has a comment, was one thing I told him was like, you know, if you have a function has to have a comment, just by default, just it has to. Yeah, the standards of how you write things, how you name things all really start to matter. And also you got to start adding, considering your skillset. And this is some mean probably one of the best hire I ever made was I hired a guy with a math degree, because I needed his help to understand how do machine learning works, how to pick the best type of algorithm. And I think this is going to evolve, that you're going to be just away from the gray bearded administrator to some other gray bearded administrator with a math degree. >> It's interesting, it's a step function. You have a data engineer who's got that kind of capabilities, like what the SRA did with infrastructure. The step function of enablement, the value creation from really good data engineering, puts the democratization playback on the table, and changes, >> Thank you very much John. >> And changes that entire landscape. How do you, what's your reaction to that? >> I completely agree 'cause so operational data. So operational security data is the most volatile data in the enterprise. It changes on a whim, you have developers who change things. They don't tell you what happens, vendor doesn't tell you what happened, and so that idea, that life cycle of managing data. So the same types of standards of disciplines that database administrators have done for years is going to have, it has to filter down into the operational areas, and you need tooling that's going to give you the ability to manage that data, manage it in flight in real time, in order to drive detections, in order to drive response. All those business value things we've been talking about. >> So I got to ask you the larger role that you see with observability lakes we were talking before we came on camera live here about how exciting this kind of concept is, and you were attracted to the company because of it. I love the observability lake concept because it puts all that data in one spot, you can manage it. But you got machine learning in AI around the corner that also can help. How has all this changed in the landscape of data security and things because it makes a lot of sense, and I can only see it getting better with machine learning. >> Yeah, definitely does. >> Totally, and so the core issue, and I don't want to say, so when you talk about observability, most people have assumptions around observability is only an operational or an application support process. It's also security process. The idea that you're looking for your unknown, unknowns. This is what keeps security administrators up at night is I'm being attacked by something I don't know about. How do you find those unknown? And that's where your machine learning comes in. And that's where that you have to understand there's so many different types of machine learning algorithms, where the guy that I hired, I mean, had started educating me about the umpteen number of algorithms and how it applies to different data and how you get different value, how you have to test your data constantly. There's no such thing as the magical black box of machine learning that gives you value. You have to implement, but just like the developer practices to keep testing and over and over again, data scientists, for example. >> The best friend of a machine learning algorithm is data, right? You got to keep feeding that data, and when the data sets are baked and secure and vetted, even better, all cool. Had great stuff, great insight. Congratulations Cribl, Great Solution. Love the architecture, love the pipelining of the observability data and streaming that in to a lake. Great stuff. Give a plug for the company where you guys are at, where people can get information. I know you guys got a bunch of live feeds on YouTube, Twitch, here in theCUBE. Where else can people find you? Give the plug. >> Oh, please, please join our slack community, go to cribl.io/community. We have an amazing community. This was another thing that drew me to the company is have a large group of people who are genuinely excited about data, about managing data. If you want to try Cribl out, we have some great tool. Try Cribl tools out. We have a cloud platform, one terabyte up free data. So go to cribl.io/cloud or cribl.cloud, sign up for, you know, just never times out. You're not 30 day, it's forever up to one terabyte. Try out our new products as well, Cribl Edge. And then finally come watch Nick Decker and I, every Thursday, 2:00 PM Eastern. We have live streams on Twitter, LinkedIn and YouTube live. And so just my Twitter handle is EBA 1367. Love to have, love to chat, love to have these conversations. And also, we are hiring. >> All right, good stuff. Great team, great concepts, right? Of course, we're theCUBE here. We got our video lake coming on soon. I think I love this idea of having these video. Hey, videos data too, right? I mean, we've got to keep coming to you. >> I love it, I love videos, it's awesome. It's a great way to communicate, it's a great way to have a conversation. That's the best thing about us, having conversations. I appreciate your time. >> Thank you so much, Ed, for representing Cribl here on the Data as Code. This is season two episode two of the ongoing series covering the hottest, most exciting startups from the AWS ecosystem. Talking about the future data, I'm John Furrier, your host. Thanks for watching. >> Ed: All right, thank you. (slow upbeat music)

Published Date : Apr 26 2022

SUMMARY :

And talk about the future of I thank you for the I like the breach investigation angle, to be able to have your I like that's part of the talk And the key is so when Where is the data? and do the instrumentation And I only put the data I need I like that piece, you We can fit the data to for the enterprise. I got to ask you, well, go ahead. and being, it could take, you know, hours, the Cribble streaming there, What are some of the impact? and that's the key. just for the sake of You have the choices to put together, This is the new way. I believe I did the first, this is 2018, And the thing is, it just They got the lake in place, the ability to restore, we call it-- and putting it back into the core. is I'm going to show you more that there's going to be And I think this is going to evolve, the value creation from And changes that entire landscape. that's going to give you the So I got to ask you the Totally, and so the core of the observability data and that drew me to the company I think I love this idea That's the best thing about Cribl here on the Data as Code. Ed: All right, thank you.

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
John Furrier	PERSON	0.99+
Ed	PERSON	0.99+
Ed Bailey	PERSON	0.99+
TransUnion	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
2018	DATE	0.99+
Autodesk	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
three hours	QUANTITY	0.99+
287 days	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
30 day	QUANTITY	0.99+
six month	QUANTITY	0.99+
first demo	QUANTITY	0.99+
yesterday	DATE	0.99+
Cribl	ORGANIZATION	0.99+
first demos	QUANTITY	0.99+
YouTube	ORGANIZATION	0.99+
Twitch	ORGANIZATION	0.99+
first	QUANTITY	0.99+
both sides	QUANTITY	0.99+
three servers	QUANTITY	0.99+
Splunk	ORGANIZATION	0.99+
one spot	QUANTITY	0.99+
one	QUANTITY	0.99+
One	QUANTITY	0.98+
30 minutes	QUANTITY	0.98+
Cribl	PERSON	0.98+
UBA	ORGANIZATION	0.98+
one place	QUANTITY	0.98+
one terabyte	QUANTITY	0.98+
first 30 minutes	QUANTITY	0.98+
LinkedIn	ORGANIZATION	0.98+
SRA	ORGANIZATION	0.97+
Today	DATE	0.97+
one more way	QUANTITY	0.97+
about 1/2 million dollars	QUANTITY	0.96+
one server	QUANTITY	0.96+
Twitter	ORGANIZATION	0.96+
Beats	ORGANIZATION	0.96+
Nick Decker	PERSON	0.96+
Cribl	TITLE	0.95+
today	DATE	0.94+
Cribl Edge	TITLE	0.94+
first customers	QUANTITY	0.94+
87, 90 days	QUANTITY	0.93+
Thursday, 2:00 PM Eastern	DATE	0.92+
around 13 years ago	DATE	0.91+
first time	QUANTITY	0.89+
three	QUANTITY	0.87+
cribl.io/community	OTHER	0.87+
Intel	ORGANIZATION	0.87+
cribl.cloud	TITLE	0.86+
Datadog	ORGANIZATION	0.85+
S3	TITLE	0.84+
Cribl stream	TITLE	0.82+
cribl.io/cloud	TITLE	0.81+
Couple of things	QUANTITY	0.78+
two	OTHER	0.78+
episode	QUANTITY	0.74+
AWS Startup Showcase	EVENT	0.72+
lock	TITLE	0.72+
Exabeam	ORGANIZATION	0.71+
Startup Showcase S2 E2	EVENT	0.69+
season two	QUANTITY	0.67+
Multicloud	TITLE	0.67+
up to one terabyte	QUANTITY	0.67+

Clint Sharp, Cribl | Cube Conversation

(upbeat music) >> Hello, welcome to this CUBE conversation I'm John Furrier your host here in theCUBE in Palo Alto, California, featuring Cribl a hot startup taking over the enterprise when it comes to data pipelining, and we have a CUBE alumni who's the co-founder and CEO, Clint Sharp. Clint, great to see you again, you've been on theCUBE, you were on in 2013, great to see you, congratulations on the company that you co-founded, and leading as the chief executive officer over $200 million in funding, doing this really strong in the enterprise, congratulations thanks for joining us. >> Hey, thanks John it's really great to be back. >> You know, remember our first conversation the big data wave coming in, Hadoop World 2010, now the cloud comes in, and really the cloud native really takes data to a whole nother level. You've seeing the old data architectures being replaced with cloud scale. So the data landscape is interesting. You know, Data as Code you're hearing that term, data engineering teams are out there, data is everywhere, it's now part of how developers and companies are getting value whether it's real time, or coming out of data lakes, data is more pervasive than ever. Observability is a hot area, there's a zillion companies doing it, what are you guys doing? Where do you fit in the data landscape? >> Yeah, so what I say is that Cribl and our products and we solve the problem for our customers of the fundamental tension between data growth and budget. And so if you look at IDCs data data's growing at a 25%, CAGR, you're going to have two and a half times the amount of data in five years that you have today, and I talk to a lot of CIOs, I talk to a lot of CISOs, and the thing that I hear repeatedly is my budget is not growing at a 25% CAGR so fundamentally, how do I resolve this tension? We sell very specifically into the observability in security markets, we sell to technology professionals who are operating, you know, observability in security platforms like Splunk, or Elasticsearch, or Datadog, Exabeam, like these types of platforms they're moving, protocols like syslog, they're moving, they have lots of agents deployed on every endpoint and they're trying to figure out how to get the right data to the right place, and fundamentally you know, control cost. And we do that through our product called Stream which is what we call an observability pipeline. It allows you to take all this data, manipulate it in the stream and get it to the right place and fundamentally be able to connect all those things that maybe weren't originally intended to be connected. >> So I want to get into that new architecture if you don't mind, but let me first ask you on the problem space that you're in. So cloud native obviously instrumentating, instrumenting everything is a key thing. You mentioned data got all these tools, is the problem that there's been a sprawl of things being instrumented and they have to bring it together, or it's too costly to run all these point solutions and get it to work? What's the problem space that you're in? >> So I think customers have always been forced to make trade offs John. So the, hey I have volumes and volumes and volumes of data that's relevant to securing my enterprise, that's relevant to observing and understanding the behavior of my applications but there's never been an approach that allows me to really onboard all of that data. And so where we're coming at is giving them the tools to be able to, you know, filter out noise and waste, to be able to, you know, aggregate this high fidelity telemetry data. There's a lot of growing changes, you talk about cloud native, but digital transformation, you know, the pandemic itself and remote work all these are driving significantly greater data volumes, and vendors unsurprisingly haven't really been all that aligned to giving customers the tools in order to reshape that data, to filter out noise and waste because, you know, for many of them they're incentivized to get as much data into their platform as possible, whether that's aligned to the customer's interests or not. And so we saw an opportunity to come out and fundamentally as a customers-first company give them the tools that they need, in order to take back control of their data. >> I remember those conversations even going back six years ago the whole cloud scale, horizontally scalable applications, you're starting to see data now being stuck in the silos now to have high, good data you have to be observable, which means you got to be addressable. So you now have to have a horizontal data plane if you will. But then you get to the question of, okay, what data do I need at the right time? So is the Data as Code, data engineering discipline changing what new architectures are needed? What changes in the mind of the customer once they realize that they need this new way to pipe data and route data around, or make it available for certain applications? What are the key new changes? >> Yeah, so I think one of the things that we've been seeing in addition to the advent of the observability pipeline that allows you to connect all the things, is also the advent of an observability lake as well. Which is allowing people to store massively greater quantities of data, and also different types of data. So data that might not traditionally fit into a data warehouse, or might not traditionally fit into a data lake architecture, things like deployment artifacts, or things like packet captures. These are binary types of data that, you know, it's not designed to work in a database but yet they want to be able to ask questions like, hey, during the Log4Shell vulnerability, one of all my deployment artifacts actually had Log4j in it in an affected version. These are hard questions to answer in today's enterprise. Or they might need to go back to full fidelity packet capture data to try to understand that, you know, a malicious actor's movement throughout the enterprise. And we're not seeing, you know, we're seeing vendors who have great log indexing engines, and great time series databases, but really what people are looking for is the ability to store massive quantities of data, five times, 10 times more data than they're storing today, and they're doing that in places like AWSS3, or in Azure Blob Storage, and we're just now starting to see the advent of technologies we can help them query that data, and technologies that are generally more specifically focused at the type of persona that we sell to which is a security professional, or an IT professional who's trying to understand the behaviors of their applications, and we also find that, you know, general-purpose data processing technologies are great for the enterprise, but they're not working for the people who are running the enterprise, and that's why you're starting to see the concepts like observability pipelines and observability lakes emerge, because they're targeted at these people who have a very unique set of problems that are not being solved by the general-purpose data processing engines. >> It's interesting as you see the evolution of more data volume, more data gravity, then you have these specialty things that need to be engineered for the business. So sounds like observability lake and pipelining of the data, the data pipelining, or stream you call it, these are new things that they bolt into the architecture, right? Because they have business reasons to do it. What's driving that? Sounds like security is one of them. Are there others that are driving this behavior? >> Yeah, I mean it's the need to be able to observe applications and observe end-user behavior at a fine-grain detail. So, I mean I often use examples of like bank teller applications, or perhaps, you know, the app that you're using to, you know, I'm going to be flying in a couple of days. I'll be using their app to understand whether my flight's on time. Am I getting a good experience in that particular application? Answering the question of is Clint getting a good experience requires massive quantities of data, and your application and your service, you know, I'm going to sit there and look at, you know, American Airlines which I'm flying on Thursday, I'm going to be judging them based on off of my experience. I don't care what the average user's experience is I care what my experience is. And if I call them up and I say, hey, and especially for the enterprise usually this is much more for, you know, in-house applications and things like that. They call up their IT department and say, hey, this application is not working well, I don't know what's going on with it, and they can't answer the question of what was my individual experience, they're living with, you know, data that they can afford to store today. And so I think that's why you're starting to see the advent of these new architectures is because digital is so absolutely critical to every company's customer experience, that they're needing to be able to answer questions about an individual user's experience which requires significantly greater volumes of data, and because of significantly greater volumes of data, that requires entirely new approaches to aggregating that data, bringing the data in, and storing that data. >> Talk to me about enabling customer choice when it comes around controlling their data. You mentioned that before we came on camera that you guys are known for choice. How do you enable customer choice and control over their data? >> So I think one of the biggest problems I've seen in the industry over the last couple of decades is that vendors come to customers with hugely valuable products that make their lives better but it also requires them to maintain a relationship with that vendor in order to be able to continue to ask questions of that data. And so customers don't get a lot of optionality in these relationships. They sign multi-year agreements, they look to try to start another, they want to go try out another vendor, they want to add new technologies into their stack, and in order to do that they're often left with a choice of well, do I roll out like get another agent, do I go touch 10,000 computers, or a 100,000 computers in order to onboard this data? And what we have been able to offer them is the ability to reuse their existing deployed footprints of agents and their existing data collection technologies, to be able to use multiple tools and use the right tool for the right job, and really give them that choice, and not only give them the choice once, but with the concepts of things like the observability lake and replay, they can go back in time and say, you know what? I wanted to rehydrate all this data into a new tool, I'm no longer locked in to the way one vendor stores this, I can store this data in open formats and that's one of the coolest things about the observability late concept is that customers are no longer locked in to any particular vendor, the data is stored in open formats and so that gives them the choice to be able to go back later and choose any vendor, because they may want to do some AI or ML on that type of data and do some model training. They may want to be able to forward that data to a new cloud data warehouse, or try a different vendor for log search or a different vendor for time series data. And we're really giving them the choice and the tools to do that in a way in which was simply not possible before. >> You know you are bring up a point that's a big part of the upcoming AWS startup series Data as Code, the data engineering role has become so important and the word engineering is a key word in that, but there's not a lot of them, right? So like how many data engineers are there on the planet, and hopefully more will come in, come from these great programs in computer science but you got to engineer something but you're talking about developing on data, you're talking about doing replays and rehydrating, this is developing. So Data as Code is now a reality, how do you see Data as Code evolving from your perspective? Because it implies DevOps, Infrastructure as Code was DevOps, if Data as Code then you got DataOps, AIOps has been around for a while, what is Data as Code? And what does that mean to you Clint? >> I think for our customers, one, it means a number of I think sort of after-effects that maybe they have not yet been considering. One you mentioned which is it's hard to acquire that talent. I think it is also increasingly more critical that people who were working in jobs that used to be purely operational, are now being forced to learn, you know, developer centric tooling, things like GET, things like CI/CD pipelines. And that means that there's a lot of education that's going to have to happen because the vast majority of the people who have been doing things in the old way from the last 10 to 20 years, you know, they're going to have to get retrained and retooled. And I think that one is that's a huge opportunity for people who have that skillset, and I think that they will find that their compensation will be directly correlated to their ability to have those types of skills, but it also represents a massive opportunity for people who can catch this wave and find themselves in a place where they're going to have a significantly better career and more options available to them. >> Yeah and I've been thinking about what you just said about your customer environment having all these different things like Datadog and other agents. Those people that rolled those out can still work there, they don't have to rip and replace and then get new training on the new multiyear enterprise service agreement that some other vendor will sell them. You come in and it sounds like you're saying, hey, stay as you are, use Cribl, we'll have some data engineering capabilities for you, is that right? Is that? >> Yup, you got it. And I think one of the things that's a little bit different about our product and our market John, from kind of general-purpose data processing is for our users they often, they're often responsible for many tools and data engineering is not their full-time job, it's actually something they just need to do now, and so we've really built tool that's designed for your average security professional, your average IT professional, yes, we can utilize the same kind of DataOps techniques that you've been talking about, CI/CD pipelines, GITOps, that sort of stuff, but you don't have to, and if you're really just already familiar with administering a Datadog or a Splunk, you can get started with our product really easily, and it is designed to be able to be approachable to anybody with that type of skillset. >> It's interesting you, when you're talking you've remind me of the big wave that was coming, it's still here, shift left meant security from the beginning. What do you do with data shift up, right, down? Like what do you, what does that mean? Because what you're getting at here is that if you're a developer, you have to deal with data but you don't have to be a data engineer but you can be, right? So we're getting in this new world. Security had that same problem. Had to wait for that group to do things, creating tension on the CI/CD pipelining, so the developers who are building apps had to wait. Now you got shift left, what is data, what's the equivalent of the data version of shift left? >> Yeah so we're actually doing this right now. We just announced a new product a week ago called Cribl Edge. And this is enabling us to move processing of this data rather than doing it centrally in the stream to actually push this processing out to the edge, and to utilize a lot of unused capacity that you're already paying AWS, or paying Azure for, or maybe in your own data center, and utilize that capacity to do the processing rather than having to centralize and aggregate all of this data. So I think we're going to see a really interesting, and left from our side is towards the origination point rather than anything else, and that allows us to really unlock a lot of unused capacity and continue to drive the kind of cost down to make more data addressable back to the original thing we talked about the tension between data growth, if we want to offer more capacity to people, if we want to be able to answer more questions, we need to be able to cost-effectively query a lot more data. >> You guys had great success in the enterprise with what you got going on. Obviously the funding is just the scoreboard for that. You got good growth, what are the use cases, or what's the customer look like that's working for you where you're winning, or maybe said differently what pain points are out there the customer might be feeling right now that Cribl could fit in and solve? How would you describe that ideal persona, or environment, or problem, that the customer may have that they say, man, Cribl's a perfect fit? >> Yeah, this is a person who's working on tooling. So they administer a Splunk, or an Elastic, or a Datadog, they may be in a network operations center, a security operation center, they are struggling to get data into their tools, they're always at capacity, their tools always at the redline, they really wish they could do more for the business. They're kind of tired of being this department of no where everybody comes to them and says, "hey, can I get this data in?" And they're like, "I wish, but you know, we're all out of capacity, and you know, we have, we wish we could help you but we frankly can't right now." We help them by routing that data to multiple locations, we help them control costs by eliminating noise and waste, and we've been very successful at that in, you know, logos, like, you know, like a Shutterfly, or a, blanking on names, but we've been very successful in the enterprise, that's not great, and we continue to be successful with major logos inside of government, inside of banking, telco, et cetera. >> So basically it used to be the old hyperscalers, the ones with the data full problem, now everyone's got the, they're full of data and they got to really expand capacity and have more agility and more engineering around contributions of the business sounds like that's what you guys are solving. >> Yup and hopefully we help them do a little bit more with less. And I think that's a key problem for our enterprises, is that there's always a limit on the number of human resources that they have available at their disposal, which is why we try to make the software as easy to use as possible, and make it as widely applicable to those IT and security professionals who are, you know, kind of your run-of-the-mill tools administrator, our product is very approachable for them. >> Clint great to see you on theCUBE here, thanks for coming on. Quick plug for the company, you guys looking for hiring, what's going on? Give a quick update, take 30 seconds to give a plug. >> Yeah, absolutely. We are absolutely hiring cribl.io/jobs, we need people in every function from sales, to marketing, to engineering, to back office, GNA, HR, et cetera. So please check out our job site. If you are interested it in learning more you can go to cribl.io. We've got some great online sandboxes there which will help you educate yourself on the product, our documentation is freely available, you can sign up for up to a terabyte a day on our cloud, go to cribl.cloud and sign up free today. The product's easily accessible, and if you'd like to speak with us we'd love to have you in our community, and you can join the community from cribl.io as well. >> All right, Clint Sharp co-founder and CEO of Cribl, thanks for coming to theCUBE. Great to see you, I'm John Furrier your host thanks for watching. (upbeat music)

Published Date : Mar 31 2022

SUMMARY :

Clint, great to see you again, really great to be back. and really the cloud native and get it to the right place and get it to work? to be able to, you know, So is the Data as Code, is the ability to store that need to be engineered that they're needing to be that you guys are known for choice. is the ability to reuse their does that mean to you Clint? from the last 10 to 20 years, they don't have to rip and and it is designed to be but you don't have to be a data engineer and to utilize a lot of unused capacity that the customer may have and you know, we have, and they got to really expand capacity as easy to use as possible, Clint great to see you on theCUBE here, and you can join the community Great to see you, I'm

ENTITIES

Entity	Category	Confidence
Clint Sharp	PERSON	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
10 times	QUANTITY	0.99+
Clint	PERSON	0.99+
30 seconds	QUANTITY	0.99+
100,000 computers	QUANTITY	0.99+
Thursday	DATE	0.99+
Cribl	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
25%	QUANTITY	0.99+
American Airlines	ORGANIZATION	0.99+
five times	QUANTITY	0.99+
10,000 computers	QUANTITY	0.99+
2013	DATE	0.99+
five years	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
one	QUANTITY	0.99+
over $200 million	QUANTITY	0.99+
six years ago	DATE	0.99+
CUBE	ORGANIZATION	0.98+
a week ago	DATE	0.98+
first	QUANTITY	0.98+
telco	ORGANIZATION	0.98+
Datadog	ORGANIZATION	0.97+
today	DATE	0.97+
AWSS3	TITLE	0.97+
Log4Shell	TITLE	0.96+
two and a half times	QUANTITY	0.94+
last couple of decades	DATE	0.89+
first conversation	QUANTITY	0.89+
One	QUANTITY	0.87+
Hadoop World 2010	EVENT	0.87+
Log4j	TITLE	0.83+
cribl.io	ORGANIZATION	0.81+
20 years	QUANTITY	0.8+
Azure	ORGANIZATION	0.8+
first company	QUANTITY	0.79+
big wave	EVENT	0.79+
theCUBE	ORGANIZATION	0.78+
up to a terabyte a day	QUANTITY	0.77+
Azure Blob	TITLE	0.77+
cribl.cloud	TITLE	0.74+
Exabeam	ORGANIZATION	0.72+
Shutterfly	ORGANIZATION	0.71+
banking	ORGANIZATION	0.7+
DataOps	TITLE	0.7+
wave	EVENT	0.68+
last	DATE	0.67+
cribl.io	TITLE	0.66+
things	QUANTITY	0.65+
zillion companies	QUANTITY	0.63+
syslog	TITLE	0.62+
10	QUANTITY	0.61+
Splunk	ORGANIZATION	0.6+
AIOps	TITLE	0.6+
Edge	TITLE	0.6+
Data as	TITLE	0.59+
cribl.io/jobs	ORGANIZATION	0.58+
Elasticsearch	TITLE	0.58+
Elastic	TITLE	0.55+
once	QUANTITY	0.5+
problems	QUANTITY	0.48+
Code	TITLE	0.46+
Splunk	TITLE	0.44+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for cribl.cloud: