Kam Amir, Cribl | HPE Discover 2022

>> TheCUBE presents HPE Discover 2022 brought to you by HPE. >> Welcome back to theCUBE's coverage of HPE Discover 2022. We're here at the Venetian convention center in Las Vegas Dave Vellante for John Furrier. Cam Amirs here is the director of technical alliances at Cribl'. Cam, good to see you. >> Good to see you too. >> Cribl'. Cool name. Tell us about it. >> So let's see. Cribl' has been around now for about five years selling products for the last two years. Fantastic company, lots of growth, started there 2020 and we're roughly 400 employees now. >> And what do you do? Tell us more. >> Yeah, sure. So I run the technical alliances team and what we do is we basically look to build integrations into platforms such as HPE GreenLake and Ezmeral. And we also work with a lot of other companies to help get data from various sources into their destinations or, you know other enrichments of data in that data pipeline. >> You know, you guys have been on theCUBE. Clint's been on many times, Ed Bailey was on our startup showcase. You guys are successful in this overfunded observability space. So, so you guys have a unique approach. Tell us about why you guys are successful in the product and some of the things you've been doing there. >> Yeah, absolutely. So our product is very complimentary to a lot of the technologies that already exist. And I used to joke around that everyone has these like pretty dashboards and reports but they completely glaze over the fact that it's not easy to get the data from those sources to their destinations. So for us, it's this capability with Cribl' Stream to get that data easily and repeatably into these destinations. >> Yeah. You know, Cam, you and I are both at the Snowflake Summit to John's point. They were like a dozen observability companies there. >> Oh yeah. >> And really beginning to be a crowded space. So explain what value you bring to that ecosystem. >> Yeah, sure. So the ecosystem that we see there is there are a lot of people that are kind of sticking to like effectively getting data and showing you dashboards reports about monitoring and things of that sort. For us, the value is how can we help customers kind of accelerate their adoption of these platforms, how to go from like your legacy SIM or your legacy monitoring solution to like the next-gen observability platform or next-gen security platform >> and what you do really well is the integration and bringing those other toolings to, to do that? >> Correct, correct. And we make it repeatable. >> How'd you end up here? >> HP? So we actually had a customer that actually deployed our software on the HPS world platform. And it was kind of a light bulb moment that, okay this is actually a different approach than going to your traditional, you know, AWS, Google, et cetera. So we decided to kind of hunt this down and figure out how we could be a bigger player in this space. >> You saw the data fabric announcement? I'm not crazy about the term, data fabric is an old NetApp term, and then Gartner kind of twisted it. I like data mesh, but anyway, it doesn't matter. We kind of know what it is, but but when you see an announcement like that how do you look at it? You know, what does it mean to to Cribl' and your customers? >> Yeah. So what we've seen is that, so we work with the data fabric team and we're able to kind of route our data to their, as a data lake, so we can actually route the data from, again all these very sources into this data lake and then have it available for whatever customers want to do with it. So one of the big things that I know Clint talks about is we give customers this, we sell choice. So we give them the ability to choose where they want to send their data, whether that's, you know HP's data lake and data fabric or some other object store or some other destination. They have that choice to do so. >> So you're saying that you can stream with any destination the customer wants? What are some examples? What are the popular destinations? >> Yeah so a lot of the popular destinations are your typical object stores. So any of your cloud object stores, whether it be AWS three, Google cloud storage or Azure blob storage. >> Okay. And so, and you can pull data from any source? >> Laughter: I'd be very careful, but absolutely. What we've seen is that a lot of people like to kind of look at traditional data sources like Syslog and they want to get it to us, a next-gen SIM, but to do so it needs to be converted to like a web hook or some sort of API call. And so, or vice versa, they have this brand new Zscaler for example, and they want to get that data into their SIM but there's no way to do it 'cause a SIM only accepts it as a Syslog event. So what we can do is we actually transform the data and make it so that it lands into that SIM in the format that it needs to be and easily make that a repeatable process >> So, okay. So wait, so not as a Syslog event but in whatever format the destination requires? >> Correct, correct. >> Okay. What are the limits on that? I mean, is this- >> Yeah. So what we've seen is that customers will be able to take, for example they'll take this Syslog event, it's unstructured data but they need to put it into say common information model for Splunk or Elastic common schema for Elastic search or just JSON format for Elastic. And so what we can do is we can actually convert those events so that they land in that transformed state, but we can also route a copy of that event in unharmed fashion, to like an S3 bucket for object store for that long term compliance user >> You can route it to any, basically any object store. Is that right? Is that always the sort of target? >> Correct, correct. >> So on the message here at HPE, first of all I'll get to the marketplace point in a second, but it's cloud to edge is kind of their theme. So data streaming sounds expensive. I mean, you know so how do you guys deal with the streaming egress issue? What does that mean to customers? You guys claim that you can save money on that piece. It's a hotly contested discussion point. >> Laughter: So one of the things that we actually just announced in our 350 release yesterday is the capability of getting data from Windows events, or from Windows hosts, I'm sorry. So a product that we also have is called Cribl' Edge. So our capability of being able to collect data from the edge and then transit it out to whether it be an on-prem, or self-hosted deployment of Cribl', or or maybe some sort of other destination object store. What we do is we actually take the data in in transit and reduce the volume of events. So we can do things like remove white space or remove events that are not really needed and compress or optimize that data so that the egress cost to your point are actually lowered. >> And your data reduction approach is, is compression? It's a compression algorithm? >> So it is a combination, yeah, so it's a combination. So there's some people what they'll do is they'll aggregate the events. So sometimes for example, VPC flow logs are very chatty and you don't need to have all those events. So instead you convert those to metrics. So suddenly you reduced those events from, you know high volume events to metrics that are so small and you still get the same value 'cause you still see the trends and everything. And if later on down the road, you need to reinvestigate those events, you can rehydrate that data with Cribl' replay >> And you'll do the streaming in real time, is that right? >> Yeah. >> So Kafka, is that what you would use? Or other tooling? >> Laughter: So we are complimentary to a Kafka deployment. Customer's already deployed and they've invested in Kafka, We can read off of Kafka and feed back into Kafka. >> If not, you can use your tooling? >> If not, we can be replacing that. >> Okay talk about your observations in the multi-cloud hybrid world because hybrid obviously everyone knows it's a steady state now. On public cloud, on premise edge all one thing, cloud operations, DevOps, data as code all the things we talk about. What's the customer view? You guys have a unique position. What's going on in the customer base? How are they looking at hybrid and specifically multi-cloud, is it stitching together multiple hybrids? Or how do you guys work across those landscapes? >> So what we've seen is a lot of customers are in multiple clouds. That's, you know, that's going to happen. But what we've seen is that if they want to egress data from say one cloud to another the way that we've architected our solution is that we have these worker nodes that reside within these hybrid, these other cloud event these other clouds, I should say so that transmitting data, first egress costs are lowered, but being able to have this kind of, easy way to collect the data and also stitch it back together, join it back together, to a single place or single location is one option that we offer customers. Another solution that we've kind of announced recently is Search. So not having to move the data from all these disparate data sources and data lakes and actually just search the data in place. That's another capability that we think is kind of popular in this hybrid approach. >> And talk about now your relationship with HPE you guys obviously had customers that drove you to Greenlake, obviously what's your experience with them and also talk about the marketplace presence. Is that new? How long has that been going on? Have you seen any results? >> Yeah, so we've actually just started our, our journey into this HPE world. So the first thing was obviously the customer's bringing us into this ecosystem and now our capabilities of, I guess getting ready to be on the marketplace. So having a presence on the marketplace has been huge giving us kind of access to just people that don't even know who we are, being that we're, you know a five year old company. So it's really good to have that exposure. >> So you're going to get customers out of this? >> That's the idea. [Laughter] >> Bring in new market, that's the idea of their GreenLake is that partners fill in. What's your impression so far of GreenLake? Because there seems to be great momentum around HP and opening up their channel their sales force, their customer base. >> Yeah. So it's been very beneficial for us, again being a smaller company and we are a channel first company so that obviously helps, you know bring out the word with other channel partners. But HP has been very, you know open arm kind of getting us into the system into the ecosystem and obviously talking, or giving the good word about Cribl' to their customers. >> So, so you'll be monetizing on GreenLake, right? That's the, the goal. >> That's the goal. >> What do you have to do to get into a position? Obviously, you got a relationship you're in the marketplace. Do you have to, you know, write to their API's or do you just have to, is that a checkbox? Describe what you have to do to monetize. >> Sure. So we have to first get validated on the platform. So the validation process validates that we can work on the Ezmeral GreenLake platform. Once that's been completed, then the idea is to have our logo show up on the marketplace. So customers say, Hey, look, I need to have a way to get transit data or do stuff with data specifically around logs, metrics, and traces into my logging solution or my SIM. And then what we do with them on the back end is we'll see this transaction occur right to their API to basically say who this customer is. 'Cause again, the idea is to have almost a zero touch kind of involvement, but we will actually have that information given to us. And then we can actually monetize on top of it. >> And the visualization component will come from the observability vendor. Is that right? Or is that somewhat, do you guys do some of that? >> So the visualization is right now we're basically just the glue that gets the data to the visualization engine. As we kind of grow and progress our search product that's what will probably have more of a visualization component. >> Do you think your customers are going to predominantly use an observability platform for that visualization? I mean, obviously you're going to get there. Are they going to use Grafana? Or some other tool? >> Or yeah, I think a lot of customers, obviously, depending on what data and what they're trying to accomplish they will have that choice now to choose, you know Grafana for their metrics, logs, et cetera or some sort of security product for their security events but same data, two different kind of use cases. And we can help enable that. >> Cam, I want to ask you a question. You mentioned you were at Splunk and Clint, the CEO and co-founder, was at Splunk too. That brings up the question I want to get your perspective on, we're seeing a modern network here with HPE, with Aruba, obviously clouds kind of going next level you got on premises, edge, all one thing, distributed computing basically, cyber security, a data problem that's solved a lot by you guys and people in this business, making sure data available machine learnings are growing and powering AI like you read about. What's changed in this business? Because you know, Splunking logs is kind of old hat you know, and now you got observability. Unification is a big topic. What's changed now? What's different about the market today around data and these platforms and, and tools? What's your perspective on that? >> I think one of the biggest things is people have seen the amount of volume of data that's coming in. When I was at Splunk, when we hit like a one terabyte deal that was a big deal. Now it's kind of standard. You're going to do a terabyte of data per day. So one of the big things I've seen is just the explosion of data growth, but getting value out of that data is very difficult. And that's kind of why we exist because getting all that volume of data is one thing. But being able to actually assert value from it, that's- >> And that's the streaming core product? That's the whole? >> Correct. >> Get data to where it needs to be for whatever application needs whether it's cyber or something else. >> Correct, correct. >> What's the customer uptake? What's the customer base like for you guys now? How many, how many customers you guys have? What are they doing with the data? What are some of the common things you're seeing? >> Yeah. I mean, it's, it's the basic blocking and tackling, we've significantly grown our customer base and they all have the same problem. They come to us and say, look, I just need to get data from here to there. And literally the routing use case is our biggest use case because it's simple and you take someone that's a an expensive engineer and operations engineer instead of having them going and doing the plumbing of data of just getting logs from one source to another, we come in and actually make that a repeatable process and make that easy. And so that's kind of just our very basic value add right from the get go. >> You can automate that, automate that, make it repeatable. Say what's in the name? Where'd the name come from? >> So Cribl', if you look it up, it's actually kind of an old shiv to get to siphon dirt from gold, right? So basically you just, that's kind of what we do. We filter out all the dirt and leave you the gold bits so you can get value. >> It's kind of what we do on theCUBE. >> It's kind of the gold nuggets. Get all these highlights, hitting Twitter, the golden, the gold nuggets. Great to have you on. >> Cam, thanks for, for coming on, explaining that sort of you guys are filling that gap between, Hey all the observability claims, which are all wonderful but then you got to get there. They got to have a route to get there. That's what got to do. Cribl' rhymes with tribble. Dave Vellante for John Furrier covering HPE Discover 2022. You're watching theCUBE. We'll be right back.

Published Date : Jun 29 2022

SUMMARY :

2022 brought to you by HPE. Cam Amirs here is the director Tell us about it. for the last two years. And what do you do? So I run the of the things you've been doing there. that it's not easy to get the data and I are both at the Snowflake So explain what value you So the ecosystem that we we make it repeatable. to your traditional, you You saw the data fabric So one of the big things So any of your cloud into that SIM in the format the destination requires? I mean, is this- but they need to put it into Is that always the sort of target? You guys claim that you can that the egress cost to your And if later on down the road, you need to Laughter: So we are all the things we talk about. So not having to move the data customers that drove you So it's really good to have that exposure. That's the idea. Bring in new market, that's the idea so that obviously helps, you know So, so you'll be monetizing Describe what you have to do to monetize. 'Cause again, the idea is to And the visualization the data to the visualization engine. are going to predominantly use now to choose, you know Cam, I want to ask you a question. So one of the big things I've Get data to where it needs to be And literally the routing use Where'd the name come from? So Cribl', if you look Great to have you on. of you guys are filling

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Ed Bailey	PERSON	0.99+
Splunk	ORGANIZATION	0.99+
Cribl	ORGANIZATION	0.99+
Kam Amir	PERSON	0.99+
Cam Amirs	PERSON	0.99+
HP	ORGANIZATION	0.99+
Clint	PERSON	0.99+
John Furrier	PERSON	0.99+
Aruba	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Elastic	TITLE	0.99+
one terabyte	QUANTITY	0.99+
2020	DATE	0.99+
HPE	ORGANIZATION	0.99+
yesterday	DATE	0.99+
Kafka	TITLE	0.99+
one option	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
Cam	PERSON	0.99+
Gartner	ORGANIZATION	0.99+
Grafana	ORGANIZATION	0.98+
400 employees	QUANTITY	0.98+
TheCUBE	ORGANIZATION	0.98+
one	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
Splunk	TITLE	0.98+
one thing	QUANTITY	0.98+
today	DATE	0.98+
Twitter	ORGANIZATION	0.97+
both	QUANTITY	0.97+
first	QUANTITY	0.97+
first thing	QUANTITY	0.96+
Windows	TITLE	0.96+
Cribl	PERSON	0.96+
one source	QUANTITY	0.96+
first company	QUANTITY	0.95+
single location	QUANTITY	0.95+
about five years	QUANTITY	0.95+
S3	TITLE	0.94+
five year old	QUANTITY	0.91+
Syslog	TITLE	0.91+
single place	QUANTITY	0.91+
John	PERSON	0.91+
Cribl	TITLE	0.88+
last two years	DATE	0.84+
NetApp	TITLE	0.83+
GreenLake	ORGANIZATION	0.83+
zero touch	QUANTITY	0.82+
Cribl' Stream	ORGANIZATION	0.81+
Ezmeral	ORGANIZATION	0.8+
two different	QUANTITY	0.78+
a terabyte of data per day	QUANTITY	0.76+
Venetian convention center	LOCATION	0.75+
350 release	QUANTITY	0.75+
Zscaler	TITLE	0.74+
one cloud	QUANTITY	0.7+
Greenlake	ORGANIZATION	0.65+
HPE Discover 2022	EVENT	0.62+

Clint Sharp, Cribl | Cube Conversation

(upbeat music) >> Hello, welcome to this CUBE conversation I'm John Furrier your host here in theCUBE in Palo Alto, California, featuring Cribl a hot startup taking over the enterprise when it comes to data pipelining, and we have a CUBE alumni who's the co-founder and CEO, Clint Sharp. Clint, great to see you again, you've been on theCUBE, you were on in 2013, great to see you, congratulations on the company that you co-founded, and leading as the chief executive officer over $200 million in funding, doing this really strong in the enterprise, congratulations thanks for joining us. >> Hey, thanks John it's really great to be back. >> You know, remember our first conversation the big data wave coming in, Hadoop World 2010, now the cloud comes in, and really the cloud native really takes data to a whole nother level. You've seeing the old data architectures being replaced with cloud scale. So the data landscape is interesting. You know, Data as Code you're hearing that term, data engineering teams are out there, data is everywhere, it's now part of how developers and companies are getting value whether it's real time, or coming out of data lakes, data is more pervasive than ever. Observability is a hot area, there's a zillion companies doing it, what are you guys doing? Where do you fit in the data landscape? >> Yeah, so what I say is that Cribl and our products and we solve the problem for our customers of the fundamental tension between data growth and budget. And so if you look at IDCs data data's growing at a 25%, CAGR, you're going to have two and a half times the amount of data in five years that you have today, and I talk to a lot of CIOs, I talk to a lot of CISOs, and the thing that I hear repeatedly is my budget is not growing at a 25% CAGR so fundamentally, how do I resolve this tension? We sell very specifically into the observability in security markets, we sell to technology professionals who are operating, you know, observability in security platforms like Splunk, or Elasticsearch, or Datadog, Exabeam, like these types of platforms they're moving, protocols like syslog, they're moving, they have lots of agents deployed on every endpoint and they're trying to figure out how to get the right data to the right place, and fundamentally you know, control cost. And we do that through our product called Stream which is what we call an observability pipeline. It allows you to take all this data, manipulate it in the stream and get it to the right place and fundamentally be able to connect all those things that maybe weren't originally intended to be connected. >> So I want to get into that new architecture if you don't mind, but let me first ask you on the problem space that you're in. So cloud native obviously instrumentating, instrumenting everything is a key thing. You mentioned data got all these tools, is the problem that there's been a sprawl of things being instrumented and they have to bring it together, or it's too costly to run all these point solutions and get it to work? What's the problem space that you're in? >> So I think customers have always been forced to make trade offs John. So the, hey I have volumes and volumes and volumes of data that's relevant to securing my enterprise, that's relevant to observing and understanding the behavior of my applications but there's never been an approach that allows me to really onboard all of that data. And so where we're coming at is giving them the tools to be able to, you know, filter out noise and waste, to be able to, you know, aggregate this high fidelity telemetry data. There's a lot of growing changes, you talk about cloud native, but digital transformation, you know, the pandemic itself and remote work all these are driving significantly greater data volumes, and vendors unsurprisingly haven't really been all that aligned to giving customers the tools in order to reshape that data, to filter out noise and waste because, you know, for many of them they're incentivized to get as much data into their platform as possible, whether that's aligned to the customer's interests or not. And so we saw an opportunity to come out and fundamentally as a customers-first company give them the tools that they need, in order to take back control of their data. >> I remember those conversations even going back six years ago the whole cloud scale, horizontally scalable applications, you're starting to see data now being stuck in the silos now to have high, good data you have to be observable, which means you got to be addressable. So you now have to have a horizontal data plane if you will. But then you get to the question of, okay, what data do I need at the right time? So is the Data as Code, data engineering discipline changing what new architectures are needed? What changes in the mind of the customer once they realize that they need this new way to pipe data and route data around, or make it available for certain applications? What are the key new changes? >> Yeah, so I think one of the things that we've been seeing in addition to the advent of the observability pipeline that allows you to connect all the things, is also the advent of an observability lake as well. Which is allowing people to store massively greater quantities of data, and also different types of data. So data that might not traditionally fit into a data warehouse, or might not traditionally fit into a data lake architecture, things like deployment artifacts, or things like packet captures. These are binary types of data that, you know, it's not designed to work in a database but yet they want to be able to ask questions like, hey, during the Log4Shell vulnerability, one of all my deployment artifacts actually had Log4j in it in an affected version. These are hard questions to answer in today's enterprise. Or they might need to go back to full fidelity packet capture data to try to understand that, you know, a malicious actor's movement throughout the enterprise. And we're not seeing, you know, we're seeing vendors who have great log indexing engines, and great time series databases, but really what people are looking for is the ability to store massive quantities of data, five times, 10 times more data than they're storing today, and they're doing that in places like AWSS3, or in Azure Blob Storage, and we're just now starting to see the advent of technologies we can help them query that data, and technologies that are generally more specifically focused at the type of persona that we sell to which is a security professional, or an IT professional who's trying to understand the behaviors of their applications, and we also find that, you know, general-purpose data processing technologies are great for the enterprise, but they're not working for the people who are running the enterprise, and that's why you're starting to see the concepts like observability pipelines and observability lakes emerge, because they're targeted at these people who have a very unique set of problems that are not being solved by the general-purpose data processing engines. >> It's interesting as you see the evolution of more data volume, more data gravity, then you have these specialty things that need to be engineered for the business. So sounds like observability lake and pipelining of the data, the data pipelining, or stream you call it, these are new things that they bolt into the architecture, right? Because they have business reasons to do it. What's driving that? Sounds like security is one of them. Are there others that are driving this behavior? >> Yeah, I mean it's the need to be able to observe applications and observe end-user behavior at a fine-grain detail. So, I mean I often use examples of like bank teller applications, or perhaps, you know, the app that you're using to, you know, I'm going to be flying in a couple of days. I'll be using their app to understand whether my flight's on time. Am I getting a good experience in that particular application? Answering the question of is Clint getting a good experience requires massive quantities of data, and your application and your service, you know, I'm going to sit there and look at, you know, American Airlines which I'm flying on Thursday, I'm going to be judging them based on off of my experience. I don't care what the average user's experience is I care what my experience is. And if I call them up and I say, hey, and especially for the enterprise usually this is much more for, you know, in-house applications and things like that. They call up their IT department and say, hey, this application is not working well, I don't know what's going on with it, and they can't answer the question of what was my individual experience, they're living with, you know, data that they can afford to store today. And so I think that's why you're starting to see the advent of these new architectures is because digital is so absolutely critical to every company's customer experience, that they're needing to be able to answer questions about an individual user's experience which requires significantly greater volumes of data, and because of significantly greater volumes of data, that requires entirely new approaches to aggregating that data, bringing the data in, and storing that data. >> Talk to me about enabling customer choice when it comes around controlling their data. You mentioned that before we came on camera that you guys are known for choice. How do you enable customer choice and control over their data? >> So I think one of the biggest problems I've seen in the industry over the last couple of decades is that vendors come to customers with hugely valuable products that make their lives better but it also requires them to maintain a relationship with that vendor in order to be able to continue to ask questions of that data. And so customers don't get a lot of optionality in these relationships. They sign multi-year agreements, they look to try to start another, they want to go try out another vendor, they want to add new technologies into their stack, and in order to do that they're often left with a choice of well, do I roll out like get another agent, do I go touch 10,000 computers, or a 100,000 computers in order to onboard this data? And what we have been able to offer them is the ability to reuse their existing deployed footprints of agents and their existing data collection technologies, to be able to use multiple tools and use the right tool for the right job, and really give them that choice, and not only give them the choice once, but with the concepts of things like the observability lake and replay, they can go back in time and say, you know what? I wanted to rehydrate all this data into a new tool, I'm no longer locked in to the way one vendor stores this, I can store this data in open formats and that's one of the coolest things about the observability late concept is that customers are no longer locked in to any particular vendor, the data is stored in open formats and so that gives them the choice to be able to go back later and choose any vendor, because they may want to do some AI or ML on that type of data and do some model training. They may want to be able to forward that data to a new cloud data warehouse, or try a different vendor for log search or a different vendor for time series data. And we're really giving them the choice and the tools to do that in a way in which was simply not possible before. >> You know you are bring up a point that's a big part of the upcoming AWS startup series Data as Code, the data engineering role has become so important and the word engineering is a key word in that, but there's not a lot of them, right? So like how many data engineers are there on the planet, and hopefully more will come in, come from these great programs in computer science but you got to engineer something but you're talking about developing on data, you're talking about doing replays and rehydrating, this is developing. So Data as Code is now a reality, how do you see Data as Code evolving from your perspective? Because it implies DevOps, Infrastructure as Code was DevOps, if Data as Code then you got DataOps, AIOps has been around for a while, what is Data as Code? And what does that mean to you Clint? >> I think for our customers, one, it means a number of I think sort of after-effects that maybe they have not yet been considering. One you mentioned which is it's hard to acquire that talent. I think it is also increasingly more critical that people who were working in jobs that used to be purely operational, are now being forced to learn, you know, developer centric tooling, things like GET, things like CI/CD pipelines. And that means that there's a lot of education that's going to have to happen because the vast majority of the people who have been doing things in the old way from the last 10 to 20 years, you know, they're going to have to get retrained and retooled. And I think that one is that's a huge opportunity for people who have that skillset, and I think that they will find that their compensation will be directly correlated to their ability to have those types of skills, but it also represents a massive opportunity for people who can catch this wave and find themselves in a place where they're going to have a significantly better career and more options available to them. >> Yeah and I've been thinking about what you just said about your customer environment having all these different things like Datadog and other agents. Those people that rolled those out can still work there, they don't have to rip and replace and then get new training on the new multiyear enterprise service agreement that some other vendor will sell them. You come in and it sounds like you're saying, hey, stay as you are, use Cribl, we'll have some data engineering capabilities for you, is that right? Is that? >> Yup, you got it. And I think one of the things that's a little bit different about our product and our market John, from kind of general-purpose data processing is for our users they often, they're often responsible for many tools and data engineering is not their full-time job, it's actually something they just need to do now, and so we've really built tool that's designed for your average security professional, your average IT professional, yes, we can utilize the same kind of DataOps techniques that you've been talking about, CI/CD pipelines, GITOps, that sort of stuff, but you don't have to, and if you're really just already familiar with administering a Datadog or a Splunk, you can get started with our product really easily, and it is designed to be able to be approachable to anybody with that type of skillset. >> It's interesting you, when you're talking you've remind me of the big wave that was coming, it's still here, shift left meant security from the beginning. What do you do with data shift up, right, down? Like what do you, what does that mean? Because what you're getting at here is that if you're a developer, you have to deal with data but you don't have to be a data engineer but you can be, right? So we're getting in this new world. Security had that same problem. Had to wait for that group to do things, creating tension on the CI/CD pipelining, so the developers who are building apps had to wait. Now you got shift left, what is data, what's the equivalent of the data version of shift left? >> Yeah so we're actually doing this right now. We just announced a new product a week ago called Cribl Edge. And this is enabling us to move processing of this data rather than doing it centrally in the stream to actually push this processing out to the edge, and to utilize a lot of unused capacity that you're already paying AWS, or paying Azure for, or maybe in your own data center, and utilize that capacity to do the processing rather than having to centralize and aggregate all of this data. So I think we're going to see a really interesting, and left from our side is towards the origination point rather than anything else, and that allows us to really unlock a lot of unused capacity and continue to drive the kind of cost down to make more data addressable back to the original thing we talked about the tension between data growth, if we want to offer more capacity to people, if we want to be able to answer more questions, we need to be able to cost-effectively query a lot more data. >> You guys had great success in the enterprise with what you got going on. Obviously the funding is just the scoreboard for that. You got good growth, what are the use cases, or what's the customer look like that's working for you where you're winning, or maybe said differently what pain points are out there the customer might be feeling right now that Cribl could fit in and solve? How would you describe that ideal persona, or environment, or problem, that the customer may have that they say, man, Cribl's a perfect fit? >> Yeah, this is a person who's working on tooling. So they administer a Splunk, or an Elastic, or a Datadog, they may be in a network operations center, a security operation center, they are struggling to get data into their tools, they're always at capacity, their tools always at the redline, they really wish they could do more for the business. They're kind of tired of being this department of no where everybody comes to them and says, "hey, can I get this data in?" And they're like, "I wish, but you know, we're all out of capacity, and you know, we have, we wish we could help you but we frankly can't right now." We help them by routing that data to multiple locations, we help them control costs by eliminating noise and waste, and we've been very successful at that in, you know, logos, like, you know, like a Shutterfly, or a, blanking on names, but we've been very successful in the enterprise, that's not great, and we continue to be successful with major logos inside of government, inside of banking, telco, et cetera. >> So basically it used to be the old hyperscalers, the ones with the data full problem, now everyone's got the, they're full of data and they got to really expand capacity and have more agility and more engineering around contributions of the business sounds like that's what you guys are solving. >> Yup and hopefully we help them do a little bit more with less. And I think that's a key problem for our enterprises, is that there's always a limit on the number of human resources that they have available at their disposal, which is why we try to make the software as easy to use as possible, and make it as widely applicable to those IT and security professionals who are, you know, kind of your run-of-the-mill tools administrator, our product is very approachable for them. >> Clint great to see you on theCUBE here, thanks for coming on. Quick plug for the company, you guys looking for hiring, what's going on? Give a quick update, take 30 seconds to give a plug. >> Yeah, absolutely. We are absolutely hiring cribl.io/jobs, we need people in every function from sales, to marketing, to engineering, to back office, GNA, HR, et cetera. So please check out our job site. If you are interested it in learning more you can go to cribl.io. We've got some great online sandboxes there which will help you educate yourself on the product, our documentation is freely available, you can sign up for up to a terabyte a day on our cloud, go to cribl.cloud and sign up free today. The product's easily accessible, and if you'd like to speak with us we'd love to have you in our community, and you can join the community from cribl.io as well. >> All right, Clint Sharp co-founder and CEO of Cribl, thanks for coming to theCUBE. Great to see you, I'm John Furrier your host thanks for watching. (upbeat music)

Published Date : Mar 31 2022

SUMMARY :

Clint, great to see you again, really great to be back. and really the cloud native and get it to the right place and get it to work? to be able to, you know, So is the Data as Code, is the ability to store that need to be engineered that they're needing to be that you guys are known for choice. is the ability to reuse their does that mean to you Clint? from the last 10 to 20 years, they don't have to rip and and it is designed to be but you don't have to be a data engineer and to utilize a lot of unused capacity that the customer may have and you know, we have, and they got to really expand capacity as easy to use as possible, Clint great to see you on theCUBE here, and you can join the community Great to see you, I'm

ENTITIES

Entity	Category	Confidence
Clint Sharp	PERSON	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
10 times	QUANTITY	0.99+
Clint	PERSON	0.99+
30 seconds	QUANTITY	0.99+
100,000 computers	QUANTITY	0.99+
Thursday	DATE	0.99+
Cribl	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
25%	QUANTITY	0.99+
American Airlines	ORGANIZATION	0.99+
five times	QUANTITY	0.99+
10,000 computers	QUANTITY	0.99+
2013	DATE	0.99+
five years	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
one	QUANTITY	0.99+
over $200 million	QUANTITY	0.99+
six years ago	DATE	0.99+
CUBE	ORGANIZATION	0.98+
a week ago	DATE	0.98+
first	QUANTITY	0.98+
telco	ORGANIZATION	0.98+
Datadog	ORGANIZATION	0.97+
today	DATE	0.97+
AWSS3	TITLE	0.97+
Log4Shell	TITLE	0.96+
two and a half times	QUANTITY	0.94+
last couple of decades	DATE	0.89+
first conversation	QUANTITY	0.89+
One	QUANTITY	0.87+
Hadoop World 2010	EVENT	0.87+
Log4j	TITLE	0.83+
cribl.io	ORGANIZATION	0.81+
20 years	QUANTITY	0.8+
Azure	ORGANIZATION	0.8+
first company	QUANTITY	0.79+
big wave	EVENT	0.79+
theCUBE	ORGANIZATION	0.78+
up to a terabyte a day	QUANTITY	0.77+
Azure Blob	TITLE	0.77+
cribl.cloud	TITLE	0.74+
Exabeam	ORGANIZATION	0.72+
Shutterfly	ORGANIZATION	0.71+
banking	ORGANIZATION	0.7+
DataOps	TITLE	0.7+
wave	EVENT	0.68+
last	DATE	0.67+
cribl.io	TITLE	0.66+
things	QUANTITY	0.65+
zillion companies	QUANTITY	0.63+
syslog	TITLE	0.62+
10	QUANTITY	0.61+
Splunk	ORGANIZATION	0.6+
AIOps	TITLE	0.6+
Edge	TITLE	0.6+
Data as	TITLE	0.59+
cribl.io/jobs	ORGANIZATION	0.58+
Elasticsearch	TITLE	0.58+
Elastic	TITLE	0.55+
once	QUANTITY	0.5+
problems	QUANTITY	0.48+
Code	TITLE	0.46+
Splunk	TITLE	0.44+

UNLIST TILL 4/2 - Keep Data Private

>> Paige: Hello everybody and thank you for joining us today for the Virtual Vertica BDC 2020. Today's breakout session is entitled Keep Data Private Prepare and Analyze Without Unencrypting With Voltage SecureData for Vertica. I'm Paige Roberts, Open Source Relations Manager at Vertica, and I'll be your host for this session. Joining me is Rich Gaston, Global Solutions Architect, Security, Risk, and Government at Voltage. And before we begin, I encourage you to submit your questions or comments during the virtual session, you don't have to wait till the end. Just type your question as it occurs to you, or comment, in the question box below the slide and then click Submit. There'll be a Q&A session at the end of the presentation where we'll try to answer as many of your questions as we're able to get to during the time. Any questions that we don't address we'll do our best to answer offline. Now, if you want, you can visit the Vertica Forum to post your questions there after the session. Now, that's going to take the place of the Developer Lounge, and our engineering team is planning to join the Forum, to keep the conversation going. So as a reminder, you can also maximize your screen by clicking the double arrow button, in the lower-right corner of the slides. That'll allow you to see the slides better. And before you ask, yes, this virtual session is being recorded and it will be available to view on-demand this week. We'll send you a notification as soon as it's ready. All right, let's get started. Over to you, Rich. >> Rich: Hey, thank you very much, Paige, and appreciate the opportunity to discuss this topic with the audience. My name is Rich Gaston and I'm a Global Solutions Architect, within the Micro Focus team, and I work on global Data privacy and protection efforts, for many different organizations, looking to take that journey toward breach defense and regulatory compliance, from platforms ranging from mobile to mainframe, everything in between, cloud, you name it, we're there in terms of our solution sets. Vertica is one of our major partners in this space, and I'm very excited to talk with you today about our solutions on the Vertica platform. First, let's talk a little bit about what you're not going to learn today, and that is, on screen you'll see, just part of the mathematics that goes into, the format-preserving encryption algorithm. We are the originators and authors and patent holders on that algorithm. Came out of research from Stanford University, back in the '90s, and we are very proud, to take that out into the market through the NIST standard process, and license that to others. So we are the originators and maintainers, of both standards and athureader in the industry. We try to make this easy and you don't have to learn any of this tough math. Behind this there are also many other layers of technology. They are part of the security, the platform, such as stateless key management. That's a really complex area, and we make it very simple for you. We have very mature and powerful products in that space, that really make your job quite easy, when you want to implement our technology within Vertica. So today, our goal is to make Data protection easy for you, to be able to understand the basics of Voltage Secure Data, you're going to be learning how the Vertica UDx, can help you get started quickly, and we're going to see some examples of how Vertica plus Voltage Secure Data, are going to be working together, in our customer cases out in the field. First, let's take you through a quick introduction to Voltage Secure Data. The business drivers and what's this all about. First of all, we started off with Breach Defense. We see that despite continued investments, in personal perimeter and platform security, Data breaches continue to occur. Voltage Secure Data plus Vertica, provides defense in depth for sensitive Data, and that's a key concept that we're going to be referring to. in the security field defense in depth, is a standard approach to be able to provide, more layers of protection around sensitive assets, such as your Data, and that's exactly what Secure Data is designed to do. Now that we've come through many of these breach examples, and big ticket items, getting the news around breaches and their impact, the business regulators have stepped up, and regulatory compliance, is now a hot topic in Data privacy. Regulations such as GDPR came online in 2018 for the EU. CCPA came online just this year, a couple months ago for California, and is the de-facto standard for the United States now, as organizations are trying to look at, the best practices for providing, regulatory compliance around Data privacy and protection. These gives massive new rights to consumers, but also obligations to organizations, to protect that personal Data. Secure Data Plus Vertica provides, fine grained authorization around sensitive Data, And we're going to show you exactly how that works, within the Vertica platform. At the bottom, you'll see some of the snippets there, of the news articles that just keep racking up, and our goal is to keep you off the news, to keep your company safe, so that you can have the assurance, that even if there is an unintentional, or intentional breach of Data out of the corporation, if it is protected by voltage Secure Data, it will be of no value to those hackers, and then you have no impact, in terms of risk to the organization. What do we mean by defense in depth? Let's take a look first at the encryption types, and the benefits that they provide, and we see our customers implementing, all kinds of different protection mechanisms, within the organization. You could be looking at disk level protection, file system protection, protection on the files themselves. You could protect the entire Database, you could protect our transmissions, as they go from the client to the server via TLS, or other protected tunnels. And then we look at Field-level Encryption, and that's what we're talking about today. That's all the above protections, at the perimeter level at the platform level. Plus, we're giving you granular access control, to your sensitive Data. Our main message is, keep the Data protected for at the earliest possible point, and only access it, when you have a valid business need to do so. That's a really critical aspect as we see Vertica customers, loading terabytes, petabytes of Data, into clusters of Vertica console, Vertica Database being able to give access to that Data, out to a wide variety of end users. We started off with organizations having, four people in an office doing Data science, or analytics, or Data warehousing, or whatever it's called within an organization, and that's now ballooned out, to a new customer coming in and telling us, we're going to have 1000 people accessing it, plus service accounts accessing Vertica, we need to be able to provide fine level access control, and be able to understand what are folks doing with that sensitive Data? And how can we Secure it, the best practices possible. In very simple state, voltage protect Data at rest and in motion. The encryption of Data facilitates compliance, and it reduces your risk of breach. So if you take a look at what we mean by feel level, we could take a name, that name might not just be in US ASCII. Here we have a sort of Latin one extended, example of Harold Potter, and we could take a look at the example protected Data. Notice that we're taking a character set approach, to protecting it, meaning, I've got an alphanumeric option here for the format, that I'm applying to that name. That gives me a mix of alpha and numeric, and plus, I've got some of that Latin one extended alphabet in there as well, and that's really controllable by the end customer. They can have this be just US ASCII, they can have it be numbers for numbers, you can have a wide variety, of different protection mechanisms, including ignoring some characters in the alphabet, in case you want to maintain formatting. We've got all the bells and whistles, that you would ever want, to put on top of format preserving encryption, and we continue to add more to that platform, as we go forward. Taking a look at tax ID, there's an example of numbers for numbers, pretty basic, but it gives us the sort of idea, that we can very quickly and easily keep the Data protected, while maintaining the format. No schema changes are going to be required, when you want to protect that Data. If you look at credit card number, really popular example, and the same concept can be applied to tax ID, often the last four digits will be used in a tax ID, to verify someone's identity. That could be on an automated telephone system, it could be a customer service representative, just trying to validate the security of the customer, and we can keep that Data in the clear for that purpose, while protecting the entire string from breach. Dates are another critical area of concern, for a lot of medical use cases. But we're seeing Date of Birth, being included in a lot of Data privacy conversations, and we can protect dates with dates, they're going to be a valid date, and we have some really nifty tools, to maintain offsets between dates. So again, we've got the real depth of capability, within our encryption, that's not just saying, here's a one size fits all approach, GPS location, customer ID, IP address, all of those kinds of Data strings, can be protected by voltage Secure Data within Vertica. Let's take a look at the UDx basics. So what are we doing, when we add Voltage to Vertica? Vertica stays as is in the center. In fact, if you get the Vertical distribution, you're getting the Secure Data UDx onboard, you just need to enable it, and have Secure Data virtual appliance, that's the box there on the middle right. That's what we come in and add to the mix, as we start to be able to add those capabilities to Vertica. On the left hand side, you'll see that your users, your service accounts, your analytics, are still typically doing Select, Update, Insert, Delete, type of functionality within Vertica. And they're going to come into Vertica's access control layer, they're going to also access those services via SQL, and we simply extend SQL for Vertica. So when you add the UDx, you get additional syntax that we can provide, and we're going to show you examples of that. You can also integrate that with concepts, like Views within Vertica. So that we can say, let's give a view of Data, that gives the Data in the clear, using the UDx to decrypt that Data, and let's give everybody else, access to the raw Data which is protected. Third parties could be brought in, folks like contractors or folks that aren't vetted, as closely as a security team might do, for internal sensitive Data access, could be given access to the Vertical cluster, without risk of them breaching and going into some area, they're not supposed to take a look at. Vertica has excellent control for access, down even to the column level, which is phenomenal, and really provides you with world class security, around the Vertical solution itself. Secure Data adds another layer of protection, like we're mentioning, so that we can have Data protected in use, Data protected at rest, and then we can have the ability, to share that protected Data throughout the organization. And that's really where Secure Data shines, is the ability to protect that Data on mainframe, on mobile, and open systems, in the cloud, everywhere you want to have that Data move to and from Vertica, then you can have Secure Data, integrated with those endpoints as well. That's an additional solution on top, the Secure Data Plus Vertica solution, that is bundled together today for a sales purpose. But we can also have that conversation with you, about those wider Secure Data use cases, we'd be happy to talk to you about that. Security to the virtual appliance, is a lightweight appliance, sits on something like eight cores, 16 gigs of RAM, 100 gig of disk or 200 gig of disk, really a lightweight appliance, you can have one or many. Most customers have four in production, just for redundancy, they don't need them for scale. But we have some customers with 16 or more in production, because they're running such high volumes of transaction load. They're running a lot of web service transactions, and they're running Vertica as well. So we're going to have those virtual appliances, as co-located around the globe, hooked up to all kinds of systems, like Syslog, LDAP, load balancers, we've got a lot of capability within the appliance, to fit into your enterprise IP landscape. So let me get you directly into the neat, of what does the UDx do. If you're technical and you know SQL, this is probably going to be pretty straightforward to you, you'll see the copy command, used widely in Vertica to get Data into Vertica. So let's try to protect that Data when we're ingesting it. Let's grab it from maybe a CSV file, and put it straight into Vertica, but protected on the way and that's what the UDx does. We have Voltage Secure protectors, an added syntax, like I mentioned, to the Vertica SQL. And that allows us to say, we're going to protect the customer first name, using the parameters of hyper alphanumeric. That's our internal lingo of a format, within Secure Data, this part of our API, the API is require very few inputs. The format is the one, that you as a developer will be supplying, and you'll have different ones for maybe SSN, you'll have different formats for street address, but you can reuse a lot of your formats, across a lot of your PII, PHI Data types. Protecting after ingest is also common. So I've got some Data, that's already been put into a staging area, perhaps I've got a landing zone, a sandbox of some sort, now I want to be able to move that, into a different zone in Vertica, different area of the schema, and I want to have that Data protected. We can do that with the update command, and simply again, you'll notice Voltage Secure protect, nothing too wild there, basically the same syntax. We're going to query unprotected Data. How do we search once I've encrypted all my Data? Well, actually, there's a pretty nifty trick to do so. If you want to be able to query unprotected Data, and we have the search string, like a phone number there in this example, simply call Voltage Secure protect on that, now you'll have the cipher text, and you'll be able to search the stored cipher text. Again, we're just format preserving encrypting the Data, and it's just a string, and we can always compare those strings, using standard syntax and SQL. Using views to decrypt Data, again a powerful concept, in terms of how to make this work, within the Vertica Landscape, when you have a lot of different groups of users. Views are very powerful, to be able to point a BI tool, for instance, business intelligence tools, Cognos, Tableau, etc, might be accessing Data from Vertica with simple queries. Well, let's point them to a view that does the hard work, and uses the Vertical nodes, and its horsepower of CPU and RAM, to actually run that Udx, and do the decryption of the Data in use, temporarily in memory, and then throw that away, so that it can't be breached. That's a nice way to keep your users active and working and going forward, with their Data access and Data analytics, while also keeping the Data Secure in the process. And then we might want to export some Data, and push it out to someone in a clear text manner. We've got a third party, needs to take the tax ID along with some Data, to do some processing, all we need to do is call Voltage Secure Access, again, very similar to the protect call, and you're writing the parameter again, and boom, we have decrypted the Data and used again, the Vertical resources of RAM and CPU and horsepower, to do the work. All we're doing with Voltage Secure Data Appliance, is a real simple little key fetch, across a protected tunnel, that's a tiny atomic transaction, gets done very quick, and you're good to go. This is it in terms of the UDx, you have a couple of calls, and one parameter to pass, everything else is config driven, and really, you're up and running very quickly. We can even do demos and samples of this Vertical Udx, using hosted appliances, that we put up for pre sales purposes. So folks want to get up and get a demo going. We could take that Udx, configure it to point to our, appliance sitting on the internet, and within a couple of minutes, we're up and running with some simple use cases. Of course, for on-prem deployment, or deployment in the cloud, you'll want your own appliance in your own crypto district, you have your own security, but it just shows, that we can easily connect to any appliance, and get this working in a matter of minutes. Let's take a look deeper at the voltage plus Vertica solution, and we'll describe some of the use cases and path to success. First of all your steps to, implementing Data-centric security and Vertica. Want to note there on the left hand side, identify sensitive Data. How do we do this? I have one customer, where they look at me and say, Rich, we know exactly what our sensitive Data is, we develop the schema, it's our own App, we have a customer table, we don't need any help in this. We've got other customers that say, Rich, we have a very complex Database environment, with multiple Databases, multiple schemas, thousands of tables, hundreds of thousands of columns, it's really, really complex help, and we don't know what people have been doing exactly, with some of that Data, We've got various teams that share this resource. There, we do have additional tools, I wanted to give a shout out to another microfocus product, which is called Structured Data Manager. It's a great tool that helps you identify sensitive Data, with some really amazing technology under the hood, that can go into a Vertica repository, scan those tables, take a sample of rows or a full table scan, and give you back some really good reports on, we think this is sensitive, let's go confirm it, and move forward with Data protection. So if you need help on that, we've got the tools to do it. Once you identify that sensitive Data, you're going to want to understand, your Data flows and your use cases. Take a look at what analytics you're doing today. What analytics do you want to do, on sensitive Data in the future? Let's start designing our analytics, to work with sensitive Data, and there's some tips and tricks that we can provide, to help you mitigate, any kind of concerns around performance, or any kind of concerns around rewriting your SQL. As you've noted, you can just simply insert our SQL additions, into your code and you're off and running. You want to install and configure the Udx, and secure Data software plants. Well, the UDx is pretty darn simple. The documentation on Vertica is publicly available, you could see how that works, and what you need to configure it, one file here, and you're ready to go. So that's pretty straightforward to process, either grant some access to the Udx, and that's really up to the customer, because there are many different ways, to handle access control in Vertica, we're going to be flexible to fit within your model, of access control and adding the UDx to your mix. Each customer is a little different there, so you might want to talk with us a little bit about, the best practices for your use cases. But in general, that's going to be up and running in just a minute. The security software plants, hardened Linux appliance today, sits on-prem or in the cloud. And you can deploy that. I've seen it done in 15 minutes, but that's what the real tech you had, access to being able to generate a search, and do all this so that, your being able to set the firewall and all the DNS entries, the basically blocking and tackling of a software appliance, you get that done, corporations can take care of that, in just a couple of weeks, they get it all done, because they have wait waiting on other teams, but the software plants are really fast to get stood up, and they're very simple to administer, with our web based GUI. Then finally, you're going to implement your UDx use cases. Once the software appliance is up and running, we can set authentication methods, we could set up the format that you're going to use in Vertica, and then those two start talking together. And it should be going in dev and test in about half a day, and then you're running toward production, in just a matter of days, in most cases. We've got other customers that say, Hey, this is going to be a bigger migration project for us. We might want to split this up into chunks. Let's do the real sensitive and scary Data, like tax ID first, as our sort of toe in the water approach, and then we'll come back and protect other Data elements. That's one way to slice and dice, and implement your solution in a planned manner. Another way is schema based. Let's take a look at this section of the schema, and implement protection on these Data elements. Now let's take a look at the different schema, and we'll repeat the process, so you can iteratively move forward with your deployment. So what's the added value? When you add full Vertica plus voltage? I want to highlight this distinction because, Vertica contains world class security controls, around their Database. I'm an old time DBA from a different product, competing against Vertica in the past, and I'm really aware of the granular access controls, that are provided within various platforms. Vertica would rank at the very top of the list, in terms of being able to give me very tight control, and a lot of different AWS methods, being able to protect the Data, in a lot of different use cases. So Vertica can handle a lot of your Data protection needs, right out of the box. Voltage Secure Data, as we keep mentioning, adds that defense in-Depth, and it's going to enable those, enterprise wide use cases as well. So first off, I mentioned this, the standard of FF1, that is format preserving encryption, we're the authors of it, we continue to maintain that, and we want to emphasize that customers, really ought to be very, very careful, in terms of choosing a NIST standard, when implementing any kind of encryption, within the organization. So 8 ES was one of the first, and Hallmark, benchmark encryption algorithms, and in 2016, we were added to that mix, as FF1 with CS online. If you search NIST, and Voltage Security, you'll see us right there as the author of the standard, and all the processes that went along with that approval. We have centralized policy for key management, authentication, audit and compliance. We can now see that Vertica selected or fetch the key, to be able to protect some Data at this date and time. We can track that and be able to give you audit, and compliance reporting against that Data. You can move protected Data into and out of Vertica. So if we ingest via Kafka, and just via NiFi and Kafka, ingest on stream sets. There are a variety of different ingestion methods, and streaming methods, that can get Data into Vertica. We can integrate secure Data with all of those components. We're very well suited to integrate, with any Hadoop technology or any big Data technology, as we have API's in a variety of languages, bitness and platforms. So we've got that all out of the box, ready to go for you, if you need it. When you're moving Data out of Vertica, you might move it into an open systems platform, you might move it to the cloud, we can also operate and do the decryption there, you're going to get the same plaintext back, and if you protect Data over in the cloud, and move it into Vertica, you're going to be able to decrypt it in Vertica. That's our cross platform promise. We've been delivering on that for many, many years, and we now have many, many endpoints that do that, in production for the world's largest organization. We're going to preserve your Data format, and referential integrity. So if I protect my social security number today, I can protect another batch of Data tomorrow, and that same ciphertext will be generated, when I put that into Vertica, I can have absolute referential integrity on that Data, to be able to allow for analytics to occur, without even decrypting Data in many cases. And we have decrypt access for authorized users only, with the ability to add LDAP authentication authorization, for UDx users. So you can really have a number of different approaches, and flavors of how you implement voltage within Vertica, but what you're getting is the additional ability, to have that confidence, that we've got the Data protected at rest, even if I have a DBA that's not vetted or someone new, or I don't know where this person is from a third party, and being provided access as a DBA level privilege. They could select star from all day long, and they're going to get ciphertext, they're going to have nothing of any value, and if they want to use the UDF to decrypt it, they're going to be tracked and traced, as to their utilization of that. So it allows us to have that control, and additional layer of security on your sensitive Data. This may be required by regulatory agencies, and it's seeming that we're seeing compliance audits, get more and more strict every year. GDPR was kind of funny, because they said in 2016, hey, this is coming, they said in 2018, it's here, and now they're saying in 2020, hey, we're serious about this, and the fines are mounting. And let's give you some examples to kind of, help you understand, that these regulations are real, the fines are real, and your reputational damage can be significant, if you were to be in breach, of a regulatory compliance requirements. We're finding so many different use cases now, popping up around regional protection of Data. I need to protect this Data so that it cannot go offshore. I need to protect this Data, so that people from another region cannot see it. That's all the kind of capability that we have, within secure Data that we can add to Vertica. We have that broad platform support, and I mentioned NiFi and Kafka, those would be on the left hand side, as we start to ingest Data from applications into Vertica. We can have landing zone approaches, where we provide some automated scripting at an OS level, to be able to protect ETL batch transactions coming in. We could protect within the Vertica UDx, as I mentioned, with the copy command, directly using Vertica. Everything inside that dot dash line, is the Vertical Plus Voltage Secure Data combo, that's sold together as a single package. Additionally, we'd love to talk with you, about the stuff that's outside the dash box, because we have dozens and dozens of endpoints, that could protect and access Data, on many different platforms. And this is where you really start to leverage, some of the extensive power of secure Data, to go across platform to handle your web based apps, to handle apps in the cloud, and to handle all of this at scale, with hundreds of thousands of transactions per second, of format preserving encryption. That may not sound like much, but when you take a look at the algorithm, what we're doing on the mathematics side, when you look at everything that goes into that transaction, to me, that's an amazing accomplishment, that we're trying to reach those kinds of levels of scale, and with Vertica, it scales horizontally. So the more nodes you add, the more power you get, the more throughput you're going to get, from voltage secure Data. I want to highlight the next steps, on how we can continue to move forward. Our secure Data team is available to you, to talk about the landscape, your use cases, your Data. We really love the concept that, we've got so many different organizations out there, using secure Data in so many different and unique ways. We have vehicle manufacturers, who are protecting not just the VIN, not just their customer Data, but in fact they're protecting sensor Data from the vehicles, which is sent over the network, down to the home base every 15 minutes, for every vehicle that's on the road, and every vehicle of this customer of ours, since 2017, has included that capability. So now we're talking about, an additional millions and millions of units coming online, as those cars are sold and distributed, and used by customers. That sensor Data is critical to the customer, and they cannot let that be ex-filled in the clear. So they protect that Data with secure Data, and we have a great track record of being able to meet, a variety of different unique requirements, whether it's IoT, whether it's web based Apps, E-commerce, healthcare, all kinds of different industries, we would love to help move the conversations forward, and we do find that it's really a three party discussion, the customer, secure Data experts in some cases, and the Vertica team. We have great enablement within Vertica team, to be able to explain and present, our secure Data solution to you. But we also have that other ability to add other experts in, to keep that conversation going into a broader perspective, of how can I protect my Data across all my platforms, not just in Vertica. I want to give a shout out to our friends at Vertica Academy. They're building out a great demo and training facilities, to be able to help you learn more about these UDx's, and how they're implemented. The Academy, is a terrific reference and resource for your teams, to be able to learn more, about the solution in a self guided way, and then we'd love to have your feedback on that. How can we help you more? What are the topics you'd like to learn more about? How can we look to the future, in protecting unstructured Data? How can we look to the future, of being able to protect Data at scale? What are the requirements that we need to be meeting? Help us through the learning processes, and through feedback to the team, get better, and then we'll help you deliver more solutions, out to those endpoints and protect that Data, so that we're not having Data breach, we're not having regulatory compliance concerns. And then lastly, learn more about the Udx. I mentioned, that all of our content there, is online and available to the public. So vertica.com/secureData , you're going to be able to walk through the basics of the UDX. You're going to see how simple it is to set up, what the UDx syntax looks like, how to grant access to it, and then you'll start to be able to figure out, hey, how can I start to put this, into a PLC in my own environment? Like I mentioned before, we have publicly available hosted appliance, for demo purposes, that we can make available to you, if you want to PLC this. Reach out to us. Let's get a conversation going, and we'll get you the address and get you some instructions, we can have a quick enablement session. We really want to make this accessible to you, and help demystify the concept of encryption, because when you see it as a developer, and you start to get your hands on it and put it to use, you can very quickly see, huh, I could use this in a variety of different cases, and I could use this to protect my Data, without impacting my analytics. Those are some of the really big concerns that folks have, and once we start to get through that learning process, and playing around with it in a PLC way, that we can start to really put it to practice into production, to say, with confidence, we're going to move forward toward Data encryption, and have a very good result, at the end of the day. This is one of the things I find with customers, that's really interesting. Their biggest stress, is not around the timeframe or the resource, it's really around, this is my Data, I have been working on collecting this Data, and making it available in a very high quality way, for many years. This is my job and I'm responsible for this Data, and now you're telling me, you're going to encrypt that Data? It makes me nervous, and that's common, everybody feels that. So we want to have that conversation, and that sort of trial and error process to say, hey, let's get your feet wet with it, and see how you like it in a sandbox environment. Let's now take that into analytics, and take a look at how we can make this, go for a quick 1.0 release, and let's then take a look at, future expansions to that, where we start adding Kafka on the ingest side. We start sending Data off, into other machine learning and analytics platforms, that we might want to utilize outside of Vertica, for certain purposes, in certain industries. Let's take a look at those use cases together, and through that journey, we can really chart a path toward the future, where we can really help you protect that Data, at rest, in use, and keep you safe, from both the hackers and the regulators, and that I think at the end of the day, is really what it's all about, in terms of protecting our Data within Vertica. We're going to have a little couple minutes for Q&A, and we would encourage you to have any questions here, and we'd love to follow up with you more, about any questions you might have, about Vertica Plus Voltage Secure Data. They you very much for your time today.

Published Date : Mar 30 2020

SUMMARY :

and our engineering team is planning to join the Forum, and our goal is to keep you off the news,

ENTITIES

Entity	Category	Confidence
Vertica	ORGANIZATION	0.99+
100 gig	QUANTITY	0.99+
16	QUANTITY	0.99+
16 gigs	QUANTITY	0.99+
200 gig	QUANTITY	0.99+
Paige Roberts	PERSON	0.99+
2016	DATE	0.99+
Paige	PERSON	0.99+
Rich Gaston	PERSON	0.99+
dozens	QUANTITY	0.99+
2018	DATE	0.99+
Vertica Academy	ORGANIZATION	0.99+
2020	DATE	0.99+
SQL	TITLE	0.99+
AWS	ORGANIZATION	0.99+
First	QUANTITY	0.99+
1000 people	QUANTITY	0.99+
Hallmark	ORGANIZATION	0.99+
today	DATE	0.99+
Harold Potter	PERSON	0.99+
Rich	PERSON	0.99+
millions	QUANTITY	0.99+
Stanford University	ORGANIZATION	0.99+
15 minutes	QUANTITY	0.99+
Today	DATE	0.99+
Each customer	QUANTITY	0.99+
one	QUANTITY	0.99+
both	QUANTITY	0.99+
California	LOCATION	0.99+
Kafka	TITLE	0.99+
Vertica	TITLE	0.99+
Latin	OTHER	0.99+
tomorrow	DATE	0.99+
2017	DATE	0.99+
eight cores	QUANTITY	0.99+
two	QUANTITY	0.98+
GDPR	TITLE	0.98+
first	QUANTITY	0.98+
one customer	QUANTITY	0.98+
Tableau	TITLE	0.98+
United States	LOCATION	0.97+
this week	DATE	0.97+
Vertica	LOCATION	0.97+
4/2	DATE	0.97+
Linux	TITLE	0.97+
one file	QUANTITY	0.96+
vertica.com/secureData	OTHER	0.96+
four	QUANTITY	0.95+
about half a day	QUANTITY	0.95+
Cognos	TITLE	0.95+
four people	QUANTITY	0.94+
Udx	ORGANIZATION	0.94+
one way	QUANTITY	0.94+

Chris Crocco, ViaSat | Splunk .conf18

>> Live from Orlando, Florida, it's theCUBE, covering .conf2018! Brought to you by Splunk. (techno music) >> Welcome back to Orlando, everybody. We're here with theCUBE covering Splunk.conf2018. I'm Dave Vellante with my co-host, Stu Miniman. Chris Crocco is here, he's the Lead Solutions Engineer at ViaSat. Great to see you, thanks for coming on theCUBE. >> Well, thanks for having me. I appreciate it. >> You're very welcome. Let's start with ViaSat. Tell us what you guys do and what your role is all about. >> So ViaSat is a global communications and technology company primarily focused on satellite-based technologies, anything from government services to commercial aviation and residential service. >> And what does a Lead Solutions Engineer do? >> My primary role is to help us kind of transition from a traditional operations state into more of a DevOps environment including monitoring, alerting, orchestration and remediation. >> Oh, we love this conversation, don't we? Okay. The basic question is, and I know it's hard, but it's subjective, it's kind of if you think about the majority of your organization in the context of DevOps, on a scale of one to five, five being nirvana, so let's assume you're not at five 'cause it never ends, right? You're constantly evolving. Where would you say you are? Are you just getting started? Are you more like a four, 4 1/2, what do you think? >> That's a good question. I would say we're probably three on our way to four. We've had a lot of growing pains, we've had a lot of learning opportunities. The processes of DevOps are getting pretty well-entrenched and right now, we're working on making sure that the culture sticks with the DevOps. >> That's critical, right? >> I mean, that's really where the rubber meets the road is that organizational and political. Without getting into the dirt of it, give us what it looked like before and where you are today. >> Sure. Prior to our shift to DevOps, which was mainly motivated by our latest spacecraft, ViaSat-2, we had a very traditional operational model where we had everything funneled through a Network Operations Center, we had a Technical Operations Team, and if they weren't able to triage and remediate issues, they kicked it over the fence to engineers and developers who would then throw something back. There wasn't a lot of communication between the two organizations, so when we did find recurring problems, recurring issues in our network and in our environment, it took a long time to get those resolved and we had to have a large volume of staff there just to kind of put out the fires. With the transition of DevOps, one of the things that we've been focusing on is making sure that our development teams, our engineering teams understand the customer experience and how it's impacted by what they do, and de-centralizing that operation structure so all of the triage work goes to the people who actually work on those services. So it's a pretty big paradigm shift but it's also helping us solve customer problems faster and get better education about what the customer experience is to the people who actually make it better. >> And roughly, what was the timeframe that it took to go from that really waterfall model to the structure that you have today? >> We've been going for about two or three years now in this transition. Like I said, the first year or so was kind of bumpy and we've really kind of ramped up over the past year in terms of the amount of teams that are practicing DevOps, the amount of teams that are in an agile and scrum model. So overall, two to three years to get to where we are today. >> So the problem with the traditional model is you have time to deployment is slower, that means time, the value is slower, a lot of re-work. Here, you take it. No, you take it. Hey, it worked when I gave it to you, a lot of back and forth, and not a lot of communication creates frustration, not a lot of collaboration and teamwork, then you're working through that now. How large is the team? >> My team is five people. We have 4,500 people roughly at ViaSat as a whole. I believe roughly 2,000 of them are in an engineering or technical role. >> Okay, but in the previous model, you had developers and you had operations folks, is that right? And your five are sort of split over those or was it a much, much larger corpus of folks? >> It was a very large distribution of people. It was very engineering and developer-centric. We still had a Core Operations Team of 60 to 100 people based in our Denver office. We're keeping our headcount relatively the same with respect to our operations and we're growing a lot in terms of those DevOps teams. So as those teams continue to grow, we're adding more operational resources to them and kind of inserting a lot of that knowledge into other parts of the organization. >> You're doing a lot more with the same. Are you coming from the ops side or the dev side? >> I come from the ops side. I actually started my career with ViaSat in our knock in Denver. From there, I transitioned into a ops analyst role and then we created the Solutions Engineering Team and I took the lead on that. >> Chris, can you tell us how Splunk plays into your DevOps? Did you start using it in the knock and kind of go from there? >> We did, actually. Splunk started out as just a tool for us to see how many modems were offline in the knock. It was up on the video wall and we would see spikes and know that there was a problem. And as we've made this transition at DevOps, a lot of teams that were using other solutions, other open-source and home-grown solutions were kind of organically pivoting to Splunk because it was a lot easier for them to use for alerting dashboards, deep-data analysis, a lot of the things they needed to do their job effectively. So as we've grown as a company, as we've grown in this organizational model, Splunk has kind of grown along with that in terms of use case. >> That growth is predominately in IT operations and security, correct? >> Well, it's actually pretty interesting. It's kind of all over the board in our organization. It started in IT operations and security, but we have people in our marketing department using it to make sales and campaign decisions. We have executive leadership looking at it to see the performance of our spacecraft, we have exploratory research being done with it in terms of what's effective and what's not for our new spacecraft that will be coming out, the ViaSat-3 Constellation. So it's really all over the board in our organization. >> That's interesting, Stu, you're not the first customer who's told us that no, it's not just confined to IT, it's actually seeping through the organization. Despite the fact that we heard a bunch of announcements today, I don't know if you saw the keynotes, making it simpler for lines of business folks to actually utilize Splunk, so given that a lot of your teams in the business are actually using it already, what do you think these announcements will do for them? Maybe you haven't had time to evaluate it, but essentially, it's making it easier for business people, you know, simplifying it. >> Yeah, you know, all of the announcements in the keynotes over the past two days have been really, really exciting. Everything that I was hoping for got checked off the list. So I think one of the big things that it's going to allow us to do is get our customer-facing teams and our customer care organizations more involved with the tool. And getting them the information that they need to better serve customers that are calling in, and potentially even prevent the situations that customers have to call in for in the first place. So giving them a lot of account information quickly, giving them the ability to access information that is PCI and PII-compliant but still allowing them to get the data they need to service an individual customer, all of those things I think are really going to be impacted by the announcements in this conf. >> So you were the keynote yesterday. >> I was! >> Were you shaking the phone? >> I was, yeah. >> Which group were you, were you orange? >> We were orange group, yeah. >> We were orange, too! But we were sitting in the media section and all the media guys were sitting on their hands but we had a lot of devs and ops guys shaking with us. It's like when you do the wave at Fenway Park when it gets behind home plate, everybody just kind of sits down, but we were plugging hard. Alright, Chris, what else has excited you about .conf2018? School stuff that you've seen, some innovations, things you've learned. >> Well, I'm really excited about the app for infrastructure. That's something that we've been trying to get for ITSI for a long time now in terms of NED-level monitoring and NED-level thresholding. I think that's going to complement our business really, really well. The advancements that they're doing with the metrics store, specifically with things like Syslog are really, really exciting. I think that that's going to allow us to accelerate our data and make it more performant. The S3 compliant storage is absolutely fantastic and it comes in black now and that's really, really fantastic. >> Oh right! The dark mode! >> Dark mode, yup. >> You mentioned the ITSI. Have you used the VictorOps pieces before or is that something you're looking to do? >> We haven't looked at VictorOps as of yet. We're an xMatters customer right now so we've been using their integration that they built out and it's on Splunk base. But VictorOps, it'll be interesting to see how that organization changes now that it's part of the Splunk. >> So dark mode actually, it's one of those things that it really got such a loud ovation. It was funny, I was actually talking to a couple Splunkers that are like, "We want that dark mode t-shirt." Which I think you have to be a user and you need to sign up for some research thing that they're doing, and they're giving out the black shirt that has like gray text on it. >> Awesome! >> Why does that resonate with you, the dark mode? >> Well, it was actually what they talked about in the keynote. If you have it up on a video wall, which we have in various parts of our company, or if you're sitting in a dark office, something like that, looking at a really white screen for a long period of time, it's not easy on your eyes, it's hard to look at for a long period of time. And generally speaking, a lot of our presentation layers go towards that visual format. So I think this is going to allow us to make it much more appealing to the people who are putting this up on screens in front of people. >> Your responsibility extends out into the field, I presume. The data that's in the field, is that true? >> It does. >> Okay, so I'm interested in your reaction to the industrial IoT announcements, how you see or if you see your organization taking advantage of that. >> Well, we're a very vertically integrated company so we actually manufacture a lot of the devices that we use and that we provide to our customers. I think a lot of our manufacturing capabilities would really benefit from that. Anything from building antennas for ground segment that actually talked to the spacecraft. It's the modems that we put in people's houses, that entire fabrication process I think would benefit a lot. I really loved the AR presentation that they did where they were actually showing the overlay of metrics on a manufacturing line. I think that's something that would be fantastic for us, particularly for sending somebody to an antenna or a ground station to replace a piece of equipment. We can overlay those metrics, we can overlay all of that, we can use the industrial analytics piece of that to actually show which piece of hardware is most affected and how best to replace that. So a lot of opportunities there for our company. >> So I wonder if you could help us understand what's, from your perspective, on Splunk's to-do list. We're going to have Doug Merritt on a little later. If you had Doug right here and he said, Chris, what can we do to make your life better? What would you tell him? >> You know, I think a couple of the things that would make it better, and it looks like they're heading this direction, is streaming in and streaming out. You know, streaming in is of course important, that's where a lot of your data lives, but you also have to be able to send that out to Kafka, to Kinesis, to other places, so other people can consume the output of what Splunk is doing. So I think that would be a really, really important thing for us to socialize the benefit of Splunk. And then vertically integrating the incident management chain, it looks like something that's on their roadmap and I'd be interested to see what their roadmap looks like in terms of pulling in Phantom, pulling in VictorOps, pulling in some of these other technologies that are now in the Splunk umbrella to really make that end-to-end process of detecting, directing and remediating issues a lot more efficient. >> Okay, and do you see at some point that the machine will actually do, the machine intelligence will do a lot of that remediation? >> I think so. >> Do you see the human still heavily involved? >> Well, I think one of the important things is for a lot of these remediation things, we shouldn't have a human involved, right? Particularly things that are well-known issues. Human beings are expensive and human beings are important, and there are a lot more important things that they can be doing with their time than putting out fires. So if we can have machines doing that for them, it frees them up to do a lot more cool stuff. >> You're right. Alright, Chris, well listen, thanks very much for coming on theCUBE. It was great to have you. >> Yeah! Appreciate it very much. >> Thanks for your insights. Alright, keep it right there, everybody. Stu and I will be back with our next guest. You're watching theCUBE from Orlando Splunk.conf2018. Be right back. (techno music)

Published Date : Oct 3 2018

SUMMARY :

Brought to you by Splunk. Great to see you, thanks I appreciate it. Tell us what you guys do and to commercial aviation My primary role is to it's kind of if you that the culture sticks with the DevOps. and where you are today. and how it's impacted by what they do, in terms of the amount of teams So the problem with are in an engineering or technical role. a lot of that knowledge ops side or the dev side? I come from the ops side. a lot of the things they needed It's kind of all over the Despite the fact that we heard that it's going to allow us to do and all the media guys I think that that's going to You mentioned the ITSI. now that it's part of the Splunk. and you need to sign up So I think this is going to allow us The data that's in the field, to the industrial IoT announcements, lot of the devices that we use So I wonder if you a couple of the things that they can be doing with their time for coming on theCUBE. Appreciate it very much. Stu and I will be back

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Chris Crocco	PERSON	0.99+
Chris	PERSON	0.99+
Doug	PERSON	0.99+
Stu Miniman	PERSON	0.99+
ViaSat	ORGANIZATION	0.99+
five	QUANTITY	0.99+
five people	QUANTITY	0.99+
Denver	LOCATION	0.99+
4,500 people	QUANTITY	0.99+
two	QUANTITY	0.99+
Fenway Park	LOCATION	0.99+
three years	QUANTITY	0.99+
Doug Merritt	PERSON	0.99+
Stu	PERSON	0.99+
two organizations	QUANTITY	0.99+
three	QUANTITY	0.99+
one	QUANTITY	0.99+
yesterday	DATE	0.99+
first	QUANTITY	0.99+
Orlando	LOCATION	0.99+
today	DATE	0.99+
four	QUANTITY	0.98+
Splunk	ORGANIZATION	0.98+
Orlando, Florida	LOCATION	0.98+
Splunk.conf2018	EVENT	0.98+
100 people	QUANTITY	0.97+
.conf2018	EVENT	0.97+
60	QUANTITY	0.96+
xMatters	ORGANIZATION	0.96+
first customer	QUANTITY	0.91+
DevOps	TITLE	0.91+
S3	COMMERCIAL_ITEM	0.88+
past year	DATE	0.88+
theCUBE	ORGANIZATION	0.87+
first year	QUANTITY	0.86+
ViaSat-3 Constellation	LOCATION	0.86+
about two	QUANTITY	0.85+
past two days	DATE	0.83+
Kafka	TITLE	0.81+
ITSI	ORGANIZATION	0.76+
Orlando Splunk.conf2018	EVENT	0.76+
2,000	QUANTITY	0.76+
Phantom	TITLE	0.75+
Syslog	ORGANIZATION	0.73+
theCUBE	TITLE	0.72+
VictorOps	TITLE	0.7+
Splunk	EVENT	0.67+
Splunkers	ORGANIZATION	0.6+
1/2	QUANTITY	0.59+
Splunk	PERSON	0.59+
ViaSat-2	TITLE	0.56+
couple	QUANTITY	0.54+
Kinesis	TITLE	0.53+
Splunk	TITLE	0.36+

Omer Trajman, Rocana - #BigDataNYC 2016 - #theCUBE

>> Announcer: From New York, it's the Cube. Covering Big Data New York City 2016. Brought to you by Headline Sponsors, Cisco, IBM, NVIDIA, and our ecosystem sponsors. Now, here are your hosts, Dave Vellante and George Gilbert. >> Welcome back to New York City everybody, this is the Cube, the worldwide leader in live tech coverage, and we've been going wall to wall since Monday here at Strata plus Hadoop World, Big Data NYC is our show within the show. Omer Trajman is here, he's the CEO of Rocana, Cube alum, good to see you again. >> Yeah you too, it's good to be here again. >> What's the deal with the shirt, it says, 'your boss is useless', what are you talking about? >> So, if I wasn't on mic'd up, I'd get up and show you, ~but you can see in the faint print that it's not talking about how your boss is useless, right, it's talking about how you make better use of data and what your boss' expectations are. The point we're trying to get across is that context matters. If you're looking at a small fraction of the information then you're not going to get the full picture, you're not going understand what's actually going on. You have to look at everything, you have no choice today. >> So Rocana has some ambitious plans to enter this market, generally referred to as IT operations, if I can call it that, why does the world need another play on IT operations? >> In IT operations? If you look at the current state of IT operations in general, and specifically people think of this largely versus monitoring, is I've got a bunch of systems, I can't keep track of everything, so I'm going to pick and choose what I pay attention to. I'm going to look at data selectively, I'm only going to keep it for as long as I can afford to keep it, and I'm not going to pay attention to the stuff that's outside that hasn't caused problems, yet. The problem is, the yet, right? You all have seen the Delta outages, the Southwest issues, the Neiman Marcus website, right? There's plenty of examples of where someone just wasn't looking at information, no one was paying attention to it or collecting it and they got blindsided. And in today's pace of business where everything is digital, everyone's interacting with the machines directly, everything's got to be up all the time. Or at least you have to know that something's gone askew and fix it quickly. And so our take is, what we call total operational visibility. You got to pay attention to everything all the time and that's easier said than done. >> Well, because that requires you got to pay attention to all the data, although this reminds me of IP meta in 2010, said, "Sampling is dead", alright? Do you agree he's right? >> Trajman: I agree. And so it's much more than that, of course right, sampling is dead, you want to look at all the details all the time, you want to look at it from all sources. You want to keep enough histories so if you're the CIO of a retailer, if your CEO says, "Are we ready for Cyber Monday, can you take a look at last year's lead up and this years", and the CEO's going to look back at them and say, "I have seven days of data (chuckles), "what are you talking about, last year". You have to keep it for as long as you need to, to address business issues. But collecting the data, that's step one, right? I think that's where people struggle today, but they don't realize that you can't just collect it all and give someone a search box, or say, "go build your charts". Companies don't have data scientists to throw at these problems. You actually have to have the analytics built in. Things that are purpose built for data center and IT operations, the machine learning models, the built in cubes, the built in views, visualizations that just work out of the box, and show you billions of events a day, the way you need to look at that information. That's prebuilt, that comes out of the box, that's also a key differentiator. >> Would it be fair to say that Hadoop has historically has been this repository for all sorts of data, and but it was a tool set, and that Splunk was the anti-Hadoop, sort of out of the box. It was an application that had some... It collected certain types of data and it had views out of the box for that data. Sounds like you're trying to take the best of each world where you have the full extensibility and visibility that you can collect with all your data in Hadoop but you've pre built all the analytic infrastructure that you need to see your operations in context. >> I think when you look at Hadoop and Splunk and your concert of Rocana's the best of both worlds, is very apt. It's a prepackaged application, it just installs. You don't have to go in under the covers and stitch everything together. It has the power of scalability that Hadoop has, it has the openness, right, 'cause you can still get at the data and do what you need with it, but you get an application that's creating value, day one. >> Okay, so maybe take us... Peel back the onion one layer, if you can go back to last year's Cyber Monday and you've got out of the box functionality, tell us how you make sense out of the data for each organization, so that the context is meaningful for them. >> Yeah, absolutely. What's interesting is that it's not a one time task, right? Every time you're trying to solve a slightly different problem, or move the business in different direction, you want to look at data differently. So we think of this more as a toolkit that helps you navigate where to find the root cause or isolate where a particular problem is, or where you need to invest, or grow the business. In the Cyber Monday example, right what you want to look at is, let me take a zoom out view, I just want to see trends over time, the months leading up or the weeks leading up to Cyber Monday. Let's look at it this year. Let's look at it last year. Let's stack on the graph everything from the edge caching, to the application, to my proxy servers to my host servers through to my network, gimmie the broad view of everything, and just show me the trend lines and show me how those trend lines are deviating. Where is there unexpected patterns and behavior, and then I'm going to zoom in on those. And what's causing those, is there a new disconfiguration, did someone deploy a new network infrastructure, what has caused some change? Or is it just... It's all good, people are making more money, more people are coming to the website it's actually a capacity issue, we just need to add more servers. So you get the step back, show me everything without a query, and then drag and drop, zoom in to isolate where are there particular issues that I need to pay attention to. >> Vellante: And this is infrastructure? >> Trajman: It's infrastructure all the way through application... >> Correct? It is? So you can do application performance management, as well? >> We don't natively do the instrumentation there's a whole domain which is, bytecode instrumentation, we partner with companies that provide APM functionality, take that feed and incorporate it. Similar to a partner with companies that do wire level deep packet inspection. >> Vellante: I was going to say... >> Yeah, take that feed and incorporate it. Some stuff we do out of the box. NetFlow, things like IPFIX, STATSD, Syslog, log4j, right? There's kind of a lot of stuff that everyone needs standard interfaces that we do out of the box. And there's also pre-configured, content oriented parsers and visualizations for an OpenStack or for Cloud Foundry or for a Blue Coat System. There's certain things that we see everywhere that we can just handle out of the box, and then there's things that are very specific to each customer. >> A lot of talk about machine learning, deep learning, AI, at this event, how do you leverage that? >> How do we fit in? It's interesting 'cause we talk about the power delivers in the product but part of it is that it's transparent. Our users, who are actually on the console day to day or trying to use Rocana to solve problems, they're not data scientists. They don't understand the difference between analytic queries and full text search. They understand understand machine learning models. >> They're IT people, is that correct? >> They're IT folks, whose job it is to keep the lights on, right? And so, they expect the software to just do all of that. We employ the data scientists, we deliver the machine learning models. The software dynamically builds models continuously for everything it's looking at and then shows it in a manner that someone can just look at it and make sense of it. >> So it might be fair to say, maybe replay this, and if it's coming out right, most people, and even the focus of IBM's big roll out this week is, people have got their data links populated and they're just now beginning to experiment with the advanced analytics. You've got an application where it's already got the advanced analytics baked into such an extent that the operator doesn't really care or need to know about it. >> So here's the caveat, people have their data links populated with the data they know they need to look at. And that's largely line of business driven, which is a great area to apply big data machine learning, analytics, that's where the data scientists are employed. That's why what IBM is saying makes sense. When you get to the underlying infrastructure that runs it day to day, the data lakes are not populated. >> Interviewer: Oh, okay. >> They're data puddles. They do not have the content of information, the wealth of information, and so, instead of saying, "hey, let's populate them, "and then let's try to think about "how to analyze them, and then let's try to think about "how get insights from them, and then let's try to think "about, and then and then", how about we just have a product that does it all for you? That just shows you what to do. >> I don't want to pollute my data lake with that information, do I? >> What you want is, you want to take the business feeds that have been analyzed and you want to overlay them, so you want to send those over to probably a much larger lake, which is all the machine data underneath it. Because what you end up with especially as people move towards more elastic environments, or the hybrid cloud environments, in those environments, if a disk fails or machine fails it may not matter. Unless you can see the topline revenue have an impact, maybe it's fine to just leave the dead machine there and isolate it. How IT operates in those environments requires knowledge of the business in order to become more efficient. >> You want to link the infrastructure to the value. >> Trajman: Exactly. >> You're taking feeds essentially, from the business data and that's informing prioritization. >> That's exactly right. So take as an example, Point of Sale systems. All the Point of Sale systems today, they're just PCs, they're computers, right? I have to monitor them and the infrastructure to make sure it's up and running. As a side effect, I also know the transactions. As an IT person, I not only know that a system is up, I know that it's generating the same amount of revenue, or a different amount of revenue than it did last week, or that another system is doing. So I can both isolate a problem as an IT person, right, as an operator, but I can also go to the business and say, "Hey nothing's wrong with the system, we're not making as much money as we were, why is that", and let's have a conversation about that. So it brings IT into a conversation with the business that they've never been able to have before, using the data they've always had. They've always had access to. >> Omer, We were talking a little before about how many more companies are starting to move big parts of their workloads into public cloud. But the notion of hybrid cloud, having a hybrid cloud strategy is still a bit of a squishy term. >> Trajman: Yeah. (laughs) >> Help us fill in, for perhaps, those customers who are trying to figure out how to do it, where you add value and make that possible. >> Well, what's happening is the world's actually getting more complex with cloud, it's another place that I can use to cost effectively balance my workloads. We do see more people moving towards public cloud or setting up private cloud. We don't see anyone whole scale, saying "I'm shutting down everything", and "I'm going to send everything to Amazon" or "I'm going to send everything to Microsoft". Even in the public cloud, it's a multi cloud strategy. And so what you've done is, you've expanded the number of data centers. Maybe I add, a half dozen data centers, now I've got a half dozen more in each of these cloud providers. It actually exacerbates the need for being able to do multi-tier monitoring. Let me monitor at full fidelity, full scale, everything that's happening in each piece of my infrastructure, aggregate the key parts of that, forward them onto something central so I can see everything that's going on in one place, but also be able to dive into the details. And that hybrid model keeps you from clogging up the pipes, it keeps you from information overload, but now you need it more than ever. >> To what extent does that actually allow you, not just to monitor, but to re-mediate? >> The sooner you notice that there's an issue, the sooner you can address that issue. The sooner you see how that issue impacts other systems, the more likely you are to identify the common root cause. An example is a customer that we worked with prior to Rocana, had spent an entire weekend isolating an issue, it was a ticket that had gotten escalated, they found the root cause, it was a core system, and they looked at it and said, "Well if that core system was actually "the root cause, these other four systems "should have also had issues". They went back into the ticketing system, sure enough, there were tickets that just didn't get escalated. Had they seen all of those issues at the same time, had they been able to quickly spin the cube view of everything, they would have found it significantly faster. They would have drawn that commonality and seen the relationships much more quickly. It requires having all the data in the same place. >> Part of the actionable information is to help triage the tickets in a sense, of that's the connection to remediation. >> Trajman: Context is everything. >> Okay. >> So how's it going? Rocana's kind of a heavy lift. (Trajman laughs) You're going after some pretty entrenched businesses that have been used to doing things a certain way. How's business? How you guys doing? >> Business is, it's amazing, I mean, the need is so severe. We had a prospective customer we were talking to, who's just starting to think about this digital transformation initiative and what they needed from an operational visibility perspective. We connected them with an existing customer that had rolled out a system and, the new prospect looked at the existing customer, called us up and said, "That," (laughs) "that's what we want, right there". Everyone's got centralized log analytics, total operational visibility, people are recognizing these are necessary to support where the business has to go and businesses are now realizing they have to digitize everything. They have to have the same kind of experience that Amazon and Google and Facebook and everyone else has. Consumers have come to expect it. This is what is required from IT in order to support it, and so we're actually getting... You say it's a heavy lift, we're getting pulled by the market. I don't think we've had a conversation where someone hasn't said, "I need that", that's what we're going through today that is my number one pang. >> That's good. Heavy lifts are good if you've got the stomach for it. >> Trajman: That's what I do. >> If you got a tailwind, that's fantastic. It sounds like things are going well. Omer, congratulations on the success we really appreciate you sharing it with our Cube audience. >> Thank you very much, thanks for having me. >> You're welcome. Keep it right there everybody. We'll be back with our next guest, this is the Cube, we're live, day four from NYC. Be right back.

Published Date : Sep 30 2016

SUMMARY :

Brought to you by Headline Sponsors, Cube alum, good to see you again. good to be here again. fraction of the information and I'm not going to pay attention the way you need to look the best of each world where you have the it has the openness, right, 'cause you can for each organization, so that the context from the edge caching, to the application, Trajman: It's infrastructure all the do the instrumentation that we do out of the box. on the console day to day We employ the data scientists, that the operator doesn't really care that runs it day to day, They do not have the and you want to overlay them, infrastructure to the value. essentially, from the business and the infrastructure But the notion of hybrid and make that possible. and "I'm going to send the sooner you can address that issue. Part of the actionable information How you guys doing? They have to have the you've got the stomach for it. Omer, congratulations on the success Thank you very much, Keep it right there everybody.

ENTITIES

Entity	Category	Confidence
Trajman	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
NYC	LOCATION	0.99+
George Gilbert	PERSON	0.99+
NVIDIA	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Omer Trajman	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
2010	DATE	0.99+
Google	ORGANIZATION	0.99+
last year	DATE	0.99+
Facebook	ORGANIZATION	0.99+
Neiman Marcus	ORGANIZATION	0.99+
New York	LOCATION	0.99+
seven days	QUANTITY	0.99+
New York City	LOCATION	0.99+
four systems	QUANTITY	0.99+
Hadoop	TITLE	0.99+
last week	DATE	0.99+
Rocana	ORGANIZATION	0.99+
this year	DATE	0.99+
Rocana	PERSON	0.99+
each piece	QUANTITY	0.98+
today	DATE	0.98+
both	QUANTITY	0.98+
Monday	DATE	0.98+
each	QUANTITY	0.98+
each organization	QUANTITY	0.97+
each customer	QUANTITY	0.97+
Big Data	ORGANIZATION	0.97+
log4j	TITLE	0.96+
this week	DATE	0.95+
IPFIX	TITLE	0.95+
Cyber Monday	EVENT	0.95+
OpenStack	TITLE	0.95+
day four	QUANTITY	0.94+
step one	QUANTITY	0.93+
STATSD	TITLE	0.93+
both worlds	QUANTITY	0.93+
Syslog	TITLE	0.93+
Southwest	ORGANIZATION	0.92+
billions of events a day	QUANTITY	0.92+
each world	QUANTITY	0.91+
Splunk	TITLE	0.9+
Splunk	PERSON	0.89+
one layer	QUANTITY	0.89+
#BigDataNYC	EVENT	0.87+
a half dozen data centers	QUANTITY	0.87+
Cloud Foundry	TITLE	0.85+
one	QUANTITY	0.85+
Hadoop	PERSON	0.84+
Vellante	ORGANIZATION	0.84+
half dozen	QUANTITY	0.83+
one time task	QUANTITY	0.82+
Hadoop World	LOCATION	0.8+
NetFlow	TITLE	0.8+
Cube	ORGANIZATION	0.79+
Omer	PERSON	0.76+
2016	DATE	0.75+
Delta	ORGANIZATION	0.75+
this years	DATE	0.73+
Cube	COMMERCIAL_ITEM	0.73+
Headline	ORGANIZATION	0.7+
day one	QUANTITY	0.66+
Rocana	TITLE	0.62+
Big Data	TITLE	0.53+
Strata	LOCATION	0.5+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Syslog: