Lewis Kaneshiro & Karthik Ramasamy, Streamlio | Big Data SV 2018

(upbeat techno music) >> Narrator: Live, from San Jose, it's theCUBE! Presenting Big Data Silicon Valley, brought to you by SiliconANGLE Media and its ecosystem partners. >> Welcome back to Big Data SV, everybody. My name is Dave Vellante and this is theCUBE, the leader in live tech coverage. You know, this is our 10th big data event. When we first started covering big data, back in 2010, it was Hadoop, and everything was a batch job. About four or five years ago, everybody started talking about real time and the ability to affect outcomes before you lose the customer. Lewis Kaneshiro was here. He's the CEO of Streamlio and he's joined by Karthik Ramasamy who's the chief product officer. They're both co-founders. Gentlemen, welcome to theCUBE. My first question is, why did you start this company? >> Sure, we came together around a vision that enterprises need to access the value around fast data. And so as you mentioned, enterprises are moving out of the slow data era, and looking for a fast data value to their data, to really deliver that back to their users or their use cases. And so, coming together around that idea of real time action what we did was we realized that enterprises can't all access this data with projects right now that are not meant to work together, that are very difficult, perhaps, to stitch together. So what we did was create an intelligent platform for fast data that's really accessible to enterprises of all sizes. What we do is we unify the core components to access fast data, which is messaging, compute and stream storage, accessing the best of breed open-source technology that's really open-source out of Twitter and Yahoo! >> It's a good thing I was going to ask why does the world need to know there are, you know, streaming platforms, but Lewis kind of touched on it, 'cause it's too hard. It's too complicated, so you guys are trying to simplify all that. >> Yep, the reason mainly we wanted to simplify it because, based on all our experiences at Twitter and Yahoo! one of the key aspects was to to simplify it so that it's conceivable by regular enterprise because Twitter and Yahoo! kind of our position can afford the talent and the expertise in order to do this real time platforms. But when it goes to normal enterprises, they don't have access to the expertise and the cost benefits that they might have to reincur. So, because of that we wanted to use these open-source projects, the Twitter and the Yahoo!'s provider, combine them, and make sure that you have a simple, easy, drag and drop kind of interface, so that it's easily conceivable for any enterprise. Essentially, what we are trying to do is reduce the (mumbles) for enterprises for real time, for all enterprises. >> Dave: Yeah, enterprises will pay up... >> Yes. >> For a solution. The companies that you used to work for, they all gladly throw engineering at the problem. >> Yeah. >> Sure. >> To save time, but most organizations, they don't have the resources and so. Okay, so how does it, would it work prior to Streamlio? Maybe take us through sort of how a company would attack this problem, the complexities of what they have to deal with, and what life is like with you guys. >> So, current state of the world is it's fragmented solution, today. So the state of the world is where you take multiple pieces of different projects and you'd assemble them together in formats so that you can do (mumbles) right? So the reason why people end up doing is each of these big data projects that people use was the same for completely different purpose. Like messaging is one, and compute is another one, and third one is storage one. So, essentially what we have done as company is to simplify this aspect by integrating this well-known, best-of-the-breed projects called, for messaging we use something called Apache Poser, for compute we use something called Apache Krem, from Twitter, and similarly for storage, for real time storage, we use something called Apache Bookkeeper, so and to unify them, so that, under the hoods, it may be three systems, but, as a user, when you are using it, it serves or functions as a single system. So you install the system, and ingest your data, express your computation, and get the results out, in one single system. >> So you've unified or converged these functions. If I understand it correctly, we talking off camera a little bit, the team, Lewis, that you've assembled actually developed a lot of these, or hugely committed to these open-source projects, right? >> Absolutely, co-creators of each of the projects and what that allows us to do is to really integrate, at a deep level, each project. For example, Pulsar is actually a pub/sub system that is built on Bookkeeper, and Bookkeeper, in our minds, is a pure list best-of-breed stream storage solution. So, fast and durable storage. That storage is also used in Apache Heron to store State. So, as you can see, enterprises, rather than stitching together multiple different solutions for queuing, streaming, compute, and storage, now have one option that they can install in a very small cluster, and operationally it's very simple to scale up. We simply add nodes if you get data spikes. And what this allows is enterprises to access new and exciting use cases that really weren't possible before. For example, machine learning model deployment to real time. So I'm a data scientist and what I found is in data science, you spend a lot of time training models in batch mode. It's a legacy type of approach, but once the model is trained, you want to put that model into production in real time so that you can deliver that value back to a user in real time. Let's call it under two second SLA. So, that has been a great use case for Streamlio because we are a ready made intelligent platform for fast data, for MLai deployment. >> And the use cases are typically stateful and your persisting data, is that right? >> Yes, use cases, it can be used for stateless use cases also, but the key advantage that we bring to a table is stateful storage. And since we ship along with the storage (mumbles) stateful storage becomes much easier because of the fact that it can be used to store a real intermediate state of the computation or it can be used for the staging (mumbles) data when it spills over from what the memory is it's automatically stored to disk or you can even in the data for as long as you want so that you can unlock the value later after the data has been processed for the fast data. You can access the lazy data later, in time. >> So give us the run-down on the company, funding, you know, VCs, head count. Give us the basics. >> Sure, we raise Series A from Lightspeed Venture Partners, lead by John Vrionis and Sudip Chakrabarti. We've raised seven and a half million and emerged from stealth back in August. That allowed us to ramp up our team to 17, now, mainly engineers, in order to really have a very solid product, but we launched post rev, prelaunch and some of our customers are really looking at geo replication across multiple data centers and so active, active geo replication is an open-source feature in Apache Pulsar, and that's been a huge draw, compared to some other solutions that are out there. As you can see, this theme of simplifying architecture is where Streamlio sits, so unifying, queuing and streaming allows us to replace a number of different legacy systems. So that's been one avenue to help growth. The other, obviously is on the compute piece. As enterprises are finding new and exciting use cases to deliver back to their users, the compute piece needs to scale up and down. We also announce Pulsar Functions, which is stream-native compute that allows very simple function computation in native Python and Java, so you spin out the Apache Python cluster or Streamlio platform, and you simply have compute functionality. That allows us to access edge use cases, so IOT is a huge, kind of exciting POC's for us right now where we have connected car examples that don't need heavyweight schedule or deployment at the edge. It's Pulsar Pulsar functions. What that allows us to do are things like fraud detection, anomaly detection at the edge, model deployment at the edge, interpolation, observability, and alerts. >> And, so how do you charge for this? Is it usage based. >> Sure. What we found is enterprise are more comfortable on a per node basis, simply because we have the ambition to really scale up and help enterprises really use Streamlio as their fast data platform across the entire enterprise. We found that having a per data charge rate actually would limit that growth, and so per node and shared architecture. So, we took an early investment in optimizing around Kubernetes. And so, as enterprises are adopting Kubernetes, we are the most simple installation on Kubernetes, so on-prem, multicloud, at the edge. >> I love it, so I mean for years we've just been talking about the complexity headwinds in this big data space. We certainly saw that with Hadoop. You know, Spark was designed to certainly solve some of those problems, but. Sounds like you're doing some really good work to take that further. Lewis and Karthik, thank you so much for coming on theCUBE. I really appreciate it. >> Thanks for having us, Dave. >> All right, thank you for watching. We're here at Big Data SV, live from San Jose. We'll be right back. (techno music)

Published Date : Mar 9 2018

SUMMARY :

brought to you by SiliconANGLE Media and the ability to affect outcomes And so as you mentioned, enterprises are moving out so you guys are trying to simplify all that. and the cost benefits that they might have to reincur. The companies that you used to work for, and what life is like with you guys. so that you can do (mumbles) right? the team, Lewis, that you've assembled so that you can deliver that value so that you can unlock the value later you know, VCs, head count. the compute piece needs to scale up and down. And, so how do you charge for this? have the ambition to really scale up and help enterprises Lewis and Karthik, thank you so much for coming on theCUBE. All right, thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Karthik Ramasamy	PERSON	0.99+
Karthik	PERSON	0.99+
Lewis Kaneshiro	PERSON	0.99+
Dave	PERSON	0.99+
San Jose	LOCATION	0.99+
Lightspeed Venture Partners	ORGANIZATION	0.99+
John Vrionis	PERSON	0.99+
Lewis	PERSON	0.99+
2010	DATE	0.99+
August	DATE	0.99+
three systems	QUANTITY	0.99+
Streamlio	ORGANIZATION	0.99+
Yahoo!	ORGANIZATION	0.99+
each	QUANTITY	0.99+
Twitter	ORGANIZATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Java	TITLE	0.99+
first question	QUANTITY	0.99+
Sudip Chakrabarti	PERSON	0.99+
one option	QUANTITY	0.99+
Python	TITLE	0.99+
both	QUANTITY	0.99+
seven and a half million	QUANTITY	0.99+
17	QUANTITY	0.98+
each project	QUANTITY	0.98+
third one	QUANTITY	0.98+
Kubernetes	TITLE	0.98+
single system	QUANTITY	0.98+
first	QUANTITY	0.96+
Pulsar	TITLE	0.96+
Streamlio	TITLE	0.96+
Spark	TITLE	0.94+
Bookkeeper	TITLE	0.94+
one	QUANTITY	0.93+
one single system	QUANTITY	0.92+
theCUBE	ORGANIZATION	0.91+
today	DATE	0.91+
Big Data SV 2018	EVENT	0.9+
Apache	ORGANIZATION	0.89+
Silicon Valley	LOCATION	0.89+
SLA	TITLE	0.89+
one avenue	QUANTITY	0.89+
Series A	OTHER	0.88+
five years ago	DATE	0.86+
Big Data	EVENT	0.85+
About four	DATE	0.85+
Big Data SV	EVENT	0.82+
IOT	TITLE	0.81+
Poser	TITLE	0.75+
Big Data SV	ORGANIZATION	0.71+
10th big	QUANTITY	0.67+
Apache Heron	TITLE	0.65+
under two second	QUANTITY	0.62+
data	EVENT	0.61+
Streamlio	PERSON	0.54+
event	QUANTITY	0.48+
Hadoop	TITLE	0.45+
Krem	TITLE	0.32+

Karthik Ramasamy, Streamlio - Data Platforms 2017 - #DataPlatforms2020

>> Narrator: Hi from the Wigwam in Phoenix, Arizona, it is theCUBE, covering Data Platforms 2017. Brought to you by Qubole. >> Hey welcome back everybody. Jeff Frick with theCUBE. We are down at the historic Wigwam 99 years young just outside of Phoenix, Arizona, Data Platforms 2017. It is really talking about a new approach to big data in cloud put on by Qubole about 200 people, very interesting conversation this morning and we're really interested to have Karthik Ramasamy. He is the co-founder of Streamlio which is still in stealth mode according to his LinkedIn profile so we won't talk about that but long time Twitter guy and really shared some great lessons this morning about things that you guys learned while growing Twitter. So welcome. >> Thank you, thanks for having me. >> Absolutely. One of the key parts of your whole talk was this concept of real time. I always joke with people real time is in time to do something about it. You went through a bunch of examples of real time is really a variable depending on what the right application is but at Twitter real time was super, super important. >> Yes it is indeed important because the nature of the streaming data, the nature of the Twitter data is streaming data because the tweets are coming at a high velocity. And Twitter positioned itself as more of a real time delivery company because that way what happens is whatever the information that we get within Twitter we need to have a strong time budget before we can deliver it to people so that people when they consume the information the information is live or real time. >> But the real time too is becoming obviously for Twitter but for lot of big enterprises it is more and more important and the great analogy I referred before is you used a sample data, is the sample historic data to make decisions. Now you want to keep all the data in real time to make decisions, so its a very different way you drive your decision-making process. >> Very different way of thinking. Especially considering the fact as you said the enterprises are getting into understanding what real time means for them and but if you look at some of the traditional enterprise like financial, they understand the value of real time. Similarly the upcoming new used cases like IoT they understand the value of real time like autonomous vehicles where they have to make quick decisions. Healthcare you have to make quick decisions because the preventive and the predictive maintenance is very important in those kind of segments. So because of those segments, its getting really popular and traditional enterprises like retail and all they're also valuing real time because it allows to blend in into the user behavior so that they can recommend products and other things in real time so that people can react to that so that its becoming more and more important. That's what I would say. >> So Hadoop started out as mostly batch infrastructure and Twitter was pioneer in the design pattern to accommodate both batch and in real time. How has that big data infrastructure evolved so that one, you don't have to split batch in real time and what should we expect going forward to make that platform stronger in terms of in your real time analytics and potentially so that it can inform decisions in systems of record. >> I think like today as of now there are two different infrastructures. One is in general is the Hadoop infrastructure. Other one is more of a real time infrastructure at this point. And the Hadoop is kind of considered as this monolithic, not monolithic, its kind of a mega store where every data like similar to all the rivers kind of reach the sea, it kind of becomes a storage sea where all the data comes and stores there. But before the data comes and stores there, lot of analytics and lot of visibility about the data from the point of its creation before it ends up there it setting done on those rive, whatever you call the data river so you could get lot of analytics done during the time before it ends up so that its more live than the other analytics. Hadoop had its own kind of limitations in terms of how much data it can handle, how real time the data can be. For example, you can kind of dump the data in real time into Hadoop but until you close the file you cannot see the data at all. There is a time budget gets into play there. And you could do smaller files like small, small files writing but the namenode will blob because like within a day you write million files, the namenode is not going to sustain that. So those are the trade-off. That's one of the reason we have to end up doing new real time infrastructure like the distributor log that allows you to the moment the data comes in data is immediately visible within the three to five millisecond timeframe. >> The distributed log you're talking about would be Kafka. The output of that would be to train the model or just score a model and then would that model essentially be carved off from this big data platform and be integrated within a system of record where would informed decisions. >> There are multiple things you could do. First of all, the distributed log essentially the data is kind of, you can think about as a data staging environment where the data kind of lands up there and once it lands up there when there's a lot of sharing of that same data going on in real time, when several jobs are taking they're using some popular data source, it provides a high fan out in the sense like 100 jobs can consume the same data they can be at different parts of the data itself. So that provides a nice sharing environment. Now once the data is around there, now the data is being used for different kind of analytics and one of them could be a model enhancement because typically in the back segment you build the model because you're looking at lot of data and other things, then once the model is built that model is pre-loaded into the real time computer environment like HERON then you look up this model and serve data based on that model whatever it tells you. For example when you do a ad serving to look up that model and what is our relevant ad for you to click. Then the next aspect is model enhancement. Because users behavior is going to change, over a period of time. Now can you capture and incrementally update the model so that those things are also partly done on the real time aspects rather than recomputing the batch and again and again and again. >> Okay so its sort of like a what's the delta? >> Karthik: Yes. >> Let's train on the delta and lets score on the delta. >> Yes and once the delta gets updated then when the new user behavior comes in they can look at that new model what that's being continuously being enhanced and once that enhancement is kind of captured you know that user behavior is changing. And ads are served accordingly. >> Okay so now that our customers are getting serious about moving to the cloud with their big data platforms and the applications on them, have you seen a change in the patterns of the apps they're looking to build or a change in the makeup of the platform that they want to use. >> SO that depends on, typically like, one disclosure is I've worked with Amazon and all, the AWS but within the companies that I worked for its everything is an on frame but thing is having said that cloud is nice because it gives you machines on the fly whenever you need to and it gives a bunch of tools around it where you can bootstrap it and all the various stuff. This works ideal for a smaller company and medium companies but the big companies one of the this things that we calculate in terms of the costwise how much is the cost that we have to pay versus doing it inhouse so there's still a huge gap unless cloud provider is going to provide a huge discount or whatever for the big companies to move in. So that is always a challenge that we get into because think about I have 10 or 20,000 notes of Hadoop can I move all of them into Amazon AWS, how much I am going to pay? Versus the cost of maintaining my own data centers and everything. I would say like I don't know the latest pricing and other things but approximately it comes to three x in terms of cost wise. >> If you're using... >> Our own on-prem and the data center and all of the staffing and everything. There's a difference of I would say three x. >> For on-prem being higher. >> On-prem being lower. >> Lower? >> Yes. >> But that assumes then that you've got flat utilization. >> Flat utilization but, I mean cloud of course I have the expands out of scale and all the various thing that you can, it gives an illusion of unlimited resources but in our case if you're provisioning so much machines in most of the at least 50 or 60% of the machines are used for production but the rest of them are used for staging, development, and all the various other environments so which means like the total cost of those machines even though like only is 50% utilized still you end up saving so much shit like operate out one-third of the cost that might be in the cloud. >> Alright Karthik, that opens up a whole can of interesting conversations. Again we just don't have time to jump into. So I'll give you the last word. When can we expect you to come out of stealth or is that stealthy too? >> It is kind of, that is stealthy too. >> Okay fair enough, I don't want to put you on the spot but thanks for stopping by and sharing your story. >> Karthik: Thanks, thanks for everything. >> Alright, he is Karthik, he is George, I'm Jeff. You're watching theCUBE. We are in the Wigwam resort just outside of phoenix at Data Platforms 2017. We will be back after this short break. Thanks for watching.

Published Date : May 26 2017

SUMMARY :

Narrator: Hi from the Wigwam in Phoenix, Arizona, He is the co-founder of Streamlio One of the key parts of your whole talk was the nature of the streaming data, But the real time too is becoming obviously for Twitter Especially considering the fact as you said the evolved so that one, you don't have to split batch so that its more live than the other analytics. and then would that model essentially be carved off the data is kind of, you can think about as a data staging Yes and once the delta gets updated makeup of the platform that they want to use. one of the this things that we calculate in terms of and all of the staffing and everything. But that assumes then that you've got and all the various other environments So I'll give you the last word. on the spot but thanks for stopping by We are in the Wigwam resort just outside of phoenix

ENTITIES

Entity	Category	Confidence
Karthik	PERSON	0.99+
Karthik Ramasamy	PERSON	0.99+
George	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Jeff	PERSON	0.99+
100 jobs	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
50%	QUANTITY	0.99+
Twitter	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
10	QUANTITY	0.99+
three	QUANTITY	0.99+
Wigwam	LOCATION	0.99+
LinkedIn	ORGANIZATION	0.98+
20,000 notes	QUANTITY	0.98+
both	QUANTITY	0.98+
60%	QUANTITY	0.98+
Streamlio	ORGANIZATION	0.98+
Phoenix, Arizona	LOCATION	0.98+
one	QUANTITY	0.98+
one-third	QUANTITY	0.98+
99 years	QUANTITY	0.98+
Hadoop	TITLE	0.97+
about 200 people	QUANTITY	0.97+
today	DATE	0.96+
One	QUANTITY	0.96+
two different infrastructures	QUANTITY	0.96+
Qubole	PERSON	0.96+
First	QUANTITY	0.89+
five millisecond	QUANTITY	0.86+
theCUBE	ORGANIZATION	0.86+
Data Platforms	ORGANIZATION	0.85+
this morning	DATE	0.82+
Amazon AWS	ORGANIZATION	0.8+
a day	QUANTITY	0.79+
HERON	ORGANIZATION	0.7+
least 50	QUANTITY	0.62+
three x	QUANTITY	0.6+
llion	QUANTITY	0.49+
Kafka	TITLE	0.49+
2017	DATE	0.47+
phoenix	ORGANIZATION	0.46+
2017	EVENT	0.42+
Data Platforms	TITLE	0.37+
Platforms	ORGANIZATION	0.25+

Doug Merritt, Splunk | Splunk .conf19

>> Announcer: Live from Las Vegas, it's theCUBE! Covering Splunk .conf19. Brought to you by Splunk. Okay, welcome back, everyone. This is day three live CUBE coverage here in Las Vegas for Splunk's .conf. Its 10 years anniversary of their big customer event. I'm John Furrier, theCUBE. This is our seventh year covering, riding the wave with Splunk. From scrappy startup, to going public company, massive growth, now a market leader continuing to innovate. We're here with the CEO, Doug Merritt of Splunk. Thanks for joining me, good to see you. >> Thank you for being here, thanks for having me. >> John: How ya feelin'? (laughs) >> Exhausted and energized simultaneously. (laughs) it was a fun week. >> You know, every year when we have the event we discuss Splunk's success and the loyalty of the customer base, the innovation, you guys are providing the value, you got a lot of happy customers, and you got a great ecosystem and partner network growing. You're now growing even further, every year it just gets better. This year has been a lot of big highlights, new branding, so you got that next level thing goin' on, new platform, tweaks, bringing this cohesive thing. What's your highlights this year? I mean, what's the big, there's so much goin' on, what's your highlights? >> So where you started is always my highlight of the show, is being able to spend time with customers. I have never been at a company where I feel so fortunate to have the passion and the dedication and the enthusiasm and the gratitude of customers as we have here. And so that, I tell everyone at Splunk this is similar to a holiday function for a kid for me where the energy keeps me going all year long, so that always is number one, and then around the customers, what we've been doing with the technology architecture, the platform, and the depth and breadth of what we've been working on honestly for four plus years. It really, I think, has come together in a unique way at this show. >> Last year you had a lot of announcements that were intentional announcements, it's coming. They're coming now, they're here, they're shipping. >> They're here, they're here. >> What is some of the feedback you're hearing because a lot of it has a theme where, you know, we kind of pointed this out a couple of years ago, it's like a security show now, but it's not a security show, but there's a lot of security in there. What are some of the key things that have come out of the oven that people should know about that are being delivered here? >> So the core of what we're trying to communicate with Data-to-Everything is that you need a very multifaceted data platform to be able to handle the huge variety of data that we're all dealing with, and Splunk has been known and been very successful at being able to index data, messy, non-structured data, and make sense of it even though it's not structured in the index, and that's been, still is incredibly valuable. But we started almost four years ago on a journey of adding in stream processing before the data gets anywhere, to our index or anywhere else, it's moving all around the world, how do you actually find that data and then begin to take advantage of it in-flight? And we announced that the beta of Data Stream Processor last year, but it went production this year, four years of development, a ton of patents, a 40 plus person, 50 plus person, development team behind that, a lot of hard engineering, and really elegant interface to get that there. And then on the other end, to complement the index, data is landing all over the place, not just in our index, and we're very aware that different structures exist for different needs. A data warehouse has different properties than a relational database which has different properties than a NoSQL column store in-memory database, and data is going to only continue to be more dispersed. So again, four plus years ago we started on what now is Data Fabric Search which we pre-announced in beta format last year. That went production at this show, but the ability to address a distributed Splunk landscape, but more importantly we demoed the integration with HTFS and S3 landscapes as the proof point of we've built a connector framework, so that this really cannot just be a incredibly high-speed, high-cardinality search processing engine, but it really is a federated search engine as well. So now we can operate on data in the stream when it's in motion. We obviously still have all the great properties of the Splunk index, and I was really excited about Splunk 8.0 and all the features in that, and we can go get data wherever it lives across a distributed Splunk environment, but increasingly across the more and more distributed data environment. >> So this is a data platform. This is absolutely a data platform, so that's very clear. So the success of platforms, in the enterprise at least, not just small and medium-sized businesses, you can have a tool and kind of look like a platform, there's some apps out there that I would point to and say, "Hey, that looks like a tool, it's really not a platform." You guys are a platform. But the success of a platform are two things, ecosystem and apps, because if you're in a platform that's enabling value, you got to have those. Talk about how you see the ecosystem success and the app success. Is that happening in your view? >> It is happening. We have over 2,000 apps on our Splunkbase framework which is where any of our customers can go and download the application to help draw value of a Palo Alto firewall, or ensure integration with a ServiceNow trouble ticketing system, and thousands of other examples that exist. And that has grown from less than 300 apps, when I first got here six years ago, to over 2,000 today. But that is still the earliest inning, for earliest pitch and your earliest inning journey. Why are there 20,000, 200,000, two million apps out there? A piece of it is we have had to up the game on how you interface with the platform, and for us that means through a stable set of services, well-mannered, well-articulated, consistently maintained services, and that's been a huge push with the core Splunk index, but it's also a big amount of work that we've been doing on everything from the separation between Phantom runbooks and playbooks with the underlying orchestration automation, it's a key component of our Stream Processor, you know, what transformations are you doing, what enrichments are you doing? That has to live separate than the underlying technology, the Kafka transport mechanism, or Kinesis, or whatever happens in the future. So that investment to make sure we got a effective and stable set of services has been key, but then you complement that with the amazing set of partners that are out here, and making sure they're educated and enabled on how to take advantage of the platform, and then feather in things like the Splunk Ventures announcement, the Innovation Fund and Social Impact Fund, to further double down on, hey, we are here to help in every way. We're going to help with enablement, we're going to help with sell-through and marketing, and we'll help with investment. >> Yeah, I think this is smart, and I think one of the things I'll point out is that feedback we heard from customers in conversations we had here on theCUBE and the hallway is, there's a lot of great feedback on the automation, the machine learning toolkit, which is a good tell sign of the engagement level of how they're dealing with data, and this kind of speaks to data as a value... The value creation from data seems to be the theme. It's not just data for data's sake, I mean, managing data is all hard stuff, but value from the data. You mentioned the Ventures, you got a lot of tech for good stuff goin' on. You're investing in companies where they're standing up data-driven companies to solve world problems, you got other things, so you guys are adjusting. In the middle innings of the data game, platform update, business model changes. Talk about some of the consumption changes, now you got Splunk Cloud, what's goin' on on (laughs) how you charge, how are customers consuming, what moves did you guys make there and what's the result? >> Yeah, it's a great intro on data is awesome, but we all have data to get to decisions first and actions second. Without an action there is no point in gathering data, and so many companies have been working their tails off to digitize their landscapes. Why, well you want a more flexible landscape, but why the flexibility? Because there's so much data being generated that if you can get effective decisions and then actions, that landscape can adapt very, very rapidly, which goes back to machine learning and eventual AI-type opportunities. So that is absolutely, squarely where we've been focused, is translating that data into value and into actual outcomes, which is why our orchestration automation piece was so important. One of the gating factors that we felt has existed is for the Splunk index, and it's only for the Splunk index, the pricing mechanism has been data volume, and that's a little bit contrary to the promise, which is you don't know where the value is going to be within data, and whether it's a gigabyte or whether it's a petabyte, why shouldn't you be able to put whatever data you want in to experiment? And so we came out with some updates in pricing a month and change ago that we were reiterating at the show and will continue to drive on a, hopefully, very aggressive and clear marketing and communications framework, that for people that have adjusted to the data volume metric, we're trying to make that much simpler. There's now a limited set of bands, or tiers, from 100 gigs to unlimited, so that you really get visibility on, all right, I think that I want to play with five terabytes, I know what that band looks like and it's very liberal. So that if you wind up with six and a half terabytes you won't be penalized, and then there's a complimentary metric which I think is ultimately going to be the more long-lived metric for our infrastructurally-bound products, which is virtual CPU or virtual core. And when I think about our index, stream processing, federated search, the execution of automation, all those are basically a factor of how much infrastructure you're going to throw at the problem, whether it's CPU or whether it's storage or network. So I can see a day when Splunk Enterprise and the index, and everything else at that lower level, or at that infrastructure layer, are all just a series of virtual CPUs or virtual cores. But I think both, we're offering choice, we really are customer-centric, and whether you want a more liberal data volume or whether you want to switch to an infrastructure, we're there and our job is to help you understand the value translation on both of those because all that matters is turning it into action and into doing. >> It's interesting, in the news yesterday quantum supremacy was announced. Google claims it, IBM's debating it, but quantum computing just points to the trend that more compute's coming. So this is going to be a good thing for data. You mentioned the pricing thing, this brings up a topic we've been hearing all week on theCUBE is, diverse data's actually great for machine learning, great for AI. So bringing in diverse data gives you more aperture into data, and that actually helps. With the diversity comes confusion and this is where the pricing seems to hit. You're trying to create, if I get this right, pricing that matches the needs of the diverse use of data. Is that kind of how you guys are thinkin' about it? >> Meets the needs of diverse data, and also provides a lot of clarity for people on when you get to a certain threshold that we stop charging you altogether, right? Once you get above 10s of terabytes to 100 terabytes, just put as much data in as you want. The foundation of Splunk, going back to the first data, is we're the only technology that still exists on the index side that takes raw, non-formatted data, doesn't force you to cleanse or scrub it in any way, and then takes all that raw data and actually provides value through the way that we interact with the data with our query language. And that design architecture, I've said it for five, six years now, is completely unique in the industry. Everybody else thinks that you've got to get to the data you want to operate on, and then put it somewhere, and the way that life works is much more organic and emergent. You've got chaos happening, and then how do you find patterns and value out of that chaos? Well, that chaos winds up being pretty voluminous. So how do we help more organizations? Some of the leading organizations are at five to 10 petabytes of data per day going through the index. How do we help everybody get there? 'Cause you don't know the nugget across that petabyte or 10 petabyte set is going to be the key to solving a critical issue, so let's make it easy for you to put that data in to find those nuggets, but then once you know what the pattern is, now you're in a different world, now you're in the structured data world of metrics, or KPIs, or events, or multidimensional data that is much more curated, and by nature that's going to be more fine-grained. There's not as much volume there as there is in the raw data. >> Doug, I notice also at the event here there's a focus on verticals. Can you comment on the strategy there, is that by design? Is there a vertical focus? >> It's definitely by design. >> Share some insight into that. >> So we launched with an IT operations focus, we wound up progressing over the years to a security operations focus, and then our doubling down with Omnition, SignalFx, VictorOps, and now Streamlio is a new acquisition on the DevOps and next gen app dev buying centers. As a company and how we go to market and what we are doing with our own solutions, we stay incredibly focused on those three very technical buying centers, but we've also seen that data is data. So the data you're bringing in to solve a security problem can be used to solve a manufacturing problem, or a logistics and supply chain problem, or a customer sentiment analysis problem, and so how do you make use of that data across those different buying centers? We've set up a verticals group to seed, continue to seed, the opportunity within those different verticals. >> And that's compatible with the horizontally scalable Splunk platform. That's kind of why that exists, right? >> That the overall platform that was in every keynote, starting with mine, is completely agnostic and horizontal. The solutions on top, the security operations, ITOps, and DevOps, are very specific to those users but they're using the horizontal platform, and then you wind up walking into the Accenture booth and seeing how they've taken similar data that the SecOps teams gathered to actually provide insight on effective rail transport for DB cargo, or effective cell tower triangulation and capacity for a major Australian cell company, or effective manufacturing and logistics supply chain optimization for a manufacturer and all their different retail distribution centers. >> Awesome, you know, I know you've talked with Jeff Frick in the past, and Stu Miniman and Dave Vellante about user experience, I know that's something that's near and dear to your heart. You guys, it has been rumored, there's going to be some user experience work done on the onboarding for your Splunk Cloud and making it easier to get in to this new Splunk platform. What can we expect on the user experience side? (laughs) >> So, for any of you out there that want to try, we've got Splunk Investigate, that's one of the first applications on top of the fully decomposed, services layered, stateless Splunk Cloud. Mission Control actually is a complementary other, those are the first two apps on top of that new framework. And the UI and experience that is in Splunk Investigate I think is a good example of both the ease of coming to and using the product. There's a very liberal amount of data you get for free just to experiment with Splunk Investigate, but then the onboarding experience of data is I think very elegant. The UI is, I love the UI, it's a Jupyter-style workbook-type interface, but if you think about what do investigators need, investigators need both some bread crumbs on where to start and how to end, but then they also need the ability to bring in anybody that's necessary so that you can actually swarm and attack a problem very efficiently. And so when you go back and look at, why did we buy VictorOps? Well, it wasn't because we think that the IT alerting space is a massive space we're going to own, it's because collaboration is incredibly important to swarm incidents of any type, whether they're security incidents or manufacturing incidents. So the facilities at VictorOps gave, on allowing distributed teams and virtual teams to very quickly get to resolution. You're going to find those baked into all products like Mission Control 'cause it's one of the key facilities of, that Tim talked about in his keynote, of indulgent design, mobility, high collaboration, 'cause luckily people still matter, and while ML is helping all of us be more productive it isn't taking away the need for us, but how do you get us to cooperate effectively? And so our cloud-based apps, I encourage any of you out there, go try Splunk Investigate, it's a beautiful product and I think you'll be blown away by it. >> Great success on the product side, and then great success on the customer side, you got great, loyal customers. But I got to ask you about the next level Splunk. As you look at this event, what jumps out at me is the cohesiveness of the story around the platform and the apps, ecosystem's great, but the new branding, Data-to-Everything. It's not product-specific 'cause you have product leadership. This is a whole next level Splunk. What is the next level Splunk vision? >> And I love the pink and orange, in bold colors. So when I've thought about what are the issues that are some of the blockers to Splunk eventually fulfilling the destiny that we could have, the number one is awareness. Who the heck is Splunk? People have very high variance of their understanding of Splunk. Log aggregation, security tool, IT tool, and what we've seen over and over is it is much more this data platform, and certainly with the announcements, it's becoming more of this data fabric or platform that can be used for anything. So how do we bring awareness to Splunk? Well, let's help create a category, and it's not up to us to create the category, it's up to all of you to create the category, but Data-to-Everything in our minds represents the power of data, and while we will continue internally to focus on those technical buying centers, everything is solvable with data. So we're trying to really reinforce the importance of data and the capabilities that something like Splunk brings. Cloud becomes a really important message to that because that makes it, execution to that, 'cause it makes it so much easier for people to immediately try something and get value, but on-prem will always be important as well 'cause data has gravity, data has risk, data has cost to move. And there are so many use cases where you would just never push data to the cloud, and it's not because we don't love cloud. If you have a factory that's producing 100 terabytes an hour in a area where you've got poor bandwidth, there's no option for a cloud connect there of high scale, so you better be able to process, make sense of, and act on that data locally. >> And you guys are great in the cloud too, on-premise, but final word, I want to get your thoughts to end this segment, I know you got to run, thanks for your time, and congratulations on all your success. Data for good. There's a lot of tech for bad kind of narratives goin' on, but there's a real resurgence of tech for good. A lot of people, entrepreneurs, for-profit, for-nonprofit, are doing ventures for good. Data is a real theme. Data for good is something that you have, that's part of the Data-to-Everything. Talk about the data for good real quick. >> Yeah, we were really excited about what we've done with Splunk4Good as our nonprofit focused entity. The Splunk Pledge which is a classic 1-1-1 approach to make sure that we're able to help organizations that need the help do something meaningful within their world, and then the Splunk Social Impact Fund which is trying to put our money where our mouth is to ensure that if funding and scarcity of funds is an issue of getting to effective outcomes, that we can be there to support. At this show we've featured three awesome charities, Conservation International, NetHope, and the Global Emancipation Network, that are all trying to tackle really thorny problems with different, in different ways, different problems in different ways, but data winds up being at the heart of one of the ways to unlock what they're trying to get done. We're really excited and proud that we're able to actually make meaningful donations to all three of those, but it is a constant theme within Splunk, and I think something that all of us, from the tech community and non-tech community are going to have to help evangelize, is with every invention and with every thing that occurs in the world there is the power to take it and make a less noble execution of it, you know, there's always potential harmful activities, and then there's the power to actually drive good, and data is one of those. >> Awesome. >> Data can be used as a weapon, it can be used negatively, but it also needs to be liberated so that it can be used positively. While we're all kind of concerned about our own privacy and really, really personal data, we're not going to get to the type of healthcare and genetic, massive shifts in changes and benefits without having a way to begin to share some of this data. So putting controls around data is going to be important, putting people in the middle of the process to decide what happens to their data, and some consequences around misuse of data is going to be important. But continuing to keep a mindset of all good happens as we become more liberal, globalization is good, free flow of good-- >> The value is in the data. >> Free flow of people, free flow of data ultimately is very good. >> Doug, thank you so much for spending the time to come on theCUBE, and again congratulations on great culture. Also is worth noting, just to give you a plug here, because it's, I think, very valuable, one of the best places to work for women in tech. You guys recently got some recognition on that. That is a huge accomplishment, congratulations. >> Thank you, thank you, we had a great diversity track here which is really important as well. But we love partnering with you guys, thank you for spending an entire week with us and for helping to continue to evangelize and help people understand what the power of technology and data can do for them. >> Hey, video is data, and we're bringin' that data to you here on theCUBE, and of course, CUBE cloud coming soon. I'm John Furrier here live at Splunk .conf with Doug Merritt the CEO. We'll be back with more coverage after this short break. (futuristic music)

Published Date : Oct 24 2019

SUMMARY :

Brought to you by Splunk. Exhausted and energized simultaneously. and the loyalty of the customer base, and the gratitude of customers as we have here. Last year you had a lot of announcements What is some of the feedback you're hearing and data is going to only continue to be more dispersed. and the app success. and download the application to help draw value and this kind of speaks to data as a value... and it's only for the Splunk index, pricing that matches the needs of the diverse use of data. and the way that life works Doug, I notice also at the event here and so how do you make use of that data with the horizontally scalable Splunk platform. and then you wind up walking into the Accenture booth and making it easier to get in the ease of coming to and using the product. But I got to ask you about the next level Splunk. and the capabilities that something like Splunk brings. Data for good is something that you have, and then there's the power to actually drive good, putting people in the middle of the process to decide free flow of data ultimately is very good. one of the best places to work for women in tech. and for helping to continue to evangelize and we're bringin' that data to you here on theCUBE,

ENTITIES

Entity	Category	Confidence
Doug	PERSON	0.99+
Doug Merritt	PERSON	0.99+
Dave Vellante	PERSON	0.99+
NetHope	ORGANIZATION	0.99+
Jeff Frick	PERSON	0.99+
five	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Tim	PERSON	0.99+
100 gigs	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
last year	DATE	0.99+
John	PERSON	0.99+
Stu Miniman	PERSON	0.99+
Last year	DATE	0.99+
Conservation International	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
less than 300 apps	QUANTITY	0.99+
thousands	QUANTITY	0.99+
four years	QUANTITY	0.99+
100 terabytes	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Global Emancipation Network	ORGANIZATION	0.99+
Splunk	ORGANIZATION	0.99+
both	QUANTITY	0.99+
yesterday	DATE	0.99+
this year	DATE	0.99+
six years	QUANTITY	0.99+
Streamlio	ORGANIZATION	0.99+
Omnition	ORGANIZATION	0.99+
six and a half terabytes	QUANTITY	0.99+
Splunk4Good	ORGANIZATION	0.99+
SignalFx	ORGANIZATION	0.99+
five terabytes	QUANTITY	0.99+
10 years	QUANTITY	0.99+
four plus years	QUANTITY	0.99+
over 2,000 apps	QUANTITY	0.99+
VictorOps	ORGANIZATION	0.99+
four plus years ago	DATE	0.99+
One	QUANTITY	0.98+
first data	QUANTITY	0.98+
10 petabytes	QUANTITY	0.98+
seventh year	QUANTITY	0.98+
six years ago	DATE	0.98+
10 petabyte	QUANTITY	0.98+
Splunk Ventures	ORGANIZATION	0.98+
50 plus person	QUANTITY	0.98+
first two apps	QUANTITY	0.98+
20,000, 200,000, two million apps	QUANTITY	0.98+
over 2,000	QUANTITY	0.97+
a ton of patents	QUANTITY	0.97+
three	QUANTITY	0.97+
one	QUANTITY	0.97+
two things	QUANTITY	0.97+
40 plus person	QUANTITY	0.96+
today	DATE	0.96+
Splunk 8.0	TITLE	0.96+
first	QUANTITY	0.95+
four years ago	DATE	0.95+
Splunk Investigate	TITLE	0.95+
couple of years ago	DATE	0.95+
first applications	QUANTITY	0.94+
This year	DATE	0.94+
above 10s of terabytes	QUANTITY	0.93+
Splunk	TITLE	0.93+
Ventures	ORGANIZATION	0.91+
Palo Alto	LOCATION	0.88+
Splunk Cloud	TITLE	0.87+
three very technical buying centers	QUANTITY	0.87+
NoSQL	TITLE	0.87+
an hour	QUANTITY	0.87+
second	QUANTITY	0.85+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Streamlio: