Zachary Musgrave & Chris Gordon, Yelp | Splunk .conf 2017
>> Narrator: Live from Washington D.C., it's theCUBE. Covering .conf2017. Brought to you by Splunk. >> Well welcome back here on theCUBE. We continue our coverage of .conf2017, we're in Washington D.C. Along with Dave Vellante, I'm John Walls. And Dave, you know what time it is, by the way? Just about? >> I don't know, this is the penultimate interview. >> It's almost five o'clock. >> Okay. >> And that means it's almost happy hour time. So I was thinking where might we go tonight, so-- >> There's an app for that. >> There was, and so I looked. It turns out that the Penny Whiskey Cafe is just two tenths of a mile from here. And you know how I knew that? >> How's the ratings on that? >> We got four. >> Four and half with 52. >> 52 reviews? >> Yeah, I feel good about that. >> Yeah, that's pretty good. That's a substantive base. >> I feel very solid with that one. We'll make it 53 in about a half hour. Of course I found it on Yelp. We have a couple of gentlemen from Yelp with us tonight. I don't have to tell you what Yelp does, it does everything for everybody, right. Zach Musgrave, technical lead, and Chris Gordon, software engineer at Yelp. Gentlemen, thanks for being here. And U can join us, by the way, later on, at the Penny Whiskey if you'd like to. First off, what are you doing here, right, at Splunk? What's Yelp and Splunk, what's that intersection all about? Zach, if you would. >> Sure, well Yelp uses Splunk for all sorts of purposes. Operational, intelligence, business metrics, pretty much any sort of analytics from event driven data that you can really think of, Yelp has found a way, and our engineers have found a way to get that into Splunk and derive business value from it. So Chris and I are actually here, we just gave a breakout session at .conf, talking about how we find strong business value and how we quantify that value and mutate our Splunk cluster to really drive that. >> Okay. >> So, so how do you find value then, I mean, what was? >> It's hard. Chris was one of the people who really, really drove this for us. And when we looked at this, you know I once had an engineer who came up to our team, we maintain Splunk amongst other things, and the engineer said can I ingest 10 terabytes of data a day into Splunk and then keep it forever? And I said, um, please don't. And then we talked a bit more about what that engineer was actually trying to do and why they needed this massive amount of data, and we found a better way that was much more efficient. And then where we didn't need to keep all the data forever. So, by being able to have those conversations and to quantify with the data you're already ingesting into Splunk, being able to quanitfy that and actually show how many people were searching this, how's it being used, what's the depth of the search look like, how far back are they looking in time. You can really optimize your Splunk cluster to get a lot more business value than just naively setting it up and turning it on. >> So you weren't taking a brute force approach, you were smarter about that, but you weren't deduping, you were identifying the data that was not necessary to keep, did I get that right? >> Correct. Yeah, we essentially kind of identified what are highest cost per search logs, which we basically just totaled up how many times each log was searched, and then tried to quantify how much each logs was costing us. And then this ended up being a really good metric for figuring out what we'd want to remove or something that was a candidate for dislodging the data somehow. >> So, you guys gave a talk today. We were talking off camera about pricing, that's not something you guys get involved in, but I would categorize this as sort of how do you get the most out of that asset, called Splunk, right. Is that sort of the >> Exactly. >> theme of your talk, right? >> Yeah. We talk a lot about expected value amongst our team, and in the talk we just gave. And we don't ever think about this as, oh do this so that you can spend less money on Splunk or on your infrastructure that's backing Splunk. Think about is more as we have this right now and we can utilize it more effectively. We can get more value out of what we already have. >> Okay, so, I wonder if we could just talk a little bit about your environment. We know you run on AWS. How does that cloud fit in with Splunk, paint a picture for us, if you would. What does it all look like? >> Yeah, so we have two clusters actually. One is the high value, high quality of service cluster, it's the larger generic, we call it generic prod, and then we have another one, where we kind of have our more verbose, maybe slightly less valuable per log cluster. And this runs on a D2, which is just instant storage. And then the higher performance cluster runs all on a GP2. So it's basically just SSDs. And we also do, we also have four copies of each log and we have two searchable copies of each log, so it's pretty well replicated. >> Dave: Okay, so that's how you protect the data. >> Yeah. >> Is to make copies, in what, in different zones, or? >> Yeah, we have two copies of each log in each availability zone, and then one searchable copy of each log in each availability zone. >> And you guys are cloud natives, all cloud, just out of school and graduate school. So you talked about infrastructure as code. You don't do any of that on-prem stuff, you're not like installing gear. And so it's not part of your lexicon, right? >> No. >> Okay. So I want to do a little editorial thing. Kristen Nicole, our managing editor, sent the note around today saying 101s get the best traffic on the website. So I want to do a little DevOps 101, okay. Even though, it's second nature to you, and a lot of people in our audience know what it is. How do you describe DevOps? Give us the 101 on DevOps. >> Okay so, DevOps is a complicated thing, but and occasionally you see it as like a role on like a job board or something. And that always strikes me as odd, because it's not really a role. Like it's a philosophy moreso. The way that I always see it, is it used to be like pre DevOps, was the software developers make a thing, and then they throw it over the fence, and operations just picks it up. And they're like well what do we do with this, and deploy it, okay, good luck. And so with this result in a sort of an us against them mentality, where the developers aren't incentivized to really make it resilient, or really document it well, and operations and the sys admins are not incentivized to really be flexible and to be really hard charging and move quickly, because they're the ones who are going to be on call for whatever the developers made. DevOps is a we, instead of an us verses them. So for example, product teams have an on-call rotation. Operations and sys admins write code. There are still definitely specializations, but it all comes together in a much more holistic manner. >> Okay, and the ops guys will write code, as opposed to hacking code, messing up your code, throwing it back over the fence, and saying hey your code doesn't work. >> Exactly. >> And then you say well it worked when I gave it to you. And then like you said that sort of finger pointing. >> We are totally done with works on my machine, it's over. No more. >> Okay, and the benefits obviously are higher quality, faster time to market, less food fighting. >> Yup, exactly. In the old model you'd have a new deployment of like a website like maybe once a week or maybe even once a month. Yelp deploys multiple times everyday over and over again. And each one of those is going to include changes from a dozen different engineers. So we need to be agile in that manner, just like with our Splunk cluster. >> I mean you guys are relatively new, four years and two years, perspectively. But these days it's a long time. How would you describe your Splunk journey. Where did it start and where do you want to take it? >> I would say it started, you actually had Kris Wehner on here last year, and he talked a lot about it. He was the VP of engineering at SeatMe. And he kind of got Yelp onto the whole Splunk train. And at that point it was used mostly by SeatMe and everyone at Yelp was like oh this is fantastic, we want to use this. And we started basically migrating it to our VPC. And have generally, we're starting to now get everything going, get all the kinks worked out, and really now we're trying to see where we can provide the most value and make things as easy as possible for our developers to add logs and add searches and get what they need out of it. >> So what kind of use cases are you envisioning, and where are you getting value out of it? >> So we have our operations teams get a lot of value out of it when there's some outage happening. And it's really useful for them to be able to just look at the access logs and see what's going on. And Splunk makes that very easy. And we also get a lot of value out of Yelp's application logs. Splunk has been great for figuring out when something's not right. And allowing us to dig in further. >> So yeah, at the end of the day, as consumers, what does this mean to us, ultimately? Like our searches are faster, searches are more refined, searches are more accurate? What does it mean to me at the end of the day that you're enabling what activity through this technology. >> Dave: Yeah, it'll be more secure? >> Yeah, what does it mean? >> As an end user of Yelp? >> Yes. >> So, I'll give you one example that always sticks out in my mind. So I don't know if you all know this, but you can actually do things like order food via Yelp, you can make appointments via Yelp, even with like a dentist. You can beauty appointments, all sorts of personal services. >> Hair salon came up today actually, when I was looking for a bar. >> Absolutely. That's not supposed to happen. >> Dave: Well that was the Penny Whiskey Cafe. >> You never know, but what ever's next door I don't know. >> Can you get a haircut while you drink? >> Hair salons in the District are pretty impressive. >> I wasn't planning on it, no. But anyway, I'm sorry. >> Anyway, so we work with a lot of external partners to enable all these different integrations, right. So you press start order, and then eventually you see the menu, and then you add some stuff to your cart, and then you have to pay. And so if you haven't given us your credit card information yet, then you have to enter that, and that has to go to a payment processor, the order of course has to go out to the partner who's going to fulfill your order, and so on. So there's this pipeline of many different micro services plus the main Yelp application, plus this partner who's actually fulfilling your order, plus the payment processor, and so on, and so on. And it ends up with this really complicated state machine. So the way that actually works under the hood, to be very simplistic, is there's a unique order identifier that is assigned to you when you start the order. And then that passed through the whole process. So at every step in this process a bunch of events are emitted out of the various parts of the pipeline and into Splunk, where they're then matched to show that your order is progressing. And the order didn't get stuck. Because you know what's really sad is when you order food and it doesn't show up. So we really have to guard against that. >> Yeah, we hate that. >> Yeah, everybody does. So it's really important that we're able to unify this data, from all these different places, Splunk's really great for that, and to be able to then alert on that and page somebody and say hey, something's not quite right here, we have hungry folks. >> So while I have the smartest guys that we've interviewed all week here, you mentioned, >> Please. You mentioned, aw shucks, I know. You mentioned state machine. Are you playing around with functional programming, so called server lists, probably don't like that word either, but what are you doing there? Are you finding sort of new applications in use cases for so called server lists? >> I would say not so much. I don't know, is anyone at Yelp doing that? >> Yeah, there's some Lambda stuff going on. Like core back end is doing that work right now. A lot of our infrastructure is actually build up before the AWS Lambdas were a thing. So we found other ways to do that, and we have this really cool internal platform as a service, it's a docker, and some scheduling stuff on top of that. So a lot of things, like it's really easy to just launch a batch job in there. And it takes away some of the need for the true server lists. >> Well the reason I ask is because people are saying a lot of the state list IoT apps are going to use that sort of Lambda or homegrown stuff. And I'm not sure what the play is for Yelp in Internet of Things. I would imagine there's actually a play there for you guys though, and I'm curious as to the data angle, and maybe where Splunk might fit in. >> I'm certain that we're going to be using Splunk to read data from all of those different components as they're being launched. I know that there's been a couple early forays into the Lambda space that I've seen go by in code reviews and everything. But of course, with Splunk itself we can get data out of those. So as that happens, like we already have all our pipe lining set up. And it'll be pretty easy for them to analyze their self with Splunk. >> What gets you young folks excited these days? What keeps you enthralled and passionate? What do you look for? >> I don't know I think just in general anything that empowers you to get a lot done without having to fight it constantly. And general DevOps tools have been getting really good at that recently. And yeah, I would say anything that empowers you, gives you the feeling that you can do anything really. >> Yeah, all of the infrastructure is code stuff that's going on right now. So one of the pipelines that we use to get data out of Amazon S3, but it passes notifications through this S3 event notifications to Amazon SNS, to Amazon SQS, to our Splunk forwarders. And so that's a very complicated pipeline. And you have to set it all up, it works really well, but here's the cool part. That's all defined in code. And so this means that if you set up a new integration there's a code review. And we have some verification and validation that it's correct. And furthermore, if anything goes wrong with it, we can just hit a button and it recreates itself. That's what gets me happy. When tools get in my way that's not so good. >> Well and it just leaves more time for higher value activities and that's exciting. the transformation in infrastructure over the last five years has just been mind boggling. So, thanks you guys. >> It does. It does give me a lot of pleasure when something can go catastrophically wrong, and then just like, oh wait, it's self healing, all it can take is give three plays fine. And we're all dandy. >> Well to Dave's point, while I was off camera I did a search on the two smartest guys in the room. And it said one is six feet away the other one is seven feet away, so Yelp works, I mean it really does. But thanks for the time. It's been interesting. Next generation, right? So far over us. >> Yeah, I know. It's kind of depressing, but I love it. (laughing) >> Very good, thanks guys. >> Thank you so much. >> Back with more, here on theCUBE at .conf2017. We are live, Washington D.C. >> Dave: I've kind of had it with millennial. (upbeat music)
SUMMARY :
Brought to you by Splunk. And Dave, you know what time it is, by the way? And that means it's almost happy hour time. And you know how I knew that? Yeah, that's pretty good. I don't have to tell you what Yelp does, from event driven data that you can really think of, and to quantify with the data And then this ended up being a really good metric as sort of how do you get the most out of that asset, and in the talk we just gave. We know you run on AWS. and then we have another one, Yeah, we have two copies of each log And you guys are cloud natives, all cloud, and a lot of people in our audience know what it is. and operations and the sys admins Okay, and the ops guys will write code, And then you say We are totally done with works on my machine, it's over. Okay, and the benefits obviously are And each one of those is going to include changes How would you describe your Splunk journey. And he kind of got Yelp onto the whole Splunk train. And we also get a lot of value What does it mean to me at the end of the day So I don't know if you all know this, Hair salon came up today actually, That's not supposed to happen. but what ever's next door I don't know. Hair salons in the District I wasn't planning on it, and then you add some stuff to your cart, and to be able to then alert on that but what are you doing there? I don't know, is anyone at Yelp doing that? And it takes away some of the need and I'm curious as to the data angle, And it'll be pretty easy for them to analyze anything that empowers you to get a lot done And so this means that if you set up Well and it just leaves more time and then just like, oh wait, And it said one is six feet away the other one It's kind of depressing, but I love it. Back with more, here on theCUBE at .conf2017. Dave: I've kind of had it with millennial.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Chris | PERSON | 0.99+ |
Zach Musgrave | PERSON | 0.99+ |
Dave | PERSON | 0.99+ |
Dave Vellante | PERSON | 0.99+ |
Chris Gordon | PERSON | 0.99+ |
Yelp | ORGANIZATION | 0.99+ |
Kristen Nicole | PERSON | 0.99+ |
John Walls | PERSON | 0.99+ |
SeatMe | ORGANIZATION | 0.99+ |
six feet | QUANTITY | 0.99+ |
four | QUANTITY | 0.99+ |
seven feet | QUANTITY | 0.99+ |
Kris Wehner | PERSON | 0.99+ |
Four | QUANTITY | 0.99+ |
One | QUANTITY | 0.99+ |
Washington D.C. | LOCATION | 0.99+ |
Zach | PERSON | 0.99+ |
two copies | QUANTITY | 0.99+ |
last year | DATE | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
two smartest guys | QUANTITY | 0.99+ |
once a week | QUANTITY | 0.99+ |
four years | QUANTITY | 0.99+ |
each log | QUANTITY | 0.99+ |
53 | QUANTITY | 0.99+ |
once a month | QUANTITY | 0.99+ |
Splunk | ORGANIZATION | 0.99+ |
one | QUANTITY | 0.99+ |
two clusters | QUANTITY | 0.99+ |
Zachary Musgrave | PERSON | 0.99+ |
Lambda | TITLE | 0.99+ |
each logs | QUANTITY | 0.99+ |
today | DATE | 0.99+ |
52 reviews | QUANTITY | 0.99+ |
52 | QUANTITY | 0.99+ |
tonight | DATE | 0.99+ |
second nature | QUANTITY | 0.99+ |
four copies | QUANTITY | 0.99+ |
Amazon | ORGANIZATION | 0.98+ |
DevOps | TITLE | 0.98+ |
Penny Whiskey Cafe | ORGANIZATION | 0.98+ |
Splunk | PERSON | 0.98+ |
First | QUANTITY | 0.97+ |
Lambdas | TITLE | 0.97+ |
DevOps 101 | TITLE | 0.97+ |
about a half hour | QUANTITY | 0.97+ |
each one | QUANTITY | 0.96+ |
one example | QUANTITY | 0.96+ |
each availability zone | QUANTITY | 0.95+ |
two years | QUANTITY | 0.94+ |