John Hart, Scalyr | Scalyr Innovation Day 2019
(upbeat music) >> From San Mateo, it's theCUBE, covering Scalyr Innovation Day, brought to you by Scalyr. >> Hello and welcome to the special Cube Innovation Day here in Silicon Valley in San Mateo, California at Scalyr's Headquarters. I'm John Furrier, host of theCUBE. John Hart's the Tech Lead Back End Engineering here at Scalyr. Thanks for having us. >> Thanks for having me John. >> So what's the secret sauce at Scalyr? You guys have unique differentiate as we have covered with some of your peers and the founders are all talking about it. But, you guys have a unique secret sauce. Take a minute to explain that. >> I think, yeah, it's a few different things. First of all, you've got just the design level, which is we don't use keyword indexes. So that's a big one right there off the top. On top of that, you've got a couple of different implementation paths. We've got our own custom written data store. So we're able to really control all the way down to the bytes on disk, how we lay things out, optimize for speed. We have a novel kind of scatter-gather approach for fanning out a query, to make sure we can get all of our nodes involved as quickly as possible. Then, finally, and this is just kind of being smart, which is we have a time series database for repetitive queries and that's on demand. You don't have to do anything, but we're going to speed up your queries in the background if we know it's a good idea. >> Talk about the time series. I think that's interesting because that comes to play. We hear about real time a lot. We talk a lot about in cyber security that time series has been beneficial. Where does time series fit for you guys in here? >> That's a good question. I think one of the big differences with Scalyr versus other uses of time series database is with Scalyr you're outputting your logs, there's all kinds of information in your logs. Some of that might be a good thing to put in a time series database, but I think with a lot of other products, you would have to decide that ahead of time. Like, hey, let's get this metric into the database. With Scalyr, the moment you have anything in your logs that you might want to put into a time series you just start querying it. You put in a dashboard (snaps) you've got a time series. So we're going to back propagate that for everything you've already given us. So all of those queries are fast from there on out. >> So it's built in from the beginning. >> Exactly, and you don't have to do anything. It's just on demand. >> So keywords been what other people have been used for years. That's been standard for these log management software packages and indexes. Indexes can slow things down. We've got a tutorial on that. Why is those two areas, haven't been innovated in awhile? When people just haven't figured it out, you guys have first? What's the differentiation for you guys? Why'd you guys get there? >> I think the main reason is that log data is just fundamentally different than most other things that you might use a database for. There's a couple of different reasons for that. So with log data, you're not in control of it. You can't design it. You know, an index is great if you're making a relational database. You've got control of your columns. You know what you're going to join on. You know what you want to index. Nobody designs their logs like they design their database tables. It's just a bunch of stuff. It's from systems you don't control. It's changing all the time. So just the number of distinct fields that you would have to index is really, really high. So if your system depends on indexing for good performance, you're going to have to make a lot of indexes. And indexes, of course, they're right amplifying. If you've got one gigabyte of raw data, then you've got to put five or six hundred indexes on top of it. You're going to have five or ten gigabytes of raw plus index data. That means you got to do a lot more IO, and at the end of the day, how much you have to read from disk, determines how fast your query's going to be. >> So, in essence indexes creates a lot of overhead. You shouldn't even need to do because of the nature of log files. >> Because the nature of log data, it's overhead that doesn't serve log data very well, yeah. >> And what about the log data that's changing? Cause one of the things we're seeing, Internet of Things, more connected devices, imagine the Teslas that are going to be connecting in, with all their data. >> Right >> All this stuff, cameras. You've got a huge amount of new kind of data. Up, down, status. This is going to be a tsunami of new types of log data. >> Yeah, and none of it are you going to have a ton of control over. Right, it's going to be changing a ton. Maybe you've got 20 different versions of devices out there that are all sending you different versions of logs. You've got to be able to handle all of it. So you want a system that is adaptive to your needs as they come up, as opposed to something you have to plan out with indexes ahead of time. >> So if someone asks you, say you guys say you're faster. Why? Is that true? Is the statement you're faster than others, and if so why? >> It is true. (laughs) And that really comes down to the secret sauce. The brute force, the key to brute force, and I think we've talked about this a little bit today, is you got to bring a lot of force, as quickly as you possibly can. And we do that. We've got a lot of custom code. We're not using off-the-shelf components. We're trying to get that time quick as we can. So I think our median performance is still better than 100 milliseconds. That might be for a query that's talking to two or three hundred machines, or maybe even more. All of which, to get, maybe it's going to scan a terabyte of data. All of that is going to come back within 100 milliseconds. It's extremely fast. >> Talk about why log data is different from other data types, for folks that are in these cloud native environments. Their time is precious. They are looking at a lot of different data. How is log data different? >> I think the fact that it's dynamic in terms of what's coming out is something new. It changes so rapidly. The other really big thing too is the way you query it changes from day to day. Most of the time you're going to your logs, you're trying to troubleshoot a problem. Today's problems are different than yesterdays problems. So every time you go in, you're using it in a different way. So it has to be very fast. It has to be exploratory. And that's one of the big things about Scalyr's speed. Is it enables this really exploratory. You can kind of move through the data quickly, as opposed to making a query, getting a cup of coffee, waiting for the query, and then deciding what you're going to do next. I'm kind of dating myself here, but it's like the first time you ever used Google. You're like, "Whoa, how did that happen?" That's what it's like the first time you use Scalyr. >> And you guys have a unique architecture, we talked about that. You guys have certain speeds. But it's not just the query speed. It's the time it takes to do the query. So you factor in a much bigger perspective than if someone has to build a query and then takes 15 minutes. >> Right. >> Game's over. >> Yeah, and instead you're just clicking on things. We're trying to make it very easy for you to move from oh here's an alert. Well here are the log files that caused that alert. Oh, what's the thread stack for that particular lock. Oh, I can go and look at everything else that happened in that thread. That's five or 10 seconds of Scalyr tops. >> You guys have unique engineering culture, that targets engineers, products built by engineers, for engineers. >> Yep. >> Great story. And it's real, and you guys building it everyday. What is the engineer threshold of pain when it comes to locked data? Have you seen any anecdotal, I mean, 'cause engineers that are in this space, they need access to it. There's SLAs now tied to it. People are sharing data. There's all kind of new ways, reasons why you need to have the Scalyr solution. But what's the pain point for most people to tolerate an inferior solution? >> Well for me, I actually have an answer for this. Right, because before I was Scalyr employee, I was a Scalyr customer and before I was a Scalyr customer, I was a Splunk customer. I used Splunk for about five years before I think Scalyr even necessarily existed and I was really happy with it because I needed it. Right? I had my own company. We were generating tons of logs. My support guys needed to use those logs. And, prior to using something like a Splunk, I was SSHing it to servers to check the log files, which is of course, not scalable. So I was really happy with the product as an idea existed, but it just kept gnawing at us. You know, every time we would query, sometimes it would be fast, sometimes it would be really slow. Sometimes the results would be down because an indexing server was down. It was just. >> You mean the Splunk solution? >> Yeah, the Splunk solution. Yeah, it was just extremely painful. So I read, actually, one of the blog posts written by Steve Newman and thought, that's a great idea. That is how you should attack this problem. No indexes. Brute forces. All the flexibility you get from that. I loved it and then I forgot about it for like six months. (laughs) Because I was busy, right. But then six months later I was really frustrated again with Splunk again being really, really slow, and I thought, what was the name of that company again? I looked them up. I installed it. And within, certainly within a day, I was blown away by the performance. Within a week, I had uninstalled Scalyr, excuse me, Splunk, from every single one of my servers and switched to Scalyr instead. >> And you're happy with that? Does it work for you? Came to join the company? >> Yeah, exactly. In kind of conversations with the support team here, I was one of their early customers to use Windows, so I had a lot of questions, they had questions for me, how did I get it working, it wasn't a supported platform. And all of my emails were responded to by two guys named Steve. So I figured that was probably the support team. Pretty funny they've got a support team of two people, both named Steve. And then at one point, in one email, Steve Newman said to me, "You may have realized there's only two of us here." And that's when I kind of went, "Oh wait, so there's two people total." And two guys I assumed in a basement. They weren't in a basement, but I assumed they were in a basement. They had software that was way better for my needs than Splunk, which at the time was worth probably eight, ten billion dollars. It's a public company. Thousands of engineers. So that's when I thought, "Huh. When I get a chance, "Maybe I should go work with these guys." >> You know it's interesting. Maybe create a new category, brute force as a service. >> Yeah. >> This is what they're doing. They're bringing in the right tool at the right time. >> Yep. >> For the right problem, for speed, and to solve the problem, no? >> Yeah. >> They care how it gets done. >> Get as much data as you can and get that answer back as quickly as you can. >> So this is the big challenge. Final question for you is obviously, you know, a lot of people we talked to in the DevOps world they're really fickle. On one hand, they'll try anything. If they like it, they'll stay with it. But if they don't, you'll know about it. Where's the value point for people to start thinking about Scalyr. Is it ingest to value, ingesting is one part, that's kind of a trial. Where's the value immediately come in? Where do you see, what's the first sign of light value, once the ingestion happens. >> So part of it is this, it's a very short period of time from the ingestion to the time you're querying on it is very, very short. So you got a real time view of what's happening on your servers not a five minutes ago view. That by itself can pay for it right there. If you're a DevOps person and you've got some alarm pinging. If that alarm is from 10 minutes ago, that means your customers are already annoyed. If you're going to have to wait another 10 minutes just to even see what's happening, you've got a really big problem, right. So being able to have the alarm, and you know that's triggering on something that happened a second or two ago, and then immediately being able to dive in with no interruption to your work flow, no reason not to dive in, that's a pretty big one right there. >> So pretty immediate impact. >> Yeah. >> So okay, for people that don't know Scalyr, what should they know about Scalyr as a company from a value proposition as a former customer now, key employee in the back end, and engineering. What is the key things they should know about? >> So speed, we keep talking about it, right? We have a really really good cost basis. Because we're not making those indexes, we don't have to store as much data. It's just generally cheaper for it to run. Right, so we actually have a really good cost point. And we get you from the alerts. You don't have to decide stuff ahead of time. You can do it all on the fly, ad hoc, we get you from the alerts, to your answers as quickly as you possibly can. That's pretty good. >> Every culture has its own unique kind of feature. What's Scalyr's culture here? I mean Intel was Moore's law, Cadence was Moore's law. What's the culture here, at Scalyr like? >> That's a good question. I guess I would say I'm just tremendously proud to be working with these engineers. Right? We're all here because we want to get better and we want to work on really, really hard problems writing our own code, not just running and kind of patching together open source systems that already exist. We want to be doing something cutting edge. So that's I would say the biggest one. >> And big problem's behind that, you've got AI right around the corner. Applying AI is going to be a natural extension. >> Yeah, 'cause we got the data. And can deal with the data. >> Ciao, thanks for the insight. Appreciate it. >> Thank you. Good talking to you. >> John Furrier here. Innovation Day with theCUBE here in Silicon Valley in San Mateo, at Scalyr's headquarters. I'm John Furrier. Thanks for watching. (upbeat music)
SUMMARY :
brought to you by Scalyr. John Hart's the Tech Lead Back End Engineering But, you guys have a unique secret sauce. You don't have to do anything, but we're going to speed up I think that's interesting because that comes to play. Some of that might be a good thing to put Exactly, and you don't have to do anything. What's the differentiation for you guys? So just the number of distinct fields You shouldn't even need to do because of the nature Because the nature of log data, it's overhead imagine the Teslas that are going to be connecting in, This is going to be a tsunami of new types of log data. as opposed to something you have to plan out Is the statement you're faster than others, All of that is going to come back within 100 milliseconds. They are looking at a lot of different data. Most of the time you're going to your logs, It's the time it takes to do the query. We're trying to make it very easy for you to move You guys have unique engineering culture, There's all kind of new ways, reasons why you need So I was really happy with the product as an idea existed, All the flexibility you get from that. So I figured that was probably the support team. You know it's interesting. They're bringing in the right tool at the right time. and get that answer back as quickly as you can. Is it ingest to value, ingesting is one part, So being able to have the alarm, What is the key things they should know about? we get you from the alerts, to your answers What's the culture here, at Scalyr like? to be working with these engineers. Applying AI is going to be a natural extension. And can deal with the data. Ciao, thanks for the insight. Good talking to you. Innovation Day with theCUBE here in Silicon Valley
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
five | QUANTITY | 0.99+ |
Steve | PERSON | 0.99+ |
Steve Newman | PERSON | 0.99+ |
two | QUANTITY | 0.99+ |
John | PERSON | 0.99+ |
two guys | QUANTITY | 0.99+ |
John Hart | PERSON | 0.99+ |
two people | QUANTITY | 0.99+ |
John Furrier | PERSON | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
Scalyr | ORGANIZATION | 0.99+ |
15 minutes | QUANTITY | 0.99+ |
one email | QUANTITY | 0.99+ |
Thousands of engineers | QUANTITY | 0.99+ |
San Mateo | LOCATION | 0.99+ |
one | QUANTITY | 0.99+ |
100 milliseconds | QUANTITY | 0.99+ |
Splunk | ORGANIZATION | 0.99+ |
Intel | ORGANIZATION | 0.99+ |
Windows | TITLE | 0.99+ |
San Mateo, California | LOCATION | 0.99+ |
both | QUANTITY | 0.99+ |
six months later | DATE | 0.99+ |
10 seconds | QUANTITY | 0.99+ |
three hundred machines | QUANTITY | 0.99+ |
Moore | PERSON | 0.99+ |
one point | QUANTITY | 0.99+ |
first | QUANTITY | 0.98+ |
one gigabyte | QUANTITY | 0.98+ |
20 different versions | QUANTITY | 0.98+ |
Today | DATE | 0.98+ |
two areas | QUANTITY | 0.98+ |
yesterdays | DATE | 0.98+ |
one part | QUANTITY | 0.98+ |
eight, ten billion dollars | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
10 minutes | QUANTITY | 0.98+ |
10 minutes ago | DATE | 0.97+ |
ten gigabytes | QUANTITY | 0.97+ |
Cube Innovation Day | EVENT | 0.97+ |
first time | QUANTITY | 0.96+ |
ORGANIZATION | 0.96+ | |
Cadence | PERSON | 0.96+ |
five minutes ago | DATE | 0.94+ |
Teslas | ORGANIZATION | 0.94+ |
a day | QUANTITY | 0.94+ |
six hundred indexes | QUANTITY | 0.94+ |
six months | QUANTITY | 0.93+ |
first sign | QUANTITY | 0.93+ |
Scalyr | TITLE | 0.92+ |
Scalyr Innovation Day 2019 | EVENT | 0.91+ |
Innovation Day | EVENT | 0.9+ |
First | QUANTITY | 0.87+ |
a ton | QUANTITY | 0.84+ |
a week | QUANTITY | 0.81+ |
years | QUANTITY | 0.81+ |
tons of logs | QUANTITY | 0.8+ |
Splunk | TITLE | 0.79+ |
a terabyte of data | QUANTITY | 0.79+ |