Erez Berkner, Lumigo & Kevin O'Neill, Flex | AWS Startup Showcase
(upbeat music) >> Welcome to theCUBE and our Q3 AWS Startup Showcase. I'm Lisa Martin. I've got two guests here with me, Erez Berkner is back, the Co-Founder and CEO of Lumigo. Hey, Erez, good to see you. >> Hey, Lisa, great to be here again. >> And Kevin O'Neill, the CTO at Flex is here as well. Kevin, welcome. >> Hi, Lisa, nice to meet you. >> Likewise, we're going to give the audience an overview of Lumigo and Flex. Let's go ahead, Erez, and start with you. Talk to us about Lumigo, and I think you have a slide to pull up to walk us through? >> Yeah, I have a couple, so, great to be here again. And just as an overview, Lumigo is a serverless monitoring and debugging platform. Basically allowing the user, the developer to get an end-to-end view of every transaction in his cloud. It's basically distributed tracing that allows you from one hand to monitor, to see a visual representation of your transaction, but also allows you to drill down and debug the failure to get to the root cause. So essentially, once you have the visualization and if we'll move to the next slide, you can actually click and drill down and see all the relevant debug information like environment variables, duct rays, inputs, outputs, and so on and so forth. And by that, understanding the root cause. And sometimes those root causes of the problems are not just errors, they are latencies, they are hiccups. And for that, we can see on the next slide, where Lumigo allows you to see where do you spend your time? Where are the hiccups in your system? What's running in Paula to what in the same transaction, where you can optimize. And that's the essence of what Lumigo provides in a distributed environment and focusing on serverless. >> Got it, focusing on serverless, we'll dig into that in a second. Kevin, give us an overview of Flex. You're a customer of Lumigo? >> We are indeed. So Flex is a build smoothing platform. We help people pay their rent and other bills, in these times of uncertainty and cashflow, the first of the month for your rent, it's a big bill. Being able to split that up into multiple payments is a lot easier. And when we entered the market, you were looking at a place where people were using things like payday loans, which are just ridiculous, really hurting, hurt people in the longterm. So we want to come in with something that is a little more equitable, little fairer and help people who can well afford their rent. They just can't afford it on the first, right? And so we started with rent, and now we cover all the bills like utilities and things like that. >> What a great use case, and I can't even imagine, Kevin, in the last year and a half, how helpful that's been as the world has been so dynamic. So talk to me a little bit about what you were doing before Lumigo and we'll get into then why you went the serverless route. >> Right, so I came to Flex to help them out with some problems that we're having as our servers were scaling up. Obviously, when the business hit, it was really, it went from zero to 100 miles an hour so quickly. And so I came in to help sort out some of the growing issues. And so when I started looking at that, we were three developers and didn't want to spend time on ops, didn't want to spend time on all of the things that you have to do just to be in business, right? And it's really expensive in the technical space. If you get into something about Kubernetes or things like that, you spend a lot of time building that infrastructure, making sure, and that's really minimal value to your business. It's there for reliability, but it doesn't really focus in on the thing that is important to you. So we wanted to build something that minimized that, we talk about DevOps, we want it ops zero, right? So that's like DevOps is a really nice practice, but having people in that role, it seems like you're still doing ops, right? You still got people who are doing those things, and we want it to kind of eliminate that. So I had some experience with serverless before joining Flex. I thought we'll run up a few things and spike up a few things. When you come out of environments like Kubernetes or your more traditional AC to type infrastructure, you'd lose some things. And one of the big things you'd lose is platforms of visibility. So things like OpenTrace and Datadog, and things like that, that do these jobs of telling you what's going on in your infrastructure, you've got fairly complex infrastructure going on, lots of things happening. And so, we initially started with what was available on the platforms, right? So we started with your CloudWatch logs and New Relics, right? Which got us somewhere. But as soon as we started to get into more complex scenarios where we're talking across multiple hops, so through SQS and then through EventBridge and Dynamo, it was very difficult to be able to retrace a piece of information. And that's when we started looking around for solutions, we looked at big traditional pliers, the Datadogs, the New Relics and people like that. And then the serverless specific players, and we ended up landing on Lumigo, and I couldn't have been happier with the results, from day one, I was getting results. >> That's great, I want to talk about that too, especially as you say, we wanted to be able to focus on our core competencies and not spend time in resources that we didn't have in areas where we could actually outsource. So I want to go back to Erez, talk to me about some of the challenges that Kevin articulated, are those common across the board, across industries that Lumigo sees? >> Yeah, I think the main thing when we met Kevin main were about visibility and about ability to zoom out, see the bigger picture and when something actually fails or about to fail in production, being able to drill down to understand what happened, what is the root cause, and go ahead and fix it instead of going through different CloudWatch logs, and log groups and connecting the dots manually. And that's one of the most common challenges when enterprise, where software engineers are heading toward serverless, toward managed services. So, definitely we'll hear that it was many of our customers. >> So Kevin, talk about the infrastructure that you've set up with serverless and go through some of the main benefits that Flex is getting. >> Right, so look, the day one thing of course, is the number of people we need doing operations as we've grown is next to nothing, right? We are able to create in that, we all want this independence of execution, right? So as you scale, I think there's two ways really to scale a system, right? You can build a monolith and shot it, that works really, really well, right? You can just build something that just holds a ton of data and everything seems connected when you release it all in one place, or you build something that's a little more distributed and relies on asynchronous interactions effectively, like in everywhere but the edges, both of those things scale. The middle ground doesn't scale, right? That middle ground of synchronous systems talking to synchronous systems, at some point, your lightency is your sum of all the things you're talking to, right? So doing anything in a quick way is not possible. So when we started to look at things like, I'm sorry, so the other challenge is things like logging and understanding what's happening in your system. Logging is one of those things that you always don't have the thing logged that you're interested in, right? You put in whatever logging you like, but the thing you need will always be missing, which is why we've always taken a tracing approach, right? Why you want to use something like Lumigo or an OpenTrace, you don't sit there and say, "Hey, log this specifically," you log the information that's moving through the system. At that point, you can then look at what's happening specifically. So again, the biggest challenge for us is that we run 1500 landlords, right? We run 600 queues. There's a lot of information. We use an EventBridge, we use Dynamo, we use RDS, we've got information spread out. We moved stuff, but to third party vendors, we're talking out to say, two guys like Stripe and Co, and we're making calls out of those. And we want to understand when we've made those calls, what's the latency on those calls. And for a given interaction, it might touch 20 or 30 of those components. And so for us, the ability to say, "Hey, I want to know why this file to write down here." We need to actually look through everywhere, explain, and understand how it's complex, right? Where this piece of data that was wrong come from? And so, yeah, which is difficult in a distributed environment where your infrastructure is so much a part of somebody else's systems, you don't have direct access to assistance. You'd only got the side effects of the system. >> Right, so talk to me in that distributed environment, Kevin, how does Lumigo help to improve that? Especially as we're talking about payments and billing and sensitive financial information. >> Right, so in a couple of ways, the nice part about Lumigo is I really don't have to do much in order for it to just do its thing, right? This comes back to that philosophy of zero ops, right? Zero effort. I don't want to be concentrating on how I build my tracing infrastructure, right? I just want it to work. I want it to work out of the box when something happens, I want it to have happened. So Lumigo, when I looked at it, when I was looking at the platforms, the integration's so straightforward, the cost integration being straightforward is kind of useless, if it doesn't actually give you the information you want. And we had a challenge initially, which was, we use a lot of EventBridge, and of course, nothing tries to EventBridge until we got, I mentioned this to Erez and Co, and said, "Hey guys, we really need to try to EventBridge, and a little while later, we were tracing through EventBridge, which was fantastic. And because I would say 70% of our transactions evolve something that goes through EventBridge, the other thing there. We're also from an architectural standpoint, we're also what's known as an event source system. So we derive the state of the information from the things that have occurred rather than a current snapshot of what something looks like, right? So rather than you being Lisa with a particular phone number and particular email address stored in a database as a record, you are, Lisa changed the phone number, Lisa changed her email address. And then we take that sequence of things and create a current view of Lisa. So that also helps us with ordering, right? And at those lower levels, we can do a lot of our security. We can do a lot of our encryption, we can say that this particular piece of information, for example, a social security number is encrypted and never is available as plain text. And you need the keys to be able to unlock that particular piece of information. So we can do a lot of that, a lower level infrastructure, but that does generate a lot of movement of information. >> Right. >> And if you can't trace that movement of information, you're in a hurting place. >> So Erez, we just got a great testimonial from Kevin on how Lumigo's really fundamental to their environment and what they're able to deliver to customers, and also Kevin talked about, it sounds like some of the collaboration that went on to help get that EventBridge. Talk to me, Erez, about the collaborative partnership that you have with Flex. >> Yeah, so I think that it's more of a, I would say a philosophy of customers, the users come first. So this is what we're really trying to about. We always try to make sure there's an open communication with all of our customers and for us customer is a key and user's a key, not even a customer. And this is why we try to accommodate the different requests, specifically on this event, this was actually a while after AWS released the service and due to the partnership that we have with AWS, we were able to get this supported relatively fast and first to market supporting EventBridge, and connecting the dots around it. So that's one of the things that we really, really focused on. >> Kevin, back to you, how do you quantify the ROI of what Lumigo is delivering to Flex? >> That's a really good question. And Erez, and I've talked about this a few times, because the simple fact is if I add up the numbers, it costs me more to trace than it does to execute. But if I look at the slightly bigger picture, I also don't have op stuff, right? And I also have an ability to look at things very quickly. The service cost is nothing compared to what I would need if I was running my own tracing through OpenTrace with my own database, monitor the staff to support those things. But the management of those things, the configuration of those things, the multiple touchpoints I'd need for those things, they're not the simple thing. So, if you look at a raw cost, you go, oh man, that part is actually more than my execution costs at least certainly in the early days, but when I look at the entire cost of what it takes to watch manage and trace a system, it's a really easy song, right? And a lot of these things don't pay off until something goes wrong. Now we're heavy users of EventBridge. EventBridge has had two incidents in USA in the last six months, right? And we were able to say through our traffic, that was going through EventBridge, that the slowdown was occurring in EventBridge. In fact, we were saying that before was alerted in the IDR VUS dashboards, to say, "Hey, EventBridge is having problems," like we watch all their alerts, but we were saying an hour before leading into Titus saying, "Hey, there's something going wrong here." Right? Because we were seeing delays in the system. So things like that give you an opportunity to adjust, right? You can't do it. You're not going to be able to get everything off of EventBridge for that period. But at least I can talk to the business and say, "Hey, we're having an impact here, and this is what's going on. We don't think it's our systems, we think it's actually something external. We can see the tries, we see it going in, we see it coming out, it's a 20 minute delay." >> There's a huge amount of value in that, sorry, Kevin, in that visibility alone, as you said, and even maybe even some cost avoidance is there, if you're seeing something going wrong, you maybe can pivot and adjust as needed. But without that visibility, you don't have that. There's a lot of potential loss. >> Yeah, and it's one of those things that doesn't pay for itself until it pays for itself, right? It's like insurance, you don't need insurance until you need insurance. These sort of things, people look at these things and go, "Ah, what am I getting it from day to day?" And day to day, I'll use Lumigo, right? When I'm developing now, Lumigo is part of my development process, in that, I use it to make sure the information is flowing in the way I expect it to, right? Which wasn't what I expected to be able to do with it, right? It wasn't even a plan or anything I intended to use it for, but day to day now, when I buy something off, one of the checks I go through when I'm debugging or when I'm looking at a problem, especially distributed problem is what went through Lumigo. What happened here, here and here, and why did that happen in response to this? So, these things are, again, it's that insurance thing, you don't need it until you need it, and when you need it, you're so glad you've got it. >> Right, exactly. >> Actually it's already said, I have a question because, yeah, I think that it's clear on that part. And how did this, if it change the developer work in Flex, do you feel different on that part? >> I think it's down to individual developers, how they use the different tools, just like individual developers use different tools. I tend to, and a couple of people that I work closely with tend to use these tools in this way, probably where the more advanced users of serverless in general inside the organization. So we were more aware of these weird little things that occur and justly double-checks you want to do. But I feel like when I don't have something like Lumigo in place, it's very hard for me to understand, did everything happen? I can write my acceptance tests, but I want to make sure that, testing is a really fun art, right? And it's picking my cabinets nice and easy, and you can run all these formulas to do things, it's just not right, and there's just too many, especially in distributed space, too many cases where things look odd, things look strange, you've got weird edge cases. We get new timeouts in Dynamo. We hit the 100,000 limit in fresh hall on Dynamo, right? In production, that was really interesting because it meant we needed to do some additional things. >> Lisa: Kevin, oh, go ahead. >> Go ahead, no, go ahead, Lisa. >> I was just going to ask you, I'd love to get your perspective. It sounds like, you look at other technologies, there's been some clear benefits and differentiators that you saw, which is why you chose Lumigo, but it also sounds like there were some things that surprised you. So in your opinion, what are some of the key differentiators of Lumigo versus its competitors? >> So I guess I've been a partner with Lumigo for like eight months now, right? Which is a long time in the history of Flex, right? 'Cause we're just out of two and a half years old. So, when I did the initial evaluation, I was looking for the things. I'm lazy, so I wanted something that I could just drop in and it would just work, right? And get the information I wanted to ask. I wanted something that was giving me information consistently. So I try to figure these things out and hit them with some load. I wanted it to have coverage of the assistance that we use. We use Dynamo a lot. We use Lambros a lot, and I want it not just cursory coverage, how it's just another one of the 20,000 things that they do, I wanted something that was dedicated to it. That gave me information that was useful for me. And really the specialist serverless providers were the obvious choice there. When you looked at the more general providers, the Datadogs and New Relics, I think if you're in an environment that has a lot of other different types of systems running on, then maybe the specificity that you'd lose is worthwhile, right? There's trade off you can make, but we're in a highly serverless environment, so one of the specificity. When I looked at the vendors, Lumigo was the one that worked best straight out of the box for me, it gave me the information I wanted. It gave me the experience I wanted, and to be frank, they've reached out really quickly and had a chat about what were my specific problems, what I was thinking. And all of those things add up, a proactive vendor, just doing the things you wanted to do, and what became and has become a lasting partnership, and I don't say partnership lightly 'cause we've worked with a number of other vendors, right? For different things. But Lumigo, I have turned to these guys, 'cause these guys know serverless, right? So I've turned to these guys when I've gone, "Look, I am not sure what the best approach here is." You have trusted me about it, this is vendor, right? >> Right, but it sounds like it's very synergistic, collaborative trusted relationship. And to your point, not using the term partner lightly, I think arises, probably couldn't have been a better testimonial for Lumigo, its capabilities, and what you guys are able to do. So I'll give you, Erez the last word, just give the audience a little bit of an overview of the AWS partnership. >> Sure, so AWS has been a very strategic partner for Lumigo, and that means that, I would say the most critical part is a product, is a technology. And we are design partners with the serverless team. And that means that we work with AWS to make sure that before new services are released, they get our feedback on whether we can integrate easily or not, and making sure that on the launch date, we are able to be a launch partner for a lot of their services. And this strong partnership with R&D team is what's allowing Lumigo to support new services out of the box like Kevin mentioned. >> Excellent, gentlemen, thank you so much for joining me today, talking about, not just about Lumigo, but getting this great perspective of it through the CTO lens with Kevin, we appreciate your insights, your time, and what a great testimonial. >> Thank you very much, thank you, Kevin. >> Thanks, Lisa, thanks Erez. >> You're most welcome. For Erez Berkner and Kevin O'Neill, I'm Lisa Martin, you're watching the AWS Startup Showcase for Q3. (gentle music)
SUMMARY :
Erez Berkner is back, the And Kevin O'Neill, the and I think you have a slide and debug the failure to You're a customer of Lumigo? And so we started with rent, So talk to me a little bit on the thing that is important to you. resources that we didn't have And that's one of the So Kevin, talk about the infrastructure but the thing you need Right, so talk to me in to EventBridge until we got, And if you can't trace that you have with Flex. and connecting the dots around it. monitor the staff to support those things. in that visibility alone, as you said, and when you need it, you're if it change the developer work in Flex, and you can run all these and differentiators that you saw, of the assistance that we use. And to your point, and making sure that on the launch date, and what a great testimonial. For Erez Berkner and Kevin
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Kevin | PERSON | 0.99+ |
Kevin O'Neill | PERSON | 0.99+ |
Erez | PERSON | 0.99+ |
Lisa Martin | PERSON | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
USA | LOCATION | 0.99+ |
Kevin O'Neill | PERSON | 0.99+ |
Lisa | PERSON | 0.99+ |
1500 landlords | QUANTITY | 0.99+ |
eight months | QUANTITY | 0.99+ |
Lumigo | ORGANIZATION | 0.99+ |
Erez Berkner | PERSON | 0.99+ |
zero | QUANTITY | 0.99+ |
20 | QUANTITY | 0.99+ |
20 minute | QUANTITY | 0.99+ |
70% | QUANTITY | 0.99+ |
two incidents | QUANTITY | 0.99+ |
two guys | QUANTITY | 0.99+ |
three developers | QUANTITY | 0.99+ |
30 | QUANTITY | 0.99+ |
EventBridge | ORGANIZATION | 0.99+ |
Dynamo | ORGANIZATION | 0.99+ |
two ways | QUANTITY | 0.99+ |
DevOps | TITLE | 0.99+ |
two guests | QUANTITY | 0.99+ |
600 queues | QUANTITY | 0.99+ |
Erez and Co | ORGANIZATION | 0.99+ |
first | QUANTITY | 0.98+ |
both | QUANTITY | 0.98+ |
one | QUANTITY | 0.98+ |
Flex | ORGANIZATION | 0.98+ |
Datadogs | ORGANIZATION | 0.98+ |
one place | QUANTITY | 0.98+ |
20,000 things | QUANTITY | 0.97+ |
OpenTrace | TITLE | 0.97+ |
Paula | LOCATION | 0.96+ |
100 miles an hour | QUANTITY | 0.96+ |