Pat Conte, Opsani | AWS Startup Showcase

(upbeat music) >> Hello and welcome to this CUBE conversation here presenting the "AWS Startup Showcase: "New Breakthroughs in DevOps, Data Analytics "and Cloud Management Tools" featuring Opsani for the cloud management and migration track here today, I'm your host John Furrier. Today, we're joined by Patrick Conte, Chief Commercial Officer, Opsani. Thanks for coming on. Appreciate you coming on. Future of AI operations. >> Thanks, John. Great to be here. Appreciate being with you. >> So congratulations on all your success being showcased here as part of the Startups Showcase, future of AI operations. You've got the cloud scale happening. A lot of new transitions in this quote digital transformation as cloud scales goes next generation. DevOps revolution as Emily Freeman pointed out in her keynote. What's the problem statement that you guys are focused on? Obviously, AI involves a lot of automation. I can imagine there's a data problem in there somewhere. What's the core problem that you guys are focused on? >> Yeah, it's interesting because there are a lot of companies that focus on trying to help other companies optimize what they're doing in the cloud, whether it's cost or whether it's performance or something else. We felt very strongly that AI was the way to do that. I've got a slide prepared, and maybe we can take a quick look at that, and that'll talk about the three elements or dimensions of the problem. So we think about cloud services and the challenge of delivering cloud services. You've really got three things that customers are trying to solve for. They're trying to solve for performance, they're trying to solve for the best performance, and, ultimately, scalability. I mean, applications are growing really quickly especially in this current timeframe with cloud services and whatnot. They're trying to keep costs under control because certainly, it can get way out of control in the cloud since you don't own the infrastructure, and more importantly than anything else which is why it's at the bottom sort of at the foundation of all this, is they want their applications to be a really a good experience for their customers. So our customer's customer is actually who we're trying to solve this problem for. So what we've done is we've built a platform that uses AI and machine learning to optimize, meaning tune, all of the key parameters of a cloud application. So those are things like the CPU usage, the memory usage, the number of replicas in a Kubernetes or container environment, those kinds of things. It seems like it would be simple just to grab some values and plug 'em in, but it's not. It's actually the combination of them has to be right. Otherwise, you get delays or faults or other problems with the application. >> Andrew, if you can bring that slide back up for a second. I want to just ask one quick question on the problem statement. You got expenditures, performance, customer experience kind of on the sides there. Do you see this tip a certain way depending upon use cases? I mean, is there one thing that jumps out at you, Patrick, from your customer's customer's standpoint? Obviously, customer experience is the outcome. That's the app, whatever. That's whatever we got going on there. >> Sure. >> But is there patterns 'cause you can have good performance, but then budget overruns. Or all of them could be failing. Talk about this dynamic with this triangle. >> Well, without AI, without machine learning, you can solve for one of these, only one, right? So if you want to solve for performance like you said, your costs may overrun, and you're probably not going to have control of the customer experience. If you want to solve for one of the others, you're going to have to sacrifice the other two. With machine learning though, we can actually balance that, and it isn't a perfect balance, and the question you asked is really a great one. Sometimes, you want to over-correct on something. Sometimes, scalability is more important than cost, but what we're going to do because of our machine learning capability, we're going to always make sure that you're never spending more than you should spend, so we're always going to make sure that you have the best cost for whatever the performance and reliability factors that you you want to have are. >> Yeah, I can imagine. Some people leave services on. Happened to us one time. An intern left one of the services on, and like where did that bill come from? So kind of looked back, we had to kind of fix that. There's a ton of action, but I got to ask you, what are customers looking for with you guys? I mean, as they look at Opsani, what you guys are offering, what's different than what other people might be proposing with optimization solutions? >> Sure. Well, why don't we bring up the second slide, and this'll illustrate some of the differences, and we can talk through some of this stuff as well. So really, the area that we play in is called AIOps, and that's sort of a new area, if you will, over the last few years, and really what it means is applying intelligence to your cloud operations, and those cloud operations could be development operations, or they could be production operations. And what this slide is really representing is in the upper slide, that's sort of the way customers experience their DevOps model today. Somebody says we need an application or we need a feature, the developers pull down something from get. They hack an early version of it. They run through some tests. They size it whatever way they know that it won't fail, and then they throw it over to the SREs to try to tune it before they shove it out into production, but nobody really sizes it properly. It's not optimized, and so it's not tuned either. When it goes into production, it's just the first combination of settings that work. So what happens is undoubtedly, there's some type of a problem, a fault or a delay, or you push new code, or there's a change in traffic. Something happens, and then, you've got to figure out what the heck. So what happens then is you use your tools. First thing you do is you over-provision everything. That's what everybody does, they over-provision and try to soak up the problem. But that doesn't solve it because now, your costs are going crazy. You've got to go back and find out and try as best you can to get root cause. You go back to the tests, and you're trying to find something in the test phase that might be an indicator. Eventually your developers have to hack a hot fix, and the conveyor belt sort of keeps on going. We've tested this model on every single customer that we've spoken to, and they've all said this is what they experience on a day-to-day basis. Now, if we can go back to the side, let's talk about the second part which is what we do and what makes us different. So on the bottom of this slide, you'll see it's really a shift-left model. What we do is we plug in in the production phase, and as I mentioned earlier, what we're doing is we're tuning all those cloud parameters. We're tuning the CPU, the memory, the Replicas, all those kinds of things. We're tuning them all in concert, and we're doing it at machine speed, so that's how the customer gets the best performance, the best reliability at the best cost. That's the way we're able to achieve that is because we're iterating this thing in machine speed, but there's one other place where we plug in and we help the whole concept of AIOps and DevOps, and that is we can plug in in the test phase as well. And so if you think about it, the DevOps guy can actually not have to over-provision before he throws it over to the SREs. He can actually optimize and find the right size of the application before he sends it through to the SREs, and what this does is collapses the timeframe because it means the SREs don't have to hunt for a working set of parameters. They get one from the DevOps guys when they send it over, and this is how the future of AIOps is being really affected by optimization and what we call autonomous optimization which means that it's happening without humans having to press a button on it. >> John: Andrew, bring that slide back up. I want to just ask another question. Tuning in concert thing is very interesting to me. So how does that work? Are you telegraphing information to the developer from the autonomous workload tuning engine piece? I mean, how does the developer know the right knobs or where does it get that provisioning information? I see the performance lag. I see where you're solving that problem. >> Sure. >> How does that work? >> Yeah, so actually, if we go to the next slide, I'll show you exactly how it works. Okay, so this slide represents the architecture of a typical application environment that we would find ourselves in, and inside the dotted line is the customer's application namespace. That's where the app is. And so, it's got a bunch of pods. It's got a horizontal pod. It's got something for replication, probably an HPA. And so, what we do is we install inside that namespace two small instances. One is a tuning pod which some people call a canary, and that tuning pod joins the rest of the pods, but it's not part of the application. It's actually separate, but it gets the same traffic. We also install somebody we call Servo which is basically an action engine. What Servo does is Servo takes the metrics from whatever the metric system is is collecting all those different settings and whatnot from the working application. It could be something like Prometheus. It could be an Envoy Sidecar, or more likely, it's something like AppDynamics, or we can even collect metrics off of Nginx which is at the front of the service. We can plug into anywhere where those metrics are. We can pull the metrics forward. Once we see the metrics, we send them to our backend. The Opsani SaaS service is our machine learning backend. That's where all the magic happens, and what happens then is that service sees the settings, sends a recommendation to Servo, Servo sends it to the tuning pod, and we tune until we find optimal. And so, that iteration typically takes about 20 steps. It depends on how big the application is and whatnot, how fast those steps take. It could be anywhere from seconds to minutes to 10 to 20 minutes per step, but typically within about 20 steps, we can find optimal, and then we'll come back and we'll say, "Here's optimal, and do you want to "promote this to production," and the customer says, "Yes, I want to promote it to production "because I'm saving a lot of money or because I've gotten "better performance or better reliability." Then, all he has to do is press a button, and all that stuff gets sent right to the production pods, and all of those settings get put into production, and now he's now he's actually saving the money. So that's basically how it works. >> It's kind of like when I want to go to the beach, I look at the weather.com, I check the forecast, and I decide whether I want to go or not. You're getting the data, so you're getting a good look at the information, and then putting that into a policy standpoint. I get that, makes total sense. Can I ask you, if you don't mind, expanding on the performance and reliability and the cost advantage? You mentioned cost. How is that impacting? Give us an example of some performance impact, reliability, and cost impacts. >> Well, let's talk about what those things mean because like a lot of people might have different ideas about what they think those mean. So from a cost standpoint, we're talking about cloud spend ultimately, but it's represented by the settings themselves, so I'm not talking about what deal you cut with AWS or Azure or Google. I'm talking about whatever deal you cut, we're going to save you 30, 50, 70% off of that. So it doesn't really matter what cost you negotiated. What we're talking about is right-sizing the settings for CPU and memory, Replica. Could be Java. It could be garbage collection, time ratios, or heap sizes or things like that. Those are all the kinds of things that we can tune. The thing is most of those settings have an unlimited number of values, and this is why machine learning is important because, if you think about it, even if they only had eight settings or eight values per setting, now you're talking about literally billions of combinations. So to find optimal, you've got to have machine speed to be able to do it, and you have to iterate very, very quickly to make it happen. So that's basically the thing, and that's really one of the things that makes us different from anybody else, and if you put that last slide back up, the architecture slide, for just a second, there's a couple of key words at the bottom of it that I want to want to focus on, continuous. So continuous really means that we're on all the time. We're not plug us in one time, make a change, and then walk away. We're actually always measuring and adjusting, and the reason why this is important is in the modern DevOps world, your traffic level is going to change. You're going to push new code. Things are going to happen that are going to change the basic nature of the software, and you have to be able to tune for those changes. So continuous is very important. Second thing is autonomous. This is designed to take pressure off of the SREs. It's not designed to replace them, but to take the pressure off of them having to check pager all the time and run in and make adjustments, or try to divine or find an adjustment that might be very, very difficult for them to do so. So we're doing it for them, and that scale means that we can solve this for, let's say, one big monolithic application, or we can solve it for literally hundreds of applications and thousands of microservices that make up those applications and tune them all at the same time. So the same platform can be used for all of those. You originally asked about the parameters and the settings. Did I answer the question there? >> You totally did. I mean, the tuning in concert. You mentioned early as a key point. I mean, you're basically tuning the engine. It's not so much negotiating a purchase SaaS discount. It's essentially cost overruns by the engine, either over burning or heating or whatever you want to call it. I mean, basically inefficiency. You're tuning the core engine. >> Exactly so. So the cost thing is I mentioned is due to right-sizing the settings and the number of Replicas. The performance is typically measured via latency, and the reliability is typically measured via error rates. And there's some other measures as well. We have a whole list of them that are in the application itself, but those are the kinds of things that we look for as results. When we do our tuning, we look for reducing error rates, or we look for holding error rates at zero, for example, even if we improve the performance or we improve the cost. So we're looking for the best result, the best combination result, and then a customer can decide if they want to do so to actually over-correct on something. We have the whole concept of guard rail, so if performance is the most important thing, or maybe some customers, cost is the most important thing, they can actually say, "Well, give us the best cost, "and give us the best performance and the best reliability, "but at this cost," and we can then use that as a service-level objective and tune around it. >> Yeah, it reminds me back in the old days when you had filtering white lists of black lists of addresses that can go through, say, a firewall or a device. You have billions of combinations now with machine learning. It's essentially scaling the same concept to unbelievable. These guardrails are now in place, and that's super cool and I think really relevant call-out point, Patrick, to kind of highlight that. At this kind of scale, you need machine learning, you need the AI to essentially identify quickly the patterns or combinations that are actually happening so a human doesn't have to waste their time that can be filled by basically a bot at that point. >> So John, there's just one other thing I want to mention around this, and that is one of the things that makes us different from other companies that do optimization. Basically, every other company in the optimization space creates a static recommendation, basically their recommendation engines, and what you get out of that is, let's say it's a manifest of changes, and you hand that to the SREs, and they put it into effect. Well, the fact of the matter is is that the traffic could have changed then. It could have spiked up, or it could have dropped below normal. You could have introduced a new feature or some other code change, and at that point in time, you've already instituted these changes. They may be completely out of date. That's why the continuous nature of what we do is important and different. >> It's funny, even the language that we're using here: network, garbage collection. I mean, you're talking about tuning an engine, am operating system. You're talking about stuff that's moving up the stack to the application layer, hence this new kind of eliminating of these kind of siloed waterfall, as you pointed out in your second slide, is kind of one integrated kind of operating environment. So when you have that or think about the data coming in, and you have to think about the automation just like self-correcting, error-correcting, tuning, garbage collection. These are words that we've kind of kicking around, but at the end of the day, it's an operating system. >> Well in the old days of automobiles, which I remember cause I'm I'm an old guy, if you wanted to tune your engine, you would probably rebuild your carburetor and turn some dials to get the air-oxygen-gas mix right. You'd re-gap your spark plugs. You'd probably make sure your points were right. There'd be four or five key things that you would do. You couldn't do them at the same time unless you had a magic wand. So we're the magic wand that basically, or in modern world, we're sort of that thing you plug in that tunes everything at once within that engine which is all now electronically controlled. So that's the big differences as you think about what we used to do manually, and now, can be done with automation. It can be done much, much faster without humans having to get their fingernails greasy, let's say. >> And I think the dynamic versus static is an interesting point. I want to bring up the SRE which has become a role that's becoming very prominent in the DevOps kind of plus world that's happening. You're seeing this new revolution. The role of the SRE is not just to be there to hold down and do the manual configuration. They had a scale. They're a developer, too. So I think this notion of offloading the SRE from doing manual tasks is another big, important point. Can you just react to that and share more about why the SRE role is so important and why automating that away through when you guys have is important? >> The SRE role is becoming more and more important, just as you said, and the reason is because somebody has to get that application ready for production. The DevOps guys don't do it. That's not their job. Their job is to get the code finished and send it through, and the SREs then have to make sure that that code will work, so they have to find a set of settings that will actually work in production. Once they find that set of settings, the first one they find that works, they'll push it through. It's not optimized at that point in time because they don't have time to try to find optimal, and if you think about it, the difference between a machine learning backend and an army of SREs that work 24-by-seven, we're talking about being able to do the work of many, many SREs that never get tired, that never need to go play video games, to unstress or whatever. We're working all the time. We're always measuring, adjusting. A lot of the companies we talked to do a once-a-month adjustment on their software. So they put an application out, and then they send in their SREs once a month to try to tune the application, and maybe they're using some of these other tools, or maybe they're using just their smarts, but they'll do that once a month. Well, gosh, they've pushed code probably four times during the month, and they probably had a bunch of different spikes and drops in traffic and other things that have happened. So we just want to help them spend their time on making sure that the application is ready for production. Want to make sure that all the other parts of the application are where they should be, and let us worry about tuning CPU, memory, Replica, job instances, and things like that so that they can work on making sure that application gets out and that it can scale, which is really important for them, for their companies to make money is for the apps to scale. >> Well, that's a great insight, Patrick. You mentioned you have a lot of great customers, and certainly if you have your customer base are early adopters, pioneers, and grow big companies because they have DevOps. They know that they're seeing a DevOps engineer and an SRE. Some of the other enterprises that are transforming think the DevOps engineer is the SRE person 'cause they're having to get transformed. So you guys are at the high end and getting now the new enterprises as they come on board to cloud scale. You have a huge uptake in Kubernetes, starting to see the standardization of microservices. People are getting it, so I got to ask you can you give us some examples of your customers, how they're organized, some case studies, who uses you guys, and why they love you? >> Sure. Well, let's bring up the next slide. We've got some customer examples here, and your viewers, our viewers, can probably figure out who these guys are. I can't tell them, but if they go on our website, they can sort of put two and two together, but the first one there is a major financial application SaaS provider, and in this particular case, they were having problems that they couldn't diagnose within the stack. Ultimately, they had to apply automation to it, and what we were able to do for them was give them a huge jump in reliability which was actually the biggest problem that they were having. We gave them 5,000 hours back a month in terms of the application. They were they're having pager duty alerts going off all the time. We actually gave them better performance. We gave them a 10% performance boost, and we dropped their cloud spend for that application by 72%. So in fact, it was an 80-plus % price performance or cost performance improvement that we gave them, and essentially, we helped them tune the entire stack. This was a hybrid environment, so this included VMs as well as more modern architecture. Today, I would say the overwhelming majority of our customers have moved off of the VMs and are in a containerized environment, and even more to the point, Kubernetes which we find just a very, very high percentage of our customers have moved to. So most of the work we're doing today with new customers is around that, and if we look at the second and third examples here, those are examples of that. In the second example, that's a company that develops websites. It's one of the big ones out in the marketplace that, let's say, if you were starting a new business and you wanted a website, they would develop that website for you. So their internal infrastructure is all brand new stuff. It's all Kubernetes, and what we were able to do for them is they were actually getting decent performance. We held their performance at their SLO. We achieved a 100% error-free scenario for them at runtime, and we dropped their cost by 80%. So for them, they needed us to hold-serve, if you will, on performance and reliability and get their costs under control because everything in that, that's a cloud native company. Everything there is cloud cost. So the interesting thing is it took us nine steps because nine of our iterations to actually get to optimal. So it was very, very quick, and there was no integration required. In the first case, we actually had to do a custom integration for an underlying platform that was used for CICD, but with the- >> John: Because of the hybrid, right? >> Patrick: Sorry? >> John: Because it was hybrid, right? >> Patrick: Yes, because it was hybrid, exactly. But within the second one, we just plugged right in, and we were able to tune the Kubernetes environment just as I showed in that architecture slide, and then the third one is one of the leading application performance monitoring companies on the market. They have a bunch of their own internal applications and those use a lot of cloud spend. They're actually running Kubernetes on top of VMs, but we don't have to worry about the VM layer. We just worry about the Kubernetes layer for them, and what we did for them was we gave them a 48% performance improvement in terms of latency and throughput. We dropped their error rates by 90% which is pretty substantial to say the least, and we gave them a 50% cost delta from where they had been. So this is the perfect example of actually being able to deliver on all three things which you can't always do. It has to be, sort of all applications are not created equal. This was one where we were able to actually deliver on all three of the key objectives. We were able to set them up in about 25 minutes from the time we got started, no extra integration, and needless to say, it was a big, happy moment for the developers to be able to go back to their bosses and say, "Hey, we have better performance, "better reliability. "Oh, by the way, we saved you half." >> So depending on the stack situation, you got VMs and Kubernetes on the one side, cloud-native, all Kubernetes, that's dream scenario obviously. Not many people like that. All the new stuff's going cloud-native, so that's ideal, and then the mixed ones, Kubernetes, but no VMs, right? >> Yeah, exactly. So Kubernetes with no VMs, no problem. Kubernetes on top of VMs, no problem, but we don't manage the VMs. We don't manage the underlay at all, in fact. And the other thing is we don't have to go back to the slide, but I think everybody will remember the slide that had the architecture, and on one side was our cloud instance. The only data that's going between the application and our cloud instance are the settings, so there's never any data. There's never any customer data, nothing for PCI, nothing for HIPPA, nothing for GDPR or any of those things. So no personal data, no health data. Nothing is passing back and forth. Just the settings of the containers. >> Patrick, while I got you here 'cause you're such a great, insightful guest, thank you for coming on and showcasing your company. Kubernetes real quick. How prevalent is this mainstream trend is because you're seeing such great examples of performance improvements. SLAs being met, SLOs being met. How real is Kubernetes for the mainstream enterprise as they're starting to use containers to tip their legacy and get into the cloud-native and certainly hybrid and soon to be multi-cloud environment? >> Yeah, I would not say it's dominant yet. Of container environments, I would say it's dominant now, but for all environments, it's not. I think the larger legacy companies are still going through that digital transformation, and so what we do is we catch them at that transformation point, and we can help them develop because as we remember from the AIOps slide, we can plug in at that test level and help them sort of pre-optimize as they're coming through. So we can actually help them be more efficient as they're transforming. The other side of it is the cloud-native companies. So you've got the legacy companies, brick and mortar, who are desperately trying to move to digitization. Then, you've got the ones that are born in the cloud. Most of them aren't on VMs at all. Most of them are on containers right from the get-go, but you do have some in the middle who have started to make a transition, and what they've done is they've taken their native VM environment and they've put Kubernetes on top of it so that way, they don't have to scuttle everything underneath it. >> Great. >> So I would say it's mixed at this point. >> Great business model, helping customers today, and being a bridge to the future. Real quick, what licensing models, how to buy, promotions you have for Amazon Web Services customers? How do people get involved? How do you guys charge? >> The product is licensed as a service, and the typical service is an annual. We license it by application, so let's just say you have an application, and it has 10 microservices. That would be a standard application. We'd have an annual cost for optimizing that application over the course of the year. We have a large application pack, if you will, for let's say applications of 20 services, something like that, and then we also have a platform, what we call Opsani platform, and that is for environments where the customer might have hundreds of applications and-or thousands of services, and we can plug into their deployment platform, something like a harness or Spinnaker or Jenkins or something like that, or we can plug into their their cloud Kubernetes orchestrator, and then we can actually discover the apps and optimize them. So we've got environments for both single apps and for many, many apps, and with the same platform. And yes, thanks for reminding me. We do have a promotion for for our AWS viewers. If you reference this presentation, and you look at the URL there which is opsani.com/awsstartupshowcase, can't forget that, you will, number one, get a free trial of our software. If you optimize one of your own applications, we're going to give you an Oculus set of goggles, the augmented reality goggles. And we have one other promotion for your viewers and for our joint customers here, and that is if you buy an annual license, you're going to get actually 15 months. So that's what we're putting on the table. It's actually a pretty good deal. The Oculus isn't contingent. That's a promotion. It's contingent on you actually optimizing one of your own services. So it's not a synthetic app. It's got to be one of your own apps, but that's what we've got on the table here, and I think it's a pretty good deal, and I hope your guys take us up on it. >> All right, great. Get Oculus Rift for optimizing one of your apps and 15 months for the price of 12. Patrick, thank you for coming on and sharing the future of AIOps with you guys. Great product, bridge to the future, solving a lot of problems. A lot of use cases there. Congratulations on your success. Thanks for coming on. >> Thank you so much. This has been excellent, and I really appreciate it. >> Hey, thanks for sharing. I'm John Furrier, your host with theCUBE. Thanks for watching. (upbeat music)

Published Date : Sep 22 2021

SUMMARY :

for the cloud management and Appreciate being with you. of the Startups Showcase, and that'll talk about the three elements kind of on the sides there. 'cause you can have good performance, and the question you asked An intern left one of the services on, and find the right size I mean, how does the and the customer says, and the cost advantage? and that's really one of the things I mean, the tuning in concert. So the cost thing is I mentioned is due to in the old days when you had and that is one of the things and you have to think about the automation So that's the big differences of offloading the SRE and the SREs then have to make sure and certainly if you So most of the work we're doing today "Oh, by the way, we saved you half." So depending on the stack situation, and our cloud instance are the settings, and get into the cloud-native that are born in the cloud. So I would say it's and being a bridge to the future. and the typical service is an annual. and 15 months for the price of 12. and I really appreciate it. I'm John Furrier, your host with theCUBE.

ENTITIES

Entity	Category	Confidence
Emily Freeman	PERSON	0.99+
Patrick	PERSON	0.99+
John	PERSON	0.99+
Andrew	PERSON	0.99+
John Furrier	PERSON	0.99+
Pat Conte	PERSON	0.99+
10%	QUANTITY	0.99+
50%	QUANTITY	0.99+
Patrick Conte	PERSON	0.99+
15 months	QUANTITY	0.99+
second	QUANTITY	0.99+
90%	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
four	QUANTITY	0.99+
nine steps	QUANTITY	0.99+
30	QUANTITY	0.99+
Oculus	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
72%	QUANTITY	0.99+
48%	QUANTITY	0.99+
10 microservices	QUANTITY	0.99+
second part	QUANTITY	0.99+
First	QUANTITY	0.99+
second slide	QUANTITY	0.99+
first case	QUANTITY	0.99+
Today	DATE	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
one	QUANTITY	0.99+
20 services	QUANTITY	0.99+
Prometheus	TITLE	0.99+
second example	QUANTITY	0.99+
second one	QUANTITY	0.99+
five key	QUANTITY	0.99+
One	QUANTITY	0.99+
first	QUANTITY	0.99+
third one	QUANTITY	0.99+
80-plus %	QUANTITY	0.99+
eight settings	QUANTITY	0.99+
Opsani	PERSON	0.99+
third examples	QUANTITY	0.99+
two	QUANTITY	0.99+
today	DATE	0.99+
services	QUANTITY	0.99+
50	QUANTITY	0.99+
eight values	QUANTITY	0.99+
both	QUANTITY	0.99+
nine	QUANTITY	0.98+
three elements	QUANTITY	0.98+
Servo	ORGANIZATION	0.98+
80%	QUANTITY	0.98+
opsani.com/awsstartupshowcase	OTHER	0.98+
first one	QUANTITY	0.98+
two small instances	QUANTITY	0.98+
10	QUANTITY	0.97+
three things	QUANTITY	0.97+
once a month	QUANTITY	0.97+
one time	QUANTITY	0.97+
70%	QUANTITY	0.97+
GDPR	TITLE	0.97+
zero	QUANTITY	0.97+
Servo	TITLE	0.97+
about 20 steps	QUANTITY	0.97+
12	QUANTITY	0.96+
Kubernetes	TITLE	0.96+
four times	QUANTITY	0.96+

Amir Sharif, Opsani | CUBE Conversation

>>mhm. What the special cube conversation here in Palo alto, I'm john Kerry host of the cube. We're here talking about kubernetes Cloud native and all things Cloud, cloud enterprise amir Sure VP of product and morgan Stanley is with me and we are great to have you on the cube. Thanks for coming on. I appreciate you taking the time, >>appreciate it, john good to be here. You >>know, cloud Native obviously super hot right now as the edges around the corner, you're seeing people looking at five G looking at amazon's wavelength outposts you've got as you got a lot of cloud companies really pushing distributed computing and I think one of the things that people really are getting into is okay, how do I take the cloud and re factor my business and then that's one business side then, the technical side. Okay, How do I do it? Like it's not that easy. Right. So it sounds, it sounds really easy to just go to move to the cloud. This is something that's been a big problem. So I know you guys in the center of all this uh and you've got, you know, microservices, kubernetes at the core of this, take a minute to introduce the company, what you guys do then I want to get into some specific questions. >>Mhm, of course. Well, bob Sani is a startup? Silicon Valley startup and what we do is automate system configuration that's typically worked at an engineer does and take lengthy and if done incorrectly at least to a lot of errors and cost overruns and the user experience problems. We completely automate that using an Ai and ml back end so that the engineering can focus on writing code and not worry about having to tune the little pieces working together. >>You know, I love the, I was talking to a V. C on our last uh startup showcase, cloud startup showcase and uh really prominent VC and he was talking about down stack up stack benefits and he says if you're going to be a down stack um, provider, you got to solve a problem. It has to be a big problem that people don't want to deal with. So, and you start getting into some of the systems configuration when you have automation at the center of this as a table stakes item problems are cropping up as new use cases are emerging. Can you talk about some of the problems that you guys see that you solve for developers and companies, >>of course. So they're basically, they're, the problem expresses itself in a number of domains. The first one is that he who pays the bills is separate from he who consumes the resources. It's the engineers that consume the resources and the incentives are to deliver code rapidly and deliver code that works well, but they don't really care about paying the bills. And then the CFO office sees the bills and there's a disparity between the two. The reason that creates a problem, a business problem is that the developers uh, will over provision stuff, uh to make sure that everything works and uh, they don't want to get caught in the middle of the night. You know, the bill comes due at the end of the month or into the quarter and then the CFO has smoke coming out of his ears because there's been clawed overruns. Then the reaction happens to all right, let's cut costs. And then, you know, there's an edict that comes down that says everything, reduce everything by 30%. So people go across and give a haircut to everything. So what happens next to systems out of balance? There's allocation resource misallocation and uh, systems start uh, suffering. So the customers become unhappy. And ironically, if you're not provisioned correctly, Not ironically, but maybe understandably, customers start suffering and that leads to a revenue problem down the line if you have too many problems unhappy. So you have to be very careful about how you cut costs and how you apportion resources. So both the revenue side is happy and it costs are happy because it all comes down to product experience and what the customers consume. You >>know, that's something that everyone who's done. Cloud development knows, you know, whose fault is it? You know, it's this fall. But now you can actually see the services you leave a switch open or, you know, I'm oversimplifying it. But, you know, you experiment services, you can the bills can just have massive, you know, overruns and then, and then you got to call the cloud company and you gotta call the engineers and say why did you do this? You got to get a refund or or the bad one. Bad apple could ruin it for everyone as you, as you highlighted over the bigger companies. So I have to ask you mean everyone lives this. How do companies have cost overruns? Is their patterns that you see that you guys wrote software 4-1, automate the obvious ones. Is there is there are certain things that you know always happen. Are there areas that have some indications? So why do, first of all, why do companies have cloud cost overruns? >>That's a great question. And let's start with a bit of history where we came from a pre cloud world, you built your own data centers, which means that you have an upfront Capex cost and you spend the money and you were forced to live within the needs that your data center provided. You really couldn't spend anymore. That provided kind of a predictable expenditure bottle it came in big chunks. But you know what, your budget was going to be four years from now, three years from now. And you built for that with the cloud computing, Your consumption is now on on demand basis and it's api enabled. So the developer can just ask for more resources. So without any kind of tools that tell the developer here is x amount of CPU or X amount of memory that you need for this particular service, that for it to deliver the right uh, performance that for the customer. The developers incentivized to basically give it a lot more than the application needs. Why? Because the developer doesn't want to pick up service tickets. He's incentivizing delivering functionality quickly and moving on to next project, not in optimizing costs. So that creates kind of uh an agency problem that the guy that actually controls how research are consumed is not incentivized to control the consumption of these resources. And we see that across the board in every company, engineers, engineering organization is a separate organization than the financial organization. So the control place is different. The consumption place and it breaks down the patterns are over provisions. And what we want to do is give engineers the tools to consume precisely the right amount of resources for the service level objectives that they have, given that you want a transaction rate of X and the literacy rate of Why here's how you configure your cloud infrastructure. So the application delivers according to the sls with the least possible resources consumed. >>So on this tool you guys have in the software you guys have, how how do you guys go to mark with that, you target the business buyer or the developer themselves and and how do you handle the developers say, I don't want anyone looking over my shoulder. I'm gonna go, I'm gonna have a blank check to do whatever it takes, um how do you guys roll that out because actually the business benefits are significant controlling the budget, I get that. Um how do you guys rolling this out? How do people engage with you? What's your strategy? >>Right. Are there, is the application owner, is the guy that owns the PML for the application? It tends to be a VP level or a senior director person that owns a SAAS platform and he or she is responsible for delivering good products to the market and delivering good financial results to the CFO So in that person of everything is rolled up, but that person will always favor the revenue site, which means consume more resources than you need in order to maximize customer happiness, therefore faster growth and uh they do that while sacrificing the cost side. So by giving the product owner the optimization tools autonomous of optimization tools that Sandy has, we allow him or her to deliver the right experience to the customer, with the right sufficient resources and address both the performance and the cost side of equation simultaneously, >>awesome. Can you talk about the impact c I C D s having in the cloud native computing on the optimization cycle? Um Obviously, you know, shifting left for security, we hear a lot of that, you're hearing a lot of more microservices being spun up, spun down automatically. Uh I'll see kubernetes clusters are going mainstream, you start to see a lot more dynamic uh activity if you if you in these new workflows, what is the impact of these new CSC D cloud? Native computing on the optimization cycle? >>C i c D is there to enable a fast delivery of software features basically. Uh So, you know, we have a combination of get get ups where you can just pull down repositories, libraries, open source projects from left and right. And using glue code, developers can deliver functionality really quick. In fact, microservices are there in service of that capability, deliver functionality quickly by being able to build functional blocks and then through a piece you put everything together. So ci cd is just accelerates the software delivery code. Between the time the boss says, give me an application until the application team plus the devops team plus SRE team puts it out in production. Now we can do this really quickly. The problem is though, nobody optimizes in the process. So when we deliver 1.0 in six months or less, we've done zero in terms of optimization and at one point, oh, becomes a way that we go through QA in many cases, unfortunately. And it also becomes a way that we go through the optimization. The customer screams that you eyes Laghi, you know, the throughput is really slow and we tinker and tinker and tinker and by the time it typically goes through a 12 month cycle of maturation, we get that system stability in the right performance with a I and machine learning that a person has enabled. We can deliver that, we can shrink that time out considerably. In fact, uh you know what we're going to announce in q khan is something that be called Kite storm is the ability to uh install our product and kubernetes environment in roughly 20 minutes and within two days you get the results. So before you have this optimization cycle that was going on for a very long time now that it's frank down and because of Ci Cd, you know, you don't have the luxury of waiting and the system itself can become part of the way of contributing system. The system being the uh ai ml service, that the presiding deliveries can be uh part and parcel of the Ci cd pipeline, that optimizes the code and gives you the right configuration and you get to go. So >>you guys are really getting down and injecting in some uh instrumentation for metadata around key areas. That right. Is that kind of how it's working? Are you getting in there with codes going to watch? Um how was it working under the hood? Can you just give me a quick example of, you know, how this would play out and what people might expect, how it would handle, >>of course. So what the way we optimize application performance is we have to have a metric against which we measure performance. That metric is an S L O service level, objective and in a kubernetes environment, we typically tap into Prometheus, which is the metrics gathering place metrics database for kubernetes workloads and we really focus on red metrics, the rate of transactions, the error rate and the for delay or latency. So we focus on these three metrics and what we have to do is inject a small container, it's an open source container into the application work space that we call that a container. Servo. Servo interacts with Prometheus to get the metrics and then it talks to our back end to tell the M L engine what's happening and then L engine and does this analysis and comes back with a new configuration which then servo implements in a canary instance. So the Canary instances where we run our experiments and we compare it against the main line, Which the application is doing after roughly 20 generations or so. The Bellingen Learns what part of the problem space to focus on in order to optimize to deliver optimal results. And then it very quickly comes to the right set of solutions to try and it tries those inside uh inside the canary instance and when it finds the optimal solution, it gives the recommendation back to the application team or alternatively, when you have enough trust in the tiny you can ought to promote it into mainline that >>gets the learning in there is a great example of some cloud native action. I want to get into some examples with your customer, but before we get there, I want to ask you, since I have you here, if you don't mind, what is cloud native mean these days, because you know, cloud native become kind of much cloud computing, um which essentially go move to the cloud, but as people start developing in the cloud where there's real new benefits, people talk about the word cloud native, could you take a quick minute to define? What is cloud Native, Does that even mean? What does cloud native mean? >>I'll try to give you my understanding government, we could get into a bit of philosophy. Uh Yeah, that's good. But basically cloud Native means it's, your application is built for the cloud and it takes advantages of the inherent benefits that a cloud environment can give you, which means that you can grow and shrink resources on the fly, if you built your application correctly, that you can scale up and scale down, you're a number of instances very quickly and uh, everything has taken advantage of a P I S so initially that was kind of done inside of the environment. Uh AWS Ec two is a perfect example of that. Kubernetes shifted cloud native to container its workload because it allows for rapid, more, rapid deployment and even enables or it takes advantage of a more rapid development cycle as we look forward. Cloud Native is more likely to be a surplus environment where you write functions and the backend systems of the cloud service provider, just give you that capability and you don't have to worry about maintaining and managing a fleet of any sort, whether it's VMS or containers, that's where it's gonna go. Currently we are to contain our space >>so as you start getting into the service molly good land, which we've been playing with, loves that as you get into that, that's going to accelerate more data. So I gotta ask you as you get into more of this this month, I will say monitoring or observe ability, how we want to look at it. You gotta get at the data. This becomes a critical part of solving a lot of problems and also making sure the machine learning is learning the right thing. How do you view that you guys over there? Because I think everyone is like getting that cloud native and it's not hard sell to say that's all good, but we can go back, you know, the expression ships created ships and then you have shipwrecks, you know, there's always a double edged sword here. So what's the downside? If you don't get the data right? >>Uh well, so the for us, the problem is not too much data, it's lack of data. So if you don't get data right is you don't have enough data. And the places where optimization cannot be automated is where the transaction rates are slow, where you don't have enough fruit. But coming into the application and it really becomes difficult to optimize that application with any kind of speed. You have to be able to profile the application long enough to know what moves its needle and in order for you to hit the S. L. O. Targets. So it's not too much data, it's not enough data. That seems to be the problem. And there are a lot of applications that are expensive to run but have a low throughput. And I would uh in all cases actually in every customer environment that have been in, where that's been the case if the application is just over provision, if you have a low throughput environment and it's costing too much, don't use ml to solve it. That's a wrong application of the technology. Just take a sledgehammer and back your resources by 50%, see what happens. And if that thing breaks back it again, until you find the baggage point. >>Exactly for you over prison, you bang it back down again. It's like the old school now with the cloud. Take me through some examples when you guys had some success, obviously you guys are in the right area right now, you're seeing a lot of people looking at this area to do that in some cases like changing the whole data center and respect of their business. But as you get it with customers with the app side, what some successes can you share some of the use cases, what you guys are being successful, your customers can get some examples. >>Yeah. So well known financial software for midsize businesses that that does accounting. It's uh there are customer during a large fleet and this product has been around for a while. It's not a container ice product. This product runs on VMS. Angela is a large component of that. So the problem for this particular vendor has been that they run on heterogeneous fleet that the application has been a along around for a very long time. And as new instance types on AWS have come in, developers have used those. So the fleet itself is quite heterogeneous and depending on the time of the day and what kind of reports are being run by organisations, they, the mix of resources that the applications need are different. So uh when we started analyzing the stack, we started we started looking at three different tiers, we looked at the database level, we looked at the job of mid tier and we looked at the web front end. And uh one of the things that became counterproductive is that m L. Discovered that using for the mid tier using larger instances but fear of a lot for better performance and lower cost and uh typically your gut feel is to go with smaller instances and more of a larger fleet if you would. But in this case, what the ML produced was completely counter intuitive And the net result for the customer was 78% cost reduction while agency went down by 10%. So think about it that you're, the response time is less, uh 10% less but your costs are down almost 80% 78% in this case. And the other are the fact that happened in the job of mitt here is that we improve garbage collection significantly and because whenever garbage collection happens on a JV M it takes a pause and that from a customer perspective it reflects as downtime because the machines are not responding so by tuning garbage collection Andrzej VMS across this very large fleet we were able to recover over 5000 minutes and month across the entire fleet. So uh, these are some substantial savings and this is what the right application of machine learning on a large fleet can do for assess business. >>And so talk about this fleet dynamic, You mentioned several lists. How do you see the future evolving for you guys? Where are you skating to where the puck is? As the expression goes? Um obviously with server list is going to have essentially unlimited fleets potentially That's gonna put a lot of power in the hands of developers. Okay. And people building experiences, What's the next five years look like for you guys? >>So I'm looking at the product from a product perspective, the service market depends on the mercy of the cloud service provider and typically the algorithms that they use. Uh basically they keep very few instances warm for you until you're the rate of api calls goes up and they start they start uh start turning on VMS are containers for you and then the system becomes more responsive over time. One place that we can optimize the service environment is give predictability of what the cyclicality of load is. So we can pre provision those instances and warm up the engine before the loads come into the system always stays responsive. You may have noticed that some of your apps on your phone that when you start them up, they may have a start up like a minute or two. Especially if it's a it's a terror gap. What's happening in those cases that you're starting an api calls goes in containers being started up for you to start up that instance, not enough of our warm to give you that rapid response. And that can lead to customer churn. So by by analyzing what the load on the overall load of the system is and pre provision the system. We can prevent the downtime uh prevent the lag to start up black on the downside. Which when you know when the usage goes down, it doesn't make sense to keep that many instances up. So we can talk to the back in infrastructure and the commission of those VMS in order to make to prevent cost creeps basically. So that's one place that we're thinking about extending our technology. >>So it's like, it's like the classic example where people say, oh during black monday everyone searches to do e commerce. You guys are thinking about it on A level that's a user centric kind of use case where you look at the application and be smart about what the expectation is on any given situation and then flex the resources on that. Is that right? That by getting right? So if it's your example, the app is a good one. If I wanted to load fast, that's the expectation. It better load fast. >>Yes, that's exactly but more romantic. So I use valentine's day and flowers my example. But you know, it doesn't have to be annual cycles. It can be daily cycles or hourly cycles. And all those patterns are learning about by an Ml back in. >>Alright, so I gotta ask you love the, this, this this new concept because most people think auto scaling right? Because that's a server concept. Can auto scale or database. Okay. On a scale up, you're getting down to the point where, okay, we'll keep the engines warm, getting more detailed. How do you explain this versus a concept like auto scaling. Is it the same as a cousins? >>They're they're basically the way they're expressed, it's the same technology but their way there expressed is different. So uh in a cooper native environment, the H. B A is your auto scaler basically in response to the need, response more instances and you get more containers going on. What happens as services? Less environment is you're unaware of the underpinnings that do that scale up for you. But there is an auto Scaler in place that does that scale up for you. So the question becomes that we're in a stack from a customer's perspective, are you talking about if you imagine your instances we're dealing with the H. B. A. If you're managing at the functional level we have to have api calls on the service provider's infrastructure to pre warm up the engine before the load comes. >>I love I love this under the hood is kind of love new dynamics kind of the same wine, new bottle but still computer science, still coding, still cool and relevant to make these experiences great. Thanks for coming on this cube conversation. I really appreciate it. Take a minute to put a plug in for the company. What are you guys doing in terms of status funding scale employees, what are you looking for? And if someone's watching this and there should be a customer of you guys, what what's, what's, what's going on in their world? What tells them that they need to be calling you? >>Yeah, so we're serious. Dave we've had the privilege of uh, our we've been privileged by having a very good success with large enterprises. Uh, if you go to our website, you'll see the logos of who we have, we will be at Q khan and there were going to be actively targeting the mid market or smaller kubernetes instances, as I mentioned, it's gonna take about 20 minutes to get started and we'll show the results in two hours. And our goal is for our customers to deliver the best user experience in terms of performance, reliability. Uh, so that they, they delight their customers in return and they do so without breaking the bank. So deliver excellent products, do it at the most efficient way possible, deliver a good financial results for your stakeholders. This is what we do. So we encourage anybody who is running a SAS company to come and take a look at us because we think we can help them and we can accelerate there. The growth at the lower cost >>and the last thing people need is have someone coming breathing down their necks saying, hey, we're getting overcharged. Why are you guys screwing up when they're not? They're trying to make a great experience. And I think this is kind of where people really want to do push the envelope and not have to go back and revisit the cost overruns, which if it's actually a good sign if you get some cost overruns here and there because you're experimenting. But again, you don't want to get out of control. >>You don't want to be a visual like the U. S. Debt. >>Exactly. I'm here. Thank you for coming on. Great. We'll see a coupe con. The key will be there in person is a hybrid event. So uh, coupon is gonna be awesome and thanks for coming on the key. Appreciate it. >>John is a pleasure. Thank you for having me on. >>Okay. I'm john fryer with acute here in Palo alto California remote interview with upsetting hot startup series. I'm sure they're gonna do well in the right spot in the market. Really well poisoned cloud Native. Thanks for watching. Yeah.

Published Date : Sep 13 2021

SUMMARY :

I appreciate you taking the time, appreciate it, john good to be here. So I know you guys in the center of all this uh and you've got, that the engineering can focus on writing code and not worry about having to tune the little pieces So, and you start getting into some of the systems configuration when you have automation at the center of this revenue problem down the line if you have too many problems unhappy. So I have to ask you mean everyone lives this. of X and the literacy rate of Why here's how you configure your cloud infrastructure. So on this tool you guys have in the software you guys have, how how do you guys go to mark So by giving the product uh activity if you if you in these new workflows, now that it's frank down and because of Ci Cd, you know, you don't have the luxury of waiting and of, you know, how this would play out and what people might expect, how it would handle, it gives the recommendation back to the application team or alternatively, native mean these days, because you know, cloud native become kind of much cloud computing, on the fly, if you built your application correctly, that you can scale up and scale down, So I gotta ask you as you get into more of this this So if you don't get data right is you don't have enough data. of the use cases, what you guys are being successful, your customers can get some examples. So the problem for this particular vendor has been that What's the next five years look like for you guys? to give you that rapid response. So it's like, it's like the classic example where people say, oh during black monday everyone searches to do e commerce. But you know, it doesn't have to be annual cycles. How do you explain this versus a concept like auto scaling. basically in response to the need, response more instances and you get more And if someone's watching this and there should be a customer of you guys, So deliver excellent products, do it at the most efficient way possible, cost overruns, which if it's actually a good sign if you get some cost overruns here and there because you're Thank you for coming on. Thank you for having me on. I'm sure they're gonna do well in the right spot in the market.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Amir Sharif	PERSON	0.99+
john fryer	PERSON	0.99+
50%	QUANTITY	0.99+
john Kerry	PERSON	0.99+
12 month	QUANTITY	0.99+
10%	QUANTITY	0.99+
two	QUANTITY	0.99+
apple	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
78%	QUANTITY	0.99+
amazon	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
two hours	QUANTITY	0.99+
30%	QUANTITY	0.99+
SAS	ORGANIZATION	0.99+
bob Sani	PERSON	0.99+
John	PERSON	0.99+
six months	QUANTITY	0.99+
over 5000 minutes	QUANTITY	0.98+
Angela	PERSON	0.98+
both	QUANTITY	0.98+
Sandy	PERSON	0.98+
three metrics	QUANTITY	0.98+
one point	QUANTITY	0.97+
first one	QUANTITY	0.97+
Palo alto California	LOCATION	0.97+
valentine's day	EVENT	0.97+
One place	QUANTITY	0.97+
one	QUANTITY	0.96+
about 20 minutes	QUANTITY	0.96+
amir	PERSON	0.96+
Palo alto	LOCATION	0.96+
zero	QUANTITY	0.95+
Prometheus	TITLE	0.95+
20 minutes	QUANTITY	0.95+
three different tiers	QUANTITY	0.95+
john	PERSON	0.95+
two days	QUANTITY	0.94+
1.0	QUANTITY	0.93+
black monday	EVENT	0.93+
one business side	QUANTITY	0.93+
a minute	QUANTITY	0.93+
this month	DATE	0.92+
four years	QUANTITY	0.92+
Opsani	PERSON	0.91+
almost 80%	QUANTITY	0.9+
Q khan	ORGANIZATION	0.88+
20 generations	QUANTITY	0.85+
U. S.	ORGANIZATION	0.85+
this fall	DATE	0.81+
next five years	DATE	0.78+
Laghi	PERSON	0.77+
c I C D	TITLE	0.72+
years	QUANTITY	0.7+
double	QUANTITY	0.67+
Stanley	PERSON	0.66+
C i c D	TITLE	0.66+
khan	ORGANIZATION	0.65+
4-1	OTHER	0.64+
CFO	ORGANIZATION	0.64+
C	EVENT	0.59+
three	DATE	0.59+
Ec two	TITLE	0.58+
Cloud	COMMERCIAL_ITEM	0.56+
first	QUANTITY	0.54+
Servo	ORGANIZATION	0.52+
Capex	ORGANIZATION	0.48+
five G	ORGANIZATION	0.48+
Servo	TITLE	0.43+
Andrzej	PERSON	0.4+
Kubernetes	TITLE	0.37+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Opsani: