Cloud native at scale: A Supercloud conversation with Madhura Maskasky, Platform9

(upbeat music) >> Hello, and welcome to theCUBE here in Palo Alto, California, for a special program on Cloud Native at Scale, Enabling Next Generation Cloud or Supercloud for Modern Application Cloud Native Developers. I'm John Furrier, host of theCUBE. My pleasure to have here, me Madhura Maskasky, Co-founder and VP of Product at Platform9. Thanks for coming in today for this cloud native at scale conversation. >> Thank you for having me. >> So cloud native at scale, something that we're talking about because we're seeing the next level of mainstream success of containers, Kubernetes and cloud native develop, basically DevOps in the CI/CD pipeline. It's changing the landscape of infrastructure as code. It's accelerating the value proposition. And the Supercloud as we call it, has been getting a lot of traction because this next generation cloud is looking a lot different, but kind of the same as the first generation. What's your view on Supercloud as it fits to cloud native, it scales up. >> Yeah, you know, I think what's interesting. And I think the reason why Supercloud is a really good and a really fit term for this. And I think I know my CEO was chatting with you as well, and he was mentioning this as well, but I think there needs to be a different term than just multicloud or cloud. And the reason is because as cloud native and cloud deployments have scaled, I think we've reached a point now where instead of having the traditional data center style model, where you have a few large distributions of infrastructure and workload at a few locations, I think the model's kind of flipped around, right? Where you have a large number of micro-sites. These micro-sites could be your public cloud deployment, your private OnPrem infrastructure deployment, or it could be your Edge environment, right? And every single enterprise, every single industry is moving in that direction. And so you got to refer that with a terminology that indicates the scale and complexity of it. And so I think Supercloud is an appropriate term for that. >> So you brought a couple things I want to dig into. You mentioned Edge nodes. We're seeing not only Edge nodes being the next kind of area of innovation, mainly because it's just popping up everywhere. And that's just the beginning, wouldn't even know what's around the corner. You got buildings, you got IoT, OT and IT kind of coming together, but you also got this idea of regions. Global infrastructure is a big part of it. I just saw some news around CloudFlare shutting down a site here. There's policies being made at scale, these new challenges there. Can you share, because you got to have Edge. So hybrid cloud is a winning formula. Everybody knows that, it's a steady state. But across multiple clouds brings in this new un-engineered area yet, It hasn't been done yet, Spanning Clouds. People say they're doing it, but you start to see the toe in the water. It's happening, it's going to happen. It's only going to get accelerated with the Edge and beyond globally. So I have to ask you, what is the technical challenges in doing this? Because there's something, business consequences as well, but there are technical challenges. Can you share your view on what the technical challenges are for the Supercloud across multiple edges and regions? >> Yeah, absolutely. So I think, you know, in the context of this term of Supercloud, I think it's sometimes easier to visualize things in terms of two axis, right? I think on one end you can think of the scale in terms of just pure number of nodes that you have deployed, a number of clusters in the Kubernetes space. And then on the other axis, you would have your distribution factor, right? Which is, do you have these tens of thousands of nodes in one site, or do you have them distributed across tens of thousands of sites, with one node at each site, right? And if you have just one flare of this, there is enough complexity, but potentially manageable. But when you are expanding on both these axis, you really get to a point where that scale really needs some well thought out, well structured solutions to address it, right? A combination of homegrown tooling, along with your, you know, favorite distribution of Kubernetes is not a strategy that can help you in this environment. It may help you when you have one of this, or when your scale is not at the level. >> Can you scope the complexity? Because, I mean, I hear a lot of moving parts going on there. The technology is also getting better. We're seeing cloud native become successful. There's a lot to configure. There's lot to install. Can you scope the scale of the problem because we're about at scale challenges here. >> Yeah absolutely, and I think I like to call it, you know, the problem that the scale creates, there's various problems. But I think one problem, one way to think about it is it works on my cluster problem, right? So, you know, I come from engineering background and there's a famous saying between engineers and QA, and the support folks, right. Which is, it works on my laptop, which is I tested this change, everything was fantastic. It worked flawlessly on my machine. On production, it's not working. The exact same problem now happens in these distributed environments, but at massive scale, right. Which is that, you know, developers test their applications, et cetera within these sanctity of their sandbox environments. But once you expose that change in the wild world of your production deployment, right. And the production deployment could be going at the radio cell tower at the Edge location where a cluster is running there. Or it could be sending, you know, these applications and having them run at my customer site, where they might not have configured that cluster exactly the same way as I configured it. Or they configured the cluster right. But maybe they didn't deploy the security policies, or they didn't deploy the other infrastructure plugins that my app relies on. All of these various factors add their own layer of complexity. And there really isn't a simple way to solve that today. And that is just, you know, one example of an issue that happens. I think another, you know, whole new ballgame of issues come in the context of security, right? Because when you are deploying applications at scale, in a distributed manner, you got to make sure someone's job is on the line to ensure that the right security policies are enforced regardless of that scale factor. So I think that's another example of problems that occur. >> Okay, so I have to ask about scale, because there are a lot of multiple steps involved when you see the success of cloud native, you know, you see some experimentation, they set up a cluster, say it's containers and Kubernetes. And then you say, okay, we got this. We configure it. And then they do it again, and again, they call it day two. Some people call it day one, day two operation, whatever you call it. Once you get past the first initial thing, then you got to scale it. Then you're seeing security breaches. You're seeing configuration errors. This seems to be where the hotspot is, in when companies transition from, I got this, to oh no, it's harder than I thought at scale. Can you share your reaction to that and how you see this playing out? >> Yeah, so, you know, I think it's interesting. There's multiple problems that occur when the two factors of scale, as we talked about, start expanding. I think one of them is what I like to call the, it works fine on my cluster problem, which is back in, when I was a developer, we used to call this, it works on my laptop problem. Which is, you know, you have your perfectly written code that is operating just fine on your machine, your sandbox environment. But the moment it runs production, it comes back with P 0s and POS from support teams, et cetera. And those issues can be really difficult to try us, right. And so in the Kubernetes environment, this problem kind of multi-folds. It goes, you know, escalates to a higher degree because you have your sandbox developer environments, they have their clusters, and things work perfectly fine in those clusters, because these clusters are typically handcrafted or a combination of some scripting and handcrafting. And so as you give that change to then run at your production Edge location, like say your radial cell power site, or you hand it over to a customer to run it on their cluster, they might not have configured that cluster exactly how you did, or they might not have configured some of the infrastructure plugins. And so things don't work. And when things don't work, triaging them becomes nightmarishly hard, right? It's just one of the examples of the problem. Another whole bucket of issues is security, which is, as you have these distributed clusters at scale. You got to ensure someone's job is on the line to make sure that the security policies are configured properly. >> So this is a huge problem. I love that comment. That's not happening on my system. It's the classic, you know, debugging mentality. But at scale, it's hard to do that with error prone. I can see that being a problem. And you guys have a solution you're launching, can you share what Arlon is? This new product? What is it all about? Talk about this new introduction. >> Yeah absolutely, I'm very, very excited. You know, it's one of the projects that we've been working on for some time now. Because we are very passionate about this problem and just solving problems at scale in OnPrem or in the cloud or at Edge environments. And what Arlon is, it's an open source project, and it is a tool, a Kubernetes native tool for complete end-to-end management of not just your clusters, but your clusters, all of the infrastructure that goes within and along the sites of those clusters, security policies, your middleware plugins, and finally your applications. So what Arlon lets you do in a nutshell is in a declarative way, it lets you handle the configuration and management of all of these components in at scale. >> So what's the elevator pitch simply put for what this solves in terms of the chaos you guys are reigning in, what's the bumper sticker. What did it do? >> There's a perfect analogy that I love to reference in this context, which is, think of your assembly line, you know, in a traditional, let's say an auto manufacturing factory, or et cetera, and the level of efficiency at scale that that assembly line brings, right. Arlon, and if you look at the logo we've designed, it's this funny little robot. And it's because when we think of Arlon, we think of these enterprise large scale environments, you know, sprawling at scale, creating chaos, because there isn't necessarily a well thought through, well-structured solution that's similar to an assembly line, which is taking each component, you know, addressing them, manufacturing, processing them in a standardized way, then handing to the next stage where again, it gets processed in a standardized way. And that's what Arlon really does. That's like the elevator pitch. If you have problems of scale, of managing your infrastructure, you know, that is distributed, Arlon brings the assembly line level of efficiency and consistency for those problems. >> So keeping it smooth, the assembly line, things are flowing, see CI/CD pipe-lining. So that's what you're trying to simplify that OPS piece for the developer. I mean, it's not really OPS, it's their OPS, it's coding. >> Yeah, not just developer the OPS, the operations folks as well, right. Because developers, you know, developers are responsible for one picture of that layer, which is my apps. And then maybe that middleware of applications that they interface with. But then they hand it over to someone else who's then responsible to ensure that these apps are secured properly, that they are logging, logs are being collected properly. Monitoring and observability is integrated. And so it solves problems for both those teams. >> Yeah, it's DevOps. So the DevOps is the cloud native developer. The OPS team have to kind of set policies. Is that where the declarative piece comes in? Is that why that's important? >> Absolutely, yeah. And you know, Kubernetes really introduced or elevated this declarative management, right. Because you know, Kubernetes clusters are you know your specifications of components that go in Kubernetes are defined in a declarative way. And Kubernetes always keeps that state consistent with your defined state. But when you go outside of that world of a single cluster, and when you actually talk about defining the clusters or defining everything that's around it, there really isn't a solution that does that today. And so Arlon addresses that problem at the heart of it. And it does that using existing open source, well known solutions. >> And do I want to get into the benefits, what's in it for me as the customer, developer, but I want to finish this out real quick and get your thoughts. You mentioned open source. Why open source? What's the current state of the product? You run the product group over there at Platform9. Is it open source, and you guys have a product that's commercial? Can you explain the open source dynamic? And first of all, why open source? And what is the consumption? I mean open source is great. People want opensource, they can download and look up the code, but maybe want to buy the commercial. So I'm assuming you have that thought through. Can you share open source and commercial relationship? >> Yeah, I think, you know, starting with why opensource? I think it's, you know, we, as a company, we have one of the things that's absolutely critical to us is that we take mainstream open source technologies, components, and then we make them available to our customers at scale through either a SaaS model or OnPrem model, right. But so as we are a company or startup, or a company that benefits, you know, in a massive way by this open source economy, it's only right I think in my mind that we do are part of the duty, right. And contribute back to the community that feeds us. And so, you know, we have always held that strongly as one of our principles. And we have, you know, created and built independent products, starting all the way with Fission, which was a serverless product that we had built, to various other examples that I can give. But that's one of the main reasons why open source. And also open source because we want the community to really first-hand engage with us on this problem, which is very difficult to achieve if your product is behind a wall, you know, behind a black box. >> Well, and that's what the developers want too. What we're seeing in reporting with Supercloud is the new model of consumption is I want to look at the code and see what's in there. >> That's right. >> And then also if I want to use it, I'll do it, great. That's open source, that's the value. But then at the end of the day, if I want to move fast, that's when people buy in. So it's a new kind of freemium, I guess, business model. I guess that's the way it is, but that's the benefit of open source. This is why standards and open source is growing so fast. You have that confluence of, you know, a way for developers to try before they buy, but also actually kind of date the application, if you will. We, you know, Adrian Kakroff uses the dating metaphor, you know, hey, you know, I want to check it out first before I get married. And that's what open source is. So this is the new, this is how people are selling. This is not just open source. This is how companies are selling. >> Absolutely, yeah, yeah. You know, I think two things, I think one is just, you know, this cloud native space is so vast that if you're building a cluster solution, sometimes there's also a risk that it may not apply to every single enterprises use cases. And so having it open source gives them an opportunity to extend it, expand it, to make it proper to their use case, if they choose to do so, right. But at the same time, what's also critical to us, is we are able to provide a supported version of it, with an SLA that's backed by us, a SaaS-hosted version of it as well for those customers who choose to go that route. You know, once they have used the open source version and loved it and want to take it at scale and in production and need a partner to collaborate with who can support them for that production environment. >> I have to ask you. Now let's get into what's in it for the customer? I'm a customer. Why should I be enthused about Arlon? What's in it for me? You know, 'cause if I'm not enthused about it, I'm not going to be confident, and it's going to be hard for me to get behind this. Can you share your enthusiastic view of, you know, why I should be enthused about Arlon, if I'm a customer. >> Yeah, absolutely. And so, and there's multiple, you know, enterprises that we talk to, many of them, are customers where this is a very kind of typical story that you will hear, which is we have a Kubernetes distribution. It could be On-Premise. It could be public cloud native Kubernetes. And then we have our CI/CD pipelines that are automating the deployment of applications, et cetera. And then there's this gray zone. And the gray zone is, well before you can, your CI/CD pipelines can deploy the apps, somebody needs to do all of their groundwork of, you know, defining those clusters, and yeah properly configuring them. And as these things start by being done hand-grown. And then as you scale, what typically enterprises would do today is they will have their homegrown DIY solutions for this. I mean, the number of folks that I talk to that have built Terraform automation, and then, you know, some of those key developers leave. So it's a typical open source, or typical, you know, DIY challenge. And the reason that they're writing it themselves is not because they want to. I mean, of course technology is always interesting to everybody, but it's because they can't find a solution that's out there that perfectly fits their problem. And so that's that pitch. I think OPS people would be delighted. The folks that we've talked, you know, spoken with have been absolutely excited and have shared that this is a major challenge we have today, because we have few hundreds of clusters on EKS, Amazon, and we want to scale them to few thousands, but we don't think we are ready to do that. And this will give us the ability to do that. >> Yeah, I think people are scared. I won't say scared, that's a bad word. Maybe I should say that they feel nervous because you know, at scale, small mistakes can become large mistakes. This is something that is concerning to enterprises. And I think this is going to come up at KubeCon this year where enterprises are going to say, okay, I need to see SLAs. I want to see track record. I want to see other companies that have used it. How would you answer that question to, or challenge, you know, hey I love this, but is there any guarantees? Is there any, what's the SLAs? I'm an enterprise, I got tight. You know, I love the open source trying to free, fast and loose, but I need hardened code. >> Yeah, absolutely. So two parts to that, right? One is Arlon leverages, existing opensource components, products that are extremely popular. Two specifically, one is Arlon uses Argo CD, which is probably one of the highest rated and used CD opensource tools that's out there, right. Created by folks that are as part of Intuit team now, you know, really brilliant team, and it's used at scale across enterprises. That's one. Second is Arlon also makes use of cluster API, CAPI, which is a Kubernetes sub-component, right for lifecycle management of clusters. So there is enough of, you know, community users, et cetera, around these two products or open source projects that will find Arlon to be right up in their alley, because they're already comfortable, familiar with Argo CD. Now Arlon just extends the scope of what Argo CD can do. And so that's one. And then the second part is going back to your point of the comfort. And that's where, you know, Platform9 has a role to play, which is when you are ready to deploy Arlon at scale, because you've been, you know playing with it in your DEV test environments, you're happy with what you get with it. Then Platform9 will stand behind it and provide that SLA. >> And what's been the reaction from customers you've talked to, Platform9 customers that are familiar with Argo, and then Arlo? What's been some of the feedback? >> Yeah, I think the feedback's been fantastic. I mean, I can give you examples of customers where you know, initially, when you're telling them about your entire portfolio of solutions, it might not strike a chord right away. But then we start talking about Arlon, and we talk about the fact that it uses Argo CD. They start opening up, they say, we have standardized on Argo, and we have built these components homegrown. We would be very interested. Can we co-develop? Does it support these use cases? So we've had that kind of validation. We've had validation all the way at the beginning of Arlon, before we even wrote a single line of code, saying this is something we plan on doing. And the customer said, if you had it today, I would've purchased it. So it's been really great validation. >> All right, so next question is what is the solution to the customer? If I asked you, look, I'm so busy. My team's overworked, I got a skills gap. I don't need another project. I'm so tied up right now, and I'm just chasing my tail. How does Platform9 help me? >> Yeah, absolutely. So I think, you know, one of the core tenants of Platform9 has always been, that we try to bring that public cloud like simplicity by hosting, you know, this and a lot of such similar tools in a SaaS hosted manner for our customers, right. So our goal behind doing that is taking away, or trying to take away all of that complexity from customer's hands and offloading it to our hands, right. And giving them that full white glove treatment as we call it. And so from a customer's perspective, one, something like Arlon will integrate with what they have, so they don't have to rip and replace anything. In fact, it will even in the next versions, it may even discover your clusters that you have today, and give you an inventory. >> So customers have clusters that are growing. That's a sign, call you guys. >> Absolutely, either they have massive, large clusters, right, that they want to split into smaller clusters, but they're not comfortable doing that today. Or they've done that already on say public cloud or otherwise. And now they have management challenges. >> So, especially operationalizing the clusters, whether they want to kind of reset everything and move things around, and reconfigure, and or scale out. >> That's right, exactly. >> And you provide that layer of policy. >> Absolutely, yes. >> That's the key value here. >> That's right. >> So policy based configuration for cluster scale up. >> Profile and policy based declarative configuration and life cycle management for clusters. >> If I asked you how this enables Supercloud, what would you say to that? >> I think this is one of the key ingredients to Supercloud, right? If you think about a Supercloud environment, there is at least few key ingredients that come to my mind that are really critical. Like they are, you know, life saving ingredients at that scale. One is having a really good strategy for managing that scale, you know, in a going back to assembly line, in a very consistent, predictable way. So that, Arlon solves. Then you need to compliment that with the right kind of observability and monitoring tools at scale, right? Because ultimately issues are going to happen, and you're going to have to figure out, you know, how to solve them fast. And Arlon, by the way also helps in that direction. But you also need observability tools. And then especially if you're running it on the public cloud, you need some cost management tools. In my mind, these three things are like the most necessary ingredients to make Supercloud successful. And you know, Arlon is one of them. >> Okay so now the next level is, okay, that makes sense is under the covers, kind of speak under the hood. How does that impact the app developers of the cloud native modern application workflows? Because the impact to me seems, the apps are going to be impacted. Are they going to be faster, stronger? I mean, what's the impact if you do all those things, as you mentioned, what's the impact of the apps? >> Yeah, the impact is that your apps are more likely to operate in production the way you expect them to, because the right checks and balances have gone through. And any discrepancies have been identified prior to those apps, prior to your customer running into them, right? Because developers run into this challenge today where there's a split responsibility, right. I'm responsible for my code. I'm responsible for some of these other plugins, but I don't own these stack end to end. I have to rely on my OPS counterpart to do their part, right. And so this really gives them the right tooling for that. >> This is actually a great kind of relevant point. You know, as cloud becomes more scalable, you're starting to see this fragmentation, gone are the days of the full stack developer, to the more specialized role. But this is a key point. And I have to ask you, because if this Arlo solution takes place, as you say, and the apps are going to do what they're designed to do, the question is what does the current pain look like? Are the apps breaking? What is the signals to the customer that they should be calling you guys up and implementing Arlo, Argo, and all the other goodness to automate, what are some of the signals? Is it downtime? Is it failed apps? Is it latency? What are some of the things that would be indications of things are effed up a little bit. >> Yeah, more frequent down times, down times that take longer to triage. And so your, you know, your mean times on resolution, et cetera, are escalating or growing larger, right? Like we have environments of customers where they have a number of folks in the field that have to take these apps, and run them at customer sites. And that's one of our partners. And they're extremely interested in this, because the rate of failures they're encountering for this, you know, the field when they're running these apps on site, because the field is automating their clusters that are running on sites using their own script. So these are the kinds of challenges. So those are the pain points, which is, you know, if you're looking to reduce your meantime to resolution. If you're looking to reduce the number of failures that occur on your production site, that's one. And second, if you're looking to manage these at scale environments with a relatively small focused nimble OPS team, which has an immediate impact on your budget. So those are the signals. >> This is the cloud native at scale situation. The innovation going on. Final thought is your reaction to the idea that if the world goes digital, which it is, and the confluence of physical and digital coming together, and cloud continues to do its thing, the company becomes the application. Not where IT used to be supporting the business, you know, the back office, and the immediate terminals and some PCs and handhelds. Now, if technology's running the business, is the business, company's the application. So it can't be down. So there's a lot of pressure on CSOs and CIOs now, and boards are saying, how is technology driving the top line revenue? That's the number one conversation. Do you see the same thing? >> Yeah, it's interesting. I think there's multiple pressures at the CSO, CIO level, right? One, is that there needs to be that visibility and clarity and guarantee almost that, you know, the technology that's going to drive your top line is going to drive that in a consistent, reliable, predictable manner. And then second, there is the constant pressure to do that while always lowering your costs of doing it, right. Especially when you're talking about, let's say retailers, or those kinds of large scale vendors, they many times make money by lowering the amount that they spend providing those goods to their end customers. So I think both those factors kind of come into play and the solution to all of them is usually in a very structured strategy around automation. >> Final question. What does cloud native at scale look like to you? If all the things happen the way we want 'em to happen, the magic wand, the magic dust, what does it look like? >> What that looks like to me is a CIO sipping at his desk on coffee. Production is running absolutely smooth. And he's running that at a nimble, nimble team size of, at the most, a handful of folks that are just looking after things, but things are just taking care of themselves. >> And the CIO doesn't exist. There's no CISO, they're at the beach. >> (laughing) Yeah. >> Madhura, thank you for coming on, sharing the cloud native at scale here on theCUBE. Thank you for your time. >> Fantastic, thanks for having me. >> Okay, I'm John Furrier here for special program presentation, special programming Cloud Native at Scale, Enabling Supercloud Modern Applications with Platform9. Thanks for watching. (upbeat music)

Published Date : Sep 20 2022

SUMMARY :

Co-founder and VP of Product at Platform9. And the Supercloud as we call it, And so you got to refer And that's just the beginning, So I think, you know, in the context Can you scope the complexity? And that is just, you know, And then you say, okay, we got this. And so as you give that change to then run It's the classic, you So what Arlon lets you do in a nutshell you guys are reigning in, Arlon, and if you look at that OPS piece for the developer. Because developers, you know, So the DevOps is the And you know, Kubernetes really introduced So I'm assuming you have or a company that benefits, you know, is the new model of consumption You have that confluence of, you know, I think one is just, you Can you share your enthusiastic view I mean, the number of folks that I talk to And I think this is going to And that's where, you know, where you know, initially, is what is the solution to the customer? clusters that you have today, That's a sign, call you guys. that they want to split operationalizing the clusters, So policy based configuration and life cycle management for clusters. for managing that scale, you know, Because the impact to me seems, the way you expect them to, and the apps are going to do for this, you know, the field that if the world goes and the solution to all of them If all the things happen the What that looks like to me And the CIO doesn't exist. Thank you for your time. for special program presentation,

ENTITIES

Entity	Category	Confidence
Madhura Maskasky	PERSON	0.99+
Adrian Kakroff	PERSON	0.99+
John Furrier	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Madhura	PERSON	0.99+
one	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
second part	QUANTITY	0.99+
Arlon	ORGANIZATION	0.99+
both	QUANTITY	0.99+
first	QUANTITY	0.99+
tens of thousands of sites	QUANTITY	0.99+
one site	QUANTITY	0.99+
second	QUANTITY	0.99+
today	DATE	0.99+
two parts	QUANTITY	0.99+
two factors	QUANTITY	0.99+
one node	QUANTITY	0.99+
Two	QUANTITY	0.99+
first generation	QUANTITY	0.99+
two products	QUANTITY	0.98+
two things	QUANTITY	0.98+
each site	QUANTITY	0.98+
one problem	QUANTITY	0.98+
each component	QUANTITY	0.98+
Supercloud	ORGANIZATION	0.98+
Second	QUANTITY	0.98+
tens of thousands of nodes	QUANTITY	0.98+
Arlo	ORGANIZATION	0.97+
KubeCon	EVENT	0.97+
Platform9	ORGANIZATION	0.97+
single line	QUANTITY	0.97+
one end	QUANTITY	0.96+
CloudFlare	TITLE	0.96+
one way	QUANTITY	0.96+
Argo	ORGANIZATION	0.96+
three things	QUANTITY	0.96+
One	QUANTITY	0.95+
Kubernetes	TITLE	0.94+
one flare	QUANTITY	0.94+
Fission	ORGANIZATION	0.93+
single cluster	QUANTITY	0.93+
one picture	QUANTITY	0.93+
DevOps	TITLE	0.92+
EKS	ORGANIZATION	0.91+
this year	DATE	0.91+
one example	QUANTITY	0.91+
Cloud	TITLE	0.9+

COMMUNICATIONS V1 | CLOUDERA

>>Hi today, I'm going to talk about network analytics and what that means for, for telecommunications as we go forward. Um, thinking about, uh, 5g, what the impact that's likely to have on, on network analytics and the data requirement, not just to run the network and to understand the network a little bit better. Um, but also to, to inform the rest of the operation of the telecommunications business. Um, so as we think about where we are in terms of network analytics and what that is over the last 20 years, the telecommunications industry has evolved its management infrastructure, uh, to abstract away from some of the specific technologies in the network. So what do we mean by that? Well, uh, in the, in the initial, uh, telecommunications networks were designed, there were management systems that were built in, um, eventually fault management systems, uh, assurance systems, provisioning systems, and so on were abstracted away. >>So it didn't matter what network technology had, whether it was a Nokia technology or Erickson technology or Huawei technology or whatever it happened to be. You could just look at your fault management system, understand where false, what happened as we got into the last sort of 10, 15 years or so. Telecommunication service providers become became more sophisticated in terms of their approach to data analytics and specifically network analytics, and started asking questions about why and what if in relation to their network performance and network behavior. And so network analytics as a, as a bit of an independent function was born and over time, more and more data began to get loaded into the network analytics function. So today just about every carrier in the world has a network analytics function that deals with vast quantities of data in big data environments that are now being migrated to the cloud. >>As all telecommunications carriers are migrating as many it workloads as possible, um, to the cloud. So what are the things that are happening as we migrate to the cloud that drive, uh, uh, enhancements in use cases and enhancements and scale, uh, in telecommunications network analytics? Well, 5g is the big thing, right? So 5g, uh, it's not just another G in that sense. I mean, in some cases, in some senses, it is 5g means greater bandwidth, lower latency and all those good things. So, you know, we can watch YouTube videos with less interference and, and less sluggish bandwidth and so on and so forth. But 5g is really about the enterprise and enterprise services. Transformation, 5g is more secure, kind of a network, but 5g is also a more pervasive network 5g, a fundamentally different network topology than previous generations. So there's going to be more masts and that means that you can have more pervasive connectivity. >>Uh, so things like IOT and edge applications, autonomous cars, smart cities, these kinds of things, um, are all much better served because you've got more masks that of course means that you're going to have a lot more data as well. And we'll get to that. The second piece is immersive digital services. So with more masks, with more connectivity, with lower latency with higher man, the potential, uh, is, is, is, is immense for services innovation. And we don't know what those services are going to be. We know that technologies like augmented reality, virtual reality, things like this have great potential. Um, but we, we have yet to see where those commercial applications are going to be, but the innovation and the innovation potential for 5g is phenomenal. Um, it certainly means that we're going to have a lot more, uh, edge devices, um, uh, and that again is going to lead to an increase in the amount of data that we have available. >>And then the idea of pervasive connectivity when it comes to smart, smart cities, uh, autonomous, autonomous currents, um, uh, integrated traffic management systems, um, all of this kind of stuff, those of those kind of smart environments thrive where you've got this kind of pervasive connectivity, this persistent, uh, connection to the network. Um, again, that's going to drive, um, um, uh, more innovation. And again, because you've got these new connected devices, you're going to get even more data. So this rise, this exponential rise in data is really what's driving the change in, in network analytics. And there are four major vectors that are driving this increase in data in terms of both volume and in terms of speed. So the first is more physical elements. So we said already that 5g networks are going to have a different apology. 5g networks will have more devices, more and more masks. >>Um, and so with more physical elements in the network, you're going to get more physical data coming off those physical networks. And so that needs to be aggregated and collected and managed and stored and analyzed and understood when, so that we can, um, have a better understanding as to why things happened the way they do, why the network behaves in which they do in, in, in, in ways that it does and why devices that are connected to the network. And ultimately of course, consumers, whether they be enterprises or retail customers, um, behave in the way they do in relation to their interaction within our edge nodes and devices, we're going to have a, uh, an explosion in terms of the number of devices. We've already seen IOT devices with your different kinds of trackers and, uh, and, and sensors that are hanging off the edge of the network, whether it's to make buildings smarter car smarter, or people smarter, um, in, in terms of having the, the, the measurements and the connectivity and all that sort of stuff. >>So the numbers of devices on the agent beyond the age, um, are going to be phenomenal. One of the things that we've been trying to with as an industry over the last few years is where does the telco network end, and where does the enterprise, or even the consumer network begin. You used to be very clear that, you know, the telco network ended at the router. Um, but now it's not, it's not that clear anymore because in the enterprise space, particularly with virtualized networking, which we're going to talk about in a second, um, you start to see end to end network services being deployed. Um, uh, and so are they being those services in some instances are being managed by the service provider themselves, and in some cases by the enterprise client, um, again, the line between where the telco network ends and where the enterprise or the consumer network begins, uh, is not clear. >>Uh, so, so those edge, the, the, the proliferation of devices at the age, um, uh, in terms of, um, you know, what those devices are, what the data yield is and what the policies are, their need to govern those devices, um, in terms of security and privacy, things like that, um, that's all going to be really, really important virtualized services. We just touched on that briefly. One of the big, big trends that's happening right now is not just the shift of it operations onto the cloud, but the shift of the network onto the cloud, the virtualization of network infrastructure, and that has two major impacts. First of all, it means that you've got the agility and all of the scale, um, uh, benefits that you get from migrating workloads to the cloud, the elasticity and the growth and all that sort of stuff. But arguably more importantly for the telco, it means that with a virtualized network infrastructure, you can offer entire networks to enterprise clients. >>So if you're selling to a government department, for example, is looking to stand up a system for certification of, of, you know, export certification, something like that. Um, you can not just sell them the connectivity, but you can sell them the networking and the infrastructure in order to serve that entire end to end application. You could sentence, you could offer them in theory, an entire end-to-end communications network, um, and with 5g network slicing, they can even have their own little piece of the 5g bandwidth that's been allocated against the carrier, um, uh, and, and have a complete end to end environment. So the kinds of services that can be offered by telcos, um, given virtualize network infrastructure, uh, are, are many and varied. And it's a, it's a, it's a, um, uh, an outstanding opportunity. But what it also means is that the number of network elements virtualized in this case is also exploding. >>That means the amount of data that we're getting on, uh, informing us as to how those network elements are behaving, how they're performing, um, uh, is, is, is going to go up as well. And then finally, AI complexity. So on the demand side, um, while historically, uh, um, network analytics, big data, uh, has been, has been driven by, um, returns in terms of data monetization, uh, whether that's through cost avoidance, um, or service assurance, uh, or even revenue generation through data monetization and things like that. AI is transforming telecommunications and every other industry, the potential for autonomous operations, uh, is extremely attractive. And so understanding how the end-to-end telecommunication service delivering delivery infrastructure works, uh, is essential, uh, as a training ground for AI models that can help to automate a huge amount of telecommunications operating, um, processes. So the AI demand for data is just going through the roof. >>And so all of these things combined to mean big data is getting explosive. It is absolutely going through the roof. So that's a huge thing that's happening. So as telecommunications companies around the world are looking at their network analytics infrastructure, which was initially designed for service insurance primarily, um, and how they migrate that to the cloud. These things are impacting on those decisions because you're not just looking at migrating a workload to operate in the cloud that used to work in the, in the data center. Now you're looking at, um, uh, migrating a workload, but also expanding the use cases in that work and bear in mind, many of those, those are going to need to remain on prem. So they'll need to be within a private cloud or at best a hybrid cloud environment in order to satisfy a regulatory jurisdictional requirements. So let's talk about an example. >>So LGU plus is a Finastra fantastic service provider in Korea. Um, huge growth in that business over the last, uh, over the last 10, 15 years or so. Um, and obviously most people will be familiar with LG, the electronics brand, maybe less so with, uh, with LG plus, but they've been doing phenomenal work. And we're the first, uh, business in the world who launch commercial 5g in 2019. And so a huge milestone that they achieved. And at the same time they deploy the network real-time analytics platform or in rep, uh, from a combination of Cloudera and our partner calmer. Now, um, there were a number of things that were driving, uh, the requirement for it, for the, for the analytics platform at the time. Um, clearly the 5g launch was that was the big thing that they had in mind, but there were other things that re so within the 5g launch, um, uh, they were looking for, for visibility of services, um, and service assurance and service quality. >>So, you know, what services have been launched? How are they being taken up? What are the issues that are arising, where are the faults happening? Um, where are the problems? Because clearly when you launch a new service, but then you want to understand and be on top of the issues as they arise. Um, so that was really, really important. The second piece was, and, you know, this is not a new story to any telco in the world, right. But there are silos in operation. Uh, and so, um, taking advantage of, um, or eliminating redundancies through the process, um, of, of digital transformation, it was really important. And so particular, the two silos between wired and the wireless sides of the business come together so that there would be an integrated network management system, um, for, uh, for LGU plus, as they rolled out 5g. So eliminating redundancy and driving cost savings through the, the integration of the silos is really, really important. >>And that's a process and the people thing every bit, as much as it is a systems and a data thing. So, um, another big driver and the fourth one, you know, we've talked a little bit about some of these things, right? 5g brings huge opportunity for enterprise services, innovation. So industry 4.0 digital experience, these kinds of use cases, um, are very important in the south Korean marketing and in the, um, in the business of LGU plus. And so, uh, um, looking at AI and how can you apply AI to network management? Uh, again, there's a number of use cases, really, really exciting use cases that have gone live now, um, in LG plus since, uh, since we did this initial deployment and they're making fantastic strides there, um, big data analytics for users across LGU plus, right? So it's not just for, um, uh, it's not just for the immediate application of 5g or the support or the 5g network. >>Um, but also for other data analysts and data scientists across the LGU plus business network analytics, while primarily it's primary it's primary use case is around network management, um, LGU plus, or, or network analytics, um, has applications across the entire business, right? So, um, you know, for customer churn or next best offer for understanding customer experience and customer behavior really important there for digital advertising, for product innovation, all sorts of different use cases and departments within the business needed access to this information. So collaboration sharing across the network, the real-time network analytics platform, um, it was very important. And then finally, as I mentioned, LG group is much bigger than just LG plus it's because the electronics and other pieces, and they had launched a major group wide digital transformation program in 2019, and still being a part of that was, well, some of them, the problems that they were looking to address. >>Um, so first of all, the integration of wired and wireless data service data sources, and so getting your assurance data sources, your network, data sources, uh, and so on integrated with is really, really important scale was massive for them. Um, you know, they're talking about billions of transactions in under a minute, uh, being processed, um, and hundreds of terabytes per day. So, uh, you know, phenomenal scale, uh, that needed to be available out of the box as it were, um, real time indicators and alarms. And there was lots of KPIs and thresholds set that, you know, w to make, make it to meet certain criteria, certain standards, um, customer specific, real time analysis of 5g, particularly for the launch root cause analysis, an AI based prediction on service, uh, anomalies and service service issues was, was, was a core use case. Um, as I talked about already the provision of service of data services across the organization, and then support for 5g, uh, served the business service, uh, impact, uh, was extremely important. >>So it's not just understand well, you know, that you have an outage in a particular network element, but what is the impact on the business of LGU plus, but also what is the impact on the business of the customer, uh, from an outage or an anomaly or a problem on, on, on the network. So being able to answer those kinds of questions really, really important, too. And as I said, between Cloudera and Kamarck, uh, uh, and LGU plus, uh, really themselves an intrinsic part of the solution, um, uh, this is, this is what we, we ended up building. So a big complicated architecture space. I really don't want to go into too much detail here. Um, uh, you can see these things for yourself, but let me skip through it really quickly. So, first of all, the key data sources, um, you have all of your wireless network information, other data sources. >>This is really important because sometimes you kind of skip over this. There are other systems that are in place like the enterprise data warehouse that needed to be integrated as well, southbound and northbound interfaces. So we get our data from the network and so on, um, and network management applications through file interfaces. CAFCA no fire important technologies. And also the RDBMS systems that, uh, you know, like the enterprise data warehouse that we're able to feed that into the system. And then northbound, um, you know, we spoke already about me making network analytics services available across the enterprise. Um, so, uh, you know, uh, having both the file and the API interface available, um, for other systems and other consumers across the enterprise is very important. Um, lots of stuff going on then in the platform itself to petabytes and persistent storage, um, Cloudera HDFS, 300 nodes for the, the raw data storage, um, uh, and then, uh, could do for real time storage for real-time indicator analysis, alarm generation, um, uh, and other real time, um, processes. >>Uh, so there, that was the, the core of the solution, uh, spark processes for ETL key quality indicators and alarming, um, and also a bunch of work done around, um, data preparation, data generation for transferal to, to third party systems, um, through the northbound interfaces, um, uh, Impala, API queries, um, for real-time systems, uh, there on the right hand side, and then, um, a whole bunch of clustering classification, prediction jobs, um, through the, uh, the, the, the, the ML processes, the machine learning processes, uh, again, another key use case, and we've done a bunch of work on that. And, um, I encourage you to have a look at the Cloudera website for more detail on some of the work that we did here. Um, so this is some pretty cool stuff. Um, and then finally, just the upstream services, some of these there's lots more than, than, than simply these ones, but service assurance is really, really important. So SQM cm and SED grade. So the service quality management customer experience, autonomous controllers, uh, really, really important consumers of, of the, of the real-time analytics platform, uh, and your conventional service assurance, um, functions like faulted performance management. Uh, these things are as much consumers of the information and the network analytics platform as they are providers of data to the network, uh, analytics >>Platform. >>Um, so some of the specific use cases, uh, that, uh, have been, have been stood up and that are delivering value to this day and lots of more episodes, but these are just three that we pulled out. Um, so first of all, um, uh, sort of specific monitoring and customer quality analysis, Karen response. So again, growing from the initial 5g launch and then broadening into broader services, um, understanding where there are the, where there are issues so that when people complaining, when people have an issue, um, that, um, uh, that we can answer the, the concerns of the client, um, in a substantive way, um, uh, AI functions around root cause analysis or understanding why things went wrong when they went wrong. Um, uh, and also making recommendations as to how to avoid those occurrences in the future. Uh, so we know what preventative measures can be taken. Um, and then finally the, uh, the collaboration function across LGU plus extremely important and continues to be important to this day where data is shared throughout the enterprise, through the API Lira through file interfaces and other things, and through interface integrations with, uh, with upstream systems. >>So, um, that's kind of the, the, uh, real quick run through of LGU plus the numbers are just stave staggering. Um, you know, we've seen, uh, upwards of a billion transactions in under 40 seconds being, um, uh, being tested. Um, and, and we've gone beyond those thresholds now, already, um, and we're started and, and, and, and this isn't just a theoretical sort of a benchmarking test or something like that. We're seeing these kinds of volumes of data and not too far down the track. So, um, with those things that I mentioned earlier with the proliferation of, of, um, of network infrastructure, uh, in the 5g context with virtualized elements, with all of these other bits and pieces are driving massive volumes of data towards the, uh, the, the, the network analytics platform. So phenomenal scale. Um, this is just one example we work with, with service providers all over the world is over 80% of the top 100 telecommunication service providers run on Cloudera. >>They use Cloudera in the network, and we're seeing those customers, all migrating legacy cloud platforms now onto CDP onto the Cloudera data platform. Um, they're increasing the, the, the jobs that they do. So it's not just warehousing, not just ingestion ETL, and moving into things like machine learning. Um, and also looking at new data sources from places like NWTF the network data analytics function in 5g, or the management and orchestration layer in, in software defined networks, network, function, virtualization. So, you know, new use cases coming in all the time, new data sources coming in all the time growth in, in, in, in the application scope from, as we say, from edge to AI. Um, and so it's, it's really exciting to see how the, the, the, the footprint is growing and how, uh, the applications in telecommunications are really making a difference in, in facilitating, um, network transformation. And that's covering that. That's me covered for today. I hope you found that helpful, um, by all means, please reach out, uh, there's a couple of links here. You can follow me on Twitter. You can connect to the telecommunications page, reach out to me directly at Cloudera. I'd love to answer your questions, um, uh, and, uh, and talk to you about how big data is transforming networks, uh, and how network transformation is, is accelerating telcos, uh, throughout >>Jamie Sharath with Liga data, I'm primarily on the delivery side of the house, but I also support our new business teams. I'd like to spend a minute really just kind of telling you about the legal data, where basically a Silicon valley startup, uh, started in 2014, and, uh, our lead iron, our executive team, basically where the data officers at Yahoo before this, uh, we provide managed data services, and we provide products that are focused on telcos. So we have some experience in non telco industry, but our focus for the last seven years or so is specifically on telco. So again, something over 200 employees, we have a global presence in north America, middle east Africa, Asia, and Europe. And we have folks in all of those places, uh, I'd like to call your attention to the, uh, the middle really of the screen there. So here is where we have done some partnership with Cloudera. >>So if you look at that and you can see we're in Holland and Jamaica, and then a lot to throughout Africa as well. Now, the data fabric is the product that we're talking about. And the data fabric is basically a big data type of data warehouse with a lot of additional functionality involved. The data fabric is comprised of, uh, some something called a flare, which we'll talk about in a minute below there, and then the Cloudera data platform underneath. So this is how we're partnering together. We, uh, we, we have this tool and it's, uh, it's functioning and delivering in something over 10 up. So flare now, flare is a piece of that legal data IP. The rest is there. And what flare does is that basically pulls in data, integrates it to an event streaming platform. It's, uh, it is the engine behind the data fabric. >>Uh, it's also a decisioning platform. So in real time, we're able to pull in data. We're able to run analytics on it, and we're able to alert are, do whatever is needed in a real-time basis. Of course, a lot of clients at this point are still sending data in batch. So it handles that as well, but we call that a CA picture Sanchez. Now Sacho is a very interesting app. It's an AI analytics app for executives. What it is is it runs on your mobile phone. It ties into your data. Now this could be the data fabric, but it couldn't be a standalone product. And basically it allows you to ask, you know, human type questions to say, how are my gross ads last week? How are they comparing against same time last week before that? And even the same time 60 days ago. So as an executive or as an analyst, I can pull it up and I can look at it instantly in a meeting or anywhere else without having to think about queries or anything like that. >>So that's pretty much for us at legal data, not really to set the context of where we are. So this is a traditional telco environments. So you see the systems of record, you see the cloud, you see OSS and BSS data. So one of the things that the next step above which calls we call the system of intelligence of the data fabric does, is it mergers that BSS and OSS data. So the longer we have any silos or anything that's separated, it's all coming into one area to allow business, to go in or allow data scientists go in and do that. So if you look at the bottom line, excuse me, of the, uh, of the system of intelligence, you can see that flare is the tools that pulls in the data. So it provides even streaming capabilities. It preserves entity states, so that you can go back and look at it state at any time. >>It does stream analytics that is as the data is coming in, it can perform analytics on it. And it also allows real-time decisioning. So that's something that, uh, that's something that business users can go in and create a system of, uh, if them's, it looks very much like the graph database, where you can create a product that will allow the user to be notified if a certain condition happens. So for instance, a bundle, so a real-time offer or user is succinct to run out of is ongoing, and an offer can be sent to him right on the fly. And that's set up by the business user as opposed to programmers, uh, data infrastructure. So the fabric has really three areas. That data is persistent, obviously there's the data lake. So the data lake stores that level of granularity that is very deep years and years of history, data, scientists like that, uh, and, uh, you know, for a historical record keeping and requirements from the government, that data would be stored there. >>Then there's also something we call the business semantics layer and the business semantics layer contains something over 650 specific telco KPIs. These are initially from PM forum, but they also are included in, uh, various, uh, uh, mobile operators that we've delivered at. And we've, we've grown that. So that's there for business data lake is there for data scientists, analytical stores, uh, they can be used for many different reasons. There are a lot of times RDBMS is, are still there. So these, this, this basically platform, this cloud they're a platform can tie into analytical data stores as well via flair access and reporting. So graphic visualizations, API APIs are a very key part of it. A third-party query tools, any kind of grid tools can be used. And those are the, of course, the, uh, the ones that are highly optimized and allow, you know, search of billions of records. >>And then if you look at the top, it's the systems of engagement, then you might vote this use cases. So teleco reporting, hundreds of KPIs that are, that are generated for users, segmentation, basically micro to macro segmentation, segmentation will play a key role in a use case. We talked about in a minute monetization. So this helps teleco providers monetize their specific data, but monetize it in. Okay, how to, how do they make money off of it, but also how might you leverage this data to engage with another client? So for instance, in some where it's allowed a DPI is used, and the fabric tracks exactly where each person goes each, uh, we call it a subscriber, goes within his, uh, um, uh, internet browsing on the, on the four or 5g. And, uh, the, all that data is stored. Uh, whereas you can tell a lot of things where the segment, the profile that's being used and, you know, what are they propensity to buy? Do they spend a lot of time on the Coca-Cola page? There are buyers out there that find that information very valuable, and then there's signs of, and we spoke briefly about Sanchez before that sits on top of the fabric or it's it's alone. >>So, so the story really that we want to tell is, is one, this is, this is one case out of it. This is a CVM type of case. So there was a mobile operator out there that was really offering, you know, packages, whether it's a bundle or whether it's a particular tool to subscribers, they, they were offering kind of an abroad approach that it was not very focused. It was not depending on the segments that were created around the profiling earlier, uh, the subscriber usage was somewhat dated and this was causing a lot of those. A lot of those offers to be just basically not taken and, and not, not, uh, audited. Uh, there was limited segmentation capabilities really before the, uh, before the, uh, fabric came in. Now, one of the key things about the fabric is when you start building segments, you can build that history. >>So all of that data stored in the data lake can be used in terms of segmentation. So what did we do about that? The, the, the envy and, oh, the challenge this, uh, we basically put the data fabric in and the data fabric was running Cloudera data platform and that, uh, and that's how we team up. Uh, we facilitated the ability to personalize campaign. So what that means is, uh, the segments that were built and that user fell within that segment, we knew exactly what his behavior most likely was. So those recommendations, those offers could be created then, and we enable this in real time. So real-time ability to even go out to the CRM system and gather further information about that. All of these tools, again, we're running on top of the Cloudera data platform, uh, what was the outcome? Willie, uh, outcome was that there was a much more precise offer given to the client that is, that was accepted, no increase in cross sell and upsell subscriber retention. >>Uh, our clients came back to us and pointed out that, uh, it was 183% year on year revenue increase. Uh, so this is a, this is probably one of the key use cases. Now, one thing to really mention is there are hundreds and hundreds of use cases running on the fabric. And I would even say thousands. A lot of those have been migrated. So when the fabric is deployed, when they bring the Cloudera and the legal data solution in there's generally a legacy system that has many use cases. So many of those were, were migrated virtually all of them in pen, on put on the cloud. Uh, another issue is that new use cases are enabled again. So when you get this level of granularity and when you have campaigns that can now base their offers on years of history, as opposed to 30 days of history, the campaigns campaign management response systems, uh, are, are, uh, are enabled quite a bit to do all, uh, to be precise in their offers. Okay. >>Okay. So this is a technical slide. Uh, one of the things that we normally do when we're, when we're out there talking to folks, is we talk and give an overview and that last little while, and then we give a deep technical dive on all aspects of it. So sometimes that deep dive can go a couple of hours. I'm going to do this slide and a couple of minutes. So if you look at it, you can see over on the left, this is the, uh, the sources of the data. And they go through this tool called flare that runs on the cloud. They're a data platform, uh, that can either be via cues or real-time cues, or it can be via a landing zone, or it can be a data extraction. You can take a look at the data quality that's there. So those are built in one of the things that flare does is it has out of the box ability to ingest data sources and to apply the data quality and validation for telco type sources. >>But one of the reasons this is fast to market is because throughout those 10 or 12, uh, opcos that we've done with Cloudera, where we have already built models, so models for CCN, for air for, for most mediation systems. So there's not going to be a type of, uh, input that we haven't already seen are very rarely. So that actually speeds up deployment very quickly. Then a player does the transformations, the, uh, the metrics, continuous learning, we call it continuous decisioning, uh, API access. Uh, we, uh, you know, for, for faster response, we use distributed cash. I'm not going to go too deeply in there, but the layer in the business semantics layer again, are, are sitting on top of the Cloudera data platform. You see the Kafka CLU, uh, Q1, the right as well. >>And all of that, we're calling the fabric. So the fabric is Cloudera data platform and the cloud and flair and all of this runs together. And, and by the way, there've been many, many, many, many hundreds of hours testing flare with Cloudera and, uh, and the whole process, the results, what are the results? Well, uh, there are, there are four I'm going to talk about, uh, we saw the one for the, it was called my pocket pocket, but it's a CDM type, a use case. Uh, the subscribers of that mobile operator were 14 million plus there was a use case for 24 million plus that a year on year revenue was 130%, uh, 32 million plus for 38%. These are, um, these are different CVM pipe, uh, use cases, as well as network use cases. And then there were 44%, uh, telco with 76 million subscribers. So I think that there are a lot more use cases that we could talk about, but, but in this case, this is the ones we're looking at, uh, again, 183%. This is something that we find consistently. And these figures come from our, uh, our actual end client. How do we unlock the full potential of this? Well, I think to start is to arrange a meeting and, uh, it would be great to, to, uh, for you to reach out to me or to Anthony. Uh, we're working at the junction on this, and we can set up a, uh, we can set up a meeting and we can go through this initial meeting. And, uh, I think that's the very beginning. Uh, again, you can get additional information from Cloudera website and from the league of data website, Anthony, that's the story. Thank you. >>No, that's great. Jeremy, thank you so much. It's a, it's, it's wonderful to go deep. And I know that there are hundreds of use cases being deployed in MTN, um, but great to go deep on one. And like you said, it can, once you get that sort of architecture in place, you can do so many different things. The power of data is tremendous, but it's great to be able to see how you can, how you can track it end to end from collecting the data, processing it, understanding it, and then applying it in a commercial context and bringing actual revenue back into the business. So there is your ROI straight away. Now you've got a platform that you can transform your business on. That's, that's, it's a tremendous story, Jamie, and thank you for your part. Sure. Um, that's a, that's, that's our story for today. Like Jamie says, um, please do flee, uh, feel free to reach out to us. Um, the, the website addresses are there and our contact details, and we'd be delighted to talk to you a little bit more about some of the other use cases, perhaps, um, and maybe about your own business and, uh, and how we might be able to make it, make it perform a little better. So thank you.

Published Date : Aug 4 2021

SUMMARY :

Um, thinking about, uh, So it didn't matter what network technology had, whether it was a Nokia technology or Erickson technology the cloud that drive, uh, uh, enhancements in use cases uh, and that again is going to lead to an increase in the amount of data that we have available. So the first is more physical elements. And so that needs to be aggregated and collected and managed and stored So the numbers of devices on the agent beyond the age, um, are going to be phenomenal. the agility and all of the scale, um, uh, benefits that you get from migrating So the kinds of services So on the demand side, um, So they'll need to be within a private cloud or at best a hybrid cloud environment in order to satisfy huge growth in that business over the last, uh, over the last 10, 15 years or so. And so particular, the two silos between And so, uh, um, the real-time network analytics platform, um, it was very important. Um, so first of all, the integration of wired and wireless data service data sources, So, first of all, the key data sources, um, you have all of your wireless network information, And also the RDBMS systems that, uh, you know, like the enterprise data warehouse that we're able to feed of the information and the network analytics platform as they are providers of data to the network, Um, so some of the specific use cases, uh, Um, you know, we've seen, Um, and also looking at new data sources from places like NWTF the network data analytics So here is where we have done some partnership with So if you look at that and you can see we're in Holland and Jamaica, and then a lot to throughout And even the same time So the longer we have any silos data, scientists like that, uh, and, uh, you know, for a historical record keeping and requirements of course, the, uh, the ones that are highly optimized and allow, the segment, the profile that's being used and, you know, what are they propensity to buy? Now, one of the key things about the fabric is when you start building segments, So all of that data stored in the data lake can be used in terms of segmentation. So when you get this level of granularity and when you have campaigns that can now base their offers So if you look at it, you can see over on the left, this is the, uh, the sources of the data. So there's not going to be a type of, uh, input that we haven't already seen are very rarely. So the fabric is Cloudera data platform and the cloud uh, and how we might be able to make it, make it perform a little better.

ENTITIES

Entity	Category	Confidence
Jamie	PERSON	0.99+
Jeremy	PERSON	0.99+
Holland	LOCATION	0.99+
Jamie Sharath	PERSON	0.99+
Anthony	PERSON	0.99+
Korea	LOCATION	0.99+
38%	QUANTITY	0.99+
Cloudera	ORGANIZATION	0.99+
2014	DATE	0.99+
2019	DATE	0.99+
183%	QUANTITY	0.99+
Europe	LOCATION	0.99+
24 million	QUANTITY	0.99+
14 million	QUANTITY	0.99+
LG	ORGANIZATION	0.99+
second piece	QUANTITY	0.99+
30 days	QUANTITY	0.99+
Jamaica	LOCATION	0.99+
Nokia	ORGANIZATION	0.99+
Huawei	ORGANIZATION	0.99+
today	DATE	0.99+
Yahoo	ORGANIZATION	0.99+
130%	QUANTITY	0.99+
32 million	QUANTITY	0.99+
Asia	LOCATION	0.99+
last week	DATE	0.99+
Erickson	ORGANIZATION	0.99+
Finastra	ORGANIZATION	0.99+
three	QUANTITY	0.99+
thousands	QUANTITY	0.99+
Africa	LOCATION	0.99+
north America	LOCATION	0.99+
telco	ORGANIZATION	0.99+
Silicon valley	LOCATION	0.99+
first	QUANTITY	0.99+
each person	QUANTITY	0.99+
Willie	PERSON	0.99+
10	QUANTITY	0.99+
44%	QUANTITY	0.99+
over 80%	QUANTITY	0.99+
one	QUANTITY	0.98+
76 million subscribers	QUANTITY	0.98+
60 days ago	DATE	0.98+
over 200 employees	QUANTITY	0.98+
LGU plus	ORGANIZATION	0.98+
Cloudera	TITLE	0.98+
Sacho	TITLE	0.98+
middle east Africa	LOCATION	0.97+
First	QUANTITY	0.97+
Liga data	ORGANIZATION	0.97+
four major vectors	QUANTITY	0.97+
under 40 seconds	QUANTITY	0.97+
YouTube	ORGANIZATION	0.97+
one example	QUANTITY	0.97+
One	QUANTITY	0.97+
two silos	QUANTITY	0.97+
each	QUANTITY	0.96+
Karen	PERSON	0.96+
one case	QUANTITY	0.96+
billions of records	QUANTITY	0.96+
three areas	QUANTITY	0.96+
under a minute	QUANTITY	0.95+
CAFCA	ORGANIZATION	0.95+
one thing	QUANTITY	0.95+
both	QUANTITY	0.94+
12	QUANTITY	0.94+
LG plus	ORGANIZATION	0.94+
Twitter	ORGANIZATION	0.94+
one area	QUANTITY	0.93+
fourth one	QUANTITY	0.93+
hundreds and	QUANTITY	0.92+
a year	QUANTITY	0.92+

Ed Walsh, ChaosSearch | AWS re:Invent 2020 Partner Network Day

>> Narrator: From around the globe it's theCUBE, with digital coverage of AWS re:Invent 2020. Special coverage sponsored by AWS Global Partner Network. >> Hello and welcome to theCUBE Virtual and our coverage of AWS re:Invent 2020 with special coverage of APN partner experience. We are theCUBE Virtual and I'm your host, Justin Warren. And today I'm joined by Ed Walsh, CEO of ChaosSearch. Ed, welcome to theCUBE. >> Well thank you for having me, I really appreciate it. >> Now, this is not your first time here on theCUBE. You're a regular here and I've loved it to have you back. >> I love the platform you guys are great. >> So let's start off by just reminding people about what ChaosSearch is and what do you do there? >> Sure, the best way to say is so ChaosSearch helps our clients know better. We don't do that by a special wizard or a widget that you give to your, you know, SecOp teams. What we do is a hard work to give you a data platform to get insights at scale. And we do that also by achieving the promise of data lakes. So what we have is a Chaos data platform, connects and indexes data in a customer's S3 or glacier accounts. So inside your data lake, not our data lake but renders that data fully searchable and available for analysis using your existing tools today 'cause what we do is index it and publish open API, it's like API like Elasticsearch API, and soon SQL. So give you an example. So based upon those capabilities were an ideal replacement for a commonly deployed, either Elasticsearch or ELK Stack deployments, if you're hitting scale issues. So we talk about scalable log analytics, and more and more people are hitting these scale issues. So let's say if you're using Elasticsearch ELK or Amazon Elasticsearch, and you're hitting scale issues, what I mean by that is like, you can't keep enough retention. You want longer retention, or it's getting very expensive to keep that retention, or because the scale you hit where you have availability, where the cluster is hard to keep up running or is crashing. That's what we mean by the issues at scale. And what we do is simply we allow you, because we're publishing the open API of Elasticsearch use all your tools, but we save you about 80% off your monthly bill. We also give you an, and it's an and statement and give you unlimited retention. And as much as you want to keep on S3 or into Glacier but we also take care of all the hassles and management and the time to manage these clusters, which ends up being on a database server called leucine. And we take care of that as a managed service. And probably the biggest thing is all of this without changing anything your end users are using. So we include Kibana, but imagine it's an Elastic API. So if you're using API or Kibana, it's just easy to use the exact same tools used today, but you get the benefits of a true data lake. In fact, we're running now Elasticsearch on top of S3 natively. If that makes it sense. >> Right and natively is pretty cool. And look, 80% savings, is a dramatic number, particularly this year. I think there's a lot of people who are looking to save a few quid. So it'd be very nice to be able to save up to 80%. I am curious as to how you're able to achieve that kind of saving though. >> Yeah, you won't be the first person to ask me that. So listen, Elastic came around, it was, you know we had Splunk and we also have a lot of Splunk clients, but Elastic was a more cost effective solution open source to go after it. But what happens is, especially at scale, if it's fall it's actually very cost-effective. But underneath last six tech ELK Stack is a leucine database, it's a database technology. And that sits on our servers that are heavy memory count CPU count in and SSDs. So you can do on-prem or even in the clouds, so if you do an Amazon, basically you're spinning up a server and it stays up, it doesn't spin up, spin down. So those clusters are not one server, it's a cluster of those servers. And typically if you have any scale you're actually having multiple clusters because you don't dare put it on one, for different use cases. So our savings are actually you no longer need those servers to spin up and you don't need to pay for those seen underneath. You can still use Kibana under API but literally it's $80 off your bill that you're paying for your service now, and it's hard dollars. So it's not... And we typically see clients between 70 and 80%. It's up to 80, but it's literally right within a 10% margin that you're saving a lot of money, but more importantly, saving money is a great thing. But now you have one unified data lake that you can have. You used to go across some of the data or all the data through the role-based access. You can give different people. Like we've seen people who say, hey give that, help that person 40 days of this data. But the SecOp up team gets to see across all the different law. You know, all the machine generated data they have. And we can give you a couple of examples of that and walk you through how people deploy if you want. >> I'm always keen to hear specific examples of how customers are doing things. And it's nice that you've thought of drawn that comparison there around what what cloud is good for and what it isn't is. I'll often like to say that AWS is cheap to fail in, but expensive to succeed. So when people are actually succeeding with this and using this, this broad amount of data so what you're saying there with that savings I've actually got access to a lot more data that I can do things with. So yeah, if you could walk through a couple of examples of what people are doing with this increased amount of data that they have access to in EKL Search, what are some of the things that people are now able to unlock with that data? >> Well, literally it's always good for a customer size so we can go through and we go through it however it might want, Kleiner, Blackboard, Alert Logic, Armor Security, HubSpot. Maybe I'll start with HubSpot. One of our good clients, they were doing some Cloud Flare data that was one of their clusters they were using a lot to search for. But they were looking at to look at a denial service. And they were, we find everyone kind of at scale, they get limited. So they were down to five days retention. Why? Well, it's not that they meant to but basically they couldn't cost-effectively handle that in the scale. And also they're having scale issues with the environment, how they set the cluster and sharding. And when they also denial service tech, what happened that's when the influx of data that is one thing about scale is how fast it comes out, yet another one is how much data you have. But this is as the data was coming after them at denial service, that's when the cluster would actually go down believe it or not, you know right. When you need your log analysis tools. So what we did is because they're just using Kibana, it was easy swap. They ran in parallel because we published the open API but we took them from five days to nine days. They could keep as much as they want but nine days for denial services is what they wanted. And then we did save them in over $4 million a year in hard dollars, What they're paying in their environment from really is the savings on the server farm and a little bit on the Elasticsearch Stack. But more importantly, they had no outages since. Now here's the thing. Are you talking about the use case? They also had other clusters and you find everyone does it. They don't dare put it on one cluster, even though these are not one server, they're multiple servers. So the next use case for CloudFlare was one, the next QS and it was a 10 terabyte a day influx kept it for 90 days. So it's about a petabyte. They brought another use case on which was NetMon, again, Network Monitoring. And again, I'm having the same scale issue, retention area. And what they're able to do is easily roll that on. So that's one data platform. Now they're adding the next one. They have about four different use cases and it's just different clusters able to bring together. But now what they're able to do give you use cases either they getting more cost effective, more stability and freedom. We say saves you a lot of time, cost and complexity. Just the time they manage that get the data in the complexities around it. And then the cost is easy to kind of quantify but they've got better but more importantly now for particular teams they only need their access to one data but the SecOP team wants to see across all the data. And it's very easy for them to see across all the data where before it was impossible to do. So now they have multiple large use cases streaming at them. And what I love about that particular case is at one point they were just trying to test our scale. So they started tossing more things at it, right. To see if they could kind of break us. So they spiked us up to 30 terabytes a day which is for Elastic would even 10 terabytes a day makes things fall over. Now, if you think of what they just did, what were doing is literally three steps, put your data in S3 and as fast as you can, don't modify, just put it there. Once it's there three steps connect to us, you give us readability access to those buckets and a place to write the indexy. All of that stuff is in your S3, it never comes out. And then basically you set up, do you want to do live or do you want to do real time analysis? Or do you want to go after old data? We do the rest, we ingest, we normalize the schema. And basically we give you our back and the refinery to give the right people access. So what they did is they basically throw a whole bunch of stuff at it. They were trying to outrun S3. So, you know, we're on shoulders of giants. You know, if you think about our platform for clients what's a better dental like than S3. You're not going to get a better cross curve, right? You're not going to get a better parallelism. And so, or security it's in your, you know a virtual environment. But if you... And also you can keep data in the right location. So Blackboard's a good example. They need to keep that in all the different regions and because it's personal data, they, you know, GDPR they got to keep data in that location. It's easy, we just put compute in each one of the different areas they are. But the net net is if you think that architecture is shoulders of giants if you think you can outrun by just sheer volume or you can put in more cost-effective place to keep long-term or you think you can out store you have so much data that S3 and glacier can't possibly do it. Then you got me at your bigger scale at me but that's the scale we'r&e talking about. So if you think about the spiked our throughput what they really did is they try to outrun S3. And we didn't pick up. Now, the next thing is they tossed a bunch of users at us which were just spinning up in our data fabric different ways to do the indexing, to keep up with it. And new use cases in case they're going after everyone gets their own worker nodes which are all expected to fail in place. So again, they did some of that but really they're like you guys handled all the influx. And if you think about it, it's the shoulders of giants being on top of an Amazon platform, which is amazing. You're not going to get a more cost effective data lake in the world, and it's continuing to fall in price. And it's a cost curve, like no other, but also all that resiliency, all that security and the parallelism you can get, out of an S3 Glacier is just a bar none is the most scalable environment, you can build an environment. And what we do is a thin layer. It's a data platform that allows you to have your data now fully searchable and queryable using your tools >> Right and you, you mentioned there that, I mean you're running in AWS, which has broad experience in doing these sorts of things at scale but on that operational management side of things. As you mentioned, you actually take that off, off the hands of customers so that you run it on their behalf. What are some of the areas that you see people making in trying to do this themselves, when you've gone into customers, and brought it into the EKL Search platform? >> Yeah, so either people are just trying their best to build out clusters of Elasticsearch or they're going to services like Logz.io, Sumo Logic or Amazon Elasticsearch services. And those are all basically on the same ELK Stack. So they have the exact same limits as the same bits. Then we see people trying to say, well I really want to go to a data lake. I want to get away from these database servers and which have their limits. I want to use a data Lake. And then we see a lot of people putting data into environments before they, instead of using Elasticsearch, they want to use SQL type tools. And what they do is they put it into a Parquet or Presto form. It's a Presto dialect, but it into Parquet and structure it. And they go a lot of other way to, Hey it's in the data lake, but they end up building these little islands inside their data lake. And it's a lot of time to transform the data, to get it in a format that you can go after our tools. And then what we do is we don't make you do that. Just literally put the data there. And then what we do is we do the index and a polish API. So right now it's Elasticsearch in a very short time we'll publish Presto or the SQL dialect. You can use the same tool. So we do see people, either brute forcing and trying their best with a bunch of physical servers. We do see another group that says, you know, I want to go use an Athena use cases, or I want to use a there's a whole bunch of different startups saying, I do data lake or data lake houses. But they are, what they really do is force you to put things in the structure before you get insight. True data lake economics is literally just put it there, and use your tools natively to go after it. And that's where we're unique compared to what we see from our competition. >> Hmm, so with people who have moved into ChaosSearch, what's, let's say pick one, if you can, the most interesting example of what people have started to do with, with their data. What's new? >> That's good. Well, I'll give you another one. And so Armor Security is a good one. So Armor Security is a security service company. You know, thousands of clients doing great I mean a beautiful platform, beautiful business. And they won Rackspace as a partner. So now imagine thousand clients, but now, you know massive scale that to keep up with. So that would be an example but another example where we were able to come in and they were facing a major upgrade of their environment just to keep up, and they expose actually to their customers is how their customers do logging analytics. What we're able to do is literally simply because they didn't go below the API they use the exact same tools that are on top and in 30 days replaced that use case, save them tremendous amount of dollars. But now they're able to go back and have unlimited retention. They used to restrict their clients to 14 days. Now they have an opportunity to do a bunch of different things, and possible revenue opportunities and other. But allow them to look at their business differently and free up their team to do other things. And now they're, they're putting billing and other things into the same environment with us because one is easy it's scale but also freed up their team. No one has enough team to do things. And then the biggest thing is what people do interesting with our product is actually in their own tools. So, you know, we talk about Kibana when we do SQL again we talk about Looker and Tableau and Power BI, you know, the really interesting thing, and we think we did the hard work on the data layer which you can say is, you know I can about all the ways you consolidate the performance. Now, what becomes really interesting is what they're doing at the visibility level, either Kibana or the API or Tableau or Looker. And the key thing for us is we just say, just use the tools you're used to. Now that might be a boring statement, but to me, a great value proposition is not changing what your end users have to use. And they're doing amazing things. They're doing the exact same things they did before. They're just doing it with more data at bigger scale. And also they're able to see across their different machine learning data compared to being limited going at one thing at a time. And that getting the correlation from a unified data lake is really what we, you know we get very excited about. What's most exciting to our clients is they don't have to tell the users they have to use a different tool, which, you know, we'll decide if that's really interesting in this conversation. But again, I always say we didn't build a new algorithm that you going to give the SecOp team or a new pipeline cool widget that going to help the machine learning team which is another API we'll publish. But basically what we do is a hard work of making the data platform scalable, but more importantly give you the APIs that you're used to. So it's the platform that you don't have to change what your end users are doing, which is a... So we're kind of invisible behind the scenes. >> Well, that's certainly a pretty strong proposition there and I'm sure that there's plenty of scope for customers to come and and talk to you because no one's creating any less data. So Ed, thanks for coming out of theCUBE. It's always great to see you here. >> Know, thank you. >> You've been watching theCUBE Virtual and our coverage of AWS re:Invent 2020 with special coverage of APN partner experience. Make sure you check out all our coverage online, either on your desktop, mobile on your phone, wherever you are. I've been your host, Justin Warren. And I look forward to seeing you again soon. (soft music)

Published Date : Dec 3 2020

SUMMARY :

the globe it's theCUBE, and our coverage of AWS re:Invent 2020 Well thank you for having me, loved it to have you back. and the time to manage these clusters, be able to save up to 80%. And we can give you a So yeah, if you could walk and the parallelism you can get, that you see people making it's in the data lake, but they end up what's, let's say pick one, if you can, I can about all the ways you It's always great to see you here. And I look forward to

ENTITIES

Entity	Category	Confidence
Justin Warren	PERSON	0.99+
Ed Walsh	PERSON	0.99+
$80	QUANTITY	0.99+
40 days	QUANTITY	0.99+
five days	QUANTITY	0.99+
Ed Walsh	PERSON	0.99+
90 days	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
AWS Global Partner Network	ORGANIZATION	0.99+
nine days	QUANTITY	0.99+
80%	QUANTITY	0.99+
10 terabytes	QUANTITY	0.99+
thousands	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
HubSpot	ORGANIZATION	0.99+
Ed	PERSON	0.99+
10%	QUANTITY	0.99+
Elasticsearch	TITLE	0.99+
30 days	QUANTITY	0.99+
Armor Security	ORGANIZATION	0.99+
14 days	QUANTITY	0.99+
thousand clients	QUANTITY	0.99+
Blackboard	ORGANIZATION	0.99+
Kleiner	ORGANIZATION	0.99+
S3	TITLE	0.99+
One	QUANTITY	0.99+
Alert Logic	ORGANIZATION	0.99+
three steps	QUANTITY	0.98+
one	QUANTITY	0.98+
GDPR	TITLE	0.98+
one thing	QUANTITY	0.98+
one data	QUANTITY	0.98+
one server	QUANTITY	0.98+
Elastic	TITLE	0.98+
70	QUANTITY	0.98+
SQL	TITLE	0.98+
about 80%	QUANTITY	0.97+
Kibana	TITLE	0.97+
first time	QUANTITY	0.97+
over $4 million a year	QUANTITY	0.97+
one cluster	QUANTITY	0.97+
first person	QUANTITY	0.97+
CloudFlare	TITLE	0.97+
ChaosSearch	ORGANIZATION	0.97+
this year	DATE	0.97+
Glacier	TITLE	0.97+
up to 80%	QUANTITY	0.97+
Parquet	TITLE	0.96+
each one	QUANTITY	0.95+
Splunk	ORGANIZATION	0.95+
Sumo Logic	ORGANIZATION	0.94+
up to 80	QUANTITY	0.94+
Power BI	TITLE	0.93+
today	DATE	0.93+
Rackspace	ORGANIZATION	0.92+
up to 30 terabytes a day	QUANTITY	0.92+
one point	QUANTITY	0.91+
S3 Glacier	COMMERCIAL_ITEM	0.91+
Elastic API	TITLE	0.89+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for one flare: