Ricardo Rocha, CERN | KubeCon + CloudNativeCon Europe 2021 - Virtual
>>from around the globe. It's >>the cube >>with coverage of >>Kublai khan and >>Cloud Native Con, Europe 2021 virtual brought >>to you by red hat, >>the cloud Native >>Computing foundation and ecosystem partners. Hello, welcome back to the cubes coverage of Kublai khan. Cloud Native Con 2021 part of the CNC. S continuing cube partnership virtual here because we're not in person soon, we'll be out of the pandemic and hopefully in person for the next event. I'm john for your host of the key. We're here with ricardo. Roach computing engineers sir. In CUBA. I'm not great to see you ricardo. Thanks for remote ng in all the way across the world. Thanks for coming in. >>Hello, Pleasure. Happy to be here. >>I saw your talk with Priyanka on linkedin and all around the web. Great stuff as always, you guys do great work over there at cern. Talk about what's going on with you and the two speaking sessions you have it coop gone pretty exciting news and exciting sessions happening here. So take us through the sessions. >>Yeah. So actually the two sessions are kind of uh showing the two types of things we do with kubernetes. We we are doing we have a lot of uh services moving to kubernetes, but the first one is more on the services we have in the house. So certain is known for having a lot of data and requests, requiring a lot of computing capacity to analyze all this data. But actually we have also very large community and we have a lot of users and people interested in the stuff we do. So the first question will actually show how we've been uh migrating our group of infrastructure into the into communities and in this case actually open shift. And uh the challenge there is to to run a very large amount of uh global websites on coordinators. Uh we run more than 1000 websites and there will be a demonstration on how we do all the management of the website um life cycle, including upgrading and deploying new new websites and an operator that was developed for this purpose. And then more on the other side will give with a colleague also talk about machine learning. Machine learning has been a big topic for us. A lot of our workloads are migrating to accelerators and can benefit a lot from machine learning. So we're giving a talk about a new service that we've deployed on top of Cuban areas where we try to manage to uh lifecycle of machine learning workloads from data preparation all the way to serving the bottles, also exploring the communities features and integrating accelerators and a lot of accelerators. >>So one part of the one session, it's a large scale deployment kubernetes key to there and now the machine learning essentially service for other people to use that. Right? Like take me through the first large scale deployment. What's the key innovation there in your opinion? >>Yeah, I think compared to the infrastructure we had before, is this notion that we can develop an operator that will uh, manage resource, in this case a website. And this is uh, something that is not always obvious when people start with kubernetes, it's not just an orchestra, it's really the ap and the capability of managing a huge amount of resources, including custom resources. So the possibility to develop this operator and then uh, manage the lifecycle of uh, something that was defined in the house and that fits our needs. Uh, There are challenges there because we have a large amount of websites and uh, they can be pretty active. Uh, we also have to some scaling issues on the storage that serves these these websites and we'll give some details uh during the talk as well, >>so kubernetes storage, this is all kind of under the covers, making this easier. Um and the machine learning, it plays nicely in that what if you take us for the machine learning use case, what's going on there, wow, what was the discovery, How did you guys put that together? What's the key elements there? >>Right, so the main challenge there has been um that machine learning is is quite popular but it's quite spread as well, so we have multiple groups focusing on this, but there's no obvious way to centralize not only the resource usage and make it more efficient, but also centralize the knowledge of how these procedures can be done. So what we are trying to do is just offer a service to all our users where we help them with infrastructure so that they don't have to focus on that and they could focus just on their workloads and we do everything from exposing the data systems that we have in the house so that they can do access to the data and data preparation and then doing um some iteration using notebooks and then doing distributed training with potentially large amount of gps and that storage and serving up the models and all of this is uh is managed with the coordinates cluster underneath. Uh We had a lot of knowledge of how to handle kubernetes and uh all the features that everyone likes scalability. The reliability out of scaling is very important for this type of workload. This is, this is key. >>Yeah, it's interesting to see how kubernetes is maturing, um congratulations on the projects. Um they're going to probably continue to scale. Remember this reminds me of when I was uh you know coming into the business in the 98 late eighties early nineties with TCP I. P. And the S. I. Model, you saw the standards evolve and get settled in and then boom innovation everywhere. And that took about a year to digest state and scale up. It's happening much faster now with kubernetes I have to ask you um what's your experience with the question that people are looking to get answered? Which is as kubernetes goes, the next generation of the next step? Um People want to integrate. So how is kubernetes exposing a. P. I. S. To say integration points for tools and other things? Can you share your experience and where this is going, what's happening now and where it goes? Because we know there's no debate. People like the kubernetes aspect of it, but now it's integration is the conversation. Can you share your thoughts on that? >>I can try. Uh So it's uh I would say it's a moving target, but I would say the fact that there's such a rich ecosystem around kubernetes with all the cloud, David projects, uh it's it's uh like a real proof that the popularity of the A. P. I. And this is also something that we after we had the first step of uh deploying and understanding kubernetes, we started seeing the potential that it's not reaching only the infrastructure itself, it's reaching all the layers, all the stack that we support in house and premises. And also it's opening up uh doors to easily scale into external resources as as well. So what we've been trying to tell our users is to rely on these integrations as much as possible. So this means like the application lifecycle being managed with things like Helmand getups, but also like the monitoring being managed with Prometheus and once you're happy with your deployment in house we have ways to scale out to external resources including public clouds. And this is really like see I don't know a proof that all these A. P. I. S are not only popular but incredibly useful because there's such a rich ecosystem around it. >>So talk about the role of data in this obviously machine learning pieces something that everyone is interested in as you get infrastructure as code and devops um and def sec ops as everything's shifting left. I love that, love that narrative day to our priests. All this is all proving mature, mature ization. Um data is critical. Right? So now you get real time information, real time data. The expectations for the apps is to integrate the data. What's your view on how this is progressing from your standpoint because machine learning and you mentioned you know acceleration or being part of another system. Cashing has always done that would say databases. Right. So you've got now is databases get slower, caches are getting faster now they're all the ones so it's all changing. So what's your thoughts on this next level data equation into kubernetes? Because you know stateless is cool but now you've got state issues. >>Yeah so uh yeah we we've always had huge needs for for data we store and I I think we are over half an exhibit of data available on the premises but we we kind of have our own storage systems which are external and that's for for like the physics data, the raw data and one particular charity that we had with our workloads until recently is that we we call them embarrassing parallel in the sense that they don't really need uh very tight connectivity between the different workloads. So if it's people always say tens of thousands of jobs to do some analysis, they're actually quite independent, they will produce a lot more data but we can store them independently. Machine learning is is posing a challenge in the sense that this is a training tends to be a lot more interconnected. Um so it can be a benefit from from um systems that we are not so familiar with. So for us it's it's maybe not so much the cashing layers themselves is really understanding how our infrastructure needs to evolve on premises to support this kind of workloads. We had some smallish uh more high performance computing clusters with things like infinite and for low latency. But this is not the bulk of our workloads. This is not what we are experts on these days. This is the transition we are doing towards uh supporting this machine learning workers >>um just as a reference for the folks watching you mentioned embarrassing parallel and that's a quote that you I read on your certain tech blog. So if you go to tech blog dot web dot search dot ch or just search cern tech blog, you'll see the post there um and good stuff there and in there you go, you lay out a bunch of other things too where you start to see the deployment services and customer resource definitions being part of this, is it going to get to the point where automation is a bigger part of the cluster management setting stuff up quicker. Um As you look at some of the innovations you're doing with machines and Coubertin databases and thousands of other point things that you're working on there, I mean I know you've got a lot going on there, it's in the post but um you know, we don't want to have the problem of it's so hard to stand up and manage and this is what people want to make simpler. How do you how do you answer that when people say say we want to make it easier? >>Yeah. So uh for us it's it's really automate everything and up to now it has been automate the deployment in the kubernetes clusters right now we are looking at automating the kubernetes clusters themselves. So there's some really interesting projects, uh So people are used to using things like terra form to manage the deployment of clusters, but there are some projects like cross playing, for example, that allows us to have the clusters themselves being resources within kubernetes. Uh and this is something we are exploring quite a bit. Uh This allows us to also abstract the kubernetes clusters themselves uh as uh as carbonated resources. So this this idea of having a central cluster that will manage a much larger infrastructure. So this is something that we're exploring the getups part is really key for us to, it's something that eases the transition from from from people that are used already to manage large scale systems but are not necessarily experts on core NATO's. Uh they see that there's an easier past there if they if they can be introduced slowly through through the centralized configuration. >>You know, you mentioned cross plane, I had some on earlier, he's awesome dude, great guy and I was smiling because you know I still have you know flashbacks and trigger episodes from the Hadoop world, you know when it was such so promising that technology but it was just so hard to stand up and managed to be like really an expert to do that. And I think you mentioned cross plane, this comes up to the whole operator notion of operating the clusters, right? So you know, this comes back down to provisioning and managing the infrastructure, which is, you know, we all know is key, right? But when you start getting into multi cloud and multiple environments, that's where it becomes challenging. And I think I like what they're doing is that something that's on your mind to around hybrid and multi cloud? Can you share your thoughts on that whole trajectory? >>Absolutely. So I actually gave an internal seminar just last week describing what we've been playing with in this area and I showed some demo of using cross plane to manage clusters on premises but also manage clusters running on public clouds. A. W. S. Uh google cloud in nature and it's really like the goal there. There are many reasons we we want to explore external resources. We are kind of used to this because we have a lot of sites around the world that collaborate with us, but specifically for public clouds. Uh there are some some motivations there. The first one is this idea that we have periodic load spikes. So we knew we have international conferences, the number of analysis and job requests goes up quite a bit, so we need to be able to like scale on demand for short periods instead of over provisioning this uh in house. The second one is again coming back to machine learning this idea of accelerators. We have a lot of Cpus, we have a lot less gPS uh so it would be nice to go on fish uh for those in the public clouds. And then there's also other accelerators that are quite interesting, like CPUs and I p u s that will definitely play a role and we probably, or maybe we will never have among premises, will only be able to to use them externally. So in that, in that respect, actually coming back to your previous question, this idea of storage then becomes quite important. So what we've been playing with is not only managing this external cluster centrally, but also managing the wall infrastructure from a central place. So this means uh, making all the clusters, whatever they are look very, very much the same, including like the monitoring and the aggregation of the monitoring centrally. And then as we talked about storage, this idea of having local storage that that will be allow us to do really quick software distribution but also access to the data, >>what you guys are doing as we say, cool. And relevant projects. I mean you got the large scale deployments and the machine learning to really kind of accelerate which will drive a lot of adoption in terms of automation. And as that kicks in when you got to get the foundational work done, I see that clearly the right trajectory, you know, reminds me ricardo, um you know, again not do a little history lesson here, but you know, back when network protocols were moving from proprietary S N A for IBM deck net for digital back in the history the old days the os I Open Systems Interconnect Standard stack was evolving and you know when TCP I P came around that really opened up this interoperability, right? And SAM and I were talking about this kind of cross cloud connections or inter clouding as lou lou tucker. And I talked that open stack in 2013 about inter networking or interconnections and it's about integration and interoperability. This is like the next gen conversation that kubernetes is having. So as you get to scale up which is happening very fast as you get machine learning which can handle data and enable modern applications really it's connecting networks and connecting systems together. This is a huge architectural innovation direction. Could you share your reaction to that? >>Yeah. So actually we are starting the easy way, I would say we are starting with the workloads that are loosely coupled that we don't necessarily have to have this uh tighten inter connectivity between the different deployments, I would say that this is this is already giving us a lot because our like the bulk of our workloads are this kind of batch, embarrassing parallel, uh and we are also doing like co location when we have large workloads that made this kind of uh close inter connectivity then we kind of co locate them in the same deployment, same clouds in region. Um I think like what you describe of having cross clouds interconnectivity, this will be like a huge topic. It is already, I would say so we started investigating a lot of service measure options to try to learn what we can gain from it. There is clearly a benefit for managing services but there will be definitely also potential to allow us to kind of more easily scale out across regions. There's we've seen this by using the public cloud. Some things that we found is for example, this idea of infinite, infinite capacity which is kind of sometimes uh it feels kind of like that even at the scale we have for Cpus But when you start using accelerators, Yeah, you start negotiating like maybe use multiple regions because there's not enough capacity in a single region and you start having to talk to the cloud providers to negotiate this. And this makes the deployments more complicated of course. So this, this interconnectivity between regions and clouds will be a big thing. >>And, and again, low hanging fruit is just a kind of existing market but has thrown the vision out there mainly to kind of talk about what what we're seeing which is the world's are distributed computer. And if you have the standards, good things happen. Open systems, open innovating in the open really could make a big difference is going to be the difference between real value for the society of global society or are we going to get into the silo world? So I think the choice is the industry and I think, you know, Cern and C and C. F and Lennox Foundation and all the companies that are investing in open really is a key inflection point for us right now. So congratulations. Thanks for coming on the cube. Yeah, appreciate it. Thank you. Okay, Ricardo, rocha computing engineer cern here in the cube coverage of the CN Cf cube con cloud, native con europe. I'm john for your host of the cube. Thanks for watching.
SUMMARY :
from around the globe. I'm not great to see you ricardo. Happy to be here. what's going on with you and the two speaking sessions you have it coop gone pretty exciting news the two types of things we do with kubernetes. So one part of the one session, it's a large scale deployment kubernetes key to there and now So the possibility to Um and the machine learning, it plays nicely in that what if you take us for the machine learning use case, the data systems that we have in the house so that they can do access to the data and data preparation in the 98 late eighties early nineties with TCP I. P. And the S. I. Model, you saw the standards that the popularity of the A. P. I. And this is also something that we So talk about the role of data in this obviously machine learning pieces something that everyone is interested in as This is the transition we are doing towards So if you go to tech blog dot web dot search dot ch Uh and this is something we are exploring quite a bit. this comes back down to provisioning and managing the infrastructure, which is, you know, we all know is key, The first one is this idea that we have periodic load spikes. and the machine learning to really kind of accelerate which will drive a lot of adoption in terms of uh it feels kind of like that even at the scale we have for Cpus But when you open innovating in the open really could make a big difference is going to be the difference
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Priyanka | PERSON | 0.99+ |
Ricardo Rocha | PERSON | 0.99+ |
2013 | DATE | 0.99+ |
David | PERSON | 0.99+ |
IBM | ORGANIZATION | 0.99+ |
two sessions | QUANTITY | 0.99+ |
first question | QUANTITY | 0.99+ |
CERN | ORGANIZATION | 0.99+ |
two types | QUANTITY | 0.99+ |
Ricardo | PERSON | 0.99+ |
more than 1000 websites | QUANTITY | 0.99+ |
last week | DATE | 0.99+ |
CUBA | LOCATION | 0.99+ |
98 late eighties | DATE | 0.99+ |
NATO | ORGANIZATION | 0.99+ |
Lennox Foundation | ORGANIZATION | 0.98+ |
two speaking sessions | QUANTITY | 0.98+ |
first one | QUANTITY | 0.98+ |
thousands | QUANTITY | 0.98+ |
Cloud Native Con | EVENT | 0.98+ |
second one | QUANTITY | 0.97+ |
Cloud Native Con 2021 | EVENT | 0.97+ |
first step | QUANTITY | 0.97+ |
one session | QUANTITY | 0.96+ |
C. F | ORGANIZATION | 0.96+ |
KubeCon | EVENT | 0.95+ |
C | ORGANIZATION | 0.95+ |
ricardo | PERSON | 0.95+ |
ORGANIZATION | 0.95+ | |
tens of thousands of jobs | QUANTITY | 0.95+ |
john | PERSON | 0.95+ |
Prometheus | TITLE | 0.95+ |
one part | QUANTITY | 0.94+ |
europe | LOCATION | 0.94+ |
about a year | QUANTITY | 0.93+ |
cloud Native | ORGANIZATION | 0.9+ |
2021 | EVENT | 0.89+ |
one particular charity | QUANTITY | 0.88+ |
pandemic | EVENT | 0.81+ |
red hat | ORGANIZATION | 0.81+ |
single region | QUANTITY | 0.81+ |
Helmand | TITLE | 0.81+ |
Kublai khan | PERSON | 0.8+ |
first large | QUANTITY | 0.8+ |
Cuban | LOCATION | 0.8+ |
Cern and | ORGANIZATION | 0.79+ |
Europe | LOCATION | 0.78+ |
P. | OTHER | 0.77+ |
Coubertin | ORGANIZATION | 0.75+ |
early nineties | DATE | 0.7+ |
CloudNativeCon Europe 2021 | EVENT | 0.7+ |
over half | QUANTITY | 0.68+ |
form | TITLE | 0.68+ |
con | COMMERCIAL_ITEM | 0.67+ |
S. I. Model | OTHER | 0.67+ |
Kublai khan | PERSON | 0.65+ |
TCP I. | OTHER | 0.65+ |
Cf | COMMERCIAL_ITEM | 0.64+ |
deployment | QUANTITY | 0.56+ |
services | QUANTITY | 0.53+ |
ORGANIZATION | 0.48+ | |
SAM | ORGANIZATION | 0.46+ |
P. I. | OTHER | 0.4+ |
native con | COMMERCIAL_ITEM | 0.37+ |