Ricardo Rocha, CERN | KubeCon + CloudNativeCon NA 2020
from around the globe it's thecube with coverage of kubecon and cloudnativecon north america 2020 virtual brought to you by red hat the cloud native computing foundation and ecosystem partners hey welcome back everybody jeff frick here with thecube coming to you from our palo alto studios for the continuing coverage of kubecon cloud native con 2020 north america there was the european version earlier in the summer it's all virtual uh so the good news is we don't have to get on planes and we can get guests from all over the world and we're excited to welcome back for his return to the cube ricardo rocha he is a staff member and computing engineer at cern ricardo great to see you hello thanks for having me absolutely and you're coming in from uh from geneva so you're you already had a good thursday i bet yeah we're just finishing right now yeah right so in in getting ready for this um interview i was looking at the interview that you did i think it was two cube cons ago uh in may of 2019 and it just strikes me a lot of people know what cern is but a lot of people don't know what's cern in so i wonder if you can just give you know kind of the 101 of what cern's mission is and what is some of the work that you guys do there yeah sure uh so cern is the european organization for uh nuclear research we are the largest particle physics laboratory in the world and our main mission is uh fundamental research so we try to answer big questions about why don't we see antimatter what is dark matter or dark energy other questions about the origin of the universe and to answer these questions we build very large machines particle accelerators where we try to recreate some of [Music] the moments just after the universe was created the big bang to try to understand better what was the state of the matter at that time the result of all of this is very often a lot of data that has to be analyzed and that's why we traditionally have had a huge requirements for computing resources during the the start of cern we always had this this large large requirements right and so you have this large particle accelerators as you said large machines the one that you've got now the the latest one how long has that one been operational yeah so it started uh like maybe around 10 years ago the first launch was a bit before that uh and it's uh it's a very large uh it's the largest one ever built so it's 27 kilometers in perimeter we inject protons into different uh directions and then we we make them collide where we build these huge detectors that can can see what's happening in these collisions uh the the main the main particle accelerator is this one we do have other experiments we have a nancy meta factory that is just uh down from my office and we have other types of experiments as well going right 27 kilometers that's a big that's a big number and then and then again just so people get some type of sense of scale so then you you you speed up the particles you smash them together you see what happens they collect all the data what types of data sets are generated off off just a one you know kind of event and i don't even know if that's a relative you know if that's a valid measure how do how do you measure kind of quantities of data around event just you know kind of for orders of magnitude right so uh the way it works is as you said we accelerate the particles to very close to the speed of light and we increase the energy by by having the beams well controlled and then at specific points we make them collide we have this gigantic detectors underground all of this is 100 meters in the ground and these detectors are pretty much a very large camera that would take something like 40 million pictures a second and the result of this is a huge amount of data each of these detectors can generate up to one petabyte of second this is not something we can record so what we do is we have hardware filters that will bring this down to something we can manage which is in the order of a few tens of gigabytes per second wow so you've been you've got a very serious computing challenge ahead of you because you're the one that's on the hook for for grabbing the data recording the data making the data available for for people to use um on their experiments um so we're here at kubecon cloud native con where did containers come into the story uh and and kubernetes specifically what was the real uh challenge that you're trying to overcome yeah so uh this is a a long story of uh using distributed computing at cern and other types of computing so as i mentioned we generate a lot of data we generate something like 7 but of 70 petabytes of data every year and we accumulated something over one half an exabyte of data by now so uh traditionally we've had to build this software ourselves um which was uh because there was not so many people around that would have this kind of needs but this revolution with containers and the clouds appearing kind of allowed us to to join other other communities and benefit also from their work and not have to do everything ourselves so this is the main probe for us to start doing this the other point is more containerization we traditionally are very we have a lot of needs to share information but also share resources between physicists and engineers so this idea of containerizing the work including all the code all the data and then sharing this with our colleagues is very appealing the fact that we can also take this unit of work and just deploy it in any infrastructure that has a standardized api like kubernetes and scale that monitoring the same way it's also very appealing so all of these things kind of connect with our way of working our natural way of working i would say right so you've talked about the this upgrade is coming um to the particle accelerator in a couple four or five years whatever that timeline is relatively soon um this as you've said before is a huge step function in the data that's that that's going to come off these experiments i mean how are you keeping up on the compute side with the fundamental shift in on kind of the physics side and the data that's going to be generated to make sure that you can keep up and i think you said it in a prior interview somewhere along the way that you know you don't want to be the bottleneck when there's all this great work being done but if it's not captured and made available for people to do stuff with the data then you know it's not uh it's not the greatest experiment so how are you keeping up and and what's the relative scale to have what you got to do on the compute side to keep up with the the guys on the physics side yeah so the the the idea well we what we will have to deal with is an increase of 10 times of more data than we have today we already have a lot and very soon we'll have a lot more but this is not i would say this is not the first time this kind of uh step happens uh in our computing we always kind of found a new technology or a new way to do things that would improve in in this case uh what we do is we do what we always do which is we try to look for all sorts of new technologies or all sorts of new resources that we could make use of in this case a lot is involving improving our own software to replace what we currently use with hardware triggers to replace that with software-based using accelerators gpus and other types of accelerators this will play a big role and also making our software more efficient in this way the second thing that we are doing is trying to make our infrastructure more agile and this is where cloud native kubernetes plays a huge role so that we can benefit from external resources uh we we can always think of like expanding our in on-premises resources but it's also very good to be able to just go and fish around if there's something available externally kubernetes plays a very big role in that respect as well yeah i'd love to dig into that a little deeper because the cloud native foundation is a super active foundation obviously a ton of activity around kubernetes so what does that mean to you as an infrastructure provider you know to your own company being on the hook to have now you know kind of an open source community that's supporting you indirectly via ongoing developments and ongoing projects and having as you said kind of this broader group of brain power to pull from to help you move your own infrastructure along yeah i think this this is great we've had really good experiences in the past we've been uh heavy users of uh linux from from from for a very long time we've used openstack for our private cloud and we've been heavily involved in that community as well we not only uh contribute as end users but we also uh offer some some manpower for development and helping with the community and we are doing the same with kubernetes uh and this is uh this is really we we end up getting a lot more than we we are putting in the community we are quite involved but uh it's so large and and and with such big players that have very similar needs to ours that uh we end up having a lot a lot more back than we are putting in we try to help as much as possible but uh yeah we have limited resources as well now open source is an amazing it's just an amazing innovation uh machine and and obviously it's proved as its value over a lot of things from linux to kubernetes being one of the most recent i want to shift gears a little bit right and ask you just your your take on public cloud right one of the huge benefits of public cloud is is the flexibility to add capacity shrink capacity as you need it and you talked again in a prior thing i was looking at you know that you definitely have spikes uh in demand spikes whether there's a high frequency of experiments i don't know how frequently you run those things versus maybe a conference or something where you said people you know want to get access to the data run experiments prior to your conference do you where does public cloud play in your thoughts and maybe you're there today maybe you're not how do you think about you know kind of public cloud generically but more specifically you know that ability to add a little bit more flex in your compute horsepower or are you just going up into the right up into the right and not really flexing down very much yeah so this is this is something we've been working on for a few years now uh we it's uh it's uh it's i would say it's an ongoing work it's a situation that will will not uh be very clear for the for the next few years but again what what we try to do is just to explore as much as possible all kinds of resources that can help us what we did in the kubecon last year was this demonstration that we can actually scale we can scale out and burst for for this uh spiky workloads we have we can burst to the to the public cloud quite easily using this kind of cloud native technologies that we have today and this is extremely important because it kind of changes our mindset instead of having to to think only on investing on premises we can think that maybe we can cover for the majority of use cases but then explore and burst to the public cloud this has to be easy in terms of infrastructure and that we are at that point right now with kubernetes we also have kind of workload that is maybe easier to do these things than than a traditional i.t where services are very interconnected in our case we are more thinking of batch workloads where we can just submit jobs uh and then fetch the data back right this also has a few challenges but but it's i would say it's it's easier than the traditional ite service deployments the other aspect where the public cloud is also very interesting is uh for resources that we don't have in large quantities so we have a very large farm for with cpus we have some gpus and it's very good to be able to explore this new accelerator technologies and maybe expand our available pool of accelerators by going to the public cloud maybe to use them but also to validate to see which ones are best for our use cases and explore that option as well it's not only general capacity it's really like dedicated um hardware that we might not even have ever like we think of tpus or ipu's it's something that is very interesting that we can scale and just go go use them in the public cloud yeah that's a really interesting point because because the cloud providers are big enough now right that they're building all kind of specialized specialized server specialized uh cpu specialized gpus dpus is a new one i've heard a data processing unit as you said there's fpgas and all kinds of accelerators so it is a really rich environment for as you said to do your experiments and find what the optimal solution is for whatever that particular workload is but ricardo i want to shift gears a little bit as we come to the end of 2020 thankfully for a whole bunch of reasons as you look forward to 2021 i mean clearly anticipating and starting to plan to get ready for your upgrade as a priority i'm just curious what are your other priorities and how does you know kind of the compute infrastructure in terms of an investment within cern you know kind of rank with the investment around the physical things that you're building the big machines because without the compute those other things really don't provide much data and i know those are we always talked about how expensive the particle accelerators is it's an interesting number and it's big but you guys are a big piece of that as well so what are your priorities looking forward to 2021 yeah from from the compute side i think we are keeping the the priorities in similar to what we've been doing the last few years which is to make sure that we improve all our automation to improve efficiency as well to prepare for these upgrades we have but also there's a lot of activity in this new uh area with machine learning popping up we have a ton of services appearing where people want to to start doing machine learning in many many use cases in some cases they want to do the filtering in the detectors in other cases they want to generate simulation data a lot faster using machine learning as well so i think this will be something that will be a huge topic for next year even for the next couple of years which is to see how we can offer our users and physicists the best service so that they don't have to care about the infrastructure they don't have to know about the details of how they scale their their model training their serving of their models all of this i think this will be a very big topic um it's something that it's becoming really a big part of of the world computing for high energy physics and for cern as well that's great we see that a lot you know just applied machine learning to very specific problems you talked about you still can't even record all that information that comes off those things you have to do some compression technology and other things so real opportunities barely scratched on the surface of machine learning and ai but i'm sure you're going to be using it a ton well ricardo give you give you the last word um we're in at cncf's uh kubecon cloud native con you know what do you get out of these types of shows and why is this such again kind of why is it such an important piece of your way you get your job done yeah honestly uh with all this uh situation right now i kind of really miss this kind of conferences in person uh it's really a huge opportunity to connect with uh with the other end users but also with with the community and to talk to the developers discuss things over uh coffee beer this is something that is really something that is really useful to to have this kind of meetings every year uh i think what what uh i always try to say is uh this this wall infrastructure is is truly making a big impact in the way we do things so we can only thank the community uh it's it allows us to to kind of shift to focusing on a higher level to focus more on our use cases instead of having to focus so much on the infrastructure we kind of start giving it as a given that the infrastructure scales and we can just use it and focus on optimizing our own software so this is a huge contribution we can only thank the cncf projects and everyone involved great well thank you for that uh that summary and that that's a terrific summary so ricardo thank you so much for all your hard work answering really big helping answer really big questions and uh and for joining us today and sharing your insight thank you very much all right he's ricardo i'm jeff you're watching the cube from our palo alto studios for continuing coverage of kubecon cloud nativecon 2020. thanks for watching see you next time [Music] you
SUMMARY :
the relative scale to have what you got
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Ricardo Rocha | PERSON | 0.99+ |
100 meters | QUANTITY | 0.99+ |
10 times | QUANTITY | 0.99+ |
2021 | DATE | 0.99+ |
27 kilometers | QUANTITY | 0.99+ |
jeff frick | PERSON | 0.99+ |
last year | DATE | 0.99+ |
CERN | ORGANIZATION | 0.99+ |
today | DATE | 0.99+ |
second thing | QUANTITY | 0.99+ |
five years | QUANTITY | 0.99+ |
ricardo | PERSON | 0.98+ |
palo alto | ORGANIZATION | 0.98+ |
40 million pictures | QUANTITY | 0.98+ |
KubeCon | EVENT | 0.98+ |
first launch | QUANTITY | 0.98+ |
first time | QUANTITY | 0.98+ |
next year | DATE | 0.98+ |
CloudNativeCon | EVENT | 0.97+ |
jeff | PERSON | 0.96+ |
ricardo rocha | PERSON | 0.96+ |
north america | LOCATION | 0.95+ |
around 10 years ago | DATE | 0.95+ |
geneva | LOCATION | 0.95+ |
four | QUANTITY | 0.95+ |
101 | QUANTITY | 0.94+ |
over one half an exabyte of data | QUANTITY | 0.93+ |
70 petabytes of data | QUANTITY | 0.93+ |
kubecon | ORGANIZATION | 0.92+ |
next couple of years | DATE | 0.92+ |
7 | QUANTITY | 0.92+ |
every year | QUANTITY | 0.91+ |
linux | TITLE | 0.9+ |
last few years | DATE | 0.89+ |
up to one petabyte | QUANTITY | 0.89+ |
may of 2019 | DATE | 0.87+ |
end of 2020 | DATE | 0.87+ |
2020 | DATE | 0.87+ |
next few years | DATE | 0.86+ |
a ton of services | QUANTITY | 0.84+ |
nancy meta factory | ORGANIZATION | 0.82+ |
NA 2020 | EVENT | 0.8+ |
each | QUANTITY | 0.8+ |
cloudnativecon | ORGANIZATION | 0.8+ |
a lot of people | QUANTITY | 0.79+ |
a lot of data | QUANTITY | 0.79+ |
one | QUANTITY | 0.78+ |
few tens of gigabytes per second | QUANTITY | 0.78+ |
so many people | QUANTITY | 0.76+ |
kubecon | EVENT | 0.75+ |
openstack | TITLE | 0.74+ |
challenges | QUANTITY | 0.7+ |
kubecon cloud | ORGANIZATION | 0.66+ |
thursday | DATE | 0.66+ |
second | QUANTITY | 0.66+ |
a second | QUANTITY | 0.64+ |
lot of people | QUANTITY | 0.63+ |
a few years | QUANTITY | 0.62+ |
hat | ORGANIZATION | 0.61+ |
cern | ORGANIZATION | 0.61+ |
european | OTHER | 0.58+ |
lot of data | QUANTITY | 0.58+ |
foundation | ORGANIZATION | 0.57+ |
in the summer | DATE | 0.55+ |
red | PERSON | 0.54+ |
cloud nativecon 2020 | EVENT | 0.54+ |
lot of activity | QUANTITY | 0.53+ |
two cube | QUANTITY | 0.49+ |
con | EVENT | 0.4+ |