Krishna Gade, Fiddler.ai | Amazon re:MARS 2022

(upbeat music) >> Welcome back. Day two of theCUBE's coverage of re:MARS in Las Vegas. Amazon re:MARS, it's part of the Re Series they call it at Amazon. re:Invent is their big show, re:Inforce is a security show, re:MARS is the new emerging machine learning automation, robotics, and space. The confluence of machine learning powering a new industrial age and inflection point. I'm John Furrier, host of theCUBE. We're here to break it down for another wall to wall coverage. We've got a great guest here, CUBE alumni from our AWS startup showcase, Krishna Gade, founder and CEO of fiddler.ai. Welcome back to theCUBE. Good to see you. >> Great to see you, John. >> In person. We did the remote one before. >> Absolutely, great to be here, and I always love to be part of these interviews and love to talk more about what we're doing. >> Well, you guys have a lot of good street cred, a lot of good word of mouth around the quality of your product, the work you're doing. I know a lot of folks that I admire and trust in the AI machine learning area say great things about you. A lot going on, you guys are growing companies. So you're kind of like a startup on a rocket ship, getting ready to go, pun intended here at the space event. What's going on with you guys? You're here. Machine learning is the centerpiece of it. Swami gave the keynote here at day two and it really is an inflection point. Machine learning is now ready, it's scaling, and some of the examples that they were showing with the workloads and the data sets that they're tapping into, you know, you've got CodeWhisperer, which they announced, you've got trust and bias now being addressed, we're hitting a level, a new level in ML, ML operations, ML modeling, ML workloads for developers. >> Yep, yep, absolutely. You know, I think machine learning now has become an operational software, right? Like you know a lot of companies are investing millions and billions of dollars and creating teams to operationalize machine learning based products. And that's the exciting part. I think the thing that that is very exciting for us is like we are helping those teams to observe how those machine learning applications are working so that they can build trust into it. Because I believe as Swami was alluding to this today, without actually building trust into AI, it's really hard to actually have your business users use it in their business workflows. And that's where we are excited about bringing their trust and visibility factor into machine learning. >> You know, a lot of us all know what you guys are doing here in the ecosystem of AWS. And now extending here, take a minute to explain what Fiddler is doing for the folks that are in the space, that are in discovery mode, trying to understand who's got what, because like Swami said on stage, it's a full-time job to keep up on all the machine learning activities and tool sets and platforms. Take a minute to explain what Fiddler's doing, then we can get into some, some good questions. >> Absolutely. As the enterprise is taking on operationalization of machine learning models, one of the key problems that they run into is lack of visibility into how those models perform. You know, for example, let's say if I'm a bank, I'm trying to introduce credit risk scoring models using machine learning. You know, how do I know when my model is rejecting someone's loan? You know, when my model is accepting someone's loan? And why is it doing it? And I think this is basically what makes machine learning a complex thing to implement and operationalize. Without this visibility, you cannot build trust and actually use it in your business. With Fiddler, what we provide is we actually open up this black box and we help our customers to really understand how those models work. You know, for example, how is my model doing? Is it accurately working or not? You know, why is it actually rejecting someone's loan application? We provide these both fine grain as well as coarse grain insights. So our customers can actually deploy machine learning in a safe and trustworthy manner. >> Who is your customer? Who you're targeting? What persona is it, the data engineer, is it data science, is it the CSO, is it all the above? >> Yeah, our customer is the data scientist and the machine learning engineer, right? And we usually talk to teams that have a few models running in production, that's basically our sweet spot, where they're trying to look for a single pane of glass to see like what models are running in their production, how they're performing, how they're affecting their business metrics. So we typically engage with like head of data science or head of machine learning that has a few machine learning engineers and data scientists. >> Okay, so those people that are watching, you're into this, you can go check it out. It's good to learn. I want to get your thoughts on some trends that I see emerging, and I want to get your reaction to those. Number one, we're seeing the cloud scale now and integration a big part of things. So the time to value was brought up on stage today, Swami kind of mentioned time to value, showed some benchmark where they got four hours, some other teams were doing eight weeks. Where are we on the progression of value, time to value, and on the scale side. Can you scope that for me? >> I mean, it depends, right? You know, depending upon the company. So for example, when we work with banks, for them to time to operationalize a model can take months actually, because of all the regulatory procedures that they have to go through. You know, they have to get the models reviewed by model validators, model risk management teams, and then they audit those models, they have to then ship those models and constantly monitor them. So it's a very long process for them. And even for non-regulated sectors, if you do not have the right tools and processes in place, operationalizing machine learning models can take a long time. You know, with tools like Fiddler, what we are enabling is we are basically compressing that life cycle. We are helping them automate like model monitoring and explainability so that they can actually ship models more faster. Like you get like velocity in terms of shipping models. For example, one of the growing fintech companies that started with us last year started with six models in production, now they're running about 36 models in production. So it's within a year, they were able to like grow like 10x. So that is basically what we are trying to do. >> At other things, we at re:MARS, so first of all, you got a great product and a lot of markets that grow onto, but here you got space. I mean, anyone who's coming out of college or university PhD program, and if they're into aero, they're going to be here, right? This is where they are. Now you have a new core companies with machine learning, not just the engineering that you see in the space or aerospace area, you have a new engineering. Now I go back to the old days where my parents, there was Fortran, you used Fortran was Lingua Franca to manage the equipment. Little throwback to the old school. But now machine learning is companion, first class citizen, to the hardware. And in fact, and some will say more important. >> Yep, I mean, machine learning model is the new software artifact. It is going into production in a big way. And I think it has two different things that compare to traditional software. Number one, unlike traditional software, it's a black box. You cannot read up a machine learning model score and see why it's making those predictions. Number two, it's a stochastic entity. What that means is it's predictive power can wane over time. So it needs to be constantly monitored and then constantly refreshed so that it's actually working in tech. So those are the two main things you need to take care. And if you can do that, then machine learning can give you a huge amount of ROI. >> There is some practitioner kind of like craft to it. >> Correct. >> As you said, you got to know when to refresh, what data sets to bring in, which to stay away from, certainly when you get to the bias, but I'll get to that in a second. My next question is really along the lines of software. So if you believe that open source will dominate the software business, which I do, I mean, most people won't argue. I think you would agree with that, right? Open source is driving everything. If everything's open source, where's the differentiation coming from? So if I'm a startup entrepreneur or I'm a project manager working on the next Artemis mission, I got to open source. Okay, there's definitely security issues here. I don't want to talk about shift left right now, but like, okay, open source is everything. Where's the differentiation, where do I have the proprietary edge? >> It's a great question, right? So I used to work in tech companies before Fiddler. You know, when I used to work at Facebook, we would build everything in house. We would not even use a lot of open source software. So there are companies like that that build everything in house. And then I also worked at companies like Twitter and Pinterest, which are actually used a lot of open source, right? So now, like the thing is, it depends on the maturity of the organization. So if you're a Facebook or a Google, you can build a lot of things in house. Then if you're like a modern tech company, you would probably leverage open source, but there are lots of other companies in the world that still don't have the talent pool to actually build, take things from open source and productionize it. And that's where the opportunity for startups comes in so that we can commercialize these things, create a great enterprise experience, so actually operationalize things for them so that they don't have to do it in house for them. And that's the advantage working with startups. >> I don't want to get all operating systems with you on theory here on the stage here, but I will have to ask you the next question, which I totally agree with you, by the way, that's the way to go. There's not a lot of people out there that are peaked. And that's just statistical and it'll get better. Data engineering is really narrow. That is like the SRE of data. That's a new role emerging. Okay, all the things are happening. So if open source is there, integration is a huge deal. And you start to see the rise of a lot of MSPs, managed service providers. I run Kubernetes clusters, I do this, that, and the other thing. So what's your reaction to the growth of the integration side of the business and this role of new services coming from third parties? >> Yeah, absolutely. I think one of the big challenges for a chief data officer or someone like a CTO is how do they devise this infrastructure architecture and with components, either homegrown components or open source components or some vendor components, and how do they integrate? You know, when I used to run data engineering at Pinterest, we had to devise a data architecture combining all of these things and create something that actually flows very nicely, right? >> If you didn't do it right, it would break. >> Absolutely. And this is why it's important for us, like at Fiddler, to really make sure that Fiddler can integrate to all varies of ML platforms. Today, a lot of our customers use machine learning, build machine learning models on SageMaker. So Fiddler nicely integrate with SageMaker so that data, they get a seamless experience to monitor their models. >> Yeah, I mean, this might not be the right words for it, but I think data engineering as a service is really what I see you guys doing, as well other things, you're providing all that. >> And ML engineering as a service. >> ML engineering as a- Well it's hard. I mean, it's like the hard stuff. >> Yeah, yeah. >> Hear, hear. But that has to enable. So you as a business entrepreneur, you have to create a multiple of value proposition to your customers. What's your vision on that? What is that value? It has to be a multiple, at least 5 to 10. >> I mean, the value is simple, right? You know, if you have to operationize machine learning, you need visibility into how these things work. You know, if you're CTO or like chief data officer is asking how is my model working and how is it affecting my business? You need to be able to show them a dashboard, how it's working, right? And so like a data scientist today struggles to do this. They have to manually generate a report, manually do this analysis. What Fiddler is doing them is basically reducing their work so that they can automate these things and they can still focus on the core aspect of model building and data preparation and this boring aspect of monitoring the model and creating reports around the models is automated for them. >> Yeah, you guys got a great business. I think it's a lot of great future there and it's only going to get bigger. Again, the TAM's going to expand as the growth rising tide comes in. I want to ask you on while we're on that topic of rising tides, Dave Malik and I, since re:Invent last year have been kind of kicked down around this term that we made up called supercloud. And supercloud was a word that came out of these clouds that were not Amazon hyperscalers. So Snowflake, Buildman Sachs, Capital One, you name it, they're building massive proprietary value on top of the CapEx of Amazon. Jerry Chen at Greylock calls it castles in the cloud. You can create these moats. >> Yeah, right. >> So this is a phenomenon, right? And you land on one, and then you go to the others. So the strategies, everyone goes to Amazon first, and then hits Azure and GCP. That then creates this kind of multicloud so, okay, so super cloud's kind of happening, it's a thing. Charles Fitzgerald will disagree, he's a platformer, he says he's against the term. I get why, but he's off base a little. We can't wait to debate him on that. So superclouds are happening, but now what do I do about multicloud, because now I understand multicloud, I have this on that cloud, integrating across clouds is a very difficult thing. >> Krishna: Right, right, right. >> If I'm Snowflake or whatever, hey, I'll go to Azure, more TAM expansion, more market. But are people actually working together? Are we there yet? Where it's like, okay, I'm going to re-operationalize this code base over here. >> I mean, the reality of it, enterprise wants optionality, right? I think they don't want to be locked in into one particular cloud vendor on one particular software. And therefore you actually have in a situation where you have a multicloud scenario where they want to have some workloads in Amazon, some workloads in Azure. And this is an opportunity for startups like us because we are cloud agnostic. We can monitor models wherever you have. So this is where a lot of our customers, they have some of their models are running in their data centers and some of their models running in Amazon. And so we can provide a universal single pan of glass, right? So we can basically connect all of those data and actually showcase. I think this is an opportunity for startups to combine the data streams come from various different clouds and give them a single pain of experience. That way, the sort of the where is your data, where are my models running, which cloud are there, is all abstracted out from the customer. Because at the end of the day, enterprises will want optionality. And we are in this multicloud. >> Yeah, I mean, this reminds me of the interoperability days back when I was growing into the business. Everything was interoperability and OSI and the standards came out, but what's your opinion on openness, okay? There's a kneejerk reaction right now in the market to go silo on your data for governance or whatever reasons, but yet machine learning gurus and experts will say, "Hey, you want to horizon horizontal scalability and have the best machine learning models, you've got to have access to data and fast in real time or near real time." And the antithesis is siloing. >> Krishna: Right, right, right. >> So what's the solution? Customers control the data plane and have a control plane that's... What do customers do? It's a big challenge. >> Yeah, absolutely. I think there are multiple different architectures of ML, right, you know? We've seen like where vendors like us used to deploy completely on-prem, right? And they still do it, we still do it in some customers. And then you had this managed cloud experience where you just abstract out the entire operations from the customer. And then now you have this hybrid experience where you split the control plane and data plane. So you preserve the privacy of the customer from the data perspective, but you still control the infrastructure, right? I don't think there's a right answer. It depends on the product that you're trying to solve. You know, Databricks is able to solve this control plane, data plane split really well. I've seen some other tools that have not done this really well. So I think it all depends upon- >> What about Snowflake? I think they a- >> Sorry, correct. They have a managed cloud service, right? So predominantly that's their business. So I think it all depends on what is your go to market? You know, which customers you're talking to? You know, what's your product architecture look like? You know, from Fiddler's perspective today, we actually have chosen, we either go completely on-prem or we basically provide a managed cloud service and that's actually simpler for us instead of splitting- >> John: So it's customer choice. >> Exactly. >> That's your position. >> Exactly. >> Whoever you want to use Fiddler, go on-prem, no problem, or cloud. >> Correct, or cloud, yeah. >> You'll deploy and you'll work across whatever observability space you want to. >> That's right, that's right. >> Okay, yeah. So that's the big challenge, all right. What's the big observation from your standpoint? You've been on the hyperscaler side, your journey, Facebook, Pinterest, so back then you built everything, because no one else had software for you, but now everybody wants to be a hyperscaler, but there's a huge CapEx advantage. What should someone do? If you're a big enterprise, obviously I could be a big insurance, I could be financial services, oil and gas, whatever vertical, I want a supercloud, what do I do? >> I think like the biggest advantage enterprise today have is they have a plethora of tools. You know, when I used to work on machine learning way back in Microsoft on Bing Search, we had to build everything. You know, from like training platforms, deployment platforms, experimentation platforms. You know, how do we monitor those models? You know, everything has to be homegrown, right? A lot of open source also did not exist at the time. Today, the enterprise has this advantage, they're sitting on this gold mine of tools. You know, obviously there's probably a little bit of tool fatigue as well. You know, which tools to select? >> There's plenty of tools available. >> Exactly, right? And then there's like services available for you. So now you need to make like smarter choices to cobble together this, to create like a workflow for your engineers. And you can really get started quite fast, and actually get on par with some of these modern tech companies. And that is the advantage that a lot of enterprises see. >> If you were going to be the CTO or CEO of a big transformation, knowing what you know, 'cause you just brought up the killer point about why it's such a great time right now, you got platform as a service and the tooling essentially reset everything. So if you're going to throw everything out and start fresh, you're basically brewing the system architecture. It's a complete reset. That's doable. How fast do you think you could do that for say a large enterprise? >> See, I think if you set aside the organization processes and whatever kind of comes in the friction, from a technology perspective, it's pretty fast, right? You can devise a data architecture today with like tools like Kafka, Snowflake and Redshift, and you can actually devise a data architecture very clearly right from day one and actually implement it at scale. And then once you have accumulated enough data and you can extract more value from it, you can go and implement your MLOps workflow as well on top of it. And I think this is where tools like Fiddler can help as well. So I would start with looking at data, do we have centralization of data? Do we have like governance around data? Do we have analytics around data? And then kind of get into machine learning operations. >> Krishna, always great to have you on theCUBE. You're great masterclass guest. Obviously great success in your company. Been there, done that, and doing it again. I got to ask you, since you just brought that up about the whole reset, what is the superhero persona right now? Because it used to be the full stack developer, you know? And then it's like, then I call them, it didn't go over very well in theCUBE, the half stack developer, because nobody wants to be a half stack anything, a half sounds bad, worse than full. But cloud is essentially half a stack. I mean, you got infrastructure, you got tools. Now you're talking about a persona that's going to reset, look at tools, make selections, build an architecture, build an operating environment, distributed computing operating. Who is that person? What's that persona look like? >> I mean, I think the superhero persona today is ML engineering. I'm usually surprised how much is put on an ML engineer to do actually these days. You know, when I entered the industry as a software engineer, I had three or four things in my job to do, I write code, I test it, I deploy it, I'm done. Like today as an ML engineer, I need to worry about my data. How do I collect it? I need to clean the data, I need to train my models, I need to experiment with what it is, and to deploy them, I need to make sure that they're working once they're deployed. >> Now you got to do all the DevOps behind it. >> And all the DevOps behind it. And so I'm like working halftime as a data scientist, halftime as a software engineer, halftime as like a DevOps cloud. >> Cloud architect. >> It's like a heroic job. And I think this is why this is why obviously these jobs are like now really hard jobs and people want to be more and more machine learning >> And they get paid. >> engineering. >> Commensurate with the- >> And they're paid commensurately as well. And this is where I think an opportunity for tools like Fiddler exists as well because we can help those ML engineers do their jobs better. >> Thanks for coming on theCUBE. Great to see you. We're here at re:MARS. And great to see you again. And congratulations on being on the AWS startup showcase that we're in year two, episode four, coming up. We'll have to have you back on. Krishna, great to see you. Thanks for coming on. Okay, This is theCUBE's coverage here at re:MARS. I'm John Furrier, bringing all the signal from all the noise here. Not a lot of noise at this event, it's very small, very intimate, a little bit different, but all on point with space, machine learning, robotics, the future of industrial. We'll back with more coverage after the short break. >> Man: Thank you John. (upbeat music)

Published Date : Jun 23 2022

SUMMARY :

re:MARS is the new emerging We did the remote one before. and I always love to be and some of the examples And that's the exciting part. folks that are in the space, And I think this is basically and the machine learning engineer, right? So the time to value was You know, they have to that you see in the space And if you can do that, kind of like craft to it. I think you would agree with that, right? so that they don't have to That is like the SRE of data. and create something that If you didn't do it And this is why it's important is really what I see you guys doing, I mean, it's like the hard stuff. But that has to enable. You know, if you have to Again, the TAM's going to expand And you land on one, and I'm going to re-operationalize I mean, the reality of it, and have the best machine learning models, Customers control the data plane And then now you have You know, what's your product Whoever you want to whatever observability space you want to. So that's the big challenge, all right. Today, the enterprise has this advantage, And that is the advantage and the tooling essentially And then once you have to have you on theCUBE. I need to experiment with what Now you got to do all And all the DevOps behind it. And I think this is why this And this is where I think an opportunity And great to see you again. Man: Thank you John.

ENTITIES

Entity	Category	Confidence
Jerry Chen	PERSON	0.99+
Krishna	PERSON	0.99+
Google	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Dave Malik	PERSON	0.99+
John	PERSON	0.99+
Charles Fitzgerald	PERSON	0.99+
millions	QUANTITY	0.99+
six models	QUANTITY	0.99+
four hours	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
three	QUANTITY	0.99+
eight weeks	QUANTITY	0.99+
Pinterest	ORGANIZATION	0.99+
last year	DATE	0.99+
Buildman Sachs	ORGANIZATION	0.99+
Swami	PERSON	0.99+
Capital One	ORGANIZATION	0.99+
10x	QUANTITY	0.99+
Twitter	ORGANIZATION	0.99+
today	DATE	0.99+
Microsoft	ORGANIZATION	0.99+
Fiddler	ORGANIZATION	0.99+
Krishna Gade	PERSON	0.99+
Las Vegas	LOCATION	0.99+
Fortran	ORGANIZATION	0.99+
TAM	ORGANIZATION	0.99+
two different things	QUANTITY	0.98+
both	QUANTITY	0.98+
one	QUANTITY	0.98+
Artemis	ORGANIZATION	0.97+
Today	DATE	0.97+
theCUBE	ORGANIZATION	0.97+
Snowflake	ORGANIZATION	0.97+
four things	QUANTITY	0.96+
billions of dollars	QUANTITY	0.96+
Day two	QUANTITY	0.96+
Redshift	TITLE	0.95+
Databricks	ORGANIZATION	0.95+
two main	QUANTITY	0.94+
Kafka	TITLE	0.94+
Snowflake	TITLE	0.94+
SageMaker	TITLE	0.94+
a year	QUANTITY	0.93+
10	QUANTITY	0.93+
Azure	TITLE	0.93+
first	QUANTITY	0.92+
CUBE	ORGANIZATION	0.92+
Greylock	ORGANIZATION	0.91+
single	QUANTITY	0.91+
single pan of glass	QUANTITY	0.9+
about 36 models	QUANTITY	0.9+
year two	QUANTITY	0.89+
CapEx	ORGANIZATION	0.89+
Lingua Franca	ORGANIZATION	0.84+

Kickoff with Taylor Dolezal | Kubecon + Cloudnativecon Europe 2022

>> Announcer: "theCUBE" presents "Kubecon and Cloudnativecon Europe, 2022" brought to you by Red Hat, the Cloud Native Computing Foundation and its ecosystem partners. >> Welcome to Valencia, Spain and "Kubecon + Cloudnativecon Europe, 2022." I'm Keith Townsend, and we're continuing the conversations with amazing people doing amazing things. I think we've moved beyond a certain phase of the hype cycle when it comes to Kubernetes. And we're going to go a little bit in detail with that today, and on all the sessions, I have today with me, Taylor Dolezal. New head of CNCF Ecosystem. So, first off, what does that mean new head of? You're the head of CNCF Ecosystem? What is the CNCF Ecosystem? >> Yeah. Yeah. It's really the end user ecosystem. So, the CNCF is comprised of really three pillars. And there's the governing board, they oversee the budget and fun things, make sure everything's signed and proper. Then there's the Technical Oversight Committee, TOC. And they really help decide the technical direction of the organization through deliberation and talking about which projects get invited and accepted. Projects get donated, and the TOC votes on who's going to make it in, based on all this criteria. And then, lastly, is the end user ecosystem, that encompasses a whole bunch of different working groups, special interest groups. And that's been really interesting to kind of get a deeper sense into, as of late. So, there are groups like the developer experience group, and the user research group. And those have very specific focuses that kind of go across all industries. But what we've seen lately, is that there are really deep wants to create, whether it be financial services user group, and things like that, because end users are having trouble with going to all of the different meetings. If you're a company, a vendor member company that's selling authentication software, or something in networking, makes sense to have a SIG network, SIG off, and those kinds of things. But when it comes down to like Boeing that just joined, does that make sense for them to jump into all those meetings? Or does it make sense to have some other kind of thing that is representative of them, so that they can attend that one thing, it's specific to their industry? They can get that download and kind of come up to speed, or find the best practices as quickly as possible in a nice synthesized way. >> So, you're 10 weeks into this role. You're coming from a customer environment. So, talk to me a little bit about the customer side of it? When you're looking at something, it's odd to call CNCF massive. But it is, 7.1 million members, and the number of contributing projects, et cetera. Talk to me about the view from the outside versus the view now that you're inside? >> Yeah, so honestly, it's been fun to kind of... For me, it's really mirrored the open-source journey. I've gone to Kubecon before, gotten to enjoy all of the booths, and trying to understand what's going on, and then worked for HashiCorp before coming to the CNCF. And so, get that vendor member kind of experience working the booth itself. So, kind of getting deeper and deeper into the stack of the conference itself. And I keep saying, vendor member and end user members, the difference between those, is end users are not organizations that sell cloud native services. Those are the groups that are kind of more consuming, the Airbnbs, the Boeings, the Mercedes, these people that use these technologies and want to kind of give that feedback back to these projects. But yeah, very incredibly massive and just sprawling when it comes to working in all those contexts. >> So, I have so many questions around, like the differences between having you as an end user and in inter-operating with vendors and the CNCF itself. So, let's start from the end user lens. When you're an end user and you're out discovering open-source and cloud native products, what's that journey like? How do you go from saying, okay, I'm primarily focused on vendor solutions, to let me look at this cloud native stack? >> Yeah, so really with that, there's been, I think that a lot of people have started to work with me and ask for, "Can we have recommended architectures? Can we have blueprints for how to do these things?" When the CNCF doesn't want to take that position, we don't want to kind of be the king maker and be like, this is the only way forward. We want to be inclusive, we want to pull in these projects, and kind of give everyone the same boot strap and jump... I missing the word of it, just ability to kind of like springboard off of that. Create a nice base for everybody to get started with, and then, see what works out, learn from one another. I think that when it comes to Kubernetes, and Prometheus, and some other projects, being able to share best practices between those groups of what works best as well. So, within all of the separations of the CNCF, I think that's something I've found really fun, is kind of like seeing how the projects relate to those verticals and those groups as well. Is how you run a project, might actually have a really good play inside of an organization like, "I like that idea. Let's try that out with our team." >> So, like this idea of springboarding. You know, is when an entrepreneur says, "You know what? I'm going to quit my job and springboard off into doing something new." There's a lot of uncertainty, but for enterprise, that can be really scary. Like we're used to our big vendors, HashiCorp, VMware, Cisco kind of guiding us and telling us like, what's next? What is that experience like, springboarding off into something as massive as cloud native? >> So, I think it's really, it's a great question. So, I think that's why the CNCF works so well, is the fact that it's a safe place for all these companies to come together, even companies of competing products. you know, having that common vision of, we want to make production boring again, we don't want to have so much sprawl and have to take in so much knowledge at once. Can we kind of work together to create all these things to get rid of our adminis trivia or maintenance tasks? I think that when it comes to open-source in general, there's a fantastic book it's called "Working in Public," it's by Stripe Press. I recommend it all over the place. It's orange, so you'll recognize it. Yeah, it's easy to see. But it's really good 'cause it talks about the maintainer journey, and what things make it difficult. And so, I think that that's what the CNCF is really working hard to try to get rid of, is all this monotonous, all these monotonous things, filing issues, best practices. How do you adopt open-source within your organization? We have tips and tricks, and kind of playbooks in ways that you could accomplish that. So, that's what I find really useful for those kinds of situations. Then it becomes easier to adopt that within your organization. >> So, I asked Priyanka, CNCF executive director last night, a pretty tough question. And this is kind of in the meat of what you do. What happens when you? Let's pick on service mesh 'cause everyone likes to pick on service mesh. >> XXXX: Yeah. >> What happens when there's differences at that vendor level on the direction of a CIG or a project, or the ecosystem around service mesh? >> Yeah, so that's the fun part. Honestly, is 'cause people get to hash it out. And so, I think that's been the biggest thing for me finding out, was that there's more than one way to do thing. And so, I think it always comes down to use case. What are you trying to do? And then you get to solve after that. So, it really is, I know it depends, which is the worst answer. But I really do think that's the case, because if you have people that are using something within the automotive space, or in the financial services space, they're going to have completely different needs, wants, you know, some might need to run Coball or Fortran, others might not have to. So, even at that level, just down to what your tech stack looks like, audits, and those kinds of things, that can just really differ. So, I think it does come down to something more like that. >> So, the CNCF loosely has become kind of a standards body. And it's centered around the core project Kubernetes? >> Mm-hmm. >> So, what does it mean, when we're looking at larger segments such as service mesh or observability, et cetera, to be Kubernetes compliant? Where's the point, if any, that the CNCF steps in versus just letting everyone hash it out? Is it Kubernetes just need to be Kubernetes compliant and everything else is free for all? >> Honestly, in many cases, it's up to the communities themselves to decide that. So, the groups that are running OCI, the Open Container Interface, Open Storage Interface, all of those things that we've agreed on as ways to implement those technologies, I think that's where the CNCF, that's the line. That's where the CNCF gets up to. And then, it's like we help foster those communities and those conversations and asking, does this work for you? If not, let's talk about it, let's figure out why it might not. And then, really working closely with community to kind of help bring those things forward and create action items. >> So, it's all about putting the right people in the rooms and not necessarily playing referee, but to get people in the right room to have and facilitate the conversation? >> Absolutely. Absolutely. Like all of the booths behind us could have their own conferences, but we want to bring everybody together to have those conversations. And again, sprawling can be really wild at certain times, but it's good to have those cross understandings, or to hear from somebody that you're like, "Oh, my goodness, I didn't even think about that kind of context or use case." So, really inclusive conversation. >> So, organizations like Boeing, Adobe, Microsoft, from an end user perspective, it's sometimes difficult to get those organizations into these types of communities. How do you encourage them to participate in the conversation 'cause their voice is extremely important? >> Yeah, that I'd also say it really is the community. I really liked the Kubernetes documentary that was put out, working with some of the CNCF folks and core, and beginning Kubernetes contributors and maintainers. And it just kind of blew me away when they had said, you know, what we thought was success, was seeing Kubernetes in an Amazon Data Center. That's when we knew that this was going to take root. And you'd rarely hear that, is like, "When somebody that we typically compete with, its success is seeing it, seeing them use that." And so, I thought was really cool. >> You know, I like to use this technology for my community of skipping rope. You see the girls and boys jumping double Dutch rope. And you think, "I can do that. Like it's just jumping." But there's this hesitation to actually, how do you start? How do you get inside of it? The question is how do you become a member of the community? We've talked a lot about what happens when you're in the community. But how do you join the community? >> So, really, there's a whole bunch of ways that you can. Actually, the shirt that I'm wearing, I got from the 114 Release. So, this is just a fun example of that community. And just kind of how welcoming and inviting that they are. Really, I do think it's kind of like a job breaker. Almost you start at the outside, you start using these technologies, even more generally like, what is DevOps? What is production? How do I get to infrastructure, architecture, or software engineering? Once you start there, you start working your way in, you develop a stack, and then you start to see these tools, technologies, workflows. And then, after you've kind of gotten a good amount of time spent with it, you might really enjoy it like that, and then want to help contribute like, "I like this, but it would be great to have a function that did this. Or I want a feature that does that." At that point in time, you can either take a look at the source code on GitHub, or wherever it's hosted, and then start to kind of come up with that, some ideas to contribute back to that. And then, beyond that, you can actually say, "No, I kind of want to have these conversations with people." Join in those special interest groups, and those meetings to kind of talk about things. And then, after a while, you can kind of find yourself in a contributor role, and then a maintainer role. After that, if you really like the project, and want to kind of work with community on that front. So, I think you had asked before, like Microsoft, Adobe and these others. Really it's about steering the projects. It's these communities want these things, and then, these companies say, "Okay, this is great. Let's join in the conversation with the community." And together again, inclusivity, and bringing everybody to the table to have that discussion and push things forward. >> So, Taylor, closing message. What would you want people watching this show to get when they think about ecosystem and CNCF? >> So, ecosystem it's a big place, come on in. Yeah, (laughs) the water's just fine. I really want people to take away the fact that... I think really when it comes down to, it really is the community, it's you. We are the end user ecosystem. We're the people that build the tools, and we need help. No matter how big or small, when you come in and join the community, you don't have to rewrite the Kubernetes scheduler. You can help make documentation that much more easy to understand, and in doing so, helping thousands of people, If I'm going through the instructions or reading a paragraph, doesn't make sense, that has such a profound impact. And I think a lot of people miss that. It's like, even just changing punctuation can have such a giant difference. >> Yeah, I think people sometimes forget that community, especially community-run projects, they need product managers. They need people that will help with communications, people that will help with messaging, websites updating. Just reachability, anywhere from developing code to developing documentation, there's ways to jump in and help the community. From Valencia, Spain, I'm Keith Townsend, and you're watching "theCUBE," the leader in high tech coverage. (bright upbeat music)

Published Date : May 20 2022

SUMMARY :

brought to you by Red Hat, and on all the sessions, and the user research group. and the number of contributing Those are the groups that So, let's start from the end user lens. and kind of give everyone the I'm going to quit my job and have to take in so the meat of what you do. Yeah, so that's the fun part. So, the CNCF loosely has So, the groups that are running OCI, Like all of the booths behind us participate in the conversation I really liked the Kubernetes become a member of the community? and those meetings to What would you want people it really is the community, it's you. and help the community.

ENTITIES

Entity	Category	Confidence
Priyanka	PERSON	0.99+
Boeing	ORGANIZATION	0.99+
Adobe	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Keith Townsend	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Cloud Native Computing Foundation	ORGANIZATION	0.99+
10 weeks	QUANTITY	0.99+
Taylor Dolezal	PERSON	0.99+
Taylor	PERSON	0.99+
TOC	ORGANIZATION	0.99+
Stripe Press	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
CNCF	ORGANIZATION	0.99+
Mercedes	ORGANIZATION	0.99+
Technical Oversight Committee	ORGANIZATION	0.99+
Boeings	ORGANIZATION	0.99+
Prometheus	TITLE	0.99+
Coball	ORGANIZATION	0.99+
Valencia, Spain	LOCATION	0.99+
today	DATE	0.99+
7.1 million members	QUANTITY	0.99+
HashiCorp	ORGANIZATION	0.98+
Kubecon	ORGANIZATION	0.98+
Airbnbs	ORGANIZATION	0.98+
VMware	ORGANIZATION	0.98+
last night	DATE	0.97+
GitHub	ORGANIZATION	0.97+
Fortran	ORGANIZATION	0.97+
first	QUANTITY	0.96+
Kubernetes	TITLE	0.95+
Working in Public	TITLE	0.93+
Amazon Data Center	ORGANIZATION	0.92+
Dutch	OTHER	0.92+
thousands of people	QUANTITY	0.91+
theCUBE	TITLE	0.91+
more than one way	QUANTITY	0.9+
Cloudnativecon	ORGANIZATION	0.89+
theCUBE	ORGANIZATION	0.86+
Kubernetes	ORGANIZATION	0.84+
DevOps	TITLE	0.84+
CNCF Ecosystem	ORGANIZATION	0.83+
one thing	QUANTITY	0.83+
three pillars	QUANTITY	0.82+
Europe	LOCATION	0.79+
Open Container Interface	OTHER	0.77+
double	QUANTITY	0.76+
OCI	OTHER	0.73+
Cloudnativecon Europe	ORGANIZATION	0.69+
Open Storage Interface	OTHER	0.62+
2022	DATE	0.58+
CIG	ORGANIZATION	0.53+
2022	TITLE	0.46+
114 Release	ORGANIZATION	0.38+

Marcel Hild, Red Hat & Kenneth Hoste, Ghent University | Kubecon + Cloudnativecon Europe 2022

(upbeat music) >> Announcer: theCUBE presents KubeCon and CloudNativeCon Europe 2022, brought to you by Red Hat, the Cloud Native Computing Foundation, and its ecosystem partners. >> Welcome to Valencia, Spain, in KubeCon CloudNativeCon Europe 2022. I'm your host Keith Townsend, along with Paul Gillon. And we're going to talk to some amazing folks. But first Paul, do you remember your college days? >> Vaguely. (Keith laughing) A lot of them are lost. >> I think a lot of mine are lost as well. Well, not really, I got my degree as an adult, so they're not that far past. I can remember 'cause I have the student debt to prove it. (both laughing) Along with us today is Kenneth Hoste, systems administrator at Ghent University, and Marcel Hild, senior manager software engineering at Red Hat. You're working in office of the CTO? >> That's absolutely correct, yes >> So first off, I'm going to start off with you Kenneth. Tell us a little bit about the research that the university does. Like what's the end result? >> Oh, wow, that's a good question. So the research we do at university and again, is very broad. We have bioinformaticians, physicists, people looking at financial data, all kinds of stuff. And the end result can be very varied as well. Very often it's research papers, or spinoffs from the university. Yeah, depending on the domain I would say, it depends a lot on. >> So that sounds like the perfect environment for cloud native. Like the infrastructure that's completely flexible, that researchers can come and have a standard way of interacting, each team just use it's resources as they would, the Navana for cloud native. >> Yeah. >> But somehow, I'm going to guess HPC isn't quite there yet. >> Yeah, not really, no. So, HPC is a bit, let's say slow into adopting new technologies. And we're definitely seeing some impact from cloud, especially things like containers and Kubernetes, or we're starting to hear these things in HPC community as well. But I haven't seen a lot of HPC clusters who are really fully cloud native. Not yet at least. Maybe this is coming. And if I'm walking around here at KubeCon, I can definitely, I'm being convinced that it's coming. So whether we like it or not we're probably going to have to start worrying about stuff like this. But we're still, let's say, the most prominent technologies of things like NPI, which has been there for 20, 30 years. The Fortran programming language is still the main language, if you're looking at compute time being spent on supercomputers, over 1/2 of the time spent is in Fortran code essentially. >> Keith: Wow. >> So either the application itself where the simulations are being done is implemented in Fortran, or the libraries that we are talking to from Python for example, for doing heavy duty computations, that backend library is implemented in Fortran. So if you take all of that into account, easily over 1/2 of the time is spent in Fortran code. >> So is this because the libraries don't migrate easily to, distributed to that environment? >> Well, it's multiple things. So first of all, Fortran is very well suited for implementing these type of things. >> Paul: Right. >> We haven't really seen a better alternative maybe. And also it'll be a huge effort to re-implement that same functionality in a newer language. So, the use case has to be very convincing, there has to be a very good reason why you would move away from Fortran. And, at least the HPC community hasn't seen that reason yet. >> So in theory, and right now we're talking about the theory and then what it takes to get to the future. In theory, I can take that Fortran code put it in a compiler that runs in a container? >> Yeah, of course, yeah. >> Why isn't it that simple? >> I guess because traditionally HPC is very slow at adopting new stuff. So, I'm not saying there isn't a reason that we should start looking at these things. Flexibility is a very important one. For a lot of researchers, their compute needs are very picky. So they're doing research, they have an idea, they want you to run lots of simulations, get the results, but then they're silent for a long time writing the paper, or thinking about how to, what they can learn from the results. So there's lots of peaks, and that's a very good fit for a cloud environment. I guess at the scale of university you have enough diversity end users that all those peaks never fall at the same time. So if you have your big own infrastructure you can still fill it up quite easily and keep your users happy. But this busty thing, I guess we're seeing that more and more or so. >> So Marcel, talk to us about, Red Hat needing to service these types of end users. That it can be on both ends I'd imagine that you have some people still in writing in Fortran, you have some people that's asking you for objects based storage. Where's Fortran, I'm sorry, not Fortran, but where is Red Hat in providing the underlay and the capabilities for the HPC and AI community? >> Yeah. So, I think if you look at the user base that we're looking at, it's on this spectrum from development to production. So putting AI workloads into production, it's an interesting challenge but it's easier to solve, and it has been solved to some extent, than the development cycle. So what we're looking at in Kenneth's domain it's more like the end user, the data scientist, developing code, and doing these experiments. Putting them into production is that's where containers live and thrive. You can containerize your model, you containerize your workload, you deploy it into your OpenShift Kubernetes cluster, done, you monitor it, done. So the software developments and the SRE, the ops part, done, but how do I get the data scientist into this cloud native age where he's not developing on his laptop or on a machine, where he SSH into and then does some stuff there. And then some system admin comes and needs to tweak it because it's running out of memory or whatnot. But how do we take him and make him, well, and provide him an environment that is good enough to work in, in the browser, and then with IDE, where the workload of doing the computation and the experimentation is repeatable, so that the environment is always the same, it's reliable, so it's always up and running. It doesn't consume resources, although it's up and running. Where it's, where the supply chain and the configuration of... And the, well, the modules that are brought into the system are also reliable. So all these problems that we solved in the traditional software development world, now have to transition into the data science and HPC world, where the problems are similar, but yeah, it's different sets. It's more or less, also a huge educational problem and transitioning the tools over into that is something... >> Well, is this mostly a technical issue or is this a cultural issue? I mean, are HPC workloads that different from more conventional OLTP workloads that they would not adapt well to a distributed containerized environment? >> I think it's both. So, on one hand it's the cultural issue because you have two different communities, everybody is reinventing the wheel, everybody is some sort of siloed. So they think, okay, what we've done for 30 years now we, there's no need to change it. And they, so it's, that's what thrives and here at KubeCon where you have different communities coming together, okay, this is how you solved the problem, maybe this applies also to our problem. But it's also the, well, the tooling, which is bound to a machine, which is bound to an HPC computer, which is architecturally different than a distributed environment where you would treat your containers as kettle, and as something that you can replace, right? And the HPC community usually builds up huge machines, and these are like the gray machines. So it's also technical bit of moving it to this age. >> So the massively parallel nature of HPC workloads you're saying Kubernetes has not yet been adapted to that? >> Well, I think that parallelism works great. It's just a matter of moving that out from an HPC computer into the scale out factor of a Kubernetes cloud that elastically scales out. Whereas the traditional HPC computer, I think, and Kenneth can correct me here is, more like, I have this massive computer with 1 million cores or whatnot, and now use it. And I can use my time slice, and book my time slice there. Whereas this a Kubernetes example the concept is more like, I have 1000 cores and I declare something into it and scale it up and down based on the needs. >> So, Kenneth, this is where you talked about the culture part of the changes that need to be happening. And quite frankly, the computer is a tool, it's a tool to get to the answer. And if that tool is working, if I have a 1000 cores on a single HPC thing, and you're telling me, well, I can't get to a system with 2000 cores. And if you containerized your process and move it over then maybe I'll get to the answer 50% faster maybe I'm not that... Someone has to make that decision. How important is it to get people involved in these types of communities from a researcher? 'Cause research is very tight-knit community to have these conversations and help that see move happen. >> I think it's very important to that community should, let's say, the cloud community, HPC research community, they should be talking a lot more, there should be way more cross pollination than there is today. I'm actually, I'm happy that I've seen HPC mentioned at booths and talks quite often here at KubeCon, I wasn't really expecting that. And I'm not sure, it's my first KubeCon, so I don't know, but I think that's kind of new, it's pretty recent. If you're going to the HPC community conferences there containers have been there for a couple of years now, something like Kubernetes is still a bit new. But just this morning there was a keynote by a guy from CERN, who was explaining, they're basically slowly moving towards Kubernetes even for their HPC clusters as well. And he's seeing that as the future because all the flexibility it gives you and you can basically hide all that from the end user, from the researcher. They don't really have to know that they're running on top of Kubernetes. They shouldn't care. Like you said, to them it's just a tool, and they care about if the tool works, they can get their answers and that's what they want to do. How that's actually being done in the background they don't really care. >> So talk to me about the AI side of the equation, because when I talk to people doing AI, they're on the other end of the spectrum. What are some of the benefits they're seeing from containerization? >> I think it's the reproducibility of experiments. So, and data scientists are, they're data scientists and they do research. So they care about their experiment. And maybe they also care about putting the model into production. But, I think from a geeky perspective they are more interested in finding the next model, finding the next solution. So they do an experiment, and they're done with it, and then maybe it's going to production. So how do I repeat that experiment in a year from now, so that I can build on top of it? And a container I think is the best solution to wrap something with its dependency, like freeze it, maybe even with the data, store it away, and then come to it back later and redo the experiment or share the experiment with some of my fellow researchers, so that they don't have to go through the process of setting up an equivalent environment on their machines, be it their laptop, via their cloud environment. So you go to the internet, download something doesn't work, container works. >> Well, you said something that really intrigues me you know in concept, I can have a, let's say a one terabyte data set, have a experiment associated with that. Take a snapshot of that somehow, I don't know how, take a snapshot of that and then share it with the rest of the community and then continue my work. >> Marcel: Yeah. >> And then we can stop back and compare notes. Where are we at in a maturity scale? Like, what are some of the pitfalls or challenges customers should be looking out for? >> I think you actually said it right there, how do I snapshot a terabyte of data? It's, that's... >> It's a terabyte of data. (both conversing) >> It's a bit of a challenge. And if you snapshot it, you have two terabytes of data or you just snapshot the, like and get you to do a, okay, this is currently where we're at. So that's why the technology is evolving. How do we do source control management for data? How do we license data? How do we make sure that the data is unbiased, et cetera? So that's going more into the AI side of things. But at dealing with data in a declarative way in a containerized way, I think that's where currently a lot of innovation is happening. >> What do you mean by dealing with data in a declarative way? >> If I'm saying I run this experiment based on this data set and I'm running this other experiment based on this other data set, and I as the researcher don't care where the data is stored, I care that the data is accessible. And so I might declare, this is the process that I put on my data, like a data processing pipeline. These are the steps that it's going through. And eventually it will have gone through this process and I can work with my data. Pretty much like applying the concept of pipelines through data. Like you have these data pipelines and then now you have cube flow pipelines as one solution to apply the pipeline concept, to well, managing your data. >> Given the stateless nature of containers, is that an impediment to HPC adoption because of the very large data sets that are typically involved? >> I think it is if you have terabytes of data. Just, you have to get it to the place where the computation will happen, right? And just uploading that into the cloud is already a challenge. If you have the data sitting there on a supercomputer and maybe it was sitting there for two years, you probably don't care. And typically a lot of universities the researchers don't necessarily pay for the compute time they use. Like, this is also... At least in Ghent that's the case, it's centrally funded, which means, the researchers don't have to worry about the cost, they just get access to the supercomputer. If they need two terabytes of data, they get that space and they can park it on the system for years, no problem. If they need 200 terabytes of data, that's absolutely fine. >> But the university cares about the cost? >> The university cares about the cost, but they want to enable the researchers to do the research that they want to do. >> Right. >> And we always tell researchers don't feel constrained about things like compute power, storage space. If you're doing smaller research, because you're feeling constrained, you have to tell us, and we will just expand our storage system and buy a new cluster. >> Paul: Wonderful. >> So you, to enable your research. >> It's a nice environment to be in. I think this might be a Jevons paradox problem, you give researchers this capability you might, you're going to see some amazing things. Well, now the people are snapshoting, one, two, three, four, five, different versions of a one terabytes of data. It's a good problem to have, and I hope to have you back on theCUBE, talking about how Red Hat and Ghent have solved those problems. Thank you so much for joining theCUBE. From Valencia, Spain, I'm Keith Townsend along with Paul Gillon. And you're watching theCUBE, the leader in high tech coverage. (upbeat music)

Published Date : May 19 2022

SUMMARY :

brought to you by Red Hat, do you remember your college days? A lot of them are lost. the student debt to prove it. that the university does. So the research we do at university Like the infrastructure I'm going to guess HPC is still the main language, So either the application itself So first of all, So, the use case has talking about the theory I guess at the scale of university and the capabilities for and the experimentation is repeatable, And the HPC community usually down based on the needs. And quite frankly, the computer is a tool, And he's seeing that as the future What are some of the and redo the experiment the rest of the community And then we can stop I think you actually It's a terabyte of data. the AI side of things. I care that the data is accessible. for the compute time they use. to do the research that they want to do. and we will just expand our storage system and I hope to have you back on theCUBE,

ENTITIES

Entity	Category	Confidence
Paul Gillon	PERSON	0.99+
Keith Townsend	PERSON	0.99+
Kenneth	PERSON	0.99+
Kenneth Hoste	PERSON	0.99+
Marcel Hild	PERSON	0.99+
Paul	PERSON	0.99+
Red Hat	ORGANIZATION	0.99+
two years	QUANTITY	0.99+
Keith	PERSON	0.99+
Marcel	PERSON	0.99+
1 million cores	QUANTITY	0.99+
Cloud Native Computing Foundation	ORGANIZATION	0.99+
50%	QUANTITY	0.99+
20	QUANTITY	0.99+
Fortran	TITLE	0.99+
1000 cores	QUANTITY	0.99+
30 years	QUANTITY	0.99+
two terabytes	QUANTITY	0.99+
CERN	ORGANIZATION	0.99+
2000 cores	QUANTITY	0.99+
Ghent	LOCATION	0.99+
Valencia, Spain	LOCATION	0.99+
first	QUANTITY	0.99+
Ghent	ORGANIZATION	0.99+
one terabytes	QUANTITY	0.99+
each team	QUANTITY	0.99+
one solution	QUANTITY	0.99+
KubeCon	EVENT	0.99+
today	DATE	0.99+
one terabyte	QUANTITY	0.99+
Python	TITLE	0.99+
Ghent University	ORGANIZATION	0.99+
Kubernetes	TITLE	0.98+
both	QUANTITY	0.98+
one	QUANTITY	0.98+
HPC	ORGANIZATION	0.98+
two different communities	QUANTITY	0.96+
terabytes of data	QUANTITY	0.96+
both ends	QUANTITY	0.96+
over 1/2	QUANTITY	0.93+
two	QUANTITY	0.93+
Cloudnativecon	ORGANIZATION	0.93+
CloudNativeCon Europe 2022	EVENT	0.92+
this morning	DATE	0.92+
a year	QUANTITY	0.91+
five	QUANTITY	0.9+
theCUBE	ORGANIZATION	0.89+
Fortran	ORGANIZATION	0.88+
KubeCon	ORGANIZATION	0.87+
two terabytes of data	QUANTITY	0.86+
KubeCon CloudNativeCon Europe 2022	EVENT	0.86+
Europe	LOCATION	0.85+
years	QUANTITY	0.81+
a terabyte of data	QUANTITY	0.8+
Navana	ORGANIZATION	0.8+
200 terabytes of	QUANTITY	0.79+
Kubecon +	ORGANIZATION	0.77+

Melanie Frank, Capital One | AWS re:Invent 2020

>> Announcer: From around the globe. It's theCUBE. With digital coverage of AWS re-invent 2020 sponsored by Intel, AWS, and our community partners. >> Hi, welcome to theCUBE virtual and our coverage of AWS reinvent 2020. We are theCUBE virtual and I'm your host Keith Townsend. Today I'm joined with Melanie Frank who is managing VP of technology at Capital One. Welcome to theCUBE Melanie >> Thanks for having me glad to be here. >> So first time on theCUBE, but you guys have done something big at Capital One. So we're not going to take it easy on you. This is, this has been hitting the news cycles. You guys have closed down your last data center. What spurred the, the initiative for Capital One to exit to private data center. >> Oh, there's so many cool technology, you think that we'll talk about, but you know, if you want to talk about why that is not tech for the sake of tech in this case, this is about working back from, from what our customer needs are. So I think of how the digital world has transformed our expectations as consumers, right. I, I actually was, I use a digital assistant a lot in the kitchen. And so the other day I was cooking and, you know, I update a shopping list to set a timer. I'm just used to doing those things. And the other day I actually asked my digital assistant to set my stove to 350 degrees, which I do not have a smart stove. It's not integrated with my digital assistant, nothing in my home has that capability right now, but it really kind of struck me as a wow, this whole, like this, this interaction has changed my expectations now for my entire kitchen. And I think that those types of experiences now are what we've come to expect as consumers. And that really was the center of it for us, our shift to cloud and exit of the data centers was all about our ability to provide our customers with experiences that are real-time and intelligent that you just can't do, if you're running on outdated technology in a data center. >> So I absolutely understand the benefits of when you're there. The other day, I was bringing up another circuit in our data center and I'm thinking our virtual data center and I'm thinking, wow, man, this is so easy. I can just issue a command software defined. And now the data centers has redundant connectivity, but getting there is a process can you talk to us about the process and how long it took for Capital One to actually reach that goal. >> Yeah, you know, you're so right. Like it is so nice once you're there, but it is not to be underestimated the, the transformational aspect of this, so this was a part of a massive eight year technology transformation, really. That was about modernizing the way in which we worked as well as the tech infrastructure itself. So our goal was to get to this destination where we were faster and more nimble, like you described with those new capabilities for those customers. But we're talking about this eight year transformation where we transformed our talent. We had to hire product managers, data scientists, designers. We shifted our developer skill sets. We're now sitting at an 11,000 person tech organization with 85% of that being, you know, engineers, we shifted the way we work to agile, is it is just much more conducive to that rapid delivery of value. And then of course, on the tech side, really since about 2014, we've closed eight data centers and rebuilt thousands of applications to take full advantage of the cloud technology and capabilities. >> Cool. So it goes without saying, you guys won the world's biggest financial organizations and you've highlighted the non-technical part of the journey. Can you provide a little bit of insight for us on the, kind of the, your partners within the bank, not just technologists in the technology organization, how was this not necessarily disruptive, but changing, you know, groups like audit your financial services. You're constantly worried about audit. How did audit embraced this change? >> Well, I think there was a huge learning for the entire organization to think about what, what parts of what we do need to be done differently. In some ways there were a lot of benefits if I think about our business partners in the finance team where, you know, you had a data center running with massive amounts of technology infrastructure, perhaps you had to size that technology infrastructure for your peak loads, for us during, you know, holiday shopping periods and things like that. And now we're at a position where we can much more nimbly control the tech. We can audit scale up, we can audit scale down that is much more cost-effective for us, for our business. And then from a financial perspective, you just take that use case for the finance department it's, you know, adjusting so that we can directly show that cost to the line of business and allow them to make the changes that they need, which makes sense for their business and their customers. >> So let's talk about more of that process and the journey from a technology perspective, as we look at something as mature as a bank's infrastructure, not all of those applications can be migrated easily and re-platformed easily. How did you guys deal with those tough last to move application, you know, that last 20%? >> Yeah. I think, you know, it early with that, for us with a declaration that, you know, kind of I'll say the easier part, anything new gets built into the cloud. And as you point out that is way easier than tackling some of those things that you you've probably been dealing with and tackling for a very, very long time, from an application standpoint. We knew early on that it would be better if we could modernize the applications themselves as we moved them to the cloud to really kind of unlock the advantages because there's one part of the advantages of the cloud infrastructure. There is a second part of what it forced was a application modernization and a tech modernization for us. And so those two things together were super powerful. We had a few stubborn ones that we said, okay, can we containerize this? Can we lift and shift this over it's, to me, it was likening to, you know, moving from one house to another and you've, you're kind of cleaning out your attic and you're trying to figure out, ideally, you don't take anything to the new house that you don't need and you do all this cleanup. And at some point you say, well, this I'm going to put in this box and I will deal with it kind of at the new house. And ideally you do that before you put it right back in the attic. So we had a few of those, but in large part, you know, 85%, I think that's, that's part of why it took us so long was let's, let's do this right. And let's get, get this so that these applications can run effectively and take full advantage of the cloud. >> So let's talk about some of the potential benefits and the past eight to nine months. My relationship with my commercial banker, with my private banker have really changed due to the pandemic. Talk to me about some of the advantages or capabilities the bank has gained as a result of moving all almost all. Well now totally to the public cloud. >> Yeah. Yeah. It's a good question. I think I'll start with one that is a little bit more technically oriented then talk about the capabilities. So, you mentioned the pandemic and we had, you know massive amounts of planning as we were kind of taking in the full impact of what was about to happen for our associates and our customers, you know and trying to think through how will customer behavior change during a pandemic? We didn't have a whole lot of indicators as to what that might look like. You know, how is their activity going to change from a transactions, for example, or, you know, are they going to change the frequency with which they're logging in online or paying the cloud gave us the flexibility. As you mentioned earlier to scale rapidly in case the projections that we were making were wrong about consumer behavior. And so we could keep the platforms up and running and recover them much quicker, more resilient infrastructure to make sure that it's up because we really we're in unprecedented times, trying to think through how, how behaviors and needs of customers would change. Secondly, you know, from a capability standpoint, we talked about the need for those real time intelligent experiences that only cloud can give you only modern applications can give you things like, you know Eno, our digital assistant, which is built on a streaming architecture, it can identify unusual charges, 40% tip and alert me in real time. You know, these days 40% tip, I'm trying to help local businesses, that's, that's exactly true. But the fact that Eno is out there kind of looking out and watching my back and saying, this is unusual is, is this you you're transacting a lot online who knows what the fraudsters are looking for. It's those types of experiences that, that you can't build. If you are posting transactions to a mainframe that, you know, runs a batch process overnight, it doesn't help me if you tell me the next day. >> So let's talk about this talent transformation a little bit too, because one of the most difficult things I've witnessed with any type of, of massive transformation like this is recruitment and retention of talent, not the industry hasn't quite built up the talent pool to support such massive transformations. How is this impacting your talent and recruitment processes? >> Yeah. So, massively. And the good news is, you know, given the amount of time we've been on this transformation, we had some time to allow that to adjust, you know, when we started and knowing that, that we were focused largely on AWS, we said, we'd be great if we could go find it, you know thousands of engineers who are, you know, deep experts in AWS, and they didn't necessarily exist at the time given that there, there aren't a whole lot of companies doing what we were doing at the time besides perhaps, you know, AWS themselves and Netflix and they're good partners to us. So we don't need to, to steal everybody's talent. And so we started early on with training and re-skilling our engineers, generally speaking, I find that engineers love using new things. And they in particular love learning newer technologies. I haven't found one yet who's kind of resistant and, you know, wanting to go back and learn COBOL or Fortran like I had to. And I was in college. So, you know, the fact that, that they could, could learn some of these modern tech help evolve and develop them, you know, really kind of push on, on some of the capabilities and, and partner with partners like AWS to enhance the capabilities that are out there. And that's, that's an engineer's dream. The fact that they had, we had plenty of time. We made the declaration that we were going to go all in on cloud. Well, before we targeted getting it done probably before we even knew how in the world we were going to get there, it gave folks plenty of time to think through the training. We provided massive amounts of, of access to learning and training certifications of, for, for everyone to develop the skills that they needed. And I think it's just been great and super fun from a talent perspective. >> I love that you mentioned Netflix, they've been another company that's been extremely public with their journey to the cloud. We kind of think of Netflix is just born in the cloud company and they weren't, and they had a journey to the cloud you share your journey to the cloud. What are some of the pieces of advice, the best practices you can give other companies looking to take that similar journey? >> I will say to, you know, not underestimate the transformative part, for us I think I've said it before. I'll I will say it again. This was not just a tech transformation, you know, this started with our customer needs. It started with a business strategy. It was transformative to our culture for how we, how we think about building and delivering capabilities, as well as the software, then that underlies and supports them. And so I think, you know, starting from a place of where are you trying to go and why? And giving yourself really the fortitude and commitment to achieve it, because you mentioned it, it is not necessarily easy. If you run on a yearly budget cycle, you if, unless you are, you know, running on very few applications at this point, you know, you will not get this done in kind of one year budget cycle, this is a multi-year journey. And ideally you're changing the technology itself, but also how software gets delivered. And therefore then as we just talked about the talent required to do so. >> So Melanie, we've been watching your journey for the past couple of years and we so appreciate you sharing your journey with theCUBE. More importantly, you're now a CUBE alum, and you get all the benefits of the CUBE alum, which has a great headshot that the team has shared, but we really appreciate you sharing a builder's journey at this builder's show. Stay tuned for more coverage of AWS re-invent 2020 via that you.. virtual. Thanks for joining us. (gentle upbeat music)

Published Date : Dec 1 2020

SUMMARY :

Announcer: From around the globe. Welcome to theCUBE Melanie the initiative for Capital One to exit to private data center. was cooking and, you know, And now the data centers Yeah, you know, you're so right. you know, groups like audit for us during, you know, you know, that last 20%? it's, to me, it was likening to, you know, and the past eight to nine months. and we had, you know because one of the most difficult things And the good news is, you know, the best practices you to achieve it, because you mentioned it, and we so appreciate you

ENTITIES

Entity	Category	Confidence
Melanie	PERSON	0.99+
Keith Townsend	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Melanie Frank	PERSON	0.99+
Netflix	ORGANIZATION	0.99+
350 degrees	QUANTITY	0.99+
85%	QUANTITY	0.99+
40%	QUANTITY	0.99+
11,000 person	QUANTITY	0.99+
thousands	QUANTITY	0.99+
one year	QUANTITY	0.99+
two things	QUANTITY	0.99+
second part	QUANTITY	0.99+
Capital One	ORGANIZATION	0.99+
eight year	QUANTITY	0.99+
first time	QUANTITY	0.98+
Intel	ORGANIZATION	0.98+
20%	QUANTITY	0.97+
Today	DATE	0.97+
Secondly	QUANTITY	0.96+
one part	QUANTITY	0.96+
one house	QUANTITY	0.94+
next day	DATE	0.92+
CUBE	TITLE	0.91+
Eno	ORGANIZATION	0.91+
pandemic	EVENT	0.9+
about	DATE	0.89+
Invent 2020	TITLE	0.88+
past couple of years	DATE	0.84+
one	QUANTITY	0.84+
thousands of engineers	QUANTITY	0.83+
Fortran	TITLE	0.83+
applications	QUANTITY	0.79+
theCUBE	ORGANIZATION	0.77+
COBOL	TITLE	0.75+
theCUBE virtual	COMMERCIAL_ITEM	0.75+
nine months	QUANTITY	0.72+
eight data centers	QUANTITY	0.72+
past eight	DATE	0.7+
invent	EVENT	0.6+
2014	DATE	0.59+
2020	TITLE	0.58+
invent 2020	TITLE	0.45+
2020	DATE	0.44+
re	EVENT	0.38+

Tom Clancy, UiPath & Kurt Carlson, William & Mary | UiPath FORWARD III 2019

(upbeat music) >> Announcer: Live from Las Vegas, it's theCUBE! Covering UIPath FORWARD America's 2019. Brought to you by UIPath. >> Welcome back, everyone, to theCUBE's live coverage of UIPath FORWARD, here in Sin City, Las Vegas Nevada. I'm your host, Rebecca Knight, co-hosting alongside Dave Velante. We have two guests for this segment. We have Kurt Carlson, Associate Dean for faculty and academic affairs of the Mason School of Business at the college of William and Mary. Thanks for coming on the show. >> Thanks you for having me. >> Rebecca: And we have Tom Clancy, the SVP of learning at UIPath, thank you so much. >> Great to be here. >> You're a Cube alum, so thank you for coming back. >> I've been here a few times. >> A Cube veteran, I should say. >> I think 10 years or so >> So we're talking today about a robot for every student, this was just announced in August, William and Mary is the first university in the US to provide automation software to every undergraduate student, thanks to a four million dollar investment from UIPath. Tell us a little bit about this program, Kurt, how it works and what you're trying to do here. >> Yeah, so first of all, to Tom and the people at UIPath for making this happen. This is a bold and incredible initiative, one that, frankly, when we had it initially, we thought that maybe we could get a robot for every student, we weren't sure that other people would be willing to go along with that, but UIPath was, they see the vision, and so it was really a meeting of the minds on a common purpose. The idea was pretty simple, this technology is transforming the world in a way that students, we think it's going to transform the way that students actually are students. But it's certainly transforming the world that our students are going into. And so, we want to give them exposure to it. We wanted to try and be the first business school on the planet that actually prepares students not just for the way RPA's being used today, but the way that it's going to be used when AI starts to take hold, when it becomes the gateway to AI three, four, five years down the road. So, we talked to UIPath, they thought it was a really good idea, we went all in on it. Yeah, all of our starting juniors in the business school have robots right now, they've all been trained through the academy live session putting together a course, it's very exciting. >> So, Tom, you've always been an innovator when it comes to learning, here's my question. How come we didn't learn this school stuff when we were in college? We learned Fortran. >> I don't know, I only learned BASIC, so I can't speak to that. >> So you know last year we talked about how you're scaling, learning some of the open, sort of philosophy that you have. So, give us the update on how you're pushing learning FORWARD, and why the College of William and Mary. >> Okay, so if you buy into a bot for every worker, or a bot for every desktop, that's a lot of bots, that's a lot of desktops, right? There's studies out there from the research companies that say that there's somewhere a hundred and 200 million people that need to be educated on RPA, RPA/AI. So if you buy into that, which we do, then traditional learning isn't going to do it. We're going to miss the boat. So we have a multi-pronged approach. The first thing is to democratize RPA learning. Two and a half years ago we made, we created RPA Academy, UIPath academy, and 100% free. After two and a half years, we have 451,000 people go through the academy courses, that's huge. But we think there's a lot more. Over the next next three years we think we'll train at least two million people. But the challenge still is, if we train five million people, there's still a hundred million that need to know about it. So, the second biggest thing we're doing is, we went out, last year at this event, we announced our academic alliance program. We had one university, now we're approaching 400 universities. But what we're doing with William and Mary is a lot more than just providing a course, and I'll let Kurt talk to that, but there is so much more that we could be doing to educate our students, our youth, upscaling, rescaling the existing workforce. When you break down that hundred million people, they come from a lot of different backgrounds, and we're trying to touch as many people as we can. >> You guys are really out ahead of the curve. Oftentimes, I mean, you saw this a little bit with data science, saw some colleges leaning in. So what lead you guys to the decision to actually invest and prioritize RPA? >> Yeah, I think what we're trying to accomplish requires incredibly smart students. It requires students that can sit at the interface between what we would think of today as sort of an RPA developer and a decision maker who would be stroking the check or signing the contract. There's got to be somebody that sits in that space that understands enough about how you would actually execute this implementation. What's the right buildout of that, how we're going to build a portfolio of bots, how we're going to prioritize the different processes that we might automate, How we're going to balance some processes that might have a nice ROI but be harder for the individual who's process is being automated to absorb against processes that the individual would love to have automated, but might not have as great of an ROI. How do you balance that whole set of things? So what we've done is worked with UIPath to bring together the ideas of automation with the ideas of being a strategic thinker in process automation, and we're designing a course in collaboration to help train our students to hit the ground running. >> Rebecca, it's really visionary, isn't it? I mean it's not just about using the tooling, it's about how to apply the tooling to create competitive advantage or change lives. >> I used to cover business education for the Financial Times, so I completely agree that this really is a game changer for the students to have this kind of access to technology and ability to explore this leading edge of software robotics and really be, and graduate from college. This isn't even graduate school, they're graduating from college already having these skills. So tell me, Kurt, what are they doing? What is the course, what does it look like, how are they using this in the classroom? >> The course is called a one credit. It's 14 hours but it actually turns into about 42 when you add this stuff that's going on outside of class. They're learning about these large conceptual issues around how do you prioritize which processes, what's the process you should go through to make sure that you measure in advance of implementation so that you can do an audit on the backend to have proof points on the effectiveness, so you got to measure in advance, creating a portfolio of perspective processes and then scoring them, how do you do that, so they're learning all that sort of conceptual straight business slash strategy implementation stuff, so that's on the first half, and to keep them engaged with this software, we're giving them small skills, we're calling them skillets. Small skills in every one of those sessions that add up to having a fully automated and programmed robot. Then they're going to go into a series of days where every one of those days they're going to learn a big skill. And the big skills are ones that are going to be useful for the students in their lives as people, useful in lives as students, and useful in their lives as entrepreneurs using RPA to create new ventures, or in the organizations they go to. We've worked with UIPath and with our alums who've implement this, folks at EY, Booz. In fact, we went up to DC, we had a three hour meeting with these folks. So what are the skills students need to learn, and they told us, and so we build these three big classes, each around each one of those skills so that our students are going to come out with the ability to be business translators, not necessarily the hardcore programmers. We're not going to prevent them from doing that, but to be these business translators that sit between the programming and the decision makers. >> That's huge because, you know, like, my son's a senior in college. He and his friends, they all either want to work for Amazon, Google, an investment bank, or one of the big SIs, right? So this is a perfect role for a consultant to go in and advise. Tom, I wanted to ask you, and you and I have known each other for a long time, but one of the reasons I think you were successful at your previous company is because you weren't just focused on a narrow vendor, how to make metrics work, for instance. I presume you're taking the same philosophy here. It transcends UIPath and is really more about, you know, the category if you will, the potential. Can you talk about that? >> So we listen to our customers and now we listen to the universities too, and they're going to help guide us to where we need to go. Most companies in tech, you work with marketing, and you work with engineering, and you build product courses. And you also try to sell those courses, because it's a really good PNL when you sell training. We don't think that's right for the industry, for UIPath, or for our customers, or our partners. So when we democratize learning, everything else falls into place. So, as we go forward, we have a bunch of ideas. You know, as we get more into AI, you'll see more AI type courses. We'll team with 400 universities now, by end of next year, we'll probably have a thousand universities signed up. And so, there's a lot of subject matter expertise, and if they come to us with ideas, you mentioned a 14 hour course, we have a four hour course, and we also have a 60 hour course. So we want to be as flexible as possible, because different universities want to apply it in different ways. So we also heard about Lean Six Sigma. I mean, sorry, Lean RPA, so we might build a course on Lean RPA, because that's really important. Solution architect is one of the biggest gaps in the industry right now so, so we look to where these gaps are, we listen to everybody, and then we just execute. >> Well, it's interesting you said Six Sigma, we have Jean Younger coming on, she's a Six Sigma expert. I don't know if she's a black belt, but she's pretty sure. She talks about how to apply RPA to make business processes in Six Sigma, but you would never spend the time and money, I mean, if it's an airplane engine, for sure, but now, so that's kind of transformative. Kurt, I'm curious as to how you, as a college, market this. You know, you're very competitive industry, if you will. So how do you see this attracting students and separating you guys from the pack? >> Well, it's a two separate things. How do we actively try to take advantage of this, and what effects is it having already? Enrollments to the business school, well. Students at William and Mary get admitted to William and Mary, and they're fantastic, amazingly good undergraduate students. The best students at William and Mary come to the Raymond A. Mason school of business. If you take our undergraduate GPA of students in the business school, they're top five in the country. So what we've seen since we've announced this is that our applications to the business school are up. I don't know that it's a one to one correlation. >> Tom: I think it is. >> I believe it's a strong predictor, right? And part because it's such an easy sell. And so, when we talk to those alums and friends in DC and said, tell us why this is, why our students should do this, they said, well, if for no other reason, we are hiring students that have these skills into data science lines in the mid 90s. When I said that to my students, they fell out of their chairs. So there's incredible opportunity here for them, that's the easy way to market it internally, it aligns with things that are happening at William and Mary, trying to be innovative, nimble, and entrepreneurial. We've been talking about being innovative, nimble, and entrepreneurial for longer than we've been doing it, we believe we're getting there, we believe this is the type of activity that would fit for that. As far as promoting it, we're telling everybody that will listen that this is interesting, and people are listening. You know, the standard sort of marketing strategy that goes around, and we are coordinating with UIPath on that. But internally, this sells actually pretty easy. This is something people are looking for, we're going to make it ready for the world the way that it's going to be now and in the future. >> Well, I imagine the big consultants are hovering as well. You know, you mentioned DC, Booz Allen, Hughes and DC, and Excensior, EY, Deloitte, PWC, IBM itself. I mean it's just, they all want the best and the brightest, and now you're going to have this skill set that is a sweet spot for their businesses. >> Kurt: That's the plan. >> I'm just thinking back to remembering who these people are, these are 19 and 20 year olds. They've never experienced the dreariness of work and the drudge tasks that we all know well. So, what are you, in terms of this whole business translator idea, that they're going to be the be people that sit in the middle and can sort of be these people who can speak both languages. What kind of skills are you trying to impart to them, because it is a whole different skill set. >> Our vision is that in two or three years, the nodes and the processes that are currently... That currently make implementing RPA complex and require significant programmer skills, these places where, right now, there's a human making a relatively mundane decision, but it's sill a model. There's a decision node there. We think AI is going to take over that. The simple, AI's going to simply put models into those decision nodes. We also think a lot of the programming that takes place, you're seeing it now with studio X, a lot of the programming is going to go away. And what that's going to do is it's going to elevate the business process from the mundane to the more human intelligent, what would currently be considered human intelligence process. When we get into that space, people skills are going to be really important, prioritizing is going to be really important, identifying organizations that are ripe for this, at this moment in time, which processes to automate. Those are the kind of skills we're trying to get students to develop, and what we're selling it partly as, this is going to make you ready of the world the way we think it's going to be, a bit of a guess. But we're also saying if you don't want to automate mundane processes, then come with us on a different magic carpet ride. And that magic carpet ride is, imagine all the processes that don't exist right now because nobody would ever conceive of them because they couldn't possibly be sustained, or they would be too mundane. Now think about those processes through a business lens, so take a business student and think about all the potential when you look at it that way. So this course that we're building has that, everything in the course is wrapped in that, and so, at the end of the course, they're going to be doing a project, and the project is to bring a new process to the world that doesn't currently exist. Don't program it, don't worry about whether or not you have a team that could actually execute it. Just conceive of a process that doesn't currently exist and let's imagine, with the potential of RPA, how we would make that happen. That's going to be, we think we're going to be able to bring a lot of students along through that innovative lens even though they are 19 and 20, because 19 and 20 year olds love innovation, while they've never submitted a procurement report. >> Exactly! >> A innovation presentation. >> We'll need to do a Cube follow up with that. >> What Kurt just said, is the reason why, Tom, I think this market is being way undercounted. I think it's hard for the IDCs and the forces, because they look back they say how big was it last year, how fast are these companies growing, but, to your point, there's so much unknown processes that could be attacked. The TAM on this could be enormous. >> We agree. >> Yeah, I know you do, but I think that it's a point worth mentioning because it touches so many different parts of every organization that I think people perhaps don't realize the impact that it could have. >> You know, when listening to you, Kurt, when you look at these young kids, at least compared to me, all the coding and setting up a robot, that's the easy part, they'll pick that up right away. It's really the thought process that goes into identifying new opportunities, and that's, I think, you're challenging them to do that. But learning how to do robots, I think, is going to be pretty easy for this new digital generation. >> Piece of cake. Tom and Kurt, thank you so much for coming on theCUBE with a really fascinating conversation. >> Thank you. >> Thanks, you guys >> I'm Rebecca Knight, for Dave Velante, stay tuned for more of theCUBEs live coverage of UIPath FORWARD. (upbeat music)

Published Date : Oct 15 2019

SUMMARY :

Brought to you by UIPath. and academic affairs of the Mason School of Business at UIPath, thank you so much. William and Mary is the first university in the US that it's going to be used when AI starts to take hold, it comes to learning, here's my question. so I can't speak to that. sort of philosophy that you have. But the challenge still is, if we train five million people, So what lead you guys to the decision to actually that the individual would love to have automated, it's about how to apply the tooling to create the students to have this kind of access to And the big skills are ones that are going to be useful the category if you will, the potential. and if they come to us with ideas, and separating you guys from the pack? I don't know that it's a one to one correlation. When I said that to my students, Well, I imagine the big consultants are hovering as well. and the drudge tasks that we all know well. and so, at the end of the course, they're going to be doing how fast are these companies growing, but, to your point, don't realize the impact that it could have. is going to be pretty easy for this new digital generation. Tom and Kurt, thank you so much for coming on theCUBE for more of theCUBEs live coverage of UIPath FORWARD.

ENTITIES

Entity	Category	Confidence
Rebecca	PERSON	0.99+
Tom	PERSON	0.99+
Kurt	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Deloitte	ORGANIZATION	0.99+
Tom Clancy	PERSON	0.99+
Rebecca Knight	PERSON	0.99+
Dave Velante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
PWC	ORGANIZATION	0.99+
UIPath	ORGANIZATION	0.99+
Kurt Carlson	PERSON	0.99+
EY	ORGANIZATION	0.99+
10 years	QUANTITY	0.99+
19	QUANTITY	0.99+
14 hours	QUANTITY	0.99+
US	LOCATION	0.99+
400 universities	QUANTITY	0.99+
August	DATE	0.99+
Excensior	ORGANIZATION	0.99+
Jean Younger	PERSON	0.99+
two	QUANTITY	0.99+
College of William and Mary	ORGANIZATION	0.99+
last year	DATE	0.99+
one credit	QUANTITY	0.99+
first half	QUANTITY	0.99+
second	QUANTITY	0.99+
Financial Times	ORGANIZATION	0.99+
two guests	QUANTITY	0.99+
hundred million people	QUANTITY	0.99+
William and Mary	ORGANIZATION	0.99+
Mason School of Business	ORGANIZATION	0.99+
20	QUANTITY	0.99+
three hour	QUANTITY	0.99+
four	QUANTITY	0.99+
451,000 people	QUANTITY	0.99+
today	DATE	0.99+
DC	ORGANIZATION	0.99+
one university	QUANTITY	0.99+
both languages	QUANTITY	0.99+
100%	QUANTITY	0.99+
three years	QUANTITY	0.99+
five million people	QUANTITY	0.99+
20 year	QUANTITY	0.99+
five years	QUANTITY	0.98+
Six Sigma	TITLE	0.98+
Sin City, Las Vegas Nevada	LOCATION	0.98+
Raymond A. Mason	ORGANIZATION	0.98+
each	QUANTITY	0.98+
Las Vegas	LOCATION	0.98+
one	QUANTITY	0.98+
four million dollar	QUANTITY	0.98+
2019	DATE	0.98+
Two and a half years ago	DATE	0.97+
four hour course	QUANTITY	0.97+
first university	QUANTITY	0.97+
60 hour course	QUANTITY	0.97+
mid 90s	DATE	0.97+
first thing	QUANTITY	0.96+
UIPath FORWARD	ORGANIZATION	0.95+
about 42	QUANTITY	0.95+

Lukas Heinrich & Ricardo Rocha, CERN | KubeCon + CloudNativeCon EU 2019

>> Live from Barcelona, Spain, it's theCUBE, covering KubeCon + CloudNativeCon Europe 2019. Brought to you by Red Hat, the Cloud Native Computing Foundation and Ecosystem Partners. >> Welcome back to theCUBE, here at KubeCon CloudNativeCon 2019 in Barcelona, Spain. I'm Stu Miniman. My co-host is Corey Quinn and we're thrilled to welcome to the program two gentlemen from CERN. Of course, CERN needs no introduction. We're going to talk some science, going to talk some tech. To my right here is Ricardo Rocha, who is the computer engineer, and Lukas Heinrich, who's a physicist. So Lukas, let's start with you, you know, if you were a traditional enterprise, we'd talk about your business, but talk about your projects, your applications. What piece of, you know, fantastic science is your team working on? >> All right, so I work on an experiment that is situated with the Large Hadron Collider, so it's a particle accelerator experiments where we accelerate protons, which are hydrogen nuclei, to a very high energy, so that they almost go with the speed of light. And so, we have a large tunnel underground, 100 meters underground in Geneva, so straddling the border of France and Switzerland. And there, we're accelerating two beams. One is going clockwise. The other one is going counterclockwise, and there, we collide them. And so, I work on an experiment that kind of looks at these collisions and then analyzes this data. >> Lukas, if I can, you know, when you talk to most companies, you talk about scale, you talk about latency, you talk about performance. Those have real-world implications for your world. Do you have anything you could share there? >> Yeah, so, one of the main things that we need to do, so we collide 40 million times a second these protons, and we need to analyze them in real time, because we cannot write out all the collision data to disk because we don't have enough disk space, and so we've essentially run 10,000 core real-time application to analyze this data in real-time and see what collisions are actually most interesting, and then only those get written out to disk, so this is a system that I work on called The Trigger, and yeah, that's pretty dependent on latency. >> All right, Ricardo, luckily you know, your job's easy. We say most people you need to respond, you know, to what the business needs for you and, you know, don't worry, you can't go against the laws of physics. Well, you're working on physics here, and boy those are some hefty requirements there. Talk a little bit about that dynamic and how your team has to deal with some pretty tough challenges. >> Right, so, as Lukas was saying, we have this large amount of data. The machines can generate something around the order of a petabyte a second, and then, thanks to their hardware- and software-level triggers, they will reduce this to something that is 10 gigabytes a second, and that's what my side has to handle. So, it's still a lot of data. We are collecting something like 70 petabytes a year, and we keep adding, so right now we have, the amount of storage available is on the order of 400 petabytes. We're starting to get at a pretty large scale. And then we have to analyze all of this. So we have one big data center at CERN, which is 300,000 cores, or something like this, around that, but that's not enough, so what we've done over the last 15, 20 years, we've created this large distributed computing environment around the world. We link to many different institutes and research labs together, and this doubles our capacity. So that's our challenge, is to make sure all the effort that the physicists put into building this large machine, that, in the end, it's not the computing that is breaking the world system. We have to keep up, yup. >> One thing that I always find fascinating is people who are dealing with real problems that push our conception of what scale starts to look like, and when you're talking about things like a petabyte a second, that's beyond the comprehension of what most of us can wind up talking about. One problem that I've seen historically with a number of different infrastructure approaches is it requires a fair level of complexity to go from this problem to this problem to this problem, and you have to wind up working through a bunch of layers of abstraction, and the end result is, and at the end of all of this we can run our blog that gets eight visits a day, and that just doesn't seem to make sense. Whereas what you're talking about, that level of complexity is more than justified. So my question for you is, as you start seeing these things evolve and looking at other best practices and guidance from folks who are doing far less data-intensive applications, are you seeing that a lot of the best practices start to fall down as you're pushing theoretical boundaries of scale? >> Right, that's actually a good point. Like, the physicists are very good at getting things done, and they don't worry that much about the process, as long as in the end it works. But there's always this kind of split between the physicists and the more computing engineer where the practices, we want to establish practices, but at the end of the day, we have a large machine that has to work, so sometimes we skip a couple of steps, but we still need, there's still quite a lot of control on like data quality and the software validation and all of this. But yeah, it's a non-traditional environment in terms of IT, I would say. It's much more fast pacing than most traditional companies. >> You mentioned you had how many cores working on these problems on site? >> So in-house, we have 300,000. >> If you were to do a full migration to the public cloud, you'd almost have to repurpose that many cores just to calculating out the bill at that point. Just, because all the different dimensions, everything winds working on at that scale becomes almost completely non-trivial. I don't often say that I'm not sure public cloud can scale to the level that someone would need to. In your case, that becomes a very real concern. >> Yeah, so that's one debate we are having now, and it's, it has a lot of advantages to have the computing in-house, and also because we pretty much use it 24/7, it's a very different type of workload. So we need a lot of resources 24/7, like even the pricing is kind of calculated differently. But the issue we have now is that the accelerator will go through a major upgrade just in five years' time, where we will increase the amount of data by 100 times. Now we are talking about 70 petabytes a year and we're very soon talking about like exabytes. So the amount of computing we'll need there is just going to explode, so we need all the options. We're looking into GPUs and machine learning to change how we do computing, and we are looking at any kind of additional resources we might get, and there the public cloud will probably play a role. >> Could you speak to kind of the dynamic of how something like an upgrade of that, you know, how do you work together? I can't imagine that you just say, "Well, we built it, "whatever we needed and everything, and, you know, "throw it over the wall and make sure it works." >> Right, I mean, so I work a lot on this boundary between computing and physics, and so internally, I think we also go through the same processes as a lot of companies, that we're trying to educate people on the physics side how to go through the best practices, because it's also important. So one thing I stressed also in the keynote is this idea of reproducibility and reusability of scientific software is pretty important, so we teach people to containerize their applications and then make them reusable and stuff like that, yup. >> Anything about that relationship you can expound on? >> Yeah, so like this keynote we had yesterday is a perfect example of how this is improving a lot at CERN. We were actually using data from CMS, which was one of the experiments. Lukas is a physicist in ATLAS, which is like a computing experiment, kind of. I'm in IT, and like all this containerized infrastructure kind of is getting us all together because computing is getting much easier in terms of how to share pieces of software and even infrastructure, and this helps us a lot internally also. >> So what particular about Kubernetes helps your environment? You talk for 15 years that you've been on this distributed systems build-out, so sounds like you were the hipsters when it came to some of these solutions we're working on today. >> That has been like a major change. Lukas mentioned the container part for the software reproducibility, but I have been working on the infrastructure for, I joined CERN as a student and I've been working on the distributed infrastructure for many years, and we basically had to write our own tools, like storage systems, all the batch systems, over the years, and suddenly with this public cloud explosion and open source usage, we can just go and join communities that have requirements sometimes that are higher than ours and we can focus really on the application development. If we base, if we start writing software using Kubernetes, then not only we get this flexibility of choosing different public clouds or different infrastructures, but also we don't have to care so much about the core infrastructure, all the monitoring, log collection, restarting. Kubernetes is very important for us in this respect. We kind of remove a lot of the software we were depending on for many years. >> So these days, as you look at this build-out and what you're looking, not just what you're doing today but what you're looking to build in the upcoming years, are you viewing containers as the fundamental primitive of what empowers this? Are you looking at virtual machines as that primitive? Are you looking at functions? Where exactly do you draw the abstraction layer, as you start building this architecture? >> So, yeah, traditionally we've been using virtual machines for like the last maybe 10 years almost, or, I don't know, eight years at least, and we see containerization happening very quickly, and maybe Lukas can say a bit more about the physics, how this is important on the physics side? >> Yeah, what's been, so currently I think we are looking at containers for the main abstraction because it's also we go through things like functions as a service. What's kind of special about scientific applications is that we don't usually just have our entire code base on one software stack, right? It's not like we would deploy Node.js application or Python stack and that's it. And so, sometimes you have a complete mix between C++, Python, Fortran, and all that stuff. So this idea that we can build the entire software stack as we want it is pretty important. So even for functions as a service where, traditionally, you had just a limited choice of runtimes, this becomes important. >> Like, from our side, the virtual machines still had a very complex setup to be able to support all this diversity of software and the containerization, just all the people have to give us is like run this building block and it's kind of a standard interface, so we only have to build the infrastructure to be able to handle these pieces. >> Well, I don't think anyone can dispute that you folks are experts in taking larger things and breaking them down into constituent components thereof. I mean, you are, quite obviously, the leading world experts on that. But was there any challenge to you as you went through that process of, I don't necessarily even want to say modernizing, but in changing your viewpoint of those primitives as you've evolved, have you seen that there were challenges in gaining buy-in throughout the organization? Was there pushback? Was it culturally painful to wind up moving away from the virtual machine approach into a containerized world? >> Right, so yeah, a bit, of course. But traditionally we, like physicists really focus on their end goal. We often say that we don't count how many cores or whatever, we care about events per second, how many events we can process per second. So, it's a kind of more open-minded community maybe than traditional IT, so we don't care so much about which technology we use at some point, as long as the job gets done. So, yeah, there's a bit of traction sometimes, but there's also a push when you can demonstrate that we get a clear benefit, then it's kind of easier to push it. >> What's a little bit special maybe also for particle physics is that it's not only CERN that is the researcher. We are an international collaboration of many, many institutes all around the world that work on the same project, which is just hosted at CERN, and so it's a very flat hierarchy and people do have the freedom to try out things and so it's not like we have a top-down mandate what technology we use. And then somebody tries something out. If it works and people see a value in it then you get adoption from it. >> The collaboration with the data volumes you're talking about as well has got to be intense. I think you're a little bit beyond the, okay, we ran the experiment, we put the data in Dropbox, go ahead and download it, you'll get that in only 18 short years. It seems like there's absolutely a challenge in that. >> That was one of the key points actually in the keynote is that, so a lot of the experiments at CERN have an open data policy where we release our data, and so that's great because we think it's important for open science, but it was always a bit of an issue, like who can actually practically analyze this data for people who don't have a data center? And so one part of the keynote was that we could demonstrate that using Kubernetes and public cloud infrastructure actually becomes possible for people who don't work at CERN to analyze this large-scale scientific data sets. >> Yeah, I mean maybe just for our audience, the punchline is rediscovering the Higgs boson in the public cloud. Maybe just give our audience a little bit of taste of that. >> Right, yeah, so basically what we did is, so the Higgs boson was discovered in 2012 by both ATLAS and CMS, and a part of that data, we used open data from CMS and part of that data has now been released publicly, and basically this was a 70-terabyte data set which we, thanks to our Google Cloud partners, could put onto public cloud infrastructure and then we analyzed it on a large-scale Kubernetes cluster, and-- >> The main challenge there was that, like, we publish it and we say you probably need a month to process it, but we had like 20 minutes on the keynote, so we kind of needed a bit larger infrastructure than usual to run it down to five minutes or less. In the end, it all worked out, but that was a bit of a challenge. >> How are you approaching, I guess, making this more accessible to more people? By which I mean, not just other research institutions scattered around the world, but students, individual students, sometimes in emerging economies, where they don't have access to the kinds of resources that many of us take for granted, particularly work for a prestigious research institutions? What are you doing to make this more accessible to high school kids, for example, folks who are just dipping their toes into a world they find fascinating? >> We have entire programs, outreach programs that go to high schools. I've been doing this when I was a student in Germany. We would go to high schools and we would host workshops and people would analyze a lot of this data themselves on their computers. So we would come with a USB stick that have data on them, and they could analyze it. And so part of also the open data strategy from ATLAS is to use that open data for educational purposes. And then there are also programs in emerging countries. >> Lukas and Ricardo, really appreciate you sharing the open data, open science mission that you have with our audience. Thank you so much for joining us. >> Thank you. >> Thank you. >> All right, for Corey Quinn, I'm Stu Miniman. We're in day two of two days live coverage here at KubeCon + CloudNativeCon 2019. Thank you for watching theCUBE. (upbeat music)

Published Date : May 22 2019

SUMMARY :

Brought to you by Red Hat, What piece of, you know, fantastic science and there, we collide them. to most companies, you talk about scale, Yeah, so, one of the main things that we need to do, to what the business needs for you and, you know, and we keep adding, so right now we have, and at the end of all of this we can run our blog but at the end of the day, we have a large machine Just, because all the different dimensions, But the issue we have now is that the accelerator "whatever we needed and everything, and, you know, on the physics side how to go through the best practices, Yeah, so like this keynote we had yesterday so sounds like you were the hipsters and we basically had to write our own tools, is that we don't usually just have our entire code base just all the people have to give us But was there any challenge to you We often say that we don't count how many cores and so it's not like we have a top-down mandate okay, we ran the experiment, we put the data in Dropbox, And so one part of the keynote was that we could demonstrate in the public cloud. and we say you probably need a month to process it, And so part of also the open data strategy Lukas and Ricardo, really appreciate you sharing Thank you for watching theCUBE.

ENTITIES

Entity	Category	Confidence
Ricardo Rocha	PERSON	0.99+
Corey Quinn	PERSON	0.99+
Stu Miniman	PERSON	0.99+
CERN	ORGANIZATION	0.99+
Lukas	PERSON	0.99+
ATLAS	ORGANIZATION	0.99+
2012	DATE	0.99+
Geneva	LOCATION	0.99+
Germany	LOCATION	0.99+
Ricardo	PERSON	0.99+
Lukas Heinrich	PERSON	0.99+
Red Hat	ORGANIZATION	0.99+
20 minutes	QUANTITY	0.99+
Cloud Native Computing Foundation	ORGANIZATION	0.99+
70-terabyte	QUANTITY	0.99+
15 years	QUANTITY	0.99+
300,000 cores	QUANTITY	0.99+
300,000	QUANTITY	0.99+
Node.js	TITLE	0.99+
70 petabytes	QUANTITY	0.99+
Python	TITLE	0.99+
400 petabytes	QUANTITY	0.99+
10,000 core	QUANTITY	0.99+
Barcelona, Spain	LOCATION	0.99+
100 meters	QUANTITY	0.99+
eight years	QUANTITY	0.99+
KubeCon	EVENT	0.99+
a month	QUANTITY	0.99+
100 times	QUANTITY	0.99+
Switzerland	LOCATION	0.99+
five minutes	QUANTITY	0.99+
one	QUANTITY	0.99+
Fortran	TITLE	0.98+
yesterday	DATE	0.98+
France	LOCATION	0.98+
two days	QUANTITY	0.98+
Ecosystem Partners	ORGANIZATION	0.98+
One problem	QUANTITY	0.98+
One	QUANTITY	0.98+
five years'	QUANTITY	0.98+
18 short years	QUANTITY	0.97+
CMS	ORGANIZATION	0.97+
two beams	QUANTITY	0.97+
two gentlemen	QUANTITY	0.96+
Kubernetes	TITLE	0.96+
both	QUANTITY	0.96+
CloudNativeCon Europe 2019	EVENT	0.95+
40 million times a second	QUANTITY	0.95+
One thing	QUANTITY	0.94+
eight visits a day	QUANTITY	0.94+
CloudNativeCon EU 2019	EVENT	0.93+
CloudNativeCon 2019	EVENT	0.93+
C++	TITLE	0.93+
many years	QUANTITY	0.92+
KubeCon CloudNativeCon 2019	EVENT	0.92+
today	DATE	0.91+
one software	QUANTITY	0.91+
Dropbox	ORGANIZATION	0.89+
about 70 petabytes	QUANTITY	0.86+
one debate	QUANTITY	0.86+
10 gigabytes a second	QUANTITY	0.85+
one part	QUANTITY	0.77+
a year	QUANTITY	0.75+
one thing	QUANTITY	0.74+
a second	QUANTITY	0.73+
petabyte	QUANTITY	0.73+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Fortran: