Image Title

Search Results for International Intercontinental Hotel:

George Mihaiescu, OICR | OpenStack Summit 2018


 

>> Narrator: Live from Vancouver, Canada, it's theCUBE, covering OpenStack Summit North America 2018, brought to you by Red Hat, the OpenStack Foundation, and its ecosystem partners. >> The sun has come out, but we're still talking about a lot of the cloud here at the OpenStack Summit 2018 in Vancouver. I'm Stu Miniman with my co-host John Troyer. Happy to welcome to the program the 2018 Super User Award winner, George Mihaiescu, who's the senior cloud architect with the Ontario Institute for Cancer Research or OICR. First of all, congratulations. >> Thank you very much for having me. >> And thank you so much for joining us. So cancer research, obviously is, one of the things we talk about is how can technology really help us at a global standpoint, help people. So, tell us a little about the organization first, before we get into the tech of it? >> So OICR is the largest cancer research institution in Canada, and is funded by government of Ontario. Located in Toronto, we support about 1,700 researchers, trainees and clinician staff. It's focused entirely on cancer research, it's located in a hub of cancer research in downtown Toronto, with Princess Margaret Hospital, Sick Kids Hospital, Mount Sinai, very, very powerful research centers, and OICR basically interconnects all these research centers and tries to bring together and to advance cancer research in the province, in Canada and globally. >> That's fantastic George. So with that, sketch out for us a little bit your role, kind of the purview that you have, the scope of what you cover. >> So I was hired four years ago by OICR to build and design cloud environment, based on a research grant that was awarded to a number of principal investigators in Canada to build this cloud computing infrastructure that can be used by cancer researchers to do large-scale analysis. What happens with cancer, because the variety of limitations happening in cancer patients, researchers found that they cannot just analyze a few samples and draw a conclusion, because the conclusion wouldn't be actually valid. So they needed to do large-scale research, and the ICGC, which is International Cancer Genome Consortium, an organization that's made of 17 countries that are donating, collecting and analyzing data from cancer patients, okay, they decided to put together all this data and to align it uniformly using the same algorithm and then analyze it using the same workflows, in order to actually draw conclusion that's valid across multiple data sets. They are focusing on the 50 most common types of cancer that affect most people in this world, and for each type of cancer, at least two countries provide and collect data. So for brain cancer, let's say we have data sets from two countries, for melanoma, for skin, and this basically gives you better confidence that the conclusion you draw is valid, and then the more pieces of the puzzle you throw on the table, the easier to see the big picture that's this cancer. >> You know George, I mean, I'm a former academic, and you know, the more data you get right, the more infrastructure you're going to have to have. I'm just reading off the announcement, 2,600 cores, 18 terabytes of RAM, 7.3 petabytes of storage, right, that's a lot of data, and it's a lot of... accessed by a lot of different researchers. When you came in, was the decision to use OpenStack already made, or did you make that decision, and how was the cloud architected in that way? >> The decision was basically made to use open source. We wanted basically to spend the money on capacity, on hardware, on research and not on licensing and support. >> John: Good use of everybody's tax dollars. >> Exactly, so you cannot do that if you have to spend money for paying licensing, then you probably have only half of the capacity that you could. So that means less large analysis, and longer it takes, and more costly. So Ceph for storing the data sets and OpenStack for infrastructure as a service offering was a no-brainer. My specialty was in OpenStack and Ceph, I started OpenStack seven years ago, so I was hired to design and build, and I had a chance to actually do alignment, and invitation calling for some of the data sets, so I was able to monitor the kind of stress that this workflows put on the system, so when I design it, I knew what is important, and what to focus on. So it's a cloud environment, it's customized for cancer research. We have very good ratio of RAM per CPU, we have very large local discs for the VM, for the virtual machines to be able to download very large data sets. We built it so if one compute node fails, you only impact a few workflows running there, you don't impact single small points of failures. Another tuning that we applied to the system too. >> George, can walk us through a little bit of the stack? What do you use, do you build your own OpenStack, or do you get it from someone? >> So basically, we use community hardware, we just high-density chassis, currently from Super Micro, Ubuntu for the operating system, no licensing there, OpenStack from the VM packages. We focus more on stability, scalability and support costs, internal support costs, because it's just myself and I have a colleague Gerard Baker, who's a cloud engineer, and you have to support all this environment, so we try to focus on the features that are most useful to our users, as well as less strain on our time and support resources. >> I mean that's, let's talk about the scalability right? You said the team is you and a colleague. >> George: Yes. >> But mostly, right. And you know, in the olden days, right, you would be taking care of maybe a handful of machines, and maybe some disk arrays in the lab. Now you're basically servicing an entire infrastructure for all of Canada, right? At how many universities? >> Well basically, it's global, so we have 40 research projects from four continents. So we have from Australia, from Israel, from China, from Europe, US, Canada. So approved cancer researchers that can access the data open up an account with us, and they get a quota, and they start their virtual machines, they download the data sets from the extra API of Ceph to their VMS, and they do analysis and we charge them for the time used, and because the use, everything is open source, and we don't pay any licensing fees, we are able to, and we don't run for profit, we charge them just what it costs us to be able to replenish the hardware when it fails. >> Nice, nice. And these are actually the very large machines, right? Because you have to have huge, thick data sets, you've got big data sets you have to compare all at once. >> Yeah, an average bandwidth of a file that has the normal DNA of the patient, and they need also the tumor DNA from the biopsy, an average whole genome sequence is about 150 gigabytes. So they need at least 300 gigabytes, and depending on the analysis, if they find mutations, then the output is usually five, 10 gigabytes, so much smaller. For other workflows, you have to actually align the data, so you input 150 gigabytes and the output is 150 or a bit more with metadata. And so nevertheless, you need very large storage for the virtual machines, and these are virtual machines that run very hard, in terms of you cannot do CPU over subscription, you cannot do memory over subscription, when you have a workflow that runs for four days, hundred percent CPU. So is different than other web scale environments, where you have website was running at 10%, or you can do 10 to one subscription, and then you go much cheaper or different solutions. Here you have to only provide what you have physically. >> John: That's great. >> George, you've said you participated in the OpenStack community for about seven years now. >> George: Yes. >> What kind of, do you actually contribute code, what pieces are you active in the community? >> Yeah, so I'm not a developer. My background is in networking, system administration and security, but I was involved in OpenStack since the beginning, before it was a foundation. I went to the first OpenStack public conference in Boston seven years ago, at the International Intercontinental Hotel and over time I was involved in discussions from the RAC channel, mailing list support, reporting backs. Even recently we had very interesting packet affected as well. The cloud package that is supposed to resize the disk of the VM as it boots, it was not using more than two terabytes because it was a bug, okay. So we reported this, and Scott Moffat, who's the maintainer of the cloud utils package, worked on the bug, and two days later, we had a fix, and they built a package, it's in the latest cloud Ubuntu image, and that happen, everybody else is going to use the same virtual Ubuntu package, so somebody who now has larger than two terabytes VMs, when they boot, they'll be able to resize and use the entire disk. And that's just an example of how with open source we can achieve things that would take much longer in commercial distribution, where even if you pay, doesn't necessarily mean that the response... >> Sure. Also George, any lessons learned? You've been with us a long time, right, and like Ceph. One thing we noticed today in the keynote, is actually a lot of the storage networking and compute wasn't really talked, those projects were maybe down focused a bit, as they talked about all the connectivity to everything else. So, I mean any lessons, so you... My point is, the infrastructure is stable of OpenStack, but any lessons learned along the journey? >> I think the lessons are that you can definitely build very affordable and useful and scalable infrastructure, but you have to get your expectations right. We only use from the open standard project that we consider are stable enough, so we can support them confidently without spending, like if a project adds 5% value to your offering, but eats 80% of your time debugging and trying to get it working, and doesn't have packages and missing documentation and so on, that's maybe not a good fit for your environment if you don't have the manpower to. And if it's not absolutely needed. Another very important lesson is that you have to really stay up to date, like go to the conferences, read the emails from the mailing list, be active in the community, because the OpenStack meetups in Toronto for 2018, we present there, we talk to other members. In these seven years I read tens of thousands of emails, so I learn from other users experiences, I try to help where I can. You have to be involved with the developers, I know the Ceph core developers, Sage and other people. So, you can't do this just by staying on the side and looking, you have to be involved. >> Good, George what are you looking for next from this community? You talked about the stability, are there pieces that you're hoping reach that maturity threshold for yourselves, or new functionalities that you're looking for down the road? >> I think what we want to provide to our researchers, 'cause they don't run web scale applications, so their needs are a little bit different. We want to add Magnum to our environment, to allow them deploy Kubernetes cluster easily. We want to add Octavia to expose the services, even though they don't run many web services, but you have to find a way to expose them when they run them. Maybe, Trove, database as a service, we'll see if we can deploy it safely and if it's stable enough. Anything that OpenStack comes up with, we basically look, is it useful, is it stable, can you do it, and we try it. >> George, last thing. Your group is the Super User of the Year. Can you just walk us through that journey, what led to the nomination, what does it mean to your team to win? >> I think we are a bit surprised, because we are a very small team, and our scale is not as big as T-Mobile or the other members, but I think it shows that again, for a big company to be able to deploy OpenStack at scale and make it work, it's maybe not very surprising 'cause yes, they have the resources, they have a lot of manpower and a lot of... But for a small institution or organization, or small company to be able to do it, without involving a vendor, without involving extra costs, I think that's the thing that was appreciated by the community and by the OpenStack Foundation, and yeah, we are pretty excited to have won it. >> All right, George, let me give you the final word, as somebody that's been involved with the community for a while. What would you say to people if they're, you know, still maybe looking from the outside or played with it a little bit. What tips would you give? >> I think we are living proof that it can be done, and if you wait until things are perfect, then they will never be, okay. Even Google has services in beta, Amazon has services in beta. You have to install OpenStack, it's much more performant and stable than when I started with OpenStack, where there was just a few projects, but definitely they will get help from the community, and the documentation's much better. Just go and do it, you won't regret it. >> George, as we know, software will eventually work, hardware will eventually fail. >> Absolutely. >> So, George Mihaiescu, congratulations to OICR on the Super User of the Year award, for John Troyer, I'm Stu Miniman, we're getting towards the end of day one of three days of wall to wall coverage here at OpenStack Summit 2018 in Vancouver. Thanks so much for watching theCUBE.

Published Date : May 22 2018

SUMMARY :

brought to you by Red Hat, the OpenStack Foundation, at the OpenStack Summit 2018 in Vancouver. one of the things we talk about is how can technology So OICR is the largest cancer research the scope of what you cover. that the conclusion you draw is valid, and you know, the more data you get right, The decision was basically made to use open source. and invitation calling for some of the data sets, and you have to support all this environment, You said the team is you and a colleague. and maybe some disk arrays in the lab. and because the use, everything is open source, Because you have to have huge, thick data sets, and then you go much cheaper or different solutions. the OpenStack community for about seven years now. and that happen, everybody else is going to is actually a lot of the storage networking and looking, you have to be involved. but you have to find a way to expose them Your group is the Super User of the Year. or the other members, but I think it shows that again, What would you say to people if they're, and if you wait until things are perfect, George, as we know, software will eventually work, congratulations to OICR on the Super User of the Year award,

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
GeorgePERSON

0.99+

George MihaiescuPERSON

0.99+

OICRORGANIZATION

0.99+

CanadaLOCATION

0.99+

80%QUANTITY

0.99+

John TroyerPERSON

0.99+

Gerard BakerPERSON

0.99+

Ontario Institute for Cancer ResearchORGANIZATION

0.99+

JohnPERSON

0.99+

BostonLOCATION

0.99+

Red HatORGANIZATION

0.99+

TorontoLOCATION

0.99+

hundred percentQUANTITY

0.99+

USLOCATION

0.99+

EuropeLOCATION

0.99+

150QUANTITY

0.99+

Scott MoffatPERSON

0.99+

18 terabytesQUANTITY

0.99+

2,600 coresQUANTITY

0.99+

10QUANTITY

0.99+

40 research projectsQUANTITY

0.99+

7.3 petabytesQUANTITY

0.99+

AmazonORGANIZATION

0.99+

ICGCORGANIZATION

0.99+

150 gigabytesQUANTITY

0.99+

two countriesQUANTITY

0.99+

International Cancer Genome ConsortiumORGANIZATION

0.99+

5%QUANTITY

0.99+

OpenStack FoundationORGANIZATION

0.99+

Stu MinimanPERSON

0.99+

fiveQUANTITY

0.99+

GoogleORGANIZATION

0.99+

VancouverLOCATION

0.99+

10%QUANTITY

0.99+

Sick Kids HospitalORGANIZATION

0.99+

four daysQUANTITY

0.99+

AustraliaLOCATION

0.99+

CephORGANIZATION

0.99+

Princess Margaret HospitalORGANIZATION

0.99+

T-MobileORGANIZATION

0.99+

Vancouver, CanadaLOCATION

0.99+

IsraelLOCATION

0.99+

todayDATE

0.99+

seven years agoDATE

0.99+

four years agoDATE

0.99+

17 countriesQUANTITY

0.99+

two days laterDATE

0.98+

OpenStackTITLE

0.98+

each typeQUANTITY

0.98+

ChinaLOCATION

0.98+

2018DATE

0.98+

about 1,700 researchersQUANTITY

0.98+

UbuntuTITLE

0.98+

three daysQUANTITY

0.98+

10 gigabytesQUANTITY

0.97+

OpenStack Summit North America 2018EVENT

0.97+

seven yearsQUANTITY

0.97+

four continentsQUANTITY

0.97+

OneQUANTITY

0.97+

International Intercontinental HotelLOCATION

0.96+

Super MicroORGANIZATION

0.96+

OpenStack Summit 2018EVENT

0.96+

more than two terabytesQUANTITY

0.96+

firstQUANTITY

0.95+

50 most common types of cancerQUANTITY

0.95+

one subscriptionQUANTITY

0.95+

oneQUANTITY

0.95+