Mark Penny, University of Leicester | Commvault GO 2019
>>live >>from Denver, Colorado. It's the Q covering com vault Go 2019. Brought to you by combo. >>Hey, welcome to the Cube. Lisa Martin in Colorado for CONMEBOL Go 19. Statement. A man is with me this week, and we are pleased to welcome one of combos, longtime customers from the University of Leicester. We have Mark Penny, the systems specialist in infrastructure. Mark. Welcome to the Cube. >>Hi. It's good to be here. >>So you have been a convo customer at the UNI for nearly 10 years now, just giving folks an idea of about the union got 51 different academic departments about five research institutes. Cool research going on, by the way and between staff and students. About 20,000 folks, I'm sure all bringing multiple devices onto the campus. So talk to us about you came on board in 20 ton. It's hard to believe that was almost 10 years ago and said, All right, guys, we really got to get a strategy around back up, talk to us about way back then what? You guys were doing what you saw as an opportunity. What you're doing with combo today, a >>time and the There's a wide range of backup for us. There was no really assurance that we were getting back up. So we had a bit of convert seven that was backing up the Windows infrastructure. There was tyranny storage manager backing up a lot of Linux. And there was Amanda and open source thing. And then there was a LL sorts of scripts and things. So, for instance, of'em where backups were done by creating an array snapshot with the script, then mounting that script into that snapshot into another server backing up the server with calm bolt on the restore process is an absolute takes here. It was very, very difficult, long winded, required a lot of time on the checks. For this, it really was quite quite difficult to run it. Use a lot of stuff. Time we were, as far as the corporate side was concerned it exclusively on tape resource manager, we're using disc. Amanda was again for tape in a different, completely isolated system. Coupled with this, there had been a lack of investment in the data centers themselves, so the network hadn't really got a lot of throughput. This men that way were using data private backup networks in order to keep back up data off the production networks because there was really challenges over bandwidth contention backups on. So consider it over around and so on. If you got a back up coming into the working day defect student So Way started with a blank sheet of paper in many respects on went out to see what was available on Dhe. There was the usual ones it with the net back up, typically obviously again on convert Arc Serve has. But what was really interesting was deed Implication was starting to come in, But at the time, convo tonight just be released, and it had an absolutely killer feature for us, which was client side duplication. This men that we could now get rid of most of this private backup network that was making a lot of complex ISI. So it also did backup disk on back up to tape. So at that point, way went in with six Media agents. Way had a few 100 terabytes of disk storage. The strategy was to keep 28 days on disk and then the long term retention on tape into a tape library. WeII kept back through it about 2013 then took the decision. Disc was working, so let's just do disco only on save a whole load of effort. In even with a take life, you've got to refresh the tapes and things. So give it all on disk with D Duplication way, basically getting a 1 to 1. So if we had take my current figures about 1.5 petabytes of front side protected data, we've got about 1.5 petabytes in the back up system, which, because of all the synthetic fools and everything, we've got 12 months retention. We've got 28 days retention. It works really, really well in that and that that relationship, almost 1 to 1 with what's in the back up with all the attention with plants like data, has been fairly consistent since we went all disc >>mark. I wonder if you'd actually step back a second and talks about the role in importance of data in your organization because way went through a lot of the bits and bytes in that is there. But as a research organization, you know, I expect that data is, you know, quite a strategic component of the data >>forms your intellectual property. It's what is caught your research. It's the output of your investigations. So where were doing Earth Operational science. So we get data from satellites and that is then brought down roars time, little files. They then get a data set, which will consist of multiple packages of these, these vials and maybe even different measurements from different satellites that then combined and could be used to model scenarios climate change, temperature or pollution. All these types of things coming in. It's how you then take that raw data work with it. In our case, we use a lot of HPC haIf of computing to manipulate that data. And a lot of it is how smart researchers are in getting their code getting the maximum out of that data on. Then the output of that becomes a paper project on dhe finalized final set of of date, which is the results, which all goes with paper. We've also done the a lot of genetics and things like that because the DNA fingerprinting with Alec Jeffrey on what was very interesting with that one is how it was those techniques which then identified the bones that were dug up under the car park in Leicester, which is Richard >>Wright documentary. >>Yeah, on that really was quite exciting. The way that well do you really was quite. It's quite fitting, really, techniques that the university has discovered, which were then instrumental in identifying that. >>What? One of the interesting things I found in this part of the market is used to talk about just protecting my data. Yeah, a lot of times now it's about howto. Why leverage my data even Maur. How do I share my data? How do I extract more value out of the data in the 10 years you've been working with calm Boulder? Are you seeing that journey? Is that yes, the organization's going down. >>There's almost there's actually two conflicting things here because researchers love to share their data. But some of the data sets is so big that can be quite challenging. Some of the data sets. We take other people's Day to bring it in, combining with our own to do our own modeling. Then that goes out to provide some more for somebody else on. There's also issues about where data could exist, so there's a lot of very strict controls about the N. H s data. So health data, which so n hs England that can't then go out to Scotland on Booth. Sometimes the regulatory compliance almost gets sidelines with the excitement about research on way have quite a dichotomy of making sure that where we know about the data, that the appropriate controls are there and we understand it on Hopefully, people just don't go on, put it somewhere. It's not because some of the data sets for medical research, given the data which has got personal, identifiable information in it, that then has to be stripped out. So you've got an anonymous data set which they can then work with it Z assuring that the right data used the right information to remove so that you don't inadvertently go and then expose stuff s. So it's not just pure research on it going in this silo and in this silo it's actually ensuring that you've got the right bits in the right place, and it's being handled correctly >>to talk to us about has you know, as you pointed out, this massive growth and data volumes from a university perspective, health data perspective research perspective, the files are getting bigger and bigger In the time that you've started this foundation with combo in the last 9 10 years. Tremendous changes not just and data, but talking about complaints you've now got GDP are to deal with. Give us a perspective and snapshot of your of your con vault implementation and how you've evolved that as all the data changes, compliance changes and converts, technology has evolved. So if you take >>where we started off, we had a few 100 petabytes of disk. It's just before we migrated. Thio on Premise three Cloud Libraries That point. I think I got 2.1 petabytes of backup. Storage on the volume of data is exponentially growing covers the resolution of the instruments increases, so you can certainly have a four fold growth data that some of those are quite interesting things. They when I first joined the great excitement with a project which has just noticed Betty Colombo, which is the Mercury a year for in space agency to Demeter Mercury and they wanted 50 terabytes and way at that time, that was actually quite a big number way. We're thinking, well, we make the split. What? We need to be careful. Yes. Okay. 50 terrorizes that over the life of project. And now that's probably just to get us going. Not much actually happened with it. And then storage system changed and they still had their 50 terabytes with almost nothing in it way then understood that the spacecraft being launched and that once it had been launched, which was earlier this year, it was going to take a couple of years before the first data came back. Because it has to go to Venus. It has to go around Venus in the wrong direction, against gravity to slow it down. Then it goes to Mercury and the rial bolt data then starts coming back in. You'd have thought going to Mercury was dead easy. You just go boom straight in. But actually, if you did that because of gravity of the sun, it would just go in. You'd never stop. Just go straight into the sun. You lose your spacecraft. >>Nobody wants >>another. Eggs are really interesting. Is artfully Have you heard of the guy? A satellite? >>Yes. >>This is the one which is mapping a 1,000,000,000 stars in the Milky Way. It's now gone past its primary mission, and it's got most of that data. Huge data sets on DDE That data, there's, ah, it's already being worked on, but they are the university Thio task, packaging it and cleansing it. We're going to get a set of that data we're going to host. We're currently hosting a national HPC facility, which is for space research that's being replaced with an even bigger, more powerful one. Little probably fill one of our data centers completely. It's about 40 racks worth, and that's just to process that data because there's so much information that's come from it. And it's It's the resolution. It's the speed with which it can be computed on holding so much in memory. I mean, if you take across our current HPC systems, we've got 100 terabytes of memory across two systems, and those numbers were just unthinkable even 10 years ago, a terrible of memory. >>So Mark Lease and I would like to keep you here all way to talk about space, Mark todo of our favorite topics. But before we get towards the end, but a lot of changes, that combo, it's the whole new executive team they bought Hedvig. They land lost this metallic dot io. They've got new things. It's a longtime customer. What your viewpoint on com bold today and what what you've been seeing quite interesting to >>see how convoy has evolved on dhe. These change, which should have happened between 10 and 11 when they took the decision on the next generation platform that it would be this by industry. Sand is quite an aggressive pace of service packs, which are then come out onto this schedule. And to be fair, that schedule is being stuck to waken plan ahead. We know what's happening on Dhe. It's interesting that they're both patches and the new features and stuff, and it's really great to have that line to work, too. Now, Andi way with platform now supports natively stone Much stuff. And this was actually one of the decisions which took us around using our own on Prem Estimate Cloud Library. We were using as you to put a tear on data off site on with All is working Great. That can we do s3 on friend on. It's supported by convoy is just a cloud library. Now, When we first started that didn't exist. Way took the decision. It will proof of concept and so on, and it all worked, and we then got high for scale as well. It's interesting to see how convoy has gone down into the appliance 11 to, because people want to have to just have a box unpack it. Implicated. If you haven't got a technical team or strong yo skills in those area, why worry about putting your own system together? Haifa scale give you back up in a vault on the partnerships with were in HP customer So way we're using Apollo's RS in storage. Andi Yeah, the Apollo is actually the platform. If we bought Heifer Scale, it would have gone on an HP Apollo as well, because of the way with agreements, we've got invited. Actually, it's quite interesting how they've gone from software. Hardware is now come in, and it's evolving into this platform with Hedvig. I mean, there was a convoy object store buried in it, but it was very discreet. No one really knew about it. You occasionally could see a term on it would appear, but it it wasn't something which they published their butt object store with the increasing data volumes. Object Store is the only way to store. There's these volumes of data in a resilient and durable way. Eso Hedvig buying that and integrating in providing a really interesting way forward. And yet, for my perspective, I'm using three. So if we had gone down the Hedvig route from my perspective, what I would like to see is I have a story policy. I click on going to point it to s three, and it goes out it provision. The bucket does the whole lot in one a couple of clicks and that's it. Job done. I don't need to go out, create the use of create the bucket, and then get one out of every little written piece in there. And it's that tight integration, which is where I see benefits coming in you. It's giving value to the platform and giving the customer the assurance that you've configured correctly because the process is an automated in convoy has ensured that every step of the way the right decisions being made on that. Yet with metallic, that's everything is about it's actually tried and tested products with a very, very smart work for a process put round to ensure that the decisions you make. You don't need to be a convoy expert to get the outcome and get the backups. >>Excellent. Well, Mark, thank you for joining Student on the Cape Talking about tthe e evolution that the University of Leicester has gone through and your thoughts on com bolts evolution in parallel. We appreciate your time first to Minutemen. I'm Lisa Martin. You're watching the cue from combo go 19.
SUMMARY :
It's the Q covering com vault We have Mark Penny, the systems So talk to us about you came on board in 20 ton. So at that point, way went in with six Media agents. quite a strategic component of the data It's the output of your investigations. It's quite fitting, really, techniques that the university has discovered, the data in the 10 years you've been working with calm Boulder? it Z assuring that the right data used the right information to remove so to talk to us about has you know, as you pointed out, this massive growth and data volumes the great excitement with a project which has just noticed Betty Colombo, Is artfully Have you heard of the guy? It's the speed with which it can be computed on but a lot of changes, that combo, it's the whole new executive team they bought Hedvig. that the decisions you make. We appreciate your time first to Minutemen.
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Lisa Martin | PERSON | 0.99+ |
Mark Penny | PERSON | 0.99+ |
28 days | QUANTITY | 0.99+ |
Venus | LOCATION | 0.99+ |
Colorado | LOCATION | 0.99+ |
20 ton | QUANTITY | 0.99+ |
Mark | PERSON | 0.99+ |
University of Leicester | ORGANIZATION | 0.99+ |
100 terabytes | QUANTITY | 0.99+ |
50 terabytes | QUANTITY | 0.99+ |
2.1 petabytes | QUANTITY | 0.99+ |
Leicester | LOCATION | 0.99+ |
Milky Way | LOCATION | 0.99+ |
Mark Lease | PERSON | 0.99+ |
12 months | QUANTITY | 0.99+ |
HP | ORGANIZATION | 0.99+ |
Alec Jeffrey | PERSON | 0.99+ |
50 terabytes | QUANTITY | 0.99+ |
Denver, Colorado | LOCATION | 0.99+ |
University of Leicester | ORGANIZATION | 0.99+ |
Earth | LOCATION | 0.99+ |
51 different academic departments | QUANTITY | 0.99+ |
100 petabytes | QUANTITY | 0.99+ |
10 years | QUANTITY | 0.99+ |
two systems | QUANTITY | 0.99+ |
1 | QUANTITY | 0.99+ |
Scotland | LOCATION | 0.99+ |
1,000,000,000 stars | QUANTITY | 0.99+ |
Mercury | LOCATION | 0.99+ |
first data | QUANTITY | 0.99+ |
Apollo | ORGANIZATION | 0.99+ |
50 | QUANTITY | 0.98+ |
Maur | PERSON | 0.98+ |
three | QUANTITY | 0.98+ |
one | QUANTITY | 0.98+ |
10 years ago | DATE | 0.98+ |
tonight | DATE | 0.97+ |
today | DATE | 0.97+ |
Amanda | PERSON | 0.97+ |
About 20,000 folks | QUANTITY | 0.97+ |
One | QUANTITY | 0.97+ |
Linux | TITLE | 0.97+ |
first | QUANTITY | 0.97+ |
both | QUANTITY | 0.96+ |
England | LOCATION | 0.96+ |
2019 | DATE | 0.96+ |
Windows | TITLE | 0.96+ |
Wright | PERSON | 0.96+ |
Betty Colombo | PERSON | 0.96+ |
this week | DATE | 0.95+ |
Thio | PERSON | 0.95+ |
N. H | LOCATION | 0.93+ |
earlier this year | DATE | 0.93+ |
Richard | PERSON | 0.93+ |
nearly 10 years | QUANTITY | 0.93+ |
Thio | ORGANIZATION | 0.92+ |
six Media agents | QUANTITY | 0.9+ |
sun | LOCATION | 0.89+ |
about 1.5 petabytes | QUANTITY | 0.87+ |
Hedvig | ORGANIZATION | 0.87+ |
about 40 racks | QUANTITY | 0.87+ |
a year | QUANTITY | 0.85+ |
four fold | QUANTITY | 0.84+ |
2013 | DATE | 0.83+ |
10 | QUANTITY | 0.82+ |
about five research institutes | QUANTITY | 0.79+ |
clicks | QUANTITY | 0.78+ |
two conflicting things | QUANTITY | 0.78+ |
Prem Estimate | ORGANIZATION | 0.76+ |
Eso Hedvig | ORGANIZATION | 0.75+ |
Andi | PERSON | 0.74+ |
Apollo | COMMERCIAL_ITEM | 0.73+ |
Object Store | TITLE | 0.68+ |