Brian Biles, Datrium | VMworld 2015

it's the cube covering vmworld 2015 brought to you by VMware and its ecosystem sponsors and now your host dave vellante welcome back to moscone center everybody this is the cube silicon angles continuous production of vmworld 2015 Brian biles is here he's the CEO and co-founder of day trium Brian of course from data domain Fame David floor and I are really excited to see you thanks for coming on the cue that's great to see you guys again so in a while coming out of stealth right it's been a while you've been you've been busy right you get a domain work the DMC for a while kind of disappeared got really busy again and here you are yeah new hats got new books yeah yeah so tell us about daydream fundamentally guys on time yeah yeah well we're big on ties on the East Coast are you too well he's even more east than I am even though he goes out in California but uh yeah tell us about date you fundamentally different fundamentally different from other kinds of storage different kind of founding team so I was a founder of data domain and Hugo Patterson the CTO there BMC fellow became CTO for us we hadn't when we left emc we weren't sure what we were going to do we end up running into to VMware principal engineers who had been there 10 or 12 years working on all kinds of stuff and they believed that there was a market gap on scalable storage for VMS so we got together we use something about storage they knew something about BMS and three years later date reham is at its first trade show so talk more about that that Gavin happens all the time right guys alpha geeks nah no offense to that term it's a term of endearment yea sorry I'm a marketing guy tech ghastly ok so they get together and they sort of identify these problems and they're able to sniff them out at the root level so what really can you describe that problem or detail sure so broadly there are two kinds of storage right there's sort of arrays and emerging there's hyper converge they approach things in a very different way in a raise there tends to be a bottleneck in the controller the the electronics that that do the data services this the raid and the snapshotting and cloning and compression indeed even whatever and increasingly that takes more and more compute so Intel is you know helping every year but it's still a bottleneck and when you run out it's a cliff and you have to do a pretty expensive upgrade or migrate the data to a different place and that's sticky and takes a long time so in reaction hyper converged has emerged as an alternative and it you know it has the benefit of killing the array completely but it may have over corrected so it has some trade-offs that a lot of people don't like for example if a host goes down you know the host has assumed all the data management problems that are raised used to have so you have to migrate the data or rebuild it to service the hose if you know you can't have a fit very cleanly between a for example a blade server which has one or two drive bays and a hyper converged model where you know you look across the floor the sort of average number of capacity drives is four or five not to mention the cache drives so a blade server it's just not a fit so there's a lot of parts of the industry where that model is just not the right model you know if everybody is writing to everybody then there's a lot of neighbor noise it gets kind of weird to troubleshoot in tune arrays you know we're better in some respects things change with hyper converged a little different we're trying to create a third path in our model there's a box that we sell it's a 2u rackmount a bunch of drives for capacity but the capacity is just for at rest data it's where all the rights go it's where persistence goes but we move all the data service processing the CPU for raid for compression for dee doop whatever to host cycles we upload software to an ESX host and it uses you know anybody's x86 server and you bring your own flash for caching so you know Gartner did a thing at the end of the year where they looked at discounted street price for flash the difference between what you could pay on a server for flash you know just a commodity SSD and what you could pay in an array it was like an 8x difference so if you don't you know we don't put raid on the host all the rate is in the back end so that frees up another whatever twenty percent you end up getting an order of magnitude difference in pricing so what you can get from us in flash on a host is not you don't aim at ten percent you know of your active data in cash it gets close to a hundred dollars a terabyte after you do d Dupin compression on you know server flash so it's just cheap and plentiful you put all your data up there everything runs out of flash locally it never gets a network hit for a read we do read caching locally unlike a hyper converge we don't spread data in a pool across the host we're not interrupting every host for read for rights for you know somebody else everything is local so when you do a write it goes to our box on the end of the wire 10 gig attached but all of the compute operations are local so you're not interrupting everybody all the resourcing you would do for any i/o problem is a local either cores or flash resourcing so it's a different model and it you know it's a really well student from blade servers no one else was doing that in such a good way unlike a cash-only product it's completely organically designed for manageability you don't have a separate tier for managing on the host separate from an array where you know you're probably duplicating provisioning and having to worry about how to do dinner a snapshot when you have to flush the cache on the host it's all completely designed from the ground up so it means the the storage that we store too is minimal cost we don't have the compute overhead that you have with a controller you don't have the flash which is really expensive there that's just cycles on the host everything is you know done with the most efficient path for both data and hardware so if you look at designs in general the flash is either being a cache or it's been 100% flash or it's been a tier of story so you're just fine understand that correctly there isn't any tearing because you've got a hundred percent of it in flash so that your goals yeah we use flash on the host as a cash right but only in the sort of i only use that word guardedly initial degenerate case it's all of the data yeah so it's a cash in the spirit that if the coast dies you haven't lost any data the data is always safe somewhere else right but it's all the data it's all the data so that's sitting on the disk the back end I presume you're writing sequential event all the time with log files answering and you saw the the disk in the most effective way that's right at both sides move the flash it's a log structured and the disk it's a log stretch ownership yeah and you know we had the advantage of data domain it was the most popular log structured file system ever and you know we learned all the tricks about dee doop and garbage collection along time ago so that CTO team is uniquely qualified to get this right so what about if it does go down are you clustering it what happens when it goes down and you have to recover from those disk drives that could take a bit of time good so there's two sides of that if a host fails you know you you use vm h a to restart the vm somewhere else and life goes on if the back end fails it fails the way a traditional mid-range array might fail we have dual controllers so stay over there all the disks are dual attached there's you know dual networks on each controller you can have service which failover it's a raid 6 so there's a rebuild that happens if it disk fails but you could have two of those and keep going but a point i was getting it was that if you fail in the host you've lost all your active data be precise with them we've lost the cache copy in that local flash but you haven't lost any de una lista de menthe you've lost it from the point of view of the only from a standpoint of speed yeah so at that point you know if the ho is down you have to restart the vm somewhere else that's not instant that takes number of minutes and that gives us some time to upload data to that host to know that great good the data is all laid out in our system not for interactive views on the disk drives but for very fast upload to a cash right it's all sort of sequentially laid out unblended per vm for blasting too so what do you see is the key application times that this is going to be particularly suited full so we have the our back-end system has about 30 terabytes usable after all the you know raid and everything and dude even compressions so I figure you know 2 4 6 X data reduction call it 100 terabytes ish depends on mileage so 100 terabyte box will you know sell that that's kind of a mid-range class array it will sell mostly to those markets and our software supports only vm storage virtual disks so as long as it meets those criteria it's pretty flexible the host each host can have up to eight terabytes of raw flash you know post d doofen compression that could be 50 terabytes of effective capacity of flash / host and you know reads never leave the host so you don't get network overhead for read so that's usually two-thirds of most people I own so it's enormously price and cost effective and very performance performant as well right right latency stuff and your IP is the way you lay out the data on the media is that part of the well listen it's it's like to custom file systems from scratch yeah once in one of the hosts not to mention all the management to make it look like there's one thing you know so it's there's a lot going on it's a much more complex project than data domain wise yeah so you mentioned you know you learned from your blog structured file garbage collection days of data but the the problem that you're solving here is much closer to the host much more active data so was that obviously a challenge but so that was part of the new invention required or was really just directly sort of i mean it's at all levels we had to make it fit so we're very vm centric it looks to the software looks to ESX as though it's an NFS share right but NFS terminates in each host and then we use our own protocol to get across 10 gig to the backend and this gives us some special effects will be able to talk about overtime every version alike at entry design in some ways well it's an offense so so you get to see every VMs storage discreetly it's sort of a you know before v vols there was NFS what many support five dot five so this was a logical choice right so everything's vm centric all of the management just it just looks like there's a big pool of storage and everything else is per vm from from diagnostics to capacity planning to whatever clones are per vm you don't have to you know spend a lot of analytics to fig you know back out what the block Lunds look like with respect to the VMS and try to you know look it up figured out it's just that's all there is so I've talked to a lot of we keep on been talking to a lot of flash and you people and this is almost a flash only in the sense that you are everything is going all of the idea is going to that flash once flash is sufficiently cheap and abundant yes no so and we know we write to nvram which is the same as an all-flash array so one of the things that we've noticed is that what they find is that they have to organize things completely differently particularly as they're trying to share things and for example instead of having a the production system and then a separate copy for each application developer another separate coffee for the for the data warehouse they're trying to combine those and share the data across there with snapshots of one sort or knowledge to amortize they're very high costs just because it's much faster and quicker since the customers are doing this and I think you're not they did vendors they don't even know what's going on so but because they can share it you don't have to move the data well so it's good it's allows the developers have a more current copy the data so they can work on near production all right yeah so I was just wondering whether that was an area that you are looking at to again apply a different way of doing storage so it takes a test debuts case you saying yeah well testing or data warehousing or whatever I mean we're certainly sensitive to the overhead of having a lot of copies that's why you insolent Dean you and so on the way we do so it's but you can get so very efficient but it allows you to for example if you're doing a clone it's a you know a dee doo clone so it's it gives you a new name space entry and it keeps the rights separate but it it you know lets the common data the data with commonality across other versions be consistent so we gotta wrap but the time we have remaining so just quick update on the company headcount funding investors maybe just give us the rundown sure we raised Series A and B we've raised about 55 million so far NEA and light speed plus some angels Frank's luqman Kylie Diane Greene original founder of VMware and Ed Boon yan who was the original CTO right about a little over 70 people great and this is our first trade show and yeah awesome well congratulations Brian you know it's really awesome to see you back in and actually not to have been in action but now invisible action so well it's great to be here thanks very much for coming on cue congrat day everybody will be back right after this is the cube rely from vmworld 2015 right back

Published Date : Sep 1 2015

**Summary and Sentiment Analysis are not been shown because of improper transcript**

ENTITIES

Entity	Category	Confidence
Hugo Patterson	PERSON	0.99+
100%	QUANTITY	0.99+
Brian	PERSON	0.99+
California	LOCATION	0.99+
100 terabytes	QUANTITY	0.99+
10 gig	QUANTITY	0.99+
Brian Biles	PERSON	0.99+
10	QUANTITY	0.99+
BMC	ORGANIZATION	0.99+
50 terabytes	QUANTITY	0.99+
one	QUANTITY	0.99+
100 terabyte	QUANTITY	0.99+
Kylie Diane Greene	PERSON	0.99+
twenty percent	QUANTITY	0.99+
Ed Boon	PERSON	0.99+
12 years	QUANTITY	0.99+
VMware	ORGANIZATION	0.99+
ten percent	QUANTITY	0.99+
two kinds	QUANTITY	0.99+
two	QUANTITY	0.99+
two-thirds	QUANTITY	0.99+
Brian biles	PERSON	0.99+
five	QUANTITY	0.99+
two sides	QUANTITY	0.98+
Gartner	ORGANIZATION	0.98+
ESX	TITLE	0.98+
three years later	DATE	0.98+
both sides	QUANTITY	0.98+
each controller	QUANTITY	0.98+
DMC	ORGANIZATION	0.98+
about 55 million	QUANTITY	0.98+
dave vellante	PERSON	0.97+
Series A	OTHER	0.97+
each application	QUANTITY	0.97+
third path	QUANTITY	0.97+
each host	QUANTITY	0.96+
Datrium	ORGANIZATION	0.96+
8x	QUANTITY	0.96+
four	QUANTITY	0.96+
first trade show	QUANTITY	0.95+
10 gig	QUANTITY	0.95+
CTO	ORGANIZATION	0.94+
one thing	QUANTITY	0.94+
hundred percent	QUANTITY	0.93+
about 30 terabytes	QUANTITY	0.91+
up to eight terabytes	QUANTITY	0.88+
first trade show	QUANTITY	0.88+
over 70 people	QUANTITY	0.86+
lot of copies	QUANTITY	0.85+
x86	OTHER	0.83+
BMS	TITLE	0.83+
both data	QUANTITY	0.82+
year	DATE	0.81+
things	QUANTITY	0.79+
vmworld	EVENT	0.79+
East Coast	LOCATION	0.75+
a hundred dollars a terabyte	QUANTITY	0.74+
two drive	QUANTITY	0.71+
one of the	QUANTITY	0.71+
lot of parts	QUANTITY	0.7+
Gavin	TITLE	0.69+
end of	DATE	0.69+
David floor	PERSON	0.69+
Dean	PERSON	0.69+
vmworld	ORGANIZATION	0.68+
once	QUANTITY	0.68+
lot	QUANTITY	0.68+
VMworld 2015	EVENT	0.68+
Intel	ORGANIZATION	0.65+
2015	DATE	0.64+
every year	QUANTITY	0.61+
CTO	PERSON	0.58+
angels	TITLE	0.56+
VMS	TITLE	0.56+
2 4 6	QUANTITY	0.56+
reham	ORGANIZATION	0.46+
moscone	LOCATION	0.44+
Frank	TITLE	0.42+
daydream	ORGANIZATION	0.4+
center	ORGANIZATION	0.35+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for reham: