Does Hardware Matter?
[Music] does hardware still matter the attractiveness of software-defined models and services that are running in the cloud really make you wonder don't they but the reality is that software has to run on something and that something is hardware and history in the it business shows that the hardware that you purchase today is going to be up against the price performance of new systems in short order and these new systems will be far superior from a price performance standpoint within a couple of years so when it's time to purchase a new system look at whether it's a laptop a mainframe or a server configuring a leading edge product is going to give you the longest useful life of that new system now when i say a system what makes up a system well there's a lot of underlying technology components of course you have the processor you got memories you got storage devices there's networking like network interface cards there's interconnects and the bus architecture like pcie gen4 or whatever these components are constantly in a leapfrog mode like clock speeds and more cores and faster memories and ssds versus spinning disks and faster network cards the whole gamut so you see a constant advancement of the system components it's like it's a perpetual and sometimes chaotic improvement of the piece parts now i say chaotic because balancing these different components such that you're not wasting resources and that you're ensuring consistent application performance is a critical aspect of architecting systems so it becomes a game of like whack-a-mole meaning you're going to find the bottlenecks and you got to stamp them out it's a constant chase for locating the constraints designing systems that address these constraints without breaking the bank and optimizing all these components in a harmonious way hello everyone this is dave vellante of the cube and while these issues may not capture all the headlines except for maybe tom's hardware blog they're part of an important topic that we want to explore more deeply and to do so we're going to go inside some new benchmarking tests with our good friend kim lenar who's principal performance architect at broadcom kim always great to see you thanks so much for coming back on the cube hi there dave good to see you too thanks for having me on you bet hey so last time we met we talked about the importance of designing these balance systems i talked about that in my open and how solid state threw everything out of whack because the system was designed around spinning disk and we talked about nvme and we're here today with some new data an independent performance lab prowess consulting conducted some initial tests i've seen their their white papers on this stuff it compared the current generation of dell servers with previous models to quantify the performance impact of these new technologies and so before we get into that kim tell us a little about your background and your performance chops sure sure so i started my career about 22 years ago back when the ultra 160 scuzzy was out and just could only do about 20 megabytes a second um but i felt my experience really studying that relationship between the file systems and the application the os and storage layers as well as the hardware interaction i was absolutely just amazed with how you know touching one really affects the other and you have to understand that in order to be a good performance architect so i've authored dozens of performance white papers and i've worked with thousands of customers over the years designing and optimizing and debugging storage and trying to build mathematical models like project that next generation product where we really need to land but honestly i've just been blessed to work with really brilliant um and some of the most talented minds in the industry yeah well that's why i love having you on you you can go go really deep and so like i said we've got these these new white papers uh new test results on these dell servers what's the role people might be wondering what's the role broadcom plays inside these systems well we've been working alongside dell for for decades trying to design some of the industry's best uh storage and it's been a team effort in fact i've been working with some of these people for for you know multiple decades i know their their birthdays and their travel plans and where they vacation so it's been a really great relationship between broadcom and dell over the years we've been with them through the sata to the sas to the ssd kind of revolution now we're working from all the way back at that series five to their latest series 11 products that support nvme so it's been it's been really great but it's not just about you know gluing together the latest host or the latest disk interface you know we work with them to try and understand and characterize their customers and our customers applications the way that they're deployed security features management optimizing the i o path and making sure that when a failure happens we can get those raid volumes back optimal so it's been a really really great um you know role between between broadcom and dell got it okay let's get into the tested framework let's keep it at high level and then we're going to get into some of the data but but what did prowess test what was the workload what can you tell us about you know what they were trying to measure well the first thing is you have to kind of have an objective so what we had done was um we had them benchmark on one of the previous dell poweredge our 740xd servers and then we had them compare that to the rs750 and not just one r 750 there was two different configurations of the rs750 so we get to see kind of you know what gen 3 to gen 4 looks like um and upgrading the processor so we kind of got from like a gold system to maybe a platinum system we've added more controllers we add more drives um and then we said you know let's go ahead and let's do some sql transactional benchmarking on it and i'd like to go into why we chose that but you know microsoft sql server is one of the most popular database management platforms in the world and you know there are two kinds ones at oltp which processes records and business transactions and then there's kind of a an oltp which does analytical analytical processing and does a lot of complex queries and you know together these two things they drive the business operations and help kind of improve productivity it's a real critical part for the decision makers in a uh you know for for all of our companies so before we get in share the actual test results what specifically did prowess measure what were some of the metrics that we're going to see here we focused on the transactional workloads so we did something called a tpcc like and let me be really clear we did not execute a tpcc benchmark but it was a tpcc like benchmark and tpcc is one of the most mature standardized industry database benchmarks in the world and what it does is it simulates a sales model of a wholesale supplier so we can all kind of agree that you know handling payments and orders and status and deliveries and things like that those are those are really critical parts to running a business and ultimately what this results in is something called a new order so somebody might go on they'll log on they'll say hey is this available let me pay you um and then once that transaction is done it's called a new to order so they come up with something called a tpmc which is the new order transactions per minute now the neat thing is it's not just a one-size-fits-all kind of benchmark so you get to scale that in the way you scale the database you scale the size and the capacity of the database by adding more warehouses in our case we actually decided to choose 1400 warehouses which is a pretty standardized size and then you can also test the concurrency so you could start from one thread which kind of simulates a user all the way up to however many threads you want we decided to settle on 100 threads now this is very different from the generic benchmarking we're actually doing real work we're not just doing random reads and random rights which those are great they're critical they tell us how well we're performing but this is more like a paced workload it really executes sql i o transactions uh and you know those in order operations um are very different you do a read and then a write and then another read and those have to be executed in order it's very different from just setting up a q depth and a workers and it also provides very realistic and objective measurements that exercises not just the storage but the entire server all right let's get into some of the results so the first graphic we're going to show you is that what you were just talking about new orders per minute how should we interpret uh this graphic kim well i mean it looks like we won the waccamo game didn't we so we started out with with the baseline here the r740xd and we measured the new order transactions per minute on that we then set up the r 750 in the very first rs 750 and we have the very all the details are laid out in the paper that you just referenced there um but we started out with a single raid controller with eight drives and we measured that we got a 7x increase and then in the second test we actually added another rig controller and another eight drives and then we we kind of upgraded the the processor a little bit we were able to even double that over the initial one so and you know how do we get there that's really the more important thing and you know the the critical part of this understanding and characterizing the workload so we have to know what kind of components to balance you know where are your bottlenecks at so typically an oltp online transaction processing is a mix of transactions that are generally two reads to every one and they're very random and the way this benchmark works is it randomly accesses different warehouses it executes these queries when it executes a read query it pulls that data into memory well once the data is into memory any kind of transactions are acted on it in memory so the actual database engine does in memory transactions then you have something called a transaction log that has to record all those modifications down to non-volatile media and that's based on something um you know just to make sure that you have um all the data in case somebody pulls the plug or something you know catastrophic happens you want to make sure that those are recorded um and then every once in a while um all those in-memory changes are written down to the disk in something called a checkpoint and then we can go ahead and clear that transaction log so there's a bunch of sequence of of different kinds of i o um that happen during the course of an oltp kind of transaction so your bottlenecks are found in the processor and the memory and the amount of memory you know the latency of your disks i mean it really the whole gamut everything could be a bottleneck within there so the trick is to figure out where your bottlenecks are and trying to release those so you can get the the best performance that you possibly can yeah the sequence of events that has to take place to do a right we often we take it for granted okay the the next uh set of data we're going to look at is like you said you're doing reads you're doing right we're going to we're going to bring up now the the data around log rights and and log reads so explain what we're looking at here so as i mentioned earlier the even though the transactions happen in memory um those recorded transactions get committed down to down to the disk but eventually they get committed onto disk what we do first is we do something called a log right it's a transaction log right and that way it allows the it allows the transaction to go ahead and process so the trick here is to have the lowest latency fast disks for that log and it's very critical for your consistency and also for rollbacks and something called asset transactions and operations the log reads are really important also for the recovery efforts so we try to manage our log performance um we want low latency we want very high iops for both reads and for rights but it's not just the logs there's also the database disks and what we see is initially during these benchmarks there's a bunch of reads that are going into the database data um and then ultimately after some period of time we see something called a checkpoint and we'll see this big flurry of rights come down so you have to be able to handle all those flurry of rights as they come down and they're committed down to the disk so some of our important design considerations here are is can our processor handle this workload do we have enough memory and then finally we have three storage considerations we have a database disk we have log disk and then of course there's a temp db as well so because we have the industry leading raid 5 performance we were able to use a raid 5 for the database and that's something that you know just years ago was like whoa oh don't ever use raid 5 on your database that is no longer true our raid 5 is is fast enough and has low enough latency to handle database and it also helps save money um and then for the raid 10 we use that for a log that's pretty standardized so the faster your processor the more cores you know when you double the disk um and we get more performance so yeah you know we just figured out where the bottlenecks were we cleared them out we were able to double that that's interesting go back in history a little bit when raid 5 was all the rage uh emc at the time now of course dell when they announced symmetrics they announced it with with raid 1 which was mirroring and they did that because it was heavily into mainframe and transaction processing and while there was you know additional overhead of you do you need two disk drives to do that the performance really outweighed that and so now we're seeing with the advent of new technologies that you you're solving that problem um i i guess the other thing of course is is rebuild times and we've kind of rethought that so the next set of data that we're going to look at is is is how long it takes to rebuild uh around the raid time so we'll bring that up now and you can kind of give us the the insights here well yeah so you can see that we've been able to reduce the rebuild times and you know how do we do that well i can tell you me and my fellow architects we have been spending the last uh probably the last two years focusing on trying to improve the rebuild so we you know it's not just rebuilding faster it's also how to eloquently handle all the host operations you can't just tell those sorry i'm busy doing rebuilds you've got to be able to handle that because business continuity is a very critical component of that so um so we do that through mirroring and preparity data layouts and so the rebuild times if you can if you can do a really good balance of making sure that you are supplying a sufficient host io that we actually very quickly in the background as soon as as we have a moment we start implementing those rebuilds um you know during those law periods and so making sure that we do aggressive rebuilds by while allowing those business operations to continue have always been a real critical part but we've been working on that a lot over the last couple of generations that said we always tell our customers always have a backup that's that's a critical part to uh to business continuity plans great i wonder if we can come back to the components inside the system how does what broadcom is supplying to dell in these servers contribute to these performance results specifically kim okay so specifically um we we provide the perk storage controller and so the dell r740xd actually has their series 10 h740p controller whereas the h the r750 has the generation 11 perc 11 h755n um so we own those um you know in terms of of trying to make sure that they are integrated properly into the system provided the highest possible performance um but not just the storage controller i want to make sure that everybody knows that we also have our broadcom net extreme e series these are gen 4 pcie 25 gig do ported ethernet controllers so in you know in a critical true deployment it is a really important part of the e-commerce uh business solution so we do own the storage um for these as well as the networking excellent okay so we kind of we went deep inside into the system but let's up level why does this matter to an organization what's the business impact of all this tech coming to fruition we you know as everybody always references there's a massive growth of data and data is required for success it doesn't matter if you're a fortune 500 company or you're just a small to medium business you know it that critical data needs protected and needs protected without the complexity or the overhead or the cost of such hyper-converged infrastructures or sand deployments so we're able to do this on bare metal um and it really helps with the tco so you know and the other thing is nvme right now is the fastest growing storage nvme is so fast um as well from a performance perspective as well so that that dell r 750 with the two perc 11 controllers in it it had over 51 terabytes of storage in a single server you know and that's pretty impressive but there's um so many different performance advantages that the rs 750 provides for sql servers as well so they've got you know the gen 3 intel xeon scalable processors we've got ddr4 3200 memory you know the faster memory is very critical for those in memory transactions as well we have gen 4 pcie it really does justify an upgrade and i can tell you dave that a little over a year ago i had you know i had one of these delos 750 servers sitting in my own house and i was testing it and i was just amazed at the performance i was doing different tpcc and tpch and tpce tests on it and i was telling dell wow this is really this is amazing this server is doing so so well so i was so excited could not wait to see it in print so thank you to the prowess team um for actually showing the world what these servers can do combined with the broadcom storage now speaking of the prowess team when you read the white papers um it really is focused on this small and medium-sized business market so people might be wondering well wait a minute why wouldn't folks just spin up this compute in the cloud why am i buying servers well that's a really good question you know that still you know the studies have shown that the majority of workloads are still on-prem um and also you know there's a challenge here with the skill sets there's a lack of developers for cloud and you know cloud architects so keeping these in prem where you actually own it it really does help keep costs down um and just the management of these r750s are fantastic and the support that dell provides as well great kim i love having you on and we'd like to have you back we're going to leave it there for now but thanks so much i really appreciate your time thanks dave so look this is really helpful in understanding that at the end of the day you still need microprocessors and memories and storage devices controllers and interconnects that we you know we just saw pat gelsinger at the state of the union address nudging the federal government to support semiconductor manufacturing and you know intel is going to potentially match tsm's 100 billion dollar capex commitment and that's going to be a tailwind for the surrounding components you know including semiconductor you know component core infrastructure designers like broadcom now this is a topic that we care about and and like i said kim we're going to have you back and we plan to continue our coverage under the hood in the future so thank you for watching this cube conversation this is dave vellante and we'll see you next time [Music] you
**Summary and Sentiment Analysis are not been shown because of improper transcript**
ENTITIES
Entity | Category | Confidence |
---|---|---|
100 threads | QUANTITY | 0.99+ |
kim lenar | PERSON | 0.99+ |
7x | QUANTITY | 0.99+ |
r 750 | COMMERCIAL_ITEM | 0.99+ |
1400 warehouses | QUANTITY | 0.99+ |
eight drives | QUANTITY | 0.99+ |
rs 750 | COMMERCIAL_ITEM | 0.99+ |
dave | PERSON | 0.99+ |
microsoft | ORGANIZATION | 0.99+ |
one thread | QUANTITY | 0.99+ |
two reads | QUANTITY | 0.99+ |
rs750 | COMMERCIAL_ITEM | 0.98+ |
rs 750 | COMMERCIAL_ITEM | 0.98+ |
second test | QUANTITY | 0.98+ |
today | DATE | 0.98+ |
thousands of customers | QUANTITY | 0.98+ |
two kinds | QUANTITY | 0.98+ |
over 51 terabytes | QUANTITY | 0.97+ |
dave vellante | PERSON | 0.97+ |
740xd | COMMERCIAL_ITEM | 0.97+ |
first thing | QUANTITY | 0.97+ |
one | QUANTITY | 0.97+ |
r750s | COMMERCIAL_ITEM | 0.96+ |
two things | QUANTITY | 0.96+ |
intel | ORGANIZATION | 0.96+ |
kim | PERSON | 0.95+ |
over a year ago | DATE | 0.95+ |
tom | PERSON | 0.95+ |
dozens of performance white papers | QUANTITY | 0.95+ |
two different configurations | QUANTITY | 0.95+ |
first graphic | QUANTITY | 0.95+ |
first | QUANTITY | 0.94+ |
broadcom | ORGANIZATION | 0.94+ |
100 billion dollar | QUANTITY | 0.94+ |
decades | QUANTITY | 0.94+ |
dell | ORGANIZATION | 0.94+ |
25 gig | QUANTITY | 0.94+ |
xeon | COMMERCIAL_ITEM | 0.93+ |
r 750 | COMMERCIAL_ITEM | 0.92+ |
raid 1 | OTHER | 0.92+ |
about 20 megabytes | QUANTITY | 0.92+ |
two disk drives | QUANTITY | 0.91+ |
single | QUANTITY | 0.9+ |
r740xd | COMMERCIAL_ITEM | 0.89+ |
160 | COMMERCIAL_ITEM | 0.88+ |
a couple of years | QUANTITY | 0.87+ |
single server | QUANTITY | 0.86+ |
raid 10 | OTHER | 0.85+ |
raid 5 | OTHER | 0.84+ |
5 | OTHER | 0.84+ |
raid 5 | TITLE | 0.84+ |
last two years | DATE | 0.83+ |
both reads | QUANTITY | 0.83+ |
r750 | COMMERCIAL_ITEM | 0.79+ |
years ago | DATE | 0.78+ |
22 years ago back | DATE | 0.75+ |
double | QUANTITY | 0.72+ |
every one | QUANTITY | 0.72+ |
10 | COMMERCIAL_ITEM | 0.71+ |
ddr4 3200 | COMMERCIAL_ITEM | 0.7+ |
two perc | QUANTITY | 0.69+ |
gen 4 | OTHER | 0.68+ |
multiple decades | QUANTITY | 0.68+ |
government | ORGANIZATION | 0.63+ |
one of the most popular | QUANTITY | 0.6+ |
series 11 | QUANTITY | 0.59+ |
gen 4 | QUANTITY | 0.59+ |
3 | OTHER | 0.57+ |
h755n | COMMERCIAL_ITEM | 0.56+ |
h740p | COMMERCIAL_ITEM | 0.56+ |
most mature | QUANTITY | 0.56+ |
gen 3 | OTHER | 0.55+ |
waccamo | TITLE | 0.54+ |
minute | QUANTITY | 0.54+ |
raid | TITLE | 0.49+ |
11 | QUANTITY | 0.48+ |
fortune | QUANTITY | 0.47+ |