Image Title

Search Results for Tachyon:

Power Panel: Does Hardware Still Matter


 

(upbeat music) >> The ascendancy of cloud and SAS has shown new light on how organizations think about, pay for, and value hardware. Once sought after skills for practitioners with expertise in hardware troubleshooting, configuring ports, tuning storage arrays, and maximizing server utilization has been superseded by demand for cloud architects, DevOps pros, developers with expertise in microservices, container, application development, and like. Even a company like Dell, the largest hardware company in enterprise tech touts that it has more software engineers than those working in hardware. Begs the question, is hardware going the way of Coball? Well, not likely. Software has to run on something, but the labor needed to deploy, and troubleshoot, and manage hardware infrastructure is shifting. At the same time, we've seen the value flow also shifting in hardware. Once a world dominated by X86 processors value is flowing to alternatives like Nvidia and arm based designs. Moreover, other componentry like NICs, accelerators, and storage controllers are becoming more advanced, integrated, and increasingly important. The question is, does it matter? And if so, why does it matter and to whom? What does it mean to customers, workloads, OEMs, and the broader society? Hello and welcome to this week's Wikibon theCUBE Insights powered by ETR. In this breaking analysis, we've organized a special power panel of industry analysts and experts to address the question, does hardware still matter? Allow me to introduce the panel. Bob O'Donnell is president and chief analyst at TECHnalysis Research. Zeus Kerravala is the founder and principal analyst at ZK Research. David Nicholson is a CTO and tech expert. Keith Townson is CEO and founder of CTO Advisor. And Marc Staimer is the chief dragon slayer at Dragon Slayer Consulting and oftentimes a Wikibon contributor. Guys, welcome to theCUBE. Thanks so much for spending some time here. >> Good to be here. >> Thanks. >> Thanks for having us. >> Okay before we get into it, I just want to bring up some data from ETR. This is a survey that ETR does every quarter. It's a survey of about 1200 to 1500 CIOs and IT buyers and I'm showing a subset of the taxonomy here. This XY axis and the vertical axis is something called net score. That's a measure of spending momentum. It's essentially the percentage of customers that are spending more on a particular area than those spending less. You subtract the lesses from the mores and you get a net score. Anything the horizontal axis is pervasion in the data set. Sometimes they call it market share. It's not like IDC market share. It's just the percentage of activity in the data set as a percentage of the total. That red 40% line, anything over that is considered highly elevated. And for the past, I don't know, eight to 12 quarters, the big four have been AI and machine learning, containers, RPA and cloud and cloud of course is very impressive because not only is it elevated in the vertical access, but you know it's very highly pervasive on the horizontal. So what I've done is highlighted in red that historical hardware sector. The server, the storage, the networking, and even PCs despite the work from home are depressed in relative terms. And of course, data center collocation services. Okay so you're seeing obviously hardware is not... People don't have the spending momentum today that they used to. They've got other priorities, et cetera, but I want to start and go kind of around the horn with each of you, what is the number one trend that each of you sees in hardware and why does it matter? Bob O'Donnell, can you please start us off? >> Sure Dave, so look, I mean, hardware is incredibly important and one comment first I'll make on that slide is let's not forget that hardware, even though it may not be growing, the amount of money spent on hardware continues to be very, very high. It's just a little bit more stable. It's not as subject to big jumps as we see certainly in other software areas. But look, the important thing that's happening in hardware is the diversification of the types of chip architectures we're seeing and how and where they're being deployed, right? You refer to this in your opening. We've moved from a world of x86 CPUs from Intel and AMD to things like obviously GPUs, DPUs. We've got VPU for, you know, computer vision processing. We've got AI-dedicated accelerators, we've got all kinds of other network acceleration tools and AI-powered tools. There's an incredible diversification of these chip architectures and that's been happening for a while but now we're seeing them more widely deployed and it's being done that way because workloads are evolving. The kinds of workloads that we're seeing in some of these software areas require different types of compute engines than traditionally we've had. The other thing is (coughs), excuse me, the power requirements based on where geographically that compute happens is also evolving. This whole notion of the edge, which I'm sure we'll get into a little bit more detail later is driven by the fact that where the compute actually sits closer to in theory the edge and where edge devices are, depending on your definition, changes the power requirements. It changes the kind of connectivity that connects the applications to those edge devices and those applications. So all of those things are being impacted by this growing diversity in chip architectures. And that's a very long-term trend that I think we're going to continue to see play out through this decade and well into the 2030s as well. >> Excellent, great, great points. Thank you, Bob. Zeus up next, please. >> Yeah, and I think the other thing when you look at this chart to remember too is, you know, through the pandemic and the work from home period a lot of companies did put their office modernization projects on hold and you heard that echoed, you know, from really all the network manufacturers anyways. They always had projects underway to upgrade networks. They put 'em on hold. Now that people are starting to come back to the office, they're looking at that now. So we might see some change there, but Bob's right. The size of those market are quite a bit different. I think the other big trend here is the hardware companies, at least in the areas that I look at networking are understanding now that it's a combination of hardware and software and silicon that works together that creates that optimum type of performance and experience, right? So some things are best done in silicon. Some like data forwarding and things like that. Historically when you look at the way network devices were built, you did everything in hardware. You configured in hardware, they did all the data for you, and did all the management. And that's been decoupled now. So more and more of the control element has been placed in software. A lot of the high-performance things, encryption, and as I mentioned, data forwarding, packet analysis, stuff like that is still done in hardware, but not everything is done in hardware. And so it's a combination of the two. I think, for the people that work with the equipment as well, there's been more shift to understanding how to work with software. And this is a mistake I think the industry made for a while is we had everybody convinced they had to become a programmer. It's really more a software power user. Can you pull things out of software? Can you through API calls and things like that. But I think the big frame here is, David, it's a combination of hardware, software working together that really make a difference. And you know how much you invest in hardware versus software kind of depends on the performance requirements you have. And I'll talk about that later but that's really the big shift that's happened here. It's the vendors that figured out how to optimize performance by leveraging the best of all of those. >> Excellent. You guys both brought up some really good themes that we can tap into Dave Nicholson, please. >> Yeah, so just kind of picking up where Bob started off. Not only are we seeing the rise of a variety of CPU designs, but I think increasingly the connectivity that's involved from a hardware perspective, from a kind of a server or service design perspective has become increasingly important. I think we'll get a chance to look at this in more depth a little bit later but when you look at what happens on the motherboard, you know we're not in so much a CPU-centric world anymore. Various application environments have various demands and you can meet them by using a variety of components. And it's extremely significant when you start looking down at the component level. It's really important that you optimize around those components. So I guess my summary would be, I think we are moving out of the CPU-centric hardware model into more of a connectivity-centric model. We can talk more about that later. >> Yeah, great. And thank you, David, and Keith Townsend I really interested in your perspectives on this. I mean, for years you worked in a data center surrounded by hardware. Now that we have the software defined data center, please chime in here. >> Well, you know, I'm going to dig deeper into that software-defined data center nature of what's happening with hardware. Hardware is meeting software infrastructure as code is a thing. What does that code look like? We're still trying to figure out but servicing up these capabilities that the previous analysts have brought up, how do I ensure that I can get the level of services needed for the applications that I need? Whether they're legacy, traditional data center, workloads, AI ML, workloads, workloads at the edge. How do I codify that and consume that as a service? And hardware vendors are figuring this out. HPE, the big push into GreenLake as a service. Dale now with Apex taking what we need, these bare bone components, moving it forward with DDR five, six CXL, et cetera, and surfacing that as cold or as services. This is a very tough problem. As we transition from consuming a hardware-based configuration to this infrastructure as cold paradigm shift. >> Yeah, programmable infrastructure, really attacking that sort of labor discussion that we were having earlier, okay. Last but not least Marc Staimer, please. >> Thanks, Dave. My peers raised really good points. I agree with most of them, but I'm going to disagree with the title of this session, which is, does hardware matter? It absolutely matters. You can't run software on the air. You can't run it in an ephemeral cloud, although there's the technical cloud and that's a different issue. The cloud is kind of changed everything. And from a market perspective in the 40 plus years I've been in this business, I've seen this perception that hardware has to go down in price every year. And part of that was driven by Moore's law. And we're coming to, let's say a lag or an end, depending on who you talk to Moore's law. So we're not doubling our transistors every 18 to 24 months in a chip and as a result of that, there's been a higher emphasis on software. From a market perception, there's no penalty. They don't put the same pressure on software from the market to reduce the cost every year that they do on hardware, which kind of bass ackwards when you think about it. Hardware costs are fixed. Software costs tend to be very low. It's kind of a weird thing that we do in the market. And what's changing is we're now starting to treat hardware like software from an OPEX versus CapEx perspective. So yes, hardware matters. And we'll talk about that more in length. >> You know, I want to follow up on that. And I wonder if you guys have a thought on this, Bob O'Donnell, you and I have talked about this a little bit. Marc, you just pointed out that Moore's laws could have waning. Pat Gelsinger recently at their investor meeting said that he promised that Moore's law is alive and well. And the point I made in breaking analysis was okay, great. You know, Pat said, doubling transistors every 18 to 24 months, let's say that Intel can do that. Even though we know it's waning somewhat. Look at the M1 Ultra from Apple (chuckles). In about 15 months increased transistor density on their package by 6X. So to your earlier point, Bob, we have this sort of these alternative processors that are really changing things. And to Dave Nicholson's point, there's a whole lot of supporting components as well. Do you have a comment on that, Bob? >> Yeah, I mean, it's a great point, Dave. And one thing to bear in mind as well, not only are we seeing a diversity of these different chip architectures and different types of components as a number of us have raised the other big point and I think it was Keith that mentioned it. CXL and interconnect on the chip itself is dramatically changing it. And a lot of the more interesting advances that are going to continue to drive Moore's law forward in terms of the way we think about performance, if perhaps not number of transistors per se, is the interconnects that become available. You're seeing the development of chiplets or tiles, people use different names, but the idea is you can have different components being put together eventually in sort of a Lego block style. And what that's also going to allow, not only is that going to give interesting performance possibilities 'cause of the faster interconnect. So you can share, have shared memory between things which for big workloads like AI, huge data sets can make a huge difference in terms of how you talk to memory over a network connection, for example, but not only that you're going to see more diversity in the types of solutions that can be built. So we're going to see even more choices in hardware from a silicon perspective because you'll be able to piece together different elements. And oh, by the way, the other benefit of that is we've reached a point in chip architectures where not everything benefits from being smaller. We've been so focused and so obsessed when it comes to Moore's law, to the size of each individual transistor and yes, for certain architecture types, CPUs and GPUs in particular, that's absolutely true, but we've already hit the point where things like RF for 5g and wifi and other wireless technologies and a whole bunch of other things actually don't get any better with a smaller transistor size. They actually get worse. So the beauty of these chiplet architectures is you could actually combine different chip manufacturing sizes. You know you hear about four nanometer and five nanometer along with 14 nanometer on a single chip, each one optimized for its specific application yet together, they can give you the best of all worlds. And so we're just at the very beginning of that era, which I think is going to drive a ton of innovation. Again, gets back to my comment about different types of devices located geographically different places at the edge, in the data center, you know, in a private cloud versus a public cloud. All of those things are going to be impacted and there'll be a lot more options because of this silicon diversity and this interconnect diversity that we're just starting to see. >> Yeah, David. David Nicholson's got a graphic on that. They're going to show later. Before we do that, I want to introduce some data. I actually want to ask Keith to comment on this before we, you know, go on. This next slide is some data from ETR that shows the percent of customers that cited difficulty procuring hardware. And you can see the red is they had significant issues and it's most pronounced in laptops and networking hardware on the far right-hand side, but virtually all categories, firewalls, peripheral servers, storage are having moderately difficult procurement issues. That's the sort of pinkish or significant challenges. So Keith, I mean, what are you seeing with your customers in the hardware supply chains and bottlenecks? And you know we're seeing it with automobiles and appliances but so it goes beyond IT. The semiconductor, you know, challenges. What's been the impact on the buyer community and society and do you have any sense as to when it will subside? >> You know, I was just asked this question yesterday and I'm feeling the pain. People question, kind of a side project within the CTO advisor, we built a hybrid infrastructure, traditional IT data center that we're walking with the traditional customer and modernizing that data center. So it was, you know, kind of a snapshot of time in 2016, 2017, 10 gigabit, ARISTA switches, some older Dell's 730 XD switches, you know, speeds and feeds. And we said we would modern that with the latest Intel stack and connected to the public cloud and then the pandemic hit and we are experiencing a lot of the same challenges. I thought we'd easily migrate from 10 gig networking to 25 gig networking path that customers are going on. The 10 gig network switches that I bought used are now double the price because you can't get legacy 10 gig network switches because all of the manufacturers are focusing on the more profitable 25 gig for capacity, even the 25 gig switches. And we're focused on networking right now. It's hard to procure. We're talking about nine to 12 months or more lead time. So we're seeing customers adjust by adopting cloud. But if you remember early on in the pandemic, Microsoft Azure kind of gated customers that didn't have a capacity agreement. So customers are keeping an eye on that. There's a desire to abstract away from the underlying vendor to be able to control or provision your IT services in a way that we do with VMware VP or some other virtualization technology where it doesn't matter who can get me the hardware, they can just get me the hardware because it's critically impacting projects and timelines. >> So that's a great setup Zeus for you with Keith mentioned the earlier the software-defined data center with software-defined networking and cloud. Do you see a day where networking hardware is monetized and it's all about the software, or are we there already? >> No, we're not there already. And I don't see that really happening any time in the near future. I do think it's changed though. And just to be clear, I mean, when you look at that data, this is saying customers have had problems procuring the equipment, right? And there's not a network vendor out there. I've talked to Norman Rice at Extreme, and I've talked to the folks at Cisco and ARISTA about this. They all said they could have had blowout quarters had they had the inventory to ship. So it's not like customers aren't buying this anymore. Right? I do think though, when it comes to networking network has certainly changed some because there's a lot more controls as I mentioned before that you can do in software. And I think the customers need to start thinking about the types of hardware they buy and you know, where they're going to use it and, you know, what its purpose is. Because I've talked to customers that have tried to run software and commodity hardware and where the performance requirements are very high and it's bogged down, right? It just doesn't have the horsepower to run it. And, you know, even when you do that, you have to start thinking of the components you use. The NICs you buy. And I've talked to customers that have simply just gone through the process replacing a NIC card and a commodity box and had some performance problems and, you know, things like that. So if agility is more important than performance, then by all means try running software on commodity hardware. I think that works in some cases. If performance though is more important, that's when you need that kind of turnkey hardware system. And I've actually seen more and more customers reverting back to that model. In fact, when you talk to even some startups I think today about when they come to market, they're delivering things more on appliances because that's what customers want. And so there's this kind of app pivot this pendulum of agility and performance. And if performance absolutely matters, that's when you do need to buy these kind of turnkey, prebuilt hardware systems. If agility matters more, that's when you can go more to software, but the underlying hardware still does matter. So I think, you know, will we ever have a day where you can just run it on whatever hardware? Maybe but I'll long be retired by that point. So I don't care. >> Well, you bring up a good point Zeus. And I remember the early days of cloud, the narrative was, oh, the cloud vendors. They don't use EMC storage, they just run on commodity storage. And then of course, low and behold, you know, they've trot out James Hamilton to talk about all the custom hardware that they were building. And you saw Google and Microsoft follow suit. >> Well, (indistinct) been falling for this forever. Right? And I mean, all the way back to the turn of the century, we were calling for the commodity of hardware. And it's never really happened because you can still drive. As long as you can drive innovation into it, customers will always lean towards the innovation cycles 'cause they get more features faster and things. And so the vendors have done a good job of keeping that cycle up but it'll be a long time before. >> Yeah, and that's why you see companies like Pure Storage. A storage company has 69% gross margins. All right. I want to go jump ahead. We're going to bring up the slide four. I want to go back to something that Bob O'Donnell was talking about, the sort of supporting act. The diversity of silicon and we've marched to the cadence of Moore's law for decades. You know, we asked, you know, is Moore's law dead? We say it's moderating. Dave Nicholson. You want to talk about those supporting components. And you shared with us a slide that shift. You call it a shift from a processor-centric world to a connect-centric world. What do you mean by that? And let's bring up slide four and you can talk to that. >> Yeah, yeah. So first, I want to echo this sentiment that the question does hardware matter is sort of the answer is of course it matters. Maybe the real question should be, should you care about it? And the answer to that is it depends who you are. If you're an end user using an application on your mobile device, maybe you don't care how the architecture is put together. You just care that the service is delivered but as you back away from that and you get closer and closer to the source, someone needs to care about the hardware and it should matter. Why? Because essentially what hardware is doing is it's consuming electricity and dollars and the more efficiently you can configure hardware, the more bang you're going to get for your buck. So it's not only a quantitative question in terms of how much can you deliver? But it also ends up being a qualitative change as capabilities allow for things we couldn't do before, because we just didn't have the aggregate horsepower to do it. So this chart actually comes out of some performance tests that were done. So it happens to be Dell servers with Broadcom components. And the point here was to peel back, you know, peel off the top of the server and look at what's in that server, starting with, you know, the PCI interconnect. So PCIE gen three, gen four, moving forward. What are the effects on from an interconnect versus on performance application performance, translating into new orders per minute, processed per dollar, et cetera, et cetera? If you look at the advances in CPU architecture mapped against the advances in interconnect and storage subsystem performance, you can see that CPU architecture is sort of lagging behind in a way. And Bob mentioned this idea of tiling and all of the different ways to get around that. When we do performance testing, we can actually peg CPUs, just running the performance tests without any actual database environments working. So right now we're at this sort of imbalance point where you have to make sure you design things properly to get the most bang per kilowatt hour of power per dollar input. So the key thing here what this is highlighting is just as a very specific example, you take a card that's designed as a gen three PCIE device, and you plug it into a gen four slot. Now the card is the bottleneck. You plug a gen four card into a gen four slot. Now the gen four slot is the bottleneck. So we're constantly chasing these bottlenecks. Someone has to be focused on that from an architectural perspective, it's critically important. So there's no question that it matters. But of course, various people in this food chain won't care where it comes from. I guess a good analogy might be, where does our food come from? If I get a steak, it's a pink thing wrapped in plastic, right? Well, there are a lot of inputs that a lot of people have to care about to get that to me. Do I care about all of those things? No. Are they important? They're critically important. >> So, okay. So all I want to get to the, okay. So what does this all mean to customers? And so what I'm hearing from you is to balance a system it's becoming, you know, more complicated. And I kind of been waiting for this day for a long time, because as we all know the bottleneck was always the spinning disc, the last mechanical. So people who wrote software knew that when they were doing it right, the disc had to go and do stuff. And so they were doing other things in the software. And now with all these new interconnects and flash and things like you could do atomic rights. And so that opens up new software possibilities and combine that with alternative processes. But what's the so what on this to the customer and the application impact? Can anybody address that? >> Yeah, let me address that for a moment. I want to leverage some of the things that Bob said, Keith said, Zeus said, and David said, yeah. So I'm a bit of a contrarian in some of this. For example, on the chip side. As the chips get smaller, 14 nanometer, 10 nanometer, five nanometer, soon three nanometer, we talk about more cores, but the biggest problem on the chip is the interconnect from the chip 'cause the wires get smaller. People don't realize in 2004 the latency on those wires in the chips was 80 picoseconds. Today it's 1300 picoseconds. That's on the chip. This is why they're not getting faster. So we maybe getting a little bit slowing down in Moore's law. But even as we kind of conquer that you still have the interconnect problem and the interconnect problem goes beyond the chip. It goes within the system, composable architectures. It goes to the point where Keith made, ultimately you need a hybrid because what we're seeing, what I'm seeing and I'm talking to customers, the biggest issue they have is moving data. Whether it be in a chip, in a system, in a data center, between data centers, moving data is now the biggest gating item in performance. So if you want to move it from, let's say your transactional database to your machine learning, it's the bottleneck, it's moving the data. And so when you look at it from a distributed environment, now you've got to move the compute to the data. The only way to get around these bottlenecks today is to spend less time in trying to move the data and more time in taking the compute, the software, running on hardware closer to the data. Go ahead. >> So is this what you mean when Nicholson was talking about a shift from a processor centric world to a connectivity centric world? You're talking about moving the bits across all the different components, not having the processor you're saying is essentially becoming the bottleneck or the memory, I guess. >> Well, that's one of them and there's a lot of different bottlenecks, but it's the data movement itself. It's moving away from, wait, why do we need to move the data? Can we move the compute, the processing closer to the data? Because if we keep them separate and this has been a trend now where people are moving processing away from it. It's like the edge. I think it was Zeus or David. You were talking about the edge earlier. As you look at the edge, who defines the edge, right? Is the edge a closet or is it a sensor? If it's a sensor, how do you do AI at the edge? When you don't have enough power, you don't have enough computable. People were inventing chips to do that. To do all that at the edge, to do AI within the sensor, instead of moving the data to a data center or a cloud to do the processing. Because the lag in latency is always limited by speed of light. How fast can you move the electrons? And all this interconnecting, all the processing, and all the improvement we're seeing in the PCIE bus from three, to four, to five, to CXL, to a higher bandwidth on the network. And that's all great but none of that deals with the speed of light latency. And that's an-- Go ahead. >> You know Marc, no, I just want to just because what you're referring to could be looked at at a macro level, which I think is what you're describing. You can also look at it at a more micro level from a systems design perspective, right? I'm going to be the resident knuckle dragging hardware guy on the panel today. But it's exactly right. You moving compute closer to data includes concepts like peripheral cards that have built in intelligence, right? So again, in some of this testing that I'm referring to, we saw dramatic improvements when you basically took the horsepower instead of using the CPU horsepower for the like IO. Now you have essentially offload engines in the form of storage controllers, rate controllers, of course, for ethernet NICs, smart NICs. And so when you can have these sort of offload engines and we've gone through these waves over time. People think, well, wait a minute, raid controller and NVMe? You know, flash storage devices. Does that make sense? It turns out it does. Why? Because you're actually at a micro level doing exactly what you're referring to. You're bringing compute closer to the data. Now, closer to the data meaning closer to the data storage subsystem. It doesn't solve the macro issue that you're referring to but it is important. Again, going back to this idea of system design optimization, always chasing the bottleneck, plugging the holes. Someone needs to do that in this value chain in order to get the best value for every kilowatt hour of power and every dollar. >> Yeah. >> Well this whole drive performance has created some really interesting architectural designs, right? Like Nickelson, the rise of the DPU right? Brings more processing power into systems that already had a lot of processing power. There's also been some really interesting, you know, kind of innovation in the area of systems architecture too. If you look at the way Nvidia goes to market, their drive kit is a prebuilt piece of hardware, you know, optimized for self-driving cars, right? They partnered with Pure Storage and ARISTA to build that AI-ready infrastructure. I remember when I talked to Charlie Giancarlo, the CEO of Pure about when the three companies rolled that out. He said, "Look, if you're going to do AI, "you need good store. "You need fast storage, fast processor and fast network." And so for customers to be able to put that together themselves was very, very difficult. There's a lot of software that needs tuning as well. So the three companies partner together to create a fully integrated turnkey hardware system with a bunch of optimized software that runs on it. And so in that case, in some ways the hardware was leading the software innovation. And so, the variety of different architectures we have today around hardware has really exploded. And I think it, part of the what Bob brought up at the beginning about the different chip design. >> Yeah, Bob talked about that earlier. Bob, I mean, most AI today is modeling, you know, and a lot of that's done in the cloud and it looks from my standpoint anyway that the future is going to be a lot of AI inferencing at the edge. And that's a radically different architecture, Bob, isn't it? >> It is, it's a completely different architecture. And just to follow up on a couple points, excellent conversation guys. Dave talked about system architecture and really this that's what this boils down to, right? But it's looking at architecture at every level. I was talking about the individual different components the new interconnect methods. There's this new thing called UCIE universal connection. I forget what it stands answer for, but it's a mechanism for doing chiplet architectures, but then again, you have to take it up to the system level, 'cause it's all fine and good. If you have this SOC that's tuned and optimized, but it has to talk to the rest of the system. And that's where you see other issues. And you've seen things like CXL and other interconnect standards, you know, and nobody likes to talk about interconnect 'cause it's really wonky and really technical and not that sexy, but at the end of the day it's incredibly important exactly. To the other points that were being raised like mark raised, for example, about getting that compute closer to where the data is and that's where again, a diversity of chip architectures help and exactly to your last comment there Dave, putting that ability in an edge device is really at the cutting edge of what we're seeing on a semiconductor design and the ability to, for example, maybe it's an FPGA, maybe it's a dedicated AI chip. It's another kind of chip architecture that's being created to do that inferencing on the edge. Because again, it's that the cost and the challenges of moving lots of data, whether it be from say a smartphone to a cloud-based application or whether it be from a private network to a cloud or any other kinds of permutations we can think of really matters. And the other thing is we're tackling bigger problems. So architecturally, not even just architecturally within a system, but when we think about DPUs and the sort of the east west data center movement conversation that we hear Nvidia and others talk about, it's about combining multiple sets of these systems to function together more efficiently again with even bigger sets of data. So really is about tackling where the processing is needed, having the interconnect and the ability to get where the data you need to the right place at the right time. And because those needs are diversifying, we're just going to continue to see an explosion of different choices and options, which is going to make hardware even more essential I would argue than it is today. And so I think what we're going to see not only does hardware matter, it's going to matter even more in the future than it does now. >> Great, yeah. Great discussion, guys. I want to bring Keith back into the conversation here. Keith, if your main expertise in tech is provisioning LUNs, you probably you want to look for another job. So maybe clearly hardware matters, but with software defined everything, do people with hardware expertise matter outside of for instance, component manufacturers or cloud companies? I mean, VMware certainly changed the dynamic in servers. Dell just spun off its most profitable asset and VMware. So it obviously thinks hardware can stand alone. How does an enterprise architect view the shift to software defined hyperscale cloud and how do you see the shifting demand for skills in enterprise IT? >> So I love the question and I'll take a different view of it. If you're a data analyst and your primary value add is that you do ETL transformation, talk to a CDO, a chief data officer over midsize bank a little bit ago. He said 80% of his data scientists' time is done on ETL. Super not value ad. He wants his data scientists to do data science work. Chances are if your only value is that you do LUN provisioning, then you probably don't have a job now. The technologies have gotten much more intelligent. As infrastructure pros, we want to give infrastructure pros the opportunities to shine and I think the software defined nature and the automation that we're seeing vendors undertake, whether it's Dell, HP, Lenovo take your pick that Pure Storage, NetApp that are doing the automation and the ML needed so that these practitioners don't spend 80% of their time doing LUN provisioning and focusing on their true expertise, which is ensuring that data is stored. Data is retrievable, data's protected, et cetera. I think the shift is to focus on that part of the job that you're ensuring no matter where the data's at, because as my data is spread across the enterprise hybrid different types, you know, Dave, you talk about the super cloud a lot. If my data is in the super cloud, protecting that data and securing that data becomes much more complicated when than when it was me just procuring or provisioning LUNs. So when you say, where should the shift be, or look be, you know, focusing on the real value, which is making sure that customers can access data, can recover data, can get data at performance levels that they need within the price point. They need to get at those datasets and where they need it. We talked a lot about where they need out. One last point about this interconnecting. I have this vision and I think we all do of composable infrastructure. This idea that scaled out does not solve every problem. The cloud can give me infinite scale out. Sometimes I just need a single OS with 64 terabytes of RAM and 204 GPUs or GPU instances that single OS does not exist today. And the opportunity is to create composable infrastructure so that we solve a lot of these problems that just simply don't scale out. >> You know, wow. So many interesting points there. I had just interviewed Zhamak Dehghani, who's the founder of Data Mesh last week. And she made a really interesting point. She said, "Think about, we have separate stacks. "We have an application stack and we have "a data pipeline stack and the transaction systems, "the transaction database, we extract data from that," to your point, "We ETL it in, you know, it takes forever. "And then we have this separate sort of data stack." If we're going to inject more intelligence and data and AI into applications, those two stacks, her contention is they have to come together. And when you think about, you know, super cloud bringing compute to data, that was what Haduck was supposed to be. It ended up all sort of going into a central location, but it's almost a rhetorical question. I mean, it seems that that necessitates new thinking around hardware architectures as it kind of everything's the edge. And the other point is to your point, Keith, it's really hard to secure that. So when you can think about offloads, right, you've heard the stats, you know, Nvidia talks about it. Broadcom talks about it that, you know, that 30%, 25 to 30% of the CPU cycles are wasted on doing things like storage offloads, or networking or security. It seems like maybe Zeus you have a comment on this. It seems like new architectures need to come other to support, you know, all of that stuff that Keith and I just dispute. >> Yeah, and by the way, I do want to Keith, the question you just asked. Keith, it's the point I made at the beginning too about engineers do need to be more software-centric, right? They do need to have better software skills. In fact, I remember talking to Cisco about this last year when they surveyed their engineer base, only about a third of 'em had ever made an API call, which you know that that kind of shows this big skillset change, you know, that has to come. But on the point of architectures, I think the big change here is edge because it brings in distributed compute models. Historically, when you think about compute, even with multi-cloud, we never really had multi-cloud. We'd use multiple centralized clouds, but compute was always centralized, right? It was in a branch office, in a data center, in a cloud. With edge what we creates is the rise of distributed computing where we'll have an application that actually accesses different resources and at different edge locations. And I think Marc, you were talking about this, like the edge could be in your IoT device. It could be your campus edge. It could be cellular edge, it could be your car, right? And so we need to start thinkin' about how our applications interact with all those different parts of that edge ecosystem, you know, to create a single experience. The consumer apps, a lot of consumer apps largely works that way. If you think of like app like Uber, right? It pulls in information from all kinds of different edge application, edge services. And, you know, it creates pretty cool experience. We're just starting to get to that point in the business world now. There's a lot of security implications and things like that, but I do think it drives more architectural decisions to be made about how I deploy what data where and where I do my processing, where I do my AI and things like that. It actually makes the world more complicated. In some ways we can do so much more with it, but I think it does drive us more towards turnkey systems, at least initially in order to, you know, ensure performance and security. >> Right. Marc, I wanted to go to you. You had indicated to me that you wanted to chat about this a little bit. You've written quite a bit about the integration of hardware and software. You know, we've watched Oracle's move from, you know, buying Sun and then basically using that in a highly differentiated approach. Engineered systems. What's your take on all that? I know you also have some thoughts on the shift from CapEx to OPEX chime in on that. >> Sure. When you look at it, there are advantages to having one vendor who has the software and hardware. They can synergistically make them work together that you can't do in a commodity basis. If you own the software and somebody else has the hardware, I'll give you an example would be Oracle. As you talked about with their exit data platform, they literally are leveraging microcode in the Intel chips. And now in AMD chips and all the way down to Optane, they make basically AMD database servers work with Optane memory PMM in their storage systems, not MVME, SSD PMM. I'm talking about the cards itself. So there are advantages you can take advantage of if you own the stack, as you were putting out earlier, Dave, of both the software and the hardware. Okay, that's great. But on the other side of that, that tends to give you better performance, but it tends to cost a little more. On the commodity side it costs less but you get less performance. What Zeus had said earlier, it depends where you're running your application. How much performance do you need? What kind of performance do you need? One of the things about moving to the edge and I'll get to the OPEX CapEx in a second. One of the issues about moving to the edge is what kind of processing do you need? If you're running in a CCTV camera on top of a traffic light, how much power do you have? How much cooling do you have that you can run this? And more importantly, do you have to take the data you're getting and move it somewhere else and get processed and the information is sent back? I mean, there are companies out there like Brain Chip that have developed AI chips that can run on the sensor without a CPU. Without any additional memory. So, I mean, there's innovation going on to deal with this question of data movement. There's companies out there like Tachyon that are combining GPUs, CPUs, and DPUs in a single chip. Think of it as super composable architecture. They're looking at being able to do more in less. On the OPEX and CapEx issue. >> Hold that thought, hold that thought on the OPEX CapEx, 'cause we're running out of time and maybe you can wrap on that. I just wanted to pick up on something you said about the integrated hardware software. I mean, other than the fact that, you know, Michael Dell unlocked whatever $40 billion for himself and Silverlake, I was always a fan of a spin in with VMware basically become the Oracle of hardware. Now I know it would've been a nightmare for the ecosystem and culturally, they probably would've had a VMware brain drain, but what does anybody have any thoughts on that as a sort of a thought exercise? I was always a fan of that on paper. >> I got to eat a little crow. I did not like the Dale VMware acquisition for the industry in general. And I think it hurt the industry in general, HPE, Cisco walked away a little bit from that VMware relationship. But when I talked to customers, they loved it. You know, I got to be honest. They absolutely loved the integration. The VxRail, VxRack solution exploded. Nutanix became kind of a afterthought when it came to competing. So that spin in, when we talk about the ability to innovate and the ability to create solutions that you just simply can't create because you don't have the full stack. Dell was well positioned to do that with a potential span in of VMware. >> Yeah, we're going to be-- Go ahead please. >> Yeah, in fact, I think you're right, Keith, it was terrible for the industry. Great for Dell. And I remember talking to Chad Sakac when he was running, you know, VCE, which became Rack and Rail, their ability to stay in lockstep with what VMware was doing. What was the number one workload running on hyperconverged forever? It was VMware. So their ability to remain in lockstep with VMware gave them a huge competitive advantage. And Dell came out of nowhere in, you know, the hyper-converged market and just started taking share because of that relationship. So, you know, this sort I guess it's, you know, from a Dell perspective I thought it gave them a pretty big advantage that they didn't really exploit across their other properties, right? Networking and service and things like they could have given the dominance that VMware had. From an industry perspective though, I do think it's better to have them be coupled. So. >> I agree. I mean, they could. I think they could have dominated in super cloud and maybe they would become the next Oracle where everybody hates 'em, but they kick ass. But guys. We got to wrap up here. And so what I'm going to ask you is I'm going to go and reverse the order this time, you know, big takeaways from this conversation today, which guys by the way, I can't thank you enough phenomenal insights, but big takeaways, any final thoughts, any research that you're working on that you want highlight or you know, what you look for in the future? Try to keep it brief. We'll go in reverse order. Maybe Marc, you could start us off please. >> Sure, on the research front, I'm working on a total cost of ownership of an integrated database analytics machine learning versus separate services. On the other aspect that I would wanted to chat about real quickly, OPEX versus CapEx, the cloud changed the market perception of hardware in the sense that you can use hardware or buy hardware like you do software. As you use it, pay for what you use in arrears. The good thing about that is you're only paying for what you use, period. You're not for what you don't use. I mean, it's compute time, everything else. The bad side about that is you have no predictability in your bill. It's elastic, but every user I've talked to says every month it's different. And from a budgeting perspective, it's very hard to set up your budget year to year and it's causing a lot of nightmares. So it's just something to be aware of. From a CapEx perspective, you have no more CapEx if you're using that kind of base system but you lose a certain amount of control as well. So ultimately that's some of the issues. But my biggest point, my biggest takeaway from this is the biggest issue right now that everybody I talk to in some shape or form it comes down to data movement whether it be ETLs that you talked about Keith or other aspects moving it between hybrid locations, moving it within a system, moving it within a chip. All those are key issues. >> Great, thank you. Okay, CTO advisor, give us your final thoughts. >> All right. Really, really great commentary. Again, I'm going to point back to us taking the walk that our customers are taking, which is trying to do this conversion of all primary data center to a hybrid of which I have this hard earned philosophy that enterprise IT is additive. When we add a service, we rarely subtract a service. So the landscape and service area what we support has to grow. So our research focuses on taking that walk. We are taking a monolithic application, decomposing that to containers, and putting that in a public cloud, and connecting that back private data center and telling that story and walking that walk with our customers. This has been a super enlightening panel. >> Yeah, thank you. Real, real different world coming. David Nicholson, please. >> You know, it really hearkens back to the beginning of the conversation. You talked about momentum in the direction of cloud. I'm sort of spending my time under the hood, getting grease under my fingernails, focusing on where still the lions share of spend will be in coming years, which is OnPrem. And then of course, obviously data center infrastructure for cloud but really diving under the covers and helping folks understand the ramifications of movement between generations of CPU architecture. I know we all know Sapphire Rapids pushed into the future. When's the next Intel release coming? Who knows? We think, you know, in 2023. There have been a lot of people standing by from a practitioner's standpoint asking, well, what do I do between now and then? Does it make sense to upgrade bits and pieces of hardware or go from a last generation to a current generation when we know the next generation is coming? And so I've been very, very focused on looking at how these connectivity components like rate controllers and NICs. I know it's not as sexy as talking about cloud but just how these opponents completely change the game and actually can justify movement from say a 14th-generation architecture to a 15th-generation architecture today, even though gen 16 is coming, let's say 12 months from now. So that's where I am. Keep my phone number in the Rolodex. I literally reference Rolodex intentionally because like I said, I'm in there under the hood and it's not as sexy. But yeah, so that's what I'm focused on Dave. >> Well, you know, to paraphrase it, maybe derivative paraphrase of, you know, Larry Ellison's rant on what is cloud? It's operating systems and databases, et cetera. Rate controllers and NICs live inside of clouds. All right. You know, one of the reasons I love working with you guys is 'cause have such a wide observation space and Zeus Kerravala you, of all people, you know you have your fingers in a lot of pies. So give us your final thoughts. >> Yeah, I'm not a propeller heady as my chip counterparts here. (all laugh) So, you know, I look at the world a little differently and a lot of my research I'm doing now is the impact that distributed computing has on customer employee experiences, right? You talk to every business and how the experiences they deliver to their customers is really differentiating how they go to market. And so they're looking at these different ways of feeding up data and analytics and things like that in different places. And I think this is going to have a really profound impact on enterprise IT architecture. We're putting more data, more compute in more places all the way down to like little micro edges and retailers and things like that. And so we need the variety. Historically, if you think back to when I was in IT you know, pre-Y2K, we didn't have a lot of choice in things, right? We had a server that was rack mount or standup, right? And there wasn't a whole lot of, you know, differences in choice. But today we can deploy, you know, these really high-performance compute systems on little blades inside servers or inside, you know, autonomous vehicles and things. I think the world from here gets... You know, just the choice of what we have and the way hardware and software works together is really going to, I think, change the world the way we do things. We're already seeing that, like I said, in the consumer world, right? There's so many things you can do from, you know, smart home perspective, you know, natural language processing, stuff like that. And it's starting to hit businesses now. So just wait and watch the next five years. >> Yeah, totally. The computing power at the edge is just going to be mind blowing. >> It's unbelievable what you can do at the edge. >> Yeah, yeah. Hey Z, I just want to say that we know you're not a propeller head and I for one would like to thank you for having your master's thesis hanging on the wall behind you 'cause we know that you studied basket weaving. >> I was actually a physics math major, so. >> Good man. Another math major. All right, Bob O'Donnell, you're going to bring us home. I mean, we've seen the importance of semiconductors and silicon in our everyday lives, but your last thoughts please. >> Sure and just to clarify, by the way I was a great books major and this was actually for my final paper. And so I was like philosophy and all that kind of stuff and literature but I still somehow got into tech. Look, it's been a great conversation and I want to pick up a little bit on a comment Zeus made, which is this it's the combination of the hardware and the software and coming together and the manner with which that needs to happen, I think is critically important. And the other thing is because of the diversity of the chip architectures and all those different pieces and elements, it's going to be how software tools evolve to adapt to that new world. So I look at things like what Intel's trying to do with oneAPI. You know, what Nvidia has done with CUDA. What other platform companies are trying to create tools that allow them to leverage the hardware, but also embrace the variety of hardware that is there. And so as those software development environments and software development tools evolve to take advantage of these new capabilities, that's going to open up a lot of interesting opportunities that can leverage all these new chip architectures. That can leverage all these new interconnects. That can leverage all these new system architectures and figure out ways to make that all happen, I think is going to be critically important. And then finally, I'll mention the research I'm actually currently working on is on private 5g and how companies are thinking about deploying private 5g and the potential for edge applications for that. So I'm doing a survey of several hundred us companies as we speak and really looking forward to getting that done in the next couple of weeks. >> Yeah, look forward to that. Guys, again, thank you so much. Outstanding conversation. Anybody going to be at Dell tech world in a couple of weeks? Bob's going to be there. Dave Nicholson. Well drinks on me and guys I really can't thank you enough for the insights and your participation today. Really appreciate it. Okay, and thank you for watching this special power panel episode of theCube Insights powered by ETR. Remember we publish each week on Siliconangle.com and wikibon.com. All these episodes they're available as podcasts. DM me or any of these guys. I'm at DVellante. You can email me at David.Vellante@siliconangle.com. Check out etr.ai for all the data. This is Dave Vellante. We'll see you next time. (upbeat music)

Published Date : Apr 25 2022

SUMMARY :

but the labor needed to go kind of around the horn the applications to those edge devices Zeus up next, please. on the performance requirements you have. that we can tap into It's really important that you optimize I mean, for years you worked for the applications that I need? that we were having earlier, okay. on software from the market And the point I made in breaking at the edge, in the data center, you know, and society and do you have any sense as and I'm feeling the pain. and it's all about the software, of the components you use. And I remember the early days And I mean, all the way back Yeah, and that's why you see And the answer to that is the disc had to go and do stuff. the compute to the data. So is this what you mean when Nicholson the processing closer to the data? And so when you can have kind of innovation in the area that the future is going to be the ability to get where and how do you see the shifting demand And the opportunity is to to support, you know, of that edge ecosystem, you know, that you wanted to chat One of the things about moving to the edge I mean, other than the and the ability to create solutions Yeah, we're going to be-- And I remember talking to Chad the order this time, you know, in the sense that you can use hardware us your final thoughts. So the landscape and service area Yeah, thank you. in the direction of cloud. You know, one of the reasons And I think this is going to The computing power at the edge you can do at the edge. on the wall behind you I was actually a of semiconductors and silicon and the manner with which Okay, and thank you for watching

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
DavePERSON

0.99+

DavidPERSON

0.99+

Marc StaimerPERSON

0.99+

Keith TownsonPERSON

0.99+

David NicholsonPERSON

0.99+

Dave NicholsonPERSON

0.99+

KeithPERSON

0.99+

Dave VellantePERSON

0.99+

MarcPERSON

0.99+

Bob O'DonnellPERSON

0.99+

DellORGANIZATION

0.99+

CiscoORGANIZATION

0.99+

MicrosoftORGANIZATION

0.99+

BobPERSON

0.99+

HPORGANIZATION

0.99+

LenovoORGANIZATION

0.99+

2004DATE

0.99+

Charlie GiancarloPERSON

0.99+

ZK ResearchORGANIZATION

0.99+

PatPERSON

0.99+

10 nanometerQUANTITY

0.99+

GoogleORGANIZATION

0.99+

Keith TownsendPERSON

0.99+

10 gigQUANTITY

0.99+

25QUANTITY

0.99+

Pat GelsingerPERSON

0.99+

80%QUANTITY

0.99+

ARISTAORGANIZATION

0.99+

64 terabytesQUANTITY

0.99+

NvidiaORGANIZATION

0.99+

Zeus KerravalaPERSON

0.99+

Zhamak DehghaniPERSON

0.99+

Larry EllisonPERSON

0.99+

25 gigQUANTITY

0.99+

14 nanometerQUANTITY

0.99+

2017DATE

0.99+

2016DATE

0.99+

Norman RicePERSON

0.99+

OracleORGANIZATION

0.99+

VMwareORGANIZATION

0.99+

Michael DellPERSON

0.99+

69%QUANTITY

0.99+

30%QUANTITY

0.99+

OPEXORGANIZATION

0.99+

Pure StorageORGANIZATION

0.99+

$40 billionQUANTITY

0.99+

Dragon Slayer ConsultingORGANIZATION

0.99+

Ion Stoica, Databricks - Spark Summit East 2017 - #sparksummit - #theCUBE


 

>> [Announcer] Live from Boston Massachusetts. This is theCUBE. Covering Sparks Summit East 2017. Brought to you by Databricks. Now here are your hosts, Dave Vellante and George Gilbert. >> [Dave] Welcome back to Boston everybody, this is Spark Summit East #SparkSummit And this is theCUBE. Ion Stoica is here. He's Executive Chairman of Databricks and Professor of Computer Science at UCal Berkeley. The smarts is rubbing off on me. I always feel smart when I co-host with George. And now having you on is just a pleasure, so thanks very much for taking the time. >> [Ion] Thank you for having me. >> So loved the talk this morning, we learned about RISELabs, we're going to talk about that. Which is the son of AMP. You may be the father of those two, so. Again welcome. Give us the update, great keynote this morning. How's the vibe, how are you feeling? >> [Ion] I think it's great, you know, thank you and thank everyone for attending the summit. It's a lot of energy, a lot of interesting discussions, and a lot of ideas around. So I'm very happy about how things are going. >> [Dave] So let's start with RISELabs. Maybe take us back, to those who don't understand, so the birth of AMP and what you were trying to achieve there and what's next. >> Yeah, so the AMP was a six-year Project at Berkeley, and it involved around eight faculties and over the duration of the lab around 60 students and postdocs, And the mission of the AMPLab was to make sense of big data. AMPLab started in 2009, at the end of 2009, and the premise is that in order to make sense of this big data, we need a holistic approach, which involves algorithms, in particular machine-learning algorithms, machines, means systems, large-scale systems, and people, crowd sourcing. And more precisely the goal was to build a stack, a data analytic stack for interactive analytics, to be used across industry and academia. And, of course, being at Berkeley, it has to be open source. (laugh) So that's basically what was AMPLab and it was a birthplace for Apache Spark that's why you are all here today. And a few other open-source systems like Mesos, Apache Mesos, and Alluxio which was previously called Tachyon. And so AMPLab ended in December last year and in January, this January, we started a new lab which is called RISE. RISE stands for Real-time Intelligent Secure Execution. And the premise of the new lab is that actually the real value in the data is the decision you can make on the data. And you can see this more and more at almost every organization. They want to use their data to make some decision to improve their business processes, applications, services, or come up with new applications and services. But then if you think about that, what does it mean that the emphasis is on the decision? Then it means that you want the decision to be fast, because fast decisions are better than slower decisions. You want decisions to be on fresh data, on live data, because decisions on the data I have right now are original but those are decisions on the data from yesterday, or last week. And then you also want to make targeted, personalized decisions. Because the decisions on personal information are better than aggregate information. So that's the fundamental premise. So therefore you want to be on platforms, tools and algorithms to enable intelligent real-time decisions on live data with strong security. And the security is a big emphasis of the lab because it means to provide privacy, confidentiality and integrity, and as you hear about data breaches or things like that every day. So for an organization, it is extremely important to provide privacy and confidentiality to their users and it's not only because the users want that, but it also indirectly can help them to improve their service. Because if I guarantee your data is confidential with me, you are probably much more willing to share some of your data with me. And if you share some of the data with me, I can build and provide better services. So that's basically in a nutshell what the lab is and what the focus is. >> [Dave] Okay, so you said three things: fast, live and targeted. So fast means you can affect the outcome. >> Yes. Live data means it's better quality. And then targeted means it's relevant. >> Yes. >> Okay, and then my question on security, I felt like when cloud and Big Data came to fore, security became a do-over. (laughter) Is that a fair assessment? Are you doing it over? >> [George] Or as Bill Clinton would call it, a Mulligan. >> Yeah, if you get a Mulligan on security. >> I think security is, it's always a difficult topic because it means so many things for so many people. >> Hmm-mmm. >> So there are instances and actually cloud is quite secure. It's actually cloud can be more secure than some on-prem deployments. In fact, if you hear about these data leaks or security breaches, you don't hear them happening in the cloud. And there is some reason for that, right? It is because they have trained people, you know, they are paranoid about this, they do a specification maybe much more often and things like that. But still, you know, the state of security is not that great. Right? For instance, if I compromise your operating system, whether it's in cloud or in not in the cloud, I can't do anything. Right? Or your VM, right? On all this cloud you run on a VM. And now you are going to allow on some containers. Right? So it's a lot of attacks, or there are attacks, sophisticated attacks, which means your data is encrypted, but if I can look at the access patterns, how much data you transferred, or how much data you access from memory, then I can infer something about what you are doing about your queries, right? If it's more data, maybe it's a query on New York. If it's less data it's probably maybe something smaller, like maybe something at Berkeley. So you can infer from multiple queries just looking at the access. So it's a difficult problem. But fortunately again, there are some new technologies which are developed and some new algorithms which gives us some hope. One of the most interesting technologies which is happening today is hardware enclaves. So with hardware enclaves you can execute the code within this enclave which is hardware protected. And even if your operating system or VM is compromised, you cannot access your code which runs into this enclave. And Intel has Intell SGX and we are working and collaborating with them actively. ARM has TrustZone and AMB also announced they are going to have a similar technology in their chips. So that's kind of a very interesting and very promising development. I think the other aspect, it's a focus of the lab, is that even if you have the enclaves, it doesn't automatically solve the problem. Because the code itself has a vulnerability. Yes, I can run the code in hardware enclave, but the code can send out >> Right. >> data outside. >> Right, the enclave is a more granular perimeter. Right? >> Yeah. So yeah, so you are looking and the security expert is in your lab looking at this, maybe how to split the application so you run only a small part in the enclave, which is a critical part, and you can make sure that also the code is secure, and the rest of the code you run outside. But the rest of the code, it's only going to work on data which is encrypted. Right? So there is a lot of interesting research but that's good. >> And does Blockchain fit in there as well? >> Yeah, I think Blockchain it's a very interesting technology. And again it's real-time and the area is also very interesting directions. >> Yeah, right. >> Absolutely. >> So you guys, I want George, you've shared with me sort of what you were calling a new workload. So you had batch and you have interactive and now you've got continuous- >> Continuous, yes. >> And I know that's a topic that you want to discuss and I'd love to hear more about that. But George, tee it up. >> Well, okay. So we were talking earlier and the objective of RISE is fast and continuous-type decisions. And this is different from the traditional, you either do it batch or you do it interactive. So maybe tell us about some applications where that is one workload among the other traditional workloads. And then let's unpack that a little more. >> Yeah, so I'll give you a few applications. So it's more than continuously interacting with the environment continuously, but you also learn continuously. I'll give you some examples. So for instance in one example, think about you want to detect a network security attack, and respond and diagnose and defend in the real time. So what this means is that you need to continuously get logs from the network and from the more endpoints you can get the better. Right? Because more data will help you to detect things faster. But then you need to detect the new pattern and you need to learn the new patterns. Because new security attacks, which are the ones that are effective, are slightly different from the past one because you hope that you already have the defense in place for the past ones. So now you are going to learn that and then you are going to react. You may push patches in real time. You may push filters, installing new filters to firewalls. So that's kind of one application that's going in real time. Another application can be about self driving. Now self driving has made tremendous strides. And a lot of algorithms you know, very smart algorithms now they are implemented on the cars. Right? All the system is on the cars. But imagine now that you want to continuously get the information from this car, aggregate and learn and then send back the information you learned to the cars. Like for instance if it's an accident or a roadblock an object which is dropped on the highway, so you can learn from the other cars what they've done in that situation. It may mean in some cases the driver took an evasive action, right? Maybe you can monitor also the cars which are not self-driving, but driven by the humans. And then you learn that in real time and then the other cars which follow through the same, confronted with the same situation, they now know what to do. Right? So this is again, I want to emphasize this. Not only continuous sensing environment, and making the decisions, but a very important components about learning. >> Let me take you back to the security example as I sort of process the auto one. >> Yeah, yeah. >> So in the security example, it doesn't sound like, I mean if you have a vast network, you know, end points, software, infrastructure, you're not going to have one God model looking out at everything. >> Yes. >> So I assume that means there are models distributed everywhere and they don't know what a new, necessarily but an entirely new attack pattern looks like. So in other words, for that isolated model, it doesn't know what it doesn't know. I don't know if that's what Rumsfeld called it. >> Yes (laughs). >> How does it know what to pass back for retraining? >> Yes. Yes. Yes. So there are many aspects and there are many things you can look at. And it's again, it's a research problem, so I cannot give you the solution now, I can hypothesize and I give you some examples. But for instance, you can look about, and you correlate by observing the affect. Some of the affects of the attack are visible. In some cases, denial of service attack. That's pretty clear. Even the And so forth, they maybe cause computers to crash, right? So once you see some of this kind of anomaly, right, anomalies on the end devices, end host and things like that. Maybe reported by humans, right? Then you can try to correlate with what kind of traffic you've got. Right? And from there, from that correlation, probably you can, and hopefully, you can develop some models to identify what kind of traffic. Where it comes from. What is the content, and so forth, which causes behavior, anomalous behavior. >> And where is that correlation happening? >> I think it will happen everywhere, right? Because- >> At the edge and at the center. >> Absolutely. >> And then I assume that it sounds like the models both at the edge and at the center are ensemble models. >> Yes. >> Because you're tracking different behavior. >> Yes. You are going to track different behavior and you are going to, I think that's a good hypothesis. And then you are going to assemble them, assemble to come up with the best decision. >> Okay, so now let's wind forward to the car example. >> Yeah. >> So it sound like there's a mesh network, at least, Peter Levine's sort of talk was there's near-local compute resources and you can use bitcoin to pay for it or Blockchain or however it works. But that sort of topology, we haven't really encountered before in computing, have we? And how imminent is that sort of ... >> I think that some of the stuff you can do today in the cloud. I think if you're on super-low latency probably you need to have more computation towards the edges, but if I'm thinking that I want kind of reactions on tens, hundreds of milliseconds, in theory you can do it today with the cloud infrastructure we have. And if you think about in many cases, if you can't do it within a few hundredths of milliseconds, it's still super useful. Right? To avoid this object which has dropped on the highway. You know, if I have a few hundred milliseconds, many cars will effectively avoid that having that information. >> Let's have that conversation about the edge a little further. The one we were having off camera. So there's a debate in our community about how much data will stay at the edge, how much will go into the cloud, David Flores said 90% of it will stay at the edge. Your comment was, it depends on the value. What do you mean by that? >> I think that that depends who am I and how I perceive the value of the data. And, you know, what can be the value of the data? This is what I was saying. I think that value of the data is fundamentally what kind of decisions, what kind of actions it will enable me to take. Right? So here I'm not just talking about you know, credit card information or things like that, even exactly there is an action somebody's going to take on that. So if I do believe that the data can provide me with ability to take better actions or make better decisions I think that I want to keep it. And it's not, because why I want to keep it, because also it's not only the decision it enables me now, but everyone is going to continuously improve their algorithms. Develop new algorithms. And when you do that, how do you test them? You test on the old data. Right? So I think that for all these reasons, a lot of data, valuable data in this sense, is going to go to the cloud. Now, is there a lot of data that should remain on the edges? And I think that's fair. But it's, again, if a cloud provider, or someone who provides a service in the cloud, believes that the data is valuable. I do believe that eventually it is going to get to the cloud. >> So if it's valuable, it will be persisted and will eventually get to the cloud? And we talked about latency, but latency, the example of evasive action. You can't send the back to the cloud and make the decision, you have to make it real time. But eventually that data, if it's important, will go back to the cloud. The other question of all this data that we are now processing on a continuous basis, how much actually will get persisted, most of it, much of it probably does not get persisted. Right? Is that a fair assumption? >> Yeah, I think so. And probably all the data is not equal. All right? It's like you want to maybe, even if you take a continuous video, all right? On the cars, they continuously have videos from multiple cameras and radar and lidar, all of this stuff. This continuous. And if you think about this one, I would assume that you don't want to send all the data to the cloud. But the data around the interesting events, you may want to do, right? So before and after the car has a near-accident, or took an evasive action, or the human had to intervene. So in all these cases, probably I want to send the data to the cloud. But for the most cases, probably not. >> That's good. We have to leave it there, but I'll give you the last word on things that are exciting you, things you're working on, interesting projects. >> Yeah, so I think this is what really excites me is about how we are going to have this continuous application, you are going to continuously interact with the environment. You are going to continuously learn and improve. And here there are many challenges. And I just want to say a few more there, and which we haven't discussed. One, in general it's about explainability. Right? If these systems augment the human decision process, if these systems are going to make decisions which impact you as a human, you want to know why. Right? Like I gave this example, assuming you have machine-learning algorithms, you're making a diagnosis on your MRI, or x-ray. You want to know why. What is in this x-ray causes that decision? If you go to the doctor, they are going to point and show you. Okay, this is why you have this condition. So I think this is very important. Because as a human you want to understand. And you want to understand not only why the decision happens, but you want also to understand what you have to do, you want to understand what you need to do to do better in the future, right? Like if your mortgage application is turned down, I want to know why is that? Because next time when I apply to the mortgage, I want to have a higher chance to get it through. So I think that's a very important aspect. And the last thing I will say is that this is super important and information is about having algorithms which can say I don't know. Right? It's like, okay I never have seen this situation in the past. So I don't know what to do. This is much better than giving you just the wrong decision. Right? >> Right, or a low probability that you don't know what to do with. (laughs) >> Yeah. >> Excellent. Ion, thanks again for coming in theCUBE. It was really a pleasure having you. >> Thanks for having me. >> You're welcome. All right, keep it right there everybody. George and I will be back to do our wrap right after this short break. This is theCUBE. We're live from Spark Summit East. Right back. (techno music)

Published Date : Feb 8 2017

SUMMARY :

Brought to you by Databricks. And now having you on is just a pleasure, So loved the talk this morning, [Ion] I think it's great, you know, and what you were trying to achieve there is the decision you can make on the data. So fast means you can affect the outcome. And then targeted means it's relevant. Are you doing it over? because it means so many things for so many people. So with hardware enclaves you can execute the code Right, the enclave is a more granular perimeter. and the rest of the code you run outside. And again it's real-time and the area is also So you guys, I want George, And I know that's a topic that you want to discuss and the objective of RISE and from the more endpoints you can get the better. Let me take you back to the security example So in the security example, and they don't know what a new, and you correlate both at the edge and at the center And then you are going to assemble them, to the car example. and you can use bitcoin to pay for it And if you think about What do you mean by that? So here I'm not just talking about you know, You can't send the back to the cloud And if you think about this one, but I'll give you the last word And you want to understand not only why that you don't know what to do with. It was really a pleasure having you. George and I will be back to do our wrap

SENTIMENT ANALYSIS :

ENTITIES

EntityCategoryConfidence
David FloresPERSON

0.99+

GeorgePERSON

0.99+

George GilbertPERSON

0.99+

Dave VellantePERSON

0.99+

2009DATE

0.99+

Peter LevinePERSON

0.99+

Bill ClintonPERSON

0.99+

New YorkLOCATION

0.99+

90%QUANTITY

0.99+

JanuaryDATE

0.99+

AMBORGANIZATION

0.99+

last weekDATE

0.99+

DavePERSON

0.99+

yesterdayDATE

0.99+

IonPERSON

0.99+

ARMORGANIZATION

0.99+

BostonLOCATION

0.99+

six-yearQUANTITY

0.99+

December last yearDATE

0.99+

DatabricksORGANIZATION

0.99+

three thingsQUANTITY

0.99+

Boston MassachusettsLOCATION

0.99+

one exampleQUANTITY

0.99+

twoQUANTITY

0.98+

UCal BerkeleyORGANIZATION

0.98+

BerkeleyLOCATION

0.98+

AMPLabORGANIZATION

0.98+

Ion StoicaPERSON

0.98+

tens, hundreds of millisecondsQUANTITY

0.98+

todayDATE

0.97+

end of 2009DATE

0.96+

RumsfeldPERSON

0.96+

IntelORGANIZATION

0.96+

IntellORGANIZATION

0.95+

bothQUANTITY

0.95+

OneQUANTITY

0.95+

AMPORGANIZATION

0.94+

TrustZoneORGANIZATION

0.94+

Spark Summit East 2017EVENT

0.93+

around 60 studentsQUANTITY

0.93+

RISEORGANIZATION

0.93+

Sparks Summit East 2017EVENT

0.92+

oneQUANTITY

0.89+

one workloadQUANTITY

0.88+

Spark Summit EastEVENT

0.87+

Apache SparkORGANIZATION

0.87+

around eight facultiesQUANTITY

0.86+

this JanuaryDATE

0.86+

this morningDATE

0.84+

MulliganORGANIZATION

0.78+

few hundredths of millisecondsQUANTITY

0.77+

ProfessorPERSON

0.74+

GodPERSON

0.72+

theCUBEORGANIZATION

0.7+

few hundred millisecondsQUANTITY

0.67+

SGXCOMMERCIAL_ITEM

0.64+

MesosORGANIZATION

0.63+

one applicationQUANTITY

0.63+

Apache MesosORGANIZATION

0.62+

AlluxioORGANIZATION

0.62+

AMPLabEVENT

0.59+

TachyonORGANIZATION

0.59+

#SparkSummitEVENT

0.57+