Action Item | Big Data SV Preview Show - Feb 2018

>> Hi, I'm Peter Burris and once again, welcome to a Wikibon Action Item. (lively electronic music) We are again broadcasting from the beautiful theCUBE Studios here in Palo Alto, California, and we're joined today by a relatively larger group. So, let me take everybody through who's here in the studio with us. David Floyer, George Gilbert, once again, we've been joined by John Furrier, who's one of the key CUBE hosts, and on the remote system is Jim Kobielus, Neil Raden, and another CUBE host, Dave Vellante. Hey guys. >> Hi there. >> Good to be here. >> Hey. >> So, one of the things we're, one of the reasons why we have a little bit larger group here is because we're going to be talking about a community gathering that's taking place in the big data universe in a couple of weeks. Large numbers of big data professionals are going to be descending upon Strata for the purposes of better understanding what's going on within the big data universe. Now we have run a CUBE show next to that event, in which we get the best thought leaders that are possible at Strata, bring them in onto theCUBE, and really to help separate the signal from the noise that Strata has historically represented. We want to use this show to preview what we think that signal's going to be, so that we can help the community better understand what to look for, where to go, what kinds of things to be talking about with each other so that it can get more out of that important event. Now, George, with that in mind, what are kind of the top level thing? If it was one thing that we'd identify as something that was different two years ago or a year ago, and it's going to be different from this show, what would we say it would be? >> Well, I think the big realization that's here is that we're starting with the end in mind. We know the modern operational analytic applications that we want to build, that anticipate or influence a user interaction or inform or automate a business transaction. And for several years, we were experimenting with big data infrastructure, but that was, it wasn't solution-centric, it was technology-centric. And we kind of realized that the do it yourself, assemble your own kit, opensource big data infrastructure created too big a burden on admins. Now we're at the point where we're beginning to see a more converged set of offerings take place. And by converged, I mean an end to end analytic pipeline that is uniform for developers, uniform for admins, and because it's pre-integrated, is lower latency. It helps you put more data through one single analytic latency budget. That's what we think people should look for. Right now, though, the hottest new tech-centric activity is around Machine Learning, and I think the big thing we have to do is recognize that we're sort of at the same maturity level as we were with big data several years ago. And people should, if they're going to work with it, start with the knowledge, for the most part, that they're going to be experimenting, 'cause the tooling isn't quite mature enough, we don't have enough data scientists for people to be building all these pipelines bespoke. And the third-party applications, we don't have a high volume of them where this is embedded yet. >> So if I can kind of summarize what you're saying, we're seeing bifurcation occur within the ecosystem associated with big data that's driving toward simplification on the infrastructure side, which increasingly is being associated with the term big data, and new technologies that can apply that infrastructure and that data to new applications, including things like AI, ML, DL, where we think about modeling and services, and a new way of building value. Now that suggests that one or the other is more or less hot, but Neil Raden, I think the practical reality is that here in Silicon Valley, we got to be careful about getting too far out in front of our skis. At the end of the day, there's still a lot of work to be done inside how you simply do things like move data from one place to the other in a lot of big enterprises. Would you agree with that? >> Oh absolutely. I've been talking to a lot clients this week and, you know, we don't talk about the fact that they're still running their business on what we would call legacy systems, and they don't know how to, you know, get out of them or transform from them. So they're still starting to plan for this, but the problem is, you know, it's like talking about the 27 rocket engines on the whatever it was that he launched into space, launching a Tesla into space. But you can talk about the engineering of those engines and that's great, but what about all the other things you're going to have to do to get that (laughs) car into space? And it's the same thing. A year ago, we were talking about Hadoop and big data and, to a certain extent, Machine Learning, maybe more data science. But now people are really starting to say, How do we actually do this, how do we secure it, how do we govern it, how do we get some sort of metadata or semantics on the data we're working with so people know what they're using. I think that's where we are in a lot of companies. >> Great, so that's great feedback, Neil. So as we look forward, Jim Kobielus, the challenges associated with what it means to better improve the facilities of your infrastructure, but also use that as a basis for increasing your capability on some of the new applications services, what are we looking for, what should folks be looking for as they explore the show in the next couple of weeks on the ML side? What new technologies, what new approaches? Going back to what George said, we're in experimentation mode. What are going to be the experiments that are going to generate greatest results over the course of the next year? >> Yeah, for the data scientists, who flock to Strata and similar conferences, automation of the Machine Learning pipeline is super hot in terms of investments by the solution providers. Everybody from Google to IBM to AWS, and others, are investing very heavily in automation of, not just the data engine, that problem's been had a long time ago. It's automation of more of the feature engineering and the trending. These very manual, often labor intensive, jobs have to be sped up and automated to a great degree to enable the magic of productivity by the data scientists in the new generation of app developers. So look for automation of Machine Learning to be a super hot focus. Related to that is, look for a new generation of development suites that focus on DevOps, speeding the Machine Learning in DL and AI from modeling through training and evaluation deployment in iteration. We've seen a fair upswing in the number of such toolkits on the market from a variety of startup vendors, like the DataRobots of the world. But also coming to say, AWS with SageMaker, for example, that's hot. Also, look for development toolkits that automate more of the cogeneration, you know, a low-code tools, but the new generation of low-code tools, as highlighted in a recent Wikibons study, use ML to drive more of the actual production of fairly decent, good enough code, as a first rough prototype for a broad range of applications. And finally we're seeing a fair amount of ML-generated code generation inside of things like robotic process automation, RPA, which I believe will probably be a super hot theme at Strata and other shows this year going forward. So there's a, you mentioned the idea of better tooling for DevOps and the relationship between big data and ML, and what not, and DevOps. One of the key things that we've been seeing over the course of the last few years, and it's consistent with the trends that we're talking about, is increasing specialization in a lot of the perspectives associated with changes within this marketplace, so we've seen other shows that have emerged that have been very, very important, that we, for example, are participating in. Places like Splunk, for example, that is the vanguard, in many respects, of a lot of these trends in big data and how big data can applied to business problems. Dave Vellante, I know you've been associated with a number of, participating in these shows, how does this notion of specialization inform what's going to happen in San Jose, and what kind of advice and counsel should we tell people to continue to explore beyond just what's going to happen in San Jose in a couple weeks? >> Well, you mentioned Splunk as an example, a very sort of narrow and specialized company that solves a particular problem and has a very enthusiastic ecosystem and customer base around that problem. LAN files to solve security problems, for example. I would say Tableau is another example, you know, heavily focused on Viz. So what you're seeing is these specialized skillsets that go deep within a particular domain. I think the thing to think about, especially when we're in San Jose next week, is as we talk about digital disruption, what are the skillsets required beyond just the domain expertise. So you're sort of seeing this bifurcated skillsets really coming into vogue, where if somebody understands, for example, traditional marketing, but they also need to understand digital marketing in great depth, and the skills that go around it, so there's sort of a two-tool player. We talk about five-tool player in baseball. At least a multidimensional skillset in digital. >> And that's likely to occur not just in a place like marketing, but across the board. David Floyer, as folks go to the show and start to look more specifically about this notion of convergence, are there particular things that they should think about that, to come back to the notion of, well, you know, hardware is going to make things more or less difficult for what the software can do, and software is going to be created that will fill up the capabilities of hardware. What are some of the underlying hardware realities that folks going to the show need to keep in mind as they evaluate, especially the infrastructure side, these different infrastructure technologies that are getting more specialized? >> Well, if we look historically at the big data area, the solution has been to put in very low cost equipment as nodes, lots of different nodes, and move the data to those nodes so that you get a parallelization of the, of the data handling. That is not the only way of doing it. There are good ways now where you can, in fact, have a single version of that data in one place in very high speed storage, on flash storage, for example, and where you can allow very fast communication from all of the nodes directly to that data. And that makes things a lot simpler from an operational point of view. So using current Batch Automation techniques that are in existence, and looking at those from a new perspective, which is I do IUs apply these to big data, how do I automate these things, can make a huge difference in just the practicality in the elapsed time for some of these large training things, for example. >> Yeah, I was going to say that to many respects, what you're talking about is bringing things like training under a more traditional >> David: Operational, yeah. >> approach and operational set of disciplines. >> David: Yes, that's right. >> Very, very important. So John Furrier, I want to come back to you, or I want to come to you, and say that there are some other technologies that, while they're the bright shiny objects and people think that they're going to be the new kind of Harry Potter technologies of magic everywhere, Blockchain is certainly going to become folded into this big data concept, because Blockchain describes how contracts, ownership, authority ultimately get distributed. What should folks look for as the, as Blockchain starts to become part of these conversations? >> That's a good point, Peter. My summary of the preview for BigData SV Silicon Valley, which includes the Strata show, is two things: Blockchain points to the future and GDPR points to the present. GDPR is probably the most, one of the most fundamental impacts to the big data market in a long time. People have been working on it for a year. It is a nightmare. The technical underpinnings of what companies have to do to comply with GDPR is a moving train, and it's complete BS. There's no real solutions out there, so if I was going to tell everyone to think about that and what to look for: What is happening with GDPR, what's the impact of the databases, what's the impact of the architectures? Everyone is faking it 'til they make it. No one really has anything, in my opinion from what I can see, so it's a technical nightmare. Where was that database? So it's going to impact how you store the data, and the sovereignty issue is another issue. So the Blockchain then points to the sovereignty issue of the data, both in terms of the company, the country, and the user. These things are going to impact software development, application development, and, ultimately, cloud choice and the IoT. So to me, GDPR is not just a one and done thing and Blockchain is kind of a future thing to look at. So I would look out of those two lenses and say, Do you have a direction or a narrative that supports me today with what GDPR will impact throughout the organization. And then, what's going on with this new decentralized infrastructure and the role of data, and the sovereignty of that data, with respect to company, country, and user. So to me, that's the big issue. >> So George Gilbert, if we think about this question of these fundamental technologies that are going to become increasingly important here, database managers are not dead as a technology. We've seen a relative explosion over the last few years in at least invention, even if it hasn't been followed with, as Neil talked about, very practical ways of bringing new types of disciplines into a lot of enterprises. What's going to happen with the database world, and what should people be looking for in a couple of weeks to better understand how some of these data management technologies are going to converge and, or involve? >> It's a topic that will be of intense interest and relevance to IT professionals, because it's become the common foundation of all modern apps. But I think what we can do is we can see, for instance, a leading indicator of what's going to happen with the legacy vendors, where we have in-memory technologies from both transaction processing and analytics, and we have more advanced analytics embedded in the database engine, including Machine Learning, the model training, as well as model serving. But the, what happened in the big data community is that we disassembled the DBMS into the data manipulation language, which is an analytic language, like, could be Spark, could be Flink, even Hive. We had the Catalog, which I think Jim has talked about or will be talking about, where we're not looking, it's not just a dictionary of what's in one DBMS, but it's a whole way of tracking and governing data across many stores. And then there's the Storage Manager, could be the file system, an object store, could be just something like Kudu, which is a MPP way of, in parallel, performing a bunch of operations on data that's stored. The reason I bring all this up is, following on David's comment about the evolution of hardware, databases are fundamentally meant to expose capabilities in the hardware and to mediate access to data, using these hardware capabilities. And now that we have this, what's emerging as this unigrid, with memory-intensive architectures and super low latency to get from any point or node on that cluster to any other node, like with only a five microsecond lag, relative to previous architectures. We can now build databases that scale up with the same knowledge base that we built databases... I'm sorry, that scale out, that we used to build databases that scale up. In other words, it democratizes the ability to build databases of enormous scale, and that means that we can have analytics and the transactions working together at very low latency. >> Without binding them. Alright, so I think it's time for the action items. We got a lot to do, so guys, keep it really tight, really simple. David Floyer, let me start with you. Action item. >> So action item on big data should be focus on technologies that are going to reduce the elapse time of solutions in the data center, and those are many and many of them, but it's a production problem, it's becoming a production problem, treat it as a production problem, and put it in the fundamental procedures and technologies to succeed. >> And look for vendors >> Who can do that, yes. >> that do that. George Gilbert, action item. >> So I talked about convergence before. The converged platform now is shifting, it's center of gravity is shifting to continuous processing, where the data lake is a reference data repository that helps inform the creation of models, but then you run the models against the streaming continuous data for the freshest insights-- >> Okay, Jim Kobielus, action item. >> Yeah, focus on developer productivity in this new era of big data analytics. Specifically focus on the next generation of developers, who are data scientists, and specifically focus on automating most of what they do, so they can focus on solving problems and sifting through data. Put all the grunt work or training, and all that stuff, take and carry it by the infrastructure, the tooling. >> Peter: Neil Raden, action item. >> Well, one thing I learned this week is that everything we're talking about is about the analytical problem, which is how do you make better decisions and take action? But companies still run on transactions, and it seems like we're running on two different tracks and no one's talking about the transactions anymore. We're like the tail wagging the dog. >> Okay, John Furrier, action item. >> Action item is dig into GDPR. It is a really big issue. If you're not proactive, it could be a nightmare. It's going to have implications that are going to be far-reaching in the technical infrastructure, and it's the Sarbanes-Oxley, what they did for public companies, this is going to be a nightmare. And evaluate the impact of Blockchains. Two things. >> David Vellante, action item. >> So we often say that digital is data, and just because your industry hasn't been upended by digital transformations, don't think it's not coming. So it's maybe comfortable to sit back and say, Well, we're going to wait and see. Don't sit back and wait and see. All industries are susceptible to digital transformation. >> Alright, so I'll give the action item for the team. We've talked a lot about what to look for in the community gathering that's taking place next week in Silicon Valley around strata. Our observations as the community, it descends upon us, and what to look for is, number one, we're seeing a bifurcation in the marketplace, in the thought leadership, and in the tooling. One set of group, one group is going more after the infrastructure, where it's focused more on simplification, convergence; another group is going more after the developer, AI, ML, where it's focused more on how to create models, training those models, and building applications with the services associated with those models. Look for that. Don't, you know, be careful about vendors who say that they do it all. Be careful about vendors that say that they don't have to participate in a converged approach to doing this. The second thing I think we need to look for, very importantly, is that the role of data is evolving, and data is becoming an asset. And the tooling for driving velocity of data through systems and applications is going to become increasingly important, and the discipline that is necessary to ensure that the business can successfully do that with a high degree of predictability, bringing new production systems are also very important. A third area that we take a look at is that, ultimately, the impact of this notion of data as an asset is going to really come home to roost in 2018 through things like GDPR. As you scan the show, ask a simple question: Who here is going to help me get up to compliance and sustain compliance, as the understanding of privacy, ownership, etc. of data, in a big data context, starts to evolve, because there's going to be a lot of specialization over the next few years. And there's a final one that we might add: When you go to the show, do not just focus on your favorite brands. There's a lot of new technology out there, including things like Blockchain. They're going to have an enormous impact, ultimately, on how this marketplace unfolds. The kind of miasma that's occurred in big data is starting to specialize, it's starting to break down, and that's creating new niches and new opportunities for new sources of technology, while at the same time, reducing the focus that we currently have on things like Hadoop as a centerpiece. A lot of convergence is going to create a lot of new niches, and that's going to require new partnerships, new practices, new business models. Once again, guys, I want to thank you very much for joining me on Action Item today. This is Peter Burris from our beautiful Palo Alto theCUBE Studio. This has been Action Item. (lively electronic music)

Published Date : Feb 24 2018

SUMMARY :

We are again broadcasting from the beautiful and it's going to be different from this show, And the third-party applications, we don't have Now that suggests that one or the other is more or less hot, but the problem is, you know, it's like talking about the What are going to be the experiments that are going to in a lot of the perspectives associated with I think the thing to think about, that folks going to the show need to keep in mind and move the data to those nodes and people think that they're going to be So the Blockchain then points to the sovereignty issue What's going to happen with the database world, in the hardware and to mediate access to data, We got a lot to do, so guys, focus on technologies that are going to that do that. that helps inform the creation of models, Specifically focus on the next generation of developers, and no one's talking about the transactions anymore. and it's the Sarbanes-Oxley, So it's maybe comfortable to sit back and say, and sustain compliance, as the understanding of privacy,

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Jim Kobielus	PERSON	0.99+
George	PERSON	0.99+
David Floyer	PERSON	0.99+
George Gilbert	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Neil Raden	PERSON	0.99+
Neil	PERSON	0.99+
Peter Burris	PERSON	0.99+
David Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
John Furrier	PERSON	0.99+
Peter	PERSON	0.99+
Feb 2018	DATE	0.99+
Silicon Valley	LOCATION	0.99+
Jim	PERSON	0.99+
AWS	ORGANIZATION	0.99+
2018	DATE	0.99+
Google	ORGANIZATION	0.99+
GDPR	TITLE	0.99+
next week	DATE	0.99+
two things	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
Splunk	ORGANIZATION	0.99+
both	QUANTITY	0.99+
A year ago	DATE	0.99+
two lenses	QUANTITY	0.99+
a year ago	DATE	0.99+
two years ago	DATE	0.99+
this week	DATE	0.99+
Palo Alto	LOCATION	0.99+
first	QUANTITY	0.99+
third area	QUANTITY	0.98+
CUBE	ORGANIZATION	0.98+
one group	QUANTITY	0.98+
second thing	QUANTITY	0.98+
27 rocket	QUANTITY	0.98+
today	DATE	0.98+
next year	DATE	0.98+
Two things	QUANTITY	0.97+
theCUBE Studios	ORGANIZATION	0.97+
two-tool player	QUANTITY	0.97+
five microsecond	QUANTITY	0.96+
One set	QUANTITY	0.96+
Tableau	ORGANIZATION	0.94+
a year	QUANTITY	0.94+
single version	QUANTITY	0.94+
one	QUANTITY	0.94+
Wikibons	ORGANIZATION	0.91+
Wikibon	ORGANIZATION	0.91+
two different tracks	QUANTITY	0.91+
five-tool player	QUANTITY	0.9+
several years ago	DATE	0.9+
this year	DATE	0.9+
Strata	TITLE	0.87+
Harry Potter	PERSON	0.85+
one thing	QUANTITY	0.84+
years	DATE	0.83+
one place	QUANTITY	0.82+

Action Item | Converged & Hyper Converged Infrastructure

Hi, I'm Peter Burris, and welcome to Wikibon's Action Item. (electronic music) Every week, we bring together the Wikibon research team and we present the action items that we believe are most crucial for users to focus on against very important topics. This week, I'm joined by George Gilbert, David Floyer, here in the Cube studios in Palo Alto. And on the phone we have Ralph Phinos, Dave Vellante, and Jim Kobielus. Thank you guys, thank you team for being part of today's conversation. What were going to talk about today in Action Item is the notion of what we're calling enterprise hyperscale. Now we're going to take a route to get there that touches upon many important issues, but fundamentally the question is, at what point should enterprises choose to deploy their own hardware at scale to support applications that will have a consequential business impact on their shareholder, customer, and employee value? Now to kick us off here, 'cause this is a very complex topic, and it involves a lot of different elements, David Floyer, first question to you. What is the core challenge that enterprises face today as they think about build, buy, or rent across this increasingly mushed hardware continuum, or system continuum? >> So the biggest challenge from the traditional way that enterprises have put together systems is that the cost and the time to manage these systems is going up and up. And as we go from just systems of record, to with analytic systems being mainly in bash mode, towards systems of intelligence where the real-time analytics are going to combine in with the systems of record. So the complexity of the systems and the software layers are getting more and more complicated. And it takes more and more time and effort and elapsed time to keep things current. >> Why is it that not everybody can do this, David? Is there a fundamental economic reason to play here? >> Well, if you take systems, and build them yourself and put them together yourself, you'll always end up with the cheapest system. The issue is that the cost of maintaining those systems, and even more, the elapsed time cost of maintaining those systems, the time to value to putting in new releases, etc., has been extending. And there comes a time when that cost of delaying implementing new systems overwhelms the cost that you can save in the hardware itself. >> So there's some scale efficiencies in thinking about integration from a time standpoint. Dave Vellante, we've been looking at this for quite some time, and we think about true private could, for example. But if you would, kind of give us that core dynamic in simple terms between what is valuable to the business, what isn't valuable to the business, and the different options between renting and buying your cells, what is that kind of core dynamic at play? >> OK, so as we talked about a lot in our true private cloud research, hyper-converged systems are an attempt to substantially mimic public cloud environments on-prem. And this creates this bifurcated buying dynamic that I think is worth exploring a little bit. The big cloud players, as everybody talks about, have lots of engineers running around, they have skill, and they have time. So they'll spend time to build proprietary technologies and use their roll-your-own components to automate processes. In other words, they'll spend time to save money. And this is essentially the hyperscale as a form of their R&D, and they have an end-year lead, whatever it's five, six, four years on the enterprise. And that's not likely to change, that dynamic. The enterprise buyers, on the other hand, they don't have the resources, they're stretched thin, so they'll spend money to save time. So enterprises they want to cut labor costs, and shift useless IT labor to so-called vendor R&D. To wit, our forecasts show that about $150 billion is going to come out of low-value IT operations over the next ten years, and will shift to integrated products. >> So ultimately we end up seeing the vendors effectively capturing a lot of that spend that otherwise had been internally. Now this raises a new dynamic, when we think about this, David Floyer, in that there are still vendors that have to return something to their shareholders. There's this increased recognition that businesses or enterprises want this cloud experience, but not everybody is able to offer it, and we end up then with some really loosely-defined definitions. What's the continuum of where systems are today, from traditional all the way out to cloud, what does that look like? >> So a useful way of looking at it is to see what has happened over time and where we think it's going. We started with separate systems completely. Converged systems then came in, where the vendor put them together and reduced the time a little bit to value. But really the maintenance was still a responsibility of-- >> [Peter] But what was brought together? >> [David F] It was the traditional arrays, it was the servers-- >> Racks, power supplies-- >> All of that stuff put together, and delivered as a package. The next level up was so-called hyper-converged, where certainly some of the hyperconverged vendors went and put in software for each layer, software for the storage layer, software for the networking layer, put in more management. But a lot of vendors really took hyperconverged as being the old stuff with a few extra flavors. >> So they literally virtualized those underlying hardware sources, got some new efficiencies and economies. >> That's right, so they software virtualized each of those components. When you look at the cloud vendors, just skipping one there, they have gone hyperscale. And they have put in, as Dave spoke earlier, they have put in all of their software to make that hyperscale work. What we think in the middle of that is enterprise hyperscale, which is coming in, where you have the what we call service end. We have that storage capability, we have the networking capability, and the CPU capabilities, all separated, able to be scaled in whatever direction is required, and any processor to be able to get at any data through that network, with very, very little overhead. And it's software for the storage, it's software and firmware for the networking. The processor is relieved of all that processing. We think that architecture is going to mimic what the hyperscale have. But the vendors now have an opportunity of putting in the software to emulate that cloud experience, and take away from the people who want on-site equipment, take away all of the work that's necessary to keep that software stack up to date. The vendors are going to maintain that software stack as high as they can go. >> So David, is this theory, or are there practical examples of this happening today? >> Oh, absolutely, there are practical examples of those happening. There are practical examples at the lower levels, with people like Micron and SolidScale. That's at a technology level, when we're talking about hyperscale-- Well if you're looking at it from a practical point of view, ARCOL have put it into the marketplace. ARCOL cloud on-premises, ARCOL converged systems, where they are taking the responsibility of maintaining all of the software, all the way up to the database stack. And in the future, probably beyond that, towards the ARCOL applications as well. So they're taking that approach, putting it in, and arguing, persuasively, that the customer should focus on time to value as opposed to cost of just the hardware. >> Well we can also look at SaaS vendors right, who many of the have come off of infrastructure as a service, deployed their own enterprise hyperscale, increasingly starting to utilize some of this hyperscale componentry, as a basis for building things out. Now one of the key reasons why we want to do this, and George I'll turn it to you, is because as David mentioned earlier, the idea is we want to bring analytics and operations more closely together to improve automation, augmentation, other types of workloads. What is it about that effort that's encouraging this kind of adoption of these new approaches? >> [George] Well databases typically make great leaps forward when we have changes in the underlying trade-offs or relative price performance of compute storage and networking. What we're talking about with hyperscale, I guess either on-prem or the cloud version, is that we can build scale out that databases can support without having to be rewritten, so that they work just the way they did on tightly-coupled symmetric multiprocessors, shared memory. And so now they can go from a few nodes, or half a dozen nodes, or even say a dozen nodes, to thousands. And as David's research has pointed out, they have latency to get to memory in any node from any node in five microseconds. So building up from that, the point is we can now build databases that really do have the horsepower to handle the analytics to inform the transactions in the same database. Or, if you do separate them, because you don't want to touch a current system of record, you have a very powerful analytic system that can apply more data and do richer analytics to inform a decision in the form of a transaction, than you could with traditional architectures. >> So it's the data that's driving the need for a data-rich system that's architected in the context of data needs, that's driving a lot of this change. Now, David Floyer, we've talked about data tiering. We've talked about the notion of primary, secondary, and tertiary data. Without revisiting that entirely, what is it about this notion of enterprise hyperconverge that's going to make it easier to naturally place data where it belongs in the infrastructure? >> Well underlying this is that moving data is extremely expensive, so you want to, where possible, move the processing to the data itself. The origin of that data may be at the edge, for example, in IOT. It may be in a large central headquarters. It may be in the cloud, it may be operational data, end-user data, for people using their phones, which is available from the cloud. So there are multiple sources. So you want to place the processing as close to that data as possible so that you have the least cost of both moving it, and you have the lowest latency. And that's particularly important when you've got systems of intelligence where you want to combine the two. >> So Jim Kobielus, it seems as though there's a compelling case to be made here to focus on time, time to value, time to deploy, on the one hand, as well as another aspect of time, the time associated with latency, the time associated with reducing path length, and optimizing for path length. Which again has a scale impact. What are developers thinking? Are developers actually going to move the market to these kinds of solutions, or are they going to try to do something different? >> I think what developers will do is that they will begin to move the market towards hyperconverged systems. Much of the development that's going on now is for artificial intelligence, deep learning, and so forth, where you're building applications that have an increasing degree of autonomy, being able to make decisions based on system of record data, system of engagement data, system of insight data, in real time. What that increasingly requires, Peter, is a development platform that combines those different types of data bases, or data stores, and also combines the processing for deep learning, machine learning, and so forth. On devices that are increasingly tinier and tinier, and embedded in mobile devices and what not. So what I'm talking about here is an architecture for development where developers are going to say, I want to be able to develop it in the cloud, I'm going to need to. 'Cause we have huge teams of specialists who are building and training and deploying and iterating these in a cloud environment, a centralized modeling context, but then deploying their results of their work down to the smallest systems where these models will need to run, if not autonomously, in some loosely-coupled fashion with tier two and tier three systems, which will also be hyperconverged. And each of those systems in each of those tiers will need a self-similar data fabric, and an AI processing fabric. So what developers are saying is, I want to be able to take it and model it, and deploy it to these increasingly nano-scopic devices at the edge, and I need each of those components at every tier to have the same capabilities and hyperconverged form factors, essentially. >> For hyperscale, so here's where we are, guys. Where we are is that there are compelling economic reasons why we're going to see this notion of enterprise hyperscale emerge. It appears that the workloads are encouraging that. Developers seem to be moving towards adopting these technologies. But there's another group that we haven't talked about. Dave Vellante, the computing industry is not a simple go-to-market model. There's a lot of reasons why channels, partnerships, etc. are so complex. How are they going to weigh in on this change? >> [Dave Vellante] Well the cloud clearly is having an impact on the channel. I mean if you look at sort of the channel guys, you got the sort of box sellers, which still comprises most of the channel. You got more solution orientation, and then increasingly, you know, the developers are becoming a form of a channel. And I think the channel still has a lot of influence over how customers buy, and I think one of the reasons that people buy roll-your-own still, and it's somewhat artificial, is that the channel oftentimes prefers it that way. It's more complicated, and as their margins get squeezed, the channel players can maintain services, on top of those roll-your-own components. So I think buyers got to be careful, and they got to make sure that their service provider's motivations align with, you know, their desired outcomes, and they're not doing the roll-your-own bespoke approach for the wrong reasons. >> Yeah, and we've seen that a fair amount as we've talked to senior IT folks, that there's a clear misalignment, often, between what's being pushed from a technology standpoint and what the application actually requires, and that's one of the reasons why this question is so rich and so important. But Ralph Phinos, kind of sum up, when you think about some of these issues as they pertain to where to make investments, how to make investments. From our perspective, is there a relatively simple approach to thinking this through, and understanding how best to put your money to get the most value out of the technologies that you choose? (static hissing) Alright, I think we've lost Ralph there, so I'll try to answer the question myself. (chuckles) (David laughs) So here's how we would look at it, and David Floyer, help me out and see if you disagree with me. But at the end of the day, what we're looking for is we're suggesting to customers that have a cost orientation should worry a little bit less about risk, a little bit less about flexibility, and they can manage how that cost happens. And the goal is to try to reduce the cost as fast as possible, and not worry so much about the future options that they'll face in terms of how to reduce future types of cost out. And so that might push them more towards this public hyperscale approach. But for companies that are thinking in terms of revenue, that have to ensure that their systems are able to respond to competitive pressures, customer needs, that are increasingly worried about buying future options with today's technology choices. That there's a scale, but that's the group that's going to start looking more at the enterprise hyperscale. Clearly that's where SAS players are. Yeah. And then the question is and what requires further research is, where's that break point going to be? So if I'm looking at this from an automation, from a revenue standpoint, then I need a little bit greater visibility in where that break point's going to be between controlling my own destiny, with the technology that's crucial to my business, versus not having to deal with the near-term costs associated with doing the integration myself. But this time to value, I want to return to this time to value. >> [David] It's time to value that is the crucial thing here, isn't it? >> [Peter] Time to value now, and time to future value. >> And time to future value, yes. What is the consequence of doing everything yourself is that the time to put in new releases, the time to put in patches, the time to make your system secure, is increasingly high. And the more that you integrate systems into systems of intelligence, with the analytics and the systems of record, the more you start to integrate, the more complex the total environment, the more difficult it's going to be for people to manage that themselves. So in that environment, you would be pushing towards getting systems where the vendor is doing as much of that integration as they can-- And that's where they get the economies from. The vendors get the economies of scale because they can feed back into the system faster than anybody else. Rather than taking a snowflake approach, they're taking a volume approach, and they can feed back for example artificial intelligence in operational efficiency, in security. There's many, many opportunities for vendors to push down into the marketplace those findings. And those vendors can be cloud vendors as well. If you look at Microsoft, they can push down into their Azure Stack what they're finding in terms of artificial intelligence and in terms of capabilities. They can push those down into the enterprises themselves. So the more that they can go up the stack into the database layers, maybe even into the application layers, the higher they can go, the lower the cost, the lower the time to value will be for them to deploy applications using that. >> Alright, so we've very quickly got some great observations on this important dynamic. It's time for action items. So Jim Kobielus, let me start with you. What's the action item for this whole notion of hyperscale? Action items, Jim Kobielus. >> Yeah, the action item for hyperscale is to consider the degree of convergence you require at the lowest level of the system, the edge device. How much of that needs to be converged down to a commoditized component that can be flexible enough that you can develop a wide range of applications on top of that-- >> Excellent, hold on, OK. George Gilbert, action item. >> Really quickly you have to determine, are you going to keep your legacy system of record database, and add like an analytic database on a hyperscale infrastructure, so that you're not doing a heart and lung transplant on an existing system. If you can do that and you can manage the latency between the existing database and culling to the analytic database, that's great. Then there's little disruption. Otherwise you have to consider integrating the analytics into a hyperscale-ready legacy database. >> David Vellante, action item. >> Tasks like LUN management, and server provisioning, and just generally infrastructure management, and non-strategic. So as fast as possible, shift your "IT labor resources" up the stack toward more strategic initiatives, whether they're digital initiatives, data orientation, and other value-producing activities. >> David Floyer, action item. >> Well I was just about to say what Dave Vellante just said. So let me focus a little bit more on a step in order to get to that position. >> So Dave Floyer, action item. (David laughs) >> So the action item that I would choose would be that you have to know what your costs are, and you have to be able to, as senior management, look at those objectively and say, "What is my return on spending all of "this money and making the system operate?" The more that you can reduce the complexity, buy in, converge systems, hyperconverge systems, hyperscale systems, that are going to put that responsibility onto the vendors themselves, the better position you're going to be to really add value to the bottom line of applications that really can start to use all of this capability, advanced analytics that's coming into the marketplace. >> So I'm going to add an action item before I do a quick summary. And I'm just going to insert it. My action item, the relationship that you have with your vendors is going to change. It used to be focused on procurement and reducing the cost of acquisition. Increasingly, for those high-value, high-performing, revenue-producing, differentiating applications, it's going to be strategic vendor management. That implies a whole different range of activities. And companies that are going to build their business with technology and digital are going to have to move to a new relationship management framework. Alright, so let's summarize today's action item meeting. First of I want to thank very much George Gilbert, David Floyer, here in the studio with me. David Vellante, Ralph Phinos, Jim Kobielus on the phone. Today we talked about enterprise hyperscale. This is part of a continuum that we see happening, because the economics of technology are continuing to assert themselves in the marketplace, and it's having a significant range of impacts on all venues. When we think about scale economies, we typically think about how many chips we're going to stamp out, or how many copies of an operating system is going to be produced, and that still obtains, and it's very important. But increasingly users have to focus their attention to how we're going to generate economies out of the IT labor that's necessary to keep the digital businesses running. If we can shift some of those labor costs to other players, then we want to support those technology sets that embed those labor costs directly in the form of technology. So over the next few years, we're going to see the emergence of what we're calling enterprise hyperscale that embeds labor costs directly into hyperscale packaging, so that companies can focus more on generating revenue out of technology, and spend less time on the integration of work. The implications of that is that the traditional buying process of trying to economize on the time to purchase, the time to get access to the piece parts, is going to give way to a broader perspective on time to ultimate value of the application or of the outcome that we seek. And that's going to have a number of implications that CIOs have to worry about. From an external standpoint, it's going to mean valuing technology differently, valuing packaging differently. It means less of a focus on the underlying hardware, more of a focus on this common set of capabilities that allow us to converge applications. So whereas converge technology talked about converging hardware, enterprise hyperscale increasingly is about converging applications against common data, so that we can run more complex and interesting workloads and revenue-producing workloads, without scaling the labor and management costs of those workloads. A second key issue is, we have to step back and acknowledge that sometimes the way products go to market, and our outcomes or our desires, do not align. That there is the residual reality in the marketplace that large numbers of channel partners and vendors have an incentive to try to push more complex technologies that require more integration, because it creates a greater need for them and creates margin opportunities. So ensure that as you try to achieve this notion of converged applications and not converged infrastructure necessarily, that you are working with a partner who follows that basic program. And the last thing is I noted a second ago, that that is going to require a new approach to thinking about strategic vendor management. For the last 30 years, we've done a phenomenal job of taking cost out of technology, by focusing on procurement and trying to drive every single dime out of a purchase that we possibly could. Even if we didn't know what that was going to mean from an ongoing maintenance and integration and risk-cost standpoint, what we need to think about now is what will be the cost to the outcome. And not only this outcome, but because we're worried about digital business, future outcomes, that are predicated on today's decisions. So the whole concept here is, from a relationship management standpoint, the idea of what relationship is going to provide us the best time to value today, and streams of time to value in the future. And we have to build our relationships around that. So once again I want to thank the team. This is Peter Burris. Thanks again for participating or listening to the Action Item. From the Cube studios in Palo Alto, California, see you next week. (electronic music)

Published Date : Nov 10 2017

SUMMARY :

And on the phone we have Ralph Phinos, is that the cost and the time to The issue is that the cost of maintaining those systems, and the different options between renting and buying So they'll spend time to build proprietary What's the continuum of where systems are today, But really the maintenance was still a responsibility of-- the old stuff with a few extra flavors. So they literally virtualized those underlying putting in the software to emulate that cloud experience, and arguing, persuasively, that the customer the idea is we want to bring analytics and operations build databases that really do have the horsepower So it's the data that's driving the need for as possible so that you have the least cost the market to these kinds of solutions, in the cloud, I'm going to need to. It appears that the workloads are encouraging that. and they got to make sure that their service provider's And the goal is to try to reduce the cost is that the time to put in new releases, What's the action item for this whole notion of hyperscale? Yeah, the action item for hyperscale is to George Gilbert, action item. culling to the analytic database, that's great. So as fast as possible, shift your "IT labor resources" a step in order to get to that position. So Dave Floyer, action item. hyperscale systems, that are going to put that economize on the time to purchase,

ENTITIES

Entity	Category	Confidence
David Floyer	PERSON	0.99+
David	PERSON	0.99+
George Gilbert	PERSON	0.99+
Peter Burris	PERSON	0.99+
Jim Kobielus	PERSON	0.99+
Ralph Phinos	PERSON	0.99+
Dave Floyer	PERSON	0.99+
Dave Vellante	PERSON	0.99+
David Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
George	PERSON	0.99+
Peter	PERSON	0.99+
Wikibon	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
Ralph	PERSON	0.99+
David F	PERSON	0.99+
five	QUANTITY	0.99+
six	QUANTITY	0.99+
thousands	QUANTITY	0.99+
each	QUANTITY	0.99+
next week	DATE	0.99+
Today	DATE	0.99+
today	DATE	0.99+
one	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
about $150 billion	QUANTITY	0.99+
This week	DATE	0.99+
each layer	QUANTITY	0.99+
two	QUANTITY	0.99+
ARCOL	ORGANIZATION	0.99+
first question	QUANTITY	0.99+
First	QUANTITY	0.99+
both	QUANTITY	0.98+
four years	QUANTITY	0.98+
five microseconds	QUANTITY	0.98+
a dozen nodes	QUANTITY	0.98+
second key issue	QUANTITY	0.98+
half a dozen nodes	QUANTITY	0.97+
Azure Stack	TITLE	0.89+
Micron	ORGANIZATION	0.84+
last 30 years	DATE	0.8+
Cube studios	ORGANIZATION	0.79+
SAS	ORGANIZATION	0.76+
Cube	ORGANIZATION	0.74+
single	QUANTITY	0.72+
Action Item	ORGANIZATION	0.68+
second ago	DATE	0.67+
next few years	DATE	0.64+
three	OTHER	0.61+
next	QUANTITY	0.58+
tier two	OTHER	0.56+

Wikibon Research Meeting | Systems at the Edge

>> Hi I'm Peter Burris and welcome once again to Wikibons's weekly research meeting on theCUBE. (funky electronic music) This week we're going to discuss something that we actually believe is extremely important. And if you listen to the recent press announcements this week from Deli MC, the industry increasingly is starting to believe is important. And that is, how are we going to build systems that are dependent upon what happens at the edge? The past 10 years have been dominated about the cloud. How are we going to build things in the cloud? How are we going to get data to the cloud? How are we going to integrate things in the cloud? While all those questions remain very relevant, increasingly, the technology's becoming available, the systems and the design elements are becoming available, and the expertise is now more easily bought together so that we can start attacking some extremely complex problems at the edge. A great example of that is the popular notion of what's happening with automated driving. That is a clear example of huge design requirements at the edge. Now to understand these issues, we have to be able to generalize certain attributes of the differences in the resources, whether they be hardware or software, but increasingly, especially from a digital business transformation standpoint, the differences in the characteristics of the data. And that's what we're going to talk about this week. How do different types of data, data that's generated at the edge, data that's generated elsewhere, going to inform decisions about the classes of infrastructure that we're going to have to build and support as we move forward with this transformation that's taking place in the industry. So to kick it off, Neil Raden I want to turn to you. What are some of those key data differences and what taxonomically do we regard as what we call primary, secondary, and tertiary data? Neil. >> Well, primary data come in from sensors. It's a little bit different than anything we've ever seen in terms of doing analytics. Now I know that operational systems do pick up primary data, credit card transactions, something like that. But, scanner data, not scanner data, I mean sensor data is really designed for analysis. It's not designed for record keeping. And because it's designed for analysis, we have to have a different way of treating it than we do other things. If you think about a data lake, everything that falls into that data lake has come from somewhere else, it's been used for something else. But this data is fresh, and that requires that we really have to treat it carefully. Now, the retention and stewardship of that requires a lot of thought. And I don't think industry has really thought of that through a great deal. But look, sensor data is not new, it's been around for a long time. But what's different now is the volume and the lack of latency in it. But any organization that wants to get involved in it really needs to be thinking about what's the business purpose of it. If you're just going into, IOT as we call it generically, to save a few bucks you might as well not bother. It really is something that will change your organization. Now, what do we do with this data is a real problem because for the most part, these senses are going to be remote, and there's going to be a lot of, that means they're going to generate a lot of data. So what do we do with it? Do we reduce it at the sight? That's been one suggestion. There's an issue that any model for reduction could conceivably lose data that may be important somewhere down the line. Can the data be reconstituted through metadata or some sort of reverse algorithms? You know, perhaps. Those are the things we really need to think about. My humble opinion is the software and the devices need to be a single unit. And for the most part, they need to be designed by vendors, not by individual ITs. >> So David Floyer, let's pick up on that. Software and devices as single unit, designed more by vendors who have specific demand expertise, turn into solutions and present it to business. What do you think? >> Absolutely, I completely concur with that. The initial attempts to using the sensors and connecting to the sensors were very simple things like for example, the nest, the thermostats. And that's worked very well. But if you look at it over time, the processing for that has gone into the home, into your Apple TV device or your Alexa or whatever it is. So, that's coming down and now it's getting even closer to the edge. In the future, our proposition is that it will get even closer and then those will put together solutions, all types of solutions that are appropriate to the edge that will be taking not just one sensor but multiple sensors, collecting that data together, just like in the autonomous car for example where you take the lidars and the radars and the cameras etcetera. We'll be taking that data, we'll be analyzing it, and we'll be making decisions based on that data at the edge. And vendors are going to play a crucial role in providing these solutions to IT and to the OT and to many other parts. And a large value will be in their expertise that they will develop in this area. >> So as a rule of thumb, when I was growing up and learned to drive, I was told always keep five car lengths between you and whatever's in front of you at whatever speed you're traveling. What you just described David is that there will be sensors and there will be processing that takes place in that automated car that isn't using that type of rule of thumb but know something about tire temperature, and therefore the coefficient of friction on the tires, know something about the brakes, knows what the stopping power needs to be at the speed and therefore what buffer needs to be between it and whatever else is around it. >> Absolutely. >> This is no longer a rule of thumb, this is physics and deep understanding of what it's going to require to stop that car. >> And on top of that, what you'll also want to know, outside from your car is, what type of car is in front of you? Is that an autonomous car, or is that somebody being driven bye Peter? In which case, you have 10 lengths behind you. >> But that's not going to be primary data. Is that what we mean by secondary data? >> No, that's still primary because you're going to set up a connection between you and that other car. That car is going to tell you I'm primary to you, that's primary data. >> Here's what I mean, correct use primary data but, from a standpoint of that the car in that case is submitting a signal, right? So even though to your car it's primary data, but one of the things from a design standpoint that's interesting, is that car is now transmitting a digital signal about it's state that's relevant to you so that you can combine that >> Correct. inside effectively, a gateway inside your car. >> Yes. >> So there's external information that is in fact digital coming in, combining with the sensors about what's happening in your car. Have I got that right? >> Absolutely. That to me is a sort of sengrey one, then you've got the tertiary data which is the big picture about the traffic conditions >> Routes. and the weather and the routes and that sort of thing which is at that much higher cloud level, yes. So David Vellante, we always have to make sure as we have these conversations. We've talked a bit about this data, we've talked a little bit about the classes of work that's going to be performed at the different levels. How do we ensure that we sustain the business problem in this conversation? >> So, I mean I think Wikibon's done some really good work on describing what this sort of data model looks like from edge devices where you have primary data, the gateways where you're doing aggregated data in the cloud where maybe the serious modeling occurs. And my assertion would be is that the technology to support that elongating and increasingly distributed data model has been maturing for a decade and the real customer challenge is not just technical, it's really understanding a number of factors and I'll name some. Where in the distributed data value chain are you going to differentiate? And how does the data that you're capturing in that data pipeline contribute to monetization? What are the data sources, who has access to that data, how do you trust that data, and interpret it, and act on it with confidence? There are significant IP ownership in data protection issues. Who owns the data? Is it the device manufacturer, is it the factory, etcetera. What's the business model that's going to allow you to succeed? What skill sets are required to win? And really importantly, what's the shape of the ecosystem that needs to form to go to market and succeed? These are the things that I think customers are really struggling with that I talk to. >> Now, the one thing I'd add to that and I want to come back to it is the idea that, and who is ultimately bonding the solution because this is going to end up in a court of law. But let's come to this IP issue, George. Let's talk about how local data is going to be, is going to enter into the flow of analytics, and that question of who owns data, because that's important and then have the question about some of the ramifications and liabilities associated with this. >> Okay well, just on the IP protection and the idea that a vendor has to take sort of whole product responsibility for the solution. That vendor is probably going to be dealing with multiple competitors when they're sort of enabling say, self-driving car or other, you know edge, or smaller devices. The key thing is that, a vendor will say, you know, the customer keeps their data and the customer gets the insights from that data. But that data is informing in the middle a black box, an analytic black box. It's flowing through it, that's where the insights come out, on the other side. But the data changes that black box as it flows through it. So, that is something where, you know, when the vendor provides a whole solution to Mercedes, that solution will be better when they come around to BMW. And the customers should make sure that what BMW gets the benefit of, goes back to Mercedes. That's on the IP thing. I want to add one more thing on the tertiary side which is, when you're close to the edge, it's much more data intensive. When we've talked about the reduction in data and the real-time analytics, at the tertiary level it's going to be more where time is a bigger factor and you're essentially running a simulation, it's more compute intensive. And so you're doing optimizations of the model and those flow back as context to inform both the gateway and the edge. >> David Floyer I want to turn it to you. So we've talked a little bit about the characteristics of the data, great list of Dave Vellante about some of the business considerations, we will get very quickly in a second to some of the liability issues cause that's going to be important. But take us through how, which George just said about the tertiary elements. Now we've got all the data laid out, how is that going to map to the classes of devices? And we'll then talk a bit about some of the impacts on the industry. What's it going to look like? >> So if we take the primary edge first, and you take that as a unit, you'll have a number of senses within that. >> So just released, this is data about the real world that's coming into the system to be processed? >> Yes. So it'll have, for example, cameras. If we take a simple example of making sure that bad people don't get into your site. You'll have a camera there which will be facial recognition. They'll have a badge of some sort, so you'll read that badge, you may want to take their weight, you may want to have a infrared sensor on them so that you can tell their exact distance. So, a whole set of sensors that the vendor will put together for the job of insuring you don't get bad guys in there. And what you're insuring is that bad guys don't get in there, that's obviously one, very important, and also, that you don't go and- >> Stop good guys from going in. stop good guys from going in there. So those are the two characteristics >> The false-positive problem. the false-positives. Those are the two things you're trying to design that- >> At the primary edge. at the primary edge. And there's a mass amount of data going into that, which is only going to be reduced to very, very little data coming up to the next level which is this guy came here, this was his characteristics, he didn't look well today, maybe you should see a nurse, or whatever other information you can gather from that will go up to that secondary level, and then that'll also be a record of to HR maybe, about who has arrived there or what time they arrived, to the manufacturing systems about who is there and who has those skills to do a particular job. There are multiple uses of that data which can then be used for differentiation for whatever else from that secondary layer into local systems and then equally they can be pushed up to the higher level which is, how much power should be generating today, what are the higher levels. >> We now have 4,000 people in the building, air condition therefore is going to look like this, or, it could be combined with other types of data like over time we're going to need new capacity, or payroll, or whatever else it might be. >> And each level will have its own type of AI. So you've got AI at the edge, which is to produce a specific result, and then there's AI to optimize at the secondary level and then the AI optimize bigger things at the tertiary level. >> So we're going to talk more about some of the AI next week, but for right now we're talking about classes of devices that are high performance, high bandwidth, cheap, constrained, proximate to the event. >> Yep. >> Gateways that are capable of taking that information and start to synthesize it for the business, for other business types of things, and then tertiary systems, true private cloud for example, although we may have very sizable things at the gateway as well, >> There will be true private clouds. that are capable of integrating data in a more broad way. What's the impact in the industry? Are we going to see IT firms roll in and control this sweeping, (man chuckles) as Neil said, trillions of new devices. Is this all going to be intel? Is it all going to be, you know, looking like clients and PCs? >> My strong advice is, that the devices themselves will be done by extreme specialists in those areas that they will need a set of very deep technology understanding of the devices themselves, the senses themselves, the AI software relevant to that. Those are the people that are going to make money in that area. And you're much better off partnering with those people and letting them solve the problems, and you solve, as Dave said earlier, the ones that can differentiate you within your processes, within your business. So yes, leave that to other people is my strong advice. And from an IT's point of view, just don't do it yourself. >> Well the gateway's, sound like you're suggesting, the gateway is where that boundary's going to be. >> Yes. That's where the boundary is. >> And the IT technologies may increasingly go down to the edge, but it's not clear that the IT vendor expertise goes down to the edge >> Correct. at the same degree. >> Correct. >> So, Neil let's come back to you. When we think about this arrangement of data, you know, how the use cases are going to play out, and where the vendors are, we still have to address this fundamental challenge that Dave Vellante bought up. Who's going to end up being responsible for this? Now you've worked in insurance, what does that mean from an overall business standpoint? What kinds of failure weights are we going to accommodate? How is this going to play out? What do you think? >> Well, I'd like to point out that I worked in insurance 30 years ago. (men chuckling) >> Male Voice: I didn't want to date ya Neil. (men chuckling) >> Yeah the old reliable life insurance company. Anyway, one of the things David was just discussing sounded a lot to me like complex event processing. And I'm wondering where the logical location event needs to be, because it needs some prior data to do CEP, you have to have something to compare it against. But if you're pushing it all back to the tertiary level, there's going to be a lot of latency. And the whole idea was CEP was, you know, right now. So, that I'm a little curious about. But I'm sorry, what was your question? >> Well no, let's address that. So CEP David, I agree. But I don't want to turn this into a general discussion and CEP. It's got its own set of issues. >> It's clear there have got to be complex models created. And those are going to be created in a large environment, almost certainly in a tertiary type environment. And those are going to be created by the vendors of those particular problem solvers at the primary edge. To a large extent, they're going to provide solutions in that area. And they're going to have to update those. And so, they are going to have to have lots and lots of test data for themselves and maybe some companies will provide test data if it's convenient for those, for a fee or whatever it is, to those vendors. But the primary model itself is going to be in the tertiary level, and that's going to be pushed down to the primary level itself. >> I'm going to make an assertion here that the, the way I think about this Neil is that the data coming off at the primary level is going to be the sensor data, the sensor said it was good. Then that is recorded as an event, we let somebody in the building. And that's going to be a key feature of what happens at the secondary level. I think a lot of complex processing is likely to end up at that secondary level. >> Absolutely. >> Then the data gets pushed up to the tertiary level and it becomes part of an overall social understanding of the business, it's behavior data. So increasingly, what did we do as a consequence of letting this person in the building? Oh we tried to stop him. That's going to be more of the behavioral data that ends up at the tertiary level, will still do complex event processing there. It's going to be interesting to see whether or not we end up with CEP directly in the sensor tower. Might under certain circumstances, that's a cost question though. So let me now turn it in the last few minutes here Neil back to you. At the end of the day, we've seen for years the question of how much security is enough security? And businesses said, "Oh I want to be 100% secure." And sometimes see-so said "We got that. You gave me the money, we've now made you 100% secure." But we know it's not true. Same thing is going to exist here. How much fidelity is enough fidelity down at the edge? How do we ensure that business decisions can be translated into design decisions that lead to an appropriate and optimized overall approach to the way the system operates? From a business standpoint back, what types of conversations are going to take place in the boardroom that the rest of the organization's going to have to translate into design decisions? >> You know, boy, bad actors are going to be bad actors. I don't think you can do anything to eliminate it. The best you can do is use the best processes and the best techniques to keep it from happening and hope for the best. I'm sorry, that's all I can really say about it. >> There's quite a lot of work going on at the moment from Arm, in particular. They've got a security device image ability. So, there's a lot of work going on in that very space. It's obviously interesting from an IT perspective is how do you link the different security systems, both from an Arm point of view and then from a X86 as you go further up the chain. How are they going to be controlled and how's that going to be managed? That's going to be a big IT issue. >> Yeah, I think the transmission is the weak point. >> Male Voice: What do you mean by that Neil? >> Well the data has to flow across networks, that would be the easiest place for someone to intercept it and, you know, and do something nefarious. >> Right yeah, so that's purely in a security thing. I was trying to use that as an analogy. So, at the end of the day, the business is going to have to decide how much data do we have to capture off the edge to ensure that we have the kinds of models we want, so that we can realize the specificity of actions and behaviors that we want in our business? That's partly a technology question, partly a cost question. Different sensors are able to operate at different speeds for example. But ultimately, we have to be able to bring those, that list of decisions or business issues that Dave Vellante raised, down to some of the design questions. But it's not going to be throw a $300 micro processor everything. There's going to be very, very concrete decisions that have to take place. So, George do you agree with that? >> Yes, two issues though. One, there's the existing devices that can't get re-instrumented, that they already have their software, hardware stack. >> There's a legacy in place? >> Yes. But there's another thing which is, some of the most advanced research that's been going on that produced much of today's distributed computing and big data infrastructure, like the Berkeley Analytics lab, and say their contributions spark in related technologies. They're saying we have to throw everything out and start over for secure real-time systems. That you have to build from hardware all the way up. In other words, you're starting from the sand to re-think something that's secure and real-time that you can't layer it on. >> So very quickly David, that's a great point George. Building on what George has said very quickly, the primary responsibility for bonding the behavior or the attributes of these devices are going to be with the vendor. >> Of creating the solution? >> Correct. >> That's going to be the primary responsibility. But obviously from an IT point of view, you need to make sure that that device is doing the job that's important for your business, not too much, not too little, is doing that job, and that you are able to collect the necessary data from it that is going to be of value to you. So that's a question of qualification of the devices themselves. >> Alright so, David Vellante, Neil Raden, David Floyer, George Gilbert, action item round. I want one action item from you guys from this conversation. Keep it quick, keep it short, keep it to the point. David Floyer, what's your action item? >> So my action item is don't go into areas that you don't need to. You do not need to become experts, IT in general does not need to become experts at the edge itself. Rely on partners, rely on vendors to do that unless of course you're one of those vendors. In which case, you'll need very, very deep knowledge. >> Or you choose that that's where you're value stream your differentiations is going to be which means you just became one of those values. >> Yes, exactly. >> George Gilbert. >> I would build on that and I would say that if you look at the skills required to build these full stack solutions, there's data science, there's application development, there's the analytics. Very few of those solutions are going to have skills all in one company. So the go-to market model for building these is going to be something that, at least at this point in time, we're going to have to look to like combinations like IBM working with sort of supply chain masters. >> Good. Neil Raden, action item. >> The question is not necessarily one of technology because that's going to evolve. But I think as an organization, you need to look at it from this end which is, would employing this create a new business opportunity for us? Something we're not already doing. Or number two, change our operations in some significant way. Or number three, you know, the old red queen thing. We have to do it to keep up with the competition. >> Male Voice: David Vellante, action item. >> Okay well look, at the risk of sounding trite, you got to start the planning process from the customer on in, and so often people don't. You got to understand where you're going to add value for customers and constructing and external and internal ecosystem that can really juice that value creation. >> Alright, fantastic guys. So let me quickly summarize. This week on the Wikibon Friday research meeting in the cube, we discussed a new way of thinking about data characteristics that will inform system design and a business value that's created. We observe that data is not all the same when we think about these very complex, highly distributed, and decentralized systems that we're going to build. That there's a difference between primary data, secondary data, and tertiary data. Primary data is data that is generated from real world events or measurements and then turned into signals that can be acted upon very proximate to that real world set of conditions. A lot of sensors will be there, a lot of processing will be moved down there, and a lot of actuators and actions will take place without referencing other locations within the cloud. However, we will see circumstances where the events that are taken, or the decisions that are taken on those vents, will be captured in some sort of secondary tier that will then record something about the characteristics of the actions and events that were taken, and then summarized and then pushed up to a tertiary tier where that data can then be further integrated in other attributes and elements of the business. The technology to do this is broadly available but not universally successfully applied. We expect to see a lot of new combinations of edge-related device to work with primary data. That is going to be a combination of currently successful firms in the OT or operational technology world, most likely in partnership with a lot of other vendors that have demonstrated significant expertise and understanding the problems, especially the business problems, associated with the fidelity of what happens at the edge. The IT industry is going to approach very aggressively and very close to this at that secondary level, through gateways and other types of technologies. And even though we'll see IT technology continue to move down to the primary level, it's not clear exactly how vendors will be able to follow that. More likely, we'll see the adoption of IT approaches to doing things at the primary level by vendors that have the main expertise in how that level works. We will however see significantly interesting true private cloud and public cloud data end up from the tertiary level end up with a whole new sets of systems that are going to be very important from an administration and management standpoint because they have to work within the context of the fidelity of this overall system together. The final point we want to make is that these are not technology problems by themselves. While significant technology problems are on the horizon about how we think about handling this distribution of data, managing it appropriately, our ability, ultimately, to present the appropriate authority at different levels within that distributive fabric to ensure the proper working condition in a way that nonetheless we can recreate if we need to. But these are, at bottom, fundamentally business problems. They're business problems related to who owns the intellectual property that's being created, they're business problem related to what level in that stack do I want to show my differentiation to my customers and they're business problems from a liability and legal standpoint as well. The action item is, all firms will in one form or another be impacted by the emergence of the edge as a dominate design as consideration for their infrastructure but also for their business. Three ways, or a taxonomy that looks at three classes of data, primary, secondary, and tertiary, will help businesses sort out who's responsible, what partnerships I need to put in place, what technologies and I going to employ, and very importantly, what overall business exposure I'm going to accommodate as I think ultimately about the nature of the processing and business promises that I'm making to my marketplace. Once again, this has been the Wikibon Friday research meeting here on theCUBE. I want to thank all the analysts who were here today, but especially thank you for paying attention and working with us. And by all means, let's hear those comments back about how we're doing and what you think about this important question of different classes of data driven by different needs of the edge. (funky electronic music)

Published Date : Oct 13 2017

SUMMARY :

A great example of that is the popular notion And for the most part, they need to be designed present it to business. that are appropriate to the edge that will be taking and learned to drive, I was told of what it's going to require to stop that car. Is that an autonomous car, or is that But that's not going to be primary data. That car is going to tell you I'm primary inside your car. Have I got that right? the big picture about the traffic conditions and the weather and the routes What's the business model that's going to allow you to succeed? Now, the one thing I'd add to that the benefit of, goes back to Mercedes. of the liability issues cause that's going to be important. and you take that as a unit, and also, that you don't go and- So those are the two characteristics Those are the two things you're trying to design that- and then that'll also be a record of to HR maybe, air condition therefore is going to look like this, a specific result, and then there's AI to optimize high bandwidth, cheap, constrained, proximate to the event. Is it all going to be, you know, looking like clients and PCs? Those are the people that are going to make money in that area. Well the gateway's, sound like you're suggesting, at the same degree. How is this going to play out? Well, I'd like to point out that I worked in insurance Male Voice: I didn't want to date ya Neil. And the whole idea was CEP was, you know, right now. But I don't want to turn this into be in the tertiary level, and that's going to be And that's going to be a key feature of That's going to be more of the behavioral data and the best techniques to keep it from happening and how's that going to be managed? Well the data has to flow across networks, capture off the edge to ensure that we have can't get re-instrumented, that they already have their some of the most advanced research that's been going on are going to be with the vendor. the necessary data from it that is going to be of value to you. Keep it quick, keep it short, keep it to the point. IT in general does not need to Or you choose that that's where you're is going to be something that, at least at this point in time, Neil Raden, action item. We have to do it to keep up with the competition. You got to understand where you're going to add value sets of systems that are going to be very important

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
David Floyer	PERSON	0.99+
Neil	PERSON	0.99+
Neil Raden	PERSON	0.99+
Dave Vellante	PERSON	0.99+
David Vellante	PERSON	0.99+
David	PERSON	0.99+
George	PERSON	0.99+
George Gilbert	PERSON	0.99+
Peter Burris	PERSON	0.99+
Mercedes	ORGANIZATION	0.99+
BMW	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
$300	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
10 lengths	QUANTITY	0.99+
two characteristics	QUANTITY	0.99+
Berkeley Analytics	ORGANIZATION	0.99+
next week	DATE	0.99+
4,000 people	QUANTITY	0.99+
two issues	QUANTITY	0.99+
Peter	PERSON	0.99+
today	DATE	0.99+
each level	QUANTITY	0.99+
One	QUANTITY	0.99+
one suggestion	QUANTITY	0.98+
Three ways	QUANTITY	0.98+
five car	QUANTITY	0.98+
both	QUANTITY	0.98+
This week	DATE	0.97+
two things	QUANTITY	0.97+
this week	DATE	0.97+
30 years ago	DATE	0.97+
one	QUANTITY	0.97+
Wikibon	ORGANIZATION	0.97+
Wikibons	ORGANIZATION	0.97+
trillions of new devices	QUANTITY	0.97+
single unit	QUANTITY	0.97+
one sensor	QUANTITY	0.96+
one form	QUANTITY	0.96+
first	QUANTITY	0.94+
one company	QUANTITY	0.94+
Apple TV	COMMERCIAL_ITEM	0.92+
one action item	QUANTITY	0.92+
three classes	QUANTITY	0.91+
intel	ORGANIZATION	0.89+
Wikibon	EVENT	0.86+
one more	QUANTITY	0.79+
second	QUANTITY	0.76+
past 10 years	DATE	0.75+
CEP	ORGANIZATION	0.75+
Deli MC	ORGANIZATION	0.73+
CEP	TITLE	0.68+
Arm	ORGANIZATION	0.65+
Wikibon Friday	EVENT	0.64+
Alexa	TITLE	0.64+
years	QUANTITY	0.62+
few bucks	QUANTITY	0.6+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for one action item: