Jim Kobelius HPE3 (Do not make public)

(jazzy techno music) >> From our studios, in the heart of Silicon Valley, Palo Alto, California, this is a Cube Conversation. >> Hi, I'm Peter Burris, and welcome to another Cube Conversation. Everybody talks about AI, and how it's going to dramatically alter both the nature and the productivity of different classes of business outcomes, and it's clear that we're on a variety of different vector's and road map's to achieve that. One of the most important conversations is the roll that AI's gonna play within the IT organization, within digital operations. To improve the productivity of the resources that we have put in place to make these broader, more complex business outcomes possible and operationally efficient. One of the key places where this is gonna be important is in storage itself. How will AI improve the productivity, both from a cost stand point, but even more importantly, from the amount of work that storage resources can do standpoint. Now, to have that conversation we've got Jim Kobielus, my colleague from Wikibon, to talk about his vision of how AI technology, Jim you're the, our key AI guy. How AI technologies will be embedded in storage services, data services, and the new classes of products, that are gonna make possible these new types of data driven, AI driven outcomes. Jim, welcome back to the cube. >> Thanks, Peter. >> All right, so let's start Jim. As you think about it, what is it about AI that makes it relevant to improving storage productivity? >> Well, AI is a broad term, but let me net it out to the core of what AI's all about. Core AI is what is called machine learning, and machine learning is being able to find patterns in the data using algorithmic methods in a way that can be automated, and also in a way that humans, mortal humans can't usually do. In other words you have complex datasets. Machine learning is very good at doing such things as looking for anomalies, looking for trends, looking for blotter statistical patterns among (mumbles) elements within a broader dataset. So, when you talk about storage resources, and you talk about storage resources in (mumbles) environment, you have many tables, and you have records, and you have indices, keys and so forth. >> Logs. >> Yeah, yeah. You have, yeah. So, when you have a lot of entities in various, and quite often complex relationships, that when a storage exists, if you will, a number of things to persist data, you know, as a historical artifact, but also storage exists to facilitate queries, and access to the data. To answer questions, that's what analytics is all about. If you can shorten the path that a query takes to assemble the relevant tables, records, and so forth, and deliver a result back to whoever posts the query, then storage becomes evermore efficient in serving the end user. The more complex your storage resources get, it can be across different servers, different clusters, different clouds, it can be highly distributed across the internet of things, and what not. The more complex your storage architecture and distributed becomes, the more critically you need machine learning to be able to detect the high level patterns, to be able to identify, you know at any point in time, what is the path that a given query might take to be able to respond in real time to some kind of requirement from a business user who's sitting there at a dashboard trying to call up some complex metrics. So, machine learnings able to not only identify the current patterns within distributed datasets, but also to predicting models that are built in. That's what AI often does, is predict the models. To identify the predict, how if under various scenarios, if the data were placed in different storage volumes, or devices, or cached here. Now there distributed, and (mumbles) in a particular way. How you might be able to speed up queries. So, machine learning is increasingly used in storage architectures to identify, A the current patterns, B to identify query paths, C to predict a recommended, and automatically move data around so that the performance, whether it be queries, or reporting, or data transfers, or all that. So, the performance of that data transaction or analytic, is as good or as fast as it can possibly be. >> More predictable, right? >> No, here you are. Automate it, predictably. So that humans don't have to muck around with query plans, and so forth. That the architecture, the infrastructure takes care of that problem. That's why these capabilities are autonomous operations, they're built into things like Oracle database. That's just the way database computing has to be done. There's less of a need for human data engineers to do that. I think human data engineers everywhere are saying hallelujah, that is way too complex for us, especially in the era of distributed edge computing. We can't do that in a finite amount of time. Let the infrastructure automate that function. >> So, if we look back, storage used to be machine attached. Then we went to network classes of storage. Now, we're increasingly distributing data. I think one of the big misnomer's in the industry, is that cloud was a tactic for centralizing resources. In fact, it's turning out that cloud is a tactic for more broad distribution of compute data, and related resources. All of those patterns in this increasingly distributed cloud service oriented world, have to be accommodated, have to be understood, and as you said, to improve predictability, and competence in the system, we have to have some visibility into what it's gonna take to perform, and AI can help us do that. >> Exactly. >> One thing you didn't mention Jim. I want to pick up on something though. Is the idea as we move to Kubernetes, as we move to container based, transient, even serverless types of application forms where the data is. Where all the state is really baked into the data, and not residing in the application. This notion of data assurance is important. Assuring that the data that's required by an instance of a Kubernetes cluster, is available. Can be made available, or will be available when that cluster spins it up. Talk about how that becomes a use case for more AI in the storage subsystems, to ensure that storage can assure that the data that's required is available in the form it needs to be, when it needs to be, and with the policies that are required to secure it, and insure it's integrity. >> Oh yeah, that requirement for that level of data protection requires end-to-end data replication architecture. Infrastructure that's able to assure that all the critical data, or data that's tagged by the real criticality, is always available with backup copies that are always available and close enough to the applications and the users at any point in time. Continuously, so that nobody ever need worry that the data that they need will not be available, because a given server, or storage device is down, or given network is down. End-to-end data replication architecture is automated to a degree that it's always assured, and it will (mumbles) AI, as a (mumbles). First of all, making sure that the end-to-end infrastructure always has a high level, and a very fine (mumbles). A depiction on what is where at every point in time. Also, on the path between all applications, and the critical data sources that they require. Of those paths, always include backups that are hot backups. They're just available without having to worry about it. That the infrastructure predicatively takes care of caching, and replicating, and storing the data wherever it needs to be. To assure that degree of end-to-end data protection and assurance. Once again, that's an automated, and it needs to be an automated capability, especially in the era edge computing, where the storage resources are everywhere. In high preventive architecture really, storage is everywhere, it's just baked into everything. It's the very nature of HCI. >> Right. >> So, you know, yeah. >> So, Jim. We've always, you use a term anomalous behavior, and in the storage world, the storage administrator world, regarded that, or associated that with anticipating or predicting the possibility of a failure somewhere within the subsystem, but as we move to a more broadly distributed use of storage, feeding a more rich and complex set of applications, supporting a more varied and unknown set of user and business activities, the roll of anomalous behavior, even within these data patterns, and security starts to come together. Talk a little about how AI, security, and storage are likely to conflate over the course of the next few years. >> Okay. AI, security, and storage. Well, when you look at security now, data security where everything is being pushed to the edge. You need each device now. You just know that in the internet of things, whether it be an actual edge device, or a gateway, and so forth. To be able to protect the local data resources in an autonomous or semi-autonomous fashion without necessarily having to round trip back to the cloud center, if there is a cloud center, because the critical data is being stored at the edges, so what's happening more and more is that we see something that's called. I forgot the name. Zero perimeter, or perimeterless-- >> Oh, the zero-trust perimeterless security. >> Yeah, there you go, thank you. Where the policies follow the data all the way to wherever it's stored, in a zero trust environment, the permissions, the crypto keys, and so forth, and this is just automatic, so that no matter where the data happens to move, the entire security context follows it. So, what's happening now is that we're seeing that more autonomous operation become part of the architecture of end-to-end data management in this new world. So, to enable, what's happening is that, in terms of protecting that data from any number of theft, or you know denial services, and so forth, AI becomes critically important, machine learning in particular, to be able to detect intrusions autonomously at those edge devices using embedded machine running models that are persistent within the edge nodes themselves. To be able to look for patterns that might be indicative of security threats, because fixed rules are becoming less and less relevant in terms of security rules in an era where the access patterns become more or less 360 degree, in terms of every data resource is being bombarded from all sides by all possible threats. So machine learning is the tool for looking at, what access requests, or attempts on a given data resource, are anomalous in terms of, they've not been seen before. There unusual, they fall inside the confidence intervals that would be normally expected in terms of the access request. So, those edge nodes need then to be able to take action autonomously based on those patterns according to the (mumbles). So we're seeing more of that pattern based security. The edge nodes have zero trust. They're not trusting any access attempt. Any access attempt, even from local applications on the same device, is treated as if it were coming from a remote party, and it has to come through gateway that's erected through machine learning. That machine learning that it learns in real time to adapt to the threat patterns that are seen at that node. >> All right Jim, let's wrap it up there. Once again, we've been, Jim Kobielus and I have been talking about the role that AI's going to play inside the storage capacity, the storage and data services that enterprises are gonna use to improve their business outcomes. Jim, thank you very much for being on The Cube. >> Thank you very much, Peter. >> Once again, I'm Peter Burris. 'Till next time. (techno music)

Published Date : May 1 2019

SUMMARY :

in the heart of Silicon Valley, To improve the productivity of the resources that we have relevant to improving storage productivity? and machine learning is being able to find so that the performance, especially in the era of distributed edge computing. and competence in the system, in the form it needs to be, and the critical data sources that they require. and in the storage world, You just know that in the internet of things, in terms of the access request. Jim Kobielus and I have been talking about the role Once again, I'm Peter Burris.

ENTITIES

Entity	Category	Confidence
Jim	PERSON	0.99+
Jim Kobielus	PERSON	0.99+
Peter	PERSON	0.99+
Peter Burris	PERSON	0.99+
Jim Kobelius	PERSON	0.99+
each device	QUANTITY	0.99+
One	QUANTITY	0.98+
360 degree	QUANTITY	0.98+
both	QUANTITY	0.98+
Oracle	ORGANIZATION	0.97+
Wikibon	ORGANIZATION	0.94+
First	QUANTITY	0.92+
Palo Alto, California	LOCATION	0.91+
zero	QUANTITY	0.85+
Kubernetes	TITLE	0.84+
HPE3	ORGANIZATION	0.84+
one	QUANTITY	0.83+
Silicon Valley,	LOCATION	0.81+
nodes	TITLE	0.81+
zero trust	QUANTITY	0.8+
Zero	QUANTITY	0.78+
Cube	ORGANIZATION	0.71+
next few years	DATE	0.59+
Cube Conversation	ORGANIZATION	0.54+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for HPE3: