Partha Seetala, Robin Systems | DataWorks Summit 2018
>> Live from San Jose, in the heart of Silicon Valley, it's theCUBE. Covering DataWorks Summit 2018. Brought to you by Hortonworks. >> Welcome back everyone, you are watching day two of theCUBE's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight. I'm coming at you with my cohost Jame Kobielus. We're joined by Partha Seetala, he is the Chief Technology Officer at Robin Systems, thanks so much for coming on theCUBE. >> Pleasure to be here. >> You're a first timer, so we promise we don't bite. >> Actually I'm not, I was on theCUBE- >> Oh! >> At DockerCon in 2016. >> Oh well excellent, okay, so now you're a veteran, right. >> Yes, ma'am. >> So Robin Systems, as before the cameras were rolling, we were talking about it, it's about four years old, based here in San Jose, venture backed company. Tell us a little bit more about the company and what you do. >> Absolutely. First of all, thanks for hosting me here. Like you said, Robin is a Silicon Valley based company. Our focus is in allowing applications, such as big data, databases, no sequel and AI ML, to run within the Kubernetes platform. What we have built is a product that converges storage, complex storage, networking, application workflow management, along with Kubernetes to create a one click experience where users can get managed services kind of feel when they're deploying these applications. They can also do one click life cycle management on these apps. Our thesis has initially been to, instead of looking at this problem from an infrastructure up into application, to actually look at it from the applications down and then say, "Let the applications drive the underlying infrastructure to meet the user's requirements." >> Is that your differentiating factor, would you say? >> Yeah, I think it is because most of the folks out there today are looking at is as if it's a competent based play, it's like they want to bring storage to Kubernetes or networking to Kubernetes but the challenges are not really around storage and networking. If you talk to the operations folk they say that, "You know what? Those are underlying problems but my challenge is more along the lines of, okay, my CIO says the initiative is to make my applications mobile. They want go across to different Clouds. That's my challenge." The line of business user says, "I want to get a managed source experience." Yes, storage is the thing that you want to manage underneath, but I want to go and click and create my, let's say, an Oracle database or distributions log. >> In terms of the developer experience here, from the application down, give us a sense for how Robin Systems tooling your product enables that degree of specification of the application logic that will then get containerized within? >> Absolutely, like I said, we want applications to drive the infrastructure. What it means is that we, Robin is a software platform. We later ourselves on top of the machines that we sit on whether it is bare metal machines on premises, our VMs, or even an Azure, Google Cloud as well as AWs. Then we make the underlying compute, storage, network resources almost invisible. We treat it as a pool of resources. Now once you have this pool of resources, they can be attached to the applications that are being deployed as can inside containers. I mean, it's a software place, install on machines. Once it's installed, the experience now moves away from infrastructure into applications. You log in, you can see a portal, you have a lot of applications in that portal. We ship support for about 25 applications of some such. >> So these are templates? >> Yes. >> That the developer can then customize to their specific requirements? Or no? >> Absolutely, we ship reference templates for pretty much a wide variety of the most popular big data, no sequel, database, AI ML applications today. But again, as I said, it's a reference implementation. Typically customers take the reference recommendation and they enhance it or they use that to onboard their custom apps, for example, or the apps that we don't ship out of the box. So it's a very open, extensible platform but the goal being that whatever the application might be, in fact we keep saying that, if it runs somewhere else, it's runs on Robin, right? So the idea here is that you can bring anything, and we just, the flip of switch, you can make it a one click deploy, one click manage, one click mobile across Clouds. >> You keep mentioning this one click and this idea of it being so easy, so convenient, so seamless, is that what you say is the biggest concern of your customers? Is this ease and speed? Or what are some other things that are on their minds that you want to deliver? >> Right, so one click of course is a user experience part but what is the real challenge? The real challenges, there are a wide variety of tools being used by enterprises today. Even the data analytic pipeline, there's a lot across the data store, processor pipeline. Users don't want to deal with setting it up and keeping it up and running. They don't want that, they want to get the job done, right? Now when you only get the job done, you really want to hide the underlying details of those platforms and the best way to convey that, the best way to give that experience is to make it a single click experience from the UI. So I keep calling it all one click because that is the experience that you get to hide the underlying complexity for these apps. >> Does your environment actually compile executable code based on that one click experience? Or where does the compilation and containerization actually happen in your distributed architecture? >> Alright, so, I think the simplest- >> You're a prem based offering, right? You're not in the Cloud yourself? >> No, we are. We work on all the three big public clouds. >> Oh, okay. >> Whether it is Azure, AWS or Google. >> So your entire application is containerized itself for deployment into these Clouds? >> Yes, it is. >> Okay. >> So the idea here is let's simplify it significantly, right? You have Kubernetes today, it can run anywhere, on premises, in the public Cloud and so on. Kubernetes is a great platform for orchestrating containers but it is largely inaccessible to a certain class of data centric applications. >> Yeah. >> We make that possible. But our take is, just onboarding those applications on Kubernetes does not solve your CXO or you line of business user's problems. You ought to make the management, from an application point of view, not from a container management point of view, from an application point of view, a lot easier and that is where we kind of create this experience that I'm talking about, one click experience. >> Give us a sense for how, we're here at DataWorks and it's the Hortonworks show. Discuss with us your partnership with Hortonworks and you know, we've heard the announcement of HDP 3.0 and containerization support, just give us a rough sense for how you align or partner with Hortonworks in this area. >> Absolutely. It's kind of interesting because Hortonworks is a data management platform, if you think about it from that point of view and when we engaged with them first- So some of our customers have been using the product, Hortonworks, on top of Robin, so orchestrating Hortonworks, making it a lot easier to use. >> Right. >> One of the requirements was, "Are you certified with Hortonworks?" And the challenge that Hortonworks also had is they had never certified a container based deployment of Hortonworks before. They actually were very skeptical, you know, "You guys are saying all these things. Can you actually containerize and run Hortonworks?" So we worked with Hortonworks and we are, I mean if you go to the Hortonworks website, you'll see that we are the first in the entire industry who have been certified as a container based play that can actually deploy and manage Hortonworks. They have certified us by running a wide variety of tests, which they call the Q80 Test Suite, and when we got certified the only other players in the market that got that stamp of approval was Microsoft in Azure and EMC with Isilon. >> So you're in good company? >> I think we are in great company. >> You're certified to work with HTP 3.0 or the prior version or both? >> When we got certified we were still in the 2.X version of Hortonworks, HTP 3.0 is a more relatively newer version. But our plan is that we want to continue working with Hortonworks to get certified as they release the program and also help them because HTP 3.0 also has some container based orchestration and deployment so you want to help them provide the underlying infrastructure so that it becomes easier for beyond to spin up more containers. >> The higher level security and governance and all these things you're describing, they have to be over the Kubernetes layer. Hortonworks supports it in their data plane services portfolio. Does Robin Systems solutions portfolio tap in to any of that, or do you provide your own layer of sort of security and metadata management so forth? >> Yeah, so we don't want- >> In context of what you offer? >> Right, so we don't want to take away the security model that the application itself provides because might have step it up so that they are doing governance, it's not just logging in and auto control and things like this. Some governance is built into. We don't want to change that. We want to keep the same experience and the same workflow hat customers have so we just integrate with whatever security that the application has. We, of course, provide security in terms of isolating these different apps that are running on the Robin platform where the security or the access into the application itself is left to the apps themselves. When I say apps, I'm talking about Hortonworks. >> Yeah, sure. >> Or any other databases. >> Moving forward, as you think about ways you're going to augment and enhance and alter the Robin platform, what are some of the biggest trends that are driving your decision making around that in the sense of, as we know that companies are living with this deluge of data, how are you helping them manage it better? >> Sure. I think there are a few trends that we are closely watching. One is around Cloud mobility. CIOs want their applications along with their data to be available where their end users are. It's almost like follow the sun model, where you might have generated the data in one Cloud and at a different time, different time zone, you'll basically want to keep the app as well as data, moving. So we are following that very closely. How we can enable the mobility of data and apps a lot easier in that world. The other one is around the general AI ML workflow. One of the challenges there, of course, you have great apps like TensorFlow or Theano or Caffe, these are very good AI ML toolkits but one of the challenges that people face, is they are buying this very expensive, let's say NVIDIA DGX Box, this box costs about $150,000 each, how do you keep these boxes busy so that you're getting a good return on investment? It will require you to better manage the resources offered with these boxes. We are also monitoring that space and we're seeing that how can we take the Robin platform and how do you enable the better utilization of GPUs or the sharing of GPUs for running your AI ML kind of workload. >> Great. >> Those are, I think, two key trends that we are closely watching. >> We'll be discussing those at the next DataWorks Summit, I'm sure, at some other time in the future. >> Absolutely. >> Thank you so much for coming on theCUBE, Partha. >> Thank you. >> Thank you, my pleasure. Thanks. >> I'm Rebecca Knight for James Kobielus, We will have more from DataWorks coming up in just a little bit. (techno beat music)
SUMMARY :
in the heart of Silicon Valley, he is the Chief Technology we promise we don't bite. so now you're a veteran, right. and what you do. from the applications down Yes, storage is the thing that you want the machines that we sit on or the apps that we don't because that is the No, we are. So the idea here is let's and that is where we kind of create and it's the Hortonworks show. if you think about it One of the requirements was, or the prior version or both? the underlying infrastructure so that to any of that, or do you that are running on the Robin platform the Robin platform and how do you enable that we are closely watching. at the next DataWorks Summit, Thank you so much for Thank you, my pleasure. We will have more from DataWorks
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
Rebecca Knight | PERSON | 0.99+ |
Hortonworks | ORGANIZATION | 0.99+ |
Jame Kobielus | PERSON | 0.99+ |
San Jose | LOCATION | 0.99+ |
AWS | ORGANIZATION | 0.99+ |
James Kobielus | PERSON | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
Robin Systems | ORGANIZATION | 0.99+ |
Partha Seetala | PERSON | 0.99+ |
Silicon Valley | LOCATION | 0.99+ |
San Jose, California | LOCATION | 0.99+ |
Oracle | ORGANIZATION | 0.99+ |
one click | QUANTITY | 0.99+ |
ORGANIZATION | 0.99+ | |
one | QUANTITY | 0.99+ |
2016 | DATE | 0.99+ |
both | QUANTITY | 0.99+ |
HTP 3.0 | TITLE | 0.99+ |
NVIDIA | ORGANIZATION | 0.99+ |
first | QUANTITY | 0.99+ |
DataWorks | ORGANIZATION | 0.99+ |
Robin | ORGANIZATION | 0.98+ |
Kubernetes | TITLE | 0.98+ |
One | QUANTITY | 0.98+ |
TensorFlow | TITLE | 0.98+ |
about $150,000 each | QUANTITY | 0.98+ |
about 25 applications | QUANTITY | 0.98+ |
one click | QUANTITY | 0.98+ |
Partha | PERSON | 0.98+ |
Isilon | ORGANIZATION | 0.97+ |
DGX Box | COMMERCIAL_ITEM | 0.97+ |
today | DATE | 0.96+ |
First | QUANTITY | 0.96+ |
DockerCon | EVENT | 0.96+ |
Azure | ORGANIZATION | 0.96+ |
Theano | TITLE | 0.96+ |
DataWorks Summit 2018 | EVENT | 0.95+ |
theCUBE | ORGANIZATION | 0.94+ |
Caffe | TITLE | 0.91+ |
Azure | TITLE | 0.91+ |
Robin | PERSON | 0.91+ |
Robin | TITLE | 0.9+ |
two key trends | QUANTITY | 0.89+ |
HDP 3.0 | TITLE | 0.87+ |
EMC | ORGANIZATION | 0.86+ |
single click | QUANTITY | 0.86+ |
day two | QUANTITY | 0.84+ |
DataWorks Summit | EVENT | 0.83+ |
three big public clouds | QUANTITY | 0.82+ |
DataWorks | EVENT | 0.81+ |
AI and Hybrid Cloud Storage | Wikibon Action Item | May 2019
Hi, I'm Peter Burris, and this is Wikibon's Action Item. We're joined here in the studio by David Floyer. Hi David. >> Hi there. >> And remote, we've got Jim Kobielus. Hi, Jim. >> Hi everybody. >> Now, Jim, you probably can't see this, but for those who are watching, when we do see the broad set, notice that David Floyer's got his Game of Thrones coffee cup with us. Now that has nothing to do with the topic. David, and Jim, we're going to be talking about this challenge that businesses have, that enterprises have, as they think about making practical use of AI. The presumption for many years was that we were going to move all the data up into the Cloud in a central location, and all workloads were going to be run there. As we've gained experience, it's very clear that we're actually going to see a greater distribution function, partly in response to a greater distribution of data. But what does that tell about the relationship between AI, AI workloads, storage, and hybrid Cloud? David, why don't you give us a little clue as to where we're going to go from here. >> Well I think the first thing we have to do is separate out the two types of workload. There's the development of the AI solution, the inference code, et cetera, the dealing with all of the data required for that. And then there is the execution of that code, which is the inference code itself. And the two are very different in characteristics. For the development, you've got a lot of data. It's very likely to be data-bound. And storage is a very important component of that, as well as computer and the GPUs. For the inference, that's much more compute-bound. Again, compute neural networks, GPUs, are very, very relevant to that portion. Storage is much more ephemeral in the sense that the data will come in and you will need to execute on it. But that data will be part of the, the compute will be part of that sensor, and you will want the storage to be actually in the DIMM itself, or non-volatile DIMM, right up as part of the processing. And you'll want to share that data only locally in real time, through some sort of mesh computing. So, very different compute requirements, storage requirements, and architectural requirements. >> Yeah, let's go back to that notion of the different storage types in a second, but Jim, David described how the workloads are going to play out. Give a sense of what the pipelines are going to look like, because that's what people are building right now, is the pipelines for actually executing these workloads. How will they differ? How do they differ in the different locations? >> Yeah, so the entire DataOps pipeline for data science, data analytics, AI in other words. And so what you're looking at here is all the processes from discovering and adjusting the data to transforming and preparing and correcting it, cleansing it, to modeling and training the AI models, to serving them out for inferencing along the lines of what David's describing. So, there's different types of AI models and one builds from different data to do different types of inferencing. And each of these different pipelines might be highly, often is, highly specific to a particular use case. You know, AI for robotics, that's a very different use case from AI for natural language processing, embedded for example in an e-commerce portal environment. So, what you're looking at here is different pipelines that all share a common sort of flow of activities and phases. And you need a data scientist to build and test, train and evaluate and serve out the various models to the consuming end devices or application. >> So, David we've got 50 or so years of computing. Where the primary role of storage was to assist a transaction and the data associated with that transaction that has occurred. And that's you know, disk and then you have all the way out to tape if we're talking about archive. Flash changes that equation. >> Absolutely changes it. >> AI absolutely demands a different way of thinking. Here we're not talking about persisting our data we're talking about delivering data, really fast. As you said, sometimes very ephemeral. And so, it requires a different set of technologies. What are some of the limitations that historically storage has been putting on some of these workloads? And how are we breaching those limitations, to make them possible? >> Well if we take only 10 years ago, the start of the big data was Hadoop. And that was spreading the data over very cheap disks and hard disks. With the compute there, and you spread that data and you did it all in parallel on very cheap nodes. So, that was the initial but that is a very expensive way of doing it now because you're tying the data to that set of nodes. They're all connected together so, a more modern way of doing it is to use Flash, to use multiple copies of that data but logical copies or snapshots of that Flash. And to be able to apply as many processes, nodes as is appropriate for that particular workload. And that is a far more efficient and faster way of processing that or getting through that sort of workload. And it really does make a difference of tenfold in terms of elapsed time and ability to get through that. And the overall cost is very similar. >> So that's true in the inferencing or, I'm sorry, in the modeling. What about in the inferencing side of things? >> Well, the inferencing side is again, very different. Because you are dealing with the data coming in from the sensors or coming in from other sensors or smart sensors. So, what you want to do there is process that data with the inference code as quickly as you can, in real time. Most of the time in real time. So, when you're doing that, you're holding the current data actually in memory. Or maybe in what's called non-volatile DIMM and VDIMM. Which gives you a larger amount. But, you almost certainly don't have the time to go and store that data and you certainly don't want to store it if you can avoid it because it is a large amount of data and if I open my... >> Has limited derivative use. >> Exactly. >> Yeah. >> So you want to get all or quickly get all the value out of that data. Compact it right down using whatever techniques you can, and then take just the results of that inference up to other ones. Now at the beginning of the cycle, you may need more but at the end of the cycle, you'll need very little. >> So Jim, the AI world has built algorithms over many, many, many years. Many which still persist today but they were building these algorithms with the idea that they were going to use kind of slower technologies. How is the AI world rethinking algorithms, architectures, pipelines, use cases? As a consequence of these new storage capabilities that David's describing? >> Well yeah, well, AI has become widely distributed in terms of its architecture increasingly and often. Increasingly it's running over containerized, Kubernetes orchestrated fabrics. And a lot of this is going on in the area of training, of models and distributing pieces of those models out to various nodes within an edge architecture. It may not be edge in the internet of things sense but, widely distributed, highly parallel environments. As a way of speeding up the training and speeding up the modeling and really speeding up the evaluation of many models running in parallel in an approach called ensemble modeling. To be able to converge on a predictive solution, more rapidly. So, that's very much what David's describing is that that's leveraging the fact that memory is far faster than any storage technology we have out there. And so, being able to distribute pieces of the overall modeling and training and even data prep of workloads. It's able to speed up the deployment of highly optimized and highly sophisticated AI models for the cutting edge, you know, challenges we face like the Event Horizon telescope for example. That we're all aware of when they were able to essentially make a visualization of a black hole. That relied on a form of highly distributed AI called Grid Computing. For example, I mean the challenges like that demand a highly distributed memory-centric orchestrated approach to tackling. >> So, you're essentially moving the code to the data as opposed to moving all of the data all the way out to the one central point. >> Well so if we think about that notion of moving code to the data. And I started off by suggesting that. In many respects, the Cloud is an architectural approach to how you distribute your workloads as opposed to an approach to centralizing everything in some public Cloud. I think increasingly, application architects and IT organizations and service providers are all seeing things in that way. This is a way of more broadly distributing workloads. Now as we think about, we talked briefly about the relationship between storage and AI workloads but we don't want to leave anyone with the impression that we're at a device level. We're really talking about a network of data that has to be associated with a network of storage. >> Yes. >> Now that suggests a different way of thinking about how - about data and data administration storage. We're not thinking about devices, we're really trying to move that conversation up into data services. What kind of data services are especially crucial to supporting some of these distributed AI workloads? >> Yes. So there are the standard ones that you need for all data which is the backup and safety and encryption security, control. >> Primary storage allocation. >> All of that, you need that in place. But on top of that, you need other things as well. Because you need to understand the mesh, the distributed hybrid Cloud that you have, and you need to know what the capabilities are of each of those nodes, you need to know the latencies between each of those nodes - >> Let me stop you here for a second. When you say "you need to know," do you mean "I as an individual need to know" or "the system needs to know"? >> It needs to be known, and it's too complex, far too complex for an individual ever to solve problems like this so it needs, in fact, its own little AI environment to be able to optimize and check the SLAs so that particular inference coding can be achieved in the way that it's set up. >> So it sounds like - >> It's a mesh type of computer. >> Yeah, so it sounds like one of the first use cases for AI, practical, commercial use cases, will be AI within the data plane itself because the AI workloads are going to drive such a complex model and utilization of data that if you don't have that the whole thing will probably just fold in on itself. Jim, how would you characterize this relationship between AI inside the system, and how should people think about that and is that really going to be a practical, near-term commercial application that folks should be paying attention to? >> Well looking at the Cloud native world, what we need and what we're increasingly seeing out there are solutions, tools, really data planes, that are able to associate a distributed storage infrastructure of a very hybridized nature in terms of disk and flash and so forth with a highly distributed containerized application environment. So for example just last week at Jeredhad I met with the folks from Robin Systems and they're one of the solution providers providing those capabilities to associate, like I said, the storage Cloud with the containerized, essentially application, or Cloud applications that are out there, you know, what we need there, like you've indicated, are the ability to use AI to continue to look for patterns of performance issues, bottlenecks, and so forth and to drive the ongoing placement of data storage nodes and servers which in clusters and so forth as way of making sure that storage resources are always used efficiently that SLAs as David indicated are always observed in an automated fashion as the native placement and workload placement decisions are being made and so ultimately that the AI itself, whatever it's doing like recognizing faces or recognizing human language, is able to do it as efficiently and really as cheaply as possible. >> Right, so let me summarize what we've got so far. We've got that there is a relationship between storage and AI, that the workload suggests that we're going to have centralized modeling, large volumes of data, we're going to have distributed inferencing, smaller on data, more complex computing. Flash is crucial, mesh is crucial, and increasingly because of the distributed nature of these applications, there's going to have to be very specific and specialized AI in the infrastructure, in that mesh itself, to administer a lot of these data resources. >> Absolutely. >> So, but we want to be careful here, right David? We don't want to suggest that we have, just as the notion of everything goes into a centralized Cloud under a central administrative effort, we also don't want to suggest this notion that there's this broad, heterogeneous, common, democratized, every service available everywhere. Let's bring hybrid Cloud into this. >> Right. >> How will hybrid Cloud ultimately evolve to ensure that we get common services where we need them? And know where we don't have common services so that we can factor those constraints? >> So it's useful to think about the hybrid Cloud from the point of view of the development which will be fairly normal types of computing and be in really large centers and the edges themselves, which will be what we call autonomous Clouds. Those are the ones at the edge which need to be self-sufficient. So if you have an autonomous car, you can't guarantee that you will have communication to it. And most - a lot of IOTs in distant places which again, on chips or distant places, where you can't guarantee. So they have to be able to run much more by themselves. So that's one important characteristic so that autonomous one needs to be self-sufficient itself and have within it all the capabilities of running that particular code. And then passing up data when it can. >> Now you gave examples where it's physically required to do that, but it's also OT examples. >> Exactly. >> Operational technologies where you need to have that air gap to ensure that bad guys can't get into your data. >> Yes, absolutely, I mean if you think about a boat, a ship, it has multiple very clear air gaps and a nuclear power station has a total air gap around it. You must have those sort of air gaps. So it's a different architecture for different uses for different areas. But of course data is going to come up from those autonomous, upwards, but it will be a very small amount of the data that's actually being processed. The data, and there'll be requests down to those autonomous Clouds for additional processing of one sort or another. So there still will be a discussion, communication, between them, to ensure that the final outcome, the business outcome, is met. >> All right, so I'm going to ask each of you guys to give me a quick prediction. David, I'm going to ask you about storage and then Jim I'm going to ask you about AI in light of David's prediction about storage. So David, as we think about where these AI workloads seem to be going, how is storage technology going to evolve to make AI applications easier to deal with, easier to run, cheaper to run, more secure? >> Well, the fundamental move is towards larger amounts of Flash. And the new thing is that larger amounts of non-volatile DIMM, the memory in the computer itself, those are going to get much, much bigger, those are going to help with the execution of these real-time applications and there's going to be high-speed communication between short distances between the different nodes and this mesh architecture. So that's on the inference side, there's a big change happening in that space. On the development side the storage will move towards sharing data. So having a copy of the data which is available to everybody, and that data will be distributed. So sharing that data, having that data distributed, will then enable the sorts of ways of using that data which will retain context, which is incredibly important, and avoid the cost and the loss of value because of the time taken of moving that data from A to B. >> All right, so to summarize, we've got a new level in the storage hierarchy that puts between Flash and memory to really accelerate things, and then secondly we've got this notion that increasingly we have to provide a way of handling time and context so that we sustain fidelity especially in more real-time applications. Jim, given that this is where storage is going to go, what does that say about AI? >> What it says about AI is that first of all, we're talking about like David said, meshes of meshes, every edge node is increasingly becoming a mesh in its own right with disparate CPUs and GPUs and whatever, doing different inferencing on each device, but every one of these, like a smart car, will have plenty of embedded storage to process a lot of data locally that may need to be kept locally for lots of very good reasons, like a black box in case of an accident, but also in terms of e-discovery of the data and the models that might have led up to an accident that might have caused fatalities and whatnot. So when we look at where AI is going, AI is going into the mesh of mesh, meshes of meshes, where there's AI running it in each of the nodes within the meshes, and the meshes themselves will operate as autonomous decisioning nodes within a broader environment. Now in terms of the context, the context increasingly that surrounds all of the AI within these distributed architectures will be in the form of graphs and graphs are something distinct from the statistical algorithms that we built AI out of. We're talking about knowledge graphs, we're talking about social graphs, we're talking about behavioral graphs, so graph technology is just getting going. For example, Microsoft recently built, they made a big continued push into threading graph - contextual graph technology - into everything they do. So that's where I see AI going is up from statistical models to graph models as the broader metadata framework for binding everything together. >> Excellent. All right guys, so Jim, I think another topic another time might be the mesh mess. (laughs) But we won't do that now. All right, let's summarize really quickly. We've talked about how the relationship between AI, storage and hybrid Clouds are going to evolve. Number one, AI workloads are at least differentiated by where we handle modeling, large amounts of data still need a lot of compute, but we're really focused on large amounts of data and moving that data around very, very quickly. But therefore proximate to where the workload resides. Great, great application for Clouds, large, public as well as private. On the other side, where the inferencing work is done, that's going to be very compute-bound, smaller data volumes, but very, very fast data. Lot of flash everywhere. The second thing we observed is that these new AI applications are going to be used and applied in a lot of different domains, both within human interaction as well as real-time domains within IOT, et cetera, but that as we evolve, we're going to see a greater relationship between the nature of the workload and the class of the storage, and that is going to be a crucial feature for storage administrators and storage vendors over the next few year is to ensure that that specialization is reflected in what's known. What's needed. Now the last point that we'll make very quickly is that as we look forward, the whole concept of hybrid Cloud where we can have greater predictability into the nature of data-oriented services that are available for different workloads is going to be really, really important. We're not going to have all data services common in all places. But we do want to make sure that we can assure whether it's a container-based application or some other structure, that we can ensure that the data that is required will be there in the context, form and metadata structures that are required. Ultimately, as we look forward, we see new classes of storage evolving that bring data even closer to the compute side, and we see new data models emerging, such as graph models, that are a better overall reflection of how this distributed data is going to evolve within hybrid Cloud environments. David Floyer, Jim Kobielus, Wikibon analysts, I'm Peter Burris, once again, this has been Action Item.
SUMMARY :
We're joined here in the studio by David Floyer. And remote, we've got Jim Kobielus. Now that has nothing to do with the topic. in the sense that the data will come in of the different storage types in a second, and adjusting the data to transforming out to tape if we're talking about archive. What are some of the limitations that historically storage of the big data was Hadoop. What about in the inferencing side of things? and store that data and you certainly don't want to store it Now at the beginning of the cycle, you may need more but So Jim, the AI world has built algorithms for the cutting edge, you know, challenges we face as opposed to moving all of the data that has to be associated with a network of storage. to supporting some of these distributed AI workloads? and encryption security, control. the distributed hybrid Cloud that you have, "I as an individual need to know" in the way that it's set up. and is that really going to be a practical, are the ability to use AI to continue to look and increasingly because of the distributed nature just as the notion of everything goes and the edges themselves, which will be what we call to do that, but it's also OT examples. to have that air gap to ensure But of course data is going to come up and then Jim I'm going to ask you about AI because of the time taken of moving that data from A to B. and context so that we sustain fidelity and the models that might have led up to an accident and that is going to be a crucial feature
SENTIMENT ANALYSIS :
ENTITIES
Entity | Category | Confidence |
---|---|---|
David | PERSON | 0.99+ |
David Floyer | PERSON | 0.99+ |
Jim | PERSON | 0.99+ |
Jim Kobielus | PERSON | 0.99+ |
Peter Burris | PERSON | 0.99+ |
Robin Systems | ORGANIZATION | 0.99+ |
May 2019 | DATE | 0.99+ |
Microsoft | ORGANIZATION | 0.99+ |
two | QUANTITY | 0.99+ |
Game of Thrones | TITLE | 0.99+ |
each | QUANTITY | 0.99+ |
last week | DATE | 0.99+ |
Wikibon | ORGANIZATION | 0.99+ |
two types | QUANTITY | 0.99+ |
second thing | QUANTITY | 0.98+ |
both | QUANTITY | 0.98+ |
one | QUANTITY | 0.98+ |
each device | QUANTITY | 0.98+ |
Flash | TITLE | 0.97+ |
10 years ago | DATE | 0.96+ |
Jeredhad | ORGANIZATION | 0.95+ |
today | DATE | 0.9+ |
first use cases | QUANTITY | 0.85+ |
first thing | QUANTITY | 0.84+ |
one important characteristic | QUANTITY | 0.76+ |
secondly | QUANTITY | 0.76+ |
one central point | QUANTITY | 0.74+ |
Event Horizon | COMMERCIAL_ITEM | 0.72+ |
many years | QUANTITY | 0.71+ |
50 or so years | QUANTITY | 0.7+ |
Cloud | TITLE | 0.67+ |
first | QUANTITY | 0.66+ |
next few year | DATE | 0.65+ |
lot of data | QUANTITY | 0.62+ |
VDIMM | OTHER | 0.59+ |
every one | QUANTITY | 0.58+ |
second | QUANTITY | 0.57+ |
DataOps | TITLE | 0.46+ |
Kubernetes | TITLE | 0.44+ |