Robert Nishihara, Anyscale | AWS Startup Showcase S3 E1

(upbeat music) >> Hello everyone. Welcome to theCube's presentation of the "AWS Startup Showcase." The topic this episode is AI and machine learning, top startups building foundational model infrastructure. This is season three, episode one of the ongoing series covering exciting startups from the AWS ecosystem. And this time we're talking about AI and machine learning. I'm your host, John Furrier. I'm excited I'm joined today by Robert Nishihara, who's the co-founder and CEO of a hot startup called Anyscale. He's here to talk about Ray, the open source project, Anyscale's infrastructure for foundation as well. Robert, thank you for joining us today. >> Yeah, thanks so much as well. >> I've been following your company since the founding pre pandemic and you guys really had a great vision scaled up and in a perfect position for this big wave that we all see with ChatGPT and OpenAI that's gone mainstream. Finally, AI has broken out through the ropes and now gone mainstream, so I think you guys are really well positioned. I'm looking forward to to talking with you today. But before we get into it, introduce the core mission for Anyscale. Why do you guys exist? What is the North Star for Anyscale? >> Yeah, like you mentioned, there's a tremendous amount of excitement about AI right now. You know, I think a lot of us believe that AI can transform just every different industry. So one of the things that was clear to us when we started this company was that the amount of compute needed to do AI was just exploding. Like to actually succeed with AI, companies like OpenAI or Google or you know, these companies getting a lot of value from AI, were not just running these machine learning models on their laptops or on a single machine. They were scaling these applications across hundreds or thousands or more machines and GPUs and other resources in the Cloud. And so to actually succeed with AI, and this has been one of the biggest trends in computing, maybe the biggest trend in computing in, you know, in recent history, the amount of compute has been exploding. And so to actually succeed with that AI, to actually build these scalable applications and scale the AI applications, there's a tremendous software engineering lift to build the infrastructure to actually run these scalable applications. And that's very hard to do. So one of the reasons many AI projects and initiatives fail is that, or don't make it to production, is the need for this scale, the infrastructure lift, to actually make it happen. So our goal here with Anyscale and Ray, is to make that easy, is to make scalable computing easy. So that as a developer or as a business, if you want to do AI, if you want to get value out of AI, all you need to know is how to program on your laptop. Like, all you need to know is how to program in Python. And if you can do that, then you're good to go. Then you can do what companies like OpenAI or Google do and get value out of machine learning. >> That programming example of how easy it is with Python reminds me of the early days of Cloud, when infrastructure as code was talked about was, it was just code the infrastructure programmable. That's super important. That's what AI people wanted, first program AI. That's the new trend. And I want to understand, if you don't mind explaining, the relationship that Anyscale has to these foundational models and particular the large language models, also called LLMs, was seen with like OpenAI and ChatGPT. Before you get into the relationship that you have with them, can you explain why the hype around foundational models? Why are people going crazy over foundational models? What is it and why is it so important? >> Yeah, so foundational models and foundation models are incredibly important because they enable businesses and developers to get value out of machine learning, to use machine learning off the shelf with these large models that have been trained on tons of data and that are useful out of the box. And then, of course, you know, as a business or as a developer, you can take those foundational models and repurpose them or fine tune them or adapt them to your specific use case and what you want to achieve. But it's much easier to do that than to train them from scratch. And I think there are three, for people to actually use foundation models, there are three main types of workloads or problems that need to be solved. One is training these foundation models in the first place, like actually creating them. The second is fine tuning them and adapting them to your use case. And the third is serving them and actually deploying them. Okay, so Ray and Anyscale are used for all of these three different workloads. Companies like OpenAI or Cohere that train large language models. Or open source versions like GPTJ are done on top of Ray. There are many startups and other businesses that fine tune, that, you know, don't want to train the large underlying foundation models, but that do want to fine tune them, do want to adapt them to their purposes, and build products around them and serve them, those are also using Ray and Anyscale for that fine tuning and that serving. And so the reason that Ray and Anyscale are important here is that, you know, building and using foundation models requires a huge scale. It requires a lot of data. It requires a lot of compute, GPUs, TPUs, other resources. And to actually take advantage of that and actually build these scalable applications, there's a lot of infrastructure that needs to happen under the hood. And so you can either use Ray and Anyscale to take care of that and manage the infrastructure and solve those infrastructure problems. Or you can build the infrastructure and manage the infrastructure yourself, which you can do, but it's going to slow your team down. It's going to, you know, many of the businesses we work with simply don't want to be in the business of managing infrastructure and building infrastructure. They want to focus on product development and move faster. >> I know you got a keynote presentation we're going to go to in a second, but I think you hit on something I think is the real tipping point, doing it yourself, hard to do. These are things where opportunities are and the Cloud did that with data centers. Turned a data center and made it an API. The heavy lifting went away and went to the Cloud so people could be more creative and build their product. In this case, build their creativity. Is that kind of what's the big deal? Is that kind of a big deal happening that you guys are taking the learnings and making that available so people don't have to do that? >> That's exactly right. So today, if you want to succeed with AI, if you want to use AI in your business, infrastructure work is on the critical path for doing that. To do AI, you have to build infrastructure. You have to figure out how to scale your applications. That's going to change. We're going to get to the point, and you know, with Ray and Anyscale, we're going to remove the infrastructure from the critical path so that as a developer or as a business, all you need to focus on is your application logic, what you want the the program to do, what you want your application to do, how you want the AI to actually interface with the rest of your product. Now the way that will happen is that Ray and Anyscale will still, the infrastructure work will still happen. It'll just be under the hood and taken care of by Ray in Anyscale. And so I think something like this is really necessary for AI to reach its potential, for AI to have the impact and the reach that we think it will, you have to make it easier to do. >> And just for clarification to point out, if you don't mind explaining the relationship of Ray and Anyscale real quick just before we get into the presentation. >> So Ray is an open source project. We created it. We were at Berkeley doing machine learning. We started Ray so that, in order to provide an easy, a simple open source tool for building and running scalable applications. And Anyscale is the managed version of Ray, basically we will run Ray for you in the Cloud, provide a lot of tools around the developer experience and managing the infrastructure and providing more performance and superior infrastructure. >> Awesome. I know you got a presentation on Ray and Anyscale and you guys are positioning as the infrastructure for foundational models. So I'll let you take it away and then when you're done presenting, we'll come back, I'll probably grill you with a few questions and then we'll close it out so take it away. >> Robert: Sounds great. So I'll say a little bit about how companies are using Ray and Anyscale for foundation models. The first thing I want to mention is just why we're doing this in the first place. And the underlying observation, the underlying trend here, and this is a plot from OpenAI, is that the amount of compute needed to do machine learning has been exploding. It's been growing at something like 35 times every 18 months. This is absolutely enormous. And other people have written papers measuring this trend and you get different numbers. But the point is, no matter how you slice and dice it, it' a astronomical rate. Now if you compare that to something we're all familiar with, like Moore's Law, which says that, you know, the processor performance doubles every roughly 18 months, you can see that there's just a tremendous gap between the needs, the compute needs of machine learning applications, and what you can do with a single chip, right. So even if Moore's Law were continuing strong and you know, doing what it used to be doing, even if that were the case, there would still be a tremendous gap between what you can do with the chip and what you need in order to do machine learning. And so given this graph, what we've seen, and what has been clear to us since we started this company, is that doing AI requires scaling. There's no way around it. It's not a nice to have, it's really a requirement. And so that led us to start Ray, which is the open source project that we started to make it easy to build these scalable Python applications and scalable machine learning applications. And since we started the project, it's been adopted by a tremendous number of companies. Companies like OpenAI, which use Ray to train their large models like ChatGPT, companies like Uber, which run all of their deep learning and classical machine learning on top of Ray, companies like Shopify or Spotify or Instacart or Lyft or Netflix, ByteDance, which use Ray for their machine learning infrastructure. Companies like Ant Group, which makes Alipay, you know, they use Ray across the board for fraud detection, for online learning, for detecting money laundering, you know, for graph processing, stream processing. Companies like Amazon, you know, run Ray at a tremendous scale and just petabytes of data every single day. And so the project has seen just enormous adoption since, over the past few years. And one of the most exciting use cases is really providing the infrastructure for building training, fine tuning, and serving foundation models. So I'll say a little bit about, you know, here are some examples of companies using Ray for foundation models. Cohere trains large language models. OpenAI also trains large language models. You can think about the workloads required there are things like supervised pre-training, also reinforcement learning from human feedback. So this is not only the regular supervised learning, but actually more complex reinforcement learning workloads that take human input about what response to a particular question, you know is better than a certain other response. And incorporating that into the learning. There's open source versions as well, like GPTJ also built on top of Ray as well as projects like Alpa coming out of UC Berkeley. So these are some of the examples of exciting projects in organizations, training and creating these large language models and serving them using Ray. Okay, so what actually is Ray? Well, there are two layers to Ray. At the lowest level, there's the core Ray system. This is essentially low level primitives for building scalable Python applications. Things like taking a Python function or a Python class and executing them in the cluster setting. So Ray core is extremely flexible and you can build arbitrary scalable applications on top of Ray. So on top of Ray, on top of the core system, what really gives Ray a lot of its power is this ecosystem of scalable libraries. So on top of the core system you have libraries, scalable libraries for ingesting and pre-processing data, for training your models, for fine tuning those models, for hyper parameter tuning, for doing batch processing and batch inference, for doing model serving and deployment, right. And a lot of the Ray users, the reason they like Ray is that they want to run multiple workloads. They want to train and serve their models, right. They want to load their data and feed that into training. And Ray provides common infrastructure for all of these different workloads. So this is a little overview of what Ray, the different components of Ray. So why do people choose to go with Ray? I think there are three main reasons. The first is the unified nature. The fact that it is common infrastructure for scaling arbitrary workloads, from data ingest to pre-processing to training to inference and serving, right. This also includes the fact that it's future proof. AI is incredibly fast moving. And so many people, many companies that have built their own machine learning infrastructure and standardized on particular workflows for doing machine learning have found that their workflows are too rigid to enable new capabilities. If they want to do reinforcement learning, if they want to use graph neural networks, they don't have a way of doing that with their standard tooling. And so Ray, being future proof and being flexible and general gives them that ability. Another reason people choose Ray in Anyscale is the scalability. This is really our bread and butter. This is the reason, the whole point of Ray, you know, making it easy to go from your laptop to running on thousands of GPUs, making it easy to scale your development workloads and run them in production, making it easy to scale, you know, training to scale data ingest, pre-processing and so on. So scalability and performance, you know, are critical for doing machine learning and that is something that Ray provides out of the box. And lastly, Ray is an open ecosystem. You can run it anywhere. You can run it on any Cloud provider. Google, you know, Google Cloud, AWS, Asure. You can run it on your Kubernetes cluster. You can run it on your laptop. It's extremely portable. And not only that, it's framework agnostic. You can use Ray to scale arbitrary Python workloads. You can use it to scale and it integrates with libraries like TensorFlow or PyTorch or JAX or XG Boost or Hugging Face or PyTorch Lightning, right, or Scikit-learn or just your own arbitrary Python code. It's open source. And in addition to integrating with the rest of the machine learning ecosystem and these machine learning frameworks, you can use Ray along with all of the other tooling in the machine learning ecosystem. That's things like weights and biases or ML flow, right. Or you know, different data platforms like Databricks, you know, Delta Lake or Snowflake or tools for model monitoring for feature stores, all of these integrate with Ray. And that's, you know, Ray provides that kind of flexibility so that you can integrate it into the rest of your workflow. And then Anyscale is the scalable compute platform that's built on top, you know, that provides Ray. So Anyscale is a managed Ray service that runs in the Cloud. And what Anyscale does is it offers the best way to run Ray. And if you think about what you get with Anyscale, there are fundamentally two things. One is about moving faster, accelerating the time to market. And you get that by having the managed service so that as a developer you don't have to worry about managing infrastructure, you don't have to worry about configuring infrastructure. You also, it provides, you know, optimized developer workflows. Things like easily moving from development to production, things like having the observability tooling, the debug ability to actually easily diagnose what's going wrong in a distributed application. So things like the dashboards and the other other kinds of tooling for collaboration, for monitoring and so on. And then on top of that, so that's the first bucket, developer productivity, moving faster, faster experimentation and iteration. The second reason that people choose Anyscale is superior infrastructure. So this is things like, you know, cost deficiency, being able to easily take advantage of spot instances, being able to get higher GPU utilization, things like faster cluster startup times and auto scaling. Things like just overall better performance and faster scheduling. And so these are the kinds of things that Anyscale provides on top of Ray. It's the managed infrastructure. It's fast, it's like the developer productivity and velocity as well as performance. So this is what I wanted to share about Ray in Anyscale. >> John: Awesome. >> Provide that context. But John, I'm curious what you think. >> I love it. I love the, so first of all, it's a platform because that's the platform architecture right there. So just to clarify, this is an Anyscale platform, not- >> That's right. >> Tools. So you got tools in the platform. Okay, that's key. Love that managed service. Just curious, you mentioned Python multiple times, is that because of PyTorch and TensorFlow or Python's the most friendly with machine learning or it's because it's very common amongst all developers? >> That's a great question. Python is the language that people are using to do machine learning. So it's the natural starting point. Now, of course, Ray is actually designed in a language agnostic way and there are companies out there that use Ray to build scalable Java applications. But for the most part right now we're focused on Python and being the best way to build these scalable Python and machine learning applications. But, of course, down the road there always is that potential. >> So if you're slinging Python code out there and you're watching that, you're watching this video, get on Anyscale bus quickly. Also, I just, while you were giving the presentation, I couldn't help, since you mentioned OpenAI, which by the way, congratulations 'cause they've had great scale, I've noticed in their rapid growth 'cause they were the fastest company to the number of users than anyone in the history of the computer industry, so major successor, OpenAI and ChatGPT, huge fan. I'm not a skeptic at all. I think it's just the beginning, so congratulations. But I actually typed into ChatGPT, what are the top three benefits of Anyscale and came up with scalability, flexibility, and ease of use. Obviously, scalability is what you guys are called. >> That's pretty good. >> So that's what they came up with. So they nailed it. Did you have an inside prompt training, buy it there? Only kidding. (Robert laughs) >> Yeah, we hard coded that one. >> But that's the kind of thing that came up really, really quickly if I asked it to write a sales document, it probably will, but this is the future interface. This is why people are getting excited about the foundational models and the large language models because it's allowing the interface with the user, the consumer, to be more human, more natural. And this is clearly will be in every application in the future. >> Absolutely. This is how people are going to interface with software, how they're going to interface with products in the future. It's not just something, you know, not just a chat bot that you talk to. This is going to be how you get things done, right. How you use your web browser or how you use, you know, how you use Photoshop or how you use other products. Like you're not going to spend hours learning all the APIs and how to use them. You're going to talk to it and tell it what you want it to do. And of course, you know, if it doesn't understand it, it's going to ask clarifying questions. You're going to have a conversation and then it'll figure it out. >> This is going to be one of those things, we're going to look back at this time Robert and saying, "Yeah, from that company, that was the beginning of that wave." And just like AWS and Cloud Computing, the folks who got in early really were in position when say the pandemic came. So getting in early is a good thing and that's what everyone's talking about is getting in early and playing around, maybe replatforming or even picking one or few apps to refactor with some staff and managed services. So people are definitely jumping in. So I have to ask you the ROI cost question. You mentioned some of those, Moore's Law versus what's going on in the industry. When you look at that kind of scale, the first thing that jumps out at people is, "Okay, I love it. Let's go play around." But what's it going to cost me? Am I going to be tied to certain GPUs? What's the landscape look like from an operational standpoint, from the customer? Are they locked in and the benefit was flexibility, are you flexible to handle any Cloud? What is the customers, what are they looking at? Basically, that's my question. What's the customer looking at? >> Cost is super important here and many of the companies, I mean, companies are spending a huge amount on their Cloud computing, on AWS, and on doing AI, right. And I think a lot of the advantage of Anyscale, what we can provide here is not only better performance, but cost efficiency. Because if we can run something faster and more efficiently, it can also use less resources and you can lower your Cloud spending, right. We've seen companies go from, you know, 20% GPU utilization with their current setup and the current tools they're using to running on Anyscale and getting more like 95, you know, 100% GPU utilization. That's something like a five x improvement right there. So depending on the kind of application you're running, you know, it's a significant cost savings. We've seen companies that have, you know, processing petabytes of data every single day with Ray going from, you know, getting order of magnitude cost savings by switching from what they were previously doing to running their application on Ray. And when you have applications that are spending, you know, potentially $100 million a year and getting a 10 X cost savings is just absolutely enormous. So these are some of the kinds of- >> Data infrastructure is super important. Again, if the customer, if you're a prospect to this and thinking about going in here, just like the Cloud, you got infrastructure, you got the platform, you got SaaS, same kind of thing's going to go on in AI. So I want to get into that, you know, ROI discussion and some of the impact with your customers that are leveraging the platform. But first I hear you got a demo. >> Robert: Yeah, so let me show you, let me give you a quick run through here. So what I have open here is the Anyscale UI. I've started a little Anyscale Workspace. So Workspaces are the Anyscale concept for interactive developments, right. So here, imagine I'm just, you want to have a familiar experience like you're developing on your laptop. And here I have a terminal. It's not on my laptop. It's actually in the cloud running on Anyscale. And I'm just going to kick this off. This is going to train a large language model, so OPT. And it's doing this on 32 GPUs. We've got a cluster here with a bunch of CPU cores, bunch of memory. And as that's running, and by the way, if I wanted to run this on instead of 32 GPUs, 64, 128, this is just a one line change when I launch the Workspace. And what I can do is I can pull up VS code, right. Remember this is the interactive development experience. I can look at the actual code. Here it's using Ray train to train the torch model. We've got the training loop and we're saying that each worker gets access to one GPU and four CPU cores. And, of course, as I make the model larger, this is using deep speed, as I make the model larger, I could increase the number of GPUs that each worker gets access to, right. And how that is distributed across the cluster. And if I wanted to run on CPUs instead of GPUs or a different, you know, accelerator type, again, this is just a one line change. And here we're using Ray train to train the models, just taking my vanilla PyTorch model using Hugging Face and then scaling that across a bunch of GPUs. And, of course, if I want to look at the dashboard, I can go to the Ray dashboard. There are a bunch of different visualizations I can look at. I can look at the GPU utilization. I can look at, you know, the CPU utilization here where I think we're currently loading the model and running that actual application to start the training. And some of the things that are really convenient here about Anyscale, both I can get that interactive development experience with VS code. You know, I can look at the dashboards. I can monitor what's going on. It feels, I have a terminal, it feels like my laptop, but it's actually running on a large cluster. And I can, with however many GPUs or other resources that I want. And so it's really trying to combine the best of having the familiar experience of programming on your laptop, but with the benefits, you know, being able to take advantage of all the resources in the Cloud to scale. And it's like when, you know, you're talking about cost efficiency. One of the biggest reasons that people waste money, one of the silly reasons for wasting money is just forgetting to turn off your GPUs. And what you can do here is, of course, things will auto terminate if they're idle. But imagine you go to sleep, I have this big cluster. You can turn it off, shut off the cluster, come back tomorrow, restart the Workspace, and you know, your big cluster is back up and all of your code changes are still there. All of your local file edits. It's like you just closed your laptop and came back and opened it up again. And so this is the kind of experience we want to provide for our users. So that's what I wanted to share with you. >> Well, I think that whole, couple of things, lines of code change, single line of code change, that's game changing. And then the cost thing, I mean human error is a big deal. People pass out at their computer. They've been coding all night or they just forget about it. I mean, and then it's just like leaving the lights on or your water running in your house. It's just, at the scale that it is, the numbers will add up. That's a huge deal. So I think, you know, compute back in the old days, there's no compute. Okay, it's just compute sitting there idle. But you know, data cranking the models is doing, that's a big point. >> Another thing I want to add there about cost efficiency is that we make it really easy to use, if you're running on Anyscale, to use spot instances and these preemptable instances that can just be significantly cheaper than the on-demand instances. And so when we see our customers go from what they're doing before to using Anyscale and they go from not using these spot instances 'cause they don't have the infrastructure around it, the fault tolerance to handle the preemption and things like that, to being able to just check a box and use spot instances and save a bunch of money. >> You know, this was my whole, my feature article at Reinvent last year when I met with Adam Selipsky, this next gen Cloud is here. I mean, it's not auto scale, it's infrastructure scale. It's agility. It's flexibility. I think this is where the world needs to go. Almost what DevOps did for Cloud and what you were showing me that demo had this whole SRE vibe. And remember Google had site reliability engines to manage all those servers. This is kind of like an SRE vibe for data at scale. I mean, a similar kind of order of magnitude. I mean, I might be a little bit off base there, but how would you explain it? >> It's a nice analogy. I mean, what we are trying to do here is get to the point where developers don't think about infrastructure. Where developers only think about their application logic. And where businesses can do AI, can succeed with AI, and build these scalable applications, but they don't have to build, you know, an infrastructure team. They don't have to develop that expertise. They don't have to invest years in building their internal machine learning infrastructure. They can just focus on the Python code, on their application logic, and run the stuff out of the box. >> Awesome. Well, I appreciate the time. Before we wrap up here, give a plug for the company. I know you got a couple websites. Again, go, Ray's got its own website. You got Anyscale. You got an event coming up. Give a plug for the company looking to hire. Put a plug in for the company. >> Yeah, absolutely. Thank you. So first of all, you know, we think AI is really going to transform every industry and the opportunity is there, right. We can be the infrastructure that enables all of that to happen, that makes it easy for companies to succeed with AI, and get value out of AI. Now we have, if you're interested in learning more about Ray, Ray has been emerging as the standard way to build scalable applications. Our adoption has been exploding. I mentioned companies like OpenAI using Ray to train their models. But really across the board companies like Netflix and Cruise and Instacart and Lyft and Uber, you know, just among tech companies. It's across every industry. You know, gaming companies, agriculture, you know, farming, robotics, drug discovery, you know, FinTech, we see it across the board. And all of these companies can get value out of AI, can really use AI to improve their businesses. So if you're interested in learning more about Ray and Anyscale, we have our Ray Summit coming up in September. This is going to highlight a lot of the most impressive use cases and stories across the industry. And if your business, if you want to use LLMs, you want to train these LLMs, these large language models, you want to fine tune them with your data, you want to deploy them, serve them, and build applications and products around them, give us a call, talk to us. You know, we can really take the infrastructure piece, you know, off the critical path and make that easy for you. So that's what I would say. And, you know, like you mentioned, we're hiring across the board, you know, engineering, product, go-to-market, and it's an exciting time. >> Robert Nishihara, co-founder and CEO of Anyscale, congratulations on a great company you've built and continuing to iterate on and you got growth ahead of you, you got a tailwind. I mean, the AI wave is here. I think OpenAI and ChatGPT, a customer of yours, have really opened up the mainstream visibility into this new generation of applications, user interface, roll of data, large scale, how to make that programmable so we're going to need that infrastructure. So thanks for coming on this season three, episode one of the ongoing series of the hot startups. In this case, this episode is the top startups building foundational model infrastructure for AI and ML. I'm John Furrier, your host. Thanks for watching. (upbeat music)

Published Date : Mar 9 2023

SUMMARY :

episode one of the ongoing and you guys really had and other resources in the Cloud. and particular the large language and what you want to achieve. and the Cloud did that with data centers. the point, and you know, if you don't mind explaining and managing the infrastructure and you guys are positioning is that the amount of compute needed to do But John, I'm curious what you think. because that's the platform So you got tools in the platform. and being the best way to of the computer industry, Did you have an inside prompt and the large language models and tell it what you want it to do. So I have to ask you and you can lower your So I want to get into that, you know, and you know, your big cluster is back up So I think, you know, the on-demand instances. and what you were showing me that demo and run the stuff out of the box. I know you got a couple websites. and the opportunity is there, right. and you got growth ahead

ENTITIES

Entity	Category	Confidence
Robert Nishihara	PERSON	0.99+
John	PERSON	0.99+
Robert	PERSON	0.99+
John Furrier	PERSON	0.99+
Netflix	ORGANIZATION	0.99+
35 times	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
$100 million	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Ant Group	ORGANIZATION	0.99+
first	QUANTITY	0.99+
Python	TITLE	0.99+
20%	QUANTITY	0.99+
32 GPUs	QUANTITY	0.99+
Lyft	ORGANIZATION	0.99+
hundreds	QUANTITY	0.99+
tomorrow	DATE	0.99+
Anyscale	ORGANIZATION	0.99+
three	QUANTITY	0.99+
128	QUANTITY	0.99+
September	DATE	0.99+
today	DATE	0.99+
Moore's Law	TITLE	0.99+
Adam Selipsky	PERSON	0.99+
PyTorch	TITLE	0.99+
Ray	ORGANIZATION	0.99+
second reason	QUANTITY	0.99+
64	QUANTITY	0.99+
each worker	QUANTITY	0.99+
each worker	QUANTITY	0.99+
Photoshop	TITLE	0.99+
UC Berkeley	ORGANIZATION	0.99+
Java	TITLE	0.99+
Shopify	ORGANIZATION	0.99+
OpenAI	ORGANIZATION	0.99+
Anyscale	PERSON	0.99+
third	QUANTITY	0.99+
two things	QUANTITY	0.99+
ByteDance	ORGANIZATION	0.99+
Spotify	ORGANIZATION	0.99+
One	QUANTITY	0.99+
95	QUANTITY	0.99+
Asure	ORGANIZATION	0.98+
one line	QUANTITY	0.98+
one GPU	QUANTITY	0.98+
ChatGPT	TITLE	0.98+
TensorFlow	TITLE	0.98+
last year	DATE	0.98+
first bucket	QUANTITY	0.98+
both	QUANTITY	0.98+
two layers	QUANTITY	0.98+
Cohere	ORGANIZATION	0.98+
Alipay	ORGANIZATION	0.98+
Ray	PERSON	0.97+
one	QUANTITY	0.97+
Instacart	ORGANIZATION	0.97+

Wikibon | Action Item, Feb 2018

>> Hi I'm Peter Burris, welcome to Action Item. (electronic music) There's an enormous net new array of software technologies that are available to businesses and enterprises to tend to some new classes of problems and that means that there's an explosion in the number of problems that people perceive as could be applied, or could be solved, with software approaches. The whole world of how we're going to automate things differently in artificial intelligence and any number of other software technologies, are all being brought to bear on problems in ways that we never envisioned or never thought possible. That leads ultimately to a comparable explosion in the number of approaches to how we're going to solve some of these problems. That means new tooling, new models, new any number of other structures, conventions, and artifacts that are going to have to be factored by IT organizations and professionals in the technology industry as they conceive and put forward plans and approaches to solving some of these problems. Now, George that leads to a question. Are we going to see an ongoing ever-expanding array of approaches or are we going to see some new kind of steady-state that kind of starts to simplify what happens, or how enterprises conceive of the role of software and solving problems. >> Well, we've had... probably four decades of packaged applications being installed and defining really the systems of record, which first handled the ordered cash process and then layered around that. Once we had more CRM capabilities we had the sort of the opportunity to lead capability added in there. But systems of record fundamentally are backward looking, they're tracking about the performance of the business. The opportunity-- >> Peter: Recording what has happened? >> Yes, recording what has happened. The opportunity we have is now to combine what the big Internet companies pioneered, with systems of engagement. Where you had machine learning anticipating and influencing interactions. You can now combine those sorts of analytics with systems of record to inform and automate decisions in the form of transactions. And the question is now, how are we going to do this? Is there some way to simplify or, not completely standardized, but can we make it so that we have at least some conventions and design patterns for how to do that? >> And David, we've been working on this problem for quite some time but the notion of convergence has been extent in the hardware and the services, or in the systems business for quite some time. Take us through what convergence means and how it is going to set up new ways of thinking about software. >> So there's a hardware convergence and it's useful to define a few terms. There's converged systems, those are systems which have some management software that have been brought into it and then on top of that they have traditional SANs and networks. There's hyper-converged systems, which started off in the cloud systems and now have come to enterprise as well. And those bring software networking, software storage, software-- >> Software defined, so it's a virtualizing of those converged systems. >> David: Absolutely, and in the future is going to bring also automated operational stuff as well, AI in the operational side. And then there's full stack conversions. Where we start to put in the software, the application software, to begin with the database side of things and then the application itself on top of the database. And finally these, what you are talking about, the systems of intelligence. Where we can combine both the systems of record, the systems of engagement, and the real-time analytics as a complete stack. >> Peter: Let's talk about this for a second because ultimately what I think you're saying is, that we've got hardware convergence in the form of converged infrastructure, hyper-converged in the forms of virtualization of that, new ways of thinking about how the stack comes together, and new ways of thinking about application components. But what seems to be the common thread, through all of this, is data. >> David: Yes. >> So it's basically what we're seeing is a convergence or a rethinking of how software elements revolve around the data, is that kind of the centerpiece of this? >> David: That's the centerpiece of it and we had very serious constraints about accessing data. Those will improve with flash but there's still a lot of room for improvement. And the architecture that we are saying is going to come forward, which really helps this a lot, is the unit grid architecture. Where we offload the networking and the storage from the processor. This is already happening in the hyper scale clouds, they're putting a lot of effort into doing this. But we're at the same time allowing any processor to access any data in a much more fluid way and we can grow that to thousands of processes. Now that type of architecture gives us the ability to converge the traditional systems of record, and there are a lot of them obviously, and the systems of engagement and the the real-time analytics for the first time. >> But the focal point of that convergence is not the licensing of the software, the focal point is convergence around the data. >> The data. >> But that has some pretty significant implications when we think about how software has always been sold, how organizations to run software have been structured, the way that funding is set up within businesses. So George, what does it mean to talk about converging software around data from a practical standpoint over the next few years? >> Okay, so let me take that and interpret that as converging the software around data in the context of adding intelligence to our existing application portfolio and then the new applications that follow on. And basically, when we want to inject an intelligence enough to inform and anticipate and inform interactions or inform or automate transactions, we have a bunch of steps that need to get done. Where we're ingesting essentially contextual or ambient information. Often this is information about a user or the business process. And this data, it's got to go through a pipeline where there's both a Design Time and a Run Time. In addition to ingesting it, you have to sort of enrich it and make it ready for analysis. Then the analysis has essentially picking out of all that data and calculating the features that you plug into a machine learning model. And then that, produces essentially an inference based on all that data, that says well this is the probable value and it sounds like, sounds like it's in the weeds but the point is it's actually a standardized set of steps. Then the question is, do you put that all together in one product across that whole pipeline? Can one piece of infrastructure software manage that ? Or do you have a bunch of pieces each handing off to the next? And-- >> Peter: But let me stop you so because I want to make sure that we kind of follow this thread. So we've argued that hardware convergence and the ability to scale the role the data plays or how data is used, is happening and that opens up new opportunities to think about data. Now what we've got is we are centering a lot of the software convergence around the use of data through copies and other types of mechanisms for handling snapshots and whatnot and things like uni grid. What you're, let's start with this. It sounds like what you're saying is we need to think of new classes of investments in technologies that are specifically set up to handling the processing of data in a more distributed application way, right? If I got that right, that's kind of what we mean by pipelines? >> George: Yes. >> Okay, so once we do that, once we establish those conventions, once we establish organizationally institutionally how that's going to work. Now we take the next step of saying, are we going to default to a single set of products or are we going to do best to breed and what kind of convergence are we going to see there? >> And there's no-- >> First of all, have I got that right? >> Yes, but there's no right answer. And I think there's a bunch of variables that we have to play with that depend on who the customer is. For instance, the very largest and most sophisticated tech companies are more comfortable taking multiple pieces each that's very specialized and putting them together in a pipeline. >> Facebook, Yahoo, Google-- >> George: LinkedIn. >> Got it. >> George: Those guys. And the knobs that they're playing with, that everyone's playing with, are three, basically on the software side. There's your latency budget, which is how much time do you have to produce an answer. So that drives the transaction or the interaction. And it's not, that itself is not just a single answer because... It's not, the goal isn't to get it as short as possible. The goal is to get as much information into the analysis within the budgeted latency. >> Peter: So it's packing the latency budget with data? >> George: Yes, because the more data that goes into making the inference, the better the inference. >> Got it. >> The example that someone used actually on Fareed Zakaria GPS, one show about it was, if he had 300 attributes describing a person he could know more about that person then that person did (laughs) in terms of inferring other attributes. So the the point is, once you've got your latency budget, the other two knobs that you can play with are development complexity and admin complexity. And the idea is on development complexity, there's a bunch of abstractions that you have to deal with. If it's all one product you're going to have one data model, one address and namespace convention, one programming model, one way of persisting data, a whole bunch of things. That's simplicity. And that makes it more accessible to mainstream organizations. Similarly there's a bunch of, let me just add that, there's probably two or three times as many constructs that admins would have to deal with. So again, if you're dealing with one product, it's a huge burden off the admin and we know they struggled with Hadoop. >> So convergence, decisions about how to enact convergence is going to be partly or strongly influenced by those three issues. Latency budget, development complexity or simplicity, and administrative, David-- >> I'd like to add one more to that, and that is location of data. Because you want to be able to, you want to be able to look at the data that is most relevant to solving that particular problem. Now, today a lot of the data is inside the enterprise. There's a lot of data outside that but they're still, you will want to, in the best possible way, combine that data one way or another. >> But isn't that a variable on the latency budget? >> David: Well there's, I would think it's very useful to split the latency budget, which is to do with inference mainly, and development with the machine learning. So there is a development cycle with machine learning that is much longer. That is days, could be weeks, could be months. >> I would still done in Bash. >> It is or will be done, wait a second. It will be done in Bash, it is done in Bash, and it's. You need to test it and then deliver it as an inference engine to the applications that you're talking about. Now that's going to be very close together, that inference, then the rest of it has to be all physically very close together. But the data itself is spread out and you want to have mechanisms that can combine those datas, move application to those datas, bring those together in the best possible way. That is still a Bash process. That can run where the data is, in the cloud locally, wherever it is. >> George: And I think you brought up a great point, which I would tend to include in latency budget because... no matter what kind of answers you're looking for, some of the attributes are going to be pre computed and those could be-- >> David: Absolutely. >> External data. >> David: Yes. >> And you're not going to calculate everything in real time, there's just-- >> You can't. >> Yes you can't. >> But is the practical reality that the convergence of, so again, the argument. We've got all these new problems, all new kinds of new people that are claiming that they know how to solve the problems, each of them choosing different classes of tools to solve the problem, an explosion across the board in the approaches, which can lead to enormous downstream integration and complexity costs. You've used the example of Cloudera, for example. Some of the distro companies who claim that 50 plus percent of their development budget is dedicated to just integrating these pieces. That's a non-starter for a lot of enterprises. Are we fundamentally saying that the degree of complexity or the degree of simplicity and convergence, it's possible in software, is tied to the degree of convergence in the data? >> You're honing in on something really important, give me-- >> Peter: Thank you! (laughs) >> George: Give an example of the convergence of data that you're talking about. >> Peter: I'll let David do it because I think he's going to jump on it. >> David: Yes so let me take examples, for example. If you have a small business, there's no way that you want to invest yourself in any of the normal levels of machine learning and applications like that. You want to outsource that. So big software companies are going to do that for you and they're going to do it especially for the specific business processes which are unique to them, which give them digital differentiation of some sort or another. So for all of those type of things, software will come in from vendors, from SAP or son of SAP, which will help you solve those problems. And having data brokers which are collecting the data, putting them together, helping you with that. That seems to me the way things are going. In the same way that there's a lot of inference engines which will be out at the IOT level. Those will have very rapid analytics given to them. Again, not by yourself but by companies that specialize in facial recognition or specialize in making warehouse-- >> Wait a minute, are you saying that my customers aren't special, that require special facial recognition? (laughs) So I agree with David but I want to come back to this notion because-- >> David: The point I was getting at is, there's going to be lots and lots of room for software to be developed, to help in specific cases. >> Peter: And large markets to sell that software into. >> Very large markets. >> Whether it's a software, but increasingly also with services. But I want to come back to this notion of convergence because we talked about hardware convergence and we're starting to talk about the practical limits on software convergence. But somewhere in between I would argue, and I think you guys would agree, that really the catalyst for, or the thing that's going to determine the rate of change and the degree of convergence is going to be how we deal with data. Now you've done a lot of research on this, I'm going to put something out there and you tell me if I'm wrong. But at the end of the day, when we start thinking about uni grid, when we start thinking about some of these new technologies, and the ability to have single copies or single sources of data, multiple copies, in many respects what we're talking about is the virtualization of data without loss. >> David: Yes. >> Not loss of the characters, the fidelity of the data, or the state of the data. I got that right? >> Knowing the state of the data. >> Peter: Or knowing state of the data. >> If you take a snapshot, that's a point in time, you know what that point of time is, and you can do a lot of analytics for example on, and you want to do them on a certain time of day or whatever-- >> Peter: So is it wrong to say that we're seeing, we've moved through the virtualization of hardware and we're now in a hyper scale or hyper-converged, which is very powerful stuff. We're seeing this explosion in the amount of software that's being you know, the way we approach problems and whatnot. But that a forcing function, something that's going to both constrain how converged that can be, but also force or catalyze some convergence, is the idea that we're moving into an era where we can start to think about virtualized data through some of these distributed file systems-- >> David: That's right, and the metadata that goes with it. The most important thing about the data is, and it's increasing much more rapidly than data itself, is the metadata around it. But I want to just, make one point on this, all data isn't useful. There's a huge amount of data that we capture that we're just going to have to throw away. The idea that we can look at every piece of data for every decision is patently false. There's a lovely example of this in... fluid mechanics. >> Peter: Fluid dynamics. >> David: Fluid dynamics, if you're trying to, if you're trying to have simulation at a very very low level, the amount of-- >> Peter: High fidelity. >> High fidelity, you run out of capacity very very very quickly indeed. So you have to make trade-offs about everything and all of that data that you're doing in that simulation, you're not going to keep that. All the data from IOT, you can't keep that. >> Peter: And that's not just a statement about the performance or the power or the capabilities of the hardware, there's some physical realities-- >> David: Absolutely, yes. >> That are going to limit what you can do with the simulation. But, and we've talked. We've talked about this in other action items, There is this notion of options on data value, where the value of today's data is maybe-- >> David: Is much higher. >> Peter: Well it's higher from at a time standpoint for the problems that we understand and are trying to solve now but there may be future problems where we still want to ensure that we have some degree of data where we can be better at attending those future problems. But I want to come back to this point because in all honesty, I haven't heard anybody else talking about this and maybe's because I'm not listening. But this notion of again, your research that the notion of virtualized data inside these new architectures being a catalyst for a simplification of a lot of the sharing subsystem. >> David: It's essentially sharing of data. So instead of having the traditional way of doing it within a data center, which is I have my systems of record, I make a copy, it gets delivered to the data warehouse, for example. That's the way that's being done. That is too slow, moving data is incredibly slow. So another way of doing it is to share that data, make a virtual copy of it, and technologies allowing you to do that because the access density has gone up by thousands of times-- >> Peter: Because? >> Because. (laughs) Because of flash, because of new technologies at that level, >> Peter: High performance interfaces, high performance networks. >> David: All of that stuff is now allowing things, which just couldn't be even conceived. However, there is still a constraint there. It may be a thousand times bigger but there is still an absolute constraint to the amount of data that you can actually process. >> And that constraint is provided by latency. >> Latency. >> Peter: Speed of light. >> Speed of light and speed of the processes themselves. >> George: Let me add something that may help explain the sort of the virtualization of data and how it ties into the convergence or non convergence of the software around it. Which is, when we're building these analytic pipelines, essentially we've disassembled what used to be a DBMS. And so out of that we've got a storage engine, we've got query optimizers, we've got data manipulation languages which have grown into full-blown analytic languages, data definition language. Now the system catalog used to be just, a way to virtualize all the tables in the database and tell you where all the stuff was, and the indexes and things like that. Now, what we're seeing is since data is now spread out over so many places and products, we're seeing an emergence of a new of catalog. Whether that's from Elation or Dremio or on AWS, it's the Glue catalog, and I think there's something equivalent coming on Asure. But the point is, we're beginning, those are beginning to get useful enough to be the entry point for analytic products and maybe eventually even for transactional products to update, or at least to analyze the data in these pipelines that we're putting together out of these components of what was a disassembled database. Now, we could be-- >> I would make a difference there there between the development of analytics and again, the real-time use of those analytics within systems of intelligence. >> George: Yeah but when you're using them-- >> David: There's a different, problems they have to solve. >> George: But there's a Design Time and a Run Time, there's actually four pipelines for the sort of analytic pipeline itself. There's Design Time and Run Time, and then for the inference engine and the modeling that goes behind it, there's also a Design Time and Run Time. But I guess where. I'm not disagreeing that you could have one converged product to manage the Run Time analytic pipeline. I'm just saying that the pieces that you assemble could come from one vendor. >> Yeah but I think David's point, I think it's accurate and this has been since the beginning of time. (laughs) Certainly predated UNIVAC. That at the end of the day, read/write ratios and the characteristics of the data are going to have an enormous impact on the choices that you make. And high write to read ratios almost dictate the degree of convergence, and we used to call that SMP, or you know scale-up database managers. And for those types of applications, with those types of workloads, it's not necessarily obvious that that's going to change. Now we can still find ways to relax that but you're talking about, George, the new characteristics >> Injecting the analytics. >> Injecting the analytics where we're doing more reading as opposed to writing. We may still be writing into an application that has these characteristics-- >> That's a small amount of data. >> But a significant portion of the new function is associated with these new pipelines. >> Right. And it's actually... what data you create is generally derived data. So you're not stepping on something that's already there. >> All right, so let me get some action items here. David, I want to start with you. What's the action item? >> David: So for me, about conversions, there's two levels of conversions. First of all, converge as much as possible and give the work to the vendor, would be my action item. The more that you can go full stack, the more that you can get the software services from a single point, single throat to choke, single hand to shake, the more you have out source your problems to them. >> Peter: And that has a speed implication, time to value. >> Time to value, it has a, you don't have to do undifferentiated work. So that's the first level of convergence and then the second level of convergence is to look hard about how you can bring additional value to your existing systems of record by putting in automation or a real-time analytics. Which leads to automation, that is the second one, for me, where the money is. Automation, reduction in the number of things that people have to do. >> Peter: George, action item. >> So my action item is that you have to evaluate, you the customer have to evaluate sort of your skills as much as your existing application portfolio. And if more of your greenfield apps can start in the cloud and you're not religious about open source but you're more religious about the admin burden and development burden and your latency budget, then start focusing on the services that the cloud vendors originally created that were standalone, but they are increasingly integrating because the customers are leading them there. And then for those customers who you know, have decades and decades of infrastructure and applications on Prem and need a pathway to the cloud, some of the vendors formerly known as Hadoop vendors. But for that matter, any on Prem software vendor is providing customers a way to run workloads in a hybrid environment or to migrate data across platforms. >> All right, so let me give this a final action item here. Thank you David Foyer, George Gilbert. Neil Raiden and Jim Kobielus and the rest of the Wikibon team is with customers today. We talked today about convergence at the software level. What we've observed over the course of the last few years is an expanding array of software technologies, specifically AI, big data, machine learning, etc. That are allowing enterprises to think differently about the types of problems that they can solve with technology. That's leading to an explosion and a number of problems that folks are looking at, the number of individuals participating in making those decisions and thinking those issues through. And very importantly, an explosion of the number of vendors with piecemeal solutions about what they regard, their best approach to doing things. However, that is going to have a significant burden that could have enormous implications for years and so the question is, will we see a degree of convergence in the approach to doing software, in the form of pipelines and applications and whatnot, driven by a combination of: what the hardware is capable of doing, what the skills are or make possible, and very importantly, the natural attributes of the data. And we think that there will be. There will always be tension in the model if you try to invent new software but one of the factors that's going to bring it all back to a degree of simplicity, will be a combination of what the hardware can do, what people can do, and what the data can do. And so we believe, pretty strongly, that ultimately the issues surrounding data whether it be latency or location, as well as the development complexity and administrative complexity, are going to be a range of factors that are going to dictate ultimately of how some of these solutions start to converge and simplify within enterprises. As we look forward, our expectation is that we're going to see an enormous net new investment over the next few years in pipelines, because pipelines are a first-level set of investments on how we're going to handle data within the enterprise. And they'll look like, in certain respects, how DBMS used to look but just in a disaggregated way but conceptually and administratively and then from a product selection and service election standpoint, the expectation is that they themselves have to come together so the developers can have a consistent view of the data that's going to run inside the enterprise. Want to thank David Floyer, want to thank George Gilbert. Once again, this has been Wikibon Action Item and we look forward to seeing you on our next Action Item. (electronic music)

Published Date : Feb 16 2018

SUMMARY :

in the number of approaches to how we're going the sort of the opportunity to lead And the question is now, how are we going to do this? has been extent in the hardware and the services, and now have come to enterprise as well. of those converged systems. David: Absolutely, and in the future is going to bring hyper-converged in the forms of virtualization of that, and the the real-time analytics for the first time. the licensing of the software, the way that funding is set up within businesses. the features that you plug into a machine learning model. and the ability to scale how that's going to work. that we have to play with that It's not, the goal isn't to get it as short as possible. George: Yes, because the more data that goes the other two knobs that you can play with is going to be partly or strongly that is most relevant to solving that particular problem. to split the latency budget, that inference, then the rest of it has to be all some of the attributes are going to be pre computed But is the practical reality that the convergence of, George: Give an example of the convergence of data because I think he's going to jump on it. in any of the normal levels of there's going to be lots and lots of room for and the ability to have single copies Not loss of the characters, the fidelity of the data, the way we approach problems and whatnot. David: That's right, and the metadata that goes with it. and all of that data that you're doing in that simulation, That are going to limit what you can for the problems that we understand So instead of having the traditional way of doing it Because of flash, because of new technologies at that level, Peter: High performance interfaces, to the amount of data that you can actually process. and the indexes and things like that. the development of analytics and again, I'm just saying that the pieces that you assemble on the choices that you make. Injecting the analytics where we're doing But a significant portion of the new function is what data you create is generally derived data. What's the action item? the more that you can get the software services So that's the first level of convergence and applications on Prem and need a pathway to the cloud, of convergence in the approach to doing software,

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
David Floyer	PERSON	0.99+
George	PERSON	0.99+
Peter Burris	PERSON	0.99+
Jim Kobielus	PERSON	0.99+
George Gilbert	PERSON	0.99+
Peter	PERSON	0.99+
David Foyer	PERSON	0.99+
George Gilber	PERSON	0.99+
Feb 2018	DATE	0.99+
Yahoo	ORGANIZATION	0.99+
Neil Raiden	PERSON	0.99+
two	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
LinkedIn	ORGANIZATION	0.99+
300 attributes	QUANTITY	0.99+
Bash	TITLE	0.99+
three	QUANTITY	0.99+
second level	QUANTITY	0.99+
Wikibon	ORGANIZATION	0.99+
two knobs	QUANTITY	0.99+
today	DATE	0.99+
two levels	QUANTITY	0.99+
SAP	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
one	QUANTITY	0.99+
first level	QUANTITY	0.99+
each	QUANTITY	0.98+
three issues	QUANTITY	0.98+
First	QUANTITY	0.98+
first time	QUANTITY	0.98+
one point	QUANTITY	0.98+
one product	QUANTITY	0.98+
both	QUANTITY	0.98+
UNIVAC	ORGANIZATION	0.98+
50 plus percent	QUANTITY	0.98+
decades	QUANTITY	0.98+
second one	QUANTITY	0.98+
single point	QUANTITY	0.97+
three times	QUANTITY	0.97+
one way	QUANTITY	0.97+

Manuvir Das, Dell EMC - Dell EMC World 2017

>> Announcer: Live From Las Vegas, it's The Cube, covering Dell EMC World 2017. Brought to you by Dell EMC. >> Hey, welcome back everyone. We're here live in Las Vegas for Dell EMC World 2017. This is The Cube, I'm John Furrier with my co-host, Paul Gillin. And our next guest Manuvir Das, Senior Vice President of Product Management, Dell EMC, former Microsoft Asure, historic role at Microsoft, been at the EMC for a few years. Welcome to The Cube, good to see you. >> Thank you, it's nice to be here. >> So the last year we had a conversation. We were talking about some of the technology and the kind of direction it was going, so first question is from last year to this year, what's changed and what's the news? >> We've brought together two pretty well-known platforms that we did, Isilon for Scalar file and ECS for Scalar object. It one team that are now around called the Unstructured Data Storage Team. And we've done this really big because from the point of view of the customer, what we see is this confluence between file and object really in the space of unstructured storage, and we think we have some ideas of how to put that together in just the right solution for the customer. So that's why we brought these teams together and we've got a lot of great stuff to talk about this year. >> How are you positioning file versus object right now? It seems like object is the rage, but file is still going to be around for a long time. How do you position that? >> Yes, I think it will be. I think basically, if I may, it's not just two, but we see three pillars of unstructured storage. The first is file, which is really more towards compatibility with traditional workloads. A lot of the application ecosystem is comfortable programming against NFS or SMB, and that ecosystem is going to remain for a long time. For instance, in the space-like video surveillance. So that's where we see file. It's optimized more for performance rather than Scale, although you do get Scale. The next level was really object, which is more for your modern workloads, for your web and mobile sort of workloads. Optimized more for Scale rather than performance. And then, the third pillar that we see that we'd be working on now is really realtime data, or what you call streaming data, from things like IOT, where you're getting a firehose of information coming out and you got to store it very, very quickly. So we see these are three different pillars of unstructured storage. And really, what we've been working on in our Unstructured Data Storage Team is how to bring all of these three together in the right solution for the customer. >> So tell us about the group that you're in because this is kind of a new, not new industry, we're talking about unstructured data for many years, going on eight years, but it's becoming super important now as you have this horizontal data fabric development. We talked a little bit about it last year, but you can see a clear line of sight now with apps using data very dynamically. So you need under-the-hood storage, but now you need addressability of data. And so, there's a challenge of getting the right data from the right database to the right place on the app in less than a hundred milliseconds. I mean, that's like the nirvana. >> So I think there's a couple of things happening. Firstly, the advances in hardware have changed the game a fair bit, because you can take a software stack that was not optimized for latency to begin with, you can put it on all Flash hardware and you can reduce the roundtrip a lot, that's one thing. The other thing I see is that especially with the advancement of object >> For the stage of life in IT, you have research background, PhD in Computer Science, I mean, it's a pretty awesome time to be in computer science right now. There's a ton of opportunity that applies from that. Machine learning, all this goodness there. What's your vision of how the next 30 years are going to play out? Because Michael Dell said, "Hey, it's been 33 years," since he's started the company, the next 33 are going to be amazing, and I believe that to be true as well given the science opportunities. How do you look at this, from a personal level and also from a Dell EMC? >> I think what's really going to change is, up 'til now, a lot of things that have been done with computing have started with the thought of, "How much data can I really have?" And then, once I've decided how much data I can really have, what am I going to do with it? And I think sort of the innovation that's happened in storage that I'm a part of, what has really changed is it said, "You don't have to stop and think "about how much data you're going to have." You can just keep all of it. And so, what that means is I don't have to figure out upfront what I'm going to do with this data. I'm just going to bring it all in, it's going to be really cheap, the systems are really scalable that can hold it, and everything is sort of tagged in such a way that after the fact, five years from now, I can go do something with this data that I hadn't envisioned when I brought it in. And I think that just opens up a range of things that were hard to imagine. The other thing I think is, >> Programmatically meaning, from a software standpoint. Discoverability, >> That's right, I think as you said, machine learning is a big part of it. Because I think machine learning unlocks opportunities to mind the data that people hadn't really thought of before. And it comes back to the same thing that when I bring data in, whether it's from sensors or aircraft engines or what have you, I have no idea what I'm going to do with the data, so I have no idea which part of the data is important and which part of the data is less important. But when I can apply things like machine learning after the fact, I don't actually have to worry about that. I just bring it all in, and the algorithms themselves will figure out which part of the data is the useful part of the data. >> Your ScaleUp product line and ScaleOut product line, how are you positioning those two application-wise to your customers? >> So I think there is distinction between tier one storage and tier two storage. I think when you think about tier one storage, it's not just about the numbers, like latency and IOPS, but it's about the whole experience of tier one storage. Do I have, for my disaster recovery, do I have RPO-0, which means I can recover to the exact point in time I was at when I failed over data center. How does my replication work, what data services that I have? So I think our ScaleUp technologies are very well oriented towards the tier one kind of capabilities. And then our ScaleOut technologies are very well oriented towards sort of the ubiquitous tier two storage, which is much more deployable at Scale. It's pretty good performance in two, actually, but not with that complete set of capabilities you think about with tier one in terms of RPO-0s, synchronist replication, those kinds of things. So I think there's a very natural sort of mace between the two. And really, I think from a storage vision, what we see is the tier two storage is so scalable and so cheap, that all of your bools of tier one storage on the top tier down automatically into the tier two storage. And what that means is for our customers, if you think about how much tier one storage they have to provision today, they should be able to provision less of that, because they should be able to tier more of that down to the tier two storage, which is now capable enough to hold the rest of the data. >> And be available. >> And be available, >> Okay so, customers want to do this, a no brainer. So when we hear Amazon talk about this all the time, Jeff Bezos was just talking about just the other day a new chassis, they've got the recognition software so you see facial recognition, a lot of great stuff happening all over the Cloud world with this kind of modeling, with the power of computes that's available. What are the customers do now? Because now they get it, it's a no brainer obviously. Now they've got to change how they did IT 30 years to be agile for tomorrow. What's the playbook? >> So what we're seeing is, the step one that we're seeing more and more today, and have seen really for the last couple of years with Isilon and with DCS, is what I would call Consolidation of the Tier Two. So where we had 12 different clustered silos of storage for the different use cases, let's buy into this model that I can just build one large storage cluster, and it can handle the 12 different use cases at the same time. And that's what we've been proving out for the last few years. I think customers have really, enterprise customers are really getting there. And now, what we're beginning to see this year is the next phase, whether it's the industrial internet with the automotives, et cetera, the more IOT style use cases. In fact, on Wednesday, we'll be talking about a new thing we've got called Project Nautilus, which is the third leg of our stool with the streaming storage that is built on top of Isilon and ECS. And we're now at the point where are first customers are beginning to work with that, where they're saying, "From my sensors, "in the automobiles, on the cameras, "I'm going to bring in this firehouse of data, "I'm going to store it all with you, "but later on, I'm going to do analytics on it. "As it's coming in, I'm going to do "some real-time analytics on it, "and then after the fact, I'm going to do "the more batch style." >> I know Paul Scott wants to jump in, but I want you to just back up because I missed the three pillars. >> The three pillars were file, for which we have Isilon, object for your modern applications and web workloads, for which we have ECS, and then streaming storage for IOT. >> Which is Nautilus? >> Which is Project Nautilus, >> Okay, got it. >> The way I put it to people is traditional storage systems, ScaleUp or ScaleOut, file or object, they need resilience. So when you write the data, you have to write and think at the same time, because you have to record all kinds of information about it, you have to take locks, et cetera. For IOT, you need a storage system that writes now, and thinks later, so that you can just suck it all in. >> It sounds like an operating system. You've got a storage that's turning into like LUNs, provisioning, hardware. It's essentially intelligence software that has to compile, runtime, assembly, all this stuff's going on. >> And there's all these fancy names like LAN Architecture and all that kind of stuff. And what that's all saying is, "I bring the data in "and as it's coming in, "there's some things I already want to do with it, "I do that analytics in real-time. "There's other things when I go tag it, "who was in the photo, where was it, "and then the rest of it, I'm going to do later." And who knows what and when, and that's a beautiful thing. >> You're way along the thinking curve on this obviously, but where are your customers? I mean, you're talking about a pretty radically, different approach to processing and storing data even in realtime. Machine learning, meta tagging, there's a lot for them to absorb. >> And I think that part, it's a vertical driven, use-case driven thing. So there's some industries where we see a lot of uptake on that. Automotive is a great example. >> Financial services, >> Financial services, fraud detection, those kinds of things. And there's other verticals where it's not time for that yet. Like I said, healthcare is a great example. So in those verticals, we see more of just the storage consolidation, let me build one pool of tier two storage, if you will, and consolidate my 12 use cases sort of what we refer to as the Data Lake in our words, but I think it's specific verticals. And that's fine, if you look at even the traditional unstructured storage, I think it really started with certain verticals like media and entertainment, life sciences, and that's sort of where it kicked up from. And I think for the streaming storage, it's these verticals that are more oriented towards IOT, your automotive, your fraud detection, those kinds of things where it's really kicking off, and then it'll sort of broaden from there. >> How is this playing into the Dell server strategy? >> It's really a fantastic thing, I don't want to say so much for us as for our customer, because I've talked to a number of people in these verticals where the customer wants a complete solution for IOT. And what that means is number one: on the edge, do I have the right equipment with the right horsepower and the right software on the edge to bring in all the data from the edge and do the part of the processing that needs to be done right there on the edge of realtime, and then it has to be backed by stuff in the backing environment that can process massive amounts of data. And with Dell, we have the opportunity for the first time that we didn't have with the EMC alone to do the complete solution on both ends of it, both the equipment on the edge as well as the backing IT, so I think it's a great opportunity. >> You bring up so many awesome conversations because it's boring storage, now storage is not boring anymore because it's fundamental to the heartbeat of a company. >> Exactly. >> So here's a question for you, kind of like thinking out loud and riffing with you. So some debate, like, "Listen, I want to find "the needle in the haystacks, "but the haystacks are getting bigger," so there's a problem with that. I got to do more design and more gig digging, if you will. And the second point is customers are saying, to at least to us on The Cube and privately is, "I got a data lake that's turning "into a data swamp, "so help me not have swamps of data, "and I want more needles, "but the stack's getting bigger." What's your advice to those CXOs? Could be a CDO, chief data officer, a CS CISCO, these are the fundamental questions. >> I would say this, whatever technology you're evaluating, whether it's an on-premise technology or a hosted technology from a vendor like us, or it's a service out there in the Public Cloud, if you will, ask yourself two questions. One is, "If I size out what I need right now, "and I multiply it by 10 or 100, "what is it going to cost? "And is it really going to work the same way, "is it going to scale the same way?" Look at the algorithmics inside the product, not the Power Point and say, "The way "they've designed this thing, "when I put 100 times the data "on 100 times the number of servers "on this storage system, "are things actually going to work the same way or not?" >> So it's a scale question, kind of what are the magnitude thinking you need to kind of go out and size it up a bit. >> Because I see right now, the landscape is full of new technologies for storage, and a lot of them sound great and look great on the PowerPoint, and you go do a POC with four nodes or eight nodes, and you put Flash in there and it works really well. But the thing is, when you have 200 nodes of that, when you've got a 30 petabyte cluster and you've got to fail it over because your data center went down, how does that go? >> Well, it's also who's going to run it, too. You want less obstacles, not more, and you don't them to be huge, expensive developers. >> TierPoint, that's the other thing. We really don't talk to our customers in terms of storage acquisition costs anymore, we talk in terms TCO, total cost of ownership. You look at power, you look at cooling. >> That killed the Duke, basically, it was so hard to run and total cost of of ownership. Michael Dell was just on, I was interviewing Michael and I asked him like, "Where's the Cloud strategy?" I was just busting his chops a little bit, 'cause I know he's messaging, trying to get him off his messaging. But he made an interesting comment and metaphor. He goes, "Well John, I remember the days "during the internet days, where's you internet strategy?" Look where that happened, the bubble popped. But ultimately, everything played out as according to plan. There's pet food online, now we've got food delivery, DoorDash, all this stuff's happening. So he kind of was using it to compare to the Cloud today. There's a lot of hope and promise, where's your Cloud strategy? But yet, his point was it's going to be everywhere. >> Yeah, and I would say this, I think people sometimes confuse Cloud with Public Cloud. And I think what happened is, having that issue myself, I would say that Public Cloud exposed a certain model that had some benefits for the customer base that were new. That is, I can use as a service, I don't worry about operationalizing things, I can pay as I go, so I get that, it's elastic. But it also came with a lot of drawbacks. I don't have the kind of control that I would like to have. A normal thing that any person who takes a dependency on infrastructure has is, "Today's my Superbowl Sunday. "Don't touch my environment today." Now you go to a Public Cloud and you use a service that is used by thousands of other customers, so which day is Superbowl Sunday? Every day is Superbowl Sunday for somebody. >> It was a metaphor, Public cloud was a metaphor for virtualization that would effect the entire environment. >> And so, I think the journey we're all in, all the vendors, the Public Cloud suppliers, everybody is, "What are the right set of models "that are going to cover the space for all our customers?" There's not going to be one. There's several. I think the dedicated private Cloud models are certainly very appealing in a number of ways if you do the economics right. And I think that's the journey we're all on sort of together. >> I tweeted a little bit of the jewels out there this morning. True, Private cloud is going to be a $265 billion dollar market, but they were the first ones to actually size that, let's say true private public means essentially hybrid, but on-prem with a data center. That's huge numbers, it's not like rounding errors. >> We believe that, too. And that's why one of the neatest things we've announced this year with ECS object storage is something called ECS Dedicated Cloud, which is basically saying, "You can take the object storage "from us, but it's going to run in our data centers." We operate it, it's actually the developers who wrote the code from my team who are actually operating it, and you can do a variety of hybrid things. You can keep some of it on-prem, some of it off-prem, you can keep all of it off-prem. But regardless, it's your stuff. You can hug it, it's dedicated to you. You're not sharing the cluster with anybody else. You get to decided when you update your version, when you take a maintenance window or what have you. So, we're all searching for that sweet spot, if you will. >> I want to ask you about something, some of the different containers. The hottest thing right now in infrastructure, lack of persistent storage has been a real problem for containers. Is that a problem that's yours to solve or is it Docker's to solve? >> No, I think it is ours to solve with them. So, I'll say a couple of things. Firstly, our modern products, ECS and object storage as well as ScaleIO, our block ScaleOut storage, these are built with containers. So for instance, if you take ECS today, every ECS appliance that we ship, if you look inside very server, it's running Linux with Docker. And all the ECS code is running on Docker containers. That's just how it works. So A: we believe in containers, and two: I think we have been doing the work to provide that persistence ecosystem for containers using our storage. So we have a great team at Dell EMC called EMC Code. And these are people, they do a lot of this integration stuff, they work very closely with Docker and a number of the other frameworks to really plug our storage in. And I think it's a very open ecosystem. There are APIs there now, so you can plug anybody's storage in. And I think that's really if you compare VM-based infrastructures with container-based infrastructures. That's really the gap, because when you operationalize the stuff, you need things like that. You need persistent storage, you need snapshots, you need a VR-storage, you need those kinds of things, but I think that'll all come. >> Well, we're looking to continuing the conversation, I know time's tight. We'd like to follow up with you after the show, maybe bring you into our studio via Skype. You're in a hot area, you got the storage, you got the software, you got some Cloud action going on. Thank you very much for coming on The Cube, appreciate it. >> My pleasure for being here, thank you for having me. >> This is TheCube, live coverage here at Dell EMC World 2017. And I'm John Furrier with Paul Gillin, we'll be right back. Stay with us. (bright tech tones)

Published Date : May 9 2017

SUMMARY :

Brought to you by Dell EMC. historic role at Microsoft, been at the EMC for a few years. and the kind of direction it was going, in just the right solution for the customer. but file is still going to be around for a long time. and that ecosystem is going to remain for a long time. I mean, that's like the nirvana. and you can reduce the roundtrip a lot, the next 33 are going to be amazing, I don't have to figure out upfront from a software standpoint. I have no idea what I'm going to do with the data, I think when you think about tier one storage, just the other day a new chassis, and have seen really for the last couple of years but I want you to just back up and then streaming storage so that you can just suck it all in. that has to compile, runtime, assembly, "and then the rest of it, I'm going to do later." the thinking curve on this obviously, And I think that part, And I think for the streaming storage, and the right software on the edge because it's fundamental to the heartbeat I got to do more design and more gig digging, if you will. "And is it really going to work the same way, you need to kind of go out and size it up a bit. But the thing is, when you have 200 nodes of that, and you don't them to be huge, expensive developers. TierPoint, that's the other thing. "during the internet days, where's you internet strategy?" I don't have the kind of control that I would like to have. the entire environment. And I think that's the journey we're all on True, Private cloud is going to be You get to decided when you update your version, I want to ask you about something, That's really the gap, because when you operationalize We'd like to follow up with you after the show, thank you for having me. And I'm John Furrier with Paul Gillin,

ENTITIES

Entity	Category	Confidence
Paul Gillin	PERSON	0.99+
Michael	PERSON	0.99+
Michael Dell	PERSON	0.99+
100 times	QUANTITY	0.99+
Jeff Bezos	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Paul Scott	PERSON	0.99+
John Furrier	PERSON	0.99+
Wednesday	DATE	0.99+
John	PERSON	0.99+
Manuvir Das	PERSON	0.99+
$265 billion	QUANTITY	0.99+
thousands	QUANTITY	0.99+
10	QUANTITY	0.99+
100	QUANTITY	0.99+
two questions	QUANTITY	0.99+
Dell	ORGANIZATION	0.99+
eight years	QUANTITY	0.99+
PowerPoint	TITLE	0.99+
Dell EMC	ORGANIZATION	0.99+
One	QUANTITY	0.99+
second point	QUANTITY	0.99+
33 years	QUANTITY	0.99+
last year	DATE	0.99+
12 use cases	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
EMC	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
today	DATE	0.99+
this year	DATE	0.99+
Linux	TITLE	0.99+
first question	QUANTITY	0.99+
less than a hundred milliseconds	QUANTITY	0.99+
two	QUANTITY	0.99+
third pillar	QUANTITY	0.99+
Superbowl	EVENT	0.99+
12 different use cases	QUANTITY	0.99+
first customers	QUANTITY	0.99+
30 petabyte	QUANTITY	0.99+
Firstly	QUANTITY	0.99+
30 years	QUANTITY	0.99+
both	QUANTITY	0.99+
first	QUANTITY	0.99+
ScaleIO	TITLE	0.98+
Today	DATE	0.98+
three pillars	QUANTITY	0.98+
one team	QUANTITY	0.98+
Isilon	ORGANIZATION	0.98+
Duke	ORGANIZATION	0.98+
three	QUANTITY	0.98+
first time	QUANTITY	0.98+
Scale	TITLE	0.98+
200 nodes	QUANTITY	0.98+
Docker	TITLE	0.98+
Skype	ORGANIZATION	0.98+
RPO-0	OTHER	0.98+
CISCO	ORGANIZATION	0.97+
two application	QUANTITY	0.97+
Tier Two	QUANTITY	0.97+
third leg	QUANTITY	0.96+
ECS	TITLE	0.96+
eight nodes	QUANTITY	0.96+
Public Cloud	TITLE	0.95+
EMC Code	ORGANIZATION	0.95+
step one	QUANTITY	0.95+
four nodes	QUANTITY	0.95+
one	QUANTITY	0.94+
DoorDash	ORGANIZATION	0.94+
one thing	QUANTITY	0.94+
RPO-0s	OTHER	0.94+
three different pillars	QUANTITY	0.94+
ECS	ORGANIZATION	0.93+
DCS	ORGANIZATION	0.93+
both ends	QUANTITY	0.93+

Brian Lillie, Equinix | NAB Show 2017

[Announcer] Live from Las Vegas. It's theCUBE. Covering NAB 2017. Brought to you by HGST. >> Welcome back everybody, Jeff Frick here with theCUBE. We're at NAB 2017 with a hundred thousand of our closes friends but we actually do have one of my friends here. Who I can't believe we haven't had on theCUBE since 2013 ServiceNow Knowledge. >> That's right. That's right. >> Just down the road at the Cosmopolitan. Brian Lillie, he is now the Chief Customer Officer and EVP of Technology Services from Equinix. >> Brian, it's always great to see you. >> Jeff, it's always a good thing to be on theCUBE. And I love NAB. Love it! >> What do you think, you've been coming here for awhile. What's kind of your take away, what's the vibe? >> Well, so the vibe, it feels as innovative and as exciting as ever. And I really think that, people are seeing, are starting to hit a tipping point where they're seeing what's possible. What's possible with the cloud, possible with increased collaboration. When I first started coming here a few years ago, saw very few of these kinds of projects. Now, we're seeing tons of innovative approaches to using the cloud. Using our facilities, using really some of our network providers that are really innovating around this vertical. >> Yeah, it's pretty interesting Brian because this is our first time for theCUBE being here. And what's surprising me is how many of the macro trends that we see time and time again at all the other shows about increasing capacity, flexibility, democratization of data, democratization of assets. All these kinds of typical IT themes that are being executed here within the media entertainment industry both on the creative side and as well as the production side. >> That's right. That's very well said. I think this industry, really more than many, is very, very collaborative. You know, from everything from acquisition to pre-production, production, post production, delivery. It feels like a community that wants to share, wants to learn, sees that they don't necessarily own all the best ideas. And that we're seeing some young innovative startups from all over the world. Everywhere from Europe to Asia coming up with ideas that the big houses, big players are starting to see as viable. And I do think, I think, when you talk about it being maybe some of these IT trends, I think some of the secular trends. The fact that consumers want their content anytime, anywhere, on any device. >> Jeff: Right, right. >> Really if you work from the customers backwards, everybody else has to adjust to that. And we're parents. >> Jeff: Right, right. >> We see what our kids wants. And it's really driving I think the whole industry. >> And good stuff for you. You guys at Equinix made a big bet on cloud long time ago. And the fact of the matter is, we're surrounded by all these crazy hardware, both in the production side, the data center side. No one is buying this. You don't just take this stuff home anymore and plug it in. It's just too big and too expensive. As you said, I think was interesting about the media business, is everybody comes together around a project. When the project's over, they go away. How many people has Quentin Tarantino employed directly, probably not that many. But the guy kicks out a lot of big budget movies. >> That's right. I think when you think about the creation of a production, like a QT movie, wherever that set is, it's ephemeral. You go, you setup and it's big data needs, it's high bandwidth, low latency, you've got to get the data. In some cases centrally, but in some cases you're processing at the edge. But it's very cloud-like. We're seeing a lot of this unfold. We're seeing these players not only in the centers where it makes sense to consolidate, but we're actually seeing some of this kit show up in our data centers in a distributed mode, where they say some information, some equipment, we want to keep behind our firewalls on our premise, which could be an Equinix cage or their own. But then I want to absolutely connect to multiple clouds. I want to use the tools in Asure, the tools in Amazon, the tools in Google and others to further enhance our abilities. And so it's truly this hybrid, best of breed, I got a lot of tools in my tool kit, some cloud, some on premise. And there has never been a better time to be in this industry. >> Right. >> You see a lot of industries, you got a lot of customers, how do you see it kind of compare, are financial services, the entertainment, et cetera, are they all kind of progressing pretty much down the same path, at the same rate or do you see some significant laggers or significant people ahead of the curve? >> Well, I would say that financial services is way ahead, to be frank. Financial services has been doing this for a long time. Like when we built Equinix, it was really starting with the networks at the core. And the first vertical to take advantage of that was the financial services, where they said, hey, I want low latency routes between New York and London. Low latency routes between Chicago and New York. And so they've been doing that and then building communities of interest where they could reach all the folks in their digital supply chain. On the financial services side, guys like Bloomberg and Reuters, they said, I can reach all my customers in one place. And I can direct connect to them. So they built early. The content guys did see it right after that. Guys like Yahoo, and if you remember Myspace. >> Jeff: Right, right. >> So it's wonderful to see Facebook video here. I mean, here's now Facebook, real-time video, live at NAB. And with a big presence. So I think content digital media has been a little bit slower to move. But it's one of these ramps. >> Jeff: Right, right. >> And they, over the last two years, I think they have been the fastest excelerating vertical using the cloud and interconnection to build their brand, to build their business. >> Right. It's interesting, because some of our other guests were talking about the theme I guess last year, here was a lot of VR. >> Brian: Yes. >> It's all about the VR theme. But now, we're hearing about machine learning, and metadata and a lot more kind of tradition themes, it's not necessarily just about the VR and the 360. >> Brian: Yup, yup. >> To add more value to these assets, to be able to distribute them better, to have the metadata, to create an experience for that individual person, >> Yup. >> even within the context of a bigger asset, have these small ones, they're pretty interesting trend. >> Yeah, it's spot on. I think VR, virtual reality and augmented reality, >> Jeff: Yeah, I think so. >> is the future. I mean it's the future. I think what maybe what people are realizing is, it's at it's really early days. But data we have, and this whole notion of data science and analytics that you can put around the customer experience in real-time, in situ. >> Right. >> They're like, we can do that now. >> Where virtual reality, the massive bandwidth, the storage, the compute, the compute. Because it's no longer that you're watching the movie in a third person, you are the movie. You are the experience, you're in it. And that's just going to require just massive compute, that in my opinion, only the cloud can do. [Jeff] Right, right. >> So I think it's a little bit further off, But I think VR and AR is the wave, it's the future. >> And certainly in the AR, I think is really cool because there's so much potential there. So from a data center perspective, you guys are sitting right at the heart of this thing. And you're taking advantage of these tremendous Moore's law impacts on not only compute and store but networking, it's got to be phenomenal to see the increase demand. I always think of the old Microsoft Intel, you know back in the day, >> Brian: Right, right. you get a better microprocessor, well, Microsoft's OS heats up, another 80% of that one back and forth. But now we're really hitting huge, huge efficiencies in these core components that are enabling ridiculous scale that you could never even imagine before. >> I think the Intel Microsoft example or analogy is a really, really interesting one because in fact, when you look at companies like Mesophere and Google's Kubernetes and these others, that are, they're calling themselves the data center operating system which is operating containers with the move to microservices, all this technology that's coming, that's making compute more ubiquitous, where you can run workloads anywhere. The fact that we sit, we feel privileged cuz we sit in the middle, of not only all the networks, but of the clouds, the multi-clouds. >> Right, right. >> And if you're a, whether you're a producer or you're in production, you're in delivery, you're an over-the-top guy, where you want to be is where you can connect very directly with little latency and high security and high reliability, to the clouds you need, to the networks you need, to the partners you need. I think that's just a powerful thing. Now the operating system is how do we make that easy, how do we create the easy button. >> Right, right. >> For these folks to access these resources. And what' the value we provide as that neutral, in the middle provider that brings people together. You know, I was at an event last night, and DPP, Mark from DPP was there. We were talking about the question of who owns this new business model. He said he saw a panel on Sunday, because it's transforming in front of us. [Jeff] Right, right. >> And it's an excellent question. I don't know who owns it, but I know we see it. And we're seeing people talk about it. I think the community owns it. They own what this new business model looks like and we're just listening to our customers and letting them lead us. >> Jeff: Right. >> To the place we need to go. >> Interesting. So we're running a little low on time. Just want to get kind of what are your priorities for 2017. >> Well, priorities in this area is really to make cloud ubiquitous globally. It's to push that out to the edge, make that available in as many markets, to as many customers as we can. With our big partners, with Google and Amazon and Microsoft and Oracle and all the rest. That's a big priority. Second is this notion of the easy button. How can we add value, how can we take friction out of the system to make collaboration and communication between this industry that much easier, that much faster. Those are our two big ones in particular here. And I'm delighted to see this vertical just taking off with the cloud. >> Yeah. Pretty exciting times. >> Brian: It's a great time. >> Alright, I got to embarrass you before I let you go Brian. Never have I met an executive that takes such pride in in losing good employees to better jobs. I just want to compliment you on that. (Brian laughs) I know you take pride in CIOs all over the industry that were once your charges. So I want to give you a shout-out for that. >> Okay. Alright, he's Brian Lillie, keep working for him. Don't take the other CIO jobs just yet, but if you do, he'll be happy to mentor you. >> Brian: I will help you get there. >> Alright, thanks for stopping by. He's Brian Lillie, I'm Jeff Frick. You're watching theCUBE from NAB 2017. We'll be right back after this short break. >> Brian: Thanks Jeff. >> Good to see you buddy. (techno music)

Published Date : Apr 25 2017

SUMMARY :

Brought to you by HGST. We're at NAB 2017 with a hundred thousand of our closes That's right. Brian Lillie, he is now the Chief Customer Officer Jeff, it's always a good thing to be on theCUBE. What do you think, you've been coming here for awhile. And I really think that, on the creative side and as well as the production side. And that we're seeing some young innovative startups everybody else has to adjust to that. And it's really driving I think the whole industry. And the fact of the matter is, I think when you think about the creation of a production, And I can direct connect to them. And with a big presence. and interconnection to build their brand, about the theme I guess last year, here was a lot of VR. It's all about the VR theme. have these small ones, they're pretty interesting trend. I think VR, virtual reality I mean it's the future. that in my opinion, only the cloud can do. But I think VR and AR is And certainly in the AR, I think is really cool ridiculous scale that you could never even imagine before. but of the clouds, the multi-clouds. to the clouds you need, to the networks you need, in the middle provider I think the community owns it. Just want to get kind of what are your priorities for 2017. And I'm delighted to see Alright, I got to embarrass you before I let you go Brian. Don't take the other CIO jobs just yet, but if you do, We'll be right back after this short break. Good to see you buddy.

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
Brian Lillie	PERSON	0.99+
Jeff	PERSON	0.99+
Jeff Frick	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Brian	PERSON	0.99+
2017	DATE	0.99+
Europe	LOCATION	0.99+
Reuters	ORGANIZATION	0.99+
Quentin Tarantino	PERSON	0.99+
Sunday	DATE	0.99+
Mark	PERSON	0.99+
Asia	LOCATION	0.99+
Chicago	LOCATION	0.99+
Equinix	ORGANIZATION	0.99+
London	LOCATION	0.99+
Bloomberg	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Las Vegas	LOCATION	0.99+
Yahoo	ORGANIZATION	0.99+
first	QUANTITY	0.99+
Second	QUANTITY	0.99+
last year	DATE	0.99+
Facebook	ORGANIZATION	0.99+
first time	QUANTITY	0.98+
Intel	ORGANIZATION	0.98+
last night	DATE	0.98+
NAB Show 2017	EVENT	0.98+
one place	QUANTITY	0.98+
both	QUANTITY	0.97+
2013	DATE	0.97+
NAB 2017	EVENT	0.97+
80%	QUANTITY	0.96+
DPP	ORGANIZATION	0.95+
one	QUANTITY	0.94+
Myspace	ORGANIZATION	0.92+
two big ones	QUANTITY	0.89+
Mesophere	ORGANIZATION	0.89+
hundred thousand	QUANTITY	0.88+
Cosmopolitan	ORGANIZATION	0.87+
last two years	DATE	0.85+
few years ago	DATE	0.85+
NAB	EVENT	0.84+
Moore	PERSON	0.82+
third	QUANTITY	0.8+
theCUBE	ORGANIZATION	0.65+
Asure	ORGANIZATION	0.62+
EVP	PERSON	0.55+
Kubernetes	TITLE	0.46+
360	QUANTITY	0.44+
HGST	DATE	0.4+
ServiceNow	ORGANIZATION	0.39+

Marc Farley, Vulcancast - Google Next 2017 - #GoogleNext17 - #theCUBE

>> Narrator: Live from the Silicon Valley, it's theCUBE. (bright music) Covering Google Cloud Next 17. >> Hi, and welcome to the second day of live coverage here of theCUBE covering Google Next 2017. We're at the heart of Silicon Valley here at our 4,500 square foot new studio in Palo Alto. We've got a team of reporters and analysts up in San Francisco checking out everything that's happening in Google. I was up there for the day two keynote, and happy to have with me is the first guest of the day, friend of theCUBE, Marc Farley, Vulcancast, guy that knows clouds, worked for one the big three in the past and going to help me break down some of what's going on in the marketplace. Mark, it's great to see you. >> Oh, it's really nice to be here, Stu, thanks for asking me on. >> Always happy to have you-- >> And what a lot of fun stuff to get into. >> Oh my god, yeah, this is what we love. We talked about, I wonder, Amazon Reinvent is like the Superbowl of the industry there. What's Google there if, you know-- >> Well, Google pulls a lot of resources for this. And they can put on a very impressive show. So if this is, if Invent is the Superbowl, then maybe this, maybe Next is the college championship game. I hate to call it college, but it's got that kind of draw, it's a big deal. >> Is is that, I don't want to say, arena football, it's the up and coming-- >> Oh, it's a lot better than that. Google really does some spectacular things at events. >> They're Google, come on, we all use Google, we all know Google, 10,000 people showed up, there's a lot of excitement. So what's your take of the show so far in Google's positioning in cloud? >> It's nothing like the introduction of Glass. And of course, Google Glass is a thing of the past, but I don't know if you remember when they introduced that, when they had the sky diver. Sky divers diving out of an airplane and then climbing up the outside of the building and all that, it was really spectacular. Nobody can ever reach that mark again, probably not even the Academy Awards. But you asked the second part of the question, what's Google position with cloud, I think that's going to be the big question moving forward. They are obviously committed to doing it, and they're bringing unique capabilities into cloud that you don't see from either Amazon or Microsoft. >> Yeah. I mean, coming into it, there's certain things that we've been hearing forever about Google, and especially when you talk about Google in the enterprise. Are they serious, is this just beta, are they going to put the money in? I thought Eric Schmidt did a real good job yesterday in the close day keynote, he's like, "Look, I've been telling Google to push hard "in the enterprise for 17 years. "Look, I signed a check for 30 billion dollars." >> 30 billion! >> Yeah, and I talked to some people, they're a little skeptical, and they're like, "Oh, you know, that's not like it all went to build "the cloud, some of it's for their infrastructure, "there's acquisitions, there's all these other things." But I think it was infrastructure related. Look, there shouldn't be a question that they're serious. And Diane Greene said, in a Q&A she had with the press, that thing about, we're going to tinker with something and then kill it, I want to smash that perception because there's certain things you can do in the consumer side that you cannot get away with on the enterprise side, and she knows that, they're putting a lot of effort to transform their support, transform the pricing, dig in with partners and channels. And some of it is, you know, they've gotten the strategy together, they've gotten the pieces together, we're moving things from beta to GA, and they're making good progress. I think they have addressed some of the misperceptions, that being said, everybody usually, it's like, "I've been hearing this for five years, "it's probably going to take me a couple of years "to really believe it." >> Yeah, but you know, the things is, for people that know Diane Greene and have watched VMware over the years, and then her being there at Google is a real commitment. And she's talking about commitment when she talks about that business. It's full pedal to the metal, this is a very serious, the things that's interesting about it, it's a lot more than infrastructure as a service. >> Yeah. >> The kinds of APIs and apps and everything that they're bringing, this is a lot more than just infrastructure, this is Google developed, Google, if you will, proprietary technology now that they're turning to the external world to use. And there's some really sophisticated stuff in there. >> Yes, so before we get into some of the competitive landscape, some of the things you were pretty impressed with, I think everybody was, the keynote this morning definitely went out much better, day one keynote, a little rocky. Didn't hear, the biggest applauses were around some of the International Women's Day, which is great that they do that, but it's nice when they're like, "Oh, here's some cool new tech," or they're like, oh, wow, this demo that they're doing, some really cool things and products that people want to get their hands on. So what jumped out at you at the keynote this morning? >> I'm trying to remember what it's called. The stuff from around personal identifiable information. >> Yeah, so that's what they call DLP or it's the Data Loss Prevention API. Thank goodness for my Evernote here, which I believe runs on Google cloud, keeping up to date, so I'm-- >> Data loss prevention shouldn't be so hard to remember. >> And by the way, you said proprietary stuff. One thing about Google is, that Data Loss Prevention, it's an API, they want to make it easy to get in, a lot of what they do is open source. They feel that that's one of their differentiations, is to be, we always used to say on the infrastructure side, it's like everybody's pumping their chest. Who's more open than everybody else? Google. Lots of cool stuff, everything from the TensorFlow and Kubernetes that's coming out, where some of us are like, "Okay, how will they actually make money on some of this, "will it be services?" But yeah, Data Loss Prevention API, which was a really cool demo. It's like, okay, here's a credit card, the video kind of takes it and it redacts the number. It can redact social security numbers, it's got that kind of machine learning AI with the video and all those things built in to try to help security encrypt and protect what you're doing. >> It's mind boggling. You think about, they do the facial recognition, but they're doing content recognition also. And you could have a string of numbers there that might not be a phone number, it might not be a social security number, and the question is, what DLP flagged that to, who knows, it doesn't really matter. What matters is that they can actually do this. And as a storage person, you're getting involved, and compliance and risk and mitigation, all these kinds of things over the years. And it's hard for software to go in and scan a lot of data to just look for text. Not images of numbers on a photograph, but just text in a document, whether it's a Word file or something. And you say, "Oh, it's not so hard," but when you try to do that at scale, it's really hard at scale. And that's the thing that I really wonder about DLP, are they going to be able to do this at large scale? And you have to think that that is part of the consideration for them, because they are large scale. And if they can do that, Stu, that is going to be wildly impressive. >> Marc, everything that Google does tends to be built for scale, so you would think they could do that. And I'd think about all the breaches, it was usually, "Oh, oops, we didn't realize we had this information, "didn't know where it was," or things like that. So if Google can help address that, they're looking at some of those core security issues they talked about, they've got a second form factor authentication with a little USB tab that can go into your computer, end to end encryption if you've got Android and Chrome devices, so a lot of good sounding things on encryption and security. >> One of the other things they announced, I don't know if this was part of the same thinking, but they talk about 64 core servers, and they talk about, or VMs, I should say, 64 core VMs, and they're talking about getting the latest and greatest from Intel. What is it, Skylink, Sky-- >> Stu: Skylake. >> Skylake, yeah, thanks. >> They had Raejeanne actually up on stage, Raejeanne Skillern, Cube alumn, know her well, was happy to see her up on stage showing off what they're doing. Not only just the chipset, but Intel's digging in, doing development on Kubernetes, doing development on TensorFlow to help with really performance. And we've seen Intel do this, they did this with virtualization with the extensions that they did, they're doing it with containers. Intel gets involved in these software pieces and makes sure that the chipset's going to be optimized, and great to see them working with Google on it. >> My guess is they're going to be using a lot of cycles for these security things also. The security is really hard, it's front and center in our lives these days, and just everything. I think Google's making a really interesting play, they take their own internal technology, this security technology that they've been using, and they know it's compute heavy. The whole thing about DLP, it's extremely compute heavy to do this stuff. Okay, let's get the biggest, fastest technology we can to make it work, and then maybe it can all seem seamless. I'm really impressed with how they've figured out to take the assets that they have in different places, like from YouTube. These other things that you would think, is YouTube really an enterprise app? No, but there's technology in YouTube that you can use for enterprise cloud services. Very smart, I give them a lot of credit for looking broadly throughout their organization which, in a lot of respects, traditionally has been a consumer oriented experience, and they're taking some of these technologies now and making it available to enterprise. It's really, really hard. >> Absolutely. They did a bunch of enhancements on the G Suite product line. It felt at times a little bit, it's like, okay, wait, I've got the cloud and I've got the applications. There are places that they come together, places that data and security flow between them, but it still feels like a couple of different parts, and how they put together the portfolio, but building a whole solution for the enterprise. We see similar things from Microsoft, not as much from Amazon. I'm curious what your take is as to how Google stacks up against Microsoft who, disclaimer, you did work for one time on the infrastructure side. >> Yeah, that's a whole interesting thing. Google really wants to try to figure out how to get enterprises that run on Microsoft technology moving to Google cloud, and I think it's going to be very tough for them. Satya Nadella and Microsoft are very serious about making a seamless experience for end users and administrators and everybody along managing the systems and using their systems. Okay, can Google replicate that? Maybe on the user side they can, but certainly not on the administration side. And there are hooks between the land-based technology and the cloud-based technology that Microsoft's been working on for years. Question is, can Google come close to replicating those kinds of things, and on Microsoft's side, do customers get enough value, is there enough magic there to make that automation of a hybrid IT experience valuable to their customers. I just have to think though that there's no way Google's going to be able to beat Microsoft at hybrid IT for Microsoft apps. I just don't believe it. >> Yeah, it's interesting. I think one of the not so secret weapons that Google has there is what they're doing with Kubernetes. They've gotten Kubernetes in all the public clouds, it's getting into a lot of on premises environment. Everything from we were at the KubeCon conference in Seattle a couple of months ago. I hear DockerCon and OpenStacks Summit are going to have strong Kubernetes discussions there, and it's growing, it's got a lot of buzz, and that kind of portability and mobility of workload has been something that, especially as guys that have storage background, we have a little bit of skepticism because physics and the size of data and that whole data gravity thing. But that being said, if I can write applications and have ways to be able to do similar things across multiple environments, that gives Google a way to spread their wings beyond what they can do in their Google cloud. So I'm curious what you think about containers, Kubernetes, serverless type activity that they're doing. >> I think within the Google cloud, they'll be able to leverage that technology pretty effectively. I don't think it's going to be very effective, though, in enterprise data centers. I think the OpenStack stuff's been a really hard road, and it's a long time coming, I don't know if they'll ever get there. So then you've got a company like Microsoft that is working really hard on the same thing. It's not clear to me what Microsoft's orchestrate is going to be, but they're going to have one. >> Are you bullish on Asure Stack that's coming out later this year? >> No, not really. >> Okay. >> I think Asure Stack's a step in the right direction, and Microsoft absolutely has to have it, not so much for Google, but for AWS, to compete with AWS. I think it's a good idea, but it's such a constrained system at this point. It's going to take a while to see what it is. You're going to have HPE and Lenovo and Cisco, all have, and Dell, all having the same basic thing. And so you ask yourself, what is the motivation for any of these companies to really knock it out of the park when Microsoft is nailing everybody's feet to the floor on what the options are to offer this? And I understand Microsoft wanting to play it safe and saying, "We want to be able to support this thing, "make sure that, when customers install it, "they don't have problems with it." And Microsoft always wants to foist the support burden onto somebody else anyway, we've all been working for Microsoft our whole lives. >> It was the old Dilbert cartoon, as soon as you open that software, you're all of a sudden Microsoft's pool boy. >> (laughs) I love that, yeah. Asure Stack's going to be pretty constrained, and they keep pushing it further out. So what's the reality of this? And Asure Pack right now is a zombie, everybody's waiting for Asure Stack, but Asure Stack keeps moving out and Asure Stack's going to be small and constrained. This stuff is hard. There's a reason why it's taking everybody a long time to get it out, there's a reason why OpenStack hasn't had the adoption that people first expected, there's going to be a reason why I think Asure Stack does not have the adoption that Microsoft hoped for either. It's going to be an interesting thing to watch over what will play out over the next five or six years. >> Yeah, but for myself, I've seen this story play out a few times on the infrastructure side. I remember the original precursor, the Vblock with Acadia and the go-to-market. VMware, when they did the VSAN stuff, the generation one of Evo really went nowhere, and they had to go, a lot of times it takes 18 to 24 months to sort out some of those basic pricing, packaging, partnering, positioning type things, and even though Asure Stack's been coming for a while, I want to say TP3 is like here, and we're talking about it, and it's going to GA this summer, but it's once we really start getting this customer environment, people start selling it, that we're going to find out what it is and what it isn't. >> It's interesting. You know how important that technology is to Microsoft. It's, in many respects, Satya's baby. And it's so important to them, and at the same time, it's not there, it's not coming, it's going to be constrained. >> So Marc, unfortunately, you and I could talk all day about stuff like this, and we've had many times, at conferences, that we spend a long time. I want to give you just the final word. Wrap up the intro for today on what's happening at Google Next and what's interesting you in the industry. >> Well, I think the big thing here is that Google is showing that they put their foot down and they're not letting up. They're serious about this business, they made this commitment. And we sort of talk and we give lip service, a little bit, to the big three, we got Asure, we got Amazon, and then there's Google. I think every year it's Google does more, and they're proving themselves as a more capable cloud service provider. They're showing the integration with HANA is really interesting, SAP, I should say, not HANA but SAP. They're going after big applications, they've got big customers. Every year that they do this, it's more of an arrival. And I think, in two years time, that idea of the big three is actually going to be big three. It's not going to be two plus one. And that is going to accelerate more of the movement into cloud faster than ever, because the options that Google is offering are different than the others, these are all different clouds with different strengths. Of the three of them, Google, I have to say, has the most, if you will, computer science behind it. It's not that Microsoft doesn't have it, but Google is going to have a lot more capability and machine learning than I think what you're going to see out of Amazon ever. They are just going to take off and run with that, and Microsoft is going to have to figure out how they're going to try to catch up or how they're going to parley what they have in machine learning. It's not that they haven't made an investment in it, but it's not like Google has made investment in it. Google's been making investment in it over the years to support their consumer applications on Google. And now that stuff is coming, like I said before, the stuff is coming into the enterprise. I think there is a shift now, and we sort of wonder, is machine learning going to happen, when it's going to happen? It's going to happen, and it's going to come from Google. >> All right, well, great way to end the opening segment here. Thank you so much, Marc Farley, for joining us. We've got a full day of coverage here from our 4,500 square foot studio in the heart of Silicon Valley. You're watching theCUBE. (bright music)

Published Date : Mar 9 2017

SUMMARY :

Narrator: Live from the in the past and going to Oh, it's really nice to be here, Stu, fun stuff to get into. of the industry there. I hate to call it college, but Oh, it's a lot better than that. in Google's positioning in cloud? I think that's going to be the are they going to put the money in? Yeah, and I talked to some people, It's full pedal to the metal, that they're bringing, this is a lot more some of the things what it's called. or it's the Data Loss Prevention API. shouldn't be so hard to remember. and all those things built in to try And it's hard for software to tends to be built for One of the other things they announced, and makes sure that the and making it available to enterprise. on the infrastructure side. it's going to be very tough for them. and the size of data and that I don't think it's going to and Microsoft absolutely has to have it, as soon as you open that software, and Asure Stack's going to and they had to go, a lot of times And it's so important to I want to give you just the final word. And that is going to in the heart of Silicon Valley.

ENTITIES

Entity	Category	Confidence
Diane Greene	PERSON	0.99+
Marc Farley	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Lenovo	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Marc	PERSON	0.99+
San Francisco	LOCATION	0.99+
Google	ORGANIZATION	0.99+
three	QUANTITY	0.99+
Dell	ORGANIZATION	0.99+
Eric Schmidt	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Raejeanne Skillern	PERSON	0.99+
18	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
Vulcancast	ORGANIZATION	0.99+
YouTube	ORGANIZATION	0.99+
64 core	QUANTITY	0.99+
Seattle	LOCATION	0.99+
five years	QUANTITY	0.99+
4,500 square foot	QUANTITY	0.99+
17 years	QUANTITY	0.99+
Raejeanne	PERSON	0.99+
Marc Farley	PERSON	0.99+
HANA	TITLE	0.99+
Mark	PERSON	0.99+
second part	QUANTITY	0.99+
30 billion dollars	QUANTITY	0.99+
Satya Nadella	PERSON	0.99+
Satya	PERSON	0.99+
AWS	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Asure	ORGANIZATION	0.99+
International Women's Day	EVENT	0.99+
Android	TITLE	0.99+
Superbowl	EVENT	0.99+
24 months	QUANTITY	0.99+
DockerCon	EVENT	0.99+
yesterday	DATE	0.99+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Asure: