Jay Marshall, Neural Magic | AWS Startup Showcase S3E1

(upbeat music) >> Hello, everyone, and welcome to theCUBE's presentation of the "AWS Startup Showcase." This is season three, episode one. The focus of this episode is AI/ML: Top Startups Building Foundational Models, Infrastructure, and AI. It's great topics, super-relevant, and it's part of our ongoing coverage of startups in the AWS ecosystem. I'm your host, John Furrier, with theCUBE. Today, we're excited to be joined by Jay Marshall, VP of Business Development at Neural Magic. Jay, thanks for coming on theCUBE. >> Hey, John, thanks so much. Thanks for having us. >> We had a great CUBE conversation with you guys. This is very much about the company focuses. It's a feature presentation for the "Startup Showcase," and the machine learning at scale is the topic, but in general, it's more, (laughs) and we should call it "Machine Learning and AI: How to Get Started," because everybody is retooling their business. Companies that aren't retooling their business right now with AI first will be out of business, in my opinion. You're seeing massive shift. This is really truly the beginning of the next-gen machine learning AI trend. It's really seeing ChatGPT. Everyone sees that. That went mainstream. But this is just the beginning. This is scratching the surface of this next-generation AI with machine learning powering it, and with all the goodness of cloud, cloud scale, and how horizontally scalable it is. The resources are there. You got the Edge. Everything's perfect for AI 'cause data infrastructure's exploding in value. AI is just the applications. This is a super topic, so what do you guys see in this general area of opportunities right now in the headlines? And I'm sure you guys' phone must be ringing off the hook, metaphorically speaking, or emails and meetings and Zooms. What's going on over there at Neural Magic? >> No, absolutely, and you pretty much nailed most of it. I think that, you know, my background, we've seen for the last 20-plus years. Even just getting enterprise applications kind of built and delivered at scale, obviously, amazing things with AWS and the cloud to help accelerate that. And we just kind of figured out in the last five or so years how to do that productively and efficiently, kind of from an operations perspective. Got development and operations teams. We even came up with DevOps, right? But now, we kind of have this new kind of persona and new workload that developers have to talk to, and then it has to be deployed on those ITOps solutions. And so you pretty much nailed it. Folks are saying, "Well, how do I do this?" These big, generational models or foundational models, as we're calling them, they're great, but enterprises want to do that with their data, on their infrastructure, at scale, at the edge. So for us, yeah, we're helping enterprises accelerate that through optimizing models and then delivering them at scale in a more cost-effective fashion. >> Yeah, and I think one of the things, the benefits of OpenAI we saw, was not only is it open source, then you got also other models that are more proprietary, is that it shows the world that this is really happening, right? It's a whole nother level, and there's also new landscape kind of maps coming out. You got the generative AI, and you got the foundational models, large LLMs. Where do you guys fit into the landscape? Because you guys are in the middle of this. How do you talk to customers when they say, "I'm going down this road. I need help. I'm going to stand this up." This new AI infrastructure and applications, where do you guys fit in the landscape? >> Right, and really, the answer is both. I think today, when it comes to a lot of what for some folks would still be considered kind of cutting edge around computer vision and natural language processing, a lot of our optimization tools and our runtime are based around most of the common computer vision and natural language processing models. So your YOLOs, your BERTs, you know, your DistilBERTs and what have you, so we work to help optimize those, again, who've gotten great performance and great value for customers trying to get those into production. But when you get into the LLMs, and you mentioned some of the open source components there, our research teams have kind of been right in the trenches with those. So kind of the GPT open source equivalent being OPT, being able to actually take, you know, a multi-$100 billion parameter model and sparsify that or optimize that down, shaving away a ton of parameters, and being able to run it on smaller infrastructure. So I think the evolution here, you know, all this stuff came out in the last six months in terms of being turned loose into the wild, but we're staying in the trenches with folks so that we can help optimize those as well and not require, again, the heavy compute, the heavy cost, the heavy power consumption as those models evolve as well. So we're staying right in with everybody while they're being built, but trying to get folks into production today with things that help with business value today. >> Jay, I really appreciate you coming on theCUBE, and before we came on camera, you said you just were on a customer call. I know you got a lot of activity. What specific things are you helping enterprises solve? What kind of problems? Take us through the spectrum from the beginning, people jumping in the deep end of the pool, some people kind of coming in, starting out slow. What are the scale? Can you scope the kind of use cases and problems that are emerging that people are calling you for? >> Absolutely, so I think if I break it down to kind of, like, your startup, or I maybe call 'em AI native to kind of steal from cloud native years ago, that group, it's pretty much, you know, part and parcel for how that group already runs. So if you have a data science team and an ML engineering team, you're building models, you're training models, you're deploying models. You're seeing firsthand the expense of starting to try to do that at scale. So it's really just a pure operational efficiency play. They kind of speak natively to our tools, which we're doing in the open source. So it's really helping, again, with the optimization of the models they've built, and then, again, giving them an alternative to expensive proprietary hardware accelerators to have to run them. Now, on the enterprise side, it varies, right? You have some kind of AI native folks there that already have these teams, but you also have kind of, like, AI curious, right? Like, they want to do it, but they don't really know where to start, and so for there, we actually have an open source toolkit that can help you get into this optimization, and then again, that runtime, that inferencing runtime, purpose-built for CPUs. It allows you to not have to worry, again, about do I have a hardware accelerator available? How do I integrate that into my application stack? If I don't already know how to build this into my infrastructure, does my ITOps teams, do they know how to do this, and what does that runway look like? How do I cost for this? How do I plan for this? When it's just x86 compute, we've been doing that for a while, right? So it obviously still requires more, but at least it's a little bit more predictable. >> It's funny you mentioned AI native. You know, born in the cloud was a phrase that was out there. Now, you have startups that are born in AI companies. So I think you have this kind of cloud kind of vibe going on. You have lift and shift was a big discussion. Then you had cloud native, kind of in the cloud, kind of making it all work. Is there a existing set of things? People will throw on this hat, and then what's the difference between AI native and kind of providing it to existing stuff? 'Cause we're a lot of people take some of these tools and apply it to either existing stuff almost, and it's not really a lift and shift, but it's kind of like bolting on AI to something else, and then starting with AI first or native AI. >> Absolutely. It's a- >> How would you- >> It's a great question. I think that probably, where I'd probably pull back to kind of allow kind of retail-type scenarios where, you know, for five, seven, nine years or more even, a lot of these folks already have data science teams, you know? I mean, they've been doing this for quite some time. The difference is the introduction of these neural networks and deep learning, right? Those kinds of models are just a little bit of a paradigm shift. So, you know, I obviously was trying to be fun with the term AI native, but I think it's more folks that kind of came up in that neural network world, so it's a little bit more second nature, whereas I think for maybe some traditional data scientists starting to get into neural networks, you have the complexity there and the training overhead, and a lot of the aspects of getting a model finely tuned and hyperparameterization and all of these aspects of it. It just adds a layer of complexity that they're just not as used to dealing with. And so our goal is to help make that easy, and then of course, make it easier to run anywhere that you have just kind of standard infrastructure. >> Well, the other point I'd bring out, and I'd love to get your reaction to, is not only is that a neural network team, people who have been focused on that, but also, if you look at some of the DataOps lately, AIOps markets, a lot of data engineering, a lot of scale, folks who have been kind of, like, in that data tsunami cloud world are seeing, they kind of been in this, right? They're, like, been experiencing that. >> No doubt. I think it's funny the data lake concept, right? And you got data oceans now. Like, the metaphors just keep growing on us, but where it is valuable in terms of trying to shift the mindset, I've always kind of been a fan of some of the naming shift. I know with AWS, they always talk about purpose-built databases. And I always liked that because, you know, you don't have one database that can do everything. Even ones that say they can, like, you still have to do implementation detail differences. So sitting back and saying, "What is my use case, and then which database will I use it for?" I think it's kind of similar here. And when you're building those data teams, if you don't have folks that are doing data engineering, kind of that data harvesting, free processing, you got to do all that before a model's even going to care about it. So yeah, it's definitely a central piece of this as well, and again, whether or not you're going to be AI negative as you're making your way to kind of, you know, on that journey, you know, data's definitely a huge component of it. >> Yeah, you would have loved our Supercloud event we had. Talk about naming and, you know, around data meshes was talked about a lot. You're starting to see the control plane layers of data. I think that was the beginning of what I saw as that data infrastructure shift, to be horizontally scalable. So I have to ask you, with Neural Magic, when your customers and the people that are prospects for you guys, they're probably asking a lot of questions because I think the general thing that we see is, "How do I get started? Which GPU do I use?" I mean, there's a lot of things that are kind of, I won't say technical or targeted towards people who are living in that world, but, like, as the mainstream enterprises come in, they're going to need a playbook. What do you guys see, what do you guys offer your clients when they come in, and what do you recommend? >> Absolutely, and I think where we hook in specifically tends to be on the training side. So again, I've built a model. Now, I want to really optimize that model. And then on the runtime side when you want to deploy it, you know, we run that optimized model. And so that's where we're able to provide. We even have a labs offering in terms of being able to pair up our engineering teams with a customer's engineering teams, and we can actually help with most of that pipeline. So even if it is something where you have a dataset and you want some help in picking a model, you want some help training it, you want some help deploying that, we can actually help there as well. You know, there's also a great partner ecosystem out there, like a lot of folks even in the "Startup Showcase" here, that extend beyond into kind of your earlier comment around data engineering or downstream ITOps or the all-up MLOps umbrella. So we can absolutely engage with our labs, and then, of course, you know, again, partners, which are always kind of key to this. So you are spot on. I think what's happened with the kind of this, they talk about a hockey stick. This is almost like a flat wall now with the rate of innovation right now in this space. And so we do have a lot of folks wanting to go straight from curious to native. And so that's definitely where the partner ecosystem comes in so hard 'cause there just isn't anybody or any teams out there that, I literally do from, "Here's my blank database, and I want an API that does all the stuff," right? Like, that's a big chunk, but we can definitely help with the model to delivery piece. >> Well, you guys are obviously a featured company in this space. Talk about the expertise. A lot of companies are like, I won't say faking it till they make it. You can't really fake security. You can't really fake AI, right? So there's going to be a learning curve. They'll be a few startups who'll come out of the gate early. You guys are one of 'em. Talk about what you guys have as expertise as a company, why you're successful, and what problems do you solve for customers? >> No, appreciate that. Yeah, we actually, we love to tell the story of our founder, Nir Shavit. So he's a 20-year professor at MIT. Actually, he was doing a lot of work on kind of multicore processing before there were even physical multicores, and actually even did a stint in computational neurobiology in the 2010s, and the impetus for this whole technology, has a great talk on YouTube about it, where he talks about the fact that his work there, he kind of realized that the way neural networks encode and how they're executed by kind of ramming data layer by layer through these kind of HPC-style platforms, actually was not analogous to how the human brain actually works. So we're on one side, we're building neural networks, and we're trying to emulate neurons. We're not really executing them that way. So our team, which one of the co-founders, also an ex-MIT, that was kind of the birth of why can't we leverage this super-performance CPU platform, which has those really fat, fast caches attached to each core, and actually start to find a way to break that model down in a way that I can execute things in parallel, not having to do them sequentially? So it is a lot of amazing, like, talks and stuff that show kind of the magic, if you will, a part of the pun of Neural Magic, but that's kind of the foundational layer of all the engineering that we do here. And in terms of how we're able to bring it to reality for customers, I'll give one customer quote where it's a large retailer, and it's a people-counting application. So a very common application. And that customer's actually been able to show literally double the amount of cameras being run with the same amount of compute. So for a one-to-one perspective, two-to-one, business leaders usually like that math, right? So we're able to show pure cost savings, but even performance-wise, you know, we have some of the common models like your ResNets and your YOLOs, where we can actually even perform better than hardware-accelerated solutions. So we're trying to do, I need to just dumb it down to better, faster, cheaper, but from a commodity perspective, that's where we're accelerating. >> That's not a bad business model. Make things easier to use, faster, and reduce the steps it takes to do stuff. So, you know, that's always going to be a good market. Now, you guys have DeepSparse, which we've talked about on our CUBE conversation prior to this interview, delivers ML models through the software so the hardware allows for a decoupling, right? >> Yep. >> Which is going to drive probably a cost advantage. Also, it's also probably from a deployment standpoint it must be easier. Can you share the benefits? Is it a cost side? Is it more of a deployment? What are the benefits of the DeepSparse when you guys decouple the software from the hardware on the ML models? >> No you actually, you hit 'em both 'cause that really is primarily the value. Because ultimately, again, we're so early. And I came from this world in a prior life where I'm doing Java development, WebSphere, WebLogic, Tomcat open source, right? When we were trying to do innovation, we had innovation buckets, 'cause everybody wanted to be on the web and have their app and a browser, right? We got all the money we needed to build something and show, hey, look at the thing on the web, right? But when you had to get in production, that was the challenge. So to what you're speaking to here, in this situation, we're able to show we're just a Python package. So whether you just install it on the operating system itself, or we also have a containerized version you can drop on any container orchestration platform, so ECS or EKS on AWS. And so you get all the auto-scaling features. So when you think about that kind of a world where you have everything from real-time inferencing to kind of after hours batch processing inferencing, the fact that you can auto scale that hardware up and down and it's CPU based, so you're paying by the minute instead of maybe paying by the hour at a lower cost shelf, it does everything from pure cost to, again, I can have my standard IT team say, "Hey, here's the Kubernetes in the container," and it just runs on the infrastructure we're already managing. So yeah, operational, cost and again, and many times even performance. (audio warbles) CPUs if I want to. >> Yeah, so that's easier on the deployment too. And you don't have this kind of, you know, blank check kind of situation where you don't know what's on the backend on the cost side. >> Exactly. >> And you control the actual hardware and you can manage that supply chain. >> And keep in mind, exactly. Because the other thing that sometimes gets lost in the conversation, depending on where a customer is, some of these workloads, like, you know, you and I remember a world where even like the roundtrip to the cloud and back was a problem for folks, right? We're used to extremely low latency. And some of these workloads absolutely also adhere to that. But there's some workloads where the latency isn't as important. And we actually even provide the tuning. Now, if we're giving you five milliseconds of latency and you don't need that, you can tune that back. So less CPU, lower cost. Now, throughput and other things come into play. But that's the kind of configurability and flexibility we give for operations. >> All right, so why should I call you if I'm a customer or prospect Neural Magic, what problem do I have or when do I know I need you guys? When do I call you in and what does my environment look like? When do I know? What are some of the signals that would tell me that I need Neural Magic? >> No, absolutely. So I think in general, any neural network, you know, the process I mentioned before called sparcification, it's, you know, an optimization process that we specialize in. Any neural network, you know, can be sparcified. So I think if it's a deep-learning neural network type model. If you're trying to get AI into production, you have cost concerns even performance-wise. I certainly hate to be too generic and say, "Hey, we'll talk to everybody." But really in this world right now, if it's a neural network, it's something where you're trying to get into production, you know, we are definitely offering, you know, kind of an at-scale performant deployable solution for deep learning models. >> So neural network you would define as what? Just devices that are connected that need to know about each other? What's the state-of-the-art current definition of neural network for customers that may think they have a neural network or might not know they have a neural network architecture? What is that definition for neural network? >> That's a great question. So basically, machine learning models that fall under this kind of category, you hear about transformers a lot, or I mentioned about YOLO, the YOLO family of computer vision models, or natural language processing models like BERT. If you have a data science team or even developers, some even regular, I used to call myself a nine to five developer 'cause I worked in the enterprise, right? So like, hey, we found a new open source framework, you know, I used to use Spring back in the day and I had to go figure it out. There's developers that are pulling these models down and they're figuring out how to get 'em into production, okay? So I think all of those kinds of situations, you know, if it's a machine learning model of the deep learning variety that's, you know, really specifically where we shine. >> Okay, so let me pretend I'm a customer for a minute. I have all these videos, like all these transcripts, I have all these people that we've interviewed, CUBE alumnis, and I say to my team, "Let's AI-ify, sparcify theCUBE." >> Yep. >> What do I do? I mean, do I just like, my developers got to get involved and they're going to be like, "Well, how do I upload it to the cloud? Do I use a GPU?" So there's a thought process. And I think a lot of companies are going through that example of let's get on this AI, how can it help our business? >> Absolutely. >> What does that progression look like? Take me through that example. I mean, I made up theCUBE example up, but we do have a lot of data. We have large data models and we have people and connect to the internet and so we kind of seem like there's a neural network. I think every company might have a neural network in place. >> Well, and I was going to say, I think in general, you all probably do represent even the standard enterprise more than most. 'Cause even the enterprise is going to have a ton of video content, a ton of text content. So I think it's a great example. So I think that that kind of sea or I'll even go ahead and use that term data lake again, of data that you have, you're probably going to want to be setting up kind of machine learning pipelines that are going to be doing all of the pre-processing from kind of the raw data to kind of prepare it into the format that say a YOLO would actually use or let's say BERT for natural language processing. So you have all these transcripts, right? So we would do a pre-processing path where we would create that into the file format that BERT, the machine learning model would know how to train off of. So that's kind of all the pre-processing steps. And then for training itself, we actually enable what's called sparse transfer learning. So that's transfer learning is a very popular method of doing training with existing models. So we would be able to retrain that BERT model with your transcript data that we have now done the pre-processing with to get it into the proper format. And now we have a BERT natural language processing model that's been trained on your data. And now we can deploy that onto DeepSparse runtime so that now you can ask that model whatever questions, or I should say pass, you're not going to ask it those kinds of questions ChatGPT, although we can do that too. But you're going to pass text through the BERT model and it's going to give you answers back. It could be things like sentiment analysis or text classification. You just call the model, and now when you pass text through it, you get the answers better, faster or cheaper. I'll use that reference again. >> Okay, we can create a CUBE bot to give us questions on the fly from the the AI bot, you know, from our previous guests. >> Well, and I will tell you using that as an example. So I had mentioned OPT before, kind of the open source version of ChatGPT. So, you know, typically that requires multiple GPUs to run. So our research team, I may have mentioned earlier, we've been able to sparcify that over 50% already and run it on only a single GPU. And so in that situation, you could train OPT with that corpus of data and do exactly what you say. Actually we could use Alexa, we could use Alexa to actually respond back with voice. How about that? We'll do an API call and we'll actually have an interactive Alexa-enabled bot. >> Okay, we're going to be a customer, let's put it on the list. But this is a great example of what you guys call software delivered AI, a topic we chatted about on theCUBE conversation. This really means this is a developer opportunity. This really is the convergence of the data growth, the restructuring, how data is going to be horizontally scalable, meets developers. So this is an AI developer model going on right now, which is kind of unique. >> It is, John, I will tell you what's interesting. And again, folks don't always think of it this way, you know, the AI magical goodness is now getting pushed in the middle where the developers and IT are operating. And so it again, that paradigm, although for some folks seem obvious, again, if you've been around for 20 years, that whole all that plumbing is a thing, right? And so what we basically help with is when you deploy the DeepSparse runtime, we have a very rich API footprint. And so the developers can call the API, ITOps can run it, or to your point, it's developer friendly enough that you could actually deploy our off-the-shelf models. We have something called the SparseZoo where we actually publish pre-optimized or pre-sparcified models. And so developers could literally grab those right off the shelf with the training they've already had and just put 'em right into their applications and deploy them as containers. So yeah, we enable that for sure as well. >> It's interesting, DevOps was infrastructure as code and we had a last season, a series on data as code, which we kind of coined. This is data as code. This is a whole nother level of opportunity where developers just want to have programmable data and apps with AI. This is a whole new- >> Absolutely. >> Well, absolutely great, great stuff. Our news team at SiliconANGLE and theCUBE said you guys had a little bit of a launch announcement you wanted to make here on the "AWS Startup Showcase." So Jay, you have something that you want to launch here? >> Yes, and thank you John for teeing me up. So I'm going to try to put this in like, you know, the vein of like an AWS, like main stage keynote launch, okay? So we're going to try this out. So, you know, a lot of our product has obviously been built on top of x86. I've been sharing that the past 15 minutes or so. And with that, you know, we're seeing a lot of acceleration for folks wanting to run on commodity infrastructure. But we've had customers and prospects and partners tell us that, you know, ARM and all of its kind of variance are very compelling, both cost performance-wise and also obviously with Edge. And wanted to know if there was anything we could do from a runtime perspective with ARM. And so we got the work and, you know, it's a hard problem to solve 'cause the instructions set for ARM is very different than the instruction set for x86, and our deep tensor column technology has to be able to work with that lower level instruction spec. But working really hard, the engineering team's been at it and we are happy to announce here at the "AWS Startup Showcase," that DeepSparse inference now has, or inference runtime now has support for AWS Graviton instances. So it's no longer just x86, it is also ARM and that obviously also opens up the door to Edge and further out the stack so that optimize once run anywhere, we're not going to open up. So it is an early access. So if you go to neuralmagic.com/graviton, you can sign up for early access, but we're excited to now get into the ARM side of the fence as well on top of Graviton. >> That's awesome. Our news team is going to jump on that news. We'll get it right up. We get a little scoop here on the "Startup Showcase." Jay Marshall, great job. That really highlights the flexibility that you guys have when you decouple the software from the hardware. And again, we're seeing open source driving a lot more in AI ops now with with machine learning and AI. So to me, that makes a lot of sense. And congratulations on that announcement. Final minute or so we have left, give a summary of what you guys are all about. Put a plug in for the company, what you guys are looking to do. I'm sure you're probably hiring like crazy. Take the last few minutes to give a plug for the company and give a summary. >> No, I appreciate that so much. So yeah, joining us out neuralmagic.com, you know, part of what we didn't spend a lot of time here, our optimization tools, we are doing all of that in the open source. It's called SparseML and I mentioned SparseZoo briefly. So we really want the data scientists community and ML engineering community to join us out there. And again, the DeepSparse runtime, it's actually free to use for trial purposes and for personal use. So you can actually run all this on your own laptop or on an AWS instance of your choice. We are now live in the AWS marketplace. So push button, deploy, come try us out and reach out to us on neuralmagic.com. And again, sign up for the Graviton early access. >> All right, Jay Marshall, Vice President of Business Development Neural Magic here, talking about performant, cost effective machine learning at scale. This is season three, episode one, focusing on foundational models as far as building data infrastructure and AI, AI native. I'm John Furrier with theCUBE. Thanks for watching. (bright upbeat music)

Published Date : Mar 9 2023

SUMMARY :

of the "AWS Startup Showcase." Thanks for having us. and the machine learning and the cloud to help accelerate that. and you got the foundational So kind of the GPT open deep end of the pool, that group, it's pretty much, you know, So I think you have this kind It's a- and a lot of the aspects of and I'd love to get your reaction to, And I always liked that because, you know, that are prospects for you guys, and you want some help in picking a model, Talk about what you guys have that show kind of the magic, if you will, and reduce the steps it takes to do stuff. when you guys decouple the the fact that you can auto And you don't have this kind of, you know, the actual hardware and you and you don't need that, neural network, you know, of situations, you know, CUBE alumnis, and I say to my team, and they're going to be like, and connect to the internet and it's going to give you answers back. you know, from our previous guests. and do exactly what you say. of what you guys call enough that you could actually and we had a last season, that you want to launch here? And so we got the work and, you know, flexibility that you guys have So you can actually run Vice President of Business

ENTITIES

Entity	Category	Confidence
Jay	PERSON	0.99+
Jay Marshall	PERSON	0.99+
John Furrier	PERSON	0.99+
John	PERSON	0.99+
AWS	ORGANIZATION	0.99+
five	QUANTITY	0.99+
Nir Shavit	PERSON	0.99+
20-year	QUANTITY	0.99+
Alexa	TITLE	0.99+
2010s	DATE	0.99+
seven	QUANTITY	0.99+
Python	TITLE	0.99+
MIT	ORGANIZATION	0.99+
each core	QUANTITY	0.99+
Neural Magic	ORGANIZATION	0.99+
Java	TITLE	0.99+
YouTube	ORGANIZATION	0.99+
Today	DATE	0.99+
nine years	QUANTITY	0.98+
both	QUANTITY	0.98+
BERT	TITLE	0.98+
theCUBE	ORGANIZATION	0.98+
ChatGPT	TITLE	0.98+
20 years	QUANTITY	0.98+
over 50%	QUANTITY	0.97+
second nature	QUANTITY	0.96+
today	DATE	0.96+
ARM	ORGANIZATION	0.96+
one	QUANTITY	0.95+
DeepSparse	TITLE	0.94+
neuralmagic.com/graviton	OTHER	0.94+
SiliconANGLE	ORGANIZATION	0.94+
WebSphere	TITLE	0.94+
nine	QUANTITY	0.94+
first	QUANTITY	0.93+
Startup Showcase	EVENT	0.93+
five milliseconds	QUANTITY	0.92+
AWS Startup Showcase	EVENT	0.91+
two	QUANTITY	0.9+
YOLO	ORGANIZATION	0.89+
CUBE	ORGANIZATION	0.88+
OPT	TITLE	0.88+
last six months	DATE	0.88+
season three	QUANTITY	0.86+
double	QUANTITY	0.86+
one customer	QUANTITY	0.86+
Supercloud	EVENT	0.86+
one side	QUANTITY	0.85+
Vice	PERSON	0.85+
x86	OTHER	0.83+
AI/ML: Top Startups Building Foundational Models	TITLE	0.82+
ECS	TITLE	0.81+
$100 billion	QUANTITY	0.81+
DevOps	TITLE	0.81+
WebLogic	TITLE	0.8+
EKS	TITLE	0.8+
a minute	QUANTITY	0.8+
neuralmagic.com	OTHER	0.79+

Robert Nishihara, Anyscale | AWS Startup Showcase S3 E1

(upbeat music) >> Hello everyone. Welcome to theCube's presentation of the "AWS Startup Showcase." The topic this episode is AI and machine learning, top startups building foundational model infrastructure. This is season three, episode one of the ongoing series covering exciting startups from the AWS ecosystem. And this time we're talking about AI and machine learning. I'm your host, John Furrier. I'm excited I'm joined today by Robert Nishihara, who's the co-founder and CEO of a hot startup called Anyscale. He's here to talk about Ray, the open source project, Anyscale's infrastructure for foundation as well. Robert, thank you for joining us today. >> Yeah, thanks so much as well. >> I've been following your company since the founding pre pandemic and you guys really had a great vision scaled up and in a perfect position for this big wave that we all see with ChatGPT and OpenAI that's gone mainstream. Finally, AI has broken out through the ropes and now gone mainstream, so I think you guys are really well positioned. I'm looking forward to to talking with you today. But before we get into it, introduce the core mission for Anyscale. Why do you guys exist? What is the North Star for Anyscale? >> Yeah, like you mentioned, there's a tremendous amount of excitement about AI right now. You know, I think a lot of us believe that AI can transform just every different industry. So one of the things that was clear to us when we started this company was that the amount of compute needed to do AI was just exploding. Like to actually succeed with AI, companies like OpenAI or Google or you know, these companies getting a lot of value from AI, were not just running these machine learning models on their laptops or on a single machine. They were scaling these applications across hundreds or thousands or more machines and GPUs and other resources in the Cloud. And so to actually succeed with AI, and this has been one of the biggest trends in computing, maybe the biggest trend in computing in, you know, in recent history, the amount of compute has been exploding. And so to actually succeed with that AI, to actually build these scalable applications and scale the AI applications, there's a tremendous software engineering lift to build the infrastructure to actually run these scalable applications. And that's very hard to do. So one of the reasons many AI projects and initiatives fail is that, or don't make it to production, is the need for this scale, the infrastructure lift, to actually make it happen. So our goal here with Anyscale and Ray, is to make that easy, is to make scalable computing easy. So that as a developer or as a business, if you want to do AI, if you want to get value out of AI, all you need to know is how to program on your laptop. Like, all you need to know is how to program in Python. And if you can do that, then you're good to go. Then you can do what companies like OpenAI or Google do and get value out of machine learning. >> That programming example of how easy it is with Python reminds me of the early days of Cloud, when infrastructure as code was talked about was, it was just code the infrastructure programmable. That's super important. That's what AI people wanted, first program AI. That's the new trend. And I want to understand, if you don't mind explaining, the relationship that Anyscale has to these foundational models and particular the large language models, also called LLMs, was seen with like OpenAI and ChatGPT. Before you get into the relationship that you have with them, can you explain why the hype around foundational models? Why are people going crazy over foundational models? What is it and why is it so important? >> Yeah, so foundational models and foundation models are incredibly important because they enable businesses and developers to get value out of machine learning, to use machine learning off the shelf with these large models that have been trained on tons of data and that are useful out of the box. And then, of course, you know, as a business or as a developer, you can take those foundational models and repurpose them or fine tune them or adapt them to your specific use case and what you want to achieve. But it's much easier to do that than to train them from scratch. And I think there are three, for people to actually use foundation models, there are three main types of workloads or problems that need to be solved. One is training these foundation models in the first place, like actually creating them. The second is fine tuning them and adapting them to your use case. And the third is serving them and actually deploying them. Okay, so Ray and Anyscale are used for all of these three different workloads. Companies like OpenAI or Cohere that train large language models. Or open source versions like GPTJ are done on top of Ray. There are many startups and other businesses that fine tune, that, you know, don't want to train the large underlying foundation models, but that do want to fine tune them, do want to adapt them to their purposes, and build products around them and serve them, those are also using Ray and Anyscale for that fine tuning and that serving. And so the reason that Ray and Anyscale are important here is that, you know, building and using foundation models requires a huge scale. It requires a lot of data. It requires a lot of compute, GPUs, TPUs, other resources. And to actually take advantage of that and actually build these scalable applications, there's a lot of infrastructure that needs to happen under the hood. And so you can either use Ray and Anyscale to take care of that and manage the infrastructure and solve those infrastructure problems. Or you can build the infrastructure and manage the infrastructure yourself, which you can do, but it's going to slow your team down. It's going to, you know, many of the businesses we work with simply don't want to be in the business of managing infrastructure and building infrastructure. They want to focus on product development and move faster. >> I know you got a keynote presentation we're going to go to in a second, but I think you hit on something I think is the real tipping point, doing it yourself, hard to do. These are things where opportunities are and the Cloud did that with data centers. Turned a data center and made it an API. The heavy lifting went away and went to the Cloud so people could be more creative and build their product. In this case, build their creativity. Is that kind of what's the big deal? Is that kind of a big deal happening that you guys are taking the learnings and making that available so people don't have to do that? >> That's exactly right. So today, if you want to succeed with AI, if you want to use AI in your business, infrastructure work is on the critical path for doing that. To do AI, you have to build infrastructure. You have to figure out how to scale your applications. That's going to change. We're going to get to the point, and you know, with Ray and Anyscale, we're going to remove the infrastructure from the critical path so that as a developer or as a business, all you need to focus on is your application logic, what you want the the program to do, what you want your application to do, how you want the AI to actually interface with the rest of your product. Now the way that will happen is that Ray and Anyscale will still, the infrastructure work will still happen. It'll just be under the hood and taken care of by Ray in Anyscale. And so I think something like this is really necessary for AI to reach its potential, for AI to have the impact and the reach that we think it will, you have to make it easier to do. >> And just for clarification to point out, if you don't mind explaining the relationship of Ray and Anyscale real quick just before we get into the presentation. >> So Ray is an open source project. We created it. We were at Berkeley doing machine learning. We started Ray so that, in order to provide an easy, a simple open source tool for building and running scalable applications. And Anyscale is the managed version of Ray, basically we will run Ray for you in the Cloud, provide a lot of tools around the developer experience and managing the infrastructure and providing more performance and superior infrastructure. >> Awesome. I know you got a presentation on Ray and Anyscale and you guys are positioning as the infrastructure for foundational models. So I'll let you take it away and then when you're done presenting, we'll come back, I'll probably grill you with a few questions and then we'll close it out so take it away. >> Robert: Sounds great. So I'll say a little bit about how companies are using Ray and Anyscale for foundation models. The first thing I want to mention is just why we're doing this in the first place. And the underlying observation, the underlying trend here, and this is a plot from OpenAI, is that the amount of compute needed to do machine learning has been exploding. It's been growing at something like 35 times every 18 months. This is absolutely enormous. And other people have written papers measuring this trend and you get different numbers. But the point is, no matter how you slice and dice it, it' a astronomical rate. Now if you compare that to something we're all familiar with, like Moore's Law, which says that, you know, the processor performance doubles every roughly 18 months, you can see that there's just a tremendous gap between the needs, the compute needs of machine learning applications, and what you can do with a single chip, right. So even if Moore's Law were continuing strong and you know, doing what it used to be doing, even if that were the case, there would still be a tremendous gap between what you can do with the chip and what you need in order to do machine learning. And so given this graph, what we've seen, and what has been clear to us since we started this company, is that doing AI requires scaling. There's no way around it. It's not a nice to have, it's really a requirement. And so that led us to start Ray, which is the open source project that we started to make it easy to build these scalable Python applications and scalable machine learning applications. And since we started the project, it's been adopted by a tremendous number of companies. Companies like OpenAI, which use Ray to train their large models like ChatGPT, companies like Uber, which run all of their deep learning and classical machine learning on top of Ray, companies like Shopify or Spotify or Instacart or Lyft or Netflix, ByteDance, which use Ray for their machine learning infrastructure. Companies like Ant Group, which makes Alipay, you know, they use Ray across the board for fraud detection, for online learning, for detecting money laundering, you know, for graph processing, stream processing. Companies like Amazon, you know, run Ray at a tremendous scale and just petabytes of data every single day. And so the project has seen just enormous adoption since, over the past few years. And one of the most exciting use cases is really providing the infrastructure for building training, fine tuning, and serving foundation models. So I'll say a little bit about, you know, here are some examples of companies using Ray for foundation models. Cohere trains large language models. OpenAI also trains large language models. You can think about the workloads required there are things like supervised pre-training, also reinforcement learning from human feedback. So this is not only the regular supervised learning, but actually more complex reinforcement learning workloads that take human input about what response to a particular question, you know is better than a certain other response. And incorporating that into the learning. There's open source versions as well, like GPTJ also built on top of Ray as well as projects like Alpa coming out of UC Berkeley. So these are some of the examples of exciting projects in organizations, training and creating these large language models and serving them using Ray. Okay, so what actually is Ray? Well, there are two layers to Ray. At the lowest level, there's the core Ray system. This is essentially low level primitives for building scalable Python applications. Things like taking a Python function or a Python class and executing them in the cluster setting. So Ray core is extremely flexible and you can build arbitrary scalable applications on top of Ray. So on top of Ray, on top of the core system, what really gives Ray a lot of its power is this ecosystem of scalable libraries. So on top of the core system you have libraries, scalable libraries for ingesting and pre-processing data, for training your models, for fine tuning those models, for hyper parameter tuning, for doing batch processing and batch inference, for doing model serving and deployment, right. And a lot of the Ray users, the reason they like Ray is that they want to run multiple workloads. They want to train and serve their models, right. They want to load their data and feed that into training. And Ray provides common infrastructure for all of these different workloads. So this is a little overview of what Ray, the different components of Ray. So why do people choose to go with Ray? I think there are three main reasons. The first is the unified nature. The fact that it is common infrastructure for scaling arbitrary workloads, from data ingest to pre-processing to training to inference and serving, right. This also includes the fact that it's future proof. AI is incredibly fast moving. And so many people, many companies that have built their own machine learning infrastructure and standardized on particular workflows for doing machine learning have found that their workflows are too rigid to enable new capabilities. If they want to do reinforcement learning, if they want to use graph neural networks, they don't have a way of doing that with their standard tooling. And so Ray, being future proof and being flexible and general gives them that ability. Another reason people choose Ray in Anyscale is the scalability. This is really our bread and butter. This is the reason, the whole point of Ray, you know, making it easy to go from your laptop to running on thousands of GPUs, making it easy to scale your development workloads and run them in production, making it easy to scale, you know, training to scale data ingest, pre-processing and so on. So scalability and performance, you know, are critical for doing machine learning and that is something that Ray provides out of the box. And lastly, Ray is an open ecosystem. You can run it anywhere. You can run it on any Cloud provider. Google, you know, Google Cloud, AWS, Asure. You can run it on your Kubernetes cluster. You can run it on your laptop. It's extremely portable. And not only that, it's framework agnostic. You can use Ray to scale arbitrary Python workloads. You can use it to scale and it integrates with libraries like TensorFlow or PyTorch or JAX or XG Boost or Hugging Face or PyTorch Lightning, right, or Scikit-learn or just your own arbitrary Python code. It's open source. And in addition to integrating with the rest of the machine learning ecosystem and these machine learning frameworks, you can use Ray along with all of the other tooling in the machine learning ecosystem. That's things like weights and biases or ML flow, right. Or you know, different data platforms like Databricks, you know, Delta Lake or Snowflake or tools for model monitoring for feature stores, all of these integrate with Ray. And that's, you know, Ray provides that kind of flexibility so that you can integrate it into the rest of your workflow. And then Anyscale is the scalable compute platform that's built on top, you know, that provides Ray. So Anyscale is a managed Ray service that runs in the Cloud. And what Anyscale does is it offers the best way to run Ray. And if you think about what you get with Anyscale, there are fundamentally two things. One is about moving faster, accelerating the time to market. And you get that by having the managed service so that as a developer you don't have to worry about managing infrastructure, you don't have to worry about configuring infrastructure. You also, it provides, you know, optimized developer workflows. Things like easily moving from development to production, things like having the observability tooling, the debug ability to actually easily diagnose what's going wrong in a distributed application. So things like the dashboards and the other other kinds of tooling for collaboration, for monitoring and so on. And then on top of that, so that's the first bucket, developer productivity, moving faster, faster experimentation and iteration. The second reason that people choose Anyscale is superior infrastructure. So this is things like, you know, cost deficiency, being able to easily take advantage of spot instances, being able to get higher GPU utilization, things like faster cluster startup times and auto scaling. Things like just overall better performance and faster scheduling. And so these are the kinds of things that Anyscale provides on top of Ray. It's the managed infrastructure. It's fast, it's like the developer productivity and velocity as well as performance. So this is what I wanted to share about Ray in Anyscale. >> John: Awesome. >> Provide that context. But John, I'm curious what you think. >> I love it. I love the, so first of all, it's a platform because that's the platform architecture right there. So just to clarify, this is an Anyscale platform, not- >> That's right. >> Tools. So you got tools in the platform. Okay, that's key. Love that managed service. Just curious, you mentioned Python multiple times, is that because of PyTorch and TensorFlow or Python's the most friendly with machine learning or it's because it's very common amongst all developers? >> That's a great question. Python is the language that people are using to do machine learning. So it's the natural starting point. Now, of course, Ray is actually designed in a language agnostic way and there are companies out there that use Ray to build scalable Java applications. But for the most part right now we're focused on Python and being the best way to build these scalable Python and machine learning applications. But, of course, down the road there always is that potential. >> So if you're slinging Python code out there and you're watching that, you're watching this video, get on Anyscale bus quickly. Also, I just, while you were giving the presentation, I couldn't help, since you mentioned OpenAI, which by the way, congratulations 'cause they've had great scale, I've noticed in their rapid growth 'cause they were the fastest company to the number of users than anyone in the history of the computer industry, so major successor, OpenAI and ChatGPT, huge fan. I'm not a skeptic at all. I think it's just the beginning, so congratulations. But I actually typed into ChatGPT, what are the top three benefits of Anyscale and came up with scalability, flexibility, and ease of use. Obviously, scalability is what you guys are called. >> That's pretty good. >> So that's what they came up with. So they nailed it. Did you have an inside prompt training, buy it there? Only kidding. (Robert laughs) >> Yeah, we hard coded that one. >> But that's the kind of thing that came up really, really quickly if I asked it to write a sales document, it probably will, but this is the future interface. This is why people are getting excited about the foundational models and the large language models because it's allowing the interface with the user, the consumer, to be more human, more natural. And this is clearly will be in every application in the future. >> Absolutely. This is how people are going to interface with software, how they're going to interface with products in the future. It's not just something, you know, not just a chat bot that you talk to. This is going to be how you get things done, right. How you use your web browser or how you use, you know, how you use Photoshop or how you use other products. Like you're not going to spend hours learning all the APIs and how to use them. You're going to talk to it and tell it what you want it to do. And of course, you know, if it doesn't understand it, it's going to ask clarifying questions. You're going to have a conversation and then it'll figure it out. >> This is going to be one of those things, we're going to look back at this time Robert and saying, "Yeah, from that company, that was the beginning of that wave." And just like AWS and Cloud Computing, the folks who got in early really were in position when say the pandemic came. So getting in early is a good thing and that's what everyone's talking about is getting in early and playing around, maybe replatforming or even picking one or few apps to refactor with some staff and managed services. So people are definitely jumping in. So I have to ask you the ROI cost question. You mentioned some of those, Moore's Law versus what's going on in the industry. When you look at that kind of scale, the first thing that jumps out at people is, "Okay, I love it. Let's go play around." But what's it going to cost me? Am I going to be tied to certain GPUs? What's the landscape look like from an operational standpoint, from the customer? Are they locked in and the benefit was flexibility, are you flexible to handle any Cloud? What is the customers, what are they looking at? Basically, that's my question. What's the customer looking at? >> Cost is super important here and many of the companies, I mean, companies are spending a huge amount on their Cloud computing, on AWS, and on doing AI, right. And I think a lot of the advantage of Anyscale, what we can provide here is not only better performance, but cost efficiency. Because if we can run something faster and more efficiently, it can also use less resources and you can lower your Cloud spending, right. We've seen companies go from, you know, 20% GPU utilization with their current setup and the current tools they're using to running on Anyscale and getting more like 95, you know, 100% GPU utilization. That's something like a five x improvement right there. So depending on the kind of application you're running, you know, it's a significant cost savings. We've seen companies that have, you know, processing petabytes of data every single day with Ray going from, you know, getting order of magnitude cost savings by switching from what they were previously doing to running their application on Ray. And when you have applications that are spending, you know, potentially $100 million a year and getting a 10 X cost savings is just absolutely enormous. So these are some of the kinds of- >> Data infrastructure is super important. Again, if the customer, if you're a prospect to this and thinking about going in here, just like the Cloud, you got infrastructure, you got the platform, you got SaaS, same kind of thing's going to go on in AI. So I want to get into that, you know, ROI discussion and some of the impact with your customers that are leveraging the platform. But first I hear you got a demo. >> Robert: Yeah, so let me show you, let me give you a quick run through here. So what I have open here is the Anyscale UI. I've started a little Anyscale Workspace. So Workspaces are the Anyscale concept for interactive developments, right. So here, imagine I'm just, you want to have a familiar experience like you're developing on your laptop. And here I have a terminal. It's not on my laptop. It's actually in the cloud running on Anyscale. And I'm just going to kick this off. This is going to train a large language model, so OPT. And it's doing this on 32 GPUs. We've got a cluster here with a bunch of CPU cores, bunch of memory. And as that's running, and by the way, if I wanted to run this on instead of 32 GPUs, 64, 128, this is just a one line change when I launch the Workspace. And what I can do is I can pull up VS code, right. Remember this is the interactive development experience. I can look at the actual code. Here it's using Ray train to train the torch model. We've got the training loop and we're saying that each worker gets access to one GPU and four CPU cores. And, of course, as I make the model larger, this is using deep speed, as I make the model larger, I could increase the number of GPUs that each worker gets access to, right. And how that is distributed across the cluster. And if I wanted to run on CPUs instead of GPUs or a different, you know, accelerator type, again, this is just a one line change. And here we're using Ray train to train the models, just taking my vanilla PyTorch model using Hugging Face and then scaling that across a bunch of GPUs. And, of course, if I want to look at the dashboard, I can go to the Ray dashboard. There are a bunch of different visualizations I can look at. I can look at the GPU utilization. I can look at, you know, the CPU utilization here where I think we're currently loading the model and running that actual application to start the training. And some of the things that are really convenient here about Anyscale, both I can get that interactive development experience with VS code. You know, I can look at the dashboards. I can monitor what's going on. It feels, I have a terminal, it feels like my laptop, but it's actually running on a large cluster. And I can, with however many GPUs or other resources that I want. And so it's really trying to combine the best of having the familiar experience of programming on your laptop, but with the benefits, you know, being able to take advantage of all the resources in the Cloud to scale. And it's like when, you know, you're talking about cost efficiency. One of the biggest reasons that people waste money, one of the silly reasons for wasting money is just forgetting to turn off your GPUs. And what you can do here is, of course, things will auto terminate if they're idle. But imagine you go to sleep, I have this big cluster. You can turn it off, shut off the cluster, come back tomorrow, restart the Workspace, and you know, your big cluster is back up and all of your code changes are still there. All of your local file edits. It's like you just closed your laptop and came back and opened it up again. And so this is the kind of experience we want to provide for our users. So that's what I wanted to share with you. >> Well, I think that whole, couple of things, lines of code change, single line of code change, that's game changing. And then the cost thing, I mean human error is a big deal. People pass out at their computer. They've been coding all night or they just forget about it. I mean, and then it's just like leaving the lights on or your water running in your house. It's just, at the scale that it is, the numbers will add up. That's a huge deal. So I think, you know, compute back in the old days, there's no compute. Okay, it's just compute sitting there idle. But you know, data cranking the models is doing, that's a big point. >> Another thing I want to add there about cost efficiency is that we make it really easy to use, if you're running on Anyscale, to use spot instances and these preemptable instances that can just be significantly cheaper than the on-demand instances. And so when we see our customers go from what they're doing before to using Anyscale and they go from not using these spot instances 'cause they don't have the infrastructure around it, the fault tolerance to handle the preemption and things like that, to being able to just check a box and use spot instances and save a bunch of money. >> You know, this was my whole, my feature article at Reinvent last year when I met with Adam Selipsky, this next gen Cloud is here. I mean, it's not auto scale, it's infrastructure scale. It's agility. It's flexibility. I think this is where the world needs to go. Almost what DevOps did for Cloud and what you were showing me that demo had this whole SRE vibe. And remember Google had site reliability engines to manage all those servers. This is kind of like an SRE vibe for data at scale. I mean, a similar kind of order of magnitude. I mean, I might be a little bit off base there, but how would you explain it? >> It's a nice analogy. I mean, what we are trying to do here is get to the point where developers don't think about infrastructure. Where developers only think about their application logic. And where businesses can do AI, can succeed with AI, and build these scalable applications, but they don't have to build, you know, an infrastructure team. They don't have to develop that expertise. They don't have to invest years in building their internal machine learning infrastructure. They can just focus on the Python code, on their application logic, and run the stuff out of the box. >> Awesome. Well, I appreciate the time. Before we wrap up here, give a plug for the company. I know you got a couple websites. Again, go, Ray's got its own website. You got Anyscale. You got an event coming up. Give a plug for the company looking to hire. Put a plug in for the company. >> Yeah, absolutely. Thank you. So first of all, you know, we think AI is really going to transform every industry and the opportunity is there, right. We can be the infrastructure that enables all of that to happen, that makes it easy for companies to succeed with AI, and get value out of AI. Now we have, if you're interested in learning more about Ray, Ray has been emerging as the standard way to build scalable applications. Our adoption has been exploding. I mentioned companies like OpenAI using Ray to train their models. But really across the board companies like Netflix and Cruise and Instacart and Lyft and Uber, you know, just among tech companies. It's across every industry. You know, gaming companies, agriculture, you know, farming, robotics, drug discovery, you know, FinTech, we see it across the board. And all of these companies can get value out of AI, can really use AI to improve their businesses. So if you're interested in learning more about Ray and Anyscale, we have our Ray Summit coming up in September. This is going to highlight a lot of the most impressive use cases and stories across the industry. And if your business, if you want to use LLMs, you want to train these LLMs, these large language models, you want to fine tune them with your data, you want to deploy them, serve them, and build applications and products around them, give us a call, talk to us. You know, we can really take the infrastructure piece, you know, off the critical path and make that easy for you. So that's what I would say. And, you know, like you mentioned, we're hiring across the board, you know, engineering, product, go-to-market, and it's an exciting time. >> Robert Nishihara, co-founder and CEO of Anyscale, congratulations on a great company you've built and continuing to iterate on and you got growth ahead of you, you got a tailwind. I mean, the AI wave is here. I think OpenAI and ChatGPT, a customer of yours, have really opened up the mainstream visibility into this new generation of applications, user interface, roll of data, large scale, how to make that programmable so we're going to need that infrastructure. So thanks for coming on this season three, episode one of the ongoing series of the hot startups. In this case, this episode is the top startups building foundational model infrastructure for AI and ML. I'm John Furrier, your host. Thanks for watching. (upbeat music)

Published Date : Mar 9 2023

SUMMARY :

episode one of the ongoing and you guys really had and other resources in the Cloud. and particular the large language and what you want to achieve. and the Cloud did that with data centers. the point, and you know, if you don't mind explaining and managing the infrastructure and you guys are positioning is that the amount of compute needed to do But John, I'm curious what you think. because that's the platform So you got tools in the platform. and being the best way to of the computer industry, Did you have an inside prompt and the large language models and tell it what you want it to do. So I have to ask you and you can lower your So I want to get into that, you know, and you know, your big cluster is back up So I think, you know, the on-demand instances. and what you were showing me that demo and run the stuff out of the box. I know you got a couple websites. and the opportunity is there, right. and you got growth ahead

ENTITIES

Entity	Category	Confidence
Robert Nishihara	PERSON	0.99+
John	PERSON	0.99+
Robert	PERSON	0.99+
John Furrier	PERSON	0.99+
Netflix	ORGANIZATION	0.99+
35 times	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
$100 million	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Ant Group	ORGANIZATION	0.99+
first	QUANTITY	0.99+
Python	TITLE	0.99+
20%	QUANTITY	0.99+
32 GPUs	QUANTITY	0.99+
Lyft	ORGANIZATION	0.99+
hundreds	QUANTITY	0.99+
tomorrow	DATE	0.99+
Anyscale	ORGANIZATION	0.99+
three	QUANTITY	0.99+
128	QUANTITY	0.99+
September	DATE	0.99+
today	DATE	0.99+
Moore's Law	TITLE	0.99+
Adam Selipsky	PERSON	0.99+
PyTorch	TITLE	0.99+
Ray	ORGANIZATION	0.99+
second reason	QUANTITY	0.99+
64	QUANTITY	0.99+
each worker	QUANTITY	0.99+
each worker	QUANTITY	0.99+
Photoshop	TITLE	0.99+
UC Berkeley	ORGANIZATION	0.99+
Java	TITLE	0.99+
Shopify	ORGANIZATION	0.99+
OpenAI	ORGANIZATION	0.99+
Anyscale	PERSON	0.99+
third	QUANTITY	0.99+
two things	QUANTITY	0.99+
ByteDance	ORGANIZATION	0.99+
Spotify	ORGANIZATION	0.99+
One	QUANTITY	0.99+
95	QUANTITY	0.99+
Asure	ORGANIZATION	0.98+
one line	QUANTITY	0.98+
one GPU	QUANTITY	0.98+
ChatGPT	TITLE	0.98+
TensorFlow	TITLE	0.98+
last year	DATE	0.98+
first bucket	QUANTITY	0.98+
both	QUANTITY	0.98+
two layers	QUANTITY	0.98+
Cohere	ORGANIZATION	0.98+
Alipay	ORGANIZATION	0.98+
Ray	PERSON	0.97+
one	QUANTITY	0.97+
Instacart	ORGANIZATION	0.97+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for OPT: