Closing Panel | Generative AI: Riding the Wave | AWS Startup Showcase S3 E1

(mellow music) >> Hello everyone, welcome to theCUBE's coverage of AWS Startup Showcase. This is the closing panel session on AI machine learning, the top startups generating generative AI on AWS. It's a great panel. This is going to be the experts talking about riding the wave in generative AI. We got Ankur Mehrotra, who's the director and general manager of AI and machine learning at AWS, and Clem Delangue, co-founder and CEO of Hugging Face, and Ori Goshen, who's the co-founder and CEO of AI21 Labs. Ori from Tel Aviv dialing in, and rest coming in here on theCUBE. Appreciate you coming on for this closing session for the Startup Showcase. >> Thanks for having us. >> Thank you for having us. >> Thank you. >> I'm super excited to have you all on. Hugging Face was recently in the news with the AWS relationship, so congratulations. Open source, open science, really driving the machine learning. And we got the AI21 Labs access to the LLMs, generating huge scale live applications, commercial applications, coming to the market, all powered by AWS. So everyone, congratulations on all your success, and thank you for headlining this panel. Let's get right into it. AWS is powering this wave here. We're seeing a lot of push here from applications. Ankur, set the table for us on the AI machine learning. It's not new, it's been goin' on for a while. Past three years have been significant advancements, but there's been a lot of work done in AI machine learning. Now it's released to the public. Everybody's super excited and now says, "Oh, the future's here!" It's kind of been going on for a while and baking. Now it's kind of coming out. What's your view here? Let's get it started. >> Yes, thank you. So, yeah, as you may be aware, Amazon has been in investing in machine learning research and development since quite some time now. And we've used machine learning to innovate and improve user experiences across different Amazon products, whether it's Alexa or Amazon.com. But we've also brought in our expertise to extend what we are doing in the space and add more generative AI technology to our AWS products and services, starting with CodeWhisperer, which is an AWS service that we announced a few months ago, which is, you can think of it as a coding companion as a service, which uses generative AI models underneath. And so this is a service that customers who have no machine learning expertise can just use. And we also are talking to customers, and we see a lot of excitement about generative AI, and customers who want to build these models themselves, who have the talent and the expertise and resources. For them, AWS has a number of different options and capabilities they can leverage, such as our custom silicon, such as Trainium and Inferentia, as well as distributed machine learning capabilities that we offer as part of SageMaker, which is an end-to-end machine learning development service. At the same time, many of our customers tell us that they're interested in not training and building these generative AI models from scratch, given they can be expensive and can require specialized talent and skills to build. And so for those customers, we are also making it super easy to bring in existing generative AI models into their machine learning development environment within SageMaker for them to use. So we recently announced our partnership with Hugging Face, where we are making it super easy for customers to bring in those models into their SageMaker development environment for fine tuning and deployment. And then we are also partnering with other proprietary model providers such as AI21 and others, where we making these generative AI models available within SageMaker for our customers to use. So our approach here is to really provide customers options and choices and help them accelerate their generative AI journey. >> Ankur, thank you for setting the table there. Clem and Ori, I want to get your take, because the riding the waves, the theme of this session, and to me being in California, I imagine the big surf, the big waves, the big talent out there. This is like alpha geeks, alpha coders, developers are really leaning into this. You're seeing massive uptake from the smartest people. Whether they're young or around, they're coming in with their kind of surfboards, (chuckles) if you will. These early adopters, they've been on this for a while; Now the waves are hitting. This is a big wave, everyone sees it. What are some of those early adopter devs doing? What are some of the use cases you're seeing right out of the gate? And what does this mean for the folks that are going to come in and get on this wave? Can you guys share your perspective on this? Because you're seeing the best talent now leaning into this. >> Yeah, absolutely. I mean, from Hugging Face vantage points, it's not even a a wave, it's a tidal wave, or maybe even the tide itself. Because actually what we are seeing is that AI and machine learning is not something that you add to your products. It's very much a new paradigm to do all technology. It's this idea that we had in the past 15, 20 years, one way to build software and to build technology, which was writing a million lines of code, very rule-based, and then you get your product. Now what we are seeing is that every single product, every single feature, every single company is starting to adopt AI to build the next generation of technology. And that works both to make the existing use cases better, if you think of search, if you think of social network, if you think of SaaS, but also it's creating completely new capabilities that weren't possible with the previous paradigm. Now AI can generate text, it can generate image, it can describe your image, it can do so many new things that weren't possible before. >> It's going to really make the developers really productive, right? I mean, you're seeing the developer uptake strong, right? >> Yes, we have over 15,000 companies using Hugging Face now, and it keeps accelerating. I really think that maybe in like three, five years, there's not going to be any company not using AI. It's going to be really kind of the default to build all technology. >> Ori, weigh in on this. APIs, the cloud. Now I'm a developer, I want to have live applications, I want the commercial applications on this. What's your take? Weigh in here. >> Yeah, first, I absolutely agree. I mean, we're in the midst of a technology shift here. I think not a lot of people realize how big this is going to be. Just the number of possibilities is endless, and I think hard to imagine. And I don't think it's just the use cases. I think we can think of it as two separate categories. We'll see companies and products enhancing their offerings with these new AI capabilities, but we'll also see new companies that are AI first, that kind of reimagine certain experiences. They build something that wasn't possible before. And that's why I think it's actually extremely exciting times. And maybe more philosophically, I think now these large language models and large transformer based models are helping us people to express our thoughts and kind of making the bridge from our thinking to a creative digital asset in a speed we've never imagined before. I can write something down and get a piece of text, or an image, or a code. So I'll start by saying it's hard to imagine all the possibilities right now, but it's certainly big. And if I had to bet, I would say it's probably at least as big as the mobile revolution we've seen in the last 20 years. >> Yeah, this is the biggest. I mean, it's been compared to the Enlightenment Age. I saw the Wall Street Journal had a recent story on this. We've been saying that this is probably going to be bigger than all inflection points combined in the tech industry, given what transformation is coming. I guess I want to ask you guys, on the early adopters, we've been hearing on these interviews and throughout the industry that there's already a set of big companies, a set of companies out there that have a lot of data and they're already there, they're kind of tinkering. Kind of reminds me of the old hyper scaler days where they were building their own scale, and they're eatin' glass, spittin' nails out, you know, they're hardcore. Then you got everybody else kind of saying board level, "Hey team, how do I leverage this?" How do you see those two things coming together? You got the fast followers coming in behind the early adopters. What's it like for the second wave coming in? What are those conversations for those developers like? >> I mean, I think for me, the important switch for companies is to change their mindset from being kind of like a traditional software company to being an AI or machine learning company. And that means investing, hiring machine learning engineers, machine learning scientists, infrastructure in members who are working on how to put these models in production, team members who are able to optimize models, specialized models, customized models for the company's specific use cases. So it's really changing this mindset of how you build technology and optimize your company building around that. Things are moving so fast that I think now it's kind of like too late for low hanging fruits or small, small adjustments. I think it's important to realize that if you want to be good at that, and if you really want to surf this wave, you need massive investments. If there are like some surfers listening with this analogy of the wave, right, when there are waves, it's not enough just to stand and make a little bit of adjustments. You need to position yourself aggressively, paddle like crazy, and that's how you get into the waves. So that's what companies, in my opinion, need to do right now. >> Ori, what's your take on the generative models out there? We hear a lot about foundation models. What's your experience running end-to-end applications for large foundation models? Any insights you can share with the app developers out there who are looking to get in? >> Yeah, I think first of all, it's start create an economy, where it probably doesn't make sense for every company to create their own foundation models. You can basically start by using an existing foundation model, either open source or a proprietary one, and start deploying it for your needs. And then comes the second round when you are starting the optimization process. You bootstrap, whether it's a demo, or a small feature, or introducing new capability within your product, and then start collecting data. That data, and particularly the human feedback data, helps you to constantly improve the model, so you create this data flywheel. And I think we're now entering an era where customers have a lot of different choice of how they want to start their generative AI endeavor. And it's a good thing that there's a variety of choices. And the really amazing thing here is that every industry, any company you speak with, it could be something very traditional like industrial or financial, medical, really any company. I think peoples now start to imagine what are the possibilities, and seriously think what's their strategy for adopting this generative AI technology. And I think in that sense, the foundation model actually enabled this to become scalable. So the barrier to entry became lower; Now the adoption could actually accelerate. >> There's a lot of integration aspects here in this new wave that's a little bit different. Before it was like very monolithic, hardcore, very brittle. A lot more integration, you see a lot more data coming together. I have to ask you guys, as developers come in and grow, I mean, when I went to college and you were a software engineer, I mean, I got a degree in computer science, and software engineering, that's all you did was code, (chuckles) you coded. Now, isn't it like everyone's a machine learning engineer at this point? Because that will be ultimately the science. So, (chuckles) you got open source, you got open software, you got the communities. Swami called you guys the GitHub of machine learning, Hugging Face is the GitHub of machine learning, mainly because that's where people are going to code. So this is essentially, machine learning is computer science. What's your reaction to that? >> Yes, my co-founder Julien at Hugging Face have been having this thing for quite a while now, for over three years, which was saying that actually software engineering as we know it today is a subset of machine learning, instead of the other way around. People would call us crazy a few years ago when we're seeing that. But now we are realizing that you can actually code with machine learning. So machine learning is generating code. And we are starting to see that every software engineer can leverage machine learning through open models, through APIs, through different technology stack. So yeah, it's not crazy anymore to think that maybe in a few years, there's going to be more people doing AI and machine learning. However you call it, right? Maybe you'll still call them software engineers, maybe you'll call them machine learning engineers. But there might be more of these people in a couple of years than there is software engineers today. >> I bring this up as more tongue in cheek as well, because Ankur, infrastructure's co is what made Cloud great, right? That's kind of the DevOps movement. But here the shift is so massive, there will be a game-changing philosophy around coding. Machine learning as code, you're starting to see CodeWhisperer, you guys have had coding companions for a while on AWS. So this is a paradigm shift. How is the cloud playing into this for you guys? Because to me, I've been riffing on some interviews where it's like, okay, you got the cloud going next level. This is an example of that, where there is a DevOps-like moment happening with machine learning, whether you call it coding or whatever. It's writing code on its own. Can you guys comment on what this means on top of the cloud? What comes out of the scale? What comes out of the benefit here? >> Absolutely, so- >> Well first- >> Oh, go ahead. >> Yeah, so I think as far as scale is concerned, I think customers are really relying on cloud to make sure that the applications that they build can scale along with the needs of their business. But there's another aspect to it, which is that until a few years ago, John, what we saw was that machine learning was a data scientist heavy activity. They were data scientists who were taking the data and training models. And then as machine learning found its way more and more into production and actual usage, we saw the MLOps become a thing, and MLOps engineers become more involved into the process. And then we now are seeing, as machine learning is being used to solve more business critical problems, we're seeing even legal and compliance teams get involved. We are seeing business stakeholders more engaged. So, more and more machine learning is becoming an activity that's not just performed by data scientists, but is performed by a team and a group of people with different skills. And for them, we as AWS are focused on providing the best tools and services for these different personas to be able to do their job and really complete that end-to-end machine learning story. So that's where, whether it's tools related to MLOps or even for folks who cannot code or don't know any machine learning. For example, we launched SageMaker Canvas as a tool last year, which is a UI-based tool which data analysts and business analysts can use to build machine learning models. So overall, the spectrum in terms of persona and who can get involved in the machine learning process is expanding, and the cloud is playing a big role in that process. >> Ori, Clem, can you guys weigh in too? 'Cause this is just another abstraction layer of scale. What's it mean for you guys as you look forward to your customers and the use cases that you're enabling? >> Yes, I think what's important is that the AI companies and providers and the cloud kind of work together. That's how you make a seamless experience and you actually reduce the barrier to entry for this technology. So that's what we've been super happy to do with AWS for the past few years. We actually announced not too long ago that we are doubling down on our partnership with AWS. We're excited to have many, many customers on our shared product, the Hugging Face deep learning container on SageMaker. And we are working really closely with the Inferentia team and the Trainium team to release some more exciting stuff in the coming weeks and coming months. So I think when you have an ecosystem and a system where the AWS and the AI providers, AI startups can work hand in hand, it's to the benefit of the customers and the companies, because it makes it orders of magnitude easier for them to adopt this new paradigm to build technology AI. >> Ori, this is a scale on reasoning too. The data's out there and making sense out of it, making it reason, getting comprehension, having it make decisions is next, isn't it? And you need scale for that. >> Yes. Just a comment about the infrastructure side. So I think really the purpose is to streamline and make these technologies much more accessible. And I think we'll see, I predict that we'll see in the next few years more and more tooling that make this technology much more simple to consume. And I think it plays a very important role. There's so many aspects, like the monitoring the models and their kind of outputs they produce, and kind of containing and running them in a production environment. There's so much there to build on, the infrastructure side will play a very significant role. >> All right, that's awesome stuff. I'd love to change gears a little bit and get a little philosophy here around AI and how it's going to transform, if you guys don't mind. There's been a lot of conversations around, on theCUBE here as well as in some industry areas, where it's like, okay, all the heavy lifting is automated away with machine learning and AI, the complexity, there's some efficiencies, it's horizontal and scalable across all industries. Ankur, good point there. Everyone's going to use it for something. And a lot of stuff gets brought to the table with large language models and other things. But the key ingredient will be proprietary data or human input, or some sort of AI whisperer kind of role, or prompt engineering, people are saying. So with that being said, some are saying it's automating intelligence. And that creativity will be unleashed from this. If the heavy lifting goes away and AI can fill the void, that shifts the value to the intellect or the input. And so that means data's got to come together, interact, fuse, and understand each other. This is kind of new. I mean, old school AI was, okay, got a big model, I provisioned it long time, very expensive. Now it's all free flowing. Can you guys comment on where you see this going with this freeform, data flowing everywhere, heavy lifting, and then specialization? >> Yeah, I think- >> Go ahead. >> Yeah, I think, so what we are seeing with these large language models or generative models is that they're really good at creating stuff. But I think it's also important to recognize their limitations. They're not as good at reasoning and logic. And I think now we're seeing great enthusiasm, I think, which is justified. And the next phase would be how to make these systems more reliable. How to inject more reasoning capabilities into these models, or augment with other mechanisms that actually perform more reasoning so we can achieve more reliable results. And we can count on these models to perform for critical tasks, whether it's medical tasks, legal tasks. We really want to kind of offload a lot of the intelligence to these systems. And then we'll have to get back, we'll have to make sure these are reliable, we'll have to make sure we get some sort of explainability that we can understand the process behind the generated results that we received. So I think this is kind of the next phase of systems that are based on these generated models. >> Clem, what's your view on this? Obviously you're at open community, open source has been around, it's been a great track record, proven model. I'm assuming creativity's going to come out of the woodwork, and if we can automate open source contribution, and relationships, and onboarding more developers, there's going to be unleashing of creativity. >> Yes, it's been so exciting on the open source front. We all know Bert, Bloom, GPT-J, T5, Stable Diffusion, that work up. The previous or the current generation of open source models that are on Hugging Face. It has been accelerating in the past few months. So I'm super excited about ControlNet right now that is really having a lot of impact, which is kind of like a way to control the generation of images. Super excited about Flan UL2, which is like a new model that has been recently released and is open source. So yeah, it's really fun to see the ecosystem coming together. Open source has been the basis for traditional software, with like open source programming languages, of course, but also all the great open source that we've gotten over the years. So we're happy to see that the same thing is happening for machine learning and AI, and hopefully can help a lot of companies reduce a little bit the barrier to entry. So yeah, it's going to be exciting to see how it evolves in the next few years in that respect. >> I think the developer productivity angle that's been talked about a lot in the industry will be accelerated significantly. I think security will be enhanced by this. I think in general, applications are going to transform at a radical rate, accelerated, incredible rate. So I think it's not a big wave, it's the water, right? I mean, (chuckles) it's the new thing. My final question for you guys, if you don't mind, I'd love to get each of you to answer the question I'm going to ask you, which is, a lot of conversations around data. Data infrastructure's obviously involved in this. And the common thread that I'm hearing is that every company that looks at this is asking themselves, if we don't rebuild our company, start thinking about rebuilding our business model around AI, we might be dinosaurs, we might be extinct. And it reminds me that scene in Moneyball when, at the end, it's like, if we're not building the model around your model, every company will be out of business. What's your advice to companies out there that are having those kind of moments where it's like, okay, this is real, this is next gen, this is happening. I better start thinking and putting into motion plans to refactor my business, 'cause it's happening, business transformation is happening on the cloud. This kind of puts an exclamation point on, with the AI, as a next step function. Big increase in value. So it's an opportunity for leaders. Ankur, we'll start with you. What's your advice for folks out there thinking about this? Do they put their toe in the water? Do they jump right into the deep end? What's your advice? >> Yeah, John, so we talk to a lot of customers, and customers are excited about what's happening in the space, but they often ask us like, "Hey, where do we start?" So we always advise our customers to do a lot of proof of concepts, understand where they can drive the biggest ROI. And then also leverage existing tools and services to move fast and scale, and try and not reinvent the wheel where it doesn't need to be. That's basically our advice to customers. >> Get it. Ori, what's your advice to folks who are scratching their head going, "I better jump in here. "How do I get started?" What's your advice? >> So I actually think that need to think about it really economically. Both on the opportunity side and the challenges. So there's a lot of opportunities for many companies to actually gain revenue upside by building these new generative features and capabilities. On the other hand, of course, this would probably affect the cogs, and incorporating these capabilities could probably affect the cogs. So I think we really need to think carefully about both of these sides, and also understand clearly if this is a project or an F word towards cost reduction, then the ROI is pretty clear, or revenue amplifier, where there's, again, a lot of different opportunities. So I think once you think about this in a structured way, I think, and map the different initiatives, then it's probably a good way to start and a good way to start thinking about these endeavors. >> Awesome. Clem, what's your take on this? What's your advice, folks out there? >> Yes, all of these are very good advice already. Something that you said before, John, that I disagreed a little bit, a lot of people are talking about the data mode and proprietary data. Actually, when you look at some of the organizations that have been building the best models, they don't have specialized or unique access to data. So I'm not sure that's so important today. I think what's important for companies, and it's been the same for the previous generation of technology, is their ability to build better technology faster than others. And in this new paradigm, that means being able to build machine learning faster than others, and better. So that's how, in my opinion, you should approach this. And kind of like how can you evolve your company, your teams, your products, so that you are able in the long run to build machine learning better and faster than your competitors. And if you manage to put yourself in that situation, then that's when you'll be able to differentiate yourself to really kind of be impactful and get results. That's really hard to do. It's something really different, because machine learning and AI is a different paradigm than traditional software. So this is going to be challenging, but I think if you manage to nail that, then the future is going to be very interesting for your company. >> That's a great point. Thanks for calling that out. I think this all reminds me of the cloud days early on. If you went to the cloud early, you took advantage of it when the pandemic hit. If you weren't native in the cloud, you got hamstrung by that, you were flatfooted. So just get in there. (laughs) Get in the cloud, get into AI, you're going to be good. Thanks for for calling that. Final parting comments, what's your most exciting thing going on right now for you guys? Ori, Clem, what's the most exciting thing on your plate right now that you'd like to share with folks? >> I mean, for me it's just the diversity of use cases and really creative ways of companies leveraging this technology. Every day I speak with about two, three customers, and I'm continuously being surprised by the creative ideas. And the future is really exciting of what can be achieved here. And also I'm amazed by the pace that things move in this industry. It's just, there's not at dull moment. So, definitely exciting times. >> Clem, what are you most excited about right now? >> For me, it's all the new open source models that have been released in the past few weeks, and that they'll keep being released in the next few weeks. I'm also super excited about more and more companies getting into this capability of chaining different models and different APIs. I think that's a very, very interesting development, because it creates new capabilities, new possibilities, new functionalities that weren't possible before. You can plug an API with an open source embedding model, with like a no-geo transcription model. So that's also very exciting. This capability of having more interoperable machine learning will also, I think, open a lot of interesting things in the future. >> Clem, congratulations on your success at Hugging Face. Please pass that on to your team. Ori, congratulations on your success, and continue to, just day one. I mean, it's just the beginning. It's not even scratching the service. Ankur, I'll give you the last word. What are you excited for at AWS? More cloud goodness coming here with AI. Give you the final word. >> Yeah, so as both Clem and Ori said, I think the research in the space is moving really, really fast, so we are excited about that. But we are also excited to see the speed at which enterprises and other AWS customers are applying machine learning to solve real business problems, and the kind of results they're seeing. So when they come back to us and tell us the kind of improvement in their business metrics and overall customer experience that they're driving and they're seeing real business results, that's what keeps us going and inspires us to continue inventing on their behalf. >> Gentlemen, thank you so much for this awesome high impact panel. Ankur, Clem, Ori, congratulations on all your success. We'll see you around. Thanks for coming on. Generative AI, riding the wave, it's a tidal wave, it's the water, it's all happening. All great stuff. This is season three, episode one of AWS Startup Showcase closing panel. This is the AI ML episode, the top startups building generative AI on AWS. I'm John Furrier, your host. Thanks for watching. (mellow music)

Published Date : Mar 9 2023

SUMMARY :

This is the closing panel I'm super excited to have you all on. is to really provide and to me being in California, and then you get your product. kind of the default APIs, the cloud. and kind of making the I saw the Wall Street Journal I think it's important to realize that the app developers out there So the barrier to entry became lower; I have to ask you guys, instead of the other way around. That's kind of the DevOps movement. and the cloud is playing a and the use cases that you're enabling? the barrier to entry And you need scale for that. in the next few years and AI can fill the void, a lot of the intelligence and if we can automate reduce a little bit the barrier to entry. I'd love to get each of you drive the biggest ROI. to folks who are scratching So I think once you think Clem, what's your take on this? and it's been the same of the cloud days early on. And also I'm amazed by the pace in the past few weeks, Please pass that on to your team. and the kind of results they're seeing. This is the AI ML episode,

ENTITIES

Entity	Category	Confidence
Ankur Mehrotra	PERSON	0.99+
John	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Clem	PERSON	0.99+
Ori Goshen	PERSON	0.99+
John Furrier	PERSON	0.99+
California	LOCATION	0.99+
Ori	PERSON	0.99+
Clem Delangue	PERSON	0.99+
Hugging Face	ORGANIZATION	0.99+
Julien	PERSON	0.99+
Ankur	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Tel Aviv	LOCATION	0.99+
three	QUANTITY	0.99+
Ankur	ORGANIZATION	0.99+
second round	QUANTITY	0.99+
AI21 Labs	ORGANIZATION	0.99+
two separate categories	QUANTITY	0.99+
Amazon.com	ORGANIZATION	0.99+
last year	DATE	0.99+
two things	QUANTITY	0.99+
first	QUANTITY	0.98+
over 15,000 companies	QUANTITY	0.98+
Both	QUANTITY	0.98+
five years	QUANTITY	0.98+
both	QUANTITY	0.98+
over three years	QUANTITY	0.98+
three customers	QUANTITY	0.98+
each	QUANTITY	0.98+
Trainium	ORGANIZATION	0.98+
today	DATE	0.98+
Alexa	TITLE	0.98+
Stable Diffusion	ORGANIZATION	0.97+
Swami	PERSON	0.97+
Inferentia	ORGANIZATION	0.96+
GPT-J	ORGANIZATION	0.96+
SageMaker	TITLE	0.96+
AI21 Labs	ORGANIZATION	0.95+
Riding the Wave	TITLE	0.95+
ControlNet	ORGANIZATION	0.94+
one way	QUANTITY	0.94+
a million lines	QUANTITY	0.93+
Startup Showcase	EVENT	0.92+
few months ago	DATE	0.92+
second wave	EVENT	0.91+
theCUBE	ORGANIZATION	0.91+
few years ago	DATE	0.91+
CodeWhisperer	TITLE	0.9+
AI21	ORGANIZATION	0.89+

Adam Wenchel & John Dickerson, Arthur | AWS Startup Showcase S3 E1

(upbeat music) >> Welcome everyone to theCUBE's presentation of the AWS Startup Showcase AI Machine Learning Top Startups Building Generative AI on AWS. This is season 3, episode 1 of the ongoing series covering the exciting startup from the AWS ecosystem to talk about AI and machine learning. I'm your host, John Furrier. I'm joined by two great guests here, Adam Wenchel, who's the CEO of Arthur, and Chief Scientist of Arthur, John Dickerson. Talk about how they help people build better LLM AI systems to get them into the market faster. Gentlemen, thank you for coming on. >> Yeah, thanks for having us, John. >> Well, I got to say I got to temper my enthusiasm because the last few months explosion of interest in LLMs with ChatGPT, has opened the eyes to everybody around the reality of that this is going next gen, this is it, this is the moment, this is the the point we're going to look back and say, this is the time where AI really hit the scene for real applications. So, a lot of Large Language Models, also known as LLMs, foundational models, and generative AI is all booming. This is where all the alpha developers are going. This is where everyone's focusing their business model transformations on. This is where developers are seeing action. So it's all happening, the wave is here. So I got to ask you guys, what are you guys seeing right now? You're in the middle of it, it's hitting you guys right on. You're in the front end of this massive wave. >> Yeah, John, I don't think you have to temper your enthusiasm at all. I mean, what we're seeing every single day is, everything from existing enterprise customers coming in with new ways that they're rethinking, like business things that they've been doing for many years that they can now do an entirely different way, as well as all manner of new companies popping up, applying LLMs to everything from generating code and SQL statements to generating health transcripts and just legal briefs. Everything you can imagine. And when you actually sit down and look at these systems and the demos we get of them, the hype is definitely justified. It's pretty amazing what they're going to do. And even just internally, we built, about a month ago in January, we built an Arthur chatbot so customers could ask questions, technical questions from our, rather than read our product documentation, they could just ask this LLM a particular question and get an answer. And at the time it was like state of the art, but then just last week we decided to rebuild it because the tooling has changed so much that we, last week, we've completely rebuilt it. It's now way better, built on an entirely different stack. And the tooling has undergone a full generation worth of change in six weeks, which is crazy. So it just tells you how much energy is going into this and how fast it's evolving right now. >> John, weigh in as a chief scientist. I mean, you must be blown away. Talk about kid in the candy store. I mean, you must be looking like this saying, I mean, she must be super busy to begin with, but the change, the acceleration, can you scope the kind of change you're seeing and be specific around the areas you're seeing movement and highly accelerated change? >> Yeah, definitely. And it is very, very exciting actually, thinking back to when ChatGPT was announced, that was a night our company was throwing an event at NeurIPS, which is maybe the biggest machine learning conference out there. And the hype when that happened was palatable and it was just shocking to see how well that performed. And then obviously over the last few months since then, as LLMs have continued to enter the market, we've seen use cases for them, like Adam mentioned all over the place. And so, some things I'm excited about in this space are the use of LLMs and more generally, foundation models to redesign traditional operations, research style problems, logistics problems, like auctions, decisioning problems. So moving beyond the already amazing news cases, like creating marketing content into more core integration and a lot of the bread and butter companies and tasks that drive the American ecosystem. And I think we're just starting to see some of that. And in the next 12 months, I think we're going to see a lot more. If I had to make other predictions, I think we're going to continue seeing a lot of work being done on managing like inference time costs via shrinking models or distillation. And I don't know how to make this prediction, but at some point we're going to be seeing lots of these very large scale models operating on the edge as well. So the time scales are extremely compressed, like Adam mentioned, 12 months from now, hard to say. >> We were talking on theCUBE prior to this session here. We had theCUBE conversation here and then the Wall Street Journal just picked up on the same theme, which is the printing press moment created the enlightenment stage of the history. Here we're in the whole nother automating intellect efficiency, doing heavy lifting, the creative class coming back, a whole nother level of reality around the corner that's being hyped up. The question is, is this justified? Is there really a breakthrough here or is this just another result of continued progress with AI? Can you guys weigh in, because there's two schools of thought. There's the, "Oh my God, we're entering a new enlightenment tech phase, of the equivalent of the printing press in all areas. Then there's, Ah, it's just AI (indistinct) inch by inch. What's your guys' opinion? >> Yeah, I think on the one hand when you're down in the weeds of building AI systems all day, every day, like we are, it's easy to look at this as an incremental progress. Like we have customers who've been building on foundation models since we started the company four years ago, particular in computer vision for classification tasks, starting with pre-trained models, things like that. So that part of it doesn't feel real new, but what does feel new is just when you apply these things to language with all the breakthroughs and computational efficiency, algorithmic improvements, things like that, when you actually sit down and interact with ChatGPT or one of the other systems that's out there that's building on top of LLMs, it really is breathtaking, like, the level of understanding that they have and how quickly you can accelerate your development efforts and get an actual working system in place that solves a really important real world problem and makes people way faster, way more efficient. So I do think there's definitely something there. It's more than just incremental improvement. This feels like a real trajectory inflection point for the adoption of AI. >> John, what's your take on this? As people come into the field, I'm seeing a lot of people move from, hey, I've been coding in Python, I've been doing some development, I've been a software engineer, I'm a computer science student. I'm coding in C++ old school, OG systems person. Where do they come in? Where's the focus, where's the action? Where are the breakthroughs? Where are people jumping in and rolling up their sleeves and getting dirty with this stuff? >> Yeah, all over the place. And it's funny you mentioned students in a different life. I wore a university professor hat and so I'm very, very familiar with the teaching aspects of this. And I will say toward Adam's point, this really is a leap forward in that techniques like in a co-pilot for example, everybody's using them right now and they really do accelerate the way that we develop. When I think about the areas where people are really, really focusing right now, tooling is certainly one of them. Like you and I were chatting about LangChain right before this interview started, two or three people can sit down and create an amazing set of pipes that connect different aspects of the LLM ecosystem. Two, I would say is in engineering. So like distributed training might be one, or just understanding better ways to even be able to train large models, understanding better ways to then distill them or run them. So like this heavy interaction now between engineering and what I might call traditional machine learning from 10 years ago where you had to know a lot of math, you had to know calculus very well, things like that. Now you also need to be, again, a very strong engineer, which is exciting. >> I interviewed Swami when he talked about the news. He's ahead of Amazon's machine learning and AI when they announced Hugging Face announcement. And I reminded him how Amazon was easy to get into if you were developing a startup back in 2007,8, and that the language models had that similar problem. It's step up a lot of content and a lot of expense to get provisioned up, now it's easy. So this is the next wave of innovation. So how do you guys see that from where we are right now? Are we at that point where it's that moment where it's that cloud-like experience for LLMs and large language models? >> Yeah, go ahead John. >> I think the answer is yes. We see a number of large companies that are training these and serving these, some of which are being co-interviewed in this episode. I think we're at that. Like, you can hit one of these with a simple, single line of Python, hitting an API, you can boot this up in seconds if you want. It's easy. >> Got it. >> So I (audio cuts out). >> Well let's take a step back and talk about the company. You guys being featured here on the Showcase. Arthur, what drove you to start the company? How'd this all come together? What's the origination story? Obviously you got a big customers, how'd get started? What are you guys doing? How do you make money? Give a quick overview. >> Yeah, I think John and I come at it from slightly different angles, but for myself, I have been a part of a number of technology companies. I joined Capital One, they acquired my last company and shortly after I joined, they asked me to start their AI team. And so even though I've been doing AI for a long time, I started my career back in DARPA. It was the first time I was really working at scale in AI at an organization where there were hundreds of millions of dollars in revenue at stake with the operation of these models and that they were impacting millions of people's financial livelihoods. And so it just got me hyper-focused on these issues around making sure that your AI worked well and it worked well for your company and it worked well for the people who were being affected by it. At the time when I was doing this 2016, 2017, 2018, there just wasn't any tooling out there to support this production management model monitoring life phase of the life cycle. And so we basically left to start the company that I wanted. And John has a his own story. I'll let let you share that one, John. >> Go ahead John, you're up. >> Yeah, so I'm coming at this from a different world. So I'm on leave now from a tenured role in academia where I was leading a large lab focusing on the intersection of machine learning and economics. And so questions like fairness or the response to the dynamism on the underlying environment have been around for quite a long time in that space. And so I've been thinking very deeply about some of those more like R and D style questions as well as having deployed some automation code across a couple of different industries, some in online advertising, some in the healthcare space and so on, where concerns of, again, fairness come to bear. And so Adam and I connected to understand the space of what that might look like in the 2018 20 19 realm from a quantitative and from a human-centered point of view. And so booted things up from there. >> Yeah, bring that applied engineering R and D into the Capital One, DNA that he had at scale. I could see that fit. I got to ask you now, next step, as you guys move out and think about LLMs and the recent AI news around the generative models and the foundational models like ChatGPT, how should we be looking at that news and everyone watching might be thinking the same thing. I know at the board level companies like, we should refactor our business, this is the future. It's that kind of moment, and the tech team's like, okay, boss, how do we do this again? Or are they prepared? How should we be thinking? How should people watching be thinking about LLMs? >> Yeah, I think they really are transformative. And so, I mean, we're seeing companies all over the place. Everything from large tech companies to a lot of our large enterprise customers are launching significant projects at core parts of their business. And so, yeah, I would be surprised, if you're serious about becoming an AI native company, which most leading companies are, then this is a trend that you need to be taking seriously. And we're seeing the adoption rate. It's funny, I would say the AI adoption in the broader business world really started, let's call it four or five years ago, and it was a relatively slow adoption rate, but I think all that kind of investment in and scaling the maturity curve has paid off because the rate at which people are adopting and deploying systems based on this is tremendous. I mean, this has all just happened in the few months and we're already seeing people get systems into production. So, now there's a lot of things you have to guarantee in order to put these in production in a way that basically is added into your business and doesn't cause more headaches than it solves. And so that's where we help customers is where how do you put these out there in a way that they're going to represent your company well, they're going to perform well, they're going to do their job and do it properly. >> So in the use case, as a customer, as I think about this, there's workflows. They might have had an ML AI ops team that's around IT. Their inference engines are out there. They probably don't have a visibility on say how much it costs, they're kicking the tires. When you look at the deployment, there's a cost piece, there's a workflow piece, there's fairness you mentioned John, what should be, I should be thinking about if I'm going to be deploying stuff into production, I got to think about those things. What's your opinion? >> Yeah, I'm happy to dive in on that one. So monitoring in general is extremely important once you have one of these LLMs in production, and there have been some changes versus traditional monitoring that we can dive deeper into that LLMs are really accelerated. But a lot of that bread and butter style of things you should be looking out for remain just as important as they are for what you might call traditional machine learning models. So the underlying environment of data streams, the way users interact with these models, these are all changing over time. And so any performance metrics that you care about, traditional ones like an accuracy, if you can define that for an LLM, ones around, for example, fairness or bias. If that is a concern for your particular use case and so on. Those need to be tracked. Now there are some interesting changes that LLMs are bringing along as well. So most ML models in production that we see are relatively static in the sense that they're not getting flipped in more than maybe once a day or once a week or they're just set once and then not changed ever again. With LLMs, there's this ongoing value alignment or collection of preferences from users that is often constantly updating the model. And so that opens up all sorts of vectors for, I won't say attack, but for problems to arise in production. Like users might learn to use your system in a different way and thus change the way those preferences are getting collected and thus change your system in ways that you never intended. So maybe that went through governance already internally at the company and now it's totally, totally changed and it's through no fault of your own, but you need to be watching over that for sure. >> Talk about the reinforced learnings from human feedback. How's that factoring in to the LLMs? Is that part of it? Should people be thinking about that? Is that a component that's important? >> It certainly is, yeah. So this is one of the big tweaks that happened with InstructGPT, which is the basis model behind ChatGPT and has since gone on to be used all over the place. So value alignment I think is through RLHF like you mentioned is a very interesting space to get into and it's one that you need to watch over. Like, you're asking humans for feedback over outputs from a model and then you're updating the model with respect to that human feedback. And now you've thrown humans into the loop here in a way that is just going to complicate things. And it certainly helps in many ways. You can ask humans to, let's say that you're deploying an internal chat bot at an enterprise, you could ask humans to align that LLM behind the chatbot to, say company values. And so you're listening feedback about these company values and that's going to scoot that chatbot that you're running internally more toward the kind of language that you'd like to use internally on like a Slack channel or something like that. Watching over that model I think in that specific case, that's a compliance and HR issue as well. So while it is part of the greater LLM stack, you can also view that as an independent bit to watch over. >> Got it, and these are important factors. When people see the Bing news, they freak out how it's doing great. Then it goes off the rails, it goes big, fails big. (laughing) So these models people see that, is that human interaction or is that feedback, is that not accepting it or how do people understand how to take that input in and how to build the right apps around LLMs? This is a tough question. >> Yeah, for sure. So some of the examples that you'll see online where these chatbots go off the rails are obviously humans trying to break the system, but some of them clearly aren't. And that's because these are large statistical models and we don't know what's going to pop out of them all the time. And even if you're doing as much in-house testing at the big companies like the Go-HERE's and the OpenAI's of the world, to try to prevent things like toxicity or racism or other sorts of bad content that might lead to bad pr, you're never going to catch all of these possible holes in the model itself. And so, again, it's very, very important to keep watching over that while it's in production. >> On the business model side, how are you guys doing? What's the approach? How do you guys engage with customers? Take a minute to explain the customer engagement. What do they need? What do you need? How's that work? >> Yeah, I can talk a little bit about that. So it's really easy to get started. It's literally a matter of like just handing out an API key and people can get started. And so we also offer alternative, we also offer versions that can be installed on-prem for models that, we find a lot of our customers have models that deal with very sensitive data. So you can run it in your cloud account or use our cloud version. And so yeah, it's pretty easy to get started with this stuff. We find people start using it a lot of times during the validation phase 'cause that way they can start baselining performance models, they can do champion challenger, they can really kind of baseline the performance of, maybe they're considering different foundation models. And so it's a really helpful tool for understanding differences in the way these models perform. And then from there they can just flow that into their production inferencing, so that as these systems are out there, you have really kind of real time monitoring for anomalies and for all sorts of weird behaviors as well as that continuous feedback loop that helps you make make your product get better and observability and you can run all sorts of aggregated reports to really understand what's going on with these models when they're out there deciding. I should also add that we just today have another way to adopt Arthur and that is we are in the AWS marketplace, and so we are available there just to make it that much easier to use your cloud credits, skip the procurement process, and get up and running really quickly. >> And that's great 'cause Amazon's got SageMaker, which handles a lot of privacy stuff, all kinds of cool things, or you can get down and dirty. So I got to ask on the next one, production is a big deal, getting stuff into production. What have you guys learned that you could share to folks watching? Is there a cost issue? I got to monitor, obviously you brought that up, we talked about the even reinforcement issues, all these things are happening. What is the big learnings that you could share for people that are going to put these into production to watch out for, to plan for, or be prepared for, hope for the best plan for the worst? What's your advice? >> I can give a couple opinions there and I'm sure Adam has. Well, yeah, the big one from my side is, again, I had mentioned this earlier, it's just the input data streams because humans are also exploring how they can use these systems to begin with. It's really, really hard to predict the type of inputs you're going to be seeing in production. Especially, we always talk about chatbots, but then any generative text tasks like this, let's say you're taking in news articles and summarizing them or something like that, it's very hard to get a good sampling even of the set of news articles in such a way that you can really predict what's going to pop out of that model. So to me, it's, adversarial maybe isn't the word that I would use, but it's an unnatural shifting input distribution of like prompts that you might see for these models. That's certainly one. And then the second one that I would talk about is, it can be hard to understand the costs, the inference time costs behind these LLMs. So the pricing on these is always changing as the models change size, it might go up, it might go down based on model size, based on energy cost and so on, but your pricing per token or per a thousand tokens and that I think can be difficult for some clients to wrap their head around. Again, you don't know how these systems are going to be used after all so it can be tough. And so again that's another metric that really should be tracked. >> Yeah, and there's a lot of trade off choices in there with like, how many tokens do you want at each step and in the sequence and based on, you have (indistinct) and you reject these tokens and so based on how your system's operating, that can make the cost highly variable. And that's if you're using like an API version that you're paying per token. A lot of people also choose to run these internally and as John mentioned, the inference time on these is significantly higher than a traditional classifi, even NLP classification model or tabular data model, like orders of magnitude higher. And so you really need to understand how that, as you're constantly iterating on these models and putting out new versions and new features in these models, how that's affecting the overall scale of that inference cost because you can use a lot of computing power very quickly with these profits. >> Yeah, scale, performance, price all come together. I got to ask while we're here on the secret sauce of the company, if you had to describe to people out there watching, what's the secret sauce of the company? What's the key to your success? >> Yeah, so John leads our research team and they've had a number of really cool, I think AI as much as it's been hyped for a while, it's still commercial AI at least is really in its infancy. And so the way we're able to pioneer new ways to think about performance for computer vision NLP LLMs is probably the thing that I'm proudest about. John and his team publish papers all the time at Navs and other places. But I think it's really being able to define what performance means for basically any kind of model type and give people really powerful tools to understand that on an ongoing basis. >> John, secret sauce, how would you describe it? You got all the action happening all around you. >> Yeah, well I going to appreciate Adam talking me up like that. No, I. (all laughing) >> Furrier: Robs to you. >> I would also say a couple of other things here. So we have a very strong engineering team and so I think some early hires there really set the standard at a very high bar that we've maintained as we've grown. And I think that's really paid dividends as scalabilities become even more of a challenge in these spaces, right? And so that's not just scalability when it comes to LLMs, that's scalability when it comes to millions of inferences per day, that kind of thing as well in traditional ML models. And I think that's compared to potential competitors, that's really... Well, it's made us able to just operate more efficiently and pass that along to the client. >> Yeah, and I think the infancy comment is really important because it's the beginning. You really is a long journey ahead. A lot of change coming, like I said, it's a huge wave. So I'm sure you guys got a lot of plannings at the foundation even for your own company, so I appreciate the candid response there. Final question for you guys is, what should the top things be for a company in 2023? If I'm going to set the agenda and I'm a customer moving forward, putting the pedal to the metal, so to speak, what are the top things I should be prioritizing or I need to do to be successful with AI in 2023? >> Yeah, I think, so number one, as we talked about, we've been talking about this entire episode, the things are changing so quickly and the opportunities for business transformation and really disrupting different applications, different use cases, is almost, I don't think we've even fully comprehended how big it is. And so really digging in to your business and understanding where I can apply these new sets of foundation models is, that's a top priority. The interesting thing is I think there's another force at play, which is the macroeconomic conditions and a lot of places are, they're having to work harder to justify budgets. So in the past, couple years ago maybe, they had a blank check to spend on AI and AI development at a lot of large enterprises that was limited primarily by the amount of talent they could scoop up. Nowadays these expenditures are getting scrutinized more. And so one of the things that we really help our customers with is like really calculating the ROI on these things. And so if you have models out there performing and you have a new version that you can put out that lifts the performance by 3%, how many tens of millions of dollars does that mean in business benefit? Or if I want to go to get approval from the CFO to spend a few million dollars on this new project, how can I bake in from the beginning the tools to really show the ROI along the way? Because I think in these systems when done well for a software project, the ROI can be like pretty spectacular. Like we see over a hundred percent ROI in the first year on some of these projects. And so, I think in 2023, you just need to be able to show what you're getting for that spend. >> It's a needle moving moment. You see it all the time with some of these aha moments or like, whoa, blown away. John, I want to get your thoughts on this because one of the things that comes up a lot for companies that I talked to, that are on my second wave, I would say coming in, maybe not, maybe the front wave of adopters is talent and team building. You mentioned some of the hires you got were game changing for you guys and set the bar high. As you move the needle, new developers going to need to come in. What's your advice given that you've been a professor, you've seen students, I know a lot of computer science people want to shift, they might not be yet skilled in AI, but they're proficient in programming, is that's going to be another opportunity with open source when things are happening. How do you talk to that next level of talent that wants to come in to this market to supplement teams and be on teams, lead teams? Any advice you have for people who want to build their teams and people who are out there and want to be a coder in AI? >> Yeah, I've advice, and this actually works for what it would take to be a successful AI company in 2023 as well, which is, just don't be afraid to iterate really quickly with these tools. The space is still being explored on what they can be used for. A lot of the tasks that they're used for now right? like creating marketing content using a machine learning is not a new thing to do. It just works really well now. And so I'm excited to see what the next year brings in terms of folks from outside of core computer science who are, other engineers or physicists or chemists or whatever who are learning how to use these increasingly easy to use tools to leverage LLMs for tasks that I think none of us have really thought about before. So that's really, really exciting. And so toward that I would say iterate quickly. Build things on your own, build demos, show them the friends, host them online and you'll learn along the way and you'll have somebody to show for it. And also you'll help us explore that space. >> Guys, congratulations with Arthur. Great company, great picks and shovels opportunities out there for everybody. Iterate fast, get in quickly and don't be afraid to iterate. Great advice and thank you for coming on and being part of the AWS showcase, thanks. >> Yeah, thanks for having us on John. Always a pleasure. >> Yeah, great stuff. Adam Wenchel, John Dickerson with Arthur. Thanks for coming on theCUBE. I'm John Furrier, your host. Generative AI and AWS. Keep it right there for more action with theCUBE. Thanks for watching. (upbeat music)

Published Date : Mar 9 2023

SUMMARY :

of the AWS Startup Showcase has opened the eyes to everybody and the demos we get of them, but the change, the acceleration, And in the next 12 months, of the equivalent of the printing press and how quickly you can accelerate As people come into the field, aspects of the LLM ecosystem. and that the language models in seconds if you want. and talk about the company. of the life cycle. in the 2018 20 19 realm I got to ask you now, next step, in the broader business world So in the use case, as a the way users interact with these models, How's that factoring in to that LLM behind the chatbot and how to build the Go-HERE's and the OpenAI's What's the approach? differences in the way that are going to put So the pricing on these is always changing and in the sequence What's the key to your success? And so the way we're able to You got all the action Yeah, well I going to appreciate Adam and pass that along to the client. so I appreciate the candid response there. get approval from the CFO to spend You see it all the time with some of A lot of the tasks that and being part of the Yeah, thanks for having us Generative AI and AWS.

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Adam Wenchel	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Adam	PERSON	0.99+
John Furrier	PERSON	0.99+
two	QUANTITY	0.99+
John Dickerson	PERSON	0.99+
2016	DATE	0.99+
2018	DATE	0.99+
2023	DATE	0.99+
3%	QUANTITY	0.99+
2017	DATE	0.99+
Capital One	ORGANIZATION	0.99+
last week	DATE	0.99+
AWS	ORGANIZATION	0.99+
Arthur	PERSON	0.99+
Python	TITLE	0.99+
millions	QUANTITY	0.99+
Two	QUANTITY	0.99+
each step	QUANTITY	0.99+
2018 20 19	DATE	0.99+
two schools	QUANTITY	0.99+
couple years ago	DATE	0.99+
once a week	QUANTITY	0.99+
one	QUANTITY	0.98+
first year	QUANTITY	0.98+
Swami	PERSON	0.98+
four years ago	DATE	0.98+
four	DATE	0.98+
first time	QUANTITY	0.98+
Arthur	ORGANIZATION	0.98+
two great guests	QUANTITY	0.98+
next year	DATE	0.98+
once a day	QUANTITY	0.98+
six weeks	QUANTITY	0.97+
10 years ago	DATE	0.97+
ChatGPT	TITLE	0.97+
second one	QUANTITY	0.96+
three people	QUANTITY	0.96+
front	EVENT	0.95+
second wave	EVENT	0.95+
January	DATE	0.95+
hundreds of millions of dollars	QUANTITY	0.95+
five years ago	DATE	0.94+
about a month ago	DATE	0.94+
tens of millions	QUANTITY	0.93+
today	DATE	0.92+
next 12 months	DATE	0.91+
LangChain	ORGANIZATION	0.91+
over a hundred percent	QUANTITY	0.91+
million dollars	QUANTITY	0.89+
millions of inferences	QUANTITY	0.89+
theCUBE	ORGANIZATION	0.88+

Luis Ceze & Anna Connolly, OctoML | AWS Startup Showcase S3 E1

(soft music) >> Hello, everyone. Welcome to theCUBE's presentation of the AWS Startup Showcase. AI and Machine Learning: Top Startups Building Foundational Model Infrastructure. This is season 3, episode 1 of the ongoing series covering the exciting stuff from the AWS ecosystem, talking about machine learning and AI. I'm your host, John Furrier and today we are excited to be joined by Luis Ceze who's the CEO of OctoML and Anna Connolly, VP of customer success and experience OctoML. Great to have you on again, Luis. Anna, thanks for coming on. Appreciate it. >> Thank you, John. It's great to be here. >> Thanks for having us. >> I love the company. We had a CUBE conversation about this. You guys are really addressing how to run foundational models faster for less. And this is like the key theme. But before we get into it, this is a hot trend, but let's explain what you guys do. Can you set the narrative of what the company's about, why it was founded, what's your North Star and your mission? >> Yeah, so John, our mission is to make AI sustainable and accessible for everyone. And what we offer customers is, you know, a way of taking their models into production in the most efficient way possible by automating the process of getting a model and optimizing it for a variety of hardware and making cost-effective. So better, faster, cheaper model deployment. >> You know, the big trend here is AI. Everyone's seeing the ChatGPT, kind of the shot heard around the world. The BingAI and this fiasco and the ongoing experimentation. People are into it, and I think the business impact is clear. I haven't seen this in all of my career in the technology industry of this kind of inflection point. And every senior leader I talk to is rethinking about how to rebuild their business with AI because now the large language models have come in, these foundational models are here, they can see value in their data. This is a 10 year journey in the big data world. Now it's impacting that, and everyone's rebuilding their company around this idea of being AI first 'cause they see ways to eliminate things and make things more efficient. And so now they telling 'em to go do it. And they're like, what do we do? So what do you guys think? Can you explain what is this wave of AI and why is it happening, why now, and what should people pay attention to? What does it mean to them? >> Yeah, I mean, it's pretty clear by now that AI can do amazing things that captures people's imaginations. And also now can show things that are really impactful in businesses, right? So what people have the opportunity to do today is to either train their own model that adds value to their business or find open models out there that can do very valuable things to them. So the next step really is how do you take that model and put it into production in a cost-effective way so that the business can actually get value out of it, right? >> Anna, what's your take? Because customers are there, you're there to make 'em successful, you got the new secret weapon for their business. >> Yeah, I think we just see a lot of companies struggle to get from a trained model into a model that is deployed in a cost-effective way that actually makes sense for the application they're building. I think that's a huge challenge we see today, kind of across the board across all of our customers. >> Well, I see this, everyone asking the same question. I have data, I want to get value out of it. I got to get these big models, I got to train it. What's it going to cost? So I think there's a reality of, okay, I got to do it. Then no one has any visibility on what it costs. When they get into it, this is going to break the bank. So I have to ask you guys, the cost of training these models is on everyone's mind. OctoML, your company's focus on the cost side of it as well as the efficiency side of running these models in production. Why are the production costs such a concern and where specifically are people looking at it and why did it get here? >> Yeah, so training costs get a lot of attention because normally a large number, but we shouldn't forget that it's a large, typically one time upfront cost that customers pay. But, you know, when the model is put into production, the cost grows directly with model usage and you actually want your model to be used because it's adding value, right? So, you know, the question that a customer faces is, you know, they have a model, they have a trained model and now what? So how much would it cost to run in production, right? And now without the big wave in generative AI, which rightfully is getting a lot of attention because of the amazing things that it can do. It's important for us to keep in mind that generative AI models like ChatGPT are huge, expensive energy hogs. They cost a lot to run, right? And given that model usage growth directly, model cost grows directly with usage, what you want to do is make sure that once you put a model into production, you have the best cost structure possible so that you're not surprised when it's gets popular, right? So let me give you an example. So if you have a model that costs, say 1 to $2 million to train, but then it costs about one to two cents per session to use it, right? So if you have a million active users, even if they use just once a day, it's 10 to $20,000 a day to operate that model in production. And that very, very quickly, you know, get beyond what you paid to train it. >> Anna, these aren't small numbers, and it's cost to train and cost to operate, it kind of reminds me of when the cloud came around and the data center versus cloud options. Like, wait a minute, one, it costs a ton of cash to deploy, and then running it. This is kind of a similar dynamic. What are you seeing? >> Yeah, absolutely. I think we are going to see increasingly the cost and production outpacing the costs and training by a lot. I mean, people talk about training costs now because that's what they're confronting now because people are so focused on getting models performant enough to even use in an application. And now that we have them and they're that capable, we're really going to start to see production costs go up a lot. >> Yeah, Luis, if you don't mind, I know this might be a little bit of a tangent, but, you know, training's super important. I get that. That's what people are doing now, but then there's the deployment side of production. Where do people get caught up and miss the boat or misconfigure? What's the gotcha? Where's the trip wire or so to speak? Where do people mess up on the cost side? What do they do? Is it they don't think about it, they tie it to proprietary hardware? What's the issue? >> Yeah, several things, right? So without getting really technical, which, you know, I might get into, you know, you have to understand relationship between performance, you know, both in terms of latency and throughput and cost, right? So reducing latency is important because you improve responsiveness of the model. But it's really important to keep in mind that it often leads diminishing returns. Below a certain latency, making it faster won't make a measurable difference in experience, but it's going to cost a lot more. So understanding that is important. Now, if you care more about throughputs, which is the time it takes for you to, you know, units per period of time, you care about time to solution, we should think about this throughput per dollar. And understand what you want is the highest throughput per dollar, which may come at the cost of higher latency, which you're not going to care about, right? So, and the reality here, John, is that, you know, humans and especially folks in this space want to have the latest and greatest hardware. And often they commit a lot of money to get access to them and have to commit upfront before they understand the needs that their models have, right? So common mistake here, one is not spending time to understand what you really need, and then two, over-committing and using more hardware than you actually need. And not giving yourself enough freedom to get your workload to move around to the more cost-effective choice, right? So this is just a metaphoric choice. And then another thing that's important here too is making a model run faster on the hardware directly translates to lower cost, right? So, but it takes a lot of engineers, you need to think of ways of producing very efficient versions of your model for the target hardware that you're going to use. >> Anna, what's the customer angle here? Because price performance has been around for a long time, people get that, but now latency and throughput, that's key because we're starting to see this in apps. I mean, there's an end user piece. I even seeing it on the infrastructure side where they're taking a heavy lifting away from operational costs. So you got, you know, application specific to the user and/or top of the stack, and then you got actually being used in operations where they want both. >> Yeah, absolutely. Maybe I can illustrate this with a quick story with the customer that we had recently been working with. So this customer is planning to run kind of a transformer based model for tech generation at super high scale on Nvidia T4 GPU, so kind of a commodity GPU. And the scale was so high that they would've been paying hundreds of thousands of dollars in cloud costs per year just to serve this model alone. You know, one of many models in their application stack. So we worked with this team to optimize our model and then benchmark across several possible targets. So that matching the hardware that Luis was just talking about, including the newer kind of Nvidia A10 GPUs. And what they found during this process was pretty interesting. First, the team was able to shave a quarter of their spend just by using better optimization techniques on the T4, the older hardware. But actually moving to a newer GPU would allow them to serve this model in a sub two milliseconds latency, so super fast, which was able to unlock an entirely new kind of user experience. So they were able to kind of change the value they're delivering in their application just because they were able to move to this new hardware easily. So they ultimately decided to plan their deployment on the more expensive A10 because of this, but because of the hardware specific optimizations that we helped them with, they managed to even, you know, bring costs down from what they had originally planned. And so if you extend this kind of example to everything that's happening with generative AI, I think the story we just talked about was super relevant, but the scale can be even higher, you know, it can be tenfold that. We were recently conducting kind of this internal study using GPT-J as a proxy to illustrate the experience of just a company trying to use one of these large language models with an example scenario of creating a chatbot to help job seekers prepare for interviews. So if you imagine kind of a conservative usage scenario where the model generates just 3000 words per user per day, which is, you know, pretty conservative for how people are interacting with these models. It costs 5 cents a session and if you're a company and your app goes viral, so from, you know, beginning of the year there's nobody, at the end of the year there's a million daily active active users in that year alone, going from zero to a million. You'll be spending about $6 million a year, which is pretty unmanageable. That's crazy, right? >> Yeah. >> For a company or a product that's just launching. So I think, you know, for us we see the real way to make these kind of advancements accessible and sustainable, as we said is to bring down cost to serve using these techniques. >> That's a great story and I think that illustrates this idea that deployment cost can vary from situation to situation, from model to model and that the efficiency is so strong with this new wave, it eliminates heavy lifting, creates more efficiency, automates intellect. I mean, this is the trend, this is radical, this is going to increase. So the cost could go from nominal to millions, literally, potentially. So, this is what customers are doing. Yeah, that's a great story. What makes sense on a financial, is there a cost of ownership? Is there a pattern for best practice for training? What do you guys advise cuz this is a lot of time and money involved in all potential, you know, good scenarios of upside. But you can get over your skis as they say, and be successful and be out of business if you don't manage it. I mean, that's what people are talking about, right? >> Yeah, absolutely. I think, you know, we see kind of three main vectors to reduce cost. I think one is make your deployment process easier overall, so that your engineering effort to even get your app running goes down. Two, would be get more from the compute you're already paying for, you're already paying, you know, for your instances in the cloud, but can you do more with that? And then three would be shop around for lower cost hardware to match your use case. So on the first one, I think making the deployment easier overall, there's a lot of manual work that goes into benchmarking, optimizing and packaging models for deployment. And because the performance of machine learning models can be really hardware dependent, you have to go through this process for each target you want to consider running your model on. And this is hard, you know, we see that every day. But for teams who want to incorporate some of these large language models into their applications, it might be desirable because licensing a model from a large vendor like OpenAI can leave you, you know, over provision, kind of paying for capabilities you don't need in your application or can lock you into them and you lose flexibility. So we have a customer whose team actually prepares models for deployment in a SaaS application that many of us use every day. And they told us recently that without kind of an automated benchmarking and experimentation platform, they were spending several days each to benchmark a single model on a single hardware type. So this is really, you know, manually intensive and then getting more from the compute you're already paying for. We do see customers who leave money on the table by running models that haven't been optimized specifically for the hardware target they're using, like Luis was mentioning. And for some teams they just don't have the time to go through an optimization process and for others they might lack kind of specialized expertise and this is something we can bring. And then on shopping around for different hardware types, we really see a huge variation in model performance across hardware, not just CPU vs. GPU, which is, you know, what people normally think of. But across CPU vendors themselves, high memory instances and across cloud providers even. So the best strategy here is for teams to really be able to, we say, look before you leap by running real world benchmarking and not just simulations or predictions to find the best software, hardware combination for their workload. >> Yeah. You guys sound like you have a very impressive customer base deploying large language models. Where would you categorize your current customer base? And as you look out, as you guys are growing, you have new customers coming in, take me through the progression. Take me through the profile of some of your customers you have now, size, are they hyperscalers, are they big app folks, are they kicking the tires? And then as people are out there scratching heads, I got to get in this game, what's their psychology like? Are they coming in with specific problems or do they have specific orientation point of view about what they want to do? Can you share some data around what you're seeing? >> Yeah, I think, you know, we have customers that kind of range across the spectrum of sophistication from teams that basically don't have MLOps expertise in their company at all. And so they're really looking for us to kind of give a full service, how should I do everything from, you know, optimization, find the hardware, prepare for deployment. And then we have teams that, you know, maybe already have their serving and hosting infrastructure up and ready and they already have models in production and they're really just looking to, you know, take the extra juice out of the hardware and just do really specific on that optimization piece. I think one place where we're doing a lot more work now is kind of in the developer tooling, you know, model selection space. And that's kind of an area that we're creating more tools for, particularly within the PyTorch ecosystem to bring kind of this power earlier in the development cycle so that as people are grabbing a model off the shelf, they can, you know, see how it might perform and use that to inform their development process. >> Luis, what's the big, I like this idea of picking the models because isn't that like going to the market and picking the best model for your data? It's like, you know, it's like, isn't there a certain approaches? What's your view on this? 'Cause this is where everyone, I think it's going to be a land rush for this and I want to get your thoughts. >> For sure, yeah. So, you know, I guess I'll start with saying the one main takeaway that we got from the GPT-J study is that, you know, having a different understanding of what your model's compute and memory requirements are, very quickly, early on helps with the much smarter AI model deployments, right? So, and in fact, you know, Anna just touched on this, but I want to, you know, make sure that it's clear that OctoML is putting that power into user's hands right now. So in partnership with AWS, we are launching this new PyTorch native profiler that allows you with a single, you know, one line, you know, code decorator allows you to see how your code runs on a variety of different hardware after accelerations. So it gives you very clear, you know, data on how you should think about your model deployments. And this ties back to choices of models. So like, if you have a set of choices that are equally good of models in terms of functionality and you want to understand after acceleration how are you going to deploy, how much they're going to cost or what are the options using a automated process of making a decision is really, really useful. And in fact, so I think these events can get early access to this by signing up for the Octopods, you know, this is exclusive group for insiders here, so you can go to OctoML.ai/pods to sign up. >> So that Octopod, is that a program? What is that, is that access to code? Is that a beta, what is that? Explain, take a minute and explain Octopod. >> I think the Octopod would be a group of people who is interested in experiencing this functionality. So it is the friends and users of OctoML that would be the Octopod. And then yes, after you sign up, we would provide you essentially the tool in code form for you to try out in your own. I mean, part of the benefit of this is that it happens in your own local environment and you're in control of everything kind of within the workflow that developers are already using to create and begin putting these models into their applications. So it would all be within your control. >> Got it. I think the big question I have for you is when do you, when does that one of your customers know they need to call you? What's their environment look like? What are they struggling with? What are the conversations they might be having on their side of the fence? If anyone's watching this, they're like, "Hey, you know what, I've got my team, we have a lot of data. Do we have our own language model or do I use someone else's?" There's a lot of this, I will say discovery going on around what to do, what path to take, what does that customer look like, if someone's listening, when do they know to call you guys, OctoML? >> Well, I mean the most obvious one is that you have a significant spend on AI/ML, come and talk to us, you know, putting AIML into production. So that's the clear one. In fact, just this morning I was talking to someone who is in life sciences space and is having, you know, 15 to $20 million a year cloud related to AI/ML deployment is a clear, it's a pretty clear match right there, right? So that's on the cost side. But I also want to emphasize something that Anna said earlier that, you know, the hardware and software complexity involved in putting model into production is really high. So we've been able to abstract that away, offering a clean automation flow enables one, to experiment early on, you know, how models would run and get them to production. And then two, once they are into production, gives you an automated flow to continuously updating your model and taking advantage of all this acceleration and ability to run the model on the right hardware. So anyways, let's say one then is cost, you know, you have significant cost and then two, you have an automation needs. And Anna please compliment that. >> Yeah, Anna you can please- >> Yeah, I think that's exactly right. Maybe the other time is when you are expecting a big scale up in serving your application, right? You're launching a new feature, you expect to get a lot of usage or, and you want to kind of anticipate maybe your CTO, your CIO, whoever pays your cloud bills is going to come after you, right? And so they want to know, you know, what's the return on putting this model essentially into my application stack? Am I going to, is the usage going to match what I'm paying for it? And then you can understand that. >> So you guys have a lot of the early adopters, they got big data teams, they're pushed in the production, they want to get a little QA, test the waters, understand, use your technology to figure it out. Is there any cases where people have gone into production, they have to pull it out? It's like the old lemon laws with your car, you buy a car and oh my god, it's not the way I wanted it. I mean, I can imagine the early people through the wall, so to speak, in the wave here are going to be bloody in the sense that they've gone in and tried stuff and get stuck with huge bills. Are you seeing that? Are people pulling stuff out of production and redeploying? Or I can imagine that if I had a bad deployment, I'd want to refactor that or actually replatform that. Do you see that too? >> Definitely after a sticker shock, yes, your customers will come and make sure that, you know, the sticker shock won't happen again. >> Yeah. >> But then there's another more thorough aspect here that I think we likely touched on, be worth elaborating a bit more is just how are you going to scale in a way that's feasible depending on the allocation that you get, right? So as we mentioned several times here, you know, model deployment is so hardware dependent and so complex that you tend to get a model for a hardware choice and then you want to scale that specific type of instance. But what if, when you want to scale because suddenly luckily got popular and, you know, you want to scale it up and then you don't have that instance anymore. So how do you live with whatever you have at that moment is something that we see customers needing as well. You know, so in fact, ideally what we want is customers to not think about what kind of specific instances they want. What they want is to know what their models need. Say, they know the SLA and then find a set of hybrid targets and instances that hit the SLA whenever they're also scaling, they're going to scale with more freedom, right? Instead of having to wait for AWS to give them more specific allocation for a specific instance. What if you could live with other types of hardware and scale up in a more free way, right? So that's another thing that we see customers, you know, like they need more freedom to be able to scale with whatever is available. >> Anna, you touched on this with the business model impact to that 6 million cost, if that goes out of control, there's a business model aspect and there's a technical operation aspect to the cost side too. You want to be mindful of riding the wave in a good way, but not getting over your skis. So that brings up the point around, you know, confidence, right? And teamwork. Because if you're in production, there's probably a team behind it. Talk about the team aspect of your customers. I mean, they're dedicated, they go put stuff into production, they're developers, there're data. What's in it for them? Are they getting better, are they in the beach, you know, reading the book. Are they, you know, are there easy street for them? What's the customer benefit to the teams? >> Yeah, absolutely. With just a few clicks of a button, you're in production, right? That's the dream. So yeah, I mean I think that, you know, we illustrated it before a little bit. I think the automated kind of benchmarking and optimization process, like when you think about the effort it takes to get that data by hand, which is what people are doing today, they just don't do it. So they're making decisions without the best information because it's, you know, there just isn't the bandwidth to get the information that they need to make the best decision and then know exactly how to deploy it. So I think it's actually bringing kind of a new insight and capability to these teams that they didn't have before. And then maybe another aspect on the team side is that it's making the hand-off of the models from the data science teams to the model deployment teams more seamless. So we have, you know, we have seen in the past that this kind of transition point is the place where there are a lot of hiccups, right? The data science team will give a model to the production team and it'll be too slow for the application or it'll be too expensive to run and it has to go back and be changed and kind of this loop. And so, you know, with the PyTorch profiler that Luis was talking about, and then also, you know, the other ways we do optimization that kind of prevents that hand-off problem from happening. >> Luis and Anna, you guys have a great company. Final couple minutes left. Talk about the company, the people there, what's the culture like, you know, if Intel has Moore's law, which is, you know, doubling the performance in few years, what's the culture like there? Is it, you know, more throughput, better pricing? Explain what's going on with the company and put a plug in. Luis, we'll start with you. >> Yeah, absolutely. I'm extremely proud of the team that we built here. You know, we have a people first culture, you know, very, very collaborative and folks, we all have a shared mission here of making AI more accessible and sustainable. We have a very diverse team in terms of backgrounds and life stories, you know, to do what we do here, we need a team that has expertise in software engineering, in machine learning, in computer architecture. Even though we don't build chips, we need to understand how they work, right? So, and then, you know, the fact that we have this, this very really, really varied set of backgrounds makes the environment, you know, it's say very exciting to learn more about, you know, assistance end-to-end. But also makes it for a very interesting, you know, work environment, right? So people have different backgrounds, different stories. Some of them went to grad school, others, you know, were in intelligence agencies and now are working here, you know. So we have a really interesting set of people and, you know, life is too short not to work with interesting humans. You know, that's something that I like to think about, you know. >> I'm sure your off-site meetings are a lot of fun, people talking about computer architectures, silicon advances, the next GPU, the big data models coming in. Anna, what's your take? What's the culture like? What's the company vibe and what are you guys looking to do? What's the customer success pattern? What's up? >> Yeah, absolutely. I mean, I, you know, second all of the great things that Luis just said about the team. I think one that I, an additional one that I'd really like to underscore is kind of this customer obsession, to use a term you all know well. And focus on the end users and really making the experiences that we're bringing to our user who are developers really, you know, useful and valuable for them. And so I think, you know, all of these tools that we're trying to put in the hands of users, the industry and the market is changing so rapidly that our products across the board, you know, all of the companies that, you know, are part of the showcase today, we're all evolving them so quickly and we can only do that kind of really hand in glove with our users. So that would be another thing I'd emphasize. >> I think the change dynamic, the power dynamics of this industry is just the beginning. I'm very bullish that this is going to be probably one of the biggest inflection points in history of the computer industry because of all the dynamics of the confluence of all the forces, which you mentioned some of them, I mean PC, you know, interoperability within internetworking and you got, you know, the web and then mobile. Now we have this, I mean, I wouldn't even put social media even in the close to this. Like, this is like, changes user experience, changes infrastructure. There's going to be massive accelerations in performance on the hardware side from AWS's of the world and cloud and you got the edge and more data. This is really what big data was going to look like. This is the beginning. Final question, what do you guys see going forward in the future? >> Well, it's undeniable that machine learning and AI models are becoming an integral part of an interesting application today, right? So, and the clear trends here are, you know, more and more competitional needs for these models because they're only getting more and more powerful. And then two, you know, seeing the complexity of the infrastructure where they run, you know, just considering the cloud, there's like a wide variety of choices there, right? So being able to live with that and making the most out of it in a way that does not require, you know, an impossible to find team is something that's pretty clear. So the need for automation, abstracting with the complexity is definitely here. And we are seeing this, you know, trends are that you also see models starting to move to the edge as well. So it's clear that we're seeing, we are going to live in a world where there's no large models living in the cloud. And then, you know, edge models that talk to these models in the cloud to form, you know, an end-to-end truly intelligent application. >> Anna? >> Yeah, I think, you know, our, Luis said it at the beginning. Our vision is to make AI sustainable and accessible. And I think as this technology just expands in every company and every team, that's going to happen kind of on its own. And we're here to help support that. And I think you can't do that without tools like those like OctoML. >> I think it's going to be an error of massive invention, creativity, a lot of the format heavy lifting is going to allow the talented people to automate their intellect. I mean, this is really kind of what we see going on. And Luis, thank you so much. Anna, thanks for coming on this segment. Thanks for coming on theCUBE and being part of the AWS Startup Showcase. I'm John Furrier, your host. Thanks for watching. (upbeat music)

Published Date : Mar 9 2023

SUMMARY :

Great to have you on again, Luis. It's great to be here. but let's explain what you guys do. And what we offer customers is, you know, So what do you guys think? so that the business you got the new secret kind of across the board So I have to ask you guys, And that very, very quickly, you know, and the data center versus cloud options. And now that we have them but, you know, training's super important. John, is that, you know, humans and then you got actually managed to even, you know, So I think, you know, for us we see in all potential, you know, And this is hard, you know, And as you look out, as And then we have teams that, you know, and picking the best model for your data? from the GPT-J study is that, you know, What is that, is that access to code? And then yes, after you sign up, to call you guys, OctoML? come and talk to us, you know, And so they want to know, you know, So you guys have a lot make sure that, you know, we see customers, you know, What's the customer benefit to the teams? and then also, you know, what's the culture like, you know, So, and then, you know, and what are you guys looking to do? all of the companies that, you know, I mean PC, you know, in the cloud to form, you know, And I think you can't And Luis, thank you so much.

ENTITIES

Entity	Category	Confidence
Anna	PERSON	0.99+
Anna Connolly	PERSON	0.99+
John Furrier	PERSON	0.99+
Luis	PERSON	0.99+
Luis Ceze	PERSON	0.99+
John	PERSON	0.99+
1	QUANTITY	0.99+
10	QUANTITY	0.99+
15	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
10 year	QUANTITY	0.99+
6 million	QUANTITY	0.99+
zero	QUANTITY	0.99+
Intel	ORGANIZATION	0.99+
three	QUANTITY	0.99+
Nvidia	ORGANIZATION	0.99+
First	QUANTITY	0.99+
OctoML	ORGANIZATION	0.99+
two	QUANTITY	0.99+
millions	QUANTITY	0.99+
today	DATE	0.99+
Two	QUANTITY	0.99+
$2 million	QUANTITY	0.98+
3000 words	QUANTITY	0.98+
one line	QUANTITY	0.98+
A10	COMMERCIAL_ITEM	0.98+
OctoML	TITLE	0.98+
one	QUANTITY	0.98+
three main vectors	QUANTITY	0.97+
hundreds of thousands of dollars	QUANTITY	0.97+
both	QUANTITY	0.97+
CUBE	ORGANIZATION	0.97+
T4	COMMERCIAL_ITEM	0.97+
one time	QUANTITY	0.97+
first one	QUANTITY	0.96+
two cents	QUANTITY	0.96+
GPT-J	ORGANIZATION	0.96+
single model	QUANTITY	0.95+
a minute	QUANTITY	0.95+
about $6 million a year	QUANTITY	0.95+
once a day	QUANTITY	0.95+
$20,000 a day	QUANTITY	0.95+
a million	QUANTITY	0.94+
theCUBE	ORGANIZATION	0.93+
Octopod	TITLE	0.93+
this morning	DATE	0.93+
first culture	QUANTITY	0.92+
$20 million a year	QUANTITY	0.92+
AWS Startup Showcase	EVENT	0.9+
North Star	ORGANIZATION	0.9+

Robert Nishihara, Anyscale | AWS Startup Showcase S3 E1

(upbeat music) >> Hello everyone. Welcome to theCube's presentation of the "AWS Startup Showcase." The topic this episode is AI and machine learning, top startups building foundational model infrastructure. This is season three, episode one of the ongoing series covering exciting startups from the AWS ecosystem. And this time we're talking about AI and machine learning. I'm your host, John Furrier. I'm excited I'm joined today by Robert Nishihara, who's the co-founder and CEO of a hot startup called Anyscale. He's here to talk about Ray, the open source project, Anyscale's infrastructure for foundation as well. Robert, thank you for joining us today. >> Yeah, thanks so much as well. >> I've been following your company since the founding pre pandemic and you guys really had a great vision scaled up and in a perfect position for this big wave that we all see with ChatGPT and OpenAI that's gone mainstream. Finally, AI has broken out through the ropes and now gone mainstream, so I think you guys are really well positioned. I'm looking forward to to talking with you today. But before we get into it, introduce the core mission for Anyscale. Why do you guys exist? What is the North Star for Anyscale? >> Yeah, like you mentioned, there's a tremendous amount of excitement about AI right now. You know, I think a lot of us believe that AI can transform just every different industry. So one of the things that was clear to us when we started this company was that the amount of compute needed to do AI was just exploding. Like to actually succeed with AI, companies like OpenAI or Google or you know, these companies getting a lot of value from AI, were not just running these machine learning models on their laptops or on a single machine. They were scaling these applications across hundreds or thousands or more machines and GPUs and other resources in the Cloud. And so to actually succeed with AI, and this has been one of the biggest trends in computing, maybe the biggest trend in computing in, you know, in recent history, the amount of compute has been exploding. And so to actually succeed with that AI, to actually build these scalable applications and scale the AI applications, there's a tremendous software engineering lift to build the infrastructure to actually run these scalable applications. And that's very hard to do. So one of the reasons many AI projects and initiatives fail is that, or don't make it to production, is the need for this scale, the infrastructure lift, to actually make it happen. So our goal here with Anyscale and Ray, is to make that easy, is to make scalable computing easy. So that as a developer or as a business, if you want to do AI, if you want to get value out of AI, all you need to know is how to program on your laptop. Like, all you need to know is how to program in Python. And if you can do that, then you're good to go. Then you can do what companies like OpenAI or Google do and get value out of machine learning. >> That programming example of how easy it is with Python reminds me of the early days of Cloud, when infrastructure as code was talked about was, it was just code the infrastructure programmable. That's super important. That's what AI people wanted, first program AI. That's the new trend. And I want to understand, if you don't mind explaining, the relationship that Anyscale has to these foundational models and particular the large language models, also called LLMs, was seen with like OpenAI and ChatGPT. Before you get into the relationship that you have with them, can you explain why the hype around foundational models? Why are people going crazy over foundational models? What is it and why is it so important? >> Yeah, so foundational models and foundation models are incredibly important because they enable businesses and developers to get value out of machine learning, to use machine learning off the shelf with these large models that have been trained on tons of data and that are useful out of the box. And then, of course, you know, as a business or as a developer, you can take those foundational models and repurpose them or fine tune them or adapt them to your specific use case and what you want to achieve. But it's much easier to do that than to train them from scratch. And I think there are three, for people to actually use foundation models, there are three main types of workloads or problems that need to be solved. One is training these foundation models in the first place, like actually creating them. The second is fine tuning them and adapting them to your use case. And the third is serving them and actually deploying them. Okay, so Ray and Anyscale are used for all of these three different workloads. Companies like OpenAI or Cohere that train large language models. Or open source versions like GPTJ are done on top of Ray. There are many startups and other businesses that fine tune, that, you know, don't want to train the large underlying foundation models, but that do want to fine tune them, do want to adapt them to their purposes, and build products around them and serve them, those are also using Ray and Anyscale for that fine tuning and that serving. And so the reason that Ray and Anyscale are important here is that, you know, building and using foundation models requires a huge scale. It requires a lot of data. It requires a lot of compute, GPUs, TPUs, other resources. And to actually take advantage of that and actually build these scalable applications, there's a lot of infrastructure that needs to happen under the hood. And so you can either use Ray and Anyscale to take care of that and manage the infrastructure and solve those infrastructure problems. Or you can build the infrastructure and manage the infrastructure yourself, which you can do, but it's going to slow your team down. It's going to, you know, many of the businesses we work with simply don't want to be in the business of managing infrastructure and building infrastructure. They want to focus on product development and move faster. >> I know you got a keynote presentation we're going to go to in a second, but I think you hit on something I think is the real tipping point, doing it yourself, hard to do. These are things where opportunities are and the Cloud did that with data centers. Turned a data center and made it an API. The heavy lifting went away and went to the Cloud so people could be more creative and build their product. In this case, build their creativity. Is that kind of what's the big deal? Is that kind of a big deal happening that you guys are taking the learnings and making that available so people don't have to do that? >> That's exactly right. So today, if you want to succeed with AI, if you want to use AI in your business, infrastructure work is on the critical path for doing that. To do AI, you have to build infrastructure. You have to figure out how to scale your applications. That's going to change. We're going to get to the point, and you know, with Ray and Anyscale, we're going to remove the infrastructure from the critical path so that as a developer or as a business, all you need to focus on is your application logic, what you want the the program to do, what you want your application to do, how you want the AI to actually interface with the rest of your product. Now the way that will happen is that Ray and Anyscale will still, the infrastructure work will still happen. It'll just be under the hood and taken care of by Ray in Anyscale. And so I think something like this is really necessary for AI to reach its potential, for AI to have the impact and the reach that we think it will, you have to make it easier to do. >> And just for clarification to point out, if you don't mind explaining the relationship of Ray and Anyscale real quick just before we get into the presentation. >> So Ray is an open source project. We created it. We were at Berkeley doing machine learning. We started Ray so that, in order to provide an easy, a simple open source tool for building and running scalable applications. And Anyscale is the managed version of Ray, basically we will run Ray for you in the Cloud, provide a lot of tools around the developer experience and managing the infrastructure and providing more performance and superior infrastructure. >> Awesome. I know you got a presentation on Ray and Anyscale and you guys are positioning as the infrastructure for foundational models. So I'll let you take it away and then when you're done presenting, we'll come back, I'll probably grill you with a few questions and then we'll close it out so take it away. >> Robert: Sounds great. So I'll say a little bit about how companies are using Ray and Anyscale for foundation models. The first thing I want to mention is just why we're doing this in the first place. And the underlying observation, the underlying trend here, and this is a plot from OpenAI, is that the amount of compute needed to do machine learning has been exploding. It's been growing at something like 35 times every 18 months. This is absolutely enormous. And other people have written papers measuring this trend and you get different numbers. But the point is, no matter how you slice and dice it, it' a astronomical rate. Now if you compare that to something we're all familiar with, like Moore's Law, which says that, you know, the processor performance doubles every roughly 18 months, you can see that there's just a tremendous gap between the needs, the compute needs of machine learning applications, and what you can do with a single chip, right. So even if Moore's Law were continuing strong and you know, doing what it used to be doing, even if that were the case, there would still be a tremendous gap between what you can do with the chip and what you need in order to do machine learning. And so given this graph, what we've seen, and what has been clear to us since we started this company, is that doing AI requires scaling. There's no way around it. It's not a nice to have, it's really a requirement. And so that led us to start Ray, which is the open source project that we started to make it easy to build these scalable Python applications and scalable machine learning applications. And since we started the project, it's been adopted by a tremendous number of companies. Companies like OpenAI, which use Ray to train their large models like ChatGPT, companies like Uber, which run all of their deep learning and classical machine learning on top of Ray, companies like Shopify or Spotify or Instacart or Lyft or Netflix, ByteDance, which use Ray for their machine learning infrastructure. Companies like Ant Group, which makes Alipay, you know, they use Ray across the board for fraud detection, for online learning, for detecting money laundering, you know, for graph processing, stream processing. Companies like Amazon, you know, run Ray at a tremendous scale and just petabytes of data every single day. And so the project has seen just enormous adoption since, over the past few years. And one of the most exciting use cases is really providing the infrastructure for building training, fine tuning, and serving foundation models. So I'll say a little bit about, you know, here are some examples of companies using Ray for foundation models. Cohere trains large language models. OpenAI also trains large language models. You can think about the workloads required there are things like supervised pre-training, also reinforcement learning from human feedback. So this is not only the regular supervised learning, but actually more complex reinforcement learning workloads that take human input about what response to a particular question, you know is better than a certain other response. And incorporating that into the learning. There's open source versions as well, like GPTJ also built on top of Ray as well as projects like Alpa coming out of UC Berkeley. So these are some of the examples of exciting projects in organizations, training and creating these large language models and serving them using Ray. Okay, so what actually is Ray? Well, there are two layers to Ray. At the lowest level, there's the core Ray system. This is essentially low level primitives for building scalable Python applications. Things like taking a Python function or a Python class and executing them in the cluster setting. So Ray core is extremely flexible and you can build arbitrary scalable applications on top of Ray. So on top of Ray, on top of the core system, what really gives Ray a lot of its power is this ecosystem of scalable libraries. So on top of the core system you have libraries, scalable libraries for ingesting and pre-processing data, for training your models, for fine tuning those models, for hyper parameter tuning, for doing batch processing and batch inference, for doing model serving and deployment, right. And a lot of the Ray users, the reason they like Ray is that they want to run multiple workloads. They want to train and serve their models, right. They want to load their data and feed that into training. And Ray provides common infrastructure for all of these different workloads. So this is a little overview of what Ray, the different components of Ray. So why do people choose to go with Ray? I think there are three main reasons. The first is the unified nature. The fact that it is common infrastructure for scaling arbitrary workloads, from data ingest to pre-processing to training to inference and serving, right. This also includes the fact that it's future proof. AI is incredibly fast moving. And so many people, many companies that have built their own machine learning infrastructure and standardized on particular workflows for doing machine learning have found that their workflows are too rigid to enable new capabilities. If they want to do reinforcement learning, if they want to use graph neural networks, they don't have a way of doing that with their standard tooling. And so Ray, being future proof and being flexible and general gives them that ability. Another reason people choose Ray in Anyscale is the scalability. This is really our bread and butter. This is the reason, the whole point of Ray, you know, making it easy to go from your laptop to running on thousands of GPUs, making it easy to scale your development workloads and run them in production, making it easy to scale, you know, training to scale data ingest, pre-processing and so on. So scalability and performance, you know, are critical for doing machine learning and that is something that Ray provides out of the box. And lastly, Ray is an open ecosystem. You can run it anywhere. You can run it on any Cloud provider. Google, you know, Google Cloud, AWS, Asure. You can run it on your Kubernetes cluster. You can run it on your laptop. It's extremely portable. And not only that, it's framework agnostic. You can use Ray to scale arbitrary Python workloads. You can use it to scale and it integrates with libraries like TensorFlow or PyTorch or JAX or XG Boost or Hugging Face or PyTorch Lightning, right, or Scikit-learn or just your own arbitrary Python code. It's open source. And in addition to integrating with the rest of the machine learning ecosystem and these machine learning frameworks, you can use Ray along with all of the other tooling in the machine learning ecosystem. That's things like weights and biases or ML flow, right. Or you know, different data platforms like Databricks, you know, Delta Lake or Snowflake or tools for model monitoring for feature stores, all of these integrate with Ray. And that's, you know, Ray provides that kind of flexibility so that you can integrate it into the rest of your workflow. And then Anyscale is the scalable compute platform that's built on top, you know, that provides Ray. So Anyscale is a managed Ray service that runs in the Cloud. And what Anyscale does is it offers the best way to run Ray. And if you think about what you get with Anyscale, there are fundamentally two things. One is about moving faster, accelerating the time to market. And you get that by having the managed service so that as a developer you don't have to worry about managing infrastructure, you don't have to worry about configuring infrastructure. You also, it provides, you know, optimized developer workflows. Things like easily moving from development to production, things like having the observability tooling, the debug ability to actually easily diagnose what's going wrong in a distributed application. So things like the dashboards and the other other kinds of tooling for collaboration, for monitoring and so on. And then on top of that, so that's the first bucket, developer productivity, moving faster, faster experimentation and iteration. The second reason that people choose Anyscale is superior infrastructure. So this is things like, you know, cost deficiency, being able to easily take advantage of spot instances, being able to get higher GPU utilization, things like faster cluster startup times and auto scaling. Things like just overall better performance and faster scheduling. And so these are the kinds of things that Anyscale provides on top of Ray. It's the managed infrastructure. It's fast, it's like the developer productivity and velocity as well as performance. So this is what I wanted to share about Ray in Anyscale. >> John: Awesome. >> Provide that context. But John, I'm curious what you think. >> I love it. I love the, so first of all, it's a platform because that's the platform architecture right there. So just to clarify, this is an Anyscale platform, not- >> That's right. >> Tools. So you got tools in the platform. Okay, that's key. Love that managed service. Just curious, you mentioned Python multiple times, is that because of PyTorch and TensorFlow or Python's the most friendly with machine learning or it's because it's very common amongst all developers? >> That's a great question. Python is the language that people are using to do machine learning. So it's the natural starting point. Now, of course, Ray is actually designed in a language agnostic way and there are companies out there that use Ray to build scalable Java applications. But for the most part right now we're focused on Python and being the best way to build these scalable Python and machine learning applications. But, of course, down the road there always is that potential. >> So if you're slinging Python code out there and you're watching that, you're watching this video, get on Anyscale bus quickly. Also, I just, while you were giving the presentation, I couldn't help, since you mentioned OpenAI, which by the way, congratulations 'cause they've had great scale, I've noticed in their rapid growth 'cause they were the fastest company to the number of users than anyone in the history of the computer industry, so major successor, OpenAI and ChatGPT, huge fan. I'm not a skeptic at all. I think it's just the beginning, so congratulations. But I actually typed into ChatGPT, what are the top three benefits of Anyscale and came up with scalability, flexibility, and ease of use. Obviously, scalability is what you guys are called. >> That's pretty good. >> So that's what they came up with. So they nailed it. Did you have an inside prompt training, buy it there? Only kidding. (Robert laughs) >> Yeah, we hard coded that one. >> But that's the kind of thing that came up really, really quickly if I asked it to write a sales document, it probably will, but this is the future interface. This is why people are getting excited about the foundational models and the large language models because it's allowing the interface with the user, the consumer, to be more human, more natural. And this is clearly will be in every application in the future. >> Absolutely. This is how people are going to interface with software, how they're going to interface with products in the future. It's not just something, you know, not just a chat bot that you talk to. This is going to be how you get things done, right. How you use your web browser or how you use, you know, how you use Photoshop or how you use other products. Like you're not going to spend hours learning all the APIs and how to use them. You're going to talk to it and tell it what you want it to do. And of course, you know, if it doesn't understand it, it's going to ask clarifying questions. You're going to have a conversation and then it'll figure it out. >> This is going to be one of those things, we're going to look back at this time Robert and saying, "Yeah, from that company, that was the beginning of that wave." And just like AWS and Cloud Computing, the folks who got in early really were in position when say the pandemic came. So getting in early is a good thing and that's what everyone's talking about is getting in early and playing around, maybe replatforming or even picking one or few apps to refactor with some staff and managed services. So people are definitely jumping in. So I have to ask you the ROI cost question. You mentioned some of those, Moore's Law versus what's going on in the industry. When you look at that kind of scale, the first thing that jumps out at people is, "Okay, I love it. Let's go play around." But what's it going to cost me? Am I going to be tied to certain GPUs? What's the landscape look like from an operational standpoint, from the customer? Are they locked in and the benefit was flexibility, are you flexible to handle any Cloud? What is the customers, what are they looking at? Basically, that's my question. What's the customer looking at? >> Cost is super important here and many of the companies, I mean, companies are spending a huge amount on their Cloud computing, on AWS, and on doing AI, right. And I think a lot of the advantage of Anyscale, what we can provide here is not only better performance, but cost efficiency. Because if we can run something faster and more efficiently, it can also use less resources and you can lower your Cloud spending, right. We've seen companies go from, you know, 20% GPU utilization with their current setup and the current tools they're using to running on Anyscale and getting more like 95, you know, 100% GPU utilization. That's something like a five x improvement right there. So depending on the kind of application you're running, you know, it's a significant cost savings. We've seen companies that have, you know, processing petabytes of data every single day with Ray going from, you know, getting order of magnitude cost savings by switching from what they were previously doing to running their application on Ray. And when you have applications that are spending, you know, potentially $100 million a year and getting a 10 X cost savings is just absolutely enormous. So these are some of the kinds of- >> Data infrastructure is super important. Again, if the customer, if you're a prospect to this and thinking about going in here, just like the Cloud, you got infrastructure, you got the platform, you got SaaS, same kind of thing's going to go on in AI. So I want to get into that, you know, ROI discussion and some of the impact with your customers that are leveraging the platform. But first I hear you got a demo. >> Robert: Yeah, so let me show you, let me give you a quick run through here. So what I have open here is the Anyscale UI. I've started a little Anyscale Workspace. So Workspaces are the Anyscale concept for interactive developments, right. So here, imagine I'm just, you want to have a familiar experience like you're developing on your laptop. And here I have a terminal. It's not on my laptop. It's actually in the cloud running on Anyscale. And I'm just going to kick this off. This is going to train a large language model, so OPT. And it's doing this on 32 GPUs. We've got a cluster here with a bunch of CPU cores, bunch of memory. And as that's running, and by the way, if I wanted to run this on instead of 32 GPUs, 64, 128, this is just a one line change when I launch the Workspace. And what I can do is I can pull up VS code, right. Remember this is the interactive development experience. I can look at the actual code. Here it's using Ray train to train the torch model. We've got the training loop and we're saying that each worker gets access to one GPU and four CPU cores. And, of course, as I make the model larger, this is using deep speed, as I make the model larger, I could increase the number of GPUs that each worker gets access to, right. And how that is distributed across the cluster. And if I wanted to run on CPUs instead of GPUs or a different, you know, accelerator type, again, this is just a one line change. And here we're using Ray train to train the models, just taking my vanilla PyTorch model using Hugging Face and then scaling that across a bunch of GPUs. And, of course, if I want to look at the dashboard, I can go to the Ray dashboard. There are a bunch of different visualizations I can look at. I can look at the GPU utilization. I can look at, you know, the CPU utilization here where I think we're currently loading the model and running that actual application to start the training. And some of the things that are really convenient here about Anyscale, both I can get that interactive development experience with VS code. You know, I can look at the dashboards. I can monitor what's going on. It feels, I have a terminal, it feels like my laptop, but it's actually running on a large cluster. And I can, with however many GPUs or other resources that I want. And so it's really trying to combine the best of having the familiar experience of programming on your laptop, but with the benefits, you know, being able to take advantage of all the resources in the Cloud to scale. And it's like when, you know, you're talking about cost efficiency. One of the biggest reasons that people waste money, one of the silly reasons for wasting money is just forgetting to turn off your GPUs. And what you can do here is, of course, things will auto terminate if they're idle. But imagine you go to sleep, I have this big cluster. You can turn it off, shut off the cluster, come back tomorrow, restart the Workspace, and you know, your big cluster is back up and all of your code changes are still there. All of your local file edits. It's like you just closed your laptop and came back and opened it up again. And so this is the kind of experience we want to provide for our users. So that's what I wanted to share with you. >> Well, I think that whole, couple of things, lines of code change, single line of code change, that's game changing. And then the cost thing, I mean human error is a big deal. People pass out at their computer. They've been coding all night or they just forget about it. I mean, and then it's just like leaving the lights on or your water running in your house. It's just, at the scale that it is, the numbers will add up. That's a huge deal. So I think, you know, compute back in the old days, there's no compute. Okay, it's just compute sitting there idle. But you know, data cranking the models is doing, that's a big point. >> Another thing I want to add there about cost efficiency is that we make it really easy to use, if you're running on Anyscale, to use spot instances and these preemptable instances that can just be significantly cheaper than the on-demand instances. And so when we see our customers go from what they're doing before to using Anyscale and they go from not using these spot instances 'cause they don't have the infrastructure around it, the fault tolerance to handle the preemption and things like that, to being able to just check a box and use spot instances and save a bunch of money. >> You know, this was my whole, my feature article at Reinvent last year when I met with Adam Selipsky, this next gen Cloud is here. I mean, it's not auto scale, it's infrastructure scale. It's agility. It's flexibility. I think this is where the world needs to go. Almost what DevOps did for Cloud and what you were showing me that demo had this whole SRE vibe. And remember Google had site reliability engines to manage all those servers. This is kind of like an SRE vibe for data at scale. I mean, a similar kind of order of magnitude. I mean, I might be a little bit off base there, but how would you explain it? >> It's a nice analogy. I mean, what we are trying to do here is get to the point where developers don't think about infrastructure. Where developers only think about their application logic. And where businesses can do AI, can succeed with AI, and build these scalable applications, but they don't have to build, you know, an infrastructure team. They don't have to develop that expertise. They don't have to invest years in building their internal machine learning infrastructure. They can just focus on the Python code, on their application logic, and run the stuff out of the box. >> Awesome. Well, I appreciate the time. Before we wrap up here, give a plug for the company. I know you got a couple websites. Again, go, Ray's got its own website. You got Anyscale. You got an event coming up. Give a plug for the company looking to hire. Put a plug in for the company. >> Yeah, absolutely. Thank you. So first of all, you know, we think AI is really going to transform every industry and the opportunity is there, right. We can be the infrastructure that enables all of that to happen, that makes it easy for companies to succeed with AI, and get value out of AI. Now we have, if you're interested in learning more about Ray, Ray has been emerging as the standard way to build scalable applications. Our adoption has been exploding. I mentioned companies like OpenAI using Ray to train their models. But really across the board companies like Netflix and Cruise and Instacart and Lyft and Uber, you know, just among tech companies. It's across every industry. You know, gaming companies, agriculture, you know, farming, robotics, drug discovery, you know, FinTech, we see it across the board. And all of these companies can get value out of AI, can really use AI to improve their businesses. So if you're interested in learning more about Ray and Anyscale, we have our Ray Summit coming up in September. This is going to highlight a lot of the most impressive use cases and stories across the industry. And if your business, if you want to use LLMs, you want to train these LLMs, these large language models, you want to fine tune them with your data, you want to deploy them, serve them, and build applications and products around them, give us a call, talk to us. You know, we can really take the infrastructure piece, you know, off the critical path and make that easy for you. So that's what I would say. And, you know, like you mentioned, we're hiring across the board, you know, engineering, product, go-to-market, and it's an exciting time. >> Robert Nishihara, co-founder and CEO of Anyscale, congratulations on a great company you've built and continuing to iterate on and you got growth ahead of you, you got a tailwind. I mean, the AI wave is here. I think OpenAI and ChatGPT, a customer of yours, have really opened up the mainstream visibility into this new generation of applications, user interface, roll of data, large scale, how to make that programmable so we're going to need that infrastructure. So thanks for coming on this season three, episode one of the ongoing series of the hot startups. In this case, this episode is the top startups building foundational model infrastructure for AI and ML. I'm John Furrier, your host. Thanks for watching. (upbeat music)

Published Date : Mar 9 2023

SUMMARY :

episode one of the ongoing and you guys really had and other resources in the Cloud. and particular the large language and what you want to achieve. and the Cloud did that with data centers. the point, and you know, if you don't mind explaining and managing the infrastructure and you guys are positioning is that the amount of compute needed to do But John, I'm curious what you think. because that's the platform So you got tools in the platform. and being the best way to of the computer industry, Did you have an inside prompt and the large language models and tell it what you want it to do. So I have to ask you and you can lower your So I want to get into that, you know, and you know, your big cluster is back up So I think, you know, the on-demand instances. and what you were showing me that demo and run the stuff out of the box. I know you got a couple websites. and the opportunity is there, right. and you got growth ahead

ENTITIES

Entity	Category	Confidence
Robert Nishihara	PERSON	0.99+
John	PERSON	0.99+
Robert	PERSON	0.99+
John Furrier	PERSON	0.99+
Netflix	ORGANIZATION	0.99+
35 times	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
$100 million	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Ant Group	ORGANIZATION	0.99+
first	QUANTITY	0.99+
Python	TITLE	0.99+
20%	QUANTITY	0.99+
32 GPUs	QUANTITY	0.99+
Lyft	ORGANIZATION	0.99+
hundreds	QUANTITY	0.99+
tomorrow	DATE	0.99+
Anyscale	ORGANIZATION	0.99+
three	QUANTITY	0.99+
128	QUANTITY	0.99+
September	DATE	0.99+
today	DATE	0.99+
Moore's Law	TITLE	0.99+
Adam Selipsky	PERSON	0.99+
PyTorch	TITLE	0.99+
Ray	ORGANIZATION	0.99+
second reason	QUANTITY	0.99+
64	QUANTITY	0.99+
each worker	QUANTITY	0.99+
each worker	QUANTITY	0.99+
Photoshop	TITLE	0.99+
UC Berkeley	ORGANIZATION	0.99+
Java	TITLE	0.99+
Shopify	ORGANIZATION	0.99+
OpenAI	ORGANIZATION	0.99+
Anyscale	PERSON	0.99+
third	QUANTITY	0.99+
two things	QUANTITY	0.99+
ByteDance	ORGANIZATION	0.99+
Spotify	ORGANIZATION	0.99+
One	QUANTITY	0.99+
95	QUANTITY	0.99+
Asure	ORGANIZATION	0.98+
one line	QUANTITY	0.98+
one GPU	QUANTITY	0.98+
ChatGPT	TITLE	0.98+
TensorFlow	TITLE	0.98+
last year	DATE	0.98+
first bucket	QUANTITY	0.98+
both	QUANTITY	0.98+
two layers	QUANTITY	0.98+
Cohere	ORGANIZATION	0.98+
Alipay	ORGANIZATION	0.98+
Ray	PERSON	0.97+
one	QUANTITY	0.97+
Instacart	ORGANIZATION	0.97+

Opening Panel | Generative AI: Hype or Reality | AWS Startup Showcase S3 E1

(light airy music) >> Hello, everyone, welcome to theCUBE's presentation of the AWS Startup Showcase, AI and machine learning. "Top Startups Building Generative AI on AWS." This is season three, episode one of the ongoing series covering the exciting startups from the AWS ecosystem, talking about AI machine learning. We have three great guests Bratin Saha, VP, Vice President of Machine Learning and AI Services at Amazon Web Services. Tom Mason, the CTO of Stability AI, and Aidan Gomez, CEO and co-founder of Cohere. Two practitioners doing startups and AWS. Gentlemen, thank you for opening up this session, this episode. Thanks for coming on. >> Thank you. >> Thank you. >> Thank you. >> So the topic is hype versus reality. So I think we're all on the reality is great, hype is great, but the reality's here. I want to get into it. Generative AI's got all the momentum, it's going mainstream, it's kind of come out of the behind the ropes, it's now mainstream. We saw the success of ChatGPT, opens up everyone's eyes, but there's so much more going on. Let's jump in and get your early perspectives on what should people be talking about right now? What are you guys working on? We'll start with AWS. What's the big focus right now for you guys as you come into this market that's highly active, highly hyped up, but people see value right out of the gate? >> You know, we have been working on generative AI for some time. In fact, last year we released Code Whisperer, which is about using generative AI for software development and a number of customers are using it and getting real value out of it. So generative AI is now something that's mainstream that can be used by enterprise users. And we have also been partnering with a number of other companies. So, you know, stability.ai, we've been partnering with them a lot. We want to be partnering with other companies as well. In seeing how we do three things, you know, first is providing the most efficient infrastructure for generative AI. And that is where, you know, things like Trainium, things like Inferentia, things like SageMaker come in. And then next is the set of models and then the third is the kind of applications like Code Whisperer and so on. So, you know, it's early days yet, but clearly there's a lot of amazing capabilities that will come out and something that, you know, our customers are starting to pay a lot of attention to. >> Tom, talk about your company and what your focus is and why the Amazon Web Services relationship's important for you? >> So yeah, we're primarily committed to making incredible open source foundation models and obviously stable effusions been our kind of first big model there, which we trained all on AWS. We've been working with them over the last year and a half to develop, obviously a big cluster, and bring all that compute to training these models at scale, which has been a really successful partnership. And we're excited to take it further this year as we develop commercial strategy of the business and build out, you know, the ability for enterprise customers to come and get all the value from these models that we think they can get. So we're really excited about the future. We got hugely exciting pipeline for this year with new modalities and video models and wonderful things and trying to solve images for once and for all and get the kind of general value and value proposition correct for customers. So it's a really exciting time and very honored to be part of it. >> It's great to see some of your customers doing so well out there. Congratulations to your team. Appreciate that. Aidan, let's get into what you guys do. What does Cohere do? What are you excited about right now? >> Yeah, so Cohere builds large language models, which are the backbone of applications like ChatGPT and GPT-3. We're extremely focused on solving the issues with adoption for enterprise. So it's great that you can make a super flashy demo for consumers, but it takes a lot to actually get it into billion user products and large global enterprises. So about six months ago, we released our command models, which are some of the best that exist for large language models. And in December, we released our multilingual text understanding models and that's on over a hundred different languages and it's trained on, you know, authentic data directly from native speakers. And so we're super excited to continue pushing this into enterprise and solving those barriers for adoption, making this transformation a reality. >> Just real quick, while I got you there on the new products coming out. Where are we in the progress? People see some of the new stuff out there right now. There's so much more headroom. Can you just scope out in your mind what that looks like? Like from a headroom standpoint? Okay, we see ChatGPT. "Oh yeah, it writes my papers for me, does some homework for me." I mean okay, yawn, maybe people say that, (Aidan chuckles) people excited or people are blown away. I mean, it's helped theCUBE out, it helps me, you know, feed up a little bit from my write-ups but it's not always perfect. >> Yeah, at the moment it's like a writing assistant, right? And it's still super early in the technologies trajectory. I think it's fascinating and it's interesting but its impact is still really limited. I think in the next year, like within the next eight months, we're going to see some major changes. You've already seen the very first hints of that with stuff like Bing Chat, where you augment these dialogue models with an external knowledge base. So now the models can be kept up to date to the millisecond, right? Because they can search the web and they can see events that happened a millisecond ago. But that's still limited in the sense that when you ask the question, what can these models actually do? Well they can just write text back at you. That's the extent of what they can do. And so the real project, the real effort, that I think we're all working towards is actually taking action. So what happens when you give these models the ability to use tools, to use APIs? What can they do when they can actually affect change out in the real world, beyond just streaming text back at the user? I think that's the really exciting piece. >> Okay, so I wanted to tee that up early in the segment 'cause I want to get into the customer applications. We're seeing early adopters come in, using the technology because they have a lot of data, they have a lot of large language model opportunities and then there's a big fast follower wave coming behind it. I call that the people who are going to jump in the pool early and get into it. They might not be advanced. Can you guys share what customer applications are being used with large language and vision models today and how they're using it to transform on the early adopter side, and how is that a tell sign of what's to come? >> You know, one of the things we have been seeing both with the text models that Aidan talked about as well as the vision models that stability.ai does, Tom, is customers are really using it to change the way you interact with information. You know, one example of a customer that we have, is someone who's kind of using that to query customer conversations and ask questions like, you know, "What was the customer issue? How did we solve it?" And trying to get those kinds of insights that was previously much harder to do. And then of course software is a big area. You know, generating software, making that, you know, just deploying it in production. Those have been really big areas that we have seen customers start to do. You know, looking at documentation, like instead of you know, searching for stuff and so on, you know, you just have an interactive way, in which you can just look at the documentation for a product. You know, all of this goes to where we need to take the technology. One of which is, you know, the models have to be there but they have to work reliably in a production setting at scale, with privacy, with security, and you know, making sure all of this is happening, is going to be really key. That is what, you know, we at AWS are looking to do, which is work with partners like stability and others and in the open source and really take all of these and make them available at scale to customers, where they work reliably. >> Tom, Aidan, what's your thoughts on this? Where are customers landing on this first use cases or set of low-hanging fruit use cases or applications? >> Yeah, so I think like the first group of adopters that really found product market fit were the copywriting companies. So one great example of that is HyperWrite. Another one is Jasper. And so for Cohere, that's the tip of the iceberg, like there's a very long tail of usage from a bunch of different applications. HyperWrite is one of our customers, they help beat writer's block by drafting blog posts, emails, and marketing copy. We also have a global audio streaming platform, which is using us the power of search engine that can comb through podcast transcripts, in a bunch of different languages. Then a global apparel brand, which is using us to transform how they interact with their customers through a virtual assistant, two dozen global news outlets who are using us for news summarization. So really like, these large language models, they can be deployed all over the place into every single industry sector, language is everywhere. It's hard to think of any company on Earth that doesn't use language. So it's, very, very- >> We're doing it right now. We got the language coming in. >> Exactly. >> We'll transcribe this puppy. All right. Tom, on your side, what do you see the- >> Yeah, we're seeing some amazing applications of it and you know, I guess that's partly been, because of the growth in the open source community and some of these applications have come from there that are then triggering this secondary wave of innovation, which is coming a lot from, you know, controllability and explainability of the model. But we've got companies like, you know, Jasper, which Aidan mentioned, who are using stable diffusion for image generation in block creation, content creation. We've got Lensa, you know, which exploded, and is built on top of stable diffusion for fine tuning so people can bring themselves and their pets and you know, everything into the models. So we've now got fine tuned stable diffusion at scale, which is democratized, you know, that process, which is really fun to see your Lensa, you know, exploded. You know, I think it was the largest growing app in the App Store at one point. And lots of other examples like NightCafe and Lexica and Playground. So seeing lots of cool applications. >> So much applications, we'll probably be a customer for all you guys. We'll definitely talk after. But the challenges are there for people adopting, they want to get into what you guys see as the challenges that turn into opportunities. How do you see the customers adopting generative AI applications? For example, we have massive amounts of transcripts, timed up to all the videos. I don't even know what to do. Do I just, do I code my API there. So, everyone has this problem, every vertical has these use cases. What are the challenges for people getting into this and adopting these applications? Is it figuring out what to do first? Or is it a technical setup? Do they stand up stuff, they just go to Amazon? What do you guys see as the challenges? >> I think, you know, the first thing is coming up with where you think you're going to reimagine your customer experience by using generative AI. You know, we talked about Ada, and Tom talked about a number of these ones and you know, you pick up one or two of these, to get that robust. And then once you have them, you know, we have models and we'll have more models on AWS, these large language models that Aidan was talking about. Then you go in and start using these models and testing them out and seeing whether they fit in use case or not. In many situations, like you said, John, our customers want to say, "You know, I know you've trained these models on a lot of publicly available data, but I want to be able to customize it for my use cases. Because, you know, there's some knowledge that I have created and I want to be able to use that." And then in many cases, and I think Aidan mentioned this. You know, you need these models to be up to date. Like you can't have it staying. And in those cases, you augmented with a knowledge base, you know you have to make sure that these models are not hallucinating. And so you need to be able to do the right kind of responsible AI checks. So, you know, you start with a particular use case, and there are a lot of them. Then, you know, you can come to AWS, and then look at one of the many models we have and you know, we are going to have more models for other modalities as well. And then, you know, play around with the models. We have a playground kind of thing where you can test these models on some data and then you can probably, you will probably want to bring your own data, customize it to your own needs, do some of the testing to make sure that the model is giving the right output and then just deploy it. And you know, we have a lot of tools. >> Yeah. >> To make this easy for our customers. >> How should people think about large language models? Because do they think about it as something that they tap into with their IP or their data? Or is it a large language model that they apply into their system? Is the interface that way? What's the interaction look like? >> In many situations, you can use these models out of the box. But in typical, in most of the other situations, you will want to customize it with your own data or with your own expectations. So the typical use case would be, you know, these are models are exposed through APIs. So the typical use case would be, you know you're using these APIs a little bit for testing and getting familiar and then there will be an API that will allow you to train this model further on your data. So you use that AI, you know, make sure you augmented the knowledge base. So then you use those APIs to customize the model and then just deploy it in an application. You know, like Tom was mentioning, a number of companies that are using these models. So once you have it, then you know, you again, use an endpoint API and use it in an application. >> All right, I love the example. I want to ask Tom and Aidan, because like most my experience with Amazon Web Service in 2007, I would stand up in EC2, put my code on there, play around, if it didn't work out, I'd shut it down. Is that a similar dynamic we're going to see with the machine learning where developers just kind of log in and stand up infrastructure and play around and then have a cloud-like experience? >> So I can go first. So I mean, we obviously, with AWS working really closely with the SageMaker team, do fantastic platform there for ML training and inference. And you know, going back to your point earlier, you know, where the data is, is hugely important for companies. Many companies bringing their models to their data in AWS on-premise for them is hugely important. Having the models to be, you know, open sources, makes them explainable and transparent to the adopters of those models. So, you know, we are really excited to work with the SageMaker team over the coming year to bring companies to that platform and make the most of our models. >> Aidan, what's your take on developers? Do they just need to have a team in place, if we want to interface with you guys? Let's say, can they start learning? What do they got to do to set up? >> Yeah, so I think for Cohere, our product makes it much, much easier to people, for people to get started and start building, it solves a lot of the productionization problems. But of course with SageMaker, like Tom was saying, I think that lowers a barrier even further because it solves problems like data privacy. So I want to underline what Bratin was saying earlier around when you're fine tuning or when you're using these models, you don't want your data being incorporated into someone else's model. You don't want it being used for training elsewhere. And so the ability to solve for enterprises, that data privacy and that security guarantee has been hugely important for Cohere, and that's very easy to do through SageMaker. >> Yeah. >> But the barriers for using this technology are coming down super quickly. And so for developers, it's just becoming completely intuitive. I love this, there's this quote from Andrej Karpathy. He was saying like, "It really wasn't on my 2022 list of things to happen that English would become, you know, the most popular programming language." And so the barrier is coming down- >> Yeah. >> Super quickly and it's exciting to see. >> It's going to be awesome for all the companies here, and then we'll do more, we're probably going to see explosion of startups, already seeing that, the maps, ecosystem maps, the landscape maps are happening. So this is happening and I'm convinced it's not yesterday's chat bot, it's not yesterday's AI Ops. It's a whole another ballgame. So I have to ask you guys for the final question before we kick off the company's showcasing here. How do you guys gauge success of generative AI applications? Is there a lens to look through and say, okay, how do I see success? It could be just getting a win or is it a bigger picture? Bratin we'll start with you. How do you gauge success for generative AI? >> You know, ultimately it's about bringing business value to our customers. And making sure that those customers are able to reimagine their experiences by using generative AI. Now the way to get their ease, of course to deploy those models in a safe, effective manner, and ensuring that all of the robustness and the security guarantees and the privacy guarantees are all there. And we want to make sure that this transitions from something that's great demos to actual at scale products, which means making them work reliably all of the time not just some of the time. >> Tom, what's your gauge for success? >> Look, I think this, we're seeing a completely new form of ways to interact with data, to make data intelligent, and directly to bring in new revenue streams into business. So if businesses can use our models to leverage that and generate completely new revenue streams and ultimately bring incredible new value to their customers, then that's fantastic. And we hope we can power that revolution. >> Aidan, what's your take? >> Yeah, reiterating Bratin and Tom's point, I think that value in the enterprise and value in market is like a huge, you know, it's the goal that we're striving towards. I also think that, you know, the value to consumers and actual users and the transformation of the surface area of technology to create experiences like ChatGPT that are magical and it's the first time in human history we've been able to talk to something compelling that's not a human. I think that in itself is just extraordinary and so exciting to see. >> It really brings up a whole another category of markets. B2B, B2C, it's B2D, business to developer. Because I think this is kind of the big trend the consumers have to win. The developers coding the apps, it's a whole another sea change. Reminds me everyone use the "Moneyball" movie as example during the big data wave. Then you know, the value of data. There's a scene in "Moneyball" at the end, where Billy Beane's getting the offer from the Red Sox, then the owner says to the Red Sox, "If every team's not rebuilding their teams based upon your model, there'll be dinosaurs." I think that's the same with AI here. Every company will have to need to think about their business model and how they operate with AI. So it'll be a great run. >> Completely Agree >> It'll be a great run. >> Yeah. >> Aidan, Tom, thank you so much for sharing about your experiences at your companies and congratulations on your success and it's just the beginning. And Bratin, thanks for coming on representing AWS. And thank you, appreciate for what you do. Thank you. >> Thank you, John. Thank you, Aidan. >> Thank you John. >> Thanks so much. >> Okay, let's kick off season three, episode one. I'm John Furrier, your host. Thanks for watching. (light airy music)

Published Date : Mar 9 2023

SUMMARY :

of the AWS Startup Showcase, of the behind the ropes, and something that, you know, and build out, you know, Aidan, let's get into what you guys do. and it's trained on, you know, it helps me, you know, the ability to use tools, to use APIs? I call that the people and you know, making sure the first group of adopters We got the language coming in. Tom, on your side, what do you see the- and you know, everything into the models. they want to get into what you guys see and you know, you pick for our customers. then you know, you again, All right, I love the example. and make the most of our models. And so the ability to And so the barrier is coming down- and it's exciting to see. So I have to ask you guys and ensuring that all of the robustness and directly to bring in new and it's the first time in human history the consumers have to win. and it's just the beginning. I'm John Furrier, your host.

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Tom	PERSON	0.99+
Tom Mason	PERSON	0.99+
Aidan	PERSON	0.99+
Red Sox	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Andrej Karpathy	PERSON	0.99+
Bratin Saha	PERSON	0.99+
December	DATE	0.99+
2007	DATE	0.99+
John Furrier	PERSON	0.99+
Aidan Gomez	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Amazon Web Services	ORGANIZATION	0.99+
Billy Beane	PERSON	0.99+
Bratin	PERSON	0.99+
Moneyball	TITLE	0.99+
one	QUANTITY	0.99+
Ada	PERSON	0.99+
last year	DATE	0.99+
two	QUANTITY	0.99+
Earth	LOCATION	0.99+
yesterday	DATE	0.99+
Two practitioners	QUANTITY	0.99+
Amazon Web Services	ORGANIZATION	0.99+
ChatGPT	TITLE	0.99+
next year	DATE	0.99+
Code Whisperer	TITLE	0.99+
third	QUANTITY	0.99+
this year	DATE	0.99+
App Store	TITLE	0.99+
first time	QUANTITY	0.98+
first	QUANTITY	0.98+
Inferentia	TITLE	0.98+
EC2	TITLE	0.98+
GPT-3	TITLE	0.98+
both	QUANTITY	0.98+
Lensa	TITLE	0.98+
SageMaker	ORGANIZATION	0.98+
three things	QUANTITY	0.97+
Cohere	ORGANIZATION	0.96+
over a hundred different languages	QUANTITY	0.96+
English	OTHER	0.96+
one example	QUANTITY	0.96+
about six months ago	DATE	0.96+
One	QUANTITY	0.96+
first use	QUANTITY	0.96+
SageMaker	TITLE	0.96+
Bing Chat	TITLE	0.95+
one point	QUANTITY	0.95+
Trainium	TITLE	0.95+
Lexica	TITLE	0.94+
Playground	TITLE	0.94+
three great guests	QUANTITY	0.93+
HyperWrite	TITLE	0.92+

Kevin Zawodzinski, Commvault & Paul Meighan, Amazon S3 & Glacier | AWS re:Invent 2022

(upbeat music) >> Welcome back friends. It's theCUBE LIVE in Las Vegas at the Venetian Expo, covering the first full day of AWS re:Invent 2022. I'm Lisa Martin, and I have the privilege of working much of this week with Dave Vellante. >> Hey. Yeah, it's good to be with you Lisa. >> It's always good to be with you. Dave, this show is, I can't say enough about the energy. It just keeps multiplying as I've been out on the show floor for a few minutes here and there. We've been having great conversations about cloud migration, digital transformation, business transformation. You name it, we're talking about it. >> Yeah, and I got to say the soccer Christians are really happy. (Lisa laughing) >> Right? Because the USA made it through. So that's a lot of additional excitement. >> That's true. >> People were crowded around the TVs at lunchtime. >> They were, they were. >> So yeah, but back to data. >> Back to data. We have a couple of guests here. We're going to be talking a lot with customer challenges, how they're helping to overcome them. Please welcome Kevin Zawodzinski, VP of Sales Engineering at COMMVAULT. >> Thank you. >> And Paul Meighan, Director of Product Management at AWS. Guys, it's great to have you on the program. Thank you for joining us. >> Thanks for having us. >> Thanks for having us. >> Isn't it great to be back in person? >> Paul: It really is. >> Kevin: Hell, yeah. >> You cannot replicate this on virtual, you just can't. It's nice to see how excited people are to be back. There's been a ton of buzz on our program today about Adam's keynote this morning. Amazing. A lot of synergies with the direction, Paul, that AWS is going in and where we're seeing its ecosystem as well. Paul, first question for you. Talk about, you know, in the customer environment, we know AWS is very customer obsessed. Some of the main challenges customers are facing today is they really continue this business transformation, this digital transformation, and they move to cloud native apps. What are some of those challenges and how do you help them eradicate those? >> Well, I can tell you that the biggest contribution that we make is really by focusing on the fundamentals when it comes to running storage at scale, right? So Amazon S3 is unique, distributed architecture, you know, it really does deliver on those fundamentals of durability, availability, performance, security and it does it at virtually unlimited scale, right? I mean, you guys have talked to a lot of storage folks in the industry and anyone who's run an estate at scale knows that doing that and executing on those fundamentals day after day is just super hard, right? And so we come to work every day, we focus on the fundamentals, and that focus allows customers to spend their time thinking about innovation instead of on how to keep their data durably stored. >> Well, and you guys both came out of the storage world. >> Right. >> Yeah, yeah. >> It was a box world, (Kevin laughs) and it ain't no more. >> Kevin: That's right, absolutely. >> It's a service and a service of scale. >> Kevin: Yeah. So architecture matters, right? >> Yeah. >> Yeah. >> Paul, talk a little bit about, speaking of innovation, talk about the evolution of S3. It's been around for a while now. Everyone knows it, loves it, but how has AWS architected it to really help meet customers where they are? >> Paul: Right. >> Because we know, again, there's that customer first focus. You write the press release down the road, you then follow that. How is it evolving? >> Well, I can tell you that architecture matters a lot and the architecture of Amazon S3 is pretty unique, right? I think, you know, the most important thing to understand about the architecture of S3 is that it is truly a regional service. So we're laid out across a minimum of 3 Availability Zones, or AZs, which are physically separated and isolated and have a distance of miles between them to protect against local events like floods and fires and power interruption, stuff like that. And so when you give us an object, we distribute that data across that minimum of 3 Availability Zones and then within multiple devices within each AZ, right? And so what that means is that when you store data with us, your data is on storage that's able to tolerate the failure of multiple devices with no impact to the integrity of your data, which is super powerful. And then again, super hard to do when you're trying to roll your own. So that's sort of a, like an overview of the architecture. In terms of how we think about our roadmap, you know, 90% of our roadmap comes directly from what customers tell us matters, and that's a tenant of how we think about customer obsession at AWS and it really is how we drive a roadmap. >> Right, so speaking of customers Kevin, what are customers asking you guys- >> Yeah. >> for, how does it relate to what you're doing with S3? >> Yeah, it's a wonderful question and one that is actually really appropriate for us being at re:Invent, right? So we got, last three years we've had customers here with us on stage talking about it. First of all, 3 years ago we did a virtual session, unfortunately, but glad to be back as you mentioned, with Coca-Cola and theirs was about scale and scope and really about how can we protect hundreds of thousands of objects, petabyte to data, in a simple and secure way, right. Then last year we actually met with a ACT, Inc. as well and co-presented with them and really talked about how we could protect modern workloads and their modern workloads around whether it was Aurora or as well as EKS and how they continue to evolve as well. And, last but not least it's going to be, this year we're talking with Illinois State University as well about how they're going to continue to grow, adapt and really leverage AWS and ourselves to further their support of their teachers and their staff. So that is really helping us quite a bit to continue to move forward. And the things we're doing, again, with our customer base it's really around, focused on what's important to them, right? Customer obsession, how are we working with that? How are we making sure that we're listening to them? Again, working with AWS to understand how can we evolve together and really ultimately their journeys. As you heard, even with those 3 examples they're all very different, right? And that's the point, is that everybody's at a different point in the journey. They're at a different place from a modernization perspective. So we're helping them evolve, as they're helping us evolve as well, and transform with AWS. >> So very mature COMMVAULT stack, the S3 bucket and all the other capabilities. Paul, you just talked about coming together- >> Right. >> Dave: for your customers. >> Yeah, yeah, absolutely. And just, you know, we were talking the other day, Paul and I were talking the other day, it's been, you know, we've worked with AWS, with integration since 2009, right? So a long time, right? I mean, for some that may not seem like a long time ago, but it is, right? It's, you know, over a decade of time and we've really advanced that integration considerably as well. >> What are some of the things that, I don't know if you had a chance to see the keynote this morning? >> Yeah, a little bit. >> What are some of the things that there was, and in fact this is funny, funny data point for you on data. One of my previous guests told me that Adam Selipsky spent exactly 52 minutes talking about data this morning. 52 minutes. >> Okay. >> That there's a data point. But talk about some of the things that he talked about, the direction AWS is going in, obviously new era in the last year. Talk about what you heard and how you think that will evolve the COMMVAULT-AWS relationship. >> Yeah, I think part of that is about flexibility, as Paul mentioned too, architecture matters, right? So as we evolve and some of the things that we pride ourselves on is that we developed our systems and our software and everything else to not worry about what do I have to build to today but how do I continue to evolve with my customer base? And that's what AWS does, right? And continues to do. So that's really how we would see the data environment. It's really about that integration. As they grow, as they add more features we're going to add more features as well. And we're right there with them, right? So there's a lot of things that we also talk about, Paul and I talk about, around, you know, how do we, like Graviton3 was brought up today around some of the innovations around that. We're supporting that with Auto Scale right now, right? So we're right there releasing, right when AWS releasing, co-developing things when necessary as well. >> So let's talk about security a little bit. First of all, what is COMMVAULT, right? You're not a security company but you're an adjacency to security. It's sort of, we're rethinking security. >> Kevin: Yep. >> including data protection, not a bolt-on anymore. You guys both have a background in that world and I'm sure that resonates. >> Yeah. >> So what is the security play here? What role does COMMVAULT play? I think we know pretty well what role AWS plays, but love to hear, Paul, your thoughts as well on security. >> Yeah, I'll start I guess. >> Go on Paul. >> Okay. Yeah, so on the security side of things, there's a quite a few things. So again, on the development side of things, we do things like file anomaly detection, so seeing patterns in data. We talked a lot about analytics as well in the keynote this morning. We look at what is happening in the customer environment, if there's something odd or out of place that's happening, we can detect that and we'll notify people. And we've seen that, we have case studies about that. Other things we do are simple, simple but elegant. Is with our security dashboard. So we'll use our security dashboard to show best practices. Are they using Multi-Factor Authentication? Are you viewing password complexity? You know, things like that. And allows people to understand from a security landscape perspective, how do we layer in protection with their other systems around security. We don't profess to be the security company, or a security company, but we help, you know, obviously add in those additional layers. >> And obviously you're securing, you know, the S3 piece of it. >> Mmmhmm. >> You know, from your standpoint because building it in. >> That's right. And we can tell you that for us, security is job zero. And anyone at AWS will tell you that, and not only that but it will always be our top priority. Right from the infrastructure on down. We're very focused on our shared responsibility model where we handle security from the hypervisor, or host operating system level, down to the physical security of the facilities in which our services run and then it's our customer's responsibility to build secure applications, right. >> Yeah. And you talk about Graviton earlier, Nitro comes into play and how you're, sort of, fencing off, you know, the various components of the system from the operating system, the VMs, and then that is designed in and that's a new evolution that it comes as part of the package. >> Yeah, absolutely. >> Absolutely. >> Paul, talk a little bit about, you know, security, talking about that we had so many conversations this year alone about the threat landscape and how it's dramatically changing, it's top of mind for everybody. Huge rise in ransomware attacks. Ransomware is now, when are we going to get hit? How often? What's the damage going to be? Rather than, are we going to get hit? It's, unfortunately it's progressed in that direction. How does ensuring data security impact how you're planning the roadmap at AWS and how are partners involved in shaping that? >> Right, so like I said, you know, 90% of our roadmap comes from what customers tell us matters, right? And clearly this is an issue that matters very much to customers right now, right? And so, you know, we're certainly hearing that from customers, and COMMVAULT, and partners like COMMVAULT have a big role to play in helping customers to secure and protect their applications, right? And that's why it's so critical that we come together here at re:Invent and we have a bunch of time here at the show with the COMMVAULT technical folks to talk through what they're hearing from customers and what we're hearing. And we have a number of regular touch points throughout the year as well, right? And so what COMMVAULT gets from the relationship is, sort of, early access and feedback into our features and roadmap. And what we get out of it really is that feedback from that large number of customers who interface with Amazon S3 through COMMVAULT. Who are using S3 as a backup target behind COMMVAULT, right? And so, you know, that partnership really allows us to get close to those customers and understand what really matters to them. >> Are you doing joint engineering, or is it more just, hey here you go COMMVAULT, here's the tools available, go, go build. Can you address that? >> Yeah, no, absolutely. There's definitely joint engineering like even things around, you know, data migration and movement of data, we integrate really well and we talk a lot about, hey, what are you, like as Paul mentioned, what are you seeing out there? We actually, I just left a conversation about an hour ago where we're talking about, you know, where are we seeing placement of data and how does that matter to, do you put it on, you know, instant access, or do you put it on Glacier, you know, what should be the best practices? And we tell them, again, some of the telemetry data that we have around what do we see customers doing, what's the patterns of data? And then we feed that back in and we use that to create joint solutions as well. >> You know, I wonder if we could talk about cloud, you know, optimization of cloud costs for a minute. That's obviously a big discussion point in the hallways with customers. And on your earnings call you guys talked about specifically some customers and they specifically mentioned, for example, pushing storage to lower cost tiers. So you brought up Glacier just then. What are you seeing in the field in that regard? How are customers taking advantage of that? And where does COMMVAULT play in, sort of, helping make that decision? >> You want to take part one or you want me to take it? >> I can take part one. I can tell you that, you know, we're very focused on helping customers optimize costs, however necessary, right? And, you know, we introduced intelligent hearing here at the show in 2019 and since launch it's helped customers to reduce costs by over $750 million, right? So that's a real commitment to optimizing costs on behalf of customers. We also launched, you know, later in 2020, Glacier Deep Archive, which is the lowest cost storage in the cloud. So it's an important piece of the puzzle, is to provide those storage options that can allow customers to match the workloads that are, that need to be on folder storage to the appropriate store. >> Yeah, and so, you know, S3 is not this, you know, backup and recovery system, not an archiving system and, you know, in terms of, but you have that intelligence in your platform. 'Cause when I heard that from the earnings call I was like, okay, how do customers then go about deciding what they can, you know, when it's all good times, like yeah, who cares? You know, just go, go, go. But when you got to tighten the belt, how do you guys? >> Yeah, and that goes back to understanding the data pattern. So some of that is we have intelligence and artificial intelligence and everything else and machine learning within our, so we can detect those patterns, right? We understand the patterns, we learn from that and we help customers right size, right. So ultimately we do see a blend, right? As Paul mentioned, we see, you know, hey I'm not going to put everything on Glacier necessarily upfront. Maybe they are, it all depends on their workloads and patterns. So we use the data that we collect from the different customers that we have to share those best practices out and create, you know, the right templates, so to speak, in ways for people to apply it. >> Guys, great joint, you talked about the joint engineering, joint go to market, obviously a very strong synergistic partnership between the two. A lot of excitement. This is only day one, I can only imagine what's going to be coming the next couple of days. But I have one final question for you, but I have same question for both of you. You had the chance to create your own bumper sticker, so you get a shiny new car and for some reason you want to put a bumper sticker on it. About COMMVAULT, what would it say? >> Yeah, so for me I would say comprehensive, yet simple, right? So ultimately about giving you all the bells and whistles but if you want to be very simple we can help you in every shape and form. >> Paul, what's your bumper sticker say about AWS? >> I would say that AWS starts with the customer and works backwards from there. >> Great one. >> Excellent. Guys- >> Kevin: Well done. >> it's been a pleasure to have you on the program. Thank you- >> Kevin: Thank you. >> for sharing what's going on, the updates on the AWS-COMMVAULT partnership and what's in it for customers. We appreciate it. >> Dave: Thanks you guys. >> Thanks a lot. >> Thank you. >> All right. For our guests and Dave Vellante, I'm Lisa Martin. You're watching theCUBE, the leader in live enterprise and emerging tech coverage. (upbeat music)

Published Date : Nov 30 2022

SUMMARY :

Vegas at the Venetian Expo, to be with you Lisa. It's always good to be with you. Yeah, and I got to say the Because the USA made it through. around the TVs at lunchtime. how they're helping to overcome them. have you on the program. and how do you help them eradicate those? and that focus allows customers to Well, and you guys both and it ain't no more. architecture matters, right? but how has AWS architected it to you then follow that. And so when you give us an object, and really about how can we protect and all the other capabilities. And just, you know, we What are some of the Talk about what you heard and how Paul and I talk about, around, you know, First of all, what is COMMVAULT, right? in that world and I'm sure that resonates. but love to hear, Paul, your but we help, you know, you know, the S3 piece of it. You know, from your standpoint And anyone at AWS will tell you that, sort of, fencing off, you know, What's the damage going to be? And so, you know, that partnership really Are you doing joint engineering, like even things around, you know, could talk about cloud, you know, We also launched, you know, Yeah, and so, you know, and create, you know, the right templates, You had the chance to create we can help you in every shape and form. and works backwards from there. have you on the program. the updates on the the leader in live enterprise

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Kevin Zawodzinski	PERSON	0.99+
Paul	PERSON	0.99+
Paul Meighan	PERSON	0.99+
Adam Selipsky	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Adam	PERSON	0.99+
Kevin	PERSON	0.99+
Dave	PERSON	0.99+
90%	QUANTITY	0.99+
2019	DATE	0.99+
Lisa	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
3 Availability Zones	QUANTITY	0.99+
last year	DATE	0.99+
2009	DATE	0.99+
Las Vegas	LOCATION	0.99+
ACT, Inc.	ORGANIZATION	0.99+
3 examples	QUANTITY	0.99+
Glacier	ORGANIZATION	0.99+
52 minutes	QUANTITY	0.99+
both	QUANTITY	0.99+
Illinois State University	ORGANIZATION	0.99+
One	QUANTITY	0.99+
two	QUANTITY	0.99+
first question	QUANTITY	0.99+
over $750 million	QUANTITY	0.99+
3 years ago	DATE	0.99+
S3	TITLE	0.99+
this year	DATE	0.98+
COMMVAULT	ORGANIZATION	0.98+
each	QUANTITY	0.98+
Commvault	PERSON	0.98+
first	QUANTITY	0.97+
one final question	QUANTITY	0.97+
hundreds of thousands of objects	QUANTITY	0.97+

Poojan Kumar, Clumio & Paul Meighan, Amazon S3 | AWS re:Invent 2022

>>Good afternoon and welcome back to the Classiest Show in Technology. This is the Cube we are at AWS Reinvent 2022 in Fabulous Sin City. That's why I've got my sequence on. We love a little Vegas, don't we? I'm joined by John Farer, another, another Vegas >>Fan. I don't have my sequence, I left it in my room. We're >>Gonna have to figure out how to get us 20 as soon as possible. What's been your biggest shock for you at the show so far? >>Well, I think the data story and security is so awesome. I love how that's front and center. If you look at the minutes of the keynote of Adamski, the CEO on day one, it's all bulked into data and security. All worked hand in hand. That's on top of already the innovation of their infrastructure. So I think you're gonna see a lot of interplay going on in this next segment. It's gonna tell a lot of that innovation story that's coming next. It's pretty awesome. >>It is pretty awesome, and I'm super excited. It's not only what we do here on the Cube, it's also in my show notes. We are gonna be geeking out for the next segment. Please welcome Paul and Puja. Wonderful to have you both here. Paul from Amazon, s3, glacier, and Pujan, CEO of kuo. I wanna turn to you Pujan, to start us off, just in case the audience isn't familiar, give us the Kuo pitch. >>Yeah, so basically Kuo is a, a backup as a service offering, right? Built in AWS four aws, right? And effectively going after, you know, any service that a customer uses on top of aws, right? And so a lot of the data sitting on s3, right? So that's been like our, our big use case going and basically building backup and air gap protection for, for s3. But we basically go to every other service, e c two, ebs, dynamo, you know, you name it, right? So basically do the whole thing >>And the relationship with aws. Can you guys share, I mean, you got you here together. You guys are a great partnership. Born in the cloud, operation in the cloud. Absolutely. I think talk about the partnership with aws. >>Absolutely. I think the last five years of building on AWS has been phenomenal, right? And I love the platform. It's, it's a very pure platform for us. You know, the APIs and, and the access you get and access you get to the service teams like Paul sitting here and the other teams you have gotten access to, I think has been phenomenal. But we also have, I would say, pushed the envelope in terms of how innovative we have been and how aggressive we have been in utilizing all the innovation that AWS has built in over the last few years. But it would not have happened without the fantastic partnership with the service teams. >>Paul, talk about the, AM the S3 part of this. What's the story there? >>Well, it's been great working with the CUO team over the course of the last few years. We were just upstairs diving deep into the, to the features that they're taking advantage of. They really push us hard on behalf of customers, and it's been a, it's just been a great relationship over the last years. >>That's awesome. And the ecosystem at such a, we're gonna hear tomorrow, the keynote on the, from Aruba who's gonna tend over the ecosystem. You guys are working together. There's a lot of strategic partnerships, so much collaboration between you guys that makes it very, this is the next gen cloud of cloud environment we're seeing. And you heard the, the economies around the corner. It's still gonna be challenging, but still there's more growth in the cloud. This is not stopping. This is impacts the customers. What are the customers saying to you guys when you work backwards from their needs? They want it faster, easier, cheaper. They want it more integrated. What are some of the things, all those you guys hearing from customers? >>So for us, you know, if you think about it, like, you know, as people are moving to the cloud, especially like take a use case like s3, right? So much of critical data sitting on top of S3 today. And so what folks have realized that as they're, you know, putting all of those, you know, what, over two 50 trillion objects, you know, sitting on s3, a lot of them need backup and data protection because there could be accidental deletions, there could be software bugs, there could be a ransomware type event due to which you need a second copy of the data that is outside of your security domain, right? But again, that needs to get be done at the, at the right price point, right? And that's where like a technology like Columbia comes in because since we've been built on the cloud, we've optimized it correctly. So especially for folks who are very cost conscious, given the macroeconomic conditions, we are heading into a technology that's built correctly so that, you know, you get the right architecture and the right solution at the right price point and the scale, right? Talking about trillions of objects, billions of objects within a single customer, within a single bucket sometimes. And that's where Columbia comes in. Cause we basically do that at scale without, again, impacting the, the customer's wallet more than it needs to. >>The porridge has to be the right temperature and the right size bowl. With the right spoon. You've got a lot of complexity when it comes to solving those customer challenges. You have a couple customer story examples you're allowed to share with us. Correct? Paul, do you want to kick one off? Go ahead. Oh, puja. All right. >>No, absolutely. I think there's a ton of them. I, I'll talk about, you know, want to begin with like Cox Automotive, right? A phenomenal customer that we, all of us have worked together with them. And again, looking for a solution to backup S3 to essentially go air gap protection outside of their account, right? They looked at doing it themselves, right? They thought they'll go and basically do it themselves. And then they fortunately bumped into Columbia, they looked at our architecture, looked at what it would really go and take to build it. And guess what, sitting in 2022, getting 23 right now, nobody wants to go and build this themselves. They actually want a turnkey solution that just does it, right? And so, again, we are a phenomenal joint customer of ours doing this at a pretty massive scale, right? And there are many more like that. There's Warner Brothers that are essentially going into the cloud from on premises, right? And they're going really fast accelerating the usage on aws again, looking at, you know, backup and data protection and using clum because of our extreme simplicity that we provide. >>Yeah, I think it's, you've got a, a lot of different people solving different problems that you're working with all the time. Millions of customers. Well, how do you prioritize? >>Well, for us, it really all comes down to fundamentals, right? So Amazon, s3 s unique distributed architecture delivers industry leading durability, availability, performance and security at virtually unlimited scale, right? And it's really been delivering on the fundamentals that has earned the trust of so many customers of all sizes and industries over the course of over 16 years. Now, in terms of how we prioritize on behalf of those customers, we always say that 90% of our roadmap comes directly from what customers are telling us is important. And a large number of our customers now are using S3 through lumino, which is why the relationship is so important. We're here talking about customer use cases here at the show, and we do that regularly throughout the year as well. And that's, that's how we land on a road. >>And what are the, what are the top stories from customers? What, what are they telling you? What's the number one top three things you're hearing? >>I tell you, like, again, it just comes down to the fundamentals, right? Of security, availability, durability and performance at virtually unlimited scale. Like that is the first customer first discussions that we have with customers talking about durable storage, for >>Sure. What I find interesting in, you mentioned scale, right? That comes up a lot scale with data. Yeah. That we heard data. The big theme here, security, what's in my S3 bucket? Can you find out what's in there? Is it backed up properly? How do I get it back? Where's the ransomware? Why not just target the ransomware? So how do you navigate the, the security challenges, the, the need to store all that scale data? What's the secret sauce? >>Yeah, so I think the, the big thing is we'll start with the, you know, how we have architected the product, right? If you think about it, this, you're dealing with a lot of scale, right? You get to a hundred million, a billion and billions very fast on S3 few, especially on a cloud native application. So it starts with the visibility, right? It's basically about, like we have things where you do, where you create a subset of your buckets called protection groups that you can essentially, you know, do it based on prefixes. So now you can essentially figure out what prefix you want to back up and what you don't want to back up. Maybe there's log data that you don't care about, so you don't back that up, right? And it all starts with that visibility that you give. And the prefix level data protection then comes the scale, which is where I was telling you, right? We have basically built an orchestration engine, right? It's like we call the ES for Lambdas, right? So we have a internal orchestration engine and essentially what what we have done is we have our own language internally that spawns off these lambdas, right? And they go after these S3 partitions do the right things and then you basically reel them back. So things like that that we do that are not possible if you're not built on the >>Clock. Well also, I mean, just mind blowing and go back 10 years. Yeah. I mean you got Lambda. What you're talking about here is the gift of the cloud innovation. Yeah. So the benefit of S3 is now accelerated. This is the story this year. Yeah. I mean they're highlighting it at scale, not just in the data, but like what we knew when Lambda came out and what S3 could do. But now mainstream solutions are coming in. Does that change your backup plans? Because we're gonna see a lot more end to end, lot more solutions. We heard that on the keynote. Some are saying it's more complexity. Of course it might, but you can abstract another way with the cloud that's the best part of the cloud. So these abstraction leads. So what's your view on that? But I wanna get your thoughts because you guys are perfectly positioned for this scale, but there's more coming. Yes. Yes. Exactly. What, how are you looking at that? >>So again, I think the, you know, obviously the, the S3 teams and every team in AWS is basically pushing the envelope in terms of innovation. But the key for a partner like us is to go and take that innovation. A lot of complex architectures behind the scene. But what you deliver to the customer is simple. I'll give you one more example. One of the things we launched that, you know, Paul and others are very excited about, is this ability to do instant access on the backup, right? So you could have billions of objects that you backed up. Maybe you need just 10,000 of them for a DR test. And we can basically create like an instant virtual bucket on top of that backup that you can instantly restore >>Spinning up a sandbox of temporary data to go check it >>Out. Exactly. Offer an inte application. >>Think we're geeking out right now. >>Yeah, I know. Brought that part of the segment, John. Don't worry, we're safely there. But, >>But that's the thing, right? That all that is possible because of all the, the scale and innovation and all the APIs and everything that, you know, Paul and the team gives us that we go and build on top of >>Paul, geek out on with us on this. We >>Are super excited for instant restore >>For store. I mean, automation programmability. >>It is, I mean it's the logical next step for backup in the cloud. Exactly. Yeah. But it's a super hard engineering problem to go solve for customers. I mean, the RTO benefits alone are super compelling, but then there's a cost element as well of not having to bring back all that stuff for a test restore, for example. And so it's, it's been really great to, to work with the team on that. We have some ideas on how we may help solve it from our side, and we're looking forward to collaborating on it. >>This is a great illustration of what I was writing about this week around the classic cloud, which is great. And as Adam said, and used like to use the word and, and you got this new functionality we're seeing emerge from the growth. Yes. From the companies that are built on Amazon web services that are growing. You're a partner, they have a lot of other partners and people are taking over restaurant here off action. I mean, there's real growth and new functionality on top of aws. You guys are no different. What's, are you prepared for that? Are you ready to go? >>Yeah, no, absolutely. And I think if you think about, if you think about it, right, I think it's also about doing this without impacting the primary application. Like if the customer is running a primary application at scale on s3, a backup application like ours can't come in and really mess with that. So I think being able to do things where, and this is where you solve really hard computer science problems, right? Where you're bottling yourself. If you are essentially seeing any kind of, you know, interfering with the primary, you're going to cut yourself down. You're gonna go after a different partition. So there are a lot of things you need to do behind the scenes, which is again, all the complexity, all of that, but deliver the, to the customer a very, very simple thing. >>You know, Paul, I wanna get your thoughts and I want you to chime in. Yeah. In 2014, I interviewed Steven Schmidt, my first interview with the, he was the CISO then, and now he's a CSO and, and former ciso, he's back at that time, the word was the cloud's not secure. Now we're talking about security. Just in the complexity of how you're partitioning and managing your sub portions, how you explained it, it's harder for the attackers. The cloud in its in its architecture has become a more secure environment. Yeah. Well, and getting more secure as you have laying out this, this is a new dynamic. This is good. Can you explain the, >>I mean, I, I can just tell you that at AWS security is job zero and that it will always be our number one priority, right? We have a, an infrastructure with under AWS that is vetted and approved to run even top secret workloads, which benefits all customers in all regions. >>And your, your security posture is embedded on top of that. And you got your own stuff. >>Yeah. And if you think of it as a shared responsibility model, so security of the cloud is the responsibility of the cloud provider, but then security of the data on top of it. Like you, you go and delete stuff, your software goes and does something that resiliency, the integrity of the data is your responsibility as a customer. And that's where, you know, we come in. Who >>Shared responsibility has been such a hot topic all week. Yeah. >>I gotta ask him one more question. Cause this is fascinating. And we are talking about on the cube all day today after we saw the announcement and Adam's comment on the cube, Adams LE's comment on the keynote. I mean, he said, if you're gonna tighten your belt, meaning economic cost recovery, re right sizing. If you want to tighten your belt, come to the cloud. So I have to ask you guys, Puja, if you can comment, that'd be great. There's a lot of other competitors out there that aren't born on aws. What is the customer gonna do when they tighten the build? What does that mean? They're gonna go to, to the individual contracts. They're gonna work in the marketplace. I mean this, there's a new dynamic in town. It's called AWS 2022. They weren't really around much in the recession of 2008. They were just starting to grow. Now they're an economic force. People like yourselves have embedded in there. There's a lot of competition. What's gonna happen? >>I think people are gonna just go to a place like, you know, AWS marketplace. You're going to essentially look for solutions and essentially like, and, and the right solutions built in are going to be self-service like aws. It's a very self-service thing. A hundred percent. So you go and do self-service, you figure out what's working, what's not working. Also, the model has to be consumption oriented. No longer can you expect the customer to go and pay a bunch of money for shelfware, right? It's like, like how we charge how AWS charges, which is you pay for what you consume. That and all has to be front and center, >>Right? I think that's a really, I think that's a really important >>Point. It's time >>And I think it's time. So we have a new challenge on the cube. We give you 30 seconds roughly to give us your extraordinarily hot take your shining thought leadership moment and, and highlight what you think is the most important takeaway from the show. The biggest soundbite, the juiciest announcement. Paul, I'll >>Start with an Instagram. Real basically. Yeah. Okay. >>Yeah. Hi. Go. I would just say from an S3 perspective, over the course of the last several years, we've really seen workloads shift from just backup and recovery and static images on websites to data lake analytics applications. And you continue to see that here. And I can tell you that some of these scaled applications are running at enormous mind blowing scale, right? And so, so every year we come here, we talk to customers, and it's just every year it sort of blows me away. And I've been in the storage industry for a long time and it's just is, it blows me away. Just the scale at customers are running in >>And >>Blowing scale. And when it comes to backup, let me just say that it's easy to back up and recover a single object, but doing an easy thing, a billion or 10 billion times over, that's actually quite hard. >>And just to, just to bold that a little bit, just pull out my highlighter. S3 now has over 280 trillion objects. That's a lot. >>That's a lot of objects. >>Yeah. You are not, you are not kidding. When you talk about scale, I mean, this is the most scalable. >>That's not solution's not there. Yeah. That, that's right. And we wake up every, we have a culture of durability and we wake up every single day to raise the bar on the fundamentals and make sure that every single one of those objects is protected and safe. >>Okay. You, I, >>I can't imagine worrying about two, two 80 trillion different things. >>Let's go. You're Instagram real >>For me again, you know, between S3 and us, we are two players out there that are really, you know, processing the data at the end of the day, right? And so I'm very excited about, you know, what we are going to do more and more with the instant restore capability where we can integrate third party services on top of it that can do more things with the data that is not, not passively sitting, but now becomes active data that you can analyze and do things with. So that's something where we take this to the next level is something that I'm super excited about. >>There's a lot to be excited about and, and we're excited to have you. We're excited to hear what happens next. Excited to see more collaboration like this. Paul Pon, thank you so much for joining us here on the show. Thank all of you from for tuning into our continuous wall to wall super thrilling live coverage of AWS reinvent here in fabulous Las Vegas, Nevada, with John Furrier. I'm Savannah Peterson. We're the cube, the leading source for high tech coverage.

Published Date : Nov 29 2022

SUMMARY :

This is the Cube we are at AWS Reinvent 2022 in Fabulous Sin We're Gonna have to figure out how to get us 20 as soon as possible. If you look at the minutes of the keynote of Adamski, the CEO on day one, it's all bulked into data Wonderful to have you both here. And effectively going after, you know, any service that And the relationship with aws. and the access you get and access you get to the service teams like Paul sitting here and the other teams you have gotten access What's the story there? of customers, and it's been a, it's just been a great relationship over the last years. What are the customers saying to you guys when you work backwards And so what folks have realized that as they're, you know, putting all of those, you know, what, Paul, do you want to kick one off? I, I'll talk about, you know, want to begin with like Cox Automotive, Well, how do you prioritize? And it's really been delivering on the fundamentals that has earned the trust of so many customers Like that is the first customer first discussions that we have with customers talking about durable So how do you navigate the, the security challenges, And it all starts with that visibility that you give. I mean you got Lambda. One of the things we launched that, you know, Paul and others are very excited about, is this ability to do instant Offer an inte application. Brought that part of the segment, John. Paul, geek out on with us on this. I mean, automation programmability. I mean, the RTO benefits alone are and you got this new functionality we're seeing emerge from the growth. And I think if you think about, if you think about it, right, I think it's also about doing this without Well, and getting more secure as you have laying I mean, I, I can just tell you that at AWS security is job zero and that And you got your own you know, we come in. Yeah. So I have to ask you I think people are gonna just go to a place like, you know, AWS marketplace. It's time shining thought leadership moment and, and highlight what you think is the Start with an Instagram. And I can tell you that some of these scaled applications are running at enormous And when it comes to backup, let me just say that it's easy to back up and recover a single object, And just to, just to bold that a little bit, just pull out my highlighter. When you talk about scale, I mean, this is the most scalable. And we wake up every, we have a culture of durability and we wake You're Instagram real you know, processing the data at the end of the day, right? Thank all of you from for tuning into our continuous wall to wall super thrilling

ENTITIES

Entity	Category	Confidence
Paul	PERSON	0.99+
2014	DATE	0.99+
Adam	PERSON	0.99+
Steven Schmidt	PERSON	0.99+
Paul Pon	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Savannah Peterson	PERSON	0.99+
John	PERSON	0.99+
90%	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Cox Automotive	ORGANIZATION	0.99+
30 seconds	QUANTITY	0.99+
Paul Meighan	PERSON	0.99+
John Farer	PERSON	0.99+
two players	QUANTITY	0.99+
Warner Brothers	ORGANIZATION	0.99+
Vegas	LOCATION	0.99+
10 billion	QUANTITY	0.99+
aws	ORGANIZATION	0.99+
2022	DATE	0.99+
2008	DATE	0.99+
Puja	PERSON	0.99+
Poojan Kumar	PERSON	0.98+
second copy	QUANTITY	0.98+
today	DATE	0.98+
billions	QUANTITY	0.98+
this year	DATE	0.98+
one more question	QUANTITY	0.98+
first interview	QUANTITY	0.98+
20	QUANTITY	0.98+
Millions of customers	QUANTITY	0.98+
One	QUANTITY	0.97+
Adamski	PERSON	0.97+
over 16 years	QUANTITY	0.97+
tomorrow	DATE	0.97+
Columbia	LOCATION	0.97+
Las Vegas, Nevada	LOCATION	0.97+
over 280 trillion objects	QUANTITY	0.97+
10 years	QUANTITY	0.97+
first customer	QUANTITY	0.97+
10,000	QUANTITY	0.96+
Instagram	ORGANIZATION	0.96+
both	QUANTITY	0.96+
kuo	ORGANIZATION	0.96+
S3	TITLE	0.96+
Clumio	PERSON	0.95+
Pujan	ORGANIZATION	0.95+
billions of objects	QUANTITY	0.95+
23	QUANTITY	0.95+
two	QUANTITY	0.95+
a billion	QUANTITY	0.94+
Lambdas	TITLE	0.94+
over two 50 trillion objects	QUANTITY	0.94+
first discussions	QUANTITY	0.93+
ES	TITLE	0.93+
single object	QUANTITY	0.93+
this week	DATE	0.92+
dynamo	ORGANIZATION	0.92+
single bucket	QUANTITY	0.92+
Fabulous Sin City	LOCATION	0.92+
Cube	COMMERCIAL_ITEM	0.9+
s3	TITLE	0.9+
CUO	ORGANIZATION	0.89+
Aruba	LOCATION	0.89+
80 trillion	QUANTITY	0.88+
Adams LE	PERSON	0.88+
glacier	ORGANIZATION	0.87+
s3	ORGANIZATION	0.85+

Breaking Analysis: Databricks faces critical strategic decisions…here’s why

>> From theCUBE Studios in Palo Alto and Boston, bringing you data-driven insights from theCUBE and ETR. This is Breaking Analysis with Dave Vellante. >> Spark became a top level Apache project in 2014, and then shortly thereafter, burst onto the big data scene. Spark, along with the cloud, transformed and in many ways, disrupted the big data market. Databricks optimized its tech stack for Spark and took advantage of the cloud to really cleverly deliver a managed service that has become a leading AI and data platform among data scientists and data engineers. However, emerging customer data requirements are shifting into a direction that will cause modern data platform players generally and Databricks, specifically, we think, to make some key directional decisions and perhaps even reinvent themselves. Hello and welcome to this week's wikibon theCUBE Insights, powered by ETR. In this Breaking Analysis, we're going to do a deep dive into Databricks. We'll explore its current impressive market momentum. We're going to use some ETR survey data to show that, and then we'll lay out how customer data requirements are changing and what the ideal data platform will look like in the midterm future. We'll then evaluate core elements of the Databricks portfolio against that vision, and then we'll close with some strategic decisions that we think the company faces. And to do so, we welcome in our good friend, George Gilbert, former equities analyst, market analyst, and current Principal at TechAlpha Partners. George, good to see you. Thanks for coming on. >> Good to see you, Dave. >> All right, let me set this up. We're going to start by taking a look at where Databricks sits in the market in terms of how customers perceive the company and what it's momentum looks like. And this chart that we're showing here is data from ETS, the emerging technology survey of private companies. The N is 1,421. What we did is we cut the data on three sectors, analytics, database-data warehouse, and AI/ML. The vertical axis is a measure of customer sentiment, which evaluates an IT decision maker's awareness of the firm and the likelihood of engaging and/or purchase intent. The horizontal axis shows mindshare in the dataset, and we've highlighted Databricks, which has been a consistent high performer in this survey over the last several quarters. And as we, by the way, just as aside as we previously reported, OpenAI, which burst onto the scene this past quarter, leads all names, but Databricks is still prominent. You can see that the ETR shows some open source tools for reference, but as far as firms go, Databricks is very impressively positioned. Now, let's see how they stack up to some mainstream cohorts in the data space, against some bigger companies and sometimes public companies. This chart shows net score on the vertical axis, which is a measure of spending momentum and pervasiveness in the data set is on the horizontal axis. You can see that chart insert in the upper right, that informs how the dots are plotted, and net score against shared N. And that red dotted line at 40% indicates a highly elevated net score, anything above that we think is really, really impressive. And here we're just comparing Databricks with Snowflake, Cloudera, and Oracle. And that squiggly line leading to Databricks shows their path since 2021 by quarter. And you can see it's performing extremely well, maintaining an elevated net score and net range. Now it's comparable in the vertical axis to Snowflake, and it consistently is moving to the right and gaining share. Now, why did we choose to show Cloudera and Oracle? The reason is that Cloudera got the whole big data era started and was disrupted by Spark. And of course the cloud, Spark and Databricks and Oracle in many ways, was the target of early big data players like Cloudera. Take a listen to Cloudera CEO at the time, Mike Olson. This is back in 2010, first year of theCUBE, play the clip. >> Look, back in the day, if you had a data problem, if you needed to run business analytics, you wrote the biggest check you could to Sun Microsystems, and you bought a great big, single box, central server, and any money that was left over, you handed to Oracle for a database licenses and you installed that database on that box, and that was where you went for data. That was your temple of information. >> Okay? So Mike Olson implied that monolithic model was too expensive and inflexible, and Cloudera set out to fix that. But the best laid plans, as they say, George, what do you make of the data that we just shared? >> So where Databricks has really come up out of sort of Cloudera's tailpipe was they took big data processing, made it coherent, made it a managed service so it could run in the cloud. So it relieved customers of the operational burden. Where they're really strong and where their traditional meat and potatoes or bread and butter is the predictive and prescriptive analytics that building and training and serving machine learning models. They've tried to move into traditional business intelligence, the more traditional descriptive and diagnostic analytics, but they're less mature there. So what that means is, the reason you see Databricks and Snowflake kind of side by side is there are many, many accounts that have both Snowflake for business intelligence, Databricks for AI machine learning, where Snowflake, I'm sorry, where Databricks also did really well was in core data engineering, refining the data, the old ETL process, which kind of turned into ELT, where you loaded into the analytic repository in raw form and refine it. And so people have really used both, and each is trying to get into the other. >> Yeah, absolutely. We've reported on this quite a bit. Snowflake, kind of moving into the domain of Databricks and vice versa. And the last bit of ETR evidence that we want to share in terms of the company's momentum comes from ETR's Round Tables. They're run by Erik Bradley, and now former Gartner analyst and George, your colleague back at Gartner, Daren Brabham. And what we're going to show here is some direct quotes of IT pros in those Round Tables. There's a data science head and a CIO as well. Just make a few call outs here, we won't spend too much time on it, but starting at the top, like all of us, we can't talk about Databricks without mentioning Snowflake. Those two get us excited. Second comment zeros in on the flexibility and the robustness of Databricks from a data warehouse perspective. And then the last point is, despite competition from cloud players, Databricks has reinvented itself a couple of times over the year. And George, we're going to lay out today a scenario that perhaps calls for Databricks to do that once again. >> Their big opportunity and their big challenge for every tech company, it's managing a technology transition. The transition that we're talking about is something that's been bubbling up, but it's really epical. First time in 60 years, we're moving from an application-centric view of the world to a data-centric view, because decisions are becoming more important than automating processes. So let me let you sort of develop. >> Yeah, so let's talk about that here. We going to put up some bullets on precisely that point and the changing sort of customer environment. So you got IT stacks are shifting is George just said, from application centric silos to data centric stacks where the priority is shifting from automating processes to automating decision. You know how look at RPA and there's still a lot of automation going on, but from the focus of that application centricity and the data locked into those apps, that's changing. Data has historically been on the outskirts in silos, but organizations, you think of Amazon, think Uber, Airbnb, they're putting data at the core, and logic is increasingly being embedded in the data instead of the reverse. In other words, today, the data's locked inside the app, which is why you need to extract that data is sticking it to a data warehouse. The point, George, is we're putting forth this new vision for how data is going to be used. And you've used this Uber example to underscore the future state. Please explain? >> Okay, so this is hopefully an example everyone can relate to. The idea is first, you're automating things that are happening in the real world and decisions that make those things happen autonomously without humans in the loop all the time. So to use the Uber example on your phone, you call a car, you call a driver. Automatically, the Uber app then looks at what drivers are in the vicinity, what drivers are free, matches one, calculates an ETA to you, calculates a price, calculates an ETA to your destination, and then directs the driver once they're there. The point of this is that that cannot happen in an application-centric world very easily because all these little apps, the drivers, the riders, the routes, the fares, those call on data locked up in many different apps, but they have to sit on a layer that makes it all coherent. >> But George, so if Uber's doing this, doesn't this tech already exist? Isn't there a tech platform that does this already? >> Yes, and the mission of the entire tech industry is to build services that make it possible to compose and operate similar platforms and tools, but with the skills of mainstream developers in mainstream corporations, not the rocket scientists at Uber and Amazon. >> Okay, so we're talking about horizontally scaling across the industry, and actually giving a lot more organizations access to this technology. So by way of review, let's summarize the trend that's going on today in terms of the modern data stack that is propelling the likes of Databricks and Snowflake, which we just showed you in the ETR data and is really is a tailwind form. So the trend is toward this common repository for analytic data, that could be multiple virtual data warehouses inside of Snowflake, but you're in that Snowflake environment or Lakehouses from Databricks or multiple data lakes. And we've talked about what JP Morgan Chase is doing with the data mesh and gluing data lakes together, you've got various public clouds playing in this game, and then the data is annotated to have a common meaning. In other words, there's a semantic layer that enables applications to talk to the data elements and know that they have common and coherent meaning. So George, the good news is this approach is more effective than the legacy monolithic models that Mike Olson was talking about, so what's the problem with this in your view? >> So today's data platforms added immense value 'cause they connected the data that was previously locked up in these monolithic apps or on all these different microservices, and that supported traditional BI and AI/ML use cases. But now if we want to build apps like Uber or Amazon.com, where they've got essentially an autonomously running supply chain and e-commerce app where humans only care and feed it. But the thing is figuring out what to buy, when to buy, where to deploy it, when to ship it. We needed a semantic layer on top of the data. So that, as you were saying, the data that's coming from all those apps, the different apps that's integrated, not just connected, but it means the same. And the issue is whenever you add a new layer to a stack to support new applications, there are implications for the already existing layers, like can they support the new layer and its use cases? So for instance, if you add a semantic layer that embeds app logic with the data rather than vice versa, which we been talking about and that's been the case for 60 years, then the new data layer faces challenges that the way you manage that data, the way you analyze that data, is not supported by today's tools. >> Okay, so actually Alex, bring me up that last slide if you would, I mean, you're basically saying at the bottom here, today's repositories don't really do joins at scale. The future is you're talking about hundreds or thousands or millions of data connections, and today's systems, we're talking about, I don't know, 6, 8, 10 joins and that is the fundamental problem you're saying, is a new data error coming and existing systems won't be able to handle it? >> Yeah, one way of thinking about it is that even though we call them relational databases, when we actually want to do lots of joins or when we want to analyze data from lots of different tables, we created a whole new industry for analytic databases where you sort of mung the data together into fewer tables. So you didn't have to do as many joins because the joins are difficult and slow. And when you're going to arbitrarily join thousands, hundreds of thousands or across millions of elements, you need a new type of database. We have them, they're called graph databases, but to query them, you go back to the prerelational era in terms of their usability. >> Okay, so we're going to come back to that and talk about how you get around that problem. But let's first lay out what the ideal data platform of the future we think looks like. And again, we're going to come back to use this Uber example. In this graphic that George put together, awesome. We got three layers. The application layer is where the data products reside. The example here is drivers, rides, maps, routes, ETA, et cetera. The digital version of what we were talking about in the previous slide, people, places and things. The next layer is the data layer, that breaks down the silos and connects the data elements through semantics and everything is coherent. And then the bottom layers, the legacy operational systems feed that data layer. George, explain what's different here, the graph database element, you talk about the relational query capabilities, and why can't I just throw memory at solving this problem? >> Some of the graph databases do throw memory at the problem and maybe without naming names, some of them live entirely in memory. And what you're dealing with is a prerelational in-memory database system where you navigate between elements, and the issue with that is we've had SQL for 50 years, so we don't have to navigate, we can say what we want without how to get it. That's the core of the problem. >> Okay. So if I may, I just want to drill into this a little bit. So you're talking about the expressiveness of a graph. Alex, if you'd bring that back out, the fourth bullet, expressiveness of a graph database with the relational ease of query. Can you explain what you mean by that? >> Yeah, so graphs are great because when you can describe anything with a graph, that's why they're becoming so popular. Expressive means you can represent anything easily. They're conducive to, you might say, in a world where we now want like the metaverse, like with a 3D world, and I don't mean the Facebook metaverse, I mean like the business metaverse when we want to capture data about everything, but we want it in context, we want to build a set of digital twins that represent everything going on in the world. And Uber is a tiny example of that. Uber built a graph to represent all the drivers and riders and maps and routes. But what you need out of a database isn't just a way to store stuff and update stuff. You need to be able to ask questions of it, you need to be able to query it. And if you go back to prerelational days, you had to know how to find your way to the data. It's sort of like when you give directions to someone and they didn't have a GPS system and a mapping system, you had to give them turn by turn directions. Whereas when you have a GPS and a mapping system, which is like the relational thing, you just say where you want to go, and it spits out the turn by turn directions, which let's say, the car might follow or whoever you're directing would follow. But the point is, it's much easier in a relational database to say, "I just want to get these results. You figure out how to get it." The graph database, they have not taken over the world because in some ways, it's taking a 50 year leap backwards. >> Alright, got it. Okay. Let's take a look at how the current Databricks offerings map to that ideal state that we just laid out. So to do that, we put together this chart that looks at the key elements of the Databricks portfolio, the core capability, the weakness, and the threat that may loom. Start with the Delta Lake, that's the storage layer, which is great for files and tables. It's got true separation of compute and storage, I want you to double click on that George, as independent elements, but it's weaker for the type of low latency ingest that we see coming in the future. And some of the threats highlighted here. AWS could add transactional tables to S3, Iceberg adoption is picking up and could accelerate, that could disrupt Databricks. George, add some color here please? >> Okay, so this is the sort of a classic competitive forces where you want to look at, so what are customers demanding? What's competitive pressure? What are substitutes? Even what your suppliers might be pushing. Here, Delta Lake is at its core, a set of transactional tables that sit on an object store. So think of it in a database system, this is the storage engine. So since S3 has been getting stronger for 15 years, you could see a scenario where they add transactional tables. We have an open source alternative in Iceberg, which Snowflake and others support. But at the same time, Databricks has built an ecosystem out of tools, their own and others, that read and write to Delta tables, that's what makes the Delta Lake and ecosystem. So they have a catalog, the whole machine learning tool chain talks directly to the data here. That was their great advantage because in the past with Snowflake, you had to pull all the data out of the database before the machine learning tools could work with it, that was a major shortcoming. They fixed that. But the point here is that even before we get to the semantic layer, the core foundation is under threat. >> Yep. Got it. Okay. We got a lot of ground to cover. So we're going to take a look at the Spark Execution Engine next. Think of that as the refinery that runs really efficient batch processing. That's kind of what disrupted the DOOp in a large way, but it's not Python friendly and that's an issue because the data science and the data engineering crowd are moving in that direction, and/or they're using DBT. George, we had Tristan Handy on at Supercloud, really interesting discussion that you and I did. Explain why this is an issue for Databricks? >> So once the data lake was in place, what people did was they refined their data batch, and Spark has always had streaming support and it's gotten better. The underlying storage as we've talked about is an issue. But basically they took raw data, then they refined it into tables that were like customers and products and partners. And then they refined that again into what was like gold artifacts, which might be business intelligence metrics or dashboards, which were collections of metrics. But they were running it on the Spark Execution Engine, which it's a Java-based engine or it's running on a Java-based virtual machine, which means all the data scientists and the data engineers who want to work with Python are really working in sort of oil and water. Like if you get an error in Python, you can't tell whether the problems in Python or where it's in Spark. There's just an impedance mismatch between the two. And then at the same time, the whole world is now gravitating towards DBT because it's a very nice and simple way to compose these data processing pipelines, and people are using either SQL in DBT or Python in DBT, and that kind of is a substitute for doing it all in Spark. So it's under threat even before we get to that semantic layer, it so happens that DBT itself is becoming the authoring environment for the semantic layer with business intelligent metrics. But that's again, this is the second element that's under direct substitution and competitive threat. >> Okay, let's now move down to the third element, which is the Photon. Photon is Databricks' BI Lakehouse, which has integration with the Databricks tooling, which is very rich, it's newer. And it's also not well suited for high concurrency and low latency use cases, which we think are going to increasingly become the norm over time. George, the call out threat here is customers want to connect everything to a semantic layer. Explain your thinking here and why this is a potential threat to Databricks? >> Okay, so two issues here. What you were touching on, which is the high concurrency, low latency, when people are running like thousands of dashboards and data is streaming in, that's a problem because SQL data warehouse, the query engine, something like that matures over five to 10 years. It's one of these things, the joke that Andy Jassy makes just in general, he's really talking about Azure, but there's no compression algorithm for experience. The Snowflake guy started more than five years earlier, and for a bunch of reasons, that lead is not something that Databricks can shrink. They'll always be behind. So that's why Snowflake has transactional tables now and we can get into that in another show. But the key point is, so near term, it's struggling to keep up with the use cases that are core to business intelligence, which is highly concurrent, lots of users doing interactive query. But then when you get to a semantic layer, that's when you need to be able to query data that might have thousands or tens of thousands or hundreds of thousands of joins. And that's a SQL query engine, traditional SQL query engine is just not built for that. That's the core problem of traditional relational databases. >> Now this is a quick aside. We always talk about Snowflake and Databricks in sort of the same context. We're not necessarily saying that Snowflake is in a position to tackle all these problems. We'll deal with that separately. So we don't mean to imply that, but we're just sort of laying out some of the things that Snowflake or rather Databricks customers we think, need to be thinking about and having conversations with Databricks about and we hope to have them as well. We'll come back to that in terms of sort of strategic options. But finally, when come back to the table, we have Databricks' AI/ML Tool Chain, which has been an awesome capability for the data science crowd. It's comprehensive, it's a one-stop shop solution, but the kicker here is that it's optimized for supervised model building. And the concern is that foundational models like GPT could cannibalize the current Databricks tooling, but George, can't Databricks, like other software companies, integrate foundation model capabilities into its platform? >> Okay, so the sound bite answer to that is sure, IBM 3270 terminals could call out to a graphical user interface when they're running on the XT terminal, but they're not exactly good citizens in that world. The core issue is Databricks has this wonderful end-to-end tool chain for training, deploying, monitoring, running inference on supervised models. But the paradigm there is the customer builds and trains and deploys each model for each feature or application. In a world of foundation models which are pre-trained and unsupervised, the entire tool chain is different. So it's not like Databricks can junk everything they've done and start over with all their engineers. They have to keep maintaining what they've done in the old world, but they have to build something new that's optimized for the new world. It's a classic technology transition and their mentality appears to be, "Oh, we'll support the new stuff from our old stuff." Which is suboptimal, and as we'll talk about, their biggest patron and the company that put them on the map, Microsoft, really stopped working on their old stuff three years ago so that they could build a new tool chain optimized for this new world. >> Yeah, and so let's sort of close with what we think the options are and decisions that Databricks has for its future architecture. They're smart people. I mean we've had Ali Ghodsi on many times, super impressive. I think they've got to be keenly aware of the limitations, what's going on with foundation models. But at any rate, here in this chart, we lay out sort of three scenarios. One is re-architect the platform by incrementally adopting new technologies. And example might be to layer a graph query engine on top of its stack. They could license key technologies like graph database, they could get aggressive on M&A and buy-in, relational knowledge graphs, semantic technologies, vector database technologies. George, as David Floyer always says, "A lot of ways to skin a cat." We've seen companies like, even think about EMC maintained its relevance through M&A for many, many years. George, give us your thought on each of these strategic options? >> Okay, I find this question the most challenging 'cause remember, I used to be an equity research analyst. I worked for Frank Quattrone, we were one of the top tech shops in the banking industry, although this is 20 years ago. But the M&A team was the top team in the industry and everyone wanted them on their side. And I remember going to meetings with these CEOs, where Frank and the bankers would say, "You want us for your M&A work because we can do better." And they really could do better. But in software, it's not like with EMC in hardware because with hardware, it's easier to connect different boxes. With software, the whole point of a software company is to integrate and architect the components so they fit together and reinforce each other, and that makes M&A harder. You can do it, but it takes a long time to fit the pieces together. Let me give you examples. If they put a graph query engine, let's say something like TinkerPop, on top of, I don't even know if it's possible, but let's say they put it on top of Delta Lake, then you have this graph query engine talking to their storage layer, Delta Lake. But if you want to do analysis, you got to put the data in Photon, which is not really ideal for highly connected data. If you license a graph database, then most of your data is in the Delta Lake and how do you sync it with the graph database? If you do sync it, you've got data in two places, which kind of defeats the purpose of having a unified repository. I find this semantic layer option in number three actually more promising, because that's something that you can layer on top of the storage layer that you have already. You just have to figure out then how to have your query engines talk to that. What I'm trying to highlight is, it's easy as an analyst to say, "You can buy this company or license that technology." But the really hard work is making it all work together and that is where the challenge is. >> Yeah, and well look, I thank you for laying that out. We've seen it, certainly Microsoft and Oracle. I guess you might argue that well, Microsoft had a monopoly in its desktop software and was able to throw off cash for a decade plus while it's stock was going sideways. Oracle had won the database wars and had amazing margins and cash flow to be able to do that. Databricks isn't even gone public yet, but I want to close with some of the players to watch. Alex, if you'd bring that back up, number four here. AWS, we talked about some of their options with S3 and it's not just AWS, it's blob storage, object storage. Microsoft, as you sort of alluded to, was an early go-to market channel for Databricks. We didn't address that really. So maybe in the closing comments we can. Google obviously, Snowflake of course, we're going to dissect their options in future Breaking Analysis. Dbt labs, where do they fit? Bob Muglia's company, Relational.ai, why are these players to watch George, in your opinion? >> So everyone is trying to assemble and integrate the pieces that would make building data applications, data products easy. And the critical part isn't just assembling a bunch of pieces, which is traditionally what AWS did. It's a Unix ethos, which is we give you the tools, you put 'em together, 'cause you then have the maximum choice and maximum power. So what the hyperscalers are doing is they're taking their key value stores, in the case of ASW it's DynamoDB, in the case of Azure it's Cosmos DB, and each are putting a graph query engine on top of those. So they have a unified storage and graph database engine, like all the data would be collected in the key value store. Then you have a graph database, that's how they're going to be presenting a foundation for building these data apps. Dbt labs is putting a semantic layer on top of data lakes and data warehouses and as we'll talk about, I'm sure in the future, that makes it easier to swap out the underlying data platform or swap in new ones for specialized use cases. Snowflake, what they're doing, they're so strong in data management and with their transactional tables, what they're trying to do is take in the operational data that used to be in the province of many state stores like MongoDB and say, "If you manage that data with us, it'll be connected to your analytic data without having to send it through a pipeline." And that's hugely valuable. Relational.ai is the wildcard, 'cause what they're trying to do, it's almost like a holy grail where you're trying to take the expressiveness of connecting all your data in a graph but making it as easy to query as you've always had it in a SQL database or I should say, in a relational database. And if they do that, it's sort of like, it'll be as easy to program these data apps as a spreadsheet was compared to procedural languages, like BASIC or Pascal. That's the implications of Relational.ai. >> Yeah, and again, we talked before, why can't you just throw this all in memory? We're talking in that example of really getting down to differences in how you lay the data out on disk in really, new database architecture, correct? >> Yes. And that's why it's not clear that you could take a data lake or even a Snowflake and why you can't put a relational knowledge graph on those. You could potentially put a graph database, but it'll be compromised because to really do what Relational.ai has done, which is the ease of Relational on top of the power of graph, you actually need to change how you're storing your data on disk or even in memory. So you can't, in other words, it's not like, oh we can add graph support to Snowflake, 'cause if you did that, you'd have to change, or in your data lake, you'd have to change how the data is physically laid out. And then that would break all the tools that talk to that currently. >> What in your estimation, is the timeframe where this becomes critical for a Databricks and potentially Snowflake and others? I mentioned earlier midterm, are we talking three to five years here? Are we talking end of decade? What's your radar say? >> I think something surprising is going on that's going to sort of come up the tailpipe and take everyone by storm. All the hype around business intelligence metrics, which is what we used to put in our dashboards where bookings, billings, revenue, customer, those things, those were the key artifacts that used to live in definitions in your BI tools, and DBT has basically created a standard for defining those so they live in your data pipeline or they're defined in their data pipeline and executed in the data warehouse or data lake in a shared way, so that all tools can use them. This sounds like a digression, it's not. All this stuff about data mesh, data fabric, all that's going on is we need a semantic layer and the business intelligence metrics are defining common semantics for your data. And I think we're going to find by the end of this year, that metrics are how we annotate all our analytic data to start adding common semantics to it. And we're going to find this semantic layer, it's not three to five years off, it's going to be staring us in the face by the end of this year. >> Interesting. And of course SVB today was shut down. We're seeing serious tech headwinds, and oftentimes in these sort of downturns or flat turns, which feels like this could be going on for a while, we emerge with a lot of new players and a lot of new technology. George, we got to leave it there. Thank you to George Gilbert for excellent insights and input for today's episode. I want to thank Alex Myerson who's on production and manages the podcast, of course Ken Schiffman as well. Kristin Martin and Cheryl Knight help get the word out on social media and in our newsletters. And Rob Hof is our EIC over at Siliconangle.com, he does some great editing. Remember all these episodes, they're available as podcasts. Wherever you listen, all you got to do is search Breaking Analysis Podcast, we publish each week on wikibon.com and siliconangle.com, or you can email me at David.Vellante@siliconangle.com, or DM me @DVellante. Comment on our LinkedIn post, and please do check out ETR.ai, great survey data, enterprise tech focus, phenomenal. This is Dave Vellante for theCUBE Insights powered by ETR. Thanks for watching, and we'll see you next time on Breaking Analysis.

Published Date : Mar 10 2023

SUMMARY :

bringing you data-driven core elements of the Databricks portfolio and pervasiveness in the data and that was where you went for data. and Cloudera set out to fix that. the reason you see and the robustness of Databricks and their big challenge and the data locked into in the real world and decisions Yes, and the mission of that is propelling the likes that the way you manage that data, is the fundamental problem because the joins are difficult and slow. and connects the data and the issue with that is the fourth bullet, expressiveness and it spits out the and the threat that may loom. because in the past with Snowflake, Think of that as the refinery So once the data lake was in place, George, the call out threat here But the key point is, in sort of the same context. and the company that put One is re-architect the platform and architect the components some of the players to watch. in the case of ASW it's DynamoDB, and why you can't put a relational and executed in the data and manages the podcast, of

ENTITIES

Entity	Category	Confidence
Alex Myerson	PERSON	0.99+
David Floyer	PERSON	0.99+
Mike Olson	PERSON	0.99+
2014	DATE	0.99+
George Gilbert	PERSON	0.99+
Dave Vellante	PERSON	0.99+
George	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
Ken Schiffman	PERSON	0.99+
Andy Jassy	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Erik Bradley	PERSON	0.99+
Dave	PERSON	0.99+
Uber	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
Sun Microsystems	ORGANIZATION	0.99+
50 years	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Bob Muglia	PERSON	0.99+
Gartner	ORGANIZATION	0.99+
Airbnb	ORGANIZATION	0.99+
60 years	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Ali Ghodsi	PERSON	0.99+
2010	DATE	0.99+
Databricks	ORGANIZATION	0.99+
Kristin Martin	PERSON	0.99+
Rob Hof	PERSON	0.99+
three	QUANTITY	0.99+
15 years	QUANTITY	0.99+
Databricks'	ORGANIZATION	0.99+
two places	QUANTITY	0.99+
Boston	LOCATION	0.99+
Tristan Handy	PERSON	0.99+
M&A	ORGANIZATION	0.99+
Frank Quattrone	PERSON	0.99+
second element	QUANTITY	0.99+
Daren Brabham	PERSON	0.99+
TechAlpha Partners	ORGANIZATION	0.99+
third element	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
50 year	QUANTITY	0.99+
40%	QUANTITY	0.99+
Cloudera	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
five years	QUANTITY	0.99+

Wayne Duso, AWS & Iyad Tarazi, Federated Wireless | MWC Barcelona 2023

(light music) >> Announcer: TheCUBE's live coverage is made possible by funding from Dell Technologies. Creating technologies that drive human progress. (upbeat music) >> Welcome back to the Fira in Barcelona. Dave Vellante with Dave Nicholson. Lisa Martin's been here all week. John Furrier is in our Palo Alto studio, banging out all the news. Don't forget to check out siliconangle.com, thecube.net. This is day four, our last segment, winding down. MWC23, super excited to be here. Wayne Duso, friend of theCUBE, VP of engineering from products at AWS is here with Iyad Tarazi, who's the CEO of Federated Wireless. Gents, welcome. >> Good to be here. >> Nice to see you. >> I'm so stoked, Wayne, that we connected before the show. We texted, I'm like, "You're going to be there. I'm going to be there. You got to come on theCUBE." So thank you so much for making time, and thank you for bringing a customer partner, Federated Wireless. Everybody knows AWS. Iyad, tell us about Federated Wireless. >> We're a software and services company out of Arlington, Virginia, right outside of Washington, DC, and we're really focused on this new technology called Shared Spectrum and private wireless for 5G. Think of it as enterprises consuming 5G, the way they used to consume WiFi. >> Is that unrestricted spectrum, or? >> It is managed, organized, interference free, all through cloud platforms. That's how we got to know AWS. We went and got maybe about 300 products from AWS to make it work. Quite sophisticated, highly available, and pristine spectrum worth billions of dollars, but available for people like you and I, that want to build enterprises, that want to make things work. Also carriers, cable companies everybody else that needs it. It's really a new revolution for everyone. >> And that's how you, it got introduced to AWS. Was that through public sector, or just the coincidence that you're in DC >> No, I, well, yes. The center of gravity in the world for spectrum is literally Arlington. You have the DOD spectrum people, you have spectrum people from National Science Foundation, DARPA, and then you have commercial sector, and you have the FCC just an Uber ride away. So we went and found the scientists that are doing all this work, four or five of them, Virginia Tech has an office there too, for spectrum research for the Navy. Come together, let's have a party and make a new model. >> So I asked this, I'm super excited to have you on theCUBE. I sat through the keynotes on Monday. I saw Satya Nadella was in there, Thomas Kurian there was no AWS. I'm like, where's AWS? AWS is everywhere. I mean, you guys are all over the show. I'm like, "Hey, where's the number one cloud?" So you guys have made a bunch of announcements at the show. Everybody's talking about the cloud. What's going on for you guys? >> So we are everywhere, and you know, we've been coming to this show for years. But this is really a year that we can demonstrate that what we've been doing for the IT enterprise, IT people for 17 years, we're now bringing for telcos, you know? For years, we've been, 17 years to be exact, we've been bringing the cloud value proposition, whether it's, you know, cost efficiencies or innovation or scale, reliability, security and so on, to these enterprise IT folks. Now we're doing the same thing for telcos. And so whether they want to build in region, in a local zone, metro area, on-prem with an outpost, at the edge with Snow Family, or with our IoT devices. And no matter where they want to start, if they start in the cloud and they want to move to the edge, or they start in the edge and they want to bring the cloud value proposition, like, we're demonstrating all of that is happening this week. And, and very much so, we're also demonstrating that we're bringing the same type of ecosystem that we've built for enterprise IT. We're bringing that type of ecosystem to the telco companies, with CSPs, with the ISP vendors. We've seen plenty of announcements this week. You know, so on and so forth. >> So what's different, is it, the names are different? Is it really that simple, that you're just basically taking the cloud model into telco, and saying, "Hey, why do all this undifferentiated heavy lifting when we can do it for you? Don't worry about all the plumbing." Is it really that simple? I mean, that straightforward. >> Well, simple is probably not what I'd say, but we can make it straightforward. >> Conceptually. >> Conceptually, yes. Conceptually it is the same. Because if you think about, firstly, we'll just take 5G for a moment, right? The 5G folks, if you look at the architecture for 5G, it was designed to run on a cloud architecture. It was designed to be a set of services that you could partition, and run in different places, whether it's in the region or at the edge. So in many ways it is sort of that simple. And let me give you an example. Two things, the first one is we announced integrated private wireless on AWS, which allows enterprise customers to come to a portal and look at the industry solutions. They're not worried about their network, they're worried about solving a problem, right? And they can come to that portal, they can find a solution, they can find a service provider that will help them with that solution. And what they end up with is a fully validated offering that AWS telco SAS have actually put to its paces to make sure this is a real thing. And whether they get it from a telco, and, and quite frankly in that space, it's SIs such as Federated that actually help our customers deploy those in private environments. So that's an example. And then added to that, we had a second announcement, which was AWS telco network builder, which allows telcos to plan, deploy, and operate at scale telco network capabilities on the cloud, think about it this way- >> As a managed service? >> As a managed service. So think about it this way. And the same way that enterprise IT has been deploying, you know, infrastructure as code for years. Telco network builder allows the telco folks to deploy telco networks and their capabilities as code. So it's not simple, but it is pretty straightforward. We're making it more straightforward as we go. >> Jump in Dave, by the way. He can geek out if you want. >> Yeah, no, no, no, that's good, that's good, that's good. But actually, I'm going to ask an AWS question, but I'm going to ask Iyad the AWS question. So when we, when I hear the word cloud from Wayne, cloud, AWS, typically in people's minds that denotes off-premises. Out there, AWS data center. In the telecom space, yes, of course, in the private 5G space, we're talking about a little bit of a different dynamic than in the public 5G space, in terms of the physical infrastructure. But regardless at the edge, there are things that need to be physically at the edge. Do you feel that AWS is sufficiently, have they removed the H word, hybrid, from the list of bad words you're not allowed to say? 'Cause there was a point in time- >> Yeah, of course. >> Where AWS felt that their growth- >> They'll even say multicloud today, (indistinct). >> No, no, no, no, no. But there was a period of time where, rightfully so, AWS felt that the growth trajectory would be supported solely by net new things off premises. Now though, in this space, it seems like that hybrid model is critical. Do you see AWS being open to the hybrid nature of things? >> Yeah, they're, absolutely. I mean, just to explain from- we're a services company and a solutions company. So we put together solutions at the edge, a smart campus, smart agriculture, a deployment. One of our biggest deployment is a million square feet warehouse automation project with the Marine Corps. >> That's bigger than the Fira. >> Oh yeah, it's bigger, definitely bigger than, you know, a small section of here. It's actually three massive warehouses. So yes, that is the edge. What the cloud is about is that massive amount of efficiency has happened by concentrating applications in data centers. And that is programmability, that is APIs that is solutions, that is applications that can run on it, where people know how to do it. And so all that efficiency now is being ported in a box called the edge. What AWS is doing for us is bringing all the business and technical solutions they had into the edge. Some of the data may send back and forth, but that's actually a smaller piece of the value for us. By being able to bring an AWS package at the edge, we're bringing IoT applications, we're bringing high speed cameras, we're able to integrate with the 5G public network. We're able to bring in identity and devices, we're able to bring in solutions for students, embedded laptops. All of these things that you can do much much faster and cheaper if you are able to tap in the 4,000, 5,000 partners and all the applications and all the development and all the models that AWS team did. By being able to bring that efficiency to the edge why reinvent that? And then along with that, there are partners that you, that help do integration. There are development done to make it hardened, to make the data more secure, more isolated. All of these things will contribute to an edge that truly is a carbon copy of the data center. >> So Wayne, it's AWS, Regardless of where the compute, networking and storage physically live, it's AWS. Do you think that the term cloud will sort of drift away from usage? Because if, look, it's all IT, in this case it's AWS and federated IT working together. How, what's your, it's sort of a obscure question about cloud, because cloud is so integrated. >> You Got this thing about cloud, it's just IT. >> I got thing about cloud too, because- >> You and Larry Ellison. >> Because it's no, no, no, I'm, yeah, well actually there's- >> There's a lot of IT that's not cloud, just say that okay. >> Now, a lot of IT that isn't cloud, but I would say- >> But I'll (indistinct) cloud is an IT tool, and you see AWS obviously with the Snow fill in the blank line of products and outpost type stuff. Fair to say that you're, doesn't matter where it is, it could be AWS if it's on the edge, right? >> Well, you know, everybody wants to define the cloud as what it may have been when it started. But if you look at what it was when it started and what it is today, it is different. But the ability to bring the experience, the AWS experience, the services, the operational experience and all the things that Iyad had been talking about from the region all to all the way to, you know, the IoT device, if you would, that entire continuum. And it doesn't matter where you start. Like if you start in region and you need to bring your value to other places because your customers are asking you to do so, we're enabling that experience where you need to bring it. If you started at the edge, and- but you want to build cloud value, you know, whether it's again, cost efficiency, scalability, AI, ML or analytics into those capabilities, you can start at the edge with the same APIs, with the same service, the same capabilities, and you can build that value in right from the get go. You don't build this bifurcation or many separations and try to figure out how do I glue them together? There is no gluing together. So if you think of cloud as being elastic, scalable flexible, where you can drive innovation, it's the same exact model on the continuum. And you can start at either end, it's up to you as a customer. >> And I think if, the key to me is the ecosystem. I mean, if you can do for this industry what you've done for the technology- enterprise technology business from an ecosystem standpoint, you know everybody talks about flywheel, but that gives you like the massive flywheel. I don't know what the ratio is, but it used to be for every dollar spent on a VMware license, $15 is spent in the ecosystem. I've never heard similar ratios in the AWS ecosystem, but it's, I go to reinvent and I'm like, there's some dollars being- >> That's a massive ecosystem. >> (indistinct). >> And then, and another thing I'll add is Jose Maria Alvarez, who's the chairman of Telefonica, said there's three pillars of the future-ready telco, low latency, programmable networks, and he said cloud and edge. So they recognizing cloud and edge, you know, low latency means you got to put the compute and the data, the programmable infrastructure was invented by Amazon. So what's the strategy around the telco edge? >> So, you know, at the end, so those are all great points. And in fact, the programmability of the network was a big theme in the show. It was a huge theme. And if you think about the cloud, what is the cloud? It's a set of APIs against a set of resources that you use in whatever way is appropriate for what you're trying to accomplish. The network, the telco network becomes a resource. And it could be described as a resource. We, I talked about, you know, network as in code, right? It's same infrastructure in code, it's telco infrastructure as code. And that code, that infrastructure, is programmable. So this is really, really important. And in how you build the ecosystem around that is no different than how we built the ecosystem around traditional IT abstractions. In fact, we feel that really the ecosystem is the killer app for 5G. You know, the killer app for 4G, data of sorts, right? We started using data beyond simple SMS messages. So what's the killer app for 5G? It's building this ecosystem, which includes the CSPs, the ISVs, all of the partners that we bring to the table that can drive greater value. It's not just about cost efficiency. You know, you can't save your way to success, right? At some point you need to generate greater value for your customers, which gives you better business outcomes, 'cause you can monetize them, right? The ecosystem is going to allow everybody to monetize 5G. >> 5G is like the dot connector of all that. And then developers come in on top and create new capabilities >> And how different is that than, you know, the original smartphones? >> Yeah, you're right. So what do you guys think of ChatGPT? (indistinct) to Amazon? Amazon turned the data center into an API. It's like we're visioning this world, and I want to ask that technologist, like, where it's turning resources into human language interfaces. You know, when you see that, you play with ChatGPT at all, or I know you guys got your own. >> So I won't speak directly to ChatGPT. >> No, don't speak from- >> But if you think about- >> Generative AI. >> Yeah generative AI is important. And, and we are, and we have been for years, in this space. Now you've been talking to AWS for a long time, and we often don't talk about things we don't have yet. We don't talk about things that we haven't brought to market yet. And so, you know, you'll often hear us talk about something, you know, a year from now where others may have been talking about it three years earlier, right? We will be talking about this space when we feel it's appropriate for our customers and our partners. >> You have talked about it a little bit, Adam Selipsky went on an interview with myself and John Furrier in October said you watch, you know, large language models are going to be enormous and I know you guys have some stuff that you're working on there. >> It's, I'll say it's exciting. >> Yeah, I mean- >> Well proof point is, Siri is an idiot compared to Alexa. (group laughs) So I trust one entity to come up with something smart. >> I have conversations with Alexa and Siri, and I won't judge either one. >> You don't need, you could be objective on that one. I definitely have a preference. >> Are the problems you guys solving in this space, you know, what's unique about 'em? What are they, can we, sort of, take some examples here (indistinct). >> Sure, the main theme is that the enterprise is taking control. They want to have their own networks. They want to focus on specific applications, and they want to build them with a skeleton crew. The one IT person in a warehouse want to be able to do it all. So what's unique about them is that they're now are a lot of automation on robotics, especially in warehousing environment agriculture. There simply aren't enough people in these industries, and that required precision. And so you need all that integration to make it work. People also want to build these networks as they want to control it. They want to figure out how do we actually pick this team and migrate it. Maybe just do the front of the house first. Maybe it's a security team that monitor the building, maybe later on upgrade things that use to open doors and close doors and collect maintenance data. So that ability to pick what you want to do from a new processors is really important. And then you're also seeing a lot of public-private network interconnection. That's probably the undercurrent of this show that haven't been talked about. When people say private networks, they're also talking about something called neutral host, which means I'm going to build my own network, but I want it to work, my Verizon (indistinct) need to work. There's been so much progress, it's not done yet. So much progress about this bring my own network concept, and then make sure that I'm now interoperating with the public network, but it's my domain. I can create air gaps, I can create whatever security and policy around it. That is probably the power of 5G. Now take all of these tiny networks, big networks, put them all in one ecosystem. Call it the Amazon marketplace, call it the Amazon ecosystem, that's 5G. It's going to be tremendous future. >> What does the future look like? We're going to, we just determined we're going to be orchestrating the network through human language, okay? (group laughs) But seriously, what's your vision for the future here? You know, both connectivity and cloud are on on a continuum. It's, they've been on a continuum forever. They're going to continue to be on a continuum. That being said, those continuums are coming together, right? They're coming together to bring greater value to a greater set of customers, and frankly all of us. So, you know, the future is now like, you know, this conference is the future, and if you look at what's going on, it's about the acceleration of the future, right? What we announced this week is really the acceleration of listening to customers for the last handful of years. And, we're going to continue to do that. We're going to continue to bring greater value in the form of solutions. And that's what I want to pick up on from the prior question. It's not about the network, it's not about the cloud, it's about the solutions that we can provide the customers where they are, right? And if they're on their mobile phone or they're in their factory floor, you know, they're looking to accelerate their business. They're looking to accelerate their value. They're looking to create greater safety for their employees. That's what we can do with these technologies. So in fact, when we came out with, you know, our announcement for integrated private wireless, right? It really was about industry solutions. It really isn't about, you know, the cloud or the network. It's about how you can leverage those technologies, that continuum, to deliver you value. >> You know, it's interesting you say that, 'cause again, when we were interviewing Adam Selipsky, everybody, you know, all journalists analysts want to know, how's Adam Selipsky going to be different from Andy Jassy, what's the, what's he going to do to Amazon to change? And he said, listen, the real answer is Amazon has changed. If Andy Jassy were here, we'd be doing all, you know, pretty much the same things. Your point about 17 years ago, the cloud was S3, right, and EC2. Now it's got to evolve to be solutions. 'Cause if that's all you're selling, is the bespoke services, then you know, the future is not as bright as the past has been. And so I think it's key to look for what are those outcomes or solutions that customers require and how you're going to meet 'em. And there's a lot of challenges. >> You continue to build value on the value that you've brought, and you don't lose sight of why that value is important. You carry that value proposition up the stack, but the- what you're delivering, as you said, becomes maybe a bigger or or different. >> And you are getting more solution oriented. I mean, you're not hardcore solutions yet, but we're seeing more and more of that. And that seems to be a trend. We've even seen in the database world, making things easier, connecting things. Not really an abstraction layer, which is sort of antithetical to your philosophy, but it creates a similar outcome in terms of simplicity. Yeah, you're smiling 'cause you guys always have a different angle, you know? >> Yeah, we've had this conversation. >> It's right, it's, Jassy used to say it's okay to be misunderstood. >> That's Right. For a long time. >> Yeah, right, guys, thanks so much for coming to theCUBE. I'm so glad we could make this happen. >> It's always good. Thank you. >> Thank you so much. >> All right, Dave Nicholson, for Lisa Martin, Dave Vellante, John Furrier in the Palo Alto studio. We're here at the Fira, wrapping out MWC23. Keep it right there, thanks for watching. (upbeat music)

Published Date : Mar 2 2023

SUMMARY :

that drive human progress. banging out all the news. and thank you for bringing the way they used to consume WiFi. but available for people like you and I, or just the coincidence that you're in DC and you have the FCC excited to have you on theCUBE. and you know, we've been the cloud model into telco, and saying, but we can make it straightforward. that you could partition, And the same way that enterprise Jump in Dave, by the way. that need to be physically at the edge. They'll even say multicloud AWS felt that the growth trajectory I mean, just to explain from- and all the models that AWS team did. the compute, networking You Got this thing about cloud, not cloud, just say that okay. on the edge, right? But the ability to bring the experience, but that gives you like of the future-ready telco, And in fact, the programmability 5G is like the dot So what do you guys think of ChatGPT? to ChatGPT. And so, you know, you'll often and I know you guys have some stuff it's exciting. Siri is an idiot compared to Alexa. and I won't judge either one. You don't need, you could Are the problems you that the enterprise is taking control. that continuum, to deliver you value. is the bespoke services, then you know, and you don't lose sight of And that seems to be a trend. it's okay to be misunderstood. For a long time. so much for coming to theCUBE. It's always good. in the Palo Alto studio.

ENTITIES

Entity	Category	Confidence
Dave Nicholson	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Marine Corps	ORGANIZATION	0.99+
Adam Selipsky	PERSON	0.99+
Lisa Martin	PERSON	0.99+
AWS	ORGANIZATION	0.99+
National Science Foundation	ORGANIZATION	0.99+
Wayne	PERSON	0.99+
Iyad Tarazi	PERSON	0.99+
Dave Nicholson	PERSON	0.99+
Jose Maria Alvarez	PERSON	0.99+
Thomas Kurian	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Verizon	ORGANIZATION	0.99+
Andy Jassy	PERSON	0.99+
Federated Wireless	ORGANIZATION	0.99+
Wayne Duso	PERSON	0.99+
$15	QUANTITY	0.99+
October	DATE	0.99+
Satya Nadella	PERSON	0.99+
John Furrier	PERSON	0.99+
17 years	QUANTITY	0.99+
Monday	DATE	0.99+
Telefonica	ORGANIZATION	0.99+
DARPA	ORGANIZATION	0.99+
Arlington	LOCATION	0.99+
Larry Ellison	PERSON	0.99+
Virginia Tech	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Siri	TITLE	0.99+
five	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
four	QUANTITY	0.99+
Washington, DC	LOCATION	0.99+
siliconangle.com	OTHER	0.99+
FCC	ORGANIZATION	0.99+
Barcelona	LOCATION	0.99+
Dell Technologies	ORGANIZATION	0.99+
Jassy	PERSON	0.99+
DC	LOCATION	0.99+
One	QUANTITY	0.99+
telco	ORGANIZATION	0.98+
thecube.net	OTHER	0.98+
this week	DATE	0.98+
second announcement	QUANTITY	0.98+
three years earlier	DATE	0.98+

Ed Walsh & Thomas Hazel | A New Database Architecture for Supercloud

(bright music) >> Hi, everybody, this is Dave Vellante, welcome back to Supercloud 2. Last August, at the first Supercloud event, we invited the broader community to help further define Supercloud, we assessed its viability, and identified the critical elements and deployment models of the concept. The objectives here at Supercloud too are, first of all, to continue to tighten and test the concept, the second is, we want to get real world input from practitioners on the problems that they're facing and the viability of Supercloud in terms of applying it to their business. So on the program, we got companies like Walmart, Sachs, Western Union, Ionis Pharmaceuticals, NASDAQ, and others. And the third thing that we want to do is we want to drill into the intersection of cloud and data to project what the future looks like in the context of Supercloud. So in this segment, we want to explore the concept of data architectures and what's going to be required for Supercloud. And I'm pleased to welcome one of our Supercloud sponsors, ChaosSearch, Ed Walsh is the CEO of the company, with Thomas Hazel, who's the Founder, CTO, and Chief Scientist. Guys, good to see you again, thanks for coming into our Marlborough studio. >> Always great. >> Great to be here. >> Okay, so there's a little debate, I'm going to put you right in the spot. (Ed chuckling) A little debate going on in the community started by Bob Muglia, a former CEO of Snowflake, and he was at Microsoft for a long time, and he looked at the Supercloud definition, said, "I think you need to tighten it up a little bit." So, here's what he came up with. He said, "A Supercloud is a platform that provides a programmatically consistent set of services hosted on heterogeneous cloud providers." So he's calling it a platform, not an architecture, which was kind of interesting. And so presumably the platform owner is going to be responsible for the architecture, but Dr. Nelu Mihai, who's a computer scientist behind the Cloud of Clouds Project, he chimed in and responded with the following. He said, "Cloud is a programming paradigm supporting the entire lifecycle of applications with data and logic natively distributed. Supercloud is an open architecture that integrates heterogeneous clouds in an agnostic manner." So, Ed, words matter. Is this an architecture or is it a platform? >> Put us on the spot. So, I'm sure you have concepts, I would say it's an architectural or design principle. Listen, I look at Supercloud as a mega trend, just like cloud, just like data analytics. And some companies are using the principle, design principles, to literally get dramatically ahead of everyone else. I mean, things you couldn't possibly do if you didn't use cloud principles, right? So I think it's a Supercloud effect, you're able to do things you're not able to. So I think it's more a design principle, but if you do it right, you get dramatic effect as far as customer value. >> So the conversation that we were having with Muglia, and Tristan Handy of dbt Labs, was, I'll set it up as the following, and, Thomas, would love to get your thoughts, if you have a CRM, think about applications today, it's all about forms and codifying business processes, you type a bunch of stuff into Salesforce, and all the salespeople do it, and this machine generates a forecast. What if you have this new type of data app that pulls data from the transaction system, the e-commerce, the supply chain, the partner ecosystem, et cetera, and then, without humans, actually comes up with a plan. That's their vision. And Muglia was saying, in order to do that, you need to rethink data architectures and database architectures specifically, you need to get down to the level of how the data is stored on the disc. What are your thoughts on that? Well, first of all, I'm going to cop out, I think it's actually both. I do think it's a design principle, I think it's not open technology, but open APIs, open access, and you can build a platform on that design principle architecture. Now, I'm a database person, I love solving the database problems. >> I'm waited for you to launch into this. >> Yeah, so I mean, you know, Snowflake is a database, right? It's a distributed database. And we wanted to crack those codes, because, multi-region, multi-cloud, customers wanted access to their data, and their data is in a variety of forms, all these services that you're talked about. And so what I saw as a core principle was cloud object storage, everyone streams their data to cloud object storage. From there we said, well, how about we rethink database architecture, rethink file format, so that we can take each one of these services and bring them together, whether distributively or centrally, such that customers can access and get answers, whether it's operational data, whether it's business data, AKA search, or SQL, complex distributed joins. But we had to rethink the architecture. I like to say we're not a first generation, or a second, we're a third generation distributed database on pure, pure cloud storage, no caching, no SSDs. Why? Because all that availability, the cost of time, is a struggle, and cloud object storage, we think, is the answer. >> So when you're saying no caching, so when I think about how companies are solving some, you know, pretty hairy problems, take MySQL Heatwave, everybody thought Oracle was going to just forget about MySQL, well, they come out with Heatwave. And the way they solve problems, and you see their benchmarks against Amazon, "Oh, we crush everybody," is they put it all in memory. So you said no caching? You're not getting performance through caching? How is that true, and how are you getting performance? >> Well, so five, six years ago, right? When you realize that cloud object storage is going to be everywhere, and it's going to be a core foundational, if you will, fabric, what would you do? Well, a lot of times the second generation say, "We'll take it out of cloud storage, put in SSDs or something, and put into cache." And that adds a lot of time, adds a lot of costs. But I said, what if, what if we could actually make the first read hot, the first read distributed joins and searching? And so what we went out to do was said, we can't cache, because that's adds time, that adds cost. We have to make cloud object storage high performance, like it feels like a caching SSD. That's where our patents are, that's where our technology is, and we've spent many years working towards this. So, to me, if you can crack that code, a lot of these issues we're talking about, multi-region, multicloud, different services, everybody wants to send their data to the data lake, but then they move it out, we said, "Keep it right there." >> You nailed it, the data gravity. So, Bob's right, the data's coming in, and you need to get the data from everywhere, but you need an environment that you can deal with all that different schema, all the different type of technology, but also at scale. Bob's right, you cannot use memory or SSDs to cache that, that doesn't scale, it doesn't scale cost effectively. But if you could, and what you did, is you made object storage, S3 first, but object storage, the only persistence by doing that. And then we get performance, we should talk about it, it's literally, you know, hundreds of terabytes of queries, and it's done in seconds, it's done without memory caching. We have concepts of caching, but the only caching, the only persistence, is actually when we're doing caching, we're just keeping another side-eye track of things on the S3 itself. So we're using, actually, the object storage to be a database, which is kind of where Bob was saying, we agree, but that's what you started at, people thought you were crazy. >> And maybe make it live. Don't think of it as archival or temporary space, make it live, real time streaming, operational data. What we do is make it smart, we see the data coming in, we uniquely index it such that you can get your use cases, that are search, observability, security, or backend operational. But we don't have to have this, I dunno, static, fixed, siloed type of architecture technologies that were traditionally built prior to Supercloud thinking. >> And you don't have to move everything, essentially, you can do it wherever the data lands, whatever cloud across the globe, you're able to bring it together, you get the cost effectiveness, because the only persistence is the cheapest storage persistent layer you can buy. But the key thing is you cracked the code. >> We had to crack the code, right? That was the key thing. >> That's where the plans are. >> And then once you do that, then everything else gets easier to scale, your architecture, across regions, across cloud. >> Now, it's a general purpose database, as Bob was saying, but we use that database to solve a particular issue, which is around operational data, right? So, we agree with Bob's. >> Interesting. So this brings me to this concept of data, Jimata Gan is one of our speakers, you know, we talk about data fabric, which is a NetApp, originally NetApp concept, Gartner's kind of co-opted it. But so, the basic concept is, data lives everywhere, whether it's an S3 bucket, or a SQL database, or a data lake, it's just a node on the data mesh. So in your view, how does this fit in with Supercloud? Ed, you've said that you've built, essentially, an enabler for that, for the data mesh, I think you're an enabler for the Supercloud-like principles. This is a big, chewy opportunity, and it requires, you know, a team approach. There's got to be an ecosystem, there's not going to be one Supercloud to rule them all, so where does the ecosystem fit into the discussion, and where do you fit into the ecosystem? >> Right, so we agree completely, there's not one Supercloud in effect, but we use Supercloud principles to build our platform, and then, you know, the ecosystem's going to be built on leveraging what everyone else's secret powers are, right? So our power, our superpower, based upon what we built is, we deal with, if you're having any scale, or cost effective scale issues, with data, machine generated data, like business observability or security data, we are your force multiplier, we will take that in singularly, just let it, simply put it in your object storage wherever it sits, and we give you uniformity access to that using OpenAPI access, SQL, or you know, Elasticsearch API. So, that's what we do, that's our superpower. So I'll play it into data mesh, that's a perfect, we are a node on a data mesh, but I'll play it in the soup about how, the ecosystem, we see it kind of playing, and we talked about it in just in the last couple days, how we see this kind of possibly. Short term, our superpowers, we deal with this data that's coming at these environments, people, customers, building out observability or security environments, or vendors that are selling their own Supercloud, I do observability, the Datadogs of the world, dot dot dot, the Splunks of the world, dot dot dot, and security. So what we do is we fit in naturally. What we do is a cost effective scale, just land it anywhere in the world, we deal with ingest, and it's a cost effective, an order of magnitude, or two or three order magnitudes more cost effective. Allows them, their customers are asking them to do the impossible, "Give me fast monitoring alerting. I want it snappy, but I want it to keep two years of data, (laughs) and I want it cost effective." It doesn't work. They're good at the fast monitoring alerting, we're good at the long-term retention. And yet there's some gray area between those two, but one to one is actually cheaper, so we would partner. So the first ecosystem plays, who wants to have the ability to, really, all the data's in those same environments, the security observability players, they can literally, just through API, drag our data into their point to grab. We can make it seamless for customers. Right now, we make it helpful to customers. Your Datadog, we make a button, easy go from Datadog to us for logs, save you money. Same thing with Grafana. But you can also look at ecosystem, those same vendors, it used to be a year ago it was, you know, its all about how can you grow, like it's growth at all costs, now it's about cogs. So literally we can go an environment, you supply what your customer wants, but we can help with cogs. And one-on one in a partnership is better than you trying to build on your own. >> Thomas, you were saying you make the first read fast, so you think about Snowflake. Everybody wants to talk about Snowflake and Databricks. So, Snowflake, great, but you got to get the data in there. All right, so that's, can you help with that problem? >> I mean we want simple in, right? And if you have to have structure in, you're not simple. So the idea that you have a simple in, data lake, schema read type philosophy, but schema right type performance. And so what I wanted to do, what we have done, is have that simple lake, and stream that data real time, and those access points of Search or SQL, to go after whatever business case you need, security observability, warehouse integration. But the key thing is, how do I make that click, click, click answer, and do it quickly? And so what we want to do is, that first read has to be fast. Why? 'Cause then you're going to do all this siloing, layers, complexity. If your first read's not fast, you're at a disadvantage, particularly in cost. And nobody says I want less data, but everyone has to, whether they say we're going to shorten the window, we're going to use AI to choose, but in a security moment, when you don't have that answer, you're in trouble. And that's why we are this service, this Supercloud service, if you will, providing access, well-known search, well-known SQL type access, that if you just have one access point, you're at a disadvantage. >> We actually talked about Snowflake and BigQuery, and a different platform, Data Bricks. That's kind of where we see the phase two of ecosystem. One is easy, the low-hanging fruit is observability and security firms. But the next one is, what we do, our super power is dealing with this messy data that schema is changing like night and day. Pipelines are tough, and it's changing all the time, but you want these things fast, and it's big data around the world. That's the next point, just use us alongside, or inside, one of their platforms, and now we get the best of both worlds. Our superpower is keeping this messy data as a streaming, okay, not a batch thing, allow you to do that. So, that's the second one. And then to be honest, the third one, which plays you to Supercloud, it also plays perfectly in the data mesh, is if you really go to the ultimate thing, what we have done is made object storage, S3, GCS, and blob storage, we made it a database. Put, get, complex query with big joins. You know, so back to your original thing, and Muglia teed it up perfectly, we've done that. Now imagine if that's an ecosystem, who would want that? If it's, again, it's uniform available across all the regions, across all the clouds, and it's right next to where you are building a service, or a client's trying, that's where the ecosystem, I think people are going to use Superclouds for their superpowers. We're really good at this, allows that short term. I think the Snowflakes and the Data Bricks are the medium term, you know? And then I think eventually gets to, hey, listen if you can make object storage fast, you can just go after it with simple SQL queries, or elastic. Who would want that? I think that's where people are going to leverage it. It's not going to be one Supercloud, and we leverage the super clouds. >> Our viewpoint is smart object storage can be programmable, and so we agree with Bob, but we're not saying do it here, do it here. This core, fundamental layer across regions, across clouds, that everyone has? Simple in. Right now, it's hard to get data in for access for analysis. So we said, simply, we'll automate the entire process, give you API access across regions, across clouds. And again, how do you do a distributed join that's fast? How do you do a distributed join that doesn't cost you an arm or a leg? And how do you do it at scale? And that's where we've been focused. >> So prior, the cloud object store was a niche. >> Yeah. >> S3 obviously changed that. How standard is, essentially, object store across the different cloud platforms? Is that a problem for you? Is that an easy thing to solve? >> Well, let's talk about it. I mean we've fundamentally, yeah we've extracted it, but fundamentally, cloud object storage, put, get, and list. That's why it's so scalable, 'cause it doesn't have all these other components. That complexity is where we have moved up, and provide direct analytical API access. So because of its simplicity, and costs, and security, and reliability, it can scale naturally. I mean, really, distributed object storage is easy, it's put-get anywhere, now what we've done is we put a layer of intelligence, you know, call it smart object storage, where access is simple. So whether it's multi-region, do a query across, or multicloud, do a query across, or hunting, searching. >> We've had clients doing Amazon and Google, we have some Azure, but we see Amazon and Google more, and it's a consistent service across all of them. Just literally put your data in the bucket of choice, or folder of choice, click a couple buttons, literally click that to say "that's hot," and after that, it's hot, you can see it. But we're not moving data, the data gravity issue, that's the other. That it's already natively flowing to these pools of object storage across different regions and clouds. We don't move it, we index it right there, we're spinning up stateless compute, back to the Supercloud concept. But now that allows us to do all these other things, right? >> And it's no longer just cheap and deep object storage. Right? >> Yeah, we make it the same, like you have an analytic platform regardless of where you're at, you don't have to worry about that. Yeah, we deal with that, we deal with a stateless compute coming up -- >> And make it programmable. Be able to say, "I want this bucket to provide these answers." Right, that's really the hope, the vision. And the complexity to build the entire stack, and then connect them together, we said, the fabric is cloud storage, we just provide the intelligence on top. >> Let's bring it back to the customers, and one of the things we're exploring in Supercloud too is, you know, is Supercloud a solution looking for a problem? Is a multicloud really a problem? I mean, you hear, you know, a lot of the vendor marketing says, "Oh, it's a disaster, because it's all different across the clouds." And I talked to a lot of customers even as part of Supercloud too, they're like, "Well, I solved that problem by just going mono cloud." Well, but then you're not able to take advantage of a lot of the capabilities and the primitives that, you know, like Google's data, or you like Microsoft's simplicity, their RPA, whatever it is. So what are customers telling you, what are their near term problems that they're trying to solve today, and how are they thinking about the future? >> Listen, it's a real problem. I think it started, I think this is a a mega trend, just like cloud. Just, cloud data, and I always add, analytics, are the mega trends. If you're looking at those, if you're not considering using the Supercloud principles, in other words, leveraging what I have, abstracting it out, and getting the most out of that, and then build value on top, I think you're not going to be able to keep up, In fact, no way you're going to keep up with this data volume. It's a geometric challenge, and you're trying to do linear things. So clients aren't necessarily asking, hey, for Supercloud, but they're really saying, I need to have a better mechanism to simplify this and get value across it, and how do you abstract that out to do that? And that's where they're obviously, our conversations are more amazed what we're able to do, and what they're able to do with our platform, because if you think of what we've done, the S3, or GCS, or object storage, is they can't imagine the ingest, they can't imagine how easy, time to glass, one minute, no matter where it lands in the world, querying this in seconds for hundreds of terabytes squared. People are amazed, but that's kind of, so they're not asking for that, but they are amazed. And then when you start talking on it, if you're an enterprise person, you're building a big cloud data platform, or doing data or analytics, if you're not trying to leverage the public clouds, and somehow leverage all of them, and then build on top, then I think you're missing it. So they might not be asking for it, but they're doing it. >> And they're looking for a lens, you mentioned all these different services, how do I bring those together quickly? You know, our viewpoint, our service, is I have all these streams of data, create a lens where they want to go after it via search, go after via SQL, bring them together instantly, no e-tailing out, no define this table, put into this database. We said, let's have a service that creates a lens across all these streams, and then make those connections. I want to take my CRM with my Google AdWords, and maybe my Salesforce, how do I do analysis? Maybe I want to hunt first, maybe I want to join, maybe I want to add another stream to it. And so our viewpoint is, it's so natural to get into these lake platforms and then provide lenses to get that access. >> And they don't want it separate, they don't want something different here, and different there. They want it basically -- >> So this is our industry, right? If something new comes out, remember virtualization came out, "Oh my God, this is so great, it's going to solve all these problems." And all of a sudden it just got to be this big, more complex thing. Same thing with cloud, you know? It started out with S3, and then EC2, and now hundreds and hundreds of different services. So, it's a complex matter for a lot of people, and this creates problems for customers, especially when you got divisions that are using different clouds, and you're saying that the solution, or a solution for the part of the problem, is to really allow the data to stay in place on S3, use that standard, super simple, but then give it what, Ed, you've called superpower a couple of times, to make it fast, make it inexpensive, and allow you to do that across clouds. >> Yeah, yeah. >> I'll give you guys the last word on that. >> No, listen, I think, we think Supercloud allows you to do a lot more. And for us, data, everyone says more data, more problems, more budget issue, everyone knows more data is better, and we show you how to do it cost effectively at scale. And we couldn't have done it without the design principles of we're leveraging the Supercloud to get capabilities, and because we use super, just the object storage, we're able to get these capabilities of ingest, scale, cost effectiveness, and then we built on top of this. In the end, a database is a data platform that allows you to go after everything distributed, and to get one platform for analytics, no matter where it lands, that's where we think the Supercloud concepts are perfect, that's where our clients are seeing it, and we're kind of excited about it. >> Yeah a third generation database, Supercloud database, however we want to phrase it, and make it simple, but provide the value, and make it instant. >> Guys, thanks so much for coming into the studio today, I really thank you for your support of theCUBE, and theCUBE community, it allows us to provide events like this and free content. I really appreciate it. >> Oh, thank you. >> Thank you. >> All right, this is Dave Vellante for John Furrier in theCUBE community, thanks for being with us today. You're watching Supercloud 2, keep it right there for more thought provoking discussions around the future of cloud and data. (bright music)

Published Date : Feb 17 2023

SUMMARY :

And the third thing that we want to do I'm going to put you right but if you do it right, So the conversation that we were having I like to say we're not a and you see their So, to me, if you can crack that code, and you need to get the you can get your use cases, But the key thing is you cracked the code. We had to crack the code, right? And then once you do that, So, we agree with Bob's. and where do you fit into the ecosystem? and we give you uniformity access to that so you think about Snowflake. So the idea that you have are the medium term, you know? and so we agree with Bob, So prior, the cloud that an easy thing to solve? you know, call it smart object storage, and after that, it's hot, you can see it. And it's no longer just you don't have to worry about And the complexity to and one of the things we're and how do you abstract it's so natural to get and different there. and allow you to do that across clouds. I'll give you guys and we show you how to do it but provide the value, I really thank you for around the future of cloud and data.

ENTITIES

Entity	Category	Confidence
Walmart	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
NASDAQ	ORGANIZATION	0.99+
Bob Muglia	PERSON	0.99+
Thomas	PERSON	0.99+
Thomas Hazel	PERSON	0.99+
Ionis Pharmaceuticals	ORGANIZATION	0.99+
Western Union	ORGANIZATION	0.99+
Ed Walsh	PERSON	0.99+
Bob	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Nelu Mihai	PERSON	0.99+
Sachs	ORGANIZATION	0.99+
Tristan Handy	PERSON	0.99+
two	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
two years	QUANTITY	0.99+
Supercloud 2	TITLE	0.99+
first	QUANTITY	0.99+
Last August	DATE	0.99+
three	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+
both	QUANTITY	0.99+
dbt Labs	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Ed	PERSON	0.99+
Gartner	ORGANIZATION	0.99+
Jimata Gan	PERSON	0.99+
third one	QUANTITY	0.99+
one minute	QUANTITY	0.99+
second	QUANTITY	0.99+
first generation	QUANTITY	0.99+
third generation	QUANTITY	0.99+
Grafana	ORGANIZATION	0.99+
second generation	QUANTITY	0.99+
second one	QUANTITY	0.99+
hundreds of terabytes	QUANTITY	0.98+
SQL	TITLE	0.98+
five	DATE	0.98+
one	QUANTITY	0.98+
Databricks	ORGANIZATION	0.98+
a year ago	DATE	0.98+
ChaosSearch	ORGANIZATION	0.98+
Muglia	PERSON	0.98+
MySQL	TITLE	0.98+
both worlds	QUANTITY	0.98+
third thing	QUANTITY	0.97+
Marlborough	LOCATION	0.97+
theCUBE	ORGANIZATION	0.97+
today	DATE	0.97+
Supercloud	ORGANIZATION	0.97+
Elasticsearch	TITLE	0.96+
NetApp	TITLE	0.96+
Datadog	ORGANIZATION	0.96+
One	QUANTITY	0.96+
EC2	TITLE	0.96+
each one	QUANTITY	0.96+
S3	TITLE	0.96+
one platform	QUANTITY	0.95+
Supercloud 2	EVENT	0.95+
first read	QUANTITY	0.95+
six years ago	DATE	0.95+

Nir Zuk, Palo Alto Networks | An Architecture for Securing the Supercloud

(bright upbeat music) >> Welcome back, everybody, to the Supercloud 2. My name is Dave Vellante. And I'm pleased to welcome Nir Zuk. He's the founder and CTO of Palo Alto Networks. Nir, good to see you again. Welcome. >> Same here. Good to see you. >> So let's start with the right security architecture in the context of today's fragmented market. You've got a lot of different tools, you've got different locations, on-prem, you've got hardware and software. Tell us about the right security architecture from your standpoint. What's that look like? >> You know, the funny thing is using the word security in architecture rarely works together. (Dave chuckles) If you ask a typical information security person to step up to a whiteboard and draw their security architecture, they will look at you as if you fell from the moon. I mean, haven't you been here in the last 25 years? There's no security architecture. The architecture today is just buying a bunch of products and dropping them into the infrastructure at some relatively random way without really any guiding architecture. And that's a huge challenge in cybersecurity. It's always been, we've always tried to find ways to put an architecture into writing blueprints, whatever you want to call it, and it's always been difficult. Luckily, two things. First, there's something called zero trust, which we can talk a little bit about more, if you want, and zero trust among other things is really a way to create a security architecture, and second, because in the cloud, in the supercloud, we're starting from scratch, we can do things differently. We don't have to follow the way we've always done cybersecurity, again, buying random products, okay, maybe not random, maybe there is some thinking going into it by buying products, one of the other, dropping them in, and doing it over 20 years and ending up with a mess in the cloud, we have an opportunity to do it differently and really have an architecture. >> You know, I love talking to founders and particularly technical founders from StartupNation. I think I saw an article, I think it was Erie Levine, one of the founders or co-founders of Waze, and he had a t-shirt on, it said, "Fall in love with the problem, not the solution." Is that how you approached architecture? You talk about zero trust, it's a relatively new term, but was that in your head when you thought about forming the company? >> Yeah, so when I started Palo Alto Networks, exactly, by the way, 17 years ago, we got funded January, 2006, January 18th, 2006. The idea behind Palo Alto Networks was to create a security platform and over time take more and more cybersecurity functions and deliver them on top of that platform, by the way, as a service, SaaS. Everybody thought we were crazy trying to combine many functions into one platform, best of breed and defense in death and putting all your eggs in the same basket and a bunch of other slogans were flying around, and also everybody thought we were crazy asking customers to send information to the cloud in order to secure themselves. Of course, step forward 17 years, everything is now different. We changed the market. Almost all of cybersecurity today is delivered as SaaS and platforms are ruling more and more the world. And so again, the idea behind the platform was to over time take more and more cybersecurity functions and deliver them together, one brain, one decision being made for each and every packet or system call or file or whatever it is that you're making the decision about and it works really, really well. As a side effect, when you combine that with zero trust and you end up with, let's not call it an architecture yet. You end up with with something where any user, any location, both geographically as well as any location in terms of branch office, headquarters, home, coffee shop, hotel, whatever, so any user, any geographical location, any location, any connectivity method, whether it is SD1 or IPsec or Client VPN or Client SVPN or proxy or browser isolation or whatever and any application deployed anywhere, public cloud, private cloud, traditional data center, SaaS, you secure the same way. That's really zero trust, right? You secure everything, no matter who the user is, no matter where they are, no matter where they go, you secure them exactly the same way. You don't make any assumptions about the user or the application or the location or whatever, just because you trust nothing. And as a side effect, when you do that, you end up with a security architecture, the security architecture I just described. The same thing is true for securing applications. If you try to really think and not just act instinctively the way we usually do in cybersecurity and you say, I'm going to secure my traditional data center applications or private cloud applications and public cloud applications and my SaaS applications the same way, I'm not going to trust something just because it's deployed in the private data center. I'm not going to trust two components of an application or two applications talking to each other just because they're deployed in the same place versus if one component is deployed in one public cloud and the other component is deployed in another public cloud or private cloud or whatever. I'm going to secure all of them the same way without making any trust assumptions. You end up with an architecture for securing your applications, which is applicable for the supercloud. >> It was very interesting. There's a debate I want to pick up on what you said because you said don't call it an architecture yet. So Bob Muglia, I dunno if you know Bob, but he sort of started the debate, said, "Supercloud, think of it as a platform, not an architecture." And there are others that are saying, "No, no, if we do that, then we're going to have a bunch of more stove pipes. So there needs to be standard, almost a purist view. There needs to be a supercloud architecture." So how do you think about it? And it's a bit academic, I know, but do you think of this idea of a supercloud, this layer of value on top of the hyperscalers, do you think of that as a platform approach that each of the individual vendors are responsible for the architecture? Or is there some kind of overriding architecture of standards that needs to emerge to enable the supercloud? >> So we can talk academically or we can talk practically. >> Yeah, let's talk practically. That's who you are. (Dave laughs) >> Practically, this world is ruled by financial interests and none of the public cloud providers, especially the bigger they are has any interest of making it easy for anyone to go multi-cloud, okay? Also, on top of that, if we want to be even more practical, each of those large cloud providers, cloud scale providers have engineers and all these engineers think they're the best in the world, which they are and they all like to do things differently. So you can't expect things in AWS and in Azure and GCP and in the other clouds like Oracle and Ali and so on to be the same. They're not going to be the same. And some things can be abstracted. Maybe cloud storage or bucket storage can be abstracted with the layer that makes them look the same no matter where you're running. And some things cannot be abstracted and unfortunately will not be abstracted because the economical interest and the way engineers work won't let it happen. We as a third party provider, cybersecurity provider, and I'm sure other providers in other areas as well are trying or we're doing our best. We're not trying, we are doing our best, and it's pretty close to being the way you describe the top of your supercloud. We're building something that abstracts the underlying cloud such that securing each of these clouds, and by the way, I would add private cloud to it as well, looks exactly the same. So we use, almost always, whenever possible, the same terminology, no matter which cloud we're securing and the same policy and the same alerts and the same information and so on. And that's also very important because when you look at the people that actually end up using the product, security engineers and more importantly, SOC, security operations center analysts, they're not going to study the details of each and every cloud. It's just going to be too much. So we need to abstract it for them. >> Yeah, we agree by the way that the supercloud definition is inclusive of on-prem, you know, what you call private cloud. And I want to pick up on something else you said. I think you're right that abstracting and making consistent across clouds something like object storage, get put, you know, whether it's an S3 bucket or an Azure Blob, relatively speaking trivial. When you now bring that supercloud concept to something more complex like security, first of all, as a technically feasible and inferring the answer there is yes, and if so, what do you see as the main technical challenges of doing so? >> So it is feasible to the extent that the different cloud provide the same functionality. Then you step into a territory where different cloud providers have different paths services and different cloud providers do things a little bit differently and they have different sets of permissions and different logging that sometimes provides all the information and sometimes it doesn't. So you end up with some differences. And then the question is, do you abstract the lowest common dominator and that's all you support? Or do you find a way to be smarter than that? And yeah, whatever can be abstracted is abstracted and whatever cannot be abstracted, you find an easy way to represent that to your users, security engineers, security analysts, and so on, which is what I believe we do. >> And you do that by what? Inventing or developing technology that presents that experience to users? Could you be more specific there? >> Yeah, so different cloud providers call their storage in different names and you use different ways to configure them and the logs come out the same. So we normalize it. I mean, the keyword is probably normalization. Normalize it. And we try to, you know, then you have to pick a winner here and to use someone's terminology or you need to invent new terminology. So we try to use the terminology of the largest cloud provider so that we have a better chance of doing that but we can't always do that because they don't support everything that other cloud providers provide, but the important thing is, with or thanks to that normalization, our customers both on the engineering side and on the user side, operations side end up having to learn one terminology in order to set policies and understand attacks and investigate incidents. >> I wonder if I could pick your brain on what you see as the ideal deployment model to achieve this supercloud experience. For example, do you think instantiating your stack in multiple regions and multiple clouds is the right way to do it? Or is building a single global instance on top of the clouds a more preferable way? Are maybe other models we should consider? What do you see as the trade off of these different deployment models and which one is ideal in your view? >> Yeah, so first, when you deploy cloud security, you have to decide whether you're going to use agents or not. By agents, I mean something working, something running inside the workload. Inside a virtual machine on the container host attached to function, serverless function and so on and I, of course, recommend using agents because that enables prevention, it enables functionality you cannot get without agents but you have to choose that. Now, of course, if you choose agent, you need to deploy AWS agents in AWS and GCP agents in GCP and Azure agents in Azure and so on. Of course, you don't do it manually. You do it through the CICD pipeline. And then the second thing that you need to do is you need to connect with the consoles. Of course, that can be done over the internet no matter where your security instances is running. You can run it on premise, you can run it in one of the other different clouds. Of course, we don't run it on premise. We prefer not to run it on premise because if you're secured in cloud, you might as well run in the cloud. And then the question is, for example, do you run a separate instance for AWS for GCP or for Azure, or you want to run one instance for all of them in one of these clouds? And there are advantages and disadvantages. I think that from a security perspective, it's always better to run in one place because then when you collect the information, you get information from all the clouds and you can start looking for cross-cloud issues, incidents, attacks, and so on. The downside of that is that you need to send all the information to one of the clouds and you probably know that sending data out of the cloud costs a lot of money versus keeping it in the cloud. So theoretically, you can build an architecture where you keep the data for AWS in AWS, Azure in Azure, GCP in GCP, and then you try to run distributed queries. When you do that, you find out you'd end up paying more for the compute to do that than you would've paid for sending all the data to a central location. So we prefer the approach of running in one place, bringing all the data there, and running all the security, the machine learning or whatever, the rules or whatever it is that you're running in one place versus trying to create a distributed deployment in order to try to save some money on the data, the network data transfers. >> Yeah, thank you for that. That makes a lot of sense. And so basically, should we think about the next layer building security data lake, if you will, and then running machine learning on top of that if I can use that term of a data lake or a lake house? Is that sort of where you're headed? >> Yeah, look, the world is headed in that direction, not just the cybersecurity world. The world is headed from being rule-based to being data-based. So cybersecurity is not different and what we used to do with rules in the past, we're now doing with machine learning. So in the past, you would define rules saying, if you see this, this, and this, it's an attack. Now you just throw the data at the machine, I mean, I'm simplifying it, but you throw data at a machine. You'll tell the machine, find the attack in the data. It's not that simple. You need to build the right machine learning models. It needs to be done by people that are both cybersecurity experts and machine learning experts. We do it mostly with ex-military offensive people that take their offensive knowledge and translate it into machine learning models. But look, the world is moving in that direction and cybersecurity is moving in that direction as well. You need to collect a lot of data. Like I said, I prefer to see all the data in one place so that the machine learning can be much more efficient, pay for transferring the data, save money on the compute. >> I think the drop the mic quote it ignite that you had was within five years, your security operation is going to be AI-powered. And so you could probably apply that to virtually any job over the next five years. >> I don't know if any job. Certainly writing essays for school is automated already as we've seen with ChatGPT and potentially other things. By the way, we need to talk at some point about ChatGPT security. I don't want to think what happens when someone spends a lot of money on creating a lot of fake content and teaches ChatGPT the wrong answer to a question. We start seeing ChatGPT as the oracle of everything. We need to figure out what to do with the security of that. But yeah, things have to be automated in cybersecurity. They have to be automated. They're just too much data to deal with and it's just not even close to being good enough to wait for an incident to happen and then going investigate the incident based on the data that we have. It's better to look at all the data all the time, millions of events per second, and find those incidents before they happen. There's no way to do that without machine learning. >> I'd love to have you back and talk about ChatGPT. I know they're trying to put in some guardrails but there are a lot of unintended consequences, aren't there? >> Look, if they're not going to have a person filtering the data, then with enough money, you can create thousands or tens of thousands of pieces of articles or whatever that look real and teach the machine something that is totally wrong. >> We were talking about the hyper skills before and I agree with you. It's very unlikely they're going to get together, band together, and create these standards. But it's not a static market. It's a moving train, if you will. So assuming you're building this cross cloud experience which you are, what do you want from the hyperscalers? What do you want them to bring to the table? What is a technology supplier like Palo Alto Networks bring? In other words, where do you see ongoing as your unique value add and that moat that you're building and how will that evolve over time vis-a-vis the hyperscaler evolution? >> Yeah, look, we need APIs. The more data we have, the more access we have to more data, the less restricted the access is and the cheaper the access is to the data because someone has to pay today for some reason for accessing that data, the more secure their customers are going to be. So we need help and are helping by the way a lot, all of them in finding easy ways for customers to deploy things in the cloud, access data, and again, a lot of data, very diversified data and do it in a cost-effective way. >> And when we talk about the edge, I presume you look at the edge as just another data center or maybe it's the reverse. Maybe the data center is just another edge location, but you're seeing specific edge security solutions come out. I'm guessing that you would say, that's not what we want. Edge should be part of that architecture that we talked about earlier. Do you agree? >> Correct, it should be part of the architecture. I would also say that the edge provides an opportunity specifically for network security, whereas traditional network security would be deployed on premise. I'm talking about internet security but half network security market, and not just network security but also the other network intelligent functions like routing and QS. We're seeing a trend of pushing those to the edge of the cloud. So what you deploy on premise is technology for bringing packets to the edge of the cloud and then you run your security at the edge, whatever that edge is, whether it's a private edge or public edge, you run it in the edge. It's called SASE, Secure Access Services Edge, pronounced SASE. >> Nir, I got to thank you so much. You're such a clear thinker. I really appreciate you participating in Supercloud 2. >> Thank you. >> All right, keep it right there for more content covering the future of cloud and data. This is Dave Vellante for John Furrier. I'll be right back. (bright upbeat music)

Published Date : Feb 17 2023

SUMMARY :

Nir, good to see you again. Good to see you. in the context of today's and second, because in the cloud, Is that how you approached architecture? and my SaaS applications the same way, that each of the individual So we can talk academically That's who you are. and none of the public cloud providers, and if so, what do you see and that's all you support? and on the user side, operations side is the right way to do it? and then you try to run about the next layer So in the past, you would that you had was within five years, and teaches ChatGPT the I'd love to have you that look real and teach the machine and that moat that you're building and the cheaper the access is to the data I'm guessing that you would and then you run your Nir, I got to thank you so much. the future of cloud and data.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Bob Muglia	PERSON	0.99+
January, 2006	DATE	0.99+
Erie Levine	PERSON	0.99+
Dave	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Palo Alto Networks	ORGANIZATION	0.99+
Bob	PERSON	0.99+
thousands	QUANTITY	0.99+
Nir Zuk	PERSON	0.99+
two applications	QUANTITY	0.99+
Nir	PERSON	0.99+
one component	QUANTITY	0.99+
one	QUANTITY	0.99+
StartupNation	ORGANIZATION	0.99+
Waze	ORGANIZATION	0.99+
First	QUANTITY	0.99+
two components	QUANTITY	0.99+
second thing	QUANTITY	0.99+
John Furrier	PERSON	0.99+
January 18th, 2006	DATE	0.99+
one platform	QUANTITY	0.99+
Oracle	ORGANIZATION	0.98+
both	QUANTITY	0.98+
17 years ago	DATE	0.98+
over 20 years	QUANTITY	0.98+
Azure	TITLE	0.98+
17 years	QUANTITY	0.98+
ChatGPT	TITLE	0.98+
each	QUANTITY	0.98+
first	QUANTITY	0.98+
two things	QUANTITY	0.97+
one place	QUANTITY	0.97+
one instance	QUANTITY	0.96+
one brain	QUANTITY	0.96+
today	DATE	0.95+
zero trust	QUANTITY	0.94+
single	QUANTITY	0.94+
second	QUANTITY	0.94+
GCP	TITLE	0.92+
five years	QUANTITY	0.91+
tens of thousands	QUANTITY	0.91+
one decision	QUANTITY	0.88+
last 25 years	DATE	0.86+
SASE	TITLE	0.85+
Supercloud	ORGANIZATION	0.85+
ChatGPT	ORGANIZATION	0.84+
one terminology	QUANTITY	0.79+
zero	QUANTITY	0.77+
millions of events per second	QUANTITY	0.75+
S3	COMMERCIAL_ITEM	0.75+
SOC	ORGANIZATION	0.72+
Azure Blob	TITLE	0.72+
Ali	ORGANIZATION	0.72+
Supercloud 2	ORGANIZATION	0.68+

Discussion about Walmart's Approach | Supercloud2

(upbeat electronic music) >> Okay, welcome back to Supercloud 2, live here in Palo Alto. I'm John Furrier, with Dave Vellante. Again, all day wall-to-wall coverage, just had a great interview with Walmart, we've got a Next interview coming up, you're going to hear from Bob Muglia and Tristan Handy, two experts, both experienced entrepreneurs, executives in technology. We're here to break down what just happened with Walmart, and what's coming up with George Gilbert, former colleague, Wikibon analyst, Gartner Analyst, and now independent investor and expert. George, great to see you, I know you're following this space. Like you read about it, remember the first days when Dataverse came out, we were talking about them coming out of Berkeley? >> Dave: Snowflake. >> John: Snowflake. >> Dave: Snowflake In the early days. >> We, collectively, have been chronicling the data movement since 2010, you were part of our team, now you've got your nose to the grindstone, you're seeing the next wave. What's this all about? Walmart building their own super cloud, we got Bob Muglia talking about how these next wave of apps are coming. What are the super apps? What's the super cloud to you? >> Well, this key's off Dave's really interesting questions to Walmart, which was like, how are they building their supercloud? 'Cause it makes a concrete example. But what was most interesting about his description of the Walmart WCMP, I forgot what it stood for. >> Dave: Walmart Cloud Native Platform. >> Walmart, okay. He was describing where the logic could run in these stateless containers, and maybe eventually serverless functions. But that's just it, and that's the paradigm of microservices, where the logic is in this stateless thing, where you can shoot it, or it fails, and you can spin up another one, and you've lost nothing. >> That was their triplet model. >> Yeah, in fact, and that was what they were trying to move to, where these things move fluidly between data centers. >> But there's a but, right? Which is they're all stateless apps in the cloud. >> George: Yeah. >> And all their stateful apps are on-prem and VMs. >> Or the stateful part of the apps are in VMs. >> Okay. >> And so if they really want to lift their super cloud layer off of this different provider's infrastructure, they're going to need a much more advanced software platform that manages data. And that goes to the -- >> Muglia and Handy, that you and I did, that's coming up next. So the big takeaway there, George, was, I'll set it up and you can chime in, a new breed of data apps is emerging, and this highly decentralized infrastructure. And Tristan Handy of DBT Labs has a sort of a solution to begin the journey today, Muglia is working on something that's way out there, describe what you learned from it. >> Okay. So to talk about what the new data apps are, and then the platform to run them, I go back to the using what will probably be seen as one of the first data app examples, was Uber, where you're describing entities in the real world, riders, drivers, routes, city, like a city plan, these are all defined by data. And the data is described in a structure called a knowledge graph, for lack of a, no one's come up with a better term. But that means the tough, the stuff that Jack built, which was all stateless and sits above cloud vendors' infrastructure, it needs an entirely different type of software that's much, much harder to build. And the way Bob described it is, you're going to need an entirely new data management infrastructure to handle this. But where, you know, we had this really colorful interview where it was like Rock 'Em Sock 'Em, but they weren't really that much in opposition to each other, because Tristan is going to define this layer, starting with like business intelligence metrics, where you're defining things like bookings, billings, and revenue, in business terms, not in SQL terms -- >> Well, business terms, if I can interrupt, he said the one thing we haven't figured out how to APIify is KPIs that sit inside of a data warehouse, and that's essentially what he's doing. >> George: That's what he's doing, yes. >> Right. And so then you can now expose those APIs, those KPIs, that sit inside of a data warehouse, or a data lake, a data store, whatever, through APIs. >> George: And the difference -- >> So what does that do for you? >> Okay, so all of a sudden, instead of working at technical data terms, where you're dealing with tables and columns and rows, you're dealing instead with business entities, using the Uber example of drivers, riders, routes, you know, ETA prices. But you can define, DBT will be able to define those progressively in richer terms, today they're just doing things like bookings, billings, and revenue. But Bob's point was, today, the data warehouse that actually runs that stuff, whereas DBT defines it, the data warehouse that runs it, you can't do it with relational technology >> Dave: Relational totality, cashing architecture. >> SQL, you can't -- >> SQL caching architectures in memory, you can't do it, you've got to rethink down to the way the data lake is laid out on the disk or cache. Which by the way, Thomas Hazel, who's speaking later, he's the chief scientist and founder at Chaos Search, he says, "I've actually done this," basically leave it in an S3 bucket, and I'm going to query it, you know, with no caching. >> All right, so what I hear you saying then, tell me if I got this right, there are some some things that are inadequate in today's world, that's not compatible with the Supercloud wave. >> Yeah. >> Specifically how you're using storage, and data, and stateful. >> Yes. >> And then the software that makes it run, is that what you're saying? >> George: Yeah. >> There's one other thing you mentioned to me, it's like, when you're using a CRM system, a human is inputting data. >> George: Nothing happens till the human does something. >> Right, nothing happens until that data entry occurs. What you're talking about is a world that self forms, polling data from the transaction system, or the ERP system, and then builds a plan without human intervention. >> Yeah. Something in the real world happens, where the user says, "I want a ride." And then the software goes out and says, "Okay, we got to match a driver to the rider, we got to calculate how long it takes to get there, how long to deliver 'em." That's not driven by a form, other than the first person hitting a button and saying, "I want a ride." All the other stuff happens autonomously, driven by data and analytics. >> But my question was different, Dave, so I want to get specific, because this is where the startups are going to come in, this is the disruption. Snowflake is a data warehouse that's in the cloud, they call it a data cloud, they refactored it, they did it differently, the success, we all know it looks like. These areas where it's inadequate for the future are areas that'll probably be either disrupted, or refactored. What is that? >> That's what Muglia's contention is, that the DBT can start adding that layer where you define these business entities, they're like mini digital twins, you can define them, but the data warehouse isn't strong enough to actually manage and run them. And Muglia is behind a company that is rethinking the database, really in a fundamental way that hasn't been done in 40 or 50 years. It's the first, in his contention, the first real rethink of database technology in a fundamental way since the rise of the relational database 50 years ago. >> And I think you admit it's a real Hail Mary, I mean it's quite a long shot right? >> George: Yes. >> Huge potential. >> But they're pretty far along. >> Well, we've been talking on theCUBE for 12 years, and what, 10 years going to AWS Reinvent, Dave, that no one database will rule the world, Amazon kind of showed that with them. What's different, is it databases are changing, or you can have multiple databases, or? >> It's a good question. And the reason we've had multiple different types of databases, each one specialized for a different type of workload, but actually what Muglia is behind is a new engine that would essentially, you'll never get rid of the data warehouse, or the equivalent engine in like a Databricks datalake house, but it's a new engine that manages the thing that describes all the data and holds it together, and that's the new application platform. >> George, we have one minute left, I want to get real quick thought, you're an investor, and we know your history, and the folks watching, George's got a deep pedigree in investment data, and we can testify against that. If you're going to invest in a company right now, if you're a customer, I got to make a bet, what does success look like for me, what do I want walking through my door, and what do I want to send out? What companies do I want to look at? What's the kind of of vendor do I want to evaluate? Which ones do I want to send home? >> Well, the first thing a customer really has to do when they're thinking about next gen applications, all the people have told you guys, "we got to get our data in order," getting that data in order means building an integrated view of all your data landscape, which is data coming out of all your applications. It starts with the data model, so, today, you basically extract data from all your operational systems, put it in this one giant, central place, like a warehouse or lake house, but eventually you want this, whether you call it a fabric or a mesh, it's all the data that describes how everything hangs together as in one big knowledge graph. There's different ways to implement that. And that's the most critical thing, 'cause that describes your Uber landscape, your Uber platform. >> That's going to power the digital transformation, which will power the business transformation, which powers the business model, which allows the builders to build -- >> Yes. >> Coders to code. That's Supercloud application. >> Yeah. >> George, great stuff. Next interview you're going to see right here is Bob Muglia and Tristan Handy, they're going to unpack this new wave. Great segment, really worth unpacking and reading between the lines with George, and Dave Vellante, and those two great guests. And then we'll come back here for the studio for more of the live coverage of Supercloud 2. Thanks for watching. (upbeat electronic music)

Published Date : Feb 17 2023

SUMMARY :

remember the first days What's the super cloud to you? of the Walmart WCMP, I and that's the paradigm of microservices, and that was what they stateless apps in the cloud. And all their stateful of the apps are in VMs. And that goes to the -- Muglia and Handy, that you and I did, But that means the tough, he said the one thing we haven't And so then you can now the data warehouse that runs it, Dave: Relational totality, Which by the way, Thomas I hear you saying then, and data, and stateful. thing you mentioned to me, George: Nothing happens polling data from the transaction Something in the real world happens, that's in the cloud, that the DBT can start adding that layer Amazon kind of showed that with them. and that's the new application platform. and the folks watching, all the people have told you guys, Coders to code. for more of the live

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
George	PERSON	0.99+
Bob Muglia	PERSON	0.99+
Tristan Handy	PERSON	0.99+
Dave	PERSON	0.99+
Bob	PERSON	0.99+
Thomas Hazel	PERSON	0.99+
George Gilbert	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Walmart	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Chaos Search	ORGANIZATION	0.99+
Jack	PERSON	0.99+
Tristan	PERSON	0.99+
12 years	QUANTITY	0.99+
Berkeley	LOCATION	0.99+
Uber	ORGANIZATION	0.99+
first	QUANTITY	0.99+
DBT Labs	ORGANIZATION	0.99+
10 years	QUANTITY	0.99+
two experts	QUANTITY	0.99+
Supercloud 2	TITLE	0.99+
Gartner	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Muglia	ORGANIZATION	0.99+
one minute	QUANTITY	0.99+
40	QUANTITY	0.99+
two great guests	QUANTITY	0.98+
Wikibon	ORGANIZATION	0.98+
50 years	QUANTITY	0.98+
John	PERSON	0.98+
Rock 'Em Sock 'Em	TITLE	0.98+
today	DATE	0.98+
first person	QUANTITY	0.98+
Databricks	ORGANIZATION	0.98+
S3	COMMERCIAL_ITEM	0.97+
50 years ago	DATE	0.97+
2010	DATE	0.97+
Mary	PERSON	0.96+
first days	QUANTITY	0.96+
SQL	TITLE	0.96+
one	QUANTITY	0.95+
Supercloud wave	EVENT	0.95+
each one	QUANTITY	0.93+
DBT	ORGANIZATION	0.91+
Supercloud	TITLE	0.91+
Supercloud2	TITLE	0.91+
Supercloud 2	ORGANIZATION	0.89+
Snowflake	TITLE	0.86+
Dataverse	ORGANIZATION	0.83+
triplet	QUANTITY	0.78+

AWS Startup Showcase S3E1

(upbeat electronic music) >> Hello everyone, welcome to this CUBE conversation here from the studios in the CUBE in Palo Alto, California. I'm John Furrier, your host. We're featuring a startup, Astronomer. Astronomer.io is the URL, check it out. And we're going to have a great conversation around one of the most important topics hitting the industry, and that is the future of machine learning and AI, and the data that powers it underneath it. There's a lot of things that need to get done, and we're excited to have some of the co-founders of Astronomer here. Viraj Parekh, who is co-founder of Astronomer, and Paola Peraza Calderon, another co-founder, both with Astronomer. Thanks for coming on. First of all, how many co-founders do you guys have? >> You know, I think the answer's around six or seven. I forget the exact, but there's really been a lot of people around the table who've worked very hard to get this company to the point that it's at. We have long ways to go, right? But there's been a lot of people involved that have been absolutely necessary for the path we've been on so far. >> Thanks for that, Viraj, appreciate that. The first question I want to get out on the table, and then we'll get into some of the details, is take a minute to explain what you guys are doing. How did you guys get here? Obviously, multiple co-founders, sounds like a great project. The timing couldn't have been better. ChatGPT has essentially done so much public relations for the AI industry to kind of highlight this shift that's happening. It's real, we've been chronicalizing, take a minute to explain what you guys do. >> Yeah, sure, we can get started. So, yeah, when Viraj and I joined Astronomer in 2017, we really wanted to build a business around data, and we were using an open source project called Apache Airflow that we were just using sort of as customers ourselves. And over time, we realized that there was actually a market for companies who use Apache Airflow, which is a data pipeline management tool, which we'll get into, and that running Airflow is actually quite challenging, and that there's a big opportunity for us to create a set of commercial products and an opportunity to grow that open source community and actually build a company around that. So the crux of what we do is help companies run data pipelines with Apache Airflow. And certainly we've grown in our ambitions beyond that, but that's sort of the crux of what we do for folks. >> You know, data orchestration, data management has always been a big item in the old classic data infrastructure. But with AI, you're seeing a lot more emphasis on scale, tuning, training. Data orchestration is the center of the value proposition, when you're looking at coordinating resources, it's one of the most important things. Can you guys explain what data orchestration entails? What does it mean? Take us through the definition of what data orchestration entails. >> Yeah, for sure. I can take this one, and Viraj, feel free to jump in. So if you google data orchestration, here's what you're going to get. You're going to get something that says, "Data orchestration is the automated process" "for organizing silo data from numerous" "data storage points, standardizing it," "and making it accessible and prepared for data analysis." And you say, "Okay, but what does that actually mean," right, and so let's give sort of an an example. So let's say you're a business and you have sort of the following basic asks of your data team, right? Okay, give me a dashboard in Sigma, for example, for the number of customers or monthly active users, and then make sure that that gets updated on an hourly basis. And then number two, a consistent list of active customers that I have in HubSpot so that I can send them a monthly product newsletter, right? Two very basic asks for all sorts of companies and organizations. And when that data team, which has data engineers, data scientists, ML engineers, data analysts get that request, they're looking at an ecosystem of data sources that can help them get there, right? And that includes application databases, for example, that actually have in product user behavior and third party APIs from tools that the company uses that also has different attributes and qualities of those customers or users. And that data team needs to use tools like Fivetran to ingest data, a data warehouse, like Snowflake or Databricks to actually store that data and do analysis on top of it, a tool like DBT to do transformations and make sure that data is standardized in the way that it needs to be, a tool like Hightouch for reverse ETL. I mean, we could go on and on. There's so many partners of ours in this industry that are doing really, really exciting and critical things for those data movements. And the whole point here is that data teams have this plethora of tooling that they use to both ingest the right data and come up with the right interfaces to transform and interact with that data. And data orchestration, in our view, is really the heartbeat of all of those processes, right? And tangibly the unit of data orchestration is a data pipeline, a set of tasks or jobs that each do something with data over time and eventually run that on a schedule to make sure that those things are happening continuously as time moves on and the company advances. And so, for us, we're building a business around Apache Airflow, which is a workflow management tool that allows you to author, run, and monitor data pipelines. And so when we talk about data orchestration, we talk about sort of two things. One is that crux of data pipelines that, like I said, connect that large ecosystem of data tooling in your company. But number two, it's not just that data pipeline that needs to run every day, right? And Viraj will probably touch on this as we talk more about Astronomer and our value prop on top of Airflow. But then it's all the things that you need to actually run data and production and make sure that it's trustworthy, right? So it's actually not just that you're running things on a schedule, but it's also things like CICD tooling, secure secrets management, user permissions, monitoring, data lineage, documentation, things that enable other personas in your data team to actually use those tools. So long-winded way of saying that it's the heartbeat, we think, of of the data ecosystem, and certainly goes beyond scheduling, but again, data pipelines are really at the center of it. >> One of the things that jumped out, Viraj, if you can get into this, I'd like to hear more about how you guys look at all those little tools that are out. You mentioned a variety of things. You look at the data infrastructure, it's not just one stack. You've got an analytic stack, you've got a realtime stack, you've got a data lake stack, you got an AI stack potentially. I mean you have these stacks now emerging in the data world that are fundamental, that were once served by either a full package, old school software, and then a bunch of point solution. You mentioned Fivetran there, I would say in the analytics stack. Then you got S3, they're on the data lake stack. So all these things are kind of munged together. >> Yeah. >> How do you guys fit into that world? You make it easier, or like, what's the deal? >> Great question, right? And you know, I think that one of the biggest things we've found in working with customers over the last however many years is that if a data team is using a bunch of tools to get what they need done, and the number of tools they're using is growing exponentially and they're kind of roping things together here and there, that's actually a sign of a productive team, not a bad thing, right? It's because that team is moving fast. They have needs that are very specific to them, and they're trying to make something that's exactly tailored to their business. So a lot of times what we find is that customers have some sort of base layer, right? That's kind of like, it might be they're running most of the things in AWS, right? And then on top of that, they'll be using some of the things AWS offers, things like SageMaker, Redshift, whatever, but they also might need things that their cloud can't provide. Something like Fivetran, or Hightouch, those are other tools. And where data orchestration really shines, and something that we've had the pleasure of helping our customers build, is how do you take all those requirements, all those different tools and whip them together into something that fulfills a business need? So that somebody can read a dashboard and trust the number that it says, or somebody can make sure that the right emails go out to their customers. And Airflow serves as this amazing kind of glue between that data stack, right? It's to make it so that for any use case, be it ELT pipelines, or machine learning, or whatever, you need different things to do them, and Airflow helps tie them together in a way that's really specific for a individual business' needs. >> Take a step back and share the journey of what you guys went through as a company startup. So you mentioned Apache, open source. I was just having an interview with a VC, we were talking about foundational models. You got a lot of proprietary and open source development going on. It's almost the iPhone/Android moment in this whole generative space and foundational side. This is kind of important, the open source piece of it. Can you share how you guys started? And I can imagine your customers probably have their hair on fire and are probably building stuff on their own. Are you guys helping them? Take us through, 'cause you guys are on the front end of a big, big wave, and that is to make sense of the chaos, rain it in. Take us through your journey and why this is important. >> Yeah, Paola, I can take a crack at this, then I'll kind of hand it over to you to fill in whatever I miss in details. But you know, like Paola is saying, the heart of our company is open source, because we started using Airflow as an end user and started to say like, "Hey wait a second," "more and more people need this." Airflow, for background, started at Airbnb, and they were actually using that as a foundation for their whole data stack. Kind of how they made it so that they could give you recommendations, and predictions, and all of the processes that needed orchestrated. Airbnb created Airflow, gave it away to the public, and then fast forward a couple years and we're building a company around it, and we're really excited about that. >> That's a beautiful thing. That's exactly why open source is so great. >> Yeah, yeah. And for us, it's really been about watching the community and our customers take these problems, find a solution to those problems, standardize those solutions, and then building on top of that, right? So we're reaching to a point where a lot of our earlier customers who started to just using Airflow to get the base of their BI stack down and their reporting in their ELP infrastructure, they've solved that problem and now they're moving on to things like doing machine learning with their data, because now that they've built that foundation, all the connective tissue for their data arriving on time and being orchestrated correctly is happening, they can build a layer on top of that. And it's just been really, really exciting kind of watching what customers do once they're empowered to pick all the tools that they need, tie them together in the way they need to, and really deliver real value to their business. >> Can you share some of the use cases of these customers? Because I think that's where you're starting to see the innovation. What are some of the companies that you're working with, what are they doing? >> Viraj, I'll let you take that one too. (group laughs) >> So you know, a lot of it is... It goes across the gamut, right? Because it doesn't matter what you are, what you're doing with data, it needs to be orchestrated. So there's a lot of customers using us for their ETL and ELT reporting, right? Just getting data from other disparate sources into one place and then building on top of that. Be it building dashboards, answering questions for the business, building other data products and so on and so forth. From there, these use cases evolve a lot. You do see folks doing things like fraud detection, because Airflow's orchestrating how transactions go, transactions get analyzed. They do things like analyzing marketing spend to see where your highest ROI is. And then you kind of can't not talk about all of the machine learning that goes on, right? Where customers are taking data about their own customers, kind of analyze and aggregating that at scale, and trying to automate decision making processes. So it goes from your most basic, what we call data plumbing, right? Just to make sure data's moving as needed, all the ways to your more exciting expansive use cases around automated decision making and machine learning. >> And I'd say, I mean, I'd say that's one of the things that I think gets me most excited about our future, is how critical Airflow is to all of those processes, and I think when you know a tool is valuable is when something goes wrong and one of those critical processes doesn't work. And we know that our system is so mission critical to answering basic questions about your business and the growth of your company for so many organizations that we work with. So it's, I think, one of the things that gets Viraj and I and the rest of our company up every single morning is knowing how important the work that we do for all of those use cases across industries, across company sizes, and it's really quite energizing. >> It was such a big focus this year at AWS re:Invent, the role of data. And I think one of the things that's exciting about the open AI and all the movement towards large language models is that you can integrate data into these models from outside. So you're starting to see the integration easier to deal with. Still a lot of plumbing issues. So a lot of things happening. So I have to ask you guys, what is the state of the data orchestration area? Is it ready for disruption? Has it already been disrupted? Would you categorize it as a new first inning kind of opportunity, or what's the state of the data orchestration area right now? Both technically and from a business model standpoint. How would you guys describe that state of the market? >> Yeah, I mean, I think in a lot of ways, in some ways I think we're category creating. Schedulers have been around for a long time. I released a data presentation sort of on the evolution of going from something like Kron, which I think was built in like the 1970s out of Carnegie Mellon. And that's a long time ago, that's 50 years ago. So sort of like the basic need to schedule and do something with your data on a schedule is not a new concept. But to our point earlier, I think everything that you need around your ecosystem, first of all, the number of data tools and developer tooling that has come out industry has 5X'd over the last 10 years. And so obviously as that ecosystem grows, and grows, and grows, and grows, the need for orchestration only increases. And I think, as Astronomer, I think we... And we work with so many different types of companies, companies that have been around for 50 years, and companies that got started not even 12 months ago. And so I think for us it's trying to, in a ways, category create and adjust sort of what we sell and the value that we can provide for companies all across that journey. There are folks who are just getting started with orchestration, and then there's folks who have such advanced use case, 'cause they're hitting sort of a ceiling and only want to go up from there. And so I think we, as a company, care about both ends of that spectrum, and certainly want to build and continue building products for companies of all sorts, regardless of where they are on the maturity curve of data orchestration. >> That's a really good point, Paola. And I think the other thing to really take into account is it's the companies themselves, but also individuals who have to do their jobs. If you rewind the clock like 5 or 10 years ago, data engineers would be the ones responsible for orchestrating data through their org. But when we look at our customers today, it's not just data engineers anymore. There's data analysts who sit a lot closer to the business, and the data scientists who want to automate things around their models. So this idea that orchestration is this new category is right on the money. And what we're finding is the need for it is spreading to all parts of the data team, naturally where Airflow's emerged as an open source standard and we're hoping to take things to the next level. >> That's awesome. We've been up saying that the data market's kind of like the SRE with servers, right? You're going to need one person to deal with a lot of data, and that's data engineering, and then you're got to have the practitioners, the democratization. Clearly that's coming in what you're seeing. So I have to ask, how do you guys fit in from a value proposition standpoint? What's the pitch that you have to customers, or is it more inbound coming into you guys? Are you guys doing a lot of outreach, customer engagements? I'm sure they're getting a lot of great requirements from customers. What's the current value proposition? How do you guys engage? >> Yeah, I mean, there's so many... Sorry, Viraj, you can jump in. So there's so many companies using Airflow, right? So the baseline is that the open source project that is Airflow that came out of Airbnb, over five years ago at this point, has grown exponentially in users and continues to grow. And so the folks that we sell to primarily are folks who are already committed to using Apache Airflow, need data orchestration in their organization, and just want to do it better, want to do it more efficiently, want to do it without managing that infrastructure. And so our baseline proposition is for those organizations. Now to Viraj's point, obviously I think our ambitions go beyond that, both in terms of the personas that we addressed and going beyond that data engineer, but really it's to start at the baseline, as we continue to grow our our company, it's really making sure that we're adding value to folks using Airflow and help them do so in a better way, in a larger way, in a more efficient way, and that's really the crux of who we sell to. And so to answer your question on, we get a lot of inbound because they're... >> You have a built in audience. (laughs) >> The world that use it. Those are the folks who we talk to and come to our website and chat with us and get value from our content. I mean, the power of the opensource community is really just so, so big, and I think that's also one of the things that makes this job fun. >> And you guys are in a great position. Viraj, you can comment a little, get your reaction. There's been a big successful business model to starting a company around these big projects for a lot of reasons. One is open source is continuing to be great, but there's also supply chain challenges in there. There's also we want to continue more innovation and more code and keeping it free and and flowing. And then there's the commercialization of productizing it, operationalizing it. This is a huge new dynamic, I mean, in the past 5 or so years, 10 years, it's been happening all on CNCF from other areas like Apache, Linux Foundation, they're all implementing this. This is a huge opportunity for entrepreneurs to do this. >> Yeah, yeah. Open source is always going to be core to what we do, because we wouldn't exist without the open source community around us. They are huge in numbers. Oftentimes they're nameless people who are working on making something better in a way that everybody benefits from it. But open source is really hard, especially if you're a company whose core competency is running a business, right? Maybe you're running an e-commerce business, or maybe you're running, I don't know, some sort of like, any sort of business, especially if you're a company running a business, you don't really want to spend your time figuring out how to run open source software. You just want to use it, you want to use the best of it, you want to use the community around it, you want to be able to google something and get answers for it, you want the benefits of open source. You don't have the time or the resources to invest in becoming an expert in open source, right? And I think that dynamic is really what's given companies like us an ability to kind of form businesses around that in the sense that we'll make it so people get the best of both worlds. You'll get this vast open ecosystem that you can build on top of, that you can benefit from, that you can learn from. But you won't have to spend your time doing undifferentiated heavy lifting. You can do things that are just specific to your business. >> It's always been great to see that business model evolve. We used a debate 10 years ago, can there be another Red Hat? And we said, not really the same, but there'll be a lot of little ones that'll grow up to be big soon. Great stuff. Final question, can you guys share the history of the company? The milestones of Astromer's journey in data orchestration? >> Yeah, we could. So yeah, I mean, I think, so Viraj and I have obviously been at Astronomer along with our other founding team and leadership folks for over five years now. And it's been such an incredible journey of learning, of hiring really amazing people, solving, again, mission critical problems for so many types of organizations. We've had some funding that has allowed us to invest in the team that we have and in the software that we have, and that's been really phenomenal. And so that investment, I think, keeps us confident, even despite these sort of macroeconomic conditions that we're finding ourselves in. And so honestly, the milestones for us are focusing on our product, focusing on our customers over the next year, focusing on that market for us that we know can get valuable out of what we do, and making developers' lives better, and growing the open source community and making sure that everything that we're doing makes it easier for folks to get started, to contribute to the project and to feel a part of the community that we're cultivating here. >> You guys raised a little bit of money. How much have you guys raised? >> Don't know what the total is, but it's in the ballpark over $200 million. It feels good to... >> A little bit of capital. Got a little bit of cap to work with there. Great success. I know as a Series C Financing, you guys have been down. So you're up and running, what's next? What are you guys looking to do? What's the big horizon look like for you from a vision standpoint, more hiring, more product, what is some of the key things you're looking at doing? >> Yeah, it's really a little of all of the above, right? Kind of one of the best and worst things about working at earlier stage startups is there's always so much to do and you often have to just kind of figure out a way to get everything done. But really investing our product over the next, at least over the course of our company lifetime. And there's a lot of ways we want to make it more accessible to users, easier to get started with, easier to use, kind of on all areas there. And really, we really want to do more for the community, right, like I was saying, we wouldn't be anything without the large open source community around us. And we want to figure out ways to give back more in more creative ways, in more code driven ways, in more kind of events and everything else that we can keep those folks galvanized and just keep them happy using Airflow. >> Paola, any final words as we close out? >> No, I mean, I'm super excited. I think we'll keep growing the team this year. We've got a couple of offices in the the US, which we're excited about, and a fully global team that will only continue to grow. So Viraj and I are both here in New York, and we're excited to be engaging with our coworkers in person finally, after years of not doing so. We've got a bustling office in San Francisco as well. So growing those teams and continuing to hire all over the world, and really focusing on our product and the open source community is where our heads are at this year. So, excited. >> Congratulations. 200 million in funding, plus. Good runway, put that money in the bank, squirrel it away. It's a good time to kind of get some good interest on it, but still grow. Congratulations on all the work you guys do. We appreciate you and the open source community does, and good luck with the venture, continue to be successful, and we'll see you at the Startup Showcase. >> Thank you. >> Yeah, thanks so much, John. Appreciate it. >> Okay, that's the CUBE Conversation featuring astronomer.io, that's the website. Astronomer is doing well. Multiple rounds of funding, over 200 million in funding. Open source continues to lead the way in innovation. Great business model, good solution for the next gen cloud scale data operations, data stacks that are emerging. I'm John Furrier, your host, thanks for watching. (soft upbeat music)

Published Date : Feb 14 2023

SUMMARY :

and that is the future of for the path we've been on so far. for the AI industry to kind of highlight So the crux of what we center of the value proposition, that it's the heartbeat, One of the things and the number of tools they're using of what you guys went and all of the processes That's a beautiful thing. all the tools that they need, What are some of the companies Viraj, I'll let you take that one too. all of the machine learning and the growth of your company that state of the market? and the value that we can provide and the data scientists that the data market's And so the folks that we sell to You have a built in audience. one of the things that makes this job fun. in the past 5 or so years, 10 years, that you can build on top of, the history of the company? and in the software that we have, How much have you guys raised? but it's in the ballpark What's the big horizon look like for you Kind of one of the best and worst things and continuing to hire the work you guys do. Yeah, thanks so much, John. for the next gen cloud

ENTITIES

Entity	Category	Confidence
Viraj Parekh	PERSON	0.99+
Paola	PERSON	0.99+
Viraj	PERSON	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
Airbnb	ORGANIZATION	0.99+
2017	DATE	0.99+
San Francisco	LOCATION	0.99+
New York	LOCATION	0.99+
Apache	ORGANIZATION	0.99+
US	LOCATION	0.99+
Two	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Paola Peraza Calderon	PERSON	0.99+
1970s	DATE	0.99+
first question	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
Airflow	TITLE	0.99+
both	QUANTITY	0.99+
Linux Foundation	ORGANIZATION	0.99+
200 million	QUANTITY	0.99+
Astronomer	ORGANIZATION	0.99+
One	QUANTITY	0.99+
over 200 million	QUANTITY	0.99+
over $200 million	QUANTITY	0.99+
this year	DATE	0.99+
10 years ago	DATE	0.99+
HubSpot	ORGANIZATION	0.98+
Fivetran	ORGANIZATION	0.98+
50 years ago	DATE	0.98+
over five years	QUANTITY	0.98+
one stack	QUANTITY	0.98+
12 months ago	DATE	0.98+
10 years	QUANTITY	0.97+
Both	QUANTITY	0.97+
Apache Airflow	TITLE	0.97+
both worlds	QUANTITY	0.97+
CNCF	ORGANIZATION	0.97+
one	QUANTITY	0.97+
ChatGPT	ORGANIZATION	0.97+
5	DATE	0.97+
next year	DATE	0.96+
Astromer	ORGANIZATION	0.96+
today	DATE	0.95+
5X	QUANTITY	0.95+
over five years ago	DATE	0.95+
CUBE	ORGANIZATION	0.94+
two things	QUANTITY	0.94+
each	QUANTITY	0.93+
one person	QUANTITY	0.93+
First	QUANTITY	0.92+
S3	TITLE	0.91+
Carnegie Mellon	ORGANIZATION	0.91+
Startup Showcase	EVENT	0.91+

AWS Startup Showcase S3E1

(soft music) >> Hello everyone, welcome to this Cube conversation here from the studios of theCube in Palo Alto, California. John Furrier, your host. We're featuring a startup, Astronomer, astronomer.io is the url. Check it out. And we're going to have a great conversation around one of the most important topics hitting the industry, and that is the future of machine learning and AI and the data that powers it underneath it. There's a lot of things that need to get done, and we're excited to have some of the co-founders of Astronomer here. Viraj Parekh, who is co-founder and Paola Peraza Calderon, another co-founder, both with Astronomer. Thanks for coming on. First of all, how many co-founders do you guys have? >> You know, I think the answer's around six or seven. I forget the exact, but there's really been a lot of people around the table, who've worked very hard to get this company to the point that it's at. And we have long ways to go, right? But there's been a lot of people involved that are, have been absolutely necessary for the path we've been on so far. >> Thanks for that, Viraj, appreciate that. The first question I want to get out on the table, and then we'll get into some of the details, is take a minute to explain what you guys are doing. How did you guys get here? Obviously, multiple co-founders sounds like a great project. The timing couldn't have been better. ChatGPT has essentially done so much public relations for the AI industry. Kind of highlight this shift that's happening. It's real. We've been chronologicalizing, take a minute to explain what you guys do. >> Yeah, sure. We can get started. So yeah, when Astronomer, when Viraj and I joined Astronomer in 2017, we really wanted to build a business around data and we were using an open source project called Apache Airflow, that we were just using sort of as customers ourselves. And over time, we realized that there was actually a market for companies who use Apache Airflow, which is a data pipeline management tool, which we'll get into. And that running Airflow is actually quite challenging and that there's a lot of, a big opportunity for us to create a set of commercial products and opportunity to grow that open source community and actually build a company around that. So the crux of what we do is help companies run data pipelines with Apache Airflow. And certainly we've grown in our ambitions beyond that, but that's sort of the crux of what we do for folks. >> You know, data orchestration, data management has always been a big item, you know, in the old classic data infrastructure. But with AI you're seeing a lot more emphasis on scale, tuning, training. You know, data orchestration is the center of the value proposition when you're looking at coordinating resources, it's one of the most important things. Could you guys explain what data orchestration entails? What does it mean? Take us through the definition of what data orchestration entails. >> Yeah, for sure. I can take this one and Viraj feel free to jump in. So if you google data orchestration, you know, here's what you're going to get. You're going to get something that says, data orchestration is the automated process for organizing silo data from numerous data storage points to organizing it and making it accessible and prepared for data analysis. And you say, okay, but what does that actually mean, right? And so let's give sort of an example. So let's say you're a business and you have sort of the following basic asks of your data team, right? Hey, give me a dashboard in Sigma, for example, for the number of customers or monthly active users and then make sure that that gets updated on an hourly basis. And then number two, a consistent list of active customers that I have in HubSpot so that I can send them a monthly product newsletter, right? Two very basic asks for all sorts of companies and organizations. And when that data team, which has data engineers, data scientists, ML engineers, data analysts get that request, they're looking at an ecosystem of data sources that can help them get there, right? And that includes application databases, for example, that actually have end product user behavior and third party APIs from tools that the company uses that also has different attributes and qualities of those customers or users. And that data team needs to use tools like Fivetran, to ingest data, a data warehouse like Snowflake or Databricks to actually store that data and do analysis on top of it, a tool like DBT to do transformations and make sure that that data is standardized in the way that it needs to be, a tool like Hightouch for reverse ETL. I mean, we could go on and on. There's so many partners of ours in this industry that are doing really, really exciting and critical things for those data movements. And the whole point here is that, you know, data teams have this plethora of tooling that they use to both ingest the right data and come up with the right interfaces to transform and interact with that data. And data orchestration in our view is really the heartbeat of all of those processes, right? And tangibly the unit of data orchestration, you know, is a data pipeline, a set of tasks or jobs that each do something with data over time and eventually run that on a schedule to make sure that those things are happening continuously as time moves on. And, you know, the company advances. And so, you know, for us, we're building a business around Apache Airflow, which is a workflow management tool that allows you to author, run and monitor data pipelines. And so when we talk about data orchestration, we talk about sort of two things. One is that crux of data pipelines that, like I said, connect that large ecosystem of data tooling in your company. But number two, it's not just that data pipeline that needs to run every day, right? And Viraj will probably touch on this as we talk more about Astronomer and our value prop on top of Airflow. But then it's all the things that you need to actually run data and production and make sure that it's trustworthy, right? So it's actually not just that you're running things on a schedule, but it's also things like CI/CD tooling, right? Secure secrets management, user permissions, monitoring, data lineage, documentation, things that enable other personas in your data team to actually use those tools. So long-winded way of saying that, it's the heartbeat that we think of the data ecosystem and certainly goes beyond scheduling, but again, data pipelines are really at the center of it. >> You know, one of the things that jumped out Viraj, if you can get into this, I'd like to hear more about how you guys look at all those little tools that are out there. You mentioned a variety of things. You know, if you look at the data infrastructure, it's not just one stack. You've got an analytic stack, you've got a realtime stack, you've got a data lake stack, you got an AI stack potentially. I mean you have these stacks now emerging in the data world that are >> Yeah. - >> fundamental, but we're once served by either a full package, old school software, and then a bunch of point solution. You mentioned Fivetran there, I would say in the analytics stack. Then you got, you know, S3, they're on the data lake stack. So all these things are kind of munged together. >> Yeah. >> How do you guys fit into that world? You make it easier or like, what's the deal? >> Great question, right? And you know, I think that one of the biggest things we've found in working with customers over, you know, the last however many years, is that like if a data team is using a bunch of tools to get what they need done and the number of tools they're using is growing exponentially and they're kind of roping things together here and there, that's actually a sign of a productive team, not a bad thing, right? It's because that team is moving fast. They have needs that are very specific to them and they're trying to make something that's exactly tailored to their business. So a lot of times what we find is that customers have like some sort of base layer, right? That's kind of like, you know, it might be they're running most of the things in AWS, right? And then on top of that, they'll be using some of the things AWS offers, you know, things like SageMaker, Redshift, whatever. But they also might need things that their Cloud can't provide, you know, something like Fivetran or Hightouch or anything of those other tools and where data orchestration really shines, right? And something that we've had the pleasure of helping our customers build, is how do you take all those requirements, all those different tools and whip them together into something that fulfills a business need, right? Something that makes it so that somebody can read a dashboard and trust the number that it says or somebody can make sure that the right emails go out to their customers. And Airflow serves as this amazing kind of glue between that data stack, right? It's to make it so that for any use case, be it ELT pipelines or machine learning or whatever, you need different things to do them and Airflow helps tie them together in a way that's really specific for a individual business's needs. >> Take a step back and share the journey of what your guys went through as a company startup. So you mentioned Apache open source, you know, we were just, I was just having an interview with the VC, we were talking about foundational models. You got a lot of proprietary and open source development going on. It's almost the iPhone, Android moment in this whole generative space and foundational side. This is kind of important, the open source piece of it. Can you share how you guys started? And I can imagine your customers probably have their hair on fire and are probably building stuff on their own. How do you guys, are you guys helping them? Take us through, 'cuz you guys are on the front end of a big, big wave and that is to make sense of the chaos, reigning it in. Take us through your journey and why this is important. >> Yeah Paola, I can take a crack at this and then I'll kind of hand it over to you to fill in whatever I miss in details. But you know, like Paola is saying, the heart of our company is open source because we started using Airflow as an end user and started to say like, "Hey wait a second". Like more and more people need this. Airflow, for background, started at Airbnb and they were actually using that as the foundation for their whole data stack. Kind of how they made it so that they could give you recommendations and predictions and all of the processes that need to be or needed to be orchestrated. Airbnb created Airflow, gave it away to the public and then, you know, fast forward a couple years and you know, we're building a company around it and we're really excited about that. >> That's a beautiful thing. That's exactly why open source is so great. >> Yeah, yeah. And for us it's really been about like watching the community and our customers take these problems, find solution to those problems, build standardized solutions, and then building on top of that, right? So we're reaching to a point where a lot of our earlier customers who started to just using Airflow to get the base of their BI stack down and their reporting and their ELP infrastructure, you know, they've solved that problem and now they're moving onto things like doing machine learning with their data, right? Because now that they've built that foundation, all the connective tissue for their data arriving on time and being orchestrated correctly is happening, they can build the layer on top of that. And it's just been really, really exciting kind of watching what customers do once they're empowered to pick all the tools that they need, tie them together in the way they need to, and really deliver real value to their business. >> Can you share some of the use cases of these customers? Because I think that's where you're starting to see the innovation. What are some of the companies that you're working with, what are they doing? >> Raj, I'll let you take that one too. (all laughing) >> Yeah. (all laughing) So you know, a lot of it is, it goes across the gamut, right? Because all doesn't matter what you are, what you're doing with data, it needs to be orchestrated. So there's a lot of customers using us for their ETL and ELT reporting, right? Just getting data from all the disparate sources into one place and then building on top of that, be it building dashboards, answering questions for the business, building other data products and so on and so forth. From there, these use cases evolve a lot. You do see folks doing things like fraud detection because Airflow's orchestrating how transactions go. Transactions get analyzed, they do things like analyzing marketing spend to see where your highest ROI is. And then, you know, you kind of can't not talk about all of the machine learning that goes on, right? Where customers are taking data about their own customers kind of analyze and aggregating that at scale and trying to automate decision making processes. So it goes from your most basic, what we call like data plumbing, right? Just to make sure data's moving as needed. All the ways to your more exciting and sexy use cases around like automated decision making and machine learning. >> And I'd say, I mean, I'd say that's one of the things that I think gets me most excited about our future is how critical Airflow is to all of those processes, you know? And I think when, you know, you know a tool is valuable is when something goes wrong and one of those critical processes doesn't work. And we know that our system is so mission critical to answering basic, you know, questions about your business and the growth of your company for so many organizations that we work with. So it's, I think one of the things that gets Viraj and I, and the rest of our company up every single morning, is knowing how important the work that we do for all of those use cases across industries, across company sizes. And it's really quite energizing. >> It was such a big focus this year at AWS re:Invent, the role of data. And I think one of the things that's exciting about the open AI and all the movement towards large language models, is that you can integrate data into these models, right? From outside, right? So you're starting to see the integration easier to deal with, still a lot of plumbing issues. So a lot of things happening. So I have to ask you guys, what is the state of the data orchestration area? Is it ready for disruption? Is it already been disrupted? Would you categorize it as a new first inning kind of opportunity or what's the state of the data orchestration area right now? Both, you know, technically and from a business model standpoint, how would you guys describe that state of the market? >> Yeah, I mean I think, I think in a lot of ways we're, in some ways I think we're categoric rating, you know, schedulers have been around for a long time. I recently did a presentation sort of on the evolution of going from, you know, something like KRON, which I think was built in like the 1970s out of Carnegie Mellon. And you know, that's a long time ago. That's 50 years ago. So it's sort of like the basic need to schedule and do something with your data on a schedule is not a new concept. But to our point earlier, I think everything that you need around your ecosystem, first of all, the number of data tools and developer tooling that has come out the industry has, you know, has some 5X over the last 10 years. And so obviously as that ecosystem grows and grows and grows and grows, the need for orchestration only increases. And I think, you know, as Astronomer, I think we, and there's, we work with so many different types of companies, companies that have been around for 50 years and companies that got started, you know, not even 12 months ago. And so I think for us, it's trying to always category create and adjust sort of what we sell and the value that we can provide for companies all across that journey. There are folks who are just getting started with orchestration and then there's folks who have such advanced use case 'cuz they're hitting sort of a ceiling and only want to go up from there. And so I think we as a company, care about both ends of that spectrum and certainly have want to build and continue building products for companies of all sorts, regardless of where they are on the maturity curve of data orchestration. >> That's a really good point Paola. And I think the other thing to really take into account is it's the companies themselves, but also individuals who have to do their jobs. You know, if you rewind the clock like five or 10 years ago, data engineers would be the ones responsible for orchestrating data through their org. But when we look at our customers today, it's not just data engineers anymore. There's data analysts who sit a lot closer to the business and the data scientists who want to automate things around their models. So this idea that orchestration is this new category is spot on, is right on the money. And what we're finding is it's spreading, the need for it, is spreading to all parts of the data team naturally where Airflows have emerged as an open source standard and we're hoping to take things to the next level. >> That's awesome. You know, we've been up saying that the data market's kind of like the SRE with servers, right? You're going to need one person to deal with a lot of data and that's data engineering and then you're going to have the practitioners, the democratization. Clearly that's coming in what you're seeing. So I got to ask, how do you guys fit in from a value proposition standpoint? What's the pitch that you have to customers or is it more inbound coming into you guys? Are you guys doing a lot of outreach, customer engagements? I'm sure they're getting a lot of great requirements from customers. What's the current value proposition? How do you guys engage? >> Yeah, I mean we've, there's so many, there's so many. Sorry Raj, you can jump in. - >> It's okay. So there's so many companies using Airflow, right? So our, the baseline is that the open source project that is Airflow that was, that came out of Airbnb, you know, over five years ago at this point, has grown exponentially in users and continues to grow. And so the folks that we sell to primarily are folks who are already committed to using Apache Airflow, need data orchestration in the organization and just want to do it better, want to do it more efficiently, want to do it without managing that infrastructure. And so our baseline proposition is for those organizations. Now to Raj's point, obviously I think our ambitions go beyond that, both in terms of the personas that we addressed and going beyond that data engineer, but really it's for, to start at the baseline. You know, as we continue to grow our company, it's really making sure that we're adding value to folks using Airflow and help them do so in a better way, in a larger way and a more efficient way. And that's really the crux of who we sell to. And so to answer your question on, we actually, we get a lot of inbound because they're are so many - >> A built-in audience. >> In the world that use it, that those are the folks who we talk to and come to our website and chat with us and get value from our content. I mean the power of the open source community is really just so, so big. And I think that's also one of the things that makes this job fun, so. >> And you guys are in a great position, Viraj, you can comment, to get your reaction. There's been a big successful business model to starting a company around these big projects for a lot of reasons. One is open source is continuing to be great, but there's also supply chain challenges in there. There's also, you know, we want to continue more innovation and more code and keeping it free and and flowing. And then there's the commercialization of product-izing it, operationalizing it. This is a huge new dynamic. I mean, in the past, you know, five or so years, 10 years, it's been happening all on CNCF from other areas like Apache, Linux Foundation, they're all implementing this. This is a huge opportunity for entrepreneurs to do this. >> Yeah, yeah. Open source is always going to be core to what we do because, you know, we wouldn't exist without the open source community around us. They are huge in numbers. Oftentimes they're nameless people who are working on making something better in a way that everybody benefits from it. But open source is really hard, especially if you're a company whose core competency is running a business, right? Maybe you're running e-commerce business or maybe you're running, I don't know, some sort of like any sort of business, especially if you're a company running a business, you don't really want to spend your time figuring out how to run open source software. You just want to use it, you want to use the best of it, you want to use the community around it. You want to take, you want to be able to google something and get answers for it. You want the benefits of open source. You don't want to have, you don't have the time or the resources to invest in becoming an expert in open source, right? And I think that dynamic is really what's given companies like us an ability to kind of form businesses around that, in the sense that we'll make it so people get the best of both worlds. You'll get this vast open ecosystem that you can build on top of, you can benefit from, that you can learn from, but you won't have to spend your time doing undifferentiated heavy lifting. You can do things that are just specific to your business. >> It's always been great to see that business model evolved. We used to debate 10 years ago, can there be another red hat? And we said, not really the same, but there'll be a lot of little ones that'll grow up to be big soon. Great stuff. Final question, can you guys share the history of the company, the milestones of the Astronomer's journey in data orchestration? >> Yeah, we could. So yeah, I mean, I think, so Raj and I have obviously been at astronomer along with our other founding team and leadership folks, for over five years now. And it's been such an incredible journey of learning, of hiring really amazing people. Solving again, mission critical problems for so many types of organizations. You know, we've had some funding that has allowed us to invest in the team that we have and in the software that we have. And that's been really phenomenal. And so that investment, I think, keeps us confident even despite these sort of macroeconomic conditions that we're finding ourselves in. And so honestly, the milestones for us are focusing on our product, focusing on our customers over the next year, focusing on that market for us, that we know can get value out of what we do. And making developers' lives better and growing the open source community, you know, and making sure that everything that we're doing makes it easier for folks to get started to contribute to the project and to feel a part of the community that we're cultivating here. >> You guys raised a little bit of money. How much have you guys raised? >> I forget what the total is, but it's in the ballpark of 200, over $200 million. So it feels good - >> A little bit of capital. Got a little bit of cash to work with there. Great success. I know it's a Series C financing, you guys been down, so you're up and running. What's next? What are you guys looking to do? What's the big horizon look like for you? And from a vision standpoint, more hiring, more product, what is some of the key things you're looking at doing? >> Yeah, it's really a little of all of the above, right? Like, kind of one of the best and worst things about working at earlier stage startups is there's always so much to do and you often have to just kind of figure out a way to get everything done, but really invest in our product over the next, at least the next, over the course of our company lifetime. And there's a lot of ways we wanting to just make it more accessible to users, easier to get started with, easier to use all kind of on all areas there. And really, we really want to do more for the community, right? Like I was saying, we wouldn't be anything without the large open source community around us. And we want to figure out ways to give back more in more creative ways, in more code driven ways and more kind of events and everything else that we can do to keep those folks galvanized and just keeping them happy using Airflow. >> Paola, any final words as we close out? >> No, I mean, I'm super excited. You know, I think we'll keep growing the team this year. We've got a couple of offices in the US which we're excited about, and a fully global team that will only continue to grow. So Viraj and I are both here in New York and we're excited to be engaging with our coworkers in person. Finally, after years of not doing so, we've got a bustling office in San Francisco as well. So growing those teams and continuing to hire all over the world and really focusing on our product and the open source community is where our heads are at this year, so. >> Congratulations. - >> Excited. 200 million in funding plus good runway. Put that money in the bank, squirrel it away. You know, it's good to kind of get some good interest on it, but still grow. Congratulations on all the work you guys do. We appreciate you and the open sourced community does and good luck with the venture. Continue to be successful and we'll see you at the Startup Showcase. >> Thank you. - >> Yeah, thanks so much, John. Appreciate it. - >> It's theCube conversation, featuring astronomer.io, that's the website. Astronomer is doing well. Multiple rounds of funding, over 200 million in funding. Open source continues to lead the way in innovation. Great business model. Good solution for the next gen, Cloud, scale, data operations, data stacks that are emerging. I'm John Furrier, your host. Thanks for watching. (soft music)

Published Date : Feb 8 2023

SUMMARY :

and that is the future of for the path we've been on so far. take a minute to explain what you guys do. and that there's a lot of, of the value proposition And that data team needs to use tools You know, one of the and then a bunch of point solution. and the number of tools they're using and that is to make sense of the chaos, and all of the processes that need to be That's a beautiful thing. you know, they've solved that problem What are some of the companies Raj, I'll let you take that one too. And then, you know, and the growth of your company So I have to ask you guys, and companies that got started, you know, and the data scientists that the data market's kind of you can jump in. And so the folks that we and come to our website and chat with us I mean, in the past, you to what we do because, you history of the company, and in the software that we have. How much have you guys raised? but it's in the ballpark What are you guys looking to do? and you often have to just kind of and the open source community the work you guys do. Yeah, thanks so much, John. that's the website.

ENTITIES

Entity	Category	Confidence
Viraj Parekh	PERSON	0.99+
Paola	PERSON	0.99+
Viraj	PERSON	0.99+
John Furrier	PERSON	0.99+
John	PERSON	0.99+
Raj	PERSON	0.99+
Airbnb	ORGANIZATION	0.99+
US	LOCATION	0.99+
2017	DATE	0.99+
New York	LOCATION	0.99+
Paola Peraza Calderon	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Apache	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
Palo Alto, California	LOCATION	0.99+
1970s	DATE	0.99+
10 years	QUANTITY	0.99+
five	QUANTITY	0.99+
Two	QUANTITY	0.99+
first question	QUANTITY	0.99+
over 200 million	QUANTITY	0.99+
both	QUANTITY	0.99+
Both	QUANTITY	0.99+
over $200 million	QUANTITY	0.99+
Linux Foundation	ORGANIZATION	0.99+
50 years ago	DATE	0.99+
one	QUANTITY	0.99+
five	DATE	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
this year	DATE	0.98+
One	QUANTITY	0.98+
Airflow	TITLE	0.98+
10 years ago	DATE	0.98+
Carnegie Mellon	ORGANIZATION	0.98+
over five years	QUANTITY	0.98+
200	QUANTITY	0.98+
12 months ago	DATE	0.98+
both worlds	QUANTITY	0.98+
5X	QUANTITY	0.98+
ChatGPT	ORGANIZATION	0.98+
first	QUANTITY	0.98+
one stack	QUANTITY	0.97+
one person	QUANTITY	0.97+
two things	QUANTITY	0.97+
Fivetran	ORGANIZATION	0.96+
seven	QUANTITY	0.96+
next year	DATE	0.96+
today	DATE	0.95+
50 years	QUANTITY	0.95+
each	QUANTITY	0.95+
theCube	ORGANIZATION	0.94+
HubSpot	ORGANIZATION	0.93+
Sigma	ORGANIZATION	0.92+
Series C	OTHER	0.92+
Astronomer	ORGANIZATION	0.91+
astronomer.io	OTHER	0.91+
Hightouch	TITLE	0.9+
one place	QUANTITY	0.9+
Android	TITLE	0.88+
Startup Showcase	EVENT	0.88+
Apache Airflow	TITLE	0.86+
CNCF	ORGANIZATION	0.86+

theCUBE's New Analyst Talks Cloud & DevOps

(light music) >> Hi everybody. Welcome to this Cube Conversation. I'm really pleased to announce a collaboration with Rob Strechay. He's a guest cube analyst, and we'll be working together to extract the signal from the noise. Rob is a long-time product pro, working at a number of firms including AWS, HP, HPE, NetApp, Snowplow. I did a stint as an analyst at Enterprise Strategy Group. Rob, good to see you. Thanks for coming into our Marlboro Studios. >> Well, thank you for having me. It's always great to be here. >> I'm really excited about working with you. We've known each other for a long time. You've been in the Cube a bunch. You know, you're in between gigs, and I think we can have a lot of fun together. Covering events, covering trends. So. let's get into it. What's happening out there? We're sort of exited the isolation economy. Things were booming. Now, everybody's tapping the brakes. From your standpoint, what are you seeing out there? >> Yeah. I'm seeing that people are really looking how to get more out of their data. How they're bringing things together, how they're looking at the costs of Cloud, and understanding how are they building out their SaaS applications. And understanding that when they go in and actually start to use Cloud, it's not only just using the base services anymore. They're looking at, how do I use these platforms as a service? Some are easier than others, and they're trying to understand, how do I get more value out of that relationship with the Cloud? They're also consolidating the number of Clouds that they have, I would say to try to better optimize their spend, and getting better pricing for that matter. >> Are you seeing people unhook Clouds, or just reduce maybe certain Cloud activities and going maybe instead of 60/40 going 90/10? >> Correct. It's more like the 90/10 type of rule where they're starting to say, Hey I'm not going to get rid of Azure or AWS or Google. I'm going to move a portion of this over that I was using on this one service. Maybe I got a great two-year contract to start with on this platform as a service or a database as a service. I'm going to unhook from that and maybe go with an independent. Maybe with something like a Snowflake or a Databricks on top of another Cloud, so that I can consolidate down. But it also gives them more flexibility as well. >> In our last breaking analysis, Rob, we identified six factors that were reducing Cloud consumption. There were factors and customer tactics. And I want to get your take on this. So, some of the factors really, you got fewer mortgage originations. FinTech, obviously big Cloud user. Crypto, not as much activity there. Lower ad spending means less Cloud. And then one of 'em, which you kind of disagreed with was less, less analytics, you know, fewer... Less frequency of calculations. I'll come back to that. But then optimizing compute using Graviton or AMD instances moving to cheaper storage tiers. That of course makes sense. And then optimize pricing plans. Maybe going from On Demand, you know, to, you know, instead of pay by the drink, buy in volume. Okay. So, first of all, do those make sense to you with the exception? We'll come back and talk about the analytics piece. Is that what you're seeing from customers? >> Yeah, I think so. I think that was pretty much dead on with what I'm seeing from customers and the ones that I go out and talk to. A lot of times they're trying to really monetize their, you know, understand how their business utilizes these Clouds. And, where their spend is going in those Clouds. Can they use, you know, lower tiers of storage? Do they really need the best processors? Do they need to be using Intel or can they get away with AMD or Graviton 2 or 3? Or do they need to move in? And, I think when you look at all of these Clouds, they always have pricing curves that are arcs from the newest to the oldest stuff. And you can play games with that. And understanding how you can actually lower your costs by looking at maybe some of the older generation. Maybe your application was written 10 years ago. You don't necessarily have to be on the best, newest processor for that application per se. >> So last, I want to come back to this whole analytics piece. Last June, I think it was June, Dev Ittycheria, who's the-- I call him Dev. Spelled Dev, pronounced Dave. (chuckles softly) Same pronunciation, different spelling. Dev Ittycheria, CEO of Mongo, on the earnings call. He was getting, you know, hit. Things were starting to get a little less visible in terms of, you know, the outlook. And people were pushing him like... Because you're in the Cloud, is it easier to dial down? And he said, because we're the document database, we support transaction applications. We're less discretionary than say, analytics. Well on the Snowflake earnings call, that same month or the month after, they were all over Slootman and Scarpelli. Oh, the Mongo CEO said that they're less discretionary than analytics. And Snowflake was an interesting comment. They basically said, look, we're the Cloud. You can dial it up, you can dial it down, but the area under the curve over a period of time is going to be the same, because they get their customers to commit. What do you say? You disagreed with the notion that people are running their calculations less frequently. Is that because they're trying to do a better job of targeting customers in near real time? What are you seeing out there? >> Yeah, I think they're moving away from using people and more expensive marketing. Or, they're trying to figure out what's my Google ad spend, what's my Meta ad spend? And what they're trying to do is optimize that spend. So, what is the return on advertising, or the ROAS as they would say. And what they're looking to do is understand, okay, I have to collect these analytics that better understand where are these people coming from? How do they get to my site, to my store, to my whatever? And when they're using it, how do they they better move through that? What you're also seeing is that analytics is not only just for kind of the retail or financial services or things like that, but then they're also, you know, using that to make offers in those categories. When you move back to more, you know, take other companies that are building products and SaaS delivered products. They may actually go and use this analytics for making the product better. And one of the big reasons for that is maybe they're dialing back how many product managers they have. And they're looking to be more data driven about how they actually go and build the product out or enhance the product. So maybe they're, you know, an online video service and they want to understand why people are either using or not using the whiteboard inside the product. And they're collecting a lot of that product analytics in a big way so that they can go through that. And they're doing it in a constant manner. This first party type tracking within applications is growing rapidly by customers. >> So, let's talk about who wins in that. So, obviously the Cloud guys, AWS, Google and Azure. I want to come back and unpack that a little bit. Databricks and Snowflake, we reported on our last breaking analysis, it kind of on a collision course. You know, a couple years ago we were thinking, okay, AWS, Snowflake and Databricks, like perfect sandwich. And then of course they started to become more competitive. My sense is they still, you know, compliment each other in the field, right? But, you know, publicly, they've got bigger aspirations, they get big TAMs that they're going after. But it's interesting, the data shows that-- So, Snowflake was off the charts in terms of spending momentum and our EPR surveys. Our partner down in New York, they kind of came into line. They're both growing in terms of market presence. Databricks couldn't get to IPO. So, we don't have as much, you know, visibility on their financials. You know, Snowflake obviously highly transparent cause they're a public company. And then you got AWS, Google and Azure. And it seems like AWS appears to be more partner friendly. Microsoft, you know, depends on what market you're in. And Google wants to sell BigQuery. >> Yeah. >> So, what are you seeing in the public Cloud from a data platform perspective? >> Yeah. I think that was pretty astute in what you were talking about there, because I think of the three, Google is definitely I think a little bit behind in how they go to market with their partners. Azure's done a fantastic job of partnering with these companies to understand and even though they may have Synapse as their go-to and where they want people to go to do AI and ML. What they're looking at is, Hey, we're going to also be friendly with Snowflake. We're also going to be friendly with a Databricks. And I think that, Amazon has always been there because that's where the market has been for these developers. So, many, like Databricks' and the Snowflake's have gone there first because, you know, Databricks' case, they built out on top of S3 first. And going and using somebody's object layer other than AWS, was not as simple as you would think it would be. Moving between those. >> So, one of the financial meetups I said meetup, but the... It was either the CEO or the CFO. It was either Slootman or Scarpelli talking at, I don't know, Merrill Lynch or one of the other financial conferences said, I think it was probably their Q3 call. Snowflake said 80% of our business goes through Amazon. And he said to this audience, the next day we got a call from Microsoft. Hey, we got to do more. And, we know just from reading the financial statements that Snowflake is getting concessions from Amazon, they're buying in volume, they're renegotiating their contracts. Amazon gets it. You know, lower the price, people buy more. Long term, we're all going to make more money. Microsoft obviously wants to get into that game with Snowflake. They understand the momentum. They said Google, not so much. And I've had customers tell me that they wanted to use Google's AI with Snowflake, but they can't, they got to go to to BigQuery. So, honestly, I haven't like vetted that so. But, I think it's true. But nonetheless, it seems like Google's a little less friendly with the data platform providers. What do you think? >> Yeah, I would say so. I think this is a place that Google looks and wants to own. Is that now, are they doing the right things long term? I mean again, you know, you look at Google Analytics being you know, basically outlawed in five countries in the EU because of GDPR concerns, and compliance and governance of data. And I think people are looking at Google and BigQuery in general and saying, is it the best place for me to go? Is it going to be in the right places where I need it? Still, it's still one of the largest used databases out there just because it underpins a number of the Google services. So you almost get, like you were saying, forced into BigQuery sometimes, if you want to use the tech on top. >> You do strategy. >> Yeah. >> Right? You do strategy, you do messaging. Is it the right call by Google? I mean, it's not a-- I criticize Google sometimes. But, I'm not sure it's the wrong call to say, Hey, this is our ace in the hole. >> Yeah. >> We got to get people into BigQuery. Cause, first of all, BigQuery is a solid product. I mean it's Cloud native and it's, you know, by all, it gets high marks. So, why give the competition an advantage? Let's try to force people essentially into what is we think a great product and it is a great product. The flip side of that is, they're giving up some potential partner TAM and not treating the ecosystem as well as one of their major competitors. What do you do if you're in that position? >> Yeah, I think that that's a fantastic question. And the question I pose back to the companies I've worked with and worked for is, are you really looking to have vendor lock-in as your key differentiator to your service? And I think when you start to look at these companies that are moving away from BigQuery, moving to even, Databricks on top of GCS in Google, they're looking to say, okay, I can go there if I have to evacuate from GCP and go to another Cloud, I can stay on Databricks as a platform, for instance. So I think it's, people are looking at what platform as a service, database as a service they go and use. Because from a strategic perspective, they don't want that vendor locking. >> That's where Supercloud becomes interesting, right? Because, if I can run on Snowflake or Databricks, you know, across Clouds. Even Oracle, you know, they're getting into business with Microsoft. Let's talk about some of the Cloud players. So, the big three have reported. >> Right. >> We saw AWSs Cloud growth decelerated down to 20%, which is I think the lowest growth rate since they started to disclose public numbers. And they said they exited, sorry, they said January they grew at 15%. >> Yeah. >> Year on year. Now, they had some pretty tough compares. But nonetheless, 15%, wow. Azure, kind of mid thirties, and then Google, we had kind of low thirties. But, well behind in terms of size. And Google's losing probably almost $3 billion annually. But, that's not necessarily a bad thing by advocating and investing. What's happening with the Cloud? Is AWS just running into the law, large numbers? Do you think we can actually see a re-acceleration like we have in the past with AWS Cloud? Azure, we predicted is going to be 75% of AWS IAS revenues. You know, we try to estimate IAS. >> Yeah. >> Even though they don't share that with us. That's a huge milestone. You'd think-- There's some people who have, I think, Bob Evans predicted a while ago that Microsoft would surpass AWS in terms of size. You know, what do you think? >> Yeah, I think that Azure's going to keep to-- Keep growing at a pretty good clip. I think that for Azure, they still have really great account control, even though people like to hate Microsoft. The Microsoft sellers that are out there making those companies successful day after day have really done a good job of being in those accounts and helping people. I was recently over in the UK. And the UK market between AWS and Azure is pretty amazing, how much Azure there is. And it's growing within Europe in general. In the states, it's, you know, I think it's growing well. I think it's still growing, probably not as fast as it is outside the U.S. But, you go down to someplace like Australia, it's also Azure. You hear about Azure all the time. >> Why? Is that just because of the Microsoft's software state? It's just so convenient. >> I think it has to do with, you know, and you can go with the reasoning they don't break out, you know, Office 365 and all of that out of their numbers is because they have-- They're in all of these accounts because the office suite is so pervasive in there. So, they always have reasons to go back in and, oh by the way, you're on these old SQL licenses. Let us move you up here and we'll be able to-- We'll support you on the old version, you know, with security and all of these things. And be able to move you forward. So, they have a lot of, I guess you could say, levers to stay in those accounts and be interesting. At least as part of the Cloud estate. I think Amazon, you know, is hitting, you know, the large number. Laws of large numbers. But I think that they're also going through, and I think this was seen in the layoffs that they were making, that they're looking to understand and have profitability in more of those services that they have. You know, over 350 odd services that they have. And you know, as somebody who went there and helped to start yet a new one, while I was there. And finally, it went to beta back in September, you start to look at the fact that, that number of services, people, their own sellers don't even know all of their services. It's impossible to comprehend and sell that many things. So, I think what they're going through is really looking to rationalize a lot of what they're doing from a services perspective going forward. They're looking to focus on more profitable services and bringing those in. Because right now it's built like a layer cake where you have, you know, S3 EBS and EC2 on the bottom of the layer cake. And then maybe you have, you're using IAM, the authorization and authentication in there and you have all these different services. And then they call it EMR on top. And so, EMR has to pay for that entire layer cake just to go and compete against somebody like Mongo or something like that. So, you start to unwind the costs of that. Whereas Azure, went and they build basically ground up services for the most part. And Google kind of falls somewhere in between in how they build their-- They're a sort of layer cake type effect, but not as many layers I guess you could say. >> I feel like, you know, Amazon's trying to be a platform for the ecosystem. Yes, they have their own products and they're going to sell. And that's going to drive their profitability cause they don't have to split the pie. But, they're taking a piece of-- They're spinning the meter, as Ziyas Caravalo likes to say on every time Snowflake or Databricks or Mongo or Atlas is, you know, running on their system. They take a piece of the action. Now, Microsoft does that as well. But, you look at Microsoft and security, head-to-head competitors, for example, with a CrowdStrike or an Okta in identity. Whereas, it seems like at least for now, AWS is a more friendly place for the ecosystem. At the same time, you do a lot of business in Microsoft. >> Yeah. And I think that a lot of companies have always feared that Amazon would just throw, you know, bodies at it. And I think that people have come to the realization that a two pizza team, as Amazon would call it, is eight people. I think that's, you know, two slices per person. I'm a little bit fat, so I don't know if that's enough. But, you start to look at it and go, okay, if they're going to start out with eight engineers, if I'm a startup and they're part of my ecosystem, do I really fear them or should I really embrace them and try to partner closer with them? And I think the smart people and the smart companies are partnering with them because they're realizing, Amazon, unless they can see it to, you know, a hundred million, $500 million market, they're not going to throw eight to 16 people at a problem. I think when, you know, you could say, you could look at the elastic with OpenSearch and what they did there. And the licensing terms and the battle they went through. But they knew that Elastic had a huge market. Also, you had a number of ecosystem companies building on top of now OpenSearch, that are now domain on top of Amazon as well. So, I think Amazon's being pretty strategic in how they're doing it. I think some of the-- It'll be interesting. I think this year is a payout year for the cuts that they're making to some of the services internally to kind of, you know, how do we take the fat off some of those services that-- You know, you look at Alexa. I don't know how much revenue Alexa really generates for them. But it's a means to an end for a number of different other services and partners. >> What do you make of this ChatGPT? I mean, Microsoft obviously is playing that card. You want to, you want ChatGPT in the Cloud, come to Azure. Seems like AWS has to respond. And we know Google is, you know, sharpening its knives to come up with its response. >> Yeah, I mean Google just went and talked about Bard for the first time this week and they're in private preview or I guess they call it beta, but. Right at the moment to select, select AI users, which I have no idea what that means. But that's a very interesting way that they're marketing it out there. But, I think that Amazon will have to respond. I think they'll be more measured than say, what Google's doing with Bard and just throwing it out there to, hey, we're going into beta now. I think they'll look at it and see where do we go and how do we actually integrate this in? Because they do have a lot of components of AI and ML underneath the hood that other services use. And I think that, you know, they've learned from that. And I think that they've already done a good job. Especially for media and entertainment when you start to look at some of the ways that they use it for helping do graphics and helping to do drones. I think part of their buy of iRobot was the fact that iRobot was a big user of RoboMaker, which is using different models to train those robots to go around objects and things like that, so. >> Quick touch on Kubernetes, the whole DevOps World we just covered. The Cloud Native Foundation Security, CNCF. The security conference up in Seattle last week. First time they spun that out kind of like reinforced, you know, AWS spins out, reinforced from reinvent. Amsterdam's coming up soon, the CubeCon. What should we expect? What's hot in Cubeland? >> Yeah, I think, you know, Kubes, you're going to be looking at how OpenShift keeps growing and I think to that respect you get to see the momentum with people like Red Hat. You see others coming up and realizing how OpenShift has gone to market as being, like you were saying, partnering with those Clouds and really making it simple. I think the simplicity and the manageability of Kubernetes is going to be at the forefront. I think a lot of the investment is still going into, how do I bring observability and DevOps and AIOps and MLOps all together. And I think that's going to be a big place where people are going to be looking to see what comes out of CubeCon in Amsterdam. I think it's that manageability ease of use. >> Well Rob, I look forward to working with you on behalf of the whole Cube team. We're going to do more of these and go out to some shows extract the signal from the noise. Really appreciate you coming into our studio. >> Well, thank you for having me on. Really appreciate it. >> You're really welcome. All right, keep it right there, or thanks for watching. This is Dave Vellante for the Cube. And we'll see you next time. (light music)

Published Date : Feb 7 2023

SUMMARY :

I'm really pleased to It's always great to be here. and I think we can have the number of Clouds that they have, contract to start with those make sense to you And, I think when you look in terms of, you know, the outlook. And they're looking to My sense is they still, you know, in how they go to market And he said to this audience, is it the best place for me to go? You do strategy, you do messaging. and it's, you know, And I think when you start Even Oracle, you know, since they started to to be 75% of AWS IAS revenues. You know, what do you think? it's, you know, I think it's growing well. Is that just because of the And be able to move you forward. I feel like, you know, I think when, you know, you could say, And we know Google is, you know, And I think that, you know, you know, AWS spins out, and I think to that respect forward to working with you Well, thank you for having me on. And we'll see you next time.

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Bob Evans	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
HP	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Rob	PERSON	0.99+
Google	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Rob Strechay	PERSON	0.99+
New York	LOCATION	0.99+
September	DATE	0.99+
Seattle	LOCATION	0.99+
January	DATE	0.99+
Dev Ittycheria	PERSON	0.99+
HPE	ORGANIZATION	0.99+
NetApp	ORGANIZATION	0.99+
Amsterdam	LOCATION	0.99+
75%	QUANTITY	0.99+
UK	LOCATION	0.99+
AWSs	ORGANIZATION	0.99+
June	DATE	0.99+
Snowplow	ORGANIZATION	0.99+
eight	QUANTITY	0.99+
80%	QUANTITY	0.99+
Scarpelli	PERSON	0.99+
15%	QUANTITY	0.99+
Australia	LOCATION	0.99+
Mongo	ORGANIZATION	0.99+
Slootman	PERSON	0.99+
two-year	QUANTITY	0.99+
AMD	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Databricks	ORGANIZATION	0.99+
six factors	QUANTITY	0.99+
three	QUANTITY	0.99+
Merrill Lynch	ORGANIZATION	0.99+
Last June	DATE	0.99+
five countries	QUANTITY	0.99+
eight people	QUANTITY	0.99+
U.S.	LOCATION	0.99+
last week	DATE	0.99+
16 people	QUANTITY	0.99+
Databricks'	ORGANIZATION	0.99+

Breaking Analysis: Cloud players sound a cautious tone for 2023

>> From the Cube Studios in Palo Alto in Boston bringing you data-driven insights from the Cube and ETR. This is Breaking Analysis with Dave Vellante. >> The unraveling of market enthusiasm continued in Q4 of 2022 with the earnings reports from the US hyperscalers, the big three now all in. As we said earlier this year, even the cloud is an immune from the macro headwinds and the cracks in the armor that we saw from the data that we shared last summer, they're playing out into 2023. For the most part actuals are disappointing beyond expectations including our own. It turns out that our estimates for the big three hyperscaler's revenue missed by 1.2 billion or 2.7% lower than we had forecast from even our most recent November estimates. And we expect continued decelerating growth rates for the hyperscalers through the summer of 2023 and we don't think that's going to abate until comparisons get easier. Hello and welcome to this week's Wikibon Cube Insights powered by ETR. In this Breaking Analysis, we share our view of what's happening in cloud markets not just for the hyperscalers but other firms that have hitched a ride on the cloud. And we'll share new ETR data that shows why these trends are playing out tactics that customers are employing to deal with their cost challenges and how long the pain is likely to last. You know, riding the cloud wave, it's a two-edged sword. Let's look at the players that have gone all in on or are exposed to both the positive and negative trends of cloud. Look the cloud has been a huge tailwind for so many companies like Snowflake and Databricks, Workday, Salesforce, Mongo's move with Atlas, Red Hats Cloud strategy with OpenShift and so forth. And you know, the flip side is because cloud is elastic what comes up can also go down very easily. Here's an XY graphic from ETR that shows spending momentum or net score on the vertical axis and market presence in the dataset on the horizontal axis provision or called overlap. This is data from the January 2023 survey and that the red dotted lines show the positions of several companies that we've highlighted going back to January 2021. So let's unpack this for a bit starting with the big three hyperscalers. The first point is AWS and Azure continue to solidify their moat relative to Google Cloud platform. And we're going to get into this in a moment, but Azure and AWS revenues are five to six times that of GCP for IaaS. And at those deltas, Google should be gaining ground much faster than the big two. The second point on Google is notice the red line on GCP relative to its starting point. While it appears to be gaining ground on the horizontal axis, its net score is now below that of AWS and Azure in the survey. So despite its significantly smaller size it's just not keeping pace with the leaders in terms of market momentum. Now looking at AWS and Microsoft, what we see is basically AWS is holding serve. As we know both Google and Microsoft benefit from including SaaS in their cloud numbers. So the fact that AWS hasn't seen a huge downward momentum relative to a January 2021 position is one positive in the data. And both companies are well above that magic 40% line on the Y-axis, anything above 40% we consider to be highly elevated. But the fact remains that they're down as are most of the names on this chart. So let's take a closer look. I want to start with Snowflake and Databricks. Snowflake, as we reported from several quarters back came down to Earth, it was up in the 80% range in the Y-axis here. And it's still highly elevated in the 60% range and it continues to move to the right, which is positive but as we'll address in a moment it's customers can dial down consumption just as in any cloud. Now, Databricks is really interesting. It's not a public company, it never made it to IPO during the sort of tech bubble. So we don't have the same level of transparency that we do with other companies that did make it through. But look at how much more prominent it is on the X-axis relative to January 2021. And it's net score is basically held up over that period of time. So that's a real positive for Databricks. Next, look at Workday and Salesforce. They've held up relatively well, both inching to the right and generally holding their net scores. Same from Mongo, which is the brown dot above its name that says Elastic, it says a little gets a little crowded which Elastic's actually the blue dot above it. But generally, SaaS is harder to dial down, Workday, Salesforce, Oracles, SaaS and others. So it's harder to dial down because commitments have been made in advance, they're kind of locked in. Now, one of the discussions from last summer was as Mongo, less discretionary than analytics i.e. Snowflake. And it's an interesting debate but maybe Snowflake customers, you know, they're also generally committed to a dollar amount. So over time the spending is going to be there. But in the short term, yeah maybe Snowflake customers can dial down. Now that highlighted dotted red line, that bolded one is Datadog and you can see it's made major strides on the X-axis but its net score has decelerated quite dramatically. Openshift's momentum in the survey has dropped although IBM just announced that OpenShift has a a billion dollar ARR and I suspect what's happening there is IBM consulting is bundling OpenShift into its modernization projects. It's got a, that sort of captive base if you will. And as such it's probably not as top of mind to the respondents but I'll bet you the developers are certainly aware of it. Now the other really notable call out here is CloudFlare, We've reported on them earlier. Cloudflare's net score has held up really well since January of 2021. It really hasn't seen the downdraft of some of these others, but it's making major major moves to the right gaining market presence. We really like how CloudFlare is performing. And the last comment is on Oracle which as you can see, despite its much, much lower net score continues to gain ground in the market and thrive from a profitability standpoint. But the data pretty clearly shows that there's a downdraft in the market. Okay, so what's happening here? Let's dig deeper into this data. Here's a graphic from the most recent ETR drill down asking customers that said they were going to cut spending what technique they're using to do so. Now, as we've previously reported, consolidating redundant vendors is by far the most cited approach but there's two key points we want to make here. One is reducing excess cloud resources. As you can see in the bars is the second most cited technique and it's up from the previous polling period. The second we're not showing, you know directly but we've got some red call outs there. Reducing cloud costs jumps to 29% and 28% respectively in financial services and tech telco. And it's much closer to second. It's basically neck and neck with consolidating redundant vendors in those two industries. So they're being really aggressive about optimizing cloud cost. Okay, so as we said, cloud is great 'cause you can dial it up but it's just as easy to dial down. We've identified six factors that customers tell us are affecting their cloud consumption and there are probably more, if you got more we'd love to hear them but these are the ones that are fairly prominent that have hit our radar. First, rising mortgage rates mean banks are processing fewer loans means less cloud. The crypto crash means less trading activity and that means less cloud resources. Third lower ad spend has led companies to reduce not only you know, their ad buying but also their frequency of running their analytics and their calculations. And they're also often using less data, maybe compressing the timeframe of the corpus down to a shorter time period. Also very prominent is down to the bottom left, using lower cost compute instances. For example, Graviton from AWS or AMD chips and tiering storage to cheaper S3 or deep archived tiers. And finally, optimizing based on better pricing plans. So customers are moving from, you know, smaller companies in particular moving maybe from on demand or other larger companies that are experimenting using on demand or they're moving to spot pricing or reserved instances or optimized savings plans. That all lowers cost and that means less cloud resource consumption and less cloud revenue. Now in the days when everything was on prem CFOs, what would they do? They would freeze CapEx and IT Pros would have to try to do more with less and often that meant a lot of manual tasks. With the cloud it's much easier to move things around. It still takes some thinking and some effort but it's dramatically simpler to do so. So you can get those savings a lot faster. Now of course the other huge factor is you can cut or you can freeze. And this graphic shows data from a recent ETR survey with 159 respondents and you can see the meaningful uptick in hiring freezes, freezing new IT deployments and layoffs. And as we've been reporting, this has been trending up since earlier last year. And note the call out, this is especially prominent in retail sectors, all three of these techniques jump up in retail and that's a bit of a concern because oftentimes consumer spending helps the economy make a softer landing out of a pullback. But this is a potential canary in the coal mine. If retail firms are pulling back it's because consumers aren't spending as much. And so we're keeping a close eye on that. So let's boil this down to the market data and what this all means. So in this graphic we show our estimates for Q4 IaaS revenues compared to the "actual" IaaS revenues. And we say quote because AWS is the only one that reports, you know clean revenue and IaaS, Azure and GCP don't report actuals. Why would they? Because it would make them look even, you know smaller relative to AWS. Rather, they bury the figures in overall cloud which includes their, you know G-Suite for Google and all the Microsoft SaaS. And then they give us little tidbits about in Microsoft's case, Azure, they give growth rates. Google gives kind of relative growth of GCP. So, and we use survey data and you know, other data to try to really pinpoint and we've been covering this for, I don't know, five or six years ever since the cloud really became a thing. But looking at the data, we had AWS growing at 25% this quarter and it came in at 20%. So a significant decline relative to our expectations. AWS announced that it exited December, actually, sorry it's January data showed about a 15% mid-teens growth rate. So that's, you know, something we're watching. Azure was two points off our forecast coming in at 38% growth. It said it exited December in the 35% growth range and it said that it's expecting five points of deceleration off of that. So think 30% for Azure. GCP came in three points off our expectation coming in 35% and Alibaba has yet to report but we've shaved a bid off that forecast based on some survey data and you know what maybe 9% is even still not enough. Now for the year, the big four hyperscalers generated almost 160 billion of revenue, but that was 7 billion lower than what what we expected coming into 2022. For 2023, we're expecting 21% growth for a total of 193.3 billion. And while it's, you know, lower, you know, significantly lower than historical expectations it's still four to five times the overall spending forecast that we just shared with you in our predictions post of between 4 and 5% for the overall market. We think AWS is going to come in in around 93 billion this year with Azure closing in at over 71 billion. This is, again, we're talking IaaS here. Now, despite Amazon focusing investors on the fact that AWS's absolute dollar growth is still larger than its competitors. By our estimates Azure will come in at more than 75% of AWS's forecasted revenue. That's a significant milestone. AWS is operating margins by the way declined significantly this past quarter, dropping from 30% of revenue to 24%, 30% the year earlier to 24%. Now that's still extremely healthy and we've seen wild fluctuations like this before so I don't get too freaked out about that. But I'll say this, Microsoft has a marginal cost advantage relative to AWS because one, it has a captive cloud on which to run its massive software estate. So it can just throw software at its own cloud and two software marginal costs. Marginal economics despite AWS's awesomeness in high degrees of automation, software is just a better business. Now the upshot for AWS is the ecosystem. AWS is essentially in our view positioning very smartly as a platform for data partners like Snowflake and Databricks, security partners like CrowdStrike and Okta and Palo Alto and many others and SaaS companies. You know, Microsoft is more competitive even though AWS does have competitive products. Now of course Amazon's competitive to retail companies so that's another factor but generally speaking for tech players, Amazon is a really thriving ecosystem that is a secret weapon in our view. AWS happy to spin the meter with its partners even though it sells competitive products, you know, more so in our view than other cloud players. Microsoft, of course is, don't forget is hyping now, we're hearing a lot OpenAI and ChatGPT we reported last week in our predictions post. How OpenAI is shot up in terms of market sentiment in ETR's emerging technology company surveys and people are moving to Azure to get OpenAI and get ChatGPT that is a an interesting lever. Amazon in our view has to have a response. They have lots of AI and they're going to have to make some moves there. Meanwhile, Google is emphasizing itself as an AI first company. In fact, Google spent at least five minutes of continuous dialogue, nonstop on its AI chops during its latest earnings call. So that's an area that we're watching very closely as the buzz around large language models continues. All right, let's wrap up with some assumptions for 2023. We think SaaS players are going to continue to be sticky. They're going to be somewhat insulated from all these downdrafts because they're so tied in and customers, you know they make the commitment up front, you've got the lock in. Now having said that, we do expect some backlash over time on the onerous and generally customer unfriendly pricing models of most large SaaS companies. But that's going to play out over a longer period of time. Now for cloud generally and the hyperscalers specifically we do expect accelerating growth rates into Q3 but the amplitude of the demand swings from this rubber band economy, we expect to continue to compress and become more predictable throughout the year. Estimates are coming down, CEOs we think are going to be more cautious when the market snaps back more cautious about hiring and spending and as such a perhaps we expect a more orderly return to growth which we think will slightly accelerate in Q4 as comps get easier. Now of course the big risk to these scenarios is of course the economy, the FED, consumer spending, inflation, supply chain, energy prices, wars, geopolitics, China relations, you know, all the usual stuff. But as always with our partners at ETR and the Cube community, we're here for you. We have the data and we'll be the first to report when we see a change at the margin. Okay, that's a wrap for today. I want to thank Alex Morrison who's on production and manages the podcast, Ken Schiffman as well out of our Boston studio getting this up on LinkedIn Live. Thank you for that. Kristen Martin also and Cheryl Knight help get the word out on social media and in our newsletters. And Rob Hof is our Editor-in-Chief over at siliconangle.com. He does some great editing for us. Thank you all. Remember all these episodes are available as podcast. Wherever you listen, just search Breaking Analysis podcast. I publish each week on wikibon.com, at siliconangle.com where you can see all the data and you want to get in touch. Just all you can do is email me david.vellante@siliconangle.com or DM me @dvellante if you if you got something interesting, I'll respond. If you don't, it's either 'cause I'm swamped or it's just not tickling me. You can comment on our LinkedIn post as well. And please check out ETR.ai for the best survey data in the enterprise tech business. This is Dave Vellante for the Cube Insights powered by ETR. Thanks for watching and we'll see you next time on Breaking Analysis. (gentle upbeat music)

Published Date : Feb 4 2023

SUMMARY :

From the Cube Studios and how long the pain is likely to last.

ENTITIES

Entity	Category	Confidence
Alex Morrison	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
Cheryl Knight	PERSON	0.99+
Kristen Martin	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Ken Schiffman	PERSON	0.99+
January 2021	DATE	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
2.7%	QUANTITY	0.99+
January	DATE	0.99+
Amazon	ORGANIZATION	0.99+
December	DATE	0.99+
January of 2021	DATE	0.99+
five	QUANTITY	0.99+
January 2023	DATE	0.99+
Snowflake	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
1.2 billion	QUANTITY	0.99+
20%	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
29%	QUANTITY	0.99+
30%	QUANTITY	0.99+
six factors	QUANTITY	0.99+
second point	QUANTITY	0.99+
24%	QUANTITY	0.99+
2022	DATE	0.99+
david.vellante@siliconangle.com	OTHER	0.99+
X-axis	ORGANIZATION	0.99+
2023	DATE	0.99+
28%	QUANTITY	0.99+
193.3 billion	QUANTITY	0.99+
ETR	ORGANIZATION	0.99+
38%	QUANTITY	0.99+
7 billion	QUANTITY	0.99+
21%	QUANTITY	0.99+
Earth	LOCATION	0.99+
25%	QUANTITY	0.99+
Mongo	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Atlas	ORGANIZATION	0.99+
two industries	QUANTITY	0.99+
last week	DATE	0.99+
six years	QUANTITY	0.99+
first point	QUANTITY	0.99+
Red Hats	ORGANIZATION	0.99+
35%	QUANTITY	0.99+
four	QUANTITY	0.99+
159 respondents	QUANTITY	0.99+
Okta	ORGANIZATION	0.99+

Is Data Mesh the Next Killer App for Supercloud?

(upbeat music) >> Welcome back to our Supercloud 2 event live coverage here of stage performance in Palo Alto syndicating around the world. I'm John Furrier with Dave Vellante. We got exclusive news and a scoop here for SiliconANGLE in theCUBE. Zhamak Dehghani, creator of data mesh has formed a new company called Nextdata.com, Nextdata. She's a cube alumni and contributor to our supercloud initiative, as well as our coverage and Breaking Analysis with Dave Vellante on data, the killer app for supercloud. Zhamak, great to see you. Thank you for coming into the studio and congratulations on your newly formed venture and continued success on the data mesh. >> Thank you so much. It's great to be here. Great to see you in person. >> Dave: Yeah, finally. >> Wonderful. Your contributions to the data conversation has been well documented certainly by us and others in the industry. Data mesh taking the world by storm. Some people are debating it, throwing cold water on it. Some are thinking it's the next big thing. Tell us about the data mesh, super data apps that are emerging out of cloud. >> I mean, data mesh, as you said, the pain point that it surface were universal. Everybody said, "Oh, why didn't I think of that?" It was just an obvious next step and people are approaching it, implementing it. I guess the last few years I've been involved in many of those implementations and I guess supercloud is somewhat a prerequisite for it because it's data mesh and building applications using data mesh is about sharing data responsibly across boundaries. And those boundaries include organizational boundaries, cloud technology boundaries, and trust boundaries. >> I want to bring that up because your venture, Nextdata, which is new just formed. Tell us about that. What wave is that riding? What specifically are you targeting? What's the pain point? >> Absolutely. Yes, so Nextdata is the result of, I suppose the pains that I suffered from implementing data mesh for many of the organizations. Basically a lot of organizations that I've worked with they want decentralized data. So they really embrace this idea of decentralized ownership of the data, but yet they want interconnectivity through standard APIs, yet they want discoverability and governance. So they want to have policies implemented, they want to govern that data, they want to be able to discover that data, and yet they want to decentralize it. And we do that with a developer experience that is easy and native to a generalist developer. So we try to find the, I guess the common denominator that solves those problems and enables that developer experience for data sharing. >> Since you just announced the news, what's been the reaction? >> I just announced the news right now, so what's the reaction? >> But people in the industry know you did a lot of work in the area. What have been some of the feedback on the new venture in terms of the approach, the customers, problem? >> Yeah, so we've been in stealth mode so we haven't publicly talked about it, but folks that have been close to us, in fact have reached that we already have implementations of our pilot platform with early customers, which is super exciting. And we going to have multiple of those. Of course, we're a tiny, tiny company. We can have many of those, but we are going to have multiple pilot implementations of our platform in real world where real global large scale organizations that have real world problems. So we're not going to build our platform in vacuum. And that's what's happening right now. >> Zhamak, when I think about your role at ThoughtWorks, you had a very wide observation space with a number of clients, helping them implement data mesh and other things as well prior to your data mesh initiative. But when I look at data mesh, at least the ones that I've seen, they're very narrow. I think of JPMC, I think of HelloFresh. They're generally, obviously not surprising, they don't include the big vision of inclusivity across clouds, across different data storage. But it seems like people are having to go through some gymnastics to get to the organizational reality of decentralizing data and at least pushing data ownership to the line of business. How are you approaching, or are you approaching solving that problem? Are you taking a narrow slice? What can you tell us about Nextdata? >> Yeah, absolutely. Gymnastics, the cute word to describe what the organizations have to go through. And one of those problems is that the data as you know resides on different platforms, it's owned by different people, is processed by pipelines that who knows who owns them. So there's this very disparate and disconnected set of technologies that were very useful for when we thought about data and processing as a centralized problem. But when you think about data as a decentralized problem the cost of integration of these technologies in a cohesive developer experience is what's missing. And we want to focus on that cohesive end-to-end developer experience to share data responsibly in these autonomous units. We call them data products, I guess in data mesh. That constitutes computation. That governs that data policies, discoverability. So I guess, I heard this expression in the last talks that you can have your cake and eat it too. So we want people have their cakes, which is data in different places, decentralization, and eat it too, which is interconnected access to it. So we start with standardizing and codifying this idea of a data product container that encapsulates data computation APIs to get to it in a technology agnostic way, in an open way. And then sit on top and use existing tech, Snowflake, Databricks, whatever exists, the millions of dollars of investments that companies have made, sit on top of those but create this cohesive, integrated experience where data product is a first class primitive. And that's really key here. The language and the modeling that we use is really native to data mesh, which is that I'm building a data product I'm sharing a data product, and that encapsulates I'm providing metadata about this. I'm providing computation that's constantly changing the data. I'm providing the API for that. So we we're trying to kind of codify and create a new developer experience based on that. And developer, both from provider side and user side, connected to peer-to-peer data sharing with data product as a primitive first class concept. >> So the idea would be developers would build applications leveraging those data products, which are discoverable and governed. Now today you see some companies, take a Snowflake for example, attempting to do that within their own little walled garden. They even at one point used the term mesh. I don't know if they pull back on that. And then they became aware of some of your work. But a lot of the things that they're doing within their little insulated environment support that governance, they're building out an ecosystem. What's different in your vision? >> Exactly. So we realized that, and this is a reality, like you go to organizations, they have a Snowflake and half of the organization happily operates on Snowflake. And on the other half, "oh, we are on Bare infrastructure on AWS or we are on Databricks." This is the reality. This supercloud that's written up here, it's about working across boundaries of technology. So we try to embrace that. And even for our own technology with the way we're building it, we say, "Okay, nobody's going to use Nextdata, data mesh operating system. People will have different platforms." So you have to build with openness in mind and in case of Snowflake, I think, they have very, I'm sure very happy customers as long as customers can be on Snowflake. But once you cross that boundary of platforms then that becomes a problem. And we try to keep that in mind in our solution. >> So it's worth reviewing that basically the concept of data mesh is that whether you're a data lake or a data warehouse, an S3 bucket, an Oracle database as well, they should be inclusive inside of the data. >> We did a session with AWS on the startup showcase, data as code. And remember I wrote a blog post in 2007 called "Data as the New Developer Kit" back then we used to call them developer kits if you remember. And that we said at that time, whoever can code data will have a competitive advantage. >> Aren't the machines going to be doing that? Didn't we just hear that? >> Well, we have. Hey, Siri. Hey, Cube, find me that best video for data mesh. There it is. But this is the point, like what's happening is that now data has to be addressable. for machines and for coding because as you need to call the data. So the question is how do you manage the complexity of big things as promiscuous as possible, making it available, as well as then governing it? Because it's a trade off. The more you make open, the better the machine learning. But yet the governance issue, so this is the, you need an OS to handle this maybe. >> Yes. So yes, well we call, our mental model for our platform is an OS operating system. Operating systems have shown us how you can abstract what's complex and take care of a lot of complexities, but yet provide an open and dynamic enough interface. So we think about it that way. Just, we try to solve the problem of policies live with the data, an enforcement of the policies happens at the most granular level, which is in this concept of the data product. And that would happen whether you read, write or access a data product. But we can never imagine what are these policies could be. So our thinking is we should have a policy, open policy framework that can allow organizations write their own policy drivers and policy definitions and encode it and encapsulated in this data product container. But I'm not going to fool myself to say that, that's going to solve the problem that you just described. I think we are in this, I don't know, if I look into my crystal ball, what I think might happen is that right now the primitives that we work with to train machine learning model are still bits and bytes and data. They're fields, rows, columns and that creates quite a large surface area and attack area for privacy of the data. So perhaps one of the trends that we might see is this evolution of data APIs to become more and more computational aware to bring the compute to the data to reduce that surface area. So you can really leave the control of the data to the sovereign owners of that data. So that data product. So I think that evolution of our data APIs perhaps will become more and more computational. So you describe what you want and the data owner decides how to manage. >> That's interesting, Dave, 'cause it's almost like we just talked about ChatGPT in the last segment we had with you. It was a machine learning have been around the industry. It's almost as if you're starting to see reason come into, the data reasoning is like starting to see not just metadata. Using the data to reason so that you don't have to expose the raw data. So almost like a, I won't say curation layer, but an intelligence layer. >> Zhamak: Exactly. >> Can you share your vision on that? 'Cause that seems to be where the dots are connecting. >> Yes, perhaps further into the future because just from where we stand, we have to create still that bridge of familiarity between that future and present. So we are still in that bridge making mode. However, by just the basic notion of saying, "I'm going to put an API in front of my data." And that API today might be as primitive as a level of indirection, as in you tell me what you want, tell me who you are, let me go process that, all the policies and lineage and insert all of this intelligence that need to happen. And then today, I will still give you a file. But by just defining that API and standardizing it now we have this amazing extension point that we can say, "Well, the next revision of this API, you not just tell me who you are, but you actually tell me what intelligence you're after. What's a logic that I need to go and now compute on your API?" And you can evolve that. Now you have a point of evolution to this very futuristic, I guess, future where you just described the question that you're asking from the ChatGPT. >> Well, this is the supercloud, go ahead, Dave. >> I have a question from a fan, I got to get it in. It's George Gilbert. And so his question is, you're blowing away the way we synchronize data from operational systems to the data stack to applications. So the concern that he has and he wants your feedback on this, is the data product app devs get exposed to more complexity with respect to moving data between data products or maybe it's attributes between data products? How do you respond to that? How do you see? Is that a problem? Is that something that is overstated or do you have an answer for that? >> Absolutely. So I think there's a sweet spot in getting data developers, data product developers closer to the app, but yet not overburdening them with the complexity of the application and application logic and yet reducing their cognitive load by localizing what they need to know about, which is that domain where they're operating within. Because what's happening right now? What's happening right now is that data engineers with, a ton of empathy for them for their high threshold of pain that they can deal with, they have been centralized, they've put into the data team, and they have been given this unbelievable task of make meaning out of data, put semantic over it, curate it, cleans it, and so on. So what we are saying is that get those folks embedded into the domain closer to the application developers. These are still separately moving units. Your app and your data products are independent, but yet tightly closed with each other, tightly coupled with each other based on the context of the domain. So reduce cognitive load by localizing what they need to know about to the domain, get them closer to the application, but yet have them separate from app because app provides a very different service. Transactional data for my e-commerce transaction. Data product provides a very different service. Longitudinal data for the variety of this intelligent analysis that I can do on the data. But yet it's all within the domain of e-commerce or sales or whatnot. >> It's a lot of decoupling and coupling create that cohesiveness architecture. So I have to ask you, this is an interesting question 'cause it came up on theCUBE all last year. Back on the old server data center days and cloud, SRE, Google coined the term, site reliability engineer, for someone to look over the hundreds of thousands of servers. We asked the question to data engineering community who have been suffering, by the way, I agree. Is there an SRE like role for data? Because in a way data engineering, that platform engineer, they are like the SRE for data. In other words managing the large scale to enable automation and cell service. What's your thoughts and reaction to that? >> Yes, exactly. So maybe we go through that history of how SRE came to be. So we had the first DevOps movement, which was remove the wall between dev and ops and bring them together. So you have one unit of one cross-functional units of the organization that's responsible for you build it, you run it. So then there is no, I'm going to just shoot my application over the wall for somebody else to manage it. So we did that and then we said, okay, there is a ton, as we decentralized and had these many microservices running around, we had to create a layer that abstracted a lot of the complexity around running now a lot or monitoring, observing, and running a lot while giving autonomy to this cross-functional team. And that's where the SRE, a new generation of engineers came to exist. So I think if I just look at. >> Hence, Kubernetes. >> Hence, hence, exactly. Hence, chaos engineering. Hence, embracing the complexity and messiness. And putting engineering discipline to embrace that and yet give a cohesive and high integrity experience of those systems. So I think if we look at that evolution, perhaps something like that is happening by bringing data and apps closer and make them these domain-oriented data product teams or domain-oriented cross-functional teams full stop and still have a very advanced maybe at the platform level, infrastructure level operational team that they're not busy doing two jobs, which is taking care of domains and the infrastructure, but they're building infrastructure that is embracing that complexity, interconnectivity of this data process. >> So you see similarities? >> I see, absolutely. But I feel like we're probably in a more early days of that movement. >> So it's a data DevOps kind of thing happening where scales happening. It's good things are happening, yet a little bit fast and loose with some complexities to clean up. >> Yes. This is a different restructure. As you said, the job of this industry as a whole, an architect, is decompose recompose, decompose recompose in new way and now we're like decomposing centralized team, recomposing them as domains. >> So is data mesh the killer app for supercloud? >> You had to do this to me. >> Sorry, I couldn't resist. >> I know. Of course you want me to say this. >> Yes. >> Yes, of course. I mean, supercloud, I think it's really, the terminology supercloud, open cloud, but I think in spirits of it this embracing of diversity and giving autonomy for people to make decisions for what's right for them and not yet lock them in. I think just embracing that is baked into how data mesh assume the world would work. >> Well, thank you so much for coming on Supercloud 2. We really appreciate it. Data has driven this conversation. Your success of data mesh has really opened up the conversation and exposed the slow moving data industry. >> Dave: Been a great catalyst. >> That's now going well. We can move faster. So thanks for coming on. >> Thank you for hosting me. It was wonderful. >> Supercloud 2 live here in Palo Alto, our stage performance. I'm John Furrier with Dave Vellante. We'll back with more after this short break. Stay with us all day for Supercloud 2. (upbeat music)

Published Date : Jan 25 2023

SUMMARY :

and continued success on the data mesh. Great to see you in person. and others in the industry. I guess the last few What's the pain point? for many of the organizations. But people in the industry know you did but folks that have been close to us, at least the ones that I've is that the data as you know But a lot of the things that they're doing and half of the organization that basically the concept of data mesh And that we said at that time, is that now data has to be addressable. and the data owner decides how to manage. the data reasoning is like starting to see 'Cause that seems to be where What's a logic that I need to go Well, this is the So the concern that he has into the domain closer to We asked the question to of the organization that's responsible So I think if we look at that evolution, in a more early days of that movement. So it's a data DevOps As you said, the job of Of course you want me to say this. assume the world would work. the conversation and exposed So thanks for coming on. Thank you for hosting me. I'm John Furrier with Dave Vellante.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
AWS	ORGANIZATION	0.99+
2007	DATE	0.99+
George Gilbert	PERSON	0.99+
Zhamak Dehghani	PERSON	0.99+
Nextdata	ORGANIZATION	0.99+
Zhamak	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Google	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
one	QUANTITY	0.99+
Nextdata.com	ORGANIZATION	0.99+
two jobs	QUANTITY	0.99+
JPMC	ORGANIZATION	0.99+
today	DATE	0.99+
HelloFresh	ORGANIZATION	0.99+
ThoughtWorks	ORGANIZATION	0.99+
last year	DATE	0.99+
Supercloud 2	EVENT	0.99+
Oracle	ORGANIZATION	0.98+
first	QUANTITY	0.98+
Siri	TITLE	0.98+
Cube	PERSON	0.98+
Databricks	ORGANIZATION	0.98+
Snowflake	ORGANIZATION	0.97+
Supercloud	ORGANIZATION	0.97+
both	QUANTITY	0.97+
one unit	QUANTITY	0.97+
Snowflake	TITLE	0.96+
SRE	TITLE	0.95+
millions of dollars	QUANTITY	0.94+
first class	QUANTITY	0.94+
hundreds of thousands of servers	QUANTITY	0.92+
supercloud	ORGANIZATION	0.92+
one point	QUANTITY	0.92+
Supercloud 2	TITLE	0.89+
ChatGPT	ORGANIZATION	0.81+
half	QUANTITY	0.81+
Data Mesh the Next Killer App	TITLE	0.78+
supercloud	TITLE	0.75+
a ton	QUANTITY	0.73+
Supercloud 2	ORGANIZATION	0.72+
SiliconANGLE	ORGANIZATION	0.7+
DevOps	TITLE	0.66+
Snowflake	EVENT	0.59+
S3	TITLE	0.54+
last	DATE	0.54+
supercloud	EVENT	0.48+
Kubernetes	TITLE	0.47+

Is Supercloud an Architecture or a Platform | Supercloud2

(electronic music) >> Hi everybody, welcome back to Supercloud 2. I'm Dave Vellante with my co-host John Furrier. We're here at our tricked out Palo Alto studio. We're going live wall to wall all day. We're inserting a number of pre-recorded interviews, folks like Walmart. We just heard from Nir Zuk of Palo Alto Networks, and I'm really pleased to welcome in David Flynn. David Flynn, you may know as one of the people behind Fusion-io, completely changed the way in which people think about storing data, accessing data. David Flynn now the founder and CEO of a company called Hammerspace. David, good to see you, thanks for coming on. >> David: Good to see you too. >> And Dr. Nelu Mihai is the CEO and founder of Cloud of Clouds. He's actually built a Supercloud. We're going to get into that. Nelu, thanks for coming on. >> Thank you, Happy New Year. >> Yeah, Happy New Year. So I'm going to start right off with a little debate that's going on in the community if you guys would bring out this slide. So Bob Muglia early today, he gave a definition of Supercloud. He felt like we had to tighten ours up a little bit. He said a Supercloud is a platform, underscoring platform, that provides programmatically consistent services hosted on heterogeneous cloud providers. Now, Nelu, we have this shared doc, and you've been in there. You responded, you said, well, hold on. Supercloud really needs to be an architecture, or else we're going to have this stove pipe of stove pipes, really. And then you went on with more detail, what's the information model? What's the execution model? How are users going to interact with Supercloud? So I start with you, why architecture? The inference is that a platform, the platform provider's responsible for the architecture? Why does that not work in your view? >> No, the, it's a very interesting question. So whenever I think about platform, what's the connotation, you think about monolithic system? Yeah, I mean, I don't know whether it's true or or not, but there is this connotation of of monolithic. On the other hand, if you look at what's a problem right now with HyperClouds, from the customer perspective, they're very complex. There is a heterogeneous world where actually every single one of this HyperClouds has their own architecture. You need rocket scientists to build a cloud applications. Always there is this contradiction between cost and performance. They fight each other. And I'm quoting here a former friend of mine from Bell Labs who work at AWS who used to say "Cloud is cheap as long as you don't use it too much." (group chuckles) So clearly we need something that kind of plays from the principle point of view the role of an operating system, that seats on top of this heterogeneous HyperCloud, and there's nothing wrong by having these proprietary HyperClouds, think about processors, think about operating system and so on, so forth. But in order to build a system that is simple enough, I think we need to go deeper and understand. >> So the argument, the counterargument to that, David, is you'll never get there. You need a proprietary system to get to market sooner, to solve today's problem. Now I don't know where you stand on this platform versus architecture. I haven't asked you, but. >> I think there are aspects of both for sure. I mean it needs to be an architecture in the sense that it's broad based and open and so forth. But you know, platform, you could say as long as people can instantiate it themselves, on their own infrastructure, as long as it's something that can be deployed as, you know, software defined, you don't want the concept of platform being the monolith, you know, combined hardware and software. So it really depends on what you're focused on when you're saying platform, you know, I'd say as long as they software defined thing, to where it can literally run anywhere. I mean, because I really think what we're talking about here is the original concept of cloud computing. The ability to run anything anywhere, without having to care about the physical infrastructure. And what we have today is not that, the cloud today is a big mainframe in the sky, that just happens to be large enough that once you select which region, generally you have enough resources. But, you know, nowadays you don't even necessarily have enough resources in one region. and then you're kind of stuck. So we haven't really gotten to that utility model of computing. And you're also asked to rewrite your application, you know, to abandon the conveniences of high performance file access. You got to rewrite it to use object storage stuff. We have to get away from that. >> Okay, I want to just drill on that, 'cause I think I like that point about, there's not enough availability, but on the developer cloud, the original AWS premise was targeting developers, 'cause at that time, you have to provision a Sun box get a Cisco DSU/CSU, now you get on the cloud. But I think you're giving up the scale question, 'cause I think right now, scale is huge, enterprise grade versus cloud for developers. >> That's Right. >> Because I mean look at, Amazon, Azure, they got compute, they got storage, they got queuing, and some stuff. If you're doing a startup, you throw your app up there, localhost to cloud, no big deal. It's the scale thing that gets me- >> And you can tell by the fact that, in regions that are under high demand, right, like in London or LA, at least with the clients we work with in the median entertainment space, it costs twice as much for the exact same cloud instances that do the exact same amount of work, as somewhere out in rural Canada. So why is it you have such a cost differential, it has to do with that supply and demand, and the fact that the clouds aren't really the ability to run anything anywhere. Even within the same cloud vendor, you're stuck in a specific region. >> And that was never the original promise, right? I mean it was, we turned it into that. But the original promise was get rid of the heavy lifting of IT. >> Not have to run your own, yeah, exactly. >> And then it became, wow, okay I can run anywhere. And then you know, it's like web 2.0. You know people say why Supercloud, you and I talked about this, why do you need a name for Supercloud? It's like web 2.0. >> It's what Cloud was supposed to be. >> It's what cloud was supposed to be, (group laughing and talking) exactly, right. >> Cloud was supposed to be run anything anywhere, or at least that's what we took it as. But you're right, originally it was just, oh don't have to run your own infrastructure, and you can choose somebody else's infrastructure. >> And you did that >> But you're still bound to that. >> Dave: And People said I want more, right? >> But how do we go from here? >> That's, that's actually, that's a very good point, because indeed when the first HyperClouds were designed, were designed really focus on customers. I think Supercloud is an opportunity to design in the right way. Also having in mind the computer science rigor. And we should take advantage of that, because in fact actually, if cloud would've been designed properly from the beginning, probably wouldn't have needed Supercloud. >> David: You wouldn't have to have been asked to rewrite your application. >> That's correct. (group laughs) >> To use REST interfaces to your storage. >> Revisist history is always a good one. But look, cloud is great. I mean your point is cloud is a good thing. Don't hold it back. >> It is a very good thing. >> Let it continue. >> Let it go as as it is. >> Yeah, let that thing continue to grow. Don't impose restrictions on the cloud. Just refactor what you need to for scale or enterprise grade or availability. >> And you would agree with that, is that true or is it problem you're solving? >> Well yeah, I mean it, what the cloud is doing is absolutely necessary. What the public cloud vendors are doing is absolutely necessary. But what's been missing is how to provide a consistent interface, especially to persistent data. And have it be available across different regions, and across different clouds. 'cause data is a highly localized thing in current architecture. It only exists as rendered by the storage system that you put it in. Whether that's a legacy thing like a NetApp or an Isilon or even a cloud data service. It's localized to a specific region of the cloud in which you put that. We have to delocalize data, and provide a consistent interface to it across all sites. That's high performance, local access, but to global data. >> And so Walmart earlier today described their, what we call Supercloud, they call it the Walmart cloud native platform. And they use this triplet model. They have AWS and Azure, no, oh sorry, no AWS. They have Azure and GCP and then on-prem, where all the VMs live. When you, you know, probe, it turns out that it's only stateless in the cloud. (John laughs) So, the state stuff- >> Well let's just admit it, there is no such thing as stateless, because even the application binaries and libraries are state. >> Well I'm happy that I'm hearing that. >> Yeah, okay. >> Because actually I have a lot of debate (indistinct). If you think about no software running on a (indistinct) machine is stateless. >> David: Exactly. >> This is something that was- >> David: And that's data that needs to be distributed and provided consistently >> (indistinct) >> Across all the clouds, >> And actually, it's a nonsense, but- >> Dave: So it's an illusion, okay. (group talks over each other) >> (indistinct) you guys talk about stateless. >> Well, see, people make the confusion between state and persistent state, okay. Persistent state it's a different thing. State is a different thing. So, but anyway, I want to go back to your point, because there's a lot of debate here. People are talking about data, some people are talking about logic, some people are talking about networking. In my opinion is this triplet, which is data logic and connectivity, that has equal importance. And actually depending on the application, can have the center of gravity moving towards data, moving towards what I call execution units or workloads. And connectivity is actually the most important part of it. >> David: (indistinct). >> Some people are saying move the logic towards the data, some other people, and you are saying actually, that no, you have to build a distributed data mesh. What I'm saying is actually, you have to consider all these three variables, all these vector in order to decide, based on application, what's the most important. Because sometimes- >> John: So the application chooses >> That's correct. >> Well it it's what operating systems were in the past, was principally the thing that runs and manages the jobs, the job scheduler, and the thing that provides your persistent data (indistinct). >> Okay. So we finally got operating system into the equation, thank you. (group laughs) >> Nelu: I actually have a PhD in operating system. >> Cause what we're talking about is an operating system. So forget platform or architecture, it's an operating environment. Let's use it as a general term. >> All right. I think that's about it for me. >> All right, let's take (indistinct). Nelu, I want ask you quick, 'cause I want to give a, 'cause I believe it's an operating system. I think it's going to be a reset, refactored. You wrote to me, "The model of Supercloud has to be open theoretical, has to satisfy the rigors of computer science, and customer requirements." So unique to today, if the OS is going to be refactored, it's not going to be, may or may not be Red Hat or somebody else. This new OS, obviously requirements are for customers too but is what's the computer science that is needed? Where are we, what's the missing? Where's the science in this shift? It's not your standard OS it's not like an- (group talks over each other) >> I would beg to differ. >> (indistinct) truly an operation environment. But the, if you think about, and make analogies, what you need when you design a distributed system, well you need an information model, yeah. You need to figure out how the data is located and distributed. You need a model for the execution units, and you need a way to describe the interactions between all these objects. And it is my opinion that we need to go deeper and formalize these operations in order to make a step forward. And when we design Supercloud, and design something that is better than the current HyperClouds. And actually that is when we design something better, you make a system more efficient and it's going to be better from the cost point of view, from the performance point of view. But we need to add some math into all this customer focus centering and I really admire AWS and their executive team focusing on the customer. But now it's time to go back and see, if we apply some computer science, if you try to formalize to build a theoretical model of cloud, can we build a system that is better than existing ones? >> So David, how do you- >> this is what I'm saying. >> That's a good question >> How do You see the operating system of a, or operating environment of a decentralized cloud? >> Well I think it's layered. I mean we have operating systems that can run systems quite efficiently. Linux has sort of one in the data center, but we're talking about a layer on top of that. And I think we're seeing the emergence of that. For example, on the job scheduling side of things, Kubernetes makes a really good example. You know, you break the workload into the most granular units of compute, the containerized microservice, and then you use a declarative model to state what is needed and give the system the degrees of freedom that it can choose how to instantiate it. Because the thing about these distributed systems, is that the complexity explodes, right? Running a piece of hardware, running a single server is not a problem, even with all the many cores and everything like that. It's when you start adding in the networking, and making it so that you have many of them. And then when it's going across whole different data centers, you know, so, at that level the way you solve this is not manually (group laughs) and not procedurally. You have to change the language so it's intent based, it's a declarative model, and what you're stating is what is intended, and you're leaving it to more advanced techniques, like machine learning to decide how to instantiate that service across the cluster, which is what Kubernetes does, or how to instantiate the data across the diverse storage infrastructure. And that's what we do. >> So that's a very good point because actually what has been neglected with HyperClouds is really optimization and automation. But in order to be able to do both of these things, you need, I'm going back and I'm stubborn, you need to have a mathematical model, a theoretical model because what does automation mean? It means that we have to put machines to do the work instead of us, and machines work with what? Formula, with algorithms, they don't work with services. So I think Supercloud is an opportunity to underscore the importance of optimization and automation- >> Totally agree. >> In HyperCloud, and actually by doing that, we can also have an interesting connotation. We are also contributing to save our planet, because if you think right now. we're consuming a lot of energy on this HyperClouds and also all this AI applications, and I think we can do better and build the same kind of application using less energy. >> So yeah, great point, love that call out, the- you know, Dave and I always joke about the old, 'cause we're old, we talk about, you know, (Nelu Laughs) old history, OS/2 versus DOS, okay, OS's, OS/2 is silly better, first threaded OS, DOS never went away. So how does legacy play into this conversation? Because I buy the theoretical, I love the conversation. Okay, I think it's an OS, totally see it that way myself. What's the blocker? Is there a legacy that drags it back? Is the anchor dragging from legacy? Is there a DOS OS/2 moment? Is there an opportunity to flip the script? This is- >> I think that's a perfect example of why we need to support the existing interfaces, Operating Systems, real operating systems like Linux, understands how to present data, it's called a file system, block devices, things that that plumb in there. And by, you know, going to a REST interface and S3 and telling people they have to rewrite their applications, you can't even consume your application binaries that way, the OS doesn't know how to pull that sort of thing. So we, to get to cloud, to get to the ability to host massive numbers of tenants within a centralized infrastructure, you know, we abandoned these lower level interfaces to the OS and we have to go back to that. It's the reason why DOS ultimately won, is it had the momentum of the install base. We're seeing the same thing here. Whatever it is, it has to be a real file system and not a come down file system >> Nelu, what's your reaction, 'cause you're in the theoretical bandwagon. Let's get your reaction. >> No, I think it's a good, I'll give, you made a good analogy between OS/2 and DOS, but I'll go even farther saying, if you think about the evolution operating system didn't stop the evolution of underlying microprocessors, hardware, and so on and so forth. On the contrary, it was a catalyst for that. So because everybody could develop their own hardware, without worrying that the applications on top of operating system are going to modify. The same thing is going to happen with Supercloud. You're going to have the AWSs, you're going to have the Azure and the the GCP continue to evolve in their own way proprietary. But if we create on top of it the right interface >> The open, this is why open is important. >> That's correct, because actually you're going to see sometime ago, everybody was saying, remember venture capitals were saying, "AWS killed the world, nobody's going to come." Now you see what Oracle is doing, and then you're going to see other players. >> It's funny, Amazon's trying to be more like Microsoft. Microsoft's trying to be more like Amazon and Google- Oracle's just trying to say they have cloud. >> That's, that's correct, (group laughs) so, my point is, you're going to see a multiplication of this HyperClouds and cloud technology. So, the system has to be open in order to accommodate what it is and what is going to come. Okay, so it's open. >> So the the legacy- so legacy is an opportunity, not a blocker in your mind. And you see- >> That's correct, I think we should allow them to continue to to to be their own actually. But maybe you're going to find a way to connect with it. >> Amazon's the processor, and they're on the 80 80 80 right? >> That's correct. >> You're saying you love people trying to get put to work. >> That's a good analogy. >> But, performance levels you say good luck, right? >> Well yeah, we have to be able to take traditional applications, high performance applications, those that consume file system and persistent data. Those things have to be able to run anywhere. You need to be able to put, put them onto, you know, more elastic infrastructure. So, we have to actually get cloud to where it lives up to its billing. >> And that's what you're solving for, with Hammerspace, >> That's what we're solving for, making it possible- >> Give me the bumper sticker. >> Solving for how do you have massive quantities of unstructured file data? At the end of the day, all data ultimately is unstructured data. Have that persistent data available, across any data center, within any cloud, within any region on-prem, at the edge. And have not just the same APIs, but have the exact same data sets, and not sucked over a straw remote, but at extreme high performance, local access. So how do you have local access to globally shared distributed data? And that's what we're doing. We are orchestrating data globally across all different forms of storage infrastructure, so you have a consistent access at the highest performance levels, at the lowest level innate built into the OS, how to consume it as (indistinct) >> So are you going into the- all the clouds and natively building in there, or are you off cloud? >> So This is software that can run on cloud instances and provide high performance file within the cloud. It can take file data that's on-prem. Again, it's software, it can run in virtual or on physical servers. And it abstracts the data from the existing storage infrastructure, and makes the data visible and consumable and orchestratable across any of it. >> And what's the elevator pitch for Cloud of Cloud, give that too. >> Well, Cloud of Clouds creates a theoretical model of cloud, and it describes every single object in the cloud. Where is data, execution units, and connectivity, with one single class of very simple object. And I can, I can give you (indistinct) >> And the problem that solves is what? >> The problem that solves is, it creates this mathematical model that is necessary in order to do other interesting things, such as optimization, using sata engines, using automation, applying ML for instance. Or deep learning to automate all this clouds, if you think about in the industrial field, we know how to manage and automate huge plants. Why wouldn't it do the same thing in cloud? It's the same thing you- >> That's what you mean by theoretical model. >> Nelu: That's correct. >> Lay out the architecture, almost the bones of skeleton or something, or, and then- >> That's correct, and then on top of it you can actually build a platform, You can create your services, >> when you say math, you mean you put numbers to it, you kind of index it. >> You quantify this thing and you apply mathematical- It's really about, I can disclose this thing. It's really about describing the cloud as a knowledge graph for every single object in the graph for node, an edge is a vector. And then once you have this model, then you can apply the field theory, and linear algebra to do operation with these vectors. And it's, this creates a very interesting opportunity to let the math do this thing for us. >> Okay, so what happens with hyperscale, or it's like AWS in your model. >> So in, in my model actually, >> Are they happy with this, or they >> I'm very happy with that. >> Will they be happy with you? >> We create an interface to every single HyperCloud. We actually, we don't need to interface with the thousands of APIs, but you know, if we have the 80 20 rule, and we map these APIs into this graph, and then every single operation that is done in this graph is done from the beginning, in an optimized manner and also automation ready. >> That's going to be great. David, I want us to go back to you before we close real quick. You've had a lot of experience, multiple ventures on the front end. You talked to a lot of customers who've been innovating. Where are the classic (indistinct)? Cause you, you used to sell and invent product around the old school enterprises with storage, you know that that trajectory storage is still critical to store the data. Where's the classic enterprise grade mindset right now? Those customers that were buying, that are buying storage, they're in the cloud, they're lifting and shifting. They not yet put the throttle on DevOps. When they look at this Supercloud thing, Are they like a deer in the headlights, or are they like getting it? What's the, what's the classic enterprise look like? >> You're seeing people at different stages of adoption. Some folks are trying to get to the cloud, some folks are trying to repatriate from the cloud, because they've realized it's better to own than to rent when you use a lot of it. And so people are at very different stages of the journey. But the one thing that's constant is that there's always change. And the change here has to do with being able to change the location where you're doing your computing. So being able to support traditional workloads in the cloud, being able to run things at the edge, and being able to rationalize where the data ought to exist, and with a declarative model, intent-based, business objective-based, be able to swipe a mouse and have the data get redistributed and positioned across different vendors, across different clouds, that, we're seeing that as really top of mind right now, because everybody's at some point on this journey, trying to go somewhere, and it involves taking their data with them. (John laughs) >> Guys, great conversation. Thanks so much for coming on, for John, Dave. Stay tuned, we got a great analyst power panel coming right up. More from Palo Alto, Supercloud 2. Be right back. (bouncy music)

Published Date : Jan 18 2023

SUMMARY :

and I'm really pleased to And Dr. Nelu Mihai is the CEO So I'm going to start right off On the other hand, if you look at what's So the argument, the of platform being the monolith, you know, but on the developer cloud, It's the scale thing that gets me- the ability to run anything anywhere. of the heavy lifting of IT. Not have to run your And then you know, it's like web 2.0. It's what Cloud It's what cloud was supposed to be, and you can choose somebody bound to that. Also having in mind the to rewrite your application. That's correct. I mean your point is Yeah, let that thing continue to grow. of the cloud in which you put that. So, the state stuff- because even the application binaries If you think about no software running on Dave: So it's an illusion, okay. (indistinct) you guys talk And actually depending on the application, that no, you have to build the job scheduler, and the thing the equation, thank you. a PhD in operating system. about is an operating system. I think I think it's going to and it's going to be better at that level the way you But in order to be able to and build the same kind of Because I buy the theoretical, the OS doesn't know how to Nelu, what's your reaction, of it the right interface The open, this is "AWS killed the world, to be more like Microsoft. So, the system has to be open So the the legacy- to continue to to to put to work. You need to be able to put, And have not just the same APIs, and makes the data visible and consumable for Cloud of Cloud, give that too. And I can, I can give you (indistinct) It's the same thing you- That's what you mean when you say math, and linear algebra to do Okay, so what happens with hyperscale, the thousands of APIs, but you know, the old school enterprises with storage, and being able to rationalize Stay tuned, we got a

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Walmart	ORGANIZATION	0.99+
John	PERSON	0.99+
Nelu	PERSON	0.99+
David Flynn	PERSON	0.99+
Dave	PERSON	0.99+
Google	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
London	LOCATION	0.99+
John Furrier	PERSON	0.99+
LA	LOCATION	0.99+
Bob Muglia	PERSON	0.99+
OS/2	TITLE	0.99+
Nir Zuk	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Hammerspace	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Bell Labs	ORGANIZATION	0.99+
Nelu Mihai	PERSON	0.99+
DOS	TITLE	0.99+
AWSs	ORGANIZATION	0.99+
Palo Alto Networks	ORGANIZATION	0.99+
twice	QUANTITY	0.99+
Cisco	ORGANIZATION	0.99+
today	DATE	0.99+
Canada	LOCATION	0.99+
both	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
Supercloud	ORGANIZATION	0.99+
Nelu Laughs	PERSON	0.98+
thousands	QUANTITY	0.98+
first	QUANTITY	0.97+
Linux	TITLE	0.97+
HyperCloud	TITLE	0.97+
Cloud of Cloud	TITLE	0.97+
one	QUANTITY	0.96+
Cloud of Clouds	ORGANIZATION	0.95+
GCP	TITLE	0.95+
Azure	TITLE	0.94+
three variables	QUANTITY	0.94+
one single class	QUANTITY	0.94+
single server	QUANTITY	0.94+
triplet	QUANTITY	0.94+
one region	QUANTITY	0.92+
NetApp	TITLE	0.92+
DOS OS/2	TITLE	0.92+
Azure	ORGANIZATION	0.92+
earlier today	DATE	0.92+
Cloud of Clouds	TITLE	0.91+

Closing Remarks | Supercloud2

>> Welcome back everyone to the closing remarks here before we kick off our ecosystem portion of the program. We're live in Palo Alto for theCUBE special presentation of Supercloud 2. It's the second edition, the first one was in August. I'm John Furrier with Dave Vellante. Here to wrap up with our special guest analyst George Gilbert, investor and industry legend former colleague of ours, analyst at Wikibon. George great to see you. Dave, you know, wrapping up this day what in a phenomenal program. We had a contribution from industry vendors, industry experts, practitioners and customers building and redefining their company's business model. Rolling out technology for Supercloud and multicloud and ultimately changing how they do data. And data was the theme today. So very, very great program. Before we jump into our favorite parts let's give a shout out to the folks who make this possible. Free contents our mission. We'll always stay true to that mission. We want to thank VMware, alkira, ChaosSearch, prosimo for being sponsors of this great program. We will have Supercloud 3 coming up in a month or so, or two months. We'll see. Or sooner, we don't know. But it'll be more about security, but a lot more momentum. Okay, so that's... >> And don't forget too that this program not going to end now. We've got a whole ecosystem speaks track so stay tuned for that. >> John: Yeah, we got another 20 interviews. Feels like it. >> Well, you're going to hear from Saks, Veronika Durgin. You're going to hear from Western Union, Harveer Singh. You're going to hear from Ionis Pharmaceuticals, Nick Taylor. Brian Gracely chimes in on Supecloud. So he's the man behind the cloud cast. >> Yeah, and you know, the practitioners again, pay attention to also to the cloud networking interviews. Lot of change going on there that's going to be disruptive and actually change the landscape as well. Again, as Supercloud progresses to be the next big thing. If you're not on this next wave, you'll drift what, as Pat Gelsinger says. >> Yep. >> To kick off the closing segments, George, Dave, this is a wave that's been identified. Again, people debate the word all you want Supercloud. It is a gateway to multicloud eventually it is the standard for new applications, new ways to do data. There's new computer science being generated and customer requirements being addressed. So it's the confluence of, you know, tectonic plates shifting in the industry, new computer science seeing things like AI and machine learning and data at the center of it and new infrastructure all kind of coming together. So, to me, that's my takeaway so far. That is the big story and it's going to change society and ultimately the business models of these companies. >> Well, we've had 10, you know, you think about it we came out of the financial crisis. We've had 10, 12 years despite the Covid of tech success, right? And just now CIOs are starting to hit the brakes. And so my point is you've had all this innovation building up for a decade and you've got this massive ecosystem that is running on the cloud and the ecosystem is saying, hey, we can have even more value by tapping best of of breed across clouds. And you've got customers saying, hey, we need help. We want to do more and we want to point our business and our intellectual property, our software tooling at our customers and monetize our data. So you have all these forces coming together and it's sort of entering a new era. >> George, I want to go to you for a second because you are big contributor to this event. Your interview with Bob Moglia with Dave was I thought a watershed moment for me to hear that the data apps, how databases are being rethought because we've been seeing a diversity of databases with Amazon Web services, you know, promoting no one database rules of the world. Now it's not one database kind of architecture that's puling these new apps. What's your takeaway from this event? >> So if you keep your eye on this North Star where instead of building apps that are based on code you're building apps that are defined by data coming off of things that are linked to the real world like people, places, things and activities. Then the idea is, and the example we use is, you know, Uber but it could be, you know, amazon.com is defined by stuff coming off data in the Amazon ecosystem or marketplace. And then the question is, and everyone was talking at different angles on this, which was, where's the data live? How much do you hide from the developer? You know, and when can you offer that? You know, and you started with Walmart which was describing apps, traditional apps that are just code. And frankly that's easier to make that cross cloud and you know, essentially location independent. As soon as you have data you need data management technology that a customer does not have the sophistication to build. And then the argument was like, so how much can you hide from the developer who's building data apps? Tristan's version was you take the modern data stack and you start adding these APIs that define business concepts like bookings, billings and revenue, you know, or in the Uber example like drivers and riders, you know, and ETA's and prices. But those things execute still on the data warehouse or data lakehouse. Then Bob Muglia was saying you're not really hiding enough from the developer because you still got to say how to do all that. And his vision is not only do you hide where the data is but you hide how to sort of get at all that code by just saying what you want. You define how a car and how a driver and how a rider works. And then those things automatically figure out underneath the cover. >> So huge challenges, right? There's governance, there's security, they could be big blockers to, you know, the Supercloud but the industry's going to be attacking that problem. >> Well, what's your take? What's your favorite segment? Zhamak Dehghani came on, she's starting in that company, exclusive news. That was big notable moment for theCUBE. She launched her company. She pioneered the data mesh concept. And I think what George is saying and what data mesh points to is something that we've been saying for a long time. That data is now going to flip the script on how apps behave. And the Uber example I think is illustrated 'cause people can relate to Uber. But imagine that for every business whether it's a manufacturing business or retail or oil and gas or FinTech, they can look at their business like a game almost gamify it with data, riders, cars you know, moving data around the value of data. This is something that Adam Selipsky teased out at AWS, Dave. So what's your takeaway from this Supercloud? Where are we in your mind? Well big thing is data products and decentralizing your data architecture, but putting data in the hands of domain experts who can actually monetize the data. And I think that's, to me that's really exciting. Because look, data products financial industry has always been doing building data products. Mortgage backed securities is a data product. But why should the financial industry have all the fun? I mean virtually every organization can tap its ecosystem build data products, take its internal IP and processes and software and point it to the world and actually begin to make money out of it. >> Okay, so let's go around the horn. I'll start, I'll get you guys some time to think. Next question, what did you learn today? I learned that I think it's an infrastructure game and talking to Kit Colbert at VMware, I think it's all about infrastructure refactoring and I think the data's going to be an ingredient that's going to be operating system like. I think you're going to see the infrastructure influencing operations that will enable Superclouds to be real. And developers won't even know what a Supercloud is because they'll be using it. It's the operations focus is going to be very critical. Just like DevOps movements started Cloud native I think you're going to see a data native movement and I think infrastructure is critical as people go to the next level. That's my big takeaway today. And I'll say the data conversation is at the center. I think security, data are going to be always active horizontally scalable concepts, but every company's going to reset their infrastructure, how it looks and if it's not set up for data and or things that there need to be agile on, it's going to be a non-starter. So I think that's the cloud NextGen, distributed computing. >> I mean, what came into focus for me was I think the hyperscaler is going to continue to do their thing, you know, and be very, very successful and they're each coming at it from different approaches. We talk about this all the time in theCUBE. Amazon the best infrastructure, you know, Google's got its you know, data and AI thing and it's playing catch up and Microsoft's got this massive estate. Okay, cool. Check. The next wave of innovation which is coming from data, I've always said follow the data. That's where the where the money's going to be is going to come from other places. People want to be able to, organizations want to be able to share data across clouds across their organization, outside of their ecosystem and make money with that data sharing. They don't want to FTP it anymore. I got it. You take it. They want to work with live data in real time and I think the edge, we didn't talk much about the edge today is going to even take that to a new level real time inferencing at the edge, AI and and being able to do new things with data that we haven't even seen. But playing around with ChatGPT, it's blowing our mind. And I think you're right, it's like when we first saw the browser, holy crap, this is going to change the world. >> Yeah. And the ChatGPT by the way is going to create a wave of machine learning and data refactoring for sure. But also Howie Liu had an interesting comment, he was asked by a VC how much to replicate that and he said it's in the hundreds of millions, not billions. Now if you asked that same question how much does it cost to replicate AWS? The CapEx alone is unstoppable, they're already done. So, you know, the hyperscalers are going to continue to boom. I think they're going to drive the infrastructure. I think Amazon's going to be really strong at silicon and physics and squeeze every ounce atom out of every physical thing and then get latency as your bottleneck and the rest is all going to be... >> That never blew me away, a hundred million to create kind of an open AI, you know, competitor. Look at companies like Lacework. >> John: Some people have that much cash on the balance sheet. >> These are security companies that have raised a billion dollars, right? To compete. You know, so... >> If you're not shifting left what do you do with data, shift up? >> But, you know. >> What did you learn, George? >> I'm listening to you and I think you're helping me crystallize something which is the software infrastructure to enable the data apps is wide open. The way Zhamak described it is like if you want a data product like a sales and operation plan, that is built on other data products, like a sales plan which has a forecast in it, it has a production plan, it has a procurement plan and then a sales and operation plan is actually a composition of all those and they call each other. Now in her current platform, you need to expose to the developer a certain amount of mechanics on how to move all that data, when to move it. Like what happens if something fails. Now Muglia is saying I can hide that completely. So all you have to say is what you want and the underlying machinery takes care of everything. The problem is Muglia stuff is still a few years off. And Tristan is saying, I can give you much of that today but it's got to run in the data warehouse. So this trade offs all different ways. But again, I agree with you that the Cloud platform vendors or the ecosystem participants who can run across Cloud platforms and private infrastructure will be the next platform. And then the cloud platform is sort of where you run the big honking centralized stuff where someone else manages the operations. >> Sounds like middleware to me, Dave >> And key is, I'll just end with this. The key is being able to get to the data, whether it's in a data warehouse or a data lake or a S3 bucket or an object store, Oracle database, whatever. It's got to be inclusive that is critical to execute on the vision that you just talked about 'cause that data's in different systems and you're not going to put it all into some new system. >> So creating middleware in the cloud that sounds what it sounds like to me. >> It's like, you discovered PaaS >> It's a super PaaS. >> But it's platform services 'cause PaaS connotes like a tightly integrated platform. >> Well this is the real thing that's going on. We're going to see how this evolves. George, great to have you on, Dave. Thanks for the summary. I enjoyed this segment a lot today. This ends our stage performance live here in Palo Alto. As you know, we're live stage performance and syndicate out virtually. Our afternoon program's going to kick in now you're going to hear some great interviews. We got ChaosSearch. Defining the network Supercloud from prosimo. Future of Cloud Network, alkira. We got Saks, a retail company here, Veronika Durgin. We got Dave with Western Union. So a lot of customers, a pharmaceutical company Warner Brothers, Discovery, media company. And then you know, what is really needed for Supercloud, good panels. So stay with us for the afternoon program. That's part two of Supercloud 2. This is a wrap up for our stage live performance. I'm John Furrier with Dave Vellante and George Gilbert here wrapping up. Thanks for watching and enjoy the program. (bright music)

Published Date : Jan 17 2023

SUMMARY :

to the closing remarks here program not going to end now. John: Yeah, we got You're going to hear from Yeah, and you know, It is a gateway to multicloud starting to hit the brakes. go to you for a second the sophistication to build. but the industry's going to And I think that's, to me and talking to Kit Colbert at VMware, to do their thing, you know, I think Amazon's going to be really strong kind of an open AI, you know, competitor. on the balance sheet. that have raised a billion dollars, right? I'm listening to you and I think It's got to be inclusive that is critical So creating middleware in the cloud But it's platform services George, great to have you on, Dave.

ENTITIES

Entity	Category	Confidence
Tristan	PERSON	0.99+
Dave Vellante	PERSON	0.99+
George Gilbert	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Adam Selipsky	PERSON	0.99+
Pat Gelsinger	PERSON	0.99+
Bob Moglia	PERSON	0.99+
Veronika Durgin	PERSON	0.99+
John	PERSON	0.99+
Bob Muglia	PERSON	0.99+
George	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Western Union	ORGANIZATION	0.99+
Nick Taylor	PERSON	0.99+
Palo Alto	LOCATION	0.99+
10	QUANTITY	0.99+
John Furrier	PERSON	0.99+
Uber	ORGANIZATION	0.99+
Brian Gracely	PERSON	0.99+
Howie Liu	PERSON	0.99+
Zhamak Dehghani	PERSON	0.99+
hundreds of millions	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Ionis Pharmaceuticals	ORGANIZATION	0.99+
August	DATE	0.99+
Warner Brothers	ORGANIZATION	0.99+
Kit Colbert	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Walmart	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
billions	QUANTITY	0.99+
Zhamak	PERSON	0.99+
Muglia	PERSON	0.99+
20 interviews	QUANTITY	0.99+
Discovery	ORGANIZATION	0.99+
second edition	QUANTITY	0.99+
ChaosSearch	ORGANIZATION	0.99+
today	DATE	0.99+
two months	QUANTITY	0.99+
Supercloud 2	TITLE	0.98+
VMware	ORGANIZATION	0.98+
Saks	ORGANIZATION	0.98+
PaaS	TITLE	0.98+
amazon.com	ORGANIZATION	0.98+
first one	QUANTITY	0.98+
Lacework	ORGANIZATION	0.98+
Harveer Singh	PERSON	0.98+
Oracle	ORGANIZATION	0.97+
alkira	PERSON	0.96+
first	QUANTITY	0.96+
Supercloud	ORGANIZATION	0.95+
Supercloud2	TITLE	0.94+
Wikibon	ORGANIZATION	0.94+
Supecloud	ORGANIZATION	0.94+
each	QUANTITY	0.93+
hundred million	QUANTITY	0.92+
multicloud	ORGANIZATION	0.92+
every ounce atom	QUANTITY	0.91+
Amazon Web	ORGANIZATION	0.88+
Supercloud 3	TITLE	0.87+

Juan Loaiza, Oracle | Building the Mission Critical Supercloud

(upbeat music) >> Welcome back to Supercloud two where we're gathering a number of industry luminaries to discuss the future of cloud services. And we'll be focusing on various real world practitioners today, their challenges, their opportunities with an emphasis on data, self-service infrastructure and how organizations are evolving their data and cloud strategies to prepare for that next era of digital innovation. And we really believe that support for multiple cloud estates is a first step of any Supercloud. And in that regard Oracle surprise some folks with its Azure collaboration the Oracle database and exit database services. And to discuss the challenges of developing a mission critical Supercloud we welcome Juan Loaiza, who's the executive vice president of Mission Critical Database Technologies at Oracle. Juan, you're many time CUBE alums so welcome back to the show. Great to see you. >> Great to see you, and happy to be here with you. >> Yeah, thank you. So a lot of people felt that Oracle was resistant to multicloud strategies and preferred to really have everything run just on the Oracle cloud infrastructure, OCI and maybe that was a misperception maybe you guys were misunderstood or maybe you had to change your heart. Take us through the decision to support multiple cloud platforms >> Now we've supported multiple cloud platforms for many years, so I think that was probably a misperception. Oracle database, we partnered up with Amazon very early on in their cloud when they had kind of the the first cloud out there. And we had Oracle database running on their cloud. We have backup, we have a lot of stuff running. So, yeah, part of the philosophy of Oracle has always been we partner with every platform. We're very open we started with SQL and APIs. As we develop new technologies we push them into the SQL standard. So that's always been part of the ecosystem at Oracle. That's how we think we get an advantage by being more open. I think if we try to create this isolated little world it actually hurts us and hurts customers. So for us it's a win-win to be open across the clouds. >> So Supercloud is this concept that we put forth to describe a platform or some people think it's an architecture if you have an opinion, and I'd love to hear it but it provides a programmatically consistent set of services that hosted on heterogeneous cloud providers. And so we look at the Oracle database service for Azure as fitting within this definition. In your view, is this accurate? >> Yeah, I would broaden it. I'd see a little bit more than that. We just think that services should be available from everywhere, right? So, I mean, it's a little bit like if you go back to the pre-internet world, there was things like AOL and CompuServe and those were kind of islands. And if you were on AOL, you really didn't have access to anything on CompuServe and vice versa. And the cloud world has evolved a little bit like that. And we just think that's the wrong model. They shouldn't these clouds are part of the world and they need to be interconnected like all the rest of the world. It's been a long time with telephones internet, everything, everything's interconnected. Everything should work seamlessly together. So that's how we believe if you're running in one cloud and you're running let's say an application, one cloud you want to use a service from another cloud should be completely simple to do that. It shouldn't be, I can only use what's in AOL or CompuServe or whatever else. It should not be isolated. >> Well, we got a long way to go before that Nirvana exists but one example is the Oracle database service with Azure. So what exactly does that service provide? I'm interested in how consistent the service experience is across clouds. Did you create a purpose-built PaaS layer to achieve this common experience? Or is it off the shelf Terraform? Is there unique value in the PaaS layer? Let's dig into some of those questions. I know I just threw six at you. >> Yeah, I mean, so what this is, is what we're trying to do is very simple. Which is, for example, starting with the Oracle database we want to make that seamless to use from anywhere you're running. Whether it's on-prem, on some other cloud, anywhere else you should be able to seamlessly use the Oracle database and it should look like the internet. There's no friction. There's not a lot of hoops you got to jump just because you're trying to use a database that isn't local to you. So it's pretty straightforward. And in terms of things like Azure, it's not easy to do because all these clouds have a lot of kind of very unique technologies. So what we've done is at Oracle is we've said, "Okay we're going to make Oracle database look exactly like if it was running on Azure." That means we'll use the Azure security systems, the identity management systems, the networking, there's things like monitoring and management. So we'll push all these technologies. For example, when we have monitoring event or we have alerts we'll push those into the Azure console. So as a user, it looks to you exactly as if that Oracle database was running inside Azure. Also, the networking is a big challenge across these clouds. So we've basically made that whole thing seamless. So we create the super high bandwidth network between Azure and Oracle. We make sure that's extremely low latency, under two milliseconds round trip. It's all within the local metro region. So it's very fast, very high bandwidth, very low latency. And we take care establishing the links and making sure that it's secure and all that kind of stuff. So at a high level, it looks to you like the database is--even the look and feel of the screens. It's the Azure colors, it's the Azure buttons it's the Azure layout of the screens so it looks like you're running there and we take care of all the technical details underlying that which there's a lot which has taken a lot of work to make it work seamlessly. >> In the magic of that abstraction. Juan, does it happen at the PaaS layer? Could you take us inside that a little bit? Is there intelligence in there that helps you deal with latency or are there any kind of purpose-built functions for this service? >> You could think of it as... I mean it happens at a lot of different layers. It happens at the identity management layer, it happens at the networking layer, it happens at the database layer, it happens at the monitoring layer, at the management layer. So all those things have been integrated. So it's not one thing that you just go and do. You have to integrate all these different services together. You can access files in Azure from the Oracle database. Again, that's completely seamless. You, it's just like if it was local to our cloud you get your Azure files in your kind of S3 equivalent. So yeah, the, it's not one thing. There's a whole lot of pieces to the ecosystem. And what we've done is we've worked on each piece separately to make sure that it's completely seamless and transparent so you don't have to think about it, it just works. >> So you kind of answered my next question which is one of the technical hurdles. It sounds like the technical hurdles are that integration across the entire stack. That's the sort of architecture that you've built. What was the catalyst for this service? >> Yeah, the catalyst is just fulfilling our vision of an open cloud world. It's really like I said, Oracle, from the very beginning has been believed in open standards. Customers should be able to have choice customers should be able to use whatever they want from wherever they want. And we saw that, you know in the new world of cloud that had broken down everybody had their own authentication system management system, monitoring system networking system, configuration system. And it became very difficult. There was a lot of friction to using services across cloud. So we said, "Well, okay we can fix that." It's work, it's significant amount of work but we know how to do it and let's just go do it and make it easy for customers. >> So given Oracle is really your main focus is on mission critical workloads. You talked about this low latency network, I mean but you still have physical distances, so how are you managing that latency? What's the experience been for customers across Azure and OCI? >> Yeah, so it, it's a good point. I mean, latency can be an issue. So the good thing about clouds is we have a lot of cloud data centers. We have dozens and dozens of cloud data centers around the world. And Azure has dozens and dozens of cloud data centers. And in most cases, they're in the same metro region because there's kind of natural metro regions within each country that you want to put your cloud data centers in. So most of our data centers are actually very close to the Azure data centers. There's the kind of northern Virginia, there's London, there's Tokyo I mean, there's natural places where everybody puts their data centers Seoul et cetera. And so that's the real key. So that allows us to put a very high bandwidth and low latency network. The real problems with latency come when you're trying to go along physical distance. If you're trying to connect, you know across the Pacific or you know across the country or something like that, then you can get in trouble with latency within the same metro region. It's extremely fast. It tends to be around one, you know the highest two millisecond that's roundtrip through all the routers and connections and gateways and everything else. With everything taken into consideration, what we guarantee is it's always less than two millisecond which is a very low latency time. So that tends to not be a problem because it's extremely low latency. >> I was going to ask you less than two milliseconds. So, earlier in the program we had Jack Greenfield who runs architecture for Walmart, and he was explaining what we call their Supercloud, and it's runs across Azure, GCP, and they're on-prem. They have this thing called the triplet model. So my question to you is, are you in situations where you guaranteeing that less than two milliseconds do you have situations where you're bringing, you know Exadata Cloud, a customer on-prem to achieve that? Or is this just across clouds? >> Yeah, in this case, we're talking public cloud data center to public cloud data center. >> Oh okay. >> So add your public cloud data center to Oracle Public Cloud data center. They're in the same metro region. We set up the connections, we do all the technology to make it seamless. And from a customer point of view they don't really see the network. Also, remember that SQL is actually designed to have very low bandwidth and latency requirements. So it is a language. So you don't go to the database and say do this one little thing for me. You send it a SQL statement that can actually access lots of data while in the database. So the real latency requirement of a SQL database is within the database. So I need to access all that data fast. So I need very fast access to storage very fast access across node. That's what exit data gives you. But you send one request and that request can do a huge amount of work and then return one answer. And that's kind of the design point of SQL. So SQL is inherently low bandwidth requirements, it was used back in the eighties when we used to have 10 megabit networks and the the biggest companies in the world ran back then. So right now we're talking over hundred hundreds of gigabits. So it's really not much of a challenge. When you're designed to run on 10 megabit to say, okay I'm going to give you 10,000 times what you were designed for it's really, it's a pretty low hurdle jump. >> What about the deployment models? How do you handle this? Is it a single global instance across clouds or do you sort of instantiate in each you got exudate in Azure and exudates in OCI? What's the deployment model look like? >> It's pretty straightforward. So customer decides where they want to run their application and database. So there's natural places where people go. If you're in Tokyo, you're going to choose the local Tokyo data centers for both, you know Microsoft and Oracle. If you're in London, you're going to do that. If you're in California you're going to choose maybe San Jose, something like that. So a customer just chooses. We both have data centers in that metro region. So they create their service on Azure and then they go to our console which looks just like an Azure console and say all right create me a database. And then we choose the closest Oracle data center which is generally a few miles away, and then it it all gets created. So from a customer point of view, it's very straightforward. >> I'm always in awe about how simple you make things sound. All right what about security? You talked a little bit before about identity access how you sort of abstracting the Azure capabilities away so that you've simplified it for your customers but are there any other specific security things that you need to do? How much did you have to abstract the underlying primitives of Azure or OCI to present that common experience to customers? >> Yeah, so there's really two big things. One is the identity management. Like my name is X on Azure and I have this set of privileges. Oracle has its own identity management system, right? So what we didn't want is that you have to kind of like bridge these things yourself. It's a giant pain to do that. So we actually what we call federate across these identity managements. So you put your credentials into Azure and then they automatically get to use the exact same credentials and identity in the Oracle cloud. So again, you don't have to think about it, it just works. And then the second part is that the whole bridging the network. So within a cloud you generally have virtual network that's private to your company. And so at Oracle, we bridge the private network that you created in, for example, Azure to the private network that we create for you in Oracle. So it is still a private network without you having to do a whole bunch of work. So it's just like if you were in your own data center other people can't get into your network. So it's secured at the network level, it's secured at the identity management, and encryption level. And again we did a lot of work to make that seamless for customers and they don't have to worry about it because we did the work. That's really as simple as it gets. >> That's what's Supercloud's supposed to be all about. Alright, we were talking earlier about sort of the misperception around multicloud, your view of Open I think, which is you run the Oracle database, wherever the customer wants to run it. So you got this database service across OCI and Azure customers today, they run Oracle database in AWS. You got heat wave, MySQL, heat wave that you announced on AWS, Google touts a bare metal offering where you can run Oracle on GCP. Do you see a day when you extend an OCI Azure like situation across multiple clouds? Would that bring benefits to customers or will the world of database generally remain largely fenced with maybe a few exceptions like what you're doing with OCI and Azure? I'm particularly interested in your thoughts on egress fees as maybe one of the reasons that there is a barrier to this happening and why maybe these stove pipes, exist today and in the future. What are your thoughts on that? >> Yeah, we're very open to working with everyone else out there. Like I said, we've always been, big believers in customers should have choice and you should be able to run wherever you want. So that's been kind of a founding principle of Oracle. We have the Azure, we did a partnership with them, we're open to doing other partnerships and you're going to see other things coming down the pipe on the topic of egress. Yeah, the large egress fees, it's pretty obvious what goes on with that. Various vendors like to have large egress fees because they want to keep things kind of locked into their cloud. So it's not a very customer friendly thing to do. And I think everybody recognizes that it's really trying to kind of course or put a lot of friction on moving data out of a particular cloud. And that's not what we do. We have very, very low egress fees. So we don't really do that and we don't think anybody else should do that. But I think customers at the end of the day, will win that battle. They're going to have to go back to their vendor and say, well I have choice in clouds and if you're going to impose these limits on me, maybe I'll make a different choice. So that's ultimately how these things get resolved. >> So do you think other cloud providers are going to take a page out of what you're doing with Azure and provide similar solutions? >> Yeah, well I think customers want, I mean, I've talked to a lot of customers, this is what they want, right? I mean, there's really no doubt no customer wants to be locked into a single ecosystem. There's nobody out there that wants that. And as the competition, when they start seeing an open ecosystem evolving they're going to be like, okay, I'd rather go there than the closed ecosystem, and that's going to put pressure on the closed ecosystems. So that's the nature of competition. That's what ultimately will tip the balance on these things. >> So Juan, even though you have this capability of distributing a workload across multiple clouds as in our Supercloud premise it's still something that's relatively new. It's a big decision that maybe many people might consider somewhat of a risk. So I'm curious who's driving the decisions for your initial customers? What do they want to get out of it? What's the decision point there? >> Yeah, I mean, this is generally driven by customers that want a specific technology in a cloud. I think the risk, I haven't seen a lot of people worry too much about the risk. Everybody involved in this is a very well known, very reputable firm. I mean, Oracle's been around for 40 years. We run most of the world's largest companies. I think customers understand we're not going to build a solution that's going to put their technology and their business at risk. And the same thing with Azure and others. So I don't see customers too worried about this is a risky move because it's really not. And you know, everybody understands networking at the end the day networking works. I mean, how does the internet work? It's a known quantity. It's not like it's some brand new invention. What we're really doing is breaking down the barriers to interconnecting things. Automating 'em, making 'em easy. So there's not a whole lot of risk here for customers. And like I said, every single customer in the world loves an open ecosystem. It's just not a question. If you go to a customer would you rather put your technology or your business to run on a closed ecosystem or an open system? It's kind of not even worth asking a question. It's a no-brainer. >> All right, so we got to go. My last question. What do you think of the term "Supercloud"? You think it'll stick? >> We'll see. There's a lot of terms out there and it's always fun to see which terms stick. It's a cool term. I like it, but the decision makers are actually the public, what sticks and what doesn't. It's very hard to predict. >> Yeah well, it's been a lot of fun having you on, Juan. Really appreciate your time and always good to see you. >> All right, Dave, thanks a lot. It's always fun to talk to you. >> You bet. All right, keep it right there. More Supercloud two content from theCUBE Community Dave Vellante for John Furrier. We'll be right back. (upbeat music)

Published Date : Jan 12 2023

SUMMARY :

and cloud strategies to prepare happy to be here with you. just on the Oracle cloud of the ecosystem at Oracle. and I'd love to hear it And the cloud world has Or is it off the shelf Terraform? So at a high level, it looks to you Juan, does it happen at the PaaS layer? it happens at the database layer, So you kind of And we saw that, you know What's the experience been for customers across the Pacific or you know So my question to you is, to public cloud data center. So the real latency requirement and then they go to our console the Azure capabilities away So it's secured at the network level, So you got this database We have the Azure, we did So that's the nature of competition. What's the decision point there? down the barriers to the term "Supercloud"? and it's always fun to and always good to see you. It's always fun to talk to you. Vellante for John Furrier.

ENTITIES

Entity	Category	Confidence
Microsoft	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Juan Loaiza	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
San Jose	LOCATION	0.99+
California	LOCATION	0.99+
Dave Vellante	PERSON	0.99+
Tokyo	LOCATION	0.99+
Juan	PERSON	0.99+
London	LOCATION	0.99+
six	QUANTITY	0.99+
10,000 times	QUANTITY	0.99+
Jack Greenfield	PERSON	0.99+
Google	ORGANIZATION	0.99+
second part	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
less than two millisecond	QUANTITY	0.99+
less than two milliseconds	QUANTITY	0.99+
One	QUANTITY	0.99+
SQL	TITLE	0.99+
10 megabit	QUANTITY	0.99+
both	QUANTITY	0.99+
AOL	ORGANIZATION	0.98+
each piece	QUANTITY	0.98+
MySQL	TITLE	0.98+
first cloud	QUANTITY	0.98+
single	QUANTITY	0.98+
each country	QUANTITY	0.98+
John Furrier	PERSON	0.98+
two big things	QUANTITY	0.98+
under two milliseconds	QUANTITY	0.98+
one	QUANTITY	0.98+
northern Virginia	LOCATION	0.98+
CompuServe	ORGANIZATION	0.97+
first step	QUANTITY	0.97+
Mission Critical Database Technologies	ORGANIZATION	0.97+
one request	QUANTITY	0.97+
Seoul	LOCATION	0.97+
Azure	TITLE	0.97+
each	QUANTITY	0.97+
two millisecond	QUANTITY	0.97+
Azure	ORGANIZATION	0.96+
one cloud	QUANTITY	0.95+
one thing	QUANTITY	0.95+
cloud data centers	QUANTITY	0.95+
one answer	QUANTITY	0.95+
Supercloud	ORGANIZATION	0.94+

Brian Gracely, The Cloudcast | Does the World Really Need Supercloud?

(upbeat music) >> Welcome back to Supercloud 2 this is Dave Vellante. We're here exploring the intersection of data and analytics and the future of cloud. And in this segment, we're going to look at the evolution of cloud, and try to test some of the Supercloud concepts and assumptions with Brian Gracely, is the founder and co-host along with Aaron Delp of the popular Cloudcast program. Amazing series, if you're not already familiar with it. The Cloudcast is one of the best ways to keep up with so many things going on in our industry. Enterprise tech, platform engineering, business models, obviously, cloud developer trends, crypto, Web 3.0. Sorry Brian, I know that's a sore spot, but Brian, thanks for coming >> That's okay. >> on the program, really appreciate it. >> Yeah, great to be with you, Dave. Happy New Year, and great to be back with everybody with SiliconANGLE again this year. >> Yeah, we love having you on. We miss working with you day-to-day, but I want to start with Gracely's theorem, which basically says, I'm going to paraphrase. For the most part, nothing new gets introduced in the enterprise tech business, patterns repeat themselves, maybe get applied in new ways. And you know this industry well, when something comes out that's new, if you take virtualization, for example, been around forever with mainframes, but then VMware applied it, solve a real problem in the client service system. And then it's like, "Okay, this is awesome." We get really excited and then after a while we pushed the architecture, we break things, introduce new things to fix the things that are broken and start adding new features. And oftentimes you do that through acquisitions. So, you know, has the cloud become that sort of thing? And is Supercloud sort of same wine, new bottle, following Gracely's theorem? >> Yeah, I think there's some of both of it. I hate to be the sort of, it depends sort of answer but, I think to a certain extent, you know, obviously Cloud in and of itself was, kind of revolutionary in that, you know, it wasn't that you couldn't rent things in the past, it was just being able to do it at scale, being able to do it with such amazing self-service. And then, you know, kind of proliferation of like, look at how many services I can get from, from one cloud, whether it was Amazon or Azure or Google. And then, you know, we, we slip back into the things that we know, we go, "Oh, well, okay, now I can get computing on demand, but, now it's just computing." Or I can get database on demand and it's, you know, it's got some of the same limitations of, of say, of database, right? It's still, you know, I have to think about IOPS and I have to think about caching, and other stuff. So, I think we do go through that and then we, you know, we have these sort of next paradigms that come along. So, you know, serverless was another one of those where it was like, okay, it seems sort of new. I don't have to, again, it was another level of like, I don't have to think about anything. And I was able to do that because, you know, there was either greater bandwidth available to me, or compute got cheaper. And what's been interesting is not the sort of, that specific thing, serverless in and of itself is just another way of doing compute, but the fact that it now gets applied as, sort of a no-ops model to, you know, again, like how do I provision a database? How do I think about, you know, do I have to think about the location of a service? Does that just get taken care of for me? So I think the Supercloud concept, and I did a thing and, and you and I have talked about it, you know, behind the scenes that maybe the, maybe a better name is Super app for something like Snowflake or other, but I think we're, seeing these these sort of evolutions over and over again of what were the big bottlenecks? How do we, how do we solve those bottlenecks? And I think the big thing here is, it's never, it's very rarely that you can take the old paradigm of what the thing was, the concept was, and apply it to the new model. So, I'll just give you an example. So, you know, something like VMware, which we all know, wildly popular, wildly used, but when we apply like a Supercloud concept of VMware, the concept of VMware has always been around a cluster, right? It's some finite number of servers, you sort of manage it as a cluster. And when you apply that to the cloud and you say, okay, there's, you know, for example, VMware in the cloud, it's still the same concept of a cluster of VMware. But yet when you look at some of these other services that would fit more into the, you know, Supercloud kind of paradigm, whether it's a Snowflake or a MongoDB Atlas or maybe what CloudFlare is doing at the edge, those things get rid of some of those old paradigms. And I think that's where stuff, you start to go, "Oh, okay, this is very different than before." Yes, it's still computing or storage, or data access, but there's a whole nother level of something that we didn't carry forward from the previous days. And that really kind of breaks the paradigm. And so that's the way I think I've started to think about, are these things really brand new? Yes and no, but I think it's when you can see that big, that thing that you didn't leave behind isn't there anymore, you start to get some really interesting new innovation come out of it. >> Yeah. And that's why, you know, lift and shift is okay, when you talk to practitioners, they'll say, "You know, I really didn't change my operating model. And so I just kind of moved it into the cloud. there were some benefits, but it was maybe one zero not three zeros that I was looking for." >> Right. >> You know, we always talk about what's great about cloud, the agility, and all the other wonderful stuff that we know, what's not working in cloud, you know, tie it into multi-cloud, you know, in terms of, you hear people talk about multi-cloud by accident, okay, that's true. >> Yep. >> What's not great about cloud. And then I want to get into, you know, is multi-cloud really a problem or is it just sort of vendor hype? But, but what's not working in cloud? I mean, you mentioned serverless and serverless is kind of narrow, right, for a lot of stateless apps, right? But, what's not great about cloud? >> Well, I think there's a few things that if you ask most people they don't love about cloud. I think, we can argue whether or not sort of this consolidation around a few cloud providers has been a good thing or a bad thing. I think, regardless of that, you know, we are seeing, we are hearing more and more people that say, look, you know, the experience I used to have with cloud when I went to, for example, an Amazon and there was, you know, a dozen services, it was easy to figure out what was going on. It was easy to figure out what my billing looked like. You know, now they've become so widespread, the number of services they have, you know, the number of stories you just hear of people who went, "Oh, I started a service over in US West and I can't find it anymore 'cause it's on a different screen. And I, you know, I just got billed for it." Like, so I think the sprawl of some of the clouds has gotten, has created a user experience that a lot of people are frustrated with. I think that's one thing. And we, you know, we see people like Digital Ocean and we see others who are saying, "Hey, we're going to be that simplified version." So, there's always that yin and yang. I think people are super frustrated at network costs, right? So, you know, and that's kind of at a lot of, at the center of maybe why we do or don't see more of these Supercloud services is just, you know, in the data center as an application owner, I didn't have to think about, well where, where does this go to? Where are my users? Yes, somebody took care of it, but when those things become front and center, that's super frustrating. That's the one area that we've seen absolutely no cost savings, cost reduction. So I think that frustrates people a lot. And then I think the third piece is just, you know, we're, we went from super centralized IT organizations, which, you know, for decades was how it worked. It was part of the reason why the cloud expanded and became a thing, right? Sort of shadow IT and I can't get things done. And then, now what we've seen is sort of this proliferation of little pockets of groups that are your IT, for lack of a better thing, whether they're called platform engineering or SRE or DevOps. But we have this, expansion, explosion if you will, of groups that, if I'm an app dev team, I go, "Hey, you helped me make this stuff run, but then the team next to you has another group and they have another group." And so you see this explosion of, you know, we don't have any standards in the company anymore. And, so sort of self-service has created its own nightmare to a certain extent for a lot of larger companies. >> Yeah. Thank you for that. So, you know, I want, I want to explore this multi-cloud, you know, by accident thing and is a real problem. You hear that a lot from vendors and we've been talking about Supercloud as this unifying layer across cloud. You know, but when you talk to customers, a lot of them are saying, "Yes, we have multiple clouds in our organization, but my group, we have mono cloud, we know the security, edicts, we know how to, you know, deal with the primitives, whether it's, you know, S3 or Azure Blob or whatever it is. And we're very comfortable with this." It's, that's how we're simplifying. So, do you think this is really a problem? Does it have merit that we need that unifying layer across clouds, or is it just too early for that? >> I think, yeah, I think what you, what you've laid out is basically how the world has played out. People have picked a cloud for a specific application or a series of applications. Yeah, and I think if you talk to most companies, they would tell you, you know, holistically, yes, we're multi-cloud, not, maybe not necessarily on, I don't necessarily love the phrase where people say like, well it happened by accident. I think it happened on purpose, but we got to multi-cloud, not in the way that maybe that vendors, you know, perceived, you know, kind of laid out a map for. So it was, it was, well you will lay out this sort of Supercloud framework. We didn't call it that back then, we just called it sort of multi-cloud. Maybe it was Kubernetes or maybe it was whatever. And different groups, because central IT kind of got disbanded or got fragmented. It turned into, go pick the best cloud for your application, for what you need to do for the business. And then, you know, multiple years later it was like, "Oh, hold on, I've got 20% in Google and 50% in AWS and I've got 30% in Azure. And, you know, it's, yeah, it's been evolution. I don't know that it's, I don't know if it's a mistake. I think it's now groups trying to figure out like, should I make sense of it? You know, should I try and standardize and I backwards standardize some stuff? I think that's going to be a hard thing for, for companies to do. 'cause I think they feel okay with where the applications are. They just happen to be in multiple clouds. >> I want to run something by you, and you guys, you and Aaron have talked about this. You know, still depending on who, which keynote you listen to, small percentage of the workloads are actually in cloud. And when you were with us at Wikibon, I think we called it true private cloud, and we looked at things like Nutanix and there were a lot of other examples of companies that were trying to replicate the hyperscale experience on Prem. >> Yeah. >> And, we would evaluate that, you know, beyond virtualization, and so we sort of defined that and, but I think what's, maybe what's more interesting than Supercloud across clouds is if you include that, that on Prem estate, because that's where most of the work is being done, that's where a lot of the proprietary tools have been built, a lot of data, a lot of software. So maybe there's this concept of sending that true private cloud to true hybrid cloud. So I actually think hybrid cloud in some cases is the more interesting use case for so-called Supercloud. What are your thoughts on that? >> Yeah, I think there's a couple aspects too. I think, you know, if we were to go back five or six years even, maybe even a little further and look at like what a data center looked like, even if it was just, "Hey we're a data center that runs primarily on VMware. We use some of their automation". Versus what you can, even what you can do in your data center today. The, you know, the games that people have seen through new types of automation through Kubernetes, through get ops, and a number of these things, like they've gotten significantly further along in terms of I can provision stuff really well, I can do multi-tenancy, I can do self-service. Is it, you know, is it still hard? Yeah. Because those things are hard to do, but there's been significant progress there. I don't, you know, I still look for kind of that, that killer application, that sort of, you know, lighthouse use case of, hybrid applications, you know, between data center and between cloud. I think, you know, we see some stuff where, you know, backup is a part of it. So you use the cloud for storage, maybe you use the cloud for certain kinds of resiliency, especially on maybe front end load balancing and stuff. But I think, you know, I think what we get into is, this being hung up on hybrid cloud or multi-cloud as a term and go like, "Look, what are you trying to measure? Are you trying to measure, you know, efficiency of of of IT usage? Are you trying to measure how quickly can I give these business, you know, these application teams that are part of a line of business resources that they need?" I think if we start measuring that way, we would look at, you know, you'd go, "Wow, it used to be weeks and months. Now we got rid of these boards that have to review everything every time I want to do a change management type of thing." We've seen a lot more self-service. I think those are the things we want to measure on. And then to your point of, you know, where does, where do these Supercloud applications fit in? I think there are a bunch of instances where you go, "Look, I have a, you know, global application, I have a thing that has to span multiple regions." That's where the Supercloud concept really comes into play. We used to do it in the data center, right? We'd had all sorts of technologies to help with that, I think you can now start to do it in the cloud. >> You know, one of the other things, trying to understand, your thoughts on this, do you think that you, you again have talked about this, like I'm with you. It's like, how is it that Google's losing, you know, 3 billion dollars a year, whatever. I mean, because when you go back and look at Amazon, when they were at that level of revenue where Google is today, they were making money, you know, and they were actually growing faster, by the way. So it's kind of interesting what's happened with Google. But, the reason I bring that up is, trying to understand if you think the hyperscalers will ever be motivated to create standards across clouds, and that may be a play for Google. I mean, obviously with Kubernetes it was like a Hail Mary and kind of made them relevant. Where would Google be without Kubernetes? But then did it achieve the objectives? We could have that conversation some other time, but do you think the hyperscalers will actually say, "Okay, we're going to lean in and create these standards across clouds." Because customers would love that, I would think, but it would sub-optimize their competitive advantage. What are your thoughts? >> I think, you know, on the surface, I would say they, they probably aren't. I think if you asked 'em the question, they would say, "Well, you know, first and foremost, you know, we do deliver standards, so we deliver a, you know, standard SQL interface or a SQL you know, or a standard Kubernetes API or whatever. So, in that, from that perspective, you know, we're not locking you into, you know, an Amazon specific database, or a Google specific database." You, you can argue about that, but I think to a certain extent, like they've been very good about, "Hey, we're going to adopt the standards that people want." A lot of times the open source standards. I think the problem is, let's say they did come up with a standard for it. I think you still have the problem of the costs of migration and you know, the longer you've, I think their bet is basically the longer you've been in some cloud. And again, the more data you sort of compile there, the data gravity concept, there's just going to be a natural thing that says, okay, the hurdle to get over to say, "Look, we want to move this to another cloud", becomes so cost prohibitive that they don't really have to worry about, you know, oh, I'm going to get into a war of standards. And so far I think they sort of realize like that's the flywheel that the cloud creates. And you know, unless they want to get into a world where they just cut bandwidth costs, like it just kind of won't happen. You know, I think we've even seen, and you know, the one example I'll use, and I forget the name of it off the top of my head, but there's a, there's a Google service. I think it's like BigQuery external or something along those lines, that allows you to say, "Look, you can use BigQuery against like S3 buckets and against other stuff." And so I think the cloud providers have kind of figured out, I'm never going to get the application out of that other guy's cloud or you know, the other cloud. But maybe I'm going to have to figure out some interesting ways to sort of work with it. And, you know, it's a little bit, it's a little janky, but that might be, you know, a moderate step that sort of gets customers where they want to be. >> Yeah. Or you know, it'd be interesting if you ever see AWS for example, running its database in other clouds, you started, even Oracle is doing that with, with with Azure, which is a form of Supercloud. My last question for you is, I want to get you thinking about sort of how the future plays out. You know, think about some of the companies that we've put forth this Supercloud, and by the way, this has been a criticism of the concept. Charles Fitzer, "Everything is Supercloud!" Which if true would defeat the purpose of course. >> Right. >> And so right with the community effort, we really tried to put some guardrails down on the essential characteristics, the deployment models, you know, so for example, running across multiple clouds with a purpose build pass, creating a common experience, metadata intelligence that solves a specific problem. I mean, the example I often use is Snowflake's governed data sharing. But yeah, Snowflake, Databricks, CloudFlare, Cohesity, you know, I just mentioned Oracle and Azure, these and others, they certainly claim to have that common experience across clouds. But my question is, again, I come back to, do customers need this capability? You know, is Mono Cloud the way to solve that problem? What's your, what are your thoughts on how this plays out in the future of, I guess, PAs, apps and cloud? >> Yeah, I think a couple of things. So, from a technology perspective, I think, you know, the companies you name, the services you've named, have sort of proven that the concept is viable and it's viable at a reasonable size, right? These aren't completely niche businesses, right? They're multi-billion dollar businesses. So, I think there's a subset of applications that, you know, maybe a a bigger than a niche set of applications that are going to use these types of things. A lot of what you talked about is very data centric, and that's, that's fine. That's that layer is, figuring that out. I think we'll see messaging types of services, so like Derek Hallison's, Caya Company runs a, sort of a Supercloud for messaging applications. So I think there'll be places where it makes a ton of sense. I think, the thing that I'm not sure about, and because again, we've been now 10 plus years of sort of super low, you know, interest rates in terms of being able to do things, is a lot of these things come out of research that have been done previously. Then they get turned into maybe somewhat of an open source project, and then they can become something. You know, will we see as much investment into the next Snowflake if, you know, the interest rates are three or four times that they used to be, do we, do we see VCs doing it? So that's the part that worries me a little bit, is I think we've seen what's possible. I think, you know, we've seen companies like what those services are. I think I read yesterday Snowflake was saying like, their biggest customers are growing at 30, like 50 or 60%. Like the, value they get out of it is becoming exponential. And it's just a matter of like, will the economics allow the next big thing to happen? Because some of these things are pretty, pretty costly, you know, expensive to get started. So I'm bullish on the idea. I don't know that it becomes, I think it's okay that it's still sort of, you know, niche plus, plus in terms of the size of it. Because, you know, if we think about all of IT it's still, you know, even microservices is a small part of bigger things. But I'm still really bullish on the idea. I like that it's been proven. I'm a little wary, like a lot of people have the economics of, you know, what might slow things down a little bit. But yeah, I, think the future is going to involve Supercloud somewhere, whatever people end up calling it. And you and I discussed that. (laughs) But I don't, I don't think it goes away. I don't think it's, I don't think it's a fad. I think it is something that people see tremendous value and it's just, it's got to be, you know, for what you're trying to do, your application specific thing. >> You're making a great point on the funding of innovation and we're entering a new era of public policy as well. R and D tax credit is now is shifting. >> Yeah. >> You know, you're going to have to capitalize that over five years now. And that's something that goes back to the 1950s and many people would argue that's at least in part what has helped the United States be so, you know, competitive in tech. But Brian, always great to talk to you. Thanks so much for participating in the program. Great to see you. >> Thanks Dave, appreciate it. Good luck with the rest of the show. >> Thank you. All right, this is Dave Vellante for John Furrier, the entire Cube community. Stay tuned for more content from Supercloud2.

Published Date : Jan 4 2023

SUMMARY :

of the popular Cloudcast program. Yeah, great to be with you, Dave. So, you know, has the cloud I think to a certain extent, you know, when you talk to cloud, you know, tie it into you know, is multi-cloud And we, you know, So, you know, I want, I want And then, you know, multiple you and Aaron have talked about this. And, we would evaluate that, you know, But I think, you know, I money, you know, and I think, you know, on the is, I want to get you Cohesity, you know, I just of sort of super low, you know, on the funding of innovation the United States be so, you Good luck with the rest of the show. the entire Cube community.

ENTITIES

Entity	Category	Confidence
Aaron Delp	PERSON	0.99+
Dave	PERSON	0.99+
Brian	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Charles Fitzer	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Brian Gracely	PERSON	0.99+
Google	ORGANIZATION	0.99+
Caya Company	ORGANIZATION	0.99+
30%	QUANTITY	0.99+
50%	QUANTITY	0.99+
Aaron	PERSON	0.99+
60%	QUANTITY	0.99+
John Furrier	PERSON	0.99+
20%	QUANTITY	0.99+
three	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
50	QUANTITY	0.99+
third piece	QUANTITY	0.99+
BigQuery	TITLE	0.99+
1950s	DATE	0.99+
10 plus years	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Snowflake	TITLE	0.99+
Databricks	ORGANIZATION	0.99+
Cohesity	ORGANIZATION	0.99+
SiliconANGLE	ORGANIZATION	0.99+
Nutanix	ORGANIZATION	0.99+
Wikibon	ORGANIZATION	0.98+
Digital Ocean	ORGANIZATION	0.98+
Snowflake	ORGANIZATION	0.98+
five	QUANTITY	0.98+
Snowflake	EVENT	0.98+
30	QUANTITY	0.98+
six years	QUANTITY	0.98+
this year	DATE	0.98+
four times	QUANTITY	0.98+
yesterday	DATE	0.98+
US West	LOCATION	0.97+
today	DATE	0.97+
one thing	QUANTITY	0.97+
over five years	QUANTITY	0.97+
S3	TITLE	0.96+
CloudFlare	ORGANIZATION	0.95+
Super app	TITLE	0.94+
Supercloud	ORGANIZATION	0.94+
one	QUANTITY	0.93+
Supercloud2	ORGANIZATION	0.93+
Azure	ORGANIZATION	0.92+
CloudFlare	TITLE	0.91+
one area	QUANTITY	0.91+
both	QUANTITY	0.9+
a dozen services	QUANTITY	0.9+
New Year	EVENT	0.9+
MongoDB Atlas	TITLE	0.89+
Kubernetes	TITLE	0.89+
VMware	TITLE	0.88+
SQL	TITLE	0.88+
Prem	ORGANIZATION	0.88+
first	QUANTITY	0.88+
multiple years later	DATE	0.88+
3 billion dollars a year	QUANTITY	0.86+
Mary	TITLE	0.84+
Azure	TITLE	0.84+
Cube	ORGANIZATION	0.83+
The Cloudcast	ORGANIZATION	0.8+
one cloud	QUANTITY	0.78+

Nir Zuk, Palo Alto Networks | Palo Alto Networks Ignite22

>> Presenter: theCUBE presents Ignite '22, brought to you by Palo Alto Networks. >> Hey guys and girls. Welcome back to theCube's live coverage at Palo Alto Ignite '22. We're live at the MGM Grand Hotel in beautiful Las Vegas. Lisa Martin here with Dave Vellante. This is day one of our coverage. We've been talking with execs from Palo Alto, Partners, but one of our most exciting things is talking with Founders day. We get to do that next. >> The thing is, it's like I wrote this weekend in my breaking analysis. Understanding the problem in cybersecurity is really easy, but figuring out how to fix it ain't so much. >> It definitely isn't. >> So I'm excited to have Nir here. >> Very excited. Nir Zuk joins us, the founder and CTO of Palo Alto Networks. Welcome, Nir. Great to have you on the program. >> Thank you. >> So Palo Alto Networks, you founded it back in 2005. It's hard to believe that's been 18 years, almost. You did something different, which I want to get into. But tell us, what was it back then? Why did you found this company? >> I thought the world needed another cybersecurity company. I thought it's because there were so many cybersecurity vendors in the world, and just didn't make any sense. This industry has evolved in a very weird way, where every time there was a new challenge, rather than existing vendors dealing with a challenge, you had new vendors dealing with it, and I thought I could put a stop to it, and I think I did. >> You did something differently back in 2005, looking at where you are now, the leader, what was different in your mind back then? >> Yeah. When you found a new company, you have really two good options. There's also a bad option, but we'll skip that. You can either disrupt an existing market, or you can create a new market. So first, I decided to disrupt an existing market, go into an existing market first, network security, then cyber security, and change it. Change the way it works. And like I said, the challenges that every problem had a new vendor, and nobody just stepped back and said, "I think I can solve it with the platform." Meaning, I think I can spend some time not solving a specific problem, but building a platform that then can be used to solve many different problems. And that's what I've done, and that's what Palo Alto Networks has done, and that's where we are today. >> So you look back, you call it now, I think you call it a next gen firewall, but nothing in 2005, can it be next gen? Do you know the Silicon Valley Show? Do you know the show Silicon Valley? >> Oh! Yeah. >> Yeah, of course. >> You got to have a box. But it was a different kind of box- >> Actually. >> Explain that. >> Actually, it's exactly the same thing. You got to have a box. So I actually wanted to call it a necessary evil. Marketing wouldn't go for that. >> No. >> And the reason I wanted to call it a necessary evil, because one of the things that we've done in order to platform our cyber security, again, first network security now, also cloud security, and security operations, is to turn it into a SaaS delivered industry. Today every cyber security professional knows that, when they buy cyber security, they buy usually a SaaS delivered service. Back then, people thought I was crazy to think that customers are going to send their data to their vendor in order to process, and they wanted everything on premise and so on, but I said, "No, customers are going to send information to us for processing, because we have much more processing power than they have." And we needed something in the infrastructure to send us the information. So that's why I wanted to call it the necessary evil. We ended up calling it next generation firewall, which was probably a better term. >> Well, even Veritas. Remember Veritas? They had the no hardware agenda. Even they have a box. So it is like you say, you got to have it. >> It's necessary. >> Okay. You did this, you started this on your own cloud, kind of like Salesforce, ServiceNow. >> Correct. >> Similar now- >> Build your own data centers. >> Build your own data center. Okay, I call it a cloud, but no. >> No, it's the same. There's no cloud, it's just someone else's computer. >> According to Larry Ellison, he was actually probably right about that. But over time, you've had this closer partnership with the public clouds. >> Correct. >> What does that bring you and your customers, and how hard was that to navigate? >> It wasn't that hard for us, because we didn't have that many services. Usually it's harder. Of course, we didn't do a lift and shift, which is their own thing to do with the cloud. We rebuild things for the cloud, and the benefits, of course, are time to market, scale, agility, and in some cases also, cost. >> Yeah, some cases. >> In some cases. >> So you have a sort of a hybrid model today. You still run your own data centers, do you not? >> Very few. >> Really? >> There are very, very few things that we have to do on hardware, like simulating malware and things that cannot be done in a virtual machine, which is pretty much the only option you have in the cloud. They provide bare metal, but doesn't serve our needs. I think that we don't view cloud, and your viewers should not be viewing cloud, as a place where they're going to save money. It's a place where they're going to make money. >> I like that. >> You make much more money, because you're more agile. >> And that's why this conversation is all about, your cost of goods sold they're going to be so high, you're going to have to come back to your own data centers. That's not on your mind right now. What's on your mind is advancing the unit, right? >> Look, my own data center would limit me in scale, would limit my agility. If you want to build something new, you don't have all the PaaS services, the platform as a service, services like database, and AI, and so on. I have to build them myself. It takes time. So yeah, it's going to be cheaper, but I'm not going to be delivering the same thing. So my revenues will be much lower. >> Less top line. What can humans do better than machines? You were talking about your keynote... I'm just going to chat a little bit. You were talking about your keynote. Basically, if you guys didn't see the keynote, that AI is going to run every soc within five years, that was a great prediction that you made. >> Correct. >> And they're going to do things that you can't do today, and then in the future, they're going to do things that you can't... Better than you can do. >> And you just have to be comfortable with that. >> So what do you think humans can do today and in the future better than machines? >> Look, humans can always do better than machines. The human mind can do things that machines cannot do. We are conscious, I don't think machines will be conscious. And you can do things... My point was not that machines can do things that humans cannot do. They can just do it better. The things that humans do today, machines can do better, once machines do that, humans will be free to do things that they don't do today, that machines cannot do. >> Like what? >> Like finding the most difficult, most covert attacks, dealing with the most difficult incidents, things that machines just can't do. Just that today, humans are consumed by finding attacks that machines can find, by dealing with incidents that machines can deal with. It's a waste of time. We leave it to the machines and go and focus on the most difficult problems, and then have the machines learn from you, so that next time or a hundred or a thousand times from now, they can do it themselves, and you focus on the even more difficult. >> Yeah, just like after 9/11, they said that we lack the creativity. That's what humans have, that machines don't, at least today. >> Machines don't. Yeah, look, every airplane has two pilots, even though airplanes have been flying themselves for 30 years now, why do you have two pilots, to do the things that machines cannot do? Like land on the Hudson, right? You always need humans to do the things that machines cannot do. But to leave the things that machines can do to the machines, they'll do it better. >> And autonomous vehicles need breaks. (indistinct) >> In your customer conversations, are customers really grappling with that, are they going, "Yeah, you're right?" >> It depends. It's hard for customers to let go of old habits. First, the habit of buying a hundred different solutions from a hundred different vendors, and you know what? Why would I trust one vendor to do everything, put all my eggs in the same basket? They have all kind of slogans as to why not to do that, even though it's been proven again and again that, doing everything in one system with one brain, versus a hundred systems with a hundred brains, work much better. So that's one thing. The second thing is, we always have the same issue that we've had, I think, since the industrial revolution, of what machines are going to take away my job. No, they're just going to make your job better. So I think that some of our customers are also grappling with that, like, "What do I do if the machines take over?" And of course, like we've said, the machines aren't taking over. They're going to do the benign work, you're going to do the interesting work. You should embrace it. >> When I think about your history as a technology pro, from Check Point, a couple of startups, one of the things that always frustrated you, is when when a larger company bought you out, you ended up getting sucked into the bureaucratic vortex. How do you avoid that at Palo Alto Networks? >> So first, you mean when we acquire company? >> Yes. >> The first thing is that, when we acquire companies, we always acquire for integration. Meaning, we don't just buy something and then leave it on the side, and try to sell it here and there. We integrate it into the core of our products. So that's very important, so that the technology lives, thrives and continues to grow as part of our bigger platform. And I think that the second thing that is very important, from past experience what we've learned, is to put the people that we acquire in key positions. Meaning, you don't buy a company and then put the leader of that company five levels below the CEO. You always put them in very senior positions. Almost always, we have the leaders of the companies that we acquire, be two levels below the CEO, so very senior in the company, so they can influence and make changes. >> So two questions related to that. One is, as you grow your team, can you be both integrated? And second part of the question, can you be both integrated and best of breed? Second part of the question is, do you even have to be? >> So I'll answer it in the third way, which is, I don't think you can be best of breed without being integrated in cybersecurity. And the reason is, again, this split brain that I've mentioned twice. When you have different products do a part of cybersecurity and they don't talk to each other, and they don't share a single brain, you always compromise. You start looking for things the wrong way. I can be a little bit technical here, but please. Take the example of, traditionally you would buy an IDS/IPS, separately from your filtering, separately from DNS security. One of the most important things we do in network security is to find combining control connections. Combining control connections where the adversaries controlling something behind your firewall and is now going around your network, is usually the key heel of the attack. That's why attacks like ransomware, that don't have a commanding control connection, are so difficult to deal with, by the way. So commanding control connections are a key seal of the attacks, and there are three different technologies that deal with it. Neural filtering for neural based commanding control, DNS security for DNS based commanding control, and IDS/IPS for general commanding control. If those are three different products, they'll be doing the wrong things. The oral filter will try to find things that it's not really good at, that the IPS really need to find, and the DN... It doesn't work. It works much better when it's one product doing everything. So I think the choice is not between best of breed and integrated. I think the only choice is integrated, because that's the only way to be best of breed. >> And behind that technology is some kind of realtime data store, I'll call it data lake, database. >> Yeah. >> Whatever. >> It's all driven by the same data. All the URLs, all the domain graph. Everything goes to one big data lake. We collect about... I think we collect about, a few petabytes per day. I don't write the exact number of data. It's all going to the same data lake, and all the intelligence is driven by that. >> So you mentioned in a cheeky comment about, why you founded the company, there weren't enough cybersecurity companies. >> Yeah. >> Clearly the term expansion strategy that Palo Alto Networks has done has been very successful. You've been, as you talked about, very focused on integration, not just from the technology perspective, but from the people perspective as well. >> Correct. >> So why are there still so many cybersecurity companies, and what are you thinking Palo Alto Networks can do to change that? >> So first, I think that there are a lot of cybersecurity companies out there, because there's a lot of money going into cybersecurity. If you look at the number of companies that have been really successful, it's a very small percentage of those cybersecurity companies. And also look, we're not going to be responsible for all the innovation in cybersecurity. We need other people to innovate. It's also... Look, always the question is, "Do you buy something or do you build it yourself?" Now we think we're the smartest people in the world. Of course, we can build everything, but it's not always true that we can build everything. Know that we're the smartest people in the world, for sure. You see, when you are a startup, you live and die by the thing that you build. Meaning if it's good, it works. If it's not good, you die. You run out of money, you shut down, and you just lost four years of your life to this, at least. >> At least. >> When you're a large company, yeah, I can go and find a hundred engineers and hire them. And especially nowadays, it becomes easier, as it became easier, and give them money, and have them go and build the same thing that the startup is building, but they're part of a bigger company, and they'll have more coffee breaks, and they'll be less incentive to go and do that, because the company will survive with or without them. So that's why startups can do things much better, sometimes than larger companies. We can do things better than startups, when it comes to being data driven because we have the data, and nobody can compete against the amount of data that we have. So we have a good combination of finding the right startups that have already built something, already proven that it works with some customers, and of course, building a lot of things internally that we cannot do outside. >> I heard you say in one of the, I dunno, dozens of videos I've listened to you talked to. The industry doesn't need or doesn't want another IoT stovepipe. Okay, I agree. So you got on-prem, AWS, Azure, Google, maybe Alibaba, IoT is going to be all over the place. So can you build, I call it the security super cloud, in other words, a consistent experience with the same policies and edicts across all my estates, irrespective of physical location? Is that technically feasible? Is it what you are trying to do? >> Certainly, what we're trying to do with Prisma Cloud, with our cloud security product, it works across all the clouds that you mentioned, and Oracle as well. It's almost entirely possible. >> Almost. >> Almost. Well, the things that... What you do is you normalize the language that the different cloud scale providers use, into one language. This cloud calls it a S3, and so, AWS calls it S3, and (indistinct) calls it GCS, and so on. So you normalize their terminology, and then build policy using a common terminology that your customers have to get used to. Of course, there are things that are different between the different cloud providers that cannot be normalized, and there, it has to be cloud specific. >> In that instance. So is that, in part, your strategy, is to actually build that? >> Of course. >> And does that necessitate running on all the major clouds? >> Of course. It's not just part of our strategy, it's a major part of our strategy. >> Compulsory. >> Look, as a standalone vendor that is not a cloud provider, we have two advantages. The first one is we're security product, security focused. So we can do much better than them when it comes to security. If you are a AWS, GCP, Azure, and so on, you're not going to put your best people on security, you're going to put them on the core business that you have. So we can do much better. Hey, that's interesting. >> Well, that's not how they talk. >> I don't care how they talk. >> Now that's interesting. >> When something is 4% of your business, you're not going to put it... You're not going to put your best people there. It's just, why would you? You put your best people on 96%. >> That's not driving their revenue. >> Look, it's simple. It's not what we- >> With all due respect. With all due respect. >> So I think we do security much better than them, and they become the good enough, and we become the premium. But certainly, the second thing that give us an advantage and the right to be a standalone security provider, is that we're multicloud, private cloud and all the major cloud providers. >> But they also have a different role. I mean, your role is not the security, the Nitro card or the Graviton chip, or is it? >> They are responsible for securing up to the operating system. We secure everything. >> They do a pretty good job of that. >> No, they do, certainly they have to. If they get bridged at that level, it's not just that one customer is going to suffer, the entire customer base. They have to spend a lot of time and money on it, and frankly, that's where they put their best security people. Securing the infrastructure, not building some cloud security feature. >> Absolutely. >> So Palo Alto Networks is, as we wrap here, on track to nearly double its revenues to nearly seven billion in FY '23, just compared to 2020, you were quoted in the press by saying, "We will be the first $100 billion cyber company." What is next for Palo Alto to achieve that? >> Yeah, so it was Nikesh, our CEO and chairman, that was quoted saying that, "We will double to a hundred billion." I don't think he gave it a timeframe, but what it takes is to double the sales, right? We're at 50 billion market cap right now, so we need to double sales. But in reality, you mentioned that we're growing the turn by doing more and more cybersecurity functions, and taking away pieces. Still, we have a relatively small, even though we're the largest cybersecurity vendor in the world, we have a very low market share that shows you how fragmented the market is. I would also like to point out something that is less known. Part of what we do with AI, is really take the part of the cybersecurity industry, which are service oriented, and that's about 50% of the cybersecurity industry services, and turn it into products. I mean, not all of it. But a good portion of what's provided today by people, and tens of billions of dollars are spent on that, can be done with products. And being one of the very, very few vendors that do that, I think we have a huge opportunity at turning those tens of billions of dollars in human services to AI. >> It's always been a good business taking human labor and translating into R and D, vendor R and D. >> Especially- >> It never fails if you do it well. >> Especially in difficult times, difficult economical times like we are probably experiencing right now around the world. We, not we, but we the world. >> Right, right. Well, congratulations. Coming up on the 18th anniversary. Tremendous amount of success. >> Thank you. >> Great vision, clear vision, STEM expansion strategy, really well underway. We are definitely going to continue to keep our eyes. >> Big company, a hundred billion, that's market capital, so that's a big company. You said you didn't want to work for a big company unless you founded it, is that... >> Unless it acts like a small company. >> There's the caveat. We'll keep our eye on that. >> Thank you very much. >> It's such a pleasure having you on. >> Thank you. >> Same here, thank you. >> All right, for our guests and for Dave Vellante, I'm Lisa Martin. You're watching theCUBE, the leader in live emerging and enterprise tech coverage. (upbeat music)

Published Date : Dec 14 2022

SUMMARY :

brought to you by Palo Alto Networks. We get to do that next. but figuring out how to Great to have you on the program. It's hard to believe that's and I thought I could put a stop to it, So first, I decided to Yeah. You got to have a box. You got to have a box. because one of the things that we've done So it is like you say, you got to have it. You did this, you started Build your own data center. No, it's the same. According to Larry Ellison, and the benefits, of So you have a sort option you have in the cloud. You make much more money, back to your own data centers. but I'm not going to be that was a great prediction that you made. things that you can't do today, And you just have to And you can do things... and you focus on the even more difficult. they said that we lack the creativity. to do the things that machines cannot do? And autonomous vehicles need breaks. to make your job better. one of the things that of the companies that we acquire, One is, as you grow your team, and they don't talk to each other, And behind that technology is some kind and all the intelligence So you mentioned in not just from the technology perspective, and you just lost four years that the startup is building, listened to you talked to. clouds that you mentioned, and there, it has to be cloud specific. is to actually build that? It's not just part of our strategy, core business that you have. You're not going to put It's not what we- With all due respect. and the right to be a the Nitro card or the They are responsible for securing customer is going to suffer, just compared to 2020, and that's about 50% of the and D, vendor R and D. experiencing right now around the world. Tremendous amount of success. We are definitely going to You said you didn't want There's the caveat. the leader in live emerging

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Lisa Martin	PERSON	0.99+
2005	DATE	0.99+
AWS	ORGANIZATION	0.99+
Larry Ellison	PERSON	0.99+
Palo Alto Networks	ORGANIZATION	0.99+
two questions	QUANTITY	0.99+
50 billion	QUANTITY	0.99+
Alibaba	ORGANIZATION	0.99+
Nir	PERSON	0.99+
4%	QUANTITY	0.99+
96%	QUANTITY	0.99+
30 years	QUANTITY	0.99+
two pilots	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
five levels	QUANTITY	0.99+
second thing	QUANTITY	0.99+
2020	DATE	0.99+
Google	ORGANIZATION	0.99+
Veritas	ORGANIZATION	0.99+
Nir Zuk	PERSON	0.99+
18 years	QUANTITY	0.99+
four years	QUANTITY	0.99+
One	QUANTITY	0.99+
twice	QUANTITY	0.99+
two levels	QUANTITY	0.99+
second thing	QUANTITY	0.99+
one brain	QUANTITY	0.99+
First	QUANTITY	0.99+
Today	DATE	0.99+
second part	QUANTITY	0.99+
first	QUANTITY	0.99+
one product	QUANTITY	0.99+
Palo Alto Networks	ORGANIZATION	0.99+
both	QUANTITY	0.99+
FY '23	DATE	0.99+
one language	QUANTITY	0.99+
Ignite '22	EVENT	0.98+
Palo Alto	ORGANIZATION	0.98+
Las Vegas	LOCATION	0.98+
third way	QUANTITY	0.98+
one vendor	QUANTITY	0.98+
one system	QUANTITY	0.98+
one thing	QUANTITY	0.98+
tens of billions of dollars	QUANTITY	0.98+
dozens	QUANTITY	0.98+
today	DATE	0.98+
first $100 billion	QUANTITY	0.98+
two good options	QUANTITY	0.98+
Second part	QUANTITY	0.98+
tens of billions of dollars	QUANTITY	0.98+
two advantages	QUANTITY	0.98+
S3	TITLE	0.98+
Nikesh	ORGANIZATION	0.98+
one	QUANTITY	0.97+
about 50%	QUANTITY	0.97+
three different products	QUANTITY	0.97+
18th anniversary	QUANTITY	0.97+
first one	QUANTITY	0.96+
three different technologies	QUANTITY	0.95+
five years	QUANTITY	0.95+
single brain	QUANTITY	0.95+
MGM Grand Hotel	LOCATION	0.95+
one customer	QUANTITY	0.94+
Hudson	LOCATION	0.92+

Kevin Miller and Ed Walsh | AWS re:Invent 2022 - Global Startup Program

hi everybody welcome back to re invent 2022. this is thecube's exclusive coverage we're here at the satellite set it's up on the fifth floor of the Venetian Conference Center and this is part of the global startup program the AWS startup showcase series that we've been running all through last year and and into this year with AWS and featuring some of its its Global Partners Ed wallson series the CEO of chaos search many times Cube Alum and Kevin Miller there's also a cube Alum vice president GM of S3 at AWS guys good to see you again yeah great to see you Dave hi Kevin this is we call this our Super Bowl so this must be like your I don't know uh World Cup it's a pretty big event yeah it's the World Cup for sure yeah so a lot of S3 talk you know I mean that's what got us all started in 2006 so absolutely what's new in S3 yeah it's been a great show we've had a number of really interesting launches over the last few weeks and a few at the show as well so you know we've been really focused on helping customers that are running Mass scale data Lakes including you know whether it's structured or unstructured data we actually announced just a few just an hour ago I think it was a new capability to give customers cross-account access points for sharing data securely with other parts of the organization and that's something that we'd heard from customers is as they are growing and have more data sets and they're looking to to get more out of their data they are increasingly looking to enable multiple teams across their businesses to access those data sets securely and that's what we provide with cross-count access points we also launched yesterday our multi-region access point failover capabilities and so again this is where customers have data sets and they're using multiple regions for certain critical workloads they're now able to to use that to fail to control the failover between different regions in AWS and then one other launch I would just highlight is some improvements we made to storage lens which is our really a very novel and you need capability to help customers really understand what storage they have where who's accessing it when it's being accessed and we added a bunch of new metrics storage lens has been pretty exciting for a lot of customers in fact we looked at the data and saw that customers who have adopted storage lens typically within six months they saved more than six times what they had invested in turning storage lens on and certainly in this environment right now we have a lot of customers who are it's pretty top of mind they're looking for ways to optimize their their costs in the cloud and take some of those savings and be able to reinvest them in new innovation so pretty exciting with the storage lens launch I think what's interesting about S3 is that you know pre-cloud Object Store was this kind of a niche right and then of course you guys announced you know S3 in 2006 as I said and okay great you know cheap and deep storage simple get put now the conversations about how to enable value from from data absolutely analytics and it's just a whole new world and Ed you've talked many times I love the term yeah we built chaos search on the on the shoulders of giants right and so the under underlying that is S3 but the value that you can build on top of that has been key and I don't think we've talked about his shoulders and Giants but we've talked about how we literally you know we have a big Vision right so hard to kind of solve the challenge to analytics at scale we really focus on the you know the you know Big Data coming environment get analytics so we talk about the on the shoulders Giants obviously Isaac Newton's you know metaphor of I learned from everything before and we layer on top so really when you talk about all the things come from S3 like I just smile because like we picked it up naturally we went all in an S3 and this is where I think you're going Dave but everyone is so let's just cut the chase like so any of the data platforms you're using S3 is what you're building but we did it a little bit differently so at first people using a cold storage like you said and then they ETL it up into a different platforms for analytics of different sorts now people are using it closer they're doing caching layers and cashing out and they're that's where but that's where the attributes of a scale or reliability are what we did is we actually make S3 a database so literally we have no persistence outside that three and that kind of comes in so it's working really well with clients because most of the thing is we pick up all these attributes of scale reliability and it shows up in the clients environments and so when you launch all these new scalable things we just see it like our clients constantly comment like one of our biggest customers fintech in uh Europe they go to Black Friday again black Friday's not one days and they lose scale from what is it 58 terabytes a day and they're going up to 187 terabytes a day and we don't Flinch they say how do you do that well we built our platform on S3 as long as you can stream it to S3 so they're saying I can't overrun S3 and it's a natural play so it's it's really nice that but we take out those attributes but same thing that's why we're able to you know help clients get you know really you know Equifax is a good example maybe they're able to consolidate 12 their divisions on one platform we couldn't have done that without the scale and the performance of what you can get S3 but also they saved 90 I'm able to do that but that's really because the only persistence is S3 and what you guys are delivering but and then we really for focus on shoulders Giants we're doing on top of that innovating on top of your platforms and bringing that out so things like you know we have a unique data representation that makes it easy to ingest this data because it's kind of coming at you four v's of big data we allow you to do that make it performant on s3h so now you're doing hot analytics on S3 as if it's just a native database in memory but there's no memory SSC caching and then multi-model once you get it there don't move it leverage it in place so you know elasticsearch access you know Cabana grafana access or SQL access with your tools so we're seeing that constantly but we always talk about on the shoulders of giants but even this week I get comments from our customers like how did you do that and most of it is because we built on top of what you guys provided so it's really working out pretty well and you know we talk a lot about digital transformation of course we had the pleasure sitting down with Adam solipski prior John Furrier flew to Seattle sits down his annual one-on-one with the AWS CEO which is kind of cool yeah it was it's good it's like study for the test you know and uh and so but but one of the interesting things he said was you know we're one of our challenges going forward is is how do we go Beyond digital transformation into business transformation like okay well that's that's interesting I was talking to a customer today AWS customer and obviously others because they're 100 year old company and they're basically their business was they call them like the Uber for for servicing appliances when your Appliance breaks you got to get a person to serve it a service if it's out of warranty you know these guys do that so they got to basically have a you know a network of technicians yeah and they gotta deal with the customers no phone right so they had a completely you know that was a business transformation right they're becoming you know everybody says they're coming a software company but they're building it of course yeah right on the cloud so wonder if you guys could each talk about what's what you're seeing in terms of changing not only in the sort of I.T and the digital transformation but also the business transformation yeah I know I I 100 agree that I think business transformation is probably that one of the top themes I'm hearing from customers of all sizes right now even in this environment I think customers are looking for what can I do to drive top line or you know improve bottom line or just improve my customer experience and really you know sort of have that effect where I'm helping customers get more done and you know it is it is very tricky because to do that successfully the customers that are doing that successfully I think are really getting into the lines of businesses and figuring out you know it's probably a different skill set possibly a different culture different norms and practices and process and so it's it's a lot more than just a like you said a lot more than just the technology involved but when it you know we sort of liquidate it down into the data that's where absolutely we see that as a critical function for lines of businesses to become more comfortable first off knowing what data sets they have what data they they could access but possibly aren't today and then starting to tap into those data sources and then as as that progresses figuring out how to share and collaborate with data sets across a company to you know to correlate across those data sets and and drive more insights and then as all that's being done of course it's important to measure the results and be able to really see is this what what effect is this having and proving that effect and certainly I've seen plenty of customers be able to show you know this is a percentage increase in top or bottom line and uh so that pattern is playing out a lot and actually a lot of how we think about where we're going with S3 is related to how do we make it easier for customers to to do everything that I just described to have to understand what data they have to make it accessible and you know it's great to have such a great ecosystem of partners that are then building on top of that and innovating to help customers connect really directly with the businesses that they're running and driving those insights well and customers are hours today one of the things I loved that Adam said he said where Amazon is strategically very very patient but tactically we're really impatient and the customers out there like how are you going to help me increase Revenue how are you going to help me cut costs you know we were talking about how off off camera how you know software can actually help do that yeah it's deflationary I love the quote right so software's deflationary as costs come up how do you go drive it also free up the team and you nail it it's like okay everyone wants to save money but they're not putting off these projects in fact the digital transformation or the business it's actually moving forward but they're getting a little bit bigger but everyone's looking for creative ways to look at their architecture and it becomes larger larger we talked about a couple of those examples but like even like uh things like observability they want to give this tool set this data to all the developers all their sres same data to all the security team and then to do that they need to find a way an architect should do that scale and save money simultaneously so we see constantly people who are pairing us up with some of these larger firms like uh or like keep your data dog keep your Splunk use us to reduce the cost that one and one is actually cheaper than what you have but then they use it either to save money we're saving 50 to 80 hard dollars but more importantly to free up your team from the toil and then they they turn around and make that budget neutral and then allowed to get the same tools to more people across the org because they're sometimes constrained of getting the access to everyone explain that a little bit more let's say I got a Splunk or data dog I'm sifting through you know logs how exactly do you help so it's pretty simple I'll use dad dog example so let's say using data dog preservability so it's just your developers your sres managing environments all these platforms are really good at being a monitoring alerting type of tool what they're not necessarily great at is keeping the data for longer periods like the log data the bigger data that's where we're strong what you see is like a data dog let's say you're using it for a minister for to keep 30 days of logs which is not enough like let's say you're running environment you're finding that performance issue you kind of want to look to last quarter in last month in or maybe last Black Friday so 30 days is not enough but will charge you two eighty two dollars and eighty cents a gigabyte don't focus on just 280 and then if you just turn the knob and keep seven days but keep two years of data on us which is on S3 it goes down to 22 cents plus our list price of 80 cents goes to a dollar two compared to 280. so here's the thing what they're able to do is just turn a knob get more data we do an integration so you can go right from data dog or grafana directly into our platform so the user doesn't see it but they save money A lot of times they don't just save the money now they use that to go fund and get data dog to a lot more people make sense so it's a creativity they're looking at it and they're looking at tools we see the same thing with a grafana if you look at the whole grafana play which is hey you can't put it in one place but put Prometheus for metrics or traces we fit well with logs but they're using that to bring down their costs because a lot of this data just really bogs down these applications the alerting monitoring are good at small data they're not good at the big data which is what we're really good at and then the one and one is actually less than you paid for the one so it and it works pretty well so things are really unpredictable right now in the economy you know during the pandemic we've sort of lockdown and then the stock market went crazy we're like okay it's going to end it's going to end and then it looked like it was going to end and then it you know but last year it reinvented just just in that sweet spot before Omicron so we we tucked it in which which was awesome right it was a great great event we really really missed one physical reinvent you know which was very rare so that's cool but I've called it the slingshot economy it feels like you know you're driving down the highway and you got to hit the brakes and then all of a sudden you're going okay we're through it Oh no you're gonna hit the brakes again yeah so it's very very hard to predict and I was listening to jassy this morning he was talking about yeah consumers they're still spending but what they're doing is they're they're shopping for more features they might be you know buying a TV that's less expensive you know more value for the money so okay so hopefully the consumer spending will get us out of this but you don't really know you know and I don't yeah you know we don't seem to have the algorithms we've never been through something like this before so what are you guys seeing in terms of customer Behavior given that uncertainty well one thing I would highlight that I think particularly going back to what we were just talking about as far as business and digital transformation I think some customers are still appreciating the fact that where you know yesterday you may have had to to buy some Capital put out some capital and commit to something for a large upfront expenditure is that you know today the value of being able to experiment and scale up and then most importantly scale down and dynamically based on is the experiment working out am I seeing real value from it and doing that on a time scale of a day or a week or a few months that is so important right now because again it gets to I am looking for a ways to innovate and to drive Top Line growth but I I can't commit to a multi-year sort of uh set of costs to to do that so and I think plenty of customers are finding that even a few months of experimentation gives them some really valuable insight as far as is this going to be successful or not and so I think that again just of course with S3 and storage from day one we've been elastic pay for what you use if you're not using the storage you don't get charged for it and I think that particularly right now having the applications and the rest of the ecosystem around the storage and the data be able to scale up and scale down is is just ever more important and when people see that like typically they're looking to do more with it so if they find you usually find these little Department projects but they see a way to actually move faster and save money I think it is a mix of those two they're looking to expand it which can be a nightmare for sales Cycles because they take longer but people are looking well why don't you leverage this and go across division so we do see people trying to leverage it because they're still I don't think digital transformation is slowing down but a lot more to be honest a lot more approvals at this point for everything it is you know Adam and another great quote in his in his keynote he said if you want to save money the Cloud's a place to do it absolutely and I read an article recently and I was looking through and I said this is the first time you know AWS has ever seen a downturn because the cloud was too early back then I'm like you weren't paying attention in 2008 because that was the first major inflection point for cloud adoption where CFO said okay stop the capex we're going to Opex and you saw the cloud take off and then 2010 started this you know amazing cycle that we really haven't seen anything like it where they were doubling down in Investments and they were real hardcore investment it wasn't like 1998 99 was all just going out the door for no clear reason yeah so that Foundation is now in place and I think it makes a lot of sense and it could be here for for a while where people are saying Hey I want to optimize and I'm going to do that on the cloud yeah no I mean I've obviously I certainly agree with Adam's quote I think really that's been in aws's DNA from from day one right is that ability to scale costs with with the actual consumption and paying for what you use and I think that you know certainly moments like now are ones that can really motivate change in an organization in a way that might not have been as palatable when it just it didn't feel like it was as necessary yeah all right we got to go give you a last word uh I think it's been a great event I love all your announcements I think this is wonderful uh it's been a great show I love uh in fact how many people are here at reinvent north of 50 000. yeah I mean I feel like it was it's as big if not bigger than 2019. people have said ah 2019 was a record when you count out all the professors I don't know it feels it feels as big if not bigger so there's great energy yeah it's quite amazing and uh and we're thrilled to be part of it guys thanks for coming on thecube again really appreciate it face to face all right thank you for watching this is Dave vellante for the cube your leader in Enterprise and emerging Tech coverage we'll be right back foreign

Published Date : Dec 7 2022

SUMMARY :

across a company to you know to

ENTITIES

Entity	Category	Confidence
Ed Walsh	PERSON	0.99+
Kevin Miller	PERSON	0.99+
two years	QUANTITY	0.99+
2006	DATE	0.99+
2008	DATE	0.99+
seven days	QUANTITY	0.99+
Adam	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
30 days	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
50	QUANTITY	0.99+
Adam solipski	PERSON	0.99+
Dave vellante	PERSON	0.99+
two	QUANTITY	0.99+
eighty cents	QUANTITY	0.99+
Europe	LOCATION	0.99+
22 cents	QUANTITY	0.99+
Kevin	PERSON	0.99+
80 cents	QUANTITY	0.99+
Seattle	LOCATION	0.99+
12	QUANTITY	0.99+
2010	DATE	0.99+
Isaac Newton	PERSON	0.99+
Dave	PERSON	0.99+
Super Bowl	EVENT	0.99+
a day	QUANTITY	0.99+
Venetian Conference Center	LOCATION	0.99+
fifth floor	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
World Cup	EVENT	0.99+
last year	DATE	0.99+
last quarter	DATE	0.99+
yesterday	DATE	0.99+
S3	TITLE	0.99+
last month	DATE	0.99+
more than six times	QUANTITY	0.99+
2019	DATE	0.98+
Prometheus	TITLE	0.98+
six months	QUANTITY	0.98+
280	QUANTITY	0.98+
pandemic	EVENT	0.98+
Black Friday	EVENT	0.97+
an hour ago	DATE	0.97+
today	DATE	0.97+
58 terabytes a day	QUANTITY	0.97+
100 year old	QUANTITY	0.97+
this morning	DATE	0.97+
a week	QUANTITY	0.97+
Ed wallson	PERSON	0.97+
three	QUANTITY	0.96+
Equifax	ORGANIZATION	0.96+
jassy	PERSON	0.96+
one platform	QUANTITY	0.96+
this year	DATE	0.96+
grafana	TITLE	0.96+
one days	QUANTITY	0.95+
first time	QUANTITY	0.95+
one	QUANTITY	0.95+
black Friday	EVENT	0.93+
this week	DATE	0.92+
first major inflection	QUANTITY	0.91+
one place	QUANTITY	0.91+
SQL	TITLE	0.9+
last	DATE	0.89+
Store	TITLE	0.89+

Tomer Shiran, Dremio | AWS re:Invent 2022

>>Hey everyone. Welcome back to Las Vegas. It's the Cube live at AWS Reinvent 2022. This is our fourth day of coverage. Lisa Martin here with Paul Gillen. Paul, we started Monday night, we filmed and streamed for about three hours. We have had shammed pack days, Tuesday, Wednesday, Thursday. What's your takeaway? >>We're routed final turn as we, as we head into the home stretch. Yeah. This is as it has been since the beginning, this show with a lot of energy. I'm amazed for the fourth day of a conference, how many people are still here I am too. And how, and how active they are and how full the sessions are. Huge. Proud for the keynote this morning. You don't see that at most of the day four conferences. Everyone's on their way home. So, so people come here to learn and they're, and they're still >>Learning. They are still learning. And we're gonna help continue that learning path. We have an alumni back with us, Toron joins us, the CPO and co-founder of Dremeo. Tomer, it's great to have you back on the program. >>Yeah, thanks for, for having me here. And thanks for keeping the, the best session for the fourth day. >>Yeah, you're right. I like that. That's a good mojo to come into this interview with Tomer. So last year, last time I saw you was a year ago here in Vegas at Reinvent 21. We talked about the growth of data lakes and the data lake houses. We talked about the need for open data architectures as opposed to data warehouses. And the headline of the Silicon Angle's article on the interview we did with you was, Dremio Predicts 2022 will be the year open data architectures replace the data warehouse. We're almost done with 2022. Has that prediction come true? >>Yeah, I think, I think we're seeing almost every company out there, certainly in the enterprise, adopting data lake, data lakehouse technology, embracing open source kind of file and table formats. And, and so I think that's definitely happening. Of course, nothing goes away. So, you know, data warehouses don't go away in, in a year and actually don't go away ever. We still have mainframes around, but certainly the trends are, are all pointing in that direction. >>Describe the data lakehouse for anybody who may not be really familiar with that and, and what it's, what it really means for organizations. >>Yeah. I think you could think of the data lakehouse as the evolution of the data lake, right? And so, you know, for, for, you know, the last decade we've had kind of these two options, data lakes and data warehouses and, you know, warehouses, you know, having good SQL support, but, and good performance. But you had to spend a lot of time and effort getting data into the warehouse. You got locked into them, very, very expensive. That's a big problem now. And data lakes, you know, more open, more scalable, but had all sorts of kind of limitations. And what we've done now as an industry with the Lake House, and especially with, you know, technologies like Apache Iceberg, is we've unlocked all the capabilities of the warehouse directly on object storage like s3. So you can insert and update and delete individual records. You can do transactions, you can do all the things you could do with a, a database directly in kind of open formats without getting locked in at a much lower cost. >>But you're still dealing with semi-structured data as opposed to structured data. And there's, there's work that has to be done to get that into a usable form. That's where Drio excels. What, what has been happening in that area to, to make, I mean, is it formats like j s o that are, are enabling this to happen? How, how we advancing the cause of making semi-structured data usable? Yeah, >>Well, I think first of all, you know, I think that's all changed. I think that was maybe true for the original data lakes, but now with the Lake house, you know, our bread and butter is actually structured data. It's all, it's all tables with the schema. And, you know, you can, you know, create table insert records. You know, it's, it's, it's really everything you can do with a data warehouse you can now do in the lakehouse. Now, that's not to say that there aren't like very advanced capabilities when it comes to, you know, j s O and nested data and kind of sparse data. You know, we excel in that as well. But we're really seeing kind of the lakehouse take over the, the bread and butter data warehouse use cases. >>You mentioned open a minute ago. Talk about why it's, why open is important and the value that it can deliver for customers. >>Yeah, well, I think if you look back in time and you see all the challenges that companies have had with kind of traditional data architectures, right? The, the, the, a lot of that comes from the, the, the problems with data warehouses. The fact that they are, you know, they're very expensive. The data is, you have to ingest it into the data warehouse in order to query it. And then it's almost impossible to get off of these systems, right? It takes an enormous effort, tremendous cost to get off of them. And so you're kinda locked in and that's a big problem, right? You also, you're dependent on that one data warehouse vendor, right? You can only do things with that data that the warehouse vendor supports. And if you contrast that to data lakehouse and open architectures where the data is stored in entirely open formats. >>So things like par files and Apache iceberg tables, that means you can use any engine on that data. You can use s SQL Query Engine, you can use Spark, you can use flin. You know, there's a dozen different engines that you can use on that, both at the same time. But also in the future, if you ever wanted to try something new that comes out, some new open source innovation, some new startup, you just take it and point out the same data. So that data's now at the core, at the center of the architecture as opposed to some, you know, vendors logo. Yeah. >>Amazon seems to be bought into the Lakehouse concept. It has big announcements on day two about eliminating the ETL stage between RDS and Redshift. Do you see the cloud vendors as pushing this concept forward? >>Yeah, a hundred percent. I mean, I'm, I'm Amazon's a great, great partner of ours. We work with, you know, probably 10 different teams there. Everything from, you know, the S3 team, the, the glue team, the click site team, you know, everything in between. And, you know, their embracement of the, the, the lake house architecture, the fact that they adopted Iceberg as their primary table format. I think that's exciting as an industry. We're all coming together around standard, standard ways to represent data so that at the end of the day, companies have this benefit of being able to, you know, have their own data in their own S3 account in open formats and be able to use all these different engines without losing any of the functionality that they need, right? The ability to do all these interactions with data that maybe in the past you would have to move the data into a database or, or warehouse in order to do, you just don't have to do that anymore. Speaking >>Of functionality, talk about what's new this year with drio since we've seen you last. >>Yeah, there's a lot of, a lot of new things with, with Drio. So yeah, we now have full Apache iceberg support, you know, with DML commands, you can do inserts, updates, deletes, you know, copy into all, all that kind of stuff is now, you know, fully supported native part of the platform. We, we now offer kind of two flavors of dr. We have, you know, Dr. Cloud, which is our SaaS version fully hosted. You sign up with your Google or, you know, Azure account and, and, and you're up in, you're up and running in, in, in a minute. And then dral software, which you can self host usually in the cloud, but even, even even outside of the cloud. And then we're also very excited about this new idea of data as code. And so we've introduced a new product that's now in preview called Dr. >>Arctic. And the idea there is to bring the concepts of GI or GitHub to the world of data. So things like being able to create a branch and work in isolation. If you're a data scientist, you wanna experiment on your own without impacting other people, or you're a data engineer and you're ingesting data, you want to transform it and test it before you expose it to others. You can do that in a branch. So all these ideas that, you know, we take for granted now in the world of source code and software development, we're bringing to the world of data with Jamar. And when you think about data mesh, a lot of people talking about data mesh now and wanting to kind of take advantage of, of those concepts and ideas, you know, thinking of data as a product. Well, when you think about data as a product, we think you have to manage it like code, right? You have to, and that's why we call it data as code, right? The, all those reasons that we use things like GI have to build products, you know, if we wanna think of data as a product, we need all those capabilities also with data. You know, also the ability to go back in time. The ability to undo mistakes, to see who changed my data and when did they change that table. All of those are, are part of this, this new catalog that we've created. >>Are you talk about data as a product that's sort of intrinsic to the data mesh concept. Are you, what's your opinion of data mesh? Is the, is the world ready for that radically different approach to data ownership? >>You know, we are now in dozens of, dozens of our customers that are using drio for to implement enterprise-wide kind of data mesh solutions. And at the end of the day, I think it's just, you know, what most people would consider common sense, right? In a large organization, it is very hard for a centralized single team to understand every piece of data, to manage all the data themselves, to, you know, make sure the quality is correct to make it accessible. And so what data mesh is first and foremost about is being able to kind of federate the, or distribute the, the ownership of data, the governance of the data still has to happen, right? And so that is, I think at the heart of the data mesh, but thinking of data as kind of allowing different teams, different domains to own their own data to really manage it like a product with all the best practices that that we have with that super important. >>So we we're doing a lot with data mesh, you know, the way that cloud has multiple projects and the way that Jamar allows you to have multiple catalogs and different groups can kind of interact and share data among each other. You know, the fact that we can connect to all these different data sources, even outside your data lake, you know, with Redshift, Oracle SQL Server, you know, all the different databases that are out there and join across different databases in addition to your data lake, that that's all stuff that companies want with their data mesh. >>What are some of your favorite customer stories that where you've really helped them accelerate that data mesh and drive business value from it so that more people in the organization kind of access to data so they can really make those data driven decisions that everybody wants to make? >>I mean, there's, there's so many of them, but, you know, one of the largest tech companies in the world creating a, a data mesh where you have all the different departments in the company that, you know, they, they, they were a big data warehouse user and it kinda hit the wall, right? The costs were so high and the ability for people to kind of use it for just experimentation, to try new things out to collaborate, they couldn't do it because it was so prohibitively expensive and difficult to use. And so what they said, well, we need a platform that different people can, they can collaborate, they can ex, they can experiment with the data, they can share data with others. And so at a big organization like that, the, their ability to kind of have a centralized platform but allow different groups to manage their own data, you know, several of the largest banks in the world are, are also doing data meshes with Dr you know, one of them has over over a dozen different business units that are using, using Dremio and that ability to have thousands of people on a platform and to be able to collaborate and share among each other that, that's super important to these >>Guys. Can you contrast your approach to the market, the snowflakes? Cause they have some of those same concepts. >>Snowflake's >>A very closed system at the end of the day, right? Closed and very expensive. Right? I think they, if I remember seeing, you know, a quarter ago in, in, in one of their earnings reports that the average customer spends 70% more every year, right? Well that's not sustainable. If you think about that in a decade, that's your cost is gonna increase 200 x, most companies not gonna be able to swallow that, right? So companies need, first of all, they need more cost efficient solutions that are, you know, just more approachable, right? And the second thing is, you know, you know, we talked about the open data architecture. I think most companies now realize that the, if you want to build a platform for the future, you need to have the data and open formats and not be locked into one vendor, right? And so that's kind of another important aspect beyond that's ability to connect to all your data, even outside the lake to your different databases, no sequel databases, relational databases, and drs semantic layer where we can accelerate queries. And so typically what you have, what happens with data warehouses and other data lake query engines is that because you can't get the performance that you want, you end up creating lots and lots of copies of data. You, for every use case, you're creating a, you know, a pre-joy copy of that data, a pre aggregated version of that data. And you know, then you have to redirect all your data. >>You've got a >>Governance problem, individual things. It's expensive. It's expensive, it's hard to secure that cuz permissions don't travel with the data. So you have all sorts of problems with that, right? And so what we've done because of our semantic layer that makes it easy to kind of expose data in a logical way. And then our query acceleration technology, which we call reflections, which transparently accelerates queries and gives you subsecond response times without data copies and also without extracts into the BI tools. Cause if you start doing bi extracts or imports, again, you have lots of copies of data in the organization, all sorts of refresh problems, security problems, it's, it's a nightmare, right? And that just collapsing all those copies and having a, a simple solution where data's stored in open formats and we can give you fast access to any of that data that's very different from what you get with like a snowflake or, or any of these other >>Companies. Right. That, that's a great explanation. I wanna ask you, early this year you announced that your Dr. Cloud service would be a free forever, the basic DR. Cloud service. How has that offer gone over? What's been the uptake on that offer? >>Yeah, it, I mean it is, and thousands of people have signed up and, and it's, I think it's a great service. It's, you know, it's very, very simple. People can go on the website, try it out. We now have a test drive as well. If, if you want to get started with just some sample public sample data sets and like a tutorial, we've made that increasingly easy as well. But yeah, we continue to, you know, take that approach of, you know, making it, you know, making it easy, democratizing these kind of cloud data platforms and, and kinda lowering the barriers to >>Adoption. How, how effective has it been in driving sales of the enterprise version? >>Yeah, a lot of, a lot of, a lot of business with, you know, that, that we do like when it comes to, to selling is, you know, folks that, you know, have educated themselves, right? They've started off, they've followed some tutorials. I think generally developers, they prefer the first interaction to be with a product, not with a salesperson. And so that's, that's basically the reason we did that. >>Before we ask you the last question, I wanna just, can you give us a speak peek into the product roadmap as we enter 2023? What can you share with us that we should be paying attention to where Drum is concerned? >>Yeah. You know, actually a couple, couple days ago here at the conference, we, we had a press release with all sorts of new capabilities that we, we we just released. And there's a lot more for, for the coming year. You know, we will shortly be releasing a variety of different performance enhancements. So we'll be in the next quarter or two. We'll be, you know, probably twice as fast just in terms of rock qu speed, you know, that's in addition to our reflections and our career acceleration, you know, support for all the major clouds is coming. You know, just a lot of capabilities in Inre that make it easier and easier to use the platform. >>Awesome. Tomer, thank you so much for joining us. My last question to you is, if you had a billboard in your desired location and it was going to really just be like a mic drop about why customers should be looking at Drio, what would that billboard say? >>Well, DRIO is the easy and open data lake house and, you know, open architectures. It's just a lot, a lot better, a lot more f a lot more future proof, a lot easier and a lot just a much safer choice for the future for, for companies. And so hard to argue with those people to take a look. Exactly. That wasn't the best. That wasn't the best, you know, billboards. >>Okay. I think it's a great billboard. Awesome. And thank you so much for joining Poly Me on the program, sharing with us what's new, what some of the exciting things are that are coming down the pipe. Quite soon we're gonna be keeping our eye Ono. >>Awesome. Always happy to be here. >>Thank you. Right. For our guest and for Paul Gillin, I'm Lisa Martin. You're watching The Cube, the leader in live and emerging tech coverage.

Published Date : Dec 1 2022

SUMMARY :

It's the Cube live at AWS Reinvent This is as it has been since the beginning, this show with a lot of energy. it's great to have you back on the program. And thanks for keeping the, the best session for the fourth day. And the headline of the Silicon Angle's article on the interview we did with you was, So, you know, data warehouses don't go away in, in a year and actually don't go away ever. Describe the data lakehouse for anybody who may not be really familiar with that and, and what it's, And what we've done now as an industry with the Lake House, and especially with, you know, technologies like Apache are enabling this to happen? original data lakes, but now with the Lake house, you know, our bread and butter is actually structured data. You mentioned open a minute ago. The fact that they are, you know, they're very expensive. at the center of the architecture as opposed to some, you know, vendors logo. Do you see the at the end of the day, companies have this benefit of being able to, you know, have their own data in their own S3 account Apache iceberg support, you know, with DML commands, you can do inserts, updates, So all these ideas that, you know, we take for granted now in the world of Are you talk about data as a product that's sort of intrinsic to the data mesh concept. And at the end of the day, I think it's just, you know, what most people would consider common sense, So we we're doing a lot with data mesh, you know, the way that cloud has multiple several of the largest banks in the world are, are also doing data meshes with Dr you know, Cause they have some of those same concepts. And the second thing is, you know, you know, stored in open formats and we can give you fast access to any of that data that's very different from what you get What's been the uptake on that offer? But yeah, we continue to, you know, take that approach of, you know, How, how effective has it been in driving sales of the enterprise version? to selling is, you know, folks that, you know, have educated themselves, right? you know, probably twice as fast just in terms of rock qu speed, you know, that's in addition to our reflections My last question to you is, if you had a Well, DRIO is the easy and open data lake house and, you And thank you so much for joining Poly Me on the program, sharing with us what's new, Always happy to be here. the leader in live and emerging tech coverage.

ENTITIES

Entity	Category	Confidence
Lisa Martin	PERSON	0.99+
Paul Gillen	PERSON	0.99+
Paul Gillin	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Tomer	PERSON	0.99+
Tomer Shiran	PERSON	0.99+
Toron	PERSON	0.99+
Las Vegas	LOCATION	0.99+
70%	QUANTITY	0.99+
Monday night	DATE	0.99+
Vegas	LOCATION	0.99+
fourth day	QUANTITY	0.99+
Paul	PERSON	0.99+
last year	DATE	0.99+
AWS	ORGANIZATION	0.99+
dozens	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
10 different teams	QUANTITY	0.99+
Dremio	PERSON	0.99+
early this year	DATE	0.99+
SQL Query Engine	TITLE	0.99+
The Cube	TITLE	0.99+
Tuesday	DATE	0.99+
2023	DATE	0.99+
one	QUANTITY	0.98+
a year ago	DATE	0.98+
next quarter	DATE	0.98+
S3	TITLE	0.98+
a quarter ago	DATE	0.98+
twice	QUANTITY	0.98+
Oracle	ORGANIZATION	0.98+
second thing	QUANTITY	0.98+
Drio	ORGANIZATION	0.98+
couple days ago	DATE	0.98+
both	QUANTITY	0.97+
DRIO	ORGANIZATION	0.97+
2022	DATE	0.97+
Lake House	ORGANIZATION	0.96+
thousands of people	QUANTITY	0.96+
Wednesday	DATE	0.96+
Spark	TITLE	0.96+
200 x	QUANTITY	0.96+
first	QUANTITY	0.96+
Drio	TITLE	0.95+
Dremeo	ORGANIZATION	0.95+
two options	QUANTITY	0.94+
about three hours	QUANTITY	0.94+
day two	QUANTITY	0.94+
s3	TITLE	0.94+
Apache Iceberg	ORGANIZATION	0.94+
a minute ago	DATE	0.94+
Silicon Angle	ORGANIZATION	0.94+
hundred percent	QUANTITY	0.93+
Apache	ORGANIZATION	0.93+
single team	QUANTITY	0.93+
GitHub	ORGANIZATION	0.91+
this morning	DATE	0.9+
a dozen different engines	QUANTITY	0.89+
Iceberg	TITLE	0.87+
Redshift	TITLE	0.87+
last	DATE	0.87+
this year	DATE	0.86+
first interaction	QUANTITY	0.85+
two flavors	QUANTITY	0.84+
Thursday	DATE	0.84+
Azure	ORGANIZATION	0.84+
DR. Cloud	ORGANIZATION	0.84+
SQL Server	TITLE	0.83+
four conferences	QUANTITY	0.82+
coming year	DATE	0.82+
over over a dozen different business	QUANTITY	0.81+
one vendor	QUANTITY	0.8+
Poly	ORGANIZATION	0.79+
Jamar	PERSON	0.77+
GI	ORGANIZATION	0.77+
Inre	ORGANIZATION	0.76+
Dr.	ORGANIZATION	0.73+
Lake house	ORGANIZATION	0.71+
Arctic	ORGANIZATION	0.71+
a year	QUANTITY	0.7+
a minute	QUANTITY	0.7+
SQL	TITLE	0.69+
AWS Reinvent 2022	EVENT	0.69+
subsecond	QUANTITY	0.68+
DML	TITLE	0.68+

Glen Kurisingal & Nicholas Criss, T-Mobile | AWS re:Invent 2022

>>Good morning friends. Live from Las Vegas. It's the Cube Day four of our coverage of AWS. Reinvent continues. Lisa Martin here with Dave Valante. You >>Can tell it's day four. Yeah. >>You can tell, you >>Get punchy. >>Did you? Yes. Did you know that the Vegas rodeo is coming into town? I'm kind of bummed down, leaving tonight. >>Really? You rodeo >>Fan this weekend? No, but to see a bunch of cowboys in Vegas, >>I'd like to see the Raiders. I'd like to see the Raiders get tickets. >>Yeah. And the hockey team. Yeah. We have had an amazing event, Dave. The cubes. 10th year covering reinvent 11th. Reinvent >>Our 10th year here. Yeah. Yes. Yeah. I mean we covered remotely in during Covid, but >>Yes, yes, yes. Awesome content. Anything jump out at you that we really, we, we love talking to aws, the ecosystem. We got a customer next. Anything jump out at you that's really a kind of a key takeaway? >>Big story. The majority of aws, you know, I mean people ask me what's different under a Adam than under Andy. And I'm like, really? It's the maturity of AWS is what's different, you know, ecosystem, connecting the dots, moving towards solutions, you know, that's, that's the big thing. And it's, you know, in a way it's kind of boring relative to other reinvents, which are like, oh wow, oh my god, they announced outposts. So you don't see anything like that. It's more taking the platform to the next level, which is a good >>Thing. The next level it is a good thing. Speaking of next level, we have a couple of next level guests from T-Mobile joining us. We're gonna be talking through their customers story, their business transformation with aws. Glenn Curing joins us, the director product and technology. And Nick Chris, senior manager, product and technology guys. Welcome. Great to have you on brand. You're on T-Mobile brand. I love it. >>Yeah, >>I mean we are always T-Mobile. >>I love it. So, so everyone knows T-Mobile Blend, you guys are in the digital commerce domain. Talk to us about what that is, what functions that delivers for T-Mobile. Yeah, >>So the digital commerce domain operates and runs a platform called the Digital commerce platform. What this essentially does, it's a set of APIs that are headless that power the shopping experiences. When you talk about shopping experiences at T-Mobile, a customer comes to either a T-Mobile website or goes to a store. And what they do is they start with the discovery process of a phone. They take it through the process, they decide to purchase the phone day at, at the phone to cart, and then eventually they decide to, you know, basically pull the trigger and, and buy the phone at, at which point they submit the order. So that whole experience, essentially from start to finish is powered by the digital commerce platform. Just this year we have processed well over three and a half million orders amounting to a billion and a half dollars worth of business for T-Mobile. >>Wow. Big outcomes. Nick, talk about the before stage, obviously the, the customer experience is absolutely critical because if, if it goes awry, people churn. We know that and nobody wants, you know, brand reputation is is at stake. Yep. Talk about some of the challenges before that you guys faced and how did you work with AWS and part its partner ecosystem to address those challenges? >>Sure. Yeah. So actually before I started working with Glen on the commerce domain, I was part of T-Mobile's cloud team. So we were the team that kind of brought in AWS and commerce platform was really the first tier one system to go a hundred percent cloud native. And so for us it was very much a learning experience and a journey to learn how to operate on the cloud and which was fundamentally different from how we were doing things in the old on-prem days. When >>You talk about headless APIs, you talk, I dunno if you saw Warren a Vogel's keynote this morning, but you're talking about loosely coupled, a loosely coupled system that you can evolve without ripping out the whole system or without bringing the whole system down. Can you explain that in a little bit more >>Detail? Absolutely. So the concept of headless API exactly opens up that possibility. What it allows us to do is to build and operator platform that runs sort of loosely coupled from the user experiences. So when you think about this from a simplistic standpoint, you have a set of APIs that are headless and you've got the website that connects to it, the retail store applications that connect to it, as well as the customer care applications that connect to it. And essentially what that does is it allows us to basically operate all these platforms without being sort of tightly coupled to >>Each other. Yeah, he was talking about this morning when, when AWS announced s3, you know, there was just a handful of services maybe at just two or three. I think now there's 200 and you know, it's never gone down, it's never been, you know, replaced essentially. And so, you know, the whole thing was it's an asynchronous system that's loosely coupled and then you create that illusion of synchronicity for the customer. >>Exactly. >>Which was, I thought, you know, really well described, but maybe you guys could talk about what the genesis was for this system. Take us kind of to the, from the before or after, you know, the classic as as was and the, and as is. Did you talk about that? >>Yeah, I can start and then hand it off to Nick for some more details. So we started this journey back in 2016 and at that point T-Mobile had seven or eight different commerce platforms. Obviously you can think about the complexity involved in running and operating platforms. We've all talked about T-Mobile being the uncarrier. It's a brand that we have basically popularized in the telco industry. We would come out with these massive uncarrier moves and every time that announcement was made, teams have to scramble because you've got seven systems, seven teams, every single system needs to be updated, right? So that's where we started when we kicked off this transformational journey over time, essentially we have brought it down to one platform that supports all these experiences and what that allows us to do is not only time to market gets reduced immensely, but it also allows us to basically reduce our operational cost. Cuz we don't have to have teams running seven, eight systems. It's just one system with one team that can focus on making it a world class, you know, platform. >>Yeah, I think one of the strategies that definitely paid off for us, cuz going all the way back to the beginning, our little platform was powering just a tiny little corner of the, of the webspace, right? But even in those days we approached it from we're gonna build functions in a way that is sort of agnostic to what the experience is gonna be. So over time as we would build a capability that one particular channel needed primary, we were still thinking about all the other channels that needed it. So now over a few years that investment pays off and you have basically the same capabilities working in the same way across all the channels. >>When did the journey start? >>2016. >>2016, yeah. It's been, it's been six years. >>What are some of the game changers in, in this business transformation that you would say these are some of the things that really ignited our transformation? >>Yeah, there's particularly one thing that we feel pretty proud about, which is the fact that we now operate what we call active active stacks. And what that means is you've got a single stack of the eCommerce platform start to finish that can run in an independent manner, but we can also start adding additional stacks that are basically loosely coupled from each other but can, but can run to support the business. What that basically enables is it allows us to run in active active mode, which itself is a big deal from a system uptime perspective. It really changes the game. It allows us to push releases without worrying about any kind of downtime. We've done canary releases, we are in the middle of retail season and we can introduce changes without worrying about it. And more importantly, I think what it has also allowed us to do is essentially practice disaster recovery while doing a release. Cuz that's exactly what we do is every time we do a release we are switching between these separate stacks and essentially are practicing our DR strategy. >>So you do this, it's, it's you separate across regions I presume? Yes. Is that right? Yes. This was really interesting conversation because as you well know in the on-prem world, you never tested that disaster recovery was too risky because you're afraid you're gonna take your whole business down and you're essentially saying that the testing is fundamental to the implementation. >>Absolutely. >>It, it is the thing that you do for every release. So you know, at least every week or so you are doing this and you know, in the old world, the active passive world on paper you had a bunch of capabilities and in in incidents that are even less than say a full disaster recovery scenario, you would end up making the choice not to use that capability because there was too much complexity or risk or problem. When we put this in place. Now if I, I tell people everything we do got easier after that. >>Is it a challenge for you or how do you deal with the challenge? Correct me if it's not a, a challenge that sometimes Amazon services are not available in both regions. I think for instance, the observability thing that they just announced this week is it's not cross region or maybe I'm getting that wrong, but there are services where, you know, you might not be able to do data sharing across region. How do you manage that? Or maybe there's different, you know, levels of certifications. How do you manage that discontinuity or is that not an issue for you? >>Yeah, I mean it, it is certainly a concern and so the stacks, like Glen said, they are largely decoupled and that what that means is practically every component and there's a lot of lot of components in there. I have redundancy from an availability zone point of view. But then where the real magic happens is when you come in as a user to the stack, we're gonna initially kind of lock you on one stack. And then the key thing that we do is we, we understand the difference between what, what we would call the critical data. So think of like your shopping carts and then contextual data that we can relatively easily reload if we need to. And so that critical data is constantly in an async fashion. So it's not interrupting your performance, being broadcast out to a place where we can recover it if we need to, if we need to send you to another stack and then we call that dehydration. And if you end up getting bumped to a new stack, we rehydrate you on that stack and reload that, that contextual data. So to make that whole thing happen, we rely on something we call the global cart store and that's basically powered by Dynamo. So Dynamo is highly, highly reliable and multi >>Reason. So, and, and presume you're doing some form of server list for the stateless stuff and, and maybe taking control of the run time for the stateful things you, are you leaning into to servers and lambda or Not yet cuz you want control over the, the, the EC two and the memory configs. What, what's, I mean, I know we're going inside the plumbing a little bit, but it's kind of fun. >>That's always fun. You >>Went Yeah, and, and it has been a journey. Back in 2016 when we started, we were all on EC twos and across, you know, over the last three or four years we have kind of gone through that journey where we went from easy two to, to containers and we are at some point we'll get to where we will be serverless, we've got a few functions running. But you know, in that journey, I think when you look at the full end of the spectrum, we are somewhere towards the, the process of sort of going from, you know, containers to, to serverless. >>Yeah. So today your team is setting up the containers, they're fencing 'em off, fencing off the app and doing all that sort of sort of semi heavy lifting. Yeah. How do you deal with the, you know, this is one of the things Lisa, you and I were talking about is the skill sets. We always talk about this. What's that? What's your team look like and what are the skill sets that you've got that you're deploying? >>Yeah, I mean, as you can imagine, it's a challenge and it's a, a highly specialized skill set that you need. And you talk about cloud, you know, I, I tell developers when we bring new folks in, in the old days, you could just be like really good at Java and study that for and be good at that for decades. But in the cloud world, you have to be wide in, in your breadth. And so you have to understand those 200 services, right? And so one of the things that really has helped us is we've had a partner. So UST Global is a digital services company and they've really kind of been on the journey up the same timeline that we were. And I had worked with them on the cloud team, you know, before I came to commerce. And when I came to, to the commerce team, we were really struggling, especially from that operational perspective. >>The, the team was just not adapting to that new cloud reality. They were used to the on-prem world, but we brought these folks in because not only were they really able to understand the stuff, but they had built a lot of the platforms that we were gonna be leveraging for commerce with us on the cloud team. So for example, we have built, T-Mobile operates our own customized Kubernetes platform. We've done some stuff for serverless development, C I C D, cloud security. And so not only did these folks have the right skill sets, but they knew how we were approaching it from a T-mobile cloud perspective. And so it's kind of kind of fun to see, you know, when they came on board with this journey with us, we were both, both companies were relatively new and, and learning. Now I look and, you know, I I think that they're like a, a platinum sponsor these days here of aws and so it's kind of cool to see how we've all grown together, >>A lot of evolution, a lot of maturation. Glen, I wanna know from you when we're almost out of time here, but tell me the what the digital commerce domain, you kind of talked about this in the beginning, but I wanna know what's the value in it for me as a customer? All of this under the hood plumbing? Yeah, the maturation, the transformation. How does it benefit mean? >>Great question. So as a customer, all they care about is coming into, going to the website, walking into a store, and without spending too much time completed that transaction and walkout, they don't care about what's under the hood, right? So this transformational journey from, you know, like I talked about, we started with easy twos back in the day. It was what we call the wild west in the, on a cloud native platform to where we have reached today. You know, the journey we have collectively traversed with the USD has allowed us to basically build a system that allows a customer to walk into a store and not spend a whole hour dealing with a sales rep that's trying to sell them things. They can walk in and out quickly, they go to the website, literally within a couple minutes they can complete the transaction and leave. That's what customers want. It is. And that has really sort of helped us when you think about T-Mobile and the fact that we are now poised to be a leader in the US in telco at this whole concept of systems that really empower the customers to quickly complete their transaction has been one of the key components of allowing us to kind of make that growth. Right. So >>Right. And a big driver of revenue. >>Exactly. >>I have one final question for each of you. We're making a Instagram reel, so think about if you had 30 seconds to describe T-Mobile as a technology company that sells phones or a technology company that delights people, what, what would you say if you had a billboard, what would it say about that? Glen, what do you think? >>So T-Mobile, from a technology company perspective, the, the whole purpose of setting up T-mobile's, you know, shopping experience is about bringing customers in, surprising and delighting them with the frictionless shopping experiences that basically allow them to come in and complete the transaction and move on with their lives. It's not about keeping them in the store for too long when they don't want to do it. And essentially the idea is to just basically surprise and delight our customers. >>Perfect. Nick, what would you say, what's your billboard about T-Mobile as a technology company that's delivering great services to its customers? >>Yeah, I think, you know, Glen really covered it well. What I would just add to that is I think the way that we are approaching it these days, really starting from that 2016 period is we like to say we don't think of ourselves as a telco company anymore. We think of ourselves as a technology company that happens to do telco among other things, right? And so we've approached this from a point of view of we're here to provide the best possible experience we can to our customers and we take it personally when, when we don't reach that high bar. And so what we've done in the last few years as a transformation is really given us the toolbox that we need to be able to meet that promise. >>Awesome. Guys, it's been a pleasure having you on the program, talking about the transformation of T-Mobile. Great to hear what you're doing with aws, the maturation, and we look forward to having you back on to see what's next. Thank you. >>Awesome. Thank you so much. >>All right, for our guests and Dave Ante, I'm Lisa Martin, you watching The Cube, the leader in live enterprise and emerging tech coverage.

Published Date : Dec 1 2022

SUMMARY :

It's the Cube Day four of Yeah. I'm kind of bummed down, leaving tonight. I'd like to see the Raiders. We have had an amazing event, Dave. I mean we covered remotely in during Covid, Anything jump out at you that we really, It's the maturity of AWS is what's different, you know, Great to have you on brand. So, so everyone knows T-Mobile Blend, you guys are in the digital commerce domain. you know, basically pull the trigger and, and buy the phone at, at which point they submit Talk about some of the challenges before that you So we were the team that kind of brought in AWS and You talk about headless APIs, you talk, I dunno if you saw Warren a Vogel's keynote this morning, So when you think about this from And so, you know, the whole thing was it's an asynchronous system that's loosely coupled and Which was, I thought, you know, really well described, but maybe you guys could talk about you know, platform. So now over a few years that investment pays off and you have It's been, it's been six years. fact that we now operate what we call active active stacks. So you do this, it's, it's you separate across regions I presume? So you know, at least every week or so you are doing this and you know, you might not be able to do data sharing across region. we can recover it if we need to, if we need to send you to another stack and then we call that are you leaning into to servers and lambda or Not yet cuz you want control over the, You we were all on EC twos and across, you know, over the last three How do you deal with the, you know, this is one of the things Lisa, But in the cloud world, you have to be wide in, And so it's kind of kind of fun to see, you know, when they came on board with this but tell me the what the digital commerce domain, you kind of talked about this in the beginning, you know, like I talked about, we started with easy twos back in the day. And a big driver of revenue. what would you say if you had a billboard, what would it say about that? you know, shopping experience is about bringing customers in, surprising Nick, what would you say, what's your billboard about T-Mobile as a technology company that's delivering great services Yeah, I think, you know, Glen really covered it well. Guys, it's been a pleasure having you on the program, talking about the transformation of T-Mobile. Thank you so much. you watching The Cube, the leader in live enterprise and emerging tech coverage.

ENTITIES

Entity	Category	Confidence
Lisa Martin	PERSON	0.99+
Dave Valante	PERSON	0.99+
Glen Kurisingal	PERSON	0.99+
Nicholas Criss	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Dave Ante	PERSON	0.99+
T-Mobile	ORGANIZATION	0.99+
Glen	PERSON	0.99+
30 seconds	QUANTITY	0.99+
2016	DATE	0.99+
Glenn Curing	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
UST Global	ORGANIZATION	0.99+
Las Vegas	LOCATION	0.99+
seven	QUANTITY	0.99+
Nick Chris	PERSON	0.99+
Vegas	LOCATION	0.99+
Lisa	PERSON	0.99+
Dave	PERSON	0.99+
one system	QUANTITY	0.99+
200 services	QUANTITY	0.99+
two	QUANTITY	0.99+
one team	QUANTITY	0.99+
Raiders	ORGANIZATION	0.99+
one platform	QUANTITY	0.99+
six years	QUANTITY	0.99+
Dynamo	ORGANIZATION	0.99+
three	QUANTITY	0.99+
Nick	PERSON	0.99+
seven systems	QUANTITY	0.99+
T-mobile	ORGANIZATION	0.99+
10th year	QUANTITY	0.99+
both	QUANTITY	0.99+
seven teams	QUANTITY	0.99+
both companies	QUANTITY	0.99+
tonight	DATE	0.99+
US	LOCATION	0.99+
Andy	PERSON	0.99+
this week	DATE	0.98+
The Cube	TITLE	0.98+
Adam	PERSON	0.98+
T-Mobile Blend	ORGANIZATION	0.98+
hundred percent	QUANTITY	0.98+
telco	ORGANIZATION	0.98+
200	QUANTITY	0.98+
one thing	QUANTITY	0.98+
one	QUANTITY	0.98+
eight systems	QUANTITY	0.98+
each	QUANTITY	0.98+
today	DATE	0.97+
both regions	QUANTITY	0.97+
Java	TITLE	0.97+
Covid	TITLE	0.96+
this year	DATE	0.96+
Day four	QUANTITY	0.95+
Instagram	ORGANIZATION	0.95+
a billion and a half dollars	QUANTITY	0.95+
one final question	QUANTITY	0.93+
day four	QUANTITY	0.93+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for S3: