Breaking Analysis: Databricks faces critical strategic decisions…here’s why

>> From theCUBE Studios in Palo Alto and Boston, bringing you data-driven insights from theCUBE and ETR. This is Breaking Analysis with Dave Vellante. >> Spark became a top level Apache project in 2014, and then shortly thereafter, burst onto the big data scene. Spark, along with the cloud, transformed and in many ways, disrupted the big data market. Databricks optimized its tech stack for Spark and took advantage of the cloud to really cleverly deliver a managed service that has become a leading AI and data platform among data scientists and data engineers. However, emerging customer data requirements are shifting into a direction that will cause modern data platform players generally and Databricks, specifically, we think, to make some key directional decisions and perhaps even reinvent themselves. Hello and welcome to this week's wikibon theCUBE Insights, powered by ETR. In this Breaking Analysis, we're going to do a deep dive into Databricks. We'll explore its current impressive market momentum. We're going to use some ETR survey data to show that, and then we'll lay out how customer data requirements are changing and what the ideal data platform will look like in the midterm future. We'll then evaluate core elements of the Databricks portfolio against that vision, and then we'll close with some strategic decisions that we think the company faces. And to do so, we welcome in our good friend, George Gilbert, former equities analyst, market analyst, and current Principal at TechAlpha Partners. George, good to see you. Thanks for coming on. >> Good to see you, Dave. >> All right, let me set this up. We're going to start by taking a look at where Databricks sits in the market in terms of how customers perceive the company and what it's momentum looks like. And this chart that we're showing here is data from ETS, the emerging technology survey of private companies. The N is 1,421. What we did is we cut the data on three sectors, analytics, database-data warehouse, and AI/ML. The vertical axis is a measure of customer sentiment, which evaluates an IT decision maker's awareness of the firm and the likelihood of engaging and/or purchase intent. The horizontal axis shows mindshare in the dataset, and we've highlighted Databricks, which has been a consistent high performer in this survey over the last several quarters. And as we, by the way, just as aside as we previously reported, OpenAI, which burst onto the scene this past quarter, leads all names, but Databricks is still prominent. You can see that the ETR shows some open source tools for reference, but as far as firms go, Databricks is very impressively positioned. Now, let's see how they stack up to some mainstream cohorts in the data space, against some bigger companies and sometimes public companies. This chart shows net score on the vertical axis, which is a measure of spending momentum and pervasiveness in the data set is on the horizontal axis. You can see that chart insert in the upper right, that informs how the dots are plotted, and net score against shared N. And that red dotted line at 40% indicates a highly elevated net score, anything above that we think is really, really impressive. And here we're just comparing Databricks with Snowflake, Cloudera, and Oracle. And that squiggly line leading to Databricks shows their path since 2021 by quarter. And you can see it's performing extremely well, maintaining an elevated net score and net range. Now it's comparable in the vertical axis to Snowflake, and it consistently is moving to the right and gaining share. Now, why did we choose to show Cloudera and Oracle? The reason is that Cloudera got the whole big data era started and was disrupted by Spark. And of course the cloud, Spark and Databricks and Oracle in many ways, was the target of early big data players like Cloudera. Take a listen to Cloudera CEO at the time, Mike Olson. This is back in 2010, first year of theCUBE, play the clip. >> Look, back in the day, if you had a data problem, if you needed to run business analytics, you wrote the biggest check you could to Sun Microsystems, and you bought a great big, single box, central server, and any money that was left over, you handed to Oracle for a database licenses and you installed that database on that box, and that was where you went for data. That was your temple of information. >> Okay? So Mike Olson implied that monolithic model was too expensive and inflexible, and Cloudera set out to fix that. But the best laid plans, as they say, George, what do you make of the data that we just shared? >> So where Databricks has really come up out of sort of Cloudera's tailpipe was they took big data processing, made it coherent, made it a managed service so it could run in the cloud. So it relieved customers of the operational burden. Where they're really strong and where their traditional meat and potatoes or bread and butter is the predictive and prescriptive analytics that building and training and serving machine learning models. They've tried to move into traditional business intelligence, the more traditional descriptive and diagnostic analytics, but they're less mature there. So what that means is, the reason you see Databricks and Snowflake kind of side by side is there are many, many accounts that have both Snowflake for business intelligence, Databricks for AI machine learning, where Snowflake, I'm sorry, where Databricks also did really well was in core data engineering, refining the data, the old ETL process, which kind of turned into ELT, where you loaded into the analytic repository in raw form and refine it. And so people have really used both, and each is trying to get into the other. >> Yeah, absolutely. We've reported on this quite a bit. Snowflake, kind of moving into the domain of Databricks and vice versa. And the last bit of ETR evidence that we want to share in terms of the company's momentum comes from ETR's Round Tables. They're run by Erik Bradley, and now former Gartner analyst and George, your colleague back at Gartner, Daren Brabham. And what we're going to show here is some direct quotes of IT pros in those Round Tables. There's a data science head and a CIO as well. Just make a few call outs here, we won't spend too much time on it, but starting at the top, like all of us, we can't talk about Databricks without mentioning Snowflake. Those two get us excited. Second comment zeros in on the flexibility and the robustness of Databricks from a data warehouse perspective. And then the last point is, despite competition from cloud players, Databricks has reinvented itself a couple of times over the year. And George, we're going to lay out today a scenario that perhaps calls for Databricks to do that once again. >> Their big opportunity and their big challenge for every tech company, it's managing a technology transition. The transition that we're talking about is something that's been bubbling up, but it's really epical. First time in 60 years, we're moving from an application-centric view of the world to a data-centric view, because decisions are becoming more important than automating processes. So let me let you sort of develop. >> Yeah, so let's talk about that here. We going to put up some bullets on precisely that point and the changing sort of customer environment. So you got IT stacks are shifting is George just said, from application centric silos to data centric stacks where the priority is shifting from automating processes to automating decision. You know how look at RPA and there's still a lot of automation going on, but from the focus of that application centricity and the data locked into those apps, that's changing. Data has historically been on the outskirts in silos, but organizations, you think of Amazon, think Uber, Airbnb, they're putting data at the core, and logic is increasingly being embedded in the data instead of the reverse. In other words, today, the data's locked inside the app, which is why you need to extract that data is sticking it to a data warehouse. The point, George, is we're putting forth this new vision for how data is going to be used. And you've used this Uber example to underscore the future state. Please explain? >> Okay, so this is hopefully an example everyone can relate to. The idea is first, you're automating things that are happening in the real world and decisions that make those things happen autonomously without humans in the loop all the time. So to use the Uber example on your phone, you call a car, you call a driver. Automatically, the Uber app then looks at what drivers are in the vicinity, what drivers are free, matches one, calculates an ETA to you, calculates a price, calculates an ETA to your destination, and then directs the driver once they're there. The point of this is that that cannot happen in an application-centric world very easily because all these little apps, the drivers, the riders, the routes, the fares, those call on data locked up in many different apps, but they have to sit on a layer that makes it all coherent. >> But George, so if Uber's doing this, doesn't this tech already exist? Isn't there a tech platform that does this already? >> Yes, and the mission of the entire tech industry is to build services that make it possible to compose and operate similar platforms and tools, but with the skills of mainstream developers in mainstream corporations, not the rocket scientists at Uber and Amazon. >> Okay, so we're talking about horizontally scaling across the industry, and actually giving a lot more organizations access to this technology. So by way of review, let's summarize the trend that's going on today in terms of the modern data stack that is propelling the likes of Databricks and Snowflake, which we just showed you in the ETR data and is really is a tailwind form. So the trend is toward this common repository for analytic data, that could be multiple virtual data warehouses inside of Snowflake, but you're in that Snowflake environment or Lakehouses from Databricks or multiple data lakes. And we've talked about what JP Morgan Chase is doing with the data mesh and gluing data lakes together, you've got various public clouds playing in this game, and then the data is annotated to have a common meaning. In other words, there's a semantic layer that enables applications to talk to the data elements and know that they have common and coherent meaning. So George, the good news is this approach is more effective than the legacy monolithic models that Mike Olson was talking about, so what's the problem with this in your view? >> So today's data platforms added immense value 'cause they connected the data that was previously locked up in these monolithic apps or on all these different microservices, and that supported traditional BI and AI/ML use cases. But now if we want to build apps like Uber or Amazon.com, where they've got essentially an autonomously running supply chain and e-commerce app where humans only care and feed it. But the thing is figuring out what to buy, when to buy, where to deploy it, when to ship it. We needed a semantic layer on top of the data. So that, as you were saying, the data that's coming from all those apps, the different apps that's integrated, not just connected, but it means the same. And the issue is whenever you add a new layer to a stack to support new applications, there are implications for the already existing layers, like can they support the new layer and its use cases? So for instance, if you add a semantic layer that embeds app logic with the data rather than vice versa, which we been talking about and that's been the case for 60 years, then the new data layer faces challenges that the way you manage that data, the way you analyze that data, is not supported by today's tools. >> Okay, so actually Alex, bring me up that last slide if you would, I mean, you're basically saying at the bottom here, today's repositories don't really do joins at scale. The future is you're talking about hundreds or thousands or millions of data connections, and today's systems, we're talking about, I don't know, 6, 8, 10 joins and that is the fundamental problem you're saying, is a new data error coming and existing systems won't be able to handle it? >> Yeah, one way of thinking about it is that even though we call them relational databases, when we actually want to do lots of joins or when we want to analyze data from lots of different tables, we created a whole new industry for analytic databases where you sort of mung the data together into fewer tables. So you didn't have to do as many joins because the joins are difficult and slow. And when you're going to arbitrarily join thousands, hundreds of thousands or across millions of elements, you need a new type of database. We have them, they're called graph databases, but to query them, you go back to the prerelational era in terms of their usability. >> Okay, so we're going to come back to that and talk about how you get around that problem. But let's first lay out what the ideal data platform of the future we think looks like. And again, we're going to come back to use this Uber example. In this graphic that George put together, awesome. We got three layers. The application layer is where the data products reside. The example here is drivers, rides, maps, routes, ETA, et cetera. The digital version of what we were talking about in the previous slide, people, places and things. The next layer is the data layer, that breaks down the silos and connects the data elements through semantics and everything is coherent. And then the bottom layers, the legacy operational systems feed that data layer. George, explain what's different here, the graph database element, you talk about the relational query capabilities, and why can't I just throw memory at solving this problem? >> Some of the graph databases do throw memory at the problem and maybe without naming names, some of them live entirely in memory. And what you're dealing with is a prerelational in-memory database system where you navigate between elements, and the issue with that is we've had SQL for 50 years, so we don't have to navigate, we can say what we want without how to get it. That's the core of the problem. >> Okay. So if I may, I just want to drill into this a little bit. So you're talking about the expressiveness of a graph. Alex, if you'd bring that back out, the fourth bullet, expressiveness of a graph database with the relational ease of query. Can you explain what you mean by that? >> Yeah, so graphs are great because when you can describe anything with a graph, that's why they're becoming so popular. Expressive means you can represent anything easily. They're conducive to, you might say, in a world where we now want like the metaverse, like with a 3D world, and I don't mean the Facebook metaverse, I mean like the business metaverse when we want to capture data about everything, but we want it in context, we want to build a set of digital twins that represent everything going on in the world. And Uber is a tiny example of that. Uber built a graph to represent all the drivers and riders and maps and routes. But what you need out of a database isn't just a way to store stuff and update stuff. You need to be able to ask questions of it, you need to be able to query it. And if you go back to prerelational days, you had to know how to find your way to the data. It's sort of like when you give directions to someone and they didn't have a GPS system and a mapping system, you had to give them turn by turn directions. Whereas when you have a GPS and a mapping system, which is like the relational thing, you just say where you want to go, and it spits out the turn by turn directions, which let's say, the car might follow or whoever you're directing would follow. But the point is, it's much easier in a relational database to say, "I just want to get these results. You figure out how to get it." The graph database, they have not taken over the world because in some ways, it's taking a 50 year leap backwards. >> Alright, got it. Okay. Let's take a look at how the current Databricks offerings map to that ideal state that we just laid out. So to do that, we put together this chart that looks at the key elements of the Databricks portfolio, the core capability, the weakness, and the threat that may loom. Start with the Delta Lake, that's the storage layer, which is great for files and tables. It's got true separation of compute and storage, I want you to double click on that George, as independent elements, but it's weaker for the type of low latency ingest that we see coming in the future. And some of the threats highlighted here. AWS could add transactional tables to S3, Iceberg adoption is picking up and could accelerate, that could disrupt Databricks. George, add some color here please? >> Okay, so this is the sort of a classic competitive forces where you want to look at, so what are customers demanding? What's competitive pressure? What are substitutes? Even what your suppliers might be pushing. Here, Delta Lake is at its core, a set of transactional tables that sit on an object store. So think of it in a database system, this is the storage engine. So since S3 has been getting stronger for 15 years, you could see a scenario where they add transactional tables. We have an open source alternative in Iceberg, which Snowflake and others support. But at the same time, Databricks has built an ecosystem out of tools, their own and others, that read and write to Delta tables, that's what makes the Delta Lake and ecosystem. So they have a catalog, the whole machine learning tool chain talks directly to the data here. That was their great advantage because in the past with Snowflake, you had to pull all the data out of the database before the machine learning tools could work with it, that was a major shortcoming. They fixed that. But the point here is that even before we get to the semantic layer, the core foundation is under threat. >> Yep. Got it. Okay. We got a lot of ground to cover. So we're going to take a look at the Spark Execution Engine next. Think of that as the refinery that runs really efficient batch processing. That's kind of what disrupted the DOOp in a large way, but it's not Python friendly and that's an issue because the data science and the data engineering crowd are moving in that direction, and/or they're using DBT. George, we had Tristan Handy on at Supercloud, really interesting discussion that you and I did. Explain why this is an issue for Databricks? >> So once the data lake was in place, what people did was they refined their data batch, and Spark has always had streaming support and it's gotten better. The underlying storage as we've talked about is an issue. But basically they took raw data, then they refined it into tables that were like customers and products and partners. And then they refined that again into what was like gold artifacts, which might be business intelligence metrics or dashboards, which were collections of metrics. But they were running it on the Spark Execution Engine, which it's a Java-based engine or it's running on a Java-based virtual machine, which means all the data scientists and the data engineers who want to work with Python are really working in sort of oil and water. Like if you get an error in Python, you can't tell whether the problems in Python or where it's in Spark. There's just an impedance mismatch between the two. And then at the same time, the whole world is now gravitating towards DBT because it's a very nice and simple way to compose these data processing pipelines, and people are using either SQL in DBT or Python in DBT, and that kind of is a substitute for doing it all in Spark. So it's under threat even before we get to that semantic layer, it so happens that DBT itself is becoming the authoring environment for the semantic layer with business intelligent metrics. But that's again, this is the second element that's under direct substitution and competitive threat. >> Okay, let's now move down to the third element, which is the Photon. Photon is Databricks' BI Lakehouse, which has integration with the Databricks tooling, which is very rich, it's newer. And it's also not well suited for high concurrency and low latency use cases, which we think are going to increasingly become the norm over time. George, the call out threat here is customers want to connect everything to a semantic layer. Explain your thinking here and why this is a potential threat to Databricks? >> Okay, so two issues here. What you were touching on, which is the high concurrency, low latency, when people are running like thousands of dashboards and data is streaming in, that's a problem because SQL data warehouse, the query engine, something like that matures over five to 10 years. It's one of these things, the joke that Andy Jassy makes just in general, he's really talking about Azure, but there's no compression algorithm for experience. The Snowflake guy started more than five years earlier, and for a bunch of reasons, that lead is not something that Databricks can shrink. They'll always be behind. So that's why Snowflake has transactional tables now and we can get into that in another show. But the key point is, so near term, it's struggling to keep up with the use cases that are core to business intelligence, which is highly concurrent, lots of users doing interactive query. But then when you get to a semantic layer, that's when you need to be able to query data that might have thousands or tens of thousands or hundreds of thousands of joins. And that's a SQL query engine, traditional SQL query engine is just not built for that. That's the core problem of traditional relational databases. >> Now this is a quick aside. We always talk about Snowflake and Databricks in sort of the same context. We're not necessarily saying that Snowflake is in a position to tackle all these problems. We'll deal with that separately. So we don't mean to imply that, but we're just sort of laying out some of the things that Snowflake or rather Databricks customers we think, need to be thinking about and having conversations with Databricks about and we hope to have them as well. We'll come back to that in terms of sort of strategic options. But finally, when come back to the table, we have Databricks' AI/ML Tool Chain, which has been an awesome capability for the data science crowd. It's comprehensive, it's a one-stop shop solution, but the kicker here is that it's optimized for supervised model building. And the concern is that foundational models like GPT could cannibalize the current Databricks tooling, but George, can't Databricks, like other software companies, integrate foundation model capabilities into its platform? >> Okay, so the sound bite answer to that is sure, IBM 3270 terminals could call out to a graphical user interface when they're running on the XT terminal, but they're not exactly good citizens in that world. The core issue is Databricks has this wonderful end-to-end tool chain for training, deploying, monitoring, running inference on supervised models. But the paradigm there is the customer builds and trains and deploys each model for each feature or application. In a world of foundation models which are pre-trained and unsupervised, the entire tool chain is different. So it's not like Databricks can junk everything they've done and start over with all their engineers. They have to keep maintaining what they've done in the old world, but they have to build something new that's optimized for the new world. It's a classic technology transition and their mentality appears to be, "Oh, we'll support the new stuff from our old stuff." Which is suboptimal, and as we'll talk about, their biggest patron and the company that put them on the map, Microsoft, really stopped working on their old stuff three years ago so that they could build a new tool chain optimized for this new world. >> Yeah, and so let's sort of close with what we think the options are and decisions that Databricks has for its future architecture. They're smart people. I mean we've had Ali Ghodsi on many times, super impressive. I think they've got to be keenly aware of the limitations, what's going on with foundation models. But at any rate, here in this chart, we lay out sort of three scenarios. One is re-architect the platform by incrementally adopting new technologies. And example might be to layer a graph query engine on top of its stack. They could license key technologies like graph database, they could get aggressive on M&A and buy-in, relational knowledge graphs, semantic technologies, vector database technologies. George, as David Floyer always says, "A lot of ways to skin a cat." We've seen companies like, even think about EMC maintained its relevance through M&A for many, many years. George, give us your thought on each of these strategic options? >> Okay, I find this question the most challenging 'cause remember, I used to be an equity research analyst. I worked for Frank Quattrone, we were one of the top tech shops in the banking industry, although this is 20 years ago. But the M&A team was the top team in the industry and everyone wanted them on their side. And I remember going to meetings with these CEOs, where Frank and the bankers would say, "You want us for your M&A work because we can do better." And they really could do better. But in software, it's not like with EMC in hardware because with hardware, it's easier to connect different boxes. With software, the whole point of a software company is to integrate and architect the components so they fit together and reinforce each other, and that makes M&A harder. You can do it, but it takes a long time to fit the pieces together. Let me give you examples. If they put a graph query engine, let's say something like TinkerPop, on top of, I don't even know if it's possible, but let's say they put it on top of Delta Lake, then you have this graph query engine talking to their storage layer, Delta Lake. But if you want to do analysis, you got to put the data in Photon, which is not really ideal for highly connected data. If you license a graph database, then most of your data is in the Delta Lake and how do you sync it with the graph database? If you do sync it, you've got data in two places, which kind of defeats the purpose of having a unified repository. I find this semantic layer option in number three actually more promising, because that's something that you can layer on top of the storage layer that you have already. You just have to figure out then how to have your query engines talk to that. What I'm trying to highlight is, it's easy as an analyst to say, "You can buy this company or license that technology." But the really hard work is making it all work together and that is where the challenge is. >> Yeah, and well look, I thank you for laying that out. We've seen it, certainly Microsoft and Oracle. I guess you might argue that well, Microsoft had a monopoly in its desktop software and was able to throw off cash for a decade plus while it's stock was going sideways. Oracle had won the database wars and had amazing margins and cash flow to be able to do that. Databricks isn't even gone public yet, but I want to close with some of the players to watch. Alex, if you'd bring that back up, number four here. AWS, we talked about some of their options with S3 and it's not just AWS, it's blob storage, object storage. Microsoft, as you sort of alluded to, was an early go-to market channel for Databricks. We didn't address that really. So maybe in the closing comments we can. Google obviously, Snowflake of course, we're going to dissect their options in future Breaking Analysis. Dbt labs, where do they fit? Bob Muglia's company, Relational.ai, why are these players to watch George, in your opinion? >> So everyone is trying to assemble and integrate the pieces that would make building data applications, data products easy. And the critical part isn't just assembling a bunch of pieces, which is traditionally what AWS did. It's a Unix ethos, which is we give you the tools, you put 'em together, 'cause you then have the maximum choice and maximum power. So what the hyperscalers are doing is they're taking their key value stores, in the case of ASW it's DynamoDB, in the case of Azure it's Cosmos DB, and each are putting a graph query engine on top of those. So they have a unified storage and graph database engine, like all the data would be collected in the key value store. Then you have a graph database, that's how they're going to be presenting a foundation for building these data apps. Dbt labs is putting a semantic layer on top of data lakes and data warehouses and as we'll talk about, I'm sure in the future, that makes it easier to swap out the underlying data platform or swap in new ones for specialized use cases. Snowflake, what they're doing, they're so strong in data management and with their transactional tables, what they're trying to do is take in the operational data that used to be in the province of many state stores like MongoDB and say, "If you manage that data with us, it'll be connected to your analytic data without having to send it through a pipeline." And that's hugely valuable. Relational.ai is the wildcard, 'cause what they're trying to do, it's almost like a holy grail where you're trying to take the expressiveness of connecting all your data in a graph but making it as easy to query as you've always had it in a SQL database or I should say, in a relational database. And if they do that, it's sort of like, it'll be as easy to program these data apps as a spreadsheet was compared to procedural languages, like BASIC or Pascal. That's the implications of Relational.ai. >> Yeah, and again, we talked before, why can't you just throw this all in memory? We're talking in that example of really getting down to differences in how you lay the data out on disk in really, new database architecture, correct? >> Yes. And that's why it's not clear that you could take a data lake or even a Snowflake and why you can't put a relational knowledge graph on those. You could potentially put a graph database, but it'll be compromised because to really do what Relational.ai has done, which is the ease of Relational on top of the power of graph, you actually need to change how you're storing your data on disk or even in memory. So you can't, in other words, it's not like, oh we can add graph support to Snowflake, 'cause if you did that, you'd have to change, or in your data lake, you'd have to change how the data is physically laid out. And then that would break all the tools that talk to that currently. >> What in your estimation, is the timeframe where this becomes critical for a Databricks and potentially Snowflake and others? I mentioned earlier midterm, are we talking three to five years here? Are we talking end of decade? What's your radar say? >> I think something surprising is going on that's going to sort of come up the tailpipe and take everyone by storm. All the hype around business intelligence metrics, which is what we used to put in our dashboards where bookings, billings, revenue, customer, those things, those were the key artifacts that used to live in definitions in your BI tools, and DBT has basically created a standard for defining those so they live in your data pipeline or they're defined in their data pipeline and executed in the data warehouse or data lake in a shared way, so that all tools can use them. This sounds like a digression, it's not. All this stuff about data mesh, data fabric, all that's going on is we need a semantic layer and the business intelligence metrics are defining common semantics for your data. And I think we're going to find by the end of this year, that metrics are how we annotate all our analytic data to start adding common semantics to it. And we're going to find this semantic layer, it's not three to five years off, it's going to be staring us in the face by the end of this year. >> Interesting. And of course SVB today was shut down. We're seeing serious tech headwinds, and oftentimes in these sort of downturns or flat turns, which feels like this could be going on for a while, we emerge with a lot of new players and a lot of new technology. George, we got to leave it there. Thank you to George Gilbert for excellent insights and input for today's episode. I want to thank Alex Myerson who's on production and manages the podcast, of course Ken Schiffman as well. Kristin Martin and Cheryl Knight help get the word out on social media and in our newsletters. And Rob Hof is our EIC over at Siliconangle.com, he does some great editing. Remember all these episodes, they're available as podcasts. Wherever you listen, all you got to do is search Breaking Analysis Podcast, we publish each week on wikibon.com and siliconangle.com, or you can email me at David.Vellante@siliconangle.com, or DM me @DVellante. Comment on our LinkedIn post, and please do check out ETR.ai, great survey data, enterprise tech focus, phenomenal. This is Dave Vellante for theCUBE Insights powered by ETR. Thanks for watching, and we'll see you next time on Breaking Analysis.

Published Date : Mar 10 2023

SUMMARY :

bringing you data-driven core elements of the Databricks portfolio and pervasiveness in the data and that was where you went for data. and Cloudera set out to fix that. the reason you see and the robustness of Databricks and their big challenge and the data locked into in the real world and decisions Yes, and the mission of that is propelling the likes that the way you manage that data, is the fundamental problem because the joins are difficult and slow. and connects the data and the issue with that is the fourth bullet, expressiveness and it spits out the and the threat that may loom. because in the past with Snowflake, Think of that as the refinery So once the data lake was in place, George, the call out threat here But the key point is, in sort of the same context. and the company that put One is re-architect the platform and architect the components some of the players to watch. in the case of ASW it's DynamoDB, and why you can't put a relational and executed in the data and manages the podcast, of

ENTITIES

Entity	Category	Confidence
Alex Myerson	PERSON	0.99+
David Floyer	PERSON	0.99+
Mike Olson	PERSON	0.99+
2014	DATE	0.99+
George Gilbert	PERSON	0.99+
Dave Vellante	PERSON	0.99+
George	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
Ken Schiffman	PERSON	0.99+
Andy Jassy	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Erik Bradley	PERSON	0.99+
Dave	PERSON	0.99+
Uber	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
Sun Microsystems	ORGANIZATION	0.99+
50 years	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Bob Muglia	PERSON	0.99+
Gartner	ORGANIZATION	0.99+
Airbnb	ORGANIZATION	0.99+
60 years	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Ali Ghodsi	PERSON	0.99+
2010	DATE	0.99+
Databricks	ORGANIZATION	0.99+
Kristin Martin	PERSON	0.99+
Rob Hof	PERSON	0.99+
three	QUANTITY	0.99+
15 years	QUANTITY	0.99+
Databricks'	ORGANIZATION	0.99+
two places	QUANTITY	0.99+
Boston	LOCATION	0.99+
Tristan Handy	PERSON	0.99+
M&A	ORGANIZATION	0.99+
Frank Quattrone	PERSON	0.99+
second element	QUANTITY	0.99+
Daren Brabham	PERSON	0.99+
TechAlpha Partners	ORGANIZATION	0.99+
third element	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
50 year	QUANTITY	0.99+
40%	QUANTITY	0.99+
Cloudera	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
five years	QUANTITY	0.99+

Robert Nishihara, Anyscale | AWS Startup Showcase S3 E1

(upbeat music) >> Hello everyone. Welcome to theCube's presentation of the "AWS Startup Showcase." The topic this episode is AI and machine learning, top startups building foundational model infrastructure. This is season three, episode one of the ongoing series covering exciting startups from the AWS ecosystem. And this time we're talking about AI and machine learning. I'm your host, John Furrier. I'm excited I'm joined today by Robert Nishihara, who's the co-founder and CEO of a hot startup called Anyscale. He's here to talk about Ray, the open source project, Anyscale's infrastructure for foundation as well. Robert, thank you for joining us today. >> Yeah, thanks so much as well. >> I've been following your company since the founding pre pandemic and you guys really had a great vision scaled up and in a perfect position for this big wave that we all see with ChatGPT and OpenAI that's gone mainstream. Finally, AI has broken out through the ropes and now gone mainstream, so I think you guys are really well positioned. I'm looking forward to to talking with you today. But before we get into it, introduce the core mission for Anyscale. Why do you guys exist? What is the North Star for Anyscale? >> Yeah, like you mentioned, there's a tremendous amount of excitement about AI right now. You know, I think a lot of us believe that AI can transform just every different industry. So one of the things that was clear to us when we started this company was that the amount of compute needed to do AI was just exploding. Like to actually succeed with AI, companies like OpenAI or Google or you know, these companies getting a lot of value from AI, were not just running these machine learning models on their laptops or on a single machine. They were scaling these applications across hundreds or thousands or more machines and GPUs and other resources in the Cloud. And so to actually succeed with AI, and this has been one of the biggest trends in computing, maybe the biggest trend in computing in, you know, in recent history, the amount of compute has been exploding. And so to actually succeed with that AI, to actually build these scalable applications and scale the AI applications, there's a tremendous software engineering lift to build the infrastructure to actually run these scalable applications. And that's very hard to do. So one of the reasons many AI projects and initiatives fail is that, or don't make it to production, is the need for this scale, the infrastructure lift, to actually make it happen. So our goal here with Anyscale and Ray, is to make that easy, is to make scalable computing easy. So that as a developer or as a business, if you want to do AI, if you want to get value out of AI, all you need to know is how to program on your laptop. Like, all you need to know is how to program in Python. And if you can do that, then you're good to go. Then you can do what companies like OpenAI or Google do and get value out of machine learning. >> That programming example of how easy it is with Python reminds me of the early days of Cloud, when infrastructure as code was talked about was, it was just code the infrastructure programmable. That's super important. That's what AI people wanted, first program AI. That's the new trend. And I want to understand, if you don't mind explaining, the relationship that Anyscale has to these foundational models and particular the large language models, also called LLMs, was seen with like OpenAI and ChatGPT. Before you get into the relationship that you have with them, can you explain why the hype around foundational models? Why are people going crazy over foundational models? What is it and why is it so important? >> Yeah, so foundational models and foundation models are incredibly important because they enable businesses and developers to get value out of machine learning, to use machine learning off the shelf with these large models that have been trained on tons of data and that are useful out of the box. And then, of course, you know, as a business or as a developer, you can take those foundational models and repurpose them or fine tune them or adapt them to your specific use case and what you want to achieve. But it's much easier to do that than to train them from scratch. And I think there are three, for people to actually use foundation models, there are three main types of workloads or problems that need to be solved. One is training these foundation models in the first place, like actually creating them. The second is fine tuning them and adapting them to your use case. And the third is serving them and actually deploying them. Okay, so Ray and Anyscale are used for all of these three different workloads. Companies like OpenAI or Cohere that train large language models. Or open source versions like GPTJ are done on top of Ray. There are many startups and other businesses that fine tune, that, you know, don't want to train the large underlying foundation models, but that do want to fine tune them, do want to adapt them to their purposes, and build products around them and serve them, those are also using Ray and Anyscale for that fine tuning and that serving. And so the reason that Ray and Anyscale are important here is that, you know, building and using foundation models requires a huge scale. It requires a lot of data. It requires a lot of compute, GPUs, TPUs, other resources. And to actually take advantage of that and actually build these scalable applications, there's a lot of infrastructure that needs to happen under the hood. And so you can either use Ray and Anyscale to take care of that and manage the infrastructure and solve those infrastructure problems. Or you can build the infrastructure and manage the infrastructure yourself, which you can do, but it's going to slow your team down. It's going to, you know, many of the businesses we work with simply don't want to be in the business of managing infrastructure and building infrastructure. They want to focus on product development and move faster. >> I know you got a keynote presentation we're going to go to in a second, but I think you hit on something I think is the real tipping point, doing it yourself, hard to do. These are things where opportunities are and the Cloud did that with data centers. Turned a data center and made it an API. The heavy lifting went away and went to the Cloud so people could be more creative and build their product. In this case, build their creativity. Is that kind of what's the big deal? Is that kind of a big deal happening that you guys are taking the learnings and making that available so people don't have to do that? >> That's exactly right. So today, if you want to succeed with AI, if you want to use AI in your business, infrastructure work is on the critical path for doing that. To do AI, you have to build infrastructure. You have to figure out how to scale your applications. That's going to change. We're going to get to the point, and you know, with Ray and Anyscale, we're going to remove the infrastructure from the critical path so that as a developer or as a business, all you need to focus on is your application logic, what you want the the program to do, what you want your application to do, how you want the AI to actually interface with the rest of your product. Now the way that will happen is that Ray and Anyscale will still, the infrastructure work will still happen. It'll just be under the hood and taken care of by Ray in Anyscale. And so I think something like this is really necessary for AI to reach its potential, for AI to have the impact and the reach that we think it will, you have to make it easier to do. >> And just for clarification to point out, if you don't mind explaining the relationship of Ray and Anyscale real quick just before we get into the presentation. >> So Ray is an open source project. We created it. We were at Berkeley doing machine learning. We started Ray so that, in order to provide an easy, a simple open source tool for building and running scalable applications. And Anyscale is the managed version of Ray, basically we will run Ray for you in the Cloud, provide a lot of tools around the developer experience and managing the infrastructure and providing more performance and superior infrastructure. >> Awesome. I know you got a presentation on Ray and Anyscale and you guys are positioning as the infrastructure for foundational models. So I'll let you take it away and then when you're done presenting, we'll come back, I'll probably grill you with a few questions and then we'll close it out so take it away. >> Robert: Sounds great. So I'll say a little bit about how companies are using Ray and Anyscale for foundation models. The first thing I want to mention is just why we're doing this in the first place. And the underlying observation, the underlying trend here, and this is a plot from OpenAI, is that the amount of compute needed to do machine learning has been exploding. It's been growing at something like 35 times every 18 months. This is absolutely enormous. And other people have written papers measuring this trend and you get different numbers. But the point is, no matter how you slice and dice it, it' a astronomical rate. Now if you compare that to something we're all familiar with, like Moore's Law, which says that, you know, the processor performance doubles every roughly 18 months, you can see that there's just a tremendous gap between the needs, the compute needs of machine learning applications, and what you can do with a single chip, right. So even if Moore's Law were continuing strong and you know, doing what it used to be doing, even if that were the case, there would still be a tremendous gap between what you can do with the chip and what you need in order to do machine learning. And so given this graph, what we've seen, and what has been clear to us since we started this company, is that doing AI requires scaling. There's no way around it. It's not a nice to have, it's really a requirement. And so that led us to start Ray, which is the open source project that we started to make it easy to build these scalable Python applications and scalable machine learning applications. And since we started the project, it's been adopted by a tremendous number of companies. Companies like OpenAI, which use Ray to train their large models like ChatGPT, companies like Uber, which run all of their deep learning and classical machine learning on top of Ray, companies like Shopify or Spotify or Instacart or Lyft or Netflix, ByteDance, which use Ray for their machine learning infrastructure. Companies like Ant Group, which makes Alipay, you know, they use Ray across the board for fraud detection, for online learning, for detecting money laundering, you know, for graph processing, stream processing. Companies like Amazon, you know, run Ray at a tremendous scale and just petabytes of data every single day. And so the project has seen just enormous adoption since, over the past few years. And one of the most exciting use cases is really providing the infrastructure for building training, fine tuning, and serving foundation models. So I'll say a little bit about, you know, here are some examples of companies using Ray for foundation models. Cohere trains large language models. OpenAI also trains large language models. You can think about the workloads required there are things like supervised pre-training, also reinforcement learning from human feedback. So this is not only the regular supervised learning, but actually more complex reinforcement learning workloads that take human input about what response to a particular question, you know is better than a certain other response. And incorporating that into the learning. There's open source versions as well, like GPTJ also built on top of Ray as well as projects like Alpa coming out of UC Berkeley. So these are some of the examples of exciting projects in organizations, training and creating these large language models and serving them using Ray. Okay, so what actually is Ray? Well, there are two layers to Ray. At the lowest level, there's the core Ray system. This is essentially low level primitives for building scalable Python applications. Things like taking a Python function or a Python class and executing them in the cluster setting. So Ray core is extremely flexible and you can build arbitrary scalable applications on top of Ray. So on top of Ray, on top of the core system, what really gives Ray a lot of its power is this ecosystem of scalable libraries. So on top of the core system you have libraries, scalable libraries for ingesting and pre-processing data, for training your models, for fine tuning those models, for hyper parameter tuning, for doing batch processing and batch inference, for doing model serving and deployment, right. And a lot of the Ray users, the reason they like Ray is that they want to run multiple workloads. They want to train and serve their models, right. They want to load their data and feed that into training. And Ray provides common infrastructure for all of these different workloads. So this is a little overview of what Ray, the different components of Ray. So why do people choose to go with Ray? I think there are three main reasons. The first is the unified nature. The fact that it is common infrastructure for scaling arbitrary workloads, from data ingest to pre-processing to training to inference and serving, right. This also includes the fact that it's future proof. AI is incredibly fast moving. And so many people, many companies that have built their own machine learning infrastructure and standardized on particular workflows for doing machine learning have found that their workflows are too rigid to enable new capabilities. If they want to do reinforcement learning, if they want to use graph neural networks, they don't have a way of doing that with their standard tooling. And so Ray, being future proof and being flexible and general gives them that ability. Another reason people choose Ray in Anyscale is the scalability. This is really our bread and butter. This is the reason, the whole point of Ray, you know, making it easy to go from your laptop to running on thousands of GPUs, making it easy to scale your development workloads and run them in production, making it easy to scale, you know, training to scale data ingest, pre-processing and so on. So scalability and performance, you know, are critical for doing machine learning and that is something that Ray provides out of the box. And lastly, Ray is an open ecosystem. You can run it anywhere. You can run it on any Cloud provider. Google, you know, Google Cloud, AWS, Asure. You can run it on your Kubernetes cluster. You can run it on your laptop. It's extremely portable. And not only that, it's framework agnostic. You can use Ray to scale arbitrary Python workloads. You can use it to scale and it integrates with libraries like TensorFlow or PyTorch or JAX or XG Boost or Hugging Face or PyTorch Lightning, right, or Scikit-learn or just your own arbitrary Python code. It's open source. And in addition to integrating with the rest of the machine learning ecosystem and these machine learning frameworks, you can use Ray along with all of the other tooling in the machine learning ecosystem. That's things like weights and biases or ML flow, right. Or you know, different data platforms like Databricks, you know, Delta Lake or Snowflake or tools for model monitoring for feature stores, all of these integrate with Ray. And that's, you know, Ray provides that kind of flexibility so that you can integrate it into the rest of your workflow. And then Anyscale is the scalable compute platform that's built on top, you know, that provides Ray. So Anyscale is a managed Ray service that runs in the Cloud. And what Anyscale does is it offers the best way to run Ray. And if you think about what you get with Anyscale, there are fundamentally two things. One is about moving faster, accelerating the time to market. And you get that by having the managed service so that as a developer you don't have to worry about managing infrastructure, you don't have to worry about configuring infrastructure. You also, it provides, you know, optimized developer workflows. Things like easily moving from development to production, things like having the observability tooling, the debug ability to actually easily diagnose what's going wrong in a distributed application. So things like the dashboards and the other other kinds of tooling for collaboration, for monitoring and so on. And then on top of that, so that's the first bucket, developer productivity, moving faster, faster experimentation and iteration. The second reason that people choose Anyscale is superior infrastructure. So this is things like, you know, cost deficiency, being able to easily take advantage of spot instances, being able to get higher GPU utilization, things like faster cluster startup times and auto scaling. Things like just overall better performance and faster scheduling. And so these are the kinds of things that Anyscale provides on top of Ray. It's the managed infrastructure. It's fast, it's like the developer productivity and velocity as well as performance. So this is what I wanted to share about Ray in Anyscale. >> John: Awesome. >> Provide that context. But John, I'm curious what you think. >> I love it. I love the, so first of all, it's a platform because that's the platform architecture right there. So just to clarify, this is an Anyscale platform, not- >> That's right. >> Tools. So you got tools in the platform. Okay, that's key. Love that managed service. Just curious, you mentioned Python multiple times, is that because of PyTorch and TensorFlow or Python's the most friendly with machine learning or it's because it's very common amongst all developers? >> That's a great question. Python is the language that people are using to do machine learning. So it's the natural starting point. Now, of course, Ray is actually designed in a language agnostic way and there are companies out there that use Ray to build scalable Java applications. But for the most part right now we're focused on Python and being the best way to build these scalable Python and machine learning applications. But, of course, down the road there always is that potential. >> So if you're slinging Python code out there and you're watching that, you're watching this video, get on Anyscale bus quickly. Also, I just, while you were giving the presentation, I couldn't help, since you mentioned OpenAI, which by the way, congratulations 'cause they've had great scale, I've noticed in their rapid growth 'cause they were the fastest company to the number of users than anyone in the history of the computer industry, so major successor, OpenAI and ChatGPT, huge fan. I'm not a skeptic at all. I think it's just the beginning, so congratulations. But I actually typed into ChatGPT, what are the top three benefits of Anyscale and came up with scalability, flexibility, and ease of use. Obviously, scalability is what you guys are called. >> That's pretty good. >> So that's what they came up with. So they nailed it. Did you have an inside prompt training, buy it there? Only kidding. (Robert laughs) >> Yeah, we hard coded that one. >> But that's the kind of thing that came up really, really quickly if I asked it to write a sales document, it probably will, but this is the future interface. This is why people are getting excited about the foundational models and the large language models because it's allowing the interface with the user, the consumer, to be more human, more natural. And this is clearly will be in every application in the future. >> Absolutely. This is how people are going to interface with software, how they're going to interface with products in the future. It's not just something, you know, not just a chat bot that you talk to. This is going to be how you get things done, right. How you use your web browser or how you use, you know, how you use Photoshop or how you use other products. Like you're not going to spend hours learning all the APIs and how to use them. You're going to talk to it and tell it what you want it to do. And of course, you know, if it doesn't understand it, it's going to ask clarifying questions. You're going to have a conversation and then it'll figure it out. >> This is going to be one of those things, we're going to look back at this time Robert and saying, "Yeah, from that company, that was the beginning of that wave." And just like AWS and Cloud Computing, the folks who got in early really were in position when say the pandemic came. So getting in early is a good thing and that's what everyone's talking about is getting in early and playing around, maybe replatforming or even picking one or few apps to refactor with some staff and managed services. So people are definitely jumping in. So I have to ask you the ROI cost question. You mentioned some of those, Moore's Law versus what's going on in the industry. When you look at that kind of scale, the first thing that jumps out at people is, "Okay, I love it. Let's go play around." But what's it going to cost me? Am I going to be tied to certain GPUs? What's the landscape look like from an operational standpoint, from the customer? Are they locked in and the benefit was flexibility, are you flexible to handle any Cloud? What is the customers, what are they looking at? Basically, that's my question. What's the customer looking at? >> Cost is super important here and many of the companies, I mean, companies are spending a huge amount on their Cloud computing, on AWS, and on doing AI, right. And I think a lot of the advantage of Anyscale, what we can provide here is not only better performance, but cost efficiency. Because if we can run something faster and more efficiently, it can also use less resources and you can lower your Cloud spending, right. We've seen companies go from, you know, 20% GPU utilization with their current setup and the current tools they're using to running on Anyscale and getting more like 95, you know, 100% GPU utilization. That's something like a five x improvement right there. So depending on the kind of application you're running, you know, it's a significant cost savings. We've seen companies that have, you know, processing petabytes of data every single day with Ray going from, you know, getting order of magnitude cost savings by switching from what they were previously doing to running their application on Ray. And when you have applications that are spending, you know, potentially $100 million a year and getting a 10 X cost savings is just absolutely enormous. So these are some of the kinds of- >> Data infrastructure is super important. Again, if the customer, if you're a prospect to this and thinking about going in here, just like the Cloud, you got infrastructure, you got the platform, you got SaaS, same kind of thing's going to go on in AI. So I want to get into that, you know, ROI discussion and some of the impact with your customers that are leveraging the platform. But first I hear you got a demo. >> Robert: Yeah, so let me show you, let me give you a quick run through here. So what I have open here is the Anyscale UI. I've started a little Anyscale Workspace. So Workspaces are the Anyscale concept for interactive developments, right. So here, imagine I'm just, you want to have a familiar experience like you're developing on your laptop. And here I have a terminal. It's not on my laptop. It's actually in the cloud running on Anyscale. And I'm just going to kick this off. This is going to train a large language model, so OPT. And it's doing this on 32 GPUs. We've got a cluster here with a bunch of CPU cores, bunch of memory. And as that's running, and by the way, if I wanted to run this on instead of 32 GPUs, 64, 128, this is just a one line change when I launch the Workspace. And what I can do is I can pull up VS code, right. Remember this is the interactive development experience. I can look at the actual code. Here it's using Ray train to train the torch model. We've got the training loop and we're saying that each worker gets access to one GPU and four CPU cores. And, of course, as I make the model larger, this is using deep speed, as I make the model larger, I could increase the number of GPUs that each worker gets access to, right. And how that is distributed across the cluster. And if I wanted to run on CPUs instead of GPUs or a different, you know, accelerator type, again, this is just a one line change. And here we're using Ray train to train the models, just taking my vanilla PyTorch model using Hugging Face and then scaling that across a bunch of GPUs. And, of course, if I want to look at the dashboard, I can go to the Ray dashboard. There are a bunch of different visualizations I can look at. I can look at the GPU utilization. I can look at, you know, the CPU utilization here where I think we're currently loading the model and running that actual application to start the training. And some of the things that are really convenient here about Anyscale, both I can get that interactive development experience with VS code. You know, I can look at the dashboards. I can monitor what's going on. It feels, I have a terminal, it feels like my laptop, but it's actually running on a large cluster. And I can, with however many GPUs or other resources that I want. And so it's really trying to combine the best of having the familiar experience of programming on your laptop, but with the benefits, you know, being able to take advantage of all the resources in the Cloud to scale. And it's like when, you know, you're talking about cost efficiency. One of the biggest reasons that people waste money, one of the silly reasons for wasting money is just forgetting to turn off your GPUs. And what you can do here is, of course, things will auto terminate if they're idle. But imagine you go to sleep, I have this big cluster. You can turn it off, shut off the cluster, come back tomorrow, restart the Workspace, and you know, your big cluster is back up and all of your code changes are still there. All of your local file edits. It's like you just closed your laptop and came back and opened it up again. And so this is the kind of experience we want to provide for our users. So that's what I wanted to share with you. >> Well, I think that whole, couple of things, lines of code change, single line of code change, that's game changing. And then the cost thing, I mean human error is a big deal. People pass out at their computer. They've been coding all night or they just forget about it. I mean, and then it's just like leaving the lights on or your water running in your house. It's just, at the scale that it is, the numbers will add up. That's a huge deal. So I think, you know, compute back in the old days, there's no compute. Okay, it's just compute sitting there idle. But you know, data cranking the models is doing, that's a big point. >> Another thing I want to add there about cost efficiency is that we make it really easy to use, if you're running on Anyscale, to use spot instances and these preemptable instances that can just be significantly cheaper than the on-demand instances. And so when we see our customers go from what they're doing before to using Anyscale and they go from not using these spot instances 'cause they don't have the infrastructure around it, the fault tolerance to handle the preemption and things like that, to being able to just check a box and use spot instances and save a bunch of money. >> You know, this was my whole, my feature article at Reinvent last year when I met with Adam Selipsky, this next gen Cloud is here. I mean, it's not auto scale, it's infrastructure scale. It's agility. It's flexibility. I think this is where the world needs to go. Almost what DevOps did for Cloud and what you were showing me that demo had this whole SRE vibe. And remember Google had site reliability engines to manage all those servers. This is kind of like an SRE vibe for data at scale. I mean, a similar kind of order of magnitude. I mean, I might be a little bit off base there, but how would you explain it? >> It's a nice analogy. I mean, what we are trying to do here is get to the point where developers don't think about infrastructure. Where developers only think about their application logic. And where businesses can do AI, can succeed with AI, and build these scalable applications, but they don't have to build, you know, an infrastructure team. They don't have to develop that expertise. They don't have to invest years in building their internal machine learning infrastructure. They can just focus on the Python code, on their application logic, and run the stuff out of the box. >> Awesome. Well, I appreciate the time. Before we wrap up here, give a plug for the company. I know you got a couple websites. Again, go, Ray's got its own website. You got Anyscale. You got an event coming up. Give a plug for the company looking to hire. Put a plug in for the company. >> Yeah, absolutely. Thank you. So first of all, you know, we think AI is really going to transform every industry and the opportunity is there, right. We can be the infrastructure that enables all of that to happen, that makes it easy for companies to succeed with AI, and get value out of AI. Now we have, if you're interested in learning more about Ray, Ray has been emerging as the standard way to build scalable applications. Our adoption has been exploding. I mentioned companies like OpenAI using Ray to train their models. But really across the board companies like Netflix and Cruise and Instacart and Lyft and Uber, you know, just among tech companies. It's across every industry. You know, gaming companies, agriculture, you know, farming, robotics, drug discovery, you know, FinTech, we see it across the board. And all of these companies can get value out of AI, can really use AI to improve their businesses. So if you're interested in learning more about Ray and Anyscale, we have our Ray Summit coming up in September. This is going to highlight a lot of the most impressive use cases and stories across the industry. And if your business, if you want to use LLMs, you want to train these LLMs, these large language models, you want to fine tune them with your data, you want to deploy them, serve them, and build applications and products around them, give us a call, talk to us. You know, we can really take the infrastructure piece, you know, off the critical path and make that easy for you. So that's what I would say. And, you know, like you mentioned, we're hiring across the board, you know, engineering, product, go-to-market, and it's an exciting time. >> Robert Nishihara, co-founder and CEO of Anyscale, congratulations on a great company you've built and continuing to iterate on and you got growth ahead of you, you got a tailwind. I mean, the AI wave is here. I think OpenAI and ChatGPT, a customer of yours, have really opened up the mainstream visibility into this new generation of applications, user interface, roll of data, large scale, how to make that programmable so we're going to need that infrastructure. So thanks for coming on this season three, episode one of the ongoing series of the hot startups. In this case, this episode is the top startups building foundational model infrastructure for AI and ML. I'm John Furrier, your host. Thanks for watching. (upbeat music)

Published Date : Mar 9 2023

SUMMARY :

episode one of the ongoing and you guys really had and other resources in the Cloud. and particular the large language and what you want to achieve. and the Cloud did that with data centers. the point, and you know, if you don't mind explaining and managing the infrastructure and you guys are positioning is that the amount of compute needed to do But John, I'm curious what you think. because that's the platform So you got tools in the platform. and being the best way to of the computer industry, Did you have an inside prompt and the large language models and tell it what you want it to do. So I have to ask you and you can lower your So I want to get into that, you know, and you know, your big cluster is back up So I think, you know, the on-demand instances. and what you were showing me that demo and run the stuff out of the box. I know you got a couple websites. and the opportunity is there, right. and you got growth ahead

ENTITIES

Entity	Category	Confidence
Robert Nishihara	PERSON	0.99+
John	PERSON	0.99+
Robert	PERSON	0.99+
John Furrier	PERSON	0.99+
Netflix	ORGANIZATION	0.99+
35 times	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
$100 million	QUANTITY	0.99+
Uber	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
100%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Ant Group	ORGANIZATION	0.99+
first	QUANTITY	0.99+
Python	TITLE	0.99+
20%	QUANTITY	0.99+
32 GPUs	QUANTITY	0.99+
Lyft	ORGANIZATION	0.99+
hundreds	QUANTITY	0.99+
tomorrow	DATE	0.99+
Anyscale	ORGANIZATION	0.99+
three	QUANTITY	0.99+
128	QUANTITY	0.99+
September	DATE	0.99+
today	DATE	0.99+
Moore's Law	TITLE	0.99+
Adam Selipsky	PERSON	0.99+
PyTorch	TITLE	0.99+
Ray	ORGANIZATION	0.99+
second reason	QUANTITY	0.99+
64	QUANTITY	0.99+
each worker	QUANTITY	0.99+
each worker	QUANTITY	0.99+
Photoshop	TITLE	0.99+
UC Berkeley	ORGANIZATION	0.99+
Java	TITLE	0.99+
Shopify	ORGANIZATION	0.99+
OpenAI	ORGANIZATION	0.99+
Anyscale	PERSON	0.99+
third	QUANTITY	0.99+
two things	QUANTITY	0.99+
ByteDance	ORGANIZATION	0.99+
Spotify	ORGANIZATION	0.99+
One	QUANTITY	0.99+
95	QUANTITY	0.99+
Asure	ORGANIZATION	0.98+
one line	QUANTITY	0.98+
one GPU	QUANTITY	0.98+
ChatGPT	TITLE	0.98+
TensorFlow	TITLE	0.98+
last year	DATE	0.98+
first bucket	QUANTITY	0.98+
both	QUANTITY	0.98+
two layers	QUANTITY	0.98+
Cohere	ORGANIZATION	0.98+
Alipay	ORGANIZATION	0.98+
Ray	PERSON	0.97+
one	QUANTITY	0.97+
Instacart	ORGANIZATION	0.97+

Adam Wenchel, Arthur.ai | CUBE Conversation

(bright upbeat music) >> Hello and welcome to this Cube Conversation. I'm John Furrier, host of theCUBE. We've got a great conversation featuring Arthur AI. I'm your host. I'm excited to have Adam Wenchel who's the Co-Founder and CEO. Thanks for joining us today, appreciate it. >> Yeah, thanks for having me on, John, looking forward to the conversation. >> I got to say, it's been an exciting world in AI or artificial intelligence. Just an explosion of interest kind of in the mainstream with the language models, which people don't really get, but they're seeing the benefits of some of the hype around OpenAI. Which kind of wakes everyone up to, "Oh, I get it now." And then of course the pessimism comes in, all the skeptics are out there. But this breakthrough in generative AI field is just awesome, it's really a shift, it's a wave. We've been calling it probably the biggest inflection point, then the others combined of what this can do from a surge standpoint, applications. I mean, all aspects of what we used to know is the computing industry, software industry, hardware, is completely going to get turbo. So we're totally obviously bullish on this thing. So, this is really interesting. So my first question is, I got to ask you, what's you guys taking? 'Cause you've been doing this, you're in it, and now all of a sudden you're at the beach where the big waves are. What's the explosion of interest is there? What are you seeing right now? >> Yeah, I mean, it's amazing, so for starters, I've been in AI for over 20 years and just seeing this amount of excitement and the growth, and like you said, the inflection point we've hit in the last six months has just been amazing. And, you know, what we're seeing is like people are getting applications into production using LLMs. I mean, really all this excitement just started a few months ago, with ChatGPT and other breakthroughs and the amount of activity and the amount of new systems that we're seeing hitting production already so soon after that is just unlike anything we've ever seen. So it's pretty awesome. And, you know, these language models are just, they could be applied in so many different business contexts and that it's just the amount of value that's being created is again, like unprecedented compared to anything. >> Adam, you know, you've been in this for a while, so it's an interesting point you're bringing up, and this is a good point. I was talking with my friend John Markoff, former New York Times journalist and he was talking about, there's been a lot of work been done on ethics. So there's been, it's not like it's new. It's like been, there's a lot of stuff that's been baking over many, many years and, you know, decades. So now everyone wakes up in the season, so I think that is a key point I want to get into some of your observations. But before we get into it, I want you to explain for the folks watching, just so we can kind of get a definition on the record. What's an LLM, what's a foundational model and what's generative ai? Can you just quickly explain the three things there? >> Yeah, absolutely. So an LLM or a large language model, it's just a large, they would imply a large language model that's been trained on a huge amount of data typically pulled from the internet. And it's a general purpose language model that can be built on top for all sorts of different things, that includes traditional NLP tasks like document classification and sentiment understanding. But the thing that's gotten people really excited is it's used for generative tasks. So, you know, asking it to summarize documents or asking it to answer questions. And these aren't new techniques, they've been around for a while, but what's changed is just this new class of models that's based on new architectures. They're just so much more capable that they've gone from sort of science projects to something that's actually incredibly useful in the real world. And there's a number of companies that are making them accessible to everyone so that you can build on top of them. So that's the other big thing is, this kind of access to these models that can power generative tasks has been democratized in the last few months and it's just opening up all these new possibilities. And then the third one you mentioned foundation models is sort of a broader term for the category that includes LLMs, but it's not just language models that are included. So we've actually seen this for a while in the computer vision world. So people have been building on top of computer vision models, pre-trained computer vision models for a while for image classification, object detection, that's something we've had customers doing for three or four years already. And so, you know, like you said, there are antecedents to like, everything that's happened, it's not entirely new, but it does feel like a step change. >> Yeah, I did ask ChatGPT to give me a riveting introduction to you and it gave me an interesting read. If we have time, I'll read it. It's kind of, it's fun, you get a kick out of it. "Ladies and gentlemen, today we're a privileged "to have Adam Wenchel, Founder of Arthur who's going to talk "about the exciting world of artificial intelligence." And then it goes on with some really riveting sentences. So if we have time, I'll share that, it's kind of funny. It was good. >> Okay. >> So anyway, this is what people see and this is why I think it's exciting 'cause I think people are going to start refactoring what they do. And I've been saying this on theCUBE now for about a couple months is that, you know, there's a scene in "Moneyball" where Billy Beane sits down with the Red Sox owner and the Red Sox owner says, "If people aren't rebuilding their teams on your model, "they're going to be dinosaurs." And it reminds me of what's happening right now. And I think everyone that I talk to in the business sphere is looking at this and they're connecting the dots and just saying, if we don't rebuild our business with this new wave, they're going to be out of business because there's so much efficiency, there's so much automation, not like DevOps automation, but like the generative tasks that will free up the intellect of people. Like just the simple things like do an intro or do this for me, write some code, write a countermeasure to a hack. I mean, this is kind of what people are doing. And you mentioned computer vision, again, another huge field where 5G things are coming on, it's going to accelerate. What do you say to people when they kind of are leaning towards that, I need to rethink my business? >> Yeah, it's 100% accurate and what's been amazing to watch the last few months is the speed at which, and the urgency that companies like Microsoft and Google or others are actually racing to, to do that rethinking of their business. And you know, those teams, those companies which are large and haven't always been the fastest moving companies are working around the clock. And the pace at which they're rolling out LLMs across their suite of products is just phenomenal to watch. And it's not just the big, the large tech companies as well, I mean, we're seeing the number of startups, like we get, every week a couple of new startups get in touch with us for help with their LLMs and you know, there's just a huge amount of venture capital flowing into it right now because everyone realizes the opportunities for transforming like legal and healthcare and content creation in all these different areas is just wide open. And so there's a massive gold rush going on right now, which is amazing. >> And the cloud scale, obviously horizontal scalability of the cloud brings us to another level. We've been seeing data infrastructure since the Hadoop days where big data was coined. Now you're seeing this kind of take fruit, now you have vertical specialization where data shines, large language models all of a set up perfectly for kind of this piece. And you know, as you mentioned, you've been doing it for a long time. Let's take a step back and I want to get into how you started the company, what drove you to start it? Because you know, as an entrepreneur you're probably saw this opportunity before other people like, "Hey, this is finally it, it's here." Can you share the origination story of what you guys came up with, how you started it, what was the motivation and take us through that origination story. >> Yeah, absolutely. So as I mentioned, I've been doing AI for many years. I started my career at DARPA, but it wasn't really until 2015, 2016, my previous company was acquired by Capital One. Then I started working there and shortly after I joined, I was asked to start their AI team and scale it up. And for the first time I was actually doing it, had production models that we were working with, that was at scale, right? And so there was hundreds of millions of dollars of business revenue and certainly a big group of customers who were impacted by the way these models acted. And so it got me hyper-aware of these issues of when you get models into production, it, you know. So I think people who are earlier in the AI maturity look at that as a finish line, but it's really just the beginning and there's this constant drive to make them better, make sure they're not degrading, make sure you can explain what they're doing, if they're impacting people, making sure they're not biased. And so at that time, there really weren't any tools to exist to do this, there wasn't open source, there wasn't anything. And so after a few years there, I really started talking to other people in the industry and there was a really clear theme that this needed to be addressed. And so, I joined with my Co-Founder John Dickerson, who was on the faculty in University of Maryland and he'd been doing a lot of research in these areas. And so we ended up joining up together and starting Arthur. >> Awesome. Well, let's get into what you guys do. Can you explain the value proposition? What are people using you for now? Where's the action? What's the customers look like? What do prospects look like? Obviously you mentioned production, this has been the theme. It's not like people woke up one day and said, "Hey, I'm going to put stuff into production." This has kind of been happening. There's been companies that have been doing this at scale and then yet there's a whole follower model coming on mainstream enterprise and businesses. So there's kind of the early adopters are there now in production. What do you guys do? I mean, 'cause I think about just driving the car off the lot is not, you got to manage operations. I mean, that's a big thing. So what do you guys do? Talk about the value proposition and how you guys make money? >> Yeah, so what we do is, listen, when you go to validate ahead of deploying these models in production, starts at that point, right? So you want to make sure that if you're going to be upgrading a model, if you're going to replacing one that's currently in production, that you've proven that it's going to perform well, that it's going to be perform ethically and that you can explain what it's doing. And then when you launch it into production, traditionally data scientists would spend 25, 30% of their time just manually checking in on their model day-to-day babysitting as we call it, just to make sure that the data hasn't drifted, the model performance hasn't degraded, that a programmer did make a change in an upstream data system. You know, there's all sorts of reasons why the world changes and that can have a real adverse effect on these models. And so what we do is bring the same kind of automation that you have for other kinds of, let's say infrastructure monitoring, application monitoring, we bring that to your AI systems. And that way if there ever is an issue, it's not like weeks or months till you find it and you find it before it has an effect on your P&L and your balance sheet, which is too often before they had tools like Arthur, that was the way they were detected. >> You know, I was talking to Swami at Amazon who I've known for a long time for 13 years and been on theCUBE multiple times and you know, I watched Amazon try to pick up that sting with stage maker about six years ago and so much has happened since then. And he and I were talking about this wave, and I kind of brought up this analogy to how when cloud started, it was, Hey, I don't need a data center. 'Cause when I did my startup that time when Amazon, one of my startups at that time, my choice was put a box in the colo, get all the configuration before I could write over the line of code. So the cloud became the benefit for that and you can stand up stuff quickly and then it grew from there. Here it's kind of the same dynamic, you don't want to have to provision a large language model or do all this heavy lifting. So that seeing companies coming out there saying, you can get started faster, there's like a new way to get it going. So it's kind of like the same vibe of limiting that heavy lifting. >> Absolutely. >> How do you look at that because this seems to be a wave that's going to be coming in and how do you guys help companies who are going to move quickly and start developing? >> Yeah, so I think in the race to this kind of gold rush mentality, race to get these models into production, there's starting to see more sort of examples and evidence that there are a lot of risks that go along with it. Either your model says things, your system says things that are just wrong, you know, whether it's hallucination or just making things up, there's lots of examples. If you go on Twitter and the news, you can read about those, as well as sort of times when there could be toxic content coming out of things like that. And so there's a lot of risks there that you need to think about and be thoughtful about when you're deploying these systems. But you know, you need to balance that with the business imperative of getting these things into production and really transforming your business. And so that's where we help people, we say go ahead, put them in production, but just make sure you have the right guardrails in place so that you can do it in a smart way that's going to reflect well on you and your company. >> Let's frame the challenge for the companies now that you have, obviously there's the people who doing large scale production and then you have companies maybe like as small as us who have large linguistic databases or transcripts for example, right? So what are customers doing and why are they deploying AI right now? And is it a speed game, is it a cost game? Why have some companies been able to deploy AI at such faster rates than others? And what's a best practice to onboard new customers? >> Yeah, absolutely. So I mean, we're seeing across a bunch of different verticals, there are leaders who have really kind of started to solve this puzzle about getting AI models into production quickly and being able to iterate on them quickly. And I think those are the ones that realize that imperative that you mentioned earlier about how transformational this technology is. And you know, a lot of times, even like the CEOs or the boards are very personally kind of driving this sense of urgency around it. And so, you know, that creates a lot of movement, right? And so those companies have put in place really smart infrastructure and rails so that people can, data scientists aren't encumbered by having to like hunt down data, get access to it. They're not encumbered by having to stand up new platforms every time they want to deploy an AI system, but that stuff is already in place. There's a really nice ecosystem of products out there, including Arthur, that you can tap into. Compared to five or six years ago when I was building at a top 10 US bank, at that point you really had to build almost everything yourself and that's not the case now. And so it's really nice to have things like, you know, you mentioned AWS SageMaker and a whole host of other tools that can really accelerate things. >> What's your profile customer? Is it someone who already has a team or can people who are learning just dial into the service? What's the persona? What's the pitch, if you will, how do you align with that customer value proposition? Do people have to be built out with a team and in play or is it pre-production or can you start with people who are just getting going? >> Yeah, people do start using it pre-production for validation, but I think a lot of our customers do have a team going and they're starting to put, either close to putting something into production or about to, it's everything from large enterprises that have really sort of complicated, they have dozens of models running all over doing all sorts of use cases to tech startups that are very focused on a single problem, but that's like the lifeblood of the company and so they need to guarantee that it works well. And you know, we make it really easy to get started, especially if you're using one of the common model development platforms, you can just kind of turn key, get going and make sure that you have a nice feedback loop. So then when your models are out there, it's pointing out, areas where it's performing well, areas where it's performing less well, giving you that feedback so that you can make improvements, whether it's in training data or futurization work or algorithm selection. There's a number of, you know, depending on the symptoms, there's a number of things you can do to increase performance over time and we help guide people on that journey. >> So Adam, I have to ask, since you have such a great customer base and they're smart and they got teams and you're on the front end, I mean, early adopters is kind of an overused word, but they're killing it. They're putting stuff in the production's, not like it's a test, it's not like it's early. So as the next wave comes of fast followers, how do you see that coming online? What's your vision for that? How do you see companies that are like just waking up out of the frozen, you know, freeze of like old IT to like, okay, they got cloud, but they're not yet there. What do you see in the market? I see you're in the front end now with the top people really nailing AI and working hard. What's the- >> Yeah, I think a lot of these tools are becoming, or every year they get easier, more accessible, easier to use. And so, you know, even for that kind of like, as the market broadens, it takes less and less of a lift to put these systems in place. And the thing is, every business is unique, they have their own kind of data and so you can use these foundation models which have just been trained on generic data. They're a great starting point, a great accelerant, but then, in most cases you're either going to want to create a model or fine tune a model using data that's really kind of comes from your particular customers, the people you serve and so that it really reflects that and takes that into account. And so I do think that these, like the size of that market is expanding and its broadening as these tools just become easier to use and also the knowledge about how to build these systems becomes more widespread. >> Talk about your customer base you have now, what's the makeup, what size are they? Give a taste a little bit of a customer base you got there, what's they look like? I'll say Capital One, we know very well while you were at there, they were large scale, lot of data from fraud detection to all kinds of cool stuff. What do your customers now look like? >> Yeah, so we have a variety, but I would say one area we're really strong, we have several of the top 10 US banks, that's not surprising, that's a strength for us, but we also have Fortune 100 customers in healthcare, in manufacturing, in retail, in semiconductor and electronics. So what we find is like in any sort of these major verticals, there's typically, you know, one, two, three kind of companies that are really leading the charge and are the ones that, you know, in our opinion, those are the ones that for the next multiple decades are going to be the leaders, the ones that really kind of lead the charge on this AI transformation. And so we're very fortunate to be working with some of those. And then we have a number of startups as well who we love working with just because they're really pushing the boundaries technologically and so they provide great feedback and make sure that we're continuing to innovate and staying abreast of everything that's going on. >> You know, these early markups, even when the hyperscalers were coming online, they had to build everything themselves. That's the new, they're like the alphas out there building it. This is going to be a big wave again as that fast follower comes in. And so when you look at the scale, what advice would you give folks out there right now who want to tee it up and what's your secret sauce that will help them get there? >> Yeah, I think that the secret to teeing it up is just dive in and start like the, I think these are, there's not really a secret. I think it's amazing how accessible these are. I mean, there's all sorts of ways to access LLMs either via either API access or downloadable in some cases. And so, you know, go ahead and get started. And then our secret sauce really is the way that we provide that performance analysis of what's going on, right? So we can tell you in a very actionable way, like, hey, here's where your model is doing good things, here's where it's doing bad things. Here's something you want to take a look at, here's some potential remedies for it. We can help guide you through that. And that way when you're putting it out there, A, you're avoiding a lot of the common pitfalls that people see and B, you're able to really kind of make it better in a much faster way with that tight feedback loop. >> It's interesting, we've been kind of riffing on this supercloud idea because it was just different name than multicloud and you see apps like Snowflake built on top of AWS without even spending any CapEx, you just ride that cloud wave. This next AI, super AI wave is coming. I don't want to call AIOps because I think there's a different distinction. If you, MLOps and AIOps seem a little bit old, almost a few years back, how do you view that because everyone's is like, "Is this AIOps?" And like, "No, not kind of, but not really." How would you, you know, when someone says, just shoots off the hip, "Hey Adam, aren't you doing AIOps?" Do you say, yes we are, do you say, yes, but we do differently because it's doesn't seem like it's the same old AIOps. What's your- >> Yeah, it's a good question. AIOps has been a term that was co-opted for other things and MLOps also has people have used it for different meanings. So I like the term just AI infrastructure, I think it kind of like describes it really well and succinctly. >> But you guys are doing the ops. I mean that's the kind of ironic thing, it's like the next level, it's like NextGen ops, but it's not, you don't want to be put in that bucket. >> Yeah, no, it's very operationally focused platform that we have, I mean, it fires alerts, people can action off them. If you're familiar with like the way people run security operations centers or network operations centers, we do that for data science, right? So think of it as a DSOC, a Data Science Operations Center where all your models, you might have hundreds of models running across your organization, you may have five, but as problems are detected, alerts can be fired and you can actually work the case, make sure they're resolved, escalate them as necessary. And so there is a very strong operational aspect to it, you're right. >> You know, one of the things I think is interesting is, is that, if you don't mind commenting on it, is that the aspect of scale is huge and it feels like that was made up and now you have scale and production. What's your reaction to that when people say, how does scale impact this? >> Yeah, scale is huge for some of, you know, I think, I think look, the highest leverage business areas to apply these to, are generally going to be the ones at the biggest scale, right? And I think that's one of the advantages we have. Several of us come from enterprise backgrounds and we're used to doing things enterprise grade at scale and so, you know, we're seeing more and more companies, I think they started out deploying AI and sort of, you know, important but not necessarily like the crown jewel area of their business, but now they're deploying AI right in the heart of things and yeah, the scale that some of our companies are operating at is pretty impressive. >> John: Well, super exciting, great to have you on and congratulations. I got a final question for you, just random. What are you most excited about right now? Because I mean, you got to be pretty pumped right now with the way the world is going and again, I think this is just the beginning. What's your personal view? How do you feel right now? >> Yeah, the thing I'm really excited about for the next couple years now, you touched on it a little bit earlier, but is a sort of convergence of AI and AI systems with sort of turning into AI native businesses. And so, as you sort of do more, get good further along this transformation curve with AI, it turns out that like the better the performance of your AI systems, the better the performance of your business. Because these models are really starting to underpin all these key areas that cumulatively drive your P&L. And so one of the things that we work a lot with our customers is to do is just understand, you know, take these really esoteric data science notions and performance and tie them to all their business KPIs so that way you really are, it's kind of like the operating system for running your AI native business. And we're starting to see more and more companies get farther along that maturity curve and starting to think that way, which is really exciting. >> I love the AI native. I haven't heard any startup yet say AI first, although we kind of use the term, but I guarantee that's going to come in all the pitch decks, we're an AI first company, it's going to be great run. Adam, congratulations on your success to you and the team. Hey, if we do a few more interviews, we'll get the linguistics down. We can have bots just interact with you directly and ask you, have an interview directly. >> That sounds good, I'm going to go hang out on the beach, right? So, sounds good. >> Thanks for coming on, really appreciate the conversation. Super exciting, really important area and you guys doing great work. Thanks for coming on. >> Adam: Yeah, thanks John. >> Again, this is Cube Conversation. I'm John Furrier here in Palo Alto, AI going next gen. This is legit, this is going to a whole nother level that's going to open up huge opportunities for startups, that's going to use opportunities for investors and the value to the users and the experience will come in, in ways I think no one will ever see. So keep an eye out for more coverage on siliconangle.com and theCUBE.net, thanks for watching. (bright upbeat music)

Published Date : Mar 3 2023

SUMMARY :

I'm excited to have Adam Wenchel looking forward to the conversation. kind of in the mainstream and that it's just the amount Adam, you know, you've so that you can build on top of them. to give me a riveting introduction to you And you mentioned computer vision, again, And you know, those teams, And you know, as you mentioned, of when you get models into off the lot is not, you and that you can explain what it's doing. So it's kind of like the same vibe so that you can do it in a smart way And so, you know, that creates and make sure that you out of the frozen, you know, and so you can use these foundation models a customer base you got there, that are really leading the And so when you look at the scale, And so, you know, go how do you view that So I like the term just AI infrastructure, I mean that's the kind of ironic thing, and you can actually work the case, is that the aspect of and so, you know, we're seeing exciting, great to have you on so that way you really are, success to you and the team. out on the beach, right? and you guys doing great work. and the value to the users and

ENTITIES

Entity	Category	Confidence
John Markoff	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Adam Wenchel	PERSON	0.99+
John	PERSON	0.99+
Red Sox	ORGANIZATION	0.99+
John Dickerson	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Adam	PERSON	0.99+
John Furrier	PERSON	0.99+
Palo Alto	LOCATION	0.99+
2015	DATE	0.99+
Capital One	ORGANIZATION	0.99+
five	QUANTITY	0.99+
100%	QUANTITY	0.99+
2016	DATE	0.99+
13 years	QUANTITY	0.99+
Snowflake	TITLE	0.99+
three	QUANTITY	0.99+
first question	QUANTITY	0.99+
two	QUANTITY	0.99+
five	DATE	0.99+
today	DATE	0.99+
one	QUANTITY	0.99+
four years	QUANTITY	0.99+
Billy Beane	PERSON	0.99+
over 20 years	QUANTITY	0.99+
DARPA	ORGANIZATION	0.99+
third one	QUANTITY	0.98+
AWS	ORGANIZATION	0.98+
siliconangle.com	OTHER	0.98+
University of Maryland	ORGANIZATION	0.97+
first time	QUANTITY	0.97+
US	LOCATION	0.97+
first	QUANTITY	0.96+
six years ago	DATE	0.96+
New York Times	ORGANIZATION	0.96+
ChatGPT	ORGANIZATION	0.96+
Swami	PERSON	0.95+
ChatGPT	TITLE	0.95+
hundreds of models	QUANTITY	0.95+
25, 30%	QUANTITY	0.95+
single problem	QUANTITY	0.95+
hundreds of millions of dollars	QUANTITY	0.95+
10	QUANTITY	0.94+
Moneyball	TITLE	0.94+
wave	EVENT	0.91+
three things	QUANTITY	0.9+
AIOps	TITLE	0.9+
last six months	DATE	0.89+
few months ago	DATE	0.88+
big	EVENT	0.86+
next couple years	DATE	0.86+
DevOps	TITLE	0.85+
Arthur	PERSON	0.85+
CUBE	ORGANIZATION	0.83+
dozens of models	QUANTITY	0.8+
a few years back	DATE	0.8+
six years ago	DATE	0.78+
theCUBE	ORGANIZATION	0.76+
SageMaker	TITLE	0.75+
decades	QUANTITY	0.75+
Twitter	ORGANIZATION	0.74+
MLOps	TITLE	0.74+
supercloud	ORGANIZATION	0.73+
super AI wave	EVENT	0.73+
a couple months	QUANTITY	0.72+
Arthur	ORGANIZATION	0.72+
100 customers	QUANTITY	0.71+
Cube Conversation	EVENT	0.69+
theCUBE.net	OTHER	0.67+

Phil Kippen, Snowflake, Dave Whittington, AT&T & Roddy Tranum, AT&T | | MWC Barcelona 2023

(gentle music) >> Narrator: "TheCUBE's" live coverage is made possible by funding from Dell Technologies, creating technologies that drive human progress. (upbeat music) >> Hello everybody, welcome back to day four of "theCUBE's" coverage of MWC '23. We're here live at the Fira in Barcelona. Wall-to-wall coverage, John Furrier is in our Palo Alto studio, banging out all the news. Really, the whole week we've been talking about the disaggregation of the telco network, the new opportunities in telco. We're really excited to have AT&T and Snowflake here. Dave Whittington is the AVP, at the Chief Data Office at AT&T. Roddy Tranum is the Assistant Vice President, for Channel Performance Data and Tools at AT&T. And Phil Kippen, the Global Head Of Industry-Telecom at Snowflake, Snowflake's new telecom business. Snowflake just announced earnings last night. Typical Scarpelli, they beat earnings, very conservative guidance, stocks down today, but we like Snowflake long term, they're on that path to 10 billion. Guys, welcome to "theCUBE." Thanks so much >> Phil: Thank you. >> for coming on. >> Dave and Roddy: Thanks Dave. >> Dave, let's start with you. The data culture inside of telco, We've had this, we've been talking all week about this monolithic system. Super reliable. You guys did a great job during the pandemic. Everything shifting to landlines. We didn't even notice, you guys didn't miss a beat. Saved us. But the data culture's changing inside telco. Explain that. >> Well, absolutely. So, first of all IoT and edge processing is bringing forth new and exciting opportunities all the time. So, we're bridging the world between a lot of the OSS stuff that we can do with edge processing. But bringing that back, and now we're talking about working, and I would say traditionally, we talk data warehouse. Data warehouse and big data are now becoming a single mesh, all right? And the use cases and the way you can use those, especially I'm taking that edge data and bringing it back over, now I'm running AI and ML models on it, and I'm pushing back to the edge, and I'm combining that with my relational data. So that mesh there is making all the difference. We're getting new use cases that we can do with that. And it's just, and the volume of data is immense. >> Now, I love ChatGPT, but I'm hoping your data models are more accurate than ChatGPT. I never know. Sometimes it's really good, sometimes it's really bad. But enterprise, you got to be clean with your AI, don't you? >> Not only you have to be clean, you have to monitor it for bias and be ethical about it. We're really good about that. First of all with AT&T, our brand is Platinum. We take care of that. So, we may not be as cutting-edge risk takers as others, but when we go to market with an AI or an ML or a product, it's solid. >> Well hey, as telcos go, you guys are leaning into the Cloud. So I mean, that's a good starting point. Roddy, explain your role. You got an interesting title, Channel Performance Data and Tools, what's that all about? >> So literally anything with our consumer, retail, concenters' channels, all of our channels, from a data perspective and metrics perspective, what it takes to run reps, agents, all the way to leadership levels, scorecards, how you rank in the business, how you're driving the business, from sales, service, customer experience, all that data infrastructure with our great partners on the CDO side, as well as Snowflake, that comes from my team. >> And that's traditionally been done in a, I don't mean the pejorative, but we're talking about legacy, monolithic, sort of data warehouse technologies. >> Absolutely. >> We have a love-hate relationship with them. It's what we had. It's what we used, right? And now that's evolving. And you guys are leaning into the Cloud. >> Dramatic evolution. And what Snowflake's enabled for us is impeccable. We've talked about having, people have dreamed of one data warehouse for the longest time and everything in one system. Really, this is the only way that becomes a reality. The more you get in Snowflake, we can have golden source data, and instead of duplicating that 50 times across AT&T, it's in one place, we just share it, everybody leverages it, and now it's not duplicated, and the process efficiency is just incredible. >> But it really hinges on that separation of storage and compute. And we talk about the monolithic warehouse, and one of the nightmares I've lived with, is having a monolithic warehouse. And let's just go with some of my primary, traditional customers, sales, marketing and finance. They are leveraging BSS OSS data all the time. For me to coordinate a deployment, I have to make sure that each one of these units can take an outage, if it's going to be a long deployment. With the separation of storage, compute, they own their own compute cluster. So I can move faster for these people. 'Cause if finance, I can implement his code without impacting finance or marketing. This brings in CI/CD to more reality. It brings us faster to market with more features. So if he wants to implement a new comp plan for the field reps, or we're reacting to the marketplace, where one of our competitors has done something, we can do that in days, versus waiting weeks or months. >> And we've reported on this a lot. This is the brilliance of Snowflake's founders, that whole separation >> Yep. >> from compute and data. I like Dave, that you're starting with sort of the business flexibility, 'cause there's a cost element of this too. You can dial down, you can turn off compute, and then of course the whole world said, "Hey, that's a good idea." And a VC started throwing money at Amazon, but Redshift said, "Oh, we can do that too, sort of, can't turn off the compute." But I want to ask you Phil, so, >> Sure. >> it looks from my vantage point, like you're taking your Data Cloud message which was originally separate compute from storage simplification, now data sharing, automated governance, security, ultimately the marketplace. >> Phil: Right. >> Taking that same model, break down the silos into telecom, right? It's that same, >> Mm-hmm. >> sorry to use the term playbook, Frank Slootman tells me he doesn't use playbooks, but he's not a pattern matcher, but he's a situational CEO, he says. But the situation in telco calls for that type of strategy. So explain what you guys are doing in telco. >> I think there's, so, what we're launching, we launched last week, and it really was three components, right? So we had our platform as you mentioned, >> Dave: Mm-hmm. >> and that platform is being utilized by a number of different companies today. We also are adding, for telecom very specifically, we're adding capabilities in marketplace, so that service providers can not only use some of the data and apps that are in marketplace, but as well service providers can go and sell applications or sell data that they had built. And then as well, we're adding our ecosystem, it's telecom-specific. So, we're bringing partners in, technology partners, and consulting and services partners, that are very much focused on telecoms and what they do internally, but also helping them monetize new services. >> Okay, so it's not just sort of generic Snowflake into telco? You have specific value there. >> We're purposing the platform specifically for- >> Are you a telco guy? >> I am. You are, okay. >> Total telco guy absolutely. >> So there you go. You see that Snowflake is actually an interesting organizational structure, 'cause you're going after verticals, which is kind of rare for a company of your sort of inventory, I'll say, >> Absolutely. >> I don't mean that as a negative. (Dave laughs) So Dave, take us through the data journey at AT&T. It's a long history. You don't have to go back to the 1800s, but- (Dave laughs) >> Thank you for pointing out, we're a 149-year-old company. So, Jesse James was one of the original customers, (Dave laughs) and we have no longer got his data. So, I'll go back. I've been 17 years singular AT&T, and I've watched it through the whole journey of, where the monolithics were growing, when the consolidation of small, wireless carriers, and we went through that boom. And then we've gone through mergers and acquisitions. But, Hadoop came out, and it was going to solve all world hunger. And we had all the aspects of, we're going to monetize and do AI and ML, and some of the things we learned with Hadoop was, we had this monolithic warehouse, we had this file-based-structured Hadoop, but we really didn't know how to bring this all together. And we were bringing items over to the relational, and we were taking the relational and bringing it over to the warehouse, and trying to, and it was a struggle. Let's just go there. And I don't think we were the only company to struggle with that, but we learned a lot. And so now as tech is finally emerging, with the cloud, companies like Snowflake, and others that can handle that, where we can create, we were discussing earlier, but it becomes more of a conducive mesh that's interoperable. So now we're able to simplify that environment. And the cloud is a big thing on that. 'Cause you could not do this on-prem with on-prem technologies. It would be just too cost prohibitive, and too heavy of lifting, going back and forth, and managing the data. The simplicity the cloud brings with a smaller set of tools, and I'll say in the data space specifically, really allows us, maybe not a single instance of data for all use cases, but a greatly reduced ecosystem. And when you simplify your ecosystem, you simplify speed to market and data management. >> So I'm going to ask you, I know it's kind of internal organizational plumbing, but it'll inform my next question. So, Dave, you're with the Chief Data Office, and Roddy, you're kind of, you all serve in the business, but you're really serving the, you're closer to those guys, they're banging on your door for- >> Absolutely. I try to keep the 130,000 users who may or may not have issues sometimes with our data and metrics, away from Dave. And he just gets a call from me. >> And he only calls when he has a problem. He's never wished me happy birthday. (Dave and Phil laugh) >> So the reason I asked that is because, you describe Dave, some of the Hadoop days, and again love-hate with that, but we had hyper-specialized roles. We still do. You've got data engineers, data scientists, data analysts, and you've got this sort of this pipeline, and it had to be this sequential pipeline. I know Snowflake and others have come to simplify that. My question to you is, how is that those roles, how are those roles changing? How is data getting closer to the business? Everybody talks about democratizing business. Are you doing that? What's a real use example? >> From our perspective, those roles, a lot of those roles on my team for years, because we're all about efficiency, >> Dave: Mm-hmm. >> we cut across those areas, and always have cut across those areas. So now we're into a space where things have been simplified, data processes and copying, we've gone from 40 data processes down to five steps now. We've gone from five steps to one step. We've gone from days, now take hours, hours to minutes, minutes to seconds. Literally we're seeing that time in and time out with Snowflake. So these resources that have spent all their time on data engineering and moving data around, are now freed up more on what they have skills for and always have, the data analytics area of the business, and driving the business forward, and new metrics and new analysis. That's some of the great operational value that we've seen here. As this simplification happens, it frees up brain power. >> So, you're pumping data from the OSS, the BSS, the OKRs everywhere >> Everywhere. >> into Snowflake? >> Scheduling systems, you name it. If you can think of what drives our retail and centers and online, all that data, scheduling system, chat data, call center data, call detail data, all of that enters into this common infrastructure to manage the business on a day in and day out basis. >> How are the roles and the skill sets changing? 'Cause you're doing a lot less ETL, you're doing a lot less moving of data around. There were guys that were probably really good at that. I used to joke in the, when I was in the storage world, like if your job is bandaging lungs, you need to look for a new job, right? So, and they did and people move on. So, are you able to sort of redeploy those assets, and those people, those human resources? >> These folks are highly skilled. And we were talking about earlier, SQL hasn't gone away. Relational databases are not going away. And that's one thing that's made this migration excellent, they're just transitioning their skills. Experts in legacy systems are now rapidly becoming experts on the Snowflake side. And it has not been that hard a transition. There are certainly nuances, things that don't operate as well in the cloud environment that we have to learn and optimize. But we're making that transition. >> Dave: So just, >> Please. >> within the Chief Data Office we have a couple of missions, and Roddy is a great partner and an example of how it works. We try to bring the data for democratization, so that we have one interface, now hopefully know we just have a logical connection back to these Snowflake instances that we connect. But we're providing that governance and cleansing, and if there's a business rule at the enterprise level, we provide it. But the goal at CDO is to make sure that business units like Roddy or marketing or finance, that they can come to a platform that's reliable, robust, and self-service. I don't want to be in his way. So I feel like I'm providing a sub-level of platform, that he can come to and anybody can come to, and utilize, that they're not having to go back and undo what's in Salesforce, or ServiceNow, or in our billers. So, I'm sort of that layer. And then making sure that that ecosystem is robust enough for him to use. >> And that self-service infrastructure is predominantly through the Azure Cloud, correct? >> Dave: Absolutely. >> And you work on other clouds, but it's predominantly through Azure? >> We're predominantly in Azure, yeah. >> Dave: That's the first-party citizen? >> Yeah. >> Okay, I like to think in terms sometimes of data products, and I know you've mentioned upfront, you're Gold standard or Platinum standard, you're very careful about personal information. >> Dave: Yeah. >> So you're not trying to sell, I'm an AT&T customer, you're not trying to sell my data, and make money off of my data. So the value prop and the business case for Snowflake is it's simpler. You do things faster, you're in the cloud, lower cost, et cetera. But I presume you're also in the business, AT&T, of making offers and creating packages for customers. I look at those as data products, 'cause it's not a, I mean, yeah, there's a physical phone, but there's data products behind it. So- >> It ultimately is, but not everybody always sees it that way. Data reporting often can be an afterthought. And we're making it more on the forefront now. >> Yeah, so I like to think in terms of data products, I mean even if the financial services business, it's a data business. So, if we can think about that sort of metaphor, do you see yourselves as data product builders? Do you have that, do you think about building products in that regard? >> Within the Chief Data Office, we have a data product team, >> Mm-hmm. >> and by the way, I wouldn't be disingenuous if I said, oh, we're very mature in this, but no, it's where we're going, and it's somewhat of a journey, but I've got a peer, and their whole job is to go from, especially as we migrate from cloud, if Roddy or some other group was using tables three, four and five and joining them together, it's like, "Well look, this is an offer for data product, so let's combine these and put it up in the cloud, and here's the offer data set product, or here's the opportunity data product," and it's a journey. We're on the way, but we have dedicated staff and time to do this. >> I think one of the hardest parts about that is the organizational aspects of it. Like who owns the data now, right? It used to be owned by the techies, and increasingly the business lines want to have access, you're providing self-service. So there's a discussion about, "Okay, what is a data product? Who's responsible for that data product? Is it in my P&L or your P&L? Somebody's got to sign up for that number." So, it sounds like those discussions are taking place. >> They are. And, we feel like we're more the, and CDO at least, we feel more, we're like the guardians, and the shepherds, but not the owners. I mean, we have a role in it all, but he owns his metrics. >> Yeah, and even from our perspective, we see ourselves as an enabler of making whatever AT&T wants to make happen in terms of the key products and officers' trade-in offers, trade-in programs, all that requires this data infrastructure, and managing reps and agents, and what they do from a channel performance perspective. We still ourselves see ourselves as key enablers of that. And we've got to be flexible, and respond quickly to the business. >> I always had empathy for the data engineer, and he or she had to service all these different lines of business with no business context. >> Yeah. >> Like the business knows good data from bad data, and then they just pound that poor individual, and they're like, "Okay, I'm doing my best. It's just ones and zeros to me." So, it sounds like that's, you're on that path. >> Yeah absolutely, and I think, we do have refined, getting more and more refined owners of, since Snowflake enables these golden source data, everybody sees me and my organization, channel performance data, go to Roddy's team, we have a great team, and we go to Dave in terms of making it all happen from a data infrastructure perspective. So we, do have a lot more refined, "This is where you go for the golden source, this is where it is, this is who owns it. If you want to launch this product and services, and you want to manage reps with it, that's the place you-" >> It's a strong story. So Chief Data Office doesn't own the data per se, but it's your responsibility to provide the self-service infrastructure, and make sure it's governed properly, and in as automated way as possible. >> Well, yeah, absolutely. And let me tell you more, everybody talks about single version of the truth, one instance of the data, but there's context to that, that we are taking, trying to take advantage of that as we do data products is, what's the use case here? So we may have an entity of Roddy as a prospective customer, and we may have a entity of Roddy as a customer, high-value customer over here, which may have a different set of mix of data and all, but as a data product, we can then create those for those specific use cases. Still point to the same data, but build it in different constructs. One for marketing, one for sales, one for finance. By the way, that's where your data engineers are struggling. >> Yeah, yeah, of course. So how do I serve all these folks, and really have the context-common story in telco, >> Absolutely. >> or are these guys ahead of the curve a little bit? Or where would you put them? >> I think they're definitely moving a lot faster than the industry is generally. I think the enabling technologies, like for instance, having that single copy of data that everybody sees, a single pane of glass, right, that's definitely something that everybody wants to get to. Not many people are there. I think, what AT&T's doing, is most definitely a little bit further ahead than the industry generally. And I think the successes that are coming out of that, and the learning experiences are starting to generate momentum within AT&T. So I think, it's not just about the product, and having a product now that gives you a single copy of data. It's about the experiences, right? And now, how the teams are getting trained, domains like network engineering for instance. They typically haven't been a part of data discussions, because they've got a lot of data, but they're focused on the infrastructure. >> Mm. >> So, by going ahead and deploying this platform, for platform's purpose, right, and the business value, that's one thing, but also to start bringing, getting that experience, and bringing new experience in to help other groups that traditionally hadn't been data-centric, that's also a huge step ahead, right? So you need to enable those groups. >> A big complaint of course we hear at MWC from carriers is, "The over-the-top guys are killing us. They're riding on our networks, et cetera, et cetera. They have all the data, they have all the client relationships." Do you see your client relationships changing as a result of sort of your data culture evolving? >> Yes, I'm not sure I can- >> It's a loaded question, I know. >> Yeah, and then I, so, we want to start embedding as much into our network on the proprietary value that we have, so we can start getting into that OTT play, us as any other carrier, we have distinct advantages of what we can do at the edge, and we just need to start exploiting those. But you know, 'cause whether it's location or whatnot, so we got to eat into that. Historically, the network is where we make our money in, and we stack the services on top of it. It used to be *69. >> Dave: Yeah. >> If anybody remembers that. >> Dave: Yeah, of course. (Dave laughs) >> But you know, it was stacked on top of our network. Then we stack another product on top of it. It'll be in the edge where we start providing distinct values to other partners as we- >> I mean, it's a great business that you're in. I mean, if they're really good at connectivity. >> Dave: Yeah. >> And so, it sounds like it's still to be determined >> Dave: Yeah. >> where you can go with this. You have to be super careful with private and for personal information. >> Dave: Yep. >> Yeah, but the opportunities are enormous. >> There's a lot. >> Yeah, particularly at the edge, looking at, private networks are just an amazing opportunity. Factories and name it, hospital, remote hospitals, remote locations. I mean- >> Dave: Connected cars. >> Connected cars are really interesting, right? I mean, if you start communicating car to car, and actually drive that, (Dave laughs) I mean that's, now we're getting to visit Xen Fault Tolerance people. This is it. >> Dave: That's not, let's hold the traffic. >> Doesn't scare me as much as we actually learn. (all laugh) >> So how's the show been for you guys? >> Dave: Awesome. >> What're your big takeaways from- >> Tremendous experience. I mean, someone who doesn't go outside the United States much, I'm a homebody. The whole experience, the whole trip, city, Mobile World Congress, the technologies that are out here, it's been a blast. >> Anything, top two things you learned, advice you'd give to others, your colleagues out in general? >> In general, we talked a lot about technologies today, and we talked a lot about data, but I'm going to tell you what, the accelerator that you cannot change, is the relationship that we have. So when the tech and the business can work together toward a common goal, and it's a partnership, you get things done. So, I don't know how many CDOs or CIOs or CEOs are out there, but this connection is what accelerates and makes it work. >> And that is our audience Dave. I mean, it's all about that alignment. So guys, I really appreciate you coming in and sharing your story in "theCUBE." Great stuff. >> Thank you. >> Thanks a lot. >> All right, thanks everybody. Thank you for watching. I'll be right back with Dave Nicholson. Day four SiliconANGLE's coverage of MWC '23. You're watching "theCUBE." (gentle music)

Published Date : Mar 2 2023

SUMMARY :

that drive human progress. And Phil Kippen, the Global But the data culture's of the OSS stuff that we But enterprise, you got to be So, we may not be as cutting-edge Channel Performance Data and all the way to leadership I don't mean the pejorative, And you guys are leaning into the Cloud. and the process efficiency and one of the nightmares I've lived with, This is the brilliance of the business flexibility, like you're taking your Data Cloud message But the situation in telco and that platform is being utilized You have specific value there. I am. So there you go. I don't mean that as a negative. and some of the things we and Roddy, you're kind of, And he just gets a call from me. (Dave and Phil laugh) and it had to be this sequential pipeline. and always have, the data all of that enters into How are the roles and in the cloud environment that But the goal at CDO is to and I know you've mentioned upfront, So the value prop and the on the forefront now. I mean even if the and by the way, I wouldn't and increasingly the business and the shepherds, but not the owners. and respond quickly to the business. and he or she had to service Like the business knows and we go to Dave in terms doesn't own the data per se, and we may have a entity and really have the and having a product now that gives you and the business value, that's one thing, They have all the data, on the proprietary value that we have, Dave: Yeah, of course. It'll be in the edge business that you're in. You have to be super careful Yeah, but the particularly at the edge, and actually drive that, let's hold the traffic. much as we actually learn. the whole trip, city, is the relationship that we have. and sharing your story in "theCUBE." Thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Whittington	PERSON	0.99+
Frank Slootman	PERSON	0.99+
Roddy	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Phil	PERSON	0.99+
Phil Kippen	PERSON	0.99+
AT&T	ORGANIZATION	0.99+
Jesse James	PERSON	0.99+
AT&T.	ORGANIZATION	0.99+
five steps	QUANTITY	0.99+
Dave Nicholson	PERSON	0.99+
John Furrier	PERSON	0.99+
50 times	QUANTITY	0.99+
Snowflake	ORGANIZATION	0.99+
Roddy Tranum	PERSON	0.99+
10 billion	QUANTITY	0.99+
one step	QUANTITY	0.99+
17 years	QUANTITY	0.99+
130,000 users	QUANTITY	0.99+
United States	LOCATION	0.99+
1800s	DATE	0.99+
last week	DATE	0.99+
Barcelona	LOCATION	0.99+
Palo Alto	LOCATION	0.99+
Dell Technologies	ORGANIZATION	0.99+
last night	DATE	0.99+
MWC '23	EVENT	0.98+
telco	ORGANIZATION	0.98+
one system	QUANTITY	0.98+
one	QUANTITY	0.98+
40 data processes	QUANTITY	0.98+
today	DATE	0.98+
one place	QUANTITY	0.97+
P&L	ORGANIZATION	0.97+
telcos	ORGANIZATION	0.97+
CDO	ORGANIZATION	0.97+
149-year-old	QUANTITY	0.97+
five	QUANTITY	0.97+
single	QUANTITY	0.96+
three components	QUANTITY	0.96+
One	QUANTITY	0.96+

John Kreisa, Couchbase | MWC Barcelona 2023

>> Narrator: TheCUBE's live coverage is made possible by funding from Dell Technologies, creating technologies that drive human progress. (upbeat music intro) (logo background tingles) >> Hi everybody, welcome back to day three of MWC23, my name is Dave Vellante and we're here live at the Theater of Barcelona, Lisa Martin, David Nicholson, John Furrier's in our studio in Palo Alto. Lot of buzz at the show, the Mobile World Daily Today, front page, Netflix chief hits back in fair share row, Greg Peters, the co-CEO of Netflix, talking about how, "Hey, you guys want to tax us, the telcos want to tax us, well, maybe you should help us pay for some of the content. Your margins are higher, you have a monopoly, you know, we're delivering all this value, you're bundling Netflix in, from a lot of ISPs so hold on, you know, pump the brakes on that tax," so that's the big news. Lockheed Martin, FOSS issues, AI guidelines, says, "AI's not going to take over your job anytime soon." Although I would say, your job's going to be AI-powered for the next five years. We're going to talk about data, we've been talking about the disaggregation of the telco stack, part of that stack is a data layer. John Kreisa is here, the CMO of Couchbase, John, you know, we've talked about all week, the disaggregation of the telco stacks, they got, you know, Silicon and operating systems that are, you know, real time OS, highly reliable, you know, compute infrastructure all the way up through a telemetry stack, et cetera. And that's a proprietary block that's really exploding, it's like the big bang, like we saw in the enterprise 20 years ago and we haven't had much discussion about that data layer, sort of that horizontal data layer, that's the market you play in. You know, Couchbase obviously has a lot of telco customers- >> John: That's right. >> We've seen, you know, Snowflake and others launch telco businesses. What are you seeing when you talk to customers at the show? What are they doing with that data layer? >> Yeah, so they're building applications to drive and power unique experiences for their users, but of course, it all starts with where the data is. So they're building mobile applications where they're stretching it out to the edge and you have to move the data to the edge, you have to have that capability to deliver that highly interactive experience to their customers or for their own internal use cases out to that edge, so seeing a lot of that with Couchbase and with our customers in telco. >> So what do the telcos want to do with data? I mean, they've got the telemetry data- >> John: Yeah. >> Now they frequently complain about the over-the-top providers that have used that data, again like Netflix, to identify customer demand for content and they're mopping that up in a big way, you know, certainly Amazon and shopping Google and ads, you know, they're all using that network. But what do the telcos do today and what do they want to do in the future? They're all talking about monetization, how do they monetize that data? >> Yeah, well, by taking that data, there's insight to be had, right? So by usage patterns and what's happening, just as you said, so they can deliver a better experience. It's all about getting that edge, if you will, on their competition and so taking that data, using it in a smart way, gives them that edge to deliver a better service and then grow their business. >> We're seeing a lot of action at the edge and, you know, the edge can be a Home Depot or a Lowe's store, but it also could be the far edge, could be a, you know, an oil drilling, an oil rig, it could be a racetrack, you know, certainly hospitals and certain, you know, situations. So let's think about that edge, where there's maybe not a lot of connectivity, there might be private networks going in, in the future- >> John: That's right. >> Private 5G networks. What's the data flow look like there? Do you guys have any customers doing those types of use cases? >> Yeah, absolutely. >> And what are they doing with the data? >> Yeah, absolutely, we've got customers all across, so telco and transportation, all kinds of service delivery and healthcare, for example, we've got customers who are delivering healthcare out at the edge where they have a remote location, they're able to deliver healthcare, but as you said, there's not always connectivity, so they need to have the applications, need to continue to run and then sync back once they have that connectivity. So it's really having the ability to deliver a service, reliably and then know that that will be synced back to some central server when they have connectivity- >> So the processing might occur where the data- >> Compute at the edge. >> How do you sync back? What is that technology? >> Yeah, so there's, so within, so Couchbase and Couchbase's case, we have an autonomous sync capability that brings it back to the cloud once they get back to whether it's a private network that they want to run over, or if they're doing it over a public, you know, wifi network, once it determines that there's connectivity and, it can be peer-to-peer sync, so different edge apps communicating with each other and then ultimately communicating back to a central server. >> I mean, the other theme here, of course, I call it the software-defined telco, right? But you got to have, you got to run on something, got to have hardware. So you see companies like AWS putting Outposts, out to the edge, Outposts, you know, doesn't really run a lot of database to mind, I mean, it runs RDS, you know, maybe they're going to eventually work with companies like... I mean, you're a partner of AWS- >> John: We are. >> Right? So do you see that kind of cloud infrastructure that's moving to the edge? Do you see that as an opportunity for companies like Couchbase? >> Yeah, we do. We see customers wanting to push more and more of that compute out to the edge and so partnering with AWS gives us that opportunity and we are certified on Outpost and- >> Oh, you are? >> We are, yeah. >> Okay. >> Absolutely. >> When did that, go down? >> That was last year, but probably early last year- >> So I can run Couchbase at the edge, on Outpost? >> Yeah, that's right. >> I mean, you know, Outpost adoption has been slow, we've reported on that, but are you seeing any traction there? Are you seeing any nibbles? >> Starting to see some interest, yeah, absolutely. And again, it has to be for the right use case, but again, for service delivery, things like healthcare and in transportation, you know, they're starting to see where they want to have that compute, be very close to where the actions happen. >> And you can run on, in the data center, right? >> That's right. >> You can run in the cloud, you know, you see HPE with GreenLake, you see Dell with Apex, that's essentially their Outposts. >> Yeah. >> They're saying, "Hey, we're going to take our whole infrastructure and make it as a service." >> Yeah, yeah. >> Right? And so you can participate in those environments- >> We do. >> And then so you've got now, you know, we call it supercloud, you've got the on-prem, you've got the, you can run in the public cloud, you can run at the edge and you want that consistent experience- >> That's right. >> You know, from a data layer- >> That's right. >> So is that really the strategy for a data company is taking or should be taking, that horizontal layer across all those use cases? >> You do need to think holistically about it, because you need to be able to deliver as a, you know, as a provider, wherever the customer wants to be able to consume that application. So you do have to think about any of the public clouds or private networks and all the way to the edge. >> What's different John, about the telco business versus the traditional enterprise? >> Well, I mean, there's scale, I mean, one thing they're dealing with, particularly for end user-facing apps, you're dealing at a very very high scale and the expectation that you're going to deliver a very interactive experience. So I'd say one thing in particular that we are focusing on, is making sure we deliver that highly interactive experience but it's the scale of the number of users and customers that they have, and the expectation that your application's always going to work. >> Speaking of applications, I mean, it seems like that's where the innovation is going to come from. We saw yesterday, GSMA announced, I think eight APIs telco APIs, you know, we were talking on theCUBE, one of the analysts was like, "Eight, that's nothing," you know, "What do these guys know about developers?" But you know, as Daniel Royston said, "Eight's better than zero." >> Right? >> So okay, so we're starting there, but the point being, it's all about the apps, that's where the innovation's going to come from- >> That's right. >> So what are you seeing there, in terms of building on top of the data app? >> Right, well you have to provide, I mean, have to provide the APIs and the access because it is really, the rubber meets the road, with the developers and giving them the ability to create those really rich applications where they want and create the experiences and innovate and change the way that they're giving those experiences. >> Yeah, so what's your relationship with developers at Couchbase? >> John: Yeah. >> I mean, talk about that a little bit- >> Yeah, yeah, so we have a great relationship with developers, something we've been investing more and more in, in terms of things like developer relations teams and community, Couchbase started in open source, continue to be based on open source projects and of course, those are very developer centric. So we provide all the consistent APIs for developers to create those applications, whether it's something on Couchbase Lite, which is our kind of edge-based database, or how they can sync that data back and we actually automate a lot of that syncing which is a very difficult developer task which lends them to one of the developer- >> What I'm trying to figure out is, what's the telco developer look like? Is that a developer that comes from the enterprise and somebody comes from the blockchain world, or AI or, you know, there really doesn't seem to be a lot of developer talk here, but there's a huge opportunity. >> Yeah, yeah. >> And, you know, I feel like, the telcos kind of remind me of, you know, a traditional legacy company trying to get into the developer world, you know, even Oracle, okay, they bought Sun, they got Java, so I guess they have developers, but you know, IBM for years tried with Bluemix, they had to end up buying Red Hat, really, and that gave them the developer community. >> Yep. >> EMC used to have a thing called EMC Code, which was a, you know, good effort, but eh. And then, you know, VMware always trying to do that, but, so as you move up the stack obviously, you have greater developer affinity. Where do you think the telco developer's going to come from? How's that going to evolve? >> Yeah, it's interesting, and I think they're... To kind of get to your first question, I think they're fairly traditional enterprise developers and when we break that down, we look at it in terms of what the developer persona is, are they a front-end developer? Like they're writing that front-end app, they don't care so much about the infrastructure behind or are they a full stack developer and they're really involved in the entire application development lifecycle? Or are they living at the backend and they're really wanting to just focus in on that data layer? So we lend towards all of those different personas and we think about them in terms of the APIs that we create, so that's really what the developers are for telcos is, there's a combination of those front-end and full stack developers and so for them to continue to innovate they need to appeal to those developers and that's technology, like Couchbase, is what helps them do that. >> Yeah and you think about the Apples, you know, the app store model or Apple sort of says, "Okay, here's a developer kit, go create." >> John: Yeah. >> "And then if it's successful, you're going to be successful and we're going to take a vig," okay, good model. >> John: Yeah. >> I think I'm hearing, and maybe I misunderstood this, but I think it was the CEO or chairman of Ericsson on the day one keynotes, was saying, "We are going to monetize the, essentially the telemetry data, you know, through APIs, we're going to charge for that," you know, maybe that's not the best approach, I don't know, I think there's got to be some innovation on top. >> John: Yeah. >> Now maybe some of these greenfield telcos are going to do like, you take like a dish networks, what they're doing, they're really trying to drive development layers. So I think it's like this wild west open, you know, community that's got to be formed and right now it's very unclear to me, do you have any insights there? >> I think it is more, like you said, Wild West, I think there's no emerging standard per se for across those different company types and sort of different pieces of the industry. So consequently, it does need to form some more standards in order to really help it grow and I think you're right, you have to have the right APIs and the right access in order to properly monetize, you have to attract those developers or you're not going to be able to monetize properly. >> Do you think that if, in thinking about your business and you know, you've always sold to telcos, but now it's like there's this transformation going on in telcos, will that become an increasingly larger piece of your business or maybe even a more important piece of your business? Or it's kind of be steady state because it's such a slow moving industry? >> No, it is a big and increasing piece of our business, I think telcos like other enterprises, want to continue to innovate and so they look to, you know, technologies like, Couchbase document database that allows them to have more flexibility and deliver the speed that they need to deliver those kinds of applications. So we see a lot of migration off of traditional legacy infrastructure in order to build that new age interface and new age experience that they want to deliver. >> A lot of buzz in Silicon Valley about open AI and Chat GPT- >> Yeah. >> You know, what's your take on all that? >> Yeah, we're looking at it, I think it's exciting technology, I think there's a lot of applications that are kind of, a little, sort of innovate traditional interfaces, so for example, you can train Chat GPT to create code, sample code for Couchbase, right? You can go and get it to give you that sample app which gets you a headstart or you can actually get it to do a better job of, you know, sorting through your documentation, like Chat GPT can do a better job of helping you get access. So it improves the experience overall for developers, so we're excited about, you know, what the prospect of that is. >> So you're playing around with it, like everybody is- >> Yeah. >> And potentially- >> Looking at use cases- >> Ways tO integrate, yeah. >> Hundred percent. >> So are we. John, thanks for coming on theCUBE. Always great to see you, my friend. >> Great, thanks very much. >> All right, you're welcome. All right, keep it right there, theCUBE will be back live from Barcelona at the theater. SiliconANGLE's continuous coverage of MWC23. Go to siliconangle.com for all the news, theCUBE.net is where all the videos are, keep it right there. (cheerful upbeat music outro)

Published Date : Mar 1 2023

SUMMARY :

that drive human progress. that's the market you play in. We've seen, you know, and you have to move the data to the edge, you know, certainly Amazon that edge, if you will, it could be a racetrack, you know, Do you guys have any customers the applications, need to over a public, you know, out to the edge, Outposts, you know, of that compute out to the edge in transportation, you know, You can run in the cloud, you know, and make it as a service." to deliver as a, you know, and the expectation that But you know, as Daniel Royston said, and change the way that they're continue to be based on open or AI or, you know, there developer world, you know, And then, you know, VMware and so for them to continue to innovate about the Apples, you know, and we're going to take data, you know, through APIs, are going to do like, you and the right access in and so they look to, you know, so we're excited about, you know, yeah. Always great to see you, Go to siliconangle.com for all the news,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
John	PERSON	0.99+
Greg Peters	PERSON	0.99+
Daniel Royston	PERSON	0.99+
Lisa Martin	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Ericsson	ORGANIZATION	0.99+
David Nicholson	PERSON	0.99+
Palo Alto	LOCATION	0.99+
John Kreisa	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Netflix	ORGANIZATION	0.99+
last year	DATE	0.99+
Silicon Valley	LOCATION	0.99+
GSMA	ORGANIZATION	0.99+
Java	TITLE	0.99+
Lowe	ORGANIZATION	0.99+
first question	QUANTITY	0.99+
Lockheed Martin	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
telcos	ORGANIZATION	0.99+
Dell Technologies	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
yesterday	DATE	0.99+
Eight	QUANTITY	0.99+
one	QUANTITY	0.99+
Chat GPT	TITLE	0.99+
Hundred percent	QUANTITY	0.99+
Apple	ORGANIZATION	0.99+
telco	ORGANIZATION	0.98+
Couchbase	ORGANIZATION	0.98+
John Furrier	PERSON	0.98+
siliconangle.com	OTHER	0.98+
Apex	ORGANIZATION	0.98+
Home Depot	ORGANIZATION	0.98+
early last year	DATE	0.98+
Barcelona	LOCATION	0.98+
20 years ago	DATE	0.98+
MWC23	EVENT	0.97+
Bluemix	ORGANIZATION	0.96+
Sun	ORGANIZATION	0.96+
SiliconANGLE	ORGANIZATION	0.96+
theCUBE	ORGANIZATION	0.95+
GreenLake	ORGANIZATION	0.94+
Apples	ORGANIZATION	0.94+
Snowflake	ORGANIZATION	0.93+
Outpost	ORGANIZATION	0.93+
VMware	ORGANIZATION	0.93+
zero	QUANTITY	0.93+
EMC	ORGANIZATION	0.91+
day three	QUANTITY	0.9+
today	DATE	0.89+
Mobile World Daily Today	TITLE	0.88+
Wild West	ORGANIZATION	0.88+
theCUBE.net	OTHER	0.87+
app store	TITLE	0.86+
one thing	QUANTITY	0.86+
EMC Code	TITLE	0.86+
Couchbase	TITLE	0.85+

Breaking Analysis: MWC 2023 goes beyond consumer & deep into enterprise tech

>> From theCUBE Studios in Palo Alto in Boston, bringing you data-driven insights from theCUBE and ETR, this is Breaking Analysis with Dave Vellante. >> While never really meant to be a consumer tech event, the rapid ascendancy of smartphones sucked much of the air out of Mobile World Congress over the years, now MWC. And while the device manufacturers continue to have a major presence at the show, the maturity of intelligent devices, longer life cycles, and the disaggregation of the network stack, have put enterprise technologies front and center in the telco business. Semiconductor manufacturers, network equipment players, infrastructure companies, cloud vendors, software providers, and a spate of startups are eyeing the trillion dollar plus communications industry as one of the next big things to watch this decade. Hello, and welcome to this week's Wikibon CUBE Insights, powered by ETR. In this Breaking Analysis, we bring you part two of our ongoing coverage of MWC '23, with some new data on enterprise players specifically in large telco environments, a brief glimpse at some of the pre-announcement news and corresponding themes ahead of MWC, and some of the key announcement areas we'll be watching at the show on theCUBE. Now, last week we shared some ETR data that showed how traditional enterprise tech players were performing, specifically within the telecoms vertical. Here's a new look at that data from ETR, which isolates the same companies, but cuts the data for what ETR calls large telco. The N in this cut is 196, down from 288 last week when we included all company sizes in the dataset. Now remember the two dimensions here, on the y-axis is net score, or spending momentum, and on the x-axis is pervasiveness in the data set. The table insert in the upper left informs how the dots and companies are plotted, and that red dotted line, the horizontal line at 40%, that indicates a highly elevated net score. Now while the data are not dramatically different in terms of relative positioning, there are a couple of changes at the margin. So just going down the list and focusing on net score. Azure is comparable, but slightly lower in this sector in the large telco than it was overall. Google Cloud comes in at number two, and basically swapped places with AWS, which drops slightly in the large telco relative to overall telco. Snowflake is also slightly down by one percentage point, but maintains its position. Remember Snowflake, overall, its net score is much, much higher when measuring across all verticals. Snowflake comes down in telco, and relative to overall, a little bit down in large telco, but it's making some moves to attack this market that we'll talk about in a moment. Next are Red Hat OpenStack and Databricks. About the same in large tech telco as they were an overall telco. Then there's Dell next that has a big presence at MWC and is getting serious about driving 16G adoption, and new servers, and edge servers, and other partnerships. Cisco and Red Hat OpenShift basically swapped spots when moving from all telco to large telco, as Cisco drops and Red Hat bumps up a bit. And VMware dropped about four percentage points in large telco. Accenture moved up dramatically, about nine percentage points in big telco, large telco relative to all telco. HPE dropped a couple of percentage points. Oracle stayed about the same. And IBM surprisingly dropped by about five points. So look, I understand not a ton of change in terms of spending momentum in the large sector versus telco overall, but some deltas. The bottom line for enterprise players is one, they're just getting started in this new disruption journey that they're on as the stack disaggregates. Two, all these players have experience in delivering horizontal solutions, but now working with partners and identifying big problems to be solved, and three, many of these companies are generally not the fastest moving firms relative to smaller disruptive disruptors. Now, cloud has been an exception in fairness. But the good news for the legacy infrastructure and IT companies is that the telco transformation and the 5G buildout is going to take years. So it's moving at a pace that is very favorable to many of these companies. Okay, so looking at just some of the pre-announcement highlights that have hit the wire this week, I want to give you a glimpse of the diversity of innovation that is occurring in the telecommunication space. You got semiconductor manufacturers, device makers, network equipment players, carriers, cloud vendors, enterprise tech companies, software companies, startups. Now we've included, you'll see in this list, we've included OpeRAN, that logo, because there's so much buzz around the topic and we're going to come back to that. But suffice it to say, there's no way we can cover all the announcements from the 2000 plus exhibitors at the show. So we're going to cherry pick here and make a few call outs. Hewlett Packard Enterprise announced an acquisition of an Italian private cellular network company called AthoNet. Zeus Kerravala wrote about it on SiliconANGLE if you want more details. Now interestingly, HPE has a partnership with Solana, which also does private 5G. But according to Zeus, Solona is more of an out-of-the-box solution, whereas AthoNet is designed for the core and requires more integration. And as you'll see in a moment, there's going to be a lot of talk at the show about private network. There's going to be a lot of news there from other competitors, and we're going to be watching that closely. And while many are concerned about the P5G, private 5G, encroaching on wifi, Kerravala doesn't see it that way. Rather, he feels that these private networks are really designed for more industrial, and you know mission critical environments, like factories, and warehouses that are run by robots, et cetera. 'Cause these can justify the increased expense of private networks. Whereas wifi remains a very low cost and flexible option for, you know, whatever offices and homes. Now, over to Dell. Dell announced its intent to go hard after opening up the telco network with the announcement that in the second half of this year it's going to begin shipping its infrastructure blocks for Red Hat. Remember it's like kind of the converged infrastructure for telco with a more open ecosystem and sort of more flexible, you know, more mature engineered system. Dell has also announced a range of PowerEdge servers for a variety of use cases. A big wide line bringing forth its 16G portfolio and aiming squarely at the telco space. Dell also announced, here we go, a private wireless offering with airspan, and Expedo, and a solution with AthoNet, the company HPE announced it was purchasing. So I guess Dell and HPE are now partnering up in the private wireless space, and yes, hell is freezing over folks. We'll see where that relationship goes in the mid- to long-term. Dell also announced new lab and certification capabilities, which we said last week was going to be critical for the further adoption of open ecosystem technology. So props to Dell for, you know, putting real emphasis and investment in that. AWS also made a number of announcements in this space including private wireless solutions and associated managed services. AWS named Deutsche Telekom, Orange, T-Mobile, Telefonica, and some others as partners. And AWS announced the stepped up partnership, specifically with T-Mobile, to bring AWS services to T-Mobile's network portfolio. Snowflake, back to Snowflake, announced its telecom data cloud. Remember we showed the data earlier, it's Snowflake not as strong in the telco sector, but they're continuing to move toward this go-to market alignment within key industries, realigning their go-to market by vertical. It also announced that AT&T, and a number of other partners, are collaborating to break down data silos specifically in telco. Look, essentially, this is Snowflake taking its core value prop to the telco vertical and forming key partnerships that resonate in the space. So think simplification, breaking down silos, data sharing, eventually data monetization. Samsung previewed its future capability to allow smartphones to access satellite services, something Apple has previously done. AMD, Intel, Marvell, Qualcomm, are all in the act, all the semiconductor players. Qualcomm for example, announced along with Telefonica, and Erickson, a 5G millimeter network that will be showcased in Spain at the event this coming week using Qualcomm Snapdragon chipset platform, based on none other than Arm technology. Of course, Arm we said is going to dominate the edge, and is is clearly doing so. It's got the volume advantage over, you know, traditional Intel, you know, X86 architectures. And it's no surprise that Microsoft is touting its open AI relationship. You're going to hear a lot of AI talk at this conference as is AI is now, you know, is the now topic. All right, we could go on and on and on. There's just so much going on at Mobile World Congress or MWC, that we just wanted to give you a glimpse of some of the highlights that we've been watching. Which brings us to the key topics and issues that we'll be exploring at MWC next week. We touched on some of this last week. A big topic of conversation will of course be, you know, 5G. Is it ever going to become real? Is it, is anybody ever going to make money at 5G? There's so much excitement around and anticipation around 5G. It has not lived up to the hype, but that's because the rollout, as we've previous reported, is going to take years. And part of that rollout is going to rely on the disaggregation of the hardened telco stack, as we reported last week and in previous Breaking Analysis episodes. OpenRAN is a big component of that evolution. You know, as our RAN intelligent controllers, RICs, which essentially the brain of OpenRAN, if you will. Now as we build out 5G networks at massive scale and accommodate unprecedented volumes of data and apply compute-hungry AI to all this data, the issue of energy efficiency is going to be front and center. It has to be. Not only is it a, you know, hot political issue, the reality is that improving power efficiency is compulsory or the whole vision of telco's future is going to come crashing down. So chip manufacturers, equipment makers, cloud providers, everybody is going to be doubling down and clicking on this topic. Let's talk about AI. AI as we said, it is the hot topic right now, but it is happening not only in consumer, with things like ChatGPT. And think about the theme of this Breaking Analysis in the enterprise, AI in the enterprise cannot be ChatGPT. It cannot be error prone the way ChatGPT is. It has to be clean, reliable, governed, accurate. It's got to be ethical. It's got to be trusted. Okay, we're going to have Zeus Kerravala on the show next week and definitely want to get his take on private networks and how they're going to impact wifi. You know, will private networks cannibalize wifi? If not, why not? He wrote about this again on SiliconANGLE if you want more details, and we're going to unpack that on theCUBE this week. And finally, as always we'll be following the data flows to understand where and how telcos, cloud players, startups, software companies, disruptors, legacy companies, end customers, how are they going to make money from new data opportunities? 'Cause we often say in theCUBE, don't ever bet against data. All right, that's a wrap for today. Remember theCUBE is going to be on location at MWC 2023 next week. We got a great set. We're in the walkway in between halls four and five, right in Congress Square, stand CS-60. Look for us, we got a full schedule. If you got a great story or you have news, stop by. We're going to try to get you on the program. I'll be there with Lisa Martin, co-hosting, David Nicholson as well, and the entire CUBE crew, so don't forget to come by and see us. I want to thank Alex Myerson, who's on production and manages the podcast, and Ken Schiffman, as well, in our Boston studio. Kristen Martin and Cheryl Knight help get the word out on social media and in our newsletters. And Rob Hof is our editor-in-chief over at SiliconANGLE.com. He does some great editing. Thank you. All right, remember all these episodes they are available as podcasts wherever you listen. All you got to do is search Breaking Analysis podcasts. I publish each week on Wikibon.com and SiliconANGLE.com. All the video content is available on demand at theCUBE.net, or you can email me directly if you want to get in touch David.Vellante@SiliconANGLE.com or DM me @DVellante, or comment on our LinkedIn posts. And please do check out ETR.ai for the best survey data in the enterprise tech business. This is Dave Vellante for theCUBE Insights, powered by ETR. Thanks for watching. We'll see you next week at Mobile World Congress '23, MWC '23, or next time on Breaking Analysis. (bright music)

Published Date : Feb 25 2023

SUMMARY :

bringing you data-driven in the mid- to long-term.

ENTITIES

Entity	Category	Confidence
David Nicholson	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Alex Myerson	PERSON	0.99+
Orange	ORGANIZATION	0.99+
Qualcomm	ORGANIZATION	0.99+
HPE	ORGANIZATION	0.99+
Telefonica	ORGANIZATION	0.99+
Kristen Martin	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
AMD	ORGANIZATION	0.99+
Spain	LOCATION	0.99+
T-Mobile	ORGANIZATION	0.99+
Ken Schiffman	PERSON	0.99+
Deutsche Telekom	ORGANIZATION	0.99+
Hewlett Packard Enterprise	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Cheryl Knight	PERSON	0.99+
Marvell	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Samsung	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
AT&T	ORGANIZATION	0.99+
Dell	ORGANIZATION	0.99+
Intel	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
40%	QUANTITY	0.99+
last week	DATE	0.99+
AthoNet	ORGANIZATION	0.99+
Erickson	ORGANIZATION	0.99+
Congress Square	LOCATION	0.99+
Accenture	ORGANIZATION	0.99+
next week	DATE	0.99+
Mobile World Congress	EVENT	0.99+
Solana	ORGANIZATION	0.99+
Boston	LOCATION	0.99+
two dimensions	QUANTITY	0.99+
ETR	ORGANIZATION	0.99+
MWC '23	EVENT	0.99+
MWC	EVENT	0.99+
288	QUANTITY	0.98+
today	DATE	0.98+
this week	DATE	0.98+
Solona	ORGANIZATION	0.98+
David.Vellante@SiliconANGLE.com	OTHER	0.98+
telco	ORGANIZATION	0.98+
Two	QUANTITY	0.98+
each week	QUANTITY	0.97+
Zeus Kerravala	PERSON	0.97+
MWC 2023	EVENT	0.97+
about five points	QUANTITY	0.97+
theCUBE.net	OTHER	0.97+
Red Hat	ORGANIZATION	0.97+
Snowflake	TITLE	0.96+
one	QUANTITY	0.96+
Databricks	ORGANIZATION	0.96+
three	QUANTITY	0.96+
theCUBE Studios	ORGANIZATION	0.96+

Paola Peraza Calderon & Viraj Parekh, Astronomer | Cube Conversation

(soft electronic music) >> Hey everyone, welcome to this CUBE conversation as part of the AWS Startup Showcase, season three, episode one, featuring Astronomer. I'm your host, Lisa Martin. I'm in the CUBE's Palo Alto Studios, and today excited to be joined by a couple of guests, a couple of co-founders from Astronomer. Viraj Parekh is with us, as is Paola Peraza-Calderon. Thanks guys so much for joining us. Excited to dig into Astronomer. >> Thank you so much for having us. >> Yeah, thanks for having us. >> Yeah, and we're going to be talking about the role of data orchestration. Paola, let's go ahead and start with you. Give the audience that understanding, that context about Astronomer and what it is that you guys do. >> Mm-hmm. Yeah, absolutely. So, Astronomer is a, you know, we're a technology and software company for modern data orchestration, as you said, and we're the driving force behind Apache Airflow. The Open Source Workflow Management tool that's since been adopted by thousands and thousands of users, and we'll dig into this a little bit more. But, by data orchestration, we mean data pipeline, so generally speaking, getting data from one place to another, transforming it, running it on a schedule, and overall just building a central system that tangibly connects your entire ecosystem of data services, right. So what, that's Redshift, Snowflake, DVT, et cetera. And so tangibly, we build, we at Astronomer here build products powered by Apache Airflow for data teams and for data practitioners, so that they don't have to. So, we sell to data engineers, data scientists, data admins, and we really spend our time doing three things. So, the first is that we build Astro, our flagship cloud service that we'll talk more on. But here, we're really building experiences that make it easier for data practitioners to author, run, and scale their data pipeline footprint on the cloud. And then, we also contribute to Apache Airflow as an open source project and community. So, we cultivate the community of humans, and we also put out open source developer tools that actually make it easier for individual data practitioners to be productive in their day-to-day jobs, whether or not they actually use our product and and pay us money or not. And then of course, we also have professional services and education and all of these things around our commercial products that enable folks to use our products and use Airflow as effectively as possible. So yeah, super, super happy with everything we've done and hopefully that gives you an idea of where we're starting. >> Awesome, so when you're talking with those, Paola, those data engineers, those data scientists, how do you define data orchestration and what does it mean to them? >> Yeah, yeah, it's a good question. So, you know, if you Google data orchestration you're going to get something about an automated process for organizing silo data and making it accessible for processing and analysis. But, to your question, what does that actually mean, you know? So, if you look at it from a customer's perspective, we can share a little bit about how we at Astronomer actually do data orchestration ourselves and the problems that it solves for us. So, as many other companies out in the world do, we at Astronomer need to monitor how our own customers use our products, right? And so, we have a weekly meeting, for example, that goes through a dashboard and a dashboarding tool called Sigma where we see the number of monthly customers and how they're engaging with our product. But, to actually do that, you know, we have to use data from our application database, for example, that has behavioral data on what they're actually doing in our product. We also have data from third party API tools, like Salesforce and HubSpot, and other ways in which our customer, we actually engage with our customers and their behavior. And so, our data team internally at Astronomer uses a bunch of tools to transform and use that data, right? So, we use FiveTran, for example, to ingest. We use Snowflake as our data warehouse. We use other tools for data transformations. And even, if we at Astronomer don't do this, you can imagine a data team also using tools like, Monte Carlo for data quality, or Hightouch for Reverse ETL, or things like that. And, I think the point here is that data teams, you know, that are building data-driven organizations have a plethora of tooling to both ingest the right data and come up with the right interfaces to transform and actually, interact with that data. And so, that movement and sort of synchronization of data across your ecosystem is exactly what data orchestration is responsible for. Historically, I think, and Raj will talk more about this, historically, schedulers like KRON and Oozie or Control-M have taken a role here, but we think that Apache Airflow has sort of risen over the past few years as the defacto industry standard for writing data pipelines that do tasks, that do data jobs that interact with that ecosystem of tools in your organization. And so, beyond that sort of data pipeline unit, I think where we see it is that data acquisition is not only writing those data pipelines that move your data, but it's also all the things around it, right, so, CI/CD tool and Secrets Management, et cetera. So, a long-winded answer here, but I think that's how we talk about it here at Astronomer and how we're building our products. >> Excellent. Great context, Paola. Thank you. Viraj, let's bring you into the conversation. Every company these days has to be a data company, right? They've got to be a software company- >> Mm-hmm. >> whether it's my bank or my grocery store. So, how are companies actually doing data orchestration today, Viraj? >> Yeah, it's a great question. So, I think one thing to think about is like, on one hand, you know, data orchestration is kind of a new category that we're helping define, but on the other hand, it's something that companies have been doing forever, right? You need to get data moving to use it, you know. You've got it all in place, aggregate it, cleaning it, et cetera. So, when you look at what companies out there are doing, right. Sometimes, if you're a more kind of born in the cloud company, as we say, you'll adopt all these cloud native tooling things your cloud provider gives you. If you're a bank or another sort of institution like that, you know, you're probably juggling an even wider variety of tools. You're thinking about a cloud migration. You might have things like Kron running in one place, Uzi running somewhere else, Informatics running somewhere else, while you're also trying to move all your workloads to the cloud. So, there's quite a large spectrum of what the current state is for companies. And then, kind of like Paola was saying, Apache Airflow started in 2014, and it was actually started by Airbnb, and they put out this blog post that was like, "Hey here's how we use Apache Airflow to orchestrate our data across all their sources." And really since then, right, it's almost been a decade since then, Airflow emerged as the open source standard, and there's companies of all sorts using it. And, it's really used to tie all these tools together, especially as that number of tools increases, companies move to hybrid cloud, hybrid multi-cloud strategies, and so on and so forth. But you know, what we found is that if you go to any company, especially a larger one and you say like, "Hey, how are you doing data orchestration?" They'll probably say something like, "Well, I have five data teams, so I have eight different ways I do data orchestration." Right. This idea of data orchestration's been there but the right way to do it, kind of all the abstractions you need, the way your teams need to work together, and so on and so forth, hasn't really emerged just yet, right? It's such a quick moving space that companies have to combine what they were doing before with what their new business initiatives are today. So, you know, what we really believe here at Astronomer is Airflow is the core of how you solve data orchestration for any sort of use case, but it's not everything. You know, it needs a little more. And, that's really where our commercial product, Astro comes in, where we've built, not only the most tried and tested airflow experience out there. We do employ a majority of the Airflow Core Committers, right? So, we're kind of really deep in the project. We've also built the right things around developer tooling, observability, and reliability for customers to really rely on Astro as the heart of the way they do data orchestration, and kind of think of it as the foundational layer that helps tie together all the different tools, practices and teams large companies have to do today. >> That foundational layer is absolutely critical. You've both mentioned open source software. Paola, I want to go back to you, and just give the audience an understanding of how open source really plays into Astronomer's mission as a company, and into the technologies like Astro. >> Mm-hmm. Yeah, absolutely. I mean, we, so we at Astronomers started using Airflow and actually building our products because Airflow is open source and we were our own customers at the beginning of our company journey. And, I think the open source community is at the core of everything we do. You know, without that open source community and culture, I think, you know, we have less of a business, and so, we're super invested in continuing to cultivate and grow that. And, I think there's a couple sort of concrete ways in which we do this that personally make me really excited to do my own job. You know, for one, we do things like we organize meetups and we sponsor the Airflow Summit and there's these sort of baseline community efforts that I think are really important and that reminds you, hey, there just humans trying to do their jobs and learn and use both our technology and things that are out there and contribute to it. So, making it easier to contribute to Airflow, for example, is another one of our efforts. As Viraj mentioned, we also employ, you know, engineers internally who are on our team whose full-time job is to make the open source project better. Again, regardless of whether or not you're a customer of ours or not, we want to make sure that we continue to cultivate the Airflow project in and of itself. And, we're also building developer tooling that might not be a part of the Apache Open Source project, but is still open source. So, we have repositories in our own sort of GitHub organization, for example, with tools that individual data practitioners, again customers are not, can use to make them be more productive in their day-to-day jobs with Airflow writing Dags for the most common use cases out there. The last thing I'll say is how important I think we've found it to build sort of educational resources and documentation and best practices. Airflow can be complex. It's been around for a long time. There's a lot of really, really rich feature sets. And so, how do we enable folks to actually use those? And that comes in, you know, things like webinars, and best practices, and courses and curriculum that are free and accessible and open to the community are just some of the ways in which I think we're continuing to invest in that open source community over the next year and beyond. >> That's awesome. It sounds like open source is really core, not only to the mission, but really to the heart of the organization. Viraj, I want to go back to you and really try to understand how does Astronomer fit into the wider modern data stack and ecosystem? Like what does that look like for customers? >> Yeah, yeah. So, both in the open source and with our commercial customers, right? Folks everywhere are trying to tie together a huge variety of tools in order to start making sense of their data. And you know, I kind of think of it almost like as like a pyramid, right? At the base level, you need things like data reliability, data, sorry, data freshness, data availability, and so on and so forth, right? You just need your data to be there. (coughs) I'm sorry. You just need your data to be there, and you need to make it predictable when it's going to be there. You need to make sure it's kind of correct at the highest level, some quality checks, and so on and so forth. And oftentimes, that kind of takes the case of ELT or ETL use cases, right? Taking data from somewhere and moving it somewhere else, usually into some sort of analytics destination. And, that's really what businesses can do to just power the core parts of getting insights into how their business is going, right? How much revenue did I had? What's in my pipeline, salesforce, and so on and so forth. Once that kind of base foundation is there and people can get the data they need, how they need it, it really opens up a lot for what customers can do. You know, I think one of the trendier things out there right now is MLOps, and how do companies actually put machine learning into production? Well, when you think about it you kind of have to squint at it, right? Like, machine learning pipelines are really just any other data pipeline. They just have a certain set of needs that might not not be applicable to ELT pipelines. And, when you kind of have a common layer to tie together all the ways data can move through your organization, that's really what we're trying to make it so companies can do. And, that happens in financial services where, you know, we have some customers who take app data coming from their mobile apps, and actually run it through their fraud detection services to make sure that all the activity is not fraudulent. We have customers that will run sports betting models on our platform where they'll take data from a bunch of public APIs around different sporting events that are happening, transform all of that in a way their data scientist can build models with it, and then actually bet on sports based on that output. You know, one of my favorite use cases I like to talk about that we saw in the open source is we had there was one company whose their business was to deliver blood transfusions via drone into remote parts of the world. And, it was really cool because they took all this data from all sorts of places, right? Kind of orchestrated all the aggregation and cleaning and analysis that happened had to happen via airflow and the end product would be a drone being shot out into a real remote part of the world to actually give somebody blood who needed it there. Because it turns out for certain parts of the world, the easiest way to deliver blood to them is via drone and not via some other, some other thing. So, these kind of, all the things people do with the modern data stack is absolutely incredible, right? Like you were saying, every company's trying to be a data-driven company. What really energizes me is knowing that like, for all those best, super great tools out there that power a business, we get to be the connective tissue, or the, almost like the electricity that kind of ropes them all together and makes so people can actually do what they need to do. >> Right. Phenomenal use cases that you just described, Raj. I mean, just the variety alone of what you guys are able to do and impact is so cool. So Paola, when you're with those data engineers, those data scientists, and customer conversations, what's your pitch? Why use Astro? >> Mm-hmm. Yeah, yeah, it's a good question. And honestly, to piggyback off of Viraj, there's so many. I think what keeps me so energized is how mission critical both our product and data orchestration is, and those use cases really are incredible and we work with customers of all shapes and sizes. But, to answer your question, right, so why use Astra? Why use our commercial products? There's so many people using open source, why pay for something more than that? So, you know, the baseline for our business really is that Airflow has grown exponentially over the last five years, and like we said has become an industry standard that we're confident there's a huge opportunity for us as a company and as a team. But, we also strongly believe that being great at running Airflow, you know, doesn't make you a successful company at what you do. What makes you a successful company at what you do is building great products and solving problems and solving pin points of your own customers, right? And, that differentiating value isn't being amazing at running Airflow. That should be our job. And so, we want to abstract those customers from meaning to do things like manage Kubernetes infrastructure that you need to run Airflow, and then hiring someone full-time to go do that. Which can be hard, but again doesn't add differentiating value to your team, or to your product, or to your customers. So, folks to get away from managing that infrastructure sort of a base, a base layer. Folks who are looking for differentiating features that make their team more productive and allows them to spend less time tweaking Airflow configurations and more time working with the data that they're getting from their business. For help, getting, staying up with Airflow releases. There's a ton of, we've actually been pretty quick to come out with new Airflow features and releases, and actually just keeping up with that feature set and working strategically with a partner to help you make the most out of those feature sets is a key part of it. And, really it's, especially if you're an organization who currently is committed to using Airflow, you likely have a lot of Airflow environments across your organization. And, being able to see those Airflow environments in a single place and being able to enable your data practitioners to create Airflow environments with a click of a button, and then use, for example, our command line to develop your Airflow Dags locally and push them up to our product, and use all of the sort of testing and monitoring and observability that we have on top of our product is such a key. It sounds so simple, especially if you use Airflow, but really those things are, you know, baseline value props that we have for the customers that continue to be excited to work with us. And of course, I think we can go beyond that and there's, we have ambitions to add whole, a whole bunch of features and expand into different types of personas. >> Right? >> But really our main value prop is for companies who are committed to Airflow and want to abstract themselves and make use of some of the differentiating features that we now have at Astronomer. >> Got it. Awesome. >> Thank you. One thing, one thing I'll add to that, Paola, and I think you did a good job of saying is because every company's trying to be a data company, companies are at different parts of their journey along that, right? And we want to meet customers where they are, and take them through it to where they want to go. So, on one end you have folks who are like, "Hey, we're just building a data team here. We have a new initiative. We heard about Airflow. How do you help us out?" On the farther end, you know, we have some customers that have been using Airflow for five plus years and they're like, "Hey, this is awesome. We have 10 more teams we want to bring on. How can you help with this? How can we do more stuff in the open source with you? How can we tell our story together?" And, it's all about kind of taking this vast community of data users everywhere, seeing where they're at, and saying like, "Hey, Astro and Airflow can take you to the next place that you want to go." >> Which is incredibly- >> Mm-hmm. >> and you bring up a great point, Viraj, that every company is somewhere in a different place on that journey. And it's, and it's complex. But it sounds to me like a lot of what you're doing is really stripping away a lot of the complexity, really enabling folks to use their data as quickly as possible, so that it's relevant and they can serve up, you know, the right products and services to whoever wants what. Really incredibly important. We're almost out of time, but I'd love to get both of your perspectives on what's next for Astronomer. You give us a a great overview of what the company's doing, the value in it for customers. Paola, from your lens as one of the co-founders, what's next? >> Yeah, I mean, I think we'll continue to, I think cultivate in that open source community. I think we'll continue to build products that are open sourced as part of our ecosystem. I also think that we'll continue to build products that actually make Airflow, and getting started with Airflow, more accessible. So, sort of lowering that barrier to entry to our products, whether that's price wise or infrastructure requirement wise. I think making it easier for folks to get started and get their hands on our product is super important for us this year. And really it's about, I think, you know, for us, it's really about focused execution this year and all of the sort of core principles that we've been talking about. And continuing to invest in all of the things around our product that again, enable teams to use Airflow more effectively and efficiently. >> And that efficiency piece is, everybody needs that. Last question, Viraj, for you. What do you see in terms of the next year for Astronomer and for your role? >> Yeah, you know, I think Paola did a really good job of laying it out. So it's, it's really hard to disagree with her on anything, right? I think executing is definitely the most important thing. My own personal bias on that is I think more than ever it's important to really galvanize the community around airflow. So, we're going to be focusing on that a lot. We want to make it easier for our users to get get our product into their hands, be that open source users or commercial users. And last, but certainly not least, is we're also really excited about Data Lineage and this other open source project in our umbrella called Open Lineage to make it so that there's a standard way for users to get lineage out of different systems that they use. When we think about what's in store for data lineage and needing to audit the way automated decisions are being made. You know, I think that's just such an important thing that companies are really just starting with, and I don't think there's a solution that's emerged that kind of ties it all together. So, we think that as we kind of grow the role of Airflow, right, we can also make it so that we're helping solve, we're helping customers solve their lineage problems all in Astro, which is our kind of the best of both worlds for us. >> Awesome. I can definitely feel and hear the enthusiasm and the passion that you both bring to Astronomer, to your customers, to your team. I love it. We could keep talking more and more, so you're going to have to come back. (laughing) Viraj, Paola, thank you so much for joining me today on this showcase conversation. We really appreciate your insights and all the context that you provided about Astronomer. >> Thank you so much for having us. >> My pleasure. For my guests, I'm Lisa Martin. You're watching this Cube conversation. (soft electronic music)

Published Date : Feb 21 2023

SUMMARY :

to this CUBE conversation Thank you so much and what it is that you guys do. and hopefully that gives you an idea and the problems that it solves for us. to be a data company, right? So, how are companies actually kind of all the abstractions you need, and just give the And that comes in, you of the organization. and analysis that happened that you just described, Raj. that you need to run Airflow, that we now have at Astronomer. Awesome. and I think you did a good job of saying and you bring up a great point, Viraj, and all of the sort of core principles and for your role? and needing to audit the and all the context that you (soft electronic music)

ENTITIES

Entity	Category	Confidence
Viraj Parekh	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Paola	PERSON	0.99+
Viraj	PERSON	0.99+
2014	DATE	0.99+
Astronomer	ORGANIZATION	0.99+
Paola Peraza-Calderon	PERSON	0.99+
Paola Peraza Calderon	PERSON	0.99+
Airflow	ORGANIZATION	0.99+
Airbnb	ORGANIZATION	0.99+
five plus years	QUANTITY	0.99+
Astro	ORGANIZATION	0.99+
Raj	PERSON	0.99+
Uzi	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
first	QUANTITY	0.99+
both	QUANTITY	0.99+
today	DATE	0.99+
Kron	ORGANIZATION	0.99+
10 more teams	QUANTITY	0.98+
Astronomers	ORGANIZATION	0.98+
Astra	ORGANIZATION	0.98+
one	QUANTITY	0.98+
Airflow	TITLE	0.98+
Informatics	ORGANIZATION	0.98+
Monte Carlo	TITLE	0.98+
this year	DATE	0.98+
HubSpot	ORGANIZATION	0.98+
one company	QUANTITY	0.97+
Astronomer	TITLE	0.97+
next year	DATE	0.97+
Apache	ORGANIZATION	0.97+
Airflow Summit	EVENT	0.97+
AWS	ORGANIZATION	0.95+
both worlds	QUANTITY	0.93+
KRON	ORGANIZATION	0.93+
CUBE	ORGANIZATION	0.92+
M	ORGANIZATION	0.92+
Redshift	TITLE	0.91+
Snowflake	TITLE	0.91+
five data teams	QUANTITY	0.91+
GitHub	ORGANIZATION	0.91+
Oozie	ORGANIZATION	0.9+
Data Lineage	ORGANIZATION	0.9+

Breaking Analysis: MWC 2023 highlights telco transformation & the future of business

>> From the Cube Studios in Palo Alto in Boston, bringing you data-driven insights from The Cube and ETR. This is "Breaking Analysis" with Dave Vellante. >> The world's leading telcos are trying to shed the stigma of being monopolies lacking innovation. Telcos have been great at operational efficiency and connectivity and living off of transmission, and the costs and expenses or revenue associated with that transmission. But in a world beyond telephone poles and basic wireless and mobile services, how will telcos modernize and become more agile and monetize new opportunities brought about by 5G and private wireless and a spate of new innovations and infrastructure, cloud data and apps? Hello, and welcome to this week's Wikibon CUBE Insights powered by ETR. In this breaking analysis and ahead of Mobile World Congress or now, MWC23, we explore the evolution of the telco business and how the industry is in many ways, mimicking transformations that took place decades ago in enterprise IT. We'll model some of the traditional enterprise vendors using ETR data and investigate how they're faring in the telecommunications sector, and we'll pose some of the key issues facing the industry this decade. First, let's take a look at what the GSMA has in store for MWC23. GSMA is the host of what used to be called Mobile World Congress. They've set the theme for this year's event as "Velocity" and they've rebranded MWC to reflect the fact that mobile technology is only one part of the story. MWC has become one of the world's premier events highlighting innovations not only in Telco, mobile and 5G, but the collision between cloud, infrastructure, apps, private networks, smart industries, machine intelligence, and AI, and more. MWC comprises an enormous ecosystem of service providers, technology companies, and firms from virtually every industry including sports and entertainment. And as well, GSMA, along with its venue partner at the Fira Barcelona, have placed a major emphasis on sustainability and public and private partnerships. Virtually every industry will be represented at the event because every industry is impacted by the trends and opportunities in this space. GSMA has said it expects 80,000 attendees at MWC this year, not quite back to 2019 levels, but trending in that direction. Of course, attendance from Chinese participants has historically been very high at the show, and obviously the continued travel issues from that region are affecting the overall attendance, but still very strong. And despite these concerns, Huawei, the giant Chinese technology company. has the largest physical presence of any exhibitor at the show. And finally, GSMA estimates that more than $300 million in economic benefit will result from the event which takes place at the end of February and early March. And The Cube will be back at MWC this year with a major presence thanks to our anchor sponsor, Dell Technologies and other supporters of our content program, including Enterprise Web, ArcaOS, VMware, Snowflake, Cisco, AWS, and others. And one of the areas we're interested in exploring is the evolution of the telco stack. It's a topic that's often talked about and one that we've observed taking place in the 1990s when the vertically integrated IBM mainframe monopoly gave way to a disintegrated and horizontal industry structure. And in many ways, the same thing is happening today in telecommunications, which is shown on the left-hand side of this diagram. Historically, telcos have relied on a hardened, integrated, and incredibly reliable, and secure set of hardware and software services that have been fully vetted and tested, and certified, and relied upon for decades. And at the top of that stack on the left are the crown jewels of the telco stack, the operational support systems and the business support systems. For the OSS, we're talking about things like network management, network operations, service delivery, quality of service, fulfillment assurance, and things like that. For the BSS systems, these refer to customer-facing elements of the stack, like revenue, order management, what products they sell, billing, and customer service. And what we're seeing is telcos have been really good at operational efficiency and making money off of transport and connectivity, but they've lacked the innovation in services and applications. They own the pipes and that works well, but others, be the over-the-top content companies, or private network providers and increasingly, cloud providers have been able to bypass the telcos, reach around them, if you will, and drive innovation. And so, the right-most diagram speaks to the need to disaggregate pieces of the stack. And while the similarities to the 1990s in enterprise IT are greater than the differences, there are things that are different. For example, the granularity of hardware infrastructure will not likely be as high where competition occurred back in the 90s at every layer of the value chain with very little infrastructure integration. That of course changed in the 2010s with converged infrastructure and hyper-converged and also software defined. So, that's one difference. And the advent of cloud, containers, microservices, and AI, none of that was really a major factor in the disintegration of legacy IT. And that probably means that disruptors can move even faster than did the likes of Intel and Microsoft, Oracle, Cisco, and the Seagates of the 1990s. As well, while many of the products and services will come from traditional enterprise IT names like Dell, HPE, Cisco, Red Hat, VMware, AWS, Microsoft, Google, et cetera, many of the names are going to be different and come from traditional network equipment providers. These are names like Ericsson and Huawei, and Nokia, and other names, like Wind River, and Rakuten, and Dish Networks. And there are enormous opportunities in data to help telecom companies and their competitors go beyond telemetry data into more advanced analytics and data monetization. There's also going to be an entirely new set of apps based on the workloads and use cases ranging from hospitals, sports arenas, race tracks, shipping ports, you name it. Virtually every vertical will participate in this transformation as the industry evolves its focus toward innovation, agility, and open ecosystems. Now remember, this is not a binary state. There are going to be greenfield companies disrupting the apple cart, but the incumbent telcos are going to have to continue to ensure newer systems work with their legacy infrastructure, in their OSS and BSS existing systems. And as we know, this is not going to be an overnight task. Integration is a difficult thing, transformations, migrations. So that's what makes this all so interesting because others can come in with Greenfield and potentially disrupt. There'll be interesting partnerships and ecosystems will form and coalitions will also form. Now, we mentioned that several traditional enterprise companies are or will be playing in this space. Now, ETR doesn't have a ton of data on specific telecom equipment and software providers, but it does have some interesting data that we cut for this breaking analysis. What we're showing here in this graphic is some of the names that we've followed over the years and how they're faring. Specifically, we did the cut within the telco sector. So the Y-axis here shows net score or spending velocity. And the horizontal axis, that shows the presence or pervasiveness in the data set. And that table insert in the upper left, that informs as to how the dots are plotted. You know, the two columns there, net score and the ends. And that red-dotted line, that horizontal line at 40%, that is an indicator of a highly elevated level. Anything above that, we consider quite outstanding. And what we'll do now is we'll comment on some of the cohorts and share with you how they're doing in telecommunications, and that sector, that vertical relative to their position overall in the data set. Let's start with the public cloud players. They're prominent in every industry. Telcos, telecommunications is no exception and it's quite an interesting cohort here. On the one hand, they can help telecommunication firms modernize and become more agile by eliminating the heavy lifting and you know, all the cloud, you know, value prop, data center costs, and the cloud benefits. At the same time, public cloud players are bringing their services to the edge, building out their own global networks and are a disruptive force to traditional telcos. All right, let's talk about Azure first. Their net score is basically identical to telco relative to its overall average. AWS's net score is higher in telco by just a few percentage points. Google Cloud platform is eight percentage points higher in telco with a 53% net score. So all three hyperscalers have an equal or stronger presence in telco than their average overall. Okay, let's look at the traditional enterprise hardware and software infrastructure cohort. Dell, Cisco, HPE, Red Hat, VMware, and Oracle. We've highlighted in this chart just as sort of indicators or proxies. Dell's net score's 10 percentage points higher in telco than its overall average. Interesting. Cisco's is a bit higher. HPE's is actually lower by about nine percentage points in the ETR survey, and VMware's is lower by about four percentage points. Now, Red Hat is really interesting. OpenStack, as we've previously reported is popular with telcos who want to build out their own private cloud. And the data shows that Red Hat OpenStack's net score is 15 percentage points higher in the telco sector than its overall average. OpenShift, on the other hand, has a net score that's four percentage points lower in telco than its overall average. So this to us talks to the pace of adoption of microservices and containers. You know, it's going to happen, but it's going to happen more slowly. Finally, Oracle's spending momentum is somewhat lower in the sector than its average, despite the firm having a decent telco business. IBM and Accenture, heavy services companies are both lower in this sector than their average. And real quickly, snowflake's net score is much lower by about 12 percentage points relative to its very high average net score of 62%. But we look for them to be a player in this space as telcos need to modernize their analytics stack and share data in a governed manner. Databricks' net score is also much lower than its average by about 13 points. And same, I would expect them to be a player as open architectures and cloud gains steam in telco. All right, let's close out now on what we're going to be talking about at MWC23 and some of the key issues that we'll be unpacking. We've talked about stack disaggregation in this breaking analysis, but the key here will be the pace at which it will reach the operational efficiency and reliability of closed stacks. Telcos, you know, in a large part, they're engineering heavy firms and much of their work takes place, kind of in the basement, in the dark. It's not really a big public hype machine, and they tend to move slowly and cautiously. While they understand the importance of agility, they're going to be careful because, you know, it's in their DNA. And so at the same time, if they don't move fast enough, they're going to get hurt and disrupted by competitors. So that's going to be a topic of conversation, and we'll be looking for proof points. And the other comment I'll make is around integration. Telcos because of their conservatism will benefit from better testing and those firms that can innovate on the testing front and have labs and certifications and innovate at that level, with an ecosystem are going to be in a better position. Because open sometimes means wild west. So the more players like Dell, HPE, Cisco, Red Hat, et cetera, that do that and align with their ecosystems and provide those resources, the faster adoption is going to go. So we'll be looking for, you know, who's actually doing that, Open RAN or Radio Access Networks. That fits in this discussion because O-RAN is an emerging network architecture. It essentially enables the use of open technologies from an ecosystem and over time, look at O-RAN is going to be open, but the questions, you know, a lot of questions remain as to when it will be able to deliver the operational efficiency of traditional RAN. Got some interesting dynamics going on. Rakuten is a company that's working hard on this problem, really focusing on operational efficiency. Then you got Dish Networks. They're also embracing O-RAN. They're coming at it more from service innovation. So that's something that we'll be monitoring and unpacking. We're going to look at cloud as a disruptor. On the one hand, cloud can help drive agility, as we said earlier and optionality, and innovation for incumbent telcos. But the flip side is going to also do the same for startups trying to disrupt and cloud attracts startups. While some of the telcos are actually embracing the cloud, many are being cautious. So that's going to be an interesting topic of discussion. And there's private wireless networks and 5G, and hyperlocal private networks, they're being deployed, you know, at the edge. This idea of open edge is also a really hot topic and this trend is going to accelerate. You know, the importance here is that the use cases are going to be widely varied. The needs of a hospital are going to be different than those of a sports venue are different from a remote drilling location, and energy or a concert venue. Things like real-time AI inference and data flows are going to bring new services and monetization opportunities. And many firms are going to be bypassing traditional telecommunications networks to build these out. Satellites as well, we're going to see, you know, in this decade, you're going to have, you're going to look down at Google Earth and you're going to see real-time. You know, today you see snapshots and so, lots of innovations going in that space. So how is this going to disrupt industries and traditional industry structures? Now, as always, we'll be looking at data angles, right? 'Cause it's in The Cube's DNA to follow the data and what opportunities and risks data brings. The Cube is going to be on location at MWC23 at the end of the month. We got a great set. We're in the walkway between halls four and five, right in Congress Square, it's booths CS60. So we'll have a full, they're called Stan CS60. We have a full schedule. I'm going to be there with Lisa Martin, Dave Nicholson and the entire Cube crew, so don't forget to stop by. All right, that's a wrap. I want to thank Alex Myerson, who's on production and manages the podcast, Ken Schiffman as well. Kristin Martin and Cheryl Knight help get the word out on social media and in our newsletters. And Rob Hof is our editor-in-chief over at Silicon Angle, does some great stuff for us. Thank you all. Remember, all these episodes are available as podcasts. Wherever you listen, just search "Breaking Analysis" podcasts I publish each week on wikibon.com and silicon angle.com. And all the video content is available on demand at thecube.net. You can email me directly at david.vellante@silicon angle.com. You can DM me at dvellante or comment on my LinkedIn post. Please do check out etr.ai for the best survey data in the enterprise tech business. This is Dave Vellante for The Cube Insights powered by ETR. Thanks for watching and we'll see you at Mobile World Congress, and/or at next time on "Breaking Analysis." (bright music) (bright music fades)

Published Date : Feb 18 2023

SUMMARY :

From the Cube Studios and some of the key issues

ENTITIES

Entity	Category	Confidence
Alex Myerson	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Dave Nicholson	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Ericsson	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Huawei	ORGANIZATION	0.99+
Ken Schiffman	PERSON	0.99+
Kristin Martin	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Nokia	ORGANIZATION	0.99+
Rakuten	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
Oracle	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
GSMA	ORGANIZATION	0.99+
Accenture	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
2019	DATE	0.99+
53%	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
Wind River	ORGANIZATION	0.99+
HPE	ORGANIZATION	0.99+
Dell Technologies	ORGANIZATION	0.99+
more than $300 million	QUANTITY	0.99+
40%	QUANTITY	0.99+
Telcos	ORGANIZATION	0.99+
Congress Square	LOCATION	0.99+
First	QUANTITY	0.99+
VMware	ORGANIZATION	0.99+
Telco	ORGANIZATION	0.99+
Dish Networks	ORGANIZATION	0.99+
telco	ORGANIZATION	0.99+
2010s	DATE	0.99+
Intel	ORGANIZATION	0.99+
david.vellante@silicon angle.com	OTHER	0.99+
MWC23	EVENT	0.99+
1990s	DATE	0.99+
62%	QUANTITY	0.99+
Mobile World Congress	EVENT	0.99+
two columns	QUANTITY	0.99+
each week	QUANTITY	0.99+
Seagates	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
today	DATE	0.99+
early March	DATE	0.99+
both	QUANTITY	0.99+
thecube.net	OTHER	0.99+
MWC	EVENT	0.99+
ETR	ORGANIZATION	0.98+
this year	DATE	0.98+
Cube Studios	ORGANIZATION	0.98+
one part	QUANTITY	0.98+
Chinese	OTHER	0.98+
Boston	LOCATION	0.98+
decades ago	DATE	0.97+
three	QUANTITY	0.97+
90s	DATE	0.97+
about 13 points	QUANTITY	0.97+

Ed Walsh & Thomas Hazel | A New Database Architecture for Supercloud

(bright music) >> Hi, everybody, this is Dave Vellante, welcome back to Supercloud 2. Last August, at the first Supercloud event, we invited the broader community to help further define Supercloud, we assessed its viability, and identified the critical elements and deployment models of the concept. The objectives here at Supercloud too are, first of all, to continue to tighten and test the concept, the second is, we want to get real world input from practitioners on the problems that they're facing and the viability of Supercloud in terms of applying it to their business. So on the program, we got companies like Walmart, Sachs, Western Union, Ionis Pharmaceuticals, NASDAQ, and others. And the third thing that we want to do is we want to drill into the intersection of cloud and data to project what the future looks like in the context of Supercloud. So in this segment, we want to explore the concept of data architectures and what's going to be required for Supercloud. And I'm pleased to welcome one of our Supercloud sponsors, ChaosSearch, Ed Walsh is the CEO of the company, with Thomas Hazel, who's the Founder, CTO, and Chief Scientist. Guys, good to see you again, thanks for coming into our Marlborough studio. >> Always great. >> Great to be here. >> Okay, so there's a little debate, I'm going to put you right in the spot. (Ed chuckling) A little debate going on in the community started by Bob Muglia, a former CEO of Snowflake, and he was at Microsoft for a long time, and he looked at the Supercloud definition, said, "I think you need to tighten it up a little bit." So, here's what he came up with. He said, "A Supercloud is a platform that provides a programmatically consistent set of services hosted on heterogeneous cloud providers." So he's calling it a platform, not an architecture, which was kind of interesting. And so presumably the platform owner is going to be responsible for the architecture, but Dr. Nelu Mihai, who's a computer scientist behind the Cloud of Clouds Project, he chimed in and responded with the following. He said, "Cloud is a programming paradigm supporting the entire lifecycle of applications with data and logic natively distributed. Supercloud is an open architecture that integrates heterogeneous clouds in an agnostic manner." So, Ed, words matter. Is this an architecture or is it a platform? >> Put us on the spot. So, I'm sure you have concepts, I would say it's an architectural or design principle. Listen, I look at Supercloud as a mega trend, just like cloud, just like data analytics. And some companies are using the principle, design principles, to literally get dramatically ahead of everyone else. I mean, things you couldn't possibly do if you didn't use cloud principles, right? So I think it's a Supercloud effect, you're able to do things you're not able to. So I think it's more a design principle, but if you do it right, you get dramatic effect as far as customer value. >> So the conversation that we were having with Muglia, and Tristan Handy of dbt Labs, was, I'll set it up as the following, and, Thomas, would love to get your thoughts, if you have a CRM, think about applications today, it's all about forms and codifying business processes, you type a bunch of stuff into Salesforce, and all the salespeople do it, and this machine generates a forecast. What if you have this new type of data app that pulls data from the transaction system, the e-commerce, the supply chain, the partner ecosystem, et cetera, and then, without humans, actually comes up with a plan. That's their vision. And Muglia was saying, in order to do that, you need to rethink data architectures and database architectures specifically, you need to get down to the level of how the data is stored on the disc. What are your thoughts on that? Well, first of all, I'm going to cop out, I think it's actually both. I do think it's a design principle, I think it's not open technology, but open APIs, open access, and you can build a platform on that design principle architecture. Now, I'm a database person, I love solving the database problems. >> I'm waited for you to launch into this. >> Yeah, so I mean, you know, Snowflake is a database, right? It's a distributed database. And we wanted to crack those codes, because, multi-region, multi-cloud, customers wanted access to their data, and their data is in a variety of forms, all these services that you're talked about. And so what I saw as a core principle was cloud object storage, everyone streams their data to cloud object storage. From there we said, well, how about we rethink database architecture, rethink file format, so that we can take each one of these services and bring them together, whether distributively or centrally, such that customers can access and get answers, whether it's operational data, whether it's business data, AKA search, or SQL, complex distributed joins. But we had to rethink the architecture. I like to say we're not a first generation, or a second, we're a third generation distributed database on pure, pure cloud storage, no caching, no SSDs. Why? Because all that availability, the cost of time, is a struggle, and cloud object storage, we think, is the answer. >> So when you're saying no caching, so when I think about how companies are solving some, you know, pretty hairy problems, take MySQL Heatwave, everybody thought Oracle was going to just forget about MySQL, well, they come out with Heatwave. And the way they solve problems, and you see their benchmarks against Amazon, "Oh, we crush everybody," is they put it all in memory. So you said no caching? You're not getting performance through caching? How is that true, and how are you getting performance? >> Well, so five, six years ago, right? When you realize that cloud object storage is going to be everywhere, and it's going to be a core foundational, if you will, fabric, what would you do? Well, a lot of times the second generation say, "We'll take it out of cloud storage, put in SSDs or something, and put into cache." And that adds a lot of time, adds a lot of costs. But I said, what if, what if we could actually make the first read hot, the first read distributed joins and searching? And so what we went out to do was said, we can't cache, because that's adds time, that adds cost. We have to make cloud object storage high performance, like it feels like a caching SSD. That's where our patents are, that's where our technology is, and we've spent many years working towards this. So, to me, if you can crack that code, a lot of these issues we're talking about, multi-region, multicloud, different services, everybody wants to send their data to the data lake, but then they move it out, we said, "Keep it right there." >> You nailed it, the data gravity. So, Bob's right, the data's coming in, and you need to get the data from everywhere, but you need an environment that you can deal with all that different schema, all the different type of technology, but also at scale. Bob's right, you cannot use memory or SSDs to cache that, that doesn't scale, it doesn't scale cost effectively. But if you could, and what you did, is you made object storage, S3 first, but object storage, the only persistence by doing that. And then we get performance, we should talk about it, it's literally, you know, hundreds of terabytes of queries, and it's done in seconds, it's done without memory caching. We have concepts of caching, but the only caching, the only persistence, is actually when we're doing caching, we're just keeping another side-eye track of things on the S3 itself. So we're using, actually, the object storage to be a database, which is kind of where Bob was saying, we agree, but that's what you started at, people thought you were crazy. >> And maybe make it live. Don't think of it as archival or temporary space, make it live, real time streaming, operational data. What we do is make it smart, we see the data coming in, we uniquely index it such that you can get your use cases, that are search, observability, security, or backend operational. But we don't have to have this, I dunno, static, fixed, siloed type of architecture technologies that were traditionally built prior to Supercloud thinking. >> And you don't have to move everything, essentially, you can do it wherever the data lands, whatever cloud across the globe, you're able to bring it together, you get the cost effectiveness, because the only persistence is the cheapest storage persistent layer you can buy. But the key thing is you cracked the code. >> We had to crack the code, right? That was the key thing. >> That's where the plans are. >> And then once you do that, then everything else gets easier to scale, your architecture, across regions, across cloud. >> Now, it's a general purpose database, as Bob was saying, but we use that database to solve a particular issue, which is around operational data, right? So, we agree with Bob's. >> Interesting. So this brings me to this concept of data, Jimata Gan is one of our speakers, you know, we talk about data fabric, which is a NetApp, originally NetApp concept, Gartner's kind of co-opted it. But so, the basic concept is, data lives everywhere, whether it's an S3 bucket, or a SQL database, or a data lake, it's just a node on the data mesh. So in your view, how does this fit in with Supercloud? Ed, you've said that you've built, essentially, an enabler for that, for the data mesh, I think you're an enabler for the Supercloud-like principles. This is a big, chewy opportunity, and it requires, you know, a team approach. There's got to be an ecosystem, there's not going to be one Supercloud to rule them all, so where does the ecosystem fit into the discussion, and where do you fit into the ecosystem? >> Right, so we agree completely, there's not one Supercloud in effect, but we use Supercloud principles to build our platform, and then, you know, the ecosystem's going to be built on leveraging what everyone else's secret powers are, right? So our power, our superpower, based upon what we built is, we deal with, if you're having any scale, or cost effective scale issues, with data, machine generated data, like business observability or security data, we are your force multiplier, we will take that in singularly, just let it, simply put it in your object storage wherever it sits, and we give you uniformity access to that using OpenAPI access, SQL, or you know, Elasticsearch API. So, that's what we do, that's our superpower. So I'll play it into data mesh, that's a perfect, we are a node on a data mesh, but I'll play it in the soup about how, the ecosystem, we see it kind of playing, and we talked about it in just in the last couple days, how we see this kind of possibly. Short term, our superpowers, we deal with this data that's coming at these environments, people, customers, building out observability or security environments, or vendors that are selling their own Supercloud, I do observability, the Datadogs of the world, dot dot dot, the Splunks of the world, dot dot dot, and security. So what we do is we fit in naturally. What we do is a cost effective scale, just land it anywhere in the world, we deal with ingest, and it's a cost effective, an order of magnitude, or two or three order magnitudes more cost effective. Allows them, their customers are asking them to do the impossible, "Give me fast monitoring alerting. I want it snappy, but I want it to keep two years of data, (laughs) and I want it cost effective." It doesn't work. They're good at the fast monitoring alerting, we're good at the long-term retention. And yet there's some gray area between those two, but one to one is actually cheaper, so we would partner. So the first ecosystem plays, who wants to have the ability to, really, all the data's in those same environments, the security observability players, they can literally, just through API, drag our data into their point to grab. We can make it seamless for customers. Right now, we make it helpful to customers. Your Datadog, we make a button, easy go from Datadog to us for logs, save you money. Same thing with Grafana. But you can also look at ecosystem, those same vendors, it used to be a year ago it was, you know, its all about how can you grow, like it's growth at all costs, now it's about cogs. So literally we can go an environment, you supply what your customer wants, but we can help with cogs. And one-on one in a partnership is better than you trying to build on your own. >> Thomas, you were saying you make the first read fast, so you think about Snowflake. Everybody wants to talk about Snowflake and Databricks. So, Snowflake, great, but you got to get the data in there. All right, so that's, can you help with that problem? >> I mean we want simple in, right? And if you have to have structure in, you're not simple. So the idea that you have a simple in, data lake, schema read type philosophy, but schema right type performance. And so what I wanted to do, what we have done, is have that simple lake, and stream that data real time, and those access points of Search or SQL, to go after whatever business case you need, security observability, warehouse integration. But the key thing is, how do I make that click, click, click answer, and do it quickly? And so what we want to do is, that first read has to be fast. Why? 'Cause then you're going to do all this siloing, layers, complexity. If your first read's not fast, you're at a disadvantage, particularly in cost. And nobody says I want less data, but everyone has to, whether they say we're going to shorten the window, we're going to use AI to choose, but in a security moment, when you don't have that answer, you're in trouble. And that's why we are this service, this Supercloud service, if you will, providing access, well-known search, well-known SQL type access, that if you just have one access point, you're at a disadvantage. >> We actually talked about Snowflake and BigQuery, and a different platform, Data Bricks. That's kind of where we see the phase two of ecosystem. One is easy, the low-hanging fruit is observability and security firms. But the next one is, what we do, our super power is dealing with this messy data that schema is changing like night and day. Pipelines are tough, and it's changing all the time, but you want these things fast, and it's big data around the world. That's the next point, just use us alongside, or inside, one of their platforms, and now we get the best of both worlds. Our superpower is keeping this messy data as a streaming, okay, not a batch thing, allow you to do that. So, that's the second one. And then to be honest, the third one, which plays you to Supercloud, it also plays perfectly in the data mesh, is if you really go to the ultimate thing, what we have done is made object storage, S3, GCS, and blob storage, we made it a database. Put, get, complex query with big joins. You know, so back to your original thing, and Muglia teed it up perfectly, we've done that. Now imagine if that's an ecosystem, who would want that? If it's, again, it's uniform available across all the regions, across all the clouds, and it's right next to where you are building a service, or a client's trying, that's where the ecosystem, I think people are going to use Superclouds for their superpowers. We're really good at this, allows that short term. I think the Snowflakes and the Data Bricks are the medium term, you know? And then I think eventually gets to, hey, listen if you can make object storage fast, you can just go after it with simple SQL queries, or elastic. Who would want that? I think that's where people are going to leverage it. It's not going to be one Supercloud, and we leverage the super clouds. >> Our viewpoint is smart object storage can be programmable, and so we agree with Bob, but we're not saying do it here, do it here. This core, fundamental layer across regions, across clouds, that everyone has? Simple in. Right now, it's hard to get data in for access for analysis. So we said, simply, we'll automate the entire process, give you API access across regions, across clouds. And again, how do you do a distributed join that's fast? How do you do a distributed join that doesn't cost you an arm or a leg? And how do you do it at scale? And that's where we've been focused. >> So prior, the cloud object store was a niche. >> Yeah. >> S3 obviously changed that. How standard is, essentially, object store across the different cloud platforms? Is that a problem for you? Is that an easy thing to solve? >> Well, let's talk about it. I mean we've fundamentally, yeah we've extracted it, but fundamentally, cloud object storage, put, get, and list. That's why it's so scalable, 'cause it doesn't have all these other components. That complexity is where we have moved up, and provide direct analytical API access. So because of its simplicity, and costs, and security, and reliability, it can scale naturally. I mean, really, distributed object storage is easy, it's put-get anywhere, now what we've done is we put a layer of intelligence, you know, call it smart object storage, where access is simple. So whether it's multi-region, do a query across, or multicloud, do a query across, or hunting, searching. >> We've had clients doing Amazon and Google, we have some Azure, but we see Amazon and Google more, and it's a consistent service across all of them. Just literally put your data in the bucket of choice, or folder of choice, click a couple buttons, literally click that to say "that's hot," and after that, it's hot, you can see it. But we're not moving data, the data gravity issue, that's the other. That it's already natively flowing to these pools of object storage across different regions and clouds. We don't move it, we index it right there, we're spinning up stateless compute, back to the Supercloud concept. But now that allows us to do all these other things, right? >> And it's no longer just cheap and deep object storage. Right? >> Yeah, we make it the same, like you have an analytic platform regardless of where you're at, you don't have to worry about that. Yeah, we deal with that, we deal with a stateless compute coming up -- >> And make it programmable. Be able to say, "I want this bucket to provide these answers." Right, that's really the hope, the vision. And the complexity to build the entire stack, and then connect them together, we said, the fabric is cloud storage, we just provide the intelligence on top. >> Let's bring it back to the customers, and one of the things we're exploring in Supercloud too is, you know, is Supercloud a solution looking for a problem? Is a multicloud really a problem? I mean, you hear, you know, a lot of the vendor marketing says, "Oh, it's a disaster, because it's all different across the clouds." And I talked to a lot of customers even as part of Supercloud too, they're like, "Well, I solved that problem by just going mono cloud." Well, but then you're not able to take advantage of a lot of the capabilities and the primitives that, you know, like Google's data, or you like Microsoft's simplicity, their RPA, whatever it is. So what are customers telling you, what are their near term problems that they're trying to solve today, and how are they thinking about the future? >> Listen, it's a real problem. I think it started, I think this is a a mega trend, just like cloud. Just, cloud data, and I always add, analytics, are the mega trends. If you're looking at those, if you're not considering using the Supercloud principles, in other words, leveraging what I have, abstracting it out, and getting the most out of that, and then build value on top, I think you're not going to be able to keep up, In fact, no way you're going to keep up with this data volume. It's a geometric challenge, and you're trying to do linear things. So clients aren't necessarily asking, hey, for Supercloud, but they're really saying, I need to have a better mechanism to simplify this and get value across it, and how do you abstract that out to do that? And that's where they're obviously, our conversations are more amazed what we're able to do, and what they're able to do with our platform, because if you think of what we've done, the S3, or GCS, or object storage, is they can't imagine the ingest, they can't imagine how easy, time to glass, one minute, no matter where it lands in the world, querying this in seconds for hundreds of terabytes squared. People are amazed, but that's kind of, so they're not asking for that, but they are amazed. And then when you start talking on it, if you're an enterprise person, you're building a big cloud data platform, or doing data or analytics, if you're not trying to leverage the public clouds, and somehow leverage all of them, and then build on top, then I think you're missing it. So they might not be asking for it, but they're doing it. >> And they're looking for a lens, you mentioned all these different services, how do I bring those together quickly? You know, our viewpoint, our service, is I have all these streams of data, create a lens where they want to go after it via search, go after via SQL, bring them together instantly, no e-tailing out, no define this table, put into this database. We said, let's have a service that creates a lens across all these streams, and then make those connections. I want to take my CRM with my Google AdWords, and maybe my Salesforce, how do I do analysis? Maybe I want to hunt first, maybe I want to join, maybe I want to add another stream to it. And so our viewpoint is, it's so natural to get into these lake platforms and then provide lenses to get that access. >> And they don't want it separate, they don't want something different here, and different there. They want it basically -- >> So this is our industry, right? If something new comes out, remember virtualization came out, "Oh my God, this is so great, it's going to solve all these problems." And all of a sudden it just got to be this big, more complex thing. Same thing with cloud, you know? It started out with S3, and then EC2, and now hundreds and hundreds of different services. So, it's a complex matter for a lot of people, and this creates problems for customers, especially when you got divisions that are using different clouds, and you're saying that the solution, or a solution for the part of the problem, is to really allow the data to stay in place on S3, use that standard, super simple, but then give it what, Ed, you've called superpower a couple of times, to make it fast, make it inexpensive, and allow you to do that across clouds. >> Yeah, yeah. >> I'll give you guys the last word on that. >> No, listen, I think, we think Supercloud allows you to do a lot more. And for us, data, everyone says more data, more problems, more budget issue, everyone knows more data is better, and we show you how to do it cost effectively at scale. And we couldn't have done it without the design principles of we're leveraging the Supercloud to get capabilities, and because we use super, just the object storage, we're able to get these capabilities of ingest, scale, cost effectiveness, and then we built on top of this. In the end, a database is a data platform that allows you to go after everything distributed, and to get one platform for analytics, no matter where it lands, that's where we think the Supercloud concepts are perfect, that's where our clients are seeing it, and we're kind of excited about it. >> Yeah a third generation database, Supercloud database, however we want to phrase it, and make it simple, but provide the value, and make it instant. >> Guys, thanks so much for coming into the studio today, I really thank you for your support of theCUBE, and theCUBE community, it allows us to provide events like this and free content. I really appreciate it. >> Oh, thank you. >> Thank you. >> All right, this is Dave Vellante for John Furrier in theCUBE community, thanks for being with us today. You're watching Supercloud 2, keep it right there for more thought provoking discussions around the future of cloud and data. (bright music)

Published Date : Feb 17 2023

SUMMARY :

And the third thing that we want to do I'm going to put you right but if you do it right, So the conversation that we were having I like to say we're not a and you see their So, to me, if you can crack that code, and you need to get the you can get your use cases, But the key thing is you cracked the code. We had to crack the code, right? And then once you do that, So, we agree with Bob's. and where do you fit into the ecosystem? and we give you uniformity access to that so you think about Snowflake. So the idea that you have are the medium term, you know? and so we agree with Bob, So prior, the cloud that an easy thing to solve? you know, call it smart object storage, and after that, it's hot, you can see it. And it's no longer just you don't have to worry about And the complexity to and one of the things we're and how do you abstract it's so natural to get and different there. and allow you to do that across clouds. I'll give you guys and we show you how to do it but provide the value, I really thank you for around the future of cloud and data.

ENTITIES

Entity	Category	Confidence
Walmart	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
NASDAQ	ORGANIZATION	0.99+
Bob Muglia	PERSON	0.99+
Thomas	PERSON	0.99+
Thomas Hazel	PERSON	0.99+
Ionis Pharmaceuticals	ORGANIZATION	0.99+
Western Union	ORGANIZATION	0.99+
Ed Walsh	PERSON	0.99+
Bob	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Nelu Mihai	PERSON	0.99+
Sachs	ORGANIZATION	0.99+
Tristan Handy	PERSON	0.99+
two	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
two years	QUANTITY	0.99+
Supercloud 2	TITLE	0.99+
first	QUANTITY	0.99+
Last August	DATE	0.99+
three	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+
both	QUANTITY	0.99+
dbt Labs	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Ed	PERSON	0.99+
Gartner	ORGANIZATION	0.99+
Jimata Gan	PERSON	0.99+
third one	QUANTITY	0.99+
one minute	QUANTITY	0.99+
second	QUANTITY	0.99+
first generation	QUANTITY	0.99+
third generation	QUANTITY	0.99+
Grafana	ORGANIZATION	0.99+
second generation	QUANTITY	0.99+
second one	QUANTITY	0.99+
hundreds of terabytes	QUANTITY	0.98+
SQL	TITLE	0.98+
five	DATE	0.98+
one	QUANTITY	0.98+
Databricks	ORGANIZATION	0.98+
a year ago	DATE	0.98+
ChaosSearch	ORGANIZATION	0.98+
Muglia	PERSON	0.98+
MySQL	TITLE	0.98+
both worlds	QUANTITY	0.98+
third thing	QUANTITY	0.97+
Marlborough	LOCATION	0.97+
theCUBE	ORGANIZATION	0.97+
today	DATE	0.97+
Supercloud	ORGANIZATION	0.97+
Elasticsearch	TITLE	0.96+
NetApp	TITLE	0.96+
Datadog	ORGANIZATION	0.96+
One	QUANTITY	0.96+
EC2	TITLE	0.96+
each one	QUANTITY	0.96+
S3	TITLE	0.96+
one platform	QUANTITY	0.95+
Supercloud 2	EVENT	0.95+
first read	QUANTITY	0.95+
six years ago	DATE	0.95+

Daren Brabham & Erik Bradley | What the Spending Data Tells us About Supercloud

(gentle synth music) (music ends) >> Welcome back to Supercloud 2, an open industry collaboration between technologists, consultants, analysts, and of course practitioners to help shape the future of cloud. At this event, one of the key areas we're exploring is the intersection of cloud and data. And how building value on top of hyperscale clouds and across clouds is evolving, a concept of course we call "Supercloud". And we're pleased to welcome our friends from Enterprise Technology research, Erik Bradley and Darren Brabham. Guys, thanks for joining us, great to see you. we love to bring the data into these conversations. >> Thank you for having us, Dave, I appreciate it. >> Yeah, thanks. >> You bet. And so, let me do the setup on what is Supercloud. It's a concept that we've floated, Before re:Invent 2021, based on the idea that cloud infrastructure is becoming ubiquitous, incredibly powerful, but there's a lack of standards across the big three clouds. That creates friction. So we defined over the period of time, you know, better part of a year, a set of essential elements, deployment models for so-called supercloud, which create this common experience for specific cloud services that, of course, again, span multiple clouds and even on-premise data. So Erik, with that as background, I wonder if you could add your general thoughts on the term supercloud, maybe play proxy for the CIO community, 'cause you do these round tables, you talk to these guys all the time, you gather a lot of amazing information from senior IT DMs that compliment your survey. So what are your thoughts on the term and the concept? >> Yeah, sure. I'll even go back to last year when you and I did our predictions panel, right? And we threw it out there. And to your point, you know, there's some haters. Anytime you throw out a new term, "Is it marketing buzz? Is it worth it? Why are you even doing it?" But you know, from my own perspective, and then also speaking to the IT DMs that we interview on a regular basis, this is just a natural evolution. It's something that's inevitable in enterprise tech, right? The internet was not built for what it has become. It was never intended to be the underlying infrastructure of our daily lives and work. The cloud also was not built to be what it's become. But where we're at now is, we have to figure out what the cloud is and what it needs to be to be scalable, resilient, secure, and have the governance wrapped around it. And to me that's what supercloud is. It's a way to define operantly, what the next generation, the continued iteration and evolution of the cloud and what its needs to be. And that's what the supercloud means to me. And what depends, if you want to call it metacloud, supercloud, it doesn't matter. The point is that we're trying to define the next layer, the next future of work, which is inevitable in enterprise tech. Now, from the IT DM perspective, I have two interesting call outs. One is from basically a senior developer IT architecture and DevSecOps who says he uses the term all the time. And the reason he uses the term, is that because multi-cloud has a stigma attached to it, when he is talking to his business executives. (David chuckles) the stigma is because it's complex and it's expensive. So he switched to supercloud to better explain to his business executives and his CFO and his CIO what he's trying to do. And we can get into more later about what it means to him. But the inverse of that, of course, is a good CSO friend of mine for a very large enterprise says the concern with Supercloud is the reduction of complexity. And I'll explain, he believes anything that takes the requirement of specific expertise out of the equation, even a little bit, as a CSO worries him. So as you said, David, always two sides to the coin, but I do believe supercloud is a relevant term, and it is necessary because the cloud is continuing to be defined. >> You know, that's really interesting too, 'cause you know, Darren, we use Snowflake a lot as an example, sort of early supercloud, and you think from a security standpoint, we've always pushed Amazon and, "Are you ever going to kind of abstract the complexity away from all these primitives?" and their position has always been, "Look, if we produce these primitives, and offer these primitives, we we can move as the market moves. When you abstract, then it becomes harder to peel the layers." But Darren, from a data standpoint, like I say, we use Snowflake a lot. I think of like Tim Burners-Lee when Web 2.0 came out, he said, "Well this is what the internet was always supposed to be." So in a way, you know, supercloud is maybe what multi-cloud was supposed to be. But I mean, you think about data sharing, Darren, across clouds, it's always been a challenge. Snowflake always, you know, obviously trying to solve that problem, as are others. But what are your thoughts on the concept? >> Yeah, I think the concept fits, right? It is reflective of, it's a paradigm shift, right? Things, as a pendulum have swung back and forth between needing to piece together a bunch of different tools that have specific unique use cases and they're best in breed in what they do. And then focusing on the duct tape that holds 'em all together and all the engineering complexity and skill, it shifted from that end of the pendulum all the way back to, "Let's streamline this, let's simplify it. Maybe we have budget crunches and we need to consolidate tools or eliminate tools." And so then you kind of see this back and forth over time. And with data and analytics for instance, a lot of organizations were trying to bring the data closer to the business. That's where we saw self-service analytics coming in. And tools like Snowflake, what they did was they helped point to different databases, they helped unify data, and organize it in a single place that was, you know, in a sense neutral, away from a single cloud vendor or a single database, and allowed the business to kind of be more flexible in how it brought stuff together and provided it out to the business units. So Snowflake was an example of one of those times where we pulled back from the granular, multiple points of the spear, back to a simple way to do things. And I think Snowflake has continued to kind of keep that mantle to a degree, and we see other tools trying to do that, but that's all it is. It's a paradigm shift back to this kind of meta abstraction layer that kind of simplifies what is the reality, that you need a complex multi-use case, multi-region way of doing business. And it sort of reflects the reality of that. >> And you know, to me it's a spectrum. As part of Supercloud 2, we're talking to a number of of practitioners, Ionis Pharmaceuticals, US West, we got Walmart. And it's a spectrum, right? In some cases the practitioner's saying, "You know, the way I solve multi-cloud complexity is mono-cloud, I just do one cloud." (laughs) Others like Walmart are saying, "Hey, you know, we actually are building an abstraction layer ourselves, take advantage of it." So my general question to both of you is, is this a concept, is the lack of standards across clouds, you know, really a problem, you know, or is supercloud a solution looking for a problem? Or do you hear from practitioners that "No, this is really an issue, we have to bring together a set of standards to sort of unify our cloud estates." >> Allow me to answer that at a higher level, and then we're going to hand it over to Dr. Brabham because he is a little bit more detailed on the realtime streaming analytics use cases, which I think is where we're going to get to. But to answer that question, it really depends on the size and the complexity of your business. At the very large enterprise, Dave, Yes, a hundred percent. This needs to happen. There is complexity, there is not only complexity in the compute and actually deploying the applications, but the governance and the security around them. But for lower end or, you know, business use cases, and for smaller businesses, it's a little less necessary. You certainly don't need to have all of these. Some of the things that come into mind from the interviews that Darren and I have done are, you know, financial services, if you're doing real-time trading, anything that has real-time data metrics involved in your transactions, is going to be necessary. And another use case that we hear about is in online travel agencies. So I think it is very relevant, the complexity does need to be solved, and I'll allow Darren to explain a little bit more about how that's used from an analytics perspective. >> Yeah, go for it. >> Yeah, exactly. I mean, I think any modern, you know, multinational company that's going to have a footprint in the US and Europe, in China, or works in different areas like manufacturing, where you're probably going to have on-prem instances that will stay on-prem forever, for various performance reasons. You have these complicated governance and security and regulatory issues. So inherently, I think, large multinational companies and or companies that are in certain areas like finance or in, you know, online e-commerce, or things that need real-time data, they inherently are going to have a very complex environment that's going to need to be managed in some kind of cleaner way. You know, they're looking for one door to open, one pane of glass to look at, one thing to do to manage these multi points. And, streaming's a good example of that. I mean, not every organization has a real-time streaming use case, and may not ever, but a lot of organizations do, a lot of industries do. And so there's this need to use, you know, they want to use open-source tools, they want to use Apache Kafka for instance. They want to use different megacloud vendors offerings, like Google Pub/Sub or you know, Amazon Kinesis Firehose. They have all these different pieces they want to use for different use cases at different stages of maturity or proof of concept, you name it. They're going to have to have this complexity. And I think that's why we're seeing this need, to have sort of this supercloud concept, to juggle all this, to wrangle all of it. 'Cause the reality is, it's complex and you have to simplify it somehow. >> Great, thanks you guys. All right, let's bring up the graphic, and take a look. Anybody who follows the breaking analysis, which is co-branded with ETR Cube Insights powered by ETR, knows we like to bring data to the table. ETR does amazing survey work every quarter, 1200 plus 1500 practitioners that that answer a number of questions. The vertical axis here is net score, which is ETR's proprietary methodology, which is a measure of spending momentum, spending velocity. And the horizontal axis here is overlap, but it's the presence pervasiveness, and the dataset, the ends, that table insert on the bottom right shows you how the dots are plotted, the net score and then the ends in the survey. And what we've done is we've plotted a bunch of the so-called supercloud suspects, let's start in the upper right, the cloud platforms. Without these hyperscale clouds, you can't have a supercloud. And as always, Azure and AWS, up and to the right, it's amazing we're talking about, you know, 80 plus billion dollar company in AWS. Azure's business is, if you just look at the IaaS is in the 50 billion range, I mean it's just amazing to me the net scores here. Anything above 40% we consider highly elevated. And you got Azure and you got Snowflake, Databricks, HashiCorp, we'll get to them. And you got AWS, you know, right up there at that size, it's quite amazing. With really big ends as well, you know, 700 plus ends in the survey. So, you know, kind of half the survey actually has these platforms. So my question to you guys is, what are you seeing in terms of cloud adoption within the big three cloud players? I wonder if you could could comment, maybe Erik, you could start. >> Yeah, sure. Now we're talking data, now I'm happy. So yeah, we'll get into some of it. Right now, the January, 2023 TSIS is approaching 1500 survey respondents. One caveat, it's not closed yet, it will close on Friday, but with an end that big we are over statistically significant. We also recently did a cloud survey, and there's a couple of key points on that I want to get into before we get into individual vendors. What we're seeing here, is that annual spend on cloud infrastructure is expected to grow at almost a 70% CAGR over the next three years. The percentage of those workloads for cloud infrastructure are expected to grow over 70% as three years as well. And as you mentioned, Azure and AWS are still dominant. However, we're seeing some share shift spreading around a little bit. Now to get into the individual vendors you mentioned about, yes, Azure is still number one, AWS is number two. What we're seeing, which is incredibly interesting, CloudFlare is number three. It's actually beating GCP. That's the first time we've seen it. What I do want to state, is this is on net score only, which is our measure of spending intentions. When you talk about actual pervasion in the enterprise, it's not even close. But from a spending velocity intention point of view, CloudFlare is now number three above GCP, and even Salesforce is creeping up to be at GCPs level. So what we're seeing here, is a continued domination by Azure and AWS, but some of these other players that maybe might fit into your moniker. And I definitely want to talk about CloudFlare more in a bit, but I'm going to stop there. But what we're seeing is some of these other players that fit into your Supercloud moniker, are starting to creep up, Dave. >> Yeah, I just want to clarify. So as you also know, we track IaaS and PaaS revenue and we try to extract, so AWS reports in its quarterly earnings, you know, they're just IaaS and PaaS, they don't have a SaaS play, a little bit maybe, whereas Microsoft and Google include their applications and so we extract those out and if you do that, AWS is bigger, but in the surveys, you know, customers, they see cloud, SaaS to them as cloud. So that's one of the reasons why you see, you know, Microsoft as larger in pervasion. If you bring up that survey again, Alex, the survey results, you see them further to the right and they have higher spending momentum, which is consistent with what you see in the earnings calls. Now, interesting about CloudFlare because the CEO of CloudFlare actually, and CloudFlare itself uses the term supercloud basically saying, "Hey, we're building a new type of internet." So what are your thoughts? Do you have additional information on CloudFlare, Erik that you want to share? I mean, you've seen them pop up. I mean this is a really interesting company that is pretty forward thinking and vocal about how it's disrupting the industry. >> Sure, we've been tracking 'em for a long time, and even from the disruption of just a traditional CDN where they took down Akamai and what they're doing. But for me, the definition of a true supercloud provider can't just be one instance. You have to have multiple. So it's not just the cloud, it's networking aspect on top of it, it's also security. And to me, CloudFlare is the only one that has all of it. That they actually have the ability to offer all of those things. Whereas you look at some of the other names, they're still piggybacking on the infrastructure or platform as a service of the hyperscalers. CloudFlare does not need to, they actually have the cloud, the networking, and the security all themselves. So to me that lends credibility to their own internal usage of that moniker Supercloud. And also, again, just what we're seeing right here that their net score is now creeping above AGCP really does state it. And then just one real last thing, one of the other things we do in our surveys is we track adoption and replacement reasoning. And when you look at Cloudflare's adoption rate, which is extremely high, it's based on technical capabilities, the breadth of their feature set, it's also based on what we call the ability to avoid stack alignment. So those are again, really supporting reasons that makes CloudFlare a top candidate for your moniker of supercloud. >> And they've also announced an object store (chuckles) and a database. So, you know, that's going to be, it takes a while as you well know, to get database adoption going, but you know, they're ambitious and going for it. All right, let's bring the chart back up, and I want to focus Darren in on the ecosystem now, and really, we've identified Snowflake and Databricks, it's always fun to talk about those guys, and there are a number of other, you know, data platforms out there, but we use those too as really proxies for leaders. We got a bunch of the backup guys, the data protection folks, Rubric, Cohesity, and Veeam. They're sort of in a cluster, although Rubric, you know, ahead of those guys in terms of spending momentum. And then VMware, Tanzu and Red Hat as sort of the cross cloud platform. But I want to focus, Darren, on the data piece of it. We're seeing a lot of activity around data sharing, governed data sharing. Databricks is using Delta Sharing as their sort of place, Snowflakes is sort of this walled garden like the app store. What are your thoughts on, you know, in the context of Supercloud, cross cloud capabilities for the data platforms? >> Yeah, good question. You know, I think Databricks is an interesting player because they sort of have made some interesting moves, with their Data Lakehouse technology. So they're trying to kind of complicate, or not complicate, they're trying to take away the complications of, you know, the downsides of data warehousing and data lakes, and trying to find that middle ground, where you have the benefits of a managed, governed, you know, data warehouse environment, but you have sort of the lower cost, you know, capability of a data lake. And so, you know, Databricks has become really attractive, especially by data scientists, right? We've been tracking them in the AI machine learning sector for quite some time here at ETR, attractive for a data scientist because it looks and acts like a lake, but can have some managed capabilities like a warehouse. So it's kind of the best of both worlds. So in some ways I think you've seen sort of a data science driver for the adoption of Databricks that has now become a little bit more mainstream across the business. Snowflake, maybe the other direction, you know, it's a cloud data warehouse that you know, is starting to expand its capabilities and add on new things like Streamlit is a good example in the analytics space, with apps. So you see these tools starting to branch and creep out a bit, but they offer that sort of neutrality, right? We heard one IT decision maker we recently interviewed that referred to Snowflake and Databricks as the quote unquote Switzerland of what they do. And so there's this desirability from an organization to find these tools that can solve the complex multi-headed use-case of data and analytics, which every business unit needs in different ways. And figure out a way to do that, an elegant way that's governed and centrally managed, that federated kind of best of both worlds that you get by bringing the data close to the business while having a central governed instance. So these tools are incredibly powerful and I think there's only going to be room for growth, for those two especially. I think they're going to expand and do different things and maybe, you know, join forces with others and a lot of the power of what they do well is trying to define these connections and find these partnerships with other vendors, and try to be seen as the nice add-on to your existing environment that plays nicely with everyone. So I think that's where those two tools are going, but they certainly fit this sort of label of, you know, trying to be that supercloud neutral, you know, layer that unites everything. >> Yeah, and if you bring the graphic back up, please, there's obviously big data plays in each of the cloud platforms, you know, Microsoft, big database player, AWS is, you know, 11, 12, 15, data stores. And of course, you know, BigQuery and other, you know, data platforms within Google. But you know, I'm not sure the big cloud guys are going to go hard after so-called supercloud, cross-cloud services. Although, we see Oracle getting in bed with Microsoft and Azure, with a database service that is cross-cloud, certainly Google with Anthos and you know, you never say never with with AWS. I guess what I would say guys, and I'll I'll leave you with this is that, you know, just like all players today are cloud players, I feel like anybody in the business or most companies are going to be so-called supercloud players. In other words, they're going to have a cross-cloud strategy, they're going to try to build connections if they're coming from on-prem like a Dell or an HPE, you know, or Pure or you know, many of these other companies, Cohesity is another one. They're going to try to connect to their on-premise states, of course, and create a consistent experience. It's natural that they're going to have sort of some consistency across clouds. You know, the big question is, what's that spectrum look like? I think on the one hand you're going to have some, you know, maybe some rudimentary, you know, instances of supercloud or maybe they just run on the individual clouds versus where Snowflake and others and even beyond that are trying to go with a single global instance, basically building out what I would think of as their own cloud, and importantly their own ecosystem. I'll give you guys the last thought. Maybe you could each give us, you know, closing thoughts. Maybe Darren, you could start and Erik, you could bring us home on just this entire topic, the future of cloud and data. >> Yeah, I mean I think, you know, two points to make on that is, this question of these, I guess what we'll call legacy on-prem players. These, mega vendors that have been around a long time, have big on-prem footprints and a lot of people have them for that reason. I think it's foolish to assume that a company, especially a large, mature, multinational company that's been around a long time, it's foolish to think that they can just uproot and leave on-premises entirely full scale. There will almost always be an on-prem footprint from any company that was not, you know, natively born in the cloud after 2010, right? I just don't think that's reasonable anytime soon. I think there's some industries that need on-prem, things like, you know, industrial manufacturing and so on. So I don't think on-prem is going away, and I think vendors that are going to, you know, go very cloud forward, very big on the cloud, if they neglect having at least decent connectors to on-prem legacy vendors, they're going to miss out. So I think that's something that these players need to keep in mind is that they continue to reach back to some of these players that have big footprints on-prem, and make sure that those integrations are seamless and work well, or else their customers will always have a multi-cloud or hybrid experience. And then I think a second point here about the future is, you know, we talk about the three big, you know, cloud providers, the Google, Microsoft, AWS as sort of the opposite of, or different from this new supercloud paradigm that's emerging. But I want to kind of point out that, they will always try to make a play to become that and I think, you know, we'll certainly see someone like Microsoft trying to expand their licensing and expand how they play in order to become that super cloud provider for folks. So also don't want to downplay them. I think you're going to see those three big players continue to move, and take over what players like CloudFlare are doing and try to, you know, cut them off before they get too big. So, keep an eye on them as well. >> Great points, I mean, I think you're right, the first point, if you're Dell, HPE, Cisco, IBM, your strategy should be to make your on-premise state as cloud-like as possible and you know, make those differences as minimal as possible. And you know, if you're a customer, then the business case is going to be low for you to move off of that. And I think you're right. I think the cloud guys, if this is a real problem, the cloud guys are going to play in there, and they're going to make some money at it. Erik, bring us home please. >> Yeah, I'm going to revert back to our data and this on the macro side. So to kind of support this concept of a supercloud right now, you know Dave, you and I know, we check overall spending and what we're seeing right now is total year spent is expected to only be 4.6%. We ended 2022 at 5% even though it began at almost eight and a half. So this is clearly declining and in that environment, we're seeing the top two strategies to reduce spend are actually vendor consolidation with 36% of our respondents saying they're actively seeking a way to reduce their number of vendors, and consolidate into one. That's obviously supporting a supercloud type of play. Number two is reducing excess cloud resources. So when I look at both of those combined, with a drop in the overall spending reduction, I think you're on the right thread here, Dave. You know, the overall macro view that we're seeing in the data supports this happening. And if I can real quick, couple of names we did not touch on that I do think deserve to be in this conversation, one is HashiCorp. HashiCorp is the number one player in our infrastructure sector, with a 56% net score. It does multiple things within infrastructure and it is completely agnostic to your environment. And if we're also speaking about something that's just a singular feature, we would look at Rubric for data, backup, storage, recovery. They're not going to offer you your full cloud or your networking of course, but if you are looking for your backup, recovery, and storage Rubric, also number one in that sector with a 53% net score. Two other names that deserve to be in this conversation as we watch it move and evolve. >> Great, thank you for bringing that up. Yeah, we had both of those guys in the chart and I failed to focus in on HashiCorp. And clearly a Supercloud enabler. All right guys, we got to go. Thank you so much for joining us, appreciate it. Let's keep this conversation going. >> Always enjoy talking to you Dave, thanks. >> Yeah, thanks for having us. >> All right, keep it right there for more content from Supercloud 2. This is Dave Valente for John Ferg and the entire Cube team. We'll be right back. (gentle synth music) (music fades)

Published Date : Feb 17 2023

SUMMARY :

is the intersection of cloud and data. Thank you for having period of time, you know, and evolution of the cloud So in a way, you know, supercloud the data closer to the business. So my general question to both of you is, the complexity does need to be And so there's this need to use, you know, So my question to you guys is, And as you mentioned, Azure but in the surveys, you know, customers, the ability to offer and there are a number of other, you know, and maybe, you know, join forces each of the cloud platforms, you know, the three big, you know, And you know, if you're a customer, you and I know, we check overall spending and I failed to focus in on HashiCorp. to you Dave, thanks. Ferg and the entire Cube team.

ENTITIES

Entity	Category	Confidence
IBM	ORGANIZATION	0.99+
Cisco	ORGANIZATION	0.99+
Erik	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
John Ferg	PERSON	0.99+
Dave	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Erik Bradley	PERSON	0.99+
David	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Dave Valente	PERSON	0.99+
January, 2023	DATE	0.99+
China	LOCATION	0.99+
US	LOCATION	0.99+
HPE	ORGANIZATION	0.99+
50 billion	QUANTITY	0.99+
Ionis Pharmaceuticals	ORGANIZATION	0.99+
Darren Brabham	PERSON	0.99+
56%	QUANTITY	0.99+
4.6%	QUANTITY	0.99+
Europe	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
53%	QUANTITY	0.99+
36%	QUANTITY	0.99+
Tanzu	ORGANIZATION	0.99+
Darren	PERSON	0.99+
1200	QUANTITY	0.99+
Red Hat	ORGANIZATION	0.99+
VMware	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Friday	DATE	0.99+
Rubric	ORGANIZATION	0.99+
last year	DATE	0.99+
two sides	QUANTITY	0.99+
Databricks	ORGANIZATION	0.99+
5%	QUANTITY	0.99+
Cohesity	ORGANIZATION	0.99+
two tools	QUANTITY	0.99+
Veeam	ORGANIZATION	0.99+
CloudFlare	TITLE	0.99+
two	QUANTITY	0.99+
both	QUANTITY	0.99+
2022	DATE	0.99+
One	QUANTITY	0.99+
Daren Brabham	PERSON	0.99+
three years	QUANTITY	0.99+
TSIS	ORGANIZATION	0.99+
Brabham	PERSON	0.99+
CloudFlare	ORGANIZATION	0.99+
1500 survey respondents	QUANTITY	0.99+
second point	QUANTITY	0.99+
first point	QUANTITY	0.98+
Snowflake	TITLE	0.98+
one	QUANTITY	0.98+
Supercloud	ORGANIZATION	0.98+
ETR	ORGANIZATION	0.98+
Snowflake	ORGANIZATION	0.98+
Akamai	ORGANIZATION	0.98+

Supercloud Applications & Developer Impact | Supercloud2

(gentle music) >> Okay, welcome back to Supercloud 2, live here in Palo Alto, California for our live stage performance. Supercloud 2 is our second Supercloud event. We're going to get these out as fast as we can every couple months. It's our second one, you'll see two and three this year. I'm John Furrier, my co-host, Dave Vellante. A panel here to break down the Supercloud momentum, the wave, and the developer impact that we bringing back Vittorio Viarengo, who's a VP for Cross-Cloud Services at VMware. Sarbjeet Johal, industry influencer and Analyst at StackPayne, his company, Cube alumni and Influencer. Sarbjeet, great to see you. Vittorio, thanks for coming back. >> Nice to be here. >> My pleasure. >> Vittorio, you just gave a keynote where we unpacked the cross-cloud services, what VMware is doing, how you guys see it, not just from VMware's perspective, but VMware looking out broadly at the industry and developers came up and you were like, "Developers, developer, developers", kind of a goof on the Steve Ballmer famous meme that everyone's seen. This is a huge star, sorry, I mean a big piece of it. The developers are the canary in the coal mines. They're the ones who are being asked to code the digital transformation, which is fully business transformation and with the market the way it is right now in terms of the accelerated technology, every enterprise grade business model's changing. The technology is evolving, the builders are kind of, they want go faster. I'm saying they're stuck in a way, but that's my opinion, but there's a lot of growth. >> Yeah. >> The impact, they got to get released up and let it go. Those developers need to accelerate faster. It's been a big part of productivity, and the conversations we've had. So developer impact is huge in Supercloud. What's your, what do you guys think about this? We'll start with you, Sarbjeet. >> Yeah, actually, developers are the masons of the digital empires I call 'em, right? They lay every brick and build all these big empires. On the left side of the SDLC, or the, you know, when you look at the system operations, developer is number one cost from economic side of things, and from technology side of things, they are tech hungry people. They are developers for that reason because developer nights are long, hours are long, they forget about when to eat, you know, like, I've been a developer, I still code. So you want to keep them happy, you want to hug your developers. We always say that, right? Vittorio said that right earlier. The key is to, in this context, in the Supercloud context, is that developers don't mind mucking around with platforms or APIs or new languages, but they hate the infrastructure part. That's a fact. They don't want to muck around with servers. It's friction for them, it is like they don't want to muck around even with the VMs. So they want the programmability to the nth degree. They want to automate everything, so that's how they think and cloud is the programmable infrastructure, industrialization of infrastructure in many ways. So they are happy with where we are going, and we need more abstraction layers for some developers. By the way, I have this sort of thinking frame for last year or so, not all developers are same, right? So if you are a developer at an ISV, you behave differently. If you are a developer at a typical enterprise, you behave differently or you are forced to behave differently because you're not writing software.- >> Well, developers, developers have changed, I mean, Vittorio, you and I were talking earlier on the keynote, and this is kind of the key point is what is a developer these days? If everything is software enabled, I mean, even hardware interviews we do with Nvidia, and Amazon and other people building silicon, they all say the same thing, "It's software on a chip." So you're seeing the role of software up and down the stack and the role of the stack is changing. The old days of full stack developer, what does that even mean? I mean, the cloud is a half a stack kind of right there. So, you know, developers are certainly more agile, but cloud native, I mean VMware is epitome of operations, IT operations, and the Tan Zoo initiative, you guys started, you went after the developers to look at them, and ask them questions, "What do you need?", "How do you transform the Ops from virtualization?" Again, back to your point, so this hardware abstraction, what is software, what is cloud native? It's kind of messy equation these days. How do you guys grokel with that? >> I would argue that developers don't want the Supercloud. I dropped that up there, so, >> Dave: Why not? >> Because developers, they, once they get comfortable in AWS or Google, because they're doing some AI stuff, which is, you know, very trendy right now, or they are in IBM, any of the IPA scaler, professional developers, system developers, they love that stuff, right? Yeah, they don't, the infrastructure gets in the way, but they're just, the problem is, and I think the Supercloud should be driven by the operators because as we discussed, the operators have been left behind because they're busy with day-to-day jobs, and in most cases IT is centralized, developers are in the business units. >> John: Yeah. >> Right? So they get the mandate from the top, say, "Our bank, they're competing against". They gave teenagers or like young people the ability to do all these new things online, and Venmo and all this integration, where are we? "Oh yeah, we can do it", and then build it, and then deploy it, "Okay, we caught up." but now the operators are back in the private cloud trying to keep the backend system running and so I think the Supercloud is needed for the primarily, initially, for the operators to get in front of the developers, fit in the workflow, but lay the foundation so it is secure.- >> So, so I love this thinking because I love the rift, because the rift points to what is the target audience for the value proposition and if you're a developer, Supercloud enables you so you shouldn't have to deal with Supercloud. >> Exactly. >> What you're saying is get the operating environment or operating system done properly, whether it's architecture, building the platform, this comes back to architecture platform conversations. What is the future platform? Is it a vendor supplied or is it customer created platform? >> Dave: So developers want best to breed, is what you just said. >> Vittorio: Yeah. >> Right and operators, they, 'cause developers don't want to deal with governance, they don't want to deal with security, >> No. >> They don't want to deal with spinning up infrastructure. That's the role of the operator, but that's where Supercloud enables, to John's point, the developer, so to your question, is it a platform where the platform vendor is responsible for the architecture, or there is it an architectural standard that spans multiple clouds that has to emerge? Based on what you just presented earlier, Vittorio, you are the determinant of the architecture. It's got to be open, but you guys determine that, whereas the nirvana is, "Oh no, it's all open, and it just kind of works." >> Yeah, so first of all, let's all level set on one thing. You cannot tell developers what to do. >> Dave: Right, great >> At least great developers, right? Cannot tell them what to do. >> Dave: So that's what, that's the way I want to sort of, >> You can tell 'em what's possible. >> There's a bottle on that >> If you tell 'em what's possible, they'll test it, they'll look at it, but if you try to jam it down their throat, >> Yeah. >> Dave: You can't tell 'em how to do it, just like your point >> Let me answer your answer the question. >> Yeah, yeah. >> So I think we need to build an architect, help them build an architecture, but it cannot be proprietary, has to be built on what works in the cloud and so what works in the cloud today is Kubernetes, is you know, number of different open source project that you need to enable and then provide, use this, but when I first got exposed to Kubernetes, I said, "Hallelujah!" We had a runtime that works the same everywhere only to realize there are 12 different distributions. So that's where we come in, right? And other vendors come in to say, "Hey, no, we can make them all look the same. So you still use Kubernetes, but we give you a place to build, to set those operation policy once so that you don't create friction for the developers because that's the last thing you want to do." >> Yeah, actually, coming back to the same point, not all developers are same, right? So if you're ISV developer, you want to go to the lowest sort of level of the infrastructure and you want to shave off the milliseconds from to get that performance, right? If you're working at AWS, you are doing that. If you're working at scale at Facebook, you're doing that. At Twitter, you're doing that, but when you go to DMV and Kansas City, you're not doing that, right? So your developers are different in nature. They are given certain parameters to work with, certain sort of constraints on the budget side. They are educated at a different level as well. Like they don't go to that end of the degree of sort of automation, if you will. So you cannot have the broad stroking of developers. We are talking about a citizen developer these days. That's a extreme low, >> You mean Low-Code. >> Yeah, Low-Code, No-code, yeah, on the extreme side. On one side, that's citizen developers. On the left side is the professional developers, when you say developers, your mind goes to the professional developers, like the hardcore developers, they love the flexibility, you know, >> John: Well app, developers too, I mean. >> App developers, yeah. >> You're right a lot of, >> Sarbjeet: Infrastructure platform developers, app developers, yes. >> But there are a lot of customers, its a spectrum, you're saying. >> Yes, it's a spectrum >> There's a lot of customers don't want deal with that muck. >> Yeah. >> You know, like you said, AWS, Twitter, the sophisticated developers do, but there's a whole suite of developers out there >> Yeah >> That just want tools that are abstracted. >> Within a company, within a company. Like how I see the Supercloud is there shouldn't be anything which blocks the developers, like their view of the world, of the future. Like if you're blocked as a developer, like something comes in front of you, you are not developer anymore, believe me, (John laughing) so you'll go somewhere else >> John: First of all, I'm, >> You'll leave the company by the way. >> Dave: Yeah, you got to quit >> Yeah, you will quit, you will go where the action is, where there's no sort of blockage there. So like if you put in front of them like a huge amount of a distraction, they don't like it, so they don't, >> Well, the idea of a developer, >> Coming back to that >> Let's get into 'cause you mentioned platform. Get year in the term platform engineering now. >> Yeah. >> Platform developer. You know, I remember back in, and I think there's still a term used today, but when I graduated my computer science degree, we were called "Software engineers," right? Do people use that term "Software engineering", or is it "Software development", or they the same, are they different? >> Well, >> I think there's a, >> So, who's engineering what? Are they engineering or are they developing? Or both? Well, I think it the, you made a great point. There is a factor of, I had the, I was blessed to work with Adam Bosworth, that is the guy that created some of the abstraction layer, like Visual Basic and Microsoft Access and he had so, he made his whole career thinking about this layer, and he always talk about the professional developers, the developers that, you know, give him a user manual, maybe just go at the APIs, he'll build anything, right, from system engine, go down there, and then through obstruction, you get the more the procedural logic type of engineers, the people that used to be able to write procedural logic and visual basic and so on and so forth. I think those developers right now are a little cut out of the picture. There's some No-code, Low-Code environment that are maybe gain some traction, I caught up with Adam Bosworth two weeks ago in New York and I asked him "What's happening to this higher level developers?" and you know what he is told me, and he is always a little bit out there, so I'm going to use his thought process here. He says, "ChapGPT", I mean, they will get to a point where this high level procedural logic will be written by, >> John: Computers. >> Computers, and so we may not need as many at the high level, but we still need the engineers down there. The point is the operation needs to get in front of them >> But, wait, wait, you seen the ChatGPT meme, I dunno if it's a Dilbert thing where it's like, "Time to tic" >> Yeah, yeah, yeah, I did that >> "Time to develop the code >> Five minutes, time to decode", you know, to debug the codes like five hours. So you know, the whole equation >> Well, this ChatGPT is a hot wave, everyone's been talking about it because I think it illustrates something that's NextGen, feels NextGen, and it's just getting started so it's going to get better. I mean people are throwing stones at it, but I think it's amazing. It's the equivalent of me seeing the browser for the first time, you know, like, "Wow, this is really compelling." This is game-changing, it's not just keyword chat bots. It's like this is real, this is next level, and I think the Supercloud wave that people are getting behind points to that and I think the question of Ops and Dev comes up because I think if you limit the infrastructure opportunity for a developer, I think they're going to be handicapped. I mean that's a general, my opinion, the thesis is you give more aperture to developers, more choice, more capabilities, more good things could happen, policy, and that's why you're seeing the convergence of networking people, virtualization talent, operational talent, get into the conversation because I think it's an infrastructure engineering opportunity. I think this is a seminal moment in a new stack that's emerging from an infrastructure, software virtualization, low-code, no-code layer that will be completely programmable by things like the next Chat GPT or something different, but yet still the mechanics and the plumbing will still need engineering. >> Sarbjeet: Oh yeah. >> So there's still going to be more stuff coming on. >> Yeah, we have, with the cloud, we have made the infrastructure programmable and you give the programmability to the programmer, they will be very creative with that and so we are being very creative with our infrastructure now and on top of that, we are being very creative with the silicone now, right? So we talk about that. That's part of it, by the way. So you write the code to the particle's silicone now, and on the flip side, the silicone is built for certain use cases for AI Inference and all that. >> You saw this at CES? >> Yeah, I saw at CES, the scenario is this, the Bosch, I spoke to Bosch, I spoke to John Deere, I spoke to AWS guys, >> Yeah. >> They were showcasing their technology there and I was spoke to Azure guys as well. So the Bosch is a good example. So they are building, they are right now using AWS. I have that interview on camera, I will put it some sometime later on there online. So they're using AWS on the back end now, but Bosch is the number one, number one or number two depending on what day it is of the year, supplier of the componentry to the auto industry, and they are creating a platform for our auto industry, so is Qualcomm actually by the way, with the Snapdragon. So they told me that customers, their customers, BMW, Audi, all the manufacturers, they demand the diversity of the backend. Like they don't want all, they, all of them don't want to go to AWS. So they want the choice on the backend. So whatever they cook in the middle has to work, they have to sprinkle the data for the data sovereign side because they have Chinese car makers as well, and for, you know, for other reasons, competitive reasons and like use. >> People don't go to, aw, people don't go to AWS either for political reasons or like competitive reasons or specific use cases, but for the most part, generally, I haven't met anyone who hasn't gone first choice with either, but that's me personally. >> No, but they're building. >> Point is the developer wants choice at the back end is what I'm hearing, but then finish that thought. >> Their developers want the choice, they want the choice on the back end, number one, because the customers are asking for, in this case, the customers are asking for it, right? But the customers requirements actually drive, their economics drives that decision making, right? So in the middle they have to, they're forced to cook up some solution which is vendor neutral on the backend or multicloud in nature. So >> Yeah, >> Every >> I mean I think that's nirvana. I don't think, I personally don't see that happening right now. I mean, I don't see the parody with clouds. So I think that's a challenge. I mean, >> Yeah, true. >> I mean the fact of the matter is if the development teams get fragmented, we had this chat with Kit Colbert last time, I think he's going to come on and I think he's going to talk about his keynote in a few, in an hour or so, development teams is this, the cloud is heterogenous, which is great. It's complex, which is challenging. You need skilled engineering to manage these clouds. So if you're a CIO and you go all in on AWS, it's hard. Then to then go out and say, "I want to be completely multi-vendor neutral" that's a tall order on many levels and this is the multicloud challenge, right? So, the question is, what's the strategy for me, the CIO or CISO, what do I do? I mean, to me, I would go all in on one and start getting hedges and start playing and then look at some >> Crystal clear. Crystal clear to me. >> Go ahead. >> If you're a CIO today, you have to build a platform engineering team, no question. 'Cause if we agree that we cannot tell the great developers what to do, we have to create a platform engineering team that using pieces of the Supercloud can build, and let's make this very pragmatic and give examples. First you need to be able to lay down the run time, okay? So you need a way to deploy multiple different Kubernetes environment in depending on the cloud. Okay, now we got that. The second part >> That's like table stakes. >> That are table stake, right? But now what is the advantage of having a Supercloud service to do that is that now you can put a policy in one place and it gets distributed everywhere consistently. So for example, you want to say, "If anybody in this organization across all these different buildings, all these developers don't even know, build a PCI compliant microservice, They can only talk to PCI compliant microservice." Now, I sleep tight. The developers still do that. Of course they're going to get their hands slapped if they don't encrypt some messages and say, "Oh, that should have been encrypted." So number one. The second thing I want to be able to say, "This service that this developer built over there better satisfy this SLA." So if the SLA is not satisfied, boom, I automatically spin up multiple instances to certify the SLA. Developers unencumbered, they don't even know. So this for me is like, CIO build a platform engineering team using one of the many Supercloud services that allow you to do that and lay down. >> And part of that is that the vendor behavior is such, 'cause the incentive is that they don't necessarily always work together. (John chuckling) I'll give you an example, we're going to hear today from Western Union. They're AWS shop, but they want to go to Google, they want to use some of Google's AI tools 'cause they're good and maybe they're even arguably better, but they're also a Snowflake customer and what you'll hear from them is Amazon and Snowflake are working together so that SageMaker can be integrated with Snowflake but Google said, "No, you want to use our AI tools, you got to use BigQuery." >> Yeah. >> Okay. So they say, "Ah, forget it." So if you have a platform engineering team, you can maybe solve some of that vendor friction and get competitive advantage. >> I think that the future proximity concept that I talk about is like, when you're doing one thing, you want to do another thing. Where do you go to get that thing, right? So that is very important. Like your question, John, is that your point is that AWS is ahead of the pack, which is true, right? They have the >> breadth of >> Infrastructure by a lot >> infrastructure service, right? They breadth of services, right? So, how do you, When do you bring in other cloud providers, right? So I believe that you should standardize on one cloud provider, like that's your primary, and for others, bring them in on as needed basis, in the subsection or sub portfolio of your applications or your platforms, what ever you can. >> So yeah, the Google AI example >> Yeah, I mean, >> Or the Microsoft collaboration software example. I mean there's always or the M and A. >> Yeah, but- >> You're going to get to run Windows, you can run Windows on Amazon, so. >> By the way, Supercloud doesn't mean that you cannot do that. So the perfect example is say that you're using Azure because you have a SQL server intensive workload. >> Yep >> And you're using Google for ML, great. If you are using some differentiated feature of this cloud, you'll have to go somewhere and configure this widget, but what you can abstract with the Supercloud is the lifecycle manage of the service that runs on top, right? So how does the service get deployed, right? How do you monitor performance? How do you lifecycle it? How you secure it that you can abstract and that's the value and eventually value will win. So the customers will find what is the values, obstructing in making it uniform or going deeper? >> How about identity? Like take identity for instance, you know, that's an opportunity to abstract. Whether I use Microsoft Identity or Okta, and I can abstract that. >> Yeah, and then we have APIs and standards that we can use so eventually I think where there is enough pain, the right open source will emerge to solve that problem. >> Dave: Yeah, I can use abstract things like object store, right? That's pretty simple. >> But back to the engineering question though, is that developers, developers, developers, one thing about developers psychology is if something's not right, they say, "Go get fixing. I'm not touching it until you fix it." They're very sticky about, if something's not working, they're not going to do it again, right? So you got to get it right for developers. I mean, they'll maybe tolerate something new, but is the "juice worth the squeeze" as they say, right? So you can't go to direct say, "Hey, it's, what's a work in progress? We're going to get our infrastructure together and the world's going to be great for you, but just hang tight." They're going to be like, "Get your shit together then talk to me." So I think that to me is the question. It's an Ops question, but where's that value for the developer in Supercloud where the capabilities are there, there's less friction, it's simpler, it solves the complexity problem. I don't need these high skilled labor to manage Amazon. I got services exposed. >> That's what we talked about earlier. It's like the Walmart example. They basically, they took away from the developer the need to spin up infrastructure and worry about all the governance. I mean, it's not completely there yet. So the developer could focus on what he or she wanted to do. >> But there's a big, like in our industry, there's a big sort of flaw or the contention between developers and operators. Developers want to be on the cutting edge, right? And operators want to be on the stability, you know, like we want governance. >> Yeah, totally. >> Right, so they want to control, developers are like these little bratty kids, right? And they want Legos, like they want toys, right? Some of them want toys by way. They want Legos, they want to build there and they want make a mess out of it. So you got to make sure. My number one advice in this context is that do it up your application portfolio and, or your platform portfolio if you are an ISV, right? So if you are ISV you most probably, you're building a platform these days, do it up in a way that you can say this portion of our applications and our platform will adhere to what you are saying, standardization, you know, like Kubernetes, like slam dunk, you know, it works across clouds and in your data center hybrid, you know, whole nine yards, but there is some subset on the next door systems of innovation. Everybody has, it doesn't matter if you're DMV of Kansas or you are, you know, metaverse, right? Or Meta company, right, which is Facebook, they have it, they are building something new. For that, give them some freedom to choose different things like play with non-standard things. So that is the mantra for moving forward, for any enterprise. >> Do you think developers are happy with the infrastructure now or are they wanting people to get their act together? I mean, what's your reaction, or you think. >> Developers are happy as long as they can do their stuff, which is running code. They want to write code and innovate. So to me, when Ballmer said, "Developer, develop, Developer, what he meant was, all you other people get your act together so these developers can do their thing, and to me the Supercloud is the way for IT to get there and let developer be creative and go fast. Why not, without getting in trouble. >> Okay, let's wrap up this segment with a super clip. Okay, we're going to do a sound bite that we're going to make into a short video for each of you >> All right >> On you guys summarizing why Supercloud's important, why this next wave is relevant for the practitioners, for the industry and we'll turn this into an Instagram reel, YouTube short. So we'll call it a "Super clip. >> Alright, >> Sarbjeet, you want, you want some time to think about it? You want to go first? Vittorio, you want. >> I just didn't mind. (all laughing) >> No, okay, okay. >> I'll do it again. >> Go back. No, we got a fresh one. We'll going to already got that one in the can. >> I'll go. >> Sarbjeet, you go first. >> I'll go >> What's your super clip? >> In software systems, abstraction is your friend. I always say that. Abstraction is your friend, even if you're super professional developer, abstraction is your friend. We saw from the MFC library from C++ days till today. Abstract, use abstraction. Do not try to reinvent what's already being invented. Leverage cloud, leverage the platform side of the cloud. Not just infrastructure service, but platform as a service side of the cloud as well, and Supercloud is a meta platform built on top of these infrastructure services from three or four or five cloud providers. So use that and embrace the programmability, embrace the abstraction layer. That's the key actually, and developers who are true developers or professional developers as you said, they know that. >> Awesome. Great super clip. Vittorio, another shot at the plate here for super clip. Go. >> Multicloud is awesome. There's a reason why multicloud happened, is because gave our developers the ability to innovate fast and ever before. So if you are embarking on a digital transformation journey, which I call a survival journey, if you're not innovating and transforming, you're not going to be around in business three, five years from now. You have to adopt the Supercloud so the developer can be developer and keep building great, innovating digital experiences for your customers and IT can get in front of it and not get in trouble together. >> Building those super apps with Supercloud. That was a great super clip. Vittorio, thank you for sharing. >> Thanks guys. >> Sarbjeet, thanks for coming on talking about the developer impact Supercloud 2. On our next segment, coming up right now, we're going to hear from Walmart enterprise architect, how they are building and they are continuing to innovate, to build their own Supercloud. Really informative, instructive from a practitioner doing it in real time. Be right back with Walmart here in Palo Alto. Thanks for watching. (gentle music)

Published Date : Feb 17 2023

SUMMARY :

the Supercloud momentum, and developers came up and you were like, and the conversations we've had. and cloud is the and the role of the stack is changing. I dropped that up there, so, developers are in the business units. the ability to do all because the rift points to What is the future platform? is what you just said. the developer, so to your question, You cannot tell developers what to do. Cannot tell them what to do. You can tell 'em your answer the question. but we give you a place to build, and you want to shave off the milliseconds they love the flexibility, you know, platform developers, you're saying. don't want deal with that muck. that are abstracted. Like how I see the Supercloud is So like if you put in front of them you mentioned platform. and I think there's the developers that, you The point is the operation to decode", you know, the browser for the first time, you know, going to be more stuff coming on. and on the flip side, the middle has to work, but for the most part, generally, Point is the developer So in the middle they have to, the parody with clouds. I mean the fact of the matter Crystal clear to me. in depending on the cloud. So if the SLA is not satisfied, boom, 'cause the incentive is that So if you have a platform AWS is ahead of the pack, So I believe that you should standardize or the M and A. you can run Windows on Amazon, so. So the perfect example is abstract and that's the value Like take identity for instance, you know, the right open source will Dave: Yeah, I can use abstract things and the world's going to be great for you, the need to spin up infrastructure on the stability, you know, So that is the mantra for moving forward, Do you think developers are happy and to me the Supercloud is for each of you for the industry you want some time to think about it? I just didn't mind. got that one in the can. platform side of the cloud. Vittorio, another shot at the the ability to innovate thank you for sharing. the developer impact Supercloud 2.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
BMW	ORGANIZATION	0.99+
Walmart	ORGANIZATION	0.99+
John	PERSON	0.99+
Sarbjeet	PERSON	0.99+
John Furrier	PERSON	0.99+
Bosch	ORGANIZATION	0.99+
Vittorio	PERSON	0.99+
Nvidia	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Audi	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Steve Ballmer	PERSON	0.99+
Qualcomm	ORGANIZATION	0.99+
Adam Bosworth	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Facebook	ORGANIZATION	0.99+
New York	LOCATION	0.99+
Vittorio Viarengo	PERSON	0.99+
Kit Colbert	PERSON	0.99+
Ballmer	PERSON	0.99+
four	QUANTITY	0.99+
Sarbjeet Johal	PERSON	0.99+
five hours	QUANTITY	0.99+
VMware	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Palo Alto, California	LOCATION	0.99+
Microsoft	ORGANIZATION	0.99+
Five minutes	QUANTITY	0.99+
NextGen	ORGANIZATION	0.99+
StackPayne	ORGANIZATION	0.99+
Visual Basic	TITLE	0.99+
second part	QUANTITY	0.99+
12 different distributions	QUANTITY	0.99+
CES	EVENT	0.99+
First	QUANTITY	0.99+
Twitter	ORGANIZATION	0.99+
Kansas City	LOCATION	0.99+
second one	QUANTITY	0.99+
three	QUANTITY	0.99+
both	QUANTITY	0.99+
Kansas	LOCATION	0.98+
first time	QUANTITY	0.98+
Windows	TITLE	0.98+
last year	DATE	0.98+

Is Data Mesh the Killer App for Supercloud | Supercloud2

(gentle bright music) >> Okay, welcome back to our "Supercloud 2" event live coverage here at stage performance in Palo Alto syndicating around the world. I'm John Furrier with Dave Vellante. We've got exclusive news and a scoop here for SiliconANGLE and theCUBE. Zhamak Dehghani, creator of data mesh has formed a new company called NextData.com NextData, she's a cube alumni and contributor to our Supercloud initiative, as well as our coverage and breaking analysis with Dave Vellante on data, the killer app for Supercloud. Zhamak, great to see you. Thank you for coming into the studio and congratulations on your newly formed venture and continued success on the data mesh. >> Thank you so much. It's great to be here. Great to see you in person. >> Dave: Yeah, finally. >> John: Wonderful. Your contributions to the data conversation has been well-documented certainly by us and others in the industry. Data mesh taking the world by storm. Some people are debating it, throwing, you know, cold water on it. Some are, I think, it's the next big thing. Tell us about the data mesh super data apps that are emerging out of cloud. >> I mean, data mesh, as you said, it's, you know, the pain point that it surfaced were universal. Everybody said, "Oh, why didn't I think of that?" You know, it was just an obvious next step and people are approaching it, implementing it. I guess the last few years, I've been involved in many of those implementations, and I guess Supercloud is somewhat a prerequisite for it because it's data mesh and building applications using data mesh is about sharing data responsibly across boundaries. And those boundaries include boundaries, organizational boundaries cloud technology boundaries and trust boundaries. >> I want to bring that up because your venture, NextData which is new, just formed. Tell us about that. What wave is that riding? What specifically are you targeting? What's the pain point? >> Zhamak: Absolutely, yes. So next data is the result of, I suppose, the pains that I suffered from implementing a database for many of the organizations. Basically, a lot of organizations that I've worked with, they want decentralized data. So they really embrace this idea of decentralized ownership of the data, but yet they want interconnectivity through standard APIs, yet they want discoverability and governance. So they want to have policies implemented, they want to govern that data, they want to be able to discover that data and yet they want to decentralize it. And we do that with a developer experience that is easy and native to a generalist developer. So we try to find, I guess, the common denominator that solves those problems and enables that developer experience for data sharing. >> John: Since you just announced the news, what's been the reaction? >> Zhamak: I just announced the news right now, so what's the reaction? >> John: But people in the industry that know you, you did a lot of work in the area. What have been some of the feedback on the new venture in terms of the approach, the customers, problem? >> Yeah, so we've been in stealth modes, so we haven't publicly talked about it, but folks that have been close to us in fact have reached out. We already have implementations of our pilot platform with early customers, which is super exciting. And we're going to have multiple of those. Of course, we're a tiny, tiny company. We can have many of those where we are going to have multiple pilots, implementations of our platform in real world. We're real global large scale organizations that have real world problems. So we're not going to build our platform in vacuum. And that's what's happening right now. >> Zhamak: When I think about your role at ThoughtWorks, you had a very wide observation space with a number of clients helping them implement data mesh and other things as well prior to your data mesh initiative. But when I look at data mesh, at least the ones that I've seen, they're very narrow. I think of JPMC, I think of HelloFresh. They're generally obviously not surprising. They don't include the big vision of inclusivity across clouds across different data stores. But it seems like people are having to go through some gymnastics to get to, you know, the organizational reality of decentralizing data, and at least pushing data ownership to the line of business. How are you approaching or are you approaching, solving that problem? Are you taking a narrow slice? What can you tell us about Next Data? >> Zhamak: Sure, yeah, absolutely. Gymnastics, the cute word to describe what the organizations have to go through. And one of those problems is that, you know, the data, as you know, resides on different platforms. It's owned by different people, it's processed by pipelines that who owns them. So there's this very disparate and disconnected set of technologies that were very useful for when we thought about data and processing as a centralized problem. But when you think about data as a decentralized problem, the cost of integration of these technologies in a cohesive developer experience is what's missing. And we want to focus on that cohesive end-to-end developer experience to share data responsibly in this autonomous units, we call them data products, I guess in data mesh, right? That constitutes computation, that governs that data policies, discoverability. So I guess, I heard this expression in the last talks that you can have your cake and eat it too. So we want people have their cakes, which is, you know, data in different places, decentralization and eat it too, which is interconnected access to it. So we start with standardizing and codifying this idea of a data product container that encapsulates data computation, APIs to get to it in a technology agnostic way, in an open way. And then, sit on top and use existing existing tech, you know, Snowflake, Databricks, whatever exists, you know, the millions of dollars of investments that companies have made, sit on top of those but create this cohesive, integrated experience where data product is a first class primitive. And that's really key here, that the language, and the modeling that we use is really native to data mesh is that I will make a data product, I'm sharing a data product, and that encapsulates on providing metadata about this. I'm providing computation that's constantly changing the data. I'm providing the API for that. So we're trying to kind of codify and create a new developer experience based on that. And developer, both from provider side and user side connected to peer-to-peer data sharing with data product as a primitive first class concept. >> Okay, so the idea would be developers would build applications leveraging those data products which are discoverable and governed. Now, today you see some companies, you know, take a snowflake for example. >> Zhamak: Yeah. >> Attempting to do that within their own little walled garden. They even, at one point, used the term, "Mesh." I dunno if they pull back on that. And then they sort of became aware of some of your work. But a lot of the things that they're doing within their little insulated environment, you know, support that, that, you know, governance, they're building out an ecosystem. What's different in your vision? >> Exactly. So we realize that, you know, and this is a reality, like you go to organizations, they have a snowflake and half of the organization happily operates on Snowflake. And on the other half, oh, we are on, you know, bare infrastructure on AWS, or we are on Databricks. This is the realities, you know, this Supercloud that's written up here. It's about working across boundaries of technology. So we try to embrace that. And even for our own technology with the way we're building it, we say, "Okay, nobody's going to use next data mesh operating system. People will have different platforms." So you have to build with openness in mind, and in case of Snowflake, I think, you know, they have I'm sure very happy customers as long as customers can be on Snowflake. But once you cross that boundary of platforms then that becomes a problem. And we try to keep that in mind in our solution. >> So, it's worth reviewing that basically, the concept of data mesh is that, whether you're a data lake or a data warehouse, an S3 bucket, an Oracle database as well, they should be inclusive inside of the data. >> We did a session with AWS on the startup showcase, data as code. And remember, I wrote a blog post in 2007 called, "Data's the new developer kit." Back then, they used to call 'em developer kits, if you remember. And that we said at that time, whoever can code data >> Zhamak: Yes. >> Will have a competitive advantage. >> Aren't there machines going to be doing that? Didn't we just hear that? >> Well we have, and you know, Hey Siri, hey Cube. Find me that best video for data mesh. There it is. I mean, this is the point, like what's happening is that, now, data has to be addressable >> Zhamak: Yes. >> For machines and for coding. >> Zhamak: Yes. >> Because as you need to call the data. So the question is, how do you manage the complexity of big things as promiscuous as possible, making it available as well as then governing it because it's a trade off. The more you make open >> Zhamak: Definitely. >> The better the machine learning. >> Zhamak: Yes. >> But yet, the governance issue, so this is the, you need an OS to handle this maybe. >> Yes, well, we call our mental model for our platform is an OS operating system. Operating systems, you know, have shown us how you can kind of abstract what's complex and take care of, you know, a lot of complexities, but yet provide an open and, you know, dynamic enough interface. So we think about it that way. We try to solve the problem of policies live with the data. An enforcement of the policies happens at the most granular level which is, in this concept, the data product. And that would happen whether you read, write, or access a data product. But we can never imagine what are these policies could be. So our thinking is, okay, we should have a open policy framework that can allow organizations write their own policy drivers, and policy definitions, and encode it and encapsulated in this data product container. But I'm not going to fool myself to say that, you know, that's going to solve the problem that you just described. I think we are in this, I don't know, if I look into my crystal ball, what I think might happen is that right now, the primitives that we work with to train machine-learning model are still bits and bites in data. They're fields, rows, columns, right? And that creates quite a large surface area, an attack area for, you know, for privacy of the data. So perhaps, one of the trends that we might see is this evolution of data APIs to become more and more computational aware to bring the compute to the data to reduce that surface area so you can really leave the control of the data to the sovereign owners of that data, right? So that data product. So I think the evolution of our data APIs perhaps will become more and more computational. So you describe what you want, and the data owner decides, you know, how to manage the- >> John: That's interesting, Dave, 'cause it's almost like we just talked about ChatGPT in the last segment with you, who's a machine learning, could really been around the industry. It's almost as if you're starting to see reason come into the data, reasoning. It's like you starting to see not just metadata, using the data to reason so that you don't have to expose the raw data. It's almost like a, I won't say curation layer, but an intelligence layer. >> Zhamak: Exactly. >> Can you share your vision on that 'cause that seems to be where the dots are connecting. >> Zhamak: Yes, this is perhaps further into the future because just from where we stand, we have to create still that bridge of familiarity between that future and present. So we are still in that bridge-making mode, however, by just the basic notion of saying, "I'm going to put an API in front of my data, and that API today might be as primitive as a level of indirection as in you tell me what you want, tell me who you are, let me go process that, all the policies and lineage, and insert all of this intelligence that need to happen. And then I will, today, I will still give you a file. But by just defining that API and standardizing it, now we have this amazing extension point that we can say, "Well, the next revision of this API, you not just tell me who you are, but you actually tell me what intelligence you're after. What's a logic that I need to go and now compute on your API?" And you can kind of evolve that, right? Now you have a point of evolution to this very futuristic, I guess, future where you just describe the question that you're asking from the chat. >> Well, this is the Supercloud, Dave. >> I have a question from a fan, I got to get it in. It's George Gilbert. And so, his question is, you're blowing away the way we synchronize data from operational systems to the data stack to applications. So the concern that he has, and he wants your feedback on this, "Is the data product app devs get exposed to more complexity with respect to moving data between data products or maybe it's attributes between data products, how do you respond to that? How do you see, is that a problem or is that something that is overstated, or do you have an answer for that?" >> Zhamak: Absolutely. So I think there's a sweet spot in getting data developers, data product developers closer to the app, but yet not burdening them with the complexity of the application and application logic, and yet reducing their cognitive load by localizing what they need to know about which is that domain where they're operating within. Because what's happening right now? what's happening right now is that data engineers, a ton of empathy for them for their high threshold of pain that they can, you know, deal with, they have been centralized, they've put into the data team, and they have been given this unbelievable task of make meaning out of data, put semantic over it, curates it, cleans it, and so on. So what we are saying is that get those folks embedded into the domain closer to the application developers, these are still separately moving units. Your app and your data products are independent but yet tightly closed with each other, tightly coupled with each other based on the context of the domain, so reduce cognitive load by localizing what they need to know about to the domain, get them closer to the application but yet have them them separate from app because app provides a very different service. Transactional data for my e-commerce transaction, data product provides a very different service, longitudinal data for the, you know, variety of this intelligent analysis that I can do on the data. But yet, it's all within the domain of e-commerce or sales or whatnot. >> So a lot of decoupling and coupling create that cohesiveness. >> Zhamak: Absolutely. >> Architecture. So I have to ask you, this is an interesting question 'cause it came up on theCUBE all last year. Back on the old server, data center days and cloud, SRE, Google coined the term, "Site Reliability Engineer" for someone to look over the hundreds of thousands of servers. We asked a question to data engineering community who have been suffering, by the way, agree. Is there an SRE-like role for data? Because in a way, data engineering, that platform engineer, they are like the SRE for data. In other words, managing the large scale to enable automation and cell service. What's your thoughts and reaction to that? >> Zhamak: Yes, exactly. So, maybe we go through that history of how SRE came to be. So we had the first DevOps movement which was, remove the wall between dev and ops and bring them together. So you have one cross-functional units of the organization that's responsible for, you build it you run it, right? So then there is no, I'm going to just shoot my application over the wall for somebody else to manage it. So we did that, and then we said, "Okay, as we decentralized and had this many microservices running around, we had to create a layer that abstracted a lot of the complexity around running now a lot or monitoring, observing and running a lot while giving autonomy to this cross-functional team." And that's where the SRE, a new generation of engineers came to exist. So I think if I just look- >> Hence Borg, hence Kubernetes. >> Hence, hence, exactly. Hence chaos engineering, hence embracing the complexity and messiness, right? And putting engineering discipline to embrace that and yet give a cohesive and high integrity experience of those systems. So I think, if we look at that evolution, perhaps something like that is happening by bringing data and apps closer and make them these domain-oriented data product teams or domain oriented cross-functional teams, full stop, and still have a very advanced maybe at the platform infrastructure level kind of operational team that they're not busy doing two jobs which is taking care of domains and the infrastructure, but they're building infrastructure that is embracing that complexity, interconnectivity of this data process. >> John: So you see similarities. >> Absolutely, but I feel like we're probably in a more early days of that movement. >> So it's a data DevOps kind of thing happening where scales happening. It's good things are happening yet. Eh, a little bit fast and loose with some complexities to clean up. >> Yes, yes. This is a different restructure. As you said we, you know, the job of this industry as a whole on architects is decompose, recompose, decompose, recomposing a new way, and now we're like decomposing centralized team, recomposing them as domains and- >> John: So is data mesh the killer app for Supercloud? >> You had to do this for me. >> Dave: Sorry, I couldn't- (John and Dave laughing) >> Zhamak: What do you want me to say, Dave? >> John: Yes. >> Zhamak: Yes of course. >> I mean Supercloud, I think it's, really the terminology's Supercloud, Opencloud. But I think, in spirits of it, this embracing of diversity and giving autonomy for people to make decisions for what's right for them and not yet lock them in. I think just embracing that is baked into how data mesh assume the world would work. >> John: Well thank you so much for coming on Supercloud too, really appreciate it. Data has driven this conversation. Your success of data mesh has really opened up the conversation and exposed the slow moving data industry. >> Dave: Been a great catalyst. (John laughs) >> John: That's now going well. We can move faster, so thanks for coming on. >> Thank you for hosting me. It was wonderful. >> Okay, Supercloud 2 live here in Palo Alto. Our stage performance, I'm John Furrier with Dave Vellante. We're back with more after this short break, Stay with us all day for Supercloud 2. (gentle bright music)

Published Date : Feb 17 2023

SUMMARY :

and continued success on the data mesh. Great to see you in person. and others in the industry. I guess the last few years, What's the pain point? a database for many of the organizations. in terms of the approach, but folks that have been close to us to get to, you know, the data, as you know, resides Okay, so the idea would be developers But a lot of the things that they're doing This is the realities, you know, inside of the data. And that we said at that Well we have, and you know, So the question is, how do so this is the, you need and the data owner decides, you know, so that you don't have 'cause that seems to be where of this API, you not So the concern that he has, into the domain closer to So a lot of decoupling So I have to ask you, this a lot of the complexity of domains and the infrastructure, in a more early days of that movement. to clean up. the job of this industry the world would work. John: Well thank you so much for coming Dave: Been a great catalyst. We can move faster, so Thank you for hosting me. after this short break,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
John	PERSON	0.99+
Zhamak	PERSON	0.99+
Dave	PERSON	0.99+
George Gilbert	PERSON	0.99+
AWS	ORGANIZATION	0.99+
2007	DATE	0.99+
Palo Alto	LOCATION	0.99+
John Furrier	PERSON	0.99+
John Furrier	PERSON	0.99+
Zhamak Dehghani	PERSON	0.99+
JPMC	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Dav	PERSON	0.99+
two jobs	QUANTITY	0.99+
Supercloud	ORGANIZATION	0.99+
NextData	ORGANIZATION	0.99+
today	DATE	0.99+
Opencloud	ORGANIZATION	0.99+
last year	DATE	0.99+
Siri	TITLE	0.99+
ThoughtWorks	ORGANIZATION	0.98+
NextData.com	ORGANIZATION	0.98+
Supercloud 2	EVENT	0.98+
both	QUANTITY	0.98+
one	QUANTITY	0.98+
HelloFresh	ORGANIZATION	0.98+
first	QUANTITY	0.98+
millions of dollars	QUANTITY	0.96+
Snowflake	EVENT	0.96+
Oracle	ORGANIZATION	0.96+
SRE	TITLE	0.94+
Snowflake	ORGANIZATION	0.94+
Cube	PERSON	0.93+
Zhama	PERSON	0.92+
Data Mesh the Killer App	TITLE	0.92+
SiliconANGLE	ORGANIZATION	0.91+
Databricks	ORGANIZATION	0.9+
first class	QUANTITY	0.89+
Supercloud 2	ORGANIZATION	0.88+
theCUBE	ORGANIZATION	0.88+
hundreds of thousands	QUANTITY	0.85+
one point	QUANTITY	0.84+
Zham	PERSON	0.83+
Supercloud	EVENT	0.83+
ChatGPT	ORGANIZATION	0.72+
SRE	ORGANIZATION	0.72+
Borg	PERSON	0.7+
Snowflake	TITLE	0.66+
Supercloud	TITLE	0.65+
half	QUANTITY	0.64+

Opening Keynote | Supercloud2

(intro music plays) >> Okay, welcome back to Supercloud 2. I'm John Furrier with my co-host, Dave Vellante, here in our Palo Alto Studio, with a live performance all day unpacking the wave of Supercloud. This is our second edition. Back for keynote review here is Vittorio Viarengo, talking about the hype and the reality of the Supercloud momentum. Vittorio, great to see you. You got a presentation. Looking forward to hearing the update. >> It's always great to be here on this stage with you guys. >> John Furrier: (chuckles) So the business imperative for cloud right now is clear and the Supercloud wave points to the builders and they want to break through. VMware, you guys have a lot of builders in the ecosystem. Where do you guys see multicloud today? What's going on? >> So, what we see is, when we talk with our customers is that customers are in a state of cloud chaos. Raghu Raghuram, our CEO, introduced this term at our user conference and it really resonated with our customers. And the chaos comes from the fact that most enterprises have applications spread across private cloud, multiple hyperscalers, and the edge increasingly. And so with that, every hyperscaler brings their own vertical integrated stack of infrastructure development, platform security, and so on and so forth. And so our customers are left with a ballooning cost because they have to train their employees across multiple stacks. And the costs are only going up. >> John Furrier: Have you talked about the Supercloud with your customers? What are they looking for when they look at the business value of Cross-Cloud Services? Why are they digging into it? What are some of the reasons? >> First of all, let's put this in perspective. 90, 87% of customers use two or more cloud including the private cloud. And 55%, get this, 55% use three or more clouds, right? And so, when you talk to these customers they're all asking for two things. One, they find that managing the multicloud is more difficult than the private cloud. And that goes without saying because it's new, they don't have the skills, and they have many of these. And pretty much everybody, 87% of them, are seeing their cost getting out of control. And so they need a new approach. We believe that the industry needs a new approach to solving the multicloud problem, which you guys have introduced and you call it the Supercloud. We call it Cross-Cloud Services. But the idea is that- and the parallel goes back to the private cloud. In the private cloud, if you remember the old days, before we called it the private cloud, we would install SAP. And the CIO would go, "Oh, I hear SAP works great on HP hardware. Oh, let's buy the HP stack", right? (hosts laugh) And then you go, "Oh, oh, Oracle databases. They run phenomenally on Sun Stack." That's another stack. And it wasn't sustainable, right? And so, VMware came in with virtualization and made everything look the same. And we unleashed a tremendous era of growth and speed and cost saving for our customers. So we believe, and I think the industry also believes, if you look at the success of Supercloud, first instance and today, that we need to create a new level of abstraction in the cloud. And this abstraction needs to be at a higher level. It needs to be built around the lingua franca of the cloud, which is Kubernetes, APIs, open source stacks. And by doing so, we're going to allow our customers to have a more unified way of building, managing, running, connecting, and securing applications across cloud. >> So where should that standardization occur? 'Cause we're going to hear from some customers today. When I ask them about cloud chaos, they're like, "Well, the way we deal with cloud chaos is MonoCloud". They sort of put on the blinders, right? But of course, they may be risking not being able to take advantage of best-of-breed. So where should that standardization layer occur across clouds? >> [Vittorio Viarengo] Well, I also hear that from some customers. "Oh, we are one cloud". They are in denial. There's no question about it. In fact, when I met at our user conference with a number of CIOs, and I went around the room and I asked them, I saw the entire spectrum. (laughs) The person is in denial. "Oh, we're using AWS." I said, "Great." "And the private cloud, so we're all set." "Okay, thank you. Next." "Oh, the business units are using AWS." "Ah, okay. So you have three." "Oh, and we just bought a company that is using Google back in Europe." So, okay, so you got four right there. So that person in denial. Then, you have the second category of customers that are seeing the problem, they're ahead of the pack, and they're building their solution. We're going to hear from Walmart later today. >> Dave Vellante: Yeah. >> So they're building their own. Not everybody has the skills and the scale of Walmart to build their own. >> Dave Vellante: Right. >> So, eventually, then you get to the third category of customers. They're actually buying solutions from one of the many ISVs that you are going to talk with today. You know, whether it is Azure Corp or Snowflake or all this. I will argue, any new company, any new ISV, is by definition a multicloud service company, right? And so these people... Or they're buying our Cross-Cloud Services to solve this problem. So that's the spectrum of customers out there. >> What's the stack you're focusing on specifically? What is VMware? Because virtualization is not going away. You're seeing a lot more in the cloud with networking, for example, this abstraction layer. What specifically are you guys focusing on? >> [Vittorio Viarengo] So, I like to talk about this beyond what VMware does, just 'cause I think this is an industry movement. A market is forming around multicloud services. And so it's an approach that pretty much a whole industry is taking of building this abstraction layer. In our approach, it is to bring these services together to simplify things even further. So, initially, we were the first to see multicloud happening. You know, Raghu and Sanjay, back in what, like 2016, 17, saw this coming and our first foray in multicloud was to take this sphere and our hypervisor and port it natively on all the hyperscaling, which is a phenomenal solution to get your enterprise application in the cloud and modernize them. But then we realized that customers were already in the cloud natively. And so we had to have (all chuckle) a religion discussion internally and drop that hypervisor religion and say, "Hey, we need to go and help our customers where they are, in a native cloud". And that's where we brought back Pivotal. We built tons around it. We shifted. And then Aria. And so basically, our evolution was to go from, you know, our hypervisor to cloud native. And then eventually we ended up at what we believe is the most comprehensive multicloud services solution that covers Application Development with Tanzu, Management with Aria, and then you have NSX for security and user computing for connectivity. And so we believe that we have the most comprehensive set of integrated services to solve the challenges of multicloud, bringing excess simplicity into the picture. >> John Furrier: As some would say, multicloud and multi environment, when you get to the distributed computing with the edge, you're going to need that capability. And you guys have been very successful with private cloud. But to be devil's advocate, you guys have been great with private cloud, but some are saying like, you guys don't get public cloud yet. How do you answer that? Because there's a lot of work that you guys have done in public cloud and it seems like private cloud successes are moving up into public cloud. Like networking. You're seeing a lot of that being configured in. So the enterprise-grade solutions are moving into the cloud. So what would you say to the skeptics out there that say, "Oh, I think you got private cloud nailed down, but you don't really have public cloud." (chuckles) >> [Vittorio Viarengo] First of all, we love skeptics. Our engineering team love skeptics and love to prove them wrong. (John laughs) And I would never ever bet against our engineering team. So I believe that VMware has been so successful in building a private cloud and the technology that actually became the foundation for the public cloud. But that is always hard, to be known in a new environment, right? There's always that period where you have to prove yourself. But what I love about VMware is that VMware has what I believe, what I like to call "enterprise pragmatism". The private cloud is not going away. So we're going to help our customers there, and then, as they move to the cloud, we are going to give them an option to adopt the cloud at their own pace, with VMware cloud, to allow them to move to the cloud and be able to rely on the enterprise-class capabilities we built on-prem in the cloud. But then with Tanzu and Aria and the rest of the Cross-Cloud Service portfolio, being able to meet them where they are. If they're already in the cloud, have them have a single place to build application, a single place to manage application, and so on and so forth. >> John Furrier: You know, Dave, we were talking in the opening. Vittorio, I want to get your reaction to this because we were saying in the opening that the market's obviously pushing this next gen. You see ChatGPT and the success of these new apps that are coming out. The business models are demanding kind of a digital transformation. The tech, the builders, are out there, and you guys have a interesting view because your customer base is almost the canary in the coal mine because this is an Operations challenge as well as just enabling the cloud native. So, I want to get your thoughts on, you know, your customer base, VMware customers. They've been in IT Ops for generations. And now, as that crowd moves and sees this Supercloud environment, it's IT again, but it's everywhere. It's not just IT in a data center. It's on-premises, it's cloud, it's edge. So, almost, your customer base is like a canary in the coal mine for this movement of how do you operationalize multiple environments? Which includes clouds, which includes apps. I mean, this is the core question. >> [Vittorio Viarengo] Yeah. And I want to make this an industry conversation. Forget about VMware for a second. We believe that there are like four or five major pillars that you need to implement to create this level of abstraction. It starts from observability. If you don't know- You need to know where your apps are, where your data is, how the the applications are performing, what is the security posture, what is their performance? So then, you can do something about it. We call that the observability part of this, creating this abstraction. The second one is security. So you need to be- Sorry. Infrastructure. An infrastructure. Creating an abstraction layer for infrastructure means to be able to give the applications, and the developer who builds application, the right infrastructure for the application at the right time. Whether it is a VM, whether it's a Kubernetes cluster, or whether it's microservices, and so on and so forth. And so, that allows our developers to think about infrastructure just as code. If it is available, whatever application needs, whatever the cost makes sense for my application, right? The third part of security, and I can give you a very, very simple example. Say that I was talking to a CIO of a major insurance company in Europe and he is saying to me, "The developers went wild, built all these great front office applications. Now the business is coming to me and says, 'What is my compliance report?'" And the guy is saying, "Say that I want to implement the policy that says, 'I want to encrypt all my data no matter where it resides.' How does it do it? It needs to have somebody logging in into Amazon and configure it, then go to Google, configure it, go to the private cloud." That's time and cost, right? >> Yeah. >> So, you need to have a way to enforce security policy from the infrastructure to the app to the firewall in one place and distribute it across. And finally, the developer experience, right? Developers, developers, developers. (all laugh) We're always trying to keep up with... >> Host: You can dance if you want to do... >> [Vittorio Viarengo] Yeah, let's not make a fool of ourselves. More than usual. Developers are the kings and queens of the hill. They are. Why? Because they build the application. They're making us money and saving us money. And so we need- And right now, they have to go into these different stacks. So, you need to give developers two things. One, a common development experience across this different Kubernetes distribution. And two, a way for the operators. To your point. The operators have fallen behind the developers. And they cannot go to the developer there and tell them, "This is how you're going to do things." They have to see how they're doing things and figure out how to bring the gallery underneath so that developers can be developers, but the operators can lay down the tracks and the infrastructure there is secure and compliant. >> Dave Vellante: So two big inferences from that. One is self-serve infrastructure. You got- In a decentralized cloud, a Supercloud world, you got to have self-serve infrastructure, you got to be simple. And the second is governance. You mentioned security, but it's also governance. You know, data sovereignty as we talked about. So the question I have, Vittorio, is where does the customer start? >> [Vittorio Viarengo] So I, it always depends on the business need, but to me, the foundational layer is observability. If you don't know where your staff is, you cannot manage, you cannot secure it, you cannot manage its cost, right? So I think observability is the bar to entry. And then it depends on the business needs, right? So, we go back to the CIO that I talked to. He is clearly struggling with compliance and security. >> Hosts: Mm hmm. >> And so, like many customers. And so, that's maybe where they start. There are other customers that are a little behind the head of the pack in terms of building applications, right? And so they're looking at these, you know, innovative companies that have the developers that get the cloud and build all these application. They are leader in the industry. They're saying, "How do I get some of that?" Well, the way you get some of that is by adopting modern application development and platform operational capabilities. So, that's maybe, that's where they should start. And so on and so forth. It really depends on the business. To me, observability is the foundational part of this. >> John Furrier: Vittorio, we've been on this conversation with you for over a year and a half now with Supercloud. You've been a leader in seeing the wave, you and Raghu and the team at VMware, among other industry leaders. This is our second event. If you're- In the minute and a half that we have left, when you get asked, "what is this Supercloud multicloud Cross-Cloud thing? What's it mean?" I mean, I mentioned earlier, the market, the business models are changing, tech's changing, society needs more economic value out of the cloud. Builders are out there. If someone says, "Hey, Vittorio, what's the bottom line? What's really going on? Why should I pay attention to this wave? What's going on?" How would you describe the relevance of Supercloud? >> I think that this industry is full of smart vendors and smart customers. And if we are smart about it, we look at the history of IT and the history of IT repeats itself over and over again. You follow the- He said, "Follow the money." I say, "Follow the developers." That's how I made my career. I follow great developers. I look at, you know, Kit Colbert. I say, "Okay. I'm going to get behind that guy wherever he is going." And I try to add value to that person. I look at Raghu and all the great engineers that I was blessed to work with. And so the engineers go and explore new territories and then the rest of the stacks moves around. The developers have gone multicloud. And just like in any iteration of IT, at some point, the way you get the right scales at the right cost is with abstractions. And you can see it everywhere from, you know, bits and bytes, integration, to SOA, to APIs and microservices. You can see it now from best-of-breed hyperscaler across multiple clouds to creating an abstraction layer, a Supercloud, that creates a unified way of building, managing, running, securing, and accessing applications. So if you're a customer- (laughs) A minute and a half. (hosts chuckle) If you are customers that are out there and feeling the pain, you got to adopt this. If you are customers that is behind and saying, "Maybe you're in denial" look at the customers that are solving the problems today, and we're going to have some today. See what they're doing and learn from them so you don't make the same mistakes and you can get there ahead of it. >> Dave Vellante: Gracely's Law, John. Brian Gracely. That history repeats itself and- >> John Furrier: And I think one of these, "follow the developers" is interesting. And the other big wave, I want to get your comment real quick, is that developers aren't just application developers. They're network developers. The stack has completely been software-enabled so that you have software-defined networking, you have all kinds of software at all aspects of observability, infrastructure, security. The developers are everywhere. It's not just software. Software is everywhere. >> [Vittorio Viarengo] Yeah. Developers, developers, developers. The other thing that we can tell, I can tell, and we know, because we live in Silicon Valley. We worship developers but if you are out there in manufacturing, healthcare... If you have developers that understand this stuff, pamper them, keep them happy. (hosts laugh) If you don't have them, figure out where they hang out and go recruit them because developers indeed make the IT world go round. >> John Furrier: Vittorio, thank you for coming on with that opening keynote here for Supercloud 2. We're going to unpack what Supercloud is all about in our second edition of our live performance here in Palo Alto. Virtual event. We're going to talk to customers, experts, leaders, investors, everyone who's looking at the future, what's being enabled by this new big wave coming on called Supercloud. I'm John Furrier with Dave Vellante. We'll be right back after this short break. (ambient theme music plays)

Published Date : Feb 17 2023

SUMMARY :

of the Supercloud momentum. on this stage with you guys. and the Supercloud wave And the chaos comes from the fact And the CIO would go, "Well, the way we deal with that are seeing the problem, and the scale of Walmart So that's the spectrum You're seeing a lot more in the cloud and then you have NSX for security And you guys have been very and the rest of the that the market's obviously Now the business is coming to me and says, from the infrastructure if you want to do... and the infrastructure there And the second is governance. is the bar to entry. Well, the way you get some of that out of the cloud. the way you get the right scales Dave Vellante: Gracely's Law, John. And the other big wave, make the IT world go round. We're going to unpack what

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
Europe	LOCATION	0.99+
John Furrier	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Vittorio Viarengo	PERSON	0.99+
Vittorio	PERSON	0.99+
Kit Colbert	PERSON	0.99+
Palo Alto	LOCATION	0.99+
John	PERSON	0.99+
VMware	ORGANIZATION	0.99+
Brian Gracely	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
three	QUANTITY	0.99+
55%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
Azure Corp	ORGANIZATION	0.99+
four	QUANTITY	0.99+
two	QUANTITY	0.99+
two things	QUANTITY	0.99+
third category	QUANTITY	0.99+
87%	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
Snowflake	ORGANIZATION	0.99+
2016	DATE	0.99+
second edition	QUANTITY	0.99+
A minute and a half	QUANTITY	0.99+
second event	QUANTITY	0.99+
second category	QUANTITY	0.99+
Raghu Raghuram	PERSON	0.99+
One	QUANTITY	0.99+
Supercloud2	EVENT	0.99+
first	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Tanzu	ORGANIZATION	0.99+
today	DATE	0.99+
Supercloud	ORGANIZATION	0.98+
Aria	ORGANIZATION	0.98+
third part	QUANTITY	0.98+
Gracely	PERSON	0.98+
one	QUANTITY	0.98+
second	QUANTITY	0.97+
HP	ORGANIZATION	0.97+
second one	QUANTITY	0.97+
five major pillars	QUANTITY	0.97+
SAP	ORGANIZATION	0.97+
17	DATE	0.97+
over a year and a half	QUANTITY	0.96+
First	QUANTITY	0.96+
one cloud	QUANTITY	0.96+
first instance	QUANTITY	0.96+

Discussion about Walmart's Approach | Supercloud2

(upbeat electronic music) >> Okay, welcome back to Supercloud 2, live here in Palo Alto. I'm John Furrier, with Dave Vellante. Again, all day wall-to-wall coverage, just had a great interview with Walmart, we've got a Next interview coming up, you're going to hear from Bob Muglia and Tristan Handy, two experts, both experienced entrepreneurs, executives in technology. We're here to break down what just happened with Walmart, and what's coming up with George Gilbert, former colleague, Wikibon analyst, Gartner Analyst, and now independent investor and expert. George, great to see you, I know you're following this space. Like you read about it, remember the first days when Dataverse came out, we were talking about them coming out of Berkeley? >> Dave: Snowflake. >> John: Snowflake. >> Dave: Snowflake In the early days. >> We, collectively, have been chronicling the data movement since 2010, you were part of our team, now you've got your nose to the grindstone, you're seeing the next wave. What's this all about? Walmart building their own super cloud, we got Bob Muglia talking about how these next wave of apps are coming. What are the super apps? What's the super cloud to you? >> Well, this key's off Dave's really interesting questions to Walmart, which was like, how are they building their supercloud? 'Cause it makes a concrete example. But what was most interesting about his description of the Walmart WCMP, I forgot what it stood for. >> Dave: Walmart Cloud Native Platform. >> Walmart, okay. He was describing where the logic could run in these stateless containers, and maybe eventually serverless functions. But that's just it, and that's the paradigm of microservices, where the logic is in this stateless thing, where you can shoot it, or it fails, and you can spin up another one, and you've lost nothing. >> That was their triplet model. >> Yeah, in fact, and that was what they were trying to move to, where these things move fluidly between data centers. >> But there's a but, right? Which is they're all stateless apps in the cloud. >> George: Yeah. >> And all their stateful apps are on-prem and VMs. >> Or the stateful part of the apps are in VMs. >> Okay. >> And so if they really want to lift their super cloud layer off of this different provider's infrastructure, they're going to need a much more advanced software platform that manages data. And that goes to the -- >> Muglia and Handy, that you and I did, that's coming up next. So the big takeaway there, George, was, I'll set it up and you can chime in, a new breed of data apps is emerging, and this highly decentralized infrastructure. And Tristan Handy of DBT Labs has a sort of a solution to begin the journey today, Muglia is working on something that's way out there, describe what you learned from it. >> Okay. So to talk about what the new data apps are, and then the platform to run them, I go back to the using what will probably be seen as one of the first data app examples, was Uber, where you're describing entities in the real world, riders, drivers, routes, city, like a city plan, these are all defined by data. And the data is described in a structure called a knowledge graph, for lack of a, no one's come up with a better term. But that means the tough, the stuff that Jack built, which was all stateless and sits above cloud vendors' infrastructure, it needs an entirely different type of software that's much, much harder to build. And the way Bob described it is, you're going to need an entirely new data management infrastructure to handle this. But where, you know, we had this really colorful interview where it was like Rock 'Em Sock 'Em, but they weren't really that much in opposition to each other, because Tristan is going to define this layer, starting with like business intelligence metrics, where you're defining things like bookings, billings, and revenue, in business terms, not in SQL terms -- >> Well, business terms, if I can interrupt, he said the one thing we haven't figured out how to APIify is KPIs that sit inside of a data warehouse, and that's essentially what he's doing. >> George: That's what he's doing, yes. >> Right. And so then you can now expose those APIs, those KPIs, that sit inside of a data warehouse, or a data lake, a data store, whatever, through APIs. >> George: And the difference -- >> So what does that do for you? >> Okay, so all of a sudden, instead of working at technical data terms, where you're dealing with tables and columns and rows, you're dealing instead with business entities, using the Uber example of drivers, riders, routes, you know, ETA prices. But you can define, DBT will be able to define those progressively in richer terms, today they're just doing things like bookings, billings, and revenue. But Bob's point was, today, the data warehouse that actually runs that stuff, whereas DBT defines it, the data warehouse that runs it, you can't do it with relational technology >> Dave: Relational totality, cashing architecture. >> SQL, you can't -- >> SQL caching architectures in memory, you can't do it, you've got to rethink down to the way the data lake is laid out on the disk or cache. Which by the way, Thomas Hazel, who's speaking later, he's the chief scientist and founder at Chaos Search, he says, "I've actually done this," basically leave it in an S3 bucket, and I'm going to query it, you know, with no caching. >> All right, so what I hear you saying then, tell me if I got this right, there are some some things that are inadequate in today's world, that's not compatible with the Supercloud wave. >> Yeah. >> Specifically how you're using storage, and data, and stateful. >> Yes. >> And then the software that makes it run, is that what you're saying? >> George: Yeah. >> There's one other thing you mentioned to me, it's like, when you're using a CRM system, a human is inputting data. >> George: Nothing happens till the human does something. >> Right, nothing happens until that data entry occurs. What you're talking about is a world that self forms, polling data from the transaction system, or the ERP system, and then builds a plan without human intervention. >> Yeah. Something in the real world happens, where the user says, "I want a ride." And then the software goes out and says, "Okay, we got to match a driver to the rider, we got to calculate how long it takes to get there, how long to deliver 'em." That's not driven by a form, other than the first person hitting a button and saying, "I want a ride." All the other stuff happens autonomously, driven by data and analytics. >> But my question was different, Dave, so I want to get specific, because this is where the startups are going to come in, this is the disruption. Snowflake is a data warehouse that's in the cloud, they call it a data cloud, they refactored it, they did it differently, the success, we all know it looks like. These areas where it's inadequate for the future are areas that'll probably be either disrupted, or refactored. What is that? >> That's what Muglia's contention is, that the DBT can start adding that layer where you define these business entities, they're like mini digital twins, you can define them, but the data warehouse isn't strong enough to actually manage and run them. And Muglia is behind a company that is rethinking the database, really in a fundamental way that hasn't been done in 40 or 50 years. It's the first, in his contention, the first real rethink of database technology in a fundamental way since the rise of the relational database 50 years ago. >> And I think you admit it's a real Hail Mary, I mean it's quite a long shot right? >> George: Yes. >> Huge potential. >> But they're pretty far along. >> Well, we've been talking on theCUBE for 12 years, and what, 10 years going to AWS Reinvent, Dave, that no one database will rule the world, Amazon kind of showed that with them. What's different, is it databases are changing, or you can have multiple databases, or? >> It's a good question. And the reason we've had multiple different types of databases, each one specialized for a different type of workload, but actually what Muglia is behind is a new engine that would essentially, you'll never get rid of the data warehouse, or the equivalent engine in like a Databricks datalake house, but it's a new engine that manages the thing that describes all the data and holds it together, and that's the new application platform. >> George, we have one minute left, I want to get real quick thought, you're an investor, and we know your history, and the folks watching, George's got a deep pedigree in investment data, and we can testify against that. If you're going to invest in a company right now, if you're a customer, I got to make a bet, what does success look like for me, what do I want walking through my door, and what do I want to send out? What companies do I want to look at? What's the kind of of vendor do I want to evaluate? Which ones do I want to send home? >> Well, the first thing a customer really has to do when they're thinking about next gen applications, all the people have told you guys, "we got to get our data in order," getting that data in order means building an integrated view of all your data landscape, which is data coming out of all your applications. It starts with the data model, so, today, you basically extract data from all your operational systems, put it in this one giant, central place, like a warehouse or lake house, but eventually you want this, whether you call it a fabric or a mesh, it's all the data that describes how everything hangs together as in one big knowledge graph. There's different ways to implement that. And that's the most critical thing, 'cause that describes your Uber landscape, your Uber platform. >> That's going to power the digital transformation, which will power the business transformation, which powers the business model, which allows the builders to build -- >> Yes. >> Coders to code. That's Supercloud application. >> Yeah. >> George, great stuff. Next interview you're going to see right here is Bob Muglia and Tristan Handy, they're going to unpack this new wave. Great segment, really worth unpacking and reading between the lines with George, and Dave Vellante, and those two great guests. And then we'll come back here for the studio for more of the live coverage of Supercloud 2. Thanks for watching. (upbeat electronic music)

Published Date : Feb 17 2023

SUMMARY :

remember the first days What's the super cloud to you? of the Walmart WCMP, I and that's the paradigm of microservices, and that was what they stateless apps in the cloud. And all their stateful of the apps are in VMs. And that goes to the -- Muglia and Handy, that you and I did, But that means the tough, he said the one thing we haven't And so then you can now the data warehouse that runs it, Dave: Relational totality, Which by the way, Thomas I hear you saying then, and data, and stateful. thing you mentioned to me, George: Nothing happens polling data from the transaction Something in the real world happens, that's in the cloud, that the DBT can start adding that layer Amazon kind of showed that with them. and that's the new application platform. and the folks watching, all the people have told you guys, Coders to code. for more of the live

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
George	PERSON	0.99+
Bob Muglia	PERSON	0.99+
Tristan Handy	PERSON	0.99+
Dave	PERSON	0.99+
Bob	PERSON	0.99+
Thomas Hazel	PERSON	0.99+
George Gilbert	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Walmart	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Chaos Search	ORGANIZATION	0.99+
Jack	PERSON	0.99+
Tristan	PERSON	0.99+
12 years	QUANTITY	0.99+
Berkeley	LOCATION	0.99+
Uber	ORGANIZATION	0.99+
first	QUANTITY	0.99+
DBT Labs	ORGANIZATION	0.99+
10 years	QUANTITY	0.99+
two experts	QUANTITY	0.99+
Supercloud 2	TITLE	0.99+
Gartner	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
both	QUANTITY	0.99+
Muglia	ORGANIZATION	0.99+
one minute	QUANTITY	0.99+
40	QUANTITY	0.99+
two great guests	QUANTITY	0.98+
Wikibon	ORGANIZATION	0.98+
50 years	QUANTITY	0.98+
John	PERSON	0.98+
Rock 'Em Sock 'Em	TITLE	0.98+
today	DATE	0.98+
first person	QUANTITY	0.98+
Databricks	ORGANIZATION	0.98+
S3	COMMERCIAL_ITEM	0.97+
50 years ago	DATE	0.97+
2010	DATE	0.97+
Mary	PERSON	0.96+
first days	QUANTITY	0.96+
SQL	TITLE	0.96+
one	QUANTITY	0.95+
Supercloud wave	EVENT	0.95+
each one	QUANTITY	0.93+
DBT	ORGANIZATION	0.91+
Supercloud	TITLE	0.91+
Supercloud2	TITLE	0.91+
Supercloud 2	ORGANIZATION	0.89+
Snowflake	TITLE	0.86+
Dataverse	ORGANIZATION	0.83+
triplet	QUANTITY	0.78+

Exploring a Supercloud Architecture | Supercloud2

(upbeat music) >> Welcome back everyone to Supercloud 2, live here in Palo Alto, our studio, where we're doing a live stage performance and virtually syndicating out around the world. I'm John Furrier with Dave Vellante, my co-host with the The Cube here. We've got Kit Colbert, the CTO of VM. We're doing a keynote on Cloud Chaos, the evolution of SuperCloud Architecture Kit. Great to see you, thanks for coming on. >> Yeah, thanks for having me back. It's great to be here for Supercloud 2. >> And so we're going to dig into it. We're going to do a Q&A. We're going to let you present. You got some slides. I really want to get this out there, it's really compelling story. Do the presentation and then we'll come back and discuss. Take it away. >> Yeah, well thank you. So, we had a great time at the original Supercloud event, since then, been talking to a lot of customers, and started to better formulate some of the thinking that we talked about last time So, let's jump into it. Just a few quick slides to sort of set the tone here. So, if we go to the the next slide, what that shows is the journey that we see customers on today, going from what we call Cloud First into this phase that many customers are stuck in, called Cloud Chaos, and where they want to get to, and this is the term customers actually use, we didn't make this up, we heard it from customers. This notion of Cloud Smart, right? How do they use cloud more effectively, more intelligently? Now, if you walk through this journey, customers start with Cloud First. They usually select a single cloud that they're going to standardize on, and when they do that, they have to build out a whole bunch of functionality around that cloud. Things you can see there on the screen, disaster recovery, security, how do they monitor it or govern it? Like, these are things that are non-negotiable, you've got to figure it out, and typically what they do is, they leverage solutions that are specific for that cloud, and that's fine when you have just one cloud. But if we build out here, what we see is that most customers are using more than just one, they're actually using multiple, not necessarily 10 or however many on the screen, but this is just as an example. And so what happens is, they have to essentially duplicate or replicate that stack they've built for each different cloud, and they do so in a kind of a siloed manner. This results in the Cloud Chaos term that that we talked about before. And this is where most businesses out there are, they're using two, maybe three public clouds. They've got some stuff on-prem and they've also got some stuff out at the edge. This is apps, data, et cetera. So, this is the situation, this is sort of that Cloud Chaos. So, the question is, how do we move from this phase to Cloud Smart? And this is where the architecture comes in. This is why architecture, I think, is so important. It's really about moving away from these single cloud services that just solve a problem for one cloud, to something we call a Cross-Cloud service. Something that can support a set of functionality across all clouds, and that means not just public clouds, but also private clouds, edge, et cetera, and when you evolve that across the board, what you get is this sort of Supercloud. This notion that we're talking about here, where you combine these cross-cloud services in many different categories. You can see some examples there on the screen. This is not meant to be a complete set of things, but just examples of what can be done. So, this is sort of the transition and transformation that we're talking about here, and I think the architecture piece comes in both for the individual cloud services as well as that Supercloud concept of how all those services come together. >> Great presentation., thanks for sharing. If you could pop back to that slide, on the Cloud Chaos one. I just want to get your thoughts on something there. This is like the layout of the stack. So, this slide here that I'm showing on the screen, that you presented, okay, take us through that complexity. This is the one where I wanted though, that looks like a spaghetti code mix. >> Yes. >> So, do you turn this into a Supercloud stack, right? Is that? >> well, I think it's, it's an evolving state that like, let's take one of these examples, like security. So, instead of implementing security individually in different ways, using different technologies, different tooling for each cloud, what you would do is say, "Hey, I want a single security solution that works across all clouds", right? A concrete example of this would be secure software supply chain. This is probably one of the top ones that I hear when I talk to customers. How do I know that the software I'm building is truly what I expect it to be, and not something that some hacker has gotten into, and polluted with malicious code? And what they do is that, typically today, their teams have gone off and created individual secure software supply chain solutions for each cloud. So, now they could say, "Hey, I can take a single implementation and just have different endpoints." It could go to Google, or AWS, or on-prem, or wherever have you, right? So, that's the sort of architectural evolution that we're talking about. >> You know, one of the things we hear, Dave, you've been on theCUBE all the time, and we, when we talk privately with customers who are asking us like, what's, what's going on? They have the same complaint, "I don't want to build a team, a dev team, for that stack." So, if you go back to that slide again, you'll see that, that illustrates the tech stack for the clouds and the clouds at the bottom. So, the number one complaint we hear, and I want to get your reaction to that, "I don't want to have a team to have to work on that. So, I'm going to pick one and then have a hedge secondary one, as a backup." Here, that's one, that's four, five, eight, ten, ten environments. >> Yeah, I got a lot. >> That's going to be the reality, so, what's the technical answer to that? >> Yeah, well first of all, let me just say, this picture is again not totally representative of reality oftentimes, because while that picture shows a solution for every cloud, oftentimes that's not the case. Oftentimes it's a line of business going off, starting to use a new cloud. They might solve one or two things, but usually not security, usually not some of these other things, right? So, I think from a technical standpoint, where you want to get to is, yes, that sort of common service, with a common operational team behind it, that is trained on that, that can work across clouds. And that's really I think the important evolution here, is that you don't need to replicate these operational teams, one for each cloud. You can actually have them more focused across all those clouds. >> Yeah, in fact, we were commenting on the opening today. Dave and I were talking about the benefits of the cloud. It's heterogeneous, which is a good thing, but it's complex. There's skill gaps and skill required, but at the end of the day, self-service of the cloud, and the elastic nature of it makes it the benefit. So, if you try to create too many common services, you lose the value of the cloud. So, what's the trade off, in your mind right now as customers start to look at okay, identity, maybe I'll have one single sign on, that's an obvious one. Other ones? What are the areas people are looking at from a combination, common set of services? Where do they start? What's the choices? What are some of the trade offs? 'Cause you can't do it everything. >> No, it's a great question. So, that's actually a really good point and as I answer your question, before I answer your question, the important point about that, as you saw here, you know, across cloud services or these set of Cross-Cloud services, the things that comprise the Supercloud, at least in my view, the point is not necessarily to completely abstract the underlying cloud. The point is to give a business optionality and choice, in terms of what it wants to abstract, and I think that gets to your question, is how much do you actually want to abstract from the underlying cloud? Now, what I find, is that typically speaking, cloud choice is driven at least from a developer or app team perspective, by the best of breed services. What higher level application type services do you need? A database or AI, you know, ML systems, for your application, and that's going to drive your choice of the cloud. So oftentimes, businesses I talk to, want to allow those services to shine through, but for other things that are not necessarily highly differentiated and yet are absolutely critical to creating a successful application, those are things that you want to standardize. Again, like things like security, the supply chain piece, cost management, like these things you need to, and you know, things like cogs become really, really important when you start operating at scale. So, those are the things in it that I see people wanting to focus on. >> So, there's a majority model. >> Yes. >> All right, and we heard of earlier from Walmart, who's fairly, you know, advanced, but at the same time their supercloud is pretty immature. So, what are you seeing in terms of supercloud momentum, crosscloud momentum? What's the starting point for customers? >> Yeah, so it's interesting, right, on that that three-tiered journey that I talked about, this Cloud Smart notion is, that is adoption of what you might call a supercloud or architecture, and most folks aren't there yet. Even the really advanced ones, even the really large ones, and I think it's because of the fact that, we as an industry are still figuring this out. We as an industry did not realize this sort of Cloud Chaos state could happen, right? We didn't, I think most folks thought they could standardize on one cloud and that'd be it, but as time has shown, that's simply not the case. As much as one might try to do that, that's not where you end up. So, I think there's two, there's two things here. Number one, for folks that are early in to the cloud, and are in this Cloud Chaos phase, we see the path out through standardization of these cross-cloud services through adoption of this sort of supercloud architecture, but the other thing I think is particularly exciting, 'cause I talked to a number of of businesses who are not yet in the Cloud Chaos phase. They're earlier on in the cloud journey, and I think the opportunity there is that they don't have to go through Cloud Chaos. They can actually skip that whole phase if they adopt this supercloud architecture from the beginning, and I think being thoughtful around that is really the key here. >> It's interesting, 'cause we're going to hear from Ionis Pharmaceuticals later, and they, yes there are multiple clouds, but the multiple clouds are largely separate, and so it's a business unit using that. So, they're not in Cloud Chaos, but they're not tapping the advantages that you could get for best of breed across those business units. So, to your point, they have an opportunity to actually build that architecture or take advantage of those cross-cloud services, prior to reaching cloud chaos. >> Well, I, actually, you know, I'd love to hear from them if, 'cause you say they're not in Cloud Chaos, but are they, I mean oftentimes I find that each BU, each line of business may feel like they're fine, in of themselves. >> Yes, exactly right, yes. >> But when you look at it from an overall company perspective, they're like, okay, things are pretty chaotic here. We don't have standardization, I don't, you know, like, again, security compliance, these things, especially in many regulated industries, become huge problems when you're trying to run applications across multiple clouds, but you don't have any of those company-wide standardizations. >> Well, this is a point. So, they have a big deal with AstraZeneca, who's got this huge ecosystem, they want to start sharing data across those ecosystem, and that's when they will, you know, that Cloud Chaos will, you know, come, come to fore, you would think. I want to get your take on something that Bob Muglia said earlier, which is, he kind of said, "Hey Dave, you guys got to tighten up your definition a little bit." So, he said a supercloud is a platform that provides programmatically consistent services hosted on heterogeneous cloud providers. So, you know, thank you, that was nice and simple. However others in the community, we're going to hear from Dr. Nelu Mihai later, says, no, no, wait a minute, it's got to be an architecture, not a platform. Where do you land on this architecture v. platform thing? >> I look at it as, I dunno if it's, you call it maturity or just kind of a time horizon thing, but for me when I hear the word platform, I typically think of a single vendor. A single vendor provides this platform. That's kind of the beauty of a platform, is that there is a simplicity usually consistency to it. >> They did the architecture. (laughing) >> Yeah, exactly but I mean, well, there's obviously architecture behind it, has to be, but you as a customer don't necessarily need to deal with that. Now, I think one of the opportunities with Supercloud is that it's not going to be, or there is no single vendor that can solve all these problems. It's got to be the industry coming together as a community, inter-operating, working together, and so, that's why, for me, I think about it as an architecture, that there's got to be these sort of, well-defined categories of functionality. There's got to be well-defined interfaces between those categories of functionality to enable modularity, to enable businesses to be able to pick and choose the right sorts of services, and then weave those together into an overall supercloud. >> Okay, so you're not pitching, necessarily the platform, you're saying, hey, we have an architecture that's open. I go back to something that Vittorio said on August 9th, with the first Supercloud, because as well, remember we talked about abstracting, but at the same time giving developers access to those primitives. So he said, and this, I think your answer sort of confirms this. "I want to have my cake eat it too and not gain weight." >> (laughing) Right. Well and I think that's where the platform aspect can eventually come, after we've gotten aligned architecture, you're going to start to naturally see some vendors step up to take on some of the remaining complexity there. So, I do see platforms eventually emerging here, but I think where we have to start as an industry is around aligning, okay, what does this definition mean? What does that architecture look like? How do we enable interoperability? And then we can take the next step. >> Because it depends too, 'cause I would say Snowflake has a platform, and they've just defined the architecture, but we're not talking about infrastructure here, obviously, we're talking about something else. >> Well, I think that the Snowflake talks about, what he talks about, security and data, you're going to start to see the early movement around areas that are very spanning oriented, and I think that's the beginning of the trend and I think there's going to be a lot more, I think on the infrastructure side. And to your point about the platform architecture, that's actually a really good thought exercise because it actually makes you think about what you're designing in the first place, and that's why I want to get your reaction. >> Quote from- >> Well I just have to interrupt since, later on, you're going to hear from near Nir Zuk of Palo Alto Network. He says architecture and security historically, they don't go hand in hand, 'cause it's a big mess. >> It depends if you're whacking the mole or you actually proactively building something. Well Kit, I want to get your reaction from a quote from someone in our community who said about Supercloud, you know, "The Supercloud's great, there are issues around computer science rigors, and customer requirements." So, there's some issues around the science itself as well as not just listen to the customer, 'cause if that's the case, we'd have a better database, a better Oracle, right, so, but there's other, this tech involved, new tech. We need an open architecture with universal data modeling interconnecting among them, connectivity is a part of security, and then, once we get through that gate, figuring out the technical, the data, and the customer requirements, they say "Supercloud should be a loosely coupled platform with open architecture, plug and play, specialized services, ready for optimization, automation that can stand the test of time." What's your reaction to that sentiment? You like it, is that, does that sound good? >> Yeah, no, broadly aligns with my thinking, I think, and what I see from talking with customers as well. I mean, I like the, again, the, you know, listening to customer needs, prioritizing those things, focusing on some of the connective tissue networking, and data and some of these aspects talking about the open architecture, the interoperability, those are all things I think are absolutely critical. And then, yeah, like I think at the end. >> On the computer science side, do you see some science and engineering things that need to be engineered differently? We heard databases are radically going to change and that are inadequate for the new architecture. What are some of the things like that, from a science standpoint? >> Yeah, yeah, yeah. Some of the more academic research type things. >> More tech, or more better tech or is it? >> Yeah, look, absolutely. I mean I think that there's a bunch around, certainly around the data piece, around, you know, there's issues of data gravity, data mobility. How do you want to do that in a way that's performant? There's definitely issues around security as well. Like how do you enable like trust in these environments, there's got to be some sort of hardware rooted trusts, and you know, a whole bunch of various types of aspects there. >> So, a lot of work still be done. >> Yes, I think so. And that's why I look at this as, this is not a one year thing, or you know, it's going to be multi-years, and I think again, it's about all of us in the industry working together to come to an aligned picture of what that looks like. >> So, as the world's moved from private cloud to public cloud and now Cross-cloud services, supercloud, metacloud, whatever you want to call it, how have you sort of changed the way engineering's organized, developers sort of approached the problem? Has it changed and how? >> Yeah, absolutely. So, you know, it's funny, we at VMware, going through the same challenges as our customers and you know, any business, right? We use multiple clouds, we got a big, of course, on-prem footprint. You know, what we're doing is similar to what I see in many other customers, which, you see the evolution of a platform team, and so the platform team is really in charge of trying to develop a lot of these underlying services to allow our lines of business, our product teams, to be able to move as quickly as possible, to focus on the building, while we help with a lot of the operational overheads, right? We maintain security, compliance, all these other things. We also deal with, yeah, just making the developer's life as simple as possible. So, they do need to know some stuff about, you know, each public cloud they're using, those public cloud services, but at the same, time we can abstract a lot of the details they don't need to be in. So, I think this sort of delineation or separation, I should say, between the underlying platform team and the product teams is a very, very common pattern. >> You know, I noticed the four layers you talked about were observability, infrastructure, security and developers, on that slide, the last slide you had at the top, that was kind of the abstraction key areas that you guys at VMware are working? >> Those were just some groupings that we've come up with, but we like to debate them. >> I noticed data's in every one of them. >> Yeah, yep, data is key. >> It's not like, so, back to the data questions that security is called out as a pillar. Observability is just kind of watching everything, but it's all pretty much data driven. Of the four layers that you see, I take that as areas that you can. >> Standardize. >> Consistently rely on to have standard services. >> Yes. >> Which one do you start with? What's the, is there order of operations? >> Well, that's, I mean. >> 'Cause I think infrastructure's number one, but you had observability, you need to know what's going on. >> Yeah, well it really, it's highly dependent. Again, it depends on the business that we talk to and what, I mean, it really goes back to, what are your business priorities, right? And we have some customers who may want to get out of a data center, they want to evacuate the data center, and so what they want is then, consistent infrastructure, so they can just move those applications up to the cloud. They don't want to have to refactor them and we'll do it later, but there's an immediate and sort of urgent problem that they have. Other customers I talk to, you know, security becomes top of mind, or maybe compliance, because they're in a regulated industry. So, those are the sort of services they want to prioritize. So, I would say there is no single right answer, no one size fits all. The point about this architecture is really around the optionality of it, as it allows you as a business to decide what's most important and where you want to prioritize. >> How about the deployment models kit? Do, does a customer have that flexibility from a deployment model standpoint or do I have to, you know, approach it a specific way? Can you address that? >> Yeah, I mean deployment models, you're talking about how they how they consume? >> So, for instance, yeah, running a control plane in the cloud. >> Got it, got it. >> And communicating elsewhere or having a single global instance or instantiating that instance, and? >> So, that's a good point actually, and you know, the white paper that we released back in August, around this sort of concept, the Cross-cloud service. This is some of the stuff we need to figure out as an industry. So, you know when we talk about a Cross-cloud service, we can mean actually any of the things you just talked about. It could be a single instance that runs, let's say in one public cloud, but it supports all of 'em. Or it could be one that's multi-instance and that runs in each of the clouds, and that customers can take dependencies on whichever one, depending on what their use cases are or the, even going further than that, there's a type of Cross-cloud service that could actually be instantiated even in an air gapped or offline environment, and we have many, many businesses, especially heavily regulated ones that have that requirement, so I think, you know. >> Global don't forget global, regions, locales. >> Yeah, there's all sorts of performance latency issues that can be concerned about. So, most services today are the former, there are single sort of instance or set of instances within a single cloud that support multiple clouds, but I think what we're doing and where we're going with, you know, things like what we see with Kubernetes and service meshes and all these things, will better enable folks to hit these different types of cross-cloud service architectures. So, today, you as a customer probably wouldn't have too much choice, but where we're going, you'll see a lot more choice in the future. >> If you had to summarize for folks watching the importance of Supercloud movement, multi-cloud, cross-cloud services, as an industry in flexible, 'cause I'm always riffing on the whole old school network protocol stacks that got disrupted by TCP/IP, that's a little bit dated, we got people on the chat that are like, you know, 20 years old that weren't even born then. So, but this is a, one of those inflection points that's once in a generation inflection point, I'm sure you agree. What scoped the order of magnitude of the change and the opportunity around the marketplace, the business models, the technology, and ultimately benefits the society. >> Yeah. Wow. Getting bigger. >> You have 10 seconds, go. >> I know. Yeah. (laughing) No, look, so I think it is what we're seeing is really the next phase of what you might call cloud, right? This notion of delivering services, the way they've been packaged together, traditionally by the hyperscalers is now being challenged. and what we're seeing is really opening that up to new levels of innovation, and I think that will be huge for businesses because it'll help meet them where they are. Instead of needing to contort the businesses to, you know, make it work with the technology, the technology will support the business and where it's going. Give people more optionality, more flexibility in order to get there, and I think in the end, for us as individuals, it will just make for better experiences, right? You can get better performance, better interactivity, given that devices are so much of what we do, and so much of what we interact with all the time. This sort of flexibility and optionality will fundamentally better for us as individuals in our experiences. >> And we're seeing that with ChatGPT, everyone's talking about, just early days. There'll be more and more of things like that, that are next gen, like obviously like, wow, that's a fall out of your chair moment. >> It'll be the next wave of innovation that's unleashed. >> All right, Kit Colbert, thanks for coming on and sharing and exploring the Supercloud architecture, Cloud Chaos, the Cloud Smart, there's a transition progression happening and it's happening fast. This is the supercloud wave. If you're not on this wave, you'll be driftwood. That's a Pat Gelsinger quote on theCUBE. This is theCUBE Be right back with more Supercloud coverage, here in Palo Alto after this break. (upbeat music) (upbeat music continues)

Published Date : Feb 17 2023

SUMMARY :

We've got Kit Colbert, the CTO of VM. It's great to be here for Supercloud 2. We're going to let you present. and when you evolve that across the board, This is like the layout of the stack. How do I know that the So, the number one complaint we hear, is that you don't need to replicate and the elastic nature of and I think that gets to your question, So, what are you seeing in terms but the other thing I think that you could get for best of breed Well, I, actually, you know, I don't, you know, like, and that's when they will, you know, That's kind of the beauty of a platform, They did the architecture. is that it's not going to be, but at the same time Well and I think that's and they've just defined the architecture, beginning of the trend Well I just have to and the customer requirements, focusing on some of the that need to be engineered differently? Some of the more academic and you know, a whole bunch or you know, it's going to be multi-years, of the details they don't need to be in. that we've come up with, Of the four layers that you see, to have standard services. but you had observability, you is really around the optionality of it, running a control plane in the cloud. and that runs in each of the clouds, Global don't forget and where we're going with, you know, and the opportunity of what you might call cloud, right? that are next gen, like obviously like, It'll be the next wave of and exploring the Supercloud architecture,

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Bob Muglia	PERSON	0.99+
Kit Colbert	PERSON	0.99+
August 9th	DATE	0.99+
Palo Alto	LOCATION	0.99+
AWS	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Pat Gelsinger	PERSON	0.99+
10 seconds	QUANTITY	0.99+
two	QUANTITY	0.99+
Ionis Pharmaceuticals	ORGANIZATION	0.99+
Walmart	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
AstraZeneca	ORGANIZATION	0.99+
Nelu Mihai	PERSON	0.99+
August	DATE	0.99+
two things	QUANTITY	0.99+
one	QUANTITY	0.99+
Supercloud	ORGANIZATION	0.99+
Vittorio	PERSON	0.99+
20 years	QUANTITY	0.99+
10	QUANTITY	0.99+
one year	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
each	QUANTITY	0.99+
Kit	PERSON	0.99+
three	QUANTITY	0.99+
first	QUANTITY	0.99+
today	DATE	0.98+
both	QUANTITY	0.98+
each cloud	QUANTITY	0.98+
one cloud	QUANTITY	0.97+
each cloud	QUANTITY	0.97+
ten	QUANTITY	0.97+
VMware	ORGANIZATION	0.96+
five	QUANTITY	0.96+
single cloud	QUANTITY	0.96+
single	QUANTITY	0.96+
each line	QUANTITY	0.96+
supercloud wave	EVENT	0.96+
single instance	QUANTITY	0.95+
Palo Alto Network	ORGANIZATION	0.95+
four	QUANTITY	0.94+
eight	QUANTITY	0.94+
single vendor	QUANTITY	0.94+
Cloud Chaos	TITLE	0.94+
Nir Zuk	PERSON	0.94+
three-tiered	QUANTITY	0.93+
Cloud First	TITLE	0.91+
four layers	QUANTITY	0.91+
Cloud Smart	TITLE	0.91+
Supercloud	TITLE	0.89+
single implementation	QUANTITY	0.88+
Supercloud 2	EVENT	0.87+
first place	QUANTITY	0.84+
single right answer	QUANTITY	0.84+
once	QUANTITY	0.83+
single sort	QUANTITY	0.82+

Chat w/ Arctic Wolf exec re: budget restraints could lead to lax cloud security

>> Now we're recording. >> All right. >> Appreciate that, Hannah. >> Yeah, so I mean, I think in general we continue to do very, very well as a company. I think like everybody, there's economic headwinds today that are unavoidable, but I think we have a couple things going for us. One, we're in the cyberspace, which I think is, for the most part, recession proof as an industry. I think the impact of a recession will impact some vendors and some categories, but in general, I think the industry is pretty resilient. It's like the power industry, no? Recession or not, you still need electricity to your house. Cybersecurity is almost becoming a utility like that as far as the needs of companies go. I think for us, we also have the ability to do the security, the security operations, for a lot of companies, and if you look at the value proposition, the ROI for the cost of less than one to maybe two or three, depending on how big you are as a customer, what you'd have to pay for half to three security operations people, we can give you a full security operations. And so the ROI is is almost kind of brain dead simple, and so that keeps us going pretty well. And I think the other areas, we remove all that complexity for people. So in a world where you got other problems to worry about, handling all the security complexity is something that adds to that ROI. So for us, I think what we're seeing is mostly is some of the larger deals are taking a little bit longer than they have, some of the large enterprise deals, 'cause I think they are being a little more cautious about how they spend it, but in general, business is still kind of cranking along. >> Anything you can share with me that you guys have talked about publicly in terms of any metrics, or what can you tell me other than cranking? >> Yeah, I mean, I would just say we're still very, very high growth, so I think our financial profile would kind of still put us clearly in the cyber unicorn position, but I think other than that, we don't really share business metrics as a private- >> Okay, so how about headcount? >> Still growing. So we're not growing as fast as we've been growing, but I don't think we were anyway. I think we kind of, we're getting to the point of critical mass. We'll start to grow in a more kind of normal course and speed. I don't think we overhired like a lot of companies did in the past, even though we added, almost doubled the size of the company in the last 18 months. So we're still hiring, but very kind of targeted to certain roles going forward 'cause I do think we're kind of at critical mass in some of the other functions. >> You disclose headcount or no? >> We do not. >> You don't, okay. And never have? >> Not that I'm aware of, no. >> Okay, on the macro, I don't know if security's recession proof, but it's less susceptible, let's say. I've had Nikesh Arora on recently, we're at Palo Alto's Ignite, and he was saying, "Look," it's just like you were saying, "Larger deal's a little harder." A lot of times customers, he was saying customers are breaking larger deals into smaller deals, more POCs, more approvals, more people to get through the approval, not whole, blah, blah, blah. Now they're a different animal, I understand, but are you seeing similar trends, and how are you dealing with that? >> Yeah, I think the exact same trends, and I think it's just in a world where spending a dollar matters, I think a lot more oversight comes into play, a lot more reviewers, and can you shave it down here? Can you reduce the scope of the project to save money there? And I think it just caused a lot of those things. I think, in the large enterprise, I think most of those deals for companies like us and Palo and CrowdStrike and kind of the upper tier companies, they'll still go through. I think they'll just going to take a lot longer, and, yeah, maybe they're 80% of what they would've been otherwise, but there's still a lot of business to be had out there. >> So how are you dealing with that? I mean, you're talking about you double the size of the company. Is it kind of more focused on go-to-market, more sort of, maybe not overlay, but sort of SE types that are going to be doing more handholding. How have you dealt with that? Or have you just sort of said, "Hey, it is what it is, and we're not going to, we're not going to tactically respond to. We got long-term direction"? >> Yeah, I think it's more the latter. I think for us, it's we've gone through all these things before. It just takes longer now. So a lot of the steps we're taking are the same steps. We're still involved in a lot of POCs, we're involved in a lot of demos, and I don't think that changed. It's just the time between your POC and when someone sends you the PO, there's five more people now got to review things and go through a budget committee and all sorts of stuff like that. I think where we're probably focused more now is adding more and more capabilities just so we continue to be on the front foot of innovation and being relevant to the market, and trying to create more differentiators for us and the competitors. That's something that's just built into our culture, and we don't want to slow that down. And so even though the business is still doing extremely, extremely well, we want to keep investing in kind of technology. >> So the deal size, is it fair to say the initial deal size for new accounts, while it may be smaller, you're adding more capabilities, and so over time, your average contract values will go up? Are you seeing that trend? Or am I- >> Well, I would say I don't even necessarily see our average deal size has gotten smaller. I think in total, it's probably gotten a little bigger. I think what happens is when something like this happens, the old cream rises to the top thing, I think, comes into play, and you'll see some organizations instead of doing a deal with three or four vendors, they may want to pick one or two and really kind of put a lot of energy behind that. For them, they're maybe spending a little less money, but for those vendors who are amongst those getting chosen, I think they're doing pretty good. So our average deal size is pretty stable. For us, it's just a temporal thing. It's just the larger deals take a little bit longer. I don't think we're seeing much of a deal velocity difference in our mid-market commercial spaces, but in the large enterprise it's a little bit slower. But for us, we have ambitious plans in our strategy or on how we want to execute and what we want to build, and so I think we want to just continue to make sure we go down that path technically. >> So I have some questions on sort of the target markets and the cohorts you're going after, and I have some product questions. I know we're somewhat limited on time, but the historical focus has been on SMB, and I know you guys have gone in into enterprise. I'm curious as to how that's going. Any guidance you can give me on mix? Or when I talk to the big guys, right, you know who they are, the big managed service providers, MSSPs, and they're like, "Poo poo on Arctic Wolf," like, "Oh, they're (groans)." I said, "Yeah, that's what they used to say about the PC. It's just a toy. Or Microsoft SQL Server." But so I kind of love that narrative for you guys, but I'm curious from your words as to, what is that enterprise? How's the historical business doing, and how's the entrance into the enterprise going? What kind of hurdles are you having, blockers are you having to remove? Any color you can give me there would be super helpful. >> Yeah, so I think our commercial S&B business continues to do really good. Our mid-market is a very strong market for us. And I think while a lot of companies like to focus purely on large enterprise, there's a lot more mid-market companies, and a much larger piece of the IT puzzle collectively is in mid-market than it is large enterprise. That being said, we started to get pulled into the large enterprise not because we're a toy but because we're quite a comprehensive service. And so I think what we're trying to do from a roadmap perspective is catch up with some of the kind of capabilities that a large enterprise would want from us that a potential mid-market customer wouldn't. In some case, it's not doing more. It's just doing it different. Like, so we have a very kind of hands-on engagement with some of our smaller customers, something we call our concierge. Some of the large enterprises want more of a hybrid where they do some stuff and you do some stuff. And so kind of building that capability into the platform is something that's really important for us. Just how we engage with them as far as giving 'em access to their data, the certain APIs they want, things of that nature, what we're building out for large enterprise, but the demand by large enterprise on our business is enormous. And so it's really just us kind of catching up with some of the kind of the features that they want that we lack today, but many of 'em are still signing up with us, obviously, and in lieu of that, knowing that it's coming soon. And so I think if you look at the growth of our large enterprise, it's one of our fastest growing segments, and I think it shows anything but we're a toy. I would be shocked, frankly, if there's an MSSP, and, of course, we don't see ourself as an MSSP, but I'd be shocked if any of them operate a platform at the scale that ours operates. >> Okay, so wow. A lot I want to unpack there. So just to follow up on that last question, you don't see yourself as an MSSP because why, you see yourselves as a technology platform? >> Yes, I mean, the vast, vast, vast majority of what we deliver is our own technology. So we integrate with third-party solutions mostly to bring in that telemetry. So we've built our own platform from the ground up. We have our own threat intelligence, our own detection logic. We do have our own agents and network sensors. MSSP is typically cobbling together other tools, third party off-the-shelf tools to run their SOC. Ours is all homegrown technology. So I have a whole group called Arctic Wolf Labs, is building, just cranking out ML-based detections, building out infrastructure to take feeds in from a variety of different sources. We have a full integration kind of effort where we integrate into other third parties. So when we go into a customer, we can leverage whatever they have, but at the same time, we produce some tech that if they're lacking in a certain area, we can provide that tech, particularly around things like endpoint agents and network sensors and the like. >> What about like identity, doing your own identity? >> So we don't do our own identity, but we take feeds in from things like Okta and Active Directory and the like, and we have detection logic built on top of that. So part of our value add is we were XDR before XDR was the cool thing to talk about, meaning we can look across multiple attack surfaces and come to a security conclusion where most EDR vendors started with looking just at the endpoint, right? And then they called themselves XDR because now they took in a network feed, but they still looked at it as a separate network detection. We actually look at the things across multiple attack surfaces and stitch 'em together to look at that from a security perspective. In some cases we have automatic detections that will fire. In other cases, we can surface some to a security professional who can go start pulling on that thread. >> So you don't need to purchase CrowdStrike software and integrate it. You have your own equivalent essentially. >> Well, we'll take a feed from the CrowdStrike endpoint into our platform. We don't have to rely on their detections and their alerts, and things of that nature. Now obviously anything they discover we pull in as well, it's just additional context, but we have all our own tech behind it. So we operate kind of at an MSSP scale. We have a similar value proposition in the sense that we'll use whatever the customer has, but once that data kind of comes into our pipeline, it's all our own homegrown tech from there. >> But I mean, what I like about the MSSP piece of your business is it's very high touch. It's very intimate. What I like about what you're saying is that it's software-like economics, so software, software-like part of it. >> That's what makes us the unicorn, right? Is we do have, our concierges is very hands-on. We continue to drive automation that makes our concierge security professionals more efficient, but we always want that customer to have that concierge person as, is almost an extension to their security team, or in some cases, for companies that don't even have a security team, as their security team. As we go down the path, as I mentioned, one of the things we want to be able to do is start to have a more flexible model where we can have that high touch if you want it. We can have the high touch on certain occasions, and you can do stuff. We can have low touch, like we can span the spectrum, but we never want to lose our kind of unique value proposition around the concierge, but we also want to make sure that we're providing an interface that any customer would want to use. >> So given that sort of software-like economics, I mean, services companies need this too, but especially in software, things like net revenue retention and churn are super important. How are those metrics looking? What can you share with me there? >> Yeah, I mean, again, we don't share those metrics publicly, but all's I can continue to repeat is, if you looked at all of our financial metrics, I think you would clearly put us in the unicorn category. I think very few companies are going to have the level of growth that we have on the amount of ARR that we have with the net revenue retention and the churn and upsell. All those aspects continue to be very, very strong for us. >> I want to go back to the sort of enterprise conversation. So large enterprises would engage with you as a complement to their existing SOC, correct? Is that a fair statement or not necessarily? >> It's in some cases. In some cases, they're looking to not have a SOC. So we run into a lot of cases where they want to replace their SIEM, and they want a solution like Arctic Wolf to do that. And so there's a poll, I can't remember, I think it was Forrester, IDC, one of them did it a couple years ago, and they found out that 70% of large enterprises do not want to build the SOC, and it's not 'cause they don't need one, it's 'cause they can't afford it, they can't staff it, they don't have the expertise. And you think about if you're a tech company or a bank, or something like that, of course you can do it, but if you're an international plumbing distributor, you're not going to (chuckles), someone's not going to graduate from Stanford with a cybersecurity degree and go, "Cool, I want to go work for a plumbing distributor in their SOC," right? So they're going to have trouble kind of bringing in the right talent, and as a result, it's difficult to go make a multimillion-dollar investment into a SOC if you're not going to get the quality people to operate it, so they turn to companies like us. >> Got it, so, okay, so you're talking earlier about capabilities that large enterprises require that there might be some gaps, you might lack some features. A couple questions there. One is, when you do some of those, I inferred some of that is integrations. Are those integrations sort of one-off snowflakes or are you finding that you're able to scale those across the large enterprises? That's my first question. >> Yeah, so most of the integrations are pretty straightforward. I think where we run into things that are kind of enterprise-centric, they definitely want open APIs, they want access to our platform, which we don't do today, which we are going to be doing, but we don't do that yet today. They want to do more of a SIEM replacement. So we're really kind of what we call an open XDR platform, so there's things that we would need to build to kind of do raw log ingestion. I mean, we do this today. We have raw log ingestion, we have log storage, we have log searching, but there's like some of the compliance scenarios that they need out of their SIEM. We don't do those today. And so that's kind of holding them back from getting off their SIEM and going fully onto a solution like ours. Then the other one is kind of the level of customization, so the ability to create a whole bunch of custom rules, and that ties back to, "I want to get off my SIEM. I've built all these custom rules in my SIEM, and it's great that you guys do all this automatic AI stuff in the background, but I need these very specific things to be executed on." And so trying to build an interface for them to be able to do that and then also simulate it, again, because, no matter how big they are running their SIEM and their SOC... Like, we talked to one of the largest financial institutions in the world. As far as we were told, they have the largest individual company SOC in the world, and we operate almost 15 times their size. So we always have to be careful because this is a cloud-based native platform, but someone creates some rule that then just craters the performance of the whole platform, so we have to build kind of those guardrails around it. So those are the things primarily that the large enterprises are asking for. Most of those issues are not holding them back from coming. They want to know they're coming, and we're working on all of those. >> Cool, and see, just aside, I was talking to CISO the other day, said, "If it weren't for my compliance and audit group, I would chuck my SIEM." I mean, everybody wants to get rid of their SIEM. >> I've never met anyone who likes their SIEM. >> Do you feel like you've achieved product market fit in the larger enterprise or is that still something that you're sorting out? >> So I think we know, like, we're on a path to do that. We're on a provable path to do that, so I don't think there's any surprises left. I think everything that we know we need to do for that is someone's writing code for it today. It's just a matter of getting it through the system and getting into production. So I feel pretty good about it. I think that's why we are seeing such a high growth rate in our large enterprise business, 'cause we share that feedback with some of those key customers. We have a Customer Advisory Board that we share a lot of this information with. So yeah, I mean, I feel pretty good about what we need to do. We're certainly operate at large enterprise scales, so taking in the amount of the volume of data they're going to have and the types of integrations they need. We're comfortable with that. It's just more or less the interfaces that a large enterprise would want that some of the smaller companies don't ask for. >> Do you have enough tenure in the market to get a sense as to stickiness or even indicators that will lead toward retention? Have you been at it long enough in the enterprise or you still, again, figuring that out? >> Yeah, no, I think we've been at it long enough, and our retention rates are extremely high. If anything, kind of our net retention rates, well over 100% 'cause we have opportunities to upsell into new modules and expanding the coverage of what they have today. I think the areas that if you cornered enterprise that use us and things they would complain about are things I just told you about, right? There's still some things I want to do in my Splunk, and I need an API to pull my data out and put it in my Splunk and stuff like that, and those are the things we want to enable. >> Yeah, so I can't wait till you guys go public because you got Snowflake up here, and you got Veritas down here, and I'm very curious as to where you guys go. When's the IPO? You want to tell me that? (chuckling) >> Unfortunately, it's not up to us right now. You got to get the markets- >> Yeah, I hear you. Right, if the market were better. Well, if the market were better, you think you'd be out? >> Yeah, I mean, we'd certainly be a viable candidate to go. >> Yeah, there you go. I have a question for you because I don't have a SOC. I run a small business with my co-CEO. We're like 30, 40 people W-2s, we got another 50 or so contractors, and I'm always like have one eye, sleep with one eye open 'cause of security. What is your ideal SMB customer? Think S. >> Yeah. >> Would I fit? >> Yeah, I mean you're you're right in the sweet spot. I think where the company started and where we still have a lot of value proposition, which is companies like, like you said it, you sleep with one eye open, but you don't have necessarily the technical acumen to be able to do that security for yourself, and that's where we fit in. We bring kind of this whole security, we call it Security Operations Cloud, to bear, and we have some of the best professionals in the world who can basically be your SOC for less than it would cost you to hire somebody right out of college to do IT stuff. And so the value proposition's there. You're going to get the best of the best, providing you a kind of a security service that you couldn't possibly build on your own, and that way you can go to bed at night and close both eyes. >> So (chuckling) I'm sure something else would keep me up. But so in thinking about that, our Amazon bill keeps growing and growing and growing. What would it, and I presume I can engage with you on a monthly basis, right? As a consumption model, or how's the pricing work? >> Yeah, so there's two models that we have. So typically the kind of the monthly billing type of models would be through one of our MSP partners, where they have monthly billing capabilities. Usually direct with us is more of a longer term deal, could be one, two, or three, or it's up to the customer. And so we have both of those engagement models. Were doing more and more and more through MSPs today because of that model you just described, and they do kind of target the very S in the SMB as well. >> I mean, rough numbers, even ranges. If I wanted to go with the MSP monthly, I mean, what would a small company like mine be looking at a month? >> Honestly, I do not even know the answer to that. >> We're not talking hundreds of thousands of dollars a month? >> No. God, no. God, no. No, no, no. >> I mean, order of magnitude, we're talking thousands, tens of thousands? >> Thousands, on a monthly basis. Yeah. >> Yeah, yeah. Thousands per month. So if I were to budget between 20 and $50,000 a year, I'm definitely within the envelope. Is that fair? I mean, I'm giving a wide range >> That's fair. just to try to make- >> No, that's fair. >> And if I wanted to go direct with you, I would be signing up for a longer term agreement, correct, like I do with Salesforce? >> Yeah, yeah, a year. A year would, I think, be the minimum for that, and, yeah, I think the budget you set aside is kind of right in the sweet spot there. >> Yeah, I'm interested, I'm going to... Have a sales guy call me (chuckles) somehow. >> All right, will do. >> No, I'm serious. I want to start >> I will. >> investigating these things because we sell to very large organizations. I mean, name a tech company. That's our client base, except for Arctic Wolf. We should talk about that. And increasingly they're paranoid about data protection agreements, how you're protecting your data, our data. We write a lot of software and deliver it as part of our services, so it's something that's increasingly important. It's certainly a board level discussion and beyond, and most large organizations and small companies oftentimes don't think about it or try not to. They just put their head in the sand and, "We don't want to be doing that," so. >> Yeah, I will definitely have someone get in touch with you. >> Cool. Let's see. Anything else you can tell me on the product side? Are there things that you're doing that we talked about, the gaps at the high end that you're, some of the features that you're building in, which was super helpful. Anything in the SMB space that you want to share? >> Yeah, I think the biggest thing that we're doing technically now is really trying to drive more and more automation and efficiency through our operations, and that comes through really kind of a generous use of AI. So building models around more efficient detections based upon signal, but also automating the actions of our operators so we can start to learn through the interface. When they do A and B, they always do C. Well, let's just do C for them, stuff like that. Then also building more automation as far as the response back to third-party solutions as well so we can remediate more directly on third-party products without having to get into the consoles or having our customers do it. So that's really just trying to drive efficiency in the system, and that helps provide better security outcomes but also has a big impact on our margins as well. >> I know you got to go, but I want to show you something real quick. I have data. I do a weekly program called "Breaking Analysis," and I have a partner called ETR, Enterprise Technology Research, and they have a platform. I don't know if you can see this. They have a survey platform, and each quarter, they do a survey of about 1,500 IT decision makers. They also have a survey on, they call ETS, Emerging Technology Survey. So it's private companies. And I don't want to go into it too much, but this is a sentiment graph. This is net sentiment. >> Just so you know, all I see is a white- >> Yeah, just a white bar. >> Oh, that's weird. Oh, whiteboard. Oh, here we go. How about that? >> There you go. >> Yeah, so this is a sentiment graph. So this is net sentiment and this is mindshare. And if I go to Arctic Wolf... So it's typical security, right? The 8,000 companies. And when I go here, what impresses me about this is you got a decent mindshare, that's this axis, but you've also got an N in the survey. It's about 1,500 in the survey, It's 479 Arctic Wolf customers responded to this. 57% don't know you. Oh, sorry, they're aware of you, but no plan to evaluate; 19% plan to evaluate, 7% are evaluating; 11%, no plan to utilize even though they've evaluated you; and 1% say they've evaluated you and plan to utilize. It's a small percentage, but actually it's not bad in the random sample of the world about that. And so obviously you want to get that number up, but this is a really impressive position right here that I wanted to just share with you. I do a lot of analysis weekly, and this is a really, it's completely independent survey, and you're sort of separating from the pack, as you can see. So kind of- >> Well, it's good to see that. And I think that just is a further indicator of what I was telling you. We continue to have a strong financial performance. >> Yeah, in a good market. Okay, well, thanks you guys. And hey, if I can get this recording, Hannah, I may even figure out how to write it up. (chuckles) That would be super helpful. >> Yes. We'll get that up. >> And David or Hannah, if you can send me David's contact info so I can get a salesperson in touch with him. (Hannah chuckling) >> Yeah, great. >> Yeah, we'll work on that as well. Thanks so much for both your time. >> Thanks a lot. It was great talking with you. >> Thanks, you guys. Great to meet you. >> Thank you. >> Bye. >> Bye.

Published Date : Feb 15 2023

SUMMARY :

I think for us, we also have the ability I don't think we overhired And never have? and how are you dealing with that? I think they'll just going to that are going to be So a lot of the steps we're and so I think we want to just continue and the cohorts you're going after, And so I think if you look at the growth So just to follow up but at the same time, we produce some tech and Active Directory and the like, So you don't need to but we have all our own tech behind it. like about the MSSP piece one of the things we want So given that sort of of growth that we have on the So large enterprises would engage with you kind of bringing in the right I inferred some of that is integrations. and it's great that you guys do to get rid of their SIEM. I've never met anyone I think everything that we and expanding the coverage to where you guys go. You got to get the markets- Well, if the market were Yeah, I mean, we'd certainly I have a question for you and that way you can go to bed I can engage with you because of that model you just described, the MSP monthly, I mean, know the answer to that. No. God, no. Thousands, on a monthly basis. I mean, I'm giving just to try to make- is kind of right in the sweet spot there. Yeah, I'm interested, I'm going to... I want to start because we sell to very get in touch with you. doing that we talked about, of our operators so we can start to learn I don't know if you can see this. Oh, here we go. from the pack, as you can see. And I think that just I may even figure out how to write it up. if you can send me David's contact info Thanks so much for both your time. great talking with you. Great to meet you.

ENTITIES

Entity	Category	Confidence
David	PERSON	0.99+
Hannah	PERSON	0.99+
two models	QUANTITY	0.99+
three	QUANTITY	0.99+
Arctic Wolf Labs	ORGANIZATION	0.99+
one	QUANTITY	0.99+
80%	QUANTITY	0.99+
70%	QUANTITY	0.99+
Arctic Wolf	ORGANIZATION	0.99+
two	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
30	QUANTITY	0.99+
Palo	ORGANIZATION	0.99+
479	QUANTITY	0.99+
half	QUANTITY	0.99+
19%	QUANTITY	0.99+
first question	QUANTITY	0.99+
Forrester	ORGANIZATION	0.99+
50	QUANTITY	0.99+
8,000 companies	QUANTITY	0.99+
Thousands	QUANTITY	0.99+
1%	QUANTITY	0.99+
7%	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
57%	QUANTITY	0.99+
IDC	ORGANIZATION	0.99+
CrowdStrike	ORGANIZATION	0.99+
today	DATE	0.99+
A year	QUANTITY	0.99+
one eye	QUANTITY	0.99+
both	QUANTITY	0.99+
both eyes	QUANTITY	0.99+
each quarter	QUANTITY	0.99+
less than one	QUANTITY	0.98+
11%	QUANTITY	0.98+
One	QUANTITY	0.98+
five more people	QUANTITY	0.98+
axis	ORGANIZATION	0.98+
thousands	QUANTITY	0.98+
tens of thousands	QUANTITY	0.97+
Veritas	ORGANIZATION	0.97+
about 1,500 IT decision makers	QUANTITY	0.97+
20	QUANTITY	0.97+
a year	QUANTITY	0.96+
Salesforce	ORGANIZATION	0.96+
ETS	ORGANIZATION	0.96+
Stanford	ORGANIZATION	0.96+
40 people	QUANTITY	0.95+
over 100%	QUANTITY	0.95+
couple years ago	DATE	0.95+
CISO	ORGANIZATION	0.94+
four vendors	QUANTITY	0.94+
$50,000 a year	QUANTITY	0.93+
about 1,500	QUANTITY	0.92+
Enterprise Technology Research	ORGANIZATION	0.92+
almost 15 times	QUANTITY	0.91+
couple questions	QUANTITY	0.91+
CrowdStrike	TITLE	0.9+
hundreds of thousands of dollars a month	QUANTITY	0.9+
ETR	ORGANIZATION	0.88+
last 18 months	DATE	0.87+
SQL Server	TITLE	0.84+
three security	QUANTITY	0.84+
Breaking Analysis	TITLE	0.82+
Thousands per month	QUANTITY	0.8+
XDR	TITLE	0.79+
a month	QUANTITY	0.74+
SIEM	TITLE	0.74+
Arctic	ORGANIZATION	0.74+

AWS Startup Showcase S3E1

(upbeat electronic music) >> Hello everyone, welcome to this CUBE conversation here from the studios in the CUBE in Palo Alto, California. I'm John Furrier, your host. We're featuring a startup, Astronomer. Astronomer.io is the URL, check it out. And we're going to have a great conversation around one of the most important topics hitting the industry, and that is the future of machine learning and AI, and the data that powers it underneath it. There's a lot of things that need to get done, and we're excited to have some of the co-founders of Astronomer here. Viraj Parekh, who is co-founder of Astronomer, and Paola Peraza Calderon, another co-founder, both with Astronomer. Thanks for coming on. First of all, how many co-founders do you guys have? >> You know, I think the answer's around six or seven. I forget the exact, but there's really been a lot of people around the table who've worked very hard to get this company to the point that it's at. We have long ways to go, right? But there's been a lot of people involved that have been absolutely necessary for the path we've been on so far. >> Thanks for that, Viraj, appreciate that. The first question I want to get out on the table, and then we'll get into some of the details, is take a minute to explain what you guys are doing. How did you guys get here? Obviously, multiple co-founders, sounds like a great project. The timing couldn't have been better. ChatGPT has essentially done so much public relations for the AI industry to kind of highlight this shift that's happening. It's real, we've been chronicalizing, take a minute to explain what you guys do. >> Yeah, sure, we can get started. So, yeah, when Viraj and I joined Astronomer in 2017, we really wanted to build a business around data, and we were using an open source project called Apache Airflow that we were just using sort of as customers ourselves. And over time, we realized that there was actually a market for companies who use Apache Airflow, which is a data pipeline management tool, which we'll get into, and that running Airflow is actually quite challenging, and that there's a big opportunity for us to create a set of commercial products and an opportunity to grow that open source community and actually build a company around that. So the crux of what we do is help companies run data pipelines with Apache Airflow. And certainly we've grown in our ambitions beyond that, but that's sort of the crux of what we do for folks. >> You know, data orchestration, data management has always been a big item in the old classic data infrastructure. But with AI, you're seeing a lot more emphasis on scale, tuning, training. Data orchestration is the center of the value proposition, when you're looking at coordinating resources, it's one of the most important things. Can you guys explain what data orchestration entails? What does it mean? Take us through the definition of what data orchestration entails. >> Yeah, for sure. I can take this one, and Viraj, feel free to jump in. So if you google data orchestration, here's what you're going to get. You're going to get something that says, "Data orchestration is the automated process" "for organizing silo data from numerous" "data storage points, standardizing it," "and making it accessible and prepared for data analysis." And you say, "Okay, but what does that actually mean," right, and so let's give sort of an an example. So let's say you're a business and you have sort of the following basic asks of your data team, right? Okay, give me a dashboard in Sigma, for example, for the number of customers or monthly active users, and then make sure that that gets updated on an hourly basis. And then number two, a consistent list of active customers that I have in HubSpot so that I can send them a monthly product newsletter, right? Two very basic asks for all sorts of companies and organizations. And when that data team, which has data engineers, data scientists, ML engineers, data analysts get that request, they're looking at an ecosystem of data sources that can help them get there, right? And that includes application databases, for example, that actually have in product user behavior and third party APIs from tools that the company uses that also has different attributes and qualities of those customers or users. And that data team needs to use tools like Fivetran to ingest data, a data warehouse, like Snowflake or Databricks to actually store that data and do analysis on top of it, a tool like DBT to do transformations and make sure that data is standardized in the way that it needs to be, a tool like Hightouch for reverse ETL. I mean, we could go on and on. There's so many partners of ours in this industry that are doing really, really exciting and critical things for those data movements. And the whole point here is that data teams have this plethora of tooling that they use to both ingest the right data and come up with the right interfaces to transform and interact with that data. And data orchestration, in our view, is really the heartbeat of all of those processes, right? And tangibly the unit of data orchestration is a data pipeline, a set of tasks or jobs that each do something with data over time and eventually run that on a schedule to make sure that those things are happening continuously as time moves on and the company advances. And so, for us, we're building a business around Apache Airflow, which is a workflow management tool that allows you to author, run, and monitor data pipelines. And so when we talk about data orchestration, we talk about sort of two things. One is that crux of data pipelines that, like I said, connect that large ecosystem of data tooling in your company. But number two, it's not just that data pipeline that needs to run every day, right? And Viraj will probably touch on this as we talk more about Astronomer and our value prop on top of Airflow. But then it's all the things that you need to actually run data and production and make sure that it's trustworthy, right? So it's actually not just that you're running things on a schedule, but it's also things like CICD tooling, secure secrets management, user permissions, monitoring, data lineage, documentation, things that enable other personas in your data team to actually use those tools. So long-winded way of saying that it's the heartbeat, we think, of of the data ecosystem, and certainly goes beyond scheduling, but again, data pipelines are really at the center of it. >> One of the things that jumped out, Viraj, if you can get into this, I'd like to hear more about how you guys look at all those little tools that are out. You mentioned a variety of things. You look at the data infrastructure, it's not just one stack. You've got an analytic stack, you've got a realtime stack, you've got a data lake stack, you got an AI stack potentially. I mean you have these stacks now emerging in the data world that are fundamental, that were once served by either a full package, old school software, and then a bunch of point solution. You mentioned Fivetran there, I would say in the analytics stack. Then you got S3, they're on the data lake stack. So all these things are kind of munged together. >> Yeah. >> How do you guys fit into that world? You make it easier, or like, what's the deal? >> Great question, right? And you know, I think that one of the biggest things we've found in working with customers over the last however many years is that if a data team is using a bunch of tools to get what they need done, and the number of tools they're using is growing exponentially and they're kind of roping things together here and there, that's actually a sign of a productive team, not a bad thing, right? It's because that team is moving fast. They have needs that are very specific to them, and they're trying to make something that's exactly tailored to their business. So a lot of times what we find is that customers have some sort of base layer, right? That's kind of like, it might be they're running most of the things in AWS, right? And then on top of that, they'll be using some of the things AWS offers, things like SageMaker, Redshift, whatever, but they also might need things that their cloud can't provide. Something like Fivetran, or Hightouch, those are other tools. And where data orchestration really shines, and something that we've had the pleasure of helping our customers build, is how do you take all those requirements, all those different tools and whip them together into something that fulfills a business need? So that somebody can read a dashboard and trust the number that it says, or somebody can make sure that the right emails go out to their customers. And Airflow serves as this amazing kind of glue between that data stack, right? It's to make it so that for any use case, be it ELT pipelines, or machine learning, or whatever, you need different things to do them, and Airflow helps tie them together in a way that's really specific for a individual business' needs. >> Take a step back and share the journey of what you guys went through as a company startup. So you mentioned Apache, open source. I was just having an interview with a VC, we were talking about foundational models. You got a lot of proprietary and open source development going on. It's almost the iPhone/Android moment in this whole generative space and foundational side. This is kind of important, the open source piece of it. Can you share how you guys started? And I can imagine your customers probably have their hair on fire and are probably building stuff on their own. Are you guys helping them? Take us through, 'cause you guys are on the front end of a big, big wave, and that is to make sense of the chaos, rain it in. Take us through your journey and why this is important. >> Yeah, Paola, I can take a crack at this, then I'll kind of hand it over to you to fill in whatever I miss in details. But you know, like Paola is saying, the heart of our company is open source, because we started using Airflow as an end user and started to say like, "Hey wait a second," "more and more people need this." Airflow, for background, started at Airbnb, and they were actually using that as a foundation for their whole data stack. Kind of how they made it so that they could give you recommendations, and predictions, and all of the processes that needed orchestrated. Airbnb created Airflow, gave it away to the public, and then fast forward a couple years and we're building a company around it, and we're really excited about that. >> That's a beautiful thing. That's exactly why open source is so great. >> Yeah, yeah. And for us, it's really been about watching the community and our customers take these problems, find a solution to those problems, standardize those solutions, and then building on top of that, right? So we're reaching to a point where a lot of our earlier customers who started to just using Airflow to get the base of their BI stack down and their reporting in their ELP infrastructure, they've solved that problem and now they're moving on to things like doing machine learning with their data, because now that they've built that foundation, all the connective tissue for their data arriving on time and being orchestrated correctly is happening, they can build a layer on top of that. And it's just been really, really exciting kind of watching what customers do once they're empowered to pick all the tools that they need, tie them together in the way they need to, and really deliver real value to their business. >> Can you share some of the use cases of these customers? Because I think that's where you're starting to see the innovation. What are some of the companies that you're working with, what are they doing? >> Viraj, I'll let you take that one too. (group laughs) >> So you know, a lot of it is... It goes across the gamut, right? Because it doesn't matter what you are, what you're doing with data, it needs to be orchestrated. So there's a lot of customers using us for their ETL and ELT reporting, right? Just getting data from other disparate sources into one place and then building on top of that. Be it building dashboards, answering questions for the business, building other data products and so on and so forth. From there, these use cases evolve a lot. You do see folks doing things like fraud detection, because Airflow's orchestrating how transactions go, transactions get analyzed. They do things like analyzing marketing spend to see where your highest ROI is. And then you kind of can't not talk about all of the machine learning that goes on, right? Where customers are taking data about their own customers, kind of analyze and aggregating that at scale, and trying to automate decision making processes. So it goes from your most basic, what we call data plumbing, right? Just to make sure data's moving as needed, all the ways to your more exciting expansive use cases around automated decision making and machine learning. >> And I'd say, I mean, I'd say that's one of the things that I think gets me most excited about our future, is how critical Airflow is to all of those processes, and I think when you know a tool is valuable is when something goes wrong and one of those critical processes doesn't work. And we know that our system is so mission critical to answering basic questions about your business and the growth of your company for so many organizations that we work with. So it's, I think, one of the things that gets Viraj and I and the rest of our company up every single morning is knowing how important the work that we do for all of those use cases across industries, across company sizes, and it's really quite energizing. >> It was such a big focus this year at AWS re:Invent, the role of data. And I think one of the things that's exciting about the open AI and all the movement towards large language models is that you can integrate data into these models from outside. So you're starting to see the integration easier to deal with. Still a lot of plumbing issues. So a lot of things happening. So I have to ask you guys, what is the state of the data orchestration area? Is it ready for disruption? Has it already been disrupted? Would you categorize it as a new first inning kind of opportunity, or what's the state of the data orchestration area right now? Both technically and from a business model standpoint. How would you guys describe that state of the market? >> Yeah, I mean, I think in a lot of ways, in some ways I think we're category creating. Schedulers have been around for a long time. I released a data presentation sort of on the evolution of going from something like Kron, which I think was built in like the 1970s out of Carnegie Mellon. And that's a long time ago, that's 50 years ago. So sort of like the basic need to schedule and do something with your data on a schedule is not a new concept. But to our point earlier, I think everything that you need around your ecosystem, first of all, the number of data tools and developer tooling that has come out industry has 5X'd over the last 10 years. And so obviously as that ecosystem grows, and grows, and grows, and grows, the need for orchestration only increases. And I think, as Astronomer, I think we... And we work with so many different types of companies, companies that have been around for 50 years, and companies that got started not even 12 months ago. And so I think for us it's trying to, in a ways, category create and adjust sort of what we sell and the value that we can provide for companies all across that journey. There are folks who are just getting started with orchestration, and then there's folks who have such advanced use case, 'cause they're hitting sort of a ceiling and only want to go up from there. And so I think we, as a company, care about both ends of that spectrum, and certainly want to build and continue building products for companies of all sorts, regardless of where they are on the maturity curve of data orchestration. >> That's a really good point, Paola. And I think the other thing to really take into account is it's the companies themselves, but also individuals who have to do their jobs. If you rewind the clock like 5 or 10 years ago, data engineers would be the ones responsible for orchestrating data through their org. But when we look at our customers today, it's not just data engineers anymore. There's data analysts who sit a lot closer to the business, and the data scientists who want to automate things around their models. So this idea that orchestration is this new category is right on the money. And what we're finding is the need for it is spreading to all parts of the data team, naturally where Airflow's emerged as an open source standard and we're hoping to take things to the next level. >> That's awesome. We've been up saying that the data market's kind of like the SRE with servers, right? You're going to need one person to deal with a lot of data, and that's data engineering, and then you're got to have the practitioners, the democratization. Clearly that's coming in what you're seeing. So I have to ask, how do you guys fit in from a value proposition standpoint? What's the pitch that you have to customers, or is it more inbound coming into you guys? Are you guys doing a lot of outreach, customer engagements? I'm sure they're getting a lot of great requirements from customers. What's the current value proposition? How do you guys engage? >> Yeah, I mean, there's so many... Sorry, Viraj, you can jump in. So there's so many companies using Airflow, right? So the baseline is that the open source project that is Airflow that came out of Airbnb, over five years ago at this point, has grown exponentially in users and continues to grow. And so the folks that we sell to primarily are folks who are already committed to using Apache Airflow, need data orchestration in their organization, and just want to do it better, want to do it more efficiently, want to do it without managing that infrastructure. And so our baseline proposition is for those organizations. Now to Viraj's point, obviously I think our ambitions go beyond that, both in terms of the personas that we addressed and going beyond that data engineer, but really it's to start at the baseline, as we continue to grow our our company, it's really making sure that we're adding value to folks using Airflow and help them do so in a better way, in a larger way, in a more efficient way, and that's really the crux of who we sell to. And so to answer your question on, we get a lot of inbound because they're... >> You have a built in audience. (laughs) >> The world that use it. Those are the folks who we talk to and come to our website and chat with us and get value from our content. I mean, the power of the opensource community is really just so, so big, and I think that's also one of the things that makes this job fun. >> And you guys are in a great position. Viraj, you can comment a little, get your reaction. There's been a big successful business model to starting a company around these big projects for a lot of reasons. One is open source is continuing to be great, but there's also supply chain challenges in there. There's also we want to continue more innovation and more code and keeping it free and and flowing. And then there's the commercialization of productizing it, operationalizing it. This is a huge new dynamic, I mean, in the past 5 or so years, 10 years, it's been happening all on CNCF from other areas like Apache, Linux Foundation, they're all implementing this. This is a huge opportunity for entrepreneurs to do this. >> Yeah, yeah. Open source is always going to be core to what we do, because we wouldn't exist without the open source community around us. They are huge in numbers. Oftentimes they're nameless people who are working on making something better in a way that everybody benefits from it. But open source is really hard, especially if you're a company whose core competency is running a business, right? Maybe you're running an e-commerce business, or maybe you're running, I don't know, some sort of like, any sort of business, especially if you're a company running a business, you don't really want to spend your time figuring out how to run open source software. You just want to use it, you want to use the best of it, you want to use the community around it, you want to be able to google something and get answers for it, you want the benefits of open source. You don't have the time or the resources to invest in becoming an expert in open source, right? And I think that dynamic is really what's given companies like us an ability to kind of form businesses around that in the sense that we'll make it so people get the best of both worlds. You'll get this vast open ecosystem that you can build on top of, that you can benefit from, that you can learn from. But you won't have to spend your time doing undifferentiated heavy lifting. You can do things that are just specific to your business. >> It's always been great to see that business model evolve. We used a debate 10 years ago, can there be another Red Hat? And we said, not really the same, but there'll be a lot of little ones that'll grow up to be big soon. Great stuff. Final question, can you guys share the history of the company? The milestones of Astromer's journey in data orchestration? >> Yeah, we could. So yeah, I mean, I think, so Viraj and I have obviously been at Astronomer along with our other founding team and leadership folks for over five years now. And it's been such an incredible journey of learning, of hiring really amazing people, solving, again, mission critical problems for so many types of organizations. We've had some funding that has allowed us to invest in the team that we have and in the software that we have, and that's been really phenomenal. And so that investment, I think, keeps us confident, even despite these sort of macroeconomic conditions that we're finding ourselves in. And so honestly, the milestones for us are focusing on our product, focusing on our customers over the next year, focusing on that market for us that we know can get valuable out of what we do, and making developers' lives better, and growing the open source community and making sure that everything that we're doing makes it easier for folks to get started, to contribute to the project and to feel a part of the community that we're cultivating here. >> You guys raised a little bit of money. How much have you guys raised? >> Don't know what the total is, but it's in the ballpark over $200 million. It feels good to... >> A little bit of capital. Got a little bit of cap to work with there. Great success. I know as a Series C Financing, you guys have been down. So you're up and running, what's next? What are you guys looking to do? What's the big horizon look like for you from a vision standpoint, more hiring, more product, what is some of the key things you're looking at doing? >> Yeah, it's really a little of all of the above, right? Kind of one of the best and worst things about working at earlier stage startups is there's always so much to do and you often have to just kind of figure out a way to get everything done. But really investing our product over the next, at least over the course of our company lifetime. And there's a lot of ways we want to make it more accessible to users, easier to get started with, easier to use, kind of on all areas there. And really, we really want to do more for the community, right, like I was saying, we wouldn't be anything without the large open source community around us. And we want to figure out ways to give back more in more creative ways, in more code driven ways, in more kind of events and everything else that we can keep those folks galvanized and just keep them happy using Airflow. >> Paola, any final words as we close out? >> No, I mean, I'm super excited. I think we'll keep growing the team this year. We've got a couple of offices in the the US, which we're excited about, and a fully global team that will only continue to grow. So Viraj and I are both here in New York, and we're excited to be engaging with our coworkers in person finally, after years of not doing so. We've got a bustling office in San Francisco as well. So growing those teams and continuing to hire all over the world, and really focusing on our product and the open source community is where our heads are at this year. So, excited. >> Congratulations. 200 million in funding, plus. Good runway, put that money in the bank, squirrel it away. It's a good time to kind of get some good interest on it, but still grow. Congratulations on all the work you guys do. We appreciate you and the open source community does, and good luck with the venture, continue to be successful, and we'll see you at the Startup Showcase. >> Thank you. >> Yeah, thanks so much, John. Appreciate it. >> Okay, that's the CUBE Conversation featuring astronomer.io, that's the website. Astronomer is doing well. Multiple rounds of funding, over 200 million in funding. Open source continues to lead the way in innovation. Great business model, good solution for the next gen cloud scale data operations, data stacks that are emerging. I'm John Furrier, your host, thanks for watching. (soft upbeat music)

Published Date : Feb 14 2023

SUMMARY :

and that is the future of for the path we've been on so far. for the AI industry to kind of highlight So the crux of what we center of the value proposition, that it's the heartbeat, One of the things and the number of tools they're using of what you guys went and all of the processes That's a beautiful thing. all the tools that they need, What are some of the companies Viraj, I'll let you take that one too. all of the machine learning and the growth of your company that state of the market? and the value that we can provide and the data scientists that the data market's And so the folks that we sell to You have a built in audience. one of the things that makes this job fun. in the past 5 or so years, 10 years, that you can build on top of, the history of the company? and in the software that we have, How much have you guys raised? but it's in the ballpark What's the big horizon look like for you Kind of one of the best and worst things and continuing to hire the work you guys do. Yeah, thanks so much, John. for the next gen cloud

ENTITIES

Entity	Category	Confidence
Viraj Parekh	PERSON	0.99+
Paola	PERSON	0.99+
Viraj	PERSON	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
Airbnb	ORGANIZATION	0.99+
2017	DATE	0.99+
San Francisco	LOCATION	0.99+
New York	LOCATION	0.99+
Apache	ORGANIZATION	0.99+
US	LOCATION	0.99+
Two	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Paola Peraza Calderon	PERSON	0.99+
1970s	DATE	0.99+
first question	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
Airflow	TITLE	0.99+
both	QUANTITY	0.99+
Linux Foundation	ORGANIZATION	0.99+
200 million	QUANTITY	0.99+
Astronomer	ORGANIZATION	0.99+
One	QUANTITY	0.99+
over 200 million	QUANTITY	0.99+
over $200 million	QUANTITY	0.99+
this year	DATE	0.99+
10 years ago	DATE	0.99+
HubSpot	ORGANIZATION	0.98+
Fivetran	ORGANIZATION	0.98+
50 years ago	DATE	0.98+
over five years	QUANTITY	0.98+
one stack	QUANTITY	0.98+
12 months ago	DATE	0.98+
10 years	QUANTITY	0.97+
Both	QUANTITY	0.97+
Apache Airflow	TITLE	0.97+
both worlds	QUANTITY	0.97+
CNCF	ORGANIZATION	0.97+
one	QUANTITY	0.97+
ChatGPT	ORGANIZATION	0.97+
5	DATE	0.97+
next year	DATE	0.96+
Astromer	ORGANIZATION	0.96+
today	DATE	0.95+
5X	QUANTITY	0.95+
over five years ago	DATE	0.95+
CUBE	ORGANIZATION	0.94+
two things	QUANTITY	0.94+
each	QUANTITY	0.93+
one person	QUANTITY	0.93+
First	QUANTITY	0.92+
S3	TITLE	0.91+
Carnegie Mellon	ORGANIZATION	0.91+
Startup Showcase	EVENT	0.91+

AWS Startup Showcase S3E1

(soft music) >> Hello everyone, welcome to this Cube conversation here from the studios of theCube in Palo Alto, California. John Furrier, your host. We're featuring a startup, Astronomer, astronomer.io is the url. Check it out. And we're going to have a great conversation around one of the most important topics hitting the industry, and that is the future of machine learning and AI and the data that powers it underneath it. There's a lot of things that need to get done, and we're excited to have some of the co-founders of Astronomer here. Viraj Parekh, who is co-founder and Paola Peraza Calderon, another co-founder, both with Astronomer. Thanks for coming on. First of all, how many co-founders do you guys have? >> You know, I think the answer's around six or seven. I forget the exact, but there's really been a lot of people around the table, who've worked very hard to get this company to the point that it's at. And we have long ways to go, right? But there's been a lot of people involved that are, have been absolutely necessary for the path we've been on so far. >> Thanks for that, Viraj, appreciate that. The first question I want to get out on the table, and then we'll get into some of the details, is take a minute to explain what you guys are doing. How did you guys get here? Obviously, multiple co-founders sounds like a great project. The timing couldn't have been better. ChatGPT has essentially done so much public relations for the AI industry. Kind of highlight this shift that's happening. It's real. We've been chronologicalizing, take a minute to explain what you guys do. >> Yeah, sure. We can get started. So yeah, when Astronomer, when Viraj and I joined Astronomer in 2017, we really wanted to build a business around data and we were using an open source project called Apache Airflow, that we were just using sort of as customers ourselves. And over time, we realized that there was actually a market for companies who use Apache Airflow, which is a data pipeline management tool, which we'll get into. And that running Airflow is actually quite challenging and that there's a lot of, a big opportunity for us to create a set of commercial products and opportunity to grow that open source community and actually build a company around that. So the crux of what we do is help companies run data pipelines with Apache Airflow. And certainly we've grown in our ambitions beyond that, but that's sort of the crux of what we do for folks. >> You know, data orchestration, data management has always been a big item, you know, in the old classic data infrastructure. But with AI you're seeing a lot more emphasis on scale, tuning, training. You know, data orchestration is the center of the value proposition when you're looking at coordinating resources, it's one of the most important things. Could you guys explain what data orchestration entails? What does it mean? Take us through the definition of what data orchestration entails. >> Yeah, for sure. I can take this one and Viraj feel free to jump in. So if you google data orchestration, you know, here's what you're going to get. You're going to get something that says, data orchestration is the automated process for organizing silo data from numerous data storage points to organizing it and making it accessible and prepared for data analysis. And you say, okay, but what does that actually mean, right? And so let's give sort of an example. So let's say you're a business and you have sort of the following basic asks of your data team, right? Hey, give me a dashboard in Sigma, for example, for the number of customers or monthly active users and then make sure that that gets updated on an hourly basis. And then number two, a consistent list of active customers that I have in HubSpot so that I can send them a monthly product newsletter, right? Two very basic asks for all sorts of companies and organizations. And when that data team, which has data engineers, data scientists, ML engineers, data analysts get that request, they're looking at an ecosystem of data sources that can help them get there, right? And that includes application databases, for example, that actually have end product user behavior and third party APIs from tools that the company uses that also has different attributes and qualities of those customers or users. And that data team needs to use tools like Fivetran, to ingest data, a data warehouse like Snowflake or Databricks to actually store that data and do analysis on top of it, a tool like DBT to do transformations and make sure that that data is standardized in the way that it needs to be, a tool like Hightouch for reverse ETL. I mean, we could go on and on. There's so many partners of ours in this industry that are doing really, really exciting and critical things for those data movements. And the whole point here is that, you know, data teams have this plethora of tooling that they use to both ingest the right data and come up with the right interfaces to transform and interact with that data. And data orchestration in our view is really the heartbeat of all of those processes, right? And tangibly the unit of data orchestration, you know, is a data pipeline, a set of tasks or jobs that each do something with data over time and eventually run that on a schedule to make sure that those things are happening continuously as time moves on. And, you know, the company advances. And so, you know, for us, we're building a business around Apache Airflow, which is a workflow management tool that allows you to author, run and monitor data pipelines. And so when we talk about data orchestration, we talk about sort of two things. One is that crux of data pipelines that, like I said, connect that large ecosystem of data tooling in your company. But number two, it's not just that data pipeline that needs to run every day, right? And Viraj will probably touch on this as we talk more about Astronomer and our value prop on top of Airflow. But then it's all the things that you need to actually run data and production and make sure that it's trustworthy, right? So it's actually not just that you're running things on a schedule, but it's also things like CI/CD tooling, right? Secure secrets management, user permissions, monitoring, data lineage, documentation, things that enable other personas in your data team to actually use those tools. So long-winded way of saying that, it's the heartbeat that we think of the data ecosystem and certainly goes beyond scheduling, but again, data pipelines are really at the center of it. >> You know, one of the things that jumped out Viraj, if you can get into this, I'd like to hear more about how you guys look at all those little tools that are out there. You mentioned a variety of things. You know, if you look at the data infrastructure, it's not just one stack. You've got an analytic stack, you've got a realtime stack, you've got a data lake stack, you got an AI stack potentially. I mean you have these stacks now emerging in the data world that are >> Yeah. - >> fundamental, but we're once served by either a full package, old school software, and then a bunch of point solution. You mentioned Fivetran there, I would say in the analytics stack. Then you got, you know, S3, they're on the data lake stack. So all these things are kind of munged together. >> Yeah. >> How do you guys fit into that world? You make it easier or like, what's the deal? >> Great question, right? And you know, I think that one of the biggest things we've found in working with customers over, you know, the last however many years, is that like if a data team is using a bunch of tools to get what they need done and the number of tools they're using is growing exponentially and they're kind of roping things together here and there, that's actually a sign of a productive team, not a bad thing, right? It's because that team is moving fast. They have needs that are very specific to them and they're trying to make something that's exactly tailored to their business. So a lot of times what we find is that customers have like some sort of base layer, right? That's kind of like, you know, it might be they're running most of the things in AWS, right? And then on top of that, they'll be using some of the things AWS offers, you know, things like SageMaker, Redshift, whatever. But they also might need things that their Cloud can't provide, you know, something like Fivetran or Hightouch or anything of those other tools and where data orchestration really shines, right? And something that we've had the pleasure of helping our customers build, is how do you take all those requirements, all those different tools and whip them together into something that fulfills a business need, right? Something that makes it so that somebody can read a dashboard and trust the number that it says or somebody can make sure that the right emails go out to their customers. And Airflow serves as this amazing kind of glue between that data stack, right? It's to make it so that for any use case, be it ELT pipelines or machine learning or whatever, you need different things to do them and Airflow helps tie them together in a way that's really specific for a individual business's needs. >> Take a step back and share the journey of what your guys went through as a company startup. So you mentioned Apache open source, you know, we were just, I was just having an interview with the VC, we were talking about foundational models. You got a lot of proprietary and open source development going on. It's almost the iPhone, Android moment in this whole generative space and foundational side. This is kind of important, the open source piece of it. Can you share how you guys started? And I can imagine your customers probably have their hair on fire and are probably building stuff on their own. How do you guys, are you guys helping them? Take us through, 'cuz you guys are on the front end of a big, big wave and that is to make sense of the chaos, reigning it in. Take us through your journey and why this is important. >> Yeah Paola, I can take a crack at this and then I'll kind of hand it over to you to fill in whatever I miss in details. But you know, like Paola is saying, the heart of our company is open source because we started using Airflow as an end user and started to say like, "Hey wait a second". Like more and more people need this. Airflow, for background, started at Airbnb and they were actually using that as the foundation for their whole data stack. Kind of how they made it so that they could give you recommendations and predictions and all of the processes that need to be or needed to be orchestrated. Airbnb created Airflow, gave it away to the public and then, you know, fast forward a couple years and you know, we're building a company around it and we're really excited about that. >> That's a beautiful thing. That's exactly why open source is so great. >> Yeah, yeah. And for us it's really been about like watching the community and our customers take these problems, find solution to those problems, build standardized solutions, and then building on top of that, right? So we're reaching to a point where a lot of our earlier customers who started to just using Airflow to get the base of their BI stack down and their reporting and their ELP infrastructure, you know, they've solved that problem and now they're moving onto things like doing machine learning with their data, right? Because now that they've built that foundation, all the connective tissue for their data arriving on time and being orchestrated correctly is happening, they can build the layer on top of that. And it's just been really, really exciting kind of watching what customers do once they're empowered to pick all the tools that they need, tie them together in the way they need to, and really deliver real value to their business. >> Can you share some of the use cases of these customers? Because I think that's where you're starting to see the innovation. What are some of the companies that you're working with, what are they doing? >> Raj, I'll let you take that one too. (all laughing) >> Yeah. (all laughing) So you know, a lot of it is, it goes across the gamut, right? Because all doesn't matter what you are, what you're doing with data, it needs to be orchestrated. So there's a lot of customers using us for their ETL and ELT reporting, right? Just getting data from all the disparate sources into one place and then building on top of that, be it building dashboards, answering questions for the business, building other data products and so on and so forth. From there, these use cases evolve a lot. You do see folks doing things like fraud detection because Airflow's orchestrating how transactions go. Transactions get analyzed, they do things like analyzing marketing spend to see where your highest ROI is. And then, you know, you kind of can't not talk about all of the machine learning that goes on, right? Where customers are taking data about their own customers kind of analyze and aggregating that at scale and trying to automate decision making processes. So it goes from your most basic, what we call like data plumbing, right? Just to make sure data's moving as needed. All the ways to your more exciting and sexy use cases around like automated decision making and machine learning. >> And I'd say, I mean, I'd say that's one of the things that I think gets me most excited about our future is how critical Airflow is to all of those processes, you know? And I think when, you know, you know a tool is valuable is when something goes wrong and one of those critical processes doesn't work. And we know that our system is so mission critical to answering basic, you know, questions about your business and the growth of your company for so many organizations that we work with. So it's, I think one of the things that gets Viraj and I, and the rest of our company up every single morning, is knowing how important the work that we do for all of those use cases across industries, across company sizes. And it's really quite energizing. >> It was such a big focus this year at AWS re:Invent, the role of data. And I think one of the things that's exciting about the open AI and all the movement towards large language models, is that you can integrate data into these models, right? From outside, right? So you're starting to see the integration easier to deal with, still a lot of plumbing issues. So a lot of things happening. So I have to ask you guys, what is the state of the data orchestration area? Is it ready for disruption? Is it already been disrupted? Would you categorize it as a new first inning kind of opportunity or what's the state of the data orchestration area right now? Both, you know, technically and from a business model standpoint, how would you guys describe that state of the market? >> Yeah, I mean I think, I think in a lot of ways we're, in some ways I think we're categoric rating, you know, schedulers have been around for a long time. I recently did a presentation sort of on the evolution of going from, you know, something like KRON, which I think was built in like the 1970s out of Carnegie Mellon. And you know, that's a long time ago. That's 50 years ago. So it's sort of like the basic need to schedule and do something with your data on a schedule is not a new concept. But to our point earlier, I think everything that you need around your ecosystem, first of all, the number of data tools and developer tooling that has come out the industry has, you know, has some 5X over the last 10 years. And so obviously as that ecosystem grows and grows and grows and grows, the need for orchestration only increases. And I think, you know, as Astronomer, I think we, and there's, we work with so many different types of companies, companies that have been around for 50 years and companies that got started, you know, not even 12 months ago. And so I think for us, it's trying to always category create and adjust sort of what we sell and the value that we can provide for companies all across that journey. There are folks who are just getting started with orchestration and then there's folks who have such advanced use case 'cuz they're hitting sort of a ceiling and only want to go up from there. And so I think we as a company, care about both ends of that spectrum and certainly have want to build and continue building products for companies of all sorts, regardless of where they are on the maturity curve of data orchestration. >> That's a really good point Paola. And I think the other thing to really take into account is it's the companies themselves, but also individuals who have to do their jobs. You know, if you rewind the clock like five or 10 years ago, data engineers would be the ones responsible for orchestrating data through their org. But when we look at our customers today, it's not just data engineers anymore. There's data analysts who sit a lot closer to the business and the data scientists who want to automate things around their models. So this idea that orchestration is this new category is spot on, is right on the money. And what we're finding is it's spreading, the need for it, is spreading to all parts of the data team naturally where Airflows have emerged as an open source standard and we're hoping to take things to the next level. >> That's awesome. You know, we've been up saying that the data market's kind of like the SRE with servers, right? You're going to need one person to deal with a lot of data and that's data engineering and then you're going to have the practitioners, the democratization. Clearly that's coming in what you're seeing. So I got to ask, how do you guys fit in from a value proposition standpoint? What's the pitch that you have to customers or is it more inbound coming into you guys? Are you guys doing a lot of outreach, customer engagements? I'm sure they're getting a lot of great requirements from customers. What's the current value proposition? How do you guys engage? >> Yeah, I mean we've, there's so many, there's so many. Sorry Raj, you can jump in. - >> It's okay. So there's so many companies using Airflow, right? So our, the baseline is that the open source project that is Airflow that was, that came out of Airbnb, you know, over five years ago at this point, has grown exponentially in users and continues to grow. And so the folks that we sell to primarily are folks who are already committed to using Apache Airflow, need data orchestration in the organization and just want to do it better, want to do it more efficiently, want to do it without managing that infrastructure. And so our baseline proposition is for those organizations. Now to Raj's point, obviously I think our ambitions go beyond that, both in terms of the personas that we addressed and going beyond that data engineer, but really it's for, to start at the baseline. You know, as we continue to grow our company, it's really making sure that we're adding value to folks using Airflow and help them do so in a better way, in a larger way and a more efficient way. And that's really the crux of who we sell to. And so to answer your question on, we actually, we get a lot of inbound because they're are so many - >> A built-in audience. >> In the world that use it, that those are the folks who we talk to and come to our website and chat with us and get value from our content. I mean the power of the open source community is really just so, so big. And I think that's also one of the things that makes this job fun, so. >> And you guys are in a great position, Viraj, you can comment, to get your reaction. There's been a big successful business model to starting a company around these big projects for a lot of reasons. One is open source is continuing to be great, but there's also supply chain challenges in there. There's also, you know, we want to continue more innovation and more code and keeping it free and and flowing. And then there's the commercialization of product-izing it, operationalizing it. This is a huge new dynamic. I mean, in the past, you know, five or so years, 10 years, it's been happening all on CNCF from other areas like Apache, Linux Foundation, they're all implementing this. This is a huge opportunity for entrepreneurs to do this. >> Yeah, yeah. Open source is always going to be core to what we do because, you know, we wouldn't exist without the open source community around us. They are huge in numbers. Oftentimes they're nameless people who are working on making something better in a way that everybody benefits from it. But open source is really hard, especially if you're a company whose core competency is running a business, right? Maybe you're running e-commerce business or maybe you're running, I don't know, some sort of like any sort of business, especially if you're a company running a business, you don't really want to spend your time figuring out how to run open source software. You just want to use it, you want to use the best of it, you want to use the community around it. You want to take, you want to be able to google something and get answers for it. You want the benefits of open source. You don't want to have, you don't have the time or the resources to invest in becoming an expert in open source, right? And I think that dynamic is really what's given companies like us an ability to kind of form businesses around that, in the sense that we'll make it so people get the best of both worlds. You'll get this vast open ecosystem that you can build on top of, you can benefit from, that you can learn from, but you won't have to spend your time doing undifferentiated heavy lifting. You can do things that are just specific to your business. >> It's always been great to see that business model evolved. We used to debate 10 years ago, can there be another red hat? And we said, not really the same, but there'll be a lot of little ones that'll grow up to be big soon. Great stuff. Final question, can you guys share the history of the company, the milestones of the Astronomer's journey in data orchestration? >> Yeah, we could. So yeah, I mean, I think, so Raj and I have obviously been at astronomer along with our other founding team and leadership folks, for over five years now. And it's been such an incredible journey of learning, of hiring really amazing people. Solving again, mission critical problems for so many types of organizations. You know, we've had some funding that has allowed us to invest in the team that we have and in the software that we have. And that's been really phenomenal. And so that investment, I think, keeps us confident even despite these sort of macroeconomic conditions that we're finding ourselves in. And so honestly, the milestones for us are focusing on our product, focusing on our customers over the next year, focusing on that market for us, that we know can get value out of what we do. And making developers' lives better and growing the open source community, you know, and making sure that everything that we're doing makes it easier for folks to get started to contribute to the project and to feel a part of the community that we're cultivating here. >> You guys raised a little bit of money. How much have you guys raised? >> I forget what the total is, but it's in the ballpark of 200, over $200 million. So it feels good - >> A little bit of capital. Got a little bit of cash to work with there. Great success. I know it's a Series C financing, you guys been down, so you're up and running. What's next? What are you guys looking to do? What's the big horizon look like for you? And from a vision standpoint, more hiring, more product, what is some of the key things you're looking at doing? >> Yeah, it's really a little of all of the above, right? Like, kind of one of the best and worst things about working at earlier stage startups is there's always so much to do and you often have to just kind of figure out a way to get everything done, but really invest in our product over the next, at least the next, over the course of our company lifetime. And there's a lot of ways we wanting to just make it more accessible to users, easier to get started with, easier to use all kind of on all areas there. And really, we really want to do more for the community, right? Like I was saying, we wouldn't be anything without the large open source community around us. And we want to figure out ways to give back more in more creative ways, in more code driven ways and more kind of events and everything else that we can do to keep those folks galvanized and just keeping them happy using Airflow. >> Paola, any final words as we close out? >> No, I mean, I'm super excited. You know, I think we'll keep growing the team this year. We've got a couple of offices in the US which we're excited about, and a fully global team that will only continue to grow. So Viraj and I are both here in New York and we're excited to be engaging with our coworkers in person. Finally, after years of not doing so, we've got a bustling office in San Francisco as well. So growing those teams and continuing to hire all over the world and really focusing on our product and the open source community is where our heads are at this year, so. >> Congratulations. - >> Excited. 200 million in funding plus good runway. Put that money in the bank, squirrel it away. You know, it's good to kind of get some good interest on it, but still grow. Congratulations on all the work you guys do. We appreciate you and the open sourced community does and good luck with the venture. Continue to be successful and we'll see you at the Startup Showcase. >> Thank you. - >> Yeah, thanks so much, John. Appreciate it. - >> It's theCube conversation, featuring astronomer.io, that's the website. Astronomer is doing well. Multiple rounds of funding, over 200 million in funding. Open source continues to lead the way in innovation. Great business model. Good solution for the next gen, Cloud, scale, data operations, data stacks that are emerging. I'm John Furrier, your host. Thanks for watching. (soft music)

Published Date : Feb 8 2023

SUMMARY :

and that is the future of for the path we've been on so far. take a minute to explain what you guys do. and that there's a lot of, of the value proposition And that data team needs to use tools You know, one of the and then a bunch of point solution. and the number of tools they're using and that is to make sense of the chaos, and all of the processes that need to be That's a beautiful thing. you know, they've solved that problem What are some of the companies Raj, I'll let you take that one too. And then, you know, and the growth of your company So I have to ask you guys, and companies that got started, you know, and the data scientists that the data market's kind of you can jump in. And so the folks that we and come to our website and chat with us I mean, in the past, you to what we do because, you history of the company, and in the software that we have. How much have you guys raised? but it's in the ballpark What are you guys looking to do? and you often have to just kind of and the open source community the work you guys do. Yeah, thanks so much, John. that's the website.

ENTITIES

Entity	Category	Confidence
Viraj Parekh	PERSON	0.99+
Paola	PERSON	0.99+
Viraj	PERSON	0.99+
John Furrier	PERSON	0.99+
John	PERSON	0.99+
Raj	PERSON	0.99+
Airbnb	ORGANIZATION	0.99+
US	LOCATION	0.99+
2017	DATE	0.99+
New York	LOCATION	0.99+
Paola Peraza Calderon	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Apache	ORGANIZATION	0.99+
San Francisco	LOCATION	0.99+
Palo Alto, California	LOCATION	0.99+
1970s	DATE	0.99+
10 years	QUANTITY	0.99+
five	QUANTITY	0.99+
Two	QUANTITY	0.99+
first question	QUANTITY	0.99+
over 200 million	QUANTITY	0.99+
both	QUANTITY	0.99+
Both	QUANTITY	0.99+
over $200 million	QUANTITY	0.99+
Linux Foundation	ORGANIZATION	0.99+
50 years ago	DATE	0.99+
one	QUANTITY	0.99+
five	DATE	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
this year	DATE	0.98+
One	QUANTITY	0.98+
Airflow	TITLE	0.98+
10 years ago	DATE	0.98+
Carnegie Mellon	ORGANIZATION	0.98+
over five years	QUANTITY	0.98+
200	QUANTITY	0.98+
12 months ago	DATE	0.98+
both worlds	QUANTITY	0.98+
5X	QUANTITY	0.98+
ChatGPT	ORGANIZATION	0.98+
first	QUANTITY	0.98+
one stack	QUANTITY	0.97+
one person	QUANTITY	0.97+
two things	QUANTITY	0.97+
Fivetran	ORGANIZATION	0.96+
seven	QUANTITY	0.96+
next year	DATE	0.96+
today	DATE	0.95+
50 years	QUANTITY	0.95+
each	QUANTITY	0.95+
theCube	ORGANIZATION	0.94+
HubSpot	ORGANIZATION	0.93+
Sigma	ORGANIZATION	0.92+
Series C	OTHER	0.92+
Astronomer	ORGANIZATION	0.91+
astronomer.io	OTHER	0.91+
Hightouch	TITLE	0.9+
one place	QUANTITY	0.9+
Android	TITLE	0.88+
Startup Showcase	EVENT	0.88+
Apache Airflow	TITLE	0.86+
CNCF	ORGANIZATION	0.86+

theCUBE's New Analyst Talks Cloud & DevOps

(light music) >> Hi everybody. Welcome to this Cube Conversation. I'm really pleased to announce a collaboration with Rob Strechay. He's a guest cube analyst, and we'll be working together to extract the signal from the noise. Rob is a long-time product pro, working at a number of firms including AWS, HP, HPE, NetApp, Snowplow. I did a stint as an analyst at Enterprise Strategy Group. Rob, good to see you. Thanks for coming into our Marlboro Studios. >> Well, thank you for having me. It's always great to be here. >> I'm really excited about working with you. We've known each other for a long time. You've been in the Cube a bunch. You know, you're in between gigs, and I think we can have a lot of fun together. Covering events, covering trends. So. let's get into it. What's happening out there? We're sort of exited the isolation economy. Things were booming. Now, everybody's tapping the brakes. From your standpoint, what are you seeing out there? >> Yeah. I'm seeing that people are really looking how to get more out of their data. How they're bringing things together, how they're looking at the costs of Cloud, and understanding how are they building out their SaaS applications. And understanding that when they go in and actually start to use Cloud, it's not only just using the base services anymore. They're looking at, how do I use these platforms as a service? Some are easier than others, and they're trying to understand, how do I get more value out of that relationship with the Cloud? They're also consolidating the number of Clouds that they have, I would say to try to better optimize their spend, and getting better pricing for that matter. >> Are you seeing people unhook Clouds, or just reduce maybe certain Cloud activities and going maybe instead of 60/40 going 90/10? >> Correct. It's more like the 90/10 type of rule where they're starting to say, Hey I'm not going to get rid of Azure or AWS or Google. I'm going to move a portion of this over that I was using on this one service. Maybe I got a great two-year contract to start with on this platform as a service or a database as a service. I'm going to unhook from that and maybe go with an independent. Maybe with something like a Snowflake or a Databricks on top of another Cloud, so that I can consolidate down. But it also gives them more flexibility as well. >> In our last breaking analysis, Rob, we identified six factors that were reducing Cloud consumption. There were factors and customer tactics. And I want to get your take on this. So, some of the factors really, you got fewer mortgage originations. FinTech, obviously big Cloud user. Crypto, not as much activity there. Lower ad spending means less Cloud. And then one of 'em, which you kind of disagreed with was less, less analytics, you know, fewer... Less frequency of calculations. I'll come back to that. But then optimizing compute using Graviton or AMD instances moving to cheaper storage tiers. That of course makes sense. And then optimize pricing plans. Maybe going from On Demand, you know, to, you know, instead of pay by the drink, buy in volume. Okay. So, first of all, do those make sense to you with the exception? We'll come back and talk about the analytics piece. Is that what you're seeing from customers? >> Yeah, I think so. I think that was pretty much dead on with what I'm seeing from customers and the ones that I go out and talk to. A lot of times they're trying to really monetize their, you know, understand how their business utilizes these Clouds. And, where their spend is going in those Clouds. Can they use, you know, lower tiers of storage? Do they really need the best processors? Do they need to be using Intel or can they get away with AMD or Graviton 2 or 3? Or do they need to move in? And, I think when you look at all of these Clouds, they always have pricing curves that are arcs from the newest to the oldest stuff. And you can play games with that. And understanding how you can actually lower your costs by looking at maybe some of the older generation. Maybe your application was written 10 years ago. You don't necessarily have to be on the best, newest processor for that application per se. >> So last, I want to come back to this whole analytics piece. Last June, I think it was June, Dev Ittycheria, who's the-- I call him Dev. Spelled Dev, pronounced Dave. (chuckles softly) Same pronunciation, different spelling. Dev Ittycheria, CEO of Mongo, on the earnings call. He was getting, you know, hit. Things were starting to get a little less visible in terms of, you know, the outlook. And people were pushing him like... Because you're in the Cloud, is it easier to dial down? And he said, because we're the document database, we support transaction applications. We're less discretionary than say, analytics. Well on the Snowflake earnings call, that same month or the month after, they were all over Slootman and Scarpelli. Oh, the Mongo CEO said that they're less discretionary than analytics. And Snowflake was an interesting comment. They basically said, look, we're the Cloud. You can dial it up, you can dial it down, but the area under the curve over a period of time is going to be the same, because they get their customers to commit. What do you say? You disagreed with the notion that people are running their calculations less frequently. Is that because they're trying to do a better job of targeting customers in near real time? What are you seeing out there? >> Yeah, I think they're moving away from using people and more expensive marketing. Or, they're trying to figure out what's my Google ad spend, what's my Meta ad spend? And what they're trying to do is optimize that spend. So, what is the return on advertising, or the ROAS as they would say. And what they're looking to do is understand, okay, I have to collect these analytics that better understand where are these people coming from? How do they get to my site, to my store, to my whatever? And when they're using it, how do they they better move through that? What you're also seeing is that analytics is not only just for kind of the retail or financial services or things like that, but then they're also, you know, using that to make offers in those categories. When you move back to more, you know, take other companies that are building products and SaaS delivered products. They may actually go and use this analytics for making the product better. And one of the big reasons for that is maybe they're dialing back how many product managers they have. And they're looking to be more data driven about how they actually go and build the product out or enhance the product. So maybe they're, you know, an online video service and they want to understand why people are either using or not using the whiteboard inside the product. And they're collecting a lot of that product analytics in a big way so that they can go through that. And they're doing it in a constant manner. This first party type tracking within applications is growing rapidly by customers. >> So, let's talk about who wins in that. So, obviously the Cloud guys, AWS, Google and Azure. I want to come back and unpack that a little bit. Databricks and Snowflake, we reported on our last breaking analysis, it kind of on a collision course. You know, a couple years ago we were thinking, okay, AWS, Snowflake and Databricks, like perfect sandwich. And then of course they started to become more competitive. My sense is they still, you know, compliment each other in the field, right? But, you know, publicly, they've got bigger aspirations, they get big TAMs that they're going after. But it's interesting, the data shows that-- So, Snowflake was off the charts in terms of spending momentum and our EPR surveys. Our partner down in New York, they kind of came into line. They're both growing in terms of market presence. Databricks couldn't get to IPO. So, we don't have as much, you know, visibility on their financials. You know, Snowflake obviously highly transparent cause they're a public company. And then you got AWS, Google and Azure. And it seems like AWS appears to be more partner friendly. Microsoft, you know, depends on what market you're in. And Google wants to sell BigQuery. >> Yeah. >> So, what are you seeing in the public Cloud from a data platform perspective? >> Yeah. I think that was pretty astute in what you were talking about there, because I think of the three, Google is definitely I think a little bit behind in how they go to market with their partners. Azure's done a fantastic job of partnering with these companies to understand and even though they may have Synapse as their go-to and where they want people to go to do AI and ML. What they're looking at is, Hey, we're going to also be friendly with Snowflake. We're also going to be friendly with a Databricks. And I think that, Amazon has always been there because that's where the market has been for these developers. So, many, like Databricks' and the Snowflake's have gone there first because, you know, Databricks' case, they built out on top of S3 first. And going and using somebody's object layer other than AWS, was not as simple as you would think it would be. Moving between those. >> So, one of the financial meetups I said meetup, but the... It was either the CEO or the CFO. It was either Slootman or Scarpelli talking at, I don't know, Merrill Lynch or one of the other financial conferences said, I think it was probably their Q3 call. Snowflake said 80% of our business goes through Amazon. And he said to this audience, the next day we got a call from Microsoft. Hey, we got to do more. And, we know just from reading the financial statements that Snowflake is getting concessions from Amazon, they're buying in volume, they're renegotiating their contracts. Amazon gets it. You know, lower the price, people buy more. Long term, we're all going to make more money. Microsoft obviously wants to get into that game with Snowflake. They understand the momentum. They said Google, not so much. And I've had customers tell me that they wanted to use Google's AI with Snowflake, but they can't, they got to go to to BigQuery. So, honestly, I haven't like vetted that so. But, I think it's true. But nonetheless, it seems like Google's a little less friendly with the data platform providers. What do you think? >> Yeah, I would say so. I think this is a place that Google looks and wants to own. Is that now, are they doing the right things long term? I mean again, you know, you look at Google Analytics being you know, basically outlawed in five countries in the EU because of GDPR concerns, and compliance and governance of data. And I think people are looking at Google and BigQuery in general and saying, is it the best place for me to go? Is it going to be in the right places where I need it? Still, it's still one of the largest used databases out there just because it underpins a number of the Google services. So you almost get, like you were saying, forced into BigQuery sometimes, if you want to use the tech on top. >> You do strategy. >> Yeah. >> Right? You do strategy, you do messaging. Is it the right call by Google? I mean, it's not a-- I criticize Google sometimes. But, I'm not sure it's the wrong call to say, Hey, this is our ace in the hole. >> Yeah. >> We got to get people into BigQuery. Cause, first of all, BigQuery is a solid product. I mean it's Cloud native and it's, you know, by all, it gets high marks. So, why give the competition an advantage? Let's try to force people essentially into what is we think a great product and it is a great product. The flip side of that is, they're giving up some potential partner TAM and not treating the ecosystem as well as one of their major competitors. What do you do if you're in that position? >> Yeah, I think that that's a fantastic question. And the question I pose back to the companies I've worked with and worked for is, are you really looking to have vendor lock-in as your key differentiator to your service? And I think when you start to look at these companies that are moving away from BigQuery, moving to even, Databricks on top of GCS in Google, they're looking to say, okay, I can go there if I have to evacuate from GCP and go to another Cloud, I can stay on Databricks as a platform, for instance. So I think it's, people are looking at what platform as a service, database as a service they go and use. Because from a strategic perspective, they don't want that vendor locking. >> That's where Supercloud becomes interesting, right? Because, if I can run on Snowflake or Databricks, you know, across Clouds. Even Oracle, you know, they're getting into business with Microsoft. Let's talk about some of the Cloud players. So, the big three have reported. >> Right. >> We saw AWSs Cloud growth decelerated down to 20%, which is I think the lowest growth rate since they started to disclose public numbers. And they said they exited, sorry, they said January they grew at 15%. >> Yeah. >> Year on year. Now, they had some pretty tough compares. But nonetheless, 15%, wow. Azure, kind of mid thirties, and then Google, we had kind of low thirties. But, well behind in terms of size. And Google's losing probably almost $3 billion annually. But, that's not necessarily a bad thing by advocating and investing. What's happening with the Cloud? Is AWS just running into the law, large numbers? Do you think we can actually see a re-acceleration like we have in the past with AWS Cloud? Azure, we predicted is going to be 75% of AWS IAS revenues. You know, we try to estimate IAS. >> Yeah. >> Even though they don't share that with us. That's a huge milestone. You'd think-- There's some people who have, I think, Bob Evans predicted a while ago that Microsoft would surpass AWS in terms of size. You know, what do you think? >> Yeah, I think that Azure's going to keep to-- Keep growing at a pretty good clip. I think that for Azure, they still have really great account control, even though people like to hate Microsoft. The Microsoft sellers that are out there making those companies successful day after day have really done a good job of being in those accounts and helping people. I was recently over in the UK. And the UK market between AWS and Azure is pretty amazing, how much Azure there is. And it's growing within Europe in general. In the states, it's, you know, I think it's growing well. I think it's still growing, probably not as fast as it is outside the U.S. But, you go down to someplace like Australia, it's also Azure. You hear about Azure all the time. >> Why? Is that just because of the Microsoft's software state? It's just so convenient. >> I think it has to do with, you know, and you can go with the reasoning they don't break out, you know, Office 365 and all of that out of their numbers is because they have-- They're in all of these accounts because the office suite is so pervasive in there. So, they always have reasons to go back in and, oh by the way, you're on these old SQL licenses. Let us move you up here and we'll be able to-- We'll support you on the old version, you know, with security and all of these things. And be able to move you forward. So, they have a lot of, I guess you could say, levers to stay in those accounts and be interesting. At least as part of the Cloud estate. I think Amazon, you know, is hitting, you know, the large number. Laws of large numbers. But I think that they're also going through, and I think this was seen in the layoffs that they were making, that they're looking to understand and have profitability in more of those services that they have. You know, over 350 odd services that they have. And you know, as somebody who went there and helped to start yet a new one, while I was there. And finally, it went to beta back in September, you start to look at the fact that, that number of services, people, their own sellers don't even know all of their services. It's impossible to comprehend and sell that many things. So, I think what they're going through is really looking to rationalize a lot of what they're doing from a services perspective going forward. They're looking to focus on more profitable services and bringing those in. Because right now it's built like a layer cake where you have, you know, S3 EBS and EC2 on the bottom of the layer cake. And then maybe you have, you're using IAM, the authorization and authentication in there and you have all these different services. And then they call it EMR on top. And so, EMR has to pay for that entire layer cake just to go and compete against somebody like Mongo or something like that. So, you start to unwind the costs of that. Whereas Azure, went and they build basically ground up services for the most part. And Google kind of falls somewhere in between in how they build their-- They're a sort of layer cake type effect, but not as many layers I guess you could say. >> I feel like, you know, Amazon's trying to be a platform for the ecosystem. Yes, they have their own products and they're going to sell. And that's going to drive their profitability cause they don't have to split the pie. But, they're taking a piece of-- They're spinning the meter, as Ziyas Caravalo likes to say on every time Snowflake or Databricks or Mongo or Atlas is, you know, running on their system. They take a piece of the action. Now, Microsoft does that as well. But, you look at Microsoft and security, head-to-head competitors, for example, with a CrowdStrike or an Okta in identity. Whereas, it seems like at least for now, AWS is a more friendly place for the ecosystem. At the same time, you do a lot of business in Microsoft. >> Yeah. And I think that a lot of companies have always feared that Amazon would just throw, you know, bodies at it. And I think that people have come to the realization that a two pizza team, as Amazon would call it, is eight people. I think that's, you know, two slices per person. I'm a little bit fat, so I don't know if that's enough. But, you start to look at it and go, okay, if they're going to start out with eight engineers, if I'm a startup and they're part of my ecosystem, do I really fear them or should I really embrace them and try to partner closer with them? And I think the smart people and the smart companies are partnering with them because they're realizing, Amazon, unless they can see it to, you know, a hundred million, $500 million market, they're not going to throw eight to 16 people at a problem. I think when, you know, you could say, you could look at the elastic with OpenSearch and what they did there. And the licensing terms and the battle they went through. But they knew that Elastic had a huge market. Also, you had a number of ecosystem companies building on top of now OpenSearch, that are now domain on top of Amazon as well. So, I think Amazon's being pretty strategic in how they're doing it. I think some of the-- It'll be interesting. I think this year is a payout year for the cuts that they're making to some of the services internally to kind of, you know, how do we take the fat off some of those services that-- You know, you look at Alexa. I don't know how much revenue Alexa really generates for them. But it's a means to an end for a number of different other services and partners. >> What do you make of this ChatGPT? I mean, Microsoft obviously is playing that card. You want to, you want ChatGPT in the Cloud, come to Azure. Seems like AWS has to respond. And we know Google is, you know, sharpening its knives to come up with its response. >> Yeah, I mean Google just went and talked about Bard for the first time this week and they're in private preview or I guess they call it beta, but. Right at the moment to select, select AI users, which I have no idea what that means. But that's a very interesting way that they're marketing it out there. But, I think that Amazon will have to respond. I think they'll be more measured than say, what Google's doing with Bard and just throwing it out there to, hey, we're going into beta now. I think they'll look at it and see where do we go and how do we actually integrate this in? Because they do have a lot of components of AI and ML underneath the hood that other services use. And I think that, you know, they've learned from that. And I think that they've already done a good job. Especially for media and entertainment when you start to look at some of the ways that they use it for helping do graphics and helping to do drones. I think part of their buy of iRobot was the fact that iRobot was a big user of RoboMaker, which is using different models to train those robots to go around objects and things like that, so. >> Quick touch on Kubernetes, the whole DevOps World we just covered. The Cloud Native Foundation Security, CNCF. The security conference up in Seattle last week. First time they spun that out kind of like reinforced, you know, AWS spins out, reinforced from reinvent. Amsterdam's coming up soon, the CubeCon. What should we expect? What's hot in Cubeland? >> Yeah, I think, you know, Kubes, you're going to be looking at how OpenShift keeps growing and I think to that respect you get to see the momentum with people like Red Hat. You see others coming up and realizing how OpenShift has gone to market as being, like you were saying, partnering with those Clouds and really making it simple. I think the simplicity and the manageability of Kubernetes is going to be at the forefront. I think a lot of the investment is still going into, how do I bring observability and DevOps and AIOps and MLOps all together. And I think that's going to be a big place where people are going to be looking to see what comes out of CubeCon in Amsterdam. I think it's that manageability ease of use. >> Well Rob, I look forward to working with you on behalf of the whole Cube team. We're going to do more of these and go out to some shows extract the signal from the noise. Really appreciate you coming into our studio. >> Well, thank you for having me on. Really appreciate it. >> You're really welcome. All right, keep it right there, or thanks for watching. This is Dave Vellante for the Cube. And we'll see you next time. (light music)

Published Date : Feb 7 2023

SUMMARY :

I'm really pleased to It's always great to be here. and I think we can have the number of Clouds that they have, contract to start with those make sense to you And, I think when you look in terms of, you know, the outlook. And they're looking to My sense is they still, you know, in how they go to market And he said to this audience, is it the best place for me to go? You do strategy, you do messaging. and it's, you know, And I think when you start Even Oracle, you know, since they started to to be 75% of AWS IAS revenues. You know, what do you think? it's, you know, I think it's growing well. Is that just because of the And be able to move you forward. I feel like, you know, I think when, you know, you could say, And we know Google is, you know, And I think that, you know, you know, AWS spins out, and I think to that respect forward to working with you Well, thank you for having me on. And we'll see you next time.

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Bob Evans	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
HP	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Rob	PERSON	0.99+
Google	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Rob Strechay	PERSON	0.99+
New York	LOCATION	0.99+
September	DATE	0.99+
Seattle	LOCATION	0.99+
January	DATE	0.99+
Dev Ittycheria	PERSON	0.99+
HPE	ORGANIZATION	0.99+
NetApp	ORGANIZATION	0.99+
Amsterdam	LOCATION	0.99+
75%	QUANTITY	0.99+
UK	LOCATION	0.99+
AWSs	ORGANIZATION	0.99+
June	DATE	0.99+
Snowplow	ORGANIZATION	0.99+
eight	QUANTITY	0.99+
80%	QUANTITY	0.99+
Scarpelli	PERSON	0.99+
15%	QUANTITY	0.99+
Australia	LOCATION	0.99+
Mongo	ORGANIZATION	0.99+
Slootman	PERSON	0.99+
two-year	QUANTITY	0.99+
AMD	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
Databricks	ORGANIZATION	0.99+
six factors	QUANTITY	0.99+
three	QUANTITY	0.99+
Merrill Lynch	ORGANIZATION	0.99+
Last June	DATE	0.99+
five countries	QUANTITY	0.99+
eight people	QUANTITY	0.99+
U.S.	LOCATION	0.99+
last week	DATE	0.99+
16 people	QUANTITY	0.99+
Databricks'	ORGANIZATION	0.99+

Breaking Analysis: Cloud players sound a cautious tone for 2023

>> From the Cube Studios in Palo Alto in Boston bringing you data-driven insights from the Cube and ETR. This is Breaking Analysis with Dave Vellante. >> The unraveling of market enthusiasm continued in Q4 of 2022 with the earnings reports from the US hyperscalers, the big three now all in. As we said earlier this year, even the cloud is an immune from the macro headwinds and the cracks in the armor that we saw from the data that we shared last summer, they're playing out into 2023. For the most part actuals are disappointing beyond expectations including our own. It turns out that our estimates for the big three hyperscaler's revenue missed by 1.2 billion or 2.7% lower than we had forecast from even our most recent November estimates. And we expect continued decelerating growth rates for the hyperscalers through the summer of 2023 and we don't think that's going to abate until comparisons get easier. Hello and welcome to this week's Wikibon Cube Insights powered by ETR. In this Breaking Analysis, we share our view of what's happening in cloud markets not just for the hyperscalers but other firms that have hitched a ride on the cloud. And we'll share new ETR data that shows why these trends are playing out tactics that customers are employing to deal with their cost challenges and how long the pain is likely to last. You know, riding the cloud wave, it's a two-edged sword. Let's look at the players that have gone all in on or are exposed to both the positive and negative trends of cloud. Look the cloud has been a huge tailwind for so many companies like Snowflake and Databricks, Workday, Salesforce, Mongo's move with Atlas, Red Hats Cloud strategy with OpenShift and so forth. And you know, the flip side is because cloud is elastic what comes up can also go down very easily. Here's an XY graphic from ETR that shows spending momentum or net score on the vertical axis and market presence in the dataset on the horizontal axis provision or called overlap. This is data from the January 2023 survey and that the red dotted lines show the positions of several companies that we've highlighted going back to January 2021. So let's unpack this for a bit starting with the big three hyperscalers. The first point is AWS and Azure continue to solidify their moat relative to Google Cloud platform. And we're going to get into this in a moment, but Azure and AWS revenues are five to six times that of GCP for IaaS. And at those deltas, Google should be gaining ground much faster than the big two. The second point on Google is notice the red line on GCP relative to its starting point. While it appears to be gaining ground on the horizontal axis, its net score is now below that of AWS and Azure in the survey. So despite its significantly smaller size it's just not keeping pace with the leaders in terms of market momentum. Now looking at AWS and Microsoft, what we see is basically AWS is holding serve. As we know both Google and Microsoft benefit from including SaaS in their cloud numbers. So the fact that AWS hasn't seen a huge downward momentum relative to a January 2021 position is one positive in the data. And both companies are well above that magic 40% line on the Y-axis, anything above 40% we consider to be highly elevated. But the fact remains that they're down as are most of the names on this chart. So let's take a closer look. I want to start with Snowflake and Databricks. Snowflake, as we reported from several quarters back came down to Earth, it was up in the 80% range in the Y-axis here. And it's still highly elevated in the 60% range and it continues to move to the right, which is positive but as we'll address in a moment it's customers can dial down consumption just as in any cloud. Now, Databricks is really interesting. It's not a public company, it never made it to IPO during the sort of tech bubble. So we don't have the same level of transparency that we do with other companies that did make it through. But look at how much more prominent it is on the X-axis relative to January 2021. And it's net score is basically held up over that period of time. So that's a real positive for Databricks. Next, look at Workday and Salesforce. They've held up relatively well, both inching to the right and generally holding their net scores. Same from Mongo, which is the brown dot above its name that says Elastic, it says a little gets a little crowded which Elastic's actually the blue dot above it. But generally, SaaS is harder to dial down, Workday, Salesforce, Oracles, SaaS and others. So it's harder to dial down because commitments have been made in advance, they're kind of locked in. Now, one of the discussions from last summer was as Mongo, less discretionary than analytics i.e. Snowflake. And it's an interesting debate but maybe Snowflake customers, you know, they're also generally committed to a dollar amount. So over time the spending is going to be there. But in the short term, yeah maybe Snowflake customers can dial down. Now that highlighted dotted red line, that bolded one is Datadog and you can see it's made major strides on the X-axis but its net score has decelerated quite dramatically. Openshift's momentum in the survey has dropped although IBM just announced that OpenShift has a a billion dollar ARR and I suspect what's happening there is IBM consulting is bundling OpenShift into its modernization projects. It's got a, that sort of captive base if you will. And as such it's probably not as top of mind to the respondents but I'll bet you the developers are certainly aware of it. Now the other really notable call out here is CloudFlare, We've reported on them earlier. Cloudflare's net score has held up really well since January of 2021. It really hasn't seen the downdraft of some of these others, but it's making major major moves to the right gaining market presence. We really like how CloudFlare is performing. And the last comment is on Oracle which as you can see, despite its much, much lower net score continues to gain ground in the market and thrive from a profitability standpoint. But the data pretty clearly shows that there's a downdraft in the market. Okay, so what's happening here? Let's dig deeper into this data. Here's a graphic from the most recent ETR drill down asking customers that said they were going to cut spending what technique they're using to do so. Now, as we've previously reported, consolidating redundant vendors is by far the most cited approach but there's two key points we want to make here. One is reducing excess cloud resources. As you can see in the bars is the second most cited technique and it's up from the previous polling period. The second we're not showing, you know directly but we've got some red call outs there. Reducing cloud costs jumps to 29% and 28% respectively in financial services and tech telco. And it's much closer to second. It's basically neck and neck with consolidating redundant vendors in those two industries. So they're being really aggressive about optimizing cloud cost. Okay, so as we said, cloud is great 'cause you can dial it up but it's just as easy to dial down. We've identified six factors that customers tell us are affecting their cloud consumption and there are probably more, if you got more we'd love to hear them but these are the ones that are fairly prominent that have hit our radar. First, rising mortgage rates mean banks are processing fewer loans means less cloud. The crypto crash means less trading activity and that means less cloud resources. Third lower ad spend has led companies to reduce not only you know, their ad buying but also their frequency of running their analytics and their calculations. And they're also often using less data, maybe compressing the timeframe of the corpus down to a shorter time period. Also very prominent is down to the bottom left, using lower cost compute instances. For example, Graviton from AWS or AMD chips and tiering storage to cheaper S3 or deep archived tiers. And finally, optimizing based on better pricing plans. So customers are moving from, you know, smaller companies in particular moving maybe from on demand or other larger companies that are experimenting using on demand or they're moving to spot pricing or reserved instances or optimized savings plans. That all lowers cost and that means less cloud resource consumption and less cloud revenue. Now in the days when everything was on prem CFOs, what would they do? They would freeze CapEx and IT Pros would have to try to do more with less and often that meant a lot of manual tasks. With the cloud it's much easier to move things around. It still takes some thinking and some effort but it's dramatically simpler to do so. So you can get those savings a lot faster. Now of course the other huge factor is you can cut or you can freeze. And this graphic shows data from a recent ETR survey with 159 respondents and you can see the meaningful uptick in hiring freezes, freezing new IT deployments and layoffs. And as we've been reporting, this has been trending up since earlier last year. And note the call out, this is especially prominent in retail sectors, all three of these techniques jump up in retail and that's a bit of a concern because oftentimes consumer spending helps the economy make a softer landing out of a pullback. But this is a potential canary in the coal mine. If retail firms are pulling back it's because consumers aren't spending as much. And so we're keeping a close eye on that. So let's boil this down to the market data and what this all means. So in this graphic we show our estimates for Q4 IaaS revenues compared to the "actual" IaaS revenues. And we say quote because AWS is the only one that reports, you know clean revenue and IaaS, Azure and GCP don't report actuals. Why would they? Because it would make them look even, you know smaller relative to AWS. Rather, they bury the figures in overall cloud which includes their, you know G-Suite for Google and all the Microsoft SaaS. And then they give us little tidbits about in Microsoft's case, Azure, they give growth rates. Google gives kind of relative growth of GCP. So, and we use survey data and you know, other data to try to really pinpoint and we've been covering this for, I don't know, five or six years ever since the cloud really became a thing. But looking at the data, we had AWS growing at 25% this quarter and it came in at 20%. So a significant decline relative to our expectations. AWS announced that it exited December, actually, sorry it's January data showed about a 15% mid-teens growth rate. So that's, you know, something we're watching. Azure was two points off our forecast coming in at 38% growth. It said it exited December in the 35% growth range and it said that it's expecting five points of deceleration off of that. So think 30% for Azure. GCP came in three points off our expectation coming in 35% and Alibaba has yet to report but we've shaved a bid off that forecast based on some survey data and you know what maybe 9% is even still not enough. Now for the year, the big four hyperscalers generated almost 160 billion of revenue, but that was 7 billion lower than what what we expected coming into 2022. For 2023, we're expecting 21% growth for a total of 193.3 billion. And while it's, you know, lower, you know, significantly lower than historical expectations it's still four to five times the overall spending forecast that we just shared with you in our predictions post of between 4 and 5% for the overall market. We think AWS is going to come in in around 93 billion this year with Azure closing in at over 71 billion. This is, again, we're talking IaaS here. Now, despite Amazon focusing investors on the fact that AWS's absolute dollar growth is still larger than its competitors. By our estimates Azure will come in at more than 75% of AWS's forecasted revenue. That's a significant milestone. AWS is operating margins by the way declined significantly this past quarter, dropping from 30% of revenue to 24%, 30% the year earlier to 24%. Now that's still extremely healthy and we've seen wild fluctuations like this before so I don't get too freaked out about that. But I'll say this, Microsoft has a marginal cost advantage relative to AWS because one, it has a captive cloud on which to run its massive software estate. So it can just throw software at its own cloud and two software marginal costs. Marginal economics despite AWS's awesomeness in high degrees of automation, software is just a better business. Now the upshot for AWS is the ecosystem. AWS is essentially in our view positioning very smartly as a platform for data partners like Snowflake and Databricks, security partners like CrowdStrike and Okta and Palo Alto and many others and SaaS companies. You know, Microsoft is more competitive even though AWS does have competitive products. Now of course Amazon's competitive to retail companies so that's another factor but generally speaking for tech players, Amazon is a really thriving ecosystem that is a secret weapon in our view. AWS happy to spin the meter with its partners even though it sells competitive products, you know, more so in our view than other cloud players. Microsoft, of course is, don't forget is hyping now, we're hearing a lot OpenAI and ChatGPT we reported last week in our predictions post. How OpenAI is shot up in terms of market sentiment in ETR's emerging technology company surveys and people are moving to Azure to get OpenAI and get ChatGPT that is a an interesting lever. Amazon in our view has to have a response. They have lots of AI and they're going to have to make some moves there. Meanwhile, Google is emphasizing itself as an AI first company. In fact, Google spent at least five minutes of continuous dialogue, nonstop on its AI chops during its latest earnings call. So that's an area that we're watching very closely as the buzz around large language models continues. All right, let's wrap up with some assumptions for 2023. We think SaaS players are going to continue to be sticky. They're going to be somewhat insulated from all these downdrafts because they're so tied in and customers, you know they make the commitment up front, you've got the lock in. Now having said that, we do expect some backlash over time on the onerous and generally customer unfriendly pricing models of most large SaaS companies. But that's going to play out over a longer period of time. Now for cloud generally and the hyperscalers specifically we do expect accelerating growth rates into Q3 but the amplitude of the demand swings from this rubber band economy, we expect to continue to compress and become more predictable throughout the year. Estimates are coming down, CEOs we think are going to be more cautious when the market snaps back more cautious about hiring and spending and as such a perhaps we expect a more orderly return to growth which we think will slightly accelerate in Q4 as comps get easier. Now of course the big risk to these scenarios is of course the economy, the FED, consumer spending, inflation, supply chain, energy prices, wars, geopolitics, China relations, you know, all the usual stuff. But as always with our partners at ETR and the Cube community, we're here for you. We have the data and we'll be the first to report when we see a change at the margin. Okay, that's a wrap for today. I want to thank Alex Morrison who's on production and manages the podcast, Ken Schiffman as well out of our Boston studio getting this up on LinkedIn Live. Thank you for that. Kristen Martin also and Cheryl Knight help get the word out on social media and in our newsletters. And Rob Hof is our Editor-in-Chief over at siliconangle.com. He does some great editing for us. Thank you all. Remember all these episodes are available as podcast. Wherever you listen, just search Breaking Analysis podcast. I publish each week on wikibon.com, at siliconangle.com where you can see all the data and you want to get in touch. Just all you can do is email me david.vellante@siliconangle.com or DM me @dvellante if you if you got something interesting, I'll respond. If you don't, it's either 'cause I'm swamped or it's just not tickling me. You can comment on our LinkedIn post as well. And please check out ETR.ai for the best survey data in the enterprise tech business. This is Dave Vellante for the Cube Insights powered by ETR. Thanks for watching and we'll see you next time on Breaking Analysis. (gentle upbeat music)

Published Date : Feb 4 2023

SUMMARY :

From the Cube Studios and how long the pain is likely to last.

ENTITIES

Entity	Category	Confidence
Alex Morrison	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Alibaba	ORGANIZATION	0.99+
Cheryl Knight	PERSON	0.99+
Kristen Martin	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Ken Schiffman	PERSON	0.99+
January 2021	DATE	0.99+
Microsoft	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Rob Hof	PERSON	0.99+
2.7%	QUANTITY	0.99+
January	DATE	0.99+
Amazon	ORGANIZATION	0.99+
December	DATE	0.99+
January of 2021	DATE	0.99+
five	QUANTITY	0.99+
January 2023	DATE	0.99+
Snowflake	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
1.2 billion	QUANTITY	0.99+
20%	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
29%	QUANTITY	0.99+
30%	QUANTITY	0.99+
six factors	QUANTITY	0.99+
second point	QUANTITY	0.99+
24%	QUANTITY	0.99+
2022	DATE	0.99+
david.vellante@siliconangle.com	OTHER	0.99+
X-axis	ORGANIZATION	0.99+
2023	DATE	0.99+
28%	QUANTITY	0.99+
193.3 billion	QUANTITY	0.99+
ETR	ORGANIZATION	0.99+
38%	QUANTITY	0.99+
7 billion	QUANTITY	0.99+
21%	QUANTITY	0.99+
Earth	LOCATION	0.99+
25%	QUANTITY	0.99+
Mongo	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Atlas	ORGANIZATION	0.99+
two industries	QUANTITY	0.99+
last week	DATE	0.99+
six years	QUANTITY	0.99+
first point	QUANTITY	0.99+
Red Hats	ORGANIZATION	0.99+
35%	QUANTITY	0.99+
four	QUANTITY	0.99+
159 respondents	QUANTITY	0.99+
Okta	ORGANIZATION	0.99+

Opher Kahane, Sonoma Ventures | CloudNativeSecurityCon 23

(uplifting music) >> Hello, welcome back to theCUBE's coverage of CloudNativeSecurityCon, the inaugural event, in Seattle. I'm John Furrier, host of theCUBE, here in the Palo Alto Studios. We're calling it theCUBE Center. It's kind of like our Sports Center for tech. It's kind of remote coverage. We've been doing this now for a few years. We're going to amp it up this year as more events are remote, and happening all around the world. So, we're going to continue the coverage with this segment focusing on the data stack, entrepreneurial opportunities around all things security, and as, obviously, data's involved. And our next guest is a friend of theCUBE, and CUBE alumni from 2013, entrepreneur himself, turned, now, venture capitalist angel investor, with his own firm, Opher Kahane, Managing Director, Sonoma Ventures. Formerly the founder of Origami, sold to Intuit a few years back. Focusing now on having a lot of fun, angel investing on boards, focusing on data-driven applications, and stacks around that, and all the stuff going on in, really, in the wheelhouse for what's going on around security data. Opher, great to see you. Thanks for coming on. >> My pleasure. Great to be back. It's been a while. >> So you're kind of on Easy Street now. You did the entrepreneurial venture, you've worked hard. We were on together in 2013 when theCUBE just started. XCEL Partners had an event in Stanford, XCEL, and they had all the features there. We interviewed Satya Nadella, who was just a manager at Microsoft at that time, he was there. He's now the CEO of Microsoft. >> Yeah, he was. >> A lot's changed in nine years. But congratulations on your venture you sold, and you got an exit there, and now you're doing a lot of investments. I'd love to get your take, because this is really the biggest change I've seen in the past 12 years, around an inflection point around a lot of converging forces. Data, which, big data, 10 years ago, was a big part of your career, but now it's accelerated, with cloud scale. You're seeing people building scale on top of other clouds, and becoming their own cloud. You're seeing data being a big part of it. Cybersecurity kind of has not really changed much, but it's the most important thing everyone's talking about. So, developers are involved, data's involved, a lot of entrepreneurial opportunities. So I'd love to get your take on how you see the current situation, as it relates to what's gone on in the past five years or so. What's the big story? >> So, a lot of big stories, but I think a lot of it has to do with a promise of making value from data, whether it's for cybersecurity, for Fintech, for DevOps, for RevTech startups and companies. There's a lot of challenges in actually driving and monetizing the value from data with velocity. Historically, the challenge has been more around, "How do I store data at massive scale?" And then you had the big data infrastructure company, like Cloudera, and MapR, and others, deal with it from a scale perspective, from a storage perspective. Then you had a whole layer of companies that evolved to deal with, "How do I index massive scales of data, for quick querying, and federated access, et cetera?" But now that a lot of those underlying problems, if you will, have been solved, to a certain extent, although they're always being stretched, given the scale of data, and its utility is becoming more and more massive, in particular with AI use cases being very prominent right now, the next level is how to actually make value from the data. How do I manage the full lifecycle of data in complex environments, with complex organizations, complex use cases? And having seen this from the inside, with Origami Logic, as we dealt with a lot of large corporations, and post-acquisition by Intuit, and a lot of the startups I'm involved with, it's clear that we're now onto that next step. And you have fundamental new paradigms, such as data mesh, that attempt to address that complexity, and responsibly scaling access, and democratizing access in the value monetization from data, across large organizations. You have a slew of startups that are evolving to help the entire lifecycle of data, from the data engineering side of it, to the data analytics side of it, to the AI use cases side of it. And it feels like the early days, to a certain extent, of the revolution that we've seen in transition from traditional databases, to data warehouses, to cloud-based data processing, and big data. It feels like we're at the genesis of that next wave. And it's super, super exciting, for me at least, as someone who's sitting more in the coach seat, rather than being on the pitch, and building startups, helping folks as they go through those motions. >> So that's awesome. I want to get into some of these data infrastructure dynamics you mentioned, but before that, talk to the audience around what you're working on now. You've been a successful entrepreneur, you're focused on angel investing, so, super-early seed stage. What kind of deals are you looking at? What's interesting to you? What is Sonoma Ventures looking for, and what are some of the entrepreneurial dynamics that you're seeing right now, from a startup standpoint? >> Cool, so, at a macro level, this is a little bit of background of my history, because it shapes very heavily what it is that I'm looking at. So, I've been very fortunate with entrepreneurial career. I founded three startups. All three of them are successful. Final two were sold, the first one merged and went public. And my third career has been about data, moving data, passing data, processing data, generating insights from it. And, at this phase, I wanted to really evolve from just going and building startup number four, from going through the same motions again. A 10 year adventure, I'm a little bit too old for that, I guess. But the next best thing is to sit from a point whereby I can be more elevated in where I'm dealing with, and broaden the variety of startups I'm focused on, rather than just do your own thing, and just go very, very deep into it. Now, what specifically am I focused on at Sonoma Ventures? So, basically, looking at what I refer to as a data-driven application stack. Anything from the low-level data infrastructure and cloud infrastructure, that helps any persona in the data universe maximize value for data, from their particular point of view, for their particular role, whether it's data analysts, data scientists, data engineers, cloud engineers, DevOps folks, et cetera. All the way up to the application layer, in applications that are very data-heavy. And what are very typical data-heavy applications? FinTech, cyber, Web3, revenue technologies, and product and DevOps. So these are the areas we're focused on. I have almost 23 or 24 startups in the portfolio that span all these different areas. And this is in terms of the aperture. Now, typically, focus on pre-seed, seed. Sometimes a little bit later stage, but this is the primary focus. And it's really about partnering with entrepreneurs, and helping them make, if you will, original mistakes, avoid the mistakes I made. >> Yeah. >> And take it to the next level, whatever the milestone they're driving with. So I'm very, very hands-on with many of those startups. Now, what is it that's happening right now, initially, and why is it so exciting? So, on one hand, you have this scaling of data and its complexity, yet lagging value creation from it, across those different personas we've touched on. So that's one fundamental opportunity which is secular. The other one, which is more a cyclic situation, is the fact that we're going through a down cycle in tech, as is very evident in the public markets, and everything we're hearing about funding going slower and lower, terms shifting more into the hands of typical VCs versus entrepreneur-friendly market, and so on and so forth. And a very significant amount of layoffs. Now, when you combine these two trends together, you're observing a very interesting thing, that a lot of folks, really bright folks, who have sold a startup to a company, or have been in the guts of the large startup, or a large corporation, have, hands-on, experienced all those challenges we've spoken about earlier, in turf, maximizing value from data, irrespective of their role, in a specific angle, or vantage point they have on those challenges. So, for many of them, it's an opportunity to, "Now, let me now start a startup. I've been laid off, maybe, or my company's stock isn't doing as well as it used to, as a large corporation. Now I have an opportunity to actually go and take my entrepreneurial passion, and apply it to a product and experience as part of this larger company." >> Yeah. >> And you see a slew of folks who are emerging with these great ideas. So it's a very, very exciting period of time to innovate. >> It's interesting, a lot of people look at, I mean, I look at Snowflake as an example of a company that refactored data warehouses. They just basically took data warehouse, and put it on the cloud, and called it a data cloud. That, to me, was compelling. They didn't pay any CapEx. They rode Amazon's wave there. So, a similar thing going on with data. You mentioned this, and I see it as an enabling opportunity. So whether it's cybersecurity, FinTech, whatever vertical, you have an enablement. Now, you mentioned data infrastructure. It's a super exciting area, as there's so many stacks emerging. We got an analytics stack, there's real-time stacks, there's data lakes, AI stack, foundational models. So, you're seeing an explosion of stacks, different tools probably will emerge. So, how do you look at that, as a seasoned entrepreneur, now investor? Is that a good thing? Is that just more of the market? 'Cause it just seems like more and more kind of decomposed stacks targeted at use cases seems to be a trend. >> Yeah. >> And how do you vet that, is it? >> So it's a great observation, and if you take a step back and look at the evolution of technology over the last 30 years, maybe longer, you always see these cycles of expansion, fragmentation, contraction, expansion, contraction. Go decentralize, go centralize, go decentralize, go centralize, as manifested in different types of technology paradigms. From client server, to storage, to microservices, to et cetera, et cetera. So I think we're going through another big bang, to a certain extent, whereby end up with more specialized data stacks for specific use cases, as you need performance, the data models, the tooling to best adapt to the particular task at hand, and the particular personas at hand. As the needs of the data analysts are quite different from the needs of an NL engineer, it's quite different from the needs of the data engineer. And what happens is, when you end up with these siloed stacks, you end up with new fragmentation, and new gaps that need to be filled with a new layer of innovation. And I suspect that, in part, that's what we're seeing right now, in terms of the next wave of data innovation. Whether it's in a service of FinTech use cases, or cyber use cases, or other, is a set of tools that end up having to try and stitch together those elements and bridge between them. So I see that as a fantastic gap to innovate around. I see, also, a fundamental need in creating a common data language, and common data management processes and governance across those different personas, because ultimately, the same underlying data these folks need, albeit in different mediums, different access models, different velocities, et cetera, the subject matter, if you will, the underlying raw data, and some of the taxonomies right on top of it, do need to be consistent. So, once again, a great opportunity to innovate, whether it's about semantic layers, whether it's about data mesh, whether it's about CICD tools for data engineers, and so on and so forth. >> I got to ask you, first of all, I see you have a friend you brought into the interview. You have a dog in the background who made a little cameo appearance. And that's awesome. Sitting right next to you, making sure everything's going well. On the AI thing, 'cause I think that's the hot trend here. >> Yeah. >> You're starting to see, that ChatGPT's got everyone excited, because it's kind of that first time you see kind of next-gen functionality, large-language models, where you can bring data in, and it integrates well. So, to me, I think, connecting the dots, this kind of speaks to the beginning of what will be a trend of really blending of data stacks together, or blending of models. And so, as more data modeling emerges, you start to have this AI stack kind of situation, where you have things out there that you can compose. It's almost very developer-friendly, conceptually. This is kind of new, but kind of the same concept's been working on with Google and others. How do you see this emerging, as an investor? What are some of the things that you're excited about, around the ChatGPT kind of things that's happening? 'Cause it brings it mainstream. Again, a million downloads, fastest applications get a million downloads, even among all the successes. So it's obviously hit a nerve. People are talking about it. What's your take on that? >> Yeah, so, I think that's a great point, and clearly, it feels like an iPhone moment, right, to the industry, in this case, AI, and lots of applications. And I think there's, at a high level, probably three different layers of innovation. One is on top of those platforms. What use cases can one bring to the table that would drive on top of a ChatGPT-like service? Whereby, the startup, the company, can bring some unique datasets to infuse and add value on top of it, by custom-focusing it and purpose-building it for a particular use case or particular vertical. Whether it's applying it to customer service, in a particular vertical, applying it to, I don't know, marketing content creation, and so on and so forth. That's one category. And I do know that, as one of my startups is in Y Combinator, this season, winter '23, they're saying that a very large chunk of the YC companies in this cycle are about GPT use cases. So we'll see a flurry of that. The next layer, the one below that, is those who actually provide those platforms, whether it's ChatGPT, whatever will emerge from the partnership with Microsoft, and any competitive players that emerge from other startups, or from the big cloud providers, whether it's Facebook, if they ever get into this, and Google, which clearly will, as they need to, to survive around search. The third layer is the enabling layer. As you're going to have more and more of those different large-language models and use case running on top of it, the underlying layers, all the way down to cloud infrastructure, the data infrastructure, and the entire set of tools and systems, that take raw data, and massage it into useful, labeled, contextualized features and data to feed the models, the AI models, whether it's during training, or during inference stages, in production. Personally, my focus is more on the infrastructure than on the application use cases. And I believe that there's going to be a massive amount of innovation opportunity around that, to reach cost-effective, quality, fair models that are deployed easily and maintained easily, or at least with as little pain as possible, at scale. So there are startups that are dealing with it, in various areas. Some are about focusing on labeling automation, some about fairness, about, speaking about cyber, protecting models from threats through data and other issues with it, and so on and so forth. And I believe that this will be, too, a big driver for massive innovation, the infrastructure layer. >> Awesome, and I love how you mentioned the iPhone moment. I call it the browser moment, 'cause it felt that way for me, personally. >> Yep. >> But I think, from a business model standpoint, there is that iPhone shift. It's not the BlackBerry. It's a whole 'nother thing. And I like that. But I do have to ask you, because this is interesting. You mentioned iPhone. iPhone's mostly proprietary. So, in these machine learning foundational models, >> Yeah. >> you're starting to see proprietary hardware, bolt-on, acceleration, bundled together, for faster uptake. And now you got open source emerging, as two things. It's almost iPhone-Android situation happening. >> Yeah. >> So what's your view on that? Because there's pros and cons for either one. You're seeing a lot of these machine learning laws are very proprietary, but they work, and do you care, right? >> Yeah. >> And then you got open source, which is like, "Okay, let's get some upsource code, and let people verify it, and then build with that." Is it a balance? >> Yes, I think- >> Is it mutually exclusive? What's your view? >> I think it's going to be, markets will drive the proportion of both, and I think, for a certain use case, you'll end up with more proprietary offerings. With certain use cases, I guess the fundamental infrastructure for ChatGPT-like, let's say, large-language models and all the use cases running on top of it, that's likely going to be more platform-oriented and open source, and will allow innovation. Think of it as the equivalent of iPhone apps or Android apps running on top of those platforms, as in AI apps. So we'll have a lot of that. Now, when you start going a little bit more into the guts, the lower layers, then it's clear that, for performance reasons, in particular, for certain use cases, we'll end up with more proprietary offerings, whether it's advanced silicon, such as some of the silicon that emerged from entrepreneurs who have left Google, around TensorFlow, and all the silicon that powers that. You'll see a lot of innovation in that area as well. It hopefully intends to improve the cost efficiency of running large AI-oriented workloads, both in inference and in learning stages. >> I got to ask you, because this has come up a lot around Azure and Microsoft. Microsoft, pretty good move getting into the ChatGPT >> Yep. >> and the open AI, because I was talking to someone who's a hardcore Amazon developer, and they said, they swore they would never use Azure, right? One of those types. And they're spinning up Azure servers to get access to the API. So, the developers are flocking, as you mentioned. The YC class is all doing large data things, because you can now program with data, which is amazing, which is amazing. So, what's your take on, I know you got to be kind of neutral 'cause you're an investor, but you got, Amazon has to respond, Google, essentially, did all the work, so they have to have a solution. So, I'm expecting Google to have something very compelling, but Microsoft, right now, is going to just, might run the table on developers, this new wave of data developers. What's your take on the cloud responses to this? What's Amazon, what do you think AWS is going to do? What should Google be doing? What's your take? >> So, each of them is coming from a slightly different angle, of course. I'll say, Google, I think, has massive assets in the AI space, and their underlying cloud platform, I think, has been designed to support such complicated workloads, but they have yet to go as far as opening it up the same way ChatGPT is now in that Microsoft partnership, and Azure. Good question regarding Amazon. AWS has had a significant investment in AI-related infrastructure. Seeing it through my startups, through other lens as well. How will they respond to that higher layer, above and beyond the low level, if you will, AI-enabling apparatuses? How do they elevate to at least one or two layers above, and get to the same ChatGPT layer, good question. Is there an acquisition that will make sense for them to accelerate it, maybe. Is there an in-house development that they can reapply from a different domain towards that, possibly. But I do suspect we'll end up with acquisitions as the arms race around the next level of cloud wars emerges, and it's going to be no longer just about the basic tooling for basic cloud-based applications, and the infrastructure, and the cost management, but rather, faster time to deliver AI in data-heavy applications. Once again, each one of those cloud suppliers, their vendor is coming with different assets, and different pros and cons. All of them will need to just elevate the level of the fight, if you will, in this case, to the AI layer. >> It's going to be very interesting, the different stacks on the data infrastructure, like I mentioned, analytics, data lake, AI, all happening. It's going to be interesting to see how this turns into this AI cloud, like data clouds, data operating systems. So, super fascinating area. Opher, thank you for coming on and sharing your expertise with us. Great to see you, and congratulations on the work. I'll give you the final word here. Give a plugin for what you're looking for for startup seats, pre-seeds. What's the kind of profile that gets your attention, from a seed, pre-seed candidate or entrepreneur? >> Cool, first of all, it's my pleasure. Enjoy our chats, as always. Hopefully the next one's not going to be in nine years. As to what I'm looking for, ideally, smart data entrepreneurs, who have come from a particular domain problem, or problem domain, that they understand, they felt it in their own 10 fingers, or millions of neurons in their brains, and they figured out a way to solve it. Whether it's a data infrastructure play, a cloud infrastructure play, or a very, very smart application that takes advantage of data at scale. These are the things I'm looking for. >> One final, final question I have to ask you, because you're a seasoned entrepreneur, and now coach. What's different about the current entrepreneurial environment right now, vis-a-vis, the past decade? What's new? Is it different, highly accelerated? What advice do you give entrepreneurs out there who are putting together their plan? Obviously, a global resource pool now of engineering. It might not be yesterday's formula for success to putting a venture together to get to that product-market fit. What's new and different, and what's your advice to the folks out there about what's different about the current environment for being an entrepreneur? >> Fantastic, so I think it's a great question. So I think there's a few axes of difference, compared to, let's say, five years ago, 10 years ago, 15 years ago. First and foremost, given the amount of infrastructure out there, the amount of open-source technologies, amount of developer toolkits and frameworks, trying to develop an application, at least at the application layer, is much faster than ever. So, it's faster and cheaper, to the most part, unless you're building very fundamental, core, deep tech, where you still have a big technology challenge to deal with. And absent that, the challenge shifts more to how do you manage my resources, to product-market fit, how are you integrating the GTM lens, the go-to-market lens, as early as possible in the product-market fit cycle, such that you reach from pre-seed to seed, from seed to A, from A to B, with an optimal amount of velocity, and a minimal amount of resources. One big difference, specifically as of, let's say, beginning of this year, late last year, is that money is no longer free for entrepreneurs, which means that you need to operate and build startup in an environment with a lot more constraints. And in my mind, some of the best startups that have ever been built, and some of the big market-changing, generational-changing, if you will, technology startups, in their respective industry verticals, have actually emerged from these times. And these tend to be the smartest, best startups that emerge because they operate with a lot less money. Money is not as available for them, which means that they need to make tough decisions, and make verticals every day. What you don't need to do, you can kick the cow down the road. When you have plenty of money, and it cushions for a lot of mistakes, you don't have that cushion. And hopefully we'll end up with companies with a more agile, more, if you will, resilience, and better cultures in making those tough decisions that startups need to make every day. Which is why I'm super, super excited to see the next batch of amazing unicorns, true unicorns, not just valuation, market rising with the water type unicorns that emerged from this particular era, which we're in the beginning of. And very much enjoy working with entrepreneurs during this difficult time, the times we're in. >> The next 24 months will be the next wave, like you said, best time to do a company. Remember, Airbnb's pitch was, "We'll rent cots in apartments, and sell cereal." Boy, a lot of people passed on that deal, in that last down market, that turned out to be a game-changer. So the crazy ideas might not be that bad. So it's all about the entrepreneurs, and >> 100%. >> this is a big wave, and it's certainly happening. Opher, thank you for sharing. Obviously, data is going to change all the markets. Refactoring, security, FinTech, user experience, applications are going to be changed by data, data operating system. Thanks for coming on, and thanks for sharing. Appreciate it. >> My pleasure. Have a good one. >> Okay, more coverage for the CloudNativeSecurityCon inaugural event. Data will be the key for cybersecurity. theCUBE's coverage continues after this break. (uplifting music)

Published Date : Feb 2 2023

SUMMARY :

and happening all around the world. Great to be back. He's now the CEO in the past five years or so. and a lot of the startups What kind of deals are you looking at? and broaden the variety of and apply it to a product and experience And you see a slew of folks and put it on the cloud, and new gaps that need to be filled You have a dog in the background but kind of the same and the entire set of tools and systems, I call it the browser moment, But I do have to ask you, And now you got open source and do you care, right? and then build with that." and all the use cases I got to ask you, because and the open AI, and it's going to be no longer What's the kind of profile These are the things I'm looking for. about the current environment and some of the big market-changing, So it's all about the entrepreneurs, and to change all the markets. Have a good one. for the CloudNativeSecurityCon

ENTITIES

Entity	Category	Confidence
Satya Nadella	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
2013	DATE	0.99+
Opher	PERSON	0.99+
CapEx	ORGANIZATION	0.99+
Seattle	LOCATION	0.99+
John Furrier	PERSON	0.99+
Sonoma Ventures	ORGANIZATION	0.99+
BlackBerry	ORGANIZATION	0.99+
10 fingers	QUANTITY	0.99+
Airbnb	ORGANIZATION	0.99+
CUBE	ORGANIZATION	0.99+
nine years	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
Origami Logic	ORGANIZATION	0.99+
Origami	ORGANIZATION	0.99+
Intuit	ORGANIZATION	0.99+
RevTech	ORGANIZATION	0.99+
each	QUANTITY	0.99+
Opher Kahane	PERSON	0.99+
CloudNativeSecurityCon	EVENT	0.99+
Palo Alto Studios	LOCATION	0.99+
yesterday	DATE	0.99+
One	QUANTITY	0.99+
First	QUANTITY	0.99+
third layer	QUANTITY	0.98+
theCUBE	ORGANIZATION	0.98+
two layers	QUANTITY	0.98+
Android	TITLE	0.98+
third career	QUANTITY	0.98+
two things	QUANTITY	0.98+
both	QUANTITY	0.98+
MapR	ORGANIZATION	0.98+
one	QUANTITY	0.98+
one category	QUANTITY	0.98+
late last year	DATE	0.98+
millions of neurons	QUANTITY	0.98+
a million downloads	QUANTITY	0.98+
three startups	QUANTITY	0.98+
10 years ago	DATE	0.97+
Fintech	ORGANIZATION	0.97+
winter '23	DATE	0.97+
first one	QUANTITY	0.97+
this year	DATE	0.97+
Stanford	LOCATION	0.97+
Cloudera	ORGANIZATION	0.97+
theCUBE Center	ORGANIZATION	0.96+
five years ago	DATE	0.96+
10 year	QUANTITY	0.96+
ChatGPT	TITLE	0.96+
three	QUANTITY	0.95+
first time	QUANTITY	0.95+
XCEL Partners	ORGANIZATION	0.95+
15 years ago	DATE	0.94+
24 startups	QUANTITY	0.93+

Jon Turow, Madrona Venture Group | CloudNativeSecurityCon 23

(upbeat music) >> Hello and welcome back to theCUBE. We're here in Palo Alto, California. I'm your host, John Furrier with a special guest here in the studio. As part of our Cloud Native SecurityCon Coverage we had an opportunity to bring in Jon Turow who is the partner at Madrona Venture Partners formerly with AWS and to talk about machine learning, foundational models, and how the future of AI is going to be impacted by some of the innovation around what's going on in the industry. ChatGPT has taken the world by storm. A million downloads, fastest to the million downloads there. Before some were saying it's just a gimmick. Others saying it's a game changer. Jon's here to break it down, and great to have you on. Thanks for coming in. >> Thanks John. Glad to be here. >> Thanks for coming on. So first of all, I'm glad you're here. First of all, because two things. One, you were formerly with AWS, got a lot of experience running projects at AWS. Now a partner at Madrona, a great firm doing great deals, and they had this future at modern application kind of thesis. Now you are putting out some content recently around foundational models. You're deep into computer vision. You were the IoT general manager at AWS among other things, Greengrass. So you know a lot about data. You know a lot about some of this automation, some of the edge stuff. You've been in the middle of all these kind of areas that now seem to be the next wave coming. So I wanted to ask you what your thoughts are of how the machine learning and this new automation wave is coming in, this AI tools are coming out. Is it a platform? Is it going to be smarter? What feeds AI? What's your take on this whole foundational big movement into AI? What's your general reaction to all this? >> So, thanks, Jon, again for having me here. Really excited to talk about these things. AI has been coming for a long time. It's been kind of the next big thing. Always just over the horizon for quite some time. And we've seen really compelling applications in generations before and until now. Amazon and AWS have introduced a lot of them. My firm, Madrona Venture Group has invested in some of those early players as well. But what we're seeing now is something categorically different. That's really exciting and feels like a durable change. And I can try and explain what that is. We have these really large models that are useful in a general way. They can be applied to a lot of different tasks beyond the specific task that the designers envisioned. That makes them more flexible, that makes them more useful for building applications than what we've seen before. And so that, we can talk about the depths of it, but in a nutshell, that's why I think people are really excited. >> And I think one of the things that you wrote about that jumped out at me is that this seems to be this moment where there's been a multiple decades of nerds and computer scientists and programmers and data thinkers around waiting for AI to blossom. And it's like they're scratching that itch. Every year is going to be, and it's like the bottleneck's always been compute power. And we've seen other areas, genome sequencing, all kinds of high computation things where required high forms computing. But now there's no real bottleneck to compute. You got cloud. And so you're starting to see the emergence of a massive acceleration of where AI's been and where it needs to be going. Now, it's almost like it's got a reboot. It's almost a renaissance in the AI community with a whole nother macro environmental things happening. Cloud, younger generation, applications proliferate from mobile to cloud native. It's the perfect storm for this kind of moment to switch over. Am I overreading that? Is that right? >> You're right. And it's been cooking for a cycle or two. And let me try and explain why that is. We have cloud and AWS launch in whatever it was, 2006, and offered more compute to more people than really was possible before. Initially that was about taking existing applications and running them more easily in a bigger scale. But in that period of time what's also become possible is new kinds of computation that really weren't practical or even possible without that vast amount of compute. And so one result that came of that is something called the transformer AI model architecture. And Google came out with that, published a paper in 2017. And what that says is, with a transformer model you can actually train an arbitrarily large amount of data into a model, and see what happens. That's what Google demonstrated in 2017. The what happens is the really exciting part because when you do that, what you start to see, when models exceed a certain size that we had never really seen before all of a sudden they get what we call emerging capabilities of complex reasoning and reasoning outside a domain and reasoning with data. The kinds of things that people describe as spooky when they play with something like ChatGPT. That's the underlying term. We don't as an industry quite know why it happens or how it happens, but we can measure that it does. So cloud enables new kinds of math and science. New kinds of math and science allow new kinds of experimentation. And that experimentation has led to this new generation of models. >> So one of the debates we had on theCUBE at our Supercloud event last month was, what's the barriers to entry for say OpenAI, for instance? Obviously, I weighed in aggressively and said, "The barriers for getting into cloud are high because all the CapEx." And Howie Xu formerly VMware, now at ZScaler, he's an AI machine learning guy. He was like, "Well, you can spend $100 million and replicate it." I saw a quote that set up for 180,000 I can get this other package. What's the barriers to entry? Is ChatGPT or OpenAI, does it have sustainability? Is it easy to get into? What is the market like for AI? I mean, because a lot of entrepreneurs are jumping in. I mean, I just read a story today. San Francisco's got more inbound migration because of the AI action happening, Seattle's booming, Boston with MIT's been working on neural networks for generations. That's what we've found the answer. Get off the neural network, Boston jump on the AI bus. So there's total excitement for this. People are enthusiastic around this area. >> You can think of an iPhone versus Android tension that's happening today. In the iPhone world, there are proprietary models from OpenAI who you might consider as the leader. There's Cohere, there's AI21, there's Anthropic, Google's going to have their own, and a few others. These are proprietary models that developers can build on top of, get started really quickly. They're measured to have the highest accuracy and the highest performance today. That's the proprietary side. On the other side, there is an open source part of the world. These are a proliferation of model architectures that developers and practitioners can take off the shelf and train themselves. Typically found in Hugging face. What people seem to think is that the accuracy and performance of the open source models is something like 18 to 20 months behind the accuracy and performance of the proprietary models. But on the other hand, there's infinite flexibility for teams that are capable enough. So you're going to see teams choose sides based on whether they want speed or flexibility. >> That's interesting. And that brings up a point I was talking to a startup and the debate was, do you abstract away from the hardware and be software-defined or software-led on the AI side and let the hardware side just extremely accelerate on its own, 'cause it's flywheel? So again, back to proprietary, that's with hardware kind of bundled in, bolted on. Is it accelerator or is it bolted on or is it part of it? So to me, I think that the big struggle in understanding this is that which one will end up being right. I mean, is it a beta max versus VHS kind of thing going on? Or iPhone, Android, I mean iPhone makes a lot of sense, but if you're Apple, but is there an Apple moment in the machine learning? >> In proprietary models, here does seem to be a jump ball. That there's going to be a virtuous flywheel that emerges that, for example, all these excitement about ChatGPT. What's really exciting about it is it's really easy to use. The technology isn't so different from what we've seen before even from OpenAI. You mentioned a million users in a short period of time, all providing training data for OpenAI that makes their underlying models, their next generation even better. So it's not unreasonable to guess that there's going to be power laws that emerge on the proprietary side. What I think history has shown is that iPhone, Android, Windows, Linux, there seems to be gravity towards this yin and yang. And my guess, and what other people seem to think is going to be the case is that we're going to continue to see these two poles of AI. >> So let's get into the relationship with data because I've been emerging myself with ChatGPT, fascinated by the ease of use, yes, but also the fidelity of how you query it. And I felt like when I was doing writing SQL back in the eighties and nineties where SQL was emerging. You had to be really a guru at the SQL to get the answers you wanted. It seems like the querying into ChatGPT is a good thing if you know how to talk to it. Labeling whether your input is and it does a great job if you feed it right. If you ask a generic questions like Google. It's like a Google search. It gives you great format, sounds credible, but the facts are kind of wrong. >> That's right. >> That's where general consensus is coming on. So what does that mean? That means people are on one hand saying, "Ah, it's bullshit 'cause it's wrong." But I look at, I'm like, "Wow, that's that's compelling." 'Cause if you feed it the right data, so now we're in the data modeling here, so the role of data's going to be critical. Is there a data operating system emerging? Because if this thing continues to go the way it's going you can almost imagine as you would look at companies to invest in. Who's going to be right on this? What's going to scale? What's sustainable? What could build a durable company? It might not look what like what people think it is. I mean, I remember when Google started everyone thought it was the worst search engine because it wasn't a portal. But it was the best organic search on the planet became successful. So I'm trying to figure out like, okay, how do you read this? How do you read the tea leaves? >> Yeah. There are a few different ways that companies can differentiate themselves. Teams with galactic capabilities to take an open source model and then change the architecture and retrain and go down to the silicon. They can do things that might not have been possible for other teams to do. There's a company that that we're proud to be investors in called RunwayML that provides video accelerated, sorry, AI accelerated video editing capabilities. They were used in everything, everywhere all at once and some others. In order to build RunwayML, they needed a vision of what the future was going to look like and they needed to make deep contributions to the science that was going to enable all that. But not every team has those capabilities, maybe nor should they. So as far as how other teams are going to differentiate there's a couple of things that they can do. One is called prompt engineering where they shape on behalf of their own users exactly how the prompt to get fed to the underlying model. It's not clear whether that's going to be a durable problem or whether like Google, we consumers are going to start to get more intuitive about this. That's one. The second is what's called information retrieval. How can I get information about the world outside, information from a database or a data store or whatever service into these models so they can reason about them. And the third is, this is going to sound funny, but attribution. Just like you would do in a news report or an academic paper. If you can state where your facts are coming from, the downstream consumer or the human being who has to use that information actually is going to be able to make better sense of it and rely better on it. So that's prompt engineering, that's retrieval, and that's attribution. >> So that brings me to my next point I want to dig in on is the foundational model stack that you published. And I'll start by saying that with ChatGPT, if you take out the naysayers who are like throwing cold water on it about being a gimmick or whatever, and then you got the other side, I would call the alpha nerds who are like they can see, "Wow, this is amazing." This is truly NextGen. This isn't yesterday's chatbot nonsense. They're like, they're all over it. It's that everybody's using it right now in every vertical. I heard someone using it for security logs. I heard a data center, hardware vendor using it for pushing out appsec review updates. I mean, I've heard corner cases. We're using it for theCUBE to put our metadata in. So there's a horizontal use case of value. So to me that tells me it's a market there. So when you have horizontal scalability in the use case you're going to have a stack. So you publish this stack and it has an application at the top, applications like Jasper out there. You're seeing ChatGPT. But you go after the bottom, you got silicon, cloud, foundational model operations, the foundational models themselves, tooling, sources, actions. Where'd you get this from? How'd you put this together? Did you just work backwards from the startups or was there a thesis behind this? Could you share your thoughts behind this foundational model stack? >> Sure. Well, I'm a recovering product manager and my job that I think about as a product manager is who is my customer and what problem he wants to solve. And so to put myself in the mindset of an application developer and a founder who is actually my customer as a partner at Madrona, I think about what technology and resources does she need to be really powerful, to be able to take a brilliant idea, and actually bring that to life. And if you spend time with that community, which I do and I've met with hundreds of founders now who are trying to do exactly this, you can see that the stack is emerging. In fact, we first drew it in, not in January 2023, but October 2022. And if you look at the difference between the October '22 and January '23 stacks you're going to see that holes in the stack that we identified in October around tooling and around foundation model ops and the rest are organically starting to get filled because of how much demand from the developers at the top of the stack. >> If you look at the young generation coming out and even some of the analysts, I was just reading an analyst report on who's following the whole data stacks area, Databricks, Snowflake, there's variety of analytics, realtime AI, data's hot. There's a lot of engineers coming out that were either data scientists or I would call data platform engineering folks are becoming very key resources in this area. What's the skillset emerging and what's the mindset of that entrepreneur that sees the opportunity? How does these startups come together? Is there a pattern in the formation? Is there a pattern in the competency or proficiency around the talent behind these ventures? >> Yes. I would say there's two groups. The first is a very distinct pattern, John. For the past 10 years or a little more we've seen a pattern of democratization of ML where more and more people had access to this powerful science and technology. And since about 2017, with the rise of the transformer architecture in these foundation models, that pattern has reversed. All of a sudden what has become broader access is now shrinking to a pretty small group of scientists who can actually train and manipulate the architectures of these models themselves. So that's one. And what that means is the teams who can do that have huge ability to make the future happen in ways that other people don't have access to yet. That's one. The second is there is a broader population of people who by definition has even more collective imagination 'cause there's even more people who sees what should be possible and can use things like the proprietary models, like the OpenAI models that are available off the shelf and try to create something that maybe nobody has seen before. And when they do that, Jasper AI is a great example of that. Jasper AI is a company that creates marketing copy automatically with generative models such as GPT-3. They do that and it's really useful and it's almost fun for a marketer to use that. But there are going to be questions of how they can defend that against someone else who has access to the same technology. It's a different population of founders who has to find other sources of differentiation without being able to go all the way down to the the silicon and the science. >> Yeah, and it's going to be also opportunity recognition is one thing. Building a viable venture product market fit. You got competition. And so when things get crowded you got to have some differentiation. I think that's going to be the key. And that's where I was trying to figure out and I think data with scale I think are big ones. Where's the vulnerability in the stack in terms of gaps? Where's the white space? I shouldn't say vulnerability. I should say where's the opportunity, where's the white space in the stack that you see opportunities for entrepreneurs to attack? >> I would say there's two. At the application level, there is almost infinite opportunity, John, because almost every kind of application is about to be reimagined or disrupted with a new generation that takes advantage of this really powerful new technology. And so if there is a kind of application in almost any vertical, it's hard to rule something out. Almost any vertical that a founder wishes she had created the original app in, well, now it's her time. So that's one. The second is, if you look at the tooling layer that we discussed, tooling is a really powerful way that you can provide more flexibility to app developers to get more differentiation for themselves. And the tooling layer is still forming. This is the interface between the models themselves and the applications. Tools that help bring in data, as you mentioned, connect to external actions, bring context across multiple calls, chain together multiple models. These kinds of things, there's huge opportunity there. >> Well, Jon, I really appreciate you coming in. I had a couple more questions, but I will take a minute to read some of your bios for the audience and we'll get into, I won't embarrass you, but I want to set the context. You said you were recovering product manager, 10 plus years at AWS. Obviously, recovering from AWS, which is a whole nother dimension of recovering. In all seriousness, I talked to Andy Jassy around that time and Dr. Matt Wood and it was about that time when AI was just getting on the radar when they started. So you guys started seeing the wave coming in early on. So I remember at that time as Amazon was starting to grow significantly and even just stock price and overall growth. From a tech perspective, it was pretty clear what was coming, so you were there when this tsunami hit. >> Jon: That's right. >> And you had a front row seat building tech, you were led the product teams for Computer Vision AI, Textract, AI intelligence for document processing, recognition for image and video analysis. You wrote the business product plan for AWS IoT and Greengrass, which we've covered a lot in theCUBE, which extends out to the whole edge thing. So you know a lot about AI/ML, edge computing, IOT, messaging, which I call the law of small numbers that scale become big. This is a big new thing. So as a former AWS leader who's been there and at Madrona, what's your investment thesis as you start to peruse the landscape and talk to entrepreneurs as you got the stack? What's the big picture? What are you looking for? What's the thesis? How do you see this next five years emerging? >> Five years is a really long time given some of this science is only six months out. I'll start with some, no pun intended, some foundational things. And we can talk about some implications of the technology. The basics are the same as they've always been. We want, what I like to call customers with their hair on fire. So they have problems, so urgent they'll buy half a product. The joke is if your hair is on fire you might want a bucket of cold water, but you'll take a tennis racket and you'll beat yourself over the head to put the fire out. You want those customers 'cause they'll meet you more than halfway. And when you find them, you can obsess about them and you can get better every day. So we want customers with their hair on fire. We want founders who have empathy for those customers, understand what is going to be required to serve them really well, and have what I like to call founder-market fit to be able to build the products that those customers are going to need. >> And because that's a good strategy from an emerging, not yet fully baked out requirements definition. >> Jon: That's right. >> Enough where directionally they're leaning in, more than in, they're part of the product development process. >> That's right. And when you're doing early stage development, which is where I personally spend a lot of my time at the seed and A and a little bit beyond that stage often that's going to be what you have to go on because the future is going to be so complex that you can't see the curves beyond it. But if you have customers with their hair on fire and talented founders who have the capability to serve those customers, that's got me interested. >> So if I'm an entrepreneur, I walk in and say, "I have customers that have their hair on fire." What kind of checks do you write? What's the kind of the average you're seeing for seed and series? Probably seed, seed rounds and series As. >> It can depend. I have seen seed rounds of double digit million dollars. I have seen seed rounds much smaller than that. It really depends on what is going to be the right thing for these founders to prove out the hypothesis that they're testing that says, "Look, we have this customer with her hair on fire. We think we can build at least a tennis racket that she can use to start beating herself over the head and put the fire out. And then we're going to have something really interesting that we can scale up from there and we can make the future happen. >> So it sounds like your advice to founders is go out and find some customers, show them a product, don't obsess over full completion, get some sort of vibe on fit and go from there. >> Yeah, and I think by the time founders come to me they may not have a product, they may not have a deck, but if they have a customer with her hair on fire, then I'm really interested. >> Well, I always love the professional services angle on these markets. You go in and you get some business and you understand it. Walk away if you don't like it, but you see the hair on fire, then you go in product mode. >> That's right. >> All Right, Jon, thank you for coming on theCUBE. Really appreciate you stopping by the studio and good luck on your investments. Great to see you. >> You too. >> Thanks for coming on. >> Thank you, Jon. >> CUBE coverage here at Palo Alto. I'm John Furrier, your host. More coverage with CUBE Conversations after this break. (upbeat music)

Published Date : Feb 2 2023

SUMMARY :

and great to have you on. that now seem to be the next wave coming. It's been kind of the next big thing. is that this seems to be this moment and offered more compute to more people What's the barriers to entry? is that the accuracy and the debate was, do you that there's going to be power laws but also the fidelity of how you query it. going to be critical. exactly how the prompt to get So that brings me to my next point and actually bring that to life. and even some of the analysts, But there are going to be questions Yeah, and it's going to be and the applications. the radar when they started. and talk to entrepreneurs the head to put the fire out. And because that's a good of the product development process. that you can't see the curves beyond it. What kind of checks do you write? and put the fire out. to founders is go out time founders come to me and you understand it. stopping by the studio More coverage with CUBE

ENTITIES

Entity	Category	Confidence
Amazon	ORGANIZATION	0.99+
Jon	PERSON	0.99+
AWS	ORGANIZATION	0.99+
John	PERSON	0.99+
John Furrier	PERSON	0.99+
Andy Jassy	PERSON	0.99+
2017	DATE	0.99+
January 2023	DATE	0.99+
Jon Turow	PERSON	0.99+
October	DATE	0.99+
18	QUANTITY	0.99+
MIT	ORGANIZATION	0.99+
$100 million	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
10 plus years	QUANTITY	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
Google	ORGANIZATION	0.99+
two	QUANTITY	0.99+
October 2022	DATE	0.99+
hundreds	QUANTITY	0.99+
Madrona	ORGANIZATION	0.99+
Apple	ORGANIZATION	0.99+
Madrona Venture Partners	ORGANIZATION	0.99+
January '23	DATE	0.99+
two groups	QUANTITY	0.99+
Matt Wood	PERSON	0.99+
Madrona Venture Group	ORGANIZATION	0.99+
180,000	QUANTITY	0.99+
October '22	DATE	0.99+
Jasper	TITLE	0.99+
Palo Alto, California	LOCATION	0.99+
six months	QUANTITY	0.99+
2006	DATE	0.99+
million downloads	QUANTITY	0.99+
Five years	QUANTITY	0.99+
SQL	TITLE	0.99+
last month	DATE	0.99+
two poles	QUANTITY	0.99+
first	QUANTITY	0.99+
Howie Xu	PERSON	0.99+
VMware	ORGANIZATION	0.99+
third	QUANTITY	0.99+
20 months	QUANTITY	0.99+
Greengrass	ORGANIZATION	0.99+
Madrona Venture Group	ORGANIZATION	0.98+
second	QUANTITY	0.98+
One	QUANTITY	0.98+
Supercloud	EVENT	0.98+
RunwayML	TITLE	0.98+
San Francisco	LOCATION	0.98+
ZScaler	ORGANIZATION	0.98+
yesterday	DATE	0.98+
one	QUANTITY	0.98+
First	QUANTITY	0.97+
CapEx	ORGANIZATION	0.97+
eighties	DATE	0.97+
ChatGPT	TITLE	0.96+
Dr.	PERSON	0.96+

Breaking Analysis: Enterprise Technology Predictions 2023

(upbeat music beginning) >> From the Cube Studios in Palo Alto and Boston, bringing you data-driven insights from the Cube and ETR, this is "Breaking Analysis" with Dave Vellante. >> Making predictions about the future of enterprise tech is more challenging if you strive to lay down forecasts that are measurable. In other words, if you make a prediction, you should be able to look back a year later and say, with some degree of certainty, whether the prediction came true or not, with evidence to back that up. Hello and welcome to this week's Wikibon Cube Insights, powered by ETR. In this breaking analysis, we aim to do just that, with predictions about the macro IT spending environment, cost optimization, security, lots to talk about there, generative AI, cloud, and of course supercloud, blockchain adoption, data platforms, including commentary on Databricks, snowflake, and other key players, automation, events, and we may even have some bonus predictions around quantum computing, and perhaps some other areas. To make all this happen, we welcome back, for the third year in a row, my colleague and friend Eric Bradley from ETR. Eric, thanks for all you do for the community, and thanks for being part of this program. Again. >> I wouldn't miss it for the world. I always enjoy this one. Dave, good to see you. >> Yeah, so let me bring up this next slide and show you, actually come back to me if you would. I got to show the audience this. These are the inbounds that we got from PR firms starting in October around predictions. They know we do prediction posts. And so they'll send literally thousands and thousands of predictions from hundreds of experts in the industry, technologists, consultants, et cetera. And if you bring up the slide I can show you sort of the pattern that developed here. 40% of these thousands of predictions were from cyber. You had AI and data. If you combine those, it's still not close to cyber. Cost optimization was a big thing. Of course, cloud, some on DevOps, and software. Digital... Digital transformation got, you know, some lip service and SaaS. And then there was other, it's kind of around 2%. So quite remarkable, when you think about the focus on cyber, Eric. >> Yeah, there's two reasons why I think it makes sense, though. One, the cybersecurity companies have a lot of cash, so therefore the PR firms might be working a little bit harder for them than some of their other clients. (laughs) And then secondly, as you know, for multiple years now, when we do our macro survey, we ask, "What's your number one spending priority?" And again, it's security. It just isn't going anywhere. It just stays at the top. So I'm actually not that surprised by that little pie chart there, but I was shocked that SaaS was only 5%. You know, going back 10 years ago, that would've been the only thing anyone was talking about. >> Yeah. So true. All right, let's get into it. First prediction, we always start with kind of tech spending. Number one is tech spending increases between four and 5%. ETR has currently got it at 4.6% coming into 2023. This has been a consistently downward trend all year. We started, you know, much, much higher as we've been reporting. Bottom line is the fed is still in control. They're going to ease up on tightening, is the expectation, they're going to shoot for a soft landing. But you know, my feeling is this slingshot economy is going to continue, and it's going to continue to confound, whether it's supply chains or spending. The, the interesting thing about the ETR data, Eric, and I want you to comment on this, the largest companies are the most aggressive to cut. They're laying off, smaller firms are spending faster. They're actually growing at a much larger, faster rate as are companies in EMEA. And that's a surprise. That's outpacing the US and APAC. Chime in on this, Eric. >> Yeah, I was surprised on all of that. First on the higher level spending, we are definitely seeing it coming down, but the interesting thing here is headlines are making it worse. The huge research shop recently said 0% growth. We're coming in at 4.6%. And just so everyone knows, this is not us guessing, we asked 1,525 IT decision-makers what their budget growth will be, and they came in at 4.6%. Now there's a huge disparity, as you mentioned. The Fortune 500, global 2000, barely at 2% growth, but small, it's at 7%. So we're at a situation right now where the smaller companies are still playing a little bit of catch up on digital transformation, and they're spending money. The largest companies that have the most to lose from a recession are being more trepidatious, obviously. So they're playing a "Wait and see." And I hope we don't talk ourselves into a recession. Certainly the headlines and some of their research shops are helping it along. But another interesting comment here is, you know, energy and utilities used to be called an orphan and widow stock group, right? They are spending more than anyone, more than financials insurance, more than retail consumer. So right now it's being driven by mid, small, and energy and utilities. They're all spending like gangbusters, like nothing's happening. And it's the rest of everyone else that's being very cautious. >> Yeah, so very unpredictable right now. All right, let's go to number two. Cost optimization remains a major theme in 2023. We've been reporting on this. You've, we've shown a chart here. What's the primary method that your organization plans to use? You asked this question of those individuals that cited that they were going to reduce their spend and- >> Mhm. >> consolidating redundant vendors, you know, still leads the way, you know, far behind, cloud optimization is second, but it, but cloud continues to outpace legacy on-prem spending, no doubt. Somebody, it was, the guy's name was Alexander Feiglstorfer from Storyblok, sent in a prediction, said "All in one becomes extinct." Now, generally I would say I disagree with that because, you know, as we know over the years, suites tend to win out over, you know, individual, you know, point products. But I think what's going to happen is all in one is going to remain the norm for these larger companies that are cutting back. They want to consolidate redundant vendors, and the smaller companies are going to stick with that best of breed and be more aggressive and try to compete more effectively. What's your take on that? >> Yeah, I'm seeing much more consolidation in vendors, but also consolidation in functionality. We're seeing people building out new functionality, whether it's, we're going to talk about this later, so I don't want to steal too much of our thunder right now, but data and security also, we're seeing a functionality creep. So I think there's further consolidation happening here. I think niche solutions are going to be less likely, and platform solutions are going to be more likely in a spending environment where you want to reduce your vendors. You want to have one bill to pay, not 10. Another thing on this slide, real quick if I can before I move on, is we had a bunch of people write in and some of the answer options that aren't on this graph but did get cited a lot, unfortunately, is the obvious reduction in staff, hiring freezes, and delaying hardware, were three of the top write-ins. And another one was offshore outsourcing. So in addition to what we're seeing here, there were a lot of write-in options, and I just thought it would be important to state that, but essentially the cost optimization is by and far the highest one, and it's growing. So it's actually increased in our citations over the last year. >> And yeah, specifically consolidating redundant vendors. And so I actually thank you for bringing that other up, 'cause I had asked you, Eric, is there any evidence that repatriation is going on and we don't see it in the numbers, we don't see it even in the other, there was, I think very little or no mention of cloud repatriation, even though it might be happening in this in a smattering. >> Not a single mention, not one single mention. I went through it for you. Yep. Not one write-in. >> All right, let's move on. Number three, security leads M&A in 2023. Now you might say, "Oh, well that's a layup," but let me set this up Eric, because I didn't really do a great job with the slide. I hid the, what you've done, because you basically took, this is from the emerging technology survey with 1,181 responses from November. And what we did is we took Palo Alto and looked at the overlap in Palo Alto Networks accounts with these vendors that were showing on this chart. And Eric, I'm going to ask you to explain why we put a circle around OneTrust, but let me just set it up, and then have you comment on the slide and take, give us more detail. We're seeing private company valuations are off, you know, 10 to 40%. We saw a sneak, do a down round, but pretty good actually only down 12%. We've seen much higher down rounds. Palo Alto Networks we think is going to get busy. Again, they're an inquisitive company, they've been sort of quiet lately, and we think CrowdStrike, Cisco, Microsoft, Zscaler, we're predicting all of those will make some acquisitions and we're thinking that the targets are somewhere in this mess of security taxonomy. Other thing we're predicting AI meets cyber big time in 2023, we're going to probably going to see some acquisitions of those companies that are leaning into AI. We've seen some of that with Palo Alto. And then, you know, your comment to me, Eric, was "The RSA conference is going to be insane, hopping mad, "crazy this April," (Eric laughing) but give us your take on this data, and why the red circle around OneTrust? Take us back to that slide if you would, Alex. >> Sure. There's a few things here. First, let me explain what we're looking at. So because we separate the public companies and the private companies into two separate surveys, this allows us the ability to cross-reference that data. So what we're doing here is in our public survey, the tesis, everyone who cited some spending with Palo Alto, meaning they're a Palo Alto customer, we then cross-reference that with the private tech companies. Who also are they spending with? So what you're seeing here is an overlap. These companies that we have circled are doing the best in Palo Alto's accounts. Now, Palo Alto went and bought Twistlock a few years ago, which this data slide predicted, to be quite honest. And so I don't know if they necessarily are going to go after Snyk. Snyk, sorry. They already have something in that space. What they do need, however, is more on the authentication space. So I'm looking at OneTrust, with a 45% overlap in their overall net sentiment. That is a company that's already existing in their accounts and could be very synergistic to them. BeyondTrust as well, authentication identity. This is something that Palo needs to do to move more down that zero trust path. Now why did I pick Palo first? Because usually they're very inquisitive. They've been a little quiet lately. Secondly, if you look at the backdrop in the markets, the IPO freeze isn't going to last forever. Sooner or later, the IPO markets are going to open up, and some of these private companies are going to tap into public equity. In the meantime, however, cash funding on the private side is drying up. If they need another round, they're not going to get it, and they're certainly not going to get it at the valuations they were getting. So we're seeing valuations maybe come down where they're a touch more attractive, and Palo knows this isn't going to last forever. Cisco knows that, CrowdStrike, Zscaler, all these companies that are trying to make a push to become that vendor that you're consolidating in, around, they have a chance now, they have a window where they need to go make some acquisitions. And that's why I believe leading up to RSA, we're going to see some movement. I think it's going to pretty, a really exciting time in security right now. >> Awesome. Thank you. Great explanation. All right, let's go on the next one. Number four is, it relates to security. Let's stay there. Zero trust moves from hype to reality in 2023. Now again, you might say, "Oh yeah, that's a layup." A lot of these inbounds that we got are very, you know, kind of self-serving, but we always try to put some meat in the bone. So first thing we do is we pull out some commentary from, Eric, your roundtable, your insights roundtable. And we have a CISO from a global hospitality firm says, "For me that's the highest priority." He's talking about zero trust because it's the best ROI, it's the most forward-looking, and it enables a lot of the business transformation activities that we want to do. CISOs tell me that they actually can drive forward transformation projects that have zero trust, and because they can accelerate them, because they don't have to go through the hurdle of, you know, getting, making sure that it's secure. Second comment, zero trust closes that last mile where once you're authenticated, they open up the resource to you in a zero trust way. That's a CISO of a, and a managing director of a cyber risk services enterprise. Your thoughts on this? >> I can be here all day, so I'm going to try to be quick on this one. This is not a fluff piece on this one. There's a couple of other reasons this is happening. One, the board finally gets it. Zero trust at first was just a marketing hype term. Now the board understands it, and that's why CISOs are able to push through it. And what they finally did was redefine what it means. Zero trust simply means moving away from hardware security, moving towards software-defined security, with authentication as its base. The board finally gets that, and now they understand that this is necessary and it's being moved forward. The other reason it's happening now is hybrid work is here to stay. We weren't really sure at first, large companies were still trying to push people back to the office, and it's going to happen. The pendulum will swing back, but hybrid work's not going anywhere. By basically on our own data, we're seeing that 69% of companies expect remote and hybrid to be permanent, with only 30% permanent in office. Zero trust works for a hybrid environment. So all of that is the reason why this is happening right now. And going back to our previous prediction, this is why we're picking Palo, this is why we're picking Zscaler to make these acquisitions. Palo Alto needs to be better on the authentication side, and so does Zscaler. They're both fantastic on zero trust network access, but they need the authentication software defined aspect, and that's why we think this is going to happen. One last thing, in that CISO round table, I also had somebody say, "Listen, Zscaler is incredible. "They're doing incredibly well pervading the enterprise, "but their pricing's getting a little high," and they actually think Palo Alto is well-suited to start taking some of that share, if Palo can make one move. >> Yeah, Palo Alto's consolidation story is very strong. Here's my question and challenge. Do you and me, so I'm always hardcore about, okay, you've got to have evidence. I want to look back at these things a year from now and say, "Did we get it right? Yes or no?" If we got it wrong, we'll tell you we got it wrong. So how are we going to measure this? I'd say a couple things, and you can chime in. One is just the number of vendors talking about it. That's, but the marketing always leads the reality. So the second part of that is we got to get evidence from the buying community. Can you help us with that? >> (laughs) Luckily, that's what I do. I have a data company that asks thousands of IT decision-makers what they're adopting and what they're increasing spend on, as well as what they're decreasing spend on and what they're replacing. So I have snapshots in time over the last 11 years where I can go ahead and compare and contrast whether this adoption is happening or not. So come back to me in 12 months and I'll let you know. >> Now, you know, I will. Okay, let's bring up the next one. Number five, generative AI hits where the Metaverse missed. Of course everybody's talking about ChatGPT, we just wrote last week in a breaking analysis with John Furrier and Sarjeet Joha our take on that. We think 2023 does mark a pivot point as natural language processing really infiltrates enterprise tech just as Amazon turned the data center into an API. We think going forward, you're going to be interacting with technology through natural language, through English commands or other, you know, foreign language commands, and investors are lining up, all the VCs are getting excited about creating something competitive to ChatGPT, according to (indistinct) a hundred million dollars gets you a seat at the table, gets you into the game. (laughing) That's before you have to start doing promotion. But he thinks that's what it takes to actually create a clone or something equivalent. We've seen stuff from, you know, the head of Facebook's, you know, AI saying, "Oh, it's really not that sophisticated, ChatGPT, "it's kind of like IBM Watson, it's great engineering, "but you know, we've got more advanced technology." We know Google's working on some really interesting stuff. But here's the thing. ETR just launched this survey for the February survey. It's in the field now. We circle open AI in this category. They weren't even in the survey, Eric, last quarter. So 52% of the ETR survey respondents indicated a positive sentiment toward open AI. I added up all the sort of different bars, we could double click on that. And then I got this inbound from Scott Stevenson of Deep Graham. He said "AI is recession-proof." I don't know if that's the case, but it's a good quote. So bring this back up and take us through this. Explain this chart for us, if you would. >> First of all, I like Scott's quote better than the Facebook one. I think that's some sour grapes. Meta just spent an insane amount of money on the Metaverse and that's a dud. Microsoft just spent money on open AI and it is hot, undoubtedly hot. We've only been in the field with our current ETS survey for a week. So my caveat is it's preliminary data, but I don't care if it's preliminary data. (laughing) We're getting a sneak peek here at what is the number one net sentiment and mindshare leader in the entire machine-learning AI sector within a week. It's beating Data- >> 600. 600 in. >> It's beating Databricks. And we all know Databricks is a huge established enterprise company, not only in machine-learning AI, but it's in the top 10 in the entire survey. We have over 400 vendors in this survey. It's number eight overall, already. In a week. This is not hype. This is real. And I could go on the NLP stuff for a while. Not only here are we seeing it in open AI and machine-learning and AI, but we're seeing NLP in security. It's huge in email security. It's completely transforming that area. It's one of the reasons I thought Palo might take Abnormal out. They're doing such a great job with NLP in this email side, and also in the data prep tools. NLP is going to take out data prep tools. If we have time, I'll discuss that later. But yeah, this is, to me this is a no-brainer, and we're already seeing it in the data. >> Yeah, John Furrier called, you know, the ChatGPT introduction. He said it reminded him of the Netscape moment, when we all first saw Netscape Navigator and went, "Wow, it really could be transformative." All right, number six, the cloud expands to supercloud as edge computing accelerates and CloudFlare is a big winner in 2023. We've reported obviously on cloud, multi-cloud, supercloud and CloudFlare, basically saying what multi-cloud should have been. We pulled this quote from Atif Kahn, who is the founder and CTO of Alkira, thanks, one of the inbounds, thank you. "In 2023, highly distributed IT environments "will become more the norm "as organizations increasingly deploy hybrid cloud, "multi-cloud and edge settings..." Eric, from one of your round tables, "If my sources from edge computing are coming "from the cloud, that means I have my workloads "running in the cloud. "There is no one better than CloudFlare," That's a senior director of IT architecture at a huge financial firm. And then your analysis shows CloudFlare really growing in pervasion, that sort of market presence in the dataset, dramatically, to near 20%, leading, I think you had told me that they're even ahead of Google Cloud in terms of momentum right now. >> That was probably the biggest shock to me in our January 2023 tesis, which covers the public companies in the cloud computing sector. CloudFlare has now overtaken GCP in overall spending, and I was shocked by that. It's already extremely pervasive in networking, of course, for the edge networking side, and also in security. This is the number one leader in SaaSi, web access firewall, DDoS, bot protection, by your definition of supercloud, which we just did a couple of weeks ago, and I really enjoyed that by the way Dave, I think CloudFlare is the one that fits your definition best, because it's bringing all of these aspects together, and most importantly, it's cloud agnostic. It does not need to rely on Azure or AWS to do this. It has its own cloud. So I just think it's, when we look at your definition of supercloud, CloudFlare is the poster child. >> You know, what's interesting about that too, is a lot of people are poo-pooing CloudFlare, "Ah, it's, you know, really kind of not that sophisticated." "You don't have as many tools," but to your point, you're can have those tools in the cloud, Cloudflare's doing serverless on steroids, trying to keep things really simple, doing a phenomenal job at, you know, various locations around the world. And they're definitely one to watch. Somebody put them on my radar (laughing) a while ago and said, "Dave, you got to do a breaking analysis on CloudFlare." And so I want to thank that person. I can't really name them, 'cause they work inside of a giant hyperscaler. But- (Eric laughing) (Dave chuckling) >> Real quickly, if I can from a competitive perspective too, who else is there? They've already taken share from Akamai, and Fastly is their really only other direct comp, and they're not there. And these guys are in poll position and they're the only game in town right now. I just, I don't see it slowing down. >> I thought one of your comments from your roundtable I was reading, one of the folks said, you know, CloudFlare, if my workloads are in the cloud, they are, you know, dominant, they said not as strong with on-prem. And so Akamai is doing better there. I'm like, "Okay, where would you want to be?" (laughing) >> Yeah, which one of those two would you rather be? >> Right? Anyway, all right, let's move on. Number seven, blockchain continues to look for a home in the enterprise, but devs will slowly begin to adopt in 2023. You know, blockchains have got a lot of buzz, obviously crypto is, you know, the killer app for blockchain. Senior IT architect in financial services from your, one of your insight roundtables said quote, "For enterprises to adopt a new technology, "there have to be proven turnkey solutions. "My experience in talking with my peers are, "blockchain is still an open-source component "where you have to build around it." Now I want to thank Ravi Mayuram, who's the CTO of Couchbase sent in, you know, one of the predictions, he said, "DevOps will adopt blockchain, specifically Ethereum." And he referenced actually in his email to me, Solidity, which is the programming language for Ethereum, "will be in every DevOps pro's playbook, "mirroring the boom in machine-learning. "Newer programming languages like Solidity "will enter the toolkits of devs." His point there, you know, Solidity for those of you don't know, you know, Bitcoin is not programmable. Solidity, you know, came out and that was their whole shtick, and they've been improving that, and so forth. But it, Eric, it's true, it really hasn't found its home despite, you know, the potential for smart contracts. IBM's pushing it, VMware has had announcements, and others, really hasn't found its way in the enterprise yet. >> Yeah, and I got to be honest, I don't think it's going to, either. So when we did our top trends series, this was basically chosen as an anti-prediction, I would guess, that it just continues to not gain hold. And the reason why was that first comment, right? It's very much a niche solution that requires a ton of custom work around it. You can't just plug and play it. And at the end of the day, let's be very real what this technology is, it's a database ledger, and we already have database ledgers in the enterprise. So why is this a priority to move to a different database ledger? It's going to be very niche cases. I like the CTO comment from Couchbase about it being adopted by DevOps. I agree with that, but it has to be a DevOps in a very specific use case, and a very sophisticated use case in financial services, most likely. And that's not across the entire enterprise. So I just think it's still going to struggle to get its foothold for a little bit longer, if ever. >> Great, thanks. Okay, let's move on. Number eight, AWS Databricks, Google Snowflake lead the data charge with Microsoft. Keeping it simple. So let's unpack this a little bit. This is the shared accounts peer position for, I pulled data platforms in for analytics, machine-learning and AI and database. So I could grab all these accounts or these vendors and see how they compare in those three sectors. Analytics, machine-learning and database. Snowflake and Databricks, you know, they're on a crash course, as you and I have talked about. They're battling to be the single source of truth in analytics. They're, there's going to be a big focus. They're already started. It's going to be accelerated in 2023 on open formats. Iceberg, Python, you know, they're all the rage. We heard about Iceberg at Snowflake Summit, last summer or last June. Not a lot of people had heard of it, but of course the Databricks crowd, who knows it well. A lot of other open source tooling. There's a company called DBT Labs, which you're going to talk about in a minute. George Gilbert put them on our radar. We just had Tristan Handy, the CEO of DBT labs, on at supercloud last week. They are a new disruptor in data that's, they're essentially making, they're API-ifying, if you will, KPIs inside the data warehouse and dramatically simplifying that whole data pipeline. So really, you know, the ETL guys should be shaking in their boots with them. Coming back to the slide. Google really remains focused on BigQuery adoption. Customers have complained to me that they would like to use Snowflake with Google's AI tools, but they're being forced to go to BigQuery. I got to ask Google about that. AWS continues to stitch together its bespoke data stores, that's gone down that "Right tool for the right job" path. David Foyer two years ago said, "AWS absolutely is going to have to solve that problem." We saw them start to do it in, at Reinvent, bringing together NoETL between Aurora and Redshift, and really trying to simplify those worlds. There's going to be more of that. And then Microsoft, they're just making it cheap and easy to use their stuff, you know, despite some of the complaints that we hear in the community, you know, about things like Cosmos, but Eric, your take? >> Yeah, my concern here is that Snowflake and Databricks are fighting each other, and it's allowing AWS and Microsoft to kind of catch up against them, and I don't know if that's the right move for either of those two companies individually, Azure and AWS are building out functionality. Are they as good? No they're not. The other thing to remember too is that AWS and Azure get paid anyway, because both Databricks and Snowflake run on top of 'em. So (laughing) they're basically collecting their toll, while these two fight it out with each other, and they build out functionality. I think they need to stop focusing on each other, a little bit, and think about the overall strategy. Now for Databricks, we know they came out first as a machine-learning AI tool. They were known better for that spot, and now they're really trying to play catch-up on that data storage compute spot, and inversely for Snowflake, they were killing it with the compute separation from storage, and now they're trying to get into the MLAI spot. I actually wouldn't be surprised to see them make some sort of acquisition. Frank Slootman has been a little bit quiet, in my opinion there. The other thing to mention is your comment about DBT Labs. If we look at our emerging technology survey, last survey when this came out, DBT labs, number one leader in that data integration space, I'm going to just pull it up real quickly. It looks like they had a 33% overall net sentiment to lead data analytics integration. So they are clearly growing, it's fourth straight survey consecutively that they've grown. The other name we're seeing there a little bit is Cribl, but DBT labs is by far the number one player in this space. >> All right. Okay, cool. Moving on, let's go to number nine. With Automation mixer resurgence in 2023, we're showing again data. The x axis is overlap or presence in the dataset, and the vertical axis is shared net score. Net score is a measure of spending momentum. As always, you've seen UI path and Microsoft Power Automate up until the right, that red line, that 40% line is generally considered elevated. UI path is really separating, creating some distance from Automation Anywhere, they, you know, previous quarters they were much closer. Microsoft Power Automate came on the scene in a big way, they loom large with this "Good enough" approach. I will say this, I, somebody sent me a results of a (indistinct) survey, which showed UiPath actually had more mentions than Power Automate, which was surprising, but I think that's not been the case in the ETR data set. We're definitely seeing a shift from back office to front soft office kind of workloads. Having said that, software testing is emerging as a mainstream use case, we're seeing ML and AI become embedded in end-to-end automations, and low-code is serving the line of business. And so this, we think, is going to increasingly have appeal to organizations in the coming year, who want to automate as much as possible and not necessarily, we've seen a lot of layoffs in tech, and people... You're going to have to fill the gaps with automation. That's a trend that's going to continue. >> Yep, agreed. At first that comment about Microsoft Power Automate having less citations than UiPath, that's shocking to me. I'm looking at my chart right here where Microsoft Power Automate was cited by over 60% of our entire survey takers, and UiPath at around 38%. Now don't get me wrong, 38% pervasion's fantastic, but you know you're not going to beat an entrenched Microsoft. So I don't really know where that comment came from. So UiPath, looking at it alone, it's doing incredibly well. It had a huge rebound in its net score this last survey. It had dropped going through the back half of 2022, but we saw a big spike in the last one. So it's got a net score of over 55%. A lot of people citing adoption and increasing. So that's really what you want to see for a name like this. The problem is that just Microsoft is doing its playbook. At the end of the day, I'm going to do a POC, why am I going to pay more for UiPath, or even take on another separate bill, when we know everyone's consolidating vendors, if my license already includes Microsoft Power Automate? It might not be perfect, it might not be as good, but what I'm hearing all the time is it's good enough, and I really don't want another invoice. >> Right. So how does UiPath, you know, and Automation Anywhere, how do they compete with that? Well, the way they compete with it is they got to have a better product. They got a product that's 10 times better. You know, they- >> Right. >> they're not going to compete based on where the lowest cost, Microsoft's got that locked up, or where the easiest to, you know, Microsoft basically give it away for free, and that's their playbook. So that's, you know, up to UiPath. UiPath brought on Rob Ensslin, I've interviewed him. Very, very capable individual, is now Co-CEO. So he's kind of bringing that adult supervision in, and really tightening up the go to market. So, you know, we know this company has been a rocket ship, and so getting some control on that and really getting focused like a laser, you know, could be good things ahead there for that company. Okay. >> One of the problems, if I could real quick Dave, is what the use cases are. When we first came out with RPA, everyone was super excited about like, "No, UiPath is going to be great for super powerful "projects, use cases." That's not what RPA is being used for. As you mentioned, it's being used for mundane tasks, so it's not automating complex things, which I think UiPath was built for. So if you were going to get UiPath, and choose that over Microsoft, it's going to be 'cause you're doing it for more powerful use case, where it is better. But the problem is that's not where the enterprise is using it. The enterprise are using this for base rote tasks, and simply, Microsoft Power Automate can do that. >> Yeah, it's interesting. I've had people on theCube that are both Microsoft Power Automate customers and UiPath customers, and I've asked them, "Well you know, "how do you differentiate between the two?" And they've said to me, "Look, our users and personal productivity users, "they like Power Automate, "they can use it themselves, and you know, "it doesn't take a lot of, you know, support on our end." The flip side is you could do that with UiPath, but like you said, there's more of a focus now on end-to-end enterprise automation and building out those capabilities. So it's increasingly a value play, and that's going to be obviously the challenge going forward. Okay, my last one, and then I think you've got some bonus ones. Number 10, hybrid events are the new category. Look it, if I can get a thousand inbounds that are largely self-serving, I can do my own here, 'cause we're in the events business. (Eric chuckling) Here's the prediction though, and this is a trend we're seeing, the number of physical events is going to dramatically increase. That might surprise people, but most of the big giant events are going to get smaller. The exception is AWS with Reinvent, I think Snowflake's going to continue to grow. So there are examples of physical events that are growing, but generally, most of the big ones are getting smaller, and there's going to be many more smaller intimate regional events and road shows. These micro-events, they're going to be stitched together. Digital is becoming a first class citizen, so people really got to get their digital acts together, and brands are prioritizing earned media, and they're beginning to build their own news networks, going direct to their customers. And so that's a trend we see, and I, you know, we're right in the middle of it, Eric, so you know we're going to, you mentioned RSA, I think that's perhaps going to be one of those crazy ones that continues to grow. It's shrunk, and then it, you know, 'cause last year- >> Yeah, it did shrink. >> right, it was the last one before the pandemic, and then they sort of made another run at it last year. It was smaller but it was very vibrant, and I think this year's going to be huge. Global World Congress is another one, we're going to be there end of Feb. That's obviously a big big show, but in general, the brands and the technology vendors, even Oracle is going to scale down. I don't know about Salesforce. We'll see. You had a couple of bonus predictions. Quantum and maybe some others? Bring us home. >> Yeah, sure. I got a few more. I think we touched upon one, but I definitely think the data prep tools are facing extinction, unfortunately, you know, the Talons Informatica is some of those names. The problem there is that the BI tools are kind of including data prep into it already. You know, an example of that is Tableau Prep Builder, and then in addition, Advanced NLP is being worked in as well. ThoughtSpot, Intelius, both often say that as their selling point, Tableau has Ask Data, Click has Insight Bot, so you don't have to really be intelligent on data prep anymore. A regular business user can just self-query, using either the search bar, or even just speaking into what it needs, and these tools are kind of doing the data prep for it. I don't think that's a, you know, an out in left field type of prediction, but it's the time is nigh. The other one I would also state is that I think knowledge graphs are going to break through this year. Neo4j in our survey is growing in pervasion in Mindshare. So more and more people are citing it, AWS Neptune's getting its act together, and we're seeing that spending intentions are growing there. Tiger Graph is also growing in our survey sample. I just think that the time is now for knowledge graphs to break through, and if I had to do one more, I'd say real-time streaming analytics moves from the very, very rich big enterprises to downstream, to more people are actually going to be moving towards real-time streaming, again, because the data prep tools and the data pipelines have gotten easier to use, and I think the ROI on real-time streaming is obviously there. So those are three that didn't make the cut, but I thought deserved an honorable mention. >> Yeah, I'm glad you did. Several weeks ago, we did an analyst prediction roundtable, if you will, a cube session power panel with a number of data analysts and that, you know, streaming, real-time streaming was top of mind. So glad you brought that up. Eric, as always, thank you very much. I appreciate the time you put in beforehand. I know it's been crazy, because you guys are wrapping up, you know, the last quarter survey in- >> Been a nuts three weeks for us. (laughing) >> job. I love the fact that you're doing, you know, the ETS survey now, I think it's quarterly now, right? Is that right? >> Yep. >> Yep. So that's phenomenal. >> Four times a year. I'll be happy to jump on with you when we get that done. I know you were really impressed with that last time. >> It's unbelievable. This is so much data at ETR. Okay. Hey, that's a wrap. Thanks again. >> Take care Dave. Good seeing you. >> All right, many thanks to our team here, Alex Myerson as production, he manages the podcast force. Ken Schiffman as well is a critical component of our East Coast studio. Kristen Martin and Cheryl Knight help get the word out on social media and in our newsletters. And Rob Hoof is our editor-in-chief. He's at siliconangle.com. He's just a great editing for us. Thank you all. Remember all these episodes that are available as podcasts, wherever you listen, podcast is doing great. Just search "Breaking analysis podcast." Really appreciate you guys listening. I publish each week on wikibon.com and siliconangle.com, or you can email me directly if you want to get in touch, david.vellante@siliconangle.com. That's how I got all these. I really appreciate it. I went through every single one with a yellow highlighter. It took some time, (laughing) but I appreciate it. You could DM me at dvellante, or comment on our LinkedIn post and please check out etr.ai. Its data is amazing. Best survey data in the enterprise tech business. This is Dave Vellante for theCube Insights, powered by ETR. Thanks for watching, and we'll see you next time on "Breaking Analysis." (upbeat music beginning) (upbeat music ending)

Published Date : Jan 29 2023

SUMMARY :

insights from the Cube and ETR, do for the community, Dave, good to see you. actually come back to me if you would. It just stays at the top. the most aggressive to cut. that have the most to lose What's the primary method still leads the way, you know, So in addition to what we're seeing here, And so I actually thank you I went through it for you. I'm going to ask you to explain and they're certainly not going to get it to you in a zero trust way. So all of that is the One is just the number of So come back to me in 12 So 52% of the ETR survey amount of money on the Metaverse and also in the data prep tools. the cloud expands to the biggest shock to me "Ah, it's, you know, really and Fastly is their really the folks said, you know, for a home in the enterprise, Yeah, and I got to be honest, in the community, you know, and I don't know if that's the right move and the vertical axis is shared net score. So that's really what you want Well, the way they compete So that's, you know, One of the problems, if and that's going to be obviously even Oracle is going to scale down. and the data pipelines and that, you know, Been a nuts three I love the fact I know you were really is so much data at ETR. and we'll see you next time

ENTITIES

Entity	Category	Confidence
Alex Myerson	PERSON	0.99+
Eric	PERSON	0.99+
Eric Bradley	PERSON	0.99+
Cisco	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Rob Hoof	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
10	QUANTITY	0.99+
Ravi Mayuram	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
George Gilbert	PERSON	0.99+
Ken Schiffman	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Tristan Handy	PERSON	0.99+
Dave	PERSON	0.99+
Atif Kahn	PERSON	0.99+
November	DATE	0.99+
Frank Slootman	PERSON	0.99+
APAC	ORGANIZATION	0.99+
Zscaler	ORGANIZATION	0.99+
Palo	ORGANIZATION	0.99+
David Foyer	PERSON	0.99+
February	DATE	0.99+
January 2023	DATE	0.99+
DBT Labs	ORGANIZATION	0.99+
October	DATE	0.99+
Rob Ensslin	PERSON	0.99+
Scott Stevenson	PERSON	0.99+
John Furrier	PERSON	0.99+
69%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
CrowdStrike	ORGANIZATION	0.99+
4.6%	QUANTITY	0.99+
10 times	QUANTITY	0.99+
2023	DATE	0.99+
Scott	PERSON	0.99+
1,181 responses	QUANTITY	0.99+
Palo Alto	ORGANIZATION	0.99+
third year	QUANTITY	0.99+
Boston	LOCATION	0.99+
Alex	PERSON	0.99+
thousands	QUANTITY	0.99+
OneTrust	ORGANIZATION	0.99+
45%	QUANTITY	0.99+
33%	QUANTITY	0.99+
Databricks	ORGANIZATION	0.99+
two reasons	QUANTITY	0.99+
Palo Alto	LOCATION	0.99+
last year	DATE	0.99+
BeyondTrust	ORGANIZATION	0.99+
7%	QUANTITY	0.99+
IBM	ORGANIZATION	0.99+

Is Data Mesh the Next Killer App for Supercloud?

(upbeat music) >> Welcome back to our Supercloud 2 event live coverage here of stage performance in Palo Alto syndicating around the world. I'm John Furrier with Dave Vellante. We got exclusive news and a scoop here for SiliconANGLE in theCUBE. Zhamak Dehghani, creator of data mesh has formed a new company called Nextdata.com, Nextdata. She's a cube alumni and contributor to our supercloud initiative, as well as our coverage and Breaking Analysis with Dave Vellante on data, the killer app for supercloud. Zhamak, great to see you. Thank you for coming into the studio and congratulations on your newly formed venture and continued success on the data mesh. >> Thank you so much. It's great to be here. Great to see you in person. >> Dave: Yeah, finally. >> Wonderful. Your contributions to the data conversation has been well documented certainly by us and others in the industry. Data mesh taking the world by storm. Some people are debating it, throwing cold water on it. Some are thinking it's the next big thing. Tell us about the data mesh, super data apps that are emerging out of cloud. >> I mean, data mesh, as you said, the pain point that it surface were universal. Everybody said, "Oh, why didn't I think of that?" It was just an obvious next step and people are approaching it, implementing it. I guess the last few years I've been involved in many of those implementations and I guess supercloud is somewhat a prerequisite for it because it's data mesh and building applications using data mesh is about sharing data responsibly across boundaries. And those boundaries include organizational boundaries, cloud technology boundaries, and trust boundaries. >> I want to bring that up because your venture, Nextdata, which is new just formed. Tell us about that. What wave is that riding? What specifically are you targeting? What's the pain point? >> Absolutely. Yes, so Nextdata is the result of, I suppose the pains that I suffered from implementing data mesh for many of the organizations. Basically a lot of organizations that I've worked with they want decentralized data. So they really embrace this idea of decentralized ownership of the data, but yet they want interconnectivity through standard APIs, yet they want discoverability and governance. So they want to have policies implemented, they want to govern that data, they want to be able to discover that data, and yet they want to decentralize it. And we do that with a developer experience that is easy and native to a generalist developer. So we try to find the, I guess the common denominator that solves those problems and enables that developer experience for data sharing. >> Since you just announced the news, what's been the reaction? >> I just announced the news right now, so what's the reaction? >> But people in the industry know you did a lot of work in the area. What have been some of the feedback on the new venture in terms of the approach, the customers, problem? >> Yeah, so we've been in stealth mode so we haven't publicly talked about it, but folks that have been close to us, in fact have reached that we already have implementations of our pilot platform with early customers, which is super exciting. And we going to have multiple of those. Of course, we're a tiny, tiny company. We can have many of those, but we are going to have multiple pilot implementations of our platform in real world where real global large scale organizations that have real world problems. So we're not going to build our platform in vacuum. And that's what's happening right now. >> Zhamak, when I think about your role at ThoughtWorks, you had a very wide observation space with a number of clients, helping them implement data mesh and other things as well prior to your data mesh initiative. But when I look at data mesh, at least the ones that I've seen, they're very narrow. I think of JPMC, I think of HelloFresh. They're generally, obviously not surprising, they don't include the big vision of inclusivity across clouds, across different data storage. But it seems like people are having to go through some gymnastics to get to the organizational reality of decentralizing data and at least pushing data ownership to the line of business. How are you approaching, or are you approaching solving that problem? Are you taking a narrow slice? What can you tell us about Nextdata? >> Yeah, absolutely. Gymnastics, the cute word to describe what the organizations have to go through. And one of those problems is that the data as you know resides on different platforms, it's owned by different people, is processed by pipelines that who knows who owns them. So there's this very disparate and disconnected set of technologies that were very useful for when we thought about data and processing as a centralized problem. But when you think about data as a decentralized problem the cost of integration of these technologies in a cohesive developer experience is what's missing. And we want to focus on that cohesive end-to-end developer experience to share data responsibly in these autonomous units. We call them data products, I guess in data mesh. That constitutes computation. That governs that data policies, discoverability. So I guess, I heard this expression in the last talks that you can have your cake and eat it too. So we want people have their cakes, which is data in different places, decentralization, and eat it too, which is interconnected access to it. So we start with standardizing and codifying this idea of a data product container that encapsulates data computation APIs to get to it in a technology agnostic way, in an open way. And then sit on top and use existing tech, Snowflake, Databricks, whatever exists, the millions of dollars of investments that companies have made, sit on top of those but create this cohesive, integrated experience where data product is a first class primitive. And that's really key here. The language and the modeling that we use is really native to data mesh, which is that I'm building a data product I'm sharing a data product, and that encapsulates I'm providing metadata about this. I'm providing computation that's constantly changing the data. I'm providing the API for that. So we we're trying to kind of codify and create a new developer experience based on that. And developer, both from provider side and user side, connected to peer-to-peer data sharing with data product as a primitive first class concept. >> So the idea would be developers would build applications leveraging those data products, which are discoverable and governed. Now today you see some companies, take a Snowflake for example, attempting to do that within their own little walled garden. They even at one point used the term mesh. I don't know if they pull back on that. And then they became aware of some of your work. But a lot of the things that they're doing within their little insulated environment support that governance, they're building out an ecosystem. What's different in your vision? >> Exactly. So we realized that, and this is a reality, like you go to organizations, they have a Snowflake and half of the organization happily operates on Snowflake. And on the other half, "oh, we are on Bare infrastructure on AWS or we are on Databricks." This is the reality. This supercloud that's written up here, it's about working across boundaries of technology. So we try to embrace that. And even for our own technology with the way we're building it, we say, "Okay, nobody's going to use Nextdata, data mesh operating system. People will have different platforms." So you have to build with openness in mind and in case of Snowflake, I think, they have very, I'm sure very happy customers as long as customers can be on Snowflake. But once you cross that boundary of platforms then that becomes a problem. And we try to keep that in mind in our solution. >> So it's worth reviewing that basically the concept of data mesh is that whether you're a data lake or a data warehouse, an S3 bucket, an Oracle database as well, they should be inclusive inside of the data. >> We did a session with AWS on the startup showcase, data as code. And remember I wrote a blog post in 2007 called "Data as the New Developer Kit" back then we used to call them developer kits if you remember. And that we said at that time, whoever can code data will have a competitive advantage. >> Aren't the machines going to be doing that? Didn't we just hear that? >> Well, we have. Hey, Siri. Hey, Cube, find me that best video for data mesh. There it is. But this is the point, like what's happening is that now data has to be addressable. for machines and for coding because as you need to call the data. So the question is how do you manage the complexity of big things as promiscuous as possible, making it available, as well as then governing it? Because it's a trade off. The more you make open, the better the machine learning. But yet the governance issue, so this is the, you need an OS to handle this maybe. >> Yes. So yes, well we call, our mental model for our platform is an OS operating system. Operating systems have shown us how you can abstract what's complex and take care of a lot of complexities, but yet provide an open and dynamic enough interface. So we think about it that way. Just, we try to solve the problem of policies live with the data, an enforcement of the policies happens at the most granular level, which is in this concept of the data product. And that would happen whether you read, write or access a data product. But we can never imagine what are these policies could be. So our thinking is we should have a policy, open policy framework that can allow organizations write their own policy drivers and policy definitions and encode it and encapsulated in this data product container. But I'm not going to fool myself to say that, that's going to solve the problem that you just described. I think we are in this, I don't know, if I look into my crystal ball, what I think might happen is that right now the primitives that we work with to train machine learning model are still bits and bytes and data. They're fields, rows, columns and that creates quite a large surface area and attack area for privacy of the data. So perhaps one of the trends that we might see is this evolution of data APIs to become more and more computational aware to bring the compute to the data to reduce that surface area. So you can really leave the control of the data to the sovereign owners of that data. So that data product. So I think that evolution of our data APIs perhaps will become more and more computational. So you describe what you want and the data owner decides how to manage. >> That's interesting, Dave, 'cause it's almost like we just talked about ChatGPT in the last segment we had with you. It was a machine learning have been around the industry. It's almost as if you're starting to see reason come into, the data reasoning is like starting to see not just metadata. Using the data to reason so that you don't have to expose the raw data. So almost like a, I won't say curation layer, but an intelligence layer. >> Zhamak: Exactly. >> Can you share your vision on that? 'Cause that seems to be where the dots are connecting. >> Yes, perhaps further into the future because just from where we stand, we have to create still that bridge of familiarity between that future and present. So we are still in that bridge making mode. However, by just the basic notion of saying, "I'm going to put an API in front of my data." And that API today might be as primitive as a level of indirection, as in you tell me what you want, tell me who you are, let me go process that, all the policies and lineage and insert all of this intelligence that need to happen. And then today, I will still give you a file. But by just defining that API and standardizing it now we have this amazing extension point that we can say, "Well, the next revision of this API, you not just tell me who you are, but you actually tell me what intelligence you're after. What's a logic that I need to go and now compute on your API?" And you can evolve that. Now you have a point of evolution to this very futuristic, I guess, future where you just described the question that you're asking from the ChatGPT. >> Well, this is the supercloud, go ahead, Dave. >> I have a question from a fan, I got to get it in. It's George Gilbert. And so his question is, you're blowing away the way we synchronize data from operational systems to the data stack to applications. So the concern that he has and he wants your feedback on this, is the data product app devs get exposed to more complexity with respect to moving data between data products or maybe it's attributes between data products? How do you respond to that? How do you see? Is that a problem? Is that something that is overstated or do you have an answer for that? >> Absolutely. So I think there's a sweet spot in getting data developers, data product developers closer to the app, but yet not overburdening them with the complexity of the application and application logic and yet reducing their cognitive load by localizing what they need to know about, which is that domain where they're operating within. Because what's happening right now? What's happening right now is that data engineers with, a ton of empathy for them for their high threshold of pain that they can deal with, they have been centralized, they've put into the data team, and they have been given this unbelievable task of make meaning out of data, put semantic over it, curate it, cleans it, and so on. So what we are saying is that get those folks embedded into the domain closer to the application developers. These are still separately moving units. Your app and your data products are independent, but yet tightly closed with each other, tightly coupled with each other based on the context of the domain. So reduce cognitive load by localizing what they need to know about to the domain, get them closer to the application, but yet have them separate from app because app provides a very different service. Transactional data for my e-commerce transaction. Data product provides a very different service. Longitudinal data for the variety of this intelligent analysis that I can do on the data. But yet it's all within the domain of e-commerce or sales or whatnot. >> It's a lot of decoupling and coupling create that cohesiveness architecture. So I have to ask you, this is an interesting question 'cause it came up on theCUBE all last year. Back on the old server data center days and cloud, SRE, Google coined the term, site reliability engineer, for someone to look over the hundreds of thousands of servers. We asked the question to data engineering community who have been suffering, by the way, I agree. Is there an SRE like role for data? Because in a way data engineering, that platform engineer, they are like the SRE for data. In other words managing the large scale to enable automation and cell service. What's your thoughts and reaction to that? >> Yes, exactly. So maybe we go through that history of how SRE came to be. So we had the first DevOps movement, which was remove the wall between dev and ops and bring them together. So you have one unit of one cross-functional units of the organization that's responsible for you build it, you run it. So then there is no, I'm going to just shoot my application over the wall for somebody else to manage it. So we did that and then we said, okay, there is a ton, as we decentralized and had these many microservices running around, we had to create a layer that abstracted a lot of the complexity around running now a lot or monitoring, observing, and running a lot while giving autonomy to this cross-functional team. And that's where the SRE, a new generation of engineers came to exist. So I think if I just look at. >> Hence, Kubernetes. >> Hence, hence, exactly. Hence, chaos engineering. Hence, embracing the complexity and messiness. And putting engineering discipline to embrace that and yet give a cohesive and high integrity experience of those systems. So I think if we look at that evolution, perhaps something like that is happening by bringing data and apps closer and make them these domain-oriented data product teams or domain-oriented cross-functional teams full stop and still have a very advanced maybe at the platform level, infrastructure level operational team that they're not busy doing two jobs, which is taking care of domains and the infrastructure, but they're building infrastructure that is embracing that complexity, interconnectivity of this data process. >> So you see similarities? >> I see, absolutely. But I feel like we're probably in a more early days of that movement. >> So it's a data DevOps kind of thing happening where scales happening. It's good things are happening, yet a little bit fast and loose with some complexities to clean up. >> Yes. This is a different restructure. As you said, the job of this industry as a whole, an architect, is decompose recompose, decompose recompose in new way and now we're like decomposing centralized team, recomposing them as domains. >> So is data mesh the killer app for supercloud? >> You had to do this to me. >> Sorry, I couldn't resist. >> I know. Of course you want me to say this. >> Yes. >> Yes, of course. I mean, supercloud, I think it's really, the terminology supercloud, open cloud, but I think in spirits of it this embracing of diversity and giving autonomy for people to make decisions for what's right for them and not yet lock them in. I think just embracing that is baked into how data mesh assume the world would work. >> Well, thank you so much for coming on Supercloud 2. We really appreciate it. Data has driven this conversation. Your success of data mesh has really opened up the conversation and exposed the slow moving data industry. >> Dave: Been a great catalyst. >> That's now going well. We can move faster. So thanks for coming on. >> Thank you for hosting me. It was wonderful. >> Supercloud 2 live here in Palo Alto, our stage performance. I'm John Furrier with Dave Vellante. We'll back with more after this short break. Stay with us all day for Supercloud 2. (upbeat music)

Published Date : Jan 25 2023

SUMMARY :

and continued success on the data mesh. Great to see you in person. and others in the industry. I guess the last few What's the pain point? for many of the organizations. But people in the industry know you did but folks that have been close to us, at least the ones that I've is that the data as you know But a lot of the things that they're doing and half of the organization that basically the concept of data mesh And that we said at that time, is that now data has to be addressable. and the data owner decides how to manage. the data reasoning is like starting to see 'Cause that seems to be where What's a logic that I need to go Well, this is the So the concern that he has into the domain closer to We asked the question to of the organization that's responsible So I think if we look at that evolution, in a more early days of that movement. So it's a data DevOps As you said, the job of Of course you want me to say this. assume the world would work. the conversation and exposed So thanks for coming on. Thank you for hosting me. I'm John Furrier with Dave Vellante.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
AWS	ORGANIZATION	0.99+
2007	DATE	0.99+
George Gilbert	PERSON	0.99+
Zhamak Dehghani	PERSON	0.99+
Nextdata	ORGANIZATION	0.99+
Zhamak	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Google	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
one	QUANTITY	0.99+
Nextdata.com	ORGANIZATION	0.99+
two jobs	QUANTITY	0.99+
JPMC	ORGANIZATION	0.99+
today	DATE	0.99+
HelloFresh	ORGANIZATION	0.99+
ThoughtWorks	ORGANIZATION	0.99+
last year	DATE	0.99+
Supercloud 2	EVENT	0.99+
Oracle	ORGANIZATION	0.98+
first	QUANTITY	0.98+
Siri	TITLE	0.98+
Cube	PERSON	0.98+
Databricks	ORGANIZATION	0.98+
Snowflake	ORGANIZATION	0.97+
Supercloud	ORGANIZATION	0.97+
both	QUANTITY	0.97+
one unit	QUANTITY	0.97+
Snowflake	TITLE	0.96+
SRE	TITLE	0.95+
millions of dollars	QUANTITY	0.94+
first class	QUANTITY	0.94+
hundreds of thousands of servers	QUANTITY	0.92+
supercloud	ORGANIZATION	0.92+
one point	QUANTITY	0.92+
Supercloud 2	TITLE	0.89+
ChatGPT	ORGANIZATION	0.81+
half	QUANTITY	0.81+
Data Mesh the Next Killer App	TITLE	0.78+
supercloud	TITLE	0.75+
a ton	QUANTITY	0.73+
Supercloud 2	ORGANIZATION	0.72+
SiliconANGLE	ORGANIZATION	0.7+
DevOps	TITLE	0.66+
Snowflake	EVENT	0.59+
S3	TITLE	0.54+
last	DATE	0.54+
supercloud	EVENT	0.48+
Kubernetes	TITLE	0.47+

Breaking Analysis: ChatGPT Won't Give OpenAI First Mover Advantage

>> From theCUBE Studios in Palo Alto in Boston, bringing you data-driven insights from theCUBE and ETR. This is Breaking Analysis with Dave Vellante. >> OpenAI The company, and ChatGPT have taken the world by storm. Microsoft reportedly is investing an additional 10 billion dollars into the company. But in our view, while the hype around ChatGPT is justified, we don't believe OpenAI will lock up the market with its first mover advantage. Rather, we believe that success in this market will be directly proportional to the quality and quantity of data that a technology company has at its disposal, and the compute power that it could deploy to run its system. Hello and welcome to this week's Wikibon CUBE insights, powered by ETR. In this Breaking Analysis, we unpack the excitement around ChatGPT, and debate the premise that the company's early entry into the space may not confer winner take all advantage to OpenAI. And to do so, we welcome CUBE collaborator, alum, Sarbjeet Johal, (chuckles) and John Furrier, co-host of the Cube. Great to see you Sarbjeet, John. Really appreciate you guys coming to the program. >> Great to be on. >> Okay, so what is ChatGPT? Well, actually we asked ChatGPT, what is ChatGPT? So here's what it said. ChatGPT is a state-of-the-art language model developed by OpenAI that can generate human-like text. It could be fine tuned for a variety of language tasks, such as conversation, summarization, and language translation. So I asked it, give it to me in 50 words or less. How did it do? Anything to add? >> Yeah, think it did good. It's large language model, like previous models, but it started applying the transformers sort of mechanism to focus on what prompt you have given it to itself. And then also the what answer it gave you in the first, sort of, one sentence or two sentences, and then introspect on itself, like what I have already said to you. And so just work on that. So it it's self sort of focus if you will. It does, the transformers help the large language models to do that. >> So to your point, it's a large language model, and GPT stands for generative pre-trained transformer. >> And if you put the definition back up there again, if you put it back up on the screen, let's see it back up. Okay, it actually missed the large, word large. So one of the problems with ChatGPT, it's not always accurate. It's actually a large language model, and it says state of the art language model. And if you look at Google, Google has dominated AI for many times and they're well known as being the best at this. And apparently Google has their own large language model, LLM, in play and have been holding it back to release because of backlash on the accuracy. Like just in that example you showed is a great point. They got almost right, but they missed the key word. >> You know what's funny about that John, is I had previously asked it in my prompt to give me it in less than a hundred words, and it was too long, I said I was too long for Breaking Analysis, and there it went into the fact that it's a large language model. So it largely, it gave me a really different answer the, for both times. So, but it's still pretty amazing for those of you who haven't played with it yet. And one of the best examples that I saw was Ben Charrington from This Week In ML AI podcast. And I stumbled on this thanks to Brian Gracely, who was listening to one of his Cloudcasts. Basically what Ben did is he took, he prompted ChatGPT to interview ChatGPT, and he simply gave the system the prompts, and then he ran the questions and answers into this avatar builder and sped it up 2X so it didn't sound like a machine. And voila, it was amazing. So John is ChatGPT going to take over as a cube host? >> Well, I was thinking, we get the questions in advance sometimes from PR people. We should actually just plug it in ChatGPT, add it to our notes, and saying, "Is this good enough for you? Let's ask the real question." So I think, you know, I think there's a lot of heavy lifting that gets done. I think the ChatGPT is a phenomenal revolution. I think it highlights the use case. Like that example we showed earlier. It gets most of it right. So it's directionally correct and it feels like it's an answer, but it's not a hundred percent accurate. And I think that's where people are seeing value in it. Writing marketing, copy, brainstorming, guest list, gift list for somebody. Write me some lyrics to a song. Give me a thesis about healthcare policy in the United States. It'll do a bang up job, and then you got to go in and you can massage it. So we're going to do three quarters of the work. That's why plagiarism and schools are kind of freaking out. And that's why Microsoft put 10 billion in, because why wouldn't this be a feature of Word, or the OS to help it do stuff on behalf of the user. So linguistically it's a beautiful thing. You can input a string and get a good answer. It's not a search result. >> And we're going to get your take on on Microsoft and, but it kind of levels the playing- but ChatGPT writes better than I do, Sarbjeet, and I know you have some good examples too. You mentioned the Reed Hastings example. >> Yeah, I was listening to Reed Hastings fireside chat with ChatGPT, and the answers were coming as sort of voice, in the voice format. And it was amazing what, he was having very sort of philosophy kind of talk with the ChatGPT, the longer sentences, like he was going on, like, just like we are talking, he was talking for like almost two minutes and then ChatGPT was answering. It was not one sentence question, and then a lot of answers from ChatGPT and yeah, you're right. I, this is our ability. I've been thinking deep about this since yesterday, we talked about, like, we want to do this segment. The data is fed into the data model. It can be the current data as well, but I think that, like, models like ChatGPT, other companies will have those too. They can, they're democratizing the intelligence, but they're not creating intelligence yet, definitely yet I can say that. They will give you all the finite answers. Like, okay, how do you do this for loop in Java, versus, you know, C sharp, and as a programmer you can do that, in, but they can't tell you that, how to write a new algorithm or write a new search algorithm for you. They cannot create a secretive code for you to- >> Not yet. >> Have competitive advantage. >> Not yet, not yet. >> but you- >> Can Google do that today? >> No one really can. The reasoning side of the data is, we talked about at our Supercloud event, with Zhamak Dehghani who's was CEO of, now of Nextdata. This next wave of data intelligence is going to come from entrepreneurs that are probably cross discipline, computer science and some other discipline. But they're going to be new things, for example, data, metadata, and data. It's hard to do reasoning like a human being, so that needs more data to train itself. So I think the first gen of this training module for the large language model they have is a corpus of text. Lot of that's why blog posts are, but the facts are wrong and sometimes out of context, because that contextual reasoning takes time, it takes intelligence. So machines need to become intelligent, and so therefore they need to be trained. So you're going to start to see, I think, a lot of acceleration on training the data sets. And again, it's only as good as the data you can get. And again, proprietary data sets will be a huge winner. Anyone who's got a large corpus of content, proprietary content like theCUBE or SiliconANGLE as a publisher will benefit from this. Large FinTech companies, anyone with large proprietary data will probably be a big winner on this generative AI wave, because it just, it will eat that up, and turn that back into something better. So I think there's going to be a lot of interesting things to look at here. And certainly productivity's going to be off the charts for vanilla and the internet is going to get swarmed with vanilla content. So if you're in the content business, and you're an original content producer of any kind, you're going to be not vanilla, so you're going to be better. So I think there's so much at play Dave (indistinct). >> I think the playing field has been risen, so we- >> Risen and leveled? >> Yeah, and leveled to certain extent. So it's now like that few people as consumers, as consumers of AI, we will have a advantage and others cannot have that advantage. So it will be democratized. That's, I'm sure about that. But if you take the example of calculator, when the calculator came in, and a lot of people are, "Oh, people can't do math anymore because calculator is there." right? So it's a similar sort of moment, just like a calculator for the next level. But, again- >> I see it more like open source, Sarbjeet, because like if you think about what ChatGPT's doing, you do a query and it comes from somewhere the value of a post from ChatGPT is just a reuse of AI. The original content accent will be come from a human. So if I lay out a paragraph from ChatGPT, did some heavy lifting on some facts, I check the facts, save me about maybe- >> Yeah, it's productive. >> An hour writing, and then I write a killer two, three sentences of, like, sharp original thinking or critical analysis. I then took that body of work, open source content, and then laid something on top of it. >> And Sarbjeet's example is a good one, because like if the calculator kids don't do math as well anymore, the slide rule, remember we had slide rules as kids, remember we first started using Waze, you know, we were this minority and you had an advantage over other drivers. Now Waze is like, you know, social traffic, you know, navigation, everybody had, you know- >> All the back roads are crowded. >> They're car crowded. (group laughs) Exactly. All right, let's, let's move on. What about this notion that futurist Ray Amara put forth and really Amara's Law that we're showing here, it's, the law is we, you know, "We tend to overestimate the effect of technology in the short run and underestimate it in the long run." Is that the case, do you think, with ChatGPT? What do you think Sarbjeet? >> I think that's true actually. There's a lot of, >> We don't debate this. >> There's a lot of awe, like when people see the results from ChatGPT, they say what, what the heck? Like, it can do this? But then if you use it more and more and more, and I ask the set of similar question, not the same question, and it gives you like same answer. It's like reading from the same bucket of text in, the interior read (indistinct) where the ChatGPT, you will see that in some couple of segments. It's very, it sounds so boring that the ChatGPT is coming out the same two sentences every time. So it is kind of good, but it's not as good as people think it is right now. But we will have, go through this, you know, hype sort of cycle and get realistic with it. And then in the long term, I think it's a great thing in the short term, it's not something which will (indistinct) >> What's your counter point? You're saying it's not. >> I, no I think the question was, it's hyped up in the short term and not it's underestimated long term. That's what I think what he said, quote. >> Yes, yeah. That's what he said. >> Okay, I think that's wrong with this, because this is a unique, ChatGPT is a unique kind of impact and it's very generational. People have been comparing it, I have been comparing to the internet, like the web, web browser Mosaic and Netscape, right, Navigator. I mean, I clearly still remember the days seeing Navigator for the first time, wow. And there weren't not many sites you could go to, everyone typed in, you know, cars.com, you know. >> That (indistinct) wasn't that overestimated, the overhyped at the beginning and underestimated. >> No, it was, it was underestimated long run, people thought. >> But that Amara's law. >> That's what is. >> No, they said overestimated? >> Overestimated near term underestimated- overhyped near term, underestimated long term. I got, right I mean? >> Well, I, yeah okay, so I would then agree, okay then- >> We were off the charts about the internet in the early days, and it actually exceeded our expectations. >> Well there were people who were, like, poo-pooing it early on. So when the browser came out, people were like, "Oh, the web's a toy for kids." I mean, in 1995 the web was a joke, right? So '96, you had online populations growing, so you had structural changes going on around the browser, internet population. And then that replaced other things, direct mail, other business activities that were once analog then went to the web, kind of read only as you, as we always talk about. So I think that's a moment where the hype long term, the smart money, and the smart industry experts all get the long term. And in this case, there's more poo-pooing in the short term. "Ah, it's not a big deal, it's just AI." I've heard many people poo-pooing ChatGPT, and a lot of smart people saying, "No this is next gen, this is different and it's only going to get better." So I think people are estimating a big long game on this one. >> So you're saying it's bifurcated. There's those who say- >> Yes. >> Okay, all right, let's get to the heart of the premise, and possibly the debate for today's episode. Will OpenAI's early entry into the market confer sustainable competitive advantage for the company. And if you look at the history of tech, the technology industry, it's kind of littered with first mover failures. Altair, IBM, Tandy, Commodore, they and Apple even, they were really early in the PC game. They took a backseat to Dell who came in the scene years later with a better business model. Netscape, you were just talking about, was all the rage in Silicon Valley, with the first browser, drove up all the housing prices out here. AltaVista was the first search engine to really, you know, index full text. >> Owned by Dell, I mean DEC. >> Owned by Digital. >> Yeah, Digital Equipment >> Compaq bought it. And of course as an aside, Digital, they wanted to showcase their hardware, right? Their super computer stuff. And then so Friendster and MySpace, they came before Facebook. The iPhone certainly wasn't the first mobile device. So lots of failed examples, but there are some recent successes like AWS and cloud. >> You could say smartphone. So I mean. >> Well I know, and you can, we can parse this so we'll debate it. Now Twitter, you could argue, had first mover advantage. You kind of gave me that one John. Bitcoin and crypto clearly had first mover advantage, and sustaining that. Guys, will OpenAI make it to the list on the right with ChatGPT, what do you think? >> I think categorically as a company, it probably won't, but as a category, I think what they're doing will, so OpenAI as a company, they get funding, there's power dynamics involved. Microsoft put a billion dollars in early on, then they just pony it up. Now they're reporting 10 billion more. So, like, if the browsers, Microsoft had competitive advantage over Netscape, and used monopoly power, and convicted by the Department of Justice for killing Netscape with their monopoly, Netscape should have had won that battle, but Microsoft killed it. In this case, Microsoft's not killing it, they're buying into it. So I think the embrace extend Microsoft power here makes OpenAI vulnerable for that one vendor solution. So the AI as a company might not make the list, but the category of what this is, large language model AI, is probably will be on the right hand side. >> Okay, we're going to come back to the government intervention and maybe do some comparisons, but what are your thoughts on this premise here? That, it will basically set- put forth the premise that it, that ChatGPT, its early entry into the market will not confer competitive advantage to >> For OpenAI. >> To Open- Yeah, do you agree with that? >> I agree with that actually. It, because Google has been at it, and they have been holding back, as John said because of the scrutiny from the Fed, right, so- >> And privacy too. >> And the privacy and the accuracy as well. But I think Sam Altman and the company on those guys, right? They have put this in a hasty way out there, you know, because it makes mistakes, and there are a lot of questions around the, sort of, where the content is coming from. You saw that as your example, it just stole the content, and without your permission, you know? >> Yeah. So as quick this aside- >> And it codes on people's behalf and the, those codes are wrong. So there's a lot of, sort of, false information it's putting out there. So it's a very vulnerable thing to do what Sam Altman- >> So even though it'll get better, others will compete. >> So look, just side note, a term which Reid Hoffman used a little bit. Like he said, it's experimental launch, like, you know, it's- >> It's pretty damn good. >> It is clever because according to Sam- >> It's more than clever. It's good. >> It's awesome, if you haven't used it. I mean you write- you read what it writes and you go, "This thing writes so well, it writes so much better than you." >> The human emotion drives that too. I think that's a big thing. But- >> I Want to add one more- >> Make your last point. >> Last one. Okay. So, but he's still holding back. He's conducting quite a few interviews. If you want to get the gist of it, there's an interview with StrictlyVC interview from yesterday with Sam Altman. Listen to that one it's an eye opening what they want- where they want to take it. But my last one I want to make it on this point is that Satya Nadella yesterday did an interview with Wall Street Journal. I think he was doing- >> You were not impressed. >> I was not impressed because he was pushing it too much. So Sam Altman's holding back so there's less backlash. >> Got 10 billion reasons to push. >> I think he's almost- >> Microsoft just laid off 10000 people. Hey ChatGPT, find me a job. You know like. (group laughs) >> He's overselling it to an extent that I think it will backfire on Microsoft. And he's over promising a lot of stuff right now, I think. I don't know why he's very jittery about all these things. And he did the same thing during Ignite as well. So he said, "Oh, this AI will write code for you and this and that." Like you called him out- >> The hyperbole- >> During your- >> from Satya Nadella, he's got a lot of hyperbole. (group talks over each other) >> All right, Let's, go ahead. >> Well, can I weigh in on the whole- >> Yeah, sure. >> Microsoft thing on whether OpenAI, here's the take on this. I think it's more like the browser moment to me, because I could relate to that experience with ChatG, personally, emotionally, when I saw that, and I remember vividly- >> You mean that aha moment (indistinct). >> Like this is obviously the future. Anything else in the old world is dead, website's going to be everywhere. It was just instant dot connection for me. And a lot of other smart people who saw this. Lot of people by the way, didn't see it. Someone said the web's a toy. At the company I was worked for at the time, Hewlett Packard, they like, they could have been in, they had invented HTML, and so like all this stuff was, like, they just passed, the web was just being passed over. But at that time, the browser got better, more websites came on board. So the structural advantage there was online web usage was growing, online user population. So that was growing exponentially with the rise of the Netscape browser. So OpenAI could stay on the right side of your list as durable, if they leverage the category that they're creating, can get the scale. And if they can get the scale, just like Twitter, that failed so many times that they still hung around. So it was a product that was always successful, right? So I mean, it should have- >> You're right, it was terrible, we kept coming back. >> The fail whale, but it still grew. So OpenAI has that moment. They could do it if Microsoft doesn't meddle too much with too much power as a vendor. They could be the Netscape Navigator, without the anti-competitive behavior of somebody else. So to me, they have the pole position. So they have an opportunity. So if not, if they don't execute, then there's opportunity. There's not a lot of barriers to entry, vis-a-vis say the CapEx of say a cloud company like AWS. You can't replicate that, Many have tried, but I think you can replicate OpenAI. >> And we're going to talk about that. Okay, so real quick, I want to bring in some ETR data. This isn't an ETR heavy segment, only because this so new, you know, they haven't coverage yet, but they do cover AI. So basically what we're seeing here is a slide on the vertical axis's net score, which is a measure of spending momentum, and in the horizontal axis's is presence in the dataset. Think of it as, like, market presence. And in the insert right there, you can see how the dots are plotted, the two columns. And so, but the key point here that we want to make, there's a bunch of companies on the left, is he like, you know, DataRobot and C3 AI and some others, but the big whales, Google, AWS, Microsoft, are really dominant in this market. So that's really the key takeaway that, can we- >> I notice IBM is way low. >> Yeah, IBM's low, and actually bring that back up and you, but then you see Oracle who actually is injecting. So I guess that's the other point is, you're not necessarily going to go buy AI, and you know, build your own AI, you're going to, it's going to be there and, it, Salesforce is going to embed it into its platform, the SaaS companies, and you're going to purchase AI. You're not necessarily going to build it. But some companies obviously are. >> I mean to quote IBM's general manager Rob Thomas, "You can't have AI with IA." information architecture and David Flynn- >> You can't Have AI without IA >> without, you can't have AI without IA. You can't have, if you have an Information Architecture, you then can power AI. Yesterday David Flynn, with Hammersmith, was on our Supercloud. He was pointing out that the relationship of storage, where you store things, also impacts the data and stressablity, and Zhamak from Nextdata, she was pointing out that same thing. So the data problem factors into all this too, Dave. >> So you got the big cloud and internet giants, they're all poised to go after this opportunity. Microsoft is investing up to 10 billion. Google's code red, which was, you know, the headline in the New York Times. Of course Apple is there and several alternatives in the market today. Guys like Chinchilla, Bloom, and there's a company Jasper and several others, and then Lena Khan looms large and the government's around the world, EU, US, China, all taking notice before the market really is coalesced around a single player. You know, John, you mentioned Netscape, they kind of really, the US government was way late to that game. It was kind of game over. And Netscape, I remember Barksdale was like, "Eh, we're going to be selling software in the enterprise anyway." and then, pshew, the company just dissipated. So, but it looks like the US government, especially with Lena Khan, they're changing the definition of antitrust and what the cause is to go after people, and they're really much more aggressive. It's only what, two years ago that (indistinct). >> Yeah, the problem I have with the federal oversight is this, they're always like late to the game, and they're slow to catch up. So in other words, they're working on stuff that should have been solved a year and a half, two years ago around some of the social networks hiding behind some of the rules around open web back in the days, and I think- >> But they're like 15 years late to that. >> Yeah, and now they got this new thing on top of it. So like, I just worry about them getting their fingers. >> But there's only two years, you know, OpenAI. >> No, but the thing (indistinct). >> No, they're still fighting other battles. But the problem with government is that they're going to label Big Tech as like a evil thing like Pharma, it's like smoke- >> You know Lena Khan wants to kill Big Tech, there's no question. >> So I think Big Tech is getting a very seriously bad rap. And I think anything that the government does that shades darkness on tech, is politically motivated in most cases. You can almost look at everything, and my 80 20 rule is in play here. 80% of the government activity around tech is bullshit, it's politically motivated, and the 20% is probably relevant, but off the mark and not organized. >> Well market forces have always been the determining factor of success. The governments, you know, have been pretty much failed. I mean you look at IBM's antitrust, that, what did that do? The market ultimately beat them. You look at Microsoft back in the day, right? Windows 95 was peaking, the government came in. But you know, like you said, they missed the web, right, and >> so they were hanging on- >> There's nobody in government >> to Windows. >> that actually knows- >> And so, you, I think you're right. It's market forces that are going to determine this. But Sarbjeet, what do you make of Microsoft's big bet here, you weren't impressed with with Nadella. How do you think, where are they going to apply it? Is this going to be a Hail Mary for Bing, or is it going to be applied elsewhere? What do you think. >> They are saying that they will, sort of, weave this into their products, office products, productivity and also to write code as well, developer productivity as well. That's a big play for them. But coming back to your antitrust sort of comments, right? I believe the, your comment was like, oh, fed was late 10 years or 15 years earlier, but now they're two years. But things are moving very fast now as compared to they used to move. >> So two years is like 10 Years. >> Yeah, two years is like 10 years. Just want to make that point. (Dave laughs) This thing is going like wildfire. Any new tech which comes in that I think they're going against distribution channels. Lina Khan has commented time and again that the marketplace model is that she wants to have some grip on. Cloud marketplaces are a kind of monopolistic kind of way. >> I don't, I don't see this, I don't see a Chat AI. >> You told me it's not Bing, you had an interesting comment. >> No, no. First of all, this is great from Microsoft. If you're Microsoft- >> Why? >> Because Microsoft doesn't have the AI chops that Google has, right? Google is got so much core competency on how they run their search, how they run their backends, their cloud, even though they don't get a lot of cloud market share in the enterprise, they got a kick ass cloud cause they needed one. >> Totally. >> They've invented SRE. I mean Google's development and engineering chops are off the scales, right? Amazon's got some good chops, but Google's got like 10 times more chops than AWS in my opinion. Cloud's a whole different story. Microsoft gets AI, they get a playbook, they get a product they can render into, the not only Bing, productivity software, helping people write papers, PowerPoint, also don't forget the cloud AI can super help. We had this conversation on our Supercloud event, where AI's going to do a lot of the heavy lifting around understanding observability and managing service meshes, to managing microservices, to turning on and off applications, and or maybe writing code in real time. So there's a plethora of use cases for Microsoft to deploy this. combined with their R and D budgets, they can then turbocharge more research, build on it. So I think this gives them a car in the game, Google may have pole position with AI, but this puts Microsoft right in the game, and they already have a lot of stuff going on. But this just, I mean everything gets lifted up. Security, cloud, productivity suite, everything. >> What's under the hood at Google, and why aren't they talking about it? I mean they got to be freaked out about this. No? Or do they have kind of a magic bullet? >> I think they have the, they have the chops definitely. Magic bullet, I don't know where they are, as compared to the ChatGPT 3 or 4 models. Like they, but if you look at the online sort of activity and the videos put out there from Google folks, Google technology folks, that's account you should look at if you are looking there, they have put all these distinctions what ChatGPT 3 has used, they have been talking about for a while as well. So it's not like it's a secret thing that you cannot replicate. As you said earlier, like in the beginning of this segment, that anybody who has more data and the capacity to process that data, which Google has both, I think they will win this. >> Obviously living in Palo Alto where the Google founders are, and Google's headquarters next town over we have- >> We're so close to them. We have inside information on some of the thinking and that hasn't been reported by any outlet yet. And that is, is that, from what I'm hearing from my sources, is Google has it, they don't want to release it for many reasons. One is it might screw up their search monopoly, one, two, they're worried about the accuracy, 'cause Google will get sued. 'Cause a lot of people are jamming on this ChatGPT as, "Oh it does everything for me." when it's clearly not a hundred percent accurate all the time. >> So Lina Kahn is looming, and so Google's like be careful. >> Yeah so Google's just like, this is the third, could be a third rail. >> But the first thing you said is a concern. >> Well no. >> The disruptive (indistinct) >> What they will do is do a Waymo kind of thing, where they spin out a separate company. >> They're doing that. >> The discussions happening, they're going to spin out the separate company and put it over there, and saying, "This is AI, got search over there, don't touch that search, 'cause that's where all the revenue is." (chuckles) >> So, okay, so that's how they deal with the Clay Christensen dilemma. What's the business model here? I mean it's not advertising, right? Is it to charge you for a query? What, how do you make money at this? >> It's a good question, I mean my thinking is, first of all, it's cool to type stuff in and see a paper get written, or write a blog post, or gimme a marketing slogan for this or that or write some code. I think the API side of the business will be critical. And I think Howie Xu, I know you're going to reference some of his comments yesterday on Supercloud, I think this brings a whole 'nother user interface into technology consumption. I think the business model, not yet clear, but it will probably be some sort of either API and developer environment or just a straight up free consumer product, with some sort of freemium backend thing for business. >> And he was saying too, it's natural language is the way in which you're going to interact with these systems. >> I think it's APIs, it's APIs, APIs, APIs, because these people who are cooking up these models, and it takes a lot of compute power to train these and to, for inference as well. Somebody did the analysis on the how many cents a Google search costs to Google, and how many cents the ChatGPT query costs. It's, you know, 100x or something on that. You can take a look at that. >> A 100x on which side? >> You're saying two orders of magnitude more expensive for ChatGPT >> Much more, yeah. >> Than for Google. >> It's very expensive. >> So Google's got the data, they got the infrastructure and they got, you're saying they got the cost (indistinct) >> No actually it's a simple query as well, but they are trying to put together the answers, and they're going through a lot more data versus index data already, you know. >> Let me clarify, you're saying that Google's version of ChatGPT is more efficient? >> No, I'm, I'm saying Google search results. >> Ah, search results. >> What are used to today, but cheaper. >> But that, does that, is that going to confer advantage to Google's large language (indistinct)? >> It will, because there were deep science (indistinct). >> Google, I don't think Google search is doing a large language model on their search, it's keyword search. You know, what's the weather in Santa Cruz? Or how, what's the weather going to be? Or you know, how do I find this? Now they have done a smart job of doing some things with those queries, auto complete, re direct navigation. But it's, it's not entity. It's not like, "Hey, what's Dave Vellante thinking this week in Breaking Analysis?" ChatGPT might get that, because it'll get your Breaking Analysis, it'll synthesize it. There'll be some, maybe some clips. It'll be like, you know, I mean. >> Well I got to tell you, I asked ChatGPT to, like, I said, I'm going to enter a transcript of a discussion I had with Nir Zuk, the CTO of Palo Alto Networks, And I want you to write a 750 word blog. I never input the transcript. It wrote a 750 word blog. It attributed quotes to him, and it just pulled a bunch of stuff that, and said, okay, here it is. It talked about Supercloud, it defined Supercloud. >> It's made, it makes you- >> Wow, But it was a big lie. It was fraudulent, but still, blew me away. >> Again, vanilla content and non accurate content. So we are going to see a surge of misinformation on steroids, but I call it the vanilla content. Wow, that's just so boring, (indistinct). >> There's so many dangers. >> Make your point, cause we got to, almost out of time. >> Okay, so the consumption, like how do you consume this thing. As humans, we are consuming it and we are, like, getting a nicely, like, surprisingly shocked, you know, wow, that's cool. It's going to increase productivity and all that stuff, right? And on the danger side as well, the bad actors can take hold of it and create fake content and we have the fake sort of intelligence, if you go out there. So that's one thing. The second thing is, we are as humans are consuming this as language. Like we read that, we listen to it, whatever format we consume that is, but the ultimate usage of that will be when the machines can take that output from likes of ChatGPT, and do actions based on that. The robots can work, the robot can paint your house, we were talking about, right? Right now we can't do that. >> Data apps. >> So the data has to be ingested by the machines. It has to be digestible by the machines. And the machines cannot digest unorganized data right now, we will get better on the ingestion side as well. So we are getting better. >> Data, reasoning, insights, and action. >> I like that mall, paint my house. >> So, okay- >> By the way, that means drones that'll come in. Spray painting your house. >> Hey, it wasn't too long ago that robots couldn't climb stairs, as I like to point out. Okay, and of course it's no surprise the venture capitalists are lining up to eat at the trough, as I'd like to say. Let's hear, you'd referenced this earlier, John, let's hear what AI expert Howie Xu said at the Supercloud event, about what it takes to clone ChatGPT. Please, play the clip. >> So one of the VCs actually asked me the other day, right? "Hey, how much money do I need to spend, invest to get a, you know, another shot to the openAI sort of the level." You know, I did a (indistinct) >> Line up. >> A hundred million dollar is the order of magnitude that I came up with, right? You know, not a billion, not 10 million, right? So a hundred- >> Guys a hundred million dollars, that's an astoundingly low figure. What do you make of it? >> I was in an interview with, I was interviewing, I think he said hundred million or so, but in the hundreds of millions, not a billion right? >> You were trying to get him up, you were like "Hundreds of millions." >> Well I think, I- >> He's like, eh, not 10, not a billion. >> Well first of all, Howie Xu's an expert machine learning. He's at Zscaler, he's a machine learning AI guy. But he comes from VMware, he's got his technology pedigrees really off the chart. Great friend of theCUBE and kind of like a CUBE analyst for us. And he's smart. He's right. I think the barriers to entry from a dollar standpoint are lower than say the CapEx required to compete with AWS. Clearly, the CapEx spending to build all the tech for the run a cloud. >> And you don't need a huge sales force. >> And in some case apps too, it's the same thing. But I think it's not that hard. >> But am I right about that? You don't need a huge sales force either. It's, what, you know >> If the product's good, it will sell, this is a new era. The better mouse trap will win. This is the new economics in software, right? So- >> Because you look at the amount of money Lacework, and Snyk, Snowflake, Databrooks. Look at the amount of money they've raised. I mean it's like a billion dollars before they get to IPO or more. 'Cause they need promotion, they need go to market. You don't need (indistinct) >> OpenAI's been working on this for multiple five years plus it's, hasn't, wasn't born yesterday. Took a lot of years to get going. And Sam is depositioning all the success, because he's trying to manage expectations, To your point Sarbjeet, earlier. It's like, yeah, he's trying to "Whoa, whoa, settle down everybody, (Dave laughs) it's not that great." because he doesn't want to fall into that, you know, hero and then get taken down, so. >> It may take a 100 million or 150 or 200 million to train the model. But to, for the inference to, yeah to for the inference machine, It will take a lot more, I believe. >> Give it, so imagine, >> Because- >> Go ahead, sorry. >> Go ahead. But because it consumes a lot more compute cycles and it's certain level of storage and everything, right, which they already have. So I think to compute is different. To frame the model is a different cost. But to run the business is different, because I think 100 million can go into just fighting the Fed. >> Well there's a flywheel too. >> Oh that's (indistinct) >> (indistinct) >> We are running the business, right? >> It's an interesting number, but it's also kind of, like, context to it. So here, a hundred million spend it, you get there, but you got to factor in the fact that the ways companies win these days is critical mass scale, hitting a flywheel. If they can keep that flywheel of the value that they got going on and get better, you can almost imagine a marketplace where, hey, we have proprietary data, we're SiliconANGLE in theCUBE. We have proprietary content, CUBE videos, transcripts. Well wouldn't it be great if someone in a marketplace could sell a module for us, right? We buy that, Amazon's thing and things like that. So if they can get a marketplace going where you can apply to data sets that may be proprietary, you can start to see this become bigger. And so I think the key barriers to entry is going to be success. I'll give you an example, Reddit. Reddit is successful and it's hard to copy, not because of the software. >> They built the moat. >> Because you can, buy Reddit open source software and try To compete. >> They built the moat with their community. >> Their community, their scale, their user expectation. Twitter, we referenced earlier, that thing should have gone under the first two years, but there was such a great emotional product. People would tolerate the fail whale. And then, you know, well that was a whole 'nother thing. >> Then a plane landed in (John laughs) the Hudson and it was over. >> I think verticals, a lot of verticals will build applications using these models like for lawyers, for doctors, for scientists, for content creators, for- >> So you'll have many hundreds of millions of dollars investments that are going to be seeping out. If, all right, we got to wrap, if you had to put odds on it that that OpenAI is going to be the leader, maybe not a winner take all leader, but like you look at like Amazon and cloud, they're not winner take all, these aren't necessarily winner take all markets. It's not necessarily a zero sum game, but let's call it winner take most. What odds would you give that open AI 10 years from now will be in that position. >> If I'm 0 to 10 kind of thing? >> Yeah, it's like horse race, 3 to 1, 2 to 1, even money, 10 to 1, 50 to 1. >> Maybe 2 to 1, >> 2 to 1, that's pretty low odds. That's basically saying they're the favorite, they're the front runner. Would you agree with that? >> I'd say 4 to 1. >> Yeah, I was going to say I'm like a 5 to 1, 7 to 1 type of person, 'cause I'm a skeptic with, you know, there's so much competition, but- >> I think they're definitely the leader. I mean you got to say, I mean. >> Oh there's no question. There's no question about it. >> The question is can they execute? >> They're not Friendster, is what you're saying. >> They're not Friendster and they're more like Twitter and Reddit where they have momentum. If they can execute on the product side, and if they don't stumble on that, they will continue to have the lead. >> If they say stay neutral, as Sam is, has been saying, that, hey, Microsoft is one of our partners, if you look at their company model, how they have structured the company, then they're going to pay back to the investors, like Microsoft is the biggest one, up to certain, like by certain number of years, they're going to pay back from all the money they make, and after that, they're going to give the money back to the public, to the, I don't know who they give it to, like non-profit or something. (indistinct) >> Okay, the odds are dropping. (group talks over each other) That's a good point though >> Actually they might have done that to fend off the criticism of this. But it's really interesting to see the model they have adopted. >> The wildcard in all this, My last word on this is that, if there's a developer shift in how developers and data can come together again, we have conferences around the future of data, Supercloud and meshs versus, you know, how the data world, coding with data, how that evolves will also dictate, 'cause a wild card could be a shift in the landscape around how developers are using either machine learning or AI like techniques to code into their apps, so. >> That's fantastic insight. I can't thank you enough for your time, on the heels of Supercloud 2, really appreciate it. All right, thanks to John and Sarbjeet for the outstanding conversation today. Special thanks to the Palo Alto studio team. My goodness, Anderson, this great backdrop. You guys got it all out here, I'm jealous. And Noah, really appreciate it, Chuck, Andrew Frick and Cameron, Andrew Frick switching, Cameron on the video lake, great job. And Alex Myerson, he's on production, manages the podcast for us, Ken Schiffman as well. Kristen Martin and Cheryl Knight help get the word out on social media and our newsletters. Rob Hof is our editor-in-chief over at SiliconANGLE, does some great editing, thanks to all. Remember, all these episodes are available as podcasts. All you got to do is search Breaking Analysis podcast, wherever you listen. Publish each week on wikibon.com and siliconangle.com. Want to get in touch, email me directly, david.vellante@siliconangle.com or DM me at dvellante, or comment on our LinkedIn post. And by all means, check out etr.ai. They got really great survey data in the enterprise tech business. This is Dave Vellante for theCUBE Insights powered by ETR. Thanks for watching, We'll see you next time on Breaking Analysis. (electronic music)

Published Date : Jan 20 2023

SUMMARY :

bringing you data-driven and ChatGPT have taken the world by storm. So I asked it, give it to the large language models to do that. So to your point, it's So one of the problems with ChatGPT, and he simply gave the system the prompts, or the OS to help it do but it kind of levels the playing- and the answers were coming as the data you can get. Yeah, and leveled to certain extent. I check the facts, save me about maybe- and then I write a killer because like if the it's, the law is we, you know, I think that's true and I ask the set of similar question, What's your counter point? and not it's underestimated long term. That's what he said. for the first time, wow. the overhyped at the No, it was, it was I got, right I mean? the internet in the early days, and it's only going to get better." So you're saying it's bifurcated. and possibly the debate the first mobile device. So I mean. on the right with ChatGPT, and convicted by the Department of Justice the scrutiny from the Fed, right, so- And the privacy and thing to do what Sam Altman- So even though it'll get like, you know, it's- It's more than clever. I mean you write- I think that's a big thing. I think he was doing- I was not impressed because You know like. And he did the same thing he's got a lot of hyperbole. the browser moment to me, So OpenAI could stay on the right side You're right, it was terrible, They could be the Netscape Navigator, and in the horizontal axis's So I guess that's the other point is, I mean to quote IBM's So the data problem factors and the government's around the world, and they're slow to catch up. Yeah, and now they got years, you know, OpenAI. But the problem with government to kill Big Tech, and the 20% is probably relevant, back in the day, right? are they going to apply it? and also to write code as well, that the marketplace I don't, I don't see you had an interesting comment. No, no. First of all, the AI chops that Google has, right? are off the scales, right? I mean they got to be and the capacity to process that data, on some of the thinking So Lina Kahn is looming, and this is the third, could be a third rail. But the first thing What they will do out the separate company Is it to charge you for a query? it's cool to type stuff in natural language is the way and how many cents the and they're going through Google search results. It will, because there were It'll be like, you know, I mean. I never input the transcript. Wow, But it was a big lie. but I call it the vanilla content. Make your point, cause we And on the danger side as well, So the data By the way, that means at the Supercloud event, So one of the VCs actually What do you make of it? you were like "Hundreds of millions." not 10, not a billion. Clearly, the CapEx spending to build all But I think it's not that hard. It's, what, you know This is the new economics Look at the amount of And Sam is depositioning all the success, or 150 or 200 million to train the model. So I think to compute is different. not because of the software. Because you can, buy They built the moat And then, you know, well that the Hudson and it was over. that are going to be seeping out. Yeah, it's like horse race, 3 to 1, 2 to 1, that's pretty low odds. I mean you got to say, I mean. Oh there's no question. is what you're saying. and if they don't stumble on that, the money back to the public, to the, Okay, the odds are dropping. the model they have adopted. Supercloud and meshs versus, you know, on the heels of Supercloud

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Sarbjeet	PERSON	0.99+
Brian Gracely	PERSON	0.99+
Lina Khan	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Reid Hoffman	PERSON	0.99+
Alex Myerson	PERSON	0.99+
Lena Khan	PERSON	0.99+
Sam Altman	PERSON	0.99+
Apple	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Rob Thomas	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Ken Schiffman	PERSON	0.99+
Google	ORGANIZATION	0.99+
David Flynn	PERSON	0.99+
Sam	PERSON	0.99+
Noah	PERSON	0.99+
Ray Amara	PERSON	0.99+
10 billion	QUANTITY	0.99+
150	QUANTITY	0.99+
Rob Hof	PERSON	0.99+
Chuck	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Howie Xu	PERSON	0.99+
Anderson	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
John Furrier	PERSON	0.99+
Hewlett Packard	ORGANIZATION	0.99+
Santa Cruz	LOCATION	0.99+
1995	DATE	0.99+
Lina Kahn	PERSON	0.99+
Zhamak Dehghani	PERSON	0.99+
50 words	QUANTITY	0.99+
Hundreds of millions	QUANTITY	0.99+
Compaq	ORGANIZATION	0.99+
10	QUANTITY	0.99+
Kristen Martin	PERSON	0.99+
two sentences	QUANTITY	0.99+
Dave	PERSON	0.99+
hundreds of millions	QUANTITY	0.99+
Satya Nadella	PERSON	0.99+
Cameron	PERSON	0.99+
100 million	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
one sentence	QUANTITY	0.99+
10 million	QUANTITY	0.99+
yesterday	DATE	0.99+
Clay Christensen	PERSON	0.99+
Sarbjeet Johal	PERSON	0.99+
Netscape	ORGANIZATION	0.99+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for first snowflake: