Is Data Mesh the Killer App for Supercloud | Supercloud2

(gentle bright music) >> Okay, welcome back to our "Supercloud 2" event live coverage here at stage performance in Palo Alto syndicating around the world. I'm John Furrier with Dave Vellante. We've got exclusive news and a scoop here for SiliconANGLE and theCUBE. Zhamak Dehghani, creator of data mesh has formed a new company called NextData.com NextData, she's a cube alumni and contributor to our Supercloud initiative, as well as our coverage and breaking analysis with Dave Vellante on data, the killer app for Supercloud. Zhamak, great to see you. Thank you for coming into the studio and congratulations on your newly formed venture and continued success on the data mesh. >> Thank you so much. It's great to be here. Great to see you in person. >> Dave: Yeah, finally. >> John: Wonderful. Your contributions to the data conversation has been well-documented certainly by us and others in the industry. Data mesh taking the world by storm. Some people are debating it, throwing, you know, cold water on it. Some are, I think, it's the next big thing. Tell us about the data mesh super data apps that are emerging out of cloud. >> I mean, data mesh, as you said, it's, you know, the pain point that it surfaced were universal. Everybody said, "Oh, why didn't I think of that?" You know, it was just an obvious next step and people are approaching it, implementing it. I guess the last few years, I've been involved in many of those implementations, and I guess Supercloud is somewhat a prerequisite for it because it's data mesh and building applications using data mesh is about sharing data responsibly across boundaries. And those boundaries include boundaries, organizational boundaries cloud technology boundaries and trust boundaries. >> I want to bring that up because your venture, NextData which is new, just formed. Tell us about that. What wave is that riding? What specifically are you targeting? What's the pain point? >> Zhamak: Absolutely, yes. So next data is the result of, I suppose, the pains that I suffered from implementing a database for many of the organizations. Basically, a lot of organizations that I've worked with, they want decentralized data. So they really embrace this idea of decentralized ownership of the data, but yet they want interconnectivity through standard APIs, yet they want discoverability and governance. So they want to have policies implemented, they want to govern that data, they want to be able to discover that data and yet they want to decentralize it. And we do that with a developer experience that is easy and native to a generalist developer. So we try to find, I guess, the common denominator that solves those problems and enables that developer experience for data sharing. >> John: Since you just announced the news, what's been the reaction? >> Zhamak: I just announced the news right now, so what's the reaction? >> John: But people in the industry that know you, you did a lot of work in the area. What have been some of the feedback on the new venture in terms of the approach, the customers, problem? >> Yeah, so we've been in stealth modes, so we haven't publicly talked about it, but folks that have been close to us in fact have reached out. We already have implementations of our pilot platform with early customers, which is super exciting. And we're going to have multiple of those. Of course, we're a tiny, tiny company. We can have many of those where we are going to have multiple pilots, implementations of our platform in real world. We're real global large scale organizations that have real world problems. So we're not going to build our platform in vacuum. And that's what's happening right now. >> Zhamak: When I think about your role at ThoughtWorks, you had a very wide observation space with a number of clients helping them implement data mesh and other things as well prior to your data mesh initiative. But when I look at data mesh, at least the ones that I've seen, they're very narrow. I think of JPMC, I think of HelloFresh. They're generally obviously not surprising. They don't include the big vision of inclusivity across clouds across different data stores. But it seems like people are having to go through some gymnastics to get to, you know, the organizational reality of decentralizing data, and at least pushing data ownership to the line of business. How are you approaching or are you approaching, solving that problem? Are you taking a narrow slice? What can you tell us about Next Data? >> Zhamak: Sure, yeah, absolutely. Gymnastics, the cute word to describe what the organizations have to go through. And one of those problems is that, you know, the data, as you know, resides on different platforms. It's owned by different people, it's processed by pipelines that who owns them. So there's this very disparate and disconnected set of technologies that were very useful for when we thought about data and processing as a centralized problem. But when you think about data as a decentralized problem, the cost of integration of these technologies in a cohesive developer experience is what's missing. And we want to focus on that cohesive end-to-end developer experience to share data responsibly in this autonomous units, we call them data products, I guess in data mesh, right? That constitutes computation, that governs that data policies, discoverability. So I guess, I heard this expression in the last talks that you can have your cake and eat it too. So we want people have their cakes, which is, you know, data in different places, decentralization and eat it too, which is interconnected access to it. So we start with standardizing and codifying this idea of a data product container that encapsulates data computation, APIs to get to it in a technology agnostic way, in an open way. And then, sit on top and use existing existing tech, you know, Snowflake, Databricks, whatever exists, you know, the millions of dollars of investments that companies have made, sit on top of those but create this cohesive, integrated experience where data product is a first class primitive. And that's really key here, that the language, and the modeling that we use is really native to data mesh is that I will make a data product, I'm sharing a data product, and that encapsulates on providing metadata about this. I'm providing computation that's constantly changing the data. I'm providing the API for that. So we're trying to kind of codify and create a new developer experience based on that. And developer, both from provider side and user side connected to peer-to-peer data sharing with data product as a primitive first class concept. >> Okay, so the idea would be developers would build applications leveraging those data products which are discoverable and governed. Now, today you see some companies, you know, take a snowflake for example. >> Zhamak: Yeah. >> Attempting to do that within their own little walled garden. They even, at one point, used the term, "Mesh." I dunno if they pull back on that. And then they sort of became aware of some of your work. But a lot of the things that they're doing within their little insulated environment, you know, support that, that, you know, governance, they're building out an ecosystem. What's different in your vision? >> Exactly. So we realize that, you know, and this is a reality, like you go to organizations, they have a snowflake and half of the organization happily operates on Snowflake. And on the other half, oh, we are on, you know, bare infrastructure on AWS, or we are on Databricks. This is the realities, you know, this Supercloud that's written up here. It's about working across boundaries of technology. So we try to embrace that. And even for our own technology with the way we're building it, we say, "Okay, nobody's going to use next data mesh operating system. People will have different platforms." So you have to build with openness in mind, and in case of Snowflake, I think, you know, they have I'm sure very happy customers as long as customers can be on Snowflake. But once you cross that boundary of platforms then that becomes a problem. And we try to keep that in mind in our solution. >> So, it's worth reviewing that basically, the concept of data mesh is that, whether you're a data lake or a data warehouse, an S3 bucket, an Oracle database as well, they should be inclusive inside of the data. >> We did a session with AWS on the startup showcase, data as code. And remember, I wrote a blog post in 2007 called, "Data's the new developer kit." Back then, they used to call 'em developer kits, if you remember. And that we said at that time, whoever can code data >> Zhamak: Yes. >> Will have a competitive advantage. >> Aren't there machines going to be doing that? Didn't we just hear that? >> Well we have, and you know, Hey Siri, hey Cube. Find me that best video for data mesh. There it is. I mean, this is the point, like what's happening is that, now, data has to be addressable >> Zhamak: Yes. >> For machines and for coding. >> Zhamak: Yes. >> Because as you need to call the data. So the question is, how do you manage the complexity of big things as promiscuous as possible, making it available as well as then governing it because it's a trade off. The more you make open >> Zhamak: Definitely. >> The better the machine learning. >> Zhamak: Yes. >> But yet, the governance issue, so this is the, you need an OS to handle this maybe. >> Yes, well, we call our mental model for our platform is an OS operating system. Operating systems, you know, have shown us how you can kind of abstract what's complex and take care of, you know, a lot of complexities, but yet provide an open and, you know, dynamic enough interface. So we think about it that way. We try to solve the problem of policies live with the data. An enforcement of the policies happens at the most granular level which is, in this concept, the data product. And that would happen whether you read, write, or access a data product. But we can never imagine what are these policies could be. So our thinking is, okay, we should have a open policy framework that can allow organizations write their own policy drivers, and policy definitions, and encode it and encapsulated in this data product container. But I'm not going to fool myself to say that, you know, that's going to solve the problem that you just described. I think we are in this, I don't know, if I look into my crystal ball, what I think might happen is that right now, the primitives that we work with to train machine-learning model are still bits and bites in data. They're fields, rows, columns, right? And that creates quite a large surface area, an attack area for, you know, for privacy of the data. So perhaps, one of the trends that we might see is this evolution of data APIs to become more and more computational aware to bring the compute to the data to reduce that surface area so you can really leave the control of the data to the sovereign owners of that data, right? So that data product. So I think the evolution of our data APIs perhaps will become more and more computational. So you describe what you want, and the data owner decides, you know, how to manage the- >> John: That's interesting, Dave, 'cause it's almost like we just talked about ChatGPT in the last segment with you, who's a machine learning, could really been around the industry. It's almost as if you're starting to see reason come into the data, reasoning. It's like you starting to see not just metadata, using the data to reason so that you don't have to expose the raw data. It's almost like a, I won't say curation layer, but an intelligence layer. >> Zhamak: Exactly. >> Can you share your vision on that 'cause that seems to be where the dots are connecting. >> Zhamak: Yes, this is perhaps further into the future because just from where we stand, we have to create still that bridge of familiarity between that future and present. So we are still in that bridge-making mode, however, by just the basic notion of saying, "I'm going to put an API in front of my data, and that API today might be as primitive as a level of indirection as in you tell me what you want, tell me who you are, let me go process that, all the policies and lineage, and insert all of this intelligence that need to happen. And then I will, today, I will still give you a file. But by just defining that API and standardizing it, now we have this amazing extension point that we can say, "Well, the next revision of this API, you not just tell me who you are, but you actually tell me what intelligence you're after. What's a logic that I need to go and now compute on your API?" And you can kind of evolve that, right? Now you have a point of evolution to this very futuristic, I guess, future where you just describe the question that you're asking from the chat. >> Well, this is the Supercloud, Dave. >> I have a question from a fan, I got to get it in. It's George Gilbert. And so, his question is, you're blowing away the way we synchronize data from operational systems to the data stack to applications. So the concern that he has, and he wants your feedback on this, "Is the data product app devs get exposed to more complexity with respect to moving data between data products or maybe it's attributes between data products, how do you respond to that? How do you see, is that a problem or is that something that is overstated, or do you have an answer for that?" >> Zhamak: Absolutely. So I think there's a sweet spot in getting data developers, data product developers closer to the app, but yet not burdening them with the complexity of the application and application logic, and yet reducing their cognitive load by localizing what they need to know about which is that domain where they're operating within. Because what's happening right now? what's happening right now is that data engineers, a ton of empathy for them for their high threshold of pain that they can, you know, deal with, they have been centralized, they've put into the data team, and they have been given this unbelievable task of make meaning out of data, put semantic over it, curates it, cleans it, and so on. So what we are saying is that get those folks embedded into the domain closer to the application developers, these are still separately moving units. Your app and your data products are independent but yet tightly closed with each other, tightly coupled with each other based on the context of the domain, so reduce cognitive load by localizing what they need to know about to the domain, get them closer to the application but yet have them them separate from app because app provides a very different service. Transactional data for my e-commerce transaction, data product provides a very different service, longitudinal data for the, you know, variety of this intelligent analysis that I can do on the data. But yet, it's all within the domain of e-commerce or sales or whatnot. >> So a lot of decoupling and coupling create that cohesiveness. >> Zhamak: Absolutely. >> Architecture. So I have to ask you, this is an interesting question 'cause it came up on theCUBE all last year. Back on the old server, data center days and cloud, SRE, Google coined the term, "Site Reliability Engineer" for someone to look over the hundreds of thousands of servers. We asked a question to data engineering community who have been suffering, by the way, agree. Is there an SRE-like role for data? Because in a way, data engineering, that platform engineer, they are like the SRE for data. In other words, managing the large scale to enable automation and cell service. What's your thoughts and reaction to that? >> Zhamak: Yes, exactly. So, maybe we go through that history of how SRE came to be. So we had the first DevOps movement which was, remove the wall between dev and ops and bring them together. So you have one cross-functional units of the organization that's responsible for, you build it you run it, right? So then there is no, I'm going to just shoot my application over the wall for somebody else to manage it. So we did that, and then we said, "Okay, as we decentralized and had this many microservices running around, we had to create a layer that abstracted a lot of the complexity around running now a lot or monitoring, observing and running a lot while giving autonomy to this cross-functional team." And that's where the SRE, a new generation of engineers came to exist. So I think if I just look- >> Hence Borg, hence Kubernetes. >> Hence, hence, exactly. Hence chaos engineering, hence embracing the complexity and messiness, right? And putting engineering discipline to embrace that and yet give a cohesive and high integrity experience of those systems. So I think, if we look at that evolution, perhaps something like that is happening by bringing data and apps closer and make them these domain-oriented data product teams or domain oriented cross-functional teams, full stop, and still have a very advanced maybe at the platform infrastructure level kind of operational team that they're not busy doing two jobs which is taking care of domains and the infrastructure, but they're building infrastructure that is embracing that complexity, interconnectivity of this data process. >> John: So you see similarities. >> Absolutely, but I feel like we're probably in a more early days of that movement. >> So it's a data DevOps kind of thing happening where scales happening. It's good things are happening yet. Eh, a little bit fast and loose with some complexities to clean up. >> Yes, yes. This is a different restructure. As you said we, you know, the job of this industry as a whole on architects is decompose, recompose, decompose, recomposing a new way, and now we're like decomposing centralized team, recomposing them as domains and- >> John: So is data mesh the killer app for Supercloud? >> You had to do this for me. >> Dave: Sorry, I couldn't- (John and Dave laughing) >> Zhamak: What do you want me to say, Dave? >> John: Yes. >> Zhamak: Yes of course. >> I mean Supercloud, I think it's, really the terminology's Supercloud, Opencloud. But I think, in spirits of it, this embracing of diversity and giving autonomy for people to make decisions for what's right for them and not yet lock them in. I think just embracing that is baked into how data mesh assume the world would work. >> John: Well thank you so much for coming on Supercloud too, really appreciate it. Data has driven this conversation. Your success of data mesh has really opened up the conversation and exposed the slow moving data industry. >> Dave: Been a great catalyst. (John laughs) >> John: That's now going well. We can move faster, so thanks for coming on. >> Thank you for hosting me. It was wonderful. >> Okay, Supercloud 2 live here in Palo Alto. Our stage performance, I'm John Furrier with Dave Vellante. We're back with more after this short break, Stay with us all day for Supercloud 2. (gentle bright music)

Published Date : Feb 17 2023

SUMMARY :

and continued success on the data mesh. Great to see you in person. and others in the industry. I guess the last few years, What's the pain point? a database for many of the organizations. in terms of the approach, but folks that have been close to us to get to, you know, the data, as you know, resides Okay, so the idea would be developers But a lot of the things that they're doing This is the realities, you know, inside of the data. And that we said at that Well we have, and you know, So the question is, how do so this is the, you need and the data owner decides, you know, so that you don't have 'cause that seems to be where of this API, you not So the concern that he has, into the domain closer to So a lot of decoupling So I have to ask you, this a lot of the complexity of domains and the infrastructure, in a more early days of that movement. to clean up. the job of this industry the world would work. John: Well thank you so much for coming Dave: Been a great catalyst. We can move faster, so Thank you for hosting me. after this short break,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
John	PERSON	0.99+
Zhamak	PERSON	0.99+
Dave	PERSON	0.99+
George Gilbert	PERSON	0.99+
AWS	ORGANIZATION	0.99+
2007	DATE	0.99+
Palo Alto	LOCATION	0.99+
John Furrier	PERSON	0.99+
John Furrier	PERSON	0.99+
Zhamak Dehghani	PERSON	0.99+
JPMC	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
Dav	PERSON	0.99+
two jobs	QUANTITY	0.99+
Supercloud	ORGANIZATION	0.99+
NextData	ORGANIZATION	0.99+
today	DATE	0.99+
Opencloud	ORGANIZATION	0.99+
last year	DATE	0.99+
Siri	TITLE	0.99+
ThoughtWorks	ORGANIZATION	0.98+
NextData.com	ORGANIZATION	0.98+
Supercloud 2	EVENT	0.98+
both	QUANTITY	0.98+
one	QUANTITY	0.98+
HelloFresh	ORGANIZATION	0.98+
first	QUANTITY	0.98+
millions of dollars	QUANTITY	0.96+
Snowflake	EVENT	0.96+
Oracle	ORGANIZATION	0.96+
SRE	TITLE	0.94+
Snowflake	ORGANIZATION	0.94+
Cube	PERSON	0.93+
Zhama	PERSON	0.92+
Data Mesh the Killer App	TITLE	0.92+
SiliconANGLE	ORGANIZATION	0.91+
Databricks	ORGANIZATION	0.9+
first class	QUANTITY	0.89+
Supercloud 2	ORGANIZATION	0.88+
theCUBE	ORGANIZATION	0.88+
hundreds of thousands	QUANTITY	0.85+
one point	QUANTITY	0.84+
Zham	PERSON	0.83+
Supercloud	EVENT	0.83+
ChatGPT	ORGANIZATION	0.72+
SRE	ORGANIZATION	0.72+
Borg	PERSON	0.7+
Snowflake	TITLE	0.66+
Supercloud	TITLE	0.65+
half	QUANTITY	0.64+

Is Data Mesh the Next Killer App for Supercloud?

(upbeat music) >> Welcome back to our Supercloud 2 event live coverage here of stage performance in Palo Alto syndicating around the world. I'm John Furrier with Dave Vellante. We got exclusive news and a scoop here for SiliconANGLE in theCUBE. Zhamak Dehghani, creator of data mesh has formed a new company called Nextdata.com, Nextdata. She's a cube alumni and contributor to our supercloud initiative, as well as our coverage and Breaking Analysis with Dave Vellante on data, the killer app for supercloud. Zhamak, great to see you. Thank you for coming into the studio and congratulations on your newly formed venture and continued success on the data mesh. >> Thank you so much. It's great to be here. Great to see you in person. >> Dave: Yeah, finally. >> Wonderful. Your contributions to the data conversation has been well documented certainly by us and others in the industry. Data mesh taking the world by storm. Some people are debating it, throwing cold water on it. Some are thinking it's the next big thing. Tell us about the data mesh, super data apps that are emerging out of cloud. >> I mean, data mesh, as you said, the pain point that it surface were universal. Everybody said, "Oh, why didn't I think of that?" It was just an obvious next step and people are approaching it, implementing it. I guess the last few years I've been involved in many of those implementations and I guess supercloud is somewhat a prerequisite for it because it's data mesh and building applications using data mesh is about sharing data responsibly across boundaries. And those boundaries include organizational boundaries, cloud technology boundaries, and trust boundaries. >> I want to bring that up because your venture, Nextdata, which is new just formed. Tell us about that. What wave is that riding? What specifically are you targeting? What's the pain point? >> Absolutely. Yes, so Nextdata is the result of, I suppose the pains that I suffered from implementing data mesh for many of the organizations. Basically a lot of organizations that I've worked with they want decentralized data. So they really embrace this idea of decentralized ownership of the data, but yet they want interconnectivity through standard APIs, yet they want discoverability and governance. So they want to have policies implemented, they want to govern that data, they want to be able to discover that data, and yet they want to decentralize it. And we do that with a developer experience that is easy and native to a generalist developer. So we try to find the, I guess the common denominator that solves those problems and enables that developer experience for data sharing. >> Since you just announced the news, what's been the reaction? >> I just announced the news right now, so what's the reaction? >> But people in the industry know you did a lot of work in the area. What have been some of the feedback on the new venture in terms of the approach, the customers, problem? >> Yeah, so we've been in stealth mode so we haven't publicly talked about it, but folks that have been close to us, in fact have reached that we already have implementations of our pilot platform with early customers, which is super exciting. And we going to have multiple of those. Of course, we're a tiny, tiny company. We can have many of those, but we are going to have multiple pilot implementations of our platform in real world where real global large scale organizations that have real world problems. So we're not going to build our platform in vacuum. And that's what's happening right now. >> Zhamak, when I think about your role at ThoughtWorks, you had a very wide observation space with a number of clients, helping them implement data mesh and other things as well prior to your data mesh initiative. But when I look at data mesh, at least the ones that I've seen, they're very narrow. I think of JPMC, I think of HelloFresh. They're generally, obviously not surprising, they don't include the big vision of inclusivity across clouds, across different data storage. But it seems like people are having to go through some gymnastics to get to the organizational reality of decentralizing data and at least pushing data ownership to the line of business. How are you approaching, or are you approaching solving that problem? Are you taking a narrow slice? What can you tell us about Nextdata? >> Yeah, absolutely. Gymnastics, the cute word to describe what the organizations have to go through. And one of those problems is that the data as you know resides on different platforms, it's owned by different people, is processed by pipelines that who knows who owns them. So there's this very disparate and disconnected set of technologies that were very useful for when we thought about data and processing as a centralized problem. But when you think about data as a decentralized problem the cost of integration of these technologies in a cohesive developer experience is what's missing. And we want to focus on that cohesive end-to-end developer experience to share data responsibly in these autonomous units. We call them data products, I guess in data mesh. That constitutes computation. That governs that data policies, discoverability. So I guess, I heard this expression in the last talks that you can have your cake and eat it too. So we want people have their cakes, which is data in different places, decentralization, and eat it too, which is interconnected access to it. So we start with standardizing and codifying this idea of a data product container that encapsulates data computation APIs to get to it in a technology agnostic way, in an open way. And then sit on top and use existing tech, Snowflake, Databricks, whatever exists, the millions of dollars of investments that companies have made, sit on top of those but create this cohesive, integrated experience where data product is a first class primitive. And that's really key here. The language and the modeling that we use is really native to data mesh, which is that I'm building a data product I'm sharing a data product, and that encapsulates I'm providing metadata about this. I'm providing computation that's constantly changing the data. I'm providing the API for that. So we we're trying to kind of codify and create a new developer experience based on that. And developer, both from provider side and user side, connected to peer-to-peer data sharing with data product as a primitive first class concept. >> So the idea would be developers would build applications leveraging those data products, which are discoverable and governed. Now today you see some companies, take a Snowflake for example, attempting to do that within their own little walled garden. They even at one point used the term mesh. I don't know if they pull back on that. And then they became aware of some of your work. But a lot of the things that they're doing within their little insulated environment support that governance, they're building out an ecosystem. What's different in your vision? >> Exactly. So we realized that, and this is a reality, like you go to organizations, they have a Snowflake and half of the organization happily operates on Snowflake. And on the other half, "oh, we are on Bare infrastructure on AWS or we are on Databricks." This is the reality. This supercloud that's written up here, it's about working across boundaries of technology. So we try to embrace that. And even for our own technology with the way we're building it, we say, "Okay, nobody's going to use Nextdata, data mesh operating system. People will have different platforms." So you have to build with openness in mind and in case of Snowflake, I think, they have very, I'm sure very happy customers as long as customers can be on Snowflake. But once you cross that boundary of platforms then that becomes a problem. And we try to keep that in mind in our solution. >> So it's worth reviewing that basically the concept of data mesh is that whether you're a data lake or a data warehouse, an S3 bucket, an Oracle database as well, they should be inclusive inside of the data. >> We did a session with AWS on the startup showcase, data as code. And remember I wrote a blog post in 2007 called "Data as the New Developer Kit" back then we used to call them developer kits if you remember. And that we said at that time, whoever can code data will have a competitive advantage. >> Aren't the machines going to be doing that? Didn't we just hear that? >> Well, we have. Hey, Siri. Hey, Cube, find me that best video for data mesh. There it is. But this is the point, like what's happening is that now data has to be addressable. for machines and for coding because as you need to call the data. So the question is how do you manage the complexity of big things as promiscuous as possible, making it available, as well as then governing it? Because it's a trade off. The more you make open, the better the machine learning. But yet the governance issue, so this is the, you need an OS to handle this maybe. >> Yes. So yes, well we call, our mental model for our platform is an OS operating system. Operating systems have shown us how you can abstract what's complex and take care of a lot of complexities, but yet provide an open and dynamic enough interface. So we think about it that way. Just, we try to solve the problem of policies live with the data, an enforcement of the policies happens at the most granular level, which is in this concept of the data product. And that would happen whether you read, write or access a data product. But we can never imagine what are these policies could be. So our thinking is we should have a policy, open policy framework that can allow organizations write their own policy drivers and policy definitions and encode it and encapsulated in this data product container. But I'm not going to fool myself to say that, that's going to solve the problem that you just described. I think we are in this, I don't know, if I look into my crystal ball, what I think might happen is that right now the primitives that we work with to train machine learning model are still bits and bytes and data. They're fields, rows, columns and that creates quite a large surface area and attack area for privacy of the data. So perhaps one of the trends that we might see is this evolution of data APIs to become more and more computational aware to bring the compute to the data to reduce that surface area. So you can really leave the control of the data to the sovereign owners of that data. So that data product. So I think that evolution of our data APIs perhaps will become more and more computational. So you describe what you want and the data owner decides how to manage. >> That's interesting, Dave, 'cause it's almost like we just talked about ChatGPT in the last segment we had with you. It was a machine learning have been around the industry. It's almost as if you're starting to see reason come into, the data reasoning is like starting to see not just metadata. Using the data to reason so that you don't have to expose the raw data. So almost like a, I won't say curation layer, but an intelligence layer. >> Zhamak: Exactly. >> Can you share your vision on that? 'Cause that seems to be where the dots are connecting. >> Yes, perhaps further into the future because just from where we stand, we have to create still that bridge of familiarity between that future and present. So we are still in that bridge making mode. However, by just the basic notion of saying, "I'm going to put an API in front of my data." And that API today might be as primitive as a level of indirection, as in you tell me what you want, tell me who you are, let me go process that, all the policies and lineage and insert all of this intelligence that need to happen. And then today, I will still give you a file. But by just defining that API and standardizing it now we have this amazing extension point that we can say, "Well, the next revision of this API, you not just tell me who you are, but you actually tell me what intelligence you're after. What's a logic that I need to go and now compute on your API?" And you can evolve that. Now you have a point of evolution to this very futuristic, I guess, future where you just described the question that you're asking from the ChatGPT. >> Well, this is the supercloud, go ahead, Dave. >> I have a question from a fan, I got to get it in. It's George Gilbert. And so his question is, you're blowing away the way we synchronize data from operational systems to the data stack to applications. So the concern that he has and he wants your feedback on this, is the data product app devs get exposed to more complexity with respect to moving data between data products or maybe it's attributes between data products? How do you respond to that? How do you see? Is that a problem? Is that something that is overstated or do you have an answer for that? >> Absolutely. So I think there's a sweet spot in getting data developers, data product developers closer to the app, but yet not overburdening them with the complexity of the application and application logic and yet reducing their cognitive load by localizing what they need to know about, which is that domain where they're operating within. Because what's happening right now? What's happening right now is that data engineers with, a ton of empathy for them for their high threshold of pain that they can deal with, they have been centralized, they've put into the data team, and they have been given this unbelievable task of make meaning out of data, put semantic over it, curate it, cleans it, and so on. So what we are saying is that get those folks embedded into the domain closer to the application developers. These are still separately moving units. Your app and your data products are independent, but yet tightly closed with each other, tightly coupled with each other based on the context of the domain. So reduce cognitive load by localizing what they need to know about to the domain, get them closer to the application, but yet have them separate from app because app provides a very different service. Transactional data for my e-commerce transaction. Data product provides a very different service. Longitudinal data for the variety of this intelligent analysis that I can do on the data. But yet it's all within the domain of e-commerce or sales or whatnot. >> It's a lot of decoupling and coupling create that cohesiveness architecture. So I have to ask you, this is an interesting question 'cause it came up on theCUBE all last year. Back on the old server data center days and cloud, SRE, Google coined the term, site reliability engineer, for someone to look over the hundreds of thousands of servers. We asked the question to data engineering community who have been suffering, by the way, I agree. Is there an SRE like role for data? Because in a way data engineering, that platform engineer, they are like the SRE for data. In other words managing the large scale to enable automation and cell service. What's your thoughts and reaction to that? >> Yes, exactly. So maybe we go through that history of how SRE came to be. So we had the first DevOps movement, which was remove the wall between dev and ops and bring them together. So you have one unit of one cross-functional units of the organization that's responsible for you build it, you run it. So then there is no, I'm going to just shoot my application over the wall for somebody else to manage it. So we did that and then we said, okay, there is a ton, as we decentralized and had these many microservices running around, we had to create a layer that abstracted a lot of the complexity around running now a lot or monitoring, observing, and running a lot while giving autonomy to this cross-functional team. And that's where the SRE, a new generation of engineers came to exist. So I think if I just look at. >> Hence, Kubernetes. >> Hence, hence, exactly. Hence, chaos engineering. Hence, embracing the complexity and messiness. And putting engineering discipline to embrace that and yet give a cohesive and high integrity experience of those systems. So I think if we look at that evolution, perhaps something like that is happening by bringing data and apps closer and make them these domain-oriented data product teams or domain-oriented cross-functional teams full stop and still have a very advanced maybe at the platform level, infrastructure level operational team that they're not busy doing two jobs, which is taking care of domains and the infrastructure, but they're building infrastructure that is embracing that complexity, interconnectivity of this data process. >> So you see similarities? >> I see, absolutely. But I feel like we're probably in a more early days of that movement. >> So it's a data DevOps kind of thing happening where scales happening. It's good things are happening, yet a little bit fast and loose with some complexities to clean up. >> Yes. This is a different restructure. As you said, the job of this industry as a whole, an architect, is decompose recompose, decompose recompose in new way and now we're like decomposing centralized team, recomposing them as domains. >> So is data mesh the killer app for supercloud? >> You had to do this to me. >> Sorry, I couldn't resist. >> I know. Of course you want me to say this. >> Yes. >> Yes, of course. I mean, supercloud, I think it's really, the terminology supercloud, open cloud, but I think in spirits of it this embracing of diversity and giving autonomy for people to make decisions for what's right for them and not yet lock them in. I think just embracing that is baked into how data mesh assume the world would work. >> Well, thank you so much for coming on Supercloud 2. We really appreciate it. Data has driven this conversation. Your success of data mesh has really opened up the conversation and exposed the slow moving data industry. >> Dave: Been a great catalyst. >> That's now going well. We can move faster. So thanks for coming on. >> Thank you for hosting me. It was wonderful. >> Supercloud 2 live here in Palo Alto, our stage performance. I'm John Furrier with Dave Vellante. We'll back with more after this short break. Stay with us all day for Supercloud 2. (upbeat music)

Published Date : Jan 25 2023

SUMMARY :

and continued success on the data mesh. Great to see you in person. and others in the industry. I guess the last few What's the pain point? for many of the organizations. But people in the industry know you did but folks that have been close to us, at least the ones that I've is that the data as you know But a lot of the things that they're doing and half of the organization that basically the concept of data mesh And that we said at that time, is that now data has to be addressable. and the data owner decides how to manage. the data reasoning is like starting to see 'Cause that seems to be where What's a logic that I need to go Well, this is the So the concern that he has into the domain closer to We asked the question to of the organization that's responsible So I think if we look at that evolution, in a more early days of that movement. So it's a data DevOps As you said, the job of Of course you want me to say this. assume the world would work. the conversation and exposed So thanks for coming on. Thank you for hosting me. I'm John Furrier with Dave Vellante.

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Dave	PERSON	0.99+
AWS	ORGANIZATION	0.99+
2007	DATE	0.99+
George Gilbert	PERSON	0.99+
Zhamak Dehghani	PERSON	0.99+
Nextdata	ORGANIZATION	0.99+
Zhamak	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Google	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
one	QUANTITY	0.99+
Nextdata.com	ORGANIZATION	0.99+
two jobs	QUANTITY	0.99+
JPMC	ORGANIZATION	0.99+
today	DATE	0.99+
HelloFresh	ORGANIZATION	0.99+
ThoughtWorks	ORGANIZATION	0.99+
last year	DATE	0.99+
Supercloud 2	EVENT	0.99+
Oracle	ORGANIZATION	0.98+
first	QUANTITY	0.98+
Siri	TITLE	0.98+
Cube	PERSON	0.98+
Databricks	ORGANIZATION	0.98+
Snowflake	ORGANIZATION	0.97+
Supercloud	ORGANIZATION	0.97+
both	QUANTITY	0.97+
one unit	QUANTITY	0.97+
Snowflake	TITLE	0.96+
SRE	TITLE	0.95+
millions of dollars	QUANTITY	0.94+
first class	QUANTITY	0.94+
hundreds of thousands of servers	QUANTITY	0.92+
supercloud	ORGANIZATION	0.92+
one point	QUANTITY	0.92+
Supercloud 2	TITLE	0.89+
ChatGPT	ORGANIZATION	0.81+
half	QUANTITY	0.81+
Data Mesh the Next Killer App	TITLE	0.78+
supercloud	TITLE	0.75+
a ton	QUANTITY	0.73+
Supercloud 2	ORGANIZATION	0.72+
SiliconANGLE	ORGANIZATION	0.7+
DevOps	TITLE	0.66+
Snowflake	EVENT	0.59+
S3	TITLE	0.54+
last	DATE	0.54+
supercloud	EVENT	0.48+
Kubernetes	TITLE	0.47+

Breaking Analysis: ChatGPT Won't Give OpenAI First Mover Advantage

>> From theCUBE Studios in Palo Alto in Boston, bringing you data-driven insights from theCUBE and ETR. This is Breaking Analysis with Dave Vellante. >> OpenAI The company, and ChatGPT have taken the world by storm. Microsoft reportedly is investing an additional 10 billion dollars into the company. But in our view, while the hype around ChatGPT is justified, we don't believe OpenAI will lock up the market with its first mover advantage. Rather, we believe that success in this market will be directly proportional to the quality and quantity of data that a technology company has at its disposal, and the compute power that it could deploy to run its system. Hello and welcome to this week's Wikibon CUBE insights, powered by ETR. In this Breaking Analysis, we unpack the excitement around ChatGPT, and debate the premise that the company's early entry into the space may not confer winner take all advantage to OpenAI. And to do so, we welcome CUBE collaborator, alum, Sarbjeet Johal, (chuckles) and John Furrier, co-host of the Cube. Great to see you Sarbjeet, John. Really appreciate you guys coming to the program. >> Great to be on. >> Okay, so what is ChatGPT? Well, actually we asked ChatGPT, what is ChatGPT? So here's what it said. ChatGPT is a state-of-the-art language model developed by OpenAI that can generate human-like text. It could be fine tuned for a variety of language tasks, such as conversation, summarization, and language translation. So I asked it, give it to me in 50 words or less. How did it do? Anything to add? >> Yeah, think it did good. It's large language model, like previous models, but it started applying the transformers sort of mechanism to focus on what prompt you have given it to itself. And then also the what answer it gave you in the first, sort of, one sentence or two sentences, and then introspect on itself, like what I have already said to you. And so just work on that. So it it's self sort of focus if you will. It does, the transformers help the large language models to do that. >> So to your point, it's a large language model, and GPT stands for generative pre-trained transformer. >> And if you put the definition back up there again, if you put it back up on the screen, let's see it back up. Okay, it actually missed the large, word large. So one of the problems with ChatGPT, it's not always accurate. It's actually a large language model, and it says state of the art language model. And if you look at Google, Google has dominated AI for many times and they're well known as being the best at this. And apparently Google has their own large language model, LLM, in play and have been holding it back to release because of backlash on the accuracy. Like just in that example you showed is a great point. They got almost right, but they missed the key word. >> You know what's funny about that John, is I had previously asked it in my prompt to give me it in less than a hundred words, and it was too long, I said I was too long for Breaking Analysis, and there it went into the fact that it's a large language model. So it largely, it gave me a really different answer the, for both times. So, but it's still pretty amazing for those of you who haven't played with it yet. And one of the best examples that I saw was Ben Charrington from This Week In ML AI podcast. And I stumbled on this thanks to Brian Gracely, who was listening to one of his Cloudcasts. Basically what Ben did is he took, he prompted ChatGPT to interview ChatGPT, and he simply gave the system the prompts, and then he ran the questions and answers into this avatar builder and sped it up 2X so it didn't sound like a machine. And voila, it was amazing. So John is ChatGPT going to take over as a cube host? >> Well, I was thinking, we get the questions in advance sometimes from PR people. We should actually just plug it in ChatGPT, add it to our notes, and saying, "Is this good enough for you? Let's ask the real question." So I think, you know, I think there's a lot of heavy lifting that gets done. I think the ChatGPT is a phenomenal revolution. I think it highlights the use case. Like that example we showed earlier. It gets most of it right. So it's directionally correct and it feels like it's an answer, but it's not a hundred percent accurate. And I think that's where people are seeing value in it. Writing marketing, copy, brainstorming, guest list, gift list for somebody. Write me some lyrics to a song. Give me a thesis about healthcare policy in the United States. It'll do a bang up job, and then you got to go in and you can massage it. So we're going to do three quarters of the work. That's why plagiarism and schools are kind of freaking out. And that's why Microsoft put 10 billion in, because why wouldn't this be a feature of Word, or the OS to help it do stuff on behalf of the user. So linguistically it's a beautiful thing. You can input a string and get a good answer. It's not a search result. >> And we're going to get your take on on Microsoft and, but it kind of levels the playing- but ChatGPT writes better than I do, Sarbjeet, and I know you have some good examples too. You mentioned the Reed Hastings example. >> Yeah, I was listening to Reed Hastings fireside chat with ChatGPT, and the answers were coming as sort of voice, in the voice format. And it was amazing what, he was having very sort of philosophy kind of talk with the ChatGPT, the longer sentences, like he was going on, like, just like we are talking, he was talking for like almost two minutes and then ChatGPT was answering. It was not one sentence question, and then a lot of answers from ChatGPT and yeah, you're right. I, this is our ability. I've been thinking deep about this since yesterday, we talked about, like, we want to do this segment. The data is fed into the data model. It can be the current data as well, but I think that, like, models like ChatGPT, other companies will have those too. They can, they're democratizing the intelligence, but they're not creating intelligence yet, definitely yet I can say that. They will give you all the finite answers. Like, okay, how do you do this for loop in Java, versus, you know, C sharp, and as a programmer you can do that, in, but they can't tell you that, how to write a new algorithm or write a new search algorithm for you. They cannot create a secretive code for you to- >> Not yet. >> Have competitive advantage. >> Not yet, not yet. >> but you- >> Can Google do that today? >> No one really can. The reasoning side of the data is, we talked about at our Supercloud event, with Zhamak Dehghani who's was CEO of, now of Nextdata. This next wave of data intelligence is going to come from entrepreneurs that are probably cross discipline, computer science and some other discipline. But they're going to be new things, for example, data, metadata, and data. It's hard to do reasoning like a human being, so that needs more data to train itself. So I think the first gen of this training module for the large language model they have is a corpus of text. Lot of that's why blog posts are, but the facts are wrong and sometimes out of context, because that contextual reasoning takes time, it takes intelligence. So machines need to become intelligent, and so therefore they need to be trained. So you're going to start to see, I think, a lot of acceleration on training the data sets. And again, it's only as good as the data you can get. And again, proprietary data sets will be a huge winner. Anyone who's got a large corpus of content, proprietary content like theCUBE or SiliconANGLE as a publisher will benefit from this. Large FinTech companies, anyone with large proprietary data will probably be a big winner on this generative AI wave, because it just, it will eat that up, and turn that back into something better. So I think there's going to be a lot of interesting things to look at here. And certainly productivity's going to be off the charts for vanilla and the internet is going to get swarmed with vanilla content. So if you're in the content business, and you're an original content producer of any kind, you're going to be not vanilla, so you're going to be better. So I think there's so much at play Dave (indistinct). >> I think the playing field has been risen, so we- >> Risen and leveled? >> Yeah, and leveled to certain extent. So it's now like that few people as consumers, as consumers of AI, we will have a advantage and others cannot have that advantage. So it will be democratized. That's, I'm sure about that. But if you take the example of calculator, when the calculator came in, and a lot of people are, "Oh, people can't do math anymore because calculator is there." right? So it's a similar sort of moment, just like a calculator for the next level. But, again- >> I see it more like open source, Sarbjeet, because like if you think about what ChatGPT's doing, you do a query and it comes from somewhere the value of a post from ChatGPT is just a reuse of AI. The original content accent will be come from a human. So if I lay out a paragraph from ChatGPT, did some heavy lifting on some facts, I check the facts, save me about maybe- >> Yeah, it's productive. >> An hour writing, and then I write a killer two, three sentences of, like, sharp original thinking or critical analysis. I then took that body of work, open source content, and then laid something on top of it. >> And Sarbjeet's example is a good one, because like if the calculator kids don't do math as well anymore, the slide rule, remember we had slide rules as kids, remember we first started using Waze, you know, we were this minority and you had an advantage over other drivers. Now Waze is like, you know, social traffic, you know, navigation, everybody had, you know- >> All the back roads are crowded. >> They're car crowded. (group laughs) Exactly. All right, let's, let's move on. What about this notion that futurist Ray Amara put forth and really Amara's Law that we're showing here, it's, the law is we, you know, "We tend to overestimate the effect of technology in the short run and underestimate it in the long run." Is that the case, do you think, with ChatGPT? What do you think Sarbjeet? >> I think that's true actually. There's a lot of, >> We don't debate this. >> There's a lot of awe, like when people see the results from ChatGPT, they say what, what the heck? Like, it can do this? But then if you use it more and more and more, and I ask the set of similar question, not the same question, and it gives you like same answer. It's like reading from the same bucket of text in, the interior read (indistinct) where the ChatGPT, you will see that in some couple of segments. It's very, it sounds so boring that the ChatGPT is coming out the same two sentences every time. So it is kind of good, but it's not as good as people think it is right now. But we will have, go through this, you know, hype sort of cycle and get realistic with it. And then in the long term, I think it's a great thing in the short term, it's not something which will (indistinct) >> What's your counter point? You're saying it's not. >> I, no I think the question was, it's hyped up in the short term and not it's underestimated long term. That's what I think what he said, quote. >> Yes, yeah. That's what he said. >> Okay, I think that's wrong with this, because this is a unique, ChatGPT is a unique kind of impact and it's very generational. People have been comparing it, I have been comparing to the internet, like the web, web browser Mosaic and Netscape, right, Navigator. I mean, I clearly still remember the days seeing Navigator for the first time, wow. And there weren't not many sites you could go to, everyone typed in, you know, cars.com, you know. >> That (indistinct) wasn't that overestimated, the overhyped at the beginning and underestimated. >> No, it was, it was underestimated long run, people thought. >> But that Amara's law. >> That's what is. >> No, they said overestimated? >> Overestimated near term underestimated- overhyped near term, underestimated long term. I got, right I mean? >> Well, I, yeah okay, so I would then agree, okay then- >> We were off the charts about the internet in the early days, and it actually exceeded our expectations. >> Well there were people who were, like, poo-pooing it early on. So when the browser came out, people were like, "Oh, the web's a toy for kids." I mean, in 1995 the web was a joke, right? So '96, you had online populations growing, so you had structural changes going on around the browser, internet population. And then that replaced other things, direct mail, other business activities that were once analog then went to the web, kind of read only as you, as we always talk about. So I think that's a moment where the hype long term, the smart money, and the smart industry experts all get the long term. And in this case, there's more poo-pooing in the short term. "Ah, it's not a big deal, it's just AI." I've heard many people poo-pooing ChatGPT, and a lot of smart people saying, "No this is next gen, this is different and it's only going to get better." So I think people are estimating a big long game on this one. >> So you're saying it's bifurcated. There's those who say- >> Yes. >> Okay, all right, let's get to the heart of the premise, and possibly the debate for today's episode. Will OpenAI's early entry into the market confer sustainable competitive advantage for the company. And if you look at the history of tech, the technology industry, it's kind of littered with first mover failures. Altair, IBM, Tandy, Commodore, they and Apple even, they were really early in the PC game. They took a backseat to Dell who came in the scene years later with a better business model. Netscape, you were just talking about, was all the rage in Silicon Valley, with the first browser, drove up all the housing prices out here. AltaVista was the first search engine to really, you know, index full text. >> Owned by Dell, I mean DEC. >> Owned by Digital. >> Yeah, Digital Equipment >> Compaq bought it. And of course as an aside, Digital, they wanted to showcase their hardware, right? Their super computer stuff. And then so Friendster and MySpace, they came before Facebook. The iPhone certainly wasn't the first mobile device. So lots of failed examples, but there are some recent successes like AWS and cloud. >> You could say smartphone. So I mean. >> Well I know, and you can, we can parse this so we'll debate it. Now Twitter, you could argue, had first mover advantage. You kind of gave me that one John. Bitcoin and crypto clearly had first mover advantage, and sustaining that. Guys, will OpenAI make it to the list on the right with ChatGPT, what do you think? >> I think categorically as a company, it probably won't, but as a category, I think what they're doing will, so OpenAI as a company, they get funding, there's power dynamics involved. Microsoft put a billion dollars in early on, then they just pony it up. Now they're reporting 10 billion more. So, like, if the browsers, Microsoft had competitive advantage over Netscape, and used monopoly power, and convicted by the Department of Justice for killing Netscape with their monopoly, Netscape should have had won that battle, but Microsoft killed it. In this case, Microsoft's not killing it, they're buying into it. So I think the embrace extend Microsoft power here makes OpenAI vulnerable for that one vendor solution. So the AI as a company might not make the list, but the category of what this is, large language model AI, is probably will be on the right hand side. >> Okay, we're going to come back to the government intervention and maybe do some comparisons, but what are your thoughts on this premise here? That, it will basically set- put forth the premise that it, that ChatGPT, its early entry into the market will not confer competitive advantage to >> For OpenAI. >> To Open- Yeah, do you agree with that? >> I agree with that actually. It, because Google has been at it, and they have been holding back, as John said because of the scrutiny from the Fed, right, so- >> And privacy too. >> And the privacy and the accuracy as well. But I think Sam Altman and the company on those guys, right? They have put this in a hasty way out there, you know, because it makes mistakes, and there are a lot of questions around the, sort of, where the content is coming from. You saw that as your example, it just stole the content, and without your permission, you know? >> Yeah. So as quick this aside- >> And it codes on people's behalf and the, those codes are wrong. So there's a lot of, sort of, false information it's putting out there. So it's a very vulnerable thing to do what Sam Altman- >> So even though it'll get better, others will compete. >> So look, just side note, a term which Reid Hoffman used a little bit. Like he said, it's experimental launch, like, you know, it's- >> It's pretty damn good. >> It is clever because according to Sam- >> It's more than clever. It's good. >> It's awesome, if you haven't used it. I mean you write- you read what it writes and you go, "This thing writes so well, it writes so much better than you." >> The human emotion drives that too. I think that's a big thing. But- >> I Want to add one more- >> Make your last point. >> Last one. Okay. So, but he's still holding back. He's conducting quite a few interviews. If you want to get the gist of it, there's an interview with StrictlyVC interview from yesterday with Sam Altman. Listen to that one it's an eye opening what they want- where they want to take it. But my last one I want to make it on this point is that Satya Nadella yesterday did an interview with Wall Street Journal. I think he was doing- >> You were not impressed. >> I was not impressed because he was pushing it too much. So Sam Altman's holding back so there's less backlash. >> Got 10 billion reasons to push. >> I think he's almost- >> Microsoft just laid off 10000 people. Hey ChatGPT, find me a job. You know like. (group laughs) >> He's overselling it to an extent that I think it will backfire on Microsoft. And he's over promising a lot of stuff right now, I think. I don't know why he's very jittery about all these things. And he did the same thing during Ignite as well. So he said, "Oh, this AI will write code for you and this and that." Like you called him out- >> The hyperbole- >> During your- >> from Satya Nadella, he's got a lot of hyperbole. (group talks over each other) >> All right, Let's, go ahead. >> Well, can I weigh in on the whole- >> Yeah, sure. >> Microsoft thing on whether OpenAI, here's the take on this. I think it's more like the browser moment to me, because I could relate to that experience with ChatG, personally, emotionally, when I saw that, and I remember vividly- >> You mean that aha moment (indistinct). >> Like this is obviously the future. Anything else in the old world is dead, website's going to be everywhere. It was just instant dot connection for me. And a lot of other smart people who saw this. Lot of people by the way, didn't see it. Someone said the web's a toy. At the company I was worked for at the time, Hewlett Packard, they like, they could have been in, they had invented HTML, and so like all this stuff was, like, they just passed, the web was just being passed over. But at that time, the browser got better, more websites came on board. So the structural advantage there was online web usage was growing, online user population. So that was growing exponentially with the rise of the Netscape browser. So OpenAI could stay on the right side of your list as durable, if they leverage the category that they're creating, can get the scale. And if they can get the scale, just like Twitter, that failed so many times that they still hung around. So it was a product that was always successful, right? So I mean, it should have- >> You're right, it was terrible, we kept coming back. >> The fail whale, but it still grew. So OpenAI has that moment. They could do it if Microsoft doesn't meddle too much with too much power as a vendor. They could be the Netscape Navigator, without the anti-competitive behavior of somebody else. So to me, they have the pole position. So they have an opportunity. So if not, if they don't execute, then there's opportunity. There's not a lot of barriers to entry, vis-a-vis say the CapEx of say a cloud company like AWS. You can't replicate that, Many have tried, but I think you can replicate OpenAI. >> And we're going to talk about that. Okay, so real quick, I want to bring in some ETR data. This isn't an ETR heavy segment, only because this so new, you know, they haven't coverage yet, but they do cover AI. So basically what we're seeing here is a slide on the vertical axis's net score, which is a measure of spending momentum, and in the horizontal axis's is presence in the dataset. Think of it as, like, market presence. And in the insert right there, you can see how the dots are plotted, the two columns. And so, but the key point here that we want to make, there's a bunch of companies on the left, is he like, you know, DataRobot and C3 AI and some others, but the big whales, Google, AWS, Microsoft, are really dominant in this market. So that's really the key takeaway that, can we- >> I notice IBM is way low. >> Yeah, IBM's low, and actually bring that back up and you, but then you see Oracle who actually is injecting. So I guess that's the other point is, you're not necessarily going to go buy AI, and you know, build your own AI, you're going to, it's going to be there and, it, Salesforce is going to embed it into its platform, the SaaS companies, and you're going to purchase AI. You're not necessarily going to build it. But some companies obviously are. >> I mean to quote IBM's general manager Rob Thomas, "You can't have AI with IA." information architecture and David Flynn- >> You can't Have AI without IA >> without, you can't have AI without IA. You can't have, if you have an Information Architecture, you then can power AI. Yesterday David Flynn, with Hammersmith, was on our Supercloud. He was pointing out that the relationship of storage, where you store things, also impacts the data and stressablity, and Zhamak from Nextdata, she was pointing out that same thing. So the data problem factors into all this too, Dave. >> So you got the big cloud and internet giants, they're all poised to go after this opportunity. Microsoft is investing up to 10 billion. Google's code red, which was, you know, the headline in the New York Times. Of course Apple is there and several alternatives in the market today. Guys like Chinchilla, Bloom, and there's a company Jasper and several others, and then Lena Khan looms large and the government's around the world, EU, US, China, all taking notice before the market really is coalesced around a single player. You know, John, you mentioned Netscape, they kind of really, the US government was way late to that game. It was kind of game over. And Netscape, I remember Barksdale was like, "Eh, we're going to be selling software in the enterprise anyway." and then, pshew, the company just dissipated. So, but it looks like the US government, especially with Lena Khan, they're changing the definition of antitrust and what the cause is to go after people, and they're really much more aggressive. It's only what, two years ago that (indistinct). >> Yeah, the problem I have with the federal oversight is this, they're always like late to the game, and they're slow to catch up. So in other words, they're working on stuff that should have been solved a year and a half, two years ago around some of the social networks hiding behind some of the rules around open web back in the days, and I think- >> But they're like 15 years late to that. >> Yeah, and now they got this new thing on top of it. So like, I just worry about them getting their fingers. >> But there's only two years, you know, OpenAI. >> No, but the thing (indistinct). >> No, they're still fighting other battles. But the problem with government is that they're going to label Big Tech as like a evil thing like Pharma, it's like smoke- >> You know Lena Khan wants to kill Big Tech, there's no question. >> So I think Big Tech is getting a very seriously bad rap. And I think anything that the government does that shades darkness on tech, is politically motivated in most cases. You can almost look at everything, and my 80 20 rule is in play here. 80% of the government activity around tech is bullshit, it's politically motivated, and the 20% is probably relevant, but off the mark and not organized. >> Well market forces have always been the determining factor of success. The governments, you know, have been pretty much failed. I mean you look at IBM's antitrust, that, what did that do? The market ultimately beat them. You look at Microsoft back in the day, right? Windows 95 was peaking, the government came in. But you know, like you said, they missed the web, right, and >> so they were hanging on- >> There's nobody in government >> to Windows. >> that actually knows- >> And so, you, I think you're right. It's market forces that are going to determine this. But Sarbjeet, what do you make of Microsoft's big bet here, you weren't impressed with with Nadella. How do you think, where are they going to apply it? Is this going to be a Hail Mary for Bing, or is it going to be applied elsewhere? What do you think. >> They are saying that they will, sort of, weave this into their products, office products, productivity and also to write code as well, developer productivity as well. That's a big play for them. But coming back to your antitrust sort of comments, right? I believe the, your comment was like, oh, fed was late 10 years or 15 years earlier, but now they're two years. But things are moving very fast now as compared to they used to move. >> So two years is like 10 Years. >> Yeah, two years is like 10 years. Just want to make that point. (Dave laughs) This thing is going like wildfire. Any new tech which comes in that I think they're going against distribution channels. Lina Khan has commented time and again that the marketplace model is that she wants to have some grip on. Cloud marketplaces are a kind of monopolistic kind of way. >> I don't, I don't see this, I don't see a Chat AI. >> You told me it's not Bing, you had an interesting comment. >> No, no. First of all, this is great from Microsoft. If you're Microsoft- >> Why? >> Because Microsoft doesn't have the AI chops that Google has, right? Google is got so much core competency on how they run their search, how they run their backends, their cloud, even though they don't get a lot of cloud market share in the enterprise, they got a kick ass cloud cause they needed one. >> Totally. >> They've invented SRE. I mean Google's development and engineering chops are off the scales, right? Amazon's got some good chops, but Google's got like 10 times more chops than AWS in my opinion. Cloud's a whole different story. Microsoft gets AI, they get a playbook, they get a product they can render into, the not only Bing, productivity software, helping people write papers, PowerPoint, also don't forget the cloud AI can super help. We had this conversation on our Supercloud event, where AI's going to do a lot of the heavy lifting around understanding observability and managing service meshes, to managing microservices, to turning on and off applications, and or maybe writing code in real time. So there's a plethora of use cases for Microsoft to deploy this. combined with their R and D budgets, they can then turbocharge more research, build on it. So I think this gives them a car in the game, Google may have pole position with AI, but this puts Microsoft right in the game, and they already have a lot of stuff going on. But this just, I mean everything gets lifted up. Security, cloud, productivity suite, everything. >> What's under the hood at Google, and why aren't they talking about it? I mean they got to be freaked out about this. No? Or do they have kind of a magic bullet? >> I think they have the, they have the chops definitely. Magic bullet, I don't know where they are, as compared to the ChatGPT 3 or 4 models. Like they, but if you look at the online sort of activity and the videos put out there from Google folks, Google technology folks, that's account you should look at if you are looking there, they have put all these distinctions what ChatGPT 3 has used, they have been talking about for a while as well. So it's not like it's a secret thing that you cannot replicate. As you said earlier, like in the beginning of this segment, that anybody who has more data and the capacity to process that data, which Google has both, I think they will win this. >> Obviously living in Palo Alto where the Google founders are, and Google's headquarters next town over we have- >> We're so close to them. We have inside information on some of the thinking and that hasn't been reported by any outlet yet. And that is, is that, from what I'm hearing from my sources, is Google has it, they don't want to release it for many reasons. One is it might screw up their search monopoly, one, two, they're worried about the accuracy, 'cause Google will get sued. 'Cause a lot of people are jamming on this ChatGPT as, "Oh it does everything for me." when it's clearly not a hundred percent accurate all the time. >> So Lina Kahn is looming, and so Google's like be careful. >> Yeah so Google's just like, this is the third, could be a third rail. >> But the first thing you said is a concern. >> Well no. >> The disruptive (indistinct) >> What they will do is do a Waymo kind of thing, where they spin out a separate company. >> They're doing that. >> The discussions happening, they're going to spin out the separate company and put it over there, and saying, "This is AI, got search over there, don't touch that search, 'cause that's where all the revenue is." (chuckles) >> So, okay, so that's how they deal with the Clay Christensen dilemma. What's the business model here? I mean it's not advertising, right? Is it to charge you for a query? What, how do you make money at this? >> It's a good question, I mean my thinking is, first of all, it's cool to type stuff in and see a paper get written, or write a blog post, or gimme a marketing slogan for this or that or write some code. I think the API side of the business will be critical. And I think Howie Xu, I know you're going to reference some of his comments yesterday on Supercloud, I think this brings a whole 'nother user interface into technology consumption. I think the business model, not yet clear, but it will probably be some sort of either API and developer environment or just a straight up free consumer product, with some sort of freemium backend thing for business. >> And he was saying too, it's natural language is the way in which you're going to interact with these systems. >> I think it's APIs, it's APIs, APIs, APIs, because these people who are cooking up these models, and it takes a lot of compute power to train these and to, for inference as well. Somebody did the analysis on the how many cents a Google search costs to Google, and how many cents the ChatGPT query costs. It's, you know, 100x or something on that. You can take a look at that. >> A 100x on which side? >> You're saying two orders of magnitude more expensive for ChatGPT >> Much more, yeah. >> Than for Google. >> It's very expensive. >> So Google's got the data, they got the infrastructure and they got, you're saying they got the cost (indistinct) >> No actually it's a simple query as well, but they are trying to put together the answers, and they're going through a lot more data versus index data already, you know. >> Let me clarify, you're saying that Google's version of ChatGPT is more efficient? >> No, I'm, I'm saying Google search results. >> Ah, search results. >> What are used to today, but cheaper. >> But that, does that, is that going to confer advantage to Google's large language (indistinct)? >> It will, because there were deep science (indistinct). >> Google, I don't think Google search is doing a large language model on their search, it's keyword search. You know, what's the weather in Santa Cruz? Or how, what's the weather going to be? Or you know, how do I find this? Now they have done a smart job of doing some things with those queries, auto complete, re direct navigation. But it's, it's not entity. It's not like, "Hey, what's Dave Vellante thinking this week in Breaking Analysis?" ChatGPT might get that, because it'll get your Breaking Analysis, it'll synthesize it. There'll be some, maybe some clips. It'll be like, you know, I mean. >> Well I got to tell you, I asked ChatGPT to, like, I said, I'm going to enter a transcript of a discussion I had with Nir Zuk, the CTO of Palo Alto Networks, And I want you to write a 750 word blog. I never input the transcript. It wrote a 750 word blog. It attributed quotes to him, and it just pulled a bunch of stuff that, and said, okay, here it is. It talked about Supercloud, it defined Supercloud. >> It's made, it makes you- >> Wow, But it was a big lie. It was fraudulent, but still, blew me away. >> Again, vanilla content and non accurate content. So we are going to see a surge of misinformation on steroids, but I call it the vanilla content. Wow, that's just so boring, (indistinct). >> There's so many dangers. >> Make your point, cause we got to, almost out of time. >> Okay, so the consumption, like how do you consume this thing. As humans, we are consuming it and we are, like, getting a nicely, like, surprisingly shocked, you know, wow, that's cool. It's going to increase productivity and all that stuff, right? And on the danger side as well, the bad actors can take hold of it and create fake content and we have the fake sort of intelligence, if you go out there. So that's one thing. The second thing is, we are as humans are consuming this as language. Like we read that, we listen to it, whatever format we consume that is, but the ultimate usage of that will be when the machines can take that output from likes of ChatGPT, and do actions based on that. The robots can work, the robot can paint your house, we were talking about, right? Right now we can't do that. >> Data apps. >> So the data has to be ingested by the machines. It has to be digestible by the machines. And the machines cannot digest unorganized data right now, we will get better on the ingestion side as well. So we are getting better. >> Data, reasoning, insights, and action. >> I like that mall, paint my house. >> So, okay- >> By the way, that means drones that'll come in. Spray painting your house. >> Hey, it wasn't too long ago that robots couldn't climb stairs, as I like to point out. Okay, and of course it's no surprise the venture capitalists are lining up to eat at the trough, as I'd like to say. Let's hear, you'd referenced this earlier, John, let's hear what AI expert Howie Xu said at the Supercloud event, about what it takes to clone ChatGPT. Please, play the clip. >> So one of the VCs actually asked me the other day, right? "Hey, how much money do I need to spend, invest to get a, you know, another shot to the openAI sort of the level." You know, I did a (indistinct) >> Line up. >> A hundred million dollar is the order of magnitude that I came up with, right? You know, not a billion, not 10 million, right? So a hundred- >> Guys a hundred million dollars, that's an astoundingly low figure. What do you make of it? >> I was in an interview with, I was interviewing, I think he said hundred million or so, but in the hundreds of millions, not a billion right? >> You were trying to get him up, you were like "Hundreds of millions." >> Well I think, I- >> He's like, eh, not 10, not a billion. >> Well first of all, Howie Xu's an expert machine learning. He's at Zscaler, he's a machine learning AI guy. But he comes from VMware, he's got his technology pedigrees really off the chart. Great friend of theCUBE and kind of like a CUBE analyst for us. And he's smart. He's right. I think the barriers to entry from a dollar standpoint are lower than say the CapEx required to compete with AWS. Clearly, the CapEx spending to build all the tech for the run a cloud. >> And you don't need a huge sales force. >> And in some case apps too, it's the same thing. But I think it's not that hard. >> But am I right about that? You don't need a huge sales force either. It's, what, you know >> If the product's good, it will sell, this is a new era. The better mouse trap will win. This is the new economics in software, right? So- >> Because you look at the amount of money Lacework, and Snyk, Snowflake, Databrooks. Look at the amount of money they've raised. I mean it's like a billion dollars before they get to IPO or more. 'Cause they need promotion, they need go to market. You don't need (indistinct) >> OpenAI's been working on this for multiple five years plus it's, hasn't, wasn't born yesterday. Took a lot of years to get going. And Sam is depositioning all the success, because he's trying to manage expectations, To your point Sarbjeet, earlier. It's like, yeah, he's trying to "Whoa, whoa, settle down everybody, (Dave laughs) it's not that great." because he doesn't want to fall into that, you know, hero and then get taken down, so. >> It may take a 100 million or 150 or 200 million to train the model. But to, for the inference to, yeah to for the inference machine, It will take a lot more, I believe. >> Give it, so imagine, >> Because- >> Go ahead, sorry. >> Go ahead. But because it consumes a lot more compute cycles and it's certain level of storage and everything, right, which they already have. So I think to compute is different. To frame the model is a different cost. But to run the business is different, because I think 100 million can go into just fighting the Fed. >> Well there's a flywheel too. >> Oh that's (indistinct) >> (indistinct) >> We are running the business, right? >> It's an interesting number, but it's also kind of, like, context to it. So here, a hundred million spend it, you get there, but you got to factor in the fact that the ways companies win these days is critical mass scale, hitting a flywheel. If they can keep that flywheel of the value that they got going on and get better, you can almost imagine a marketplace where, hey, we have proprietary data, we're SiliconANGLE in theCUBE. We have proprietary content, CUBE videos, transcripts. Well wouldn't it be great if someone in a marketplace could sell a module for us, right? We buy that, Amazon's thing and things like that. So if they can get a marketplace going where you can apply to data sets that may be proprietary, you can start to see this become bigger. And so I think the key barriers to entry is going to be success. I'll give you an example, Reddit. Reddit is successful and it's hard to copy, not because of the software. >> They built the moat. >> Because you can, buy Reddit open source software and try To compete. >> They built the moat with their community. >> Their community, their scale, their user expectation. Twitter, we referenced earlier, that thing should have gone under the first two years, but there was such a great emotional product. People would tolerate the fail whale. And then, you know, well that was a whole 'nother thing. >> Then a plane landed in (John laughs) the Hudson and it was over. >> I think verticals, a lot of verticals will build applications using these models like for lawyers, for doctors, for scientists, for content creators, for- >> So you'll have many hundreds of millions of dollars investments that are going to be seeping out. If, all right, we got to wrap, if you had to put odds on it that that OpenAI is going to be the leader, maybe not a winner take all leader, but like you look at like Amazon and cloud, they're not winner take all, these aren't necessarily winner take all markets. It's not necessarily a zero sum game, but let's call it winner take most. What odds would you give that open AI 10 years from now will be in that position. >> If I'm 0 to 10 kind of thing? >> Yeah, it's like horse race, 3 to 1, 2 to 1, even money, 10 to 1, 50 to 1. >> Maybe 2 to 1, >> 2 to 1, that's pretty low odds. That's basically saying they're the favorite, they're the front runner. Would you agree with that? >> I'd say 4 to 1. >> Yeah, I was going to say I'm like a 5 to 1, 7 to 1 type of person, 'cause I'm a skeptic with, you know, there's so much competition, but- >> I think they're definitely the leader. I mean you got to say, I mean. >> Oh there's no question. There's no question about it. >> The question is can they execute? >> They're not Friendster, is what you're saying. >> They're not Friendster and they're more like Twitter and Reddit where they have momentum. If they can execute on the product side, and if they don't stumble on that, they will continue to have the lead. >> If they say stay neutral, as Sam is, has been saying, that, hey, Microsoft is one of our partners, if you look at their company model, how they have structured the company, then they're going to pay back to the investors, like Microsoft is the biggest one, up to certain, like by certain number of years, they're going to pay back from all the money they make, and after that, they're going to give the money back to the public, to the, I don't know who they give it to, like non-profit or something. (indistinct) >> Okay, the odds are dropping. (group talks over each other) That's a good point though >> Actually they might have done that to fend off the criticism of this. But it's really interesting to see the model they have adopted. >> The wildcard in all this, My last word on this is that, if there's a developer shift in how developers and data can come together again, we have conferences around the future of data, Supercloud and meshs versus, you know, how the data world, coding with data, how that evolves will also dictate, 'cause a wild card could be a shift in the landscape around how developers are using either machine learning or AI like techniques to code into their apps, so. >> That's fantastic insight. I can't thank you enough for your time, on the heels of Supercloud 2, really appreciate it. All right, thanks to John and Sarbjeet for the outstanding conversation today. Special thanks to the Palo Alto studio team. My goodness, Anderson, this great backdrop. You guys got it all out here, I'm jealous. And Noah, really appreciate it, Chuck, Andrew Frick and Cameron, Andrew Frick switching, Cameron on the video lake, great job. And Alex Myerson, he's on production, manages the podcast for us, Ken Schiffman as well. Kristen Martin and Cheryl Knight help get the word out on social media and our newsletters. Rob Hof is our editor-in-chief over at SiliconANGLE, does some great editing, thanks to all. Remember, all these episodes are available as podcasts. All you got to do is search Breaking Analysis podcast, wherever you listen. Publish each week on wikibon.com and siliconangle.com. Want to get in touch, email me directly, david.vellante@siliconangle.com or DM me at dvellante, or comment on our LinkedIn post. And by all means, check out etr.ai. They got really great survey data in the enterprise tech business. This is Dave Vellante for theCUBE Insights powered by ETR. Thanks for watching, We'll see you next time on Breaking Analysis. (electronic music)

Published Date : Jan 20 2023

SUMMARY :

bringing you data-driven and ChatGPT have taken the world by storm. So I asked it, give it to the large language models to do that. So to your point, it's So one of the problems with ChatGPT, and he simply gave the system the prompts, or the OS to help it do but it kind of levels the playing- and the answers were coming as the data you can get. Yeah, and leveled to certain extent. I check the facts, save me about maybe- and then I write a killer because like if the it's, the law is we, you know, I think that's true and I ask the set of similar question, What's your counter point? and not it's underestimated long term. That's what he said. for the first time, wow. the overhyped at the No, it was, it was I got, right I mean? the internet in the early days, and it's only going to get better." So you're saying it's bifurcated. and possibly the debate the first mobile device. So I mean. on the right with ChatGPT, and convicted by the Department of Justice the scrutiny from the Fed, right, so- And the privacy and thing to do what Sam Altman- So even though it'll get like, you know, it's- It's more than clever. I mean you write- I think that's a big thing. I think he was doing- I was not impressed because You know like. And he did the same thing he's got a lot of hyperbole. the browser moment to me, So OpenAI could stay on the right side You're right, it was terrible, They could be the Netscape Navigator, and in the horizontal axis's So I guess that's the other point is, I mean to quote IBM's So the data problem factors and the government's around the world, and they're slow to catch up. Yeah, and now they got years, you know, OpenAI. But the problem with government to kill Big Tech, and the 20% is probably relevant, back in the day, right? are they going to apply it? and also to write code as well, that the marketplace I don't, I don't see you had an interesting comment. No, no. First of all, the AI chops that Google has, right? are off the scales, right? I mean they got to be and the capacity to process that data, on some of the thinking So Lina Kahn is looming, and this is the third, could be a third rail. But the first thing What they will do out the separate company Is it to charge you for a query? it's cool to type stuff in natural language is the way and how many cents the and they're going through Google search results. It will, because there were It'll be like, you know, I mean. I never input the transcript. Wow, But it was a big lie. but I call it the vanilla content. Make your point, cause we And on the danger side as well, So the data By the way, that means at the Supercloud event, So one of the VCs actually What do you make of it? you were like "Hundreds of millions." not 10, not a billion. Clearly, the CapEx spending to build all But I think it's not that hard. It's, what, you know This is the new economics Look at the amount of And Sam is depositioning all the success, or 150 or 200 million to train the model. So I think to compute is different. not because of the software. Because you can, buy They built the moat And then, you know, well that the Hudson and it was over. that are going to be seeping out. Yeah, it's like horse race, 3 to 1, 2 to 1, that's pretty low odds. I mean you got to say, I mean. Oh there's no question. is what you're saying. and if they don't stumble on that, the money back to the public, to the, Okay, the odds are dropping. the model they have adopted. Supercloud and meshs versus, you know, on the heels of Supercloud

ENTITIES

Entity	Category	Confidence
John	PERSON	0.99+
Sarbjeet	PERSON	0.99+
Brian Gracely	PERSON	0.99+
Lina Khan	PERSON	0.99+
Dave Vellante	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Reid Hoffman	PERSON	0.99+
Alex Myerson	PERSON	0.99+
Lena Khan	PERSON	0.99+
Sam Altman	PERSON	0.99+
Apple	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Rob Thomas	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Ken Schiffman	PERSON	0.99+
Google	ORGANIZATION	0.99+
David Flynn	PERSON	0.99+
Sam	PERSON	0.99+
Noah	PERSON	0.99+
Ray Amara	PERSON	0.99+
10 billion	QUANTITY	0.99+
150	QUANTITY	0.99+
Rob Hof	PERSON	0.99+
Chuck	PERSON	0.99+
Palo Alto	LOCATION	0.99+
Howie Xu	PERSON	0.99+
Anderson	PERSON	0.99+
Cheryl Knight	PERSON	0.99+
John Furrier	PERSON	0.99+
Hewlett Packard	ORGANIZATION	0.99+
Santa Cruz	LOCATION	0.99+
1995	DATE	0.99+
Lina Kahn	PERSON	0.99+
Zhamak Dehghani	PERSON	0.99+
50 words	QUANTITY	0.99+
Hundreds of millions	QUANTITY	0.99+
Compaq	ORGANIZATION	0.99+
10	QUANTITY	0.99+
Kristen Martin	PERSON	0.99+
two sentences	QUANTITY	0.99+
Dave	PERSON	0.99+
hundreds of millions	QUANTITY	0.99+
Satya Nadella	PERSON	0.99+
Cameron	PERSON	0.99+
100 million	QUANTITY	0.99+
Silicon Valley	LOCATION	0.99+
one sentence	QUANTITY	0.99+
10 million	QUANTITY	0.99+
yesterday	DATE	0.99+
Clay Christensen	PERSON	0.99+
Sarbjeet Johal	PERSON	0.99+
Netscape	ORGANIZATION	0.99+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for NextData: