Amr Abdelhalem, Fidelity Investments | KubeCon + CloudNativeCon NA 2019

>> Announcer: Live from San Diego, California, it's theCUBE! Covering KubeCon and CloudNativeCon. Brought to you by Red Hat, the Cloud Native Computing Foundation, and its ecosystem partners. >> Welcome back. I'm Stu Miniman, my cohost, John Troyer, and this is theCUBE's fourth year of coverage of KubeCon, CloudNativeCon 2019. We're in here San Diego and happy to welcome to the program a first-time guest, Amr Abdelhalem, who is the head of Cloud Platforms at Fidelity Investments. Of course, Fidelity, we love talking to an end user. Big financial company. Your boss was up on the main stage in front of 8000 people, just in that room, there's over 12,000 here in person. Fidelity itself, you know, founded in 1946, first computers in 1965. In the last year, you've now got over 500 applications running in the public cloud, and Fidelity also joined the CNCS. So let's start there, Amr, if we would. Just kind of how does Fidelity look at kind of Kubernetes and CNCS? How does that fit into your company's mission? >> Absolutely, I mean thank you so much for inviting me here. Innovation in Fidelity is, a big part of the process. We're very focused at this time in cloud computing and machine learning, NEI technology. We had the first financial robot in 2015, I believe. We have the first augmented reality financial advisor, was actually released this year as a prototype. So a part of that innovation, we're seeing, CNCF and cloud computing and Cloud Native, is keys for strategy for our innovation part. >> All right, maybe if you could, give us a little bit of the breadth and depth of your team, what they cover, cloud platforms. What does that mean inside of Fidelity? >> Sure, so Fidelity had over, like, over 10,000 of IT. Hundreds and hundreds of develop teams, thousands of applications. It's globally distributed. It had all kind of workloads, that you can imagine. And it's in a highly regulated environment as well. And that's where we are seeing that we are all looking for this autonomy between teams, and agility, and improved time to market and customer experience. And the key for that is Cloud Native. We're seeing Kubernetes and CNCF and Cloud Native technology is like a key player for us when we go, multicloud to hypercloud model. >> Can you talk a little bit about more into that portfolio of technologies? You know, there's a lot of talk about public cloud verses on-prem, and, as if one thing is going to, one knife is going to be the only thing you need in your kitchen. >> Amr: Right. >> So you have a portfolio of platforms, you have a portfolio of destinations and a portfolio of applications. Can you talk a little bit, both about what you're using, and maybe how you're organized to access and address all those needs? >> Absolutely. So, I think, 2019, I would say, is the year of multicloud-hypercloud modeling, right? Actually, I would say that 2020 is going to more about distributed cloud, where you can distribute your workload across multicloud providers. We're not there yet. I don't think we're, anyone, is there yet. But at least we should start somewhere. We already has this multicloud providing. Distributing the workload itself between, I mean, it's a journey to move thousands of applications and thousands workloads and data as well, between on-premises data centers to a public cloud. You need to move through this journey of hypercloud models. And be able to move apps slowly and aggressively to other apps. >> All right. Amr, I want to dig into what you talked about there, multiclouds. >> Sure. >> So when you talk about multiple clouds, yes, everybody has that. I've got, walk us through a little bit, you know, where you have workloads and how many public clouds you use in life, but I want to set you up with a premise. You know, we really said, for multicloud to really be a reality-- >> Amr: Right. >> The value that you extract should be greater than the sum of its parts. And most of us lived through the multi vendor years, and that wasn't necessarily happiness and joy, when I had to span between those environments. So how do we make sure that multicloud doesn't become the least common denominator or a detriment to what I need to do with my data, my applications, the value that the company has? >> And that's why we are here. We are actually incorporated at Kubecon for that reason. That where we see this abstract layer that guarantee you the portability for moving your application from one cloud provider to another. That capability of the ability to deploy the same workload into multiclouds, the ability to have the workload itself, managed in different characteristic, next to assess services that you will find in AWS via Azure, via Google Cloud, the others. That's were we need that flexibility, and Kubernetes and Cloud Native itself, the ability to have the same deployable structure for your application, the ability to have the same ecosystem around that construction, around that artifact. The ability to move all of that, as-is, from one cloud provider to another cloud provider is big, big key. And that you can only find with script native. >> All right, Amr, can you share which cloud or clouds you're working on today, and what is your roadmap, do you have a timeline to when that vision becomes reality? >> At this moment, we're with a major cloud provider keys that, you guys can name them, all the colors. >> Stu: You're using all of them, okay. >> All the colors. >> And how are you using Kubernetes today? Where are you in that journey? >> So Kubernetes is mainly, I mean, I would say the majority is still running on premise. We are very intensively moving to public cloud in the Kubernates side. At this moment, actually, we're building an offer, inside my team, which is a cloud platform team. That offer will guarantee that portability between all the cloud provider. So for development team to port our platform, it will be kind of seamless for them, where it's going to land, is it going to be landing in AWS or Azures or on premise. >> Okay, joining the CNCF as a member, bring us inside. I understand the journey. Are there any specific goals you have? How do you measure the investment, and what you're hoping to, both as a company as well as part of the community, get out of it? >> So we have a big hold right now and opensource our project our little project about multiclouding, and our focus is mainly about the high regulation part. We're very focused in compliance and security, and in that way we can, I think, we can contribute back to the open source community around that. >> So Amr, you talked about, you know, we talked about the platforms here, and Kubernetes, but that goes hand-in-hand with the culture, and the up-skilling, and the organization and the processes. What intrigued me is you said, well, we put some things on Kubernetes on-prem, and then, and you know some things in the cloud, but then we're going to move some of those apps over time, we'll move to other appropriate homes. So that implies that you've changed process and you've changed, or maybe to be able to build cloud native apps, and that was actually separate, in some cases, from being in the public cloud. Is that the case, can you talk a little bit about how you've approached from the perspective of people who are listening or watching who are IT admins, and wondering how a company, a major organization, like your org, gets there? >> Right, and this is a main challenge. The challenge is not in the technology side itself, or the tools, that seems a majority there in the ecosystem at this moment. The challenge is mainly building the sculpture inside teams. So we're building many like, star-point or COEs across all of our business unit and all of our teams. And again, to build a sculpture across 10,000 developers plus, that's a major. >> And it's funny, because sometimes people go, well, COE is a dirty word, right, don't do a COE, but you said multiple COEs distributed across. >> So it's like nuclear reaction, our COEs, the first one, that will communicate with few COEs, each one of them would be with other COEs, and that's how that chain will go and expand quite quickly. >> All right. >> And this is happening at this moment. >> So, Amr, I have a few friends that this is the first time that they've come, and they go into the keynote, or they look at the schedule, and they're a bit overwhelmed. >> Amr: Right >> They say, it's not just Kubernetes, there's dozens and dozens of projects. The ecosystem is sprawling. If you could, give us a little walkthrough as to, the projects you're using, any key partners that you're allowed to talk about that are useful in helping you to achieve your mission. >> So, we're very focused at this moment, actually, in the Kubernetes project itself. We start exploring some of the open source project and in the CICD part, additional to that, we are starting using few frameworks like Flux, this is one of the frameworks like GitOps in general, building this culture of GitOps deployment, and moving toward, like, more ops of deployment, that's one of areas that we are very invested in. We're exploring service mesh at this time, and I hope like, we're going to get, like, maybe next year we can talk about service mesh more. >> Yeah, is there something that's holding you back on service mesh, 'cause there's a few options out there at various maturity levels, and who's driving them. What will some of your criteria be? >> I would say it's mainly, I'm waiting little bit more, I feel like 214 for me, when we had that discussion, instead of sitting here, 214, you will be discussing Mesos via Kubernetes via Swarm. So I think we are still moving at this time, service mesh as well. >> Any partners that you can speak to from a technology standpoint that are helping you, that you're allowed to talk about? >> Amr: Well, I mean, first of all CNCF. >> Yeah. >> I greatly appreciate all their help in that. Most of the public cloud providers are helping us in this areas as well, yeah. >> I'll be interested in catching you after the show and seeing how you thought, I mean this is, in some ways, it's a science project a few years ago, and now it's this robust thing. Did you bring, I'm curious, did you bring mostly engineers, mostly managers, a mix of the two? >> Amr: Mostly engineers, yeah, mostly engineers. >> Hands on? >> All hands on, I mean, this is like another change in culture right now, where most of our engineers are in innovation, like, they are full stack engineers. We're using VDI process at this moment, to move forward. All our road maps, in turn, have been published, it's being used like evolving process, to go, like, with continuous deployment, and continues feature enhancement for the teams. So it's fantastic honestly, yeah. >> Okay, Amr, what things does your team hope to achieve this week, anything that is on your roadmap, or on the public open source road map that you're waiting on? We talked a little bit, service mesh? >> We're definitely exploring OPA at this moment. I think that's like, that's big potentials there. So that's one of them, yeah. I think going through that showroom and try to see what option we have as well, that's on the area where we going to be very interested at. >> OPA, the Policy Agent, I mean, you talked about compliance before >> Yeah. >> A few years ago, with folks in the financial industry, you would have some arguments, some discussions, sometimes heated discussions about security in the cloud and et cetera and highly regulated industry, yet, kind of, maybe ironically or somewhat, maybe surprisingly for some, right? Very advanced in many areas, the whole industry. That's well known if you're in it. Do you still have to have discussions about compliance and security in the cloud? Maybe, I guess, maybe when you talk about data locality and international borders more? >> Right, and that's why we already have our own policy management tool, which is built in, we build it ourself, and that's where I see the potential, like, our moving from building it yourself to more of using an open source project and try to reuse it and contribute back to that open source community, like something like OPA, for example. So that's the next generation, where I can see it will help us as well. >> Amr, any advice you'd give your peers out there, if they're new to the community? Things you've learned along the journey so far? >> I would say start small, don't boil the ocean. Start with small COEs, small pilots program. Look for success, look for goals. Technology is great, but don't just move toward technology, because it's a moving target, it will never end. Try to set business goals for you, like targets for your project, and that's how you can achieve success. >> Well, Amr, really appreciate you sharing Fidelity's update. >> Thank you. >> Wish you and your team the best of luck here at the show and beyond, and we definitely hope to catch up soon. >> Thank you, I appreciate it. >> All right, for John Troyer, I'm Stu Miniman, be sure to checkout theCUBE.net for all of the coverage of this, as well as all the cloud, Cloud Native, and more shows that we have. Thank you for watching theCUBE. (upbeat electronic music)

Published Date : Nov 19 2019

SUMMARY :

Brought to you by Red Hat, and Fidelity also joined the CNCS. Innovation in Fidelity is, a big part of the process. All right, maybe if you could, It had all kind of workloads, that you can imagine. you need in your kitchen. So you have a portfolio of platforms, where you can distribute your workload Amr, I want to dig into what you talked about there, So when you talk about multiple clouds, and that wasn't necessarily happiness and joy, And that you can only find with script native. that, you guys can name them, all the colors. in the Kubernates side. How do you measure the investment, and in that way we can, I think, we can contribute back Is that the case, can you talk a little bit about how in the ecosystem at this moment. but you said multiple COEs distributed across. the first one, that will communicate with few COEs, So, Amr, I have a few friends that this is the first time in helping you to achieve your mission. and in the CICD part, additional to that, Yeah, is there something that's holding you back on you will be discussing Mesos via Kubernetes via Swarm. Most of the public cloud providers are helping us and seeing how you thought, I mean this is, and continues feature enhancement for the teams. that's on the area where we going to be very interested at. in the cloud and et cetera and highly regulated industry, So that's the next generation, and that's how you can achieve success. Well, Amr, really appreciate you sharing Wish you and your team the best of luck here at the show and more shows that we have.

ENTITIES

Entity	Category	Confidence
Laura	PERSON	0.99+
Lisa Martin	PERSON	0.99+
Stu Miniman	PERSON	0.99+
2015	DATE	0.99+
John Troyer	PERSON	0.99+
Umair Khan	PERSON	0.99+
Laura Dubois	PERSON	0.99+
Keith Townsend	PERSON	0.99+
1965	DATE	0.99+
Keith	PERSON	0.99+
Laura Dubois	PERSON	0.99+
Dell	ORGANIZATION	0.99+
Emil	PERSON	0.99+
Cloud Native Computing Foundation	ORGANIZATION	0.99+
Fidelity	ORGANIZATION	0.99+
Lisa	PERSON	0.99+
1946	DATE	0.99+
10 seconds	QUANTITY	0.99+
2020	DATE	0.99+
2019	DATE	0.99+
Amr Abdelhalem	PERSON	0.99+
AWS	ORGANIZATION	0.99+
Red Hat	ORGANIZATION	0.99+
Kapil Thangavelu	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
San Diego	LOCATION	0.99+
10 feet	QUANTITY	0.99+
Avamar	ORGANIZATION	0.99+
Amr	PERSON	0.99+
One	QUANTITY	0.99+
San Diego, California	LOCATION	0.99+
12 months	QUANTITY	0.99+
one tool	QUANTITY	0.99+
Fidelity Investments	ORGANIZATION	0.99+
tens of thousands	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
thousands	QUANTITY	0.99+
one repository	QUANTITY	0.99+
Lambda	TITLE	0.99+
Dell Technologies	ORGANIZATION	0.99+
Tens of thousands	QUANTITY	0.99+
six month	QUANTITY	0.99+
8000 people	QUANTITY	0.99+
next year	DATE	0.99+
10,000 developers	QUANTITY	0.99+
last year	DATE	0.99+
214	OTHER	0.99+
six months later	DATE	0.99+
C two	TITLE	0.99+
today	DATE	0.99+
fourth year	QUANTITY	0.99+
three	QUANTITY	0.99+
NoSQL	TITLE	0.99+
CNCF	ORGANIZATION	0.99+
one	QUANTITY	0.99+
150,000	QUANTITY	0.99+
79%	QUANTITY	0.99+
KubeCon	EVENT	0.99+
2022	DATE	0.99+
OpenVMS	TITLE	0.99+
Networker	ORGANIZATION	0.99+
GitOps	TITLE	0.99+
DOD	ORGANIZATION	0.99+

Amr Awadallah - Hadoop Summit 2013 - theCUBE - #HadoopSummit

>>Come back here. This is Silicon Valley coverage of ADU Summit. I'm John Fur, the founder. We're, we're pleased to have a friend inside the cube. It's rare to have such luminaries, Ama Aala, good friend and also co-founder of Cloudera. Really the pioneer in the space that helped build this industry that we're living here at at Hadoop Summit. I'm with Dave Ante from wiba.org. Amour, welcome back to the Cube Cub alumni. Thank you for having me here. Wow, what a journey. Are you co-founded Cloudera? I remember when you in Stealth Mo, I really can't talk about it. And, and then of course the history of Silicon Angle being, you know, founded and kind of built in in your office when you only had like 20 something employees. Yep. We owe a great deal of gratitude to you and, and congratulations to you Michael Olson, the team for building an industry. So I just wanted Thank you. Thank you. And welcome to the Cube. >>Thank you. It was great to be here. >>So what do you think, what's your take on the current Hadoop ecosystem right now? I mean, obviously a lot's happened. I mean it's big now. It's growing up fast. Yeah. The word enterprise grade is out there. You're seeing it move from, you know, trying to change the world. Our first interview, you said, I've seen the future, I want to bring it to the mainstream. It's here. Yeah. It's hitting mainstream right now. Yeah. What's your take of the current situation of the ecosystem and it's, and its value? >>Yeah, so I, I have a quick question first. Should I look to you or look to the camera? Look to >>The camera or both? Whatever you, whatever you'd like. >>So I think it's, the ecosystem is definitely growing, which is very, very healthy. However, there is a side question there, which is what do you think of all the competition coming into the space? So five years ago when Cloudera was started was just Cloudera. There was no other commercial vendor trying to support or enable Hadoop in the, in the industry for enterprises. And today there is at least 10 of them trying to compete with us, right? And that includes big companies, established companies that decided, hey, we gonna start addressing the space, but includes many, many newcomers who like Hortonworks, who were founded over the last couple of years. That's a healthy thing. I mean, that's absolutely a sign of a growing market. If the market wasn't growing, if there wasn't money in the market, if there wasn't, if it was just hype, there wouldn't have been all of these new companies and new ventures showing up. That said, I never look at competition as something that worries me, that I'm afraid now or what's gonna happen to me, or that's normal. That's exactly what happens to successful companies. If you look at Red Hat, when Red Hat was launching with the Linux, they had 25 competitors or even more 30 competitors. That's when Red Hat was forming out. And today, even of these 25, 30 competitors, they still have six or seven still left. So I think it's a very, very healthy sign of the graph of this market and the maturity that's reaching. >>What do you think about some of the, the white spaces that are evolving? You guys have obviously been involved in a lot of deployments at Cloudera. Again, you're doing a lot of, lot of work with the top, top names and the clients that you have aren't usually disclosed cuz you really can't disclose them. What, what are you seeing right now as the white spaces for things to do in the Hado platform? >>It's a very, very good question. So first I can't talk about future, future roadmap. Right now we're becoming a big company at that level where we can't comment on future roadmaps. >>Ah, that's sinus sign of the >>Time. You're well media train, good to see they're doing a good job keeping you >>A, You want more information on that? I can connect you with a pt, >>Please. No, no, no, we're good. We're good. We'll get it outta you. But, >>But our vision, our vision for Cloudera from day one, like you were saying earlier, we saw the future, right? So our vision from from day one was really to build this data system where we can have detail of any type, whether that data is structured or unstructured or images, it doesn't matter. And then on top of that data run any type of workloads. That workload could be the initial genesis of Hado, which is map use, which is batch processing. But now as as we made many announcements through the last few years, we also now have Impala for interactive analytics as a workload. We have a very, very strong partner partnership with SaaS for doing machine learning and statistics as a workload. And a few weeks ago we announced search as another workload. So you have multiple types of workloads that can handle different types of problems that you have within your organization and bring all of these workloads to all of your data regardless of type. And that's the vision that we'll continue to deliver on. That's exactly what we're building going into the >>Future. So how's that fit in with yarn, right? We're hearing a lot at this conference about yarn, the ability to, you know, do more with less in a lot of the things that you typically hear with the enter within the enterprise. And, and so talk about that a little bit. >>Yarn is a very core part to our platform. In fact, yarn has been part of CDH four for more than a year now out in the, in the markets. So we did bring, we were one of the, I think we were the first vendor who brought yarn into a distribution of Hado out there. It's very, very fundamental to us because that is how we're gonna coordinate. We are gonna be using yarn to coordinate launching all of these different type of workloads. You're gonna have the map produce workload, which is very batch oriented. The Impala workload, which is very latency sensitive. The, the search workload, which is also very latency sensitive. The machine learning workload, which is more batch oriented, et cetera, et cetera. And yarn is a very, very central piece to helping us coordinate all of these different types of workloads onto the >>Platform. Cloudera has been a great citizen in the community also. You, you mentioned and, and we witnessed that your team create the industry. You guys were there, you took the chance, you were the first ones commercially funded by the venture capitalists, you know, then others will follow and I'll see huge ecosystem here. Yes. A lot of noise. A lot of people trying to get attention. So I got to ask you, because I want you to address this because I know it's been talked about in some of the other blogs is there's a lot of fud going on around who's doing what? Who's doing what, and in some cases maybe flat out, you know, misinformation and that happens in a growing market, you know, the elbows get sharp. Yes. So I want you share with the audience anything that you want say about the fud around what people say about Cloudera or about others or what you're doing. Just to clarify, cuz there has been, I mean I've gotten back channel information around, you know, not sure the committers this, and it's been, it's been well documented. There's a lot of fu out there. What, what would you say to the folks out there to clarify >>That? Yes, I, I would say that our focus should be to continue to work as a community, to push the platform forwards. I would say that at Cloudera we do a lot of contributions. Horton works definitely is one of the top contributors out there as well. I'll acknowledge that. So as many, many, many other companies and we wanna continue to see the platform evolve. I will stress though that at Cloudera we do have a number of the original project founders working at the company. So it's not just the, the contribution that we bring, but the fact that we have the founders of these projects working at Cloudera. And some of these projects actually were created at Cloudera from day one as opposed to created in some other company. And then you hire the employee and they work for you. So I gave you what examples from Cloudera dot cutting. >>He is the creator of Hudu dot Cutting is also the creator of Luine, which became solar, which is part of the search project that we launched recently. Dot Cutting wasn't with Cloudera from day one, right? So, so when he created these technologies, he actually was at Tia for example, when he created had he was at ta, wasn't at Cloudera. However, he now works for Cloudera. So we get that because now that cutting works for Cloudera. So that's one example. On the flip side, there is projects like Flume and Scoop that are now part of every single distribution out there. And flu and Scoop were both created at Calera. They were actually created inside of Cloudera. Yeah. So the key point is, and and that's what I would like all of the vendors out there that are trying to leverage had and get benefit about out Hadoop is please don't be just takers. >>There are some vendors out there who are just takers. Just wanna take from the open source, take from the open source and don't give back. Right? I'm not gonna name them, but there is a few of them out there. Please, please, please. I mean that that, that is very, very a selfish behavior. It's not gonna help the ecosystem in the long term. We would like to see you both take and give at the same time. So that would be my core message. And that's for example, like I thank Hortonworks because that's exactly what Hortonworks is doing. They're both giving and taking at the same >>Time. You guys have always been clear on that. Nobody, I mean here contribution to open source has been well documented and there's, there's no question about that. John and I have talked about it a lot that you guys help get it all started. And even Haak when we had 'em on a couple years ago, when Horton Works came to the market said, Hey, the more people work on an open source, the better. >>Yeah, >>Exactly. So yeah, it's always been, been your posture. You're not playing games there. Anyways, having said that, you you, you have a strategy to layer on top of that open source some of your own proprietary code. And so you have choices to make Yes. In terms of how you allocate those resources. So as an engineering manager, how do you allocate those resources in terms of, okay, what do we do for the community and what do we do for our own, you know, future because of the business model that we chose? How do you make those trade offs? >>Yes, that's a very, very good question. So first it's important to stress that our core platform, CDH, is open source. Everything we put in the core platform is open source. So for example, in Palo, which we launched very recently as a ga, now we launched beta last year, but now's ga is a hundred percent Apache license, a hundred percent open source search, which we announced very recently is also open source. So the platform itself, we're committing to everything in there to be open source. Now we believe fundamentally just from having lots of history in studying the open source markets from our ceo Mike Olson himself being one of the very first open source people in the world with, with sleepy cats, the company that he sold to Oracle before founding Cloudera from our investors, helping many other open source companies. To have a successful open co open source company, you need to have a very good engine between the business model that generates revenue and between the product that you are creating. If you don't have a good feedback loop there between these two, you won't be able to sustain the innovation to continue to push the, the boundaries of how good the product is. So we strongly believe in that if you are, if your product is literally a hundred percent open source, meaning both the management and every, there is nothing proprietary whatsoever inside of your products. I can't tell what that is. It's >>Taking a picture. >>Oh, sorry, I thought somebody was waiting >>For me. >>Sorry about that. >>It's a cheap signal. >>It >>Was like a's really good. >>I thought it's like a card of paper with some writing. You, >>You, you have a fan fans out there. They're storming the, the concert here. >>Okay, that's, that's good to hear. That's good to hear. Sorry about that interruption. So if, if, if you have everything a hundred percent open source, that creates two problems. First you have no differentiation whatsoever, meaning another big corporation without naming who the big corporations could be, we just can take everything you do, literally every single bit of source code you have and say, Hey, we can do it too. Come to us, don't work with those guys. Right? We have the latest, greatest things that they have. Why do you wanna continue to work with them? So no, no differentiation is number one, which is very dangerous. And number two, when it becomes, if, if it's a hundred percent open source and there is lots of other vendors able to take the art, the open source artifact and work with it, then it becomes now purely about maintenance and insurance on the products, which is a commodity product, which obviously the prices for that will go down to the ground and you won't be able to have this sustain this positive feedback effect between your business model and between your product code map and won't be able to build a long-lasting company. >>So that's why we do have a combination of open source artifacts and proprietary artifacts. Now our pro proprietary AR artifacts is always around the management of the system, right? So how do we manage the security of the system? How do we manage the, the data flow within the system? How do we manage the services inside the, of the system across all layers, right? Not just the Hado player but the edge based layer, the zookeeper layer, et cetera, et cetera. So that's where we focus our efforts going forward and that's how we differentiate ourself from our, from other vendors out there. Cloudera manager, Cloudera navigator are very unique to us. Nobody else has anything close to those capabilities out there. >>So it sounds like the contributions you make to open source are cultural of, of, in nature, I mean DNA of sorts of Right. And so you're, that's something that you guys do cuz you've always done it. Absolutely. And then the, the artifacts that are proprietary are essentially around rationalizing the revenue opportunity with the expense that you're gonna apply there and making a business case decided >>How to balance. That's that's one. And then two, the differentiation from other competitors. So these two things, Yes. >>Okay. >>I believe that's fundamental to business to open source business models. >>Yeah, I mean there are many open source business models, right? You can go pure service, you can go, like you said, you can totally bogart the code. >>There is no, there is no pure service open source model company that was able to build the longlasting surviving public company, never happened in history. They always get acquired because it becomes a commodity. I >>Mean, right. I mean, I mean and even ibm, right? >>Tom or I want to ask you about the storage thing. We were talking before camera, the, the hor and worst announcement storage you, what's your take on that? >>Which one? The Gluster, the one with Red Hats? Yes. Yes. So Red Hats and yeah, there has been recent news about Red Hat with, with Hor Works having a version of the Haddo platform that uses map use for the computation but uses Red Hat for the storage, right? So Red Hat has a new storage offering that was built based off of a company they acquired was called Guster. And that, that news was very, very surprising to me. And it, the reason why it was surprising, it's correlated also with a shift in messaging from, from Horton works. If you look at Horton Works last year at had Summit last year, one of the key messages that they deliver to us is that within the next five years or by 2015, the tagline back then by 2015, and you're doing research right now to see if I'm saying the right thing. By 2015, half the world data data will be on, will be stored in had would be stored in had. Yes. If you look today at the slides, it >>Doesn't say that it says within five years, >>Right? No, no, no. It says, well >>That was the second iteration was within five years. And now they say something >>Different. Now say they say within 2015 by, sorry, by 2015, half the world's data will be processed by Hado and instead of stored by Hado. And that's a very, very fundamental So >>It's a nuance. >>It's a, it's a very important >>Nuance. Well it's a big deal because yes, when I first saw that I said, Hmm, what does this all mean? And then it sounds 2015 sounds a little early. Yes. And now you're saying processed by, Okay that's different. >>Yes, exactly. And and the reason why now is we believe s GFS is very, very core to the had platform. S GFS is very core to had platform, the storage system of had we want. It's really the layer that Mid had with is more than anything else is how scalable, how reliable and how economical the sdfs storage layer is. So we, we really, I mean ask qu works and ask all the companies working in the, in the had community not to fragment at the storage layer. We need the storage for had to stay inside of had and not to fragment that out. That's very, very critical. >>Okay. So but so >>You're saying that they're in indicating through the gesture that, that they're not come out saying we're going to fragment Hgfs, but the way that this is position might signal >>No, no, no. The announcement, the announcement with Red Hat is >>That is the direct signal. It's >>Literally, we, you'll be able to run map produce directly on top of Red Hat storage instead of sdfs. >>Okay. So >>I >>Interpreted it, I interpret it as they were just hortonwork was hedging on its prediction, which I said Okay, I'll give 'em a break on that. You're saying it's something different, >>It's a shift in strategy potentially. Yeah. Which can be dangerous. It's shift in strategy. >>Is that a compliance issue? Cuz you know, the, the Dishon Hads poss Yeah. Red Hat does have a lot of enterprise customers. Yeah. So is that just maybe if >>Then invest in making had poss compliance, which actually by the way, we are as a community investing in that. Yeah. Yes. You must have. Yeah. So we are investing in adding compulsive poss compliance to had, we're investing in adding snapshots into had, which will be coming very, very soon overnight. >>Well, do you think that that pick a year, I don't care if it's 2015 2000, 22,000 whenever that the majority of the world's data will be running into do >>The majority of worse data that has to do with analytics. Yes. Okay. So so there is, >>So that is that >>Is it's very important, the caveat. Yes, exactly. Because there is lots of types of data that are not very suitable for, had at all. For example, that data storage for Oracle systems, for Oracle database systems. No, you wanna store that in an NetApp emc you don't wanna store that in Hao the, the, the, the, the data storage for streaming video files, right? For just streaming lots and lots of video files. No, you don't wanna store that indu. It's >>A huge >>Proportion of the data. Yeah. Which is a huge, huge >>Proportion of data files, in fact that could overwhelm the data. >>Yeah. So the new nuance, like I would say like I agree that the half thing but the half thing within the world of data for the purpose of analysis. >>Yeah. Okay. So that's, that's >>Narrow down the >>Yeah, okay. But it's a more reasonable, But I've, I >>Never, It's still a huge market by the way. It is. Yeah, >>It is. Yes. Okay. So, so what's next for you? A are you, you, you've gone on this, this journey, you start this company. You've, you've been traveling around like crazy working with customers. What's the next phase of aara do's, you know, career? >>What >>Do you want to have happen next? I mean, what, what do you, what excites you? What do you, what are you working on? >>Yeah, it's just to continue to grow cloud there to be the biggest company it can be. I mean, we want to be literally, we want be one of the very few companies that we're able to take an open source model and turn that into a large publicly traded corporation. >>So you've talked about that you guys brought a new CEO on Right. Look at the background of the ceo and it's, you know, clearly it's got some IPO chops. Yes. So that's, that's an aspiration that you guys have put forth. Okay. >>And you're outward facing now. So you're doing a lot of travel. Yes. So what, what, where have, what have your travels taken now? You've been in China, you obviously you've got a European office Yeah. Open. So what's going on internationally? Give us some sound bites of, of what's happening in the field. Yeah, >>So in, in internationally, I mean, Europe definitely is our next big focus right now. And we now have a big operation in Europe and we have an office presence in, in Europe and a big team down there. And it's growing very quickly. I would say Europe is about two years behind the US kind of like that's how the, how the growth usually matters. What's happening here. And yeah, so we, our, our next big market is Europe. We are looking at China. We don't have a big process in China right now. Japan, we have a big presence in Japan. Japan is growing very quickly. So yeah, I mean we're obviously Canada with the US growing very quickly as well. >>Great to have you on the cube again, for me personally and, and for, for Dave. And I wanna say thanks to Cloudera for some great support over the years. You guys have been fantastic. You know, I say it's built a great company. It's so hard to build a company. You guys have done a great job. I gotta ask you the final question because you did bring that first sound bite, which was, I saw the future, this is back when you guys were just in your B round in, in Palo Alto office, just ramping up, just starting to ramp what's next? What do you see as around the corner? Obviously we're on a trajectory right now. A lot of things gonna get done. Positive compliance, a lot of stuff's gonna fill in. The platform's gonna get stronger. Yeah. We think that open source will win. Yeah. Through all the democratization of open source. What's next? What's the, what's around the corner that you're watching personally that you're, that's interesting to you? A or around where this will take us? >>Yeah. So what, what's next is having this, having this vision become true. Having this future vision that, that you refer to become true. Meaning having a single platform that can store all of your data and that can, regardless of the type of that data, and allow you to extract value for different types of workloads, whether that be batch, interactive machine learning or search or more, right? There will be more things that will come to the platform, but how to bring your applications, all of your data applications, how to bring them to your data and all of your data as opposed to have the data go to them. >>And what are the landmines out there that you need to avoid Yes. In the industry and community needs to avoid to make that a reality. >>The, the key landmine, it's, it's a bit technical. The landmine is a bit technical, which is making sure that they, they are vision continues to evolve and that we have the capability to properly have a multi workload resource management system that allows me to run all of these type of workloads without having them step on each other's steps. That's the key key step going forward. And >>Of course, playing well together in the sandbox. And as always, competitive competition is good. And again, Hadup is doing great. Amma Aala, co-founder of Cloudera inside the Cube. This is Silicon Angle and Wiki Bond's exclusive coverage of ADU Summit here in Silicon Valley. Right back with our next guest after the short break.

Published Date : Jun 27 2013

SUMMARY :

We owe a great deal of gratitude to you and, and congratulations to you Michael Olson, It was great to be here. So what do you think, what's your take on the current Hadoop ecosystem right now? Should I look to you or look to the camera? The camera or both? there is a side question there, which is what do you think of all the competition coming into the space? what are you seeing right now as the white spaces for things to do in the So first I can't talk about future, future roadmap. you No, no, no, we're good. So you have multiple types of workloads that can handle different types of problems to, you know, do more with less in a lot of the things that you typically hear with the enter within the enterprise. You're gonna have the map produce workload, which is very batch So I want you share with the audience anything that you want say about the So I gave you what examples from Cloudera dot cutting. So the key point is, and and that's what I would like all of the vendors out there that We would like to see you both take and give at the same time. John and I have talked about it a lot that you guys help get it all started. And so you have choices to make Yes. So we strongly believe in that if you are, I thought it's like a card of paper with some writing. You, you have a fan fans out there. big corporations could be, we just can take everything you do, literally every single bit of source code you have So how do we manage the security of the system? So it sounds like the contributions you make to open source are cultural of, of, in nature, So these two things, Yes. You can go pure service, you can go, There is no, there is no pure service open source model company I mean, I mean and even ibm, right? Tom or I want to ask you about the storage thing. And it, the reason why it was surprising, it's correlated also with a shift in messaging No, no, no. It says, well And now they say something half the world's data will be processed by Hado and instead of stored And now you're saying processed And and the reason why now is we believe s GFS is very, That is the direct signal. Interpreted it, I interpret it as they were just hortonwork was hedging on its prediction, which I said Okay, It's a shift in strategy potentially. So is that just maybe if So we are investing in adding compulsive poss compliance to had, we're investing in adding snapshots So so there is, No, you wanna store that in an NetApp emc you don't wanna store that in Hao Proportion of the data. for the purpose of analysis. But it's a more reasonable, But I've, I Never, It's still a huge market by the way. What's the next phase of aara do's, you know, of the very few companies that we're able to take an open source model and turn that into So that's, that's an aspiration that you guys have You've been in China, you obviously you've got a European how the growth usually matters. that first sound bite, which was, I saw the future, this is back when you guys were just in your B round in, and allow you to extract value for different types of workloads, whether that be batch, interactive And what are the landmines out there that you need to avoid Yes. That's the key key step going forward. Amma Aala, co-founder of Cloudera inside the Cube.

ENTITIES

Entity	Category	Confidence
Michael Olson	PERSON	0.99+
John	PERSON	0.99+
Europe	LOCATION	0.99+
Mike Olson	PERSON	0.99+
six	QUANTITY	0.99+
John Fur	PERSON	0.99+
China	LOCATION	0.99+
Dave	PERSON	0.99+
Amma Aala	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
Silicon Valley	LOCATION	0.99+
Horton Works	ORGANIZATION	0.99+
Japan	LOCATION	0.99+
2015	DATE	0.99+
25	QUANTITY	0.99+
last year	DATE	0.99+
seven	QUANTITY	0.99+
Oracle	ORGANIZATION	0.99+
Palo Alto	LOCATION	0.99+
25 competitors	QUANTITY	0.99+
Dave Ante	PERSON	0.99+
Ama Aala	PERSON	0.99+
two	QUANTITY	0.99+
two problems	QUANTITY	0.99+
Red Hat	ORGANIZATION	0.99+
30 competitors	QUANTITY	0.99+
Calera	ORGANIZATION	0.99+
today	DATE	0.99+
First	QUANTITY	0.99+
both	QUANTITY	0.99+
ADU Summit	EVENT	0.99+
Hortonworks	ORGANIZATION	0.99+
five years ago	DATE	0.99+
second iteration	QUANTITY	0.99+
one	QUANTITY	0.98+
22,000	QUANTITY	0.98+
Horton	ORGANIZATION	0.98+
first vendor	QUANTITY	0.98+
five years	QUANTITY	0.98+
hundred percent	QUANTITY	0.98+
Red Hat	TITLE	0.98+
Canada	LOCATION	0.98+
Tia	ORGANIZATION	0.98+
Tom	PERSON	0.98+
Hor Works	ORGANIZATION	0.97+
first	QUANTITY	0.97+
Horton	PERSON	0.97+
two things	QUANTITY	0.97+
first interview	QUANTITY	0.97+
Stealth Mo	LOCATION	0.97+
half	QUANTITY	0.96+
Haak	PERSON	0.96+
one example	QUANTITY	0.96+
Hadoop Summit 2013	EVENT	0.95+

Dr. Amr Awadallah - Interview 1 - Hadoop World 2011 - theCUBE

okay we're back live in new york city for hadoop world 2011 john furrier its founder SiliconANGLE calm and we have a special walk-in guest tomorrow and allah the vp of engineering co founder of Cloudera who's going to be on at two thirty eastern time on the cube to go more in depth but since we saw her in the hallway we had a quick spot wanted to grab him in here this is the cube our flagship telecast where we go out to the event atop the smartest people and i'm here with my co-host i'm dave vellante Wikibon door welcome back you're a longtime cube alum so appreciate you coming back on and doing a quick drive by here thanks for the nice welcome so you know we go talk to the smart people in the room you're one of the smartest guys that I know and we've been friends for years and it was your my tweet heard around the world by you to find space and we've been sharing the office space at Cloudera a year didn't have you I meant to have you we're going to be trying to find space because you're expanding so fast we have to get in a new home sorry about that but I wanted to really thank you personally appear on live you've enabled SiliconANGLE Wikibon to we figured it out early because of you I mean we had our nose sniffing around the big data area before it's called big data but when we met talked we've been tracking the social web and really it's exploded in an amazing way and I'm just really thankful because I've been had a front-row seat in the trenches with you guys and and it's been amazing so I want to thank you're welcome and that's great to have you on board and so so you you've been evangelizing in the trenches at Yahoo you were a ir a textile partners announcing the hundred million dollar fund which is all great news today but you've been the real spark get cloudy air is one of the 10 others one of them but I know one of the main sparks a co-founder a lots of ginger cuz I'm Rebecca and my co-founder from facebook I mean we both we said this before like we saw the future like an hour companies we saw the future where everybody is gonna go next and now Jeff's gonna be on as well he's now taking this whole date of science thing art yep building out a team you gotta drilled that down with him what do you what do you think about all this I mean like right now how do you feel personally emotionally and looking at the marketplace share with us your yeah I'm very emotional today actually yeah lots of the good news is you heard about the funding news yes million dollars for startups but no but the 14 oh yeah yeah it is more most actually the news was supposed to come out today came out a bit earlier sir day but yeah I'm very very emotional because of that it's a very Testament from very big name investor's of how well we were doing and recognition of how big this wave really is also the hundred million fun from Excel that's also a huge testament and lots of hopefully lots of new innovations or startups will come out of that so I'm very emotional about that but also overwhelmed by the by the the size of this event and how many people are really gravitating towards the technology which shows how much work we still have to do going forward it was very very August of a great a bit scared a bit scared Michaels is a great CEO on stage they're great guy we love Mike just really he's geeky and he's pragmatic Jerry strategist and you got Kirk who's the operator yeah but he showed a slide up at his keynote that showed the evolution of Hadoop yes the core Hadoop and then he showed ya year-by-year and now we got that columns extending and you got new new components coming out take us through that that progression just go back a few years in and walk us through why is this going on so fast and what are the what's the what's the community doing and just yeah and what happened in 2008 it doesn't need was one mr. yeah when we when we started so I mean first 2008 when we started and what he was believing us back then that hey this thing is going to be big like we had the belief because we saw it happen firsthand but many folks were dismissive and no no no this this big data thing is a fat and nobody will care about it and look and behold today it's obviously proving not to be the case in terms of the maturity of the of the platform you're absolutely right i mean the slide that Mike showed should but only thirty percent of the contributions happening today are in the Hadoop core layer and and and and the overall kind of vision there is very system very similar to the operating system right except what this really is it's a data operating system right it's how to operate large amounts of data in a big data center so sorry it's like an operating system for many machines as opposed to Linux which does not bring system for a single machine right so Hadoop when it came out Hadoop is only the colonel it's only that inner layers which if you look at any opening system like windows or linux and so on the core functionality is two things storing files and running applications on top of these files that's what windows does that's what linux does that was loop does at the heart but then to really get an opening system to work you need many ancillary components around it that really make it functional you need libraries in it applications in eat integration IO devices etc etc and that's really what's happening in the hadoop world so started with the core OS layer which is Hadoop HDFS for storage MapReduce for computation but then now all of these other things are showing around that core kernel to really make it a fully functional extensible data opening system I which made a little replay button but let's just put the paws on that because this is kind of an important point in folks out there there's a lot of different and a lot of people and metaphors are used in this business so it's the Linux I want to be it's just like Red Hat right yeah we kind of use that term the business model is talk a little bit about that we just mentioned you know not like Linux just unpack that a little bit deeper for us what's the difference you mentioned Linux is can you replay what you just said that was really so I was actually talking about the similarity the similarity and then i can and then i can talk about the difference the similarity is the heart of Hadoop is a system for storing files which is sdfs and a system for running applications on top of these files which is MapReduce the heart of Linux is the same thing assistant for storing files which is a txt for and a system for scheduling applications on top of these files that's the same heart of Windows and so on the difference though so that's the similarity I got a difference is Linux is made to run on a single note right and when this is made to run on a single note Hadoop is really made to run on many many notes so hadoo bicester cares about taking a data center of servers a rack of servers or a data center of servers and having them look like one big massive mainframe built out of commodity hardware that can store arbitrary amounts of data and run any type of hence the new components like the hives of the world so now so now these new components coming up like high for example I've makes it easier to write queries for Hadoop it's it's a sequel language for writing queries on top of Hadoop so you don't have to go and write it in MapReduce which we call that assembly language of Hadoop so if you write it and MapReduce you will get the most flexibility you will get the most performance but only if you know what you're doing very similar when you do machine code if you do machine cool assembly you will able do anything but you can also shoot yourself in the foot sunbelt is that right the same thing with MapReduce right when you use hive hive abstracts that out for you so your rights equal and then hive takes care of doing all of the plumbing work to get that compulsion to map it is for you so that's hive HBase for example is a very nice system that augments a dupe makes it low latency and makes it makes it support update and insert and delete transactions which are HDFS does not support out of the box so small like a database it's more like my sequel yeah the energy of my sequel to Linux is very similar to hbase to HDFS and what's your take on were from you know your founders had on now yeah on the business model similarities and differences with with redhead yes so actually they are different I mean that the sonority the similarity stops at open source we are both open source right in the sense that the core system is open source is available out there you can look at the source code again the and so on the difference is with redhead red that actually has a license on their bits so there's the source code and then there's the bits so when Red Hat compiles the source code and two bits these bits you cannot deploy them without having a red hat license with us is very different is now we have the source code which is Apache is all in the patchy we compile the source code into a bunch of bits which is our distribution called cdh these bits are one hundred percent open-source 103 can deploy them use them you don't have to face anything the only reason why you would come back and pay us is for Cloudera enterprise which is really when you go operational when become operational a mission-critical cloud enterprise gives you two things first it gives you a proprietary management suite that we built and it's very unique to us nobody in the market has anything close to what we have right now that makes it easier for you to deploy configure monitor provision do capacity planning security management etc for a loop nobody else has anything close what we have right now for that management's that is unique to cloud area and not part of a patchy open source yes it's not part of the vet's office you only get that as a subscriber to cloud era we do have a free version of that that's available for download and it can run up to 15 hours just for you to get up and running quickly yeah and it's really very simple has a very simple installer like you should be able to go fire off that software and say install Hadoop these are one of my servers and would take care of everything else for you it's like having these installers you know when windows came out in the beginning and he had this nice progress bar and you can install applications very easily imagine that now for a cluster of servers right that's ready what this is the other reason why people subscribe to the cloud enterprise in addition to getting this management suite is getting our support services right and support is necessary for any software even if it's free even for hardware think if I give you a free airplane right now just comment just give it here you go here is an airplane right you can run this airplane make money from passengers you still need somebody to maintain their plane for you right you can still go higher your mechanics maybe we'd have a tweetup bummer you can hire your own mechanics to maintain that airplane but we tell you like if you subscribe with us as the mechanics for your airplane the support you will get with us will be way better than anything else and economics of it also would be way better than having your own stuff for doing the maintenance for that airplane okay final question and we got a one-minute because we slid you in real quick we're going to come back for folks armor is going to come back at two-thirty so come back its eastern time and we'll have a more in-depth conversation but just share with the folks watching your view of what's going on in the patchy and you know there's all these kind of weird you know Fudd being thrown around that clutter is not this and that and you guys clearly the leader we talked with Kirk about that we don't need to go into that but just surely this what's going on what's the real deal happening with Apache the code and you have a unique offering which I mean the real deal and I advise people to go look at this blog post that our CEO wrote called by Michaelson road called the community effect and the real deal is there is a very big healthy community developing the source code for Hadoop the core system which is actually fsm MapReduce and all the components around around that core system we at Cloudera employ a very large engineering organization and tactile engineering relation is bigger than many of these other companies in the space that's our engineering is bigger if you look at the whole company itself is much much bigger than any of these other players so we we do a lot of contributions and to the core system and to the projects around it however we are part of the community and we're definitely doing this with the community it's not just a clowder thing for the core platform so that that's the real deal all right yeah so here we are armor that co-founder congratulations great funding hundred L from accel partners who invested in you guys congratulations you're part of the community we all know that just kind of clarifying that for the record and you have a unique differentiator management suite and the enterprise stuff and say expand the experience experience yeah I think a huge differentiation we have is we have been doing this for three years I had over everybody else we have the experience across all the industries that matter so when you come to us we know how to do this in the finance industry in the retail industry and the health industry and the government so that that's something also that so I'll just for the audience out there arm is coming back at two third you're gonna go deeper in today's the highly decorated or a general because there is there a leak oh and thanks for the small extra info he's in the uniform to the cloud era logo yes sir affecting some of those for us to someday great so what you see you again love love our great great friend

Published Date : May 1 2012

SUMMARY :

clarifying that for the record and you

ENTITIES

Entity	Category	Confidence
Rebecca	PERSON	0.99+
Mike	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
2008	DATE	0.99+
Excel	TITLE	0.99+
Hadoop	TITLE	0.99+
three years	QUANTITY	0.99+
linux	TITLE	0.99+
one-minute	QUANTITY	0.99+
windows	TITLE	0.99+
Michaels	PERSON	0.99+
Jeff	PERSON	0.99+
john furrier	PERSON	0.99+
2011	DATE	0.99+
Linux	TITLE	0.99+
Kirk	PERSON	0.99+
today	DATE	0.99+
thirty percent	QUANTITY	0.99+
Yahoo	ORGANIZATION	0.99+
hbase	TITLE	0.98+
single note	QUANTITY	0.98+
two things	QUANTITY	0.97+
single note	QUANTITY	0.97+
two bits	QUANTITY	0.97+
dave vellante	PERSON	0.97+
HDFS	TITLE	0.97+
10	QUANTITY	0.97+
first	QUANTITY	0.97+
Jerry	PERSON	0.97+
facebook	ORGANIZATION	0.97+
hundred L	QUANTITY	0.96+
both	QUANTITY	0.96+
million dollars	QUANTITY	0.96+
one hundred percent	QUANTITY	0.95+
Red Hat	TITLE	0.95+
August	DATE	0.95+
MapReduce	TITLE	0.95+
Amr Awadallah	PERSON	0.95+
tomorrow	DATE	0.94+
hundred million	QUANTITY	0.94+
Dr.	PERSON	0.94+
hundred million dollar	QUANTITY	0.94+
up to 15 hours	QUANTITY	0.93+
hadoop	TITLE	0.93+
Windows	TITLE	0.93+
single machine	QUANTITY	0.92+
HBase	TITLE	0.92+
new york city	LOCATION	0.9+
years	QUANTITY	0.9+
a year	QUANTITY	0.9+
Apache	ORGANIZATION	0.9+
one	QUANTITY	0.89+
a lot of people	QUANTITY	0.87+
red hat	TITLE	0.85+
Hadoop World	TITLE	0.84+
SiliconANGLE	ORGANIZATION	0.82+
two-thirty	DATE	0.8+
Fudd	PERSON	0.77+
Michaelson road	PERSON	0.74+

Dr. Amr Awadallah - Interview 2 - Hadoop World 2011 - theCUBE

Yeah, I'm Aala, They're the co-founder back to back. This is the cube silicon angle.com, Silicon angle dot TV's production of the cube, our flagship telecasts. We go out to the event. That was a great conversation. I was really just, just cool. I could have, we could have probably hit on a few more things, obviously well read. Awesome. Co-founder of Cloudera a. You were, you did a good job teaming up with that co-founder, huh? Not bad on the cube, huh? He's not bad on the cube, isn't he? He, >>He reads the internet. >>That's what I'm saying. >>Anything is going on. >>He's a cube star, you know, And >>Technology. Jeff knows it. Yeah. >>We, we tell you, I'm smarter just by being in Cloudera all those years. And I actually was following what he was saying, Sad and didn't dust my brain. So, Okay, so you're back. So we were talking earlier with Michaels and about the relational database thing. So I kind of pick that up where we left off with you around, you know, he was really excited. It's like, you know, hey, we saw that relational database movement happen. He was part of that. Yeah, yeah. That generation. And then, but things were happening or kind of happening the same way in a similar way, still early. So I was trying to really peg with him, how early are we, like, so, you know, as the curve, you know, this is 1400, it's not the Javit Center yet. Maybe the Duke world, you know, next year might be at the Javit Center, 35,000 just don't go to Vegas. So I'm trying to figure out where we are on that curve. Yeah. And we on the upwards slope, you know, down here, not even hitting that, >>I think, I think, I think we're moving up quicker than previous waves. And actually if you, if you look for example, Oracle, I think it took them 15, 20 years until they, they really became a mature company, VM VMware, which started about, what, 12, 13 years ago. It took them about maybe eight years to, to be a big company, met your company, and I'm hoping we're gonna do it in five. So a couple more years. >>Highly accelerated. >>Yes. But yeah, we see, I mean, I'm, I'm, I've been surprised by the growth. I have been, Right? I've been told, warned about enterprise software and, and that it takes long for production to take place. >>But the consumerization trend is really changing that. I mean, it seems to be that, yeah, the enterprises always last. Why the shorter >>Cycle? I think the shorter cycle is coming from having the, the, the, the right solution for the right problem at the right time. I think that's a big part of it. So luck definitely is a big part of this. Now, in terms of why this is changing compared to a couple of dec decades ago, why the adoption is changing compared to a couple of decades ago. I, I think that's coming just because of how quickly the technology itself, the underlying hardware is evolving. So right now, the fact that you can buy a single server and it has eight cores to 16 cores has 12 hards to terabytes. Each is, is something that's just pushing the, the, the, the limits what you can do with the existing systems and hence making it more likely for new systems to disrupt them. >>Yeah. We can talk about a lot. It's very easy for people to actually start a, a big data >>Project. >>Yes. For >>Example. Yes. And the hardest part is, okay, what, what do I really, what problem do I need to solve? How am I gonna, how am I gonna monetize it? Right? Those are the hard parts. It's not the, not the underlying >>Technology. Yes, Yes, that's true. That's true. I mean, >>You're saying, eh, you're saying >>Because, because I'm seeing both so much. I'm, I'm seeing both. I'm seeing both. And like, I'm seeing cases where you're right. There's some companies that was like, Oh, this Hadoop thing is so cool. What problem can I solve with it? And I see other companies, like, I have this huge problem and, and, and they don't know that HA exists. It's so, And once they know, they just jump on it right away. It's like, we know when you have a headache and you're searching for the medicine in Espin. Wow. It >>Works. I was talking to Jeff Hiba before he came on stage and, and I didn't even get to it cuz we were so on a nice riff there. Right. Bunch of like a musicians playing the guitar together. But like he, we talked about the it and and dynamics and he said something that I thoughts right. On money and SAP is talking the same thing and said they're going to the lines of business. Yes. Because it is the gatekeeper that's, it's like selling mini computers to a mainframe selling client servers from a mini computer team. Yeah. >>There's not, we're seeing, we're seeing both as well. So more likely the, the former one meaning, meaning that yes, line of business and departments, they adopt the technology and then it comes in and they see there's already these five different departments having it and they think, okay, now we need to formalize this across the organization. >>So what happens then? What are you seeing out there? Like when that happens, that mean people get their hands on, Hey, we got a problem to solve. Yeah. Is that what it comes down to? Well, Hadoop exist. Go get Hadoop. Oh yeah. They plop it in there and I what does it do? They, >>So they pop it into their, in their own installation or on the, on the cloud and they show that this actually is working and solving the problem for them. Yeah. And when that happens, it's a very, it's a very easy adoption from there on because they just go tell it, We need this right now because it's solving this problem and it's gonna make, make us much >>More money moving it right in. Yes. No problems. >>Is is that another reason why the cycle's compressed? I mean, you know, you think client server, there was a lot of resistance from it and now it's more much, Same thing with mobile. I mean mobile is flipped, right? I mean, so okay, bring it in. We gotta deal with it. Yep. I would think the same thing. We, we have a data problem. Let's turn it into an >>Opportunity. Yeah. In my, and it goes back to what I said earlier, the right solution for the right problem at the right time. Like when they, when you have larger amounts of unstructured data, there isn't anything else out there that can even touch what had, can >>Do. So Amar, I need to just change gears here a minute. The gaming stuff. So we have, we we're featured on justin.tv right now on the front page. Oh wow. But the numbers aren't coming in because there's a competing stream of a recently released Modern Warfare three feature. Yes. Yes. So >>I was looking for, we >>Have to compete with Modern Warfare three. So can you, can we talk about Modern Warfare three for a minute and share the folks what you think of the current version, if any, if you played it. Yeah. So >>Unfortunately I'm waiting to get back home. I don't have my Xbox with me here. >>A little like a, I'm talking about >>My lines and business. >>Boom. Water warfares like a Christmas >>Tree here. Sorry. You know, I love, I'm a big gamer. I'm a big video gamer at Cloudera. We have every Thursday at five 30 end office, we, we play Call of of Beauty version four, which is modern world form one actually. And I challenge, I challenge people out there to come challenge our team. Just ping me on Twitter and we'll, we'll do a Cloudera versus >>Let's, let's, let's reframe that. Let team out. There am Abalas company. This is the geeks that invent the future. Jeff Haer Baer at Facebook now at Cloudera. Hammerer leading the charge. These guys are at gamers. So all the young gamers out there am are saying they're gonna challenge you. At which version? >>Modern Warfare one. >>Modern Warfare one. Yes. How do they fire in? Can you set up an >>External We'll >>We'll figure it out. We'll figure it out. Okay. >>Yeah. Just p me on Twitter and We'll, >>We can carry it live actually we can stream that. Yeah, >>That'd be great. >>Great. >>Yeah. So I'll tell you some of our best Hadooop committers and Hadoop developers pitch >>A picture. Modern Warfare >>Three going now Model Warfare three. Very excited about the game. I saw the, the trailers for it looks, graphics look just amazing. Graphics are amazing. I love the Sirius since the first one that came out. And I'm looking forward to getting back home to playing the game. >>I can't play, my son won't let me play. I'm such a fumbler with the Hub. I'm a keyboard controller. I can't work the Xbox controller. Oh, I have a coordination problem my age and I'm just a gluts and like, like Dad, sorry, Charity's over. I can I play with my friends? You the box. But I'm around big gamer. >>But, but in terms of, I mean, something I wanted to bring up is how to link up gaming with big data and analysis and so on. So like, I, I'm a big gamer. I love playing games, but at the same time, whenever I play games, I feel a little bit guilty because it's kind of like wasted time. So it's like, I mean, yeah, it's fun and I'm getting lots of enjoyment on it makes my life much more cheerful. But still, how can we harness all of this, all of these hours that gamers spend playing a game like Modern Warfare three, How can we, how can we collect instrument, all of the data that's coming from that and coming up, for example, with something useful with predicted. >>This is exactly, this is exactly the kind of application that's mainstream is gaming. Yeah. Yeah. Danny at Riot G is telling me, we saw him at Oracle Open World. He's up there for the Java one. He said that they, they don't really have a big data platform and their business is about understanding user behavior rep tons of data about user playing time, who they're playing with. Yeah, Yeah. How they want us to get into currency trading, You know, >>Buy, I can't, I can't mention the names, but some of the biggest giving companies out there are using Hadoop right now. And, and depending on CDH for doing exactly that kind of thing, creating >>A good user experience >>Today, they're doing it for the purpose of enhancing the user experience and improving retention. So they do track everything. Like every single bullet, you fire everything in best Ball Head, you get everything home run, you do. And, and, and in, in a three >>Type of game consecutive headshot, you get >>Everything, everything is being Yeah. Headshot you get and so on. But, but as you said, they are using that information today to sell more products and, and, and retain their users. Now what I'm suggesting is that how can you harness that energy for the good as well? I mean for making money, money is good and everything, but how can you harness that for doing something useful so that all of this entertainment time is also actually productive time as well. I think that'd be a holy grail in this, in this environment if we >>Can achieve that. Yeah. It used to be that corn used to be the telegraph of the future of about, of applications, but gaming really is, if you look at gaming, you know, you get the headset on. It's a collaborative environment. Oh yeah. You got unified communications. >>Yeah. And you see our teenager kids, how, how many hours they spend on these things. >>You got play as a play environments, very social collaborative. Yeah. You know, some say, you know, we we're saying, what I'm saying is that that's the, that's the future work environment with Skype evolving. We're our multiplayer game's called our job. Right? Yeah. You know, so I'm big on gaming. So all the gamers out there, a has challenged you. Yeah. Got a big data example. What else are we seeing? So let's talk about the, the software. So we, one of the things you were talking about that I really liked, you were going down the list. So on Mike's slide he had all the new features. So around the core, can you just go down the core and rattle off your version of what, what it means and what it is. So you start off with say H Base, we talked about that already. What are the other ones that are out there? >>So the projects that we have right there, >>The projects that are around those tools that are being built. Cause >>Yeah, so the foundational, the foundational one as we mentioned before, is sdfs for storage map use for processing. Yeah. And then the, the immediate layer above that is how to make MAP reduce easier for the masses. So how can, not everybody knows how to learn map, use Java, everybody knows sql, right? So, so one of the most successful projects right now that has the highest attach rate, meaning people usually when they install had do installed as well is Hive. So Hive takes sequel and so Jeff Harm Becker, my co-founder, when he was at Facebook, his team built the Hive system. Essentially Hive takes sql so you don't have to learn a new language, you already know sql. And then converts that into MAP use for you. That not only expands the developer base for how many people can use adu, but also makes it easier to integrate Hadoop through all DBC and JDBC integrated with BI tools like MicroStrategy and Tableau and Informatica, et cetera, et cetera. >>You mentioned R too. You mentioned R Program R >>As well. Yeah, R is one of our best partnerships. We're very, very happy with them. So that's, that's one of the very key projects is Hive assisted project to Hive ISS called Pig. A pig Latin is a language that ya invented that you have to learn the language. It's very easy, it's very easy to learn compared to map produce. But once you learn it, you can, you can specify very deep data pipelines, right? SQL is good for queries. It's not good for data pipelines because it becomes very convoluted. It becomes very hard for the, the human brain to understand it. So Pig is much more natural to the human. It's more like Pearl very similar to scripting kind of languages. So with Peggy can write very, very long data pipelines, again, very successful projects doing very, very well. Another key project is Edge Base, like you said. So Edge Base allows you to do low latencies. So you can do very, very quick lookups and also allows you to do transactions. So you can do updates in inserts and deletes. So one of the talks here that had World we try to recommend people watch when the videos come out is the Talk by Jonathan Gray from Facebook. And he talked about how they use Edge Base, >>Jonathan, something on here in the Cube later. Yeah. So >>Drill him on that. So they use Edge Base now for many, many things within Facebook. They have a big team now committed to building an improving edge base with us and with the community at large. And they're using it for doing their online messaging system. The live mail system in Facebook is powered by Edge Base right now. Again, Pro and eBay, The Casini project, they gave a keynote earlier today at the conference as well is using Edge Base as well. So Edge Base is definitely one of the projects that's growing very, very quickly right now within the Hudu system. Another key project that Jeff alluded to earlier when he was on here is Flum. So Flume is very instrumental because you have this nice system had, but Hadoop is useless unless you have data inside it. So how do you get the data inside do? >>So Flum essentially is this very nice framework for having these agents all over your infrastructure, inside your web servers, inside your application servers, inside your mobile devices, your network equipment that collects all of that data and then reliably and, and materializes it inside Hado. So Flum does that. Another good project is Uzi, so many of them, I dunno how, how long you want me to keep going here, But, but Uzi is great. Uzi is a workflow processing system. So Uzi allows you to define a series of jobs. Some of them in Pig, some of them in Hive, some of them in map use. You can define a series of them and then link them to each other and say, only start this job when these other jobs, two jobs finish because I'm waiting for the input from them before I can kick off and so on. >>So Uzi is a very nice framework that will will do that. We'll manage the whole graph of jobs for you and retry things when they fail, et cetera, et cetera. Another good project is where W H I R R and where allows you to very easily start ADU cluster on top of Amazon. Easy two on top of Rackspace, virtualized environ. It's more for kicking off, it's for kicking off Hadoop instances or edge based instances on any virtual infrastructure. Okay. VMware, vCloud. So that it supports all of the major vCloud, sorry, all of the me, all of the major virtualized infrastructure systems out there, Eucalyptus as well, and so on. So that's where W H I R R ARU is another key project. It's one, it's duck cutting's main kind of project right now. Don of that gut cutting came on stage with you guys has, So Aru ARO is a project about how do we encode with our files, the schema of these files, right? >>Because when you open up a text file and you don't know how to what the columns mean and how to pars it, it becomes very hard to work for it. So ARU allows you to do that much more easily. It's also useful for doing rrp. We call rtc remove procedure calls for having different services talk to each other. ARO is very useful for that as well. And the list keeps going on and on Maha. Yeah. Which we just, thanks for me for reminding me of my house. We just added Maha very recently actually. What is that >>Adam? I'm not >>Familiar with it. So Maha is a data mining library. So MAHA takes some of the most popular data mining algorithms for doing clustering and regression and statistical modeling and implements them using the map map with use model. >>They have, they have machine learning in it too or Yes, yes. So that's the machine learning. >>So, So yes. Stay vector to machines and so on. >>What Scoop? >>So Scoop, you know, all of them. Thanks for feeding me all the names. >>The ones I don't understand, >>But there's so many of them, right? I can't even remember all of them. So Scoop actually is a very interesting project, is short for SQL to Hadoop, hence the name Scoop, right? So SQ from SQL and Oops from Hadoop and also means Scoop as in scooping up stuff when you scoop up ice cream. Yeah. And the idea for Scoop is to make it easy to move data between relational systems like Oracle metadata and it is a vertical and so on and Hadoop. So you can very simply say, Scoop the name of the table inside the relation system, the name of the file inside Hadoop. And the, the table will be copied over to the file and Vice and Versa can say Scoop the name of the file in Hadoop, the name of the table over there, it'll move the table over there. So it's a connectivity tool between the relational world and the Hadoop world. >>Great, great tutorial. >>And all of these are Apache projects. They're all projects built. >>It's not part of your, your unique proprietary. >>Yes. But >>These are things that you've been contributing >>To, We're contributing to the whole ecosystem. Yes. >>And you understand very well. Yes. And >>And contribute to your knowledge of the marketplace >>And Absolutely. We collaborate with the, with the community on creating these projects. We employ committers and founders for many of these projects. Like Duck Cutting, the founder of He works in Cloudera, the founder for that UIE project. He works at Calera for zookeeper works at Calera. So we have a number of them on stuff >>Work. So we had Aroon from Horton Works. Yes. And and it was really good because I tell you, I walk away from that conversation and I gotta say for the folks out there, there really isn't a war going on in Apache. There isn't. And >>Apache, there isn't. I mean isn't but would be honest. Like, and in the developer community, we are friends, we're working together. We want to achieve the, there's >>No war. It's all Kumbaya. Everyone understands the rising tide floats, all boats are all playing nice in the same box. Yes. It's just a competitive landscape in Horton. Works >>In the business, >>Business business, competitive business, PR and >>Pr. We're trying to be friendly, as friendly as we can. >>Yeah, no, I mean they're, they're, they're hying it up. But he was like, he was cool. Like, Hey, you know, we know each other. Yes. We all know each other and we're just gonna offer free Yes. And charge with support. And so are they. And that's okay. And they got other things going on. Yes. But he brought up the question. He said they're, they're launching a management console. So I said, Tyler's got a significant lead. He kind of didn't really answer the question. So the question is, that's your core bread and butter, That's your yes >>And no. Yes and no. I mean if you look at, if you look at Cloudera Enterprise, and I mentioned this earlier and when we talked in the morning, it has two main things in it. Cloudera Enterprise has the management suite, but it also has the, the the the support and maintenance that we provide to our customers and all the experience that we have in our team part That subscription. Yes. For a description. And I, I wanna stress the point that the fact that I built a sports car doesn't mean that I'm good at running that sports car. The driver of the car usually is much better at driving the car than the guy who built the car, right? So yes, we have many people on staff that are helping build had, but we have many more people on stuff that helped run Hado at large scale, at at financial indu, financial industry, retail industry, telecom industry, media industry, health industry, et cetera, et cetera. So that's very, very important for our customer. All that experience that we bring in on how to run the system technically Yeah. Within these verticals. >>But their strategies clear. We're gonna create an open source project within Apache for a management consult. Yes. And we sell support too. Yes. So there'll be a free alternative to management. >>So we have to see, But I mean we look at the product, I mean our products, >>It's gotta come down to product differentiation. >>Our product has been in the market for two years, so they just started building their products. It's >>Alpha, It's just Alpha. The >>Product is Alpha in Alpha right now. Yeah. Okay. >>Well the Apache products, it is >>Apache, right? Yeah. The Apache project is out. So we'll see how it does it compare to ours. But I think ours is way, way ahead of anything else out there. Yeah. Essentially people to try that for themselves and >>See essentially, John, when I asked Arro why does the world need Hortonwork? You know, eventually the answer we got was, well it's free. It needs to be more open. Had needs to be more open. >>No, there's, >>It's going to be, That's not really the reason why Warton >>Works. >>No, they want, they want to go make money. >>Exactly. We wasn't >>Gonna say them you >>When I kept pushing and pushing and that's ultimately the closest we can get cuz you >>Just listens. Not gonna >>12 open source projects. Yes. >>I >>Mean, yeah, yeah. You can't get much more open. Yeah. Look >>At management >>Consult, but Airs not shooting on all those. I mean, I mean not only we are No, no, not >>No, no, we absolutely >>Are. No, you are contributing. You're not. But that's not all your projects. There's other people >>Involved. Yeah, we didn't start, we didn't start all of these projects. Yeah, that's >>True. You contributing heavily to all of them. >>Yes, we >>Are. And that's clear. Todd Lipkin said that, you know, he contributed his first patch to HPAC in 2008. Yes. So I mean, you go back through the ranks >>Of your people and Todd now is a committer on Edge base is a committer on had itself. So on a number >>Of you clearly the lead and, and you know, and, but >>There is a concern. But we, we've heard it and I wanna just ask you No, no. So there's a concern that if I build processes around a proprietary management console, Yes. I'm gonna end up being locked into that proprietary management CNA all over again. Now this is so far from ca Yes. >>Right. >>But that's a concern that some people have expressed. And, and, and I think one of the reasons why Port Works is getting so much attention. So Yes. >>Talk about that. It's, it's a very good, it's a very good observation to make. Actually, >>There there is two separate things here. There's the platform where all the data sets and then there's this management parcel beside the platform. Now why did we make the management console why the cloud didn't make the management console? Because it makes our job for supporting the customers much more achievable. When a customer calls in and says, We have a problem, help us fix this problem. When they go to our management console, there is a button they click that gives us a dump of the state, of the cluster. And that's what allows us to very quickly debug what's going on. And within minutes tell them you need to do this and you to do that. Yeah. Without that we just can't offer the support services. There's >>Real value there. >>Yes. So, so now a year from, But, but, but you have to keep in mind that the, the underlying platform is completely open source and free CBH is completely a hundred percent open source, a hundred percent free, a hundred percent Apache. So a year from now, when it comes time to renew with us, if the customer is not happy with our management suite is not happy with our support data, they can, they can go to work >>And works. People are afraid >>Of all they can go to ibm. >>The data, you can take the data that >>You don't even need to take the data. You're not gonna move the data. It's the same system, the same software. Every, everything in CDH is Apache. Right? We're not putting anything in cdh, which is not Apache. So a year from now, if you're not happy with our service to you and the value that we're providing, you can switch. There is no lock in. There is no lock. And >>Your, your argument would be the switching costs to >>The only lock in is happiness. The only lock in is which >>Happiness inspection customer delay. Which by, by the way, we just wrote a piece about those wars and we said the risk of lockin is low. We made that statement. We've got some heat for it. Yes. And >>This is sort of at scale though. What the, what the people are saying, they're throwing the tomatoes is saying if this is, again, in theory at scale, the customers are so comfortable with that, the console that they don't switch. Now my argument was >>Yes, but that means they're happy with it. That means they're satisfied and happy >>With it. >>And it's more economical for them than going and hiding people full-time on stuff. Yeah. >>So you're, you're always on check as, as long as the customer doesn't feel like Oracle. >>Yeah. See that's different. Oracle is very, Oracle >>Is like different, right? Yeah. Here it's like Cisco routers, they get nested into the environment, provide value. That's just good competitive product strategy. Yes. If it they're happy. Yeah. It's >>Called open washing with >>Oracle, >>I mean our number one core attribute on the company, the number one value for us is customer satisfaction. Keeping our people Yeah. Our customers happy with the service that we provide. >>So differentiate in the product. Yes. Keep the commanding lead. That's the strategist. That's the, that's what's happening. That's your goal. Yes. >>That's what's happening. >>Absolutely. Okay. Co-founder of Cloudera, Always a pleasure to have you on the cube. We really appreciate all the hospitality over the beer and a half. And wanna personally thank you for letting us sit in your office and we'll miss you >>And we'll miss you too. We'll >>See you at the, the Cube events off Swing by, thanks for coming on the cube and great to see you and congratulations on all your success. >>Thank >>You. And thanks for the review on Modern Warfare three. Yeah, yeah. >>Love me again. If there any gaming stuff, you know, I.

Published Date : May 1 2012

SUMMARY :

Yeah, I'm Aala, They're the co-founder back to back. Yeah. So I kind of pick that up where we left off with you around, you know, he was really excited. So a couple more years. takes long for production to take place. But the consumerization trend is really changing that. So right now, the fact that you can buy a single server and it It's very easy for people to actually start a, a big data Those are the hard parts. I mean, It's like, we know when you have a headache and you're On money and SAP is talking the same thing and said they're going to the lines of business. the former one meaning, meaning that yes, line of business and departments, they adopt the technology and What are you seeing out there? So they pop it into their, in their own installation or on the, on the cloud and they show that this actually is working and Yes. I mean, you know, you think client server, there was a lot of resistance from for the right problem at the right time. Do. So Amar, I need to just change gears here a minute. of the current version, if any, if you played it. I don't have my Xbox with me here. And I challenge, I challenge people out there to come challenge our team. So all the young gamers out there am are saying they're gonna challenge you. Can you set up an We'll figure it out. We can carry it live actually we can stream that. Modern Warfare I love the Sirius since the first one that came out. You the box. but at the same time, whenever I play games, I feel a little bit guilty because it's kind of like wasted time. Danny at Riot G is telling me, we saw him at Oracle Open World. Buy, I can't, I can't mention the names, but some of the biggest giving companies out there are using Hadoop So they do Now what I'm suggesting is that how can you harness that energy for the good as well? but gaming really is, if you look at gaming, you know, you get the headset on. So around the core, can you just go down the core and rattle off your version of what, The projects that are around those tools that are being built. Yeah, so the foundational, the foundational one as we mentioned before, is sdfs for storage map use You mentioned R too. So one of the talks here that had World we Jonathan, something on here in the Cube later. So Edge Base is definitely one of the projects that's growing very, very quickly right now So Uzi allows you to define a series of So that it supports all of the major vCloud, So ARU allows you to do that much more easily. So MAHA takes some of the most popular data mining So that's the machine learning. So, So yes. So Scoop, you know, all of them. And the idea for Scoop is to make it easy to move data between relational systems like Oracle metadata And all of these are Apache projects. To, We're contributing to the whole ecosystem. And you understand very well. So we have a number of them on And and it was really good because I tell you, Like, and in the developer community, It's all Kumbaya. So the question is, the experience that we have in our team part That subscription. So there'll be a free alternative to management. Our product has been in the market for two years, so they just started building their products. Alpha, It's just Alpha. Product is Alpha in Alpha right now. So we'll see how it does it compare to ours. You know, eventually the answer We wasn't Not gonna Yes. Yeah. I mean, I mean not only we are No, But that's not all your projects. Yeah, we didn't start, we didn't start all of these projects. So I mean, you go back through the ranks So on a number But we, we've heard it and I wanna just ask you No, no. So there's a concern that So Yes. It's, it's a very good, it's a very good observation to make. And within minutes tell them you need to do this and you to do that. So a year from now, when it comes time to renew with us, if the customer is And works. It's the same system, the same software. The only lock in is which Which by, by the way, we just wrote a piece about those wars and we said the risk of lockin is low. the console that they don't switch. Yes, but that means they're happy with it. And it's more economical for them than going and hiding people full-time on stuff. Oracle is very, Oracle Yeah. I mean our number one core attribute on the company, the number one value for us is customer satisfaction. So differentiate in the product. And wanna personally thank you for letting us sit in your office and we'll miss you And we'll miss you too. you and congratulations on all your success. Yeah, yeah. If there any gaming stuff, you know, I.

ENTITIES

Entity	Category	Confidence
Jeff	PERSON	0.99+
Jeff Hiba	PERSON	0.99+
Todd Lipkin	PERSON	0.99+
2008	DATE	0.99+
Cisco	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
John	PERSON	0.99+
Mike	PERSON	0.99+
Modern Warfare three	TITLE	0.99+
Apache	ORGANIZATION	0.99+
Danny	PERSON	0.99+
Jonathan Gray	PERSON	0.99+
Jeff Haer Baer	PERSON	0.99+
15	QUANTITY	0.99+
two years	QUANTITY	0.99+
Calera	ORGANIZATION	0.99+
Modern Warfare	TITLE	0.99+
16 cores	QUANTITY	0.99+
Jeff Harm Becker	PERSON	0.99+
Todd	PERSON	0.99+
eight cores	QUANTITY	0.99+
Jonathan	PERSON	0.99+
both	QUANTITY	0.99+
Facebook	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Java	TITLE	0.99+
next year	DATE	0.99+
Skype	ORGANIZATION	0.99+
two jobs	QUANTITY	0.99+
Vegas	LOCATION	0.99+
Michaels	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
one	QUANTITY	0.99+
Hadoop	TITLE	0.99+
hundred percent	QUANTITY	0.99+
35,000	QUANTITY	0.99+
Horton Works	ORGANIZATION	0.99+
Today	DATE	0.99+
Peggy	PERSON	0.99+
eBay	ORGANIZATION	0.99+
Horton	LOCATION	0.99+
12 hards	QUANTITY	0.99+
Each	QUANTITY	0.99+
vCloud	TITLE	0.99+
HPAC	ORGANIZATION	0.99+
Aala	PERSON	0.99+
Adam	PERSON	0.99+
Tyler	PERSON	0.98+
UIE	ORGANIZATION	0.98+
Hadoop World	TITLE	0.98+
first one	QUANTITY	0.98+
12 open source projects	QUANTITY	0.98+
Edge Base	TITLE	0.98+
W H I R R	TITLE	0.98+
five	QUANTITY	0.98+
Hammerer	PERSON	0.98+
Xbox	COMMERCIAL_ITEM	0.98+
Port Works	ORGANIZATION	0.98+
Hive	TITLE	0.98+
Amar	PERSON	0.98+
five different departments	QUANTITY	0.98+
today	DATE	0.98+
Christmas	EVENT	0.98+
SQL	TITLE	0.97+
Silicon angle dot TV	ORGANIZATION	0.97+
Tableau	TITLE	0.97+
two	QUANTITY	0.97+
W H I R R	TITLE	0.97+

SiliconANGLE News | Beyond the Buzz: A deep dive into the impact of AI

(upbeat music) >> Hello, everyone, welcome to theCUBE. I'm John Furrier, the host of theCUBE in Palo Alto, California. Also it's SiliconANGLE News. Got two great guests here to talk about AI, the impact of the future of the internet, the applications, the people. Amr Awadallah, the founder and CEO, Ed Alban is the CEO of Vectara, a new startup that emerged out of the original Cloudera, I would say, 'cause Amr's known, famous for the Cloudera founding, which was really the beginning of the big data movement. And now as AI goes mainstream, there's so much to talk about, so much to go on. And plus the new company is one of the, now what I call the wave, this next big wave, I call it the fifth wave in the industry. You know, you had PCs, you had the internet, you had mobile. This generative AI thing is real. And you're starting to see startups come out in droves. Amr obviously was founder of Cloudera, Big Data, and now Vectara. And Ed Albanese, you guys have a new company. Welcome to the show. >> Thank you. It's great to be here. >> So great to see you. Now the story is theCUBE started in the Cloudera office. Thanks to you, and your friendly entrepreneurship views that you have. We got to know each other over the years. But Cloudera had Hadoop, which was the beginning of what I call the big data wave, which then became what we now call data lakes, data oceans, and data infrastructure that's developed from that. It's almost interesting to look back 12 plus years, and see that what AI is doing now, right now, is opening up the eyes to the mainstream, and the application's almost mind blowing. You know, Sati Natel called it the Mosaic Moment, didn't say Netscape, he built Netscape (laughing) but called it the Mosaic Moment. You're seeing companies in startups, kind of the alpha geeks running here, because this is the new frontier, and there's real meat on the bone, in terms of like things to do. Why? Why is this happening now? What's is the confluence of the forces happening, that are making this happen? >> Yeah, I mean if you go back to the Cloudera days, with big data, and so on, that was more about data processing. Like how can we process data, so we can extract numbers from it, and do reporting, and maybe take some actions, like this is a fraud transaction, or this is not. And in the meanwhile, many of the researchers working in the neural network, and deep neural network space, were trying to focus on data understanding, like how can I understand the data, and learn from it, so I can take actual actions, based on the data directly, just like a human does. And we were only good at doing that at the level of somebody who was five years old, or seven years old, all the way until about 2013. And starting in 2013, which is only 10 years ago, a number of key innovations started taking place, and each one added on. It was no major innovation that just took place. It was a couple of really incremental ones, but they added on top of each other, in a very exponentially additive way, that led to, by the end of 2019, we now have models, deep neural network models, that can read and understand human text just like we do. Right? And they can reason about it, and argue with you, and explain it to you. And I think that's what is unlocking this whole new wave of innovation that we're seeing right now. So data understanding would be the essence of it. >> So it's not a Big Bang kind of theory, it's been evolving over time, and I think that the tipping point has been the advancements and other things. I mean look at cloud computing, and look how fast it just crept up on AWS. I mean AWS you back three, five years ago, I was talking to Swami yesterday, and their big news about AI, expanding the Hugging Face's relationship with AWS. And just three, five years ago, there wasn't a model training models out there. But as compute comes out, and you got more horsepower,, these large language models, these foundational models, they're flexible, they're not monolithic silos, they're interacting. There's a whole new, almost fusion of data happening. Do you see that? I mean is that part of this? >> Of course, of course. I mean this wave is building on all the previous waves. We wouldn't be at this point if we did not have hardware that can scale, in a very efficient way. We wouldn't be at this point, if we don't have data that we're collecting about everything we do, that we're able to process in this way. So this, this movement, this motion, this phase we're in, absolutely builds on the shoulders of all the previous phases. For some of the observers from the outside, when they see chatGPT for the first time, for them was like, "Oh my god, this just happened overnight." Like it didn't happen overnight. (laughing) GPT itself, like GPT3, which is what chatGPT is based on, was released a year ahead of chatGPT, and many of us were seeing the power it can provide, and what it can do. I don't know if Ed agrees with that. >> Yeah, Ed? >> I do. Although I would acknowledge that the possibilities now, because of what we've hit from a maturity standpoint, have just opened up in an incredible way, that just wasn't tenable even three years ago. And that's what makes it, it's true that it developed incrementally, in the same way that, you know, the possibilities of a mobile handheld device, you know, in 2006 were there, but when the iPhone came out, the possibilities just exploded. And that's the moment we're in. >> Well, I've had many conversations over the past couple months around this area with chatGPT. John Markoff told me the other day, that he calls it, "The five dollar toy," because it's not that big of a deal, in context to what AI's doing behind the scenes, and all the work that's done on ethics, that's happened over the years, but it has woken up the mainstream, so everyone immediately jumps to ethics. "Does it work? "It's not factual," And everyone who's inside the industry is like, "This is amazing." 'Cause you have two schools of thought there. One's like, people that think this is now the beginning of next gen, this is now we're here, this ain't your grandfather's chatbot, okay?" With NLP, it's got reasoning, it's got other things. >> I'm in that camp for sure. >> Yeah. Well I mean, everyone who knows what's going on is in that camp. And as the naysayers start to get through this, and they go, "Wow, it's not just plagiarizing homework, "it's helping me be better. "Like it could rewrite my memo, "bring the lead to the top." It's so the format of the user interface is interesting, but it's still a data-driven app. >> Absolutely. >> So where does it go from here? 'Cause I'm not even calling this the first ending. This is like pregame, in my opinion. What do you guys see this going, in terms of scratching the surface to what happens next? >> I mean, I'll start with, I just don't see how an application is going to look the same in the next three years. Who's going to want to input data manually, in a form field? Who is going to want, or expect, to have to put in some text in a search box, and then read through 15 different possibilities, and try to figure out which one of them actually most closely resembles the question they asked? You know, I don't see that happening. Who's going to start with an absolute blank sheet of paper, and expect no help? That is not how an application will work in the next three years, and it's going to fundamentally change how people interact and spend time with opening any element on their mobile phone, or on their computer, to get something done. >> Yes. I agree with that. Like every single application, over the next five years, will be rewritten, to fit within this model. So imagine an HR application, I don't want to name companies, but imagine an HR application, and you go into application and you clicking on buttons, because you want to take two weeks of vacation, and menus, and clicking here and there, reasons and managers, versus just telling the system, "I'm taking two weeks of vacation, going to Las Vegas," book it, done. >> Yeah. >> And the system just does it for you. If you weren't completing in your input, in your description, for what you want, then the system asks you back, "Did you mean this? "Did you mean that? "Were you trying to also do this as well?" >> Yeah. >> "What was the reason?" And that will fit it for you, and just do it for you. So I think the user interface that we have with apps, is going to change to be very similar to the user interface that we have with each other. And that's why all these apps will need to evolve. >> I know we don't have a lot of time, 'cause you guys are very busy, but I want to definitely have multiple segments with you guys, on this topic, because there's so much to talk about. There's a lot of parallels going on here. I was talking again with Swami who runs all the AI database at AWS, and I asked him, I go, "This feels a lot like the original AWS. "You don't have to provision a data center." A lot of this heavy lifting on the back end, is these large language models, with these foundational models. So the bottleneck in the past, was the energy, and cost to actually do it. Now you're seeing it being stood up faster. So there's definitely going to be a tsunami of apps. I would see that clearly. What is it? We don't know yet. But also people who are going to leverage the fact that I can get started building value. So I see a startup boom coming, and I see an application tsunami of refactoring things. >> Yes. >> So the replatforming is already kind of happening. >> Yes, >> OpenAI, chatGPT, whatever. So that's going to be a developer environment. I mean if Amazon turns this into an API, or a Microsoft, what you guys are doing. >> We're turning it into API as well. That's part of what we're doing as well, yes. >> This is why this is exciting. Amr, you've lived the big data dream, and and we used to talk, if you didn't have a big data problem, if you weren't full of data, you weren't really getting it. Now people have all the data, and they got to stand this up. >> Yeah. >> So the analogy is again, the mobile, I like the mobile movement, and using mobile as an analogy, most companies were not building for a mobile environment, right? They were just building for the web, and legacy way of doing apps. And as soon as the user expectations shifted, that my expectation now, I need to be able to do my job on this small screen, on the mobile device with a touchscreen. Everybody had to invest in re-architecting, and re-implementing every single app, to fit within that model, and that model of interaction. And we are seeing the exact same thing happen now. And one of the core things we're focused on at Vectara, is how to simplify that for organizations, because a lot of them are overwhelmed by large language models, and ML. >> They don't have the staff. >> Yeah, yeah, yeah. They're understaffed, they don't have the skills. >> But they got developers, they've got DevOps, right? >> Yes. >> So they have the DevSecOps going on. >> Exactly, yes. >> So our goal is to simplify it enough for them that they can start leveraging this technology effectively, within their applications. >> Ed, you're the COO of the company, obviously a startup. You guys are growing. You got great backup, and good team. You've also done a lot of business development, and technical business development in this area. If you look at the landscape right now, and I agree the apps are coming, every company I talk to, that has that jet chatGPT of, you know, epiphany, "Oh my God, look how cool this is. "Like magic." Like okay, it's code, settle down. >> Mm hmm. >> But everyone I talk to is using it in a very horizontal way. I talk to a very senior person, very tech alpha geek, very senior person in the industry, technically. they're using it for log data, they're using it for configuration of routers. And in other areas, they're using it for, every vertical has a use case. So this is horizontally scalable from a use case standpoint. When you hear horizontally scalable, first thing I chose in my mind is cloud, right? >> Mm hmm. >> So cloud, and scalability that way. And the data is very specialized. So now you have this vertical specialization, horizontally scalable, everyone will be refactoring. What do you see, and what are you seeing from customers, that you talk to, and prospects? >> Yeah, I mean put yourself in the shoes of an application developer, who is actually trying to make their application a bit more like magic. And to have that soon-to-be, honestly, expected experience. They've got to think about things like performance, and how efficiently that they can actually execute a query, or a question. They've got to think about cost. Generative isn't cheap, like the inference of it. And so you've got to be thoughtful about how and when you take advantage of it, you can't use it as a, you know, everything looks like a nail, and I've got a hammer, and I'm going to hit everything with it, because that will be wasteful. Developers also need to think about how they're going to take advantage of, but not lose their own data. So there has to be some controls around what they feed into the large language model, if anything. Like, should they fine tune a large language model with their own data? Can they keep it logically separated, but still take advantage of the powers of a large language model? And they've also got to take advantage, and be aware of the fact that when data is generated, that it is a different class of data. It might not fully be their own. >> Yeah. >> And it may not even be fully verified. And so when the logical cycle starts, of someone making a request, the relationship between that request, and the output, those things have to be stored safely, logically, and identified as such. >> Yeah. >> And taken advantage of in an ongoing fashion. So these are mega problems, each one of them independently, that, you know, you can think of it as middleware companies need to take advantage of, and think about, to help the next wave of application development be logical, sensible, and effective. It's not just calling some raw API on the cloud, like openAI, and then just, you know, you get your answer and you're done, because that is a very brute force approach. >> Well also I will point, first of all, I agree with your statement about the apps experience, that's going to be expected, form filling. Great point. The interesting about chatGPT. >> Sorry, it's not just form filling, it's any action you would like to take. >> Yeah. >> Instead of clicking, and dragging, and dropping, and doing it on a menu, or on a touch screen, you just say it, and it's and it happens perfectly. >> Yeah. It's a different interface. And that's why I love that UIUX experiences, that's the people falling out of their chair moment with chatGPT, right? But a lot of the things with chatGPT, if you feed it right, it works great. If you feed it wrong and it goes off the rails, it goes off the rails big. >> Yes, yes. >> So the the Bing catastrophes. >> Yeah. >> And that's an example of garbage in, garbage out, classic old school kind of comp-side phrase that we all use. >> Yep. >> Yes. >> This is about data in injection, right? It reminds me the old SQL days, if you had to, if you can sling some SQL, you were a magician, you know, to get the right answer, it's pretty much there. So you got to feed the AI. >> You do, Some people call this, the early word to describe this as prompt engineering. You know, old school, you know, search, or, you know, engagement with data would be, I'm going to, I have a question or I have a query. New school is, I have, I have to issue it a prompt, because I'm trying to get, you know, an action or a reaction, from the system. And the active engineering, there are a lot of different ways you could do it, all the way from, you know, raw, just I'm going to send you whatever I'm thinking. >> Yeah. >> And you get the unintended outcomes, to more constrained, where I'm going to just use my own data, and I'm going to constrain the initial inputs, the data I already know that's first party, and I trust, to, you know, hyper constrain, where the application is actually, it's looking for certain elements to respond to. >> It's interesting Amr, this is why I love this, because one we are in the media, we're recording this video now, we'll stream it. But we got all your linguistics, we're talking. >> Yes. >> This is data. >> Yep. >> So the data quality becomes now the new intellectual property, because, if you have that prompt source data, it makes data or content, in our case, the original content, intellectual property. >> Absolutely. >> Because that's the value. And that's where you see chatGPT fall down, is because they're trying to scroll the web, and people think it's search. It's not necessarily search, it's giving you something that you wanted. It is a lot of that, I remember in Cloudera, you said, "Ask the right questions." Remember that phrase you guys had, that slogan? >> Mm hmm. And that's prompt engineering. So that's exactly, that's the reinvention of "Ask the right question," is prompt engineering is, if you don't give these models the question in the right way, and very few people know how to frame it in the right way with the right context, then you will get garbage out. Right? That is the garbage in, garbage out. But if you specify the question correctly, and you provide with it the metadata that constrain what that question is going to be acted upon or answered upon, then you'll get much better answers. And that's exactly what we solved Vectara. >> Okay. So before we get into the last couple minutes we have left, I want to make sure we get a plug in for the opportunity, and the profile of Vectara, your new company. Can you guys both share with me what you think the current situation is? So for the folks who are now having those moments of, "Ah, AI's bullshit," or, "It's not real, it's a lot of stuff," from, "Oh my god, this is magic," to, "Okay, this is the future." >> Yes. >> What would you say to that person, if you're at a cocktail party, or in the elevator say, "Calm down, this is the first inning." How do you explain the dynamics going on right now, to someone who's either in the industry, but not in the ropes? How would you explain like, what this wave's about? How would you describe it, and how would you prepare them for how to change their life around this? >> Yeah, so I'll go first and then I'll let Ed go. Efficiency, efficiency is the description. So we figured that a way to be a lot more efficient, a way where you can write a lot more emails, create way more content, create way more presentations. Developers can develop 10 times faster than they normally would. And that is very similar to what happened during the Industrial Revolution. I always like to look at examples from the past, to read what will happen now, and what will happen in the future. So during the Industrial Revolution, it was about efficiency with our hands, right? So I had to make a piece of cloth, like this piece of cloth for this shirt I'm wearing. Our ancestors, they had to spend month taking the cotton, making it into threads, taking the threads, making them into pieces of cloth, and then cutting it. And now a machine makes it just like that, right? And the ancestors now turned from the people that do the thing, to manage the machines that do the thing. And I think the same thing is going to happen now, is our efficiency will be multiplied extremely, as human beings, and we'll be able to do a lot more. And many of us will be able to do things they couldn't do before. So another great example I always like to use is the example of Google Maps, and GPS. Very few of us knew how to drive a car from one location to another, and read a map, and get there correctly. But once that efficiency of an AI, by the way, behind these things is very, very complex AI, that figures out how to do that for us. All of us now became amazing navigators that can go from any point to any point. So that's kind of how I look at the future. >> And that's a great real example of impact. Ed, your take on how you would talk to a friend, or colleague, or anyone who asks like, "How do I make sense of the current situation? "Is it real? "What's in it for me, and what do I do?" I mean every company's rethinking their business right now, around this. What would you say to them? >> You know, I usually like to show, rather than describe. And so, you know, the other day I just got access, I've been using an application for a long time, called Notion, and it's super popular. There's like 30 or 40 million users. And the new version of Notion came out, which has AI embedded within it. And it's AI that allows you primarily to create. So if you could break down the world of AI into find and create, for a minute, just kind of logically separate those two things, find is certainly going to be massively impacted in our experiences as consumers on, you know, Google and Bing, and I can't believe I just said the word Bing in the same sentence as Google, but that's what's happening now (all laughing), because it's a good example of change. >> Yes. >> But also inside the business. But on the crate side, you know, Notion is a wiki product, where you try to, you know, note down things that you are thinking about, or you want to share and memorialize. But sometimes you do need help to get it down fast. And just in the first day of using this new product, like my experience has really fundamentally changed. And I think that anybody who would, you know, anybody say for example, that is using an existing app, I would show them, open up the app. Now imagine the possibility of getting a starting point right off the bat, in five seconds of, instead of having to whole cloth draft this thing, imagine getting a starting point then you can modify and edit, or just dispose of and retry again. And that's the potential for me. I can't imagine a scenario where, in a few years from now, I'm going to be satisfied if I don't have a little bit of help, in the same way that I don't manually spell check every email that I send. I automatically spell check it. I love when I'm getting type ahead support inside of Google, or anything. Doesn't mean I always take it, or when texting. >> That's efficiency too. I mean the cloud was about developers getting stuff up quick. >> Exactly. >> All that heavy lifting is there for you, so you don't have to do it. >> Right? >> And you get to the value faster. >> Exactly. I mean, if history taught us one thing, it's, you have to always embrace efficiency, and if you don't fast enough, you will fall behind. Again, looking at the industrial revolution, the companies that embraced the industrial revolution, they became the leaders in the world, and the ones who did not, they all like. >> Well the AI thing that we got to watch out for, is watching how it goes off the rails. If it doesn't have the right prompt engineering, or data architecture, infrastructure. >> Yes. >> It's a big part. So this comes back down to your startup, real quick, I know we got a couple minutes left. Talk about the company, the motivation, and we'll do a deeper dive on on the company. But what's the motivation? What are you targeting for the market, business model? The tech, let's go. >> Actually, I would like Ed to go first. Go ahead. >> Sure, I mean, we're a developer-first, API-first platform. So the product is oriented around allowing developers who may not be superstars, in being able to either leverage, or choose, or select their own large language models for appropriate use cases. But they that want to be able to instantly add the power of large language models into their application set. We started with search, because we think it's going to be one of the first places that people try to take advantage of large language models, to help find information within an application context. And we've built our own large language models, focused on making it very efficient, and elegant, to find information more quickly. So what a developer can do is, within minutes, go up, register for an account, and get access to a set of APIs, that allow them to send data, to be converted into a format that's easy to understand for large language models, vectors. And then secondarily, they can issue queries, ask questions. And they can ask them very, the questions that can be asked, are very natural language questions. So we're talking about long form sentences, you know, drill down types of questions, and they can get answers that either come back in depending upon the form factor of the user interface, in list form, or summarized form, where summarized equals the opportunity to kind of see a condensed, singular answer. >> All right. I have a. >> Oh okay, go ahead, you go. >> I was just going to say, I'm going to be a customer for you, because I want, my dream was to have a hologram of theCUBE host, me and Dave, and have questions be generated in the metaverse. So you know. (all laughing) >> There'll be no longer any guests here. They'll all be talking to you guys. >> Give a couple bullets, I'll spit out 10 good questions. Publish a story. This brings the automation, I'm sorry to interrupt you. >> No, no. No, no, I was just going to follow on on the same. So another way to look at exactly what Ed described is, we want to offer you chatGPT for your own data, right? So imagine taking all of the recordings of all of the interviews you have done, and having all of the content of that being ingested by a system, where you can now have a conversation with your own data and say, "Oh, last time when I met Amr, "which video games did we talk about? "Which movie or book did we use as an analogy "for how we should be embracing data science, "and big data, which is moneyball," I know you use moneyball all the time. And you start having that conversation. So, now the data doesn't become a passive asset that you just have in your organization. No. It's an active participant that's sitting with you, on the table, helping you make decisions. >> One of my favorite things to do with customers, is to go to their site or application, and show them me using it. So for example, one of the customers I talked to was one of the biggest property management companies in the world, that lets people go and rent homes, and houses, and things like that. And you know, I went and I showed them me searching through reviews, looking for information, and trying different words, and trying to find out like, you know, is this place quiet? Is it comfortable? And then I put all the same data into our platform, and I showed them the world of difference you can have when you start asking that question wholeheartedly, and getting real information that doesn't have anything to do with the words you asked, but is really focused on the meaning. You know, when I asked like, "Is it quiet?" You know, answers would come back like, "The wind whispered through the trees peacefully," and you know, it's like nothing to do with quiet in the literal word sense, but in the meaning sense, everything to do with it. And that that was magical even for them, to see that. >> Well you guys are the front end of this big wave. Congratulations on the startup, Amr. I know you guys got great pedigree in big data, and you've got a great team, and congratulations. Vectara is the name of the company, check 'em out. Again, the startup boom is coming. This will be one of the major waves, generative AI is here. I think we'll look back, and it will be pointed out as a major inflection point in the industry. >> Absolutely. >> There's not a lot of hype behind that. People are are seeing it, experts are. So it's going to be fun, thanks for watching. >> Thanks John. (soft music)

Published Date : Feb 23 2023

SUMMARY :

I call it the fifth wave in the industry. It's great to be here. and the application's almost mind blowing. And in the meanwhile, and you got more horsepower,, of all the previous phases. in the same way that, you know, and all the work that's done on ethics, "bring the lead to the top." in terms of scratching the surface and it's going to fundamentally change and you go into application And the system just does it for you. is going to change to be very So the bottleneck in the past, So the replatforming is So that's going to be a That's part of what and they got to stand this up. And one of the core things don't have the skills. So our goal is to simplify it and I agree the apps are coming, I talk to a very senior And the data is very specialized. and be aware of the fact that request, and the output, some raw API on the cloud, about the apps experience, it's any action you would like to take. you just say it, and it's But a lot of the things with chatGPT, comp-side phrase that we all use. It reminds me the old all the way from, you know, raw, and I'm going to constrain But we got all your So the data quality And that's where you That is the garbage in, garbage out. So for the folks who are and how would you prepare them that do the thing, to manage the current situation? And the new version of Notion came out, But on the crate side, you I mean the cloud was about developers so you don't have to do it. and the ones who did not, they all like. If it doesn't have the So this comes back down to Actually, I would like Ed to go first. factor of the user interface, I have a. generated in the metaverse. They'll all be talking to you guys. This brings the automation, of all of the interviews you have done, one of the customers I talked to Vectara is the name of the So it's going to be fun, Thanks John.

ENTITIES

Entity	Category	Confidence
John Markoff	PERSON	0.99+
2013	DATE	0.99+
AWS	ORGANIZATION	0.99+
Ed Alban	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
30	QUANTITY	0.99+
10 times	QUANTITY	0.99+
2006	DATE	0.99+
John Furrier	PERSON	0.99+
two weeks	QUANTITY	0.99+
Microsoft	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Ed Albanese	PERSON	0.99+
John	PERSON	0.99+
five seconds	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
Ed	PERSON	0.99+
iPhone	COMMERCIAL_ITEM	0.99+
10 good questions	QUANTITY	0.99+
Swami	PERSON	0.99+
15 different possibilities	QUANTITY	0.99+
Palo Alto, California	LOCATION	0.99+
Vectara	ORGANIZATION	0.99+
Amr Awadallah	PERSON	0.99+
Google	ORGANIZATION	0.99+
Cloudera	ORGANIZATION	0.99+
first time	QUANTITY	0.99+
both	QUANTITY	0.99+
end of 2019	DATE	0.99+
yesterday	DATE	0.98+
Big Data	ORGANIZATION	0.98+
40 million users	QUANTITY	0.98+
two things	QUANTITY	0.98+
two great guests	QUANTITY	0.98+
12 plus years	QUANTITY	0.98+
one	QUANTITY	0.98+
five dollar	QUANTITY	0.98+
Netscape	ORGANIZATION	0.98+
five years ago	DATE	0.98+
SQL	TITLE	0.98+
first inning	QUANTITY	0.98+
Amr	PERSON	0.97+
two schools	QUANTITY	0.97+
first	QUANTITY	0.97+
10 years ago	DATE	0.97+
One	QUANTITY	0.96+
first day	QUANTITY	0.96+
three	DATE	0.96+
chatGPT	TITLE	0.96+
first places	QUANTITY	0.95+
Bing	ORGANIZATION	0.95+
Notion	TITLE	0.95+
first thing	QUANTITY	0.94+
theCUBE	ORGANIZATION	0.94+
Beyond the Buzz	TITLE	0.94+
Sati Natel	PERSON	0.94+
Industrial Revolution	EVENT	0.93+
one location	QUANTITY	0.93+
three years ago	DATE	0.93+
single application	QUANTITY	0.92+
one thing	QUANTITY	0.91+
first platform	QUANTITY	0.91+
five years old	QUANTITY	0.91+

Analyst Predictions 2023: The Future of Data Management

(upbeat music) >> Hello, this is Dave Valente with theCUBE, and one of the most gratifying aspects of my role as a host of "theCUBE TV" is I get to cover a wide range of topics. And quite often, we're able to bring to our program a level of expertise that allows us to more deeply explore and unpack some of the topics that we cover throughout the year. And one of our favorite topics, of course, is data. Now, in 2021, after being in isolation for the better part of two years, a group of industry analysts met up at AWS re:Invent and started a collaboration to look at the trends in data and predict what some likely outcomes will be for the coming year. And it resulted in a very popular session that we had last year focused on the future of data management. And I'm very excited and pleased to tell you that the 2023 edition of that predictions episode is back, and with me are five outstanding market analyst, Sanjeev Mohan of SanjMo, Tony Baer of dbInsight, Carl Olofson from IDC, Dave Menninger from Ventana Research, and Doug Henschen, VP and Principal Analyst at Constellation Research. Now, what is it that we're calling you, guys? A data pack like the rat pack? No, no, no, no, that's not it. It's the data crowd, the data crowd, and the crowd includes some of the best minds in the data analyst community. They'll discuss how data management is evolving and what listeners should prepare for in 2023. Guys, welcome back. Great to see you. >> Good to be here. >> Thank you. >> Thanks, Dave. (Tony and Dave faintly speaks) >> All right, before we get into 2023 predictions, we thought it'd be good to do a look back at how we did in 2022 and give a transparent assessment of those predictions. So, let's get right into it. We're going to bring these up here, the predictions from 2022, they're color-coded red, yellow, and green to signify the degree of accuracy. And I'm pleased to report there's no red. Well, maybe some of you will want to debate that grading system. But as always, we want to be open, so you can decide for yourselves. So, we're going to ask each analyst to review their 2022 prediction and explain their rating and what evidence they have that led them to their conclusion. So, Sanjeev, please kick it off. Your prediction was data governance becomes key. I know that's going to knock you guys over, but elaborate, because you had more detail when you double click on that. >> Yeah, absolutely. Thank you so much, Dave, for having us on the show today. And we self-graded ourselves. I could have very easily made my prediction from last year green, but I mentioned why I left it as yellow. I totally fully believe that data governance was in a renaissance in 2022. And why do I say that? You have to look no further than AWS launching its own data catalog called DataZone. Before that, mid-year, we saw Unity Catalog from Databricks went GA. So, overall, I saw there was tremendous movement. When you see these big players launching a new data catalog, you know that they want to be in this space. And this space is highly critical to everything that I feel we will talk about in today's call. Also, if you look at established players, I spoke at Collibra's conference, data.world, work closely with Alation, Informatica, a bunch of other companies, they all added tremendous new capabilities. So, it did become key. The reason I left it as yellow is because I had made a prediction that Collibra would go IPO, and it did not. And I don't think anyone is going IPO right now. The market is really, really down, the funding in VC IPO market. But other than that, data governance had a banner year in 2022. >> Yeah. Well, thank you for that. And of course, you saw data clean rooms being announced at AWS re:Invent, so more evidence. And I like how the fact that you included in your predictions some things that were binary, so you dinged yourself there. So, good job. Okay, Tony Baer, you're up next. Data mesh hits reality check. As you see here, you've given yourself a bright green thumbs up. (Tony laughing) Okay. Let's hear why you feel that was the case. What do you mean by reality check? >> Okay. Thanks, Dave, for having us back again. This is something I just wrote and just tried to get away from, and this just a topic just won't go away. I did speak with a number of folks, early adopters and non-adopters during the year. And I did find that basically that it pretty much validated what I was expecting, which was that there was a lot more, this has now become a front burner issue. And if I had any doubt in my mind, the evidence I would point to is what was originally intended to be a throwaway post on LinkedIn, which I just quickly scribbled down the night before leaving for re:Invent. I was packing at the time, and for some reason, I was doing Google search on data mesh. And I happened to have tripped across this ridiculous article, I will not say where, because it doesn't deserve any publicity, about the eight (Dave laughing) best data mesh software companies of 2022. (Tony laughing) One of my predictions was that you'd see data mesh washing. And I just quickly just hopped on that maybe three sentences and wrote it at about a couple minutes saying this is hogwash, essentially. (laughs) And that just reun... And then, I left for re:Invent. And the next night, when I got into my Vegas hotel room, I clicked on my computer. I saw a 15,000 hits on that post, which was the most hits of any single post I put all year. And the responses were wildly pro and con. So, it pretty much validates my expectation in that data mesh really did hit a lot more scrutiny over this past year. >> Yeah, thank you for that. I remember that article. I remember rolling my eyes when I saw it, and then I recently, (Tony laughing) I talked to Walmart and they actually invoked Martin Fowler and they said that they're working through their data mesh. So, it takes a really lot of thought, and it really, as we've talked about, is really as much an organizational construct. You're not buying data mesh >> Bingo. >> to your point. Okay. Thank you, Tony. Carl Olofson, here we go. You've graded yourself a yellow in the prediction of graph databases. Take off. Please elaborate. >> Yeah, sure. So, I realized in looking at the prediction that it seemed to imply that graph databases could be a major factor in the data world in 2022, which obviously didn't become the case. It was an error on my part in that I should have said it in the right context. It's really a three to five-year time period that graph databases will really become significant, because they still need accepted methodologies that can be applied in a business context as well as proper tools in order for people to be able to use them seriously. But I stand by the idea that it is taking off, because for one thing, Neo4j, which is the leading independent graph database provider, had a very good year. And also, we're seeing interesting developments in terms of things like AWS with Neptune and with Oracle providing graph support in Oracle database this past year. Those things are, as I said, growing gradually. There are other companies like TigerGraph and so forth, that deserve watching as well. But as far as becoming mainstream, it's going to be a few years before we get all the elements together to make that happen. Like any new technology, you have to create an environment in which ordinary people without a whole ton of technical training can actually apply the technology to solve business problems. >> Yeah, thank you for that. These specialized databases, graph databases, time series databases, you see them embedded into mainstream data platforms, but there's a place for these specialized databases, I would suspect we're going to see new types of databases emerge with all this cloud sprawl that we have and maybe to the edge. >> Well, part of it is that it's not as specialized as you might think it. You can apply graphs to great many workloads and use cases. It's just that people have yet to fully explore and discover what those are. >> Yeah. >> And so, it's going to be a process. (laughs) >> All right, Dave Menninger, streaming data permeates the landscape. You gave yourself a yellow. Why? >> Well, I couldn't think of a appropriate combination of yellow and green. Maybe I should have used chartreuse, (Dave laughing) but I was probably a little hard on myself making it yellow. This is another type of specialized data processing like Carl was talking about graph databases is a stream processing, and nearly every data platform offers streaming capabilities now. Often, it's based on Kafka. If you look at Confluent, their revenues have grown at more than 50%, continue to grow at more than 50% a year. They're expected to do more than half a billion dollars in revenue this year. But the thing that hasn't happened yet, and to be honest, they didn't necessarily expect it to happen in one year, is that streaming hasn't become the default way in which we deal with data. It's still a sidecar to data at rest. And I do expect that we'll continue to see streaming become more and more mainstream. I do expect perhaps in the five-year timeframe that we will first deal with data as streaming and then at rest, but the worlds are starting to merge. And we even see some vendors bringing products to market, such as K2View, Hazelcast, and RisingWave Labs. So, in addition to all those core data platform vendors adding these capabilities, there are new vendors approaching this market as well. >> I like the tough grading system, and it's not trivial. And when you talk to practitioners doing this stuff, there's still some complications in the data pipeline. And so, but I think, you're right, it probably was a yellow plus. Doug Henschen, data lakehouses will emerge as dominant. When you talk to people about lakehouses, practitioners, they all use that term. They certainly use the term data lake, but now, they're using lakehouse more and more. What's your thoughts on here? Why the green? What's your evidence there? >> Well, I think, I was accurate. I spoke about it specifically as something that vendors would be pursuing. And we saw yet more lakehouse advocacy in 2022. Google introduced its BigLake service alongside BigQuery. Salesforce introduced Genie, which is really a lakehouse architecture. And it was a safe prediction to say vendors are going to be pursuing this in that AWS, Cloudera, Databricks, Microsoft, Oracle, SAP, Salesforce now, IBM, all advocate this idea of a single platform for all of your data. Now, the trend was also supported in 2023, in that we saw a big embrace of Apache Iceberg in 2022. That's a structured table format. It's used with these lakehouse platforms. It's open, so it ensures portability and it also ensures performance. And that's a structured table that helps with the warehouse side performance. But among those announcements, Snowflake, Google, Cloud Era, SAP, Salesforce, IBM, all embraced Iceberg. But keep in mind, again, I'm talking about this as something that vendors are pursuing as their approach. So, they're advocating end users. It's very cutting edge. I'd say the top, leading edge, 5% of of companies have really embraced the lakehouse. I think, we're now seeing the fast followers, the next 20 to 25% of firms embracing this idea and embracing a lakehouse architecture. I recall Christian Kleinerman at the big Snowflake event last summer, making the announcement about Iceberg, and he asked for a show of hands for any of you in the audience at the keynote, have you heard of Iceberg? And just a smattering of hands went up. So, the vendors are ahead of the curve. They're pushing this trend, and we're now seeing a little bit more mainstream uptake. >> Good. Doug, I was there. It was you, me, and I think, two other hands were up. That was just humorous. (Doug laughing) All right, well, so I liked the fact that we had some yellow and some green. When you think about these things, there's the prediction itself. Did it come true or not? There are the sub predictions that you guys make, and of course, the degree of difficulty. So, thank you for that open assessment. All right, let's get into the 2023 predictions. Let's bring up the predictions. Sanjeev, you're going first. You've got a prediction around unified metadata. What's the prediction, please? >> So, my prediction is that metadata space is currently a mess. It needs to get unified. There are too many use cases of metadata, which are being addressed by disparate systems. For example, data quality has become really big in the last couple of years, data observability, the whole catalog space is actually, people don't like to use the word data catalog anymore, because data catalog sounds like it's a catalog, a museum, if you may, of metadata that you go and admire. So, what I'm saying is that in 2023, we will see that metadata will become the driving force behind things like data ops, things like orchestration of tasks using metadata, not rules. Not saying that if this fails, then do this, if this succeeds, go do that. But it's like getting to the metadata level, and then making a decision as to what to orchestrate, what to automate, how to do data quality check, data observability. So, this space is starting to gel, and I see there'll be more maturation in the metadata space. Even security privacy, some of these topics, which are handled separately. And I'm just talking about data security and data privacy. I'm not talking about infrastructure security. These also need to merge into a unified metadata management piece with some knowledge graph, semantic layer on top, so you can do analytics on it. So, it's no longer something that sits on the side, it's limited in its scope. It is actually the very engine, the very glue that is going to connect data producers and consumers. >> Great. Thank you for that. Doug. Doug Henschen, any thoughts on what Sanjeev just said? Do you agree? Do you disagree? >> Well, I agree with many aspects of what he says. I think, there's a huge opportunity for consolidation and streamlining of these as aspects of governance. Last year, Sanjeev, you said something like, we'll see more people using catalogs than BI. And I have to disagree. I don't think this is a category that's headed for mainstream adoption. It's a behind the scenes activity for the wonky few, or better yet, companies want machine learning and automation to take care of these messy details. We've seen these waves of management technologies, some of the latest data observability, customer data platform, but they failed to sweep away all the earlier investments in data quality and master data management. So, yes, I hope the latest tech offers, glimmers that there's going to be a better, cleaner way of addressing these things. But to my mind, the business leaders, including the CIO, only want to spend as much time and effort and money and resources on these sorts of things to avoid getting breached, ending up in headlines, getting fired or going to jail. So, vendors bring on the ML and AI smarts and the automation of these sorts of activities. >> So, if I may say something, the reason why we have this dichotomy between data catalog and the BI vendors is because data catalogs are very soon, not going to be standalone products, in my opinion. They're going to get embedded. So, when you use a BI tool, you'll actually use the catalog to find out what is it that you want to do, whether you are looking for data or you're looking for an existing dashboard. So, the catalog becomes embedded into the BI tool. >> Hey, Dave Menninger, sometimes you have some data in your back pocket. Do you have any stats (chuckles) on this topic? >> No, I'm glad you asked, because I'm going to... Now, data catalogs are something that's interesting. Sanjeev made a statement that data catalogs are falling out of favor. I don't care what you call them. They're valuable to organizations. Our research shows that organizations that have adequate data catalog technologies are three times more likely to express satisfaction with their analytics for just the reasons that Sanjeev was talking about. You can find what you want, you know you're getting the right information, you know whether or not it's trusted. So, those are good things. So, we expect to see the capabilities, whether it's embedded or separate. We expect to see those capabilities continue to permeate the market. >> And a lot of those catalogs are driven now by machine learning and things. So, they're learning from those patterns of usage by people when people use the data. (airy laughs) >> All right. Okay. Thank you, guys. All right. Let's move on to the next one. Tony Bear, let's bring up the predictions. You got something in here about the modern data stack. We need to rethink it. Is the modern data stack getting long at the tooth? Is it not so modern anymore? >> I think, in a way, it's got almost too modern. It's gotten too, I don't know if it's being long in the tooth, but it is getting long. The modern data stack, it's traditionally been defined as basically you have the data platform, which would be the operational database and the data warehouse. And in between, you have all the tools that are necessary to essentially get that data from the operational realm or the streaming realm for that matter into basically the data warehouse, or as we might be seeing more and more, the data lakehouse. And I think, what's important here is that, or I think, we have seen a lot of progress, and this would be in the cloud, is with the SaaS services. And especially you see that in the modern data stack, which is like all these players, not just the MongoDBs or the Oracles or the Amazons have their database platforms. You see they have the Informatica's, and all the other players there in Fivetrans have their own SaaS services. And within those SaaS services, you get a certain degree of simplicity, which is it takes all the housekeeping off the shoulders of the customers. That's a good thing. The problem is that what we're getting to unfortunately is what I would call lots of islands of simplicity, which means that it leads it (Dave laughing) to the customer to have to integrate or put all that stuff together. It's a complex tool chain. And so, what we really need to think about here, we have too many pieces. And going back to the discussion of catalogs, it's like we have so many catalogs out there, which one do we use? 'Cause chances are of most organizations do not rely on a single catalog at this point. What I'm calling on all the data providers or all the SaaS service providers, is to literally get it together and essentially make this modern data stack less of a stack, make it more of a blending of an end-to-end solution. And that can come in a number of different ways. Part of it is that we're data platform providers have been adding services that are adjacent. And there's some very good examples of this. We've seen progress over the past year or so. For instance, MongoDB integrating search. It's a very common, I guess, sort of tool that basically, that the applications that are developed on MongoDB use, so MongoDB then built it into the database rather than requiring an extra elastic search or open search stack. Amazon just... AWS just did the zero-ETL, which is a first step towards simplifying the process from going from Aurora to Redshift. You've seen same thing with Google, BigQuery integrating basically streaming pipelines. And you're seeing also a lot of movement in database machine learning. So, there's some good moves in this direction. I expect to see more than this year. Part of it's from basically the SaaS platform is adding some functionality. But I also see more importantly, because you're never going to get... This is like asking your data team and your developers, herding cats to standardizing the same tool. In most organizations, that is not going to happen. So, take a look at the most popular combinations of tools and start to come up with some pre-built integrations and pre-built orchestrations, and offer some promotional pricing, maybe not quite two for, but in other words, get two products for the price of two services or for the price of one and a half. I see a lot of potential for this. And it's to me, if the class was to simplify things, this is the next logical step and I expect to see more of this here. >> Yeah, and you see in Oracle, MySQL heat wave, yet another example of eliminating that ETL. Carl Olofson, today, if you think about the data stack and the application stack, they're largely separate. Do you have any thoughts on how that's going to play out? Does that play into this prediction? What do you think? >> Well, I think, that the... I really like Tony's phrase, islands of simplification. It really says (Tony chuckles) what's going on here, which is that all these different vendors you ask about, about how these stacks work. All these different vendors have their own stack vision. And you can... One application group is going to use one, and another application group is going to use another. And some people will say, let's go to, like you go to a Informatica conference and they say, we should be the center of your universe, but you can't connect everything in your universe to Informatica, so you need to use other things. So, the challenge is how do we make those things work together? As Tony has said, and I totally agree, we're never going to get to the point where people standardize on one organizing system. So, the alternative is to have metadata that can be shared amongst those systems and protocols that allow those systems to coordinate their operations. This is standard stuff. It's not easy. But the motive for the vendors is that they can become more active critical players in the enterprise. And of course, the motive for the customer is that things will run better and more completely. So, I've been looking at this in terms of two kinds of metadata. One is the meaning metadata, which says what data can be put together. The other is the operational metadata, which says basically where did it come from? Who created it? What's its current state? What's the security level? Et cetera, et cetera, et cetera. The good news is the operational stuff can actually be done automatically, whereas the meaning stuff requires some human intervention. And as we've already heard from, was it Doug, I think, people are disinclined to put a lot of definition into meaning metadata. So, that may be the harder one, but coordination is key. This problem has been with us forever, but with the addition of new data sources, with streaming data with data in different formats, the whole thing has, it's been like what a customer of mine used to say, "I understand your product can make my system run faster, but right now I just feel I'm putting my problems on roller skates. (chuckles) I don't need that to accelerate what's already not working." >> Excellent. Okay, Carl, let's stay with you. I remember in the early days of the big data movement, Hadoop movement, NoSQL was the big thing. And I remember Amr Awadallah said to us in theCUBE that SQL is the killer app for big data. So, your prediction here, if we bring that up is SQL is back. Please elaborate. >> Yeah. So, of course, some people would say, well, it never left. Actually, that's probably closer to true, but in the perception of the marketplace, there's been all this noise about alternative ways of storing, retrieving data, whether it's in key value stores or document databases and so forth. We're getting a lot of messaging that for a while had persuaded people that, oh, we're not going to do analytics in SQL anymore. We're going to use Spark for everything, except that only a handful of people know how to use Spark. Oh, well, that's a problem. Well, how about, and for ordinary conventional business analytics, Spark is like an over-engineered solution to the problem. SQL works just great. What's happened in the past couple years, and what's going to continue to happen is that SQL is insinuating itself into everything we're seeing. We're seeing all the major data lake providers offering SQL support, whether it's Databricks or... And of course, Snowflake is loving this, because that is what they do, and their success is certainly points to the success of SQL, even MongoDB. And we were all, I think, at the MongoDB conference where on one day, we hear SQL is dead. They're not teaching SQL in schools anymore, and this kind of thing. And then, a couple days later at the same conference, they announced we're adding a new analytic capability-based on SQL. But didn't you just say SQL is dead? So, the reality is that SQL is better understood than most other methods of certainly of retrieving and finding data in a data collection, no matter whether it happens to be relational or non-relational. And even in systems that are very non-relational, such as graph and document databases, their query languages are being built or extended to resemble SQL, because SQL is something people understand. >> Now, you remember when we were in high school and you had had to take the... Your debating in the class and you were forced to take one side and defend it. So, I was was at a Vertica conference one time up on stage with Curt Monash, and I had to take the NoSQL, the world is changing paradigm shift. And so just to be controversial, I said to him, Curt Monash, I said, who really needs acid compliance anyway? Tony Baer. And so, (chuckles) of course, his head exploded, but what are your thoughts (guests laughing) on all this? >> Well, my first thought is congratulations, Dave, for surviving being up on stage with Curt Monash. >> Amen. (group laughing) >> I definitely would concur with Carl. We actually are definitely seeing a SQL renaissance and if there's any proof of the pudding here, I see lakehouse is being icing on the cake. As Doug had predicted last year, now, (clears throat) for the record, I think, Doug was about a year ahead of time in his predictions that this year is really the year that I see (clears throat) the lakehouse ecosystems really firming up. You saw the first shots last year. But anyway, on this, data lakes will not go away. I've actually, I'm on the home stretch of doing a market, a landscape on the lakehouse. And lakehouse will not replace data lakes in terms of that. There is the need for those, data scientists who do know Python, who knows Spark, to go in there and basically do their thing without all the restrictions or the constraints of a pre-built, pre-designed table structure. I get that. Same thing for developing models. But on the other hand, there is huge need. Basically, (clears throat) maybe MongoDB was saying that we're not teaching SQL anymore. Well, maybe we have an oversupply of SQL developers. Well, I'm being facetious there, but there is a huge skills based in SQL. Analytics have been built on SQL. They came with lakehouse and why this really helps to fuel a SQL revival is that the core need in the data lake, what brought on the lakehouse was not so much SQL, it was a need for acid. And what was the best way to do it? It was through a relational table structure. So, the whole idea of acid in the lakehouse was not to turn it into a transaction database, but to make the data trusted, secure, and more granularly governed, where you could govern down to column and row level, which you really could not do in a data lake or a file system. So, while lakehouse can be queried in a manner, you can go in there with Python or whatever, it's built on a relational table structure. And so, for that end, for those types of data lakes, it becomes the end state. You cannot bypass that table structure as I learned the hard way during my research. So, the bottom line I'd say here is that lakehouse is proof that we're starting to see the revenge of the SQL nerds. (Dave chuckles) >> Excellent. Okay, let's bring up back up the predictions. Dave Menninger, this one's really thought-provoking and interesting. We're hearing things like data as code, new data applications, machines actually generating plans with no human involvement. And your prediction is the definition of data is expanding. What do you mean by that? >> So, I think, for too long, we've thought about data as the, I would say facts that we collect the readings off of devices and things like that, but data on its own is really insufficient. Organizations need to manipulate that data and examine derivatives of the data to really understand what's happening in their organization, why has it happened, and to project what might happen in the future. And my comment is that these data derivatives need to be supported and managed just like the data needs to be managed. We can't treat this as entirely separate. Think about all the governance discussions we've had. Think about the metadata discussions we've had. If you separate these things, now you've got more moving parts. We're talking about simplicity and simplifying the stack. So, if these things are treated separately, it creates much more complexity. I also think it creates a little bit of a myopic view on the part of the IT organizations that are acquiring these technologies. They need to think more broadly. So, for instance, metrics. Metric stores are becoming much more common part of the tooling that's part of a data platform. Similarly, feature stores are gaining traction. So, those are designed to promote the reuse and consistency across the AI and ML initiatives. The elements that are used in developing an AI or ML model. And let me go back to metrics and just clarify what I mean by that. So, any type of formula involving the data points. I'm distinguishing metrics from features that are used in AI and ML models. And the data platforms themselves are increasingly managing the models as an element of data. So, just like figuring out how to calculate a metric. Well, if you're going to have the features associated with an AI and ML model, you probably need to be managing the model that's associated with those features. The other element where I see expansion is around external data. Organizations for decades have been focused on the data that they generate within their own organization. We see more and more of these platforms acquiring and publishing data to external third-party sources, whether they're within some sort of a partner ecosystem or whether it's a commercial distribution of that information. And our research shows that when organizations use external data, they derive even more benefits from the various analyses that they're conducting. And the last great frontier in my opinion on this expanding world of data is the world of driver-based planning. Very few of the major data platform providers provide these capabilities today. These are the types of things you would do in a spreadsheet. And we all know the issues associated with spreadsheets. They're hard to govern, they're error-prone. And so, if we can take that type of analysis, collecting the occupancy of a rental property, the projected rise in rental rates, the fluctuations perhaps in occupancy, the interest rates associated with financing that property, we can project forward. And that's a very common thing to do. What the income might look like from that property income, the expenses, we can plan and purchase things appropriately. So, I think, we need this broader purview and I'm beginning to see some of those things happen. And the evidence today I would say, is more focused around the metric stores and the feature stores starting to see vendors offer those capabilities. And we're starting to see the ML ops elements of managing the AI and ML models find their way closer to the data platforms as well. >> Very interesting. When I hear metrics, I think of KPIs, I think of data apps, orchestrate people and places and things to optimize around a set of KPIs. It sounds like a metadata challenge more... Somebody once predicted they'll have more metadata than data. Carl, what are your thoughts on this prediction? >> Yeah, I think that what Dave is describing as data derivatives is in a way, another word for what I was calling operational metadata, which not about the data itself, but how it's used, where it came from, what the rules are governing it, and that kind of thing. If you have a rich enough set of those things, then not only can you do a model of how well your vacation property rental may do in terms of income, but also how well your application that's measuring that is doing for you. In other words, how many times have I used it, how much data have I used and what is the relationship between the data that I've used and the benefits that I've derived from using it? Well, we don't have ways of doing that. What's interesting to me is that folks in the content world are way ahead of us here, because they have always tracked their content using these kinds of attributes. Where did it come from? When was it created, when was it modified? Who modified it? And so on and so forth. We need to do more of that with the structure data that we have, so that we can track what it's used. And also, it tells us how well we're doing with it. Is it really benefiting us? Are we being efficient? Are there improvements in processes that we need to consider? Because maybe data gets created and then it isn't used or it gets used, but it gets altered in some way that actually misleads people. (laughs) So, we need the mechanisms to be able to do that. So, I would say that that's... And I'd say that it's true that we need that stuff. I think, that starting to expand is probably the right way to put it. It's going to be expanding for some time. I think, we're still a distance from having all that stuff really working together. >> Maybe we should say it's gestating. (Dave and Carl laughing) >> Sorry, if I may- >> Sanjeev, yeah, I was going to say this... Sanjeev, please comment. This sounds to me like it supports Zhamak Dehghani's principles, but please. >> Absolutely. So, whether we call it data mesh or not, I'm not getting into that conversation, (Dave chuckles) but data (audio breaking) (Tony laughing) everything that I'm hearing what Dave is saying, Carl, this is the year when data products will start to take off. I'm not saying they'll become mainstream. They may take a couple of years to become so, but this is data products, all this thing about vacation rentals and how is it doing, that data is coming from different sources. I'm packaging it into our data product. And to Carl's point, there's a whole operational metadata associated with it. The idea is for organizations to see things like developer productivity, how many releases am I doing of this? What data products are most popular? I'm actually in right now in the process of formulating this concept that just like we had data catalogs, we are very soon going to be requiring data products catalog. So, I can discover these data products. I'm not just creating data products left, right, and center. I need to know, do they already exist? What is the usage? If no one is using a data product, maybe I want to retire and save cost. But this is a data product. Now, there's a associated thing that is also getting debated quite a bit called data contracts. And a data contract to me is literally just formalization of all these aspects of a product. How do you use it? What is the SLA on it, what is the quality that I am prescribing? So, data product, in my opinion, shifts the conversation to the consumers or to the business people. Up to this point when, Dave, you're talking about data and all of data discovery curation is a very data producer-centric. So, I think, we'll see a shift more into the consumer space. >> Yeah. Dave, can I just jump in there just very quickly there, which is that what Sanjeev has been saying there, this is really central to what Zhamak has been talking about. It's basically about making, one, data products are about the lifecycle management of data. Metadata is just elemental to that. And essentially, one of the things that she calls for is making data products discoverable. That's exactly what Sanjeev was talking about. >> By the way, did everyone just no notice how Sanjeev just snuck in another prediction there? So, we've got- >> Yeah. (group laughing) >> But you- >> Can we also say that he snuck in, I think, the term that we'll remember today, which is metadata museums. >> Yeah, but- >> Yeah. >> And also comment to, Tony, to your last year's prediction, you're really talking about it's not something that you're going to buy from a vendor. >> No. >> It's very specific >> Mm-hmm. >> to an organization, their own data product. So, touche on that one. Okay, last prediction. Let's bring them up. Doug Henschen, BI analytics is headed to embedding. What does that mean? >> Well, we all know that conventional BI dashboarding reporting is really commoditized from a vendor perspective. It never enjoyed truly mainstream adoption. Always that 25% of employees are really using these things. I'm seeing rising interest in embedding concise analytics at the point of decision or better still, using analytics as triggers for automation and workflows, and not even necessitating human interaction with visualizations, for example, if we have confidence in the analytics. So, leading companies are pushing for next generation applications, part of this low-code, no-code movement we've seen. And they want to build that decision support right into the app. So, the analytic is right there. Leading enterprise apps vendors, Salesforce, SAP, Microsoft, Oracle, they're all building smart apps with the analytics predictions, even recommendations built into these applications. And I think, the progressive BI analytics vendors are supporting this idea of driving insight to action, not necessarily necessitating humans interacting with it if there's confidence. So, we want prediction, we want embedding, we want automation. This low-code, no-code development movement is very important to bringing the analytics to where people are doing their work. We got to move beyond the, what I call swivel chair integration, between where people do their work and going off to separate reports and dashboards, and having to interpret and analyze before you can go back and do take action. >> And Dave Menninger, today, if you want, analytics or you want to absorb what's happening in the business, you typically got to go ask an expert, and then wait. So, what are your thoughts on Doug's prediction? >> I'm in total agreement with Doug. I'm going to say that collectively... So, how did we get here? I'm going to say collectively as an industry, we made a mistake. We made BI and analytics separate from the operational systems. Now, okay, it wasn't really a mistake. We were limited by the technology available at the time. Decades ago, we had to separate these two systems, so that the analytics didn't impact the operations. You don't want the operations preventing you from being able to do a transaction. But we've gone beyond that now. We can bring these two systems and worlds together and organizations recognize that need to change. As Doug said, the majority of the workforce and the majority of organizations doesn't have access to analytics. That's wrong. (chuckles) We've got to change that. And one of the ways that's going to change is with embedded analytics. 2/3 of organizations recognize that embedded analytics are important and it even ranks higher in importance than AI and ML in those organizations. So, it's interesting. This is a really important topic to the organizations that are consuming these technologies. The good news is it works. Organizations that have embraced embedded analytics are more comfortable with self-service than those that have not, as opposed to turning somebody loose, in the wild with the data. They're given a guided path to the data. And the research shows that 65% of organizations that have adopted embedded analytics are comfortable with self-service compared with just 40% of organizations that are turning people loose in an ad hoc way with the data. So, totally behind Doug's predictions. >> Can I just break in with something here, a comment on what Dave said about what Doug said, which (laughs) is that I totally agree with what you said about embedded analytics. And at IDC, we made a prediction in our future intelligence, future of intelligence service three years ago that this was going to happen. And the thing that we're waiting for is for developers to build... You have to write the applications to work that way. It just doesn't happen automagically. Developers have to write applications that reference analytic data and apply it while they're running. And that could involve simple things like complex queries against the live data, which is through something that I've been calling analytic transaction processing. Or it could be through something more sophisticated that involves AI operations as Doug has been suggesting, where the result is enacted pretty much automatically unless the scores are too low and you need to have a human being look at it. So, I think that that is definitely something we've been watching for. I'm not sure how soon it will come, because it seems to take a long time for people to change their thinking. But I think, as Dave was saying, once they do and they apply these principles in their application development, the rewards are great. >> Yeah, this is very much, I would say, very consistent with what we were talking about, I was talking about before, about basically rethinking the modern data stack and going into more of an end-to-end solution solution. I think, that what we're talking about clearly here is operational analytics. There'll still be a need for your data scientists to go offline just in their data lakes to do all that very exploratory and that deep modeling. But clearly, it just makes sense to bring operational analytics into where people work into their workspace and further flatten that modern data stack. >> But with all this metadata and all this intelligence, we're talking about injecting AI into applications, it does seem like we're entering a new era of not only data, but new era of apps. Today, most applications are about filling forms out or codifying processes and require a human input. And it seems like there's enough data now and enough intelligence in the system that the system can actually pull data from, whether it's the transaction system, e-commerce, the supply chain, ERP, and actually do something with that data without human involvement, present it to humans. Do you guys see this as a new frontier? >> I think, that's certainly- >> Very much so, but it's going to take a while, as Carl said. You have to design it, you have to get the prediction into the system, you have to get the analytics at the point of decision has to be relevant to that decision point. >> And I also recall basically a lot of the ERP vendors back like 10 years ago, we're promising that. And the fact that we're still looking at the promises shows just how difficult, how much of a challenge it is to get to what Doug's saying. >> One element that could be applied in this case is (indistinct) architecture. If applications are developed that are event-driven rather than following the script or sequence that some programmer or designer had preconceived, then you'll have much more flexible applications. You can inject decisions at various points using this technology much more easily. It's a completely different way of writing applications. And it actually involves a lot more data, which is why we should all like it. (laughs) But in the end (Tony laughing) it's more stable, it's easier to manage, easier to maintain, and it's actually more efficient, which is the result of an MIT study from about 10 years ago, and still, we are not seeing this come to fruition in most business applications. >> And do you think it's going to require a new type of data platform database? Today, data's all far-flung. We see that's all over the clouds and at the edge. Today, you cache- >> We need a super cloud. >> You cache that data, you're throwing into memory. I mentioned, MySQL heat wave. There are other examples where it's a brute force approach, but maybe we need new ways of laying data out on disk and new database architectures, and just when we thought we had it all figured out. >> Well, without referring to disk, which to my mind, is almost like talking about cave painting. I think, that (Dave laughing) all the things that have been mentioned by all of us today are elements of what I'm talking about. In other words, the whole improvement of the data mesh, the improvement of metadata across the board and improvement of the ability to track data and judge its freshness the way we judge the freshness of a melon or something like that, to determine whether we can still use it. Is it still good? That kind of thing. Bringing together data from multiple sources dynamically and real-time requires all the things we've been talking about. All the predictions that we've talked about today add up to elements that can make this happen. >> Well, guys, it's always tremendous to get these wonderful minds together and get your insights, and I love how it shapes the outcome here of the predictions, and let's see how we did. We're going to leave it there. I want to thank Sanjeev, Tony, Carl, David, and Doug. Really appreciate the collaboration and thought that you guys put into these sessions. Really, thank you. >> Thank you. >> Thanks, Dave. >> Thank you for having us. >> Thanks. >> Thank you. >> All right, this is Dave Valente for theCUBE, signing off for now. Follow these guys on social media. Look for coverage on siliconangle.com, theCUBE.net. Thank you for watching. (upbeat music)

Published Date : Jan 11 2023

SUMMARY :

and pleased to tell you (Tony and Dave faintly speaks) that led them to their conclusion. down, the funding in VC IPO market. And I like how the fact And I happened to have tripped across I talked to Walmart in the prediction of graph databases. But I stand by the idea and maybe to the edge. You can apply graphs to great And so, it's going to streaming data permeates the landscape. and to be honest, I like the tough grading the next 20 to 25% of and of course, the degree of difficulty. that sits on the side, Thank you for that. And I have to disagree. So, the catalog becomes Do you have any stats for just the reasons that And a lot of those catalogs about the modern data stack. and more, the data lakehouse. and the application stack, So, the alternative is to have metadata that SQL is the killer app for big data. but in the perception of the marketplace, and I had to take the NoSQL, being up on stage with Curt Monash. (group laughing) is that the core need in the data lake, And your prediction is the and examine derivatives of the data to optimize around a set of KPIs. that folks in the content world (Dave and Carl laughing) going to say this... shifts the conversation to the consumers And essentially, one of the things (group laughing) the term that we'll remember today, to your last year's prediction, is headed to embedding. and going off to separate happening in the business, so that the analytics didn't And the thing that we're waiting for and that deep modeling. that the system can of decision has to be relevant And the fact that we're But in the end We see that's all over the You cache that data, and improvement of the and I love how it shapes the outcome here Thank you for watching.

ENTITIES

Entity	Category	Confidence
Dave	PERSON	0.99+
Doug Henschen	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Doug	PERSON	0.99+
Carl	PERSON	0.99+
Carl Olofson	PERSON	0.99+
Dave Menninger	PERSON	0.99+
Tony Baer	PERSON	0.99+
Tony	PERSON	0.99+
Dave Valente	PERSON	0.99+
Collibra	ORGANIZATION	0.99+
Curt Monash	PERSON	0.99+
Sanjeev Mohan	PERSON	0.99+
Christian Kleinerman	PERSON	0.99+
Dave Valente	PERSON	0.99+
Walmart	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Sanjeev	PERSON	0.99+
Constellation Research	ORGANIZATION	0.99+
IBM	ORGANIZATION	0.99+
Ventana Research	ORGANIZATION	0.99+
2022	DATE	0.99+
Hazelcast	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Tony Bear	PERSON	0.99+
25%	QUANTITY	0.99+
2021	DATE	0.99+
last year	DATE	0.99+
65%	QUANTITY	0.99+
Google	ORGANIZATION	0.99+
today	DATE	0.99+
five-year	QUANTITY	0.99+
TigerGraph	ORGANIZATION	0.99+
Databricks	ORGANIZATION	0.99+
two services	QUANTITY	0.99+
Amazon	ORGANIZATION	0.99+
David	PERSON	0.99+
RisingWave Labs	ORGANIZATION	0.99+

Ali Ghodsi, Databricks | Informatica World 2019

>> Live from Las Vegas, it's theCUBE, covering Informatica World 2019. Brought to you by Informatica. >> Welcome back everyone to theCUBE's live coverage of Informatica World 2019. I'm your host Rebecca Knight, along with my co-host John Furrier. We're joined by Ali Ghodsi, he is the CEO of Databricks, thank you so much for coming on, for returning to theCUBE. You're a CUBE veteran. >> Yes, thank you for having me. >> So I want to pick up on something that you said up on the main stage, and that is that every enterprise on the planet wants to add AI capabilities, but the hardest part of AI is not AI, it's the data. >> Yeah. >> Can you riff on that a little bit for our viewers? Elaborate? >> Yeah, actually, the interesting part is that, if you look at the company that succeeded with AI, the actual AI algorithms they're using, are actually algorithms from the 70s, you know, they're actually developed in the 70s, that's 50 years ago. So then how come they're succeeding now? When actually the same algorithms weren't working in the 70s, so people gave up on them. Like, these things called neural nets, right? Now they're en vogue and they're, you know, super successful. The reason is you have to apply orders of magnitude more data. If you feed those algorithms that we thought were broken orders of magnitude more data, you actually get great results, but that's actually hard. You know, dealing with petabyte scale data and cleaning it, making sure that it's actually the right data for the task at hand is not easy. So that's the part that people are struggling with. >> I saw you up on stage, I'm like ah, Ali's here, Databricks is here, that's awesome. Psyched that you stopped by theCUBE. Been a while. I wanted to get a quick update, 'cause you guys have been on a tear, doing some great work at Cal, we were just told before we came on camera. But what are you doing here? What's the, is there any announcements or news with Informatica? What's the story? >> Yeah, it's, we're doing partnership around Delta Lake, which is our next generation engine that we built, so we're super excited about that. It integrates with all of the Informatica platform. So their ingestion tools, their transformation tools, and the catalog that they also have. So we think together, this can actually really help enterprises make that transition into the AI era. >> So you know, we've been followers, our 10th year, so remember when we were in the cloud era office of Mike Olsen and Amr Awadallah when we first started and now, Hadoop movement started, and then the cloud came along. Right when you guys started your company, the cloud growth took off. You guys were instrumental in changing the equation in dealing with data, data lakes, whatever they're calling it back then. So now, data, holistically, is a systems architecture. On premise it's a huge challenge, cloud native, well no real challenge, people love that. Data feeds AI, lot of risk taking, lot of reward. We're seeing the SaaS business explode, Zoom communications. The list goes on and on. Do you know, enterprise that's trying to be SAS is hard. So you can't just take data from an enterprise and make it SaaS-ified. You really got to think differently. What are you guys doing? How have you guys evolved and vectored into that challenge, because this is where your core value proposition initially started change. Take us through that Databricks story and how you're solving that problem today. >> Yeah, it's a great question. Really what happened is that people started collecting a lot of our data about a decade ago. And the promise was, you can do great things with this. There are all these aspirational use cases around machine learning, real time, it's going to be amazing. Right? So people started collecting it. They started storing one petabytes, two petabytes, and they kept going back to their boss and saying this project is real successful I now have five petabytes in it. But at some point the business said, okay that's great but what can you do with it? What business problems are you actually addressing? What are you solving? And so, in the last couple years there's been a push towards let's prove the value of these data lakes. And actually, many of these projects are falling short. Many are failing. And the reason is, people have just been dumping this data into data lakes without thinking about, the structure, the quality, how it's going to be used. The use cases have been an afterthought. So the number one thing in the top of mind for everyone right now is how do we make these data lakes that we have successful so we can prove some business value to our management? Towards this, this is the main problem that we're focusing on. Towards this, we built something called Delta Lake. It's something you situate on top of your data lake. And what it does is it increases the quality, the reliability, the performance, and the scale of your data lake. >> (John) So it's like a filter. >> Yeah. >> The cream rises to the top. >> (Ari) Exactly. >> Let's the sludge, the data swamp stay below the clean water, if you will. >> Exactly actually you nailed it. So basically, we look at the data as it comes in, filter as you said, and then look at, if there's any quality issues we then put it back in the data lake. It's fine, it can stay there. We'll figure out how to get value out of it later. But if it makes it into the Delta Lake, it will have high quality. Right? So that's great. And since we're anyway already looking at all the data as it's coming in, we might as well also store a lot of inducees and a lot of things that let us performance optimize it later on. So that, later, when people are actually trying to use that data they get really high performance, they get really good quality. And we also added asset transactions to it so that now you're also getting all those transactional use cases working on your existing data lake. >> I saw, at my daughter's graduation in Cal Berkley this weekend and yesterday, people around with Databricks backpacks. Very popular in academic. You guys got the young generation coming in. What's the update on the company? How many employees? What's the traction? Give us a quick business update. >> Yeah we're about 800 employees now. About 100 people in Europe, I would say, and maybe 40-50 people in Asiapac. We're expanding the ME and the Asia business. >> (John) Growth mode. >> Yeah, growth mode. So it's expanding as fast as possible. I mean, I actually, as a CEO, I try to always, slow the hiring down to make sure that we keep the quality bars. So that's actually top of mind for me. But yeah we're-- >> (John) You did Delta Lake on that one. >> Yeah (laughing) >> Exactly. Yeah and we're super excited about working with these universities. We get a lot of graduate students from top universities-- >> And Cal had the first ever class in college of data analytics, what was that? Data analytics are the first inagaural class graduated. Shows how early it is. >> Yeah, yeah, yeah. And actually used Databricks, the community edition, for a class of over a thousand students at Cal used the platform. So they're going to be trained in data science as they come out. >> So I want to ask about that because as you said you're trying to slow down the hiring to make sure that you are maintaining a high bar for your new hires. But yet, I'm sure there's a huge demand because you are in growth mode. So what are you doing? You said you're working with universities to make sure that the next generation is trained up and is capable of performing at Databricks. So tell us more about those efforts. >> Yeah I mean, so, obviously university recruiting is big for us. Cal, I think Databricks has the longest line of all the companies that come there on the career fair day. So, we work very closely with these universities. I think, next generation, as they come out, this generation that's coming out today actually is data science trained. So it's a big difference. There is a huge skills gap out there. Every big enterprise you talk tells you my biggest problem is actually, I don't have skilled people. Can you help me hire people? I say, hey we're not in the recruiting business. But, the good news is, if you look at the universities, they're all training thousands and thousands of data scientists every year now. I can tell you just at Cal, because, I happpen to be on the faculty there, is, almost every applicant now, to grad school, wants to do something AI related. Which has actually led to, if you look at all the programs in universities today, people used to do networking, professors used to do networking, say we do intelligent networks. People who do databases say, we do intelligent databases. People who do systems research say, hey we do intelligent systems, right? So what that means is, in a couple years you'll have lots of students coming out and these companies, that are now struggling hiring, then will be able to hire this talent and will actually succeed better with these AI projects. >> As they say in Berkley, nothing like a good revolution once in a while. AI is kind of changing everyone over. I got to ask you for the young kids out there, and parents who have kids either in elementary school or high school, everyone is trying to figure out, and there's no yet clear playbook, we're starting to see first generation training, but is there a skill set, because there's a range in surface area, you got hardcore coding to ethics, and everything in between from visualization, multiple dimensions of opportunities. What skills do you that people could hone or tweak that may not be on a curriculum that they could get, or pieces of different curriculums in school that would be a good foundation for folks learning and wanting to jump in to data and data value, whether it's coding to ethics? >> Yeah, just looking at my own background and seeing how, what I got to learn in school, the thing that was lacking, compared to what's needed today, is statistics. Understanding of statistics, statistical knowledge, That I think, it's going to be pervasive. So I think, 10, 15 years from now, no matter which field you're in, actually whatever job you have, you have to have some basic level of statistical understanding 'cause the systems you're working with will be, they'll be spitting out statistics and numbers and you need to understand what is false positives, what is this, what is the sample, what is that? What do these things mean? So that's one thing that's definitely missing and actually it's coming, that's one. The second is computing will continue being important. So, in the intersection of those two is, I think a lot of those jobs. >> In all fields, we were talking about earlier, biology, everything's intersecting, biochemistry to whatever right? >> (Ali) Yeah. >> I got to ask you about, well I'm a little old school, I'm 53 years old but I remember when I broke into the business coding, I used to walk into departments, they were called DP, data processing. So we're getting into the data processing world now, you've got statistics, you've got pipeline, these are data concepts. So I got to ask you as companies that are in the enterprise may be slower to move to the cutting edge like you guys are, they got to figure out where to store the data. So can you share your opinion or view on how customers are thinking and how they maybe should be architecting data on premise, in the cloud. Certainly cloud's great, if you're getting cloud native for pure SAS, and born in the cloud like a start-up. But if you're a large enterprise, and you want to be SAS-like, to have all that benefit, take the risk with the reward of being agile, you got to have data because if you don't the data into the machine learning or AI, you're not going to have good AI. So you need to get that data feeding in fast. And if it's constrained with regulation compliance you're screwed. So what's your view on this? Where should it be stored? What's your opinion? >> Yeah, we've had the same opinion for five, six years, right? Which is the data belongs in the cloud. Don't try to do this yourself. Don't try to do this on prem. Don't store it in, at Duke, it's not built for this. Store it in the cloud. In the cloud, first of all, you get a lot of security benefits that the cloud vendors are already working on. So that's one good thing about it. Second, you get it, it's realiable. You get the 10, 11 lines of availability, so that's great, you get that. Start collecting data there. Another reason you want to do it in the cloud is that a lot of the data sets that you need to actually get good quality results, are available in the cloud. Often times what happens with AI is, you build a predictive model, but actually, it's terrible. It didn't work well. So you go back, and then the main trick, the first tricks you use to increase the quality is actually augmenting that data with other data sets. You might purchase those data sets from other vendors. You don't want to be shipping hard drives around or, you know, getting that into your data center. Those will be available in the cloud, so you can augment that data. So we're big fans of storing your data in data lakes, in the cloud. We obviously believe that you need to make that data high quality and reliable. With that we believe the Delta Lake platform, open-source project that we created is a great vehicle for that. But I think moving to the cloud is the number one thing. >> (John) And hybrid works with that if you need to have something on premise? >> In my opinion the two worlds are so different, that it's hard. You hear a lot of vendors that say we're the hybrid solution that works on both and so on. But the two models are so different, fundamentally, that it's hard to actually make them work well. I have not yet seen a customer yet or enterprise. You see a lot of offerings, where people say hybrid is the way. Of course, a lot of on prem vendors are now saying, hey, we're the hybrid solution. I haven't actually seen that be successful to be frank. Maybe someone will crack that nut but-- >> I think it's an operational question to see who can make it work. Ali, congratulations on all your success. Great to see you. >> Yeah it's been great having you on the show. >> Thank you so much for having me. >> You are watching theCUBE, Informatica 2019. I'm Rebecca Knight, for John Furrier, stay tuned.

Published Date : May 21 2019

SUMMARY :

Brought to you by Informatica. thank you so much for coming on, for returning to theCUBE. So I want to pick up on something that you said So that's the part that people are struggling with. Psyched that you stopped by theCUBE. and the catalog that they also have. So you know, we've been followers, our 10th year, And the promise was, you can do great things with this. the clean water, if you will. But if it makes it into the Delta Lake, You guys got the young generation coming in. We're expanding the ME and the Asia business. slow the hiring down to make sure that Yeah and we're super excited about And Cal had the first ever class in So they're going to be trained in data science the hiring to make sure that you are But, the good news is, if you look at the I got to ask you for the young kids out there, and numbers and you need to understand So I got to ask you as companies that are in the enterprise is that a lot of the data sets that you need But the two models are so different, fundamentally, to see who can make it work. You are watching theCUBE,

ENTITIES

Entity	Category	Confidence
Rebecca Knight	PERSON	0.99+
Ali Ghodsi	PERSON	0.99+
10	QUANTITY	0.99+
Databricks	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
John Furrier	PERSON	0.99+
Informatica	ORGANIZATION	0.99+
first	QUANTITY	0.99+
five	QUANTITY	0.99+
Cal	ORGANIZATION	0.99+
Ali	PERSON	0.99+
John	PERSON	0.99+
two	QUANTITY	0.99+
two models	QUANTITY	0.99+
thousands	QUANTITY	0.99+
one petabytes	QUANTITY	0.99+
10th year	QUANTITY	0.99+
Second	QUANTITY	0.99+
yesterday	DATE	0.99+
two petabytes	QUANTITY	0.99+
70s	DATE	0.99+
six years	QUANTITY	0.99+
Las Vegas	LOCATION	0.99+
Duke	ORGANIZATION	0.99+
five petabytes	QUANTITY	0.99+
Delta Lake	LOCATION	0.99+
both	QUANTITY	0.99+
Delta Lake	ORGANIZATION	0.99+
second	QUANTITY	0.98+
first tricks	QUANTITY	0.98+
Berkley	LOCATION	0.98+
40-50 people	QUANTITY	0.98+
two worlds	QUANTITY	0.98+
one good thing	QUANTITY	0.98+
one	QUANTITY	0.98+
Asia	LOCATION	0.98+
50 years ago	DATE	0.98+
CUBE	ORGANIZATION	0.97+
Cal Berkley	LOCATION	0.97+
over a thousand students	QUANTITY	0.97+
theCUBE	ORGANIZATION	0.96+
15 years	QUANTITY	0.96+
today	DATE	0.96+
Asiapac	LOCATION	0.96+
Mike Olsen	PERSON	0.96+
Amr Awadallah	PERSON	0.96+
About 100 people	QUANTITY	0.96+
53 years old	QUANTITY	0.95+
about 800 employees	QUANTITY	0.95+
first generation	QUANTITY	0.92+
11 lines	QUANTITY	0.92+
one thing	QUANTITY	0.91+
2019	DATE	0.89+
Informatica World 2019	EVENT	0.88+
SaaS	TITLE	0.86+
a decade ago	DATE	0.85+
thousands of data scientists	QUANTITY	0.84+
SAS	ORGANIZATION	0.84+
this weekend	DATE	0.82+
last couple years	DATE	0.81+
Informatica World	TITLE	0.62+

Mick Hollison, Cloudera | theCUBE NYC 2018

(lively peaceful music) >> Live, from New York, it's The Cube. Covering "The Cube New York City 2018." Brought to you by SiliconANGLE Media and its ecosystem partners. >> Well, everyone, welcome back to The Cube special conversation here in New York City. We're live for Cube NYC. This is our ninth year covering the big data ecosystem, now evolved into AI, machine learning, cloud. All things data in conjunction with Strata Conference, which is going on right around the corner. This is the Cube studio. I'm John Furrier. Dave Vellante. Our next guest is Mick Hollison, who is the CMO, Chief Marketing Officer, of Cloudera. Welcome to The Cube, thanks for joining us. >> Thanks for having me. >> So Cloudera, obviously we love Cloudera. Cube started in Cloudera's office, (laughing) everyone in our community knows that. I keep, keep saying it all the time. But we're so proud to have the honor of working with Cloudera over the years. And, uh, the thing that's interesting though is that the new building in Palo Alto is right in front of the old building where the first Palo Alto office was. So, a lot of success. You have a billboard in the airport. Amr Awadallah is saying, hey, it's a milestone. You're in the airport. But your business is changing. You're reaching new audiences. You have, you're public. You guys are growing up fast. All the data is out there. Tom's doing a great job. But, the business side is changing. Data is everywhere, it's a big, hardcore enterprise conversation. Give us the update, what's new with Cloudera. >> Yeah. Thanks very much for having me again. It's, it's a delight. I've been with the company for about two years now, so I'm officially part of the problem now. (chuckling) It's been a, it's been a great journey thus far. And really the first order of business when I arrived at the company was, like, welcome aboard. We're going public. Time to dig into the S-1 and reimagine who Cloudera is going to be five, ten years out from now. And we spent a good deal of time, about three or four months, actually crafting what turned out to be just 38 total words and kind of a vision and mission statement. But the, the most central to those was what we were trying to build. And it was a modern platform for machine learning analytics in the cloud. And, each of those words, when you unpack them a little bit, are very, very important. And this week, at Strata, we're really happy on the modern platform side. We just released Cloudera Enterprise Six. It's the biggest release in the history of the company. There are now over 30 open-source projects embedded into this, something that Amr and Mike could have never imagined back in the day when it was just a couple of projects. So, a very very large and meaningful update to the platform. The next piece is machine learning, and Hilary Mason will be giving the kickoff tomorrow, and she's probably forgotten more about ML and AI than somebody like me will ever know. But she's going to give the audience an update on what we're doing in that space. But, the foundation of having that data management platform, is absolutely fundamental and necessary to do good machine learning. Without good data, without good data management, you can't do good ML or AI. Sounds sort of simple but very true. And then the last thing that we'll be announcing this week, is around the analytics space. So, on the analytic side, we announced Cloudera Data Warehouse and Altus Data Warehouse, which is a PaaS flavor of our new data warehouse offering. And last, but certainly not least, is just the "optimize for the cloud" bit. So, everything that we're doing is optimized not just around a single cloud but around multi-cloud, hybrid-cloud, and really trying to bridge that gap for enterprises and what they're doing today. So, it's a new Cloudera to say the very least, but it's all still based on that core foundation and platform that, you got to know it, with very early on. >> And you guys have operating history too, so it's not like it's a pivot for Cloudera. I know for a fact that you guys had very large-scale customers, both with three letter, letters in them, the government, as well as just commercial. So, that's cool. Question I want to ask you is, as the conversation changes from, how many clusters do I have, how am I storing the data, to what problems am I solving because of the enterprises. There's a lot of hard things that enterprises want. They want compliance, all these, you know things that have either legacy. You guys work on those technical products. But, at the end of the day, they want the outcomes, they want to solve some problems. And data is clearly an opportunity and a challenge for large enterprises. What problems are you guys going after, these large enterprises in this modern platform? What are the core problems that you guys knock down? >> Yeah, absolutely. It's a great question. And we sort of categorize the way we think about addressing business problems into three broad categories. We use the terms grow, connect, and protect. So, in the "grow" sense, we help companies build or find new revenue streams. And, this is an amazing part of our business. You see it in everything from doing analytics on clickstreams and helping people understand what's happening with their web visitors and the like, all the way through to people standing up entirely new businesses based simply on their data. One large insurance provider that is a customer of ours, as an example, has taken on the challenge and asked us to engage with them on building really, effectively, insurance as a service. So, think of it as data-driven insurance rates that are gauged based on your driving behaviors in real time. So no longer simply just using demographics as the way that you determine, you know, all 18-year old young men are poor drivers. As it turns out, with actual data you can find out there's some excellent 18 year olds. >> Telematic, not demographics! >> Yeah, yeah, yeah, exactly! >> That Tesla don't connect to the >> Exactly! And Parents will love this, love this as well, I think. So they can find out exactly how their kids are really behaving by the way. >> They're going to know I rolled through the stop signs in Palo Alto. (laughing) My rates just went up. >> Exactly, exactly. So, so helping people grow new businesses based on their data. The second piece is "Connect". This is not just simply connecting devices, but that's a big part of it, so the IOT world is a big engine for us there. One of our favorite customer stories is a company called Komatsu. It's a mining manufacturer. Think of it as the ones that make those, just massive mines that are, that are all over the world. They're particularly big in Australia. And, this is equipment that, when you leave it sit somewhere, because it doesn't work, it actually starts to sink into the earth. So, being able to do predictive maintenance on that level and type and expense of equipment is very valuable to a company like Komatsu. We're helping them do that. So that's the "Connect" piece. And last is "Protect". Since data is in fact the new oil, the most valuable resource on earth, you really need to be able to protect it. Whether that's from a cyber security threat or it's just meeting compliance and regulations that are put in place by governments. Certainly GDPR is got a lot of people thinking very differently about their data management strategies. So we're helping a number of companies in that space as well. So that's how we kind of categorize what we're doing. >> So Mick, I wonder if you could address how that's all affected the ecosystem. I mean, one of the misconceptions early on was that Hadoop, Big Data, is going to kill the enterprise data warehouse. NoSQL is going to knock out Oracle. And, Mike has always said, "No, we are incremental". And people are like, "Yeah, right". But that's really, what's happened here. >> Yes. >> EDW was a fundamental component of your big data strategies. As Amr used to say, you know, SQL is the killer app for, for big data. (chuckling) So all those data sources that have been integrated. So you kind of fast forward to today, you talked about IOT and The Edge. You guys have announced, you know, your own data warehouse and platform as a service. So you see this embracing in this hybrid world emerging. How has that affected the evolution of your ecosystem? >> Yeah, it's definitely evolved considerably. So, I think I'd give you a couple of specific areas. So, clearly we've been quite successful in large enterprises, so the big SI type of vendors want a, want a piece of that action these days. And they're, they're much more engaged than they were early days, when they weren't so sure all of this was real. >> I always say, they like to eat at the trough and then the trough is full, so they dive right in. (all laughing) They're definitely very engaged, and they built big data practices and distinctive analytics practices as well. Beyond that, sort of the developer community has also begun to shift. And it's shifted from simply people that could spell, you know, Hive or could spell Kafka and all of the various projects that are involved. And it is elevated, in particular into a data science community. So one of additional communities that we sort of brought on board with what we're doing, not just with the engine and SPARK, but also with tools for data scientists like Cloudera Data Science Workbench, has added that element to the community that really wasn't a part of it, historically. So that's been a nice add on. And then last, but certainly not least, are the cloud providers. And like everybody, they're, those are complicated relationships because on the one hand, they're incredibly valuable partners to it, certainly both Microsoft and Amazon are critical partners for Cloudera, at the same time, they've got competitive offerings. So, like most successful software companies there's a lot of coopetition to contend with that also wasn't there just a few years ago when we didn't have cloud offerings, and they didn't have, you know, data warehouse in the cloud offerings. But, those are things that have sort of impacted the ecosystem. >> So, I've got to ask you a marketing question, since you're the CMO. By the way, great message UL. I like the, the "grow, connect, protect." I think that's really easy to understand. >> Thank you. >> And the other one was modern. The phrase, say the phrase again. >> Yeah. It's the "Cloudera builds the modern platform for machine learning analytics optimized for the cloud." >> Very tight mission statement. Question on the name. Cloudera. >> Mmhmm. >> It's spelled, it's actually cloud with ERA in the letters, so "the cloud era." People use that term all the time. We're living in the cloud era. >> Yes. >> Cloud-native is the hottest market right now in the Linux foundation. The CNCF has over two hundred and forty members and growing. Cloud-native clearly has indicated that the new, modern developers here in the renaissance of software development, in general, enterprises want more developers. (laughs) Not that you want to be against developers, because, clearly, they're going to hire developers. >> Absolutely. >> And you're going to enable that. And then you've got the, obviously, cloud-native on-premise dynamic. Hybrid cloud and multi-cloud. So is there plans to think about that cloud era, is it a cloud positioning? You see cloud certainly important in what you guys do, because the cloud creates more compute, more capabilities to move data around. >> Sure. >> And (laughs) process it. And make it, make machine learning go faster, which gives more data, more AI capabilities, >> It's the flywheel you and I were discussing. >> It's the flywheel of, what's the innovation sandwich, Dave? You know? (laughs) >> A little bit of data, a little bit of machine itelligence, in the cloud. >> So, the innovation's in play. >> Yeah, Absolutely. >> Positioning around Cloud. How are you looking at that? >> Yeah. So, it's a fascinating story. You were with us in the earliest days, so you know that the original architecture of everything that we built was intended to be run in the public cloud. It turns out, in 2008, there were exactly zero customers that wanted all of their data in a public cloud environment. So the company actually pivoted and re-architected the original design of the offerings to work on-prim. And, no sooner did we do that, then it was time to re-architect it yet again. And we are right in the midst of doing that. So, we really have offerings that span the whole gamut. If you want to just pick up you whole current Cloudera environment in an infrastructure as a service model, we offer something called Altus Director that allows you to do that. Just pick up the entire environment, step it up onto AWUS, or Microsoft Azure, and off you go. If you want the convenience and the elasticity and the ease of use of a true platform as a service, just this past week we announced Altus Data Warehouse, which is a platform as a service kind of a model. For data warehousing, we have the data engineering module for Altus as well. Last, but not least, is everybody's not going to sign up for just one cloud vendor. So we're big believers in multi-cloud. And that's why we support the major cloud vendors that are out there. And, in addition to that, it's going to be a hybrid world for as far out as we can see it. People are going to have certain workloads that, either for economics or for security reasons, they're going to continue to want to run in-house. And they're going to have other workloads, certainly more transient workloads, and I think ML and data science will fall into this camp, that the public cloud's going to make a great deal of sense. And, allowing companies to bridge that gap while maintaining one security compliance and management model, something we call a Shared Data Experience, is really our core differentiator as a business. That's at the very core of what we do. >> Classic cloud workload experience that you're bringing, whether it's on-prim or whatever cloud. >> That's right. >> Cloud is an operating environment for you guys. You look at it just as >> The delivery mechanism. In effect. Awesome. All right, future for Cloudera. What can you share with us. I know you're a public company. Can't say any forward-looking statements. Got to do all those disclaimers. But for customers, what's the, what's the North Star for Cloudera? You mentioned going after a much more hardcore enterprise. >> Yes. >> That's clear. What's the North Star for you guys when you talk to customers? What's the big pitch? >> Yeah. I think there's a, there's a couple of really interesting things that we learned about our business over the course of the past six, nine months or so here. One, was that the greatest need for our offerings is in very, very large and complex enterprises. They have the most data, not surprisingly. And they have the most business gain to be had from leveraging that data. So we narrowed our focus. We have now identified approximately five thousand global customers, so think of it as kind of Fortune or Forbes 5000. That is our sole focus. So, we are entirely focused on that end of the market. Within that market, there are certain industries that we play particularly well in. We're incredibly well-positioned in financial services. Very well-positioned in healthcare and telecommunications. Any regulated industry, that really cares about how they govern and maintain their data, is really the great target audience for us. And so, that continues to be the focus for the business. And we're really excited about that narrowing of focus and what opportunities that's going to build for us. To not just land new customers, but more to expand our existing ones into a broader and broader set of use cases. >> And data is coming down faster. There's more data growth than ever seen before. It's never stopping.. It's only going to get worse. >> We love it. >> Bring it on. >> Any way you look at it, it's getting worse or better. Mick, thanks for spending the time. I know you're super busy with the event going on. Congratulations on the success, and the focus, and the positioning. Appreciate it. Thanks for coming on The Cube. >> Absolutely. Thank you gentlemen. It was a pleasure. >> We are Cube NYC. This is our ninth year doing all action. Everything that's going on in the data world now is horizontally scaling across all aspects of the company, the society, as we know. It's super important, and this is what we're talking about here in New York. This is The Cube, and John Furrier. Dave Vellante. Be back with more after this short break. Stay with us for more coverage from New York City. (upbeat music)

Published Date : Sep 13 2018

SUMMARY :

Brought to you by SiliconANGLE Media This is the Cube studio. is that the new building in Palo Alto is right So, on the analytic side, we announced What are the core problems that you guys knock down? So, in the "grow" sense, we help companies by the way. They're going to know I rolled Since data is in fact the new oil, address how that's all affected the ecosystem. How has that affected the evolution of your ecosystem? in large enterprises, so the big and all of the various projects that are involved. So, I've got to ask you a marketing question, And the other one was modern. optimized for the cloud." Question on the name. We're living in the cloud era. Cloud-native clearly has indicated that the new, because the cloud creates more compute, And (laughs) process it. machine itelligence, in the cloud. How are you looking at that? that the public cloud's going to make a great deal of sense. Classic cloud workload experience that you're bringing, Cloud is an operating environment for you guys. What can you share with us. What's the North Star for you guys is really the great target audience for us. And data is coming down faster. and the positioning. Thank you gentlemen. is horizontally scaling across all aspects of the

ENTITIES

Entity	Category	Confidence
Komatsu	ORGANIZATION	0.99+
Dave Vellante	PERSON	0.99+
Mick Hollison	PERSON	0.99+
Mike	PERSON	0.99+
Australia	LOCATION	0.99+
Amazon	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
2008	DATE	0.99+
Palo Alto	LOCATION	0.99+
Tom	PERSON	0.99+
New York	LOCATION	0.99+
Mick	PERSON	0.99+
John Furrier	PERSON	0.99+
New York City	LOCATION	0.99+
Tesla	ORGANIZATION	0.99+
CNCF	ORGANIZATION	0.99+
Hilary Mason	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
second piece	QUANTITY	0.99+
three letter	QUANTITY	0.99+
North Star	ORGANIZATION	0.99+
Amr Awadallah	PERSON	0.99+
zero customers	QUANTITY	0.99+
five	QUANTITY	0.99+
18 year	QUANTITY	0.99+
ninth year	QUANTITY	0.99+
One	QUANTITY	0.99+
Dave	PERSON	0.99+
this week	DATE	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
both	QUANTITY	0.99+
ten years	QUANTITY	0.98+
four months	QUANTITY	0.98+
over two hundred and forty members	QUANTITY	0.98+
Oracle	ORGANIZATION	0.98+
NYC	LOCATION	0.98+
first	QUANTITY	0.98+
NoSQL	TITLE	0.98+
The Cube	ORGANIZATION	0.98+
over 30 open-source projects	QUANTITY	0.98+
Amr	PERSON	0.98+
today	DATE	0.98+
SQL	TITLE	0.98+
each	QUANTITY	0.98+
GDPR	TITLE	0.98+
tomorrow	DATE	0.98+
Cube	ORGANIZATION	0.97+
approximately five thousand global customers	QUANTITY	0.97+
Strata	ORGANIZATION	0.96+
about two years	QUANTITY	0.96+
Altus	ORGANIZATION	0.96+
earth	LOCATION	0.96+
EDW	TITLE	0.95+
18-year old	QUANTITY	0.95+
Strata Conference	EVENT	0.94+
few years ago	DATE	0.94+
one	QUANTITY	0.94+
AWUS	TITLE	0.93+
Altus Data Warehouse	ORGANIZATION	0.93+
first order	QUANTITY	0.93+
single cloud	QUANTITY	0.93+
Cloudera Enterprise Six	TITLE	0.92+
about three	QUANTITY	0.92+
Cloudera	TITLE	0.84+
three broad categories	QUANTITY	0.84+
past six	DATE	0.82+

Kickoff | theCUBE NYC 2018

>> Live from New York, it's theCUBE covering theCUBE New York City 2018. Brought to you by SiliconANGLE Media and its ecosystem partners. (techy music) >> Hello, everyone, welcome to this CUBE special presentation here in New York City for CUBENYC. I'm John Furrier with Dave Vellante. This is our ninth year covering the big data industry, starting with Hadoop World and evolved over the years. This is our ninth year, Dave. We've been covering Hadoop World, Hadoop Summit, Strata Conference, Strata Hadoop. Now it's called Strata Data, I don't know what Strata O'Reilly's going to call it next. As you all know, theCUBE has been present for the creation at the Hadoop big data ecosystem. We're here for our ninth year, certainly a lot's changed. AI's the center of the conversation, and certainly we've seen some horses come in, some haven't come in, and trends have emerged, some gone away, your thoughts. Nine years covering big data. >> Well, John, I remember fondly, vividly, the call that I got. I was in Dallas at a storage networking world show and you called and said, "Hey, we're doing "Hadoop World, get over there," and of course, Hadoop, big data, was the new, hot thing. I told everybody, "I'm leaving." Most of the people said, "What's Hadoop?" Right, so we came, we started covering, it was people like Jeff Hammerbacher, Amr Awadallah, Doug Cutting, who invented Hadoop, Mike Olson, you know, head of Cloudera at the time, and people like Abi Mehda, who at the time was at B of A, and some of the things we learned then that were profound-- >> Yeah. >> As much as Hadoop is sort of on the back burner now and people really aren't talking about it, some of the things that are profound about Hadoop, really, were the idea, the notion of bringing five megabytes of code to a petabyte of data, for example, or the notion of no schema on write. You know, put it into the database and then figure it out. >> Unstructured data. >> Right. >> Object storage. >> And so, that created a state of innovation, of funding. We were talking last night about, you know, many, many years ago at this event this time of the year, concurrent with Strata you would have VCs all over the place. There really aren't a lot of VCs here this year, not a lot of VC parties-- >> Mm-hm. >> As there used to be, so that somewhat waned, but some of the things that we talked about back then, we said that big money and big data is going to be made by the practitioners, not by the vendors, and that's proved true. I mean... >> Yeah. >> The big three Hadoop distro vendors, Cloudera, Hortonworks, and MapR, you know, Cloudera's $2.5 billion valuation, you know, not bad, but it's not a $30, $40 billion value company. The other thing we said is there will be no Red Hat of big data. You said, "Well, the only Red Hat of big data might be "Red Hat," and so, (chuckles) that's basically proved true. >> Yeah. >> And so, I think if we look back we always talked about Hadoop and big data being a reduction, the ROI was a reduction on investment. >> Yeah. >> It was a way to have a cheaper data warehouse, and that's essentially-- Well, what did we get right and wrong? I mean, let's look at some of the trends. I mean, first of all, I think we got pretty much everything right, as you know. We tend to make the calls pretty accurately with theCUBE. Got a lot of data, we look, we have the analytics in our own system, plus we have the research team digging in, so you know, we pretty much get, do a good job. I think one thing that we predicted was that Hadoop certainly would change the game, and that did. We also predicted that there wouldn't be a Red Hat for Hadoop, that was a production. The other prediction was is that we said Hadoop won't kill data warehouses, it didn't, and then data lakes came along. You know my position on data lakes. >> Yeah. >> I've always hated the term. I always liked data ocean because I think it was much more fluidity of the data, so I think we got that one right and data lakes still doesn't look like it's going to be panning out well. I mean, most people that deploy data lakes, it's really either not a core thing or as part of something else and it's turning into a data swamp, so I think the data lake piece is not panning out the way it, people thought it would be. I think one thing we did get right, also, is that data would be the center of the value proposition, and it continues and remains to be, and I think we're seeing that now, and we said data's the development kit back in 2010 when we said data's going to be part of programming. >> Some of the other things, our early data, and we went out and we talked to a lot of practitioners who are the, it was hard to find in the early days. They were just a select few, I mean, other than inside of Google and Yahoo! But what they told us is that things like SQL and the enterprise data warehouse were key components on their big data strategy, so to your point, you know, it wasn't going to kill the EDW, but it was going to surround it. The other thing we called was cloud. Four years ago our data showed clearly that much of this work, the modeling, the big data wrangling, et cetera, was being done in the cloud, and Cloudera, Hortonworks, and MapR, none of them at the time really had a cloud strategy. Today that's all they're talking about is cloud and hybrid cloud. >> Well, it's interesting, I think it was like four years ago, I think, Dave, when we actually were riffing on the notion of, you know, Cloudera's name. It's called Cloudera, you know. If you spell it out, in Cloudera we're in a cloud era, and I think we were very aggressive at that point. I think Amr Awadallah even made a comment on Twitter. He was like, "I don't understand "where you guys are coming from." We were actually saying at the time that Cloudera should actually leverage more cloud at that time, and they didn't. They stayed on their IPO track and they had to because they had everything betted on Impala and this data model that they had and being the business model, and then they went public, but I think clearly cloud is now part of Cloudera's story, and I think that's a good call, and it's not too late for them. It never was too late, but you know, Cloudera has executed. I mean, if you look at what's happened with Cloudera, they were the only game in town. When we started theCUBE we were in their office, as most people know in this industry, that we were there with Cloudera when they had like 17 employees. I thought Cloudera was going to run the table, but then what happened was Hortonworks came out of the Yahoo! That, I think, changed the game and I think in that competitive battle between Hortonworks and Cloudera, in my opinion, changed the industry, because if Hortonworks did not come out of Yahoo! Cloudera would've had an uncontested run. I think the landscape of the ecosystem would look completely different had Hortonworks not competed, because you think about, Dave, they had that competitive battle for years. The Hortonworks-Cloudera battle, and I think it changed the industry. I think it couldn't been a different outcome. If Hortonworks wasn't there, I think Cloudera probably would've taken Hadoop and making it so much more, and I think they wouldn't gotten more done. >> Yeah, and I think the other point we have to make here is complexity really hurt the Hadoop ecosystem, and it was just bespoke, new projects coming out all the time, and you had Cloudera, Hortonworks, and maybe to a lesser extent MapR, doing a lot of the heavy lifting, particularly, you know, Hortonworks and Cloudera. They had to invest a lot of their R&D in making these systems work and integrating them, and you know, complexity just really broke the back of the Hadoop ecosystem, and so then Spark came in, everybody said, "Oh, Spark's going to basically replace Hadoop." You know, yes and no, the people who got Hadoop right, you know, embraced it and they still use it. Spark definitely simplified things, but now the conversation has turned to AI, John. So, I got to ask you, I'm going to use your line on you in kind of the ask-me-anything segment here. AI, is it same wine, new bottle, or is it really substantively different in your opinion? >> I think it's substantively different. I don't think it's the same wine in a new bottle. I'll tell you... Well, it's kind of, it's like the bad wine... (laughs) Is going to be kind of blended in with the good wine, which is now AI. If you look at this industry, the big data industry, if you look at what O'Reilly did with this conference. I think O'Reilly really has not done a good job with the conference of big data. I think they blew it, I think that they made it a, you know, monetization, closed system when the big data business could've been all about AI in a much deeper way. I think AI is subordinate to cloud, and you mentioned cloud earlier. If you look at all the action within the AI segment, Diane Greene talking about it at Google Next, Amazon, AI is a software layer substrate that will be underpinned by the cloud. Cloud will drive more action, you need more compute, that drives more data, more data drives the machine learning, machine learning drives the AI, so I think AI is always going to be dependent upon cloud ends or some sort of high compute resource base, and all the cloud analytics are feeding into these AI models, so I think cloud takes over AI, no doubt, and I think this whole ecosystem of big data gets subsumed under either an AWS, VMworld, Google, and Microsoft Cloud show, and then also I think specialization around data science is going to go off on its own. So, I think you're going to see the breakup of the big data industry as we know it today. Strata Hadoop, Strata Data Conference, that thing's going to crumble into multiple, fractured ecosystems. >> It's already starting to be forked. I think the other thing I want to say about Hadoop is that it actually brought such great awareness to the notion of data, putting data at the core of your company, data and data value, the ability to understand how data at least contributes to the monetization of your company. AI would not be possible without the data. Right, and we've talked about this before. You call it the innovation sandwich. The innovation sandwich, last decade, last three decades, has been Moore's law. The innovation sandwich going forward is data, machine intelligence applied to that data, and cloud for scale, and that's the sandwich of innovation over the next 10 to 20 years. >> Yeah, and I think data is everywhere, so this idea of being a categorical industry segment is a little bit off, I mean, although I know data warehouse is kind of its own category and you're seeing that, but I don't think it's like a Magic Quadrant anymore. Every quadrant has data. >> Mm-hm. >> So, I think data's fundamental, and I think that's why it's going to become a layer within a control plane of either cloud or some other system, I think. I think that's pretty clear, there's no, like, one. You can't buy big data, you can't buy AI. I think you can have AI, you know, things like TensorFlow, but it's going to be a completely... Every layer of the stack is going to be impacted by AI and data. >> And I think the big players are going to infuse their applications and their databases with machine intelligence. You're going to see this, you're certainly, you know, seeing it with IBM, the sort of Watson heavy lift. Clearly Google, Amazon, you know, Facebook, Alibaba, and Microsoft, they're infusing AI throughout their entire set of cloud services and applications and infrastructure, and I think that's good news for the practitioners. People aren't... Most companies aren't going to build their own AI, they're going to buy AI, and that's how they close the gap between the sort of data haves and the data have-nots, and again, I want to emphasize that the fundamental difference, to me anyway, is having data at the core. If you look at the top five companies in terms of market value, US companies, Facebook maybe not so much anymore because of the fake news, though Facebook will be back with it's two billion users, but Apple, Google, Facebook, Amazon, who am I... And Microsoft, those five have put data at the core and they're the most valuable companies in the stock market from a market cap standpoint, why? Because it's a recognition that that intangible value of the data is actually quite valuable, and even though banks and financial institutions are data companies, their data lives in silos. So, these five have put data at the center, surrounded it with human expertise, as opposed to having humans at the center and having data all over the place. So, how do they, how do these companies close the gap? How do the companies in the flyover states close the gap? The way they close the gap, in my view, is they buy technologies that have AI infused in it, and I think the last thing I'll say is I see cloud as the substrate, and AI, and blockchain and other services, as the automation layer on top of it. I think that's going to be the big tailwind for innovation over the next decade. >> Yeah, and obviously the theme of machine learning drives a lot of the conversations here, and that's essentially never going to go away. Machine learning is the core of AI, and I would argue that AI truly doesn't even exist yet. It's machine learning really driving the value, but to put a validation on the fact that cloud is going to be driving AI business is some of the terms in popular conversations we're hearing here in New York around this event and topic, CUBENYC and Strata Conference, is you're hearing Kubernetes and blockchain, and you know, these automation, AI operation kind of conversations. That's an IT conversation, (chuckles) so you know, that's interesting. You've got IT, really, with storage. You've got to store the data, so you can't not talk about workloads and how the data moves with workloads, so you're starting to see data and workloads kind of be tossed in the same conversation, that's a cloud conversation. That is all about multi-cloud. That's why you're seeing Kubernetes, a term I never thought I would be saying at a big data show, but Kubernetes is going to be key for moving workloads around, of which there's data involved. (chuckles) Instrumenting the workloads, data inside the workloads, data driving data. This is where AI and machine learning's going to play, so again, cloud subsumes AI, that's the story, and I think that's going to be the big trend. >> Well, and I think you're right, now. I mean, that's why you're hearing the messaging of hybrid cloud and from the big distro vendors, and the other thing is you're hearing from a lot of the no-SQL database guys, they're bringing ACID compliance, they're bringing enterprise-grade capability, so you're seeing the world is hybrid. You're seeing those two worlds come together, so... >> Their worlds, it's getting leveled in the playing field out there. It's all about enterprise, B2B, AI, cloud, and data. That's theCUBE bringing you the data here. New York City, CUBENYC, that's the hashtag. Stay with us for more coverage live in New York after this short break. (techy music)

Published Date : Sep 12 2018

SUMMARY :

Brought to you by SiliconANGLE Media for the creation at the Hadoop big data ecosystem. and some of the things we learned then some of the things that are profound about Hadoop, We were talking last night about, you know, but some of the things that we talked about back then, You said, "Well, the only Red Hat of big data might be being a reduction, the ROI was a reduction I mean, first of all, I think we got and I think we're seeing that now, and the enterprise data warehouse were key components and I think we were very aggressive at that point. Yeah, and I think the other point and all the cloud analytics are and cloud for scale, and that's the sandwich Yeah, and I think data is everywhere, and I think that's why it's going to become I think that's going to be the big tailwind and I think that's going to be the big trend. and the other thing is you're hearing New York City, CUBENYC, that's the hashtag.

ENTITIES

Entity	Category	Confidence
Apple	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Diane Greene	PERSON	0.99+
Google	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
John	PERSON	0.99+
Alibaba	ORGANIZATION	0.99+
Dave	PERSON	0.99+
Dave Vellante	PERSON	0.99+
Jeff Hammerbacher	PERSON	0.99+
$30	QUANTITY	0.99+
New York	LOCATION	0.99+
2010	DATE	0.99+
IBM	ORGANIZATION	0.99+
Doug Cutting	PERSON	0.99+
Mike Olson	PERSON	0.99+
Hortonworks	ORGANIZATION	0.99+
Dallas	LOCATION	0.99+
O'Reilly	ORGANIZATION	0.99+
Yahoo	ORGANIZATION	0.99+
Cloudera	ORGANIZATION	0.99+
five	QUANTITY	0.99+
AWS	ORGANIZATION	0.99+
Abi Mehda	PERSON	0.99+
John Furrier	PERSON	0.99+
New York City	LOCATION	0.99+
$2.5 billion	QUANTITY	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
MapR	ORGANIZATION	0.99+
Amr Awadallah	PERSON	0.99+
$40 billion	QUANTITY	0.99+
17 employees	QUANTITY	0.99+
VMworld	ORGANIZATION	0.99+
Today	DATE	0.99+
Impala	ORGANIZATION	0.99+
Nine years	QUANTITY	0.99+
four years ago	DATE	0.98+
last night	DATE	0.98+
last decade	DATE	0.98+
Strata Data Conference	EVENT	0.98+
Strata Conference	EVENT	0.98+
Hadoop Summit	EVENT	0.98+
ninth year	QUANTITY	0.98+
Four years ago	DATE	0.98+
two worlds	QUANTITY	0.97+
five companies	QUANTITY	0.97+
today	DATE	0.97+
Strata Hadoop	EVENT	0.97+
Hadoop World	EVENT	0.96+
CUBE	ORGANIZATION	0.96+
Google Next	ORGANIZATION	0.95+
Twitter	ORGANIZATION	0.95+
this year	DATE	0.95+
Spark	ORGANIZATION	0.95+
US	LOCATION	0.94+
CUBENYC	EVENT	0.94+
Strata O'Reilly	ORGANIZATION	0.93+
next decade	DATE	0.93+

Day One Kickoff– DataWorks Summit Europe 2017 - #DW17 - #theCUBE

>> Narrator: Recovery. DataWorks Summit Europe 2017. Brought to you by Hortonworks. >> Hello everyone, welcome to The Cube's special presentation here in Munich, Germany for DataWorks Summit 2017. This is the Hadoop Summit powered by Hortonworks. This is their event and again, shows the transition from the Hadoop world to the big data world. I'm John Furrier. My co-host Dave Vellante, good to see you Dave. We're back in the seats together, usually on different events, but now here together in Munich. Great beer, great scene here. Small European event for Hortonworks and the ecosystem but it's called DataWorks 2017. Strata Hadoop is calling themselves Strata and Data. They're starting to see the word Hadoop being sunsetted from these events, which is a big theme of this year. The transition from Hadoop being the branded category to Data. >> Well, you're certainly seeing that in a number of ways. The titles of these events. Well, first of all, I love being in Europe. These venues are great, right? They're so Euro, very clean and magnificent. But back to your point. You're seeing the Hadoop Summit now called the DataWorks Summit. You're seeing the Strata Plus Hadoop is now Strata Plus, I don't even know what it is. Right, it's not Hadoop driven anymore. You see it also in Cloudera's IPO. They're going to talk about Hadoop and Hadoop Distro. They're a Hadoop Distro vendor but they talked about being a data management company and John, I think we are entering the era, or well deep into the era of what I have been calling for the last couple of years, profitless prosperity. Really where you see the Cloudera IPO, as you know, they raised money from Intel, over $600 million at a $4.1 billion dollar valuation. The Wall Street Journal says they'll have a tough time getting a billion dollar valuation. For every dollar each of these companies spends, Hortonworks and Cloudera, they lose between $1.70 and $2.50, so we've always said at SiliconANGLE, Wiki Bond and The Cube that people are going to make money in big data or the practitioners of big data, and it's hard to find those guys, it's hard to see them but that's really what's happening is the industries are transforming and those are the guys that are putting money into their bottom line. Not so much for technology vendors. >> Great to unpack that but first of all, I want to just say congratulations to Wiki Bond for getting it right again. As usual Wiki Bond, ahead of the curve and being out there and getting it right because I think you nailed it and I think Wiki Bond saw this first of all the research firms, kind of, you know, pat ourselves on the back here, but the truth is that practitioners are making the money and I think you're going to see more of that. In fact, last night as I'm having a nice beer here in Germany, I just like to listen to the conversations in the bar area and a lot of conversations around, real conversations around, you know, doing deals, and you know, deployments. You know, you're hearing about HBase, you're hearing about clusters, you're hearing about service revenue, and I think this is the focus. Cloudera, I think, in a classic Silicon Valley way, their hubris was tempered by their lack of scale. I mean, they didn't really blow it out. I mean, now they do 200 million in revenue. Nothing to shake a stick at, they did a great job, but they're buying revenue and Hortonworks is as well. But the ecosystem is the factor, and this is the wildcard. I'm making a prediction. Profitless prosperity that you point out is right, but I think that it has longevity with these companies like Hortonworks and Cloudera and others, like MapR because the ecosystem's robust. If you factor in the ecosystem revenue that is enough rising tide in my opinion. The question is how do they become sustainable as a standalone venture, that Red Hat for Hadoop never worked as Pat Gilson, you know, predicted. So, I think you're going to see a quick shift and pivot quickly by Hortonworks, certainly Cloudera's going to be under the microscope once they go public. I'm expecting that valuation to plummet like a rock. They're going to go public, Silicon Valley people are going to get their exits but. >> Excel will be happy. >> Everyone, yeah, they'll be happy. They already sold in 2013. They did a big sale, I mean, all of them cashed out two years ago when that liquidation event happened with Intel but that's fine. But now it's back to business building and Hortonworks has been doing it for years, so when you see your evaluation is less than a billion, so I'm expecting Cloudera to plummet like a rock. I would not buy the IPO at all because I think it's going to go well under a billion dollars. >> And I think it's the right call and as we know, last year, at the end of last year, Fidelity and other mutual funds devalued their holdings in Cloudera and so, you know, you've got this situation where, as you say, a couple hundred, maybe you know, on the way to 300 million in revenue, Hortonworks on the way to 200 million in revenue. Add up the ecosystem, yeah, maybe you get to a billion, throw in all of what IBM and Oracle call big data, and it's kind of a more interesting business, but you've called it same wine, new bottle. Is it a new bottle? Now, what I mean by that is the shift from Hadoop and then again, you read Cloudera's S1, it's all about AI, machine learning, you know, the cloud. Interesting, we'll talk about the cloud a little later, but is it same wine, new bottle, or is this really a shift toward a new era of innovation? >> It's not a new shift. It's the same innovation that the Hortonworks was founded on. Big data is a categorical and Hadoop was the horse they rode in on, but I think what's changing is the fact that customers are now putting real projects on the table and the scrutiny around those projects have to produce value, and the value comes down to total cost of ownership and business value. And that's becoming a data specific thing, and you look at all the successes in the big data world, Spark and others, you're seeing a focus on cloud integration and real-time workloads. These are real projects. This isn't fantasy. This isn't hype. This isn't early adopter. These are real companies saying we are moving to a new paradigm of digital transforming our companies and we need cost efficiencies but revenue-producing applications and workloads that are going to be running in the cloud with data at the heart of it. So, this is a customer-forcing function where the customers are generally excited about machine learning, moving to real-time classification of workloads. This is the deal and no hubris, no technology posturing, no open standards, jockeying can right the situation. Customers have demands and they want them filled, and we're going to have a lot of guests on here and I'm going to ask them those direct questions. What are you looking for and? >> Well, I totally agree with what you're saying and when we first met, it was right around the, you know, the mid point of the web 2.0 era, and I remember Tim Berners-Lee commenting on all this excitement, everybody's doing, he said this is what the web was invented to do, and this is what big data was invented to do. It was to produce deep analytics, deep learning, machine learning, you know, cognitive, as IBM likes to brand that, and so, it really is the next era even though people don't like to use the term big data anymore. We were talking to, you know, some of the folks in our community earlier, John, you and I, about some of the challenges. Why is it profitless, you know? Why is there so much growth but it's no profit? And you know, we have to point out here that people like Hortonworks and Cloudera, they've made some big bets, take HDSF of example. And now you have the cloud guys, particularly Amazon, coming in, you know, with S3. Look at YARN, big open source project. But you got Docker and Kubernetes seem to be mopping that up. Tez was supposed to replace MapReduce and now you've got. >> I mean, I wouldn't say mopping up, I mean. >> You've got Spark. >> At the end of the day the ecosystem's going to revolve around what the customers want, and portability of workloads, Kubernetes and microservices, these are areas that just absolutely make a lot of sense and I think, you know, people will move to where the frictionless action is and that's going to happen with Kubernetes and containers and microservices, but that just speaks to the devops culture, and I think Hadoop ecosystem, again, was grounded in the devops culture. So, yeah, there's some progress that are going to maybe go out of flavor, but there's other stuff coming up trough the ranks in open source and I think it's compelling. >> But where I disagree with what you're saying is well, the point I'm trying to make, is you have to, if you're Cloudera and Hortonworks, you have to support those multiple projects and it's expensive as hell. Whereas the cloud guys put all their wood behind one arrow, to use an old Scott McNealy phrase, and you know, Amazon, I would argue is mopping up in big data. I think the cloud guys, you know, it's ironic to me that Cloudera in the cloud era picked that name, you know, but really never had. >> John: They missed the cloud. >> They've never really had a strong cloud play, and I would say the same thing with Hortonworks and MapR. They have to play in the cloud and they talk about cloud, but they've got to support hybrid, they've got to support on param, they got to pick the clouds that they're going to support, AWS, Azure, maybe IBM's cloud. >> Look, Cloudera completely missed the cloud era, pun intended. However, they didn't miss open source but they're great at and I'm an admirer of Cloudera and Hortonworks on is that their open source ethos is what drove them, and so they kind of got isolated in with some of their product decisions, but that's not a bad thing. I mean, ultimately, I'm really bullish on Cloudera and Hortonworks because the ecosystem points I mentioned earlier are not high on the I wouldn't buy the IPO, I think I'd buy them at a discount, but Cloudera's not going to go away, Dave. They're going to go public. I think the valuation's going to drop like a rock and then settle around a billion, but they have good management. The founders still there, Michael Olson, Amr Awadallah. So, you're going to see Cloudera transform as a company. They have to do business out in the open and they're not afraid to, obviously they're open source. So, we're going to start to see that transition from a private venture backed, scale up, buy revenue. In the playbook of Silicon Valley venture capital's Excel partners and Greylock. Now they go public and get liquid and then now next phase of their journey is going to be build a public company and I think that they will do a good job doing it and I'm not down on them at all for that and I think it's just going to be a transition. >> Well, they're going to raise what? A couple 100 million dollars? But this industry, yeah, this industry's cashflow negative, so I agree with you. Open source is great, let's ra-ra for open source and it drives innovation, but how does this industry pay for itself? That's what I want to know. How you respond to that? >> Well, I think they have sustainable issues around services and I think partnering with the big companies like Intel that have professional services might help them on that front, but Michael Olson said in his founder's letter in his S1, kind of AI washing, he said AI and cognitive. But that's okay because Cloudera could easily pivot with their brain power, and same with Hortonworks to AI. Machine learning is very open source driven. Open source culture is growing, it's not going away, so I think Cloudera's in a very good position. >> I think the cloud guys are going to kill them in that game, and cloud guys and IBM are going to cream these profitless startups in that AI and machine learning game. >> We'll see. >> You disagree? >> I disagree, I think. Well, I mean, it depends. I mean, you know, I'm not going to, you know, forecast what the managements might do, but I mean, if I'm cloud looking at what Cloudera's done. >> What would you do? >> I would do exactly what Mike Olson's doing is I'd basically pivot immediately to machine learning. Look at Google. TensorFlow it's go so much traction with their cloud because it's got machine learning built into it. Open source is where the action is, and that's where you could do a lot of good work and use it as an advantage in that they know that game. I would not count out the open source game. >> So, we know how IBM makes money at that, you know, in theory anyway it wants. We know how Amazon's going to make money at that with their priority approach, Microsoft will do the same thing. How to Cloudera and Hortonworks make money? >> I think it's a product transition around getting to the open source with cloud technologies. Amazon is not out to kill open source, so I think there's an opportunity to wedge in a position there, and so they just got to move quickly. If they don't make these decisions then that's a failed execution on the management team at Cloudera and Hortonworks and I think they're on it. So, we'll keep an eye on that. >> No, Amazon's not trying to kill open source, I would agree, but they are bogarting open source in a big way and profiting amazingly from it. >> Well, they just do what Amy Jessie would say, they're customer driven. So, if a customer doesn't want to do five things to do one thing this is back to my point. The customers want real-time workloads. They want it with open source and they don't want all these steps in the cost of ownership. That's why this is not a new shift, it's the same wine, new bottle because now you're just seeing real projects that are demanding successful and efficient code and support and whoever delivers it builds the better mousetrap. In this case, the better mousetrap will win. >> And I'm arguing that the better mousetrap and the better marginal economics, I know I'm like a broken record on this, but if I take Kinesis and DynamoDB and Red Ship and wrap it into my big data play, offer it as a service with a set of APIs on the cloud, like AWS is going to do, or is doing, and Azure is doing, that's a better business model than, as you say, five different pieces that I have to cobble together. It's just not economically viable for customers to do that. >> Well, we've got some big new coming up here. We're going to have two days of wall-to-wall coverage of DataWorks 2017. Hortonworks announcing 2.6 of their Hadoop Hortonworks data platform. We're going to talk to Scott now, the CTO, coming up shortly. Stay with us for exclusive coverage of DataWorks in Munich, Germany 2017. We'll be back with more after this short break.

Published Date : Apr 5 2017

SUMMARY :

Brought to you by Hortonworks. Hortonworks and the ecosystem and it's hard to find those guys, and you know, deployments. going to go well under and then again, you read Cloudera's S1, and I'm going to ask them and so, it really is the next era I mean, I wouldn't and that's going to happen with Kubernetes and you know, Amazon, that they're going to support, and I think that they will Well, they're going to raise what? and same with Hortonworks to AI. and cloud guys and IBM are going to cream I mean, you know, and that's where you could to make money at that and so they just got to move quickly. to kill open source, and they don't want all these steps and the better marginal economics, We're going to talk to Scott now, the CTO,

ENTITIES

Entity	Category	Confidence
Dave Vellante	PERSON	0.99+
Michael Olson	PERSON	0.99+
IBM	ORGANIZATION	0.99+
Hortonworks	ORGANIZATION	0.99+
Amazon	ORGANIZATION	0.99+
Europe	LOCATION	0.99+
2013	DATE	0.99+
Amy Jessie	PERSON	0.99+
John	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
Fidelity	ORGANIZATION	0.99+
Oracle	ORGANIZATION	0.99+
Mike Olson	PERSON	0.99+
Germany	LOCATION	0.99+
Munich	LOCATION	0.99+
Wiki Bond	ORGANIZATION	0.99+
$2.50	QUANTITY	0.99+
Dave	PERSON	0.99+
Scott	PERSON	0.99+
John Furrier	PERSON	0.99+
last year	DATE	0.99+
MapR	ORGANIZATION	0.99+
AWS	ORGANIZATION	0.99+
Microsoft	ORGANIZATION	0.99+
200 million	QUANTITY	0.99+
Pat Gilson	PERSON	0.99+
Intel	ORGANIZATION	0.99+
less than a billion	QUANTITY	0.99+
two days	QUANTITY	0.99+
Scott McNealy	PERSON	0.99+
Tim Berners-Lee	PERSON	0.99+
Silicon Valley	LOCATION	0.99+
over $600 million	QUANTITY	0.99+
The Cube	ORGANIZATION	0.99+
SiliconANGLE	ORGANIZATION	0.99+
DataWorks Summit	EVENT	0.99+
Hadoop	ORGANIZATION	0.98+
Hadoop Distro	ORGANIZATION	0.98+
300 million	QUANTITY	0.98+
two years ago	DATE	0.98+
DataWorks 2017	EVENT	0.98+
Google	ORGANIZATION	0.98+
Hadoop Summit	EVENT	0.98+
each	QUANTITY	0.98+
a billion	QUANTITY	0.97+
DataWorks Summit 2017	EVENT	0.97+
billion dollar	QUANTITY	0.97+
Amr Awadallah	PERSON	0.97+
Munich, Germany	LOCATION	0.97+

Alison Yu, Cloudera - SXSW 2017 - #IntelAI - #theCUBE

(electronic music) >> Announcer: Live from Austin, Texas, it's The Cube. Covering South By Southwest 2017. Brought to you by Intel. Now, here's John Furrier. >> Hey, welcome back, everyone, we're here live in Austin, Texas, for South By Southwest Cube coverage at the Intel AI Lounge, #IntelAI if you're watching, put it out on Twitter. I'm John Furrier of Silicon Angle for the Cube. Our next guest is Alison Yu who's with Cloudera. And in the news today, although they won't comment on it. It's great to see you, social media manager at Cloudera. >> Yes, it's nice to see you as well. >> Great to see you. So, Cloudera has a strategic relationship with Intel. You guys have a strategic investment, Intel, and you guys partner up, so it's well-known in the industry. But what's going on here is interesting, AI for social good is our theme. >> Alison: Yes. >> Cloudera has always been a pay-it-forward company. And I've known the founders, Mike Olson and Amr Awadallah. >> Really all about the community and paying it forward. So Alison, talk about what you guys are working on. Because you're involved in a panel, but also Cloudera Cares. And you guys have teamed up with Thorn, doing some interesting things. >> Alison: Yeah (laughing). >> Take it away! >> Sure, thanks. Thanks for the great intro. So I'll give you a little bit of a brief introduction to Cloudera Cares. Cloudera Cares was founded roughly about three years ago. It was really an employee-driven and -led effort. I kind of stepped into the role and ended up being a little bit more of the leader just by the way it worked out. So we've really gone from, going from, you know, we're just doing soup kitchens and everything else, to strategic partnerships, donating software, professional service hours, things along those lines. >> Which has been very exciting to see our nonprofit partnerships grow in that way. So it really went from almost grass-root efforts to an organized organization now. And we start stepping up our strategic partnerships about a year and a half ago. We started with DataKind, is our initial one. About two years ago, we initiated that. Then we a year ago, about in September, we finalized our donation of an enterprise data hub to Thorn, which if you're not aware of they're all about using technology and innovation to stop child-trafficking. So last year, around September or so, we announced the partnership and we donated professional service hours. And then in October, we went with them to Grace Hopper, which is obviously the largest Women in Tech Conference in North America. And we hosted a hackathon and we helped mentor women entering into the tech workforce, and trying to come up with some really cool innovative solutions for them to track and see what's going on with the dark web, so we had quite a few interesting ideas coming out of that. >> Okay, awesome. We had Frederico Gomez Suarez on, who was the technical advisor. >> Alison: Yeah. >> A Microsoft employee, but he's volunteering at Thorn, and this is interesting because this is not just donating to the soup kitchens and what not. >> Alison: Yeah. >> You're starting to see a community approach to philanthropy that's coding RENN. >> Yeah. >> Hackathons turning into community galvanizing communities, and actually taking it to the next level. >> Yeah. So, I think one of the things we realize is tech, while it's so great, we have actually introduced a lot of new problems. So, I don't know if everyone's aware, but in the '80s and '90s, child exploitation had almost completely died. They had almost resolved the issue. With the introduction of technology and the Internet, it opened up a lot more ways for people to go ahead and exploit children, arrange things, in the dark web. So we're trying to figure out a way to use technology to combat a problem that technology kind of created as well, but not only solving it, but rescuing people. >> It's a classic security problem, the surface area has increased for this kind of thing. But big data, which is where you guys were founded on in the cloud era that we live in. >> Alison: Yeah. >> Pun intended. (laughing) Using the machine learning now you start with some scale now involved. >> Yes, exactly, and that's what we're really hoping, so we're partnering with Intel in the National Center of Missing Exploited Children. We're actually kicking off a virtual hackathon tomorrow, and our hope is we can figure out some different innovative ways that AI can be applied to scraping data and finding children. A lot of times we'll see there's not a lot of clues, but for example, if we can upload, if there can be a tool that can upload three or four different angles of a child's face when they go missing, maybe what happens is someone posts a picture on Instagram or Twitter that has a geo tag and this kid is in the background. That would be an amazing way of using AI and machine learning-- >> Yeah. >> Alison: To find a child, right. >> Well, I'll give you guy a plug for Cloudera. And I'll reference Dr. Naveen Rao, who's the GM of Intel's AI group, was on earlier. And he was talking about how there's a lot of storage available, not a lot of compute. Now, Cloudera, you guys have really pioneered the data lake, data hub concept where storage is critical. >> Yeah. >> Now, you got this compute power and machine learning, that's kind of where it comes together. Did I get that right? >> Yeah, and I think it's great that with the partnership with Intel we're able to integrate our technology directly into the hardware, which makes it so much more efficient. You're able to compute massive amounts of data in a very short amount of time, and really come up with real results. And with this partnership, specifically with Thorn and NCMEC, we're seeing that it's real impact for thousands of people last year, I think. In the 2016 impact report, Thorn said they identified over 6,000 trafficking victims, of which over 2,000 were children. Right, so that tool that they use is actually built on Cloudera. So, it's great seeing our technology put into place. >> Yeah, that's awesome. I was talking to an Intel person the other day, they have 72 cores now on a processor, on the high-end Xeons. Let's get down to some other things that you're working on. What are you doing here at the show? Do you have things that you're doing? You have a panel? >> Yeah, so at the show, at South by Southwest, we're kicking off a virtual hackathon tomorrow at our Austin offices for South by Southwest. Everyone's welcome to come. I just did the liquor order, so yes, everyone please come. (laughing) >> You just came from Austin's office, you're just coming there. >> Yeah, exactly. So we've-- >> Unlimited Red Bull, pizza, food. (laughing) >> Well, we'll be doing lots and lots tomorrow, but we're kicking that off, we have representatives from Thorn, NCMEC, Google, Intel, all on site to answer questions. That's kind of our kickoff of this month-long virtual hackathon. You don't need to be in Austin to participate, but that is one of the things that we are kicking off. >> And then on Sunday, actually here at the Intel AI Lounge we're doing a panel on AI for Good, and using artificial intelligence to solve problems. >> And we'll be broadcasting that live here on The Cube. So, folks, SiliconAngle.tv will carry that. Alison, talk about the trend that, you weren't here when we were talking about how there's now a new counterculture developing in a good way around community and social change. How real is the trend that you're starting to see these hackathons evolve from what used to be recruiting sessions to people just jamming together to meet each other. Now, you're starting to see the next level of formation where people are organizing collectively-- >> Yeah. >> To impact real issues. >> Yeah. >> Is this a real trend or where is that trend, can you speak to that? >> Sure, so from what I've seen from the hackathons what we've been seeing before was it's very company-specific. Only one company wanted to do it, and they would kind of silo themselves, right? Now, we're kind of seeing this coming together of companies that are generally competitors, but they see a great social cause and they decide that they want to band together, regardless of their differences in technology, product, et cetera, for a common good. And, so. >> Like a Thorn. >> For Thorn, you'll see a lot of competitors, so you'll see Facebook and Twitter or Google and Amazon, right? >> John: Yeah. >> And we'll see all these different competitors come together, lend their workforce to us, and have them code for one great project. >> So, you see it as a real trend. >> I do see it as a trend. I saw Thorn last year did a great one with Facebook and on-site with Facebook. This year as we started to introduce this hackathon, we decided that we wanted to do a hackathon series versus just a one-off hackathon. So we're seeing people being able to share code, contribute, work on top of other code, right, and it's very much a sharing community, so we're very excited for that. >> All right, so I got to ask you what's they culture like at Cloudera these days, as you guys prepare to go public? What's the vibe internally of the company, obviously Mike Olson, the founder, is still around, Amr's around. You guys have been growing really fast. Got your new space. What's the vibe like in Cloudera now? >> Honestly, the culture at Cloudera hasn't really changed. So, when I joined three years ago we were much smaller than we are now. But I think one thing that we're really excited about is everyone's still so collaborative, and everyone makes sure to help one another out. So, I think our common goal is really more along the lines of we're one team, and let's put out the best product we can. >> Awesome. So, what's South by Southwest mean to you this year? If you had to kind of zoom out and say, okay. What's the theme? We heard Robert Scoble earlier say it's a VR theme. We hear at Intel it's AI. So, there's a plethora of different touchpoints here. What do you see? >> Yeah, so I actually went to the opening keynote this morning, which was great. There was an introduction, and then I don't know if you realized, but Cory Booker was on as well, which is great. >> John: Yep. >> But I think a lot of what we had seen was they called out on stage that artificial intelligence is something that will be a trend for the next year. And I think that's very exciting that Intel really hit the nail on the head with the AI Lounge, right? >> Cory Booker, I'm a big fan. He's from my neighborhood, went to the same school I went to, that my family. So in Northern Valley, Old Tappan. Cory, if you're watching, retweet us, hashtag #IntelAI. So AI's there. >> AI is definitely there. >> No doubt, it's on stage. >> Yes, but I think we're also seeing a very large, just community around how can we make our community better versus let's try to go in these different silos, and just be hyper-aware of what's only in front of us, right? So, we're seeing a lot more from the community as well, just being interested in things that are not immediately in front of us, the wider, either nation, global, et cetera. So, I think that's very exciting people are stepping out of just their own little bubbles, right? And looking and having more compassion for other people, and figuring out how they can give back. >> And, of course, open source at the center of all the innovation as always. (laughing) >> I would like to think so, right? >> It is! I would testify. Machine learning is just a great example, how that's now going up into the cloud. We started to see that really being part of all the apps coming out, which is great because you guys are in the big data business. >> Alison: Yeah. >> Okay, Alison, thanks so much for taking the time. Real quick plug for your panel on Sunday here. >> Yeah. >> What are you going to talk about? >> So we're going to be talking a lot about AI for good. We're really going to be talking about the NCMEC, Thorn, Google, Intel, Cloudera partnership. How we've been able to do that, and a lot of what we're going to also concentrate on is how the everyday tech worker can really get involved and give back and contribute. I think there is generally a misconception of if there's not a program at my company, how do I give back? >> John: Yeah. >> And I think Cloudera's a shining example of how a few employees can really enact a lot of change. We went from grassroots, just a few employees, to a global program pretty quickly, so. >> And it's organically grown, which is the formula for success versus some sort of structured company program (laughing). >> Exactly, so we definitely gone from soup kitchen to strategic partnerships, and being able to donate our own time, our engineers' times, and obviously our software, so. >> Thanks for taking the time to come on our Cube. It's getting crowded in here. It's rocking the house, the house is rocking here at the Intel AI Lounge. If you're watching, check out the hashtag #IntelAI or South by Southwest. I'm John Furrie. I'll be back with more after this short break. (electronic music)

Published Date : Mar 10 2017

SUMMARY :

Brought to you by Intel. And in the news today, although they won't comment on it. and you guys partner up, And I've known the founders, Mike Olson and Amr Awadallah. So Alison, talk about what you guys are working on. I kind of stepped into the role for them to track and see what's going on with the dark web, We had Frederico Gomez Suarez on, donating to the soup kitchens and what not. You're starting to see a community approach and actually taking it to the next level. but in the '80s and '90s, child exploitation in the cloud era that we live in. Using the machine learning now and our hope is we can figure out some different the data lake, data hub concept Now, you got this compute power and machine learning, into the hardware, which makes it so much more efficient. on the high-end Xeons. I just did the liquor order, so yes, everyone please come. You just came from Austin's office, So we've-- (laughing) but that is one of the things that we are kicking off. actually here at the Intel AI Lounge Alison, talk about the trend that, you weren't here and they would kind of silo themselves, right? and have them code for one great project. and on-site with Facebook. All right, so I got to ask you the best product we can. What's the theme? and then I don't know if you realized, that Intel really hit the nail on the head I went to, that my family. and just be hyper-aware of And, of course, open source at the center which is great because you guys are in the Okay, Alison, thanks so much for taking the time. and a lot of what we're going to also concentrate on is And I think Cloudera's a shining example of And it's organically grown, and being able to donate our own time, Thanks for taking the time to come on our Cube.

ENTITIES

Entity	Category	Confidence
Mike Olson	PERSON	0.99+
Alison	PERSON	0.99+
Robert Scoble	PERSON	0.99+
NCMEC	ORGANIZATION	0.99+
Google	ORGANIZATION	0.99+
John Furrie	PERSON	0.99+
Cloudera	ORGANIZATION	0.99+
Austin	LOCATION	0.99+
John Furrier	PERSON	0.99+
Amazon	ORGANIZATION	0.99+
John	PERSON	0.99+
October	DATE	0.99+
Naveen Rao	PERSON	0.99+
Microsoft	ORGANIZATION	0.99+
Cory Booker	PERSON	0.99+
Alison Yu	PERSON	0.99+
Sunday	DATE	0.99+
Intel	ORGANIZATION	0.99+
Cloudera Cares	ORGANIZATION	0.99+
72 cores	QUANTITY	0.99+
Thorn	ORGANIZATION	0.99+
last year	DATE	0.99+
This year	DATE	0.99+
Amr Awadallah	PERSON	0.99+
a year ago	DATE	0.99+
Facebook	ORGANIZATION	0.99+
Cory	PERSON	0.99+
tomorrow	DATE	0.99+
Austin, Texas	LOCATION	0.99+
Twitter	ORGANIZATION	0.99+
Northern Valley	LOCATION	0.99+
September	DATE	0.99+
2016	DATE	0.99+
DataKind	ORGANIZATION	0.99+
over 6,000 trafficking victims	QUANTITY	0.99+
Frederico Gomez Suarez	PERSON	0.99+
next year	DATE	0.99+
today	DATE	0.99+
over 2,000	QUANTITY	0.99+
three years ago	DATE	0.99+
National Center of Missing Exploited Children	ORGANIZATION	0.98+
SXSW 2017	EVENT	0.98+
one	QUANTITY	0.98+
About two years ago	DATE	0.98+
Amr	ORGANIZATION	0.98+
thousands of people	QUANTITY	0.97+
North America	LOCATION	0.95+
about a year and a half ago	DATE	0.95+
this year	DATE	0.95+
one team	QUANTITY	0.95+

Roddy Martin, Oracle Corp. - Oracle OpenWorld - #oow16 - #theCUBE

>> Announcer: Live, from San Francisco. It's The Cube, covering Oracle Open World 2016. Brought to you by Oracle. Now, here's your host, John Furrier and Peter Burris. >> Hey, welcome back everyone, we are live here in San Francisco. This is SiliconANGLE Media's The Cube. It's our flagship program, we go out to the events and extract the signal from the noise. I'm John Furrier, the CEO of SiliconANGLE Media, joined by co-host Peter Burris all week. Three days of wall-walk of day three. He's the head of research at SiliconeANGLE Media Inc., as well as the general manager of Wikibon research. Our next guest is Roddy Martin, VP of SC Supply Chain Cloud Product Marketing at Oracle. Welcome to The Cube. >> Thank you very much for the opportunity. I look forward to the discussion. >> Thanks for coming on. Really want to hear your thought leadership around the supply chain transformation, because it might be a little bit bumpy depending upon your perspective. But is a huge opportunity going on in every single theater of where software used to be a point solution. The cloud is now an opportunity for customers to think differently, and is a catalyst for essentially a business model change as well as a fundamental data-driven change. Your thoughts on this? What do you see going on? What are the key inflection points? >> So a very interesting part of my background is I came out of the brewing industry in South Africa. and then I led the supply chain practice at AMR Research, which today is Gartner. And we did a lot of studies on, what are companies doing to lead this transformation? Because it's a transformation of the interim business operating model of a company. This is not stitching data together in the traditional supply chain system sense. So one of the very first foundations that is really fundamental, and Gartner has done a great job of carrying the search forward, is the idea that every company progresses to an interim operating model in five stages of capability, and every one of those builds on the other. So they're either reacting in stage one's problem and never saw the shortage coming and ran out of product. Stage two is I performance improve around projects. Stage three is I drive functional excellence. And stage four I start working as an engine outside an operating model. In other words, I'm driving the business from what's happening in the market and I'm making sure that supply is matching demand. So it's very interesting and it's very important to consider that as the base foundation for this whole discussion. >> So that outside is interesting, we've heard this before, a lot of people are going that way, but there's no shortcuts. Can you talk about, cause you talk about the endpoint is then outside-in. >> Right, when you're operating as a demand-driven interim supply channel operating model, you can't run out of supply, right? So if you saw a change happening in the marketplace but there's nothing to supply, you've really just messed up the business. And so, each of these stages builds on every other stage. So functional excellence is: Am I good at planning? Am I good at product management? Am I good at logistics? Because those are the foundations for operating in the interim business model. This is why the Oracle's blanching in the cloud, in fact all of Oracle's developments in the cloud are so important because you're effectively building a new process oriented operating model that spins the entire business. If I started off with ERP systems and then I put logistics in place and tied it together, there's all sorts of disconnects in the business. When you pick it up in cycle times, you pick it in disconnect sometimes, they don't see changes to the marketplace for weeks. So, this overarching end to end supply chain operating model in the cloud is a fundamental enabler. >> So how do you gauge a customer? First of all, I buy everything that you said, but I want to bring up a point, because it seems to me that the theme of Oracle OpenWorld that traditional applications and I won't say, I'll just say the word Silo just to use it as a point, has been a specific domain specific thing. But to be end to end and be outside-in, which is the end game, you have to know how to talk and integrate with other systems which might have been a problem if you built the most badass end to end system. >> That is a part of the challenge and in fact, a lot of companies that I've worked with over the 15 years I've been researching this, they get stuck for that very reason. In other words, this is a re-engineering of the whole IT infrastructure versus having a thousand consultants come in and tie all my data together over a question of four years and move 15 instances of whatever system you want to one. >> So, if I question on the journey thing, you mentioned thousands of consultants, which customers are now seeing. They want faster mile posts, they want to see faster agility but a lot of the customers actually outline the journey for the customer. So they're saying, here's your journey and they shorten the mile posts for the deliverables. But they're the one getting paid for it so is that the right model, should they be outlining the journey for the customer? >> And they are. It's been very interesting because I was a partner with a major global consulting company for four years and I've been mixing with them here, they suddenly recognizing that this path to the cloud is something they've better get on the bandwagon because they're not going to have a thousand consultants deploying whatever ERP system you talk about as the future of IT. So, what's happening is the business is having much more of a say in this fast deployment, fast time to value, putting these new-- >> So they're driving the journey for parameters? >> They are gearing up for this new journey, the consultants are. >> So, let's get to the fundamentals behind all this and ask a question about it. At the end of the day, digital technologies give customers an option to do their journeys very differently whether in a B2B sense or a consumer sense. And as they use digital technologies, they're also giving data up and so we have now a combination where customers are getting something out of digital, they are demanding it as part of the engagement model. They are giving up data along the way, and the technologies for sensing and doing something with that data in business are now, we're not figuring out how that impacts business design, process design, and offering design. >> So, that's stage 4S, what we talk about is people, process, and technology versus, in the past, when you had stage one, two, and three. People as one set of projects, process as another set of projects, and technology as another set of projects. >> Yeah, I may or may not take some middlings with the model you put out, but it does matter. At the end of the day, what is driving this increasingly is that it used to be that the dominant consideration in, I think, and I'm testing you, the dominant consideration was assets. Where is the physical asset, where are the materials, where is the machine, and we'll focus our returns on this things and then presume that there's a demand for it and now we're getting all this data about demand and that is having an impact on how we talk about arranging the assets. >> That is the inside-out to outside-in. So, let me give you an example without mentioning companies. A major retailer and a major pharmaceutical company. They share pollen data, they share weather data, they mine Facebook to find out what are people saying about allergies, let's say in New England. And the ragweed's busting and they say, do we have the right levels of inventory, and they're moving inventory to make sure that people who aren't on Facebook are saying we can't buy this particular product. They're moving inventory, that's the difference. >> So, they're sharing data amongst themselves. >> Yes, and they're collaborating between retailers. >> Arguably a similar example, and a retailer that's actually not moving inventory but moving pointers and offering new channel options so that someone decides may not, that they know somebody's going to come into the store, the size may not be there but they can still get it to them that day. >> So, it's very interesting, Procter and Gamble, who I did a lot of work with, and this is public domain information, the CEO drove two fundamental transformation messages in the business. And they called it the two moments of truth. He said, we will always have our product when we say we've got a product. So, if we promote a new product, the consumer goes to the shelf, it will be there. Moment of truth number two, we understand why consumers choose and use our products. And you don't fix number two until you fix number one because if I wanted a small tube of toothpaste and I went in and there were only big ones, it's the wrong buying signal. So, what you're seeing is that whole flip to measuring what the market's looking for and shaping their demand and then making sure that the assets and the supply system is geared to deliver. >> Right, I want to ask you a question. First of all, I love that point, I love your point about the data, but here's the question: cause supply chain has been very instrumentation drive, okay, and that certainly is transforming but now you mention Procter and Gamble. We are living in an era where, in the history of business, you can actually now potentially measure everything. So how does that impacting the reconfiguration of the business model? I mean, Procter and Gamble has those moments of truth, every company will have a moment of truth which is, everything is now measurable so, advertising to employee things and everything. >> So let's take the asset story versus the on shelf thing, right, so when I have assets and I'm getting all the data out of my assets, what am I doing with all of that data, right? Because it's not connected to demand. What I got to know is what demand data do I really want to be able to move my assets to the right place. >> Peter: By the way, the shelf is an asset. >> Of course it is, yes. It's a sensing point and it's an asset. They own it, they replenish that shelf. So the point is, data is everywhere and now these, the consulting and the BPM organizations supporting and companies doing their own business process manner, they got to know what data is really important and what data from the outside-in is going to allow me to leverage a new operating model for my business and become digital. >> So, this is really awesome, I was talking with an Oracle executive last night at one of their customer parties and we had a conversation around this data sharing. This is a new, different behavior. This is a theme of the show that no one's really talking about but it's in plain sight which is there is a data sharing aspect of systems and vendors and companies. >> Roddy: That's why the cloud is so important. >> John: This is now impacting everything. >> Everything. >> How do companies go forward and do this? What are you seeing, is there a best practice, is there a starting point? Is there a five step process on that? >> Well, first of all, these transformations are being lead by the C level executive team in a business. This is now longer somebody who decides to buy a new IT system and plug it in to the business. So, the business is saying, how do we change the operating model of the way we work, right? So, and then, what are the capabilities, and this is where that five stage model comes in, what capabilities do we need to look at building over the next three years so that we can operate in this intent way because you can't wake up tomorrow and go from an inside-out asset driven business to an outside-in demand driven business in two weeks. It ain't going to happen. >> So what's the progression? What's the progress bar look like when you have that moment of an epiphany and say, you know, I'm the CEO-- >> What's the earning point of the business? If it's Procter and Gamble, I want X number of one billion dollars brands. If you're a pharmaceutical company, you want to launch brand new drugs and you want to do it at half the price and half the speed that you're used to. It's the business articulating, this is why the leadership teams are so fundamental, articulating what's the burning platform and then translating that back into the capabilities-- >> So you get a reverse engineer. >> Outside-In. >> Outside-In, I love it. >> The way our research says it, and it's very similar but I want to test this because it's, we say start with context. >> Yes. >> What are you going to do with your customer that you have to do better than everybody else? And then identify the community that you're going to do it with and identify the capabilities that are going to delight that community. So it's context, community, and capabilities. >> Now here's the context, further piece to context. If context changes, how quickly do I sense that change and how fast can I respond to that change? Because if I've got all my asset capabilities and my supply capabilities locked into one set of context and that changes and I now have to re-engineer my whole business, I may lose the whole show in the process. I got to see those changes as they are happening, literally in real time. This is where the internet of things, this is where demand shaping, demand sensing, retailers collaborating, supplies connected into supply chain, everybody sharing that information and the fact that not many people, they don't know how to do it. The culture of business is not yet at the points-- >> That's why the measurement thing I brought up, I mean Procter and Gamble, they used to say to their agencies, we know that 50% of our advertising is good, we don't know which half. So now they can measure it all just like in every other aspect so this is where the business model-- >> You also have to be careful about whether or not, again going back to context changes, measurements change, data can blow you away. You have to be very smart about how you do it so a lot of these intelligent things, machine learning, how the models get built, how the insides get delivered, all become very very important. Very quickly, I have two quick questions for you. One is really approximate to the conversation, one less so but the approximate one: IOT. IOT is, has many many applications. Certainly turning analogue data into digital data so you can build models is a crucial piece of it. But it also has another implication in how you enact the output of that model back into the real word. How does supply chain and IOT come together? >> So if you look at the studies that are being done by Oracle and Gartner et cetera on what's important to the supply chain, two things come up. One is visibility and the other is analytics. Right, so there's tons of data available, to your point just now. That data could cause massive noise to the business unless you know what you're looking at. I know companies that will say, 95% visibility of changes on their demand side is good enough but I'm good enough on the supply side to be able to adjust. But you got to know which data to look at. So I'm looking at on shelf. I'm looking at what consumers are choosing and using, I'm looking to see what of my contract manufacturers-- >> Peter: Analyze key constraints. >> Bingo, so it's not about, I think what we're all going to have to learn in the internet of things is we need, again, a cloud based internet of things platform that does the analytics. >> Because we can rewire things faster. >> Exactly, you can adjust the business to new scenarios based on what you're reading from the demand side and what you're reading from the supply side. >> So you're a great foil for my second question. My second question is you look back at the history, or the recent history let's call it, of strategy, very asset based, Porter said pick the industry that has the best returns, pick your position in that industry, then choose your games based on the five factor analysis that you want to play to get to that position. Very asset oriented, we're in control, that's going to dictate how things change. What you just suggested was a very very different way of thinking about strategy. >> Same fundamentals. It's the same fundamentals but it's allowing yourself to adjust those fundamentals based on what's happening in the market place. >> Peter: But you're not going to base it on just the assets. >> No, we're not going to base it on the assets unless you've focused on, like if you're an engineering company and that's all you make is machines, you can't suddenly start producing toothpaste, for example. There are, that's why I say it's a reconfiguration of those same principles but flexible enough to meet demand. >> So how does, how does the world of design and the world of strategy start to come together in C suite? >> Fundamentally, because it's the voice of the customer that starts to count. It's the voice of the customer that dictates the strategy. So if my customers don't want green Guinness for Saint Patrick's Day, don't make any, because it's going to hang around and get thrown away, right? So, the voice of the customer determines what's happening on the demand side and the supply side has to be agile enough to meet that need. >> So, I would suggest keep Guinness the way it is because it's damn good the way it is, so personally I would agree on the Guinness comment. No green Guinness. >> So, what's the South Africa beer? >> Castle Lager. Well, SAB, South African Brewery, has been bought by Anheuser-Busch InBrev, a massive big giant. >> We love beer and if there's any beer sponsors out there, we're happy looking for our Budweiser. We want a, maybe an IPA in there. Roddy, thanks for spending the time, coming in with you, appreciate it. Some thought leadership here on Reconfiguration and looking at some of the nuances that are really going to impact the buyers here on The Cube. Oracle Open will be back with more live coverage from SiliconANGLE's The Cube after this short break.

Published Date : Sep 22 2016

SUMMARY :

Brought to you by Oracle. and extract the signal from the noise. for the opportunity. What are the key inflection points? So one of the very first a lot of people are going that way, happening in the marketplace say the word Silo just That is a part of the agility but a lot of the that this path to the the consultants are. At the end of the day, when you had stage one, two, and three. the model you put out, but it does matter. That is the inside-out to outside-in. So, they're sharing Yes, and they're the size may not be there that the assets and the of the business model? So let's take the asset Peter: By the way, So the point is, data is This is a theme of the show cloud is so important. operating model of the way we work, right? It's the business articulating, we say start with context. the capabilities that are that information and the So now they can measure one less so but the approximate one: IOT. on the supply side to be able to adjust. that does the analytics. the business to new scenarios that has the best returns, happening in the market place. to base it on just the assets. base it on the assets unless that dictates the strategy. because it's damn good the a massive big giant. and looking at some of the

ENTITIES

Entity	Category	Confidence
Roddy	PERSON	0.99+
John	PERSON	0.99+
Peter Burris	PERSON	0.99+
Roddy Martin	PERSON	0.99+
Peter	PERSON	0.99+
Procter and Gamble	ORGANIZATION	0.99+
John Furrier	PERSON	0.99+
John Furrier	PERSON	0.99+
second question	QUANTITY	0.99+
50%	QUANTITY	0.99+
San Francisco	LOCATION	0.99+
New England	LOCATION	0.99+
South Africa	LOCATION	0.99+
Oracle	ORGANIZATION	0.99+
AMR Research	ORGANIZATION	0.99+
Gartner	ORGANIZATION	0.99+
95%	QUANTITY	0.99+
Procter and Gamble	ORGANIZATION	0.99+
SiliconANGLE Media	ORGANIZATION	0.99+
Porter	PERSON	0.99+
five step	QUANTITY	0.99+
four years	QUANTITY	0.99+
Oracle Corp.	ORGANIZATION	0.99+
15 instances	QUANTITY	0.99+
one billion dollars	QUANTITY	0.99+
SiliconeANGLE Media Inc.	ORGANIZATION	0.99+
One	QUANTITY	0.99+
Anheuser-Busch InBrev	ORGANIZATION	0.99+
Facebook	ORGANIZATION	0.99+
two quick questions	QUANTITY	0.99+
last night	DATE	0.99+
tomorrow	DATE	0.99+
Saint Patrick's Day	EVENT	0.99+
two moments	QUANTITY	0.98+
Wikibon	ORGANIZATION	0.98+
day three	QUANTITY	0.98+
two weeks	QUANTITY	0.98+
SAB	ORGANIZATION	0.98+
two things	QUANTITY	0.98+
one	QUANTITY	0.98+
each	QUANTITY	0.98+
Three days	QUANTITY	0.97+
First	QUANTITY	0.97+
five stages	QUANTITY	0.97+
five factor	QUANTITY	0.97+
one set	QUANTITY	0.96+
today	DATE	0.95+
SiliconANGLE	ORGANIZATION	0.95+
Oracle Open World 2016	EVENT	0.94+
Oracle Open	EVENT	0.93+
five stage	QUANTITY	0.93+
Stage two	QUANTITY	0.9+
15 years	QUANTITY	0.89+
#oow16	EVENT	0.89+
two fundamental transformation messages	QUANTITY	0.88+
thousands of consultants	QUANTITY	0.88+
stage 4S	OTHER	0.87+
Stage three	QUANTITY	0.85+
stage one	QUANTITY	0.84+
single theater	QUANTITY	0.81+
three	QUANTITY	0.79+
two	QUANTITY	0.78+
first foundations	QUANTITY	0.78+
stage	QUANTITY	0.77+
half	QUANTITY	0.76+
thousand consultants	QUANTITY	0.76+
stage four	QUANTITY	0.76+
SC Supply Chain	ORGANIZATION	0.74+
BPM	ORGANIZATION	0.74+
The Cube	COMMERCIAL_ITEM	0.71+
first	QUANTITY	0.68+
tons	QUANTITY	0.65+

Recommend Videos

Sentiment Analysis

AWS Comprehend

Search Results for Amr: